深度知识追踪(Deep Knowledge Tracing)论文学习(简要归纳)一
目录
深度知识追踪(Deep Knowledge Tracing)
Recurrent Neural Networks(RNN)
Bayesian Knowledge Tracing (BKT)
Learning Factors Analysis (LFA)
Performance Factors Analysis (PFA)
SPARse Factor Analysis (SPARFA)
深度知识追踪(Deep Knowledge Tracing)
Chris Piech∗ , Jonathan Spencer∗ , Jonathan Huang∗‡, Surya Ganguli∗ , Mehran Sahami∗ , Leonidas Guibas∗ , Jascha Sohl-Dickstein∗† ∗Stanford University, †Khan Academy, ‡Google
发布正在:NIPS'15 (人工智能A会)
Contributions
A novel application of recurrent neural networks (RNN) to tracing student knowledge. (Model Introduction)
Demonstration that our model does not need expert annotations. (Previous Work)
A 25% gain in AUC over the best previous result. (Experimental results)
Power a number of other applications. (Other Applications)
Knowledge Tracing
Def: Knowledge tracing is the task of modelling student knowledge over time so that we can accurately predict how students will perform on future interactions.
Usually by observing the correctness of doing exercises.
Motivation
- Develop computer-assisted education by building models of large scale student trace data on MOOCs.
- Resources can be suggested to students based on their individual needs.
- Content which is predicted to be too easy or too hard can be skipped or delayed.
- Formal testing is no longer necessary if a student’s ability undergoes continuous assessment
- The knowledge tracing problem is inherently difficult. Most previous work in education relies on first order Markov models with restricted functional forms.
Recurrent Neural Networks(RNN)
Long Short Term Memory (LSTM)
Previous Work
Bayesian Knowledge Tracing (BKT)
Standard Bayesian Knowledge Tracing (BKT) (1995)
Extensions: Contextualization of guessing and slipping estimates Estimating prior knowledge for individual Learners Estimating problem difficulty
Drawbacks: The binary representation of student understanding may be unrealistic.
The meaning of the hidden variables and their mappings onto exercises can be ambiguous, rarely meeting the model’s expectation of a single concept per exercise.
The binary response data used to model transitions imposes a limit on the kinds of exercises that can be modeled.
Learning Factors Analysis (LFA)
SPARse Factor Analysis (SPARFA) (JMLR 2014)SPARse Factor Analysis (SPARFA) (JMLR 2014)
The probabilities that the learners answer the questions correctly: ???????? + ????
Three observations: Typical educational domains of interest involve only a small number of key concept.
Each question involves only a small subset of the abstract concepts.
The entries of ???? should be non-negative.
Drawback: They are both more restricted in functional form and more expensive (due to inference of latent variables) than the method we present here.
Performance Factors Analysis (PFA)
SPARse Factor Analysis (SPARFA)
RNN Model
In contrast to hidden Markov models as they appear in education, which are also dynamic, RNNs have a high dimensional, continuous, representation of latent state.
A notable advantage of the richer representation of RNNs is their ability to use information from an input in a prediction at a much later point in time.
Experiment results
Expectimax V.S. mixing (exercises from different topics are intermixed) 、blocking (students answer series of exercises of the same Type).
Tested different curricula for selecting exercises on a subset of five concepts over the span of 30 exercises from the ASSISTment dataset.
Future Work
Incorporate other features as inputs (such as time taken) Explore other educational impacts (such as hint generation, dropout prediction) Validate hypotheses posed in education literature (such as spaced repetition, modeling how students forget) To track knowledge over more complex learning activities