HCII PhD Thesis Proposal, "Building Better Behavioral Measurement Models from Fine-Grained Data with AI"

Steven C. Dang

Wednesday, June 17, 2020 - 12:00pm to 1:30pm
Meeting ID: 975 1423 1041


Ken Koedinger (HCII), Chair

Adam Perer (HCII)

Nikolas Martelaro (HCII)

Artur Dubrawski (RI)

Sidney D'Mello (University of Colorado: Boulder)



Making education truly personalized requires recognizing the whole learner. Educational technology has adopted advanced models of student knowledge and knowledge acquisition, but lacks similarly valid models of student's motivations. Development of motivation models has been held back by challenges of operationalizing existing research into valid measurement models on various educational technology products. Creating a measurement model is prone to mono-method bias threat, whereby any single measure is unlikely to accurately capture the target construct. Overcoming mono-method bias requires developers to devote additional effort to elaborate more measures into a measurement model, which may be either methodologically challenging and/or labor intensive. There is a need to scaffold this measurement model development through intelligent tools to make development more attainable. In this thesis, I elaborate a semi-automated method for developing measurement models of student motivation that are less prone to mono-method bias threats. I evaluate the method through the development of a tool and its application to a large student dataset.


In my prior work, I have explored motivational measurement through the lens of student's behavioral diligence. When applying an existing operational model of diligence to observational log data, I found reliability shortcomings attributable to unencoded confounds across different curricula for which the measurement model could not account. Extending the idea of measurement scales from psychometrics, I was able to improve the reliability of the measurement model by adding more behavioral measures with different biases. These models used coarse grained information about aggregate student behavior, but fine-grained data afford measures that can account for additional observable factors that may differ across students. Leveraging existing fine-grained models of unproductive gaming the system behaviors, I validated such behaviors were likely indicators of student diligence. I also was able to leverage motivational theory to identify observable factors that partially explained variability of such behaviors between students. Prior motivational theory does not adequately define the exact nature of the interaction between these factors on diligent behaviors. Therefore, the challenge remains of how to use this information to define new behavior measures for a measurement model.


In my proposed work, I introduce Behavioral Item Response Theory, a framework for defining an observational behavior based measurement model that addresses issues of mono-method bias. I explore the viability of the framework by using it to design a semi-automated method of building measurement models of diligence and evaluate its performance over a simulated dataset. In contrast to survey question measures of motivational constructs that are portable across learning environments, behavioral measures are more sensitive to particulars of different learning environments and thus a support is needed to aid model developers in identifying behavioral measures across learning environments. The proposed method utilizes a semi-automated data-driven bootstrap approach to discover and define behavioral measures from fine-grained multivariate log data while leveraging a human-in-the-loop to judge relevant behaviors for inclusion and to manage model bias threats. To support efficient review of large quantities of student behavior data, I build on prior research in text-replay and explainable AI to develop an interactive event sequence summarizing visualization and validate the viability of the approach with a randomized user study. I then evaluate the performance of the method on real student data through a user study to explore how domain experts are able to leverage algorithmic scaffolds for measurement model elaboration and evaluation. Overall, this proposed work makes theoretical and methodological contributions to how to more efficiently build measurement models with fine-grained observational behavior data.


Document: https://drive.google.com/file/d/1J0UGcRtLib9zmiRrM8Eg9KK_-u9djLF0/view?usp=sharing

Queenie Kravitz