Models

A model is an abstract representation of a user's unique keystroke pattern.

Purpose of the Models


At the start of the project, we didn’t have any of the data that we collected from our survey. We relied on an online data set from Carnegie Mellon that had keystroke data from 51 typists to fine tune and test models. We created the three models with the purpose of abstractly representing a user’s keystroke dynamics given 10 initial inputs.This representation would then be used for comparison against new login attempts to determine if a new input was indeed the true user based on their keystrokes. Of the three models tested, the first two are statistical models while the third is a neural network.



Key Metrics



Figure 1: Receiver Operating Characteristic Curve.


Ultimately, we will assess the models based on their True and False Accept Rates. The True Accept Rate is the rate a model correctly believes the true user to be the true user and the False Accept Rate is the rate a model incorrectly believes a false user to be the true user. Ideally, we want a 100% true accept rate and a 0% false accept rate. The optimal tradeoff between the True Accept Rate and False Accept Rate is along the equal error rate, when the True Accept Rate + False Accept Rate is 1 because this is where the model’s performance is equal between true and false user inputs. When we plot the relationship between the True Accept Rate and False Accept Rate on the ROC curve to the right, we can visualize the best accept rates based on where they intersect the black Equal Error Rate line.