1
Wish list / Typing-Fingerprinting
« on: July 27, 2009, 05:08:38 pm »
As I've retired my duties as GM lead and stopped being involved in development as well I think it's fair to note down some ideas that I think might be helpful in future.
I have worked on a proof-of-concept for this idea in python for a while and I think it can work with some tweaking.
The problem is following:
It is hard to judge whether two characters ingame are played by the same person or not. As GM you still can check whether the IP is the same, but as player you only can judge by the way they act with factors taking into account like speed of typing, spelling errors, usage of words, inside knowledge and so on. The later also only works if you know the two characters very closely.
As GM you often don't have this close view onto these characteristics; so if someone logs in with a different char and a different IP it's near impossible to tell for sure it's the same person or not.
This issue is the same for any application with identity being a critical factor, therefor approaches to the problem also are manifold:
- using the MAC address
- using other hardware characteristics like harddrive number, CPU and such
- using unique accounts that are sold in RL
- registering accounts by creditcard number
- registering by making a payment via money transfer (paypal, etc)
Any of these measures however can be circumvented, are massively intrusive into the user's privacy and don't even tackle the issue directly - to identify the user while they use the product.
The idea now was to use what's already there: the typing of the player when they play the game.
For this all keystrokes are recorded client-side as a pair of (previous_stroke,current_stroke), which are used as keys for a dictionary to record the according timing and the number how often the combination was entered.
If the combination is entered again, the timing is adapted either by averaging or using a weighted term.
Every now and again the whole dictionary client-side is read out and put into the form of a list sorted by timing (fast combinations first). All combinations that were entered only once or few X times are left out.
This sorted list is sent to the server, where a similar list is recorded for any account and adapted each time a client reports back.
The sorted list provides a form of frequency distribution that gives insight into which combinations of keystrokes are trained best by the user, which is a unique feature which can be used to distinguish one player from another and also who is using a bot and who doesn't.
I have used python with Tkinter to make a prototype for this, which worked quite well in the first tests I ran.
To reduce the load on the server it might be possible to identify the combinations used in English most often and use this as base for the distribution and then take a discrete function to describe each user's unique typing-fingerprint (utilizing fast Fourier transformation for example).
I have worked on a proof-of-concept for this idea in python for a while and I think it can work with some tweaking.
The problem is following:
It is hard to judge whether two characters ingame are played by the same person or not. As GM you still can check whether the IP is the same, but as player you only can judge by the way they act with factors taking into account like speed of typing, spelling errors, usage of words, inside knowledge and so on. The later also only works if you know the two characters very closely.
As GM you often don't have this close view onto these characteristics; so if someone logs in with a different char and a different IP it's near impossible to tell for sure it's the same person or not.
This issue is the same for any application with identity being a critical factor, therefor approaches to the problem also are manifold:
- using the MAC address
- using other hardware characteristics like harddrive number, CPU and such
- using unique accounts that are sold in RL
- registering accounts by creditcard number
- registering by making a payment via money transfer (paypal, etc)
Any of these measures however can be circumvented, are massively intrusive into the user's privacy and don't even tackle the issue directly - to identify the user while they use the product.
The idea now was to use what's already there: the typing of the player when they play the game.
For this all keystrokes are recorded client-side as a pair of (previous_stroke,current_stroke), which are used as keys for a dictionary to record the according timing and the number how often the combination was entered.
If the combination is entered again, the timing is adapted either by averaging or using a weighted term.
Every now and again the whole dictionary client-side is read out and put into the form of a list sorted by timing (fast combinations first). All combinations that were entered only once or few X times are left out.
This sorted list is sent to the server, where a similar list is recorded for any account and adapted each time a client reports back.
The sorted list provides a form of frequency distribution that gives insight into which combinations of keystrokes are trained best by the user, which is a unique feature which can be used to distinguish one player from another and also who is using a bot and who doesn't.
I have used python with Tkinter to make a prototype for this, which worked quite well in the first tests I ran.
To reduce the load on the server it might be possible to identify the combinations used in English most often and use this as base for the distribution and then take a discrete function to describe each user's unique typing-fingerprint (utilizing fast Fourier transformation for example).