Author Identification


Harshit Maheshwari
Computer Science, IIT-Kanpur

Author Identification is an area with a wide scope of further research. Different authors have different styles of writing. Its these small style differences that separate one author from another.
In this project we tinker with the frequencies of punctuation marks used, average word length used, occurrences of unique words and 'sentence length' calculated using standford-sentence parser and try to identify the author.
The most difficult task was coming up with different features and then selecting the most discriminating ones.