Style mining of electronic messages for multiple authorship discrimination: First Results
S. Argamon, M. Saric, and S.S. Stein 2003 Proceedings of ACM Conference on Knowledge Discovery and Data Mining

This paper treats the problem of authorship attribution for electronic messages based on computational stylistics, or text categorization by style. Writing style can reveal textual meanings such as affect, genre, register, and personality. This method is independent of content and focuses on such things as choices in lexical use, syntactic structure and discourse strategy. Variants of the Exponential Gradient (EG) algorithm are used to categorize text based on style, and were found to be effective author identification models. The importance of stylistic text analysis in processing complex text collections is highlighted. The test corpus, which the authors will make publicly available, is one of the first collections for research on stylistic attribution.



Subject: Psychology
Resource Type: Scientific Resources:Overview/Reference Work, Conference Proceedings