Forensic Linguistics

Every individual has certain ways of language use that are unique to them (very much like fingerprints). Such a unique use of grammar and other language features is known as a person's idoiolect, which can be used for accurate identification of a document's author, which is an important aspect of forensic linguistics. In this project, we investigate authorship attribution of various documents, including highly formal technical writings (Feng, Banerjee, and Choi; 2012b), and even collaborative multi-author documents (Zuo, Zhao, and Banerjee; 2019).

This is one of Banerjee's areas of interest within NLP. The project, however, is not his current focus. It sees sporadic progress when there are students interested in pursuing this topic.

Research Group

Publications

[Feng, Banerjee, and Choi; 2012b]
  • Song Feng, Ritwik Banerjee, and Yejin Choi. Characterizing Stylistic Elements in Syntactic Structure. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 1522 - 1533. Association for Computational Linguistics, 2012. [ PDF ]
[Zuo, Zhao, and Banerjee; 2019]
  • Chaoyuan Zuo, Yu Zhao, and Ritwik Banerjee. Style Change Detection with Feed-forward Neural Networks. In Working Notes of CLEF 2019 – Conference and Labs of the Evaluation Forum, CLEF 2018 – Vol. 2380. Central Europe Workshop Proceedings (CEUR-WS.org), 2019. [ PDF ]