Deception in Language

Language is a medium of conveying information, but unfortunately, it is also often used to deceive. We have explored the detection of such language in Online reviews, and discovered that the stylometric aspects of language play an important role in exposing the deceptive intent of writers (Feng, Banerjee, and Choi; 2012a). We have also carried out experiments on the process of creating such language by investigating the differences in how people type when writing truthful as opposed to deceptive texts, and revealed interesting parallels between typing patterns and speech patterns when people lie (Banerjee et al. 2014). On a related note, we also investigated stylometric aspects of language to identify the traits of individual writers (Feng, Banerjee, and Choi; 2012b).

Research Group

Research Products

  • The dataset contains truthful and deceptive writings from two domains: business reviews, and essays on two topics of social interest: gun control and gay marriage. The data is available for download as compressed tar.bz2 files:
The uncompressed dataset consists of files with tab-separated values. The key log data is found in the last column, titled ReviewMeta. This field has a list of KeyUp, KeyDown and MouseUp event logs. Note that the first event timestamp is not always zero. The event logs have the following formats:[timestamp] KeyUp/KeyDown [javascript keycode][timestamp] MouseUp [begin-index] [end-index]