TITLE

Automated Authorship Attribution Using Advanced Signal Classification Techniques

AUTHOR(S)
Ebrahimpour, Maryam; Putniņš, Tālis J.; Berryman, Matthew J.; Allison, Andrew; Ng, Brian W.-H.; Abbott, Derek
PUB. DATE
February 2013
SOURCE
PLoS ONE;Feb2013, Vol. 8 Issue 2, p1
SOURCE TYPE
Academic Journal
DOC. TYPE
Article
ABSTRACT
In this paper, we develop two automated authorship attribution schemes, one based on Multiple Discriminant Analysis (MDA) and the other based on a Support Vector Machine (SVM). The classification features we exploit are based on word frequencies in the text. We adopt an approach of preprocessing each text by stripping it of all characters except a-z and space. This is in order to increase the portability of the software to different types of texts. We test the methodology on a corpus of undisputed English texts, and use leave-one-out cross validation to demonstrate classification accuracies in excess of 90%. We further test our methods on the Federalist Papers, which have a partly disputed authorship and a fair degree of scholarly consensus. And finally, we apply our methodology to the question of the authorship of the Letter to the Hebrews by comparing it against a number of original Greek texts of known authorship. These tests identify where some of the limitations lie, motivating a number of open questions for future work. An open source implementation of our methodology is freely available for use at https://github.com/matthewberryman/author-detection.
ACCESSION #
87623568

 

Related Articles

  • ROCK clustering algorithm based on the P system with active membranes. YUZHEN ZHAO; XIYU LIU; WENPING WANG // WSEAS Transactions on Computers;2014, Vol. 13, p289 

    The ROCK algorithm plays an important role in data mining and data analysis, which can help people discover knowledge from large amounts of data. In this paper, an improved ROCK algorithm based on the P system with active membranes is constructed. Since the P system has great parallelism, it...

  • The Application of Collaborative Filtering Recommendation and FP-Tree Algorithms in Building Web Mining System. Zhen Liu; Wen-xian Xiao; Xia-li Zhao; Qi Lu // Journal of Convergence Information Technology;Sep2012, Vol. 7 Issue 17, p215 

    Web mining is the application of data mining on the Web, which uses data mining techniques to extract interesting, useful patterns and implicit information from the network resources. Collaborative filtering technology is the degree of similarity by comparing user past interests and behavior, to...

  • Association Rule Pruning based on Interestingness Measures with Clustering. Kannan, S.; Bhaskaran, R. // International Journal of Computer Science Issues (IJCSI);Jul2010, Vol. 7 Issue 4, p35 

    Association rule mining plays vital part in knowledge mining. The difficult task is discovering knowledge or useful rules from the large number of rules generated for reduced support. For pruning or grouping rules, several techniques are used such as rule structure cover methods, informative...

  • IARM with User Specified Constraint and K-Subset Methodology. Kalmodia, Sangita; Mungara, Jitendranath // International Journal of Database Theory & Application;Dec2012, Vol. 5 Issue 4, p73 

    To considered the problem of discovering of interesting association rule among item sets in data base. Algorithms for mining association rule are practical methods to find interesting rules implied in large database. Proposed an innovative approach, beyond minimum support and minimum confidence...

  • Study on the Improvement of TFIDF Algorithm in Data Mining. Hongfei Sun; Wei Hou // Advanced Materials Research;2014, Issue 1042, p106 

    In order to remedy the defects of traditional methods in the data mining, improving the data mining effect. The paper proposed the improved TFIDF algorithm and applied to the data mining based on the analysis of the flaw and the insufficiency in simple calculation method, minimum value...

  • Association Rule Mining and Network Analysis in Oriental Medicine. Yang, Dong Hoon; Kang, Ji Hoon; Park, Young Bae; Park, Young Jae; Oh, Hwan Sup; Kim, Seoung Bum // PLoS ONE;Mar2013, Vol. 8 Issue 3, p1 

    Extracting useful and meaningful patterns from large volumes of text data is of growing importance. In the present study we analyze vast amounts of prescription data, generated from the book of oriental medicine to identify the relationships between the symptoms and the associated medicines used...

  • Identification of Bicluster Regions in a Binary Matrix and Its Applications. Chen, Hung-Chia; Zou, Wen; Tien, Yin-Jing; Chen, James J. // PLoS ONE;Aug2013, Vol. 8 Issue 8, p1 

    Biclustering has emerged as an important approach to the analysis of large-scale datasets. A biclustering technique identifies a subset of rows that exhibit similar patterns on a subset of columns in a data matrix. Many biclustering methods have been proposed, and most, if not all, algorithms...

  • Quantifying Collective Attention from Tweet Stream. Sasahara, Kazutoshi; Hirata, Yoshito; Toyoda, Masashi; Kitsuregawa, Masaru; Aihara, Kazuyuki // PLoS ONE;Apr2013, Vol. 8 Issue 4, p1 

    :Online social media are increasingly facilitating our social interactions, thereby making available a massive “digital fossil” of human behavior. Discovering and quantifying distinct patterns using these data is important for studying social behavior, although the rapid time-variant...

  • ALGORYTM WSPIERAJÄ„CY PROCES PODEJMOWANIA DECYZJI O STANIE OBIEKTU Z WYKORZYSTANIEM SYGNAŁÓW WIBROAKUSTYCZNYCH. BONIECKI, RAFAŁ; MICIAK, MIROSŁAW // Studia i Materialy Polskiego Stowarzyszenia Zarzadzania Wiedza /;2011, Issue 47, p36 

    In this paper we proposes an algorithm for applications using Java technology and ORM (Object - relational mapping). The main objective is to promote the application of decision-making on the state of technical facility on the basis of vibroacoustic

Share

Read the Article

Courtesy of VIRGINIA BEACH PUBLIC LIBRARY AND SYSTEM

Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics