Comparison of Collocation Extraction Measures for Document Indexing

Petrovic, Sasa; Snajder, Jan; Basic, Bojana Dalbelo; Kolar, Mladen
December 2006
Journal of Computing & Information Technology;Dec2006, Vol. 14 Issue 4, p321
Academic Journal
Automatic extraction of collocations from a corpus is a well-known problem in the field of natural language processing. It is typically carried out by employing some kind of a statistical measure that indicates whether or not two words occur together more often than by chance. As there is an aboundance of these measures proposed by various authors, we have compared some of them on a task of extracting collocations from a corpus of Croatian legal documents for the purpose of document indexing. We propose and evaluate extensions of these measures for collocations consisting of three words.


Related Articles

  • Augmented Transition Network Based Morphological Tagger. Patil, Vijayalaxmi. R.; Nagalakshmi, S. // International Journal of Advanced Networking & Applications;Sep2010 Supplement, Vol. 2 Issue S1, p496 

    Natural language processing (NLP) is a very difficult and the most challenging problem in the field of Artificial Intelligence (AI). Natural language understanding is one of the tasks of NLP. Developing a system to understand the natural language is difficult mainly because of the ambiguity...

  • Pregroups and Natural Language Processing. Lambek, Joachim // Mathematical Intelligencer;Spring2006, Vol. 28 Issue 2, p41 

    The article shows that pregroups have been recently introduced in order to assist in natural language processing by examining small fragments of three modern European languages. This apparently modern algebraic concept of a pregroup had actually been around for some time in recreational...

  • Natural Language.  // Network Dictionary;2007, p330 

    An encyclopedia entry for the term "Natural Language" is presented. It is a language spoken or written by humans as opposed to a language used to program or communicate with computers. The study of natural language is one of the most complex areas of artificial intelligence, since human...

  • Query Based Intelligent Web Interaction with Real World Knowledge. Ong Sing Goh; Chun Che Fung; Kok Wai Wong // New Generation Computing;2008, Vol. 26 Issue 1, p3 

    This paper describes an integrated system based on open-domain and domain-specific knowledge for the purpose of providing query-based intelligent web interaction. It is understood that general purpose conversational agents are not able to answer questions on specific domain subject. On the other...

  • The Enhancement of Arabic Stemming by Using Light Stemming and Dictionary-Based Stemming. Alhanini, Yasir; Aziz, Mohd Juzaiddin Ab // Journal of Software Engineering & Applications;Sep2011, Vol. 4 Issue 9, p522 

    Word stemming is one of the most important factors that affect the performance of many natural language processing applications such as part of speech tagging, syntactic parsing, machine translation system and information retrieval systems. Computational stemming is an urgent problem for Arabic...

  • Relationship Between Natural Language Processing and AI. Joshi, Aravind K. // AI Magazine;Fall98, Vol. 19 Issue 3, p95 

    Opinion. Comments on the relationship between natural language processing and artificial intelligence, while focusing on the role of constrained formal-computational systems. Information on the use of constrained systems; Characterization of the approach using constrained formal-computational...

  • Determining Senses for Word Sense Disambiguation in Turkish. Orhan, Zeynep; Altan, Zeynep // International Journal of Humanities & Social Sciences;2007, Vol. 1 Issue 4, p174 

    Word sense disambiguation is an important intermediate stage for many natural language processing applications. The senses of an ambiguous word are the classification of usages for that specific word. This paper deals with the methodologies of determining the senses for a given word if they can...

  • Cognitive Approaches to Natural Language Processing. Ball, Jerry; Arney, Chris; Marcu, Mitchell; Nirenburg, Sergei // AI Magazine;Spring2008, Vol. 29 Issue 1, p100 

    The article reports on the symposium titled "Cognitive Approaches to Natural Language Processing." The symposium is developed by the Association for the Advancement of Artificial Intelligence (AAAI). It highlights research in natural language processing at the convergence of artificial (AI) and...

  • A New Direction in AI. Zadeh, Lotfi A. // AI Magazine;Spring2001, Vol. 22 Issue 1, p73 

    Outlines the computational theory of perceptions (CTP) and suggests that CTP can be used as an additional artificial intelligence tool in computing and reasoning with perception-based information. Principal features of CTP; Role of precisiated natural language (PNL) in CTP; Principal rule...


Read the Article


Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics