Zipf's Law in a Random Text from English With a New Ranking Method

Saxena, Anurag; Jauhari, Monika; Gupta, B. M.
July 2007
DESIDOC Bulletin of Information Technology;Jul2007, Vol. 27 Issue 4, p51
Academic Journal
Zipf's law has attracted infometricians time and again. There have been many studies, which have explored the application of Zipf's law to various areas. However, there are a few parameters, which largely affect a study. These parameters are the power law embedded in Zipf's law, the ranking method, the type of text taken for the study and the behaviour of extreme regions in the Zipf's curve. This paper tries to address all these points by taking a random text in English language from computer science literature. The selected text is called random because of its highly specific nature of technical words. The paper studies the properties of this text and compares the product of rank and frequency for three ranking procedures. It also analyses the performance of data in the extreme regions of the Zipf's curve. It is observed that ranking procedure and type of text have definite bearings on the performance of Zipf's curve.


Related Articles

  • An evaluation of Bradfordizing Effects. Mayr, Philipp // Collnet Journal of Scientometrics & Information Management;2008, Vol. 2 Issue 2, p21 

    The purpose of this paper is to apply and evaluate the bibliometric method Brad-fordizing for information retrieval (IR) experiments. Bradfordizing is used for generating core document sets for subject-specific questions and to reord result sets from distributed searches. The method will be...

  • Information Retrieval und Informetrie: Zur Anwendung informetrischer Methoden in digitalen Bibliotheken. Schaer, Philipp // Historical Social Research;2013, Vol. 38 Issue 3, p282 

    »Information Retrieval and Informetrics: The Application of Informetric Methods in Digital Libraries«. The search for scientific literature in scientific information systems is a discipline at the intersection between information retrieval and digital libraries. Recent user studies show...

  • Lotkaian informetrics and applications to social networks. Egghe, Leo // Bulletin of the Belgian Mathematical Society - Simon Stevin;Nov2009, Vol. 16 Issue 4, p689 

    Two-dimensional informetrics is defined in the general context of sources that produce items and examples are given. These systems are called "Information Production Processes" (IPPs). They can be described by a sizefrequency function Æ’ or, equivalently, by a rank-frequency function g. If...

  • Applied Informetrics for Digital Libraries: An Overview of Foundations, Problems and Current Approaches. Schaer, Philipp // Historical Social Research;2013, Vol. 38 Issue 4, p267 

    The foundation of every research project is a comprehensive literature review. The search for scientific literature in information systems is a discipline at the intersection of information retrieval and digital libraries; recent user studies in both fields show two typical weaknesses of the...

  • Extending Zipf’s law to n-grams for large corpora. Ha, Le; Hanna, Philip; Ming, Ji; Smith, F. // Artificial Intelligence Review;Dec2009, Vol. 32 Issue 1-4, p101 

    Experiments show that for a large corpus, Zipf’s law does not hold for all ranks of words: the frequencies fall below those predicted by Zipf’s law for ranks greater than about 5,000 word types in the English language and about 30,000 word types in the inflected languages Irish and...

  • A Corpus-Based Study on the Vocabulary Errors in CET-4 Writing and Its Pedagogical Implications. HUO Yanjuan // Studies in Literature & Language;8/31/2014, Vol. 9 Issue 1, p38 

    With the rapid development of computer science today, our teaching has been changed from traditional education with a single means or method to modern classroom teaching with scientific way of imparting knowledge. As the enormous corpora have been introduced, the investigation and analysis of...

  • Concreteness ratings for 40 thousand generally known English word lemmas. Brysbaert, Marc; Warriner, Amy; Kuperman, Victor // Behavior Research Methods;Sep2014, Vol. 46 Issue 3, p904 

    Concreteness ratings are presented for 37,058 English words and 2,896 two-word expressions (such as zebra crossing and zoom in), obtained from over 4,000 participants by means of a norming study using Internet crowdsourcing for data collection. Although the instructions stressed that the...

  • TRANSITIONS IN FRICATIVE NOISE. Uldall, Elizabeth // Language & Speech;Jan-Mar64, Vol. 7 Issue 1, p13 

    Focuses on the synthesis of English consonant clusters /sps/, /sts/ and /sks/ calling for the use of transition type frequency changes in the fricactive noise. Formant transitions between consonants and vowels; Addition of bursts of hiss of changing frequency; Evidence from broad band spectrograms.

  • DIFFERENCES IN THE FO PATTERNS OF SPEECH: TONE LANGUAGE VERSUS STRESS LANGUAGE*. Eady, Stephen J. // Language & Speech;Jan-Mar82, Vol. 25 Issue 1, p29 

    Focuses on the comparison between the fundamental frequency patterns of continuous speech in Mandarin Chinese and American English. Oral interpretation of an unemotional narrative text written in native language; Amount of dynamic movement; Function of the number of syllables.


Read the Article


Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics