Text_Statistics
[ class tree: Text_Statistics ] [ index: Text_Statistics ] [ all elements ]

Class: Text_Statistics

Source Location: /Text_Statistics-1.0.1/Text/Statistics.php

Class Overview


Text_Statistics calculates some basic readability metrics on a


Author(s):

Variables

Methods


Inherited Variables

Inherited Methods


Class Details

[line 51]
Text_Statistics calculates some basic readability metrics on a

block of text. The number of words, the number of sentences, and the number of total syllables is counted. These statistics can be used to calculate the Flesch score for a sentence, which is a number (usually between 0 and 100) that represents the readability of the text. A basic breakdown of scores is:

90 to 100 5th grade 80 to 90 6th grade 70 to 80 7th grade 60 to 70 8th and 9th grade 50 to 60 10th to 12th grade (high school) 30 to 50 college

  1. to 30 college graduate
More info can be read up on at http://www.mang.canterbury.ac.nz/courseinfo/AcademicWriting/Flesch.htm

require 'Text/Statistics.php'; $block = Text_Statistics($sometext); $block->flesch; // returns flesch score for $sometext

see the unit tests for additional examples.



[ Top ]


Class Variables

$flesch =  0

[line 100]

The Flesch score of the document.

It is FALSE if there were no words in the document.

  • Access: public

Type:   float


[ Top ]

$gradeLevel =  0

[line 109]

Flesch-Kincaid grade level It is FALSE if there were no words in the document.
  • Access: public

Type:   float


[ Top ]

$numSentences =  0

[line 91]

The number of sentences in the document.
  • Access: public

Type:   int


[ Top ]

$numSyllables =  0

[line 67]

The number of syllables in the document.
  • Access: public

Type:   int


[ Top ]

$numWords =  0

[line 75]

The number of words in the document.
  • Access: public

Type:   int


[ Top ]

$text =  ''

[line 59]

The document text.
  • Access: public

Type:   string


[ Top ]

$uniqWords =  0

[line 83]

The number of unique words in the document.
  • Access: public

Type:   int


[ Top ]

$_abbreviations = array('/Mr\./'   => 'Mister',
                                '/Mrs\./i' => 'Misses', // Phonetic
                                '/etc\./i' => 'etcetera',
                                '/Dr\./i'  => 'Doctor',
                                '/Jr\./i' => 'Junior',
                                '/Sr\./i' => 'Senior',
                               )

[line 118]

Some abbreviations we should expand. This list could/should be much larger.
  • Access: protected

Type:   array


[ Top ]



Method Detail

Text_Statistics (Constructor)   [line 140]

Text_Statistics Text_Statistics( string $block)

Constructor.
  • Access: public

Parameters:

string   $block     

[ Top ]

getCharFreq   [line 156]

array getCharFreq( )

Returns the character frequencies.
  • Return: of frequencies, where the index is the ASCII byte char value
  • Author: Jesus M. Castagnetto <jmcastagnetto@php.net>
  • Access: public

[ Top ]

getNumParagraphs   [line 171]

long getNumParagraphs( )

Returns the number of paragaphs.

Paragraphs are defined as chunks of text separated by and empty line.


[ Top ]

_analyze   [line 182]

void _analyze( )

Compute statistics for the document object.
  • Access: protected

[ Top ]

_analyze_line   [line 237]

void _analyze_line( string $line)

Helper function, computes statistics on a given line.
  • Access: protected

Parameters:

string   $line     

[ Top ]


Documentation generated on Mon, 08 Feb 2010 14:00:01 +0000 by phpDocumentor 1.4.3. PEAR Logo Copyright © PHP Group 2004.