|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object edu.northwestern.at.utils.corpuslinguistics.Frequency
public class Frequency
Computes frequency-based statistics for comparing corpora.
Constructor Summary | |
---|---|
protected |
Frequency()
Don't allow instantiation but do allow overrides. |
Method Summary | |
---|---|
static double[] |
logLikelihoodFrequencyComparison(int sampleCount,
int refCount,
int sampleSize,
int refSize)
Compute log-likelihood statistic for comparing frequencies in two corpora. |
static double[] |
logLikelihoodFrequencyComparison(int sampleCount,
int refCount,
int sampleSize,
int refSize,
boolean computeLLSig)
Compute log-likelihood statistic for comparing frequencies in two corpora. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
protected Frequency()
Method Detail |
---|
public static double[] logLikelihoodFrequencyComparison(int sampleCount, int refCount, int sampleSize, int refSize, boolean computeLLSig)
sampleCount
- Count of word/lemma appearance in sample.refCount
- Count of word/lemma appearance in reference
corpus.sampleSize
- Total words/lemmas in the sample.refSize
- Total words/lemmas in reference corpus.computeLLSig
- Compute significance of log likelihood.
The contents of the result array are as follows.
(0) Count of word/lemma appearance in sample.
(1) Percent of word/lemma appearance in sample.
(2) Count of word/lemma appearance in reference.
(3) Percent of word/lemma appearance in reference.
(4) Log-likelihood measure.
(5) Significance of log-likelihood.
The results of any zero divides are set to zero.
public static double[] logLikelihoodFrequencyComparison(int sampleCount, int refCount, int sampleSize, int refSize)
sampleCount
- Count of word/lemma appearance in sample.refCount
- Count of word/lemma appearance in reference
corpus.sampleSize
- Total words/lemmas in the sample.refSize
- Total words/lemmas in reference corpus.
The contents of the result array are as follows.
(0) Count of word/lemma appearance in sample.
(1) Percent of word/lemma appearance in sample.
(2) Count of word/lemma appearance in reference.
(3) Percent of word/lemma appearance in reference.
(4) Log-likelihood measure.
(5) Significance of log-likelihood.
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |