Paper: “Latent Dirichlet allocation,” D. Blei,
A. Ng, and M. Jordan, Journal of Machine Learning Research, 3:993–1022, January
2003. Known as LDA.
The work is extended from pLSA. In pLSA, there are 2 serious
disadvantages.
1. To a document out of the training set, it’s ambiguous
(or hard) to assign probability.
2. The
number of paras grows linearly with the size of the corpus, and it would cause serious problems of overfitting.
To deal with the problems above, this paper contributes
following solution:
1. Providing a probabilistic model at the level of documents.
1. Providing a probabilistic model at the level of documents.
2. Considering a mixture
model that captures the exchangeability of
both words and documents in order to fulfill the assumption of the
“bag-of-words”.
So the model becomes:
The first term on the right
hand side is for modeling the document distribution. And then, LDA treat the
mixture weights as a k-parameter random variable, which let the number of
parameter independent to the number of documents
And in the experiment’s part
of this paper, it has good performance. But according to wikipedia and “On
an Equivalence between PLSI and LDA” (SIGIR 2003), the pLSA model is equivalent to the LDA model under a uniform
Dirichlet prior distribution.
沒有留言:
張貼留言