Thursday, October 20, 2005
Topic Modeling
In my work with the IRIS project, I have had the opportunity to meet and work with people doing topic modeling. Sounds like an exciting subject. It is. Take Latent Semantic Analysis, tweak it to something called Latent Dirichlet allocation, add in gibbs sampling, chinese restaurant algorithms, and you've got a probabilistic means by which topics can be harvested from large bodies of information resources. Some interesting papers can be found at:
http://www.pnas.org/cgi/reprint/0307752101v1.pdf
http://eprints.pascal-network.org/archive/00000990/01/WS905BuntineW.pdf
http://cog.brown.edu/~gruffydd/papers/ncrp.pdf
http://cog.brown.edu/~gruffydd/papers/author_topics_kdd.pdf



