Electric Forest

Electric Forest

thoughts about books, digital libraries, and stuff related to expressing and keeping track of our thoughts...

Thursday, October 20, 2005

Topic Modeling

In my work with the IRIS project, I have had the opportunity to meet and work with people doing topic modeling. Sounds like an exciting subject. It is. Take Latent Semantic Analysis, tweak it to something called Latent Dirichlet allocation, add in gibbs sampling, chinese restaurant algorithms, and you've got a probabilistic means by which topics can be harvested from large bodies of information resources. Some interesting papers can be found at:
http://www.pnas.org/cgi/reprint/0307752101v1.pdf
http://eprints.pascal-network.org/archive/00000990/01/WS905BuntineW.pdf
http://cog.brown.edu/~gruffydd/papers/ncrp.pdf
http://cog.brown.edu/~gruffydd/papers/author_topics_kdd.pdf

1 Comments:

At December 19, 2005 2:32 PM, Jack Park said...

The latest URL for IRIS is http://www.openiris.org.

 

Post a Comment

<< Home