Temple University Home     CIS Home 

 
 

CIS 8590: Topics in Computer Science -- Text Mining and Language Processing

Instructor: Alexander Yates

Course Description:

This course will give a broad overview of problems and techniques in natural language processing, and then move on to cover the latest research in selected topics. The overview part of the course will cover problems in:
  • Information Retrieval: Building indexes, data compression, representation of queries and documents, and similarity functions.
  • Information Extraction: Building hierarchies of knowledge (ontologies), determining the meaning of words, and determining the relationships that exist between entities referred to in text.
The in-depth part of the course will focus on the latest research in unsupervised information extraction. This part of the course will cover such techniques as stemming, pointwise mutual information, pattern-matching, bootstrapping, TF-IDF, n-gram models, Hidden Markov Models, Conditional Random Fields, statistical parsing, clustering, and language modeling.

Prerequisites:

Familiarity and basic level of comfort with probability and statistics is essential and will be assumed. Any of the following courses, or specific permission of the instructor, should be enough: CIS 8525, 8526, 8527, 9603, 9664.

Textbook:

None. We will read extensively from the research literature, which will be handed out in class, or links to the material will be provided online.




Announcements:

  • 8/25: Welcome to CIS 8590!