Access the full text.
Sign up today, get DeepDyve free for 14 days.
We discuss an approach to the automatic expansion of domain-specific lexicons , that is, to the problem of extending, for each c i in a predefined set C = { c 1 ,…, c m } of semantic domains , an initial lexicon L i 0 into a larger lexicon L i 1 . Our approach relies on term categorization , defined as the task of labeling previously unlabeled terms according to a predefined set of domains. We approach this as a supervised learning problem in which term classifiers are built using the initial lexicons as training data. Dually to classic text categorization tasks in which documents are represented as vectors in a space of terms, we represent terms as vectors in a space of documents. We present the results of a number of experiments in which we use a boosting-based learning device for training our term classifiers. We test the effectiveness of our method by using WordNetDomains, a well-known large set of domain-specific lexicons, as a benchmark. Our experiments are performed using the documents in the Reuters Corpus Volume 1 as implicit representations for our terms.
ACM Transactions on Speech and Language Processing (TSLP) – Association for Computing Machinery
Published: May 1, 2006
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.