Access the full text.
Sign up today, get DeepDyve free for 14 days.
Cao Nguyen, K. Cios (2008)
GAKREM: A novel hybrid clustering algorithmInf. Sci., 178
(2011)
Relationship Matrix Non-negative Decomposition for Clustering
G. Salton, Anita Wong, Chung-Shu Yang (1975)
A vector space model for automatic indexingCommun. ACM, 18
I. Chiang, Charles Liu, Y. Tsai, Ajit Kumar (2015)
Discovering Latent Semantics in Web Documents Using Fuzzy ClusteringIEEE Transactions on Fuzzy Systems, 23
S. Wong, W. Ziarko, Patrick Wong (1985)
Generalized vector spaces model in information retrieval
Zhihua Ban, Jianguo Liu, L. Yuan, Hua Yang (2015)
A modified density-based clustering algorithm and its implementation, 9813
D. Zhang, Xiaoyuan Jing, Jian Yang (2006)
Principle Component Analysis
G. Liu (1997)
Semantic Vector Space Model: Implementation and EvaluationJ. Am. Soc. Inf. Sci., 48
S. Bandyopadhyay, U. Maulik (2002)
An evolutionary technique based on K-Means algorithm for optimal clustering in RNInf. Sci., 146
Wei Song, Jiuzhen Liang, Soon-cheol Park (2014)
Fuzzy control GA with a novel hybrid semantic similarity strategy for text clusteringInf. Sci., 273
Tarek Gharib, M. Fouad, M. Aref (2008)
Fuzzy Document Clustering Approach using WordNet Lexical Categories
Pu Wang, C. Domeniconi (2008)
Building semantic kernels for text classification using wikipedia
(2015)
Latent Semantic Vector Space
Amirali Noorinaeini, M. Lehto, Sze-jung Wu (2007)
Hybrid Singular Value Decomposition: A Model of Human Text Classification
M. Galar, Alberto Fernández, E. Tartas, F. Herrera (2014)
Empowering difficult classes with a similarity-based aggregation in multi-class classification problemsInf. Sci., 264
L. Kaufman, P. Rousseeuw (1991)
Finding Groups in Data: An Introduction to Cluster Analysis
S DEERWESTER, S DUMAIS, T LANDAUER, G FURNAS, R HARSHMAN (1990)
Indexing by Latent Semantic AnalysisJournal of the American Society for Information Science, 41
S. Fodeh, W. Punch, P. Tan (2011)
On ontology-driven document clustering using core semantic featuresKnowledge and Information Systems, 28
Vishal Patel, H. Nguyen, R. Vidal (2015)
Latent Space Sparse and Low-Rank Subspace ClusteringIEEE Journal of Selected Topics in Signal Processing, 9
M. Vozalis, K. Margaritis (2007)
Using SVD and demographic data for the enhancement of generalized Collaborative FilteringInf. Sci., 177
(1998)
Algorithms for Scoring Conference Chains
J. Bellegarda, J. Butzberger, Y. Chow, N. Coccaro, D. Naik (1996)
A novel word clustering algorithm based on latent semantic analysis1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings, 1
G. Miller (1995)
WordNet: A Lexical Database for EnglishCommun. ACM, 38
(2013)
Depth Aggregation Analysis of Library Document Resources Based on Co - occurrence and Coupling ”
A. Hotho, Steffen Staab, Gerd Stumme (2003)
WordNet improves text document clustering
Jamal Nasir, Iraklis Varlamis, Asim Karim, G. Tsatsaronis (2013)
Semantic smoothing for text clusteringKnowl. Based Syst., 54
G. Paltoglou, M. Salampasis, M. Satratzemi (2010)
Collection-integral source selection for uncooperative distributed information retrieval environmentsInf. Sci., 180
Mirco Schindler, Oliver Fox, A. Rausch (2015)
Clustering Source Code Elements by Semantic Similarity Using Wikipedia2015 IEEE/ACM 4th International Workshop on Realizing Artificial Intelligence Synergies in Software Engineering
T. Landauer, S. Dumais (2008)
Latent semantic analysisScholarpedia, 3
(2013)
Depth Aggregation Method of Document and Empirical Research in Citation Network”, see XML research paper in WOS database
Tingting Wei, Yonghe Lu, Huiyou Chang, Qiang Zhou, Xianyu Bao (2015)
A semantic approach for text clustering using WordNet and lexical chainsExpert Syst. Appl., 42
Shiping Wang, William Zhu, Qingxin Zhu, F. Min (2012)
Characteristic matrix of covering and its application to Boolean matrix decompositionInf. Sci., 263
M. Steinbach, G. Karypis, Vipin Kumar (2000)
A Comparison of Document Clustering Techniques
Enrique Amigó, Julio Gonzalo, J. Artiles, M. Verdejo (2009)
A comparison of extrinsic clustering evaluation metrics based on formal constraintsInformation Retrieval, 12
Pavlo Antonenko, S. Toy, Dale Niederhauser (2012)
Using cluster analysis for data mining in educational technology researchEducational Technology Research and Development, 60
The vector representation is one of the important parts in document clustering or classification, which can quantify the text. In this paper, a novel Cooccurrence Latent Semantic Vector Space Model (CLSVSM) is presented and the co-occurrence distribution is further studied. This model is developed based on the Vector Space Model (VSM), embedding the co-occurrence latent semantic of the documents’ keywords to represent their vectors. First, experiments were conducted to test the model performance, using documents from Chinese National Knowledge Infrastructure (CNKI). The results showed the Entropy (E), Purity (P) and F1 value of CLMSVM is 20% better than in VSM in the documents clustering testing, which reveals that CLSVSM can improve the accuracy of clustering of documents, meanwhile reducing sparse degree of vectors. Second, it is the best to estimate the latent semantic: maximum (MAX), minimum (MIN), average (AVE), and median (MED)? More experiments are performed to compare the four estimators. The results indicate that Max and AVE are preferred method, while MIN method is the worst, which coincided with the discussion. Some essential questions were discussed at the end. These questions related to the trends of co-occurrence frequency, the function of co-occurrence intensity and its distribution, which reinforced the model.
Journal of Classification – Springer Journals
Published: Nov 16, 2018
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.