Access the full text.
Sign up today, get DeepDyve free for 14 days.
Bhaskar Mitra, Fernando Diaz, Nick Craswell (2016)
Learning to Match using Local and Distributed Representations of Text for Web SearchProceedings of the 26th International Conference on World Wide Web
Shervin Minaee, Nal Kalchbrenner, E. Cambria, Narjes Nikzad, M. Chenaghlu, Jianfeng Gao (2020)
Deep Learning--based Text ClassificationACM Computing Surveys (CSUR), 54
Jyun-Yu Jiang, Mingyang Zhang, Cheng Li, Michael Bendersky, Nadav Golbandi, Marc Najork (2019)
Semantic Text Matching for Long-Form DocumentsThe World Wide Web Conference
Liang Pang, Yanyan Lan, Xueqi Cheng (2021)
Match-Ignition: Plugging PageRank into Transformer for Long-form Text MatchingProceedings of the 30th ACM International Conference on Information & Knowledge Management
R. Kaur, Inderveer Chana, J. Bhattacharya (2018)
Data deduplication techniques for efficient cloud storage management: a systematic reviewThe Journal of Supercomputing, 74
Zhuyun Dai, Chenyan Xiong, Jamie Callan, Zhiyuan Liu (2018)
Convolutional Neural Networks for Soft-Matching N-Grams in Ad-hoc SearchProceedings of the Eleventh ACM International Conference on Web Search and Data Mining
S Minaee (2021)
1ACM Computing Surveys (CSUR), 54
Xianlun Tang, Yang Luo, Deyi Xiong, Jingming Yang, Rui Li, Deguang Peng (2022)
Short text matching model with multiway semantic interaction based on multi-granularity semantic embeddingApplied Intelligence, 52
Preetum Nakkiran, Gal Kaplun, Yamini Bansal, Tristan Yang, B. Barak, Ilya Sutskever (2019)
Deep double descent: where bigger models and more data hurtJournal of Statistical Mechanics: Theory and Experiment, 2021
S Minaee, N Kalchbrenner, E Cambria, N Nikzad, M Chenaghlu, J Gao (2021)
Deep learning-based text classification: a comprehensive reviewACM Computing Surveys (CSUR), 54
J. Guo, Yixing Fan, Qingyao Ai, W. Croft (2016)
A Deep Relevance Matching Model for Ad-hoc RetrievalProceedings of the 25th ACM International on Conference on Information and Knowledge Management
Liang Pang, Yanyan Lan, J. Guo, Jun Xu, Shengxian Wan, Xueqi Cheng (2016)
Text Matching as Image RecognitionArXiv, abs/1602.06359
A Vaswani, N Shazeer, N Parmar, J Uszkoreit, L Jones, AN Gomez, L Kaiser, I Polosukhin (2017)
Attention is all you needAdvances in neural information processing systems, 30
Wei Liu, Xiyan Fu, Yueqian Zhang, Wenming Xiao (2021)
Lexicon Enhanced Chinese Sequence Labeling Using BERT AdapterArXiv, abs/2105.07148
Zekun Yang, Noa García, Chenhui Chu, Mayu Otani, Yuta Nakashima, H. Takemura (2021)
A comparative study of language transformers for video question answeringNeurocomputing, 445
Yixing Fan, J. Guo, Yanyan Lan, Jun Xu, ChengXiang Zhai, Xueqi Cheng (2018)
Modeling Diverse Relevance Patterns in Ad-hoc RetrievalThe 41st International ACM SIGIR Conference on Research & Development in Information Retrieval
Shengxian Wan, Yanyan Lan, J. Guo, Jun Xu, Liang Pang, Xueqi Cheng (2015)
A Deep Architecture for Semantic Matching with Multiple Positional Sentence RepresentationsArXiv, abs/1511.08277
S. Robertson, H. Zaragoza (2009)
The Probabilistic Relevance Framework: BM25 and BeyondFound. Trends Inf. Retr., 3
Mingtong Liu, Yujie Zhang, Jinan Xu, Yufeng Chen (2021)
Deep bi-directional interaction network for sentence matchingApplied Intelligence, 51
Po-Sen Huang, Xiaodong He, Jianfeng Gao, L. Deng, A. Acero, Larry Heck (2013)
Learning deep structured semantic models for web search using clickthrough dataProceedings of the 22nd ACM international conference on Information & Knowledge Management
Peiyang Liu, Xi Wang, Lin Wang, Wei Ye, Xiangyu Xi, Shikun Zhang (2021)
Distilling Knowledge from BERT into Simple Fully Connected Neural Networks for Efficient Vertical RetrievalProceedings of the 30th ACM International Conference on Information & Knowledge Management
Chenyan Xiong, Zhuyun Dai, Jamie Callan, Zhiyuan Liu, Russell Power (2017)
End-to-End Neural Ad-hoc Ranking with Kernel PoolingProceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval
S Brin, L Page (1998)
The anatomy of a large-scale hypertextual web search engineComputer networks and ISDN systems, 30
Yelong Shen, Xiaodong He, Jianfeng Gao, L. Deng, Grégoire Mesnil (2014)
A Latent Semantic Model with Convolutional-Pooling Structure for Information RetrievalProceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management
Shengdong Zhang, Fazhi He (2019)
DRCDN: learning deep residual convolutional dehazing networksThe Visual Computer, 36
Liu Yang, Mingyang Zhang, Cheng Li, Michael Bendersky, Marc Najork (2020)
Beyond 512 Tokens: Siamese Multi-depth Transformer-based Hierarchical Encoder for Long-Form Document MatchingProceedings of the 29th ACM International Conference on Information & Knowledge Management
In the field of deep matching, a large amount of noisy data in Chinese long texts affects the matching effect. Most long-form text matching models use all text data indiscriminately, which results in a large amount of noisy data, and thus the PageRank algorithm is combined with Transformer to filter noise. For sentence-level noise detection, after calculating the overlap rate of words to evaluate the similarity, a sentence-level relationship graph is constructed and filtered by using the PageRank algorithm; for word-level noise detection, based on the attention score in Transformer, a word graph is established, then the PageRank algorithm is executed on graph, combined with self-attention weights, to select keywords to highlight topic relevance, the noisy words are filtered sequentially at different layers in the module, layer by layer. In addition, during the model training, PolyLoss is applied to replace the traditional binary Cross-Entropy loss function, thus reducing the difficulty of hyperparameter tuning. Finally, a better filtering strategy is proposed and experiments are conducted to verify it on two Chinese long-form text matching datasets. The result shows that the matching model based on the noise filtering strategy of this paper can better filter the noise and capture the matching signal more accurately.
Applied Intelligence – Springer Journals
Published: Oct 1, 2023
Keywords: Long text matching; Noise filtering; Transformer; PageRank
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.