Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Analysis method and algorithm design of biological sequence problem based on generalized k-mer vector

Analysis method and algorithm design of biological sequence problem based on generalized k-mer... K-mer can be used for the description of biological sequences and k-mer distribution is a tool for solving sequences analysis problems in bioinformatics. We can use k-mer vector as a representation method of the k-mer distribution of the biological sequence. Problems, such as similarity calculations or sequence assembly, can be described in the k-mer vector space. It helps us to identify new features of an old sequence-based problem in bioinformatics and develop new algorithms using the concepts and methods from linear space theory. In this study, we defined the k-mer vector space for the generalized biological sequences. The meaning of corresponding vector operations is explained in the biological context. We presented the vector/matrix form of several widely seen sequence-based problems, including read quantification, sequence assembly, and pattern detection problem. Its advantages and disadvantages are discussed. Also, we implement a tool for the sequence assembly problem based on the concepts of k-mer vector methods. It shows the practicability and convenience of this algorithm design strategy. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Applied Mathematics-A Journal of Chinese Universities Springer Journals

Analysis method and algorithm design of biological sequence problem based on generalized k-mer vector

Loading next page...
 
/lp/springer-journals/analysis-method-and-algorithm-design-of-biological-sequence-problem-d1TFghFyJI
Publisher
Springer Journals
Copyright
Copyright © Editorial Committee of Applied Mathematics 2021
ISSN
1005-1031
eISSN
1993-0445
DOI
10.1007/s11766-021-4033-x
Publisher site
See Article on Publisher Site

Abstract

K-mer can be used for the description of biological sequences and k-mer distribution is a tool for solving sequences analysis problems in bioinformatics. We can use k-mer vector as a representation method of the k-mer distribution of the biological sequence. Problems, such as similarity calculations or sequence assembly, can be described in the k-mer vector space. It helps us to identify new features of an old sequence-based problem in bioinformatics and develop new algorithms using the concepts and methods from linear space theory. In this study, we defined the k-mer vector space for the generalized biological sequences. The meaning of corresponding vector operations is explained in the biological context. We presented the vector/matrix form of several widely seen sequence-based problems, including read quantification, sequence assembly, and pattern detection problem. Its advantages and disadvantages are discussed. Also, we implement a tool for the sequence assembly problem based on the concepts of k-mer vector methods. It shows the practicability and convenience of this algorithm design strategy.

Journal

Applied Mathematics-A Journal of Chinese UniversitiesSpringer Journals

Published: Mar 10, 2021

There are no references for this article.