Newswise — Long non-coding RNAs (lncRNAs) are ubiquitous transcripts with crucial regulatory roles in various biological processes, including chromatin remodeling, post-transcriptional regulation, and epigenetic modifications. While accumulating evidence elucidates mechanisms by which plant lncRNAs modulate growth, root development, and seed dormancy, their accurate identification remains challenging due to a lack of plant-specific methods. Currently, the mainstream methods for plant lncRNA identification are largely developed based on human or animal datasets. Consequently, the accuracy and effectiveness of these methods in predicting plant lncRNAs have not been fully evaluated.
Recently, a research article titled " Plant-LncPipe: a computational pipeline providing significant improvement in plant lncRNA identification" by the group led by Jian-Feng Mao from Beijing Forestry University and Umea University was published online in Horticulture Research. This study extensively collected high-quality RNA-sequencing data from various plants and utilized these plant-specific data to retrain the models of three mainstream lncRNA prediction tools, namely CPAT, LncFinder, and PLEK. The performance of the retrained models was compared and evaluated against other popular lncRNA prediction tools, such as CPC2, CNCI, RNAplonc, and LncADeep. The results demonstrated that the retrained models significantly improved the prediction performance for plant lncRNAs. Among them, two retrained models, LncFinder-plant and CPAT-plant, outperformed others on multiple evaluation metrics, rendering them the most suitable tools for plant lncRNA identification.
This research developed a computational pipeline, named Plant-LncPipe, for the identification and analysis of plant lncRNAs. This pipeline integrates two top-performing identification models, CPAT-plant and LncFinder-plant, enabling a comprehensive computational process encompassing raw data preprocessing, transcript assembly, lncRNA identification, lncRNA classification, and lncRNA origins. This computational pipeline can be widely applied to various plant species. Plant-LncPipe is publicly available and can be downloaded from the following link: https://github.com/xuechantian/Plant-LncRNA-pipline.
The study demonstrates that retraining lncRNA prediction models on high-quality plant transcriptomic data enabled more accurate capture of plant lncRNA features, significantly enhancing prediction precision and reliability. The study underscored the importance of species-specific retraining to improve model accuracy. Retraining existing mature models retained prior accumulated experience and methodologies while further boosting model applicability and accuracy.
Ph.D. student Xue-Chan Tian and master student Zhao-Yang Chen from Beijing Forestry University were the co-first authors. Ph.D. students Shuai Nie, Tian-Le Shi, Xue-Mei Yan, Yu-Tao Bao, Zhi-Chao Li, Kai-Hua Jia, master's student Hai-Yao Ma, and postdoctoral researcher Wei Zhao participated in and assisted with the research. This research was supported by the National Key R&D Program of China (2022YFD2200103) and National Natural Science Foundation of China (32171816).
###
References
DOI
Original Source URL
https://doi.org/10.1093/hr/uhae041
Authors
Xue-Chan Tian1, Zhao-Yang Chen1, Shuai Nie2, Tian-Le Shi1, Xue-Mei Yan1, Yu-Tao Bao1, Zhi-Chao Li1, Hai-Yao Ma1, Kai-Hua Jia 3, Wei Zhao4, and Jian-Feng Mao 1,4, *
Affiliations
1 State Key Laboratory of Tree Genetics and Breeding, National Engineering Research Center of Tree Breeding and Ecological Restoration, Beijing Advanced Innovation Center for Tree Breeding by Molecular Design, National Engineering Laboratory for Tree Breeding, Key Laboratory of Genetics and Breeding in Forest Trees and Ornamental Plants, Ministry of Education, College of Biological Sciences and Technology, Beijing Forestry University, Beijing 100083, China
2 Rice Research Institute, Guangdong Academy of Agricultural Sciences & Key Laboratory of Genetics and Breeding of High Quality Rice in Southern China (Co-construction by Ministry and Province), Ministry of Agriculture and Rural Affairs & Guangdong Key Laboratory of New Technology in Rice Breeding, Guangzhou 510640, China
3 Key Laboratory of Crop Genetic Improvement & Ecology and Physiology, Institute of Crop Germplasm Resources, Shandong Academy of Agricultural Sciences, Jinan 250100, China
4 Department of Plant Physiology, Umeå Plant Science Centre (UPSC), Umeå University, Umeå 90187, Sweden