中国农业科技导报 ›› 2016, Vol. 18 ›› Issue (6): 31-43.DOI: 10.13304/j.nykjdb.2016.132

• 生物技术 生命科学 • 上一篇    下一篇

蒙药冷蒿转录组SSR信息分析

岳春江1§,陈川川1§,郭凤仙1,李华1,孙洪波1,裴丹宁1,马晓清1,陈富欣1,杨获莉1,李琴1,刘越1,2*   

  1. 1.中央民族大学生命与环境科学学院, 北京 100081; 2.中国中医科学院中药资源研究中心, 北京 100700
  • 收稿日期:2016-03-11 出版日期:2016-12-15 发布日期:2016-08-29
  • 通讯作者: 刘越,副教授,博士,主要从事民族药物遗传多样性和功能基因组学研究。E-mail:liuyue_muc@163.com
  • 作者简介:§岳春江与陈川川为本文共同第一作者。岳春江,本科生,研究方向为生物科学,E-mail:yuechunjiang_1@126.com;陈川川,硕士研究生,研究方向为民族植物学。E-mail:471267048@qq.com。
  • 基金资助:
    国家自然科学基金项目(30801554,81110108011,81274185);教育部新世纪优秀人才支持计划项目(NCET-12-0578,NCET-13-0624);2013年度人社部留学人员科技活动项目;高等学校学科创新引智计划项目(2008-B08044);中央民族大学一流大学一流学科建设项目(2015MOTD16C);中央高校基本科研业务费专项(2016SHXY04);北京市大学生创新项目(GCCX2014110021)资助。

Data Mining of Simple Sequence Repeats in Transcriptome Sequences of Mongolia Medicinal Plant Artemisia frigida Willd

YUE Chun-jiang1§, CHEN Chuan-chuan1§, GUO Feng-xian1, LI Hua1, SUN Hong-bo1, PEI Dan-ning1, MA Xiao-qing1, CHEN Fu-xin1, YANG Huo-li1, LI Qin1, LIU Yue1,2*   

  1. 1.College of Life and Environmental Sciences, Minzu University of China, Beijing 100081; 2.National Resource
    Center for Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing 100700, China
  • Received:2016-03-11 Online:2016-12-15 Published:2016-08-29

摘要: 利用MISA(MicroSAtelite)软件对测序得到的蒙药冷蒿转录组序列143 700条跨叠群(contigs)进行简单重复序列(SSR)位点的挖掘,发现3 614条序列含有3 753个SSR位点,发生频率为2.51%,共有122种重复基元,平均每18.46 kb出现1个SSR位点。冷蒿转录组序列的SSR主要集中在三核苷酸重复(56.12%),其次是二核苷酸重复(31.60%)。AC/TG、AT/TA、CA/GT、AAT/TTA和AAC/TTG是二核苷酸、三核苷酸中的优势重复基元。冷蒿转录组SSR以5~12次重复为主,基序长度主要集中于12~36 bp。冷蒿转录组共注释43 415个contigs,其中578个SSRs位于编码区,主要以三核苷酸重复为主(397,68.69%)。从分子水平和生物信息学角度介绍了蒙药冷蒿转录组SSR信息的开发利用,其出现频率高、重复类型丰富,将为冷蒿的分子标记辅助育种、遗传多样性分析、遗传图谱构建和功能基因挖掘提供了候选序列。

关键词: 蒙药冷蒿, 转录组, SSR信息分析

Abstract: MISA (MicroSAtelite) software was used to screen SSRs in 143 700 contigs of Artemisia frigida Willd. transcriptome sequences. 3 753 SSR sites were identified among 3 614 contigs which accounted for 2.51% of 143 700 contigs. There were 122 kinds of SSR motifs existing in A. frigida Willd. transcriptome. On average, SSRs occurred every 18.46 kb in length. In the SSRs, the tri-nucleotide repeat motif was the most abundant (56.12%), followed by the di-nucleotide (31.60%). AC/TG, AT/TA, CA/GT, AAT/TTA and AAC/TTG were the main types of motif in di-nucleotide, tri-nucleotide repeats. The repeat number of SSRs which from A. frigida Willd. transcriptome SSRs were mainly from 5 to 12 and their motif length mostly ranged from 12~32 bp. A total of 43 415 contigs were annotated, and only 578 SSRs were occurred in protein-coding regions, the tri-nucleotide repeats were the most abundant in coding regions (397, 68.69%). This paper introduced the development and utilization of A. frigida Willd. transcriptome from the molecular level and biological information angle. With high occurrence frequency and multiple repeated A. frigida Willd. transcriptome types, A. frigida Willd. transcriptome had provided candidate sequenses for breeding assisted by A. frigida Willd. molecular marker, analyzing genetic diversity, constructing genetic map and mining functional gene.

Key words: Artemisia frigida Willd., transcriptome, SSR information analysis