CpG_MI: Identifying Functional CpG Island using Mutual Information

  CpG_MI provides a useful information-theoretic tool to identify functional CpG islands from the CpG clusterings in the bulk genomes. The CpG islands of six mammals and four fishes are identified by CpG_MI for download. Due to CpG dinucleotide densities differ from organism to organism, the corresponding organism should be selected firstly for CpG_MI. CpG dinucleotides densities of different genomes are implicated in the dialog box of organism. Then you can identify CpG islands by three approaches: (I) inputing the start and end coordinate positions of a chromosome, or (II) pasting one sequence in FASTA format, or (III) uploading a fasta sequence file. The output of CpG_MI includes the CpG islands identified together with corresponding genome coordinates, length, number of CpGs, G+C content and CpG O/E of the CpG islands. You can also download the standalone version for long sequences.

Submit a sequence by genomic coordinates:

Paste sequences in FASTA format:

Upload a sequence file (FASTA format):


Fetch sequences based on genomic coordinates (1-based) in specific version of genome assembly is welcome. Uploading a FASTA sequence via file or copying and pasting is also allowed, while the largest file size is set to be 1Mb and only the files with the extension name of .fasta, .fa, .txt are allowed.Please select the proper genome assembly, for the cutoffs are refined for different genomes. The executable PERL file is available for large sequences. Large sequence may take a title long time, please be patient (a 200kb sequence may take up to 4 minutes!). Enjoy yourself!

Links to UCSC Genome Browser:

We have predicted CpG islands from ten organisms. The CpG islands can be visualized in UCSC Genomic Browser.

Organism: Link: Download (bed)
Human (hg18) UCSC 40,926
Human (hg19) UCSC 41,027
Chimpanzee (panTro2) UCSC 39,796
Mouse (mm9) UCSC 33,406
Rat (rn4) UCSC 42,366
Cow (bosTau4) UCSC 50,641
Dog (canFam2) UCSC 64,938
Zebrafish (danRer5) UCSC 92,188
Medaka (oryLat2) UCSC 53,721
Stickleback (gasAcu1) UCSC 82,644
tetraodon (tetNig1) UCSC 37,831

Developers and Contact

Jianzhong Su, Project leader, algorithm development, Email:jianzhongsu@yahoo.cn

Jie Lv, middleware programming and frontend development, Email:lvjielvjie@yahoo.com.cn

Yan Zhang, deployment, web server administration, Email:yanyou1225@yahoo.com.cn

Please feel free to contact any of us if you have questions.

Please cite:"Jianzhong Su, Yan Zhang, Jie Lv, Hongbo Liu, Xiaoyan Tang, Fang Wang, Yunfeng Qi, Yujia Feng and Xia Li, CpG_MI: a novel approach for identifying functional CpG islands in mammalian genomes (Nucleic Acids Research, 2010, 38(1):e6)".

CopyRight© 2009 College of Bioinformatics Science and Technology, Harbin Medical University, China