Mice are extremely important as the premier model organism in human biomedical and mammalian genetic research. The genomes of several tens of mouse inbred strains have been sequenced. They have been compared to the genome of C57BL/6J, considered by convention as the reference genome. Based on a comparison of this reference genome with 36 other sequenced mouse strains, we generated an overview of all protein-coding genes that are deviant in this reference genome, compared with consensus protein-coding mouse gene sequences. We provide PROVEAN scores, reflecting the likelihood that these C57BL/6J proteins have lost function. We thus identified numerous abnormal proteins, and biological pathways, specifically present in C57BL/6J, suggesting the important caveats of this reference mouse strain, and linking candidate genes to some of the best-known phenotypes of this strain.
Steven Timmermans, Claude Libert