You are here : Home > News > ProMetIS, an open database for multi-omics phenotyping of mouse models

Scientific result | Mass spectrometry | Large-scale biology | Bioinformatic

ProMetIS: open bioinformatics data and tools for multi-omics phenotyping of mouse models

​A consortium of researchers, led by the SPI/DMTS, in association with the IRIG and LIST institutes of the CEA, as well as four national infrastructures, presents ProMetIS, a pilot study for the deep phenotyping of murine models by combining proteomic and metabolomic approaches. A significant advance for the functional characterization of genes and the development of new approaches for bioinformatics and biostatistics integration.

Published on 4 April 2022

The availability of combined multi-omics data would provide the scientific community with a unique opportunity to better understand, on an integrated scale, the development and progression of pathophysiological mechanisms. The analysis of such data would allow the definition of signatures and the identification of specific biomarkers for a given pathology or dysfunction. However, to date, very few global data on large cohorts are available and accessible to researchers.


Large-scale analysis of gene function conducted within the International Mouse Phenotyping consortium (IMPC) has confirmed the pleiotropic nature of genes in mammals, i.e., a single gene can be responsible for several apparently distant phenotypic traits. Thus, phenogenomics alone cannot explain the function of genes and their mutants and complementary omics approaches are needed. The global study of gene products, proteins (proteomics) and metabolites (metabolomics), combined with phenogenomic approaches, should allow us to understand the role of one or more genes and from there, the whole of biological and metabolic functions, under normal or pathological conditions.


In this study, published in the reference journal for open data Scientific Data, the four french National Infrastructures in Biology and Health in mouse phenogenomics (, proteomics (, metabolomics ( and bioinformatics ( have joined forces to develop and make available data and procedures for the characterization of mutant murine lines using combined proteomics and metabolomics approaches. The researchers chose to generate these multilevel data from plasma and liver samples of two mutant mouse lines, generated at the Mouse Clinical Institute (Illkirch, France) in the framework of the IMPC*. The 2 lines lack the Lat (linker for activation of T cells) and Mx2 (MX dynamin-like GTPase 2) genes, respectively. All the 9 raw data sets (1 preclinical, 2 proteomic and 6 metabolomic), corresponding to the study of the 2 mouse lines, are now available in the reference databases (IMPC, PRIDE and MetaboLights). In addition, the pre-processed data as well as the bioinformatics and biostatistics analysis pipeline are also available as an open access package in R language (


The data are subject to a quality control, detailed in the article, for each of the modalities, which relies on the expertise of the CEA platforms and national infrastructures. A special effort has been made to standardize workflows and formats and to make them available to the community, in order to facilitate subsequent work on data integration (a study is underway at the CEA) and the comparison of methodologies.

The ProMetIS pilot study represents a significant advance towards molecular phenotyping of large cohorts. Here, the data provide unprecedented information on the functional characterization of the Lat and Mx2 genes. They are also intended to become a reference for accessibility, reproducibility and interoperability (FAIR criteria) in the field of multi-omics studies. These data will be particularly valuable for developing new approaches to bioinformatics and biostatistics integration.

Contact :

Phenomics is the systematic study of phenotypes, i.e. the set of physical and biochemical characteristics of an organism, which depend on genetics, environment and their interaction.
-A pleiotropic gene is a single gene responsible for several apparently distant phenotypic traits.
-R is a programming language and free software for statistics and data science.
-An analysis pipeline (or workflow) is a sequence of experimental or computational steps to process samples or data.

 *In large-scale phenogenomic characterization of mouse models, IMPC individually deactivates or "turns off" each of the genes that compose the mouse genome. Mutant mice undergo standardized physiological tests (clinical biochemical markers, anatomy, behavior) in a series of biological systems to infer gene function, and the data are then made freely available to researchers on the IMPC website.

Top page