Multi angle

H 1 Matrix factorisation techniques for data integration

Auteurs : Lê Cao, Kim-Anh (Auteur de la Conférence)
CIRM (Editeur )

    Loading the player...

    Résumé : Gene module detection methods aim to group genes with similar expression profiles to shed light into functional relationships and co-regulation, and infer gene regulatory networks. Methods proposed so far use clustering to group genes based on global similarity in their expression profiles (co-expression), bi-clustering to group genes and samples simultaneously, network inference to model regulatory relationships between genes. In this talk I will focus on multivariate matrix decomposition techniques that enable dimension reduction and the identification of molecular signatures.
    We will consider two different types of assays: bulk and single cell assays. Bulk transcriptomics assays use RNA-sequencing techniques to monitor the average expression profile of all the constituent cells, but fail to identify the distinct transcriptional profiles from different cell types. Single cell assays use similar RNA-seq techniques (scRNA-seq) to those used for bulk cell populations, but provide unprecedented resolution at the cell level to understand cellular heterogeneity and uncover new biology. However, scRNA-seq present new computational and analytical challenges, because of their sheer size (100K - 500K of cells are sequenced) and their zero inflated distribution due to technical drop-outs.
    I will illustrate how we can use matrix factorisation technique to mine these data and identify gene modules that underpin molecular mechanisms in cell identity in scRNA-seq. I will also give further perspective on how we could extend similar concepts to integrate different omics data types (e.g. bulk transcriptomics, proteomics, metabolomics) to identify tightly connected multi-omics signatures that holistically describe a biological system.

    Keywords : biomathematics; reduction dimension; data integration

    Codes MSC :
    15A23 - Factorization of matrices
    92B15 - General biostatistics, See also {62P10}

      Informations sur la Vidéo

      Langue : Anglais
      Date de publication : 23/03/2020
      Date de captation : 05/03/2020
      Collection : Research talks ; Probability and Statistics
      Format : MP4
      Durée : 01:26:46
      Domaine : Probability & Statistics
      Audience : Chercheurs ; Doctorants , Post - Doctorants
      Download : https://videos.cirm-math.fr/2020-03-05_Le Cao.mp4

    Informations sur la rencontre

    Nom de la rencontre : Thematic Month Week 5: Networks and Molecular Biology / Mois thématique Semaine 5 : Réseaux et biologie moléculaire
    Organisateurs de la rencontre : Baudot, Anais ; Hubert, Florence ; Moss, Brigitte ; Rémy, Elisabeth ; Tichit, Laurent ; Vignes, Matthieu
    Dates : 02/03/2020 - 06/03/2020
    Année de la rencontre : 2020
    URL Congrès : https://conferences.cirm-math.fr/2305.html

    Citation Data

    DOI : 10.24350/CIRM.V.19620803
    Cite this video as: Lê Cao, Kim-Anh (2020). Matrix factorisation techniques for data integration. CIRM. Audiovisual resource. doi:10.24350/CIRM.V.19620803
    URI : http://dx.doi.org/10.24350/CIRM.V.19620803

    Voir aussi


    1. DRIER, Yotam, SHEFFER, Michal, et DOMANY, Eytan. Pathway-based personalized analysis of cancer. Proceedings of the National Academy of Sciences, 2013, vol. 110, no 16, p. 6388-6393. - https://doi.org/10.1073/pnas.1219651110

    2. LIU, Chao, SRIHARI, Sriganesh, CAO, Kim-Anh Lê, et al. A fine-scale dissection of the DNA double-strand break repair machinery and its implications for breast cancer therapy. Nucleic acids research, 2014, vol. 42, no 10, p. 6106-6127. - https://doi.org/10.1093/nar/gku284

    3. LIU, Chao, SRIHARI, Sriganesh, LAL, Samir, et al. Personalised pathway analysis reveals association between DNA repair pathway dysregulation and chromosomal instability in sporadic breast cancer. Molecular oncology, 2016, vol. 10, no 1, p. 179-193. - https://doi.org/10.1016/j.molonc.2015.09.007

    4. HASTIE, Trevor et STUETZLE, Werner. Principal curves. Journal of the American Statistical Association, 1989, vol. 84, no 406, p. 502-516. - https://www.tandfonline.com/doi/abs/10.1080/01621459.1989.10478797

    5. SAELENS, Wouter, CANNOODT, Robrecht, et SAEYS, Yvan. A comprehensive evaluation of module detection methods for gene expression data. Nature communications, 2018, vol. 9, no 1, p. 1-12. - https://doi.org/10.1038/s41467-018-03424-4

    6. COMON, Pierre. Independent component analysis, a new concept?. Signal processing, 1994, vol. 36, no 3, p. 287-314. - https://doi.org/10.1016/0165-1684(94)90029-9

    7. YAO, Fangzhou, COQUERY, Jeff, et LÊ CAO, Kim-Anh. Independent principal component analysis for biologically meaningful dimension reduction of large biological data sets. BMC bioinformatics, 2012, vol. 13, no 1, p. 24. - http://dx.doi.org/10.1186/1471-2105-13-24

    8. SCHAUM, Nicholas, KARKANIAS, Jim, NEFF, Norma F., et al. Single-cell transcriptomics of 20 mouse organs creates a Tabula Muris: The Tabula Muris Consortium. Nature, 2018, vol. 562, no 7727, p. 367. - https://dx.doi.org/10.1038%2Fs41586-018-0590-4

    9. CAO, Kim-Anh, ROSSOUW, Debra, ROBERT-GRANIÉ, Christèle, et al. A sparse PLS for variable selection when integrating omics data. Statistical Applications in Genetics & Molecular Biology, 2008, vol. 7, no 1, p. 1-29. - https://doi.org/10.2202/1544-6115.1390

    10. BOITARD, Simon et BESSE, Philippe. Sparse PLS discriminant analysis: biologically relevant feature selection and graphical displays for multiclass problems. BMC Bioinformatics june (12), Non paginé.(2011), 2011. - https://doi.org/10.1186/1471-2105-12-253

    11. TENENHAUS, Arthur, PHILIPPE, Cathy, GUILLEMOT, Vincent, et al. Variable selection for generalized canonical correlation analysis. Biostatistics, 2014, vol. 15, no 3, p. 569-583. - https://doi.org/10.1093/biostatistics/kxu001

    12. SINGH, Amrit, SHANNON, Casey P., GAUTIER, Benoît, et al. DIABLO: an integrative approach for identifying key molecular drivers from multi-omics assays. Bioinformatics, 2019, vol. 35, no 17, p. 3055-3062. - https://doi.org/10.1093/bioinformatics/bty1054

    13. ROHART, Florian, GAUTIER, Benoit, SINGH, Amrit, et al. mixOmics: An R package for ‘omics feature selection and multiple data integration. PLoS computational biology, 2017, vol. 13, no 11, p. e1005752.
      - https://doi.org/10.1371/journal.pcbi.1005752

    14. LEE, Amy H., SHANNON, Casey P., AMENYOGBE, Nelly, et al. Dynamic molecular changes during the first week of human life follow a robust developmental trajectory. Nature communications, 2019, vol. 10, no 1, p. 1-14. - https://doi.org/10.1038/s41467-019-08794-x

    15. LE CAO, Kim-Anh, COSTELLO, Mary-Ellen, LAKIS, Vanessa Anne, et al. MixMC: a multivariate statistical framework to gain insight into microbial communities. PloS one, 2016, vol. 11, no 8. - https://dx.doi.org/10.1371%2Fjournal.pone.0160169

    16. WANG, Yiwen et LÊCAO, Kim-Anh. Managing batch effects in microbiome data. Briefings in bioinformatics, 2019. - https://doi.org/10.1093/bib/bbz105

    17. BODEIN, Antoine, CHAPLEUR, Olivier, DROIT, Arnaud, et al. A generic multivariate framework for the integration of microbiome longitudinal studies with other data types. Frontiers in Genetics, 2019, vol. 10. - https://dx.doi.org/10.3389%2Ffgene.2019.00963

Imagette Video

Titres de périodiques et e-books électroniques (Depuis le CIRM)

Ressources Electroniques

Books & Print journals

Recherche avancée