Our methodology for addressing the data integration challenges is mainly based on two recent advances in machine learning research: i) multi-view learning; ii) learning with privileged information.
Multi-view learning algorithms attempt to learn one model from each view while jointly optimizing these view-specific models to improve the generalization performance. Recently, we have developed two novel multi-view feature selection algorithms and have evaluated them using the challenging translation bioinformatics task of predicting ovarian and kidney renal clear cell carcinoma (KIRC) survival using multi-omics data as multiple views for the same set of patients.
Our research focus on the following question:
- Under what conditions is integrating multiple-views better than using single view-specific models?
- How to develop multi-view learning methods from incomplete views?
- How to learn from some views that are available only during the training phase (learning with privileged information problem)?
Our multi-view data integration algorithms (e.g., algorithms for multi-view feature selection) and multi-view supervised and unsupervised learning algorithms will be applied to build and analyze prediction models for:
- Prediction of cancer survival using multi-omics data
- Prediction of drug sensitivity using multi-omics data
- Prediction of metagenomics biomarkers
- Prediction of protein-RNA and protein-DNA interfaces
- Physical activity recognition using multiple sensors data
El-Manzalawy Y (2018) CCA based multi-view feature selection for multi-omics data integration. IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB) (Accepted).
El-Manzalawy Y, Hsieh T-Y, Shivakumar M, Kim D, Honavar V (2017) Min-Redundancy and MaxRelevance Multi-view Feature Selection for Predicting Ovarian Cancer Survival using Multi-omics Data. Presented at the 7th Annual Translational Bioinformatics Conference.