Events Calendar

Mon
Tue
Wed
Thu
Fri
Sat
Sun
M
T
W
T
F
S
S
26
27
28
29
30
31
1
2
3
4
6
7
8
10
11
12
13
14
15
17
18
20
21
22
24
25
28
29
30
31
1
2
3
4
5
Food and Beverages
2021-07-26 - 2021-07-27    
12:00 am
The conference highlights the theme “Global leading improvement in Food Technology & Beverages Production” aimed to provide an opportunity for the professionals to discuss the [...]
European Endocrinology and Diabetes Congress
2021-08-05 - 2021-08-06    
All Day
This conference is an extraordinary and leading event ardent to the science with practice of endocrinology research, which makes a perfect platform for global networking [...]
Big Data Analysis and Data Mining
2021-08-09 - 2021-08-10    
All Day
Data Mining, the extraction of hidden predictive information from large databases, is a powerful new technology with great potential to help companies focus on the [...]
Agriculture & Horticulture
2021-08-16 - 2021-08-17    
All Day
Agriculture Conference invites a common platform for Deans, Directors, Professors, Students, Research scholars and other participants including CEO, Consultant, Head of Management, Economist, Project Manager [...]
Wireless and Satellite Communication
2021-08-19 - 2021-08-20    
All Day
Conference Series llc Ltd. proudly invites contributors across the globe to its World Convention on 2nd International Conference on Wireless and Satellite Communication (Wireless Conference [...]
Frontiers in Alternative & Traditional Medicine
2021-08-23 - 2021-08-24    
All Day
World Health Organization announced that, “The influx of large numbers of people to mass gathering events may give rise to specific public health risks because [...]
Agroecology and Organic farming
2021-08-26 - 2021-08-27    
All Day
Current research on emerging technologies and strategies, integrated agriculture and sustainable agriculture, crop improvements, the most recent updates in plant and soil science, agriculture and [...]
Agriculture Sciences and Farming Technology
2021-08-26 - 2021-08-27    
All Day
Current research on emerging technologies and strategies, integrated agriculture and sustainable agriculture, crop improvements, the most recent updates in plant and soil science, agriculture and [...]
CIVIL ENGINEERING, ARCHITECTURE AND STRUCTURAL MATERIALS
2021-08-27 - 2021-08-28    
All Day
Engineering is applied to the profession in which information on the numerical/mathematical and natural sciences, picked up by study, understanding, and practice, are applied to [...]
Diabetes, Obesity and Its Complications
2021-09-02 - 2021-09-03    
All Day
Diabetes Congress 2021 aims to provide a platform to share knowledge, expertise along with unparalleled networking opportunities between a large number of medical and industrial [...]
Events on 2021-07-26
Food and Beverages
26 Jul 21
Events on 2021-08-05
Events on 2021-08-09
Events on 2021-08-16
Events on 2021-08-19
Events on 2021-08-23
Events on 2021-09-02
Articles News

The COMET model uses deep learning to improve disease prediction.

EMR Industry

A new machine learning framework called COMET uses transfer learning to combine EHR data with omics analysis, greatly improving predictive modeling and revealing biological insights from small cohorts.

Researchers introduced clinical and omics multimodal analysis enhanced with transfer learning (COMET), a deep learning and transfer learning protocol, in a recent work that was published in the journal Nature Machine Intelligence.

Technological developments in omics have transformed our understanding of biology. Analyte quantification in the same material is now affordable thanks to proteomic, metabolic, transcriptomic, and other tests. Although these tests produce high-dimensional data, the number of omics cohorts is constrained by clinical and financial factors. As a result, new methods are required to enhance high-dimensional data analysis.

While statistical techniques deal with false positives, machine learning (ML) techniques are less common. Some strategies use transfer learning, a method in which a machine learning model is trained on a pre-training dataset and then applied to a smaller dataset. Even though more recent deep learning techniques have been used with statistical frameworks, they mostly rely on learning from omics data or useful metadata.

By combining early and late fusion techniques and using pretraining on sizable electronic health record (EHR) datasets, the COMET architecture gets over these restrictions and enables better biological discovery and prediction performance.

The research and conclusions
Researchers presented COMET, a deep learning and transfer learning technique that enhances omics analyses, in this work. When omics data and electronic health records (EHR) are accessible for both small and large cohorts, COMET may be used. COMET includes pre-training, multimodal modeling, and a technique for embedding longitudinal EHR data.

In COMET, a multimodal architecture trained and assessed on a smaller sample using omics and EHR data will receive the weights of an ML model that was trained exclusively on EHR data. First, a Stanford Healthcare pregnancy cohort of more than 30,904 people had their days to labor onset predicted using COMET. A proteomics dataset of 1,317 proteins was created using many plasma samples taken from 61 pregnant people (the omics cohort) during the final days of pregnancy.

Days to labor onset were predicted using EHR data from blood sampling at the beginning of pregnancy. Weights were passed to a multimodal network trained to generate predictions on the omics cohort following pre-training on EHR-only data (of 30,843 people). The model’s good predictive power was demonstrated by its 0.868 Pearson correlation coefficient (95% CI [0.825, 0.900]). The actual number of days before labor beginning and the anticipated number of days were strongly correlated, suggesting that COMET was quite accurate in small cohorts with multidimensional data.

Next, either proteomics data, EHR data, or both were used to compare COMET with baseline models. These baseline models didn’t have pre-training and only used omics cohort data. With a correlation of 0.768, the EHR-only baseline model scored the worst, but the proteomics-only model did somewhat better at 0.796. With a correlation of 0.815, the combined baseline model outperformed the others, but it was still less effective than COMET.

By projecting the correlation matrix into two dimensions, researchers used t-distributed stochastic neighbor embedding (t-SNE) to visualize multimodal data and uncover significant feature clusters based on correlation patterns. This allowed them to obtain deeper insights. Correlations between close features and every other variable in the space are comparable. The medical ideas that the EHR or protein properties within each cluster represent were used to annotate these clusters. Significant relationships between different proteins and EHR factors were found.

Each protein’s feature importance was calculated by the team. In accordance with accepted biological knowledge, proteins shown to be very significant in COMET models linked with gestational age, pregnancy problems, and fetal development. The three-year cancer mortality was then predicted using COMET on a cancer cohort from the UK Biobank. All of the participants had received a cancer diagnosis within five years after their enrollment.

Blood samples from a subset of participants were available and subjected to proteome analysis. If the samples were taken within a year of the cancer diagnosis, they were added to the omics cohort. With an area under the receiver operating characteristic curve (AUROC) of 0.842, COMET consistently outperformed all baselines in predicting three-year cancer mortality, exceeding both the single-modality and joint baseline models (AUROC 0.786). In the omics cohort, the three-year death rate was 5.5%.

Furthermore, compared to labor onset data, the correlation matrix, which was shown using t-SNE, showed reduced overlap between EHR and proteomics data modalities. However, when the correlation network was displayed, with each modality projected into two dimensions separately, there were notable correlations between proteomics and EHR data modalities. Its potential as a predictive biomarker was highlighted by the fact that mortality factor 4-like protein 2 showed the highest associations with EHR parameters, especially medication prescriptions.

Sixty-six percent of proteins from cancer patients did not correlate with any EHR characteristic. Additionally, the researchers calculated the highest correlation across all proteins for each EHR feature as well as the connection between each EHR feature and all proteins. This highlighted the importance of including several data modalities by revealing numerous EHR variables with weak associations to proteins in cancer patients.

Greater feature relevance proteins in COMET models correspond to established biomarkers for cancer prognosis. Crucially, the biological relevance of the model was further confirmed by the statistical association of mortality status with nine proteins that were more significant in COMET models.

Conclusions
Overall, the study demonstrated how COMET may enhance predictive modeling for a variety of tasks by using pre-training and transfer learning. Better-regularized models that more closely mirrored known biology were produced by COMET. Furthermore, biologically significant proteins for particular health outcomes were found using COMET models.

Proteins essential for immunological control, placental development, and pregnancy problems were identified by COMET in labor onset models, and its predictive power was corroborated by Pearson correlation values. Proteins implicated in tumor growth and microenvironment modification were found to be associated with cancer mortality. All things considered, COMET offers a framework for defining intricate connections between biological pathways and clinical manifestations.