70. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e.V.
70. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e.V.
Investigating host dependency factor screening data across haemorhagic fever and respiratory viruses to identify communalities and differences using machine learning
Text
Since its emergence in 2019, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has caused a global pandemic [1]. However, it is not the only threat caused by viral infection to human health. Influenza A virus (IAV) has demonstrated its pandemic potential through several outbreaks in recent decades [2]. Additionally, the number of reported infections caused by Dengue virus and West Nile virus is rising, with a considerable number of cases now being reported in Europe [3], [4]. Besides vaccines, host-directed antivirals (HDA) are promising therapeutic options [5]. While CRISPR/Cas9 or RNA interference (RNAi) screens have been employed in numerous studies to identify relevant host factors (HDF), achieving robust and consistent identification poses a challenge. These studies are lacking consistency, making it difficult to interpret the results in a broad-spectrum context.
In the present work, host dependency factor screening data from more than 60 published research articles were analyzed in an integrated approach. We focussed on studies of viruses causing either hemorrhagic fever or respiratory infections, i.e. Dengue virus, Zika virus, influenza virus, and SARS-CoV-2. A boosted random forest model was employed to detect conserved patterns within each viral group. The model was trained on more than 4,500 features derived from screening data and gene set enrichment analysis. Hierarchical clustering was applied to the features with the highest importance scores.
The model demonstrated remarkable prediction performance, achieving 90% balanced accuracy. Among the top-ranking features, several well-known HDFs, such as ACE2, EMC3, STT3A, NXF1, and SLC35A1, were identified. This highlights the model's capability to detect viral group-specific patterns within heterogeneous screening datasets. As expected, screens of SARS-CoV-2 infected cells showed the highest similarity, as revealed by clustering analysis. Similar similarity was observed for screens of flavivirus infected cells, such as Dengue and Zika viruses. In contrast to what one would expect, profiles from IAV infected cells were more closely related to those of hemorrhagic fever caused viruses than with SARS-CoV-2. HDFs of both, hemorrhagic fever viruses and IAV were enriched in gene sets related to vesicle-mediated transport, proton transport, and acidification.
These findings reveal a first insight into cross-viral pathways and may provide a basis for developing broad-spectrum host-directed antiviral strategies.
The authors declare that they have no competing interests.
The authors declare that an ethics committee vote is not required.
Literatur
[1] Li J, Shengjie L, George FG, Weifeng S. The emergence, genomic diversity and global spread of SARS-CoV-2. Nature. 2021;600(7889):408-418. DOI: 10.1038/s41586-021-04188-6[2] Morens DM, et al. Many potential pathways to future pandemic influenza. Science translational medicine. 2023;15(718):eadj2379. DOI: 10.1126/scitranslmed.adj2379.
[3] Buchs A, Conde A, Frank A, et al. The threat of dengue in Europe. New Microbes New Infect. 2022;49-50:101061. DOI: 10.1016/j.nmni.2022.101061
[4] Erazo D, et al. Contribution of climate change to the spatial expansion of West Nile virus in Europe. Nature communications. 2024;15(1):1196. DOI: 10.1038/s41467-024-45290-3
[5] Edinger TO, et al. Entry of influenza A virus: host factors and antiviral targets. The Journal of general virology. 2014;95(Pt 2):263-277. DOI: 10.1099/vir.0.059477-0



