70. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e.V.
70. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e.V.
Efficient Heritability Estimation in Multivariate Generalized Linear Models via Low-Rank Approximations
2Braunschweig Integrated Centre of Systems Biology, TU Braunschweig, Braunschweig, Germany
3The Danish Twin Registry and Epidemiology, Biostatistics and Biodemography, Department of Public Health, University of Southern Denmark, Odense, Denmark
4Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark
5Paraná Federal University, Curitiba, Brazil
Text
Introduction: Linear mixed models are widely used in genetics to estimate genetic variance components from SNP data. These models typically assume that phenotypes follow a (multivariate) normal distribution. However, in many practical settings, phenotype distributions deviate from normality, limiting the applicability of standard linear mixed models.
State of the art: Phenotypes such as binary disease status, gene expression counts, or DNA methylation levels are better modeled using distributions like Bernoulli, Poisson, or Beta, respectively. Generalized linear mixed models (GLMMs) are more appropriate in these cases. However, most implementations focus on univariate outcomes, and extensions to multivariate phenotypes – especially when each phenotype follows a different distribution – remain computationally challenging.
Concept: We address this issue using the multivariate covariance generalized linear model (McGLM) framework introduced by Bonat & Jørgensen [1], which accounts for non-normality by explicitly modeling the relationship between the mean and variance. Within this unified and flexible framework, genetic variance components can be estimated based on genetic relationship matrices (GRM).
Implementation: To make this approach scalable, especially with increasing sample sizes, we apply low-rank approximation techniques to the GRM. These approximations significantly reduce the computational cost of matrix operations involved in parameter estimation, enabling efficient heritability estimation under complex, mixed distributional settings.
Lessons learned: Our results show that low-rank approximations provide substantial computational speed-ups without sacrificing accuracy. This makes the McGLM framework practical for large-scale genetic studies involving diverse types of phenotypes. The approach opens the door to more flexible and computationally feasible heritability estimation across a wide range of real-world applications.
The authors declare that they have no competing interests.
The authors declare that an ethics committee vote is not required.



