Líffræðifélag Íslands - biologia.is
Líffræðiráðstefnan 2025
Erindi/veggspjald / Talk/poster E79
Höfundar / Authors: Rohit Goswami (1)
Starfsvettvangur / Affiliations: École Polytechnique Fédérale de Lausanne
Kynnir / Presenter: Rohit Goswami
The use of high-performance computing is ubiquitous in life sciences, from posterior sampling of Bayesian models in evolution to algorithms for syntenny and alignment. With the advent of large language models and the concurrent rise of GPU-driven software, the barrier to entry for cutting-edge tools has grown exponentially. This work introduces a workflow framework where the level of abstraction shifts from individual scripts to a Domain-Specific Language (DSL) for communicating the logic of the entire analysis. We demonstrate this using Snakemake to construct reproducible pipelines spanning Python, R, and compiled programs. We showcase this methodology with diverse case studies that highlight efficient HPC resource utilization, including concurrent CPU execution and targeted GPU acceleration. Examples include fitting Bayesian ecological models in Stan, visualizing phylogenies, and generating free energy surfaces for alanine dipeptide. This approach empowers researchers to confidently leverage modern computational resources (like LUMI) while ensuring scientific rigor and reproducibility.