Genomics Data Analyst

4 500 - 5 500 eur/kk

University of Helsinki

Helsinki

University of Helsinki

Helsinki, Finland

€ 4,500 - € 5,500 per month gross

What you will work on

This role sits at the core of our data infrastructure. Your primary focus will be on ensuring that our WGS and WES data are clean, harmonized, and analysis-ready. Over time, you will also have the opportunity to contribute to downstream genomic analyses and methods work.

Processing, quality control, and harmonization of large-scale Whole Genome and Whole Exome Sequencing datasets.
Building and maintaining scalable, reproducible workflows on HPC and cloud platforms, including Google Cloud.
Variant calling and QC for SNVs, indels, and structural variants.
Collaborating with FinnGen network researchers and international partners on joint analyses.
Over time: contributing to haplotype phasing, imputation, population genetics analyses, and integration of long-read sequencing and pangenome approaches.

What we are looking for

We welcome candidates from bioinformatics, computational biology, statistical genetics, computer science, or any related quantitative field. A PhD is preferred for the postdoctoral level, but strong candidates with equivalent research or industry experience are very welcome to apply. What matters most is your ability to handle and analyze large-scale genomic data and your enthusiasm for the work. You do not need to tick every box. If you have strong experience in WGS/WES data handling and are eager to learn, we want to hear from you.

Core skills (the focus of this role)

These are the skills most central to the day-to-day work. We expect candidates to have solid hands-on experience in most of these:

Processing and analyzing WGS and/or WES data, including alignment, duplicate marking, and base quality recalibration (e.g. using BWA, GATK, Samtools).
Variant calling and quality control for SNVs and indels; familiarity with common QC metrics such as coverage, Ti/Tv ratios, and contamination checks.
Writing and maintaining reproducible analysis pipelines using workflow managers such as Nextflow, Snakemake, or WDL/Cromwell.
Comfortable working in Linux/HPC environments with job schedulers (SLURM or similar) and shell scripting.
Scripting in Python and/or R for data processing, QC visualization, and analysis.
Experience with version control (Git) and containerization tools (Docker or Singularity) for reproducible environments.

Useful experience (good to have, not required)

Any of the following would be a bonus, but please do not be put off if you have only some of these or none at all:

Cloud computing, particularly Google Cloud (GCP); experience with cloud storage and scalable compute is a plus.
Structural variant calling and quality control (e.g. using Manta, PBSV, or similar tools).
Haplotype phasing or genotype imputation (e.g. SHAPEIT, Beagle, or Michigan Imputation Server).
Population genetics concepts such as population stratification, relatedness inference, or GWAS.
Exposure to long-read sequencing data (ONT or PacBio) or emerging pangenome reference approaches.
Familiarity with large-scale genomic databases or biobank data (e.g. FinnGen, UK Biobank, gnomAD).
Knowledge of data governance and security practices for controlled-access human genomic data.

Siirry sivulle eurosciencejobs.com

Avoin paikka julkaistu 5 päivää sitten