Python
pandas · NumPy · scikit-learn · matplotlib · seaborn · plotly · statsmodels · lifelines
I maintain a growing collection of open-source tools, analysis pipelines, and teaching materials on GitHub. The work spans genomic analysis, clinical modelling, data visualisation, and reproducible research workflows.
Models and pipelines developed in the context of the Neotree project and related work in clinical AI. Includes data preprocessing, model training, validation frameworks, and performance evaluation in low-resource settings.
Tools for working with genetic and sequencing data — population structure analysis, variant annotation, GWAS workflows, and integration with standard bioinformatics formats (VCF, PLINK, BED/BIM/FAM).
Notebooks, tutorials, and worked examples covering statistical methods in R and Python. Designed for researchers learning to apply quantitative methods to their own data — from regression and survival analysis to dimensionality reduction and clustering.
R (ggplot2) and Python (matplotlib, seaborn, plotly) scripts and templates for producing publication-quality figures and exploratory dashboards from clinical and genomic data.
pandas · NumPy · scikit-learn · matplotlib · seaborn · plotly · statsmodels · lifelines
tidyverse · ggplot2 · Bioconductor · survival · caret · randomForest · lme4
PostgreSQL · SQLite · data modelling for clinical and research databases
Git · GitHub Actions · R Markdown · Jupyter · Docker · Quarto
If you find any of my tools useful, have suggestions, or would like to collaborate on a project, please open an issue on the relevant repository or get in touch directly.