Data Science & Bioinformatics
Bridging biomedical science and computational analysis through Python, R, SQL, and genomic data science
About
Experienced Biomedical Scientist transitioning into data science with 7+ years in clinical labs.
I combine wet-lab expertise (flow cytometry, NGS library prep) with computational skills to analyse
biological data and streamline workflows.
I work across Python, R, SQL, and Bash
to turn biological data into actionable insight.
PythonRSQLBashNGS Data AnalysisTableauGitFlow CytometryNGS Library Prep (Illumina)StatisticsLIMS
Location:London, United Kingdom
Languages:Italian · English · Portuguese · Spanish
Open to:Data Science · Bioinformatics
Projects
A selection of work spanning genomics, analytics, and scripting. Full list on GitHub.
End-to-end workflows for genomic analysis: alignment, assembly, RNA-seq, variant calling, and DB querying.
Details
Objective: Gain proficiency in computational genomics through applied projects spanning multiple bioinformatics subfields.
Approach: Completed a series of modules covering sequence alignment, genome assembly, RNA-seq analysis, variant calling, data visualisation, and database querying. Applied Python, R, and Biopython to process and analyse large genomic datasets.
Outcome: Built a diverse portfolio of scripts and workflows applicable to real-world genomics research, showcasing versatility across topics such as quality control, functional annotation, and statistical interpretation of biological data.
Predictive modelling to understand drivers of employee turnover and inform retention strategy.
Details
Objective: Build a predictive model to understand factors influencing employee attrition
Approach: Cleaned and transformed HR datasets with Pandas, performed exploratory analysis, engineered features, and implemented classification models using scikit-learn.
Outcome: Delivered actionable insights and recommendations to improve workforce retention strategies.
Exploratory analysis of engagement metrics to uncover content trends and optimisation levers.
Details
Objective: Analyse user engagement metrics to identify content trends on TikTok.
Approach: Processed raw CSV datasets, generated visualisations with Matplotlib, and applied descriptive statistics to uncover patterns in user behaviour.
Outcome: Produced data-driven recommendations to optimise content strategy and boost audience engagement.
High-level AWS architecture design for migrating on-premises workloads to a cloud-native, fully managed solution, ensuring scalability, fault-tolerance, and operational efficiency.
Details
Objective: Migrate two on-prem workloads - a three-tier web application and a Hadoop-based analytics environment - into a modern AWS environment with managed services.
Approach: Designed an end-to-end cloud solution using AWS managed services:
Web Application Architecture:
Frontend hosted on Amazon S3 with CloudFront for global HTTPS delivery.
Backend containerized in ECS on Fargate behind an Application Load Balancer (ALB).
Database migrated to Amazon Aurora MySQL (Multi-AZ) with ElastiCache Redis for caching and SQS for asynchronous decoupling.
AWS Secrets Manager used for credential management and CloudWatch/X-Ray for monitoring.
Data Analytics Architecture:
Replaced Hadoop with AWS EMR (Spark/Hive) for scalable distributed data processing.
Created an S3 Data Lake as the central repository with metadata managed by AWS Glue Data Catalog.
Used Athena for serverless queries, Redshift for data warehousing, and QuickSight for BI dashboards.
Data ingested from on-prem via AWS DataSync.
Integration Flow:
CloudFront routes traffic to ALB → ECS → Aurora, with ElastiCache and SQS optimising backend performance.
Simultaneously, DataSync moves data to S3 for analysis via Glue → EMR → Athena → Redshift → QuickSight.
Outcome: Achieved full migration with:
Decoupled, fault-tolerant architecture
Multi-AZ high availability
Cloud-native modernization of both workloads
Reduced operational overhead with managed services
Integrated observability and security across layers
Final capstone for the Google Business Intelligence Certificate, showcasing stakeholder-driven dashboard design and data storytelling using Tableau.
Details
Objective: Build a BI solution that communicates market performance and insights for Google Fiber's leadership team.
Approach:
Merged multiple regional CSV datasets directly in Tableau, cleaned and standardised columns, and created interactive dashboards
visualising KPIs such as revenue, margin, and customer trends.
Outcome: Delivered a clear, visually consistent dashboard highlighting business performance across regions and channels.
Analysed NGS run metrics and generated patient reports (>500/week).
Authored and reviewed SOPs across the workflow.
Biomedical Laboratory Assistant - Cytology -Leicester Royal Infirmary
Dec 2018 - Jun 2019 · Leicester, UK
Managed sample reception and prepared specimens for Papanicolaou staining.
Maintained reagents and ensured sample integrity end-to-end.
Education
MSc, Bioinformatics - Atlantic Technological University (Sep 2025 - Present)
MSc, Cell & Gene Therapy - University College London (2021-2023)
Dissertation: Expansion and Preservation of Haematopoietic Potential in Human Amniotic Fluid Stem Cells for Therapeutic Applications
BSc, Biomedical Science - University of Catania (2014-2017)
Dissertation: Cytotoxicity assays using SIRC, ARPE-19, and HRPE cells
Earlier Research Internships
Research internships completed during the BSc in Biomedical Science at the University of Catania,
building a foundation in genomics, molecular diagnostics, and analytical biochemistry.