🖥️ Work and Personal Projects
1. FluWarning (2025)
A Dockerized ETL pipeline for data cleaning/alignment and outlier detection + web app* showcasing results with dynamic filters, tables, maps, and widgets.
* Access to the app is restricted because of the source data policy.
2. FluDataStratification (2025)
A data analysis pipeline finding clusters of Influenza genomes
3. Job Orchestra (2025)
A pipeliine orchestration framework for reproducible and memory persistent data workflows in python.
4. CodonLLM (2025)
A large language model (LLM) built from scratch capable of generating sequences of DNA/RNA similar to the input sequences. Suitable for data-augmentation purposes.
5. RecombinHunt (2024)
A python library and software identifying mosaic patterns in genomic data sequences.
6. CoV2K (2021)
Example of Use and Analysis API docs|
A backend system composed of three stages:
- A web scraper based on javaScript and HTML for retrieveing data from dynamic websites
- An ETL pipeline processing and storing information into a NoSQL database
- A RESTful API enabling graph-like data retrieval and exploration
7. ViruSurf-downloader (2021)
GitHub supporting the web apps ViruSurf and EpiSurf.
An ETL pipeline for the automated continuous database integration of multi-source genomic data, embedding data curation and automated periodic database backups and optimizations.
8. VarSum (2021)
Example of Use and Analysis (Jupyter Notebook) GitHub API docs
A RESTful API enabling user-personalized queries to a multi-source repository of genomic variants and annotations.
9. MTG Life Counter+ (2018)
![]() | Android app supporting the famous table card game, Magic The Gathering. |
10. FlashApp (2016)
Android app automating the exchange of fiscal receipts.
Used software programming languages: Python, SQL, Bash, Mongo Query Language, XPath + a bit of core web technologies (HTML, CSS, JS)
Used software platforms: PostgreSQL, AWS EC2, Docker, MongoDB, Conda, uv