CV

Houman Rajabi

AI/ML Engineer

houmanrajabi@myyahoo.com
+39 351 9465199
Turin, IT

Summary

AI/ML Engineer with experience across NLP, Search, Recommendations, RAG systems and AI Agents, with a strong Data Science background. Specializing in bridging research and production, translating raw data into strategic decision-making, with a focus on MLOps and scalable ML systems using Python, PySpark and AWS. Currently near graduation in Language Technologies at the University of Turin, with hands-on experience collaborating across Italy, Germany and Iran.

Work Experience

  • AI/ML Engineer Intern
    2025-08 - 2026-01
    Trivago
    AI/ML Engineer Intern in the central team, contributing to the development of scalable ML systems for unstructured data.
    • Contributed to the development of semantic search engines using embedding models to improve query-to-accommodation matching accuracy.
    • Applied Generative AI to automate the creation of accommodation descriptions and Computer Vision to assess and optimize image quality.
    • Supported the design of End-to-End ETL pipelines for unstructured data ingestion to power these ML systems.
  • Researcher & Data Scientist
    2023-09 - 2025-05
    University of Turin (DeepHealth Project)
    Collaboration focused on Federated Learning and Privacy Engineering.
    • Federated Learning: Architected decentralized training workflows across hospital networks to avoid moving raw patient data.
    • Technical Liaison: Aligned clinical requirements with HPC infrastructure for industrial partners like Philips and Thales.
    • Privacy Engineering: Deployed GDPR-compliant algorithms to ensure data sovereignty within strict hospital IT environments.
  • Data Scientist
    2021-03 - 2023-07
    Snapp!
    Data Scientist focusing on forecasting, algorithms, and real-time machine learning optimizations.
    • Forecasting: Developed models to balance driver supply and demand, reducing wait times by 2 minutes.
    • Algorithms: Designed dynamic pricing algorithms to maximize revenue (+3%) during high-demand periods.
    • Machine Learning: Utilized real-time data to enhance ETA prediction accuracy by 2.5%.
    • Analytics: Monitored KPIs to identify and execute optimizations that improved driver utilization and completion rates.
  • Data Scientist
    2019-05 - 2021-03
    Digikala Group
    • Engineered hybrid recommendation engines (Association Rules, Collaborative Filtering) to redesign the 'Frequently Bought Together' module, boosting Average Order Value (AOV) by 5% via cross-selling.
    • Developed price optimization algorithms for Flash Sales ('Shegeftane') to automate candidate selection and discount depth, achieving ~95% sell-through without eroding margins.
    • Built robust demand forecasting pipelines (XGBoost, Prophet) capable of handling 4x traffic loads during Black Friday, reducing logistics bottlenecks by 10% and improving inventory accuracy.
    • Enhanced search relevance by integrating custom Persian NLP models for semantic matching and misspelling correction, reducing 'Zero Search Result' queries by 7%.
  • Project Contributor
    2018-09 - 2019-03
    Sharif University of Technology
    • Institutional Analytics: Created robust analytical dashboards to support institutional decision-making and enhance strategic academic planning.

Education

  • Master of Science in Language Technology & Digital Humanities (NLP)
    2026-06
    University of Turin
  • Bachelor of Science in Computer Science
    2019-03
    Sharif University of Technology

Skills

Languages & Core

  • Python (OOP)
  • SQL
  • Bash
  • Git

AI & GenAI

  • LLMs
  • RAG Pipelines
  • LangChain
  • Hugging Face
  • Vector Databases
  • Fine-tuning (PEFT/LoRA)

Machine Learning

  • PyTorch
  • Scikit-learn
  • XGBoost
  • Computer Vision
  • NLP
  • Search & Recommendation

MLOps & Cloud

  • Docker
  • Kubernetes
  • AWS (SageMaker)
  • CI/CD (GitHub Actions)
  • MLflow
  • Airflow

Big Data

  • PySpark
  • Kafka
  • Databricks

Publications

  • Identity, Toxicity, or Complexity? (EVALITA 2026)
    EVALITA 2026
    1st Place (Italian Task): Developed a Hybrid Fusion architecture combining BERT embeddings with engineered sociolinguistic features to detect reclaimed slurs; achieved SOTA F1 of 0.8981 by modeling language-specific syntactic patterns.

Languages

  • English
    Full professional proficiency
  • German
    Limited working proficiency
  • Italian
    Elementary proficiency