CV

Houman Rajabi

AI/ML Engineer

rajabi_houman@yahoo.com
+39 351 9465199
Turin, IT

Summary

AI/ML Engineer with a strong background in Machine Learning and MLOps across various domains including information retrieval (Search, Recommendations, RAG-systems & ChatBots), NLP, and AI Agents.

Work Experience

  • AI/ML Engineer (NLP Research Intern)
    2025-08 - 2026-02
    Trivago
    Central ML team, unstructured data systems.
    • Built semantic search engines using embedding models, improving query-to-accommodation matching accuracy across millions of listings.
    • Applied Generative AI to automate accommodation description generation and Computer Vision for automated image quality scoring and optimisation at scale.
    • Designed end-to-end ETL pipelines for unstructured data ingestion powering downstream ML systems.
  • Machine Learning Researcher
    2023-09 - 2025-05
    University of Turin (DeepHealth Project)
    Federated clinical AI; industrial partners: Philips, Thales.
    • Assisted in designing federated learning workflows enabling decentralised model training across hospital networks without sharing raw patient data.
    • Participated in technical discussions aligning clinical requirements with HPC infrastructure for two major industrial partners.
    • Contributed to implementing GDPR-compliant differential-privacy algorithms ensuring data sovereignty within strict hospital IT environments.
  • Data Scientist
    2021-03 - 2023-07
    Snapp!
    Ride-hailing platform with ~1M daily trips.
    • Supply-demand forecasting: Developed ML models to balance driver supply and passenger demand, materially reducing average wait times.
    • Dynamic pricing: Designed real-time surge-pricing algorithms to maximise revenue during high-demand periods.
    • ETA prediction: Leveraged streaming data to improve arrival-time accuracy for millions of daily trips.
    • Analytics: Monitored operational KPIs and executed data-driven optimisations, improving driver completion rates.
  • Data Scientist
    2019-05 - 2021-03
    Digikala Group
    Largest e-commerce platform in MENA.
    • Recommendations: Engineered hybrid engine (Association Rules + Collaborative Filtering) to redesign 'Frequently Bought Together,' boosting Average Order Value via cross-selling.
    • Flash-sale pricing: Automated candidate selection and discount depth for 'Shegeftane' flash sales, achieving high sell-through without eroding margins.
    • Demand forecasting: XGBoost/Prophet pipelines capable of handling Black Friday traffic loads, reducing logistics bottlenecks and improving inventory accuracy.
    • Search relevance: Integrated custom Persian NLP models for semantic matching and misspelling correction, reducing zero-result queries significantly.
  • Project Contributor
    2018-09 - 2019-03
    Sharif University of Technology
    • Institutional Analytics: Created robust analytical dashboards to support institutional decision-making and enhance strategic academic planning.

Education

  • Master of Science in Language Technology & Digital Humanities (NLP)
    2026-06
    University of Turin
    GPA: 28.8/30
    Courses: Mechanistic interpretability, RAG systems, Multilingual NLP
  • Bachelor of Science in Computer Science
    2019-03
    Sharif University of Technology
    Courses: Algorithms, Data structures, Statistical learning, Distributed systems

Skills

Languages & Core

  • Python (OOP)
  • SQL
  • Bash
  • Git

AI & GenAI

  • LLMs
  • RAG
  • LangChain
  • LangGraph
  • HuggingFace
  • PEFT/LoRA
  • Vector DBs

Machine Learning

  • PyTorch
  • XGBoost
  • Scikit-learn
  • NLP
  • Computer Vision
  • Search & RecSys

MLOps & Cloud

  • AWS SageMaker
  • Docker
  • Kubernetes
  • CI/CD
  • MLflow
  • Airflow

Big Data

  • PySpark
  • Kafka
  • Databricks

Publications

  • Identity, Toxicity, or Complexity? A Language-Specific Feature Selection Approach to Reclamatory Intent Detection
    2026
    Proceedings of EVALITA 2026 (Task A: MultiPRIDE), Bari, Italy
    1st Place (Italian Task): Developed a Hybrid Fusion architecture combining BERT embeddings with engineered sociolinguistic features to detect reclaimed slurs; achieved SOTA F1 of 0.8981 by modeling language-specific syntactic patterns.
  • Parametric Stubbornness: Mechanistically Isolating the Layer Shift and Sparsity Gradient of RAG Knowledge Conflicts in Llama-3
    2026
    ACL Student Research Workshop 2026
    Two-phase activation patching on Meta-Llama-3-8B across 452 minimal-pair conflicts; introduces the Sparsity Gradient and Contextual Contamination phenomena in RAG knowledge conflict settings.

Languages

  • English
    Full professional proficiency
  • Persian
    Native
  • German
    Limited working proficiency
  • Italian
    Elementary proficiency