Skip to content
View ITRoselloSignoris's full-sized avatar

Highlights

  • Pro

Block or report ITRoselloSignoris

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
ITRoselloSignoris/README.md

Hi there! πŸ‘‹ I'm IΓ±aki Rosello

Data Scientist | Computer Science Student | MLOps Practitioner

Transforming data into actionable insights through advanced analytics and machine learning

LinkedIn GitHub


🎯 About Me

Data Scientist passionate about solving complex business problems through data-driven approaches. My work spans the entire data science lifecycle: from exploratory analysis and feature engineering to model development, deployment, and monitoring in production environments.

class DataScientist:
    def __init__(self):
        self.name = "IΓ±aki Rosello"
        self.role = "Data Scientist & CS Student"
        self.location = "Buenos Aires, Argentina"
        
    def expertise(self):
        return {
            "analytics": ["EDA", "Statistical Analysis", "Data Visualization"],
            "modeling": ["Classification", "Clustering", "Recommendation Systems"],
            "mlops": ["Model Deployment", "Monitoring", "CI/CD Pipelines"],
            "business": ["Fraud Detection", "Churn Prediction", "Demand Forecasting"]
        }
    
    def currently_learning(self):
        return ["Deep Learning", "Advanced MLOps", "Price Optimization"]

πŸš€ Featured Projects

Status NoCountry Recall

Customer Churn Prediction with Full MLOps Pipeline

Production ML system for FinTech churn prediction optimized for Recall (0.76). Complete pipeline: SMOTE balancing, model comparison (RF, XGBoost, LogReg), hyperparameter tuning, SHAP explainability, MLflow tracking, Evidently drift detection, FastAPI + Docker deployment.

Stack: Scikit-learn, MLflow, Evidently, SHAP, FastAPI, Docker
Skills: Imbalanced Data, Model Selection, Explainable AI, Production MLOps

Status Competition

Retail Demand Forecasting & Price Optimization

Complete data science pipeline for demand estimation and optimal pricing strategy. Integrates econometric analysis (price elasticity) with predictive modeling.

Stack: Python, Pandas, Scikit-learn, Optimization
Skills: Feature Engineering, Time Series, Business Analytics

Status EDVAI

E-commerce Fraud Detection System

Final project for EDVAI Bootcamp. Production-ready fraud detection using clustering + classification. Includes API, Docker containerization, and Gradio UI.

Stack: Scikit-learn, FastAPI, Docker, Gradio
Skills: Unsupervised Learning, Model Deployment, API Design

Status Personal

Hybrid Recommendation Engine

Personal project exploring recommendation systems. Implements content-based + collaborative filtering with efficient similarity search using FAISS & ANNOY.

Stack: FAISS, ANNOY, FastAPI, Gradio
Skills: RecSys, Vector Search, Similarity Algorithms, API Development


πŸ› οΈ Tech Stack

πŸ’» Programming Languages

Python C++ C SQL

πŸ“Š Data Science & Analytics

Core Libraries
Pandas NumPy SciPy

Visualization
Matplotlib Plotly Seaborn

πŸ€– Machine Learning

Frameworks & Libraries
scikit-learn XGBoost LightGBM

Specialized Tools
FAISS ANNOY

πŸš€ MLOps & Production

Experiment Tracking & Monitoring
MLflow Evidently

Deployment & APIs
FastAPI Gradio Docker

Cloud & Infrastructure
Azure

βš™οΈ Development Tools

Version Control
Git GitHub

Environment & IDE
Jupyter VS Code Linux

Productivity
Notion

Databases
MySQL


πŸ“Š GitHub Analytics

GitHub Stats

Top Languages

GitHub Streak


πŸŽ“ Education & Certifications

  • πŸŽ“ Computer Science - Universidad de Buenos Aires (Currently Studying)
  • πŸ“œ Data Science & MLOps Bootcamp - EDVAI (Completed)
  • πŸ† AlixPartners Case Competition 2025 - Participant
  • πŸ’Ό NoCountry Simulation - Data Science Team Member
  • πŸ“š Continuous Learning: Deep Learning, Advanced Analytics (EDA, Data Preparation), Production ML

🎯 Areas of Focus

Current Learning Path:

  • 🧠 Deep Learning fundamentals and advanced architectures
  • πŸ“ˆ Time series forecasting and demand prediction
  • πŸ”§ Production-grade MLOps practices and automation
  • πŸ’° Price optimization and econometric modeling
  • 🐳 Scalable deployment with Docker and cloud services

Core Competencies:

  • Analytics: Exploratory Analysis, Statistical Modeling, Business Intelligence
  • Machine Learning: Classification, Clustering, Recommendation Systems, Fraud Detection
  • MLOps: Model Deployment, Monitoring & Drift Detection, CI/CD Pipelines, Containerization
  • Business Applications: Churn Prediction, Demand Forecasting, Price Optimization

πŸ”¬ Technical Deep Dive: Key Projects

πŸ“Š FinTech Churn Prediction - Technical Overview

Business Context: Customer retention prediction for a FinTech platform
Challenge: Highly imbalanced dataset (80% no-churn, 20% churn)
Optimization Goal: Maximize Recall to minimize false negatives (missed churners)

Data Pipeline:

  1. Preprocessing: Feature engineering on cleaned_data.csv
  2. Scaling: StandardScaler for linear models
  3. Balancing: SMOTE oversampling on training set
  4. Split: 80/20 train-test with stratification

Model Development:

Model Hyperparameter Tuning Recall F1-Score ROC AUC
Random Forest RandomizedSearchCV β†’ GridSearchCV 0.66 0.57 0.83
XGBoost RandomizedSearchCV β†’ GridSearchCV 0.55 0.59 0.84
Logistic Regression βœ… RandomizedSearchCV β†’ GridSearchCV 0.76 0.56 0.84

Winner: Logistic Regression with class_weight='balanced'
Reason: Highest Recall (0.76), meeting business requirement of catching churners

MLOps Implementation:

  • Experiment Tracking: MLflow for all runs, parameters, and metrics
  • Model Explainability: SHAP for feature importance analysis
  • Model Artifacts: Serialized model + scaler with pickle
  • Monitoring: Evidently AI for drift detection
  • Deployment: FastAPI + Docker containerization
  • CI/CD: Automated retraining pipeline

Key Learnings:

  • SMOTE effectively handled class imbalance
  • Linear models outperformed tree-based for this use case
  • Recall optimization crucial for business impact
  • SHAP provided actionable insights for stakeholders
πŸ” Fraud Detection System - Architecture & Approach

Project Type: EDVAI Bootcamp Final Project
Domain: E-commerce Transaction Fraud Detection

Methodology:

  1. Unsupervised Learning: Clustering to identify fraud patterns
  2. Supervised Learning: Classification on clustered features
  3. Ensemble Approach: Combined insights from both techniques

Production Features:

  • RESTful API with FastAPI
  • Docker containerization for deployment
  • Interactive Gradio UI for demos
  • Real-time fraud scoring

Technologies: Scikit-learn, FastAPI, Docker, Gradio

🎬 Movie Recommendation System - Implementation Details

Approach: Hybrid Recommendation System

Components:

  1. Content-Based Filtering: TF-IDF on movie metadata
  2. Collaborative Filtering: User-item interaction matrix
  3. Similarity Search: FAISS & ANNOY for efficient retrieval

Performance:

  • Sub-second response time for recommendations
  • Scalable to millions of items
  • API-ready deployment

Technologies: FAISS, ANNOY, FastAPI, Gradio


πŸ’Ό Project Highlights & Achievements

Data Science Projects:

  • βœ… Churn Prediction (FinTech) - Achieved 0.76 Recall through SMOTE + Logistic Regression optimization
  • βœ… Fraud Detection System - Built production-ready fraud classifier with clustering + classification approach
  • βœ… Recommendation Engine - Implemented hybrid RecSys with FAISS/ANNOY for sub-second retrieval
  • βœ… Demand Forecasting - Created econometric optimization model for retail pricing strategy

Technical Skills:

  • πŸ“Š End-to-end data science workflows with imbalanced data handling
  • πŸ€– Model selection and hyperparameter optimization (RandomizedSearchCV, GridSearchCV)
  • πŸ” Advanced feature engineering and SMOTE oversampling
  • πŸ“ˆ Model interpretability with SHAP and explainable AI
  • πŸš€ Production deployment with MLflow tracking and Evidently monitoring
  • πŸ“‰ Drift detection and automated model retraining pipelines

Bootcamp & Competitions:

  • 🎯 EDVAI Data Science & MLOps Bootcamp
  • πŸ† AlixPartners 2025 Case Competition - Demand Forecasting Hackaton Competition
  • πŸ’Ό NoCountry Job Simulation - FinTech Churn Prediction Project

πŸ“ˆ Data Science Workflow

# My typical approach to DS projects

def data_science_project(problem):
    """
    End-to-end data science methodology
    """
    # 1. Problem Understanding
    business_context = understand_business_problem(problem)
    success_metrics = define_kpis(business_context)
    
    # 2. Data Collection & Exploration
    data = collect_data(sources)
    insights = exploratory_analysis(data)
    
    # 3. Data Preparation
    clean_data = handle_missing_values(data)
    features = feature_engineering(clean_data)
    X_train, X_test = split_and_scale(features)
    
    # 4. Modeling
    models = train_multiple_models(X_train)
    best_model = optimize_hyperparameters(models)
    results = evaluate_model(best_model, X_test)
    
    # 5. Deployment & Monitoring
    api = deploy_model(best_model)
    monitor_performance(api)
    
    return production_ready_solution

🀝 Let's Connect!

I'm always open to collaborating on data science projects, discussing new techniques, or connecting with fellow data enthusiasts and professionals!

LinkedIn GitHub Portfolio

πŸ“§ Open to opportunities in Data Science, ML Engineering, and Analytics roles


πŸ’­ "From data chaos to business clarity - one model at a time"

Wave

Thanks for visiting! ⭐ If you find my projects interesting, consider giving them a star!

Pinned Loading

  1. Data-Science-Clustering Data-Science-Clustering Public

    5th Week Exercise from Bootcamp Edvai - Clustering Models and Cluster Analysis

    Jupyter Notebook 1

  2. Data-Science-Data-Preparation Data-Science-Data-Preparation Public

    3rd Week Exercise from Bootcamp Edvai - Data Preparation

    Jupyter Notebook 1

  3. GameOfLife GameOfLife Public

    Programming Project: Conway`s Game Of Life

    Python

  4. Fraud-Detection-and-Prevention-Model Fraud-Detection-and-Prevention-Model Public

    Final Project for EdvaiΒ΄s Data Science & MLOps Bootcamp

    Jupyter Notebook 1

  5. Movie-Recommendation-Model Movie-Recommendation-Model Public

    Movie Recommendation Model - Personal Project

    Jupyter Notebook

  6. Prediccion-de-Churn-y-Retencion-de-Clientes-para-FinTech Prediccion-de-Churn-y-Retencion-de-Clientes-para-FinTech Public

    Proyecto de Data Science realizado en la simulaciΓ³n laboral de NoCountry

    Jupyter Notebook 2