Transforming data into actionable insights through advanced analytics and machine learning
Data Scientist passionate about solving complex business problems through data-driven approaches. My work spans the entire data science lifecycle: from exploratory analysis and feature engineering to model development, deployment, and monitoring in production environments.
class DataScientist:
def __init__(self):
self.name = "IΓ±aki Rosello"
self.role = "Data Scientist & CS Student"
self.location = "Buenos Aires, Argentina"
def expertise(self):
return {
"analytics": ["EDA", "Statistical Analysis", "Data Visualization"],
"modeling": ["Classification", "Clustering", "Recommendation Systems"],
"mlops": ["Model Deployment", "Monitoring", "CI/CD Pipelines"],
"business": ["Fraud Detection", "Churn Prediction", "Demand Forecasting"]
}
def currently_learning(self):
return ["Deep Learning", "Advanced MLOps", "Price Optimization"]|
Customer Churn Prediction with Full MLOps Pipeline Production ML system for FinTech churn prediction optimized for Recall (0.76). Complete pipeline: SMOTE balancing, model comparison (RF, XGBoost, LogReg), hyperparameter tuning, SHAP explainability, MLflow tracking, Evidently drift detection, FastAPI + Docker deployment. Stack: Scikit-learn, MLflow, Evidently, SHAP, FastAPI, Docker |
|
|
Retail Demand Forecasting & Price Optimization Complete data science pipeline for demand estimation and optimal pricing strategy. Integrates econometric analysis (price elasticity) with predictive modeling. Stack: Python, Pandas, Scikit-learn, Optimization |
E-commerce Fraud Detection System Final project for EDVAI Bootcamp. Production-ready fraud detection using clustering + classification. Includes API, Docker containerization, and Gradio UI. Stack: Scikit-learn, FastAPI, Docker, Gradio |
|
Hybrid Recommendation Engine Personal project exploring recommendation systems. Implements content-based + collaborative filtering with efficient similarity search using FAISS & ANNOY. Stack: FAISS, ANNOY, FastAPI, Gradio |
|
- π Computer Science - Universidad de Buenos Aires (Currently Studying)
- π Data Science & MLOps Bootcamp - EDVAI (Completed)
- π AlixPartners Case Competition 2025 - Participant
- πΌ NoCountry Simulation - Data Science Team Member
- π Continuous Learning: Deep Learning, Advanced Analytics (EDA, Data Preparation), Production ML
Current Learning Path:
- π§ Deep Learning fundamentals and advanced architectures
- π Time series forecasting and demand prediction
- π§ Production-grade MLOps practices and automation
- π° Price optimization and econometric modeling
- π³ Scalable deployment with Docker and cloud services
Core Competencies:
- Analytics: Exploratory Analysis, Statistical Modeling, Business Intelligence
- Machine Learning: Classification, Clustering, Recommendation Systems, Fraud Detection
- MLOps: Model Deployment, Monitoring & Drift Detection, CI/CD Pipelines, Containerization
- Business Applications: Churn Prediction, Demand Forecasting, Price Optimization
π FinTech Churn Prediction - Technical Overview
Business Context: Customer retention prediction for a FinTech platform
Challenge: Highly imbalanced dataset (80% no-churn, 20% churn)
Optimization Goal: Maximize Recall to minimize false negatives (missed churners)
Data Pipeline:
- Preprocessing: Feature engineering on
cleaned_data.csv - Scaling: StandardScaler for linear models
- Balancing: SMOTE oversampling on training set
- Split: 80/20 train-test with stratification
Model Development:
| Model | Hyperparameter Tuning | Recall | F1-Score | ROC AUC |
|---|---|---|---|---|
| Random Forest | RandomizedSearchCV β GridSearchCV | 0.66 | 0.57 | 0.83 |
| XGBoost | RandomizedSearchCV β GridSearchCV | 0.55 | 0.59 | 0.84 |
| Logistic Regression β | RandomizedSearchCV β GridSearchCV | 0.76 | 0.56 | 0.84 |
Winner: Logistic Regression with class_weight='balanced'
Reason: Highest Recall (0.76), meeting business requirement of catching churners
MLOps Implementation:
- Experiment Tracking: MLflow for all runs, parameters, and metrics
- Model Explainability: SHAP for feature importance analysis
- Model Artifacts: Serialized model + scaler with pickle
- Monitoring: Evidently AI for drift detection
- Deployment: FastAPI + Docker containerization
- CI/CD: Automated retraining pipeline
Key Learnings:
- SMOTE effectively handled class imbalance
- Linear models outperformed tree-based for this use case
- Recall optimization crucial for business impact
- SHAP provided actionable insights for stakeholders
π Fraud Detection System - Architecture & Approach
Project Type: EDVAI Bootcamp Final Project
Domain: E-commerce Transaction Fraud Detection
Methodology:
- Unsupervised Learning: Clustering to identify fraud patterns
- Supervised Learning: Classification on clustered features
- Ensemble Approach: Combined insights from both techniques
Production Features:
- RESTful API with FastAPI
- Docker containerization for deployment
- Interactive Gradio UI for demos
- Real-time fraud scoring
Technologies: Scikit-learn, FastAPI, Docker, Gradio
π¬ Movie Recommendation System - Implementation Details
Approach: Hybrid Recommendation System
Components:
- Content-Based Filtering: TF-IDF on movie metadata
- Collaborative Filtering: User-item interaction matrix
- Similarity Search: FAISS & ANNOY for efficient retrieval
Performance:
- Sub-second response time for recommendations
- Scalable to millions of items
- API-ready deployment
Technologies: FAISS, ANNOY, FastAPI, Gradio
Data Science Projects:
- β Churn Prediction (FinTech) - Achieved 0.76 Recall through SMOTE + Logistic Regression optimization
- β Fraud Detection System - Built production-ready fraud classifier with clustering + classification approach
- β Recommendation Engine - Implemented hybrid RecSys with FAISS/ANNOY for sub-second retrieval
- β Demand Forecasting - Created econometric optimization model for retail pricing strategy
Technical Skills:
- π End-to-end data science workflows with imbalanced data handling
- π€ Model selection and hyperparameter optimization (RandomizedSearchCV, GridSearchCV)
- π Advanced feature engineering and SMOTE oversampling
- π Model interpretability with SHAP and explainable AI
- π Production deployment with MLflow tracking and Evidently monitoring
- π Drift detection and automated model retraining pipelines
Bootcamp & Competitions:
- π― EDVAI Data Science & MLOps Bootcamp
- π AlixPartners 2025 Case Competition - Demand Forecasting Hackaton Competition
- πΌ NoCountry Job Simulation - FinTech Churn Prediction Project
# My typical approach to DS projects
def data_science_project(problem):
"""
End-to-end data science methodology
"""
# 1. Problem Understanding
business_context = understand_business_problem(problem)
success_metrics = define_kpis(business_context)
# 2. Data Collection & Exploration
data = collect_data(sources)
insights = exploratory_analysis(data)
# 3. Data Preparation
clean_data = handle_missing_values(data)
features = feature_engineering(clean_data)
X_train, X_test = split_and_scale(features)
# 4. Modeling
models = train_multiple_models(X_train)
best_model = optimize_hyperparameters(models)
results = evaluate_model(best_model, X_test)
# 5. Deployment & Monitoring
api = deploy_model(best_model)
monitor_performance(api)
return production_ready_solutionI'm always open to collaborating on data science projects, discussing new techniques, or connecting with fellow data enthusiasts and professionals!
