Skip to content
View juniorcl's full-sized avatar
🏠
Working from home
🏠
Working from home

Block or report juniorcl

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
juniorcl/README.md

👋 Hi, I'm Clébio Júnior

Data Scientist with 4+ years of experience building machine learning and NLP solutions in production, focused on solving real-world business problems in risk, credit, legal, and customer analytics.

I specialize in working with structured and unstructured data, developing predictive models, extracting insights from text and documents, and translating data into actionable decisions with measurable business impact.

I have hands-on experience with libraries such as scikit-learn, spaCy, pdfplumber, pytesseract, and pandas, as well as techniques like NLP, clustering, and supervised learning.

Get my resume

Currículo em Português Resume in English

Connect with me

Linkedin Badge Medium Badge Kaggle Badge Gmail Badge DEV Badge  

👨‍💻 About me

  • 🎓 Master's degree in Natural Sciences (UENF) and Bachelor's degree in Physics (IFF)
  • 📊 Strong background in predictive modeling, NLP, and customer analytics
  • 🛠️ Skilled with Python, scikit-learn, spaCy, pandas, pdfplumber, pytesseract
  • 🗂️ Hands-on experience with unstructured data extraction from PDFs and images
  • ✍️ I share technical content on Medium

💼 Professional Experience

Data Scientist | Vert Analytics (Oct/2024 – Present, Remote)

  • Automated legal opinion defense recommendations using NLP and FAISS.
  • Developed solutions for unstructured data extraction (PDFs and images) using pdfplumber, Tesseract OCR, regex.
  • Applied NLP (spaCy) to analyze social media comments, identifying customer concerns and dissatisfaction patterns.

Data Scientist | Datarisk (Jan/2022 – Aug/2024, Remote)

  • Built credit scoring models, sales forecasting, and customer segmentation using machine learning and clustering.
  • Developed predictive models for customer behavior (default risk, plan upgrade likelihood, job instability).
  • Delivered insights supporting strategic decision-making and risk reduction.

Data Scientist | Be.X! (Mar/2021 – Jan/2022, Remote)

  • Processed structured and unstructured data using regex and data cleaning techniques.
  • Implemented outlier detection algorithms based on business rules for risk mitigation.
  • Created ML models for delivery delay prediction, improving logistics operations.

📌 Featured Topics & Interests

  • Applied Machine Learning in production
  • Natural Language Processing (NLP)
  • Unstructured data extraction and analysis
  • Credit risk and customer behavior modeling
  • Turning data into actionable business insights

Pinned Loading

  1. data-science-toolkit data-science-toolkit Public

    Repository with a set of functions to help the data science project.

    Jupyter Notebook

  2. webscraping-comment-tripadvisor webscraping-comment-tripadvisor Public

    A study project who aims to create a web scarping for getting informations about comments on the site

    Python 2

  3. transaction-fraud-detection transaction-fraud-detection Public

    A data science project to predict whether a transaction is a fraud or not.

    Jupyter Notebook 208 101

  4. churn-prediction churn-prediction Public

    It is a data science project with a focus on predicting whether a customer will be churn or not. This term is used to refer to customers who find themselves evaded.

    Jupyter Notebook 3 4

  5. olist-delivery-forecast olist-delivery-forecast Public

    Jupyter Notebook 1

  6. lmaoclost/Machine-Health lmaoclost/Machine-Health Public

    A Study project that predict disease with Machine Learning, NodeJS and Data Science

    TypeScript 5