Data Scientist

Predictive modeling, explainable ML, and production analytics in Python and SQL. MS Data Science (GPA 3.83) — Eastern University, 2024. BS Mathematics (GPA 3.54) — Ramapo College, 2021.

LanguagesPython · SQL · R
ModelingXGBoost · GLMs · scikit-learn · statsmodels
ExplainabilitySHAP · Optuna
BI / ReportingPower BI · Tableau · Qlik Sense
Infrapandas · NumPy · Git · AWS S3/EC2

About

I am a working data scientist with a strong applied mathematics background. My focus is building models and workflows that produce reliable, interpretable outputs in production settings — not just in notebooks.

I care about evaluation rigor, reproducibility, and making results legible to stakeholders who need to act on them. I work primarily in Python and SQL, with hands-on experience shipping analytics pipelines, dashboards, and predictive models into real business processes.

Projects

Dental Claim Cost Pricing with Explainable XGBoost

Python XGBoost SHAP Optuna Streamlit

End-to-end regression model predicting expected claim cost per member. Engineered features across age/sex bands, procedure code vectors, provider risk scoring, geographic cost indices, and prior-period utilization. Holdout RMSE 187 and R² 0.89. SHAP provides both global feature importance and local per-quote explanation. Deployed as a live interactive Streamlit app.

View on GitHub →

GLM Pricing — French Motor Third-Party Liability

Python statsmodels GLM Poisson Gamma

Actuarial-style ratemaking pipeline on the freMTPL2 dataset (678K policies, 26K claims). Poisson GLM for claim frequency, Gamma GLM for claim severity, combined into pure premium estimates. Full EDA, feature preparation, and model diagnostics. Structured into reusable src/ modules with reproducible notebooks.

View on GitHub →

Experience

MetLife Jun 2024 – Present

Analytics Consultant / Data Specialist, Dental Network Analytics — Bridgewater, NJ

  • Build Python and SQL pipelines to automate data ingestion, cleaning, and prep from claims and market sources — reduced manual reporting time by ~50%.
  • Maintain Power BI, Tableau, and Qlik Sense dashboards surfacing network trends and pricing insights for business stakeholders.
  • Transform raw claims and demographic data into structured inputs for pricing and actuarial review.
  • Replace manual Excel workflows with reusable Python/SQL automation.
MetLife Oct 2022 – Jun 2024

Underwriter II, National Accounts

  • Analyzed claims, demographic, and plan data to price large group benefits accounts.
  • Automated rate calculation and data prep in SQL and Python, supporting ~100 seasonal renewals.
  • Collaborated with sales and actuarial teams on RFP strategy and alternative plan pricing.
Aetna / CVS Health Jul 2021 – Oct 2022

Underwriting Analyst Associate — Remote

  • Risk and pricing analysis on group insurance policies using statistical methods and data tools.
  • Built SQL and pandas scripts to streamline quote processing, reducing cycle time by 1–2 days.

Contact

Open to data science roles, research collaborations, and interesting problems.