End-to-end regression model predicting expected claim cost per member. Engineered features across age/sex bands, procedure code vectors, provider risk scoring, geographic cost indices, and prior-period utilization. Holdout RMSE 187 and R² 0.89. SHAP provides both global feature importance and local per-quote explanation. Deployed as a live interactive Streamlit app.
View on GitHub →Data Scientist
Predictive modeling, explainable ML, and production analytics in Python and SQL. MS Data Science (GPA 3.83) — Eastern University, 2024. BS Mathematics (GPA 3.54) — Ramapo College, 2021.
About
I am a working data scientist with a strong applied mathematics background. My focus is building models and workflows that produce reliable, interpretable outputs in production settings — not just in notebooks.
I care about evaluation rigor, reproducibility, and making results legible to stakeholders who need to act on them. I work primarily in Python and SQL, with hands-on experience shipping analytics pipelines, dashboards, and predictive models into real business processes.
Projects
Actuarial-style ratemaking pipeline on the freMTPL2 dataset (678K policies, 26K claims).
Poisson GLM for claim frequency, Gamma GLM for claim severity, combined into pure premium estimates.
Full EDA, feature preparation, and model diagnostics.
Structured into reusable src/ modules with reproducible notebooks.
Experience
Analytics Consultant / Data Specialist, Dental Network Analytics — Bridgewater, NJ
- Build Python and SQL pipelines to automate data ingestion, cleaning, and prep from claims and market sources — reduced manual reporting time by ~50%.
- Maintain Power BI, Tableau, and Qlik Sense dashboards surfacing network trends and pricing insights for business stakeholders.
- Transform raw claims and demographic data into structured inputs for pricing and actuarial review.
- Replace manual Excel workflows with reusable Python/SQL automation.
Underwriter II, National Accounts
- Analyzed claims, demographic, and plan data to price large group benefits accounts.
- Automated rate calculation and data prep in SQL and Python, supporting ~100 seasonal renewals.
- Collaborated with sales and actuarial teams on RFP strategy and alternative plan pricing.
Underwriting Analyst Associate — Remote
- Risk and pricing analysis on group insurance policies using statistical methods and data tools.
- Built SQL and pandas scripts to streamline quote processing, reducing cycle time by 1–2 days.
Contact
Open to data science roles, research collaborations, and interesting problems.