Health Insurance Cost Prediction

This project contains data visualization, feature engineering and building linear regression models for predicting insurance costs using medical personal expenses billed by a health insurance company.

GitHub


Data visualization allowed exploring the relationships between health insurance costs and various factors like age, health status (body mass index, smoking), gender, family size and region.

Based on the findings from data exploration, I built several regression models to predict the insurance costs: Linear Regression, Random Forest, and Polynomial Regression. For each model it was determined the intercept and the coefficients, and the evaluation metrics which allowed comparing their fit.