Skip to content

About An Interactive Premium Amount Detection for user which accurately predicts the required premium amount for a default loan by using series of questions that satisfies the criteria in Streamlit Application

Notifications You must be signed in to change notification settings

Saravanan9698/Smart_Premium

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🏆 Smart Insurance Premium Prediction

This project predicts insurance premium amounts using Machine Learning models.

It processes raw insurance data, applies feature engineering, and trains multiple regression models to find the best-performing one.


🚀 Key Features

Data Preprocessing

  • Handles missing values in both numerical and categorical columns.
  • Removes outliers using IQR (Interquartile Range).
  • Encodes categorical data using Label Encoding.
  • Feature Engineering: Converts raw numerical data into meaningful groups (e.g., Age Groups, Credit Score Categories).

Machine Learning Model Training

  • Trains multiple models, including:
    • Linear Regression
    • Decision Tree Regressor
    • Random Forest Regressor
    • XGBoost Regressor
  • Uses Bayesian Optimization for hyperparameter tuning.
  • Evaluates models using:
    • Root Mean Squared Log Error (RMSLE)
    • Root Mean Squared Error (RMSE)
    • Mean Absolute Error (MAE)
    • R² Score
  • Saves the best model as best_model.pkl.

Model Deployment

  • Loads the trained model and predicts insurance premiums on new data.
  • Uses MLflow to track predictions and store model artifacts.
  • Saves final predictions to Test_Predictions.csv.

🚀 Interactive Streamlit Web App

  • A user-friendly web interface for entering customer details and predicting insurance premiums.
  • Displays raw input and preprocessed data for debugging.
  • Shows real-time predictions based on trained models.

💻 Technology Stack

Programming Language

  • Python

Machine Learning Libraries

  • scikit-learn
  • XGBoost
  • Bayesian Optimization

Data Processing & Storage

  • Pandas, NumPy (for data handling and preprocessing)
  • Pickle (for model storage)

Deployment & Tracking

  • MLflow (for model logging and tracking)
  • Streamlit (for the web application)

📂 Project Structure

    📚 train.csv -> Raw training dataset
    📚 test.csv -> Raw test dataset
    📚 Cleaned_data.csv -> Data after cleaning & transformation
    📚 Encoded_data.csv -> Fully processed dataset for training
    📚 best_model.pkl -> Saved best model after training
    📚 preprocessor.pkl -> Data preprocessor for prediction
    📚 Test_Predictions.csv -> Predictions on new test data
    📚 ML_Pipeline.py -> Model inference pipeline
    📚 Model_Building.py -> Script for model training
    📚 Preprocess.py -> Script for data preprocessing
    📚 Streamlit_App.py -> Streamlit UI for predictions
    📚 README.md -> Project documentation
    

About

About An Interactive Premium Amount Detection for user which accurately predicts the required premium amount for a default loan by using series of questions that satisfies the criteria in Streamlit Application

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published