Employee Attrition Prediction using XGBoost and Flask with SHAP-based Feature Insights

Employee Attrition Prediction using XGBoost and Flask with SHAP-based Feature Insights
personal

Employee Attrition Prediction using XGBoost and Flask with SHAP-based Feature Insights

This project focuses on predicting employee attrition — identifying whether an employee is likely to leave the organization — using XGBoost, a high-performance gradient boosting algorithm. The model is trained on an HR dataset and incorporates advanced feature selection to highlight the top factors influencing attrition. Key steps include data preprocessing, encoding categorical variables, feature importance extraction, and model training using XGBoost. The project emphasizes explainability through SHAP (SHapley Additive exPlanations) values, which visualize and rank the top 10 features impacting employee turnover, such as OverTime, JobLevel, MaritalStatus, and TotalWorkingYears. To make it interactive, the prediction system is deployed using Flask, allowing users to input employee details (such as job level, income, overtime status, etc.) and instantly receive an attrition prediction. The app also displays visual insights derived from SHAP, helping HR managers understand why a particular prediction was made. Tech Stack: Python, Flask, XGBoost, SHAP, Pandas, Scikit-learn, HTML/CSS Key Features: ML-based employee attrition prediction SHAP visual explanation for top 10 contributing factors Flask web interface for live predictions Clean, modular code structure for scalability This project demonstrates a complete data science pipeline — from data analysis and feature engineering to model deployment and visualization, aligning with real-world HR analytics use cases.

Miriyala Veera Ganesh

Miriyala Veera Ganesh

student

94
Views
0
Claps
0
Comments

Project Overview

This project focuses on predicting employee attrition — identifying whether an employee is likely to leave the organization — using XGBoost, a high-performance gradient boosting algorithm. The model is trained on an HR dataset and incorporates advanced feature selection to highlight the top factors influencing attrition. Key steps include data preprocessing, encoding categorical variables, feature importance extraction, and model training using XGBoost. The project emphasizes explainability through SHAP (SHapley Additive exPlanations) values, which visualize and rank the top 10 features impacting employee turnover, such as OverTime, JobLevel, MaritalStatus, and TotalWorkingYears. To make it interactive, the prediction system is deployed using Flask, allowing users to input employee details (such as job level, income, overtime status, etc.) and instantly receive an attrition prediction. The app also displays visual insights derived from SHAP, helping HR managers understand why a particular prediction was made. Tech Stack: Python, Flask, XGBoost, SHAP, Pandas, Scikit-learn, HTML/CSS Key Features: ML-based employee attrition prediction SHAP visual explanation for top 10 contributing factors Flask web interface for live predictions Clean, modular code structure for scalability This project demonstrates a complete data science pipeline — from data analysis and feature engineering to model deployment and visualization, aligning with real-world HR analytics use cases.

Project Claps

0 claps

No claps yet. Be the first to clap for this project!

Project Images

Discussion

Please log in to join the discussion.

More Projects You Might Like

Similar Projects

Skill-Salary Correlation Study

Skill-Salary Correlation Study

The Skill–Salary Correlation Study project focuses on understanding how various skills, experience levels, and educational backgrounds influence income across different industries and job roles. The objective of this project is to use real-world job and salary datasets to identify which skills deliver the highest return on investment in the job market and help professionals make data-driven career decisions. The project involves collecting and preparing data from sources such as Kaggle’s Data Science Salaries or Stack Overflow Developer Survey, Glassdoor reports, and LinkedIn job postings. The dataset includes information such as job title, skill set, years of experience, and annual salary. After cleaning and standardizing the data, binary indicators are created for top skills, allowing deeper comparison across professions. Exploratory data analysis (EDA) is conducted to compute and visualize average salaries by skill, experience level, and industry using Python libraries like pandas, matplotlib, and seaborn. Visualizations such as bar charts, heatmaps, and bubble plots highlight top-paying skills and combinations. For example, professionals with Python, SQL, and Tableau skills tend to earn significantly higher salaries than those with traditional tools like Excel. A simple linear regression model is built to predict salary based on key features like skills, experience, and industry, allowing quantitative assessment of each factor’s contribution. The model’s coefficients and R² score help identify which skills have the greatest financial impact. Finally, the project concludes with clear, actionable insights and career recommendations — showing which skill sets provide the best salary growth potential and how professionals can strategically upskill. Overall, the Skill–Salary Correlation Study demonstrates how data analytics can bridge the gap between education and employability, offering valuable intelligence for job seekers, educators, and industry leaders.

Nalla Neeraj Naidu Nalla Neeraj Naidu