Skill–Salary Correlation Study

Personal 📊 Data-Driven

Skill–Salary Correlation Study

An analytical exploration of how professional experience and technical skills influence salary levels in the data and analytics industry. This project combines data cleaning, visualization, and regression modeling to uncover key insights that guide career and skill development strategies.

Rakesh Malash

Rakesh Malash

Aspiring Data Scientist | Student | Tech Learner

22
Views
0
Claps
0
Comments

Project Overview

This project investigates the correlation between data-related skills, years of experience, and salary compensation using a structured analytical workflow. Key objectives included: Cleaning and normalizing salary datasets from multiple sources. Mapping experience levels to numeric values for quantitative analysis. Extracting and flagging in-demand skills such as Python, SQL, Tableau, and R. Conducting exploratory data analysis (EDA) to visualize salary patterns. Building a baseline regression model to estimate how experience and technical skills impact compensation levels. The insights revealed that experience has a strong positive correlation with salary, while technical proficiency in Python, SQL, and Tableau consistently aligns with higher pay scales. The project deliverables include a cleaned dataset, regression model outputs, visualizations, and an executive summary report — all packaged in a professional ZIP submission format.

Project Claps

0 claps

No claps yet. Be the first to clap for this project!

Key Features

1. Data Cleaning & Preparation

Removed inconsistencies, standardized salary to INR, and handled missing values for reliable analysis.

2. Exploratory Data Analysis (EDA)

Generated detailed visualizations, including average salary by skill, experience vs. salary, and correlation matrices.

3. Regression Modeling

Built a baseline linear regression model to quantify the effect of experience and technical skills on salary levels.

4 . Visualization & Insights

Created insightful plots highlighting patterns and relationships between variables.

5. Final Deliverables

Packaged a professional report (PDF), annotated Jupyter notebook, and supporting datasets for submission.

Project Documents

View and download project files

Document

Video

Click to view

Discussion

Please log in to join the discussion.

More Projects You Might Like

Similar Projects

Startup Growth Analytics Dashboard

Startup Growth Analytics Dashboard

Objective This project analyzes real-world startup datasets to uncover the key factors that drive startup success in the Indian ecosystem. By examining funding patterns, team composition, geographic distribution, and sector performance, we aim to answer: "What makes startups grow — and what signals early success?" 📊 Methodology Data Collection Analyzed 15+ startups across diverse sectors including Fintech, HealthTech, EdTech, E-commerce, and CleanTech Key variables tracked: funding amount (₹Cr), employee count, startup age, funding rounds, founder count, LinkedIn followers, and success status Dataset spans multiple tier-1 cities including Bengaluru, Mumbai, Delhi, Pune, and Hyderabad Success Definition Success criteria established as startups with: Total funding raised > ₹10 Cr Employee base > 50 3+ years of sustained operations Multiple funding rounds secured Analysis Approach Exploratory Data Analysis: Identified patterns in funding distribution, sector concentration, and geographic clustering Correlation Analysis: Examined relationships between funding, team size, startup age, and success metrics Comparative Analysis: Cross-referenced sector performance, city-wise distribution, and founder composition impact Visual Storytelling: Created interactive dashboards with 10+ visualization types for comprehensive insight delivery Key Findings 1. Sector Dominance Fintech and HealthTech startups demonstrate the strongest performance metrics: Fintech companies in Bengaluru raised 2.3× higher average funding (₹45-62 Cr range) compared to emerging sectors E-commerce and HealthTech secured the highest total funding rounds (4+ rounds), indicating sustained investor confidence 2. Geographic Advantage Location significantly impacts startup success probability: Bengaluru leads with 40% of all successful startups in the dataset Tier-1 cities (Bengaluru, Mumbai, Delhi) account for 80% of total funding distributed Startups in metro areas show 35% higher success rates compared to tier-2 cities 3. Founder Composition Impact Team structure correlates strongly with success outcomes: Startups with 2-3 founders demonstrate 45% higher success rates compared to solo founders Multi-founder teams secure funding 1.5× faster on average Diverse founder backgrounds (technical + business) show stronger growth trajectories 4. Funding-Employee Correlation Strong positive correlation (R² = 0.78) between funding amount and employee count: Successful startups maintain optimal ratio of ₹40-50L funding per employee Rapid hiring post-Series A funding indicates growth acceleration phase Companies with 100+ employees average ₹50Cr+ in total funding 5. Age & Maturity Factor Startup age emerges as a critical predictor: 3-5 year old startups demonstrate highest success probability (73%) First 2 years show high volatility; survival beyond 3 years indicates product-market fit Mature startups (5+ years) command 2× higher average valuations 💡 Data-Driven Recommendations For Aspiring Entrepreneurs: Choose High-Growth Sectors: Focus on Fintech, HealthTech, or EdTech where investor appetite remains strong Build Complementary Teams: Assemble 2-3 co-founders with diverse skill sets (technical, business, domain expertise) Strategic Location: Establish presence in Bengaluru or Mumbai to access robust startup ecosystems and investor networks Aim for Milestones: Target ₹10Cr+ funding within first 3 years as a success indicator For Investors: Sector Allocation: Prioritize Fintech and HealthTech deals with proven traction Team Assessment: Evaluate founder composition and prior experience as key risk factors Geographic Focus: Metro-based startups show higher ROI potential and faster exits Stage Timing: Series A investments in 2-3 year old companies offer optimal risk-reward balance For Policy Makers: Ecosystem Development: Strengthen tier-2 city infrastructure to distribute startup success more equitably Sector Support: Provide targeted incentives for high-growth sectors aligned with national priorities Founder Programs: Create accelerators focused on team building and co-founder matching 🛠️ Technical Implementation Tools Used: Data Processing: React state management for real-time analysis Visualization: Recharts library for interactive charts (Bar, Scatter, Pie, Line) UI/UX: Modern dashboard with Tailwind CSS, featuring gradient designs and responsive layouts Analytics: Statistical correlation analysis, sector aggregation, success rate calculations Dashboard Features: 4 key metric cards with real-time calculations 10+ interactive visualizations across 4 analytical views CSV upload functionality for custom dataset analysis Data table preview with filtering capabilities Mobile-responsive design for accessibility Impact & Insights This analysis reveals that startup success is not random — it follows measurable patterns. The strongest predictors are: Sector selection (Fintech/HealthTech) Geographic positioning (Tier-1 cities) Team composition (2-3 founders) Sustained funding momentum (3+ rounds) Startups that align with these factors show 65-75% success probability, compared to 30-40% for those that don't. This data-driven approach helps de-risk entrepreneurial ventures and guides strategic decision-making for all ecosystem stakeholders. Project Tags: #StartupAnalytics #DataScience #BusinessIntelligence #PredictiveModeling #StartupEcosystem #DataVisualization #ProofOfWork Dataset: Sample dataset of 15 Indian startups (2020-2025). Expandable with custom CSV uploads. Live Dashboard: Interactive React-based analytics platform with real-time insights generation.

Mahan Raikar Mahan Raikar