Project Overview
Let's go in DetailL
EDA:
In my latest project involving the World_layoffs dataset, I extended my work beyond data cleaning to perform a comprehensive Exploratory Data Analysis (EDA) to uncover trends, patterns, and insights from the data. Here’s a detailed technical description of the EDA operations I conducted:
1. Initial Data Exploration:
I started by examining the entire cleaned dataset to get an overview of the data structure and content. This step involved inspecting all records in the layoffs_staging2 table to familiarize myself with the data.
2. Basic Aggregations and Summaries:
I used aggregate functions to find key statistics such as the maximum and minimum values of total_laid_off and percentage_laid_off. This helped identify the scale of layoffs and the extent to which companies were affected.
By selecting records where percentage_laid_off equaled 1, I identified companies that laid off their entire workforce. Further, I ordered these companies by their funds_raised_millions to understand the financial context of these complete layoffs.
3. Group By Aggregations:
Largest Single Layoff Events: I identified the top 5 single-day layoff events by ordering companies based on total_laid_off.
Total Layoffs by Company: I aggregated layoffs by company to determine which companies had the most total layoffs across the dataset.
Total Layoffs by Location: Similarly, I aggregated layoffs by location to see which areas were most affected.
Total Layoffs by Country: Aggregating layoffs by country provided a macro view of the global impact.
Annual Layoff Trends: I grouped layoffs by year to observe trends over time.
Industry-Specific Layoffs: By grouping layoffs by industry, I identified which sectors were most affected.
Company Stage Analysis: Grouping layoffs by the company’s stage provided insights into how company maturity (e.g., startup vs. established company) influenced layoff numbers.
4. Advanced Analysis with CTEs:
Yearly Layoffs by Company: I used Common Table Expressions (CTEs) to analyze layoffs by company on a yearly basis. This involved creating a ranking within each year to identify the top companies with the highest layoffs annually.
Rolling Total of Layoffs: I calculated the rolling total of layoffs per month. This required converting the date to a year-month format and then summing layoffs for each month. A CTE was used to facilitate the rolling sum calculation, providing a continuous view of layoff trends over time.
These operations provided a detailed and comprehensive analysis of the layoff data, revealing key trends and patterns. This project showcased my ability to perform in-depth exploratory data analysis, utilizing advanced SQL techniques such as window functions, CTEs, and various aggregation methods to extract meaningful insights from complex datasets. This experience demonstrates my proficiency in SQL and my analytical skills, making me adept at transforming raw data into actionable intelligence.
In my latest project titled "World Layoffs Exploratory Data Analysis (EDA)," I extended my efforts beyond data cleaning to perform a thorough EDA on the World_layoffs dataset using MySQL. This involved an initial data exploration to understand the dataset's structure, followed by various aggregations to uncover key statistics and trends. I conducted group by aggregations to identify the largest single layoff events, total layoffs by company, location, and country, as well as annual and industry-specific layoff trends. Advanced analysis using Common Table Expressions (CTEs) allowed me to examine yearly layoffs by company and calculate rolling totals of layoffs per month. This comprehensive analysis revealed significant patterns and insights, demonstrating my proficiency in SQL and my ability to extract actionable intelligence from complex datasets.