Projects
The following are some of the projects that I have done.
Greenhouse Gas (GHG) Air Emissions
Project Description:
This project aims to assess the escalation of greenhouse gas releases into the atmosphere, contributing to climate change. The primary objective is to investigate variations in emission levels across different regions.
​
Project Summary:
The analysis focused on showcasing the annual quantity of greenhouse gas emissions, highlighting the yearly escalation contributing to global warming and climate change effects. Additionally, the study sought to identify regional disparities in gas releases, revealing that emissions vary among regions and intensify over the years.
Tools Used:
Microsoft SQL, Power BI
Dataset Used
The datasets contained 16 rows and 514 columns. We reviewed the gas emission from all the gas types for a period of 11 years from 2010-2021. The data was collected from International Monetary fund (IMF) website Data Source: https://climatedata.imf.org/datasets/c8579761f19740dfbe4418b205654ddf/explore
Date Published
23rd October, 2023
Project Locations​


EU Report (PDF Document) Analysis
Project Description
An Analysis of European Union PDF Document Report on 2023 Nigeria Election
Project Summary
This analysis aimed to delve into the comprehensive content of the European Union report on the 2023 Nigerian election. The focus was on uncovering the details within the PDF report of the European Union Observation Mission (EUOM), which garnered significant attention on social media through hashtags, retweets, likes, and comments since its release on June 23, 2023. The EUOM diligently monitored the Nigerian election from January 1 to March 16, 2023, ultimately presenting a detailed 94-page report on the general election activities.
Tools Used:
Rstudio was used for the analysis with pdftoolds tidytext, wordcloud libraries.
Document source
The pdf document was downloaded from the website with the library “pdftools” https://www.eods.eu/library/EU%20EOM%20NGA%202023%20FR.pdf
Date Published
16th August, 2023
Project Locations
​
​


EDA On Death Causes
Project Description
The aim of this project is to investigate the factors contributing to the rising death rates in various countries.
Project Summary
I analyzed death records in 204 countries, identifying the primary causes of death and examining factors influencing death rates.. Variables were utilized for country comparisons, exploring the yearly death rates of China versus Nigeria. Notably, malaria emerged as the leading cause of death in Nigeria. A comparative analysis of death rates between the USA and Nigeria was conducted, along with a focus on the top 10 countries affected by HIV-related deaths. The findings provide valuable insights for affected countries to collaborate on mitigating the increasing death rates.
Tools Used:
python Jupiter Notebook with Matplotlib, Seaborn and pandas libraries.
Dataset Used
The datasets contained 6120 rows and 34 columns. There are 204 countries and year under review is 1990-2019 = 30 years. The data was collected from International Monetary fund (IMF) website Data Source: https://climatedata.imf.org/datasets/c8579761f19740dfbe4418b205654ddf/explore
Date Published
23rd October, 2023
Project Location
​



Amazon Customer Survey
Problem Description
The task involves creating a dashboard to visualize customer dissatisfaction levels with Amazon services.
Project Summary
A comprehensive analysis was conducted using 19 indices over a 3-month period to gauge customer behavior on the Amazon platform. Customers, aged between 16-75 years, engaged in shopping across 15 purchase categories. Focusing on negative reviews, the project aims to guide Amazon in enhancing service quality and customer satisfaction, echoing Bill Gates' insight: "Your most unhappy customers are your greatest source of learning."
One key variable examined is:
Shopping_Satisfaction: Rated 1-5, with 5 being the highest satisfaction. Particular attention is given to customers rating satisfaction below 2. Analysis includes identifying purchased product categories, age groups, improvement areas, service appreciation, rating accuracy, purchase frequency, and gender within this dissatisfaction category.
​
Dataset Used
The data was downloaded from kaggle. The data contains 603 rows and 24 variables https://www.kaggle.com/datasets/swathiunnikrishnan/amazon-consumer-behaviour-dataset
Data Extraction
Microsoft SQL server was used for data extraction. A number of metrics were used for data extraction in order to narrow down to the most dissatisfied customers of Amazon services.
Project Location


Decision Tree Analysis Heart Diseases
Project Description
The objective is to categorize chest pain types based on their propensity to lead to a heart attack.
​
Project Summary:
This project employs a decision tree to model various chest pains as potential precursors to heart attacks. The study involves 194 females and 726 males, aged between 28 and 77, across four locations (Cleveland, Hungary, Switzerland, Long Beach) with sample sizes of 304, 293, 123, and 200 respectively. The analysis reveals a classification of chest pain types, highlighting that Asymptomatic chest pain falls into the category of quickly leading to a heart attack. The interdependence of variables dictates the categorization, with predicted values greater or equal to 1 indicating depression or exercise-induced pain, and values less than 1 signifying asymptomatic pain, and so forth.
​
Model Used
A Classification Decision Tree Model
​
Tools Used:
R Programming Language with party, rpart, rpart.plot, readr, dplyr libraries.
Dataset Used
The data was downloaded from kaggle. The data contains 920 rows and 16 columns
https://www.kaggle.com/datasets/redwankarimsony/heart-disease-data
​
Project Location
https://github.com/Li-Ndibe/Decision-Tree-Classification-on-Heart-Disease.git


ARIMA application in Forecasting Monthly Room Occupancy of a Hotel
Problem Description
This project focuses on conducting a Time Series Analysis of hotel guest occupancy utilizing the Autoregressive Integrated Moving Average (ARIMA) method.
​
Project Summary
The aim is to analyze sequential data, identify patterns, and predict future values to inform strategic business decisions. The process involves stationarizing the time series data, plotting the Autocorrelation Function (ACF) to determine optimal parameters, and forecasting occupancy trends up to 2026 with a 95% confidence level, represented by lower and upper bounds in the chart. The grey area indicates the confidence level limits.
Tools Used:
R Programming Language with readr, dplyr , tseries, forecast libraries.
Dataset Used
The data is a primary data from the organization. The data contains daily occupancy of the hotel from 2013 to 2021 totaling to 108 months.
Date Published
12th February, 2023
Project Location
https://github.com/Li-Ndibe/Forecasting-Hotel-Guest-Occupancy-Rate-using-ARIMA-Model.git



