top of page

Projects

The following are some of the projects that I have done.

Greenhouse Gas (GHG) Air Emissions

Project Description:

This project aims to assess the escalation of greenhouse gas releases into the atmosphere, contributing to climate change. The primary objective is to investigate variations in emission levels across different regions.

​

Project Summary:
The analysis focused on showcasing the annual quantity of greenhouse gas emissions, highlighting the yearly escalation contributing to global warming and climate change effects. Additionally, the study sought to identify regional disparities in gas releases, revealing that emissions vary among regions and intensify over the years.


Tools Used:
Microsoft SQL, Power BI


Dataset Used
The datasets contained 16 rows and 514 columns. We reviewed the gas emission from all the gas types for a period of 11 years from 2010-2021. The data was collected from International Monetary fund (IMF) website Data Source: https://climatedata.imf.org/datasets/c8579761f19740dfbe4418b205654ddf/explore


Date Published

23rd October, 2023


Project Locations​

Greenhouse-Gas-Emissions.jpeg
ghg Dashboard.png
  • Medium
  • GitHub

EU Report (PDF Document) Analysis

Project Description

An Analysis of European Union PDF Document Report on 2023 Nigeria Election


Project Summary

This analysis aimed to delve into the comprehensive content of the European Union report on the 2023 Nigerian election. The focus was on uncovering the details within the PDF report of the European Union Observation Mission (EUOM), which garnered significant attention on social media through hashtags, retweets, likes, and comments since its release on June 23, 2023. The EUOM diligently monitored the Nigerian election from January 1 to March 16, 2023, ultimately presenting a detailed 94-page report on the general election activities.


Tools Used:
Rstudio was used for the analysis with pdftoolds tidytext, wordcloud  libraries.


Document  source

The pdf document was downloaded from the website with the library “pdftools” https://www.eods.eu/library/EU%20EOM%20NGA%202023%20FR.pdf


Date Published

16th August, 2023


Project Locations

​

​

  • Medium
  • GitHub
EU REport Front page.png
Eu pix.png
  • Medium
  • GitHub

EDA On Death Causes

Project Description

The aim of this project is to investigate the factors contributing to the rising death rates in various countries.

 

Project Summary

I analyzed death records in 204 countries, identifying the primary causes of death and examining factors influencing death rates.. Variables were utilized for country comparisons, exploring the yearly death rates of China versus Nigeria. Notably, malaria emerged as the leading cause of death in Nigeria. A comparative analysis of death rates between the USA and Nigeria was conducted, along with a focus on the top 10 countries affected by HIV-related deaths. The findings provide valuable insights for affected countries to collaborate on mitigating the increasing death rates.


Tools Used:
python Jupiter Notebook with Matplotlib, Seaborn and pandas libraries.


Dataset Used
The datasets contained 6120 rows and 34 columns. There are 204 countries and year under review is 1990-2019 = 30 years. The data was collected from International Monetary fund (IMF) website Data Source: https://climatedata.imf.org/datasets/c8579761f19740dfbe4418b205654ddf/explore


Date Published

23rd October, 2023


Project Location

​

  • Medium
  • GitHub
death cause.png
hiv.png
chart.png

Amazon Customer Survey

Problem Description
The task involves creating a dashboard to visualize customer dissatisfaction levels with Amazon services.

 

Project Summary

A comprehensive analysis was conducted using 19 indices over a 3-month period to gauge customer behavior on the Amazon platform. Customers, aged between 16-75 years, engaged in shopping across 15 purchase categories. Focusing on negative reviews, the project aims to guide Amazon in enhancing service quality and customer satisfaction, echoing Bill Gates' insight: "Your most unhappy customers are your greatest source of learning."

One key variable examined is:

Shopping_Satisfaction: Rated 1-5, with 5 being the highest satisfaction. Particular attention is given to customers rating satisfaction below 2. Analysis includes identifying purchased product categories, age groups, improvement areas, service appreciation, rating accuracy, purchase frequency, and gender within this dissatisfaction category.

​

Dataset Used
The data was downloaded from kaggle. The data contains 603 rows and 24 variables https://www.kaggle.com/datasets/swathiunnikrishnan/amazon-consumer-behaviour-dataset

Data Extraction
Microsoft SQL server was used for data extraction. A number of metrics were used for data extraction in order to narrow down to the most dissatisfied customers of Amazon services.

Project Location

https://github.com/Li-Ndibe/Amazon-Survey.git

Power BI Dashboard 1.png
Power BI dashboard 2.png

 Decision Tree Analysis Heart Diseases 

Project Description

The objective is to categorize chest pain types based on their propensity to lead to a heart attack.

​

Project Summary:
This project employs a decision tree to model various chest pains as potential precursors to heart attacks. The study involves 194 females and 726 males, aged between 28 and 77, across four locations (Cleveland, Hungary, Switzerland, Long Beach) with sample sizes of 304, 293, 123, and 200 respectively. The analysis reveals a classification of chest pain types, highlighting that Asymptomatic chest pain falls into the category of quickly leading to a heart attack. The interdependence of variables dictates the categorization, with predicted values greater or equal to 1 indicating depression or exercise-induced pain, and values less than 1 signifying asymptomatic pain, and so forth.

​

Model Used

A Classification Decision Tree Model 

​

Tools Used:
R Programming Language with party, rpart, rpart.plot, readr, dplyr libraries.

 

Dataset Used
The data was downloaded from kaggle. The data contains  920 rows and 16 columns

https://www.kaggle.com/datasets/redwankarimsony/heart-disease-data

​

Project Location

https://github.com/Li-Ndibe/Decision-Tree-Classification-on-Heart-Disease.git

heart-disease.jpg
Decision tree.png

 ARIMA application in  Forecasting Monthly Room Occupancy of a Hotel

Problem Description
This project focuses on conducting a Time Series Analysis of hotel guest occupancy utilizing the Autoregressive Integrated Moving Average (ARIMA) method.

​

Project Summary

The aim is to analyze sequential data, identify patterns, and predict future values to inform strategic business decisions. The process involves stationarizing the time series data, plotting the Autocorrelation Function (ACF) to determine optimal parameters, and forecasting occupancy trends up to 2026 with a 95% confidence level, represented by lower and upper bounds in the chart. The grey area indicates the confidence level limits.
 

Tools Used:
R Programming Language with readr, dplyr , tseries, forecast libraries.

Dataset Used
The data is a primary data from the organization.  The data contains daily occupancy of the hotel from 2013 to 2021 totaling to 108 months. 

Date Published

12th February, 2023

Project Location

https://github.com/Li-Ndibe/Forecasting-Hotel-Guest-Occupancy-Rate-using-ARIMA-Model.git

Hotel.png
optimum.png
forecast.png
bottom of page