Natural Disasters Analysis - Number of People Affected¶

Authors: Women of the West Coast (WWC)¶

Date: Oct. 22, 2023¶

Table of Contents¶

  • Introduction
  • Table of Contents
  • Loading the Data
  • Global Analysis
  • Canada-Specific Analysis
  • Conclusion

Introduction¶

This report provides an in-depth analysis of the impact of various natural disasters, both globally and specific to Canada. The analysis is based on historical data from 1900 to 2010 and aims to offer insights that could inform the development and features of our team's mobile app for crisis response and management.

Loading the Data¶

In [1]:
import pandas as pd
import matplotlib.pyplot as plt

# Filter warnings
from warnings import filterwarnings
filterwarnings('ignore')
In [2]:
# Load the dataset
raw_data = pd.read_csv('natural-disasters.csv')
In [3]:
# Set display options to show all columns and rows
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
In [4]:
# Display the first few rows of the dataset
raw_data.head()
Out[4]:
Entity Year Number of deaths from drought Number of people injured from drought Number of people affected from drought Number of people left homeless from drought Number of total people affected by drought Reconstruction costs from drought Insured damages against drought Total economic damages from drought Death rates from drought Injury rates from drought Number of people affected by drought per 100,000 Homelessness rate from drought Total number of people affected by drought per 100,000 Number of deaths from earthquakes Number of people injured from earthquakes Number of people affected by earthquakes Number of people left homeless from earthquakes Number of total people affected by earthquakes Reconstruction costs from earthquakes Insured damages against earthquakes Total economic damages from earthquakes Death rates from earthquakes Injury rates from earthquakes Number of people affected by earthquakes per 100,000 Homelessness rate from earthquakes Total number of people affected by earthquakes per 100,000 Number of deaths from disasters Number of people injured from disasters Number of people affected by disasters Number of people left homeless from disasters Number of total people affected by disasters Reconstruction costs from disasters Insured damages against disasters Total economic damages from disasters Death rates from disasters Injury rates from disasters Number of people affected by disasters per 100,000 Homelessness rate from disasters Total number of people affected by disasters per 100,000 Number of deaths from volcanic activity Number of people injured from volcanic activity Number of people affected by volcanic activity Number of people left homeless from volcanic activity Number of total people affected by volcanic activity Reconstruction costs from volcanic activity Insured damages against volcanic activity Total economic damages from volcanic activity Death rates from volcanic activity Injury rates from volcanic activity Number of people affected by volcanic activity per 100,000 Homelessness rate from volcanic activity Total number of people affected by volcanic activity per 100,000 Number of deaths from floods Number of people injured from floods Number of people affected by floods Number of people left homeless from floods Number of total people affected by floods Reconstruction costs from floods Insured damages against floods Total economic damages from floods Death rates from floods Injury rates from floods Number of people affected by floods per 100,000 Homelessness rate from floods Total number of people affected by floods per 100,000 Number of deaths from mass movements Number of people injured from mass movements Number of people affected by mass movements Number of people left homeless from mass movements Number of total people affected by mass movements Reconstruction costs from mass movements Insured damages against mass movements Total economic damages from mass movements Death rates from mass movements Injury rates from mass movements Number of people affected by mass movements per 100,000 Homelessness rate from mass movements Total number of people affected by mass movements per 100,000 Number of deaths from storms Number of people injured from storms Number of people affected by storms Number of people left homeless from storms Number of total people affected by storms Reconstruction costs from storms Insured damages against storms Total economic damages from storms Death rates from storms Injury rates from storms Number of people affected by storms per 100,000 Homelessness rate from storms Total number of people affected by storms per 100,000 Number of deaths from landslides Number of people injured from landslides Number of people affected by landslides Number of people left homeless from landslides Number of total people affected by landslides Reconstruction costs from landslides Insured damages against landslides Total economic damages from landslides Death rates from landslides Injury rates from landslides Number of people affected by landslides per 100,000 Homelessness rate from landslides Total number of people affected by landslides per 100,000 Number of deaths from fog Number of people injured from fog Number of people affected by fog Number of people left homeless from fog Number of total people affected by fog Reconstruction costs from fog Insured damages against fog Total economic damages from fog Death rates from fog Injury rates from fog Number of people affected by fog per 100,000 Homelessness rate from fog Total number of people affected by fog per 100,000 Number of deaths from wildfires Number of people injured from wildfires Number of people affected by wildfires Number of people left homeless from wildfires Number of total people affected by wildfires Reconstruction costs from wildfires Insured damages against wildfires Total economic damages from wildfires Death rates from wildfires Injury rates from wildfires Number of people affected by wildfires per 100,000 Homelessness rate from wildfires Total number of people affected by wildfires per 100,000 Number of deaths from extreme temperatures Number of people injured from extreme temperatures Number of people affected by extreme temperatures Number of people left homeless from extreme temperatures Number of total people affected by extreme temperatures Reconstruction costs from extreme temperatures Insured damages against extreme temperatures Total economic damages from extreme temperatures Death rates from extreme temperatures Injury rates from extreme temperatures Number of people affected by extreme temperatures per 100,000 Homelessness rate from extreme temperatures Total number of people affected by extreme temperatures per 100,000 Number of deaths from glacial lake outbursts Number of people injured from glacial lake outbursts Number of people affected by glacial lake outbursts Number of people left homeless from glacial lake outbursts Number of total people affected by glacial lake outbursts Reconstruction costs from glacial lake outbursts Insured damages against glacial lake outbursts Total economic damages from glacial lake outbursts Death rates from glacial lake outbursts Injury rates from glacial lake outbursts Number of people affected by glacial lake outbursts per 100,000 Homelessness rate from glacial lake outbursts Total number of people affected by glacial lake outbursts per 100,000 Total economic damages from disasters as a share of GDP Total economic damages from drought as a share of GDP Total economic damages from earthquakes as a share of GDP Total economic damages from extreme temperatures as a share of GDP Total economic damages from floods as a share of GDP Total economic damages from landslides as a share of GDP Total economic damages from mass movements as a share of GDP Total economic damages from storms as a share of GDP Total economic damages from volcanic activity as a share of GDP Total economic damages from volcanic activity as a share of GDP.1 deaths_rate_per_100k_storm injured_rate_per_100k_storm total_affected_rate_per_100k_all_disasters
0 Afghanistan 1950 0.0 0.0 0.0 0 0.0 NaN NaN NaN 0.0 0.0 0.000000 0 0.000000 210.0 200.0 0.0 0.0 200.0 NaN NaN NaN 2.572748 2.381236 0.000000 0.000000 2.381236 215.1 200.0 0.0 0.0 200.0 NaN NaN NaN 2.633470 2.381236 0.000000 0.000000 NaN 0.0 0.0 0.0 0 0.0 NaN NaN NaN 0.0 0.0 0.0 0.0 0.0 5.1 0.0 0.0 0.0 0.0 NaN NaN NaN 0.060722 0.000000 0.000000 0.000000 0.000000 0.0 0.0 0.0 0.0 0.0 NaN NaN NaN 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 NaN NaN NaN NaN NaN 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 NaN NaN NaN 0.000000 0.000000 0.0 0.000000 0.000000 0 0 0 0 0 NaN NaN NaN 0.0 0 0 0 0 0.0 0.0 0.0 0.0 0.0 NaN NaN NaN 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 NaN NaN NaN 0.000000 0.0 0.000000 0.0 0.000000 0 0 0 0 0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0.000000 0.0 2.381236
1 Afghanistan 1960 0.0 0.0 4800.0 0 4800.0 0.0 0.0 20.0 0.0 0.0 44.060951 0 44.060951 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.000000 0.000000 0.000000 0.000000 0.000000 10.7 0.0 4800.0 0.0 4800.0 0.0 0.0 20.0 0.112124 0.000000 44.060951 0.000000 NaN 0.0 0.0 0.0 0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 10.7 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.112124 0.000000 0.000000 0.000000 0.000000 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 NaN NaN 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.000000 0.000000 0.0 0.000000 0.000000 0 0 0 0 0 0.0 0.0 0.0 0.0 0 0 0 0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0 0.000000 0 0 0 0 0 0.0 0.0 0.0 NaN NaN NaN NaN NaN 0.001420 0.00142 0.0 0.0 0.000000 0.0 0.0 0.0 0.0 0.0 0.000000 0.0 44.060951
2 Afghanistan 1970 0.0 0.0 0.0 0 0.0 0.0 0.0 0.0 0.0 0.0 0.000000 0 0.000000 6.1 1.5 9000.0 0.0 9001.5 0.0 0.0 0.0 0.047960 0.012722 69.535656 0.000000 69.548378 48.2 15.5 68404.4 750.0 69169.9 0.0 0.0 5200.0 0.391674 0.117661 541.290447 5.621767 NaN 0.0 0.0 0.0 0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 32.1 14.0 59404.4 750.0 60168.4 0.0 0.0 5200.0 0.256567 0.104940 471.754790 5.621767 477.481497 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 NaN NaN 0.0 0.0 0.0 10.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.087146 0.000000 0.0 0.000000 0.000000 0 0 0 0 0 0.0 0.0 0.0 0.0 0 0 0 0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0 0.000000 0 0 0 0 0 0.0 0.0 0.0 NaN NaN NaN NaN NaN 0.157576 0.00000 0.0 0.0 0.157576 0.0 0.0 0.0 0.0 0.0 0.000000 0.0 547.029875
3 Afghanistan 1980 0.0 0.0 0.0 0 0.0 0.0 0.0 0.0 0.0 0.0 0.000000 0 0.000000 51.3 351.8 6244.0 658.0 7253.8 0.0 0.0 900.0 0.398499 2.742558 49.053053 5.248046 57.043657 58.3 351.8 25344.0 658.0 26353.8 0.0 0.0 26900.0 0.458817 2.742558 210.091255 5.248046 NaN 0.0 0.0 0.0 0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 19100.0 0.0 19100.0 0.0 0.0 26000.0 0.000000 0.000000 161.038202 0.000000 161.038202 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 NaN NaN 0.0 0.0 0.0 7.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.060319 0.000000 0.0 0.000000 0.000000 0 0 0 0 0 0.0 0.0 0.0 0.0 0 0 0 0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0 0.000000 0 0 0 0 0 0.0 0.0 0.0 NaN NaN NaN NaN NaN 0.000000 0.00000 0.0 0.0 0.000000 0.0 0.0 0.0 0.0 0.0 0.000000 0.0 218.081859
4 Afghanistan 1990 0.0 0.0 0.0 0 0.0 0.0 0.0 0.0 0.0 0.0 0.000000 0 0.000000 742.6 358.3 27168.5 7702.5 35229.3 0.0 0.0 2001.0 3.814559 1.835906 146.665685 38.701410 187.203001 1038.9 394.7 43624.0 9578.5 53597.2 0.0 20.0 8401.0 5.830222 2.023235 263.136415 50.991165 NaN 0.0 0.0 0.0 0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 199.0 30.0 16435.5 1765.0 18230.5 0.0 20.0 6400.0 1.414652 0.151991 116.320342 11.596787 128.069120 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 NaN NaN 0.0 0.0 0.0 73.9 6.4 0.0 111.0 117.4 0.0 0.0 0.0 0.418517 0.035338 0.0 0.692968 0.728305 0 0 0 0 0 0.0 0.0 0.0 0.0 0 0 0 0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 22.4 0.0 20.0 0.0 20.0 0.0 0.0 0.0 0.176172 0.0 0.150387 0.0 0.150387 0 0 0 0 0 0.0 0.0 0.0 NaN NaN NaN NaN NaN 0.000000 0.00000 0.0 0.0 0.000000 0.0 0.0 0.0 0.0 0.0 0.006322 0.0 316.150814
In [5]:
# Check the shape of the dataframe
raw_data.shape
Out[5]:
(1604, 171)

We have 1604 rows and 171 columns in our dataset.

Data Processing and Cleaning¶

Below are the data cleaning steps performed on the dataset:

  1. Checking for missing values.
  2. Checking for duplicate rows.

Checking for missing values¶

In [6]:
# Check for missing values
missing_values = raw_data.isnull().sum()

# Calculate the percentage of missing values for each column
missing_percentage = (missing_values / len(raw_data)) * 100

# Display columns with missing values and their corresponding percentage
missing_data = pd.DataFrame({'Missing Values': missing_values, 'Percentage': missing_percentage})
missing_data = missing_data[missing_data['Missing Values'] > 0].sort_values(by='Percentage', ascending=False)

missing_data
Out[6]:
Missing Values Percentage
Number of people affected by glacial lake outbursts per 100,000 1604 100.000000
Death rates from glacial lake outbursts 1604 100.000000
Injury rates from storms 1604 100.000000
Death rates from storms 1604 100.000000
Total number of people affected by disasters per 100,000 1604 100.000000
Total number of people affected by glacial lake outbursts per 100,000 1604 100.000000
Homelessness rate from glacial lake outbursts 1604 100.000000
Injury rates from glacial lake outbursts 1604 100.000000
Insured damages against wildfires 480 29.925187
Total economic damages from wildfires 480 29.925187
Reconstruction costs from extreme temperatures 480 29.925187
Insured damages against extreme temperatures 480 29.925187
Total economic damages from extreme temperatures 480 29.925187
Reconstruction costs from glacial lake outbursts 480 29.925187
Insured damages against glacial lake outbursts 480 29.925187
Total economic damages from glacial lake outbursts 480 29.925187
Reconstruction costs from drought 480 29.925187
Total economic damages from fog 480 29.925187
Total economic damages from disasters as a share of GDP 480 29.925187
Total economic damages from drought as a share of GDP 480 29.925187
Total economic damages from earthquakes as a share of GDP 480 29.925187
Total economic damages from extreme temperatures as a share of GDP 480 29.925187
Total economic damages from floods as a share of GDP 480 29.925187
Total economic damages from landslides as a share of GDP 480 29.925187
Total economic damages from mass movements as a share of GDP 480 29.925187
Total economic damages from storms as a share of GDP 480 29.925187
Total economic damages from volcanic activity as a share of GDP 480 29.925187
Reconstruction costs from wildfires 480 29.925187
Reconstruction costs from fog 480 29.925187
Insured damages against fog 480 29.925187
Reconstruction costs from floods 480 29.925187
Total economic damages from drought 480 29.925187
Reconstruction costs from earthquakes 480 29.925187
Insured damages against earthquakes 480 29.925187
Total economic damages from earthquakes 480 29.925187
Reconstruction costs from disasters 480 29.925187
Insured damages against disasters 480 29.925187
Total economic damages from disasters 480 29.925187
Reconstruction costs from volcanic activity 480 29.925187
Insured damages against volcanic activity 480 29.925187
Total economic damages from volcanic activity 480 29.925187
Insured damages against floods 480 29.925187
Insured damages against drought 480 29.925187
Total economic damages from floods 480 29.925187
Reconstruction costs from mass movements 480 29.925187
Insured damages against mass movements 480 29.925187
Total economic damages from mass movements 480 29.925187
Reconstruction costs from storms 480 29.925187
Insured damages against storms 480 29.925187
Total economic damages from storms 480 29.925187
Reconstruction costs from landslides 480 29.925187
Insured damages against landslides 480 29.925187
Total economic damages from landslides 480 29.925187
Total economic damages from volcanic activity as a share of GDP.1 480 29.925187

For columns with 100% missing values, it's advisable to drop them as they don't add any value to the analysis.

Dropped columns with 100% missing values.¶

In [7]:
# Drop columns with 100% missing values
data = raw_data.drop(columns=missing_data[missing_data['Percentage'] == 100].index)

Sanity Check after dropping¶

In [8]:
# Re-check for missing values
missing_values = data.isnull().sum()

# Calculate the percentage of missing values for each column
missing_percentage = (missing_values / len(data)) * 100

# Display columns with missing values and their corresponding percentage
missing_data = pd.DataFrame({'Missing Values': missing_values, 'Percentage': missing_percentage})
missing_data = missing_data[missing_data['Missing Values'] > 0].sort_values(by='Percentage', ascending=False)

missing_data
Out[8]:
Missing Values Percentage
Reconstruction costs from drought 480 29.925187
Insured damages against glacial lake outbursts 480 29.925187
Insured damages against fog 480 29.925187
Total economic damages from fog 480 29.925187
Reconstruction costs from wildfires 480 29.925187
Insured damages against wildfires 480 29.925187
Total economic damages from wildfires 480 29.925187
Reconstruction costs from extreme temperatures 480 29.925187
Insured damages against extreme temperatures 480 29.925187
Total economic damages from extreme temperatures 480 29.925187
Reconstruction costs from glacial lake outbursts 480 29.925187
Total economic damages from glacial lake outbursts 480 29.925187
Insured damages against drought 480 29.925187
Total economic damages from disasters as a share of GDP 480 29.925187
Total economic damages from drought as a share of GDP 480 29.925187
Total economic damages from earthquakes as a share of GDP 480 29.925187
Total economic damages from extreme temperatures as a share of GDP 480 29.925187
Total economic damages from floods as a share of GDP 480 29.925187
Total economic damages from landslides as a share of GDP 480 29.925187
Total economic damages from mass movements as a share of GDP 480 29.925187
Total economic damages from storms as a share of GDP 480 29.925187
Total economic damages from volcanic activity as a share of GDP 480 29.925187
Reconstruction costs from fog 480 29.925187
Total economic damages from landslides 480 29.925187
Insured damages against landslides 480 29.925187
Reconstruction costs from landslides 480 29.925187
Total economic damages from drought 480 29.925187
Reconstruction costs from earthquakes 480 29.925187
Insured damages against earthquakes 480 29.925187
Total economic damages from earthquakes 480 29.925187
Reconstruction costs from disasters 480 29.925187
Insured damages against disasters 480 29.925187
Total economic damages from disasters 480 29.925187
Reconstruction costs from volcanic activity 480 29.925187
Insured damages against volcanic activity 480 29.925187
Total economic damages from volcanic activity 480 29.925187
Reconstruction costs from floods 480 29.925187
Insured damages against floods 480 29.925187
Total economic damages from floods 480 29.925187
Reconstruction costs from mass movements 480 29.925187
Insured damages against mass movements 480 29.925187
Total economic damages from mass movements 480 29.925187
Reconstruction costs from storms 480 29.925187
Insured damages against storms 480 29.925187
Total economic damages from storms 480 29.925187
Total economic damages from volcanic activity as a share of GDP.1 480 29.925187

No more columns with 100% missing values, great!

Checking for duplicate rows¶

In [9]:
# Check for duplicate rows
duplicate_rows = data.duplicated().sum()

duplicate_rows
Out[9]:
0

The dataset doesn't have any duplicate rows.

Global Analysis¶

We begin by examining the global impact of four major types of natural disasters: earthquakes, floods, storms, and wildfires.

In [10]:
# Relevant columns for each disaster type
columns_of_interest = {
    "earthquake": [
        "Number of deaths from earthquakes",
        "Number of people injured from earthquakes",
        "Number of people left homeless from earthquakes",
        "Number of total people affected by earthquakes"
    ],
    "flood": [
        "Number of deaths from floods",
        "Number of people injured from floods",
        "Number of people left homeless from floods",
        "Number of total people affected by floods"
    ],
    "storm": [
        "Number of deaths from storms",
        "Number of people injured from storms",
        "Number of people left homeless from storms",
        "Number of total people affected by storms"
    ],
    "wildfire": [
        "Number of deaths from wildfires",
        "Number of people injured from wildfires",
        "Number of people left homeless from wildfires",
        "Number of total people affected by wildfires"
    ]
}


# Extract the relevant columns
filtered_data = data[["Entity", "Year"] + [col for sublist in columns_of_interest.values() for col in sublist]]

# Display the first few rows of the filtered dataset
filtered_data.head()
Out[10]:
Entity Year Number of deaths from earthquakes Number of people injured from earthquakes Number of people left homeless from earthquakes Number of total people affected by earthquakes Number of deaths from floods Number of people injured from floods Number of people left homeless from floods Number of total people affected by floods Number of deaths from storms Number of people injured from storms Number of people left homeless from storms Number of total people affected by storms Number of deaths from wildfires Number of people injured from wildfires Number of people left homeless from wildfires Number of total people affected by wildfires
0 Afghanistan 1950 210.0 200.0 0.0 200.0 5.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
1 Afghanistan 1960 0.0 0.0 0.0 0.0 10.7 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
2 Afghanistan 1970 6.1 1.5 0.0 9001.5 32.1 14.0 750.0 60168.4 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
3 Afghanistan 1980 51.3 351.8 658.0 7253.8 0.0 0.0 0.0 19100.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
4 Afghanistan 1990 742.6 358.3 7702.5 35229.3 199.0 30.0 1765.0 18230.5 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
In [11]:
# Aggregate the data for each disaster type over the years
global_aggregated_data = {}

for disaster, columns in columns_of_interest.items():
    if columns:
        global_aggregated_data[disaster] = filtered_data[columns].sum()
        
# Transform the aggregated data to the desired format
transformed_data = {}

for disaster, stats in global_aggregated_data.items():
    transformed_data[disaster.capitalize()] = {
        "Deaths": f"{int(stats[0]):,}",
        "People injured": f"{int(stats[1]):,}",
        "People left homeless": f"{int(stats[2]):,}",
        "Total people affected": f"{int(stats[3]):,}"
    }

# Convert the dictionary to a DataFrame for display
transformed_df = pd.DataFrame(transformed_data).transpose()

transformed_df.T
Out[11]:
Earthquake Flood Storm Wildfire
Deaths 905,683 2,793,814 559,132 1,750
People injured 1,117,359 545,990 556,190 4,455
People left homeless 9,954,741 36,963,619 21,588,466 96,016
Total people affected 80,044,254 1,536,563,304 474,296,923 6,860,637
In [12]:
# Metric names to match the column names
metrics = {
    "Number of deaths": [
        "Number of deaths from earthquakes",
        "Number of deaths from floods",
        "Number of deaths from storms",
        "Number of deaths from wildfires"
    ],
    "Number of people injured": [
        "Number of people injured from earthquakes",
        "Number of people injured from floods",
        "Number of people injured from storms",
        "Number of people injured from wildfires"
    ],
    "Number of people left homeless": [
        "Number of people left homeless from earthquakes",
        "Number of people left homeless from floods",
        "Number of people left homeless from storms",
        "Number of people left homeless from wildfires"
    ]
}

# Function to plot data for specified column and region (Global/Canada)
def plot_disaster_data(columns, region, title_suffix):
    plt.figure(figsize=(14, 7))
    
    if region == "Global":
        data_to_plot = filtered_data.groupby('Year').sum()
    else:
        data_to_plot = canada_data.groupby('Year').sum()
    
    # Disaster types
    disasters = ["earthquake", "flood", "storm", "wildfire"]
    
    # Plotting data for each disaster type
    for i, disaster in enumerate(disasters):
        plt.plot(data_to_plot.index, data_to_plot[columns[i]], label=disaster.capitalize())

    # Labeling the graph
    metric = columns[0].split(" from ")[0]
    plt.title(f"{metric} by Different Natural Disasters {title_suffix}", fontweight='bold')
    plt.ylabel(metric, fontweight='bold')
    plt.xlabel('Year', fontweight='bold')
    plt.legend()
    plt.grid(True, which='both', linestyle='--', linewidth=0.5)
    plt.tight_layout()
    plt.show()

# Plotting data globally
for metric, columns in metrics.items():
    plot_disaster_data(columns, "Global", "Globally Over the Years")
In [13]:
# Define the bars (natural disasters) and associated colors
bars = ["earthquake", "flood", "storm", "wildfire"]
colors = ["red", "blue", "green", "orange"]

# Aggregate the data for each disaster type globally
global_heights = [
    data['Number of total people affected by earthquakes'].sum(),
    data['Number of total people affected by floods'].sum(),
    data['Number of total people affected by storms'].sum(),
    data['Number of total people affected by wildfires'].sum()
]

# Sorting the data from most to least affected for global data
sorted_indices_global = sorted(range(len(global_heights)), key=lambda k: global_heights[k], reverse=True)
sorted_heights_global = [global_heights[i] for i in sorted_indices_global]
sorted_bars_global = [bars[i] for i in sorted_indices_global]
sorted_colors_global = [colors[i] for i in sorted_indices_global]

# Determine the timeframe from the global data
timeframe = f"{filtered_data['Year'].min()} - {filtered_data['Year'].max()}"

# Plotting the sorted data for global data with adjusted title and labels
plt.figure(figsize=(12, 6))
bars_plot_global = plt.bar(sorted_bars_global, sorted_heights_global, color=sorted_colors_global)

# Adding labels with rounded numbers to the bars for global data
for bar in bars_plot_global:
    yval = bar.get_height()
    plt.text(bar.get_x() + bar.get_width()/2, yval, f"{int(yval):,}", ha='center', va='bottom', fontsize=10)

plt.title(f'Number of Total People Affected by Different Natural Disasters Globally ({timeframe})', fontweight='bold')
plt.ylabel('Number of Total People Affected', fontweight='bold')
plt.xlabel('Disaster Type', fontweight='bold')
plt.tight_layout()
plt.show()

Insights for Global Analysis:¶

  • Floods have affected the most significant number of people globally, with over 1 billion people impacted over the specified timeframe. Floods can be devastating, often impacting large areas and causing significant property and agricultural damage.
  • Storms have affected nearly half a billion people globally. This indicates that storms are not only frequent but also have a vast reach, affecting large populations.
  • Earthquakes have impacted slightly above 80 million people. While earthquakes might not be as frequent as storms or floods, their severity can be high, leading to massive casualties and displacement.
  • Wildfires have the least impact in terms of the number of people affected globally, but still with over 6 million people. It's crucial to understand that wildfires can cause significant property damage and ecological impacts, even if fewer people are directly affected.

This global perspective is essential to understand the broader context of natural disasters. However, for specific applications, it might be vital to delve deeper into regional or country-specific data, as the intensity and frequency of these disasters can vary significantly based on geographical factors.

Therefore, with a global perspective established, we now turn our focus to Canada.

Canada-Specific Analysis¶

Given our project's emphasis on providing a mobile app for Canadians, understanding the impact of natural disasters in Canada is crucial. We'll examine the trends over time, the aggregated impact, and highlight significant events.

In [14]:
# Filter the data for Canada
canada_data = filtered_data[filtered_data['Entity'] == 'Canada']

# Aggregate the data for each disaster type for Canada only
canada_aggregated_data = {}

for disaster, columns in columns_of_interest.items():
    if columns:
        canada_aggregated_data[disaster] = canada_data[columns].sum()

# Transform the aggregated data to the desired format with disaster types in columns
transformed_canada_data = {}

for disaster, stats in canada_aggregated_data.items():
    transformed_canada_data[disaster.capitalize()] = {
        "Deaths": f"{int(stats[0]):,}",
        "People injured": f"{int(stats[1]):,}",
        "People left homeless": f"{int(stats[2]):,}",
        "Total people affected": f"{int(stats[3]):,}"
    }

# Convert the dictionary to a DataFrame for display
transformed_df_canada = pd.DataFrame(transformed_canada_data)

transformed_df_canada
Out[14]:
Earthquake Flood Storm Wildfire
Deaths 2 5 30 11
People injured 0 0 88 0
People left homeless 0 1,200 453 1,823
Total people affected 0 33,078 1,651 22,007
In [15]:
# Using the metrics and plot_disaster_data function defined in the previous code section for global data

# Plotting data for Canada
for metric, columns in metrics.items():
    plot_disaster_data(columns, "Canada", "in Canada Over the Years")
In [16]:
# Regenerate the canada_aggregated_df dataframe
canada_data_filtered = data[data['Entity'] == 'Canada']
canada_aggregated_data = {
    'flood': canada_data_filtered['Number of total people affected by floods'].sum(),
    'storm': canada_data_filtered['Number of total people affected by storms'].sum(),
    'wildfire': canada_data_filtered['Number of total people affected by wildfires'].sum()
}
canada_aggregated_df = pd.DataFrame.from_dict(canada_aggregated_data, orient='index', columns=['Total Affected'])

# Data for the bar graph
heights = [
    canada_aggregated_df['Total Affected']['flood'],
    canada_aggregated_df['Total Affected']['storm'],
    canada_aggregated_df['Total Affected']['wildfire']
]
bars = ['Floods', 'Storms', 'Wildfires']
colors = ['blue', 'green', 'orange']

# Sorting the data from most to least affected
sorted_indices = sorted(range(len(heights)), key=lambda k: heights[k], reverse=True)
sorted_heights = [heights[i] for i in sorted_indices]
sorted_bars = [bars[i] for i in sorted_indices]
sorted_colors = [colors[i] for i in sorted_indices]

# Plotting the sorted data
plt.figure(figsize=(12, 6))
bars_plot = plt.bar(sorted_bars, sorted_heights, color=sorted_colors)

# Adding labels with rounded numbers to the bars
for bar in bars_plot:
    yval = bar.get_height()
    plt.text(bar.get_x() + bar.get_width()/2, yval, f"{int(yval):,}", ha='center', va='bottom', fontsize=10)

plt.title(f'Number of Total People Affected by Different Natural Disasters in Canada ({timeframe})', fontweight='bold')
plt.ylabel('Number of Total People Affected', fontweight='bold')
plt.xlabel('Disaster Type', fontweight='bold')
plt.tight_layout()
plt.show()

Insights for Canada-Specific Analysis:¶

The bar chart provides an aggregated view of the total number of people affected by each disaster type in Canada:

  • Floods stand out as the most impactful natural disaster in Canada, affecting over 30,000 people. The impact of floods is consistent with the global trends we observed earlier, but the emphasis here indicates that Canada might be particularly prone to flood-related incidents.
  • Wildfires are the second most impactful disaster type in Canada. Although globally wildfires affected the least significant number of people, in Canada, they come right after floods in terms of the total affected population. Also, it's important to note that the actual spatial extent and ecological damage of wildfires can be significant.
  • Storms have had a relatively minor impact in Canada in terms of the number of people affected. This might suggest that while Canada experiences storms, they might not be as devastating or as frequent as in some other regions.

These insights provide a comprehensive view of the impact of natural disasters in Canada, informing the development and features of the mobile app.

Conclusion¶

The analysis provides valuable insights into the impact of natural disasters, both globally and in Canada. Understanding these patterns and trends can inform strategies for crisis response and management. The data emphasizes the significance of floods and wildfires, especially in affecting large populations in Canada. This information is crucial for prioritizing resources, designing preventive measures, and developing responsive solutions such as our proposed mobile app.