Zombies Apocalypse¶
- Data Source: Kaggle
- Tasks: compare humans and zombies to identify differences in supplies
- Language: julia
Context¶
News reports suggest that the impossible has become possible…zombies have appeared on the streets of the US! What should we do? The Centers for Disease Control and Prevention (CDC) zombie preparedness website recommends storing water, food, medication, tools, sanitation items, clothing, essential documents, and first aid supplies. Thankfully, we are CDC analysts and are prepared, but it may be too late for others!
Content¶
Our team decides to identify supplies that protect people and coordinate supply distribution. A few brave data collectors volunteer to check on 200 randomly selected adults who were alive before the zombies. We have recent data for the 200 on age and sex, how many are in their household, and their rural, suburban, or urban location. Our heroic volunteers visit each home and record zombie status and preparedness. Now it's our job to figure out which supplies are associated with safety!
File¶
Because every moment counts when dealing with life and (un)death, we want to get this right! The first task is to compare humans and zombies to identify differences in supplies. We review the data and find the following:
- zombieid: unique identifier
- zombie: human or zombie
- age: age in years
- sex: male or female
- rurality: rural, suburban, or urban
- household: number of people living in household
- water: gallons of clean water available
- food: food or no food
- medication: medication or no medication
- tools: tools or no tools
- firstaid: first aid or no first aid
- sanitation: sanitation or no sanitation
- clothing: clothing or no clothing
- documents: documents or no documents
Acknowledgements¶
DataCamp
ENV["COLUMNS"] = 1000; # print more columns of tables
using Random
Random.seed!(42)
"Andi Kerstin Chris Caro Jana" |> split |> shuffle |> x -> join(x," → ")
"Kerstin → Chris → Andi → Jana → Caro"
1. Data loading¶
using Dates
using CSV
using DataFrames
data = CSV.read("zombies.csv", DataFrame)
first(data, 5)
Row | zombieid | zombie | age | sex | rurality | household | water | food | medication | tools | firstaid | sanitation | clothing | documents |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Int64 | String7 | Int64 | String7 | String15 | Int64 | Int64 | String7 | String15 | String15 | String31 | String15 | String15 | String15 | |
1 | 1 | Human | 18 | Female | Rural | 1 | 0 | Food | Medication | No tools | First aid supplies | Sanitation | Clothing | NA |
2 | 2 | Human | 18 | Male | Rural | 3 | 24 | Food | Medication | tools | First aid supplies | Sanitation | Clothing | NA |
3 | 3 | Human | 18 | Male | Rural | 4 | 16 | Food | Medication | No tools | First aid supplies | Sanitation | Clothing | NA |
4 | 4 | Human | 19 | Male | Rural | 1 | 0 | Food | Medication | tools | No first aid supplies | Sanitation | Clothing | NA |
5 | 5 | Human | 19 | Male | Urban | 1 | 0 | Food | Medication | No tools | First aid supplies | Sanitation | NA | NA |
unique(data.food)
2-element Vector{String7}: "Food" "No food"
unique(data.medication)
2-element Vector{String15}: "Medication" "No medication"
unique(data.tools)
2-element Vector{String15}: "No tools" "tools"
unique(data.firstaid)
2-element Vector{String31}: "First aid supplies" "No first aid supplies"
unique(data.sanitation)
2-element Vector{String15}: "Sanitation" "No sanitation"
unique(data.clothing)
2-element Vector{String15}: "Clothing" "NA"
unique(data.documents)
2-element Vector{String15}: "NA" "Documents"
using GLMakie
using DataFramesMeta
using Chain
using StatsBase
unique(data.age)
dict_age = sort(countmap(data.age))
OrderedCollections.OrderedDict{Int64, Int64} with 62 entries: 18 => 4 19 => 4 20 => 3 21 => 5 22 => 1 23 => 4 24 => 5 25 => 7 26 => 4 27 => 2 28 => 6 29 => 6 30 => 4 31 => 2 32 => 8 33 => 3 34 => 2 35 => 2 36 => 5 ⋮ => ⋮
collect(values(dict_age))
62-element Vector{Int64}: 4 4 3 5 1 4 5 7 4 2 ⋮ 1 2 3 1 2 1 2 1 1
f, ax, plt = hist(data_grouped[1].age, color = (:blue, 0.5), label = "Human")
hist!(ax, data_grouped[2].age, color = (:red, 0.5), label = "Zombie")
axislegend(ax)
display(f)
GLMakie.Screen(...)
data_grouped = groupby(data, :zombie)
GroupedDataFrame with 2 groups based on key: zombie
Row | zombieid | zombie | age | sex | rurality | household | water | food | medication | tools | firstaid | sanitation | clothing | documents |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Int64 | String7 | Int64 | String7 | String15 | Int64 | Int64 | String7 | String15 | String15 | String31 | String15 | String15 | String15 | |
1 | 1 | Human | 18 | Female | Rural | 1 | 0 | Food | Medication | No tools | First aid supplies | Sanitation | Clothing | NA |
2 | 2 | Human | 18 | Male | Rural | 3 | 24 | Food | Medication | tools | First aid supplies | Sanitation | Clothing | NA |
3 | 3 | Human | 18 | Male | Rural | 4 | 16 | Food | Medication | No tools | First aid supplies | Sanitation | Clothing | NA |
4 | 4 | Human | 19 | Male | Rural | 1 | 0 | Food | Medication | tools | No first aid supplies | Sanitation | Clothing | NA |
5 | 5 | Human | 19 | Male | Urban | 1 | 0 | Food | Medication | No tools | First aid supplies | Sanitation | NA | NA |
6 | 6 | Human | 19 | Female | Urban | 1 | 0 | Food | Medication | tools | First aid supplies | Sanitation | Clothing | NA |
7 | 7 | Human | 20 | Female | Suburban | 2 | 0 | No food | Medication | No tools | First aid supplies | Sanitation | Clothing | NA |
8 | 8 | Human | 20 | Female | Rural | 2 | 0 | Food | No medication | No tools | No first aid supplies | Sanitation | Clothing | NA |
9 | 9 | Human | 21 | Female | Urban | 1 | 8 | No food | No medication | tools | First aid supplies | Sanitation | Clothing | Documents |
10 | 10 | Human | 21 | Female | Rural | 2 | 8 | No food | No medication | tools | First aid supplies | Sanitation | Clothing | Documents |
11 | 11 | Human | 21 | Male | Rural | 1 | 8 | Food | No medication | No tools | First aid supplies | No sanitation | NA | NA |
12 | 12 | Human | 21 | Male | Rural | 2 | 16 | No food | Medication | No tools | No first aid supplies | Sanitation | Clothing | Documents |
13 | 13 | Human | 22 | Male | Suburban | 2 | 16 | Food | Medication | No tools | First aid supplies | Sanitation | Clothing | Documents |
⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ |
110 | 110 | Human | 63 | Female | Rural | 1 | 0 | Food | Medication | No tools | First aid supplies | Sanitation | Clothing | Documents |
111 | 111 | Human | 65 | Female | Rural | 2 | 16 | Food | No medication | tools | No first aid supplies | No sanitation | NA | NA |
112 | 112 | Human | 67 | Male | Rural | 2 | 16 | No food | No medication | tools | No first aid supplies | No sanitation | NA | NA |
113 | 113 | Human | 68 | Male | Rural | 2 | 8 | No food | Medication | No tools | First aid supplies | Sanitation | Clothing | Documents |
114 | 114 | Human | 69 | Female | Rural | 2 | 8 | Food | Medication | No tools | First aid supplies | No sanitation | NA | NA |
115 | 115 | Human | 71 | Female | Urban | 2 | 8 | Food | Medication | No tools | No first aid supplies | Sanitation | Clothing | Documents |
116 | 116 | Human | 72 | Male | Suburban | 2 | 0 | Food | Medication | tools | No first aid supplies | No sanitation | Clothing | NA |
117 | 117 | Human | 74 | Male | Suburban | 1 | 0 | Food | Medication | tools | First aid supplies | No sanitation | Clothing | NA |
118 | 118 | Human | 75 | Female | Rural | 1 | 8 | Food | No medication | tools | First aid supplies | No sanitation | NA | NA |
119 | 119 | Human | 77 | Female | Rural | 1 | 8 | Food | Medication | No tools | No first aid supplies | No sanitation | NA | NA |
120 | 120 | Human | 81 | Male | Rural | 1 | 8 | Food | Medication | tools | No first aid supplies | No sanitation | NA | NA |
121 | 121 | Human | 32 | Male | Rural | 2 | 8 | Food | No medication | No tools | First aid supplies | Sanitation | Clothing | Documents |
⋮
Row | zombieid | zombie | age | sex | rurality | household | water | food | medication | tools | firstaid | sanitation | clothing | documents |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Int64 | String7 | Int64 | String7 | String15 | Int64 | Int64 | String7 | String15 | String15 | String31 | String15 | String15 | String15 | |
1 | 122 | Zombie | 20 | Female | Urban | 2 | 0 | Food | No medication | tools | First aid supplies | No sanitation | Clothing | NA |
2 | 123 | Zombie | 23 | Male | Suburban | 3 | 0 | No food | No medication | No tools | No first aid supplies | No sanitation | Clothing | NA |
3 | 124 | Zombie | 25 | Female | Rural | 5 | 0 | No food | No medication | No tools | No first aid supplies | No sanitation | Clothing | NA |
4 | 125 | Zombie | 28 | Female | Suburban | 3 | 0 | No food | Medication | tools | First aid supplies | No sanitation | Clothing | NA |
5 | 126 | Zombie | 31 | Female | Rural | 4 | 0 | No food | No medication | tools | First aid supplies | No sanitation | Clothing | NA |
6 | 127 | Zombie | 32 | Male | Suburban | 4 | 0 | No food | No medication | No tools | No first aid supplies | Sanitation | NA | Documents |
7 | 128 | Zombie | 42 | Male | Rural | 4 | 8 | No food | No medication | No tools | No first aid supplies | Sanitation | Clothing | Documents |
8 | 129 | Zombie | 43 | Male | Urban | 5 | 8 | No food | No medication | tools | First aid supplies | No sanitation | Clothing | NA |
9 | 130 | Zombie | 44 | Male | Rural | 5 | 8 | Food | No medication | tools | First aid supplies | No sanitation | Clothing | NA |
10 | 131 | Zombie | 45 | Male | Urban | 4 | 0 | Food | Medication | No tools | No first aid supplies | No sanitation | Clothing | NA |
11 | 132 | Zombie | 47 | Female | Urban | 2 | 0 | No food | Medication | No tools | No first aid supplies | No sanitation | Clothing | NA |
12 | 133 | Zombie | 48 | Female | Suburban | 3 | 0 | No food | No medication | No tools | First aid supplies | No sanitation | Clothing | NA |
13 | 134 | Zombie | 48 | Female | Urban | 2 | 0 | No food | No medication | tools | First aid supplies | Sanitation | Clothing | Documents |
⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ |
68 | 189 | Zombie | 32 | Male | Urban | 3 | 0 | No food | No medication | No tools | First aid supplies | No sanitation | Clothing | NA |
69 | 190 | Zombie | 41 | Female | Rural | 5 | 0 | No food | No medication | tools | First aid supplies | No sanitation | Clothing | NA |
70 | 191 | Zombie | 43 | Female | Rural | 5 | 0 | No food | No medication | No tools | No first aid supplies | Sanitation | Clothing | Documents |
71 | 192 | Zombie | 48 | Female | Suburban | 4 | 8 | No food | No medication | No tools | No first aid supplies | No sanitation | NA | NA |
72 | 193 | Zombie | 58 | Male | Urban | 1 | 0 | Food | No medication | tools | First aid supplies | No sanitation | NA | NA |
73 | 194 | Zombie | 65 | Male | Urban | 1 | 0 | No food | No medication | tools | First aid supplies | No sanitation | NA | NA |
74 | 195 | Zombie | 67 | Female | Suburban | 2 | 0 | No food | No medication | No tools | No first aid supplies | No sanitation | NA | NA |
75 | 196 | Zombie | 68 | Male | Suburban | 1 | 0 | Food | No medication | No tools | No first aid supplies | Sanitation | Clothing | Documents |
76 | 197 | Zombie | 71 | Male | Suburban | 1 | 8 | No food | No medication | tools | First aid supplies | No sanitation | Clothing | NA |
77 | 198 | Zombie | 76 | Female | Urban | 1 | 0 | No food | No medication | tools | First aid supplies | Sanitation | Clothing | Documents |
78 | 199 | Zombie | 82 | Male | Urban | 1 | 0 | No food | No medication | No tools | No first aid supplies | No sanitation | NA | NA |
79 | 200 | Zombie | 85 | Male | Urban | 1 | 0 | No food | Medication | No tools | No first aid supplies | Sanitation | Clothing | NA |
countmap(data_grouped[1].sex)
Dict{String7, Int64} with 2 entries: "Female" => 62 "Male" => 59
first(data)
Row | zombieid | zombie | age | sex | rurality | household | water | food | medication | tools | firstaid | sanitation | clothing | documents |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Int64 | String7 | Int64 | String7 | String15 | Int64 | Int64 | String7 | String15 | String15 | String31 | String15 | String15 | String15 | |
1 | 1 | Human | 18 | Female | Rural | 1 | 0 | Food | Medication | No tools | First aid supplies | Sanitation | Clothing | NA |
data_count = @chain data begin
groupby(:zombie)
@combine(:sex = countmap(:sex),
:rurality = countmap(:rurality),
:food = countmap(:food),
:medication = countmap(:medication),
:tools = countmap(:tools),
:firstaid = countmap(:firstaid),
:sanitation = countmap(:sanitation),
:clothing = countmap(:clothing),
:documents = countmap(:documents))
end
Row | zombie | sex | rurality | food | medication | tools | firstaid | sanitation | clothing | documents |
---|---|---|---|---|---|---|---|---|---|---|
String7 | Dict… | Dict… | Dict… | Dict… | Dict… | Dict… | Dict… | Dict… | Dict… | |
1 | Human | Dict{String7, Int64}("Female"=>62, "Male"=>59) | Dict{String15, Int64}("Urban"=>16, "Rural"=>80, "Suburban"=>25) | Dict{String7, Int64}("Food"=>91, "No food"=>30) | Dict{String15, Int64}("No medication"=>43, "Medication"=>78) | Dict{String15, Int64}("tools"=>60, "No tools"=>61) | Dict{String31, Int64}("First aid supplies"=>67, "No first aid supplies"=>54) | Dict{String15, Int64}("Sanitation"=>73, "No sanitation"=>48) | Dict{String15, Int64}("NA"=>47, "Clothing"=>74) | Dict{String15, Int64}("NA"=>77, "Documents"=>44) |
2 | Zombie | Dict{String7, Int64}("Female"=>37, "Male"=>42) | Dict{String15, Int64}("Urban"=>38, "Rural"=>18, "Suburban"=>23) | Dict{String7, Int64}("Food"=>19, "No food"=>60) | Dict{String15, Int64}("No medication"=>63, "Medication"=>16) | Dict{String15, Int64}("tools"=>39, "No tools"=>40) | Dict{String31, Int64}("First aid supplies"=>39, "No first aid supplies"=>40) | Dict{String15, Int64}("Sanitation"=>25, "No sanitation"=>54) | Dict{String15, Int64}("NA"=>27, "Clothing"=>52) | Dict{String15, Int64}("NA"=>57, "Documents"=>22) |
colors = [:red, :blue]
elem_1 = [PolyElement(color = :red, strokecolor = :blue, strokewidth = 1)]
elem_2 = [PolyElement(color = :blue, strokecolor = :blue, strokewidth = 1)]
1-element Vector{PolyElement}: PolyElement(Attributes with 3 entries: polycolor => blue polystrokecolor => blue polystrokewidth => 1)
f, ax, plt = pie(collect(values(data_count.sex[1])),
color = colors,
radius = 4,
inner_radius = 2,
strokecolor = :white,
strokewidth = 5,
axis = ( autolimitaspect = 1, ))
ax2 = Axis(f[1,2], autolimitaspect = 1, )
pie!(ax2, collect(values(data_count.sex[2])),
color = colors,
radius = 4,
inner_radius = 2,
strokecolor = :white,
strokewidth = 5)
Legend(f[1, 3],
[elem_1, elem_2],
["Female", "Male"],
patchsize = (35, 35), rowgap = 10)
display(f)
GLMakie.Screen(...)
f, ax, plt = pie(collect(values(data_count.food[1])),
color = colors,
radius = 4,
inner_radius = 2,
strokecolor = :white,
strokewidth = 5,
axis = ( autolimitaspect = 1, ))
ax2 = Axis(f[1,2], autolimitaspect = 1, )
pie!(ax2, collect(values(data_count.food[2])),
color = colors,
radius = 4,
inner_radius = 2,
strokecolor = :white,
strokewidth = 5)
Legend(f[1, 3],
[elem_1, elem_2],
["Food", "No Food"],
patchsize = (35, 35), rowgap = 10)
display(f)
GLMakie.Screen(...)
f, ax, plt = pie(collect(values(data_count.medication[1])),
color = colors,
radius = 4,
inner_radius = 2,
strokecolor = :white,
strokewidth = 5,
axis = ( autolimitaspect = 1, ))
ax2 = Axis(f[1,2] , autolimitaspect = 1, )
pie!(ax2, collect(values(data_count.medication[2])),
color = colors,
radius = 4,
inner_radius = 2,
strokecolor = :white,
strokewidth = 5)
Legend(f[1, 3],
[elem_1, elem_2],
["Medication", "No Medication"],
patchsize = (35, 35), rowgap = 10)
display(f)
GLMakie.Screen(...)