Men are 7 time more likely to die from a gun death than women in USA 11 Sep 2017

Also you are more likely to die by suicide if you are a white person with only High School or GED diploma.

I have always been fascinated with the lack of gun control in the United States. I was also curious to know what are the odds of dying by a gun if you are a man or a woman, or if race plays a role into it at all. Also I wanted to look at the places where is more likely that you will die of a mass shooting (I will do that in a second post). So I went online and started looking for gun deaths and mass shootings data available for USA. After some digging I found some CSV files that I could download on the CDC and started figuring out some things from it using Python.

I looked first at a file with gun deaths from 2012 to 2014, and explored how many gun deaths were there by year.

For this I am going to use Pandas (a free to use Python data analytics library). I open the file that you can download here and have a look at the structure of the data and how many lines of data we have here.

import pandas as pd

data = pd.read_csv('full_data.csv')

print data

print len(data.index)
  Unnamed: 0  year  month   intent  police sex   age                    race  \
0           1  2012      1  Suicide       0   M  34.0  Asian/Pacific Islander   
1           2  2012      1  Suicide       0   F  21.0                   White   
2           3  2012      1  Suicide       0   M  60.0                   White   
3           4  2012      2  Suicide       0   M  64.0                   White   
4           5  2012      2  Suicide       0   M  31.0                   White   

   hispanic            place     education  
0       100             Home           BA+  
1       100           Street  Some college  
2       100  Other specified           BA+  
3       100             Home           BA+  
4       100  Other specified        HS/GED 

As we can see the data has information about the year and month when the death happened, the intent (suicide, homicide etc), the sex of the victim, age, race, as well as the education of the victim, and the place where the crime happened.

Now let’s see how many of those crimes happened each year for the time period between 2012 and 2014.

print 'Death counts by year'
print '2012', data[(data['year'] == 2012)].count().head(1)[0]
print '2013', data[(data['year'] == 2013)].count().head(1)[0]
print '2014', data[(data['year'] == 2014)].count().head(1)[0]
Death counts by year
2012 33563
2013 33636
2014 33599

This doesn’t tell us much, except that the data collected seems to be pretty evenly split yearly.

Now lets see a break down on sex, is it more likely to die by a gun death if you are a man or if you are a woman.

print 'number of victims that are male in 2012 ', data[(data['year'] == 2012) & (data['sex'] == 'M')].count()[0]
print 'number of victims that are male in 2013 ', data[(data['year'] == 2013) & (data['sex'] == 'M')].count()[0]
print 'number of victims that are male in 2014 ', data[(data['year'] == 2014) & (data['sex'] == 'M')].count()[0]
print 'number of victims that are male in 2012 ', data[(data['year'] == 2012) & (data['sex'] == 'F')].count()[0]
print 'number of victims that are male in 2013 ', data[(data['year'] == 2013) & (data['sex'] == 'F')].count()[0]
print 'number of victims that are male in 2014 ', data[(data['year'] == 2014) & (data['sex'] == 'F')].count()[0]
number of victims that are male in 2012  28838
number of victims that are male in 2013  28794
number of victims that are male in 2014  28717
number of victims that are male in 2012  4725
number of victims that are male in 2013  4842
number of victims that are male in 2014  4882

So far we know that the number of deaths by gun has not raised year-to-year in the 2012-2014 period and that it is more likely to die like that if you are a man, than if you are a woman. Each year men are 7 times more likely to die from a gun death than women.

Next I wanted to see what was more prevalent suicides or homicides and what influenced them - race, education. First a racial breakdown of gun deaths and then a break down on intent.

# break down on race
print '\n\n Racial break down of gun deaths'
print data.groupby('race').count()['year']

# break down on intent
print '\n\n Intent break down of gun deaths'
print data.groupby('intent').count()['year']
Racial break down of gun deaths
race
Asian/Pacific Islander             1326
Black                             23296
Hispanic                           9022
Native American/Native Alaskan      917
White                             66237


Intent break down of gun deaths
intent
Accidental       1639
Homicide        35176
Suicide         63175
Undetermined      807

From this we can see that there are a lot of suicides and a lot of white people who are getting killed by guns. The rate of homicides is almost half of the one of suicides, that might be an indication that the biggest problems of the American people is not crime against others but crime against oneself. Let’s see who is more likely to commit suicide then.

# break down on race and intent
print '\n\nWhites'
d1 =  data[(data['race'] == 'White')]
d1.reset_index(inplace=True)
print d1.groupby('intent').count()['year']

print '\n\nBlacks'
d2 =  data[(data['race'] == 'Black')]
d2.reset_index(inplace=True)
print d2.groupby('intent').count()['year']
Whites
intent
Accidental       1132
Homicide         9147
Suicide         55372
Undetermined      585
Name: year, dtype: int64


Blacks
intent
Accidental        328
Homicide        19510
Suicide          3332
Undetermined      126

So, whites are more likely to commit suicide and are dying at a higher rate than blacks (which is normal, it’s a white dominated population). At the same time, it looks like most blacks’ deaths are caused by homicide. It would be interesting to know which are the zipcodes in which those things are more prevalent - where do people who commit suicide more often live?

One last thing that I wanted to check is if education had any influence on the gun death rate. So I looked at that data as well.

# break down on education

print data.groupby('education').count()['year']
BA+             12946
HS/GED          42927
Less than HS    21823
Some college    21680

So, the most vulnerable people in this data set are the ones considered as middle class, they have a High School or GED diploma. Let’s see how that correlates with race.

print '\n\nBreakdown on education and race'
d3 = data[(data['education'] == 'HS/GED') & (data['intent'] == 'Suicide')]
d3.reset_index(inplace=True)
print d3.groupby('race').count()['year']
Breakdown on education and race
race
Asian/Pacific Islander              171
Black                              1430
Hispanic                           1133
Native American/Native Alaskan      247
White                             23340

Whites with a HS/GED diploma are ~25 times more likely to kill themselves that are black or hispanic people. That’s quite depressing. I intend to continue this by adding in data for mass shootings and police shootings, I am sure it’s going to be very interesting.