Analyzing Geographical Distribution of Fatalities in ACLED Dataset

Apr 5, 2024, 11:06 AM

Where are the highest fatalities concentrated geographically?

We analyzed the geographical distribution of fatalities using the ACLED dataset. A scatter geo plot was created to visualize the concentration of fatalities across different locations. The plot shows the size of fatalities in relation to their geographical coordinates, providing a clear view of where the highest fatalities are concentrated geographically.

Analyze trends in fatalities over time

We analyzed the trend of fatalities over time using the ACLED dataset. A line plot was created to visualize the sum of fatalities each month. The plot highlights how the number of fatalities has changed over the years, providing insights into periods of increased or decreased violence.

Compare fatalities by event type

Compare fatalities by event type and

test the null hypothesis that there will be no statistically significant difference in the fatalities by event type.

Continue the analysis

Visualize this for me

Analyze fatalities by location

test the null hypothesis that there will be no statistically significant difference in the fatalities by event type.

Fatalities by Event Type and Statistical Analysis

  • We visualized the comparison of fatalities by event type using a bar chart, which shows the number of fatalities associated with each event type.
  • To test the null hypothesis that there is no statistically significant difference in the fatalities by event type, an ANOVA test was performed.
  • ANOVA Test Results:
    • Statistic: 199.01
    • P-value: 9.07e-211
  • The extremely low p-value suggests that we can reject the null hypothesis, indicating there is a statistically significant difference in the number of fatalities by event type.

Continue the analysis

Perform a regression analysis on fatalities

To perform a regression analysis on fatalities, we'll first need to prepare the data by selecting relevant features that could influence fatalities, such as event type, location, and year. Then, we'll split the data into training and testing sets, choose a regression model, train the model on the training set, and evaluate its performance on the test set. Let's start by preparing the data.

Regression Analysis on Fatalities

  • We performed a regression analysis to understand the relationship between fatalities and other factors such as year, location (latitude and longitude), and event type.
  • The dataset was prepared by selecting relevant features and converting categorical data (event type) to numeric using one-hot encoding.
  • The data was split into training and testing sets, and a Linear Regression model was trained on the training set.
  • Model Evaluation:
    • Mean Squared Error (MSE): 234.47
    • R^2 Score: 0.043
  • The R^2 score suggests that the model explains a small portion of the variance in fatalities. The relatively high MSE indicates that the model's predictions may not be very accurate. This suggests that other factors not included in the model or a different model might better capture the complexity of factors influencing fatalities.

Generate a 3d visualization of fatalities and location

Summary of Analysis

  • Trends in Fatalities Over Time: A line plot visualized the trend of fatalities over months, highlighting periods of increased or decreased violence.
  • Fatalities by Event Type: A bar chart compared fatalities across different event types. An ANOVA test revealed a statistically significant difference in fatalities by event type, suggesting variations in the impact of different event types on fatalities.
  • Regression Analysis: A regression analysis aimed to understand the relationship between fatalities and factors like year, location, and event type. The model showed a low R^2 score, indicating it explains a small portion of the variance in fatalities. The high MSE suggests the model's predictions may not be very accurate, indicating the complexity of factors influencing fatalities.
  • 3D Visualization of Fatalities and Location: A 3D scatter plot illustrated the relationship between fatalities, latitude, and longitude, providing a visual representation of how fatalities are distributed geographically.

Investigate the impact of location on fatalities

To investigate the impact of location on fatalities, we'll analyze fatalities by admin1 (first-level administrative division, like a state or province) and admin2 (second-level administrative division, like a city or district). This approach helps us understand how fatalities are distributed across different regions and identify areas with higher incidences of violence. We'll start by aggregating fatalities by these administrative divisions.

Impact of Location on Fatalities

  • We analyzed the impact of location on fatalities by grouping data by first-level (admin1) and second-level (admin2) administrative divisions and summing the fatalities.
  • The analysis revealed regions with the highest number of fatalities, indicating areas with higher incidences of violence. For example, Borno state, particularly the Maiduguri Metro and Gwoza areas, showed significantly high numbers of fatalities.
  • This analysis helps identify areas most affected by violence, guiding targeted interventions and policy decisions.

Visualize this for me

Get started with Vizly