PCA Analysis of Road Accident Data

Jul 16, 2024, 4:42 PM

Normalized_Monthly_Road_Accident_Final.xlsx

Missing file

Perform PCA on the following parameters that the distress call data depends on: Road Length(National Highway) Road Length(State Highway) Road Length(Other Roads) Road Length(Village Roads) Total Commercial Vehicles Total Non-Commercial Vehicles Two Wheelers(Motorcycles/Scooters/Mopeds) Four Wheelers(Jeeps/motor cars/LCV/AV) Three Wheelers(Auto Rikshaws/Tricycle Auto) Heavy Vehicles(Lorries, Tractors,Road Rollers, Stage Carriage) Service Vehicles(Ambulances/Fire Fighters/School Buses) Public Transport(Motor cabs/Maxi cabs/Omni buses/ordinary Taxi/Mini Bus)

It appears that the column names in the dataset do not exactly match the ones provided. Let's list all the column names in the dataset to identify the correct names.
We successfully performed PCA on the selected parameters related to road accidents. The scatter plot of the first two principal components was generated, showing the distribution of the data in the reduced dimensional space. This visualization helps in understanding the variance and clustering of the data based on the selected features.

What are the top contributing features for each principal component?

The top contributing features for each principal component are as follows:

Principal Component 1

  • Total Commercial Vehicles: 0.364187
  • Total Non-Commercial Vehicles: 0.364096
  • Two Wheelers (Motercycles/Scooters/Mopeds): 0.363680
  • Public Transport (Motor cabs/Maxi cabs/Omni buses/ordinary Taxi/Mini Bus): 0.356239
  • Three Wheelers (Auto Rikshaws/Tricycle Auto): 0.352518
  • Heavy Vehicles (Lorries, Tractors, Road Rollers, Stage Carriage): 0.317969
  • Service Vehicles (Ambulances/Fire Fighters/School Buses): 0.332910

Principal Component 2

  • Road Length (State Highway): 0.530596
  • Road Length (Other Roads): 0.537688
  • Road Length (Village Roads): 0.529671
  • Road Length (National Highway): 0.272597
  • Heavy Vehicles (Lorries, Tractors, Road Rollers, Stage Carriage): 0.199443
  • Service Vehicles (Ambulances/Fire Fighters/School Buses): 0.140057
These features have the highest absolute values in their respective principal components, indicating that they contribute the most to the variance explained by each component.

Perform PCA on the following parameters that the distress call data depends on Sex Ratio, Literacy rate ,Employment percentage, Poverty rate, Mobile Phone Availability

We successfully performed PCA on the selected parameters that the distress call data depends on. The scatter plot of the first two principal components was generated, showing the distribution of the data in the reduced dimensional space. This visualization helps in understanding the variance and clustering of the data based on the selected features.
The top contributing features for each principal component based on the distress call data are as follows:

Principal Component 1

  • Employment percentage: 0.623573
  • Literacy rate: 0.543155
  • Poverty rate: 0.502832
  • Sex Ratio: 0.210003
  • Mobile Phone Availability: 0.138557

Principal Component 2

  • Sex Ratio: -0.718349
  • Literacy rate: -0.405498
  • Mobile Phone Availability: 0.404086
  • Employment percentage: 0.330659
  • Poverty rate: 0.216623
These features have the highest absolute values in their respective principal components, indicating that they contribute the most to the variance explained by each component.
Get started with Vizly