Notebook

Aug 12, 2024, 4:51 PM

give me the eda for the above csv file

Exploratory Data Analysis (EDA) for Titanic Dataset

The dataset contains the following columns:
  • PassengerId: Unique ID for each passenger
  • Survived: Survival (0 = No; 1 = Yes)
  • Pclass: Passenger Class (1 = 1st; 2 = 2nd; 3 = 3rd)
  • Name: Name of the passenger
  • Sex: Sex of the passenger
  • Age: Age of the passenger
  • SibSp: Number of siblings/spouses aboard the Titanic
  • Parch: Number of parents/children aboard the Titanic
  • Ticket: Ticket number
  • Fare: Passenger fare
  • Cabin: Cabin number
  • Embarked: Port of Embarkation (C = Cherbourg; Q = Queenstown; S = Southampton)
Let's start by understanding the basic statistics and structure of the dataset.

Summary Statistics

  • Survived: 38.4% of passengers survived.
  • Pclass: Most passengers were in 3rd class (55.1%).
  • Sex: There were more males (64.8%) than females (35.2%).
  • Age: The average age of passengers was approximately 29.7 years.
  • SibSp: Most passengers had no siblings/spouses aboard (68.7%).
  • Parch: Most passengers had no parents/children aboard (76.1%).
  • Fare: The average fare was approximately $32.20.
  • Cabin: Only 23% of passengers had a recorded cabin number.
  • Embarked: Most passengers embarked from Southampton (72.4%).
Next, let's visualize some of these statistics to get a better understanding of the data.
Get started with Vizly