The largest number of passengers were in third class; third class accounted for over 54% of all passengers.
Titanic: data analysis
This projects demonstrates data analysis of the survivors and perished people from the Titanic tragedy in 1912. The analysis is made with Python, using Pandas library in Jupyter Lab.
- Data import


2. Definition of missing data

Observations:
a) A large number of passengers do not have an assigned age – this data will be partially completed using synthetic values (For each individual ticket, I assume it belongs to an adult and assign the age of 30. The number of missing values will drop from 263 to 84).
b) Many passengers also do not have assigned cabins – this data will be omitted.
c) Body numbers will also be omitted, as this data does not contribute anything to the analysis.
d) The destination will also not be analysed, because it has no relevance to survivability or the remaining analyses.
3. Attribution of passengers to cabins according to class:


Observation:
From Wikipedia: first class had 416 cabins, second class had 162 cabins, and third class had 262 cabins.
Despite the large number of cabins in second and third class, only 9 cabins were assigned to passengers in each of these classes.
4. Passengers’ stats:


Pie chart: Percentage of passengers in every class
Bar chart: number of passengers in each class
Observation: