Titanic Survival Rate Anticipation
An legendary illustration of data analysis by Kaggle.
Official page: https://www.kaggle.com/c/titanic
特征提取:feature engineering
pd.get_dummies(df)
对df进行one-hot encode, 适合random forest 等分类模型
Ref: https://blog.csdn.net/maymay_/article/details/80198468
random forest 模型原理
A ‘forest’ will be generated. Wit an given sample, each tree in this forest will make a binary decision on the specified features. Both the hierachy of the parameters to be considered and judging criteria are random for each tress. The results given by each tree will be gathered together to vote for a final results, outcome with most votes wins.
[Following figures provided by kaggle]
Ref
- https://www.kaggle.com/alexisbcook/titanic-tutorial
- https://www.kaggle.com/c/titanic/overview
- 多模型对比https://www.cnblogs.com/ly803744/articles/9603343.html