The data preprocessing step has been completed. The dataset has been split into training and testing sets with the following shapes:
- Training set: 3392 samples, 15 features
- Testing set: 848 samples, 15 features
Next, we will proceed with training multiple classification models. We will include logistic regression, decision trees, random forests, support vector machines, and neural networks. We will also optimize the hyperparameters for the F1 score using GridSearchCV.
Let's start with training the models.