Implement Logistic Regression, Support Vector Machine, and Random Forest Classifier models on a lung cancer dataset to predict whether each patient has lung cancer based on their characteristics, after performing feature engineering. Verify the results of the models using cross-validation. Analyze the relevance of all features in the dataset for determining the lung cancer diagnosis. Lastly, conduct t-tests to identify the most effective model for the dataset. The dataset was obtained from Kaggle.
- Download the Lung_Cancer.csv dataset and upload it to a platform like Jupyter Notebook or Google Colab.
- Download the 09_615_Project.ipynb notebook and upload it to the same platform.
- Run the 09_615_Project.ipynb notebook to perform the analysis, including feature engineering, model evaluation, cross-validation, feature relevance assessment, and identification of the best model.