Detect phishing emails Using Machine Learning Model
Summary
Learning Goals
Data Preprocessing: Techniques for cleaning and preparing the data, such as handling missing values, normalizing text data, and feature extraction.
Context for Use
This hands-on lab will for the graduate students in cybersecurity or AI classes.
Model Selection: Discussion of various machine learning algorithms suitable for classification tasks, including logistic regression, decision trees, and support vector machines.
Training and Testing: Step-by-step instructions on splitting the dataset into training and testing sets, followed by training the selected model.
Evaluation Metrics: Introduction to metrics such as accuracy, precision, recall, and F1-score to assess model performance.
Visualization: Utilizing MATLAB's visualization tools to display results and insights from the model's predictions.
By the end of the lab, participants will have hands-on experience in developing a machine learning model to classify phishing emails, equipping them with practical skills applicable in cybersecurity and data science fields.
Description and Teaching Materials
[2] Tangkere, B. B. (2024). Analisis Performa Logistic Regression dan Support Vector Classification untuk Klasifikasi Email Phising. Jurnal Ekonomi Manajemen Sistem Informasi, 5(4), 442-450.
Teaching Notes and Tips
Data preprocessing: Handling missing data, normalizing features, and splitting data for training and testing.
Model evaluation: Using accuracy metrics and interpreting confusion matrices to assess model performance.
Data visualization: Creating bar charts and heatmaps for clearer data insights.
=
MATLAB simplifies tasks like data processing, model building, and visualization
Assessment
2.A MATLAB script (M-file) that successfully loads and preprocesses the dataset, builds the logistic regression model, and visualizes the results.
Each deliverable is manually graded based on code functionality, accuracy, and the quality of explanations provided.