Skip to content

This project explores generating adversarial examples and fine-tuning the ResNet18 model on the MNIST dataset using techniques like FGSM and PGD. The model achieved 97.91% accuracy on clean data but dropped to 12.51% under FGSM attack. Adding Gaussian noise restored the accuracy to 97.92%, showcasing the effectiveness of defense techniques.

Notifications You must be signed in to change notification settings

abd84/Adversarial-ResNet-Resilience-Gradient-Perturbation-Fidelity-Analysis-

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

Adversarial Example Generation and Model Fine-tuning with ResNet18

Overview

This project explores the generation of adversarial examples and the fine-tuning of the ResNet18 model on the MNIST dataset. The main focus was on evaluating model performance under normal and adversarial conditions using techniques like FGSM (Fast Gradient Sign Method) and Projected Gradient Descent (PGD).

Adversarial Example Techniques

Adversarial examples are inputs designed to deceive machine learning models by introducing small, imperceptible changes to the data. These changes cause models to misclassify the input, exposing weaknesses in model robustness.

  • FGSM (Fast Gradient Sign Method): This method generates adversarial examples by perturbing the input data in the direction of the gradient of the loss function with respect to the input. The perturbation is scaled by a factor (epsilon), which controls the magnitude of the attack.
  • Projected Gradient Descent (PGD): This is an iterative version of FGSM. It applies the gradient updates multiple times and projects the perturbations back into the allowed input space after each step, making it a more powerful adversarial attack.
  • Other Techniques: Other methods for generating adversarial examples include Carlini-Wagner Attack and DeepFool. These methods can be more sophisticated in terms of their ability to bypass defenses, but the focus in this project was on FGSM and PGD.

Model Fine-tuning with ResNet18 on MNIST

The ResNet18 model was fine-tuned on the MNIST dataset to evaluate performance on clean data and under adversarial conditions. Below are the results after fine-tuning for 1 epoch:

  • Clean Accuracy: 97.91%
  • FGSM Accuracy: 12.51%
  • FGSM + Gaussian Accuracy: 97.92%

As seen in the results, FGSM significantly reduced the accuracy of the model, showing how vulnerable the ResNet18 model is to adversarial examples. However, when combined with Gaussian noise, the accuracy was restored, demonstrating the robustness of the model under certain defense mechanisms.

Projected Gradient Descent (PGD)

PGD is an advanced adversarial attack method that refines adversarial examples iteratively. By applying small perturbations multiple times and ensuring that the perturbation remains within a feasible range, PGD generates more challenging adversarial examples compared to FGSM. This method is effective in evaluating model robustness and improving the adversarial training process.

Fine-tuning Results

The ResNet18 model was fine-tuned on the MNIST dataset and achieved the following performance metrics:

    Fine-tuning pretrained ResNet on MNIST for 1 epoch...
    Epoch 1, Loss: 0.06693653437562896
    Fine-tuning complete.
    Model saved as 'finetuned_resnet18_mnist.pth'.
Evaluation on 10000 MNIST samples (ResNet18):
Clean Accuracy           : 9791/10000 = 97.91%
FGSM Accuracy            : 1251/10000 = 12.51%
FGSM + Gaussian Accuracy: 9792/10000 = 97.92%

Model Evaluation

  • ResNet18 Performance: The model achieved a clean accuracy of 97.91% on the MNIST test set, but its accuracy dropped significantly to 12.51% under FGSM adversarial attack. The model's performance improved to 97.92% when Gaussian noise was added to the adversarial examples.
  • Adversarial Defense: Incorporating noise or using advanced defense techniques like adversarial training can help improve the model's resilience to attacks like FGSM and PGD.

Technologies Used

  • ResNet18: Pretrained model used for classification tasks on MNIST.
  • FGSM and PGD: Adversarial attack techniques for generating adversarial examples.
  • Python: Programming language used for model development and optimization.
  • Pytorch: Framework used for training the ResNet18 model and implementing adversarial attacks.
  • TensorFlow: For training and evaluating deep learning models.
  • NumPy: For numerical computations during the model training and evaluation process.

Future Improvements

  • Integrate more sophisticated adversarial defense mechanisms like adversarial training and defensive distillation to improve model robustness.
  • Experiment with more complex models (e.g., ResNet50, DenseNet) to test their resistance to adversarial attacks.
  • Explore other adversarial attack methods like DeepFool or Carlini-Wagner to better understand model vulnerabilities and enhance defenses.

About

This project explores generating adversarial examples and fine-tuning the ResNet18 model on the MNIST dataset using techniques like FGSM and PGD. The model achieved 97.91% accuracy on clean data but dropped to 12.51% under FGSM attack. Adding Gaussian noise restored the accuracy to 97.92%, showcasing the effectiveness of defense techniques.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published