"

Reference List

Aafaq, N., Akhtar, N., Liu, W., Shah, M., & Mian, A. (2022). Language model agnostic gray-box adversarial attack on image captioning. IEEE Transactions on Information Forensics and Security, 18, 626–638. https://doi.org/10.1109/TIFS.2022.3226905

Abdukhamidov, E., Abuhamad, M., Thiruvathukal, G. K., Kim, H., & Abuhmed, T. (2023). Single-class target-specific attack against interpretable deep learning systems. arXiv Preprint, arXiv:2307.06484. https://arxiv.org/abs/2307.06484

Agnihotri, S., Jung, S., & Keuper, M. (2023). CosPGD: A unified white-box adversarial attack for pixel-wise prediction tasks. arXiv Preprint, arXiv:2302.02213. https://arxiv.org/abs/2302.02213

Ateniese, G., Mancini, L. V., Spognardi, A., Villani, A., Vitali, D., & Felici, G. (2015). Hacking smart machines with smarter ones: How to extract meaningful data from machine learning classifiers. International Journal of Security Networks, 10(3), 137–150.

Athalye, A., & Sutskever, I. (2017). Synthesizing robust adversarial examples. arXiv Preprint, arXiv:1707.07397. https://arxiv.org/abs/1707.07397

Ayub, M. A., Johnson, W. A., Talbert, D. A., & Siraj, A. (2020). Model evasion attack on intrusion detection systems using adversarial machine learning. In 2020 54th Annual Conference on Information Sciences and Systems (CISS) (pp. 1–6). IEEE. https://ieeexplore.ieee.org/document/9086268

Bai, Y., Wang, Y., Zeng, Y., Jiang, Y., & Xia, S. T. (2023). Query efficient black-box adversarial attack on deep neural networks. Pattern Recognition, 133, 109037. https://doi.org/10.1016/j.patcog.2022.109037

Balle, B., Cherubin, G., & Hayes, J. (2021). Reconstructing training data with informed adversaries. In NeurIPS 2021 Workshop on Privacy in Machine Learning (PRIML).

Baracaldo, N., Chen, B., Ludwig, H., & Safavi, J. A. (2017). QUASAR: Quantitative attack space analysis and reasoning. In Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security (pp. 103–110). ACM. https://doi.org/10.1145/3134600.3134633

Barreno, M., Nelson, B., Joseph, A. D., & Tygar, J. D. (2010). The security of machine learning. Machine Learning, 81(2), 121–148. https://link.springer.com/article/10.1007/s10994-010-5188-5

Biggio, B., Corona, I., Fumera, G., Giacinto, G., & Roli, F. (2011). Bagging classifiers for fighting poisoning attacks in adversarial classification tasks. In Proceedings of the 10th International Conference on Multiple Classifier Systems (MCS’11) (pp. 350–359). Springer. https://doi.org/10.1007/978-3-642-21587-2_36

Biggio, B., Fumera, G., Pillai, I., & Roli, F. (2011). A survey and experimental evaluation of image spam filtering techniques. Pattern Recognition Letters, 32(10), 1436–1446. https://doi.org/10.1016/j.patrec.2011.03.022

Brown, T. B., Mané, D., Roy, A., Abadi, M., & Gilmer, J. (2017). Adversarial patch. arXiv Preprint, arXiv:1712.09665. https://arxiv.org/abs/1712.09665

Carlini, N., Chien, S., Nasr, M., Song, S., Terzis, A., & Tramer, F. (2022, May). Membership inference attacks from first principles. In 2022 IEEE Symposium on Security and Privacy (S&P) (pp. 1519–1519). IEEE Computer Society.

Carlini, N., & Wagner, D. (2017). Towards evaluating the robustness of neural networks. In 2017 IEEE Symposium on Security and Privacy (SP) (pp. 39–57). IEEE. https://doi.org/10.48550/arXiv.1608.04644

Chandrasekaran, M., Sornam, M. S., & Gamage, T. (2020). Evolution of phishing attacks and countermeasures. arXiv Preprint arXiv:2003.09384. https://doi.org/10.48550/arXiv.2003.09384

Chen, B., Feng, Y., Dai, T., Bai, J., Jiang, Y., Xia, S. T., & Wang, X. (2023). Adversarial examples generation for deep product quantization networks on image retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(2), 1388–1404. https://doi.org/10.1109/TPAMI.2022.3165024

Chen, J., Wang, W. H., & Shi, X. (2020). Differential privacy protection against membership inference attack on machine learning for genomic data. In Biocomputing 2021: Proceedings of the Pacific Symposium (pp. 26–37). World Scientific Publishing Company. https://www.proceedings.com/58564.html

Chen, P. Y., Zhang, H., Sharma, Y., Yi, J., & Hsieh, C. J. (2017, November). Zoo: Zeroth order optimization based black-box attacks to deep neural networks without training substitute models. In Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security (pp. 15–26). https://doi.org/10.48550/arXiv.1708.03999

Chen, X., Liu, C., Li, B., Lu, K., & Song, D. (2017). Targeted backdoor attacks on deep learning systems using data poisoning. arXiv Preprint, arXiv:1712.05526. https://arxiv.org/abs/1712.05526

Dinur, I., & Nissim, K. (2003). Revealing information while preserving privacy. In Proceedings of the 22nd ACM Symposium on Principles of Database Systems (PODS ’03) (pp. 202–210). ACM.

Dwork, C. (2006). Differential privacy. In Automata, Languages and Programming: 33rd International Colloquium, ICALP 2006, Venice, Italy, July 10–14, 2006, Proceedings, Part II (pp. 1–12). Springer.

Dwork, C., McSherry, F., Nissim, K., & Smith, A. (2006). Calibrating noise to sensitivity in private data analysis. In Theory of Cryptography Conference, TCC ’06 (pp. 265–284). Springer.

Dwork, C., & Yekhanin, S. (2008). New efficient attacks on statistical disclosure control mechanisms. In Annual International Cryptology Conference (pp. 469–480). Springer.

Ebrahimi, M., Zhang, N., Hu, J., Raza, M. T., & Chen, H. (2021). Binary black-box evasion attacks against deep learning-based static malware detectors with adversarial byte-level language model. 2021 AAAI Workshop on Robust, Secure and Efficient Machine Learning (RSEML). The AAAI Press. https://aaai.org/conference/aaai/aaai21/ws21workshops/

Eykholt, K., Evtimov, I., Fernandes, E., Li, B., Rahmati, A., Xiao, C., Prakash, A., Kohno, T., & Song, D. (2017). Robust physical-world attacks on deep learning visual classification. arXiv Preprint arXiv: 1707.08945. https://doi.org/10.48550/arXiv.1707.08945

Feng, W., Xu, N., Zhang, T., & Zhang, Y. (2023). Dynamic generative targeted attacks with pattern injection. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 16404–16414). https://doi.org/10.1109/CVPR52729.2023.01574

Garfinkel, S., Abowd, J., & Martindale, C. (2019). Understanding database reconstruction attacks on public data. Communications of the ACM, 62(2), 46–53.

Gong, X., Chen, Y., Yang, W., Huang, H., & Wang, Q. (2023). B3: Backdoor attacks against black-box machine learning models. ACM Transactions on Privacy and Security, 26(1), 1–24. https://dl.acm.org/doi/10.1145/3605212

Goodfellow, I. J., Shlens, J., & Szegedy, C. (2015). Explaining and harnessing adversarial examples. In International Conference on Learning Representations. arXiv Preprint arXiv: 1412.6572. https://arxiv.org/abs/1412.6572

Gu, T., Liu, K., Dolan-Gavitt, B., & Garg, S. (2019). BadNets: Evaluating backdooring attacks on deep neural networks. IEEE Access, 7, 47230–47244. https://doi.org/10.1109/ACCESS.2019.2909068

Guesmi, A., Khasawneh, K. N., Abu-Ghazaleh, N., & Alouani, I. (2022). Room: Adversarial machine learning attacks under real-time constraints. In 2022 International Joint Conference on Neural Networks (IJCNN) (pp. 1–10). https://doi.org/10.1109/IJCNN55064.2022.9892437

Gupta, P., Yadav, K., Gupta, B. B., Alazab, M., & Gadekallu, T. R. (2023). A novel data poisoning attack in federated learning based on inverted loss function. Computers & Security, 130, 103270. https://doi.org/10.1016/j.cose.2023.103270

Hinton, G., Vinyals, O., & Dean, J. (2015). Distilling the knowledge in a neural network. arXiv Preprint, arXiv:1503.02531. https://arxiv.org/abs/1503.02531

Imam, N. H., & Vassilakis, V. G. (2019). A survey of attacks against Twitter spam detectors in an adversarial environment. Robotics, 8(3), 50. https://doi.org/10.3390/robotics8030050

Jagielski, M., Severi, G., Harger, N. P., & Oprea, A. (2021). Subpopulation data poisoning attacks. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security (pp. 3104–3122). Association for Computing Machinery. https://dl.acm.org/doi/proceedings/10.1145/3460120

Kurakin, A., Goodfellow, I. J., & Bengio, S. (2018). Adversarial examples in the physical world. In Artificial Intelligence Safety and Security (pp. 99–112). Chapman and Hall/CRC. arXiv Preprint arXiv:1607.02533. https://arxiv.org/abs/1607.02533

Lapid, R., & Sipper, M. (2023).I see dead people: Gray-box adversarial attack on image-to-text models. arXiv Preprint, arXiv: 2306.07591. https://doi.org/10.48550/arXiv.2306.07591

Levine, A., & Feizi, S. (2021). Deep partition aggregation: Provable defenses against general poisoning attacks. In Proceedings of the 9th International Conference on Learning Representations (ICLR 2021). OpenReview.net. https://openreview.net/forum?id=3xQDj3v7zO

Li, Y., Li, Z., Zeng, L., Long, S., Huang, F., & Ren, K. (2022). Compound adversarial examples in deep neural networks. Information Sciences, 613, 50–68. https://doi.org/10.1016/j.ins.2022.08.031

Liao, F., Liang, M., Dong, Y., Pang, T., Hu, X., & Zhu, J. (2018). Defense against adversarial attacks using high-level representation guided denoiser. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1778–1787). https://arxiv.org/abs/1712.02976

Liu, Q., Li, P., Zhao, W., Cai, W., Yu, S., & Leung, V. C. (2018). A survey on security threats and defensive techniques of machine learning: A data-driven view. IEEE Access, 6, 12103–12117. https://doi.org/10.1109/ACCESS.2018.2805680

Liu, T. Y., Yang, Y., & Mirzasoleiman, B. (2022). Friendly noise against adversarial noise: A powerful defense against data poisoning attack. Advances in Neural Information Processing Systems, 35, 11947–11959.

Liu, X., Cheng, M., Zhang, H., & Hsieh, C. J. (2018). Towards robust neural networks via random self-ensemble. In Proceedings of the European Conference on Computer Vision (ECCV) (pp. 369–385). https://doi.org/10.48550/arXiv.1712.00673

Ma, Y., Zhu, X., & Hsu, J. (2019). Data poisoning against differentially-private learners: Attacks and defenses. In Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI-19) (pp. 4732–4738). https://doi.org/10.24963/ijcai.2019/656

Madry, A., Makelov, A., Schmidt, L., Tsipras, D., & Vladu, A. (2018). Towards deep learning models resistant to adversarial attacks. In 6th International Conference on Learning Representations (ICLR 2018), Conference Track Proceedings, Vancouver, BC, Canada. https://openreview.net/forum?id=rJzIBfZAb

Murakonda, S. K., & Shokri, R. (2020). ML Privacy Meter: Aiding regulatory compliance by quantifying the privacy risks of machine learning. arXiv. https://arxiv.org/abs/2007.07789

Nelson, B., Barreno, M., Chi, F. J., Joseph, A. D., Rubinstein, B. I. P., Saini, U., Sutton, C., & Xia, K. (2008, April). Exploiting machine learning to subvert your spam filter. In Proceedings of the First USENIX Workshop on Large-Scale Exploits and Emergent Threats (LEET 08). USENIX Association. https://www.usenix.org/legacy/event/leet08/tech/full_papers/nelson/nelson.pdf

Nguyen, A., Yosinski, J., & Clune, J. (2015). Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 427–436). IEEE. https://doi.org/10.1109/CVPR.2015.7298640

Papernot, N., McDaniel, P. D., Jha, S., Fredrikson, M., Celik, Z. B., & Swami, A. (2015). The limitations of deep learning in adversarial settings. arXiv Preprint, arXiv:1511.07528. https://arxiv.org/abs/1511.07528

Papernot, N., McDaniel, P., Wu, X., Jha, S., & Swami, A. (2016, May). Distillation as a defense to adversarial perturbations against deep neural networks. In 2016 IEEE Symposium on Security and Privacy (SP) (pp. 582–597). IEEE. https://ieeexplore.ieee.org/document/7546524

Papernot, N., Sharma, Y., Duan, Y., Li, X., & Song, D. (2017). Practical black-box attacks against machine learning. In Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security (pp. 506–519). ACM. https://doi.org/10.1145/3052973.3053009

Patterson, W., Fernandez, I., Neupane, S., Parmar, M., Mittal, S., & Rahimi, S. (2022). A white-box adversarial attack against a digital twin. arXiv Preprint, arXiv: 2210.14018. https://arxiv.org/abs/2210.14018

Paudice, A., Muñoz-González, L., & Lupu, E. C. (2018). Label sanitization against label flipping poisoning attacks. In ECML PKDD 2018 Workshops: Nemesis 2018, UrbReas 2018, SoGood 2018, IWAISe 2018, and Green Data Mining 2018 (pp. 5–15). Springer. https://link.springer.com/book/10.1007/978-3-030-13453-2

Peng, J., & Chan, P. P. (2013). 2013 International Conference on Machine Learning and Cybernetics (Vol. 2, pp. 610–614). IEEE https://journals.scholarsportal.info/browse/21601348

Puttagunta, M. K., Ravi, S., & Nelson Kennedy Babu, C. (2023). Adversarial examples: Attacks and defences on medical deep learning systems. Multimedia Tools and Applications, 82, 1–37. https://doi.org/10.1007/s11042-023-14702-9

Rigaki, M., & Garcia, S. (2020). A survey of privacy attacks in machine learning. arXiv Preprint, arXiv. https://doi.org/10.48550/arXiv.2007.07646

Sagar, R., Jhaveri, R., & Borrego, C. (2020). Applications in security and evasions in machine learning: A survey. Electronics, 9(1), 97. https://doi.org/10.3390/electronics9010097

Sherman, M. (2020, April 1). Influence attacks on machine learning. AI4.io. https://ai4.io/blog/2020/04/01/influence-attacks-on-machine-learning/

Shokri, R., Stronati, M., Song, C., & Shmatikov, V. (2017). Membership inference attacks against machine learning models. In 2017 IEEE Symposium on Security and Privacy (SP) (pp. 3–18). IEEE. https://doi.org/10.1109/SP.2017.41

Siddiqi, A. (2019). Adversarial security attacks and perturbations on machine learning and deep learning methods. arXiv Preprint, arXiv: 1907.07291. https://doi.org/10.48550/arXiv.1907.07291

Song, S., & Marn, D. (2020, July). Introducing a new privacy testing library in TensorFlow. TensorFlow Blog. https://blog.tensorflow.org/2020/07/introducing-new-privacy-testing-library.html

Steinhardt, J., Koh, P. W., & Liang, P. S. (2017). Certified defenses for data poisoning attacks. In I. Guyon, U. von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, & R. Garnett (Eds.), Advances in neural information processing systems (Vol. 30). Curran Associates, Inc. https://papers.nips.cc/paper_files/paper/2017/hash/ba4a7eaefe6790fc10970aeb9665a90a-Abstract.html

Sun, C., Zhang, Y., Chaoqun, W., Wang, Q., Li, Y., Liu, T., Han, B., & Tian, X. (2022). Towards lightweight black-box attack against deep neural networks. In Proceedings of the 36th International Conference on Neural Information Processing Systems (NeurIPS ’22) (Article 1404, pp. 19319–19331). Curran Associates Inc. https://dl.acm.org/doi/10.5555/3600270.3601674

Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., & Fergus, R. (2013). Intriguing properties of neural networks. arXiv Preprint, arXiv: 1312.6199. https://arxiv.org/abs/1312.6199

Tran, B., Li, J., & Madry, A. (2018). Spectral signatures in backdoor attacks. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, & R. Garnett (Eds.), Advances in Neural Information Processing Systems (Vol. 31). Curran Associates, Inc.

Usynin, D., Rueckert, D., & Kaissis, G. (2023). Beyond gradients: Exploiting adversarial priors in model inversion attacks. ACM Transactions on Privacy and Security, 26(3), 1–30. https://doi.org/10.1145/3580788

Wang, B., Yao, Y., Shan, S., Li, H., & Viswanath, B. (2021). Neural Cleanse: Identifying and mitigating backdoor attacks in neural networks. In Proceedings of the IEEE Symposium on Security and Privacy (pp. 707–723). IEEE.

Wang, H., Wang, S., Jin, Z., Wang, Y., Chen, C., & Tistarelli, M. (2021). Similarity-based gray-box adversarial attack against deep face recognition. In 2021 16th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2021) (pp. 1–8). IEEE. https://ieeexplore.ieee.org/document/9667076

Wang, W., Levine, A., & Feizi, S. (2022). Improved certified defenses against data poisoning with (deterministic) finite aggregation. In K. Chaudhuri, S. Jegelka, L. Song, C. Szepesvári, G. Niu, & S. Sabato (Eds.), Proceedings of the 39th International Conference on Machine Learning (ICML 2022) (Vol. 162, pp. 22769–22783). PMLR.

Wu, D., Qi, S., Qi, Y., Li, Q., Cai, B., Guo, Q., & Cheng, J. (2023). Understanding and defending against white-box membership inference attack in deep learning. Knowledge-Based Systems, 259, 110014. https://doi.org/10.1016/j.knosys.2022.110014

Ye, J., Maddi, A., Murakonda, S. K., Bindschaedler, V., & Shokri, R. (2022). Enhanced membership inference attacks against machine learning models. In Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security (CCS ’22) (pp. 3093–3106). Association for Computing Machinery.

Yeom, S., Giacomelli, I., Fredrikson, M., & Jha, S. (2018). Privacy risk in machine learning: Analyzing the connection to overfitting. In IEEE Computer Security Foundations Symposium (CSF ’18) (pp. 268–282). IEEE. https://arxiv.org/abs/1709.01604

Yu, M., & Sun, S. (2022). Natural black-box adversarial examples against deep reinforcement learning. Proceedings of the AAAI Conference on Artificial Intelligence, 36, 8936–8944. https://doi.org/10.1609/aaai.v36i8.20876

Yuan, X., He, P., Zhu, Q., & Li, X. (2019, September). Adversarial examples: Attacks and defenses for deep learning. IEEE Transactions on Neural Networks and Learning Systems, 30(9), 2805–2824. https://doi.org/10.1109/TNNLS.2018.2886017

Zafar, A., et al. (2023). Untargeted white-box adversarial attack to break into deep learning-based COVID-19 monitoring face mask detection system. Multimedia Tools and Applications, 83, 1–27. https://doi.org/10.1007/s11042-023-15405-x

Zhang, C., Bengio, S., Hardt, M., Recht, B., & Vinyals, O. (2021). Understanding deep learning (still) requires rethinking generalization. Communications of the ACM, 64(3), 107–115.

Zhao, B., & Lao, Y. (2022). CLPA: Clean-label poisoning availability attacks using generative adversarial nets. Proceedings of the AAAI Conference on Artificial Intelligence, 36, 9162–9170. https://doi.org/10.1609/aaai.v36i8.20902

Zou, A., Wang, Z., Carlini, N., Nasr, M., Kolter, J. Z., & Fredrikson, M. (2023). Universal and transferable adversarial attacks on aligned language models. arXiv Preprint, arXiv: 2307.15043. https://arxiv.org/abs/2307.15043

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

Winning the Battle for Secure ML Copyright © 2025 by Bestan Maaroof is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.