Towards Trusted AI Week 5 – NIST AI Risk Management Framework is finally out! Plus other guides and recommendations

Secure AI Weekly + Digests admin January 31, 2023 156

The definitive guide to adversarial machine learning

TechTalks, January 23, 2023

Machine learning is becoming an increasingly critical part of daily life and is used to perform a range of tasks such as facial and vocal recognition, image labeling, content search, code writing, and even autonomous driving. However, with the growing reliance on ML and deep learning models, there is growing concern over their security. One such threat is adversarial examples, small modifications to input data that can alter the outcome of a machine learning model.

Adversarial examples have become a topic of increasing interest in the field of machine learning, with numerous papers being published on the subject. To help understand this complex field, Pin-Yu Chen and Cho-Jui Hsieh have written a comprehensive book, “Adversarial Robustness for Machine Learning“. The book covers the key components of adversarial machine learning, including attacks, defenses, certification, and applications.

Adversarial attacks are carried out by finding vulnerabilities in machine learning systems. The most commonly known type is an evasion attack on computer vision systems, where an attacker adds a subtle layer of noise to an image, causing the ML model to misclassify it. Adversarial attacks can be categorized based on the attacker’s access and knowledge of the target ML model, including white-box attacks where the attacker has full access and black-box attacks where the attacker only has access through an intermediate system. There are also transfer attacks where the attacker uses a local white-box model to create adversarial examples for a remote black-box model.

The authors delve into each type of attack in detail and provide references to related papers. They also explore other types of adversarial attacks, including poisoning attacks, membership inference attacks, and more. The book serves as a comprehensive guide to the field of adversarial machine learning and is a must-read for those interested in understanding the security implications of these powerful models.

NIST AI Risk Management Framework Aims to Improve Trustworthiness of Artificial Intelligence

NIST, January 26, 2023

The U.S. Department of Commerce’s National Institute of Standards and Technology (NIST) has recently released the Artificial Intelligence Risk Management Framework (AI RMF 1.0) in which Adversa AI experts also contributed their feedback. The framework is a guidance document for organizations to help manage the risks associated with AI technologies. The framework is intended to be used by organizations in varying capacities, to benefit from AI technologies while protecting society from its potential harms.

The framework equips organizations to approach AI and risk differently. It promotes a change in institutional culture and encourages organizations to approach AI with a new perspective, including how to think about, communicate, measure, and monitor AI risks and its potential positive and negative impacts. The framework provides a flexible, structured and measurable process for organizations to address AI risks and maximize the benefits of AI technologies while reducing the likelihood of negative impacts.

The AI RMF is divided into two parts. The first part discusses how organizations can frame the risks related to AI and outlines the characteristics of trustworthy AI systems. The second part, the core of the framework, describes four specific functions – govern, map, measure and manage – to help organizations address the risks of AI systems in practice. These functions can be applied in context-specific use cases and at any stage of the AI life cycle. NIST has been developing the framework for 18 months in close collaboration with the private and public sectors. The framework is expected to help drive the development of best practices and standards for responsible AI.

A Taxonomy of Trustworthiness for Artificial Intelligence

Berkley CLTC

A report by UC Berkeley Center for Long-Term Cybersecurity (CLTC) aims to aid organizations in creating AI technologies that are trustworthy. The report, called “A Taxonomy of Trustworthiness for Artificial Intelligence: Connecting Properties of Trustworthiness with Risk Management and the AI Lifecycle”, was written by Jessica Newman, Director of CLTC’s AI Security Initiative and Co-Director of the UC Berkeley AI Policy Hub. The report provides a comprehensive resource for AI organizations and teams to develop AI technologies, systems, and applications, and is designed to specifically help users of the NIST AI Risk Management Framework.

The NIST AI Risk Management Framework outlines seven key traits that AI systems should possess in order to be considered trustworthy. These characteristics include being valid and reliable, safe, secure and resilient, accountable and transparent, explainable and interpretable, privacy-enhanced, and fair with harmful biases managed. The CLTC report builds on these seven characteristics and names 150 properties of trustworthiness. Each property is linked to a specific part of the AI lifecycle and is mapped to relevant sections of the NIST framework.

The purpose of the report is to help developers of AI systems to ask the right questions to ensure that they are deploying their technologies responsibly. The report provides guidance on questions such as how to verify that the AI system is functioning as expected, how to ensure that a human is in control of the AI system’s decision-making process, and how to prevent the AI system from presenting inaccurate or deceptive outputs. The report is a result of a year-long collaboration with AI researchers and experts from various stakeholder groups and considers a full spectrum of AI systems, including those that operate without direct human interaction. The report can be used by organizations developing AI, as well as by standards-setting bodies, policymakers, independent auditors, and civil society organizations working to promote trustworthy AI.