Towards Secure AI Week 41 – AI Security Skills Shortage

Secure AI Weekly + Trusted AI Blog admin October 16, 2024 106

How to enable secure use of AI

The Register, October 10, 2024

The UK National Cyber Security Centre (NCSC) highlights several areas where AI can be exploited, but organizations need practical solutions that enable them to adopt AI safely and responsibly. This is where the SANS AI Toolkit comes in, offering a set of free resources to help organizations implement AI tools without increasing security risks.

The toolkit classifies users based on their familiarity with AI tools and whether they are using them with management approval, a crucial factor given the proliferation of easily accessible AI platforms like ChatGPT. To help organizations maintain security, the SANS AI Toolkit includes a pre-built Acceptable Use Policy to guide employees on responsible AI use while aligning with company goals. The package also contains educational factsheets on key topics, such as how generative AI functions and how to craft effective prompts. Additionally, the toolkit helps users identify where they already engage with AI in their workflows, enabling them to extract maximum value from these tools while ensuring their operations remain secure. By equipping businesses with these resources, SANS aims to foster the secure and ethical adoption of AI in the workplace.

Global Survey Reveals Critical AI Security Skills Shortage

Technology Magazine, October 7, 2024

O’Reilly’s 2024 State of Security Survey, which gathered responses from over 1,300 technology professionals worldwide, reveals a critical gap in AI and cloud security expertise. The survey found that 33.9% of professionals lacked skills in AI security, particularly in addressing vulnerabilities like prompt injection attacks, where malicious actors manipulate AI systems to produce harmful outputs. This shortage stems from the rapid adoption of AI, which has outpaced the development of security knowledge. Additionally, 38.9% of respondents reported a lack of cloud security expertise, indicating that organizations continue to struggle with securing their cloud-based infrastructures. As these technologies continue to evolve, the need to bridge this skills gap is crucial for safeguarding sensitive data and systems.

Despite the focus on emerging technologies like AI, traditional cybersecurity threats such as phishing, network intrusions, and ransomware remain significant concerns, with phishing identified by 55.4% of respondents as the top threat. The report highlights the importance of continuous learning, with 80.7% of employers mandating ongoing education for security professionals. Laura Baldwin, President of O’Reilly, stresses that cybersecurity is no longer just an IT issue but a company-wide imperative, requiring all employees to be equipped with the necessary skills to defend against evolving threats. This growing emphasis on continuous education and certification reflects the industry’s acknowledgment that AI and cloud security require not only advanced tools but also a workforce capable of managing and mitigating these risks.

How to Evaluate Jailbreak Methods: A Case Study With the StrongREJECT Benchmark

The Good Men Project, October 13, 2024

The StrongREJECT benchmark is a new tool designed to evaluate the effectiveness of jailbreak methods on language models. Previous research suggested that translating forbidden prompts into obscure languages, like Scots Gaelic, could bypass GPT-4’s safety filters. However, when researchers at BAIR replicated these experiments, they found the responses generated were vague and not particularly harmful, casting doubt on the original study’s claims. This prompted a deeper investigation into how jailbreak methods are evaluated, revealing that existing benchmarks often have significant flaws, such as repetitive prompts and simplistic binary evaluations that fail to measure the quality or harmfulness of the responses.

To address these issues, StrongREJECT was created with a more refined evaluation process. It uses a high-quality dataset of forbidden prompts and a state-of-the-art automated evaluator to assess both a model’s willingness to respond and the quality of its responses. The benchmark showed that many previously reported jailbreaks are far less effective than claimed, often reducing the capabilities of the model rather than extracting dangerous information. This suggests that most jailbreaks, while capable of bypassing safety measures, degrade the model’s overall performance, raising questions about the true impact of these methods.