How OpenAI is trying to make ChatGPT safer and less biased
MIT Technology Review, February 21, 2023
Over the past week, news outlets have reported on Microsoft’s Bing AI search, which uses a chatbot that has been generating strange and creepy responses. To address the issue, Microsoft has limited Bing to five replies per session to prevent it from going off-track. However, the chatbot’s behavior has also caused controversy among conservatives who claim that Microsoft’s ChatGPT has a biased agenda. In response, OpenAI, the company behind ChatGPT, has issued guidelines on how the chatbot should respond to certain prompts related to US “culture wars.”
To make ChatGPT safer and more reliable, OpenAI has been working on improving its datasets and removing examples where the model has had a preference for false information. They have also been monitoring and selecting prompts that have led to unwanted content to prevent the model from repeating these issues. Furthermore, OpenAI is exploring using surveys or setting up citizens assemblies to gather feedback from the public and shape its models. The company is also working on a more experimental project dubbed the “consensus project,” where researchers are studying how people agree or disagree with different responses generated by AI models.
While OpenAI’s efforts to make its models more reliable and inclusive are commendable, there are concerns about how far the company will go in allowing political customization. It is essential to involve public participation in determining what is acceptable for a tool used by millions of people worldwide in different cultures and political contexts. Nevertheless, OpenAI must also be aware of the complexities and challenges involved in content moderation, particularly when it comes to generating content that represents extreme political ideologies. As AI technology continues to develop, it is crucial to ensure that it aligns with ethical principles and societal values.
How I Broke Into a Bank Account With an AI-Generated Voice
Vice, February 23, 2023
The use of synthetic voices generated by artificial intelligence (AI) to gain unauthorized access to bank accounts is becoming a growing concern. In a recent experiment, an individual successfully used an AI-generated voice to bypass voice authentication measures used by banks to allow customers to access their accounts over the phone. While banks tout voice identification as a secure and convenient alternative to traditional authentication methods, the ease and accessibility of AI-powered voice cloning technology could render voice-based biometric security less effective in the long run.
Experts are calling for banks to switch to more secure identity verification methods, such as multi-factor authentication, to reduce the risk of fraud and hacking attempts. While synthetic voice fraud is currently rare, the potential for abuse is high, especially as anyone with access to a few minutes of an individual’s voice publicly available online could clone it using readily available AI voice technology. As such, public figures, social media influencers, politicians, and journalists could be at risk of synthetic voice cloning.
Despite assurances from banks that their voice authentication measures are secure and effective, the rise of AI-powered voice cloning technology could pose a significant threat to the integrity of voice-based biometric security measures. Banks need to deploy countermeasures to mitigate the risk of synthetic voices and explore alternative authentication methods to ensure the security of customer accounts. As AI-powered voice cloning technology continues to develop, it is critical that banks keep pace with these changes to ensure that their security measures remain robust and effective.
Can AI really be protected from text-based attacks?
TechCrunch, February 24, 2023
Microsoft’s AI-powered chatbot, Bing Chat, which was co-developed with OpenAI, has fallen victim to malicious prompt engineering. Users have discovered ways to break the chatbot by using carefully tailored inputs to get it to write neo-Nazi propaganda, invent conspiracy theories and reveal sensitive information. Prompt engineering is a type of escalation of privilege attack, where an AI that uses text-based instructions, called prompts, to accomplish tasks is tricked by malicious, adversarial prompts to perform tasks that were not a part of its objective.
The behaviour of Bing Chat is not well understood as it was trained on vast amounts of text from the internet, some of which was toxic. The response of the large language model to text input is being exploited by users, who have found ways to prompt it to divulge its hidden initial instructions and to say wildly offensive things. As AI becomes more embedded in the apps and websites we use every day, these attacks are expected to become more common, and there are currently no tools to fully model an LLM’s behaviour to prevent them.
To mitigate the effects of ill-intentioned prompts, researchers suggest manually-created filters for generated content and prompt-level filters. Companies such as Microsoft and OpenAI are already using filters to attempt to prevent their AI from responding in undesirable ways, and at the model level, they’re exploring methods like reinforcement learning from human feedback. However, there’s only so much filters can do as users make an effort to discover new exploits. This has led to an arms race, similar to cybersecurity, where users try to break the AI and the creators of the AI patch them to prevent the attacks they’ve seen.
Securing AI Systems — Researching on Security Threats
DataDrivenInvestor, February 22, 2023
Research plays a vital role in advancing our understanding and knowledge of different domains, including the development of artificial intelligence (AI) technologies. With the rapid growth of AI, it has become imperative to recognize the importance of recognizing the work done by researchers, exploring further to innovate for better insights, and getting more creative. The extensive learning from the existing knowledge base helps many of the researchers and scientists to improve on the works of others and build new knowledge. This impact is created by research, and it is an integral element of evolving technologies like AI, where many innovations are happening.
Advancements in algorithms, architectures, and computing abilities in AI have not only helped in developing cutting-edge solutions but also introduced new patterns of risks to everyone involved in the complete chain. In particular, Adversarial AI has introduced new patterns of risks, which are helpful in launching security threats. Researching on these security threats has become a crucial aspect of the development and implementation of AI technologies. Specifically, the innovations and advancements in security threats could be better learned from the research papers written on the topic of Adversarial attacks in ML, where the mindset of the adversaries and attacking patterns are investigated and experimented from different dimensions.
To better understand the length and breadth of the research involved in learning security threats, we can categorize them into two categories: Completeness and Comprehensiveness. Completeness refers to going deeper and doing everything possible within the practitioner’s scope while assessing potential threats. It means that practitioners should give more attention to implementation logics, such as the possibility of exploiting the serialization process, as well as other attack methods that we are aware of but do not give attention to the depth involved. On the other hand, Comprehensiveness refers to completeness over a broader scope. As AI practitioners, we need to think about the many components in the complete AI lifecycle and the potential security threats to each of them, such as the possibility of weight poisoning attacks when implementing Transfer Learning.
These experts are racing to protect AI from Hackers. Time is running out
ZDNet
As artificial intelligence (AI) becomes increasingly integrated into our daily lives, there are growing concerns about the potential risks and dangers associated with these systems. From driverless cars to critical infrastructure and healthcare, AI algorithms are used to manage a range of automated systems that have the potential for significant consequences if they fail. The fear is that these systems could be broken, fooled or manipulated, leading to a range of potential problems.
One of the biggest concerns is that AI systems could be hacked or manipulated, leading to incorrect or dangerous decisions. For example, a driverless car could be fooled into driving through stop signs or an AI-powered medical scanner could be tricked into making the wrong diagnosis. These failures could lead to city-wide gridlock or interruptions to essential services. The risks associated with AI-powered systems are becoming more visible and experts predict that attacks on AI will become more common as the technology becomes more widespread.
Despite these risks, the benefits of AI technology are real and cannot be ignored. As AI becomes more commonplace in all kinds of industries and settings, it becomes increasingly important to ensure that these systems are developed in a way that shields them from attempts at manipulation or attack. To address this issue, the Defense Advanced Research Projects Agency (DARPA) has launched the Guaranteeing AI Robustness Against Deception (GARD) project, which aims to develop AI and algorithms in a way that protects them from manipulation and attack. As we continue to rely on AI-powered systems to make important decisions, it is essential that we take steps to defend against potential threats and ensure that these systems can’t be fooled into making bad or dangerous decisions.
Subscribe for updates
Stay up to date with what is happening! Get a first look at news, noteworthy research and worst attacks on AI delivered right in your inbox.