The Perils and Promise of Large Language Models: A Roadmap to Responsible Use

Featured
Written by
Tommy Perniciaro
Published on
May 9, 2023

The rapid advances in artificial intelligence (AI) have brought about many benefits, including increased productivity and more efficient workflows. However, the use of large language models (LLMs) has also raised serious concerns about how they are created and used, and the potential for their abuse.

LLMs are the driving force behind text and code generation tools such as ChatGPT, Bard, and GitHub’s Copilot. However, these models can also be used to generate harmful content such as phishing emails and malware, accelerate the workflow of malicious actors, and even enable rapid access to illicit information.  

In the wrong hands, the barrier to entry to carry out these activities is as low as constructing a well-crafted chatbot prompt.

Implementation Challenges

One of the most significant challenges posed by LLMs is their susceptibility to prompt injection. By crafting specially designed prompts, attackers can bypass content filters and exploit an LLM's capabilities to produce illicit output.  

As these models increasingly connect with the outside world, such as through plugins for ChatGPT, they could enable chatbots to "eval" user-generated code, leading to arbitrary code execution. From a security perspective, this is a significant problem.

To mitigate this risk, organizations must understand their LLM-based solutions' capabilities and how they interact with external endpoints. They must evaluate their threat model, determine if the models have connected to an API, are running a social media account, or are interacting with customers without supervision.  

Additionally, implementing strong content filters and restricting access to dangerous content is imperative but not always effective.

Another area of concern is data privacy and copyright violation. The training of LLMs requires enormous amounts of data, and at this scale, understanding provenance, authorship, and copyright status is a gargantuan task.  

An unvetted training set can result in a model that leaks private data, misattributes citations, or plagiarizes copyrighted content. Moreover, data privacy laws surrounding LLMs are murky, and sending sensitive data to a third party for model training could lead to data leakage and harm in business settings.

As LLM-based services become integrated with workplace productivity tools such as Slack and Teams, it's essential to read providers' privacy policies carefully. Users must understand how AI prompts can be used and regulate the use of LLMs in the workplace accordingly.  

Concerning copyright protections, data acquisition and use must be regulated through opt-ins or special licensing without hindering the largely free and open internet as we have today.

Some Cautions

LLMs' output must always be taken with a grain of salt, as these models do not understand the context of what they produce. Their currency is the probabilistic relationships between words, and they cannot distinguish between fact and fiction.  

As such, LLM tools must be verified for accuracy and overall sanity by human beings, as ChatGPT falsifying citations and even entire papers has shown.

A real-world example of LLM misuse occurred in August 2022 when an attacker used an LLM-based tool to create deepfake content. The attacker used OpenAI's GPT-3 language model to create a fake blog post about a security researcher's experience of being kidnapped and forced to install malware on his employer's network.  

The post was designed to defame the security researcher and spread false information about the cybersecurity industry. This incident demonstrates the potential for LLM-based tools to be used for malicious purposes and highlights the need for proper regulations to be in place.

Finally, the use of LLMs in healthcare and other sensitive applications must be closely regulated to prevent any risk of harm to users. LLM-based service providers must always inform users of the scope of AI's contribution to the service, and interacting with a bot should always be a choice rather than the default.

Takeaway

There must be regulatory oversight to ensure that companies cannot utilize AI in a manner that poses ethical concerns, with or without the end user's explicit consent. The widespread implementation of LLMs must be regulated and secured to prevent significant downfalls. These models have the potential to bring about significant benefits, but their unregulated use can lead to harmful consequences.  

Organizations must evaluate their LLM-based solutions' capabilities and interactions, implement strong content filters, and regulate the use of LLMs in sensitive applications. AI service providers must be transparent about their data acquisition and use, and there must be regulatory oversight to ensure that the use of LLMs is ethical and responsible.

As we continue to rely on AI more heavily in our daily lives, it's crucial to acknowledge the risks associated with its use and take steps to mitigate them. By doing so, we can reap the benefits of these powerful tools while also ensuring that they are used safely and responsibly.

Halcyon.ai is the industry’s first dedicated, adaptive security platform that combines multiple advanced proprietary prevention engines along with AI models focused specifically on stopping ransomware – talk to a Halcyon expert today to find out more. And check out the Recent Ransomware Attacks resource site to get near real-time tracking of ransomware attacks, threat actor groups and their victims.

See Halcyon in action

Interested in getting a demo?
Fill out the form to meet with a Halcyon Anti-Ransomware Expert!

1
2
3
Let's get started
1
1
2
3
1
1
2
2
3
Back
Next
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.