Blog

Inside the SOC

Exploring AI Threats: Package Hallucination Attacks

Default blog imageDefault blog imageDefault blog imageDefault blog imageDefault blog imageDefault blog image
30
Oct 2023
30
Oct 2023
Learn how malicious actors exploit errors in generative AI tools to launch packet attacks. Read how Darktrace products detect and prevent these threats!

AI tools open doors for threat actors

On November 30, 2022, the free conversational language generation model ChatGPT was launched by OpenAI, an artificial intelligence (AI) research and development company. The launch of ChatGPT was the culmination of development ongoing since 2018 and represented the latest innovation in the ongoing generative AI boom and made the use of generative AI tools accessible to the general population for the first time.

ChatGPT is estimated to currently have at least 100 million users, and in August 2023 the site reached 1.43 billion visits [1]. Darktrace data indicated that, as of March 2023, 74% of active customer environments have employees using generative AI tools in the workplace [2].

However, with new tools come new opportunities for threat actors to exploit and use them maliciously, expanding their arsenal.

Much consideration has been given to mitigating the impacts of the increased linguistic complexity in social engineering and phishing attacks resulting from generative AI tool use, with Darktrace observing a 135% increase in ‘novel social engineering attacks’ across thousands of active Darktrace/Email™ customers from January to February 2023, corresponding with the widespread adoption of ChatGPT and its peers [3].

Less overall consideration, however, has been given to impacts stemming from errors intrinsic to generative AI tools. One of these errors is AI hallucinations.

What is an AI hallucination?

AI “hallucination” is a term which refers to the predictive elements of generative AI and LLMs’ AI model gives an unexpected or factually incorrect response which does not align with its machine learning training data [4]. This differs from regular and intended behavior for an AI model, which should provide a response based on the data it was trained upon.  

Why are AI hallucinations a problem?

Despite the term indicating it might be a rare phenomenon, hallucinations are far more likely than accurate or factual results as the AI models used in LLMs are merely predictive and focus on the most probable text or outcome, rather than factual accuracy.

Given the widespread use of generative AI tools in the workplace employees are becoming significantly more likely to encounter an AI hallucination. Furthermore, if these fabricated hallucination responses are taken at face value, they could cause significant issues for an organization.

Use of generative AI in software development

Software developers may use generative AI for recommendations on how to optimize their scripts or code, or to find packages to import into their code for various uses. Software developers may ask LLMs for recommendations on specific pieces of code or how to solve a specific problem, which will likely lead to a third-party package. It is possible that packages recommended by generative AI tools could represent AI hallucinations and the packages may not have been published, or, more accurately, the packages may not have been published prior to the date at which the training data for the model halts. If these hallucinations result in common suggestions of a non-existent package, and the developer copies the code snippet wholesale, this may leave the exchanges vulnerable to attack.

Research conducted by Vulcan revealed the prevalence of AI hallucinations when ChatGPT is asked questions related to coding. After sourcing a sample of commonly asked coding questions from Stack Overflow, a question-and-answer website for programmers, researchers queried ChatGPT (in the context of Node.js and Python) and reviewed its responses. In 20% of the responses provided by ChatGPT pertaining to Node.js at least one un-published package was included, whilst the figure sat at around 35% for Python [4].

Hallucinations can be unpredictable, but would-be attackers are able to find packages to create by asking generative AI tools generic questions and checking whether the suggested packages exist already. As such, attacks using this vector are unlikely to target specific organizations, instead posing more of a widespread threat to users of generative AI tools.

Malicious packages as attack vectors

Although AI hallucinations can be unpredictable, and responses given by generative AI tools may not always be consistent, malicious actors are able to discover AI hallucinations by adopting the approach used by Vulcan. This allows hallucinated packages to be used as attack vectors. Once a malicious actor has discovered a hallucination of an un-published package, they are able to create a package with the same name and include a malicious payload, before publishing it. This is known as a malicious package.

Malicious packages could also be recommended by generative AI tools in the form of pre-existing packages. A user may be recommended a package that had previously been confirmed to contain malicious content, or a package that is no longer maintained and, therefore, is more vulnerable to hijack by malicious actors.

In such scenarios it is not necessary to manipulate the training data (data poisoning) to achieve the desired outcome for the malicious actor, thus a complex and time-consuming attack phase can easily be bypassed.

An unsuspecting software developer may incorporate a malicious package into their code, rendering it harmful. Deployment of this code could then result in compromise and escalation into a full-blown cyber-attack.

Figure 1: Flow diagram depicting the initial stages of an AI Package Hallucination Attack.

For providers of Software-as-a-Service (SaaS) products, this attack vector may represent an even greater risk. Such organizations may have a higher proportion of employed software developers than other organizations of comparable size. A threat actor, therefore, could utilize this attack vector as part of a supply chain attack, whereby a malicious payload becomes incorporated into trusted software and is then distributed to multiple customers. This type of attack could have severe consequences including data loss, the downtime of critical systems, and reputational damage.

How could Darktrace detect an AI Package Hallucination Attack?

In June 2023, Darktrace introduced a range of DETECT™ and RESPOND™ models designed to identify the use of generative AI tools within customer environments, and to autonomously perform inhibitive actions in response to such detections. These models will trigger based on connections to endpoints associated with generative AI tools, as such, Darktrace’s detection of an AI Package Hallucination Attack would likely begin with the breaching of one of the following DETECT models:

  • Compliance / Anomalous Upload to Generative AI
  • Compliance / Beaconing to Rare Generative AI and Generative AI
  • Compliance / Generative AI

Should generative AI tool use not be permitted by an organization, the Darktrace RESPOND model ‘Antigena / Network / Compliance / Antigena Generative AI Block’ can be activated to autonomously block connections to endpoints associated with generative AI, thus preventing an AI Package Hallucination attack before it can take hold.

Once a malicious package has been recommended, it may be downloaded from GitHub, a platform and cloud-based service used to store and manage code. Darktrace DETECT is able to identify when a device has performed a download from an open-source repository such as GitHub using the following models:

  • Device / Anomalous GitHub Download
  • Device / Anomalous Script Download Followed By Additional Packages

Whatever goal the malicious package has been designed to fulfil will determine the next stages of the attack. Due to their highly flexible nature, AI package hallucinations could be used as an attack vector to deliver a large variety of different malware types.

As GitHub is a commonly used service by software developers and IT professionals alike, traditional security tools may not alert customer security teams to such GitHub downloads, meaning malicious downloads may go undetected. Darktrace’s anomaly-based approach to threat detection, however, enables it to recognize subtle deviations in a device’s pre-established pattern of life which may be indicative of an emerging attack.

Subsequent anomalous activity representing the possible progression of the kill chain as part of an AI Package Hallucination Attack could then trigger an Enhanced Monitoring model. Enhanced Monitoring models are high-fidelity indicators of potential malicious activity that are investigated by the Darktrace analyst team as part of the Proactive Threat Notification (PTN) service offered by the Darktrace Security Operation Center (SOC).

Conclusion

Employees are often considered the first line of defense in cyber security; this is particularly true in the face of an AI Package Hallucination Attack.

As the use of generative AI becomes more accessible and an increasingly prevalent tool in an attacker’s toolbox, organizations will benefit from implementing company-wide policies to define expectations surrounding the use of such tools. It is simple, yet critical, for example, for employees to fact check responses provided to them by generative AI tools. All packages recommended by generative AI should also be checked by reviewing non-generated data from either external third-party or internal sources. It is also good practice to adopt caution when downloading packages with very few downloads as it could indicate the package is untrustworthy or malicious.

As of September 2023, ChatGPT Plus and Enterprise users were able to use the tool to browse the internet, expanding the data ChatGPT can access beyond the previous training data cut-off of September 2021 [5]. This feature will be expanded to all users soon [6]. ChatGPT providing up-to-date responses could prompt the evolution of this attack vector, allowing attackers to publish malicious packages which could subsequently be recommended by ChatGPT.

It is inevitable that a greater embrace of AI tools in the workplace will be seen in the coming years as the AI technology advances and existing tools become less novel and more familiar. By fighting fire with fire, using AI technology to identify AI usage, Darktrace is uniquely placed to detect and take preventative action against malicious actors capitalizing on the AI boom.

Credit to Charlotte Thompson, Cyber Analyst, Tiana Kelly, Analyst Team Lead, London, Cyber Analyst

References

[1] https://seo.ai/blog/chatgpt-user-statistics-facts

[2] https://darktrace.com/news/darktrace-addresses-generative-ai-concerns

[3] https://darktrace.com/news/darktrace-email-defends-organizations-against-evolving-cyber-threat-landscape

[4] https://vulcan.io/blog/ai-hallucinations-package-risk?nab=1&utm_referrer=https%3A%2F%2Fwww.google.com%2F

[5] https://twitter.com/OpenAI/status/1707077710047216095

[6] https://www.reuters.com/technology/openai-says-chatgpt-can-now-browse-internet-2023-09-27/

INSIDE THE SOC
Darktrace cyber analysts are world-class experts in threat intelligence, threat hunting and incident response, and provide 24/7 SOC support to thousands of Darktrace customers around the globe. Inside the SOC is exclusively authored by these experts, providing analysis of cyber incidents and threat trends, based on real-world experience in the field.
AUTHOR
ABOUT ThE AUTHOR
Charlotte Thompson
Cyber Analyst
Tiana Kelly
Deputy Team Lead, London & Cyber Analyst
Book a 1-1 meeting with one of our experts
share this article
USE CASES
No items found.
COre coverage

More in this series

No items found.

Blog

Inside the SOC

Lost in Translation: Darktrace Blocks Non-English Phishing Campaign Concealing Hidden Payloads

Default blog imageDefault blog image
15
May 2024

Email – the vector of choice for threat actors

In times of unprecedented globalization and internationalization, the enormous number of emails sent and received by organizations every day has opened the door for threat actors looking to gain unauthorized access to target networks.

Now, increasingly global organizations not only need to safeguard their email environments against phishing campaigns targeting their employees in their own language, but they also need to be able to detect malicious emails sent in foreign languages too [1].

Why are non-English language phishing emails more popular?

Many traditional email security vendors rely on pre-trained English language models which, while function adequately against malicious emails composed in English, would struggle in the face of emails composed in other languages. It should, therefore, come as no surprise that this limitation is becoming increasingly taken advantage of by attackers.  

Darktrace/Email™, on the other hand, focuses on behavioral analysis and its Self-Learning AI understands what is considered ‘normal’ for every user within an organization’s email environment, bypassing any limitations that would come from relying on language-trained models [1].

In March 2024, Darktrace observed anomalous emails on a customer’s network that were sent from email addresses belonging to an international fast-food chain. Despite this seeming legitimacy, Darktrace promptly identified them as phishing emails that contained malicious payloads, preventing a potentially disruptive network compromise.

Attack Overview and Darktrace Coverage

On March 3, 2024, Darktrace observed one of the customer’s employees receiving an email which would turn out to be the first of more than 50 malicious emails sent by attackers over the course of three days.

The Sender

Darktrace/Email immediately understood that the sender never had any previous correspondence with the organization or its employees, and therefore treated the emails with caution from the onset. Not only was Darktrace able to detect this new sender, but it also identified that the emails had been sent from a domain located in China and contained an attachment with a Chinese file name.

The phishing emails detected by Darktrace sent from a domain in China and containing an attachment with a Chinese file name.
Figure 1: The phishing emails detected by Darktrace sent from a domain in China and containing an attachment with a Chinese file name.

Darktrace further detected that the phishing emails had been sent in a synchronized fashion between March 3 and March 5. Eight unique senders were observed sending a total of 55 emails to 55 separate recipients within the customer’s email environment. The format of the addresses used to send these suspicious emails was “12345@fastflavor-shack[.]cn”*. The domain “fastflavor-shack[.]cn” is the legitimate domain of the Chinese division of an international fast-food company, and the numerical username contained five numbers, with the final three digits changing which likely represented different stores.

*(To maintain anonymity, the pseudonym “Fast Flavor Shack” and its fictitious domain, “fastflavor-shack[.]cn”, have been used in this blog to represent the actual fast-food company and the domains identified by Darktrace throughout this incident.)

The use of legitimate domains for malicious activities become commonplace in recent years, with attackers attempting to leverage the trust endpoint users have for reputable organizations or services, in order to achieve their nefarious goals. One similar example was observed when Darktrace detected an attacker attempting to carry out a phishing attack using the cloud storage service Dropbox.

As these emails were sent from a legitimate domain associated with a trusted organization and seemed to be coming from the correct connection source, they were verified by Sender Policy Framework (SPF) and were able to evade the customer’s native email security measures. Darktrace/Email; however, recognized that these emails were actually sent from a user located in Singapore, not China.

Darktrace/Email identified that the email had been sent by a user who had logged in from Singapore, despite the connection source being in China.
Figure 2: Darktrace/Email identified that the email had been sent by a user who had logged in from Singapore, despite the connection source being in China.

The Emails

Darktrace/Email autonomously analyzed the suspicious emails and identified that they were likely phishing emails containing a malicious multistage payload.

Darktrace/Email identifying the presence of a malicious phishing link and a multistage payload.
Figure 3: Darktrace/Email identifying the presence of a malicious phishing link and a multistage payload.

There has been a significant increase in multistage payload attacks in recent years, whereby a malicious email attempts to elicit recipients to follow a series of steps, such as clicking a link or scanning a QR code, before delivering a malicious payload or attempting to harvest credentials [2].

In this case, the malicious actor had embedded a suspicious link into a QR code inside a Microsoft Word document which was then attached to the email in order to direct targets to a malicious domain. While this attempt to utilize a malicious QR code may have bypassed traditional email security tools that do not scan for QR codes, Darktrace was able to identify the presence of the QR code and scan its destination, revealing it to be a suspicious domain that had never previously been seen on the network, “sssafjeuihiolsw[.]bond”.

Suspicious link embedded in QR Code, which was detected and extracted by Darktrace.
Figure 4: Suspicious link embedded in QR Code, which was detected and extracted by Darktrace.

At the time of the attack, there was no open-source intelligence (OSINT) on the domain in question as it had only been registered earlier the same day. This is significant as newly registered domains are typically much more likely to bypass gateways until traditional security tools have enough intelligence to determine that these domains are malicious, by which point a malicious actor may likely have already gained access to internal systems [4]. Despite this, Darktrace’s Self-Learning AI enabled it to recognize the activity surrounding these unusual emails as suspicious and indicative of a malicious phishing campaign, without needing to rely on existing threat intelligence.

The most commonly used sender name line for the observed phishing emails was “财务部”, meaning “finance department”, and Darktrace observed subject lines including “The document has been delivered”, “Income Tax Return Notice” and “The file has been released”, all written in Chinese.  The emails also contained an attachment named “通知文件.docx” (“Notification document”), further indicating that they had been crafted to pass for emails related to financial transaction documents.

 Darktrace/Email took autonomous mitigative action against the suspicious emails by holding the message from recipient inboxes.
Figure 5: Darktrace/Email took autonomous mitigative action against the suspicious emails by holding the message from recipient inboxes.

Conclusion

Although this phishing attack was ultimately thwarted by Darktrace/Email, it serves to demonstrate the potential risks of relying on solely language-trained models to detect suspicious email activity. Darktrace’s behavioral and contextual learning-based detection ensures that any deviations in expected email activity, be that a new sender, unusual locations or unexpected attachments or link, are promptly identified and actioned to disrupt the attacks at the earliest opportunity.

In this example, attackers attempted to use non-English language phishing emails containing a multistage payload hidden behind a QR code. As traditional email security measures typically rely on pre-trained language models or the signature-based detection of blacklisted senders or known malicious endpoints, this multistage approach would likely bypass native protection.  

Darktrace/Email, meanwhile, is able to autonomously scan attachments and detect QR codes within them, whilst also identifying the embedded links. This ensured that the customer’s email environment was protected against this phishing threat, preventing potential financial and reputation damage.

Credit to: Rajendra Rushanth, Cyber Analyst, Steven Haworth, Head of Threat Modelling, Email

Appendices  

List of Indicators of Compromise (IoCs)  

IoC – Type – Description

sssafjeuihiolsw[.]bond – Domain Name – Suspicious Link Domain

通知文件.docx – File - Payload  

References

[1] https://darktrace.com/blog/stopping-phishing-attacks-in-enter-language  

[2] https://darktrace.com/blog/attacks-are-getting-personal

[3] https://darktrace.com/blog/phishing-with-qr-codes-how-darktrace-detected-and-blocked-the-bait

[4] https://darktrace.com/blog/the-domain-game-how-email-attackers-are-buying-their-way-into-inboxes

Continue reading
About the author
Rajendra Rushanth
Cyber Analyst

Blog

No items found.

The State of AI in Cybersecurity: The Impact of AI on Cybersecurity Solutions

Default blog imageDefault blog image
13
May 2024

About the AI Cybersecurity Report

Darktrace surveyed 1,800 CISOs, security leaders, administrators, and practitioners from industries around the globe. Our research was conducted to understand how the adoption of new AI-powered offensive and defensive cybersecurity technologies are being managed by organizations.

This blog continues the conversation from “The State of AI in Cybersecurity: Unveiling Global Insights from 1,800 Security Practitioners” which was an overview of the entire report. This blog will focus on one aspect of the overarching report, the impact of AI on cybersecurity solutions.

To access the full report, click here.

The effects of AI on cybersecurity solutions

Overwhelming alert volumes, high false positive rates, and endlessly innovative threat actors keep security teams scrambling. Defenders have been forced to take a reactive approach, struggling to keep pace with an ever-evolving threat landscape. It is hard to find time to address long-term objectives or revamp operational processes when you are always engaged in hand-to-hand combat.                  

The impact of AI on the threat landscape will soon make yesterday’s approaches untenable. Cybersecurity vendors are racing to capitalize on buyer interest in AI by supplying solutions that promise to meet the need. But not all AI is created equal, and not all these solutions live up to the widespread hype.  

Do security professionals believe AI will impact their security operations?

Yes! 95% of cybersecurity professionals agree that AI-powered solutions will level up their organization’s defenses.                                                                

Not only is there strong agreement about the ability of AI-powered cybersecurity solutions to improve the speed and efficiency of prevention, detection, response, and recovery, but that agreement is nearly universal, with more than 95% alignment.

This AI-powered future is about much more than generative AI. While generative AI can help accelerate the data retrieval process within threat detection, create quick incident summaries, automate low-level tasks in security operations, and simulate phishing emails and other attack tactics, most of these use cases were ranked lower in their impact to security operations by survey participants.

There are many other types of AI, which can be applied to many other use cases:

Supervised machine learning: Applied more often than any other type of AI in cybersecurity. Trained on attack patterns and historical threat intelligence to recognize known attacks.

Natural language processing (NLP): Applies computational techniques to process and understand human language. It can be used in threat intelligence, incident investigation, and summarization.

Large language models (LLMs): Used in generative AI tools, this type of AI applies deep learning models trained on massively large data sets to understand, summarize, and generate new content. The integrity of the output depends upon the quality of the data on which the AI was trained.

Unsupervised machine learning: Continuously learns from raw, unstructured data to identify deviations that represent true anomalies. With the correct models, this AI can use anomaly-based detections to identify all kinds of cyber-attacks, including entirely unknown and novel ones.

What are the areas of cybersecurity AI will impact the most?

Improving threat detection is the #1 area within cybersecurity where AI is expected to have an impact.                                                                                  

The most frequent response to this question, improving threat detection capabilities in general, was top ranked by slightly more than half (57%) of respondents. This suggests security professionals hope that AI will rapidly analyze enormous numbers of validated threats within huge volumes of fast-flowing events and signals. And that it will ultimately prove a boon to front-line security analysts. They are not wrong.

Identifying exploitable vulnerabilities (mentioned by 50% of respondents) is also important. Strengthening vulnerability management by applying AI to continuously monitor the exposed attack surface for risks and high-impact vulnerabilities can give defenders an edge. If it prevents threats from ever reaching the network, AI will have a major downstream impact on incident prevalence and breach risk.

Where will defensive AI have the greatest impact on cybersecurity?

Cloud security (61%), data security (50%), and network security (46%) are the domains where defensive AI is expected to have the greatest impact.        

Respondents selected broader domains over specific technologies. In particular, they chose the areas experiencing a renaissance. Cloud is the future for most organizations,
and the effects of cloud adoption on data and networks are intertwined. All three domains are increasingly central to business operations, impacting everything everywhere.

Responses were remarkably consistent across demographics, geographies, and organization sizes, suggesting that nearly all survey participants are thinking about this similarly—that AI will likely have far-reaching applications across the broadest fields, as well as fewer, more specific applications within narrower categories.

Going forward, it will be paramount for organizations to augment their cloud and SaaS security with AI-powered anomaly detection, as threat actors sharpen their focus on these targets.

How will security teams stop AI-powered threats?            

Most security stakeholders (71%) are confident that AI-powered security solutions are better able to block AI-powered threats than traditional tools.

There is strong agreement that AI-powered solutions will be better at stopping AI-powered threats (71% of respondents are confident in this), and there’s also agreement (66%) that AI-powered solutions will be able to do so automatically. This implies significant faith in the ability of AI to detect threats both precisely and accurately, and also orchestrate the correct response actions.

There is also a high degree of confidence in the ability of security teams to implement and operate AI-powered solutions, with only 30% of respondents expressing doubt. This bodes well for the acceptance of AI-powered solutions, with stakeholders saying they’re prepared for the shift.

On the one hand, it is positive that cybersecurity stakeholders are beginning to understand the terms of this contest—that is, that only AI can be used to fight AI. On the other hand, there are persistent misunderstandings about what AI is, what it can do, and why choosing the right type of AI is so important. Only when those popular misconceptions have become far less widespread can our industry advance its effectiveness.  

To access the full report, click here.

Continue reading
About the author
The Darktrace Community
Our ai. Your data.

Elevate your cyber defenses with Darktrace AI

Start your free trial
Darktrace AI protecting a business from cyber threats.