AI Conversations on GPUs at Risk of Eavesdropping

After thorough research, cybersecurity experts at Trail of Bits have uncovered a significant vulnerability that could potentially allow unauthorized access to GPU local memory on certain Apple, Qualcomm, AMD, and Imagination GPUs. This flaw, dubbed LeftoverLocals, has the capability to breach conversations held via large language models and machine learning models on affected GPUs. To address this issue, the affected vendors—Apple, Qualcomm, AMD, and Imagination—have implemented various remediations.

Table of Contents

Affected GPUs and Patches

Apple, Qualcomm, AMD, and Imagination GPUs have all been impacted by the LeftoverLocals vulnerability. Here are the recommended patches from each vendor:

Apple has issued fixes for the A17 and M3 series processors, as well as specific devices such as the Apple iPad Air 3rd G (A12). However, a comprehensive list of secured devices has not been provided. As of January 16th, the Apple MacBook Air (M2) was still vulnerable, according to Trail of Bits.

AMD plans to roll out a new mode to address the problem by March 2024. Additionally, AMD has released a detailed list of affected products.

Imagination has updated drivers and firmware to prevent the vulnerability in affected DDK Releases up to and including 23.2.

Qualcomm has released a patch for some devices, but has not supplied a complete list of affected and unaffected devices.

Understanding the LeftoverLocals Vulnerability

In essence, the LeftoverLocals vulnerability exploits a GPU memory region known as local memory to connect two GPU kernels, even if they are not part of the same application or utilized by the same user. Attackers can utilize GPU compute applications, such as OpenCL, Vulkan, or Metal, to create a GPU kernel that dumps uninitialized local memory into the target device. While CPUs typically isolate memory in a way that prevents exploits like this, GPUs may not have the same level of protection.

Implications of LeftoverLocals

The LeftoverLocals process can be used to intercept the linear algebra operations performed by open-source large language models, allowing attackers to eavesdrop on the interactive conversations taking place. Researchers at Trail of Bits discovered that the attacker can sometimes extract incorrect tokens or other errors, such as words semantically similar to other embeddings. This flaw, tracked by NIST as CVE-2023-4969, poses a significant security risk.

Defending Against LeftoverLocals

Aside from applying the updates from the GPU vendors mentioned above, experts Tyler Sorensen and Heidy Khlaaf of Trail of Bits advise that mitigating and verifying this vulnerability on individual devices may prove challenging. Programmers will need to modify the source code of all GPU kernels that use local memory, ensuring that GPU threads clear memory to any local memory locations not used in the kernel, and verifying that the compiler does not remove these memory-clearing instructions. Additionally, developers working in machine learning or application owners using ML apps should exercise caution, as many parts of the ML development stack have not been thoroughly reviewed by security experts.

Looking Ahead

Trail of Bits views this vulnerability as an opportunity for the GPU systems community to fortify the GPU system stack and associated specifications. As they work to strengthen the security of GPU systems, it is crucial for businesses and developers to remain vigilant and implement the necessary updates and precautions to safeguard sensitive data and conversations from potential breaches.

In conclusion, the discovery of the LeftoverLocals vulnerability sheds light on the importance of consistently evaluating and addressing potential security risks within the GPU systems community. By understanding the nature of this vulnerability and taking proactive measures to protect against it, businesses and developers can minimize the potential impact of security breaches and reinforce the overall security of their GPU systems.