Skip to main content

Microsoft accidentally released 38TB of private data in a major leak

It’s just been revealed that Microsoft researchers accidentally leaked 38TB of confidential information onto the company’s GitHub page, where potentially anyone could see it. Among the data trove was a backup of two former employees’ workstations, which contained keys, passwords, secrets, and more than 30,000 private Teams messages.

According to cloud security firm Wiz, the leak was published on Microsoft’s artificial intelligence (AI) GitHub repository and was accidentally included in a tranche of open-source training data. That means visitors were encouraged to download it, meaning it could have fallen into the wrong hands again and again.

A large monitor displaying a security hacking breach warning.
Stock Depot / Getty Images

Data breaches can come from all kinds of sources, but it will be particularly embarrassing for Microsoft that this one originated with its own AI researchers. The Wiz report states that Microsoft uploaded the data using Shared Access Signature (SAS) tokens, an Azure feature, that lets users share data through Azure Storage accounts.

Recommended Videos

Visitors to the repository were told to download the training data from a provided URL. However, the web address granted access to much more than just the planned training data, and allowed users to browse files and folders that were not intended to be publicly accessible.

Full control

A person using a laptop with a set of code seen on the display.
Sora Shimazaki / Pexels

It gets worse. The access token that allowed all this was misconfigured to provide full control permissions, Wiz reported, rather than more restrictive read-only permissions. In practice, that meant that anyone who visited the URL could delete and overwrite the files they found, not merely view them.

Wiz explains that this could have had dire consequences. As the repository was full of AI training data, the intention was for users to download it and feed it into a script, thereby improving their own AI models.

Yet because it was open to manipulation thanks to its wrongly configured permissions, “an attacker could have injected malicious code into all the AI models in this storage account, and every user who trusts Microsoft’s GitHub repository would’ve been infected by it,” Wiz explains.

Potential disaster

A digital depiction of a laptop being hacked by a hacker.
Digital Trends

The report also noted that the creation of SAS tokens – which grant access to Azure Storage folders such as this one – does not create any kind of paper trail, meaning “there is no way for an administrator to know this token exists and where it circulates.” When a token has full-access permissions like this one did, the results can be potentially disastrous.

Fortunately, Wiz explains that it reported the issue to Microsoft in June 2023. The leaky SAS token was replaced in July, and Microsoft completed its internal investigation in August. The security lapse has only just been reported to the public to allow time to fully fix it.

It’s a reminder that even seemingly innocent actions can potentially lead to data breaches. Luckily the issue has been patched, but it’s unknown whether hackers gained access to any of the sensitive user data before it was removed.

Alex Blake
Alex Blake has been working with Digital Trends since 2019, where he spends most of his time writing about Mac computers…
OpenAI’s Sora was leaked in protest over allegations of ‘art washing’
An AI image portraying two mammoths that walk through snow, with mountains and a forest in the background.

OpenAI's unreleased Sora video generation model was leaked Tuesday by a group protesting the company's "art washing" actions, per a post from X user @legit_rumors.

The group, calling themselves Sora PR Puppets, reportedly had gained early access to the Sora API. Through that, they leveraged authentication tokens to create a front-end interface enabling anyone to generate video clips with the model. While the project only remained online for around three hours before Hugging Face (or possibly OpenAI itself) revoked access, several users managed to publish their creations to social media sites.

Read more
With Copilot Actions, Microsoft brings AI agents to Outlook, Teams, and more
microsoft expanding ai agents 365 copilot early 2025 actions2

Microsoft plans to roll out a slew of new features for its business-facing 365 Copilot products starting early next year, the company announced during its Microsoft Ignite 2024 event on Tuesday.

365 Copilot, which was rebranded from just Copilot in September, enables businesses to incorporate Microsoft Copilot generative AI into its Microsoft 365 family of apps (as well as in Teams) for a $30/employee/month subscription.

Read more
Microsoft calls Recall one of ‘the most secure experiences’ it’s ever built
Recall promotional image.

As part of its Ignite 2024 announcements, Microsoft has provided an update on how its AI-powered Recall feature will work in the context of an IT department. Noting that the company has "heard your feedback," specifically in terms of it needing it to be more "secure and controllable," Microsoft claims to have gotten its ducks in a row for the launch of its controversial new Windows 11 feature.

Microsoft says that Recall "will ship with meaningful security enhancements, including additional layers of data encryption and Windows Hello protection, making it one of the most secure experiences we have ever built." Whether or not this will be enough to satisfy the security community, however, is still to be determined.

Read more