2/8/2024
Research by:  
Tobias Mueller, Jakob Rieck, Florian Wilkens, Luca Glockow, Nicholas Farnham, Jannes Quer, Matthias Marx, Dominik Oepen

Black Basta Buster: Decrypting files without paying the ransom

Key Takeaways

  • Files encrypted by the Black Basta ransomware between November 2022 and December 2023 can be decrypted under certain circumstances with our tool provided on Github
  • We have shared our findings with victims, CERTs, DFIR providers, and law enforcement
  • After Black Basta have changed their encryption routine in early December 2023 and thereby fixing the vulnerability, we presented our findings to the public at the 37C3 in Hamburg, Germany
Figure1. AI-generated image of a broken padlock with a sad skull on it
Image credit: Bing Image Creator

High-level Summary

We are happy to announce a decryptor for files encrypted with the Black Basta ransomware between November 2022 and December 2023. We shared our findings with victims, CERTs, DFIR providers, and law enforcement agencies, before sharing our findings with the public at the 37C3 in Hamburg. You can watch the video on media.ccc.de and get the code to decrypt affected files from github.com/srlabs

The BlackBasta ransomware encrypts Windows computers and ESXi hosts running virtual machine workloads. We analysed the behaviour of a sample of Black Basta collected in April 2023. As described by Zscaler, the Black Basta malware started using ECC andXChaCha20 over RSA in November 2022. However, the malware uses the XChaCha20 keystream wrongly, rendering the encryption vulnerable to a known-plaint ext attack that allows partial or full recovery of affected files. The cryptographic implementation repeatedly uses the same 64 bytes for encrypting parts of the file rather than advancing the key stream to obtain fresh 64 bytes.

The files can be recovered if the plaintext of 64 encrypted bytes is known which is indeed feasible for many files. The recoverability of a given file depends on its size: If a file is smaller than 5000 bytes, recovery is not possible. For files larger than 1GB, the first 5000 bytes are lost. Files between 5000 bytes and 1GB can be fully recovered. This article presents an analysis and recovery method for files encrypted by the Black Basta ransomware.

As of December 2023, Black Basta has changed their encryption and is not vulnerable anymore, rendering encrypted files unrecoverable by the presented method.

Introduction

This article presents our investigation into a Black Basta malware sample from April2023. We obtained the sample, reverse engineered the malware, and ran experiments with it to analyse its behaviour and investigate recovery options.

This analysis is structured as follows. First, background information on the malware and the cryptography it uses is given. Second, the cryptographic weakness in the ransomware’s usage of the XChaCha20 keystream is explained. Third, methods to recover encrypted files are presented. Fourth, limitations of the recovery options are discussed. Finally, the findings of this report are summarised.

This article makes the following contributions:

  1. An analysis of the encryption technique used by Black Basta ransomware until December 2023;
  2. a demonstration of a weakness in the  ransomware's use of a XChaCha20 keystream, which may result in full or partial recovery of encrypted files;
  3. an implementation of an automatic decryption tool that detects encrypted null bytes and attempts to recover files automatically.

Background

This section presents the encryption technique used by the Black Basta ransomware and establishes the cryptographic background necessary to understand the implementation's weakness described in the next section.

Samples of the Black Basta malware from November 2022 were already analysed by Zscaler. According to their description, the ransomware generates an ephemeral asymmetric key-pair for each file to be encrypted. In addition, the malware embeds a unique, per-victim asymmetric public key in its binary.

The malware uses the embedded public key and the per-file ephemeral private key for a Diffie-Hellman-style key agreement to obtain secret key material, which secures an additional per-file randomly generated symmetric encryption key. The symmetric key is used to initialise aXChaCha20 stream-cipher. The cipher is then used to encrypt the file.

To allow the ransomware operator to decrypt the files, the ephemeral per-file public key and the encrypted symmetric key are embedded in the encrypted file. Together with the asymmetric per-victim private key held by the ransomware operator, the result of the Diffie-Hellman key agreement allows for decrypting the embedded encrypted symmetric key.

The XChaCha20 stream cipher (based on ChaCha20) used by the malware to encrypt the file is typically initialised with a key to generate an arbitrary number of random bytes, which can then be used to XOR the plaintext (see Figure 2). Decrypting XChaCha20 follows the same process except that the keystream is XORed with the cipher text to obtain the plaintext. In particular, the cipher is initialised with the same(symmetric) key. This causes the cipher to produce the same pseudo-random bytes(keystream). Because of the commutative and self-inverse nature of the XOR operation, XORing the cipher text with the same bytes used for encryption will yield the plaintext.

 

Figure 2. Schematic overview of stream cipher operation.

In other words: Initialising the cipher multiple times with the same symmetric key yields the same bytes each time.

If used properly, the XChaCha20 stream-cipher produces different random bytes each time is invoked. Those random bytes (R) can then be applied to the plaintext (P) with the XOR (⊕) function to produce cipher text (C): C = (P ⊕R). If the plaintext is known, the random bytes can be inferred by rearranging the equation: R = (C ⊕P). Inferring these random bytes is not interesting if the stream-cipher is used properly, because these random bytes are used only once and are never re-used for encrypting other parts.

If the stream is not used properly and the same random bytes are re-used for encryption more than once, the random bytes can be used to recover the other encrypted blocks by simply XORing the random bytes with the encrypted parts: Pₙ= (Cₙ ⊕R). Hence, re-using the same keystream for multiple encryptions is detrimental for the security of the cipher text and may result in full recovery of the plaintext.

Weakness observed in the malware

This section describes the weakness of the Black Basta malware that ultimately allows for decrypting a file without knowledge of the private key.

When experimenting with the malware, we encrypted multiple files containing various patterns of plaintext. When analysing encrypted files containing only zero bytes, we could observe a pattern in the encrypted blocks as depicted in Error! Reference source not found.

Figure 3. Schematic overview of stream cipher operation. 

The image shows an encrypted file which contained zeros only. The encrypted file contains64 random (encrypted) bytes followed by 128 plaintext bytes. This pattern repeats until the end of the file.

Our first observation is that the file is not fully encrypted. Instead, only parts of the file are encrypted. We discuss the logic for the selection of which parts of the file get encrypted in the following section.

Our second observation is that the encrypted blocks are made up of the same bytes. In other words, the encryption is not randomised.

We formed the hypothesis that the malware generates 64 bytes and XORs those with the plaintext to produce the cipher text as described in the previous section.

Figure 4. Excerpt from the decompiled EncryptFile routine. The ransomware generates  the same 64-byte keystream for every chunk to be encrypted.

We tested our hypothesis by letting the malware encrypt a series of zero bytes followed by a series of one-bytes. If the same key was used, we expect the difference(byte-wise XOR) of the encrypted bytes to be one (0x01) due to the commutative and self-inverse properties of the XOR (⊕) operator ((0x00⊕ k) ⊕(0x01 ⊕ k) == (0x00 ⊕0x01) ⊕ (k ⊕k) == (0x00 ⊕ 0x01) ⊕(0x00) == 0x01).We observed that the difference of the encrypted zero-bytes and the encrypted one bytes is indeed one, supporting our hypothesis.

We finally confirmed our hypothesis through reverse engineering the malware. The routine responsible for the encryption, aptly named "EncryptFile", loops over the to-be-encrypted file and, for every 64 bytes long chunk to be encrypted,(re-)generates the deterministic keystream and uses that for the encryption of the chunk (see Figure 3).

It could be a deliberate choice of the malware authors to re-use the same block of 64 bytes because not advancing the keystream trivially makes encryption significantly faster. However, we do not believe that this is the case here for two reasons.

Firstly, we observed that the keystream is newly initialized for each block rather than computed once, held in memory, and then used for the rest of the file. Hence, we believe that the malware authors intended to advance the key stream but forgot to actually do so.

Secondly, older variants of the malware did make proper use of the key stream. The older variants used RSA instead ofECC, so we speculate that the malware authors introduced a regression when refactoring their code to change to the new cryptographic algorithms. Finally, newer strains of the malware that have surfaced at the beginning of December2023 have indeed rectified the identified behaviour to generate the key stream properly.

File Recovery

After having reviewed the weak encryption used by the malware in the last section, this section describes how to recover files encrypted by the Black Basta malware.

In order to exploit the weakness described in the previous section and recover an encrypted file, we: 1) find the position and size of the encrypted chunks within the file, and 2) obtain the 64-byte keystream that is re-used for the encryption of the file and apply it to the affected parts of the file.

Position and size of encrypted chunks

In our experiments we observed that the malware encrypts 64 bytes long chunks at a time. We noticed different behaviours based on the size of the file to be encrypted. To understand the malware's behaviour, we examined the decompiled executable.

The decompiled code (see Figure 4) suggests that the malware distinguishes three cases:

  1. The file is smaller than 5000 bytes: the malware encrypts the full file.
  2. The file is smaller than 1GB: the malware encrypts 64 bytes and skips 128 bytes, repeating until the end of the file has been reached.
  3. The file is at least 1GB in size: the first 5000 bytes are encrypted. Then, the malware encrypts 64 bytes and skips the following 6336 bytes, repeating until the end of the file has been reached.

These insights were already shared by Z scaler in an article from December 2022describing Black Basta malware samples from November 2022.

We note that there is a fourth case in the code for files smaller than 0x0ccccccc bytes, roughly 214MB. But that case is guarded by a check for the literal 0x05. We inferred that this is a check for the version of the encryption used. However, in our experiments, we only found version 0x06 so the condition was not met. We wondered whether there is another use for the version field but have not investigated its use further.

Figure 5. Excerpt from decompiled ransomware code. The code shows the  ransomware encrypts files of different sizes differently.

Determining there-used keystream

After knowing where the encrypted blocks are, we investigate methods to determine what bytes the keystream is composed of. The goal is to find the 64 bytes long keystream and XOR it onto the affected parts of the file.

Since, we know the position and the length of encrypted blocks within the file, one technique to determine the 64 bytes long keystream is to enumerate the encrypted blocks and check whether the plaintext is known. For example, those bytes might be recoverable from a backup. However, if a backup exists, it is likely that the file can be recovered from the backup and does not need to be decrypted. The plaintext might also be known because the file format requires certain bytes to be in place or because the content is known to the file owner.

One particular case of known plaintext bytes is a fully provisioned virtual machine disk image because such disk images are likely to contain long stretches of zero bytes. With those files, finding the keystream is a matter of extracting an encrypted chunk of the file. Generally, given a 64 bytes long plain text block and the corresponding encrypted 64 bytes long block, the keystream can be recovered by simply XORing the two blocks.

We have developed a tool to extract a chunk of a file which can then be used for decrypting the file or parts thereof. We have further developed a tool for finding encrypted zero-bytes. The logic boils down to detecting a series of plaintext zero blocks surrounding a non-zero block. The core of the logic is shown in Figure 5. Once the keystream is known, it can be applied to the file by XORing it onto the encrypted parts of the file.

Figure 6. Python code to detect encrypted zero-block

Caveats

This section discusses the limitations of the recovery approach presented in the previous section.

1.   Plaintext required

The first requirement for recovering files with the presented approach is the knowledge of the plaintext of 64 encrypted bytes of the plaintext file. Knowing the plaintext is trivially possible for encrypted zero bytes. Certain file types expose long sequences of zeros, for example, hard drive backing files used for virtual machines tend to be initialised with zeros and are only populated with content once the guest in the virtual machine actually writes files to the virtual disk drive. When recovering encrypted virtual machine disk images, there is a good chance of finding a sufficiently long sequence of zero bytes.Other file types may not expose such a way to determine known plaintext.

2.   Large files only

The second requirement for successfully recovering a file is its size. If the file is smaller than 5000 bytes, then recovery is not possible. Similarly, if the file is larger than 1GB, then the first 5000 bytes, except the very first 64 bytes, are lost. However, certain file types can be fully recovered despite the lost bytes. For example, the first partition on virtual disk images is frequently used as system partition to hold generic operating system files or swap memory rather than functionally important files. More importantly, the partition table is located within the first 5000 bytes of the disk and losing the partition table would render the disk unusable. Fortunately, the partition tables, being it GPT or MBR, can be recovered either from backup copies in the case of GPT or reconstructed in the case of MBR. A popular tool for recovering partition tables is testdisk

3.   Multiple encryptions

When running our experiments, we have noticed that the malware is capable of encrypting the same file multiple times. In that case, file recovery is more involved and will likely require manual review of the encryption structure. In particular, the first pass of the malware may have encrypted the file fully, but the second pass only encrypted the first half of the file. A layered decryption approach with manual supervision is required for such cases.

Conclusion

This article examined how the Black Basta ransomware (as of April 2023 untilDecember 2023) encrypts files and how to recover such encrypted files. In particular, we analysed the malware's use of the XChaCha20 keystream and exposed a weakness in their use of the keystream.

Our results suggest that files can be recovered if the plaintext of 64 encrypted bytes is known. Whether a file is fully or partially recoverable depends on the size of the file. Files below the size of 5000 bytes cannot be recovered. For files between 5000 bytes and 1GB in size, full recovery is possible. For files larger than 1GB, the first 5000 bytes will be lost but the remainder can be recovered.

The recovery hinges on knowing the plaintext of 64 encrypted bytes of the file. In other words, knowing 64 bytes is not sufficient since the known plaintext bytes need to be in a location of the file that is subject to encryption based on the malware's logic of determining which parts of the file to encrypt. However, since the malware specifically targets VMWare ESXi hosts running virtual machines, we argue that knowing 64 bytes of the plaintext in the right position is feasible because virtual machine disk images are likely to contain long sequences of zero bytes.

Our tools may recover files containing encrypted zero bytes. Depending on how many times and to what extent the malware encrypted the file, manual review may be required to fully recover a file.

Finally, this article described a weakness in the Black Basta malware from November 2022 until December 2023. The newer strains which have surfaced at the beginning of December 2023 have fixed this particular weakness resulting in our tool no longer being able to recover files. However, if your organisation has backed up encrypted data, our decryption tool may help to recover files.

Explore more

aLL articles
Smart Spies: Alexa and Google Home expose users to vishing and eavesdropping
device hacking
IoT
10/20/2019
A decade of hacking – meet the people behind SRLabs
team
11/8/2020
The physical access control market is ripe for an upgrade to modern technology
cryptography
device hacking
9/16/2010