In a new cyber threat exploiting ZIP file concatenation, attackers use a Trojan embedded in concatenated ZIP files to target Windows users, evading standard detection methods. This technique takes advantage of how different ZIP file readers interpret concatenated ZIP structures, allowing malicious content to remain undetected in certain programs while becoming visible in others.
Understanding ZIP File Structure
The ZIP file format, a widely used method for data compression, organizes and bundles multiple files into a single archive, making it ideal for efficient file transfers. However, the structure of ZIP files introduces potential vulnerabilities, which attackers can exploit for evasion purposes. Here’s a breakdown of the key structural components that are critical for both functionality and security:
- File Entries
- These represent the actual files or folders compressed within the ZIP archive. Each entry contains essential metadata, including the file name, size, and modification date. This metadata helps the ZIP reader identify and handle each file within the archive, allowing users to retrieve individual files.
- Central Directory
- The central directory acts as an index for the entire ZIP archive. Located at the end of the ZIP file, it contains a list of all the file entries along with their offsets (locations) within the archive. This structure allows ZIP readers to quickly locate and extract files without scanning the entire ZIP file sequentially. The central directory thus improves both file access speed and efficiency, making it easier to add or modify entries without impacting the overall ZIP structure.
- EOCD (End of Central Directory)
- The EOCD marks the end of the central directory and includes essential metadata about the entire ZIP archive, such as the total number of file entries and the starting position of the central directory. ZIP readers rely on the EOCD record to determine where the central directory begins, which facilitates quick access to the list of files within the archive.
Together, these components are crucial for enabling ZIP files to function as compact, easily accessible archives. However, the flexibility in this structure also presents potential vulnerabilities, which threat actors exploit through techniques like concatenation. By understanding these components, we gain insight into how attackers use ZIP files to evade detection and hide malicious content.
Understanding ZIP Concatenation and the Attack Technique: ZIP files, widely used for data compression, consist of structural elements like the Central Directory and EOCD (End of Central Directory) to organize file entries efficiently. However, attackers exploit these structural elements by concatenating multiple ZIP files into a single archive, creating multiple central directories. This tactic lets them hide malicious files from detection tools or programs that only read the first directory, ensuring that the Trojan is only visible in select tools like WinRAR.
Imagine you have a ZIP file named documents.zip
containing two text files:
invoice.txt
contract.txt
Standard ZIP Structure
In a typical ZIP file structure:
- File Entries: Each file (
invoice.txt
andcontract.txt
) is stored with metadata such as the file name, size, and modification date. - Central Directory: This directory is at the end of the ZIP file and includes a list of the files along with their locations within the ZIP. When you open
documents.zip
, the ZIP reader consults the central directory to quickly locate and display the two files. - EOCD (End of Central Directory): This record is located at the very end of the ZIP file and indicates where the central directory begins, making it possible for ZIP readers to efficiently find and display files without scanning the entire archive.
Exploitation via Concatenation
Attackers can exploit this structure through concatenation by appending a second ZIP archive to documents.zip
. Here’s how:
- They create a new, separate ZIP file,
malware.zip
, containing a hidden executable file namedvirus.exe
. - Using concatenation, they append
malware.zip
to the end ofdocuments.zip
, creating a combined file that appears to be a single archive but actually has two central directories (one fordocuments.zip
and one formalware.zip
).
Example in Command Line:
zip documents.zip invoice.txt contract.txt # Create initial ZIP with harmless files
zip malware.zip virus.exe # Create malicious ZIP with a hidden file
cat documents.zip malware.zip > combined.zip # Concatenate both into a single ZIP
How Different ZIP Readers Handle the Combined ZIP
Now, let’s see what happens when different programs open combined.zip
:
- 7zip: When opening
combined.zip
with 7zip, only the first ZIP’s central directory (documents.zip
) is read, so 7zip displays onlyinvoice.txt
andcontract.txt
. A minor warning might appear, but the hiddenvirus.exe
file is not displayed. - WinRAR: Unlike 7zip, WinRAR recognizes the second central directory (
malware.zip
) and revealsvirus.exe
alongside the original files. This makes WinRAR a tool that could potentially expose the hidden threat. - Windows File Explorer: File Explorer may struggle with
combined.zip
. It may only showvirus.exe
if it detects the second archive, but it sometimes fails to open concatenated ZIPs altogether, making it unreliable in security scenarios.
Why This Matters
The discrepancy in how ZIP readers interpret concatenated archives allows attackers to disguise malware in ZIP files. Security tools relying on ZIP readers like 7zip might miss the hidden virus.exe
, allowing the malware to bypass initial detection and later infect the system if opened in a program like WinRAR.
Evasion Techniques Exploited by Threat Actors
Cybercriminals often use sophisticated techniques to bypass security systems and conceal their malicious payloads. One of these techniques, ZIP concatenation, takes advantage of the structural flexibility of ZIP files to hide malware from detection tools. Here’s how threat actors exploit this technique:
1. ZIP Concatenation
- What It Is: ZIP concatenation involves appending multiple ZIP files into one single file, so it appears as a single archive but actually contains multiple central directories and file entries.
- How It Works: Attackers create two separate ZIP files — one benign and one malicious. They concatenate these files, resulting in a single archive that many ZIP readers interpret inconsistently.
- Effect: By placing the malicious file in the second archive, threat actors can make it undetectable to many security tools that only read the first archive, effectively hiding malware like Trojans or ransomware within the ZIP file.
2. Targeting ZIP Reader Discrepancies
- Different Interpretations: ZIP readers such as 7zip, WinRAR, and Windows File Explorer process concatenated ZIP files differently. This discrepancy allows attackers to exploit these inconsistencies:
- 7zip: Often only reads the first central directory, ignoring the second archive that contains the malicious payload.
- WinRAR: Displays all file entries from both concatenated ZIP files, exposing hidden malicious content.
- Windows File Explorer: Inconsistent, sometimes failing to open concatenated ZIP files, or only displaying the second archive if renamed.
- Impact: Attackers rely on users or systems using ZIP readers like 7zip to overlook the malicious content. Only when the file is opened with a more thorough reader, like WinRAR, might the malware be exposed — but by then, the system may already be compromised.
3. Disguising File Extensions and Names
- Changing Extensions: Threat actors often rename ZIP files to extensions like
.rar
or.pdf
to appear as legitimate documents or compressed files in emails. - Using Familiar Names: Malicious files within the ZIP are frequently named after commonly used files, such as “invoice.pdf” or “shipping_details.txt,” to reduce suspicion. Attackers might append a hidden executable, such as
malware.exe
, to bypass detection if the archive is opened in ZIP readers that miss the second directory.
4. Phishing Emails with High Importance
- Phishing Tactics: These attacks are typically launched through phishing emails marked as “high importance” to create urgency. The email content often urges users to open attached files under the guise of critical business information, like shipping documents or invoices.
- Targeted Recipients: These emails are crafted to appear from familiar sources (e.g., “shipping company” or “billing department”) to increase the likelihood of the recipient opening the ZIP attachment without caution.
5. Using Malicious Scripts (e.g., AutoIt) for Further Evasion
- Scripted Malware: Once the malicious payload is extracted, attackers often use scripting languages like AutoIt to automate the deployment of further threats. These scripts can perform additional tasks, such as:
- Downloading additional malware.
- Stealing sensitive data.
- Propagating within networks.
- Evasion Benefit: Since scripting languages can rapidly execute complex tasks, this adds another layer of difficulty for detection tools that may struggle to identify and isolate malicious script-based activities embedded within the ZIP file.
6. Avoiding Detection by Security Tools
- Security Tool Limitations: Many security tools rely on popular ZIP handlers like 7zip or OS-native readers to scan and parse ZIP files. Threat actors are aware of this and deliberately construct ZIP files to exploit these tools’ blind spots.
- Recursive Extraction Defenses: Traditional detection solutions may lack recursive unpacking capabilities, which means they do not parse every layer of a concatenated ZIP file. Threat actors leverage this gap to keep malicious content hidden in nested or concatenated layers that security software may overlook.
Why ZIP Concatenation Evasion Works
This method is particularly effective because it exploits fundamental inconsistencies in ZIP file interpretation across different readers and tools. By strategically placing malicious payloads in parts of the archive that some ZIP readers cannot access, attackers bypass standard detection methods and target users more likely to overlook the hidden threat.
The Countermeasure: Recursive Unpacking Technology
To combat this technique, security researchers are now developing recursive unpacking algorithms that fully parse concatenated ZIP files by examining each layer independently. This approach helps detect deeply hidden threats, reducing the chances of evasion.
In summary, ZIP concatenation is an effective evasion technique, enabling threat actors to bypass standard detection tools and deliver malware hidden within seemingly innocuous files.
Recursive Unpacker: A Solution to Unmask Evasive Malware
As attackers increasingly use techniques like ZIP concatenation to evade detection, security researchers have developed recursive unpacking technology to thoroughly analyze complex, multi-layered archives. Recursive unpacking systematically dissects concatenated or deeply nested files to reveal hidden malicious payloads that traditional detection tools may miss. Here’s how the Recursive Unpacker functions and why it’s a powerful defense against evasive threats.
1. What is a Recursive Unpacker?
- Purpose: A Recursive Unpacker is a security tool designed to break down complex file structures, including concatenated ZIP files and deeply nested archives, to expose every layer of content, whether benign or malicious.
- Function: It goes beyond single-layer extraction by recursively (repeatedly) unpacking each layer of an archive until it reaches the final files. Each layer is individually examined to ensure no hidden content remains unchecked.
2. How It Works
- Layer-by-Layer Extraction: The Recursive Unpacker opens an archive and extracts its contents. For each extracted file, if it detects additional compressed layers (such as a ZIP or RAR within another ZIP), it repeats the unpacking process for every inner layer.
- Detection of Malformed or Concatenated Files: It identifies concatenated ZIP files, where multiple central directories may contain hidden payloads. By detecting and unpacking each central directory separately, the tool ensures that no segment of the file remains uninspected.
- Dynamic Analysis Integration: After extracting all contents, the Recursive Unpacker may integrate with dynamic analysis systems that observe how the files behave when executed. This enables detection of advanced malware behaviors that might not be evident through static analysis alone.
3. Example of Recursive Unpacking in Action
Imagine an attacker has sent a ZIP file with the following structure:
- Layer 1:
invoice.zip
containing:document.pdf
(benign)hidden.zip
(a nested ZIP file)
- Layer 2:
hidden.zip
containing:malware.exe
(a malicious executable)data.txt
(benign text file)
When a Recursive Unpacker analyzes invoice.zip
, it first extracts document.pdf
and hidden.zip
. Upon detecting that hidden.zip
is itself an archive, it unpacks this nested layer as well, revealing malware.exe
and data.txt
. Without recursive unpacking, security tools may have missed malware.exe
, which could contain the actual payload.
4. Advantages of Recursive Unpacking
- Full Visibility: Recursive Unpackers ensure every layer of an archive is exposed, leaving no hidden files undetected, regardless of how deeply nested they are.
- Handling Evasive Techniques: By unpacking concatenated and nested files, Recursive Unpackers effectively counter ZIP concatenation evasion, where hidden payloads are deliberately placed in overlooked layers.
- Integration with Advanced Malware Detection: After extraction, files can be passed on for behavioral analysis to detect sophisticated malware that may attempt to execute or download additional payloads only under certain conditions.
5. Use Cases in Cybersecurity
- Detecting Phishing Payloads: Recursive Unpackers are particularly valuable in identifying malicious payloads hidden within email attachments, such as Trojanized ZIP files disguised as invoices or shipping documents.
- Protecting Endpoint Security: On corporate networks, Recursive Unpackers embedded in security software can prevent employees from inadvertently executing hidden malware embedded within ZIP files.
- Malware Research and Forensics: Security analysts can use Recursive Unpackers to thoroughly analyze suspected malicious files, ensuring comprehensive insights into an attack’s structure and methods.
6. Limitations and Challenges
False Positives: Due to its thoroughness, Recursive Unpackers may flag benign nested files as suspicious, requiring further analysis to validate the findings.
Resource Intensity: Recursive unpacking can be resource-intensive, as it requires processing every layer of large files, which can be time-consuming.
For full details and a technical breakdown of the attack, read the original research here.
Information security specialist, currently working as risk infrastructure specialist & investigator.
15 years of experience in risk and control process, security audit support, business continuity design and support, workgroup management and information security standards.