About code emulation technology
The code emulation method of malware detection scans a file’s behavior by emulating its execution in a virtual (emulated) environment. In general, this approach is similar to that of malware detection in a sandbox, but emulation and full-featured sandboxing differ in details of design and application. Let’s look at the difference.
A full-featured sandbox, unlike an emulator, is a “heavyweight” method. It emulates the whole environment and runs a scanned sample in a virtual machine with a real operating system (OS) and applications installed. As a result, this method requires high computation power and poses compatibility limitations on the host system. For this reason, a sandbox is most effective in centralized on-premise and in-cloud solutions. It is not suitable for malware detection on user hosts and other regular computers.
An emulator emulates only the execution of the sample itself. It temporarily creates objects that the sample interacts with: passwords a piece of malware will want to steal, antiviruses it will attempt to stop memory, system registry and so on. These objects are not real parts of the OS or software, but imitations made by the emulator. Its control over the emulated environment lets the emulator fast-forward time, witness future file behavior and prevent malware from evasion-by-time-delay.
An emulator determines essential behavior features of a scanned file while using far fewer resources than a sandbox, and it is suitable for user hosts. Execution of unknown files is usually postponed until they are scanned with an emulator. The emulation approach is not new, but some emulators are very advanced, and their share in malware detection is substantial. Today’s emulators are empowered with cloud-based reputation services, and their efficacy is boosted by machine learning.
Kaspersky Lab emulator
Kaspersky Lab solutions include an emulator as one line of defense in a multi-layered approach to protection. It emulates binary files and scripts, and the importance of the latter is growing with the increasing popularity of script-based fileless attacks.
Emulation is optimized for limited computer resources. It takes much less RAM per object than a sandbox, and simultaneously scans many objects without substantially loading the system. Due to hardware acceleration, emulation safely uses the processor to accelerate the scan about 20 times.
Kaspersky Lab solutions start emulation scanning “on demand,” when a user requests a disk scan, or “on access,” when an object is automatically scanned before it is accessed or executed. Emulation may start in parallel with other detection methods such as requests for process reputation in the cloud.
Emulators are implemented in Kaspersky Lab endpoint solutions, gateway-level solutions (e.g. proxy and email servers) and in virtualization environment protection. In Kaspersky Lab infrastructure, powerful emulators are a part of object classification pipeline.
- Emulates execution of any executable files (PE): *.exe, *.dll, *.sys and others in Windows environment.
- Scans scripts received via a web link (on a web page, in an email, in a message), embedded in PDF and MS Office files.
The emulation technology is implemented with an emulation core and detection records, which analyze the data provided by the core. The records are created in Kaspersky Lab, and updates are downloaded by solutions hourly. One detection record can detect many different malware samples with different binary content, but with similar behavior.
Malware detection workflow
- The emulator receives a request to scan an object (an executable file or a script) from another component of a security solution.
- The emulator safely executes the object’s instructions one by one in a virtual environment, starting from the object’s entry point. If an instruction interacts with environment (OS, registry, other files, web, memory etc.), the emulator imitates response from these objects.
- The emulator collects artifacts and passes them to the heuristic analyzer. The analyzer passes a verdict based on these artifacts to the component that requested the analysis.
- The emulation stops when there are enough artifacts to detect malware or due to a timeout.
Artifacts collected by the emulator
For executable files (binaries):
- API call log
- All changes in file system, system registry
- Memory dumps
- Arguments and returns of string operations
- Calls of embedded functions and functions provided by the environment
- Drops to file system and child scripts
Advanced malware writers equip their malware with features to prevent detection in emulation. Kaspersky Lab tracks and counter-acts these new evasion techniques. Examples:
Evasion A: Before execution, malware needs to unpack. This takes much computation time and is usually enough to evade detection via emulation timeout.
Counter evasion A: Emulator recognizes packed files and adjusts emulation depth accordingly. Hardware acceleration gives emulator enough power to pass through unpacking.
Evasion B: Before executing its malicious payload, malware may access web resources or parameters of its environment (e.g. computer name, disk size) and check if they are available and meaningful. Seeing no meaningful response, a malware will not execute its payload and will evade detection.
Counter evasion B: Upon the scanned file’s requests, the emulator imitates information about the environment and system resources, making it as meaningful as it can. For example, it randomizes computer names, so the malware may not use specific computer names as a signal of running in an emulation.