Illustration of "Modular Worm project: Early Delivrable"

Abstract

The worm architecture is divided into two parts: the core and the modules. The only goal of the core is to provide a way of communication for the modules. Each module represent a feature. It can be an exploit or an interface, as a network one or a file system one. Modules are shared objects, which mean we can load/unload them at the runtime. The worm can be present several times into a single one device. It can be useful to have an active version and an harmless one that only wait that the first one is compromised to activate itself.

Core design

The core part is an executable binary. The modules are shared libraries. The core provides an API to the modules. This API allows to module to send and receive messages to and from other modules, establish a list of the available features, and control its own usage.

Presentation

The core is an executable file that is written in C, which is compiled, fast and easily allows to load shared objects with the dlopen, dlsym and dlclose functions. The usage of shared libraries makes the worm very modular and easy to modify at runtime because the code is not stored in only one file.

Core versioning and compatibility

The core has a versioning system, which defines X.Y (a major version X for compatibility and a minor version Y for bug and features). The version number is encoded on 15 bits, 7 bits for the X number and 8 bits for the Y one.

Internal communication API for modules

The core has a versioning system, which defines X.Y (a major version X for compatibility and a minor version Y for bug and features). The version number is encoded on 15 bits, 7 bits for the X number and 8 bits for the Y one. Each module has a message queue, which is managed by the core. When a module send a message to another module, it will be put in the recipient’s queue. All the memory allocation is managed by the core, which means the messages will not last forever once they are read. To store data and retrieve them later, a database module will be available. This system jails the modules in different abstractions and protects them against each other. It allows easy concurrent and parallel programming.

On launch, the core has to load a set of available modules.\ The core will act as an infinite loop, which will call the executable part of each modules. It provides a set of functions (the core API) that will be available to the modules:

  • send_message, used to send an asynchronous message to another module. It allows modules to communicate data between them.
  • send_message_instant, used to send a synchronous message to another module, and get an instant response from it. It allows faster procedures with some limitations (stack limit, timeout).
  • receive_message, used to retrieve messages from a specified module
  • receive_message_instant, which is called when another module send a synchronous message to this module.
  • list_messages, to get a list of metadata about the messages present in the module’s message queue.
  • list_modules, to get the list of available and loaded modules. This is a features’ discovering feature.
  • disable, to disable the current module for a specified amount of time (depending on the duration, simply not call the module or detach it from the core). The goal is to make the modules stealthier and speed up the main loop.

Module design

Meta-data and identification

Each module will be defined by several information:

  • its id, which will be unique to each module. Minimum value is 1.
  • its version number, on 15 bits, like the core one
  • its compatible core version required, on 15 bits + 1 bit for a flag, the flag containing additional requirement information.

Stealth and obfuscated code

Two main issues have been found for modules:

  • Avoid the detection of the malicious modules
  • Avoid the detection of the worm because of a know signature of one module

Signature and behaviour obfuscation

Each module is written following some rules. There are implemented with macro in C, which are generating junk code regenerated randomly at each compilation. This junk code obfuscate the traces of the module (inject syscalls, blocks, logic, etc.) at every step of its execution. This could be written:

for (int i = 0; i < 10; i++) {
  call_some_internal_function(i); // some code
  JUNK_MACRO(i) // random code generated at each iteration, with optional variadic parameters
  // ...
}

The goal is to avoid:

  • static signature: as junk code is regenerated at each compilation randomly, the byte code between 2 versions is very different

  • behaviour traces: the junk code may be more or less complex, and with anchors in the real code (variables from real code sent to the junk) and between the junks themselves (each junk might remains variables, invisible from the real code). It create a complex but useless operation to flood the real goal of the module. the junk code might have no effect (just make some operations), use system calls (write files, …) or even running other programs (system(ls), etc.)

Depending on each module’s implementation, it junk code might differs (for a complex system exploit on memory it could only insert short junk between each operation, in other it could be a set of algorithm evaluation, like a sort, etc.).

Junk injection requires careful programming and is a generic tool that must be evaluated each time it is used.

Shared Library injection

Injecting modules bytecode into a legit shared library could be a way to avoid anti-virus detection, because the legit files might already be in some whitelist.

Potential issues: - Anti-virus may detect modifications of executable files or shared libraries. But it is not obvious because a lot a software are using update systems thus are modified by an update software or update themselves.

Worm features

Command center

The first characteristic of a worm is its capability of self-spreading, independently of the user. The worm should be able to have a deep control of itself. In other words, to do not depend too much of how the user interact with it or the environment (network, operating system). It is concretized by being able to gets by himself the information it needs to continue to spread, and take decisions based on those information or absence of information.

Scanner

The first requirement of the worm to spread is to gather data about its environment, in order to identify vulnerabilities and devices that could be used. The scanner module is gathering those information by interacting with the system and the network, and uses a database in order to make these information available to other modules, and especially the Command center.

Vulnerability exploitation

Exploiting vulnerabilities is the main vector of attack of the worm. By using weaknesses in the code of programs, the worm should be able to attack those programs and uses them to execute arbitrary code on the remote devices. The first exploit developed in the worm uses the “Shellshock” bug, which is a bug related to the usage of bash into “Apache HTTPD” and other type of servers using bash.

Privilege escalation

Some vulnerabilities, most of the time trickier and harder to exploit, can be used in order to improve the permissions of the worm. The final goal here is to become administrator (root) of the current device, by exploiting weaknesses in a given system. The first exploit we want to develop for that usage is “DirtyCow”, which is a bug into the Linux Kernel that uses the “copy on write” feature.

Module Discovery

In order to improve stealthness, the modules will be located in different places in the system of the infected machine. A tool to find and load the available modules, wherever they are located has been developed to serve this purpose.

Database

Some modules may need to store data on the infected device to save them for future use. This database is shared between all the modules. It might be the case of the Scanner module, which will store information about the scanned host, based on their IPs. It is a key-value system. It will be re-worked soon, because the way the names are stored is not the efficient enough. It will also contain a permission system in order to make available data to a specified module instead of any module, to enforce the secrecy.

Future and potential work

  • For now, modules are files that only embody the module code. But injecting the code of the modules into legit software and shared libraries might be a good way to avoid some security measures.
  • In order to avoid to have a static signature easy to identify, we developed a core that must be tiny as possible, which must be harmless as possible. But even so, it could be possible to identify it as a part of the worm. Furthermore, some modules (for example the exploits) can be easily identified as malicious. Creating a system able to encrypt and decrypt the modules and core could be a way to avoid most of the malware detection systems. This system could be implemented by creating a module which can blindely encrypt the other modules and core, or this module could be coupled with the other ones, because some modules may have a greater need of stealth than some others.
  • In anti-malware software, there is often a feature called “dynamic analysis” that uses the behaviour of a program to classify it as malicious or not. Having a system able to change the behaviour of the worm should be able to tricks this analysis by mutating the graphs generated by the anti-malware system, and misleading the classifier. Like the encryption module, it could be “blind” or compled with the modules which are needing protection.