Sunday, 6 December 2015

Nivdort Code Obfuscation and DGA

Recently, the Nivdort malware family is being actively spread in spam campaigns. You can find more details about the campaign here: https://techhelplist.com/index.php/spam-list/791-your-facebook-login-has-been-blocked-malware

This malware has certain interesting characteristics which I wanted to share.

Note: It is also known as BayRob. After decrypting the strings, we can get some more indicators.

Some of the reasons this malware is interesting are:

1. It uses a lot of code obfuscation. Even though the binary is not packed, the code is obfuscated to make the process of reversing it difficult.
2. It uses a Domain Generation Algorithm. The algorithm used is interesting because the domains are generated from a dictionary. So, they do not look random.
3. The email address of the recipient of the spam email is hardcoded and stored as an encrypted string in the binary.

Below are the topics I will cover in this article:

1. String Decryption - An in depth look at how the Strings are encrypted in this binary and how we can efficiently decrypt them.
2. Anti Memory Dumping Techniques - It removes all traces of decrypted strings from the memory after they are used.
3. Code Obfuscation - The Nivdort variants use a large amount of FPU and SSE2 instructions for code obfuscation. One of their variants also creates a lot of threads as an anti debugging technique.
4. Seed Generation Algorithm - Details about how the seed for DGA is generated.
5. Domain Generation Algorithm - The details of algorithm used to generate domains.
6. Domain Name analysis - Details of information we can gather from the domain names.

So, let's start with the analysis of this interesting malware :)

String Decryption

All the strings corresponding to file system, windows registry and network callback are encrypted in this malware. Let us look at the how the strings are decrypted at run time.
It generates a Decryption Table from a single DWORD. The size of the decryption table is 0xe32 DWORDs. This means, the total size is: 0xe32 *4 = 0x38c8 bytes.

The process of generating the Decryption Table is as follows. It passes the key and the address of output buffer to the subroutine:

GenerateTable(DWORD key, DWORD *buffer)


In the screenshot above, you can see the Decryption Table returned by the subroutine.

Below screenshot shows the GenerateTable Subroutine:


We can see that by performing different arithmetic and logical operations on the single DWORD key, it generates a table of size 0x38c8 bytes.

This subroutine can be reversed. I rewrote the code in C:

#include <windows.h>
#include <stdio.h>

/*
Generate the Table using an initial seed
c0d3inj3cT
*/

int main(int argc, char **argv)
{
    int i=0;
    int *buffer;
    int seed = 0xf8c5a1b8;
    int t1=0;
    int t2=0;
   
    buffer = (int *) malloc(sizeof(int) * 0xe32);
   
    memset(buffer, 0, 0x32);

    for(i=0;i<0xe32;i++)
    {
        *(buffer+i) = *(buffer+i) ^ seed;
        seed = ((unsigned)seed >> 0x1) | (seed << 0x1f);
        t1 = ((seed & 0x000000ff) + ((seed & 0x0000ff00) >> 8)) & 0xff;
        t2 = ((seed & 0x000000ff) + t1) & 0xff;
        seed = (seed & 0xffff0000) + (t1 << 8) + t2;
    }
    return 0;
}

Once the Decryption Table is generated, it calls the String Decryption Function.

The string Decryption Function, takes two arguments as an input. One is the size of the decrypted string and the second argument is the offset into the Decryption Table.

DecryptString(DWORD size, DWORD offset)

The binary stores an encrypted blob at a fixed address. The offset above is the same for both the Decryption Table and the Encrypted Data. An XOR operation is performed between the corresponding bytes to get the decrypted strings.

In the screenshot below we can see that the Decryption Subroutine is called several times during the execution of the program:


In order to decrypt all the strings at once, we could write some assembly code in the Debugger and then do an in memory execution to get the decrypted strings. Since we know the address of both the Decryption Table and the Encrypted Data, the decryption routine can be written as follows:

xor ecx, ecx
mov esi, DecryptionTable
mov edi, EncryptedData
decryption:
movzx eax, byte ptr ds:[esi+ecx]
xor byte ptr ds:[edi+ecx], al
inc ecx
cmp ecx, 0x38c8
jnz decryption

Once we have written the decryption subroutine in the debugger, we need to set the EIP to our subroutine and execute the above loop. In the memory dump we can see the list of decrypted strings. It is important to enable the Write protection on the memory page corresponding to Encrypted Data.

In the screenshot below we can see the in memory decryption subroutine and the decrypted strings in memory dump:

Before Decryption:

After Decryption:


In this way, we can decrypt all the strings at once.

Now, let us look at some of the decrypted strings and how they are related to the file system artifacts dropped by this malware.

Below screenshot shows the different files dropped by this malware on the file system:

In the screenshot below you can see that 3 binaries dropped by it are a copy of itself with a different filename:



And below we can see the corresponding strings decrypted in memory:


It is important to note that even though the filenames of dropped files look randomly generated, they were actually hardcoded in the binary and stored as encrypted strings.

Anti Memory Dumping

It is common for analysts and also automated analysis systems to take a memory dump of the binary during execution and get the decrypted strings from there. The decrypted strings often give us an indication about the malware family name and are very useful.

Nivdort ensures that the strings are cleared from the memory after they are used.

In the screenshot below we can see that an array of strings which were used to generate the domain are passed as a parameter to the subroutine, RemoveTrace for clearing it from memory. This will be done after the domain name generation process has completed.


And in the screenshot below we can see that the strings from memory have been cleared:

As a result of this, it is not possible to get all the decrypted strings from the memory dump.

Code Obfuscation

Even though the Nivdort binaries are not packed, they use a lot of code obfuscation by adding FPU and SSE2 instructions. The large amount of junk code added in the binary makes the process of reversing it difficult.

Below are few examples of the code sections which have a lot of FPU/SSE2 instructions. In between these code sections, we have the relevant code section used by the binary.

FPU instructions:


SSE2 instructions:


Multiple Thread Creation: Some variants of Nivdort also use an anti debugging technique to make the process of reversing it difficult. They create a lot of threads and all these threads have the same thread function. The parameters are passed to these threads in an incremental order.

To log all the calls to the threads, we can use the following command in windbg:

bu kernel32!CreateThread ".printf \"Thread Address: %x, Parameter:%x \", poi(esp+c), poi(esp+10); .echo; gc"

When we run the binary, we can see the log of thread creation.

The screenshot below shows the output in windbg:


We can see that the binary creates a total of 213 threads. There are some variants of Nivdort which use the same technique and create more than 700 threads as well.

Seed Generation Routine

Now, let us look at the seed generation algorithm. We know that every Domain Generation Algorithm requires a certain level of randomness in it. To introduce the randomness in the algorithm, we need a seed. So, let's see how the seed is generated by Nivdort.
Nivdort uses GetSystemTimeAsFileTime() to generate the seed. The return value of GetSystemTimeAsFileTime is a pointer to FILETIME structure.


It uses the low and high order part of the file time along with a constant 0x989680 to generate the seed as shown below:


seed = dwHighDateTime:dwLowDateTime/0x989680

The resulting value of seed is shifted right by 9 before being used in the DGA.

seed = seed >> 9

Domain Generation Algorithm

Now, let us look at the Domain Generation Algorithm. Before the DGA subroutine is called, it decrypts two more blocks of data which will be used in the process of generating the domains as shown below: 

HelperTable: This table is used to generate single byte constants which are used as an offset into the array of strings.


Array of Strings: The array of strings are used to generate the domain name. Each domain name consists of two strings from this array. The offsets used in the array of strings are randomly generated using the HelperTable and the seed.


Now, the DGA is called as shown below:

It takes 4 parameters as shown below:

DGA(output_buffer, array_of_strings, HelperTable, seed)

The domain name is generated as follows:

1. Using the seed, a bitmap of size 0xf is calculated.
2. The bitmap generated above is used along with the HelperTable to generate two constants (const1 and const2). Both these constants are single byte.
3. Domain name = array_of_strings[const1] + array_of_strings[const2]

Below we can see the domain name generated:


Domain Name Analysis

The interesting fact about the DGA used in Nivdort is that even though the domain names are generated randomly, they do not look random. As a result of this, algorithms which check the domain names for randomness to conclude whether they are randomly generated or not will not work. Most domains generated using DGA by other malware families look random and are categorized as suspicious by Domain Name analysis algorithms, however Nivdort bypasses that.

I have mentioned the list of all the strings which are used to generate the domain on pastebin here: http://pastebin.com/EDKGMn7g

In our case, the domain generated is: withinreport.net

Let us perform a whois lookup on it. We see that the domain name is not registered yet.

Now, let's execute the malware in a virtualized environment and capture the network traffic.

We can see the multiple DNS queries performed by it as shown below:


Below we can see that one of the domain names were resolved:


Let us perform a whois lookup on the resolved domain name, orderreason.net. We can see that this domain was registered on 5th Dec 2015. At the time of writing this article, the date is 6th Dec 2015.

It is using Yahoo's nameservers:

Name Server: YNS1.YAHOO.COM
Name Server: YNS2.YAHOO.COM

Below is the HTTP POST request sent to the domain:


Similarly, on another execution of the malware, we find one more live domain.


Let us perform a whois lookup on the resolved domain name, heavenalmost.net. We can see that this domain was registered on 6th December 2015.

The name servers corresponding to this domain are:

Name Server: NS4.CSOF.NET
Name Server: NS1.CSOF.NET

The name and email address corresponding to registration information are:

Registrant Name: Matthew Pynhas
Registrant Email: jgou.veia@gmail.com

There are 2363 other domains associated with this registrant. Let us perform a reverse NS lookup on the DNS Servers mentioned above.

In the list of domains associated with these nameservers, we can see a lot of domains which follow the same pattern as the domains generated by Nivdort's DGA.

Some of them are:

againstairplane.net
againstanimal.net
againstcentury.net
againstclear.net
againstclothes.net
againstcontinue.net
againstcontrol.net
againstmanner.net
againstquestion.net
againstwelcome.net

So, we can conclude that one of the key name servers associated with Nivdort's botnet is csof.net

Nivdort is an interesting malware family and some of its characteristics are specific to this malware family.

No comments:

Post a Comment