Wednesday, 9 December 2015

Reversing Encrypted Callbacks of Trojan Dynamer

Recently, a DNS Changer malware is being spread in the wild. It is hosted on various websites as shown below:

http://buffer-control.com/1.exe
http://healthy-control.com/1.exe
http://catnew4u.work/1.exe 
http://95.211.210.167/1.exe
http://catnew4u.link/1.exe
http://catnew4u.info/1.exe
http://format-control.com/1.exe

This malware will modify the Name Server settings on the machine and then perform network callbacks to domains configured in the binary. The data is sent encrypted over the network and it is sent through both HTTP HEAD and POST methods.
We will cover the following topics:
1. Name Server modifications made to the Operating System. 
2. Encryption method used for network callbacks.
3. Data sent in the callbacks.
4. Domain Name Analysis
Name Server modifications

As an example, I will consider the binary with MD5 hash: e789b3ef034427bf09676f522512858f. This binary will perform the following modifications to Windows Registry for changing the Name Server used by Operating System.


HKLM\SYSTEM\ControlSet001\Services\Tcpip\Parameters\Interfaces\{4C90BF29-49F1-4284-9837-2AC5F324A4B0}\DHCPNameServer: "199.203.131.151"
HKLM\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\Interfaces\{4C90BF29-49F1-4284-9837-2AC5F324A4B0}\DHCPNameServer: "199.203.131.151"
HKLM\SYSTEM\ControlSet001\Services\Tcpip\Parameters\NameServer: "199.203.131.151 82.163.143.181"

As we can see, two new name servers are added. The primary name server added is: 199.203.131.151. After making the above Windows Registry modifications, it calls the API, DhcpNotifyConfigChange for the changes to take effect.
This means that all the DNS queries will be routed through the name servers configured above.
In the screenshot below, we can see the DNS queries performed by the malware after execution. As can be seen, the DNS queries are sent to the resolver with IP address: 199.203.131.151

Encryption Method for Network Callbacks

Now, let us look at what data is exfiltrated from the machine and how it is encrypted.

Below are the different steps used by the binary for encrypting the data.

1. It generates a seed. The seed is generated using GetSystemTimeAsFileTime(). The seed generation routine is similar to what we observed in Nivdort malware.
2. It then decrypts a base key. This base key is later used along with the data to be encrypted.

The screenshot below shows the decryption routine and the corresponding encrypted key.

3. After the decryption is completed, we can see the 64 byte key.
4. Now, data is collected from the system. Different information about the system, like the OS version, Admin rights, timestamp and so on are collected. The data is collected in JSON format as shown below:

{"dns_setter":{"activity_type":16,"args":{},"bits":{"file_type":2,"job_id":"8842746664399823806"},"build":128,"exception_id":0,"hardware_id":"5978409460789182924","is_admin":true,"major":1,"minor":0,"os_id":501,"register_date":"1449641943","register_dsrc":"1","service_pack":3,"source_id":"201","status": true,"user_time":1449661744,"version":16777344,"x64":false}}

5. Let us look at the encryption routine now. Below is the call to the encryption routine.

We can see that one of the parameters passed to the subroutine above is the 64 byte decrypted base key.

6. In the encryption routine, it first calculates a CRC32 hash from the data to be encrypted. It then uses the seed previously calculated. It performs the following computations on the seed to calculate a one byte offset.

seed = (seed * 0x343fd) + 0x269ec3
t = (seed >> 10) & 0x3f
offset = (t + 0xf) & 0x3f


The final calculated value is used as an offset into the 64 byte base key. In this way, the first 28 bytes of the encrypted data are calculated.

After this, it uses the seed along with the data to be encrypted. When we return from this subroutine, we can see the encrypted data as shown below:


It then sends an HTTP HEAD request to the callback server. In the HTTP HEAD request it sends the encrypted data.


Domain Name Analysis

The domain names are present in plain text in the binary. Below are some of the domain names.

legco.info
ough.info
heato.info
yelts.net
deris.info
big4u.org
listcool.net
listcool.info
monoset.info

Below are the observations about the callback domains:

1. All the domains were registered after Oct 2015.
2. The name servers corresponding to these domains are from cloudflare.com
3. All the domains are hosted on a dedicated server with IP address: 185.17.184.10.

This malware is interesting because of the way it encrypts the data before sending it to the callback server.

Sunday, 6 December 2015

Nivdort Code Obfuscation and DGA

Recently, the Nivdort malware family is being actively spread in spam campaigns. You can find more details about the campaign here: https://techhelplist.com/index.php/spam-list/791-your-facebook-login-has-been-blocked-malware

This malware has certain interesting characteristics which I wanted to share.

Note: It is also known as BayRob. After decrypting the strings, we can get some more indicators.

Some of the reasons this malware is interesting are:

1. It uses a lot of code obfuscation. Even though the binary is not packed, the code is obfuscated to make the process of reversing it difficult.
2. It uses a Domain Generation Algorithm. The algorithm used is interesting because the domains are generated from a dictionary. So, they do not look random.
3. The email address of the recipient of the spam email is hardcoded and stored as an encrypted string in the binary.

Below are the topics I will cover in this article:

1. String Decryption - An in depth look at how the Strings are encrypted in this binary and how we can efficiently decrypt them.
2. Anti Memory Dumping Techniques - It removes all traces of decrypted strings from the memory after they are used.
3. Code Obfuscation - The Nivdort variants use a large amount of FPU and SSE2 instructions for code obfuscation. One of their variants also creates a lot of threads as an anti debugging technique.
4. Seed Generation Algorithm - Details about how the seed for DGA is generated.
5. Domain Generation Algorithm - The details of algorithm used to generate domains.
6. Domain Name analysis - Details of information we can gather from the domain names.

So, let's start with the analysis of this interesting malware :)

String Decryption

All the strings corresponding to file system, windows registry and network callback are encrypted in this malware. Let us look at the how the strings are decrypted at run time.
It generates a Decryption Table from a single DWORD. The size of the decryption table is 0xe32 DWORDs. This means, the total size is: 0xe32 *4 = 0x38c8 bytes.

The process of generating the Decryption Table is as follows. It passes the key and the address of output buffer to the subroutine:

GenerateTable(DWORD key, DWORD *buffer)


In the screenshot above, you can see the Decryption Table returned by the subroutine.

Below screenshot shows the GenerateTable Subroutine:


We can see that by performing different arithmetic and logical operations on the single DWORD key, it generates a table of size 0x38c8 bytes.

This subroutine can be reversed. I rewrote the code in C:

#include <windows.h>
#include <stdio.h>

/*
Generate the Table using an initial seed
c0d3inj3cT
*/

int main(int argc, char **argv)
{
    int i=0;
    int *buffer;
    int seed = 0xf8c5a1b8;
    int t1=0;
    int t2=0;
   
    buffer = (int *) malloc(sizeof(int) * 0xe32);
   
    memset(buffer, 0, 0x32);

    for(i=0;i<0xe32;i++)
    {
        *(buffer+i) = *(buffer+i) ^ seed;
        seed = ((unsigned)seed >> 0x1) | (seed << 0x1f);
        t1 = ((seed & 0x000000ff) + ((seed & 0x0000ff00) >> 8)) & 0xff;
        t2 = ((seed & 0x000000ff) + t1) & 0xff;
        seed = (seed & 0xffff0000) + (t1 << 8) + t2;
    }
    return 0;
}

Once the Decryption Table is generated, it calls the String Decryption Function.

The string Decryption Function, takes two arguments as an input. One is the size of the decrypted string and the second argument is the offset into the Decryption Table.

DecryptString(DWORD size, DWORD offset)

The binary stores an encrypted blob at a fixed address. The offset above is the same for both the Decryption Table and the Encrypted Data. An XOR operation is performed between the corresponding bytes to get the decrypted strings.

In the screenshot below we can see that the Decryption Subroutine is called several times during the execution of the program:


In order to decrypt all the strings at once, we could write some assembly code in the Debugger and then do an in memory execution to get the decrypted strings. Since we know the address of both the Decryption Table and the Encrypted Data, the decryption routine can be written as follows:

xor ecx, ecx
mov esi, DecryptionTable
mov edi, EncryptedData
decryption:
movzx eax, byte ptr ds:[esi+ecx]
xor byte ptr ds:[edi+ecx], al
inc ecx
cmp ecx, 0x38c8
jnz decryption

Once we have written the decryption subroutine in the debugger, we need to set the EIP to our subroutine and execute the above loop. In the memory dump we can see the list of decrypted strings. It is important to enable the Write protection on the memory page corresponding to Encrypted Data.

In the screenshot below we can see the in memory decryption subroutine and the decrypted strings in memory dump:

Before Decryption:

After Decryption:


In this way, we can decrypt all the strings at once.

Now, let us look at some of the decrypted strings and how they are related to the file system artifacts dropped by this malware.

Below screenshot shows the different files dropped by this malware on the file system:

In the screenshot below you can see that 3 binaries dropped by it are a copy of itself with a different filename:



And below we can see the corresponding strings decrypted in memory:


It is important to note that even though the filenames of dropped files look randomly generated, they were actually hardcoded in the binary and stored as encrypted strings.

Anti Memory Dumping

It is common for analysts and also automated analysis systems to take a memory dump of the binary during execution and get the decrypted strings from there. The decrypted strings often give us an indication about the malware family name and are very useful.

Nivdort ensures that the strings are cleared from the memory after they are used.

In the screenshot below we can see that an array of strings which were used to generate the domain are passed as a parameter to the subroutine, RemoveTrace for clearing it from memory. This will be done after the domain name generation process has completed.


And in the screenshot below we can see that the strings from memory have been cleared:

As a result of this, it is not possible to get all the decrypted strings from the memory dump.

Code Obfuscation

Even though the Nivdort binaries are not packed, they use a lot of code obfuscation by adding FPU and SSE2 instructions. The large amount of junk code added in the binary makes the process of reversing it difficult.

Below are few examples of the code sections which have a lot of FPU/SSE2 instructions. In between these code sections, we have the relevant code section used by the binary.

FPU instructions:


SSE2 instructions:


Multiple Thread Creation: Some variants of Nivdort also use an anti debugging technique to make the process of reversing it difficult. They create a lot of threads and all these threads have the same thread function. The parameters are passed to these threads in an incremental order.

To log all the calls to the threads, we can use the following command in windbg:

bu kernel32!CreateThread ".printf \"Thread Address: %x, Parameter:%x \", poi(esp+c), poi(esp+10); .echo; gc"

When we run the binary, we can see the log of thread creation.

The screenshot below shows the output in windbg:


We can see that the binary creates a total of 213 threads. There are some variants of Nivdort which use the same technique and create more than 700 threads as well.

Seed Generation Routine

Now, let us look at the seed generation algorithm. We know that every Domain Generation Algorithm requires a certain level of randomness in it. To introduce the randomness in the algorithm, we need a seed. So, let's see how the seed is generated by Nivdort.
Nivdort uses GetSystemTimeAsFileTime() to generate the seed. The return value of GetSystemTimeAsFileTime is a pointer to FILETIME structure.


It uses the low and high order part of the file time along with a constant 0x989680 to generate the seed as shown below:


seed = dwHighDateTime:dwLowDateTime/0x989680

The resulting value of seed is shifted right by 9 before being used in the DGA.

seed = seed >> 9

Domain Generation Algorithm

Now, let us look at the Domain Generation Algorithm. Before the DGA subroutine is called, it decrypts two more blocks of data which will be used in the process of generating the domains as shown below: 

HelperTable: This table is used to generate single byte constants which are used as an offset into the array of strings.


Array of Strings: The array of strings are used to generate the domain name. Each domain name consists of two strings from this array. The offsets used in the array of strings are randomly generated using the HelperTable and the seed.


Now, the DGA is called as shown below:

It takes 4 parameters as shown below:

DGA(output_buffer, array_of_strings, HelperTable, seed)

The domain name is generated as follows:

1. Using the seed, a bitmap of size 0xf is calculated.
2. The bitmap generated above is used along with the HelperTable to generate two constants (const1 and const2). Both these constants are single byte.
3. Domain name = array_of_strings[const1] + array_of_strings[const2]

Below we can see the domain name generated:


Domain Name Analysis

The interesting fact about the DGA used in Nivdort is that even though the domain names are generated randomly, they do not look random. As a result of this, algorithms which check the domain names for randomness to conclude whether they are randomly generated or not will not work. Most domains generated using DGA by other malware families look random and are categorized as suspicious by Domain Name analysis algorithms, however Nivdort bypasses that.

I have mentioned the list of all the strings which are used to generate the domain on pastebin here: http://pastebin.com/EDKGMn7g

In our case, the domain generated is: withinreport.net

Let us perform a whois lookup on it. We see that the domain name is not registered yet.

Now, let's execute the malware in a virtualized environment and capture the network traffic.

We can see the multiple DNS queries performed by it as shown below:


Below we can see that one of the domain names were resolved:


Let us perform a whois lookup on the resolved domain name, orderreason.net. We can see that this domain was registered on 5th Dec 2015. At the time of writing this article, the date is 6th Dec 2015.

It is using Yahoo's nameservers:

Name Server: YNS1.YAHOO.COM
Name Server: YNS2.YAHOO.COM

Below is the HTTP POST request sent to the domain:


Similarly, on another execution of the malware, we find one more live domain.


Let us perform a whois lookup on the resolved domain name, heavenalmost.net. We can see that this domain was registered on 6th December 2015.

The name servers corresponding to this domain are:

Name Server: NS4.CSOF.NET
Name Server: NS1.CSOF.NET

The name and email address corresponding to registration information are:

Registrant Name: Matthew Pynhas
Registrant Email: jgou.veia@gmail.com

There are 2363 other domains associated with this registrant. Let us perform a reverse NS lookup on the DNS Servers mentioned above.

In the list of domains associated with these nameservers, we can see a lot of domains which follow the same pattern as the domains generated by Nivdort's DGA.

Some of them are:

againstairplane.net
againstanimal.net
againstcentury.net
againstclear.net
againstclothes.net
againstcontinue.net
againstcontrol.net
againstmanner.net
againstquestion.net
againstwelcome.net

So, we can conclude that one of the key name servers associated with Nivdort's botnet is csof.net

Nivdort is an interesting malware family and some of its characteristics are specific to this malware family.

Tuesday, 1 December 2015

Nymaim Malware Obfuscation and System Time Check

Recently in Nov 2015, it was observed that the Nymaim malware is being distributed in a spam campaign. This spam campaign is related to Intuit Quickbooks.

More details of the spam campaign can be found at the following links:

https://techhelplist.com/spam-list/974-intuit-browsers-update-malware
http://blog.dynamoo.com/2015/11/malware-spam-intuit-qb-quickbooks.html

I checked the binary downloaded in these spam campaigns. It uses some interesting code obfuscation techniques and even has a system time check.

System Time Check

Let us first discuss about the system time check.

Each binary which is distributed in these spam campaigns is configured with a time limit. Based on the day this binary is sent in the spam campaign, it will execute completely only if the current system time is less than a configured time limit.

For instance, for the binary with MD5 hash: 563a1f54b9d90965951db0d469ecea6d which was sent in spam campaigns on Nov 18th, the system time check is configured so that it will run only on or before 20th Nov.

Below is the relevant code section which shows the system time check:


As we can see in this code section, the system time check is obfuscated as well. Instead of doing a simple check on the system time returned by GetSystemTime(), it uses obfuscation to make it difficult to understand the time check.

I have added the comments in the code section above which shows the time check performed.

Binary will check the Year, Month and Day fields from the System Time value returned by GetSystemTime and compare it with the following configured time limit.

Year - 0x7df
Month - 0xb
Date - 0x14

Only if the current system time is on or before 20th Nov 2015, it will execute completely.

As a result of this system time check, if we analyze this binary on a date after this, it will not execute completely. This is an anti analysis technique and not an evasion technique.

Code Obfuscation

Nymaim malware also uses interesting code obfuscation techniques. Below are few examples from the malware variants observed in this campaign.

SEH Anti Debugging: In another variant of Nymaim which was distributed in this campaign with MD5 hash: 60b2009138d1b21c1b93b7093bc66109, I noticed that it uses an SEH based anti debugging technique. It is used to make the process of tracing the code in debugger more difficult.

Below are the details:

It registers an exception handler at address, 0x0040116b. It then triggers an exception by executing a privileged instruction like the OUT instruction. Below is the relevant code section which shows the execution of Privileged Instruction.



We can see the usage of junk instructions in the code above. Below are few examples of such instructions:

Mov dh, dh
dec esi
inc esi
cmp esi, esi
cmp esi, esi
cmp esi, esi
and dword ptr ds:[4657b7], ffffffd5

These junk instructions do not impact the behavior of the malware because most of these junk instructions are done to synthesize NOP. In x86 instruction set, we can synthesize NOP instructions in multiple ways and this technique is used by the malware to generate junk instructions.

Now, let us look at the Structured Exception Handler which is invoked when an exception is triggered in the above code section.


We can see in the exception handler code the following things:

1. It uses a lot of junk code as well.
2. It adjusts the value of EIP in the exception record by adding 2 to it. This is done so that when the exception is handled, the execution is resumed after the OUT instruction. Since the size of OUT instruction is 0x2 bytes, the above code adds 2 to the address where exception occurred.
3. It also uses the EDX register as a counter variable which is set in the exception record as well. When we return from the exception handler, we can see that the value of EDX is checked. If it is not 0, then the loop is repeated. The counter value is set to 0x1023b. As a result of this, even if we pass the exception to debugger, the loop is triggered so many times. The code can of course be patched as shown below:

or edx, edx
jnz 00403aea            ; Patch this condition jump instruction
pop dword ptr fs:[edx]        ; value of EDX should be 0, else an exception will be thrown
sub byte ptr ds:[4667f2], 3d
add esp, 4
push esi
retn

While patching the code above, we need to be careful to even set the value of EDX to 0, since it is used in the next instruction as an offset into fs register (fs:[0] -> head of exception list).

API Call Obfuscation

Nymaim also uses an interesting way to perform API call obfuscation. There are multiple reasons to do this. It makes the process of tracing the code in debugger difficult. For instance, if you set a breakpoint at an API, when you follow the return address, instead of returning to the malware code, it returns to a location in ntdll.dll.

For malware analysis, some automation systems use API tracing. They generate a log of different APIs called by the malware. From the API, they backtrace and find the address from where the API was called. The heuristics used in analysis systems are based on API calls coming from malware code. If the API call is coming from a system DLL, it is usually whitelisted. In this way, Nymaim can also bypass some heuristics used in analysis systems.

Let us look at how Nymaim calls the APIs indirectly.

We saw in the code sections above that Nymaim uses a System Time Check. Let us set a breakpoint at GetSystemTime() and observe the state of registers and stack.


Once we hit the breakpoint in debugger, we can see on the call stack that the return address is: 0x7c95ee2d. This address points to ntdll.dll instead of pointing to the malware code.

Now, let us follow the return address in the debugger:

We see that it returns to a code section in ntdll.dll which is corresponding to the x86 instruction, Call EBX. The value of EBX is crafted in such a way that it points to the malware code.

Let us follow this subroutine:

Now, we have returned to the malware code. We can also see usage of junk instructions in above code sections.

Some of the examples are:

1. The value of a local variable is overwritten twice.

mov dword ptr ss:[ebp-c0], eax
...
mov dword ptr ss:[ebp-c0], 71b700

2. Adding, subtracting or performing an XOR operation on a local variable with 0 does not modify the value of local variable. These instructions are equivalent to a NOP instruction:

sub dword ptr ss:[ebp-dc], 0
...
add dword ptr ss:[ebp-c4], 0
...
xor dword ptr ss:[ebp-c4], 0

3. We can also see few comparison instructions however they are not followed by any conditional jump. This is also an indication of instructions which are equivalent to a NOP instruction and are added to make the process of reversing the binary difficult.

In this way, we can see that the Nymaim malware uses some interesting code obfuscation techniques.

Thursday, 3 September 2015

PayPal Phishing using Obfuscated HTML Attachments

PayPal phishing campaign which uses Obfuscated HTML attachments has been active since past few months.

In most cases, the HTML attachments which are sent for phishing the credentials are not obfuscated. This allows easy static analysis and they are detected successfully by products. However, obfuscated HTML pages make it difficult to detect them.

In this article we will look at Obfuscated HTML attachments used in PayPal Phishing campaigns and how they can be effectively analyzed.

Obfuscated HTML attachment looks as shown in Figure 1.

 Figure 1: Obfuscated HTML Page 

The PayPal Phishing page when opened with a Browser, looks as shown in Figure 2. We can see that it requests personal information from the user.


Figure 2: PayPal Phishing Page

Now, let us look at how the HTML page can be deobfuscated. We can see in Figure 3 that after the code is deobfuscated, it is displayed in the Browser using document.write()


Figure 3: Document.write in Obfuscated HTML page.


To see the deobfuscated code, we will display the value in a textarea box. This can be done by modifying the document.write() statement in the original obfuscated HTML page as shown in Figure 4.




Figure 4: Modify document.write() in Obfuscated HTML page 

When we open the modified HTML page with Browser, we can see the complete deobfuscated code as shown in Figure 5.


Figure 5: Deobfuscated HTML Page 

If we look at the HTML form in the deobfuscated page, we can see that the HTML form's action points to http://paypal.com as shown in Figure 6. In most of the phishing HTML pages, we see that HTML form's action field points to attacker's controlled webserver. However, this is what makes this instance of PayPal phishing interesting.


Figure 6: HTML Form's action field in deobfuscated HTML page. 

So, how is the HTML Form's action field modified at runtime? It is done using the JavaScript function, initsub(). In Figure 7, we can see that the JavaScript function, initsub() is called when the HTML form is submitted.


Figure 7: JavaScript function called when HTML form is submitted.

The JavaScript function, initsub() is shown in Figure 8.

Figure 8: JavaScript function, initsub()

We can see an obfuscated array in the above JavaScript. The array variable name is: _0xf7b4. This array contains strings which are represented using the ASCII value of corresponding bytes. To display the actual strings, we can use Mozilla Firefox's Web Console as shown in Figure 9.




Figure 9: Decode the Array using Firefox's Web Console

When we enter the array name in the Web Console, all the strings will be printed.

We can identify the HTML form's action URL as shown below:

var _0x13632f = "fb60d411a5c5b72b2e7d3527cfc84fd0.php";
var _0x1a7d65 = _0xf7b4[2] ;
_0x1a7d65+=_0x13632f;

The complete URL is: http://relaylinks.net/fb60d411a5c5b72b2e7d3527cfc84fd0.php

Whois look up information for the domain, relaylinks.net shows that it was registered on 17th August 2015.

Most of the domains used in this PayPal phishing campaign have been registered recently by the attackers.

I will soon post a list of different domains and the complete URLs used for phishing.

Thanks.

Monday, 31 August 2015

Dropbox Credential Phishing Campaign

Dropbox Credential Phishing Campaign has been active since quite sometime. In this article I will share some interesting details related to this campaign.

Attackers will send an email with a URL that requests you to enter your username and password to view a Shared Document.

Figure 1: Login using your Email address and Password to view the Shared Document:

As shown in Figure 1, there is a static HTML login page which is used to phish the credentials. There are a few variants of this login page which I will discuss later.

Now, here is the interesting thing. The attacker did not modify the Apache Web Server settings to prevent Directory Listing. We often see this in the case of credential phishing and these attackers often do not disable directory listing.

We can browse the directories and find the zip archive as shown below:

Figure 2: Dropbox Phishing Archive found on the Server:

We can download the Phishing Archive and view the contents as shown in Figure 3.

Figure 3: Dropbox Phishing Archive contents:

If we view the HTML form login page on this Phishing Site, we can see that the Form's action field points to finish.php as shown in Figure 4.

Figure 4: HTML Form Login page:



Using Mozilla Firefox's Web Console we can quickly check the HTML form login page and see the HTML form's action field as shown in Figure 4. Now, let us find this file, finish.php in the Dropbox archive we found and view its source code. We can see that the credentials are obtained from the HTML form and an email is sent to: instantfundtransfer@inbox.com

Figure 5: Server Side code of Phishing:


We also notice one extra thing done in the PHP code above in addition to sending the phished credentials. It also collects the Geographic Information of the victim using the Geo IP lookup service: http://www.geoplugin.net. This information is sent to the attacker in email along with the phished credentials. The reason being, popular mail services like Gmail and Yahoo have a security feature which allows them to detect a login from another geographic location. So, with the information about geographic location of victim, the attacker can use the Proxy or VPN of the corresponding country to login to view their emails.

Now, let us take this one step further and collect different links from Internet which are related to same Phishing Campaign. I found a total of 751 phishing URLs which are posted here: http://pastebin.com/zUuuYhAg

We can apply the method I have discussed in this article to get the list of attacker's email addresses. I will soon post a list of all the email addresses.

Thanks.