# Crack! - watch me amble through learning how to reverse engineer C 2020-05-17 Originally I thought that this challenge wouldn't really interest me. I don't really have much reverse engineering experience, but I figured that I was doing this to learn, and I may as well jump into something new! ## The Challenge el_l33t: > Hey n00b, you know what would really impress me? Cracking stuff. You > know, basic stuff. I'll send a few brain teasers your way to challenge > you. They won’t be hard, because you’re smart, right? > > Level 0 > Level 1 > Level 2 > Level 3 > Level 4 > Level 5 (all of the above are available at https://taowa.ca/ctf-writeups/nsec2020/crack.tar.gz) Well, first off, rude. But let's get started. ## Level 0 - strings crackme0 was an easy one. `strings`, part of binutils, finds strings of printable characters in binaries. No need to even execute it. ``` $ strings crackme0 /lib64/ld-linux-x86-64.so.2 libc.so.6 puts rand getuid [72 lines removed] ``` Well, that's small enough that you can see the flag if you scroll through, but let's try grep. ``` $ strings crackme0 | grep FLAG FLAG-EwryMNYLi0fFvsM3 $ askgod submit FLAG-EwryMNYLi0fFvsM3 Congratulations, you score your team 1 points! ``` Yay! My first ever CTF flag! ## Level 1 - Not Quite Strings Let's try running it. ``` $ chmod u+x crackme1 $ ./crackme1 Enter passcode: I don't know the passcode. Unauthorized ``` Well, that didn't work. I've never really done reverse engineering before, but my teammates pointed me to the NSA's Ghidra. I was happy to be CTFing in a VM. I thought watching software automatically process a binary would be boring, but Ghidra's animations were thoroughly... captivating. Once you've downloaded Ghidra, you can run it by extracting it then running ./ghidraRun in your terminal. Create a new non-shared project, then do `File -> Import File` and select crackme1. Accept the defaults for the import, and open the binary from the `Active Project` window. Ghidra will helpfully offer to automatically analyze the binary. Accept, and click Analyze. I encourage you to take a look at what it does, but the defaults seem sane, so click Analyze. If you don't have a `Functions` tab somewhere, go to `Window -> Functions` and open it. You can also find functions under the `Symbol Tree`, but I like the interface of `Functions`. For this crackme, there's a function generously named `print_flag`. Select it, then go to the `Decompile: print_flag` tab. ``` void print_flag(void) { long in_FS_OFFSET; int local_70; [...] local_68[0] = 0x46; local_68[1] = 0x4c; local_68[2] = 0x41; local_68[3] = 0x47; local_58 = 0x2d; local_54 = 0x4e; [...] while (local_70 < 0x14) { putchar((int)(char)local_68[local_70]); local_70 = local_70 + 1; } ``` Hmm, a long list of bytes. I wonder whether those are ASCII? 0x46: F | 0x4c: L | 0x41: A | 0x47: G | 0x47: - | 0x43: N For those following along at home, a web search for `ASCII chart` will get you what you need. Well, that's promissing. It all decodes to FLAG-NotQuiteStrings. ``` $ askgod submit FLAG-NotQuiteStrings Congratulations, you score your team 1 points! ``` It wasn't quite `strings` after all. ## Level 2 - base sixty no Running Level 2 does the same thing, but prompting for a `secret` instead of a `passcode` this time. Let's get it loaded into Ghidra. Two interesting functions pop out at me: base64_encode and compute_flag. ### compute_flag This function has similar code to the last one, with ASCII bytes. ``` local_68[0] = 99; local_68[1] = 0x46; local_68[2] = 0x6c; ``` There's also this: ``` if (bVar1) { printf("FLAG-%s\n",data); } ``` local_68 is not in hex, but in decimal. No worry, the table has values for that too! Decoding, you get: cFlBYXVUaDhmcFJwdQ==. With the `base64_encode` function in mind, I wonder what happens when I base64 decode it? ``` $ echo "cFlBYXVUaDhmcFJwdQ==" | base64 -d pYAauTh8fpRpu ``` Hmm, that looks odd, but I guess I'll try to submit it with FLAG-. ``` $ askgod submit FLAG-pYAauTh8fpRpu Congratulations, you score your team 1 points! ``` Woot! ## Level 3 - MIPS Trying to run this one failed for me. Why? Because it's compiled for MIPS, an architecture different from the one on my machine. I wonder if Ghidra will open it? It does! It automatically detects that it's MIPS. The most interesting function that pops out at me is one called `main`. It's the largest one (this is why I like `Functions`, it shows you the size of each function), over double the size of the runner-up. ``` puts("Computing secret..."); memset(&cStack64,0x41,0x11); cStack64 = cStack64 + '\x05'; cStack63 = cStack63 + '\v'; cStack61 = cStack61 + '\x06'; cStack60 = cStack60 + -0x14; cStack59 = cStack59 + '\f'; cStack58 = cStack58 + '\b'; [...] cStack55 = cStack55 + '('; cStack54 = cStack54 + '2'; cStack53 = cStack53 + '\x06'; cStack52 = cStack52 + '\x11'; ``` That looks like a flag to me, but something's off. That doesn't look like what I had before. ### A tangent on my programming ability I've never been good at/enjoyed programming. I can read trivial functions and roughly follow along with some programs, it's effectively a requirement of my job, but I don't really write code when I can avoid it. If you're in the same boat as me, don't worry. There's lots of CTF stuff that requires no or minimal programming experience :). ### Back to MIPS Okay, some of those look pretty odd to my non-programmer brain, but as the many instances of "Google" in our team chat will prove, a significant part of CTF is using search engines. Let's see what \x05 and \v mean to C. They're escape sequences, and the `Escape sequences in C` wiki page was infinitely useful. So, \x05 is easy, it's just 5 in hex. \v is a tab, represented by 0B in hex. But what about -0x14? I figured we weren't doing odd things with signed and unsigned bytes (I don't even know if that's a thing), and I also noticed that 0x05 and 0x0B were odd codes in ASCII. So when I saw the negative value, I wondered whether there was an offset of some sort. There was a memset between printing `Computing secret...` and the characters, so I looked that up. `memset`, well, sets mem(ory) values. In this case, it sets 17 bytes at the cStack64 memory location to 0x41, which is coincidentally (or not) A in ASCII. So, that makes \x05 cStack64 equal to its starting value (0x41, because of the memset) plus 0x05 (because of \x05), for a grand total of 0x46, or F. That's promissing. Repeating this for all the characters, including the parenthesis, which is an ASCII 0x29 (giving 0x41 + 0x29, or i), you get: `FLG-MIPSiCGRET!`. That looks a bit weird, but then I noticed the absense of a cStack62 and a cStack50. Those would be equal to 0x42, which is A. So now we have `FLAG-MIPSiCGREAT!`. Clearly I shouldn't be adding 2 to 0x41 as if it's hex. So what could it be? Admittedly, during the CTF I just substituted it for an s and called it a day, but I went looking when I wrote this writeup and figured out that 2 is 0x32 in ASCII, so now I get 0x73, or s. C was interpreting it as a string, because it was in single quotes, like the parenthesis. ``` $ askgod submit FLAG-MIPSisGREAT! Congratulations, you score your team 1 points! ``` EDIT: I incorrectly stated that C was interpreting 2 as a string because it's in single quotes, it's been pointed out to me that it's probably interpreting it as a char, which would make doing arithmetic to it actually possible. Thanks Tristan :). ## Level 4 - Oo, long You know the drill by now, open the binary in Ghidra, find the interesting function (main, again). ``` memset(&uStack64,0,0x32); uStack64 = 0x47414c46; uStack60 = 0x5774482d; uStack56 = 0x4b6f7779; uStack52 = 0x59617251; uStack48 = 0x57625769; ``` If you break those into byte-sized pieces (I feel no remorse for this pun), you'll notice they all hover around the right range for ASCII. Let's try one of them: 0x47: G | 0x41: A | 0x4c: L | 0x46: F Looks like I'm reading them backwards. The flag here, once I decode them all, looks like FLAG-HtWywoKQraYiWbW. ``` $ askgod submit FLAG-HtWywoKQraYiWbW Congratulations, you score your team 1 points! ``` I'm starting to figure this out! EDIT: Jayjay pointed out to me that this one can be solved by running `strings` on the binary. Moral of the story? Just because you've progressed in a challenge doesn't mean you should forget the tools that you've used to get there. ## Level 5 - aaaaaaa math This one was the hardest. It involves doing math a bit harder than adding and subtracting. I'm essentially failing Cal II, so I'm going to leave the hard math to my computer. One of my teammates was talking about z3, Microsoft Research's Theorem Prover. For my use, that means that you define a fancy equation and it finds numbers to satisfy it. There's a python wrapper in the Debian package `python3-z3`. So let's see what we have to prove. Loading up the binary in Ghidra there's some functions that pop out: `main`, `validate`, `f`, `g` and `h`. `main` takes in a "passcode", runs it through `validate`, then prints it, prepended by "FLAG-". `validate` takes in a single integer value, let's call it x (Ghidra calls it param_1), then runs `f`, `g` and `h` on it, and retuns true if the output of `f(x)` << `g(x & 1)` is equal to `h(x)`. `f` takes the input and does `^ 0x1a75` to its input. `g` adds 2 (the integer kind, not 0x32) to its input. `h` adds 0x21a3937 to its input. Okay, so I need to write some python to get z3 to find numbers that make the above true. ### My z3 code The code below seems simple, and, to be honest, it probably is. But it took me at least an hour (possibly closer to 1.5 - 2) to learn about z3, figure out what I needed to do, debug my code, figure out whether `^` and `&` meant the same thing to C and z3, and realize that `<<` was a bitshift and not less than (I felt kind of silly after that one). It was a bit demoralizing to have it constantly fail, but by the time I got it to work and submitted the flag I was fairly proud of myself for having finally figured it out. So, here's my code, comments indicated by #: ``` from z3 import * # importing z3 helps to use z3 :p x = BitVec('x',32) #|z3 treats C Ints differently than f2 = BitVecVal(0x1a75,32) #|"normal" integers, so I've used g1 = BitVecVal(1,32) #|BitVec for the unknown (with a 32 bit) g2 = BitVecVal(2,32) #|length, and BitVecVal for the thing h1 = BitVecVal(0x21a3937, 32) #|whose value I know. # EDIT to clarify BitVec and BitVecVal: z3 has a concept of an integer, # but that doesn't really mesh with the C concept of an integer. C # treats an integer as a series of bits, and a BitVec is that: a vector # full of bits. BitVec allows for things like signed integers whereas # Int would abstract the underlying bits away and just call it a # negative number. Thanks Tristan for asking me to clarify this. I'd # kind of forgotten how much figuring that out was a pain. f = x ^ f2 # This implements f() gPre = x & g1 #|validate doesn't call g directly, it binary ANDs it g = gPre + g2 #|first. After that I define g() using the value out. h = x + h1 # This implements h() solve(f << g == h) # This magic function finds a solution that makes # that true. ``` Running this, I get: ``` $ python3 maaaaaaaaaaath.py [x = 5031337] $ askgod submit FLAG-5031337 Congratulations, you score your team 1 points! ``` So el_l33t indeed. (c) Taowa - Licensed under CC BY-SA 4.0 https://creativecommons.org/licenses/by-sa/4.0/