So last week I checked out Season 1 of Mr. Robot from the library. It’s a series I’ve been meaning to watch for some time now, because some scenes I’ve seen suggest that it’s really good inspirational hacker material, while other scenes suggest that it’s just fodder for dumb normies who want to think they know tech. As with most things, I suspect the truth is somewhere in the middle, but I had to see at least a few full episodes to know for sure. This post is inspired by the exploits of Elliot the psychotic computer hacker, and I will be talking about some of the technical aspects behind these exploits.
No, actually I won’t. Sorry, it looks like I won’t be doing an examination of Mr. Robot hacking anytime soon. Because, as it turns out, I wasn’t even able to watch the DVDs in the first place. I just returned the thing today, completely unwatched. The reason for this was a message that I got when I tried to load Disc 1 in Windows 7’s default DVD playing program: “This medium can’t be read because it’s protected by CODD technology.” I can only assume that CODD is some new kind of DRM (possibly based on SGX) that can only be decrypted by Skylake-based PCs running Windows 10, or something like that. How fitting that a DVD for a show about hacking has to be hacked to be watched (if you’re using outdated software anyway).
I tell this story to illustrate a key point: Much as I may like to think of myself as a 1337 hacker, grey hat, cyber-warrior, what-have-you, I’m actually very much a novice when it comes to security – either offensive or defensive. I have my fair share of accomplishments, feats of cyber security violation that many would consider impressive, but it’s not because I know security; it’s because I’m just good at thinking outside the box. I’ve done my fair share of dirty work – removing DRM from my iTunes files, cracking trial software by resetting it back to Day 1, getting kicked off a LAN by the network admins and then being back online within five minutes… These are all great war stories and I’m proud of them all in turn, but I didn’t accomplish these feats through some grand, sophisticated knowledge of computer security. It did it by using stupid loopholes in software systems that allowed me to do things I wasn’t supposed to be able to do. These were loopholes that no one thought to patch because it simply never occurred to anyone that they could be used maliciously. That’s really the secret to my success: finding creative uses for mundane and seemingly innocuous features.
Imagine what I could do if I could combine that creativity and ability to think outside the box with real security knowledge. I would be formidable, unstoppable. I would be the next Robert Morris. “But Robert Morris got arrested,” you might say. True, but he also went on to have an illustrious career as a computer scientist and security researcher, and that’s one of my ambitions as well. But right now I don’t even feel confident enough in my abilities to pass the HackTheBox entry challenge. There are some serious knowledge gaps that need filling. So I’m making a concerted effort to learn about security vulnerabilities and pen-testing. I’m using two fronts to accomplish this: One is logging into Kali Linux and looking at all the pen-testing tools, and looking them up on the Internet to see what I can learn. That’s another topic for another post, and I’m not talking about that here. What I will talk about is the other method I’m using: Looking at vulnerability reports in the CVE database and trying to learn as much general security knowledge from them as I can.
I thought I’d start with shared memory exploits. This was sort of a research project that I embarked on, and the rest of this post will be a documentation of that project and my conclusions. There’s a whole class of security exploits that focus on low-level memory vulnerabilities, and these include some of the most notorious malware programs and bugs in the history of computing, including the Morris Worm, the Blaster Worm, the Code Red Worm, and the Heartbleed Bug. There are various ways that malicious programs and users can exploit these low-level memory vulnerabilities. One is by taking advantage of unsafe I/O functions in C/C++ that don’t perform sufficient bounds checking – this is where we get stack smashing and format string attacks. Another is to take advantage of inconsistencies between reported size and actual size of memory segments so that bounds checking fails, or to in some other way gain simultaneous access to memory by multiple processes when one process thinks it has that memory all to itself. It is the latter of these two that I will be focusing on here.
I was first introduced to shared memory exploits about a year ago, when I read about a VM escape vulnerability in VirtualBox. A VM escape vulnerability is a vulnerability that allows a VM to access data from the host that it shouldn’t have access to. If I remember correctly, this particular exploit took advantage of a segment of shared memory that was used for communication between the host and the guest. Of course this is intended to be a technically-oriented post and it just wouldn’t be complete without a closer look at the exploit that inspired this whole project, so please excuse me while I retrieve the vulnerability report from DuckDuckGo… Ah, here it is.
This exploit takes advantage of a segment of shared memory which is shared between the host and the guest, and allows for things like mouse pointer integration and seamless windows. There are two similar vulnerabilities which allow for out-of-bounds reads and writes, possibly between the guest’s section of memory and the memory that is supposed to be exclusively owned by the host. I have to admit I don’t completely understand how this works as I don’t actually know the exact mechanics of shared memory – what is and isn’t allowed on a shared memory segment and so forth. My reading in that area is very limited, and this is something I will need to look into further. The reading/writing of memory is done using the
memcpy() function, which is given parameters that are arbitrarily controlled by the guest. This is what allows for out-of-bounds reads and writes. To fix this vulnerability we would need better controls on what values the guest is allowed to set for these parameters.
Reading about this exploit got me thinking: I’m willing to bet that shared memory represents not just a couple vulnerabilities in VirtualBox, but a whole class of exploits across many different types of software and hardware. The idea is fairly simple: Process A and Process B are supposed to be in some sense separate; one or both have data that they don’t want the other to read or write. However, since some memory is shared between the two, this segregation can be tricky, and small mistakes in the code can lead to data leaks from one process to the other, via the shared memory segment.
I have now set out to try to find more exploits similar to this one. I did a search on Startpage (or maybe it was DuckDuckGo) and came across another exploit that involves shared memory, but in a completely different way. This vulnerability concerns the Android operating system and its use of the Binder framework, which is a Linux IPC mechanism fairly similar to D-Bus, but written specifically for Android. Binder uses shared memory for larger data transfers. The vulnerability basically works like this: when a process accesses data in a shared memory segment, the Binder framework uses a
size_t type to pass the size of the memory segment. If a 32-bit process is accessing memory, only the lower 32 bits of the memory size are used, even though the size is supposed to be represented by a 64-bit number. Thus an attacker could inject a number into the upper 32 bits resulting in an invalid 64-bit address where the address represented by the lower 32 bits is still valid, hence the access will pass bounds checking even though the memory access is out-of-bounds.
The most rudimentary idea of this exploit is basically the same as the other one: a shared memory segment being used as a conduit for a data leak. However, the bug that results in this exploit is very different. In the first one, the vulnerability resulted from allowing a process to arbitrarily assign the bounds inputted into a memory function. In this one, the vulnerability results from the reported memory size being different from the actual size. Both result from a lack of clarity regarding memory bounds, but the solution to the first one is to add restrictions to the input while the solution to the second one is simply to change the address size.
It has now become clear to me that in order to fully understand these exploits, and possibly come up with some of my own, I have to first gain a more detailed understanding of how shared memory actually works. How does the address resolution work out exactly? Does a shared memory segment somehow map to both processes’ address spaces in such a way that one can access the private memory of the other if certain safeguards aren’t met? Or does this unauthorized memory access refer only to enclaves within the shared memory segment that are supposed to be protected? Obviously it would have to be one of those two, because even with faulty bounds checking it’s not possible for a process to directly access memory that’s completely outside of its address space, simply due to the way virtual addresses are mapped to physical addresses on modern computers. Since all memory access except at the very lowest level is done with virtual addresses, an attempt to access, say physical address 0xffffffff, will simply access address 0xffffffff within that process’s own virtual address space. So any out-of-bounds memory access would have to be done within the context of a process’s own address space, hence it would have to be mapped from the shared memory segment into the other process’s private memory somehow.
I will need to have a more detailed low-level understanding of memory in order to fully comprehend these vulnerabilities, and this is true of a lot of things in offensive security. Understanding buffer overflows, for example, requires a detailed understanding of the layout of memory in a process’s stack and of the structure of the function preamble used to prepare data on the stack for a function call. Similarly, carrying out side-channel attacks on cryptographic ciphers often requires a sophisticated understanding of the physics principles underpinning the storage or transmission medium used for encrypted communication. I think that’s part of why I find offensive security and penetration testing so fascinating – the real stuff anyway, not the script kiddie stuff with Metasploit or whatever – it forces you to actually learn about the system you’re attacking in a multifaceted way.