I started this research trying to simplify the techniques used during a WIFI pen-test. The idea was to play with WIWO, a tool released by Core Security last year, in order to make a transparent channel between a network interface located in my host and the WIFI interface located in a router.
The first step in a research like this consists of defining the components that will be used during the process. In this particular one, I focused on finding the appropriate router.
After a period of investigation, I chose TP-Link TL-WA5210g because it fulfills the possibility of installing a customized firmware (allowing me to install the tool mentioned before) and also its an external device (which was the original idea).
The first thing I noticed after getting my hands on the device was that the hardware version (2.0) was different from the version that could allow me to customize, or hack, the firmware so I could install WIWO. I read some blog posts and found that sometimes the hardware versions can change, but the firmware versions are the ones present in older ones. Unfortunately, it was not the case this time; the device firmware version was the expected one for this hardware version and there wasn't any documentation allowing me to downgrade it (TL-WA5210G_V2_140523). Trying the downgrade through the Web Application triggered a generic error message.
Connecting to the UART port
Given the initial results, I decided I would try to find a different way to control the device in such a way it would allow me to install the desired software. My first approach was to solder the UART pins and try to connect to the device using a Bus Pirate (my choice) or something similar to see what I could do (I learned this at hacking school, open the device to see the green board :P – so haxor bro!) .
Once connected to the UART interface, I found the following console with a limited set of commands available to run. Some options shown in this menu were disabled, like the first and second ones:
I tried different numbers and letters aiming at finding additional commands other than the ones shown in the menu. Although I found some hidden ones, nothing was really interesting for my objectives.
Next option was to start analyzing the Web Application, and I found that most of the commands present in the UART console could also be triggered from a hidden menu of the Web Application Interface. This menu is documented in this OpenWRT wiki.
So, in the device I was analyzing, the hidden menu was located at:
Finding the Bug
The hidden menu found in the previous step contained different buttons that would display a lot of interesting information from the device. However, there was one strange UDP port open by default that really caught my attention when I first watched it.
After doing some research, I found a patent explaining the protocol associated with that UDP port.
TDDP is a simple protocol used for “debugging” purposes. The protocol uses a single UDP packet to send requests or commands by specifying different messages types in the payload. The following picture shows the format of a TDDP packet:
I also found that some message types are documented in the patent. A debugging protocol was a very interesting thing to use while conducting this research in order to get some information about the device's state. I wanted to know what exactly I could potentially do. I had already downloaded the firmware the device was running from TP-Link's website.
After downloading the firmware running the device (from the official Web Site), I fired up IDA and started to search for the protocol handlers with the intention of matching what the documentation describes in the patent with the real implementation inside this firmware. The patent I had found described version 2 of the protocol, but after analyzing the firmware I found references to a different version (ver 1) too.
There was no documentation for version 1 of the protocol available, so I decided to reverse engineer the handlers to understand the main differences against version 2. Although the packet structure is very similar, I found an important difference between these two versions: version 1 doesn’t support authentication nor crypto in the packet's payload, while the version 2 requires authentication and the packets to be encrypted.
Analyzing the handlers of v1 messages, I found that some of the handlers in v2 were being used in v1 too. Some examples of these handlers are set_configuration, get_configuration and set_macaddr where,
- set_configuration is the handler for setting the device configuration
- get_configuartion is the handler for getting the device configuration
Based on what I had learned so far, I was ready to start writing a Python script with a minimal implementation of the TDDP v1 protocol. I focused first on the get_configuration request trying to gather the information that was returned back to me.
After sending the packet using that script, the response looked like a key-value configuration file. While reading the output of this configuration, credentials were found in plain text!!! (Yes! Authentication was *not* required to get the device configuration!)
The outcome of this vulnerability was just getting the device configuration. Although that was interesting, I was still far away from installing a custom firmware, which was my original objective. Reading the documentation again, I thought to myself that a protocol aimed for debugging purposes should probably have more issues, so I started to statically reverse engineer the rest of the messages handlers. A couple of hours later, I went deeper into set_configuration request, and I found a strcpy-style vulnerability; leading to a simple stack based buffer overflow.
After the initial analysis, I found that in order to exploit this vulnerability the shellcode had to have a couple of restrictions, such as, it couldn't contain zeros or spaces due to different checks in the code.
With this vulnerability I could take over the TDDP service execution flow, pointing it to my own code, and then patch the verification mechanism in the update function, allowing me to install any firmware I wanted (including older versions).
The main problem with this strategy was that there was not a debugger available in the device. For some strange reason, the UART pins I had soldered before were no longer working and I couldn’t determine why :-|). The only data I could get from the device was the values of the PC and SP registers of the processes, located in the hidden web page found before.
The following picture shows what the processes list look like. As can be seen, the PC and SP registers are detailed in there.
My strategy was to develop an exploit using a “jump debugging” technique, consisting in using the jmp instruction to jump to different addresses depending on the success or failure of the executed code.
The following picture shows an example manipulating the PC register using this technique:
In this case, the value of a file descriptor (0xa value) was loaded into a register and then a jmp instruction to this register was executed, validating that the register contains the expected value.
Writing the Exploit
After a couple of days, and using the technique described in the previous section, I could get control of the PC register using some ROP gadgets; but when I was ready to execute my own code (with the restrictions explained previously) … it didn’t work :(
I had never coded an exploit for MIPS architecture before, and after reading this I understood why it didn’t work. The problem was the MIPS cache was not being flushed and the written shellcode was not available in it. The strategy described in the blog post referenced was to flush the cache by calling the sleep() function. In my case, the firmware didn’t have symbols, hence identifying the sleep() function wasn’t an easy task. For this reason, I started to learn how the MIPS cache works and how I could flush it.
Reading this blog post helped me to figure out that the MIPS cache is twofold.
On one side we have the data cache (D-cache) and on the other side, we have an instruction cache (I-Cache). The procedure to flush these caches is to set the coprocessors registers TagLi and TagHi with zeros and calling the instruction “cache 8, 0($a0)” (for flushing I-Cache) and “cache 9, 0($a0)” (for flushing D-Cache).
Inspecting into the firmware, I found a function doing exactly what I needed (probably an initialization routine):
Though, I found a slight issue on this code:
Like all others function epilogs in the firmware, this function ends calling jr $ra. A $ra register is a register set when the function is called from a jalr instruction. For instance, if you call the function foo like:
The $ra register is set in order to know the return address. In this specific example, the return instruction is implemented using jr because, contrary to the jalr instruction, the jr instruction doesn’t set the $ra register.
The common solution is to make a ROP chain (or better called JOP for MIPS architecture) that sets $ra with a value (for instance 0x12345678) and following that call a jump to 0x8016B910 (the flushing function), in order to call the flushing function and then call, for instance, 0x12345678.
The problem is that $ra is just set in functions epilogs, like this:
The previous picture shows the epilog. Here you can see that after getting the value from the stack, the $ra register is used to return to the caller. If I set the $ra register to 0x8016B910 (the flushing function), I would lose control because an infinite loop is created.
But, what can we do?
I realized that after flushing the I-Cache, new instructions are set immediately. That means that I can modify the epilog of the flushing function (0x8016B910) and replace the instruction jr $ra for jr $fp (or similar); with this trick I can flush the cache and remain in control, jumping to my shellcode.
The following picture shows the layout of the function instructions before and after exploitation:
Finally, the shellcode should just patch the firmware checks in the running code, and again, call the invalidate cache instruction to apply these changes and invalidate the firmware checks.
Talking with @_topo, he suggested me to read about the segments layout on MIPS, and I found that the MIPS Cache can be avoided, if you use kseg1 instead of kseg0 segments (http://cdn.imgtec.com/mips-training/mips-basic-training-course/slides/M…). I couldn’t test it.
First of all, the good news: the service is not reachable from the WAN interface. In fact, it can't even be reached if connected to WiFi, so one needs to be using a wired connection. Second, it's important to point out that the firmware version evaluated was shipped in 2014. However, at the same time, that's the latest available version for this particular device. Sadly, people running this particular hardware don't have a solution at the moment.
There are newer versions of firmware for other TP-Link devices. We installed the latest version of the firmware available for the TL-MR3020, from 2015, and the service was still running and listening by default. Code for the version 1 of the protocol can be seen in the firmware, though there were fewer handlers for version 1 (the handler for getting or setting the device configuration were removed, for example), hence why we can't say at this time whether or not they are vulnerable to the issues described above. There will need to be more research to determine that. Static code analysis on firmware from 2016, for which we didn't have devices to test, seem to reveal pretty much the same situation.
Depending on the attack scenario and device configuration (again, it's not exposed to the WAN or WiFi), the biggest outcome of these vulnerabilities is allowing an attacker to replace the device firmware with a custom one (including a backdoor for persistence purposes) that would allow her access to the internal network from the Internet.
The issues with the security in embedded devices have been known for a while now, and we are not going to spend a lot of time going through all of the common pitfalls. But leaving a debugging protocol listening by default, while not requiring authentication, is a bad idea, and vendors should take notice of it. Memory corruption vulnerabilities are also pretty common, as hopefully can be seen from the aforementioned research description. Secure development practices that have matured in operating systems vendors need to be adopted by embedded devices vendors. MIPS, for example, now supports XI (Execute Inhibit), something similar to the NX bit in Intel, to avoid execution in memory of user data. But really, secure development process should be adopted, including source code auditing and pen-testing.
 The hidden UART commands were found and documented for another researchers and that information can be found in internet.