Conspiracy Theory: Intel's AMT Vulnerability & The Ken Thomson Hack

2017-05-14

Around two weeks ago Intel announced a critical privilege escalation bug that was laying around its Active Management Technology (AMT) login page for the past seven years. The exploit allows a remote attacker to take control of vulnerable devices with ease.

I’ve read many posts that mock the programmer who introduced it, and the (lacking) testing framework and processes to make sure such things never happen.

But, what if no one made a mistake, and the whole thing is a result of an elaborate hack?

How much can you trust software?
Have you ever checked the validity of the sources your acquire your software from?
Can you trust your own code? Have you ever checked the tooling that compiles or runs it?

In 1984, Ken Thompson, a known figure in the hacker community and one of the authors of UNIX, proposed we can’t. In his remarkable paper, Reflections On Trusting Trust, Ken outlines a hack that many considers the worst hack imaginable: The Ken Thomson Hack.

This blog post is a bit long (but worth it!) and made out of three parts:

! Disclaimer: The conspiracy theory is completely made up.

Interested? Awesome. Start by reading about the AMT vulnerability.

The AMT Vulnerability (CVE-2017-5689)

I won’t go into too much detail, because that’s not the purpose of this post.

Anyway, the login code for the AMT web interface incorrectly used the strncmp function, which allowed users to gain access when inserting an empty password at the login screen.

What does incorrect mean? lets go back to the docs:

int strncmp (const char* str1, const char* str2, size_t num);

Compare characters of two strings
Compares up to num characters of the C string str1 to those of the C string str2.

This function starts comparing the first character of each string. If they are equal to each other, it continues with the following pairs until the characters differ, until a terminating null-character is reached, or until num characters match in both strings, whichever happens first.

Parameter	Explanation
str1	C string to be compared
str2	C string to be compared
num	Maximum number of characters to compare

The bug was fairly simple. Instead of this:

int main () {
  string realpass = "secret";
  string userpass = "user-secret";
  int equal = strncmp(realpass.c_str(),userpass.c_str(),realpass.size());
  if (equal == 0) {
     printf ("'%s' equals to '%s'", realpass.c_str(), userpass.c_str());
  }
  return equal * equal; // make sure it's positive
}

The code was compiled like this:

int main () {
  string realpass = "secret";
  string userpass = "user-secret";
  int equal = strncmp(realpass.c_str(), userpass.c_str(), userpass.size());
  if (equal == 0) {
     printf ("'%s' equals to '%s'", realpass.c_str(), userpass.c_str());
  }
  return equal * equal; // make sure it's positive
}

See the difference? The maximum number of characters to compare in the first snippet is realpass.size() while in the second is userpass.size(). That means that if the user inserted an empty password, strncmp will return 0, and print that non matching strings - match. That’s basically the AMT vulnerability.

The following video explains what I’ve just said, and shows the vulnerability in action:

Anyhow, was that a programmer mistake? probably. But what if someone attacked Intel a few years ago, and using an elaborate technique, inserted a backdoor that is almost impossible to find?

The Ken Thomson Hack

Ken describes how he injected a backdoor into a compiler that allowed him to bypass the UNIX login command. Not only did his compiler know it was compiling the login command and injecting a backdoor, but it also knew when it was compiling itself and injected the backdoor generation code into the compiler it was creating.

Ken divided his paper into three parts (“stages”), and explained each stage thoroughly. I’m summarized them for you, but if you find it interesting, I recommend reading the original paper as well: Reflections On Trusting Trust.

Stage One

Write a Quine program:

A quine is a non-empty computer program which takes no input and produces a copy of its own source code as its only output. The standard terms for these programs in the computability theory and computer science literature are “self-replicating programs”, “self-reproducing programs”, and “self-copying programs”. - Wikipedia

The following snippet shows a self-reproducing program in the C, or more precisely a program that produces a self-reproducing program.

This program can be easily written by another program.
This program can contain an arbitrary amount of excess baggage that will be reproduced along with the main algorithm. In the example, even the comment is reproduced.

#include <stdio.h>

const char * SOURCE = "#include <stdio.h>%c%cconst char * SOURCE = %c%s%c;%c%cint main(){%c%c//Prints own source code and injects newlines(10), horizontal tabs(9) and apostrophes(34)%c%cprintf(SOURCE, 10, 10, 34, SOURCE, 34, 10, 10, 10, 9, 10, 9, 10, 9, 10, 10);%c%creturn 0;%c}%c";

int main(){
	//Prints own source code and injects newlines(10), horizontal tabs(9) and apostrophes(34)
	printf(SOURCE, 10, 10, 34, SOURCE, 34, 10, 10, 10, 9, 10, 9, 10, 9, 10, 10);
	return 0;
}

Stage Two - Self learning code

Once certain code is introduced and compiled to binary, that code can be removed and the binary will know what do do with it.

For example, for a compiler to know what \n means, we have to teach it. We do that first by letting the compiler know that when it sees \n, render 10 instead. In the ASCII chart, decimal 10 is the character new line.

Once the code is compiled, we can replace the 10 with \n in our source code, because the binary now knows what that means. We’re able to remove that from the source code with no trace, unless we were to examine the binary.

Stage Three - Inserting a backdoor.

Say we have access to Windows’s source code, and we inject a backdoor in the login screen to always accept a specific password. This would work, but you’ll get caught pretty quickly once someone looks at your commit.

Instead, What if we put a Quine in the compiler, that replicates itself, including the backdoor?

Add code that injects the backdoor when compiling the login executable.
Add replication code that ensures that every time we compile the compiler, that code will be added.
Delete all traces from the source (or, better yet, replace the compiler binary)

Now all traces are gone from the source, but they exist in the binary. The backdoor remains undetectable unless someone reverses the binary!

Of course the whole thing is a lot more complex: you’ll probably have to replace the build image Microsoft uses, and find a way to remove any traces of your actions.

Sounds crazy right? but in August 2009 a virus utilizing the Ken Thompson hack was seen in the wild. W32/Induc-A infected Delphi’s compiler with code that helped it spread across machines. It is believed to have been propagating for at least a year before it was discovered by Sophos labs. You can read more about it on Naked Security.

The Mega Conspiracy

What if someone hacked into Intel’s servers a few years ago, and updated their compiler to replace this:

strncmp(realpass.c_str(), userpass.c_str(), realpass.size())

with this:

strncmp(realpass.c_str(), userpass.c_str(), userpass.size())

Essentially adding a backdoor? What if the same attacker added code that turned off the attack when test runners were used? or when the compiler was running inside Intel’s LAN?

This might sound crazy and far-fetched, but there are threat actors out there with the skill-set to pull this off. But hey, I’m not that paranoid. I do believe the vulnerability was introduced as a result of a human mistake, but what if it wasn’t?