Skip to main content

The XZ Compromise and Open-Source Security

· 5 min read

The XZ compromise is not your typical security vulnerability. This security flaw was deliberately introduced by a malicious actor, over the course of many months, in open-source software that is foundational to the internet.

The vulnerability overrides the normal SSH key decryption checks and grants remote access to malicious actors in possession of a certain key. Modern Linux systems running OpenSSH link against the compromised xz/liblzma library via systemd; the blast radius of this vulnerability would have been unprecedented.

The fact that "Jia Tan", the persona behind this campaign to backdoor XZ, managed to get from anonymous contributor to maintainer of the project over the course of ~12 months is a reflection of the hostile environments open-source software developers face - their users report bugs, demand fixes, and produce contributions of varying degrees of quality, whilst expecting maintainers to put in the time, effort, and financial resources to adequately maintain them. Complete misalignment of incentives.

In the case of XZ, its lead developer and maintainer was clearly the target of an intelligence campaign to subvert him - he mentioned he was struggling to maintain the project due to being burned out, and in his quest for a new maintainer, Jia Tan was diligently "fixing" bugs and available to step up and help out! In 2023, Jia Tan is already signing release tarballs that are consumed by a multitude of downstream projects, notably Debian, Ubuntu, Fedora, and, surely, in time, expensive proprietary enterprise Linux distros. We were lucky XZ 5.6.0 and 5.6.1 were not widely integrated in Linux.

Another gem in this whole saga is how the backdoor was actually found. Even though Valgrind, a tool for detecting errors in C/C++ code, complained and crashed when assessing this particular implementation, a number of agents in GitHub pull requests and Debian mailing lists introduced noise to legitimize the backdoor and workarounds to "fix" the Valgrind issue.

The campaign fell apart under the scrutiny of Andres Freund, a PostgreSQL developer working at Microsoft. As Andres was working on some micro-benchmarking, he noticed the timings of SSH connections became "a lot slower". I'm not sure most of us would have noticed those extra 0.508s, but Andres did, fired up his profiler, started looking into OpenSSH, and painstakingly debugged until he found the root cause - a piece of obfuscated malicious code pretending to be part of the test suite, but whose true purpose was to backdoor xz/liblzma.

Even though contributions to open-source software don't require the disclosure of one's identity (thankfully), there are some clues about the identity of Jia Tan: the majority of their commits have been manipulated to appear to come from a UTC+8 timezone (China, Indonesia, Philippines, Western Australia), but on some occasions, they forgot to change timezones and committed from UTC+2 and UTC+3. These match perfectly with daylight savings time switchover in Eastern Europe. If all other timestamps are adjusted to this timezone, we can infer they usually worked 9am-6pm, which makes more sense than non-adjusted 4pm-1am working hours.

My key takeaways from this:

  • We take our internet plumbing for granted, and most of us don't appreciate the subtle web of composable protocols, standards, and software required to do seemingly trivial things like browsing the web. Despite the internet being such a momentous achievement for humanity, we sometimes forget how fragile some of our foundations are.

  • Lasse Collin, the original author and maintainer of xz, has an impeccable track record of decades as a maintainer of xz. He is blameless in all this, and I imagine he must be feeling pretty heartbroken right now.

  • Anonymity allows anyone to contribute to open-source projects without fear of discrimination, but it also enables bad actors to attack with impunity. While large projects have robust governance models, smaller but critical projects often lack the resources to protect themselves adequately. The community must find ways to preserve anonymity's benefits while safeguarding smaller projects from targeted attacks.

  • Open-source projects are public goods, but they lack a sustainable funding model. This leads to a misalignment of incentives, where maintainers are expected to invest time and effort without proper compensation. Many smaller but critical projects struggle to survive, leaving them vulnerable. The community must explore new ways to fund and sustain these projects to ensure their long-term health and security.

  • Andres noticing SSH connections were 0.508s slower than usual speaks volumes of his proficiency as a performance engineer, but as he stated, it did require "a lot of coincidences" from past experience for him to become suspicious in the first place. Did our security tooling fail us? Would the Valgrind crashes due to the stack layout differing from what the xz backdoor was expecting have been sufficient to properly detect the backdoor?

This is a stark reminder as to why, under no circumstances, you expose the management interfaces of production servers to the internet. Even though OpenSSH and public-key cryptography remain sound, you generally have no idea how secure your supply-chain is. Your servers will get compromised if you only have a single layer of protection. The security community has known this for a long time, and, among others, we always promote defense-in-depth - a security principle which states that you should always use multiple layers of security controls.

The next time you boot up your Linux distro, open your web browser, or download a file, take a moment to reflect on the immense value of the open-source software that makes it all possible.