flaws in OpenSSL's random number generator
Feb. 19, 2016
Update 2018-05-01: This post (and the paper) refers to the "md_rand" implementation of OpenSSL's RNG. As you can read in this post, the OpenSSL 1.1.1 prereleases already feature a new RNG approach. I hope to be able to write something about it in the near future...
In the following we give an overview of our recently published paper [ paper | slides ] presented at Eurocrypt 2016. It reveals a number of cryptographic flaws in the random number generator (RNG) that is used by the OpenSSL library since about two decades. These flaws by far do not affect all systems using OpenSSL, but only those which are at risk of initially seeding the RNG with insufficient entropy and rely on its subsequent reseeding through the corresponding API functions, for instance using a seed file. The most realistic attack variants, entirely passive in nature, incur a computational effort above 280 binary operations and are thus out of reach for any adversary today, but principally allow also later attacks on the RNG state and thus previously generated keys. Accordingly, it is rather safe to say that from this issue, there is no immediate threat leading to system compromise, but possibly a significant reduction of the assured security level of generated cryptographic keys. Other issues incur a potentially arbitratily small attack complexity, but offer a comparatively small attack surface.
overview of the identified issues
In order for users of the library to be able to judge to which extend their own applications might be affected without having to read the entire paper, we provide an overview of the practically relevant identified issues, point out under which conditions they arise, and give an estimation of the resulting impact.
- description. "LESLI" is the abbreviation of "low entropy secret leakage issue" and refers to a problem in the management of the RNG rather than in its cryptographic design. The main reason for this problem is that the RNG features a function for outputting random numbers before being sufficiently seeded, namely RAND_pseudo_bytes, which uses the normal RNG state for output production. This is already sufficient to put low entropy secret seed data like user passwords at risk to be leaked through the RNG's output. However, the problem is aggrevated by the fact that OpenSSL also silently feeds other data to the RNG as seed data, such as the initial contents of the buffers to be filled with random data through RAND_pseudo_bytes. If these buffers contain low entropy secrets, these are also at risk to be leaked through subsequent RNG output. This is a rather realistic assumption, since overwriting secrets in memory with random values is an established practice.
- attack result. The attacker learns low entropy secrets, usually the previous contents of the buffers overwritten with random data in a call to RAND_pseudo_bytes.
- attack complexity. The complexity of an attack exploiting LESLI is arbitrarily low. It is derived from the combined entropy of the RNG prior to the feeding of the attacked seed data and that of the seed data itself.
- conditions for the issue to arise. The RNG is operated in a low entropy state, RAND_pseudo_bytes is used for non-cryptographic purposes and an attacker has access to RNG output.
- estimation of exploitability. For a system to be actually exploitable, it must operate with the RNG being in a low entropy state. If it does so unintentionally, and uses the RNG for cryptographic purposes, then far greater problems than LESLI must be expected. Systems operating intentionally, possibly temporarily, in a low entropy state, can be assumed to be rare, and whether or not low entropy secrets are fed to the RNG completely depends on the application. Accordingly, we have to assume that actually vulnerable applications will be quite rare.
- description. Most of the issues reported in the paper (ELO-80, ELO-160, ELO-240, DEJA-SEED, and DEJA-STATE) are caused by a combination of shortcomings in the management of the RNG and in its core cryptographic construction. The RNG manages a huge internal state and relies on what is referred to as the "stirring" operation for the distribution of the entropy within that state. The problem is that this operation is only performed once within the RNG's life cycle, namely when for the first time output is drawn from it after having been sufficiently seeded. When new seed data is fed to the RNG after the initial seeding, the stirring operation is not carried out any more. Obviously there are only two possibilities: either the operation is not necessary for the RNG's secure operation after a reseeding, and should thus be removed completely, or it is indeed necessary, and the reseeding does not achieve the same security level as the initial seeding when performed in a low entropy state. Clearly, if the operating system's RNG provides insufficient entropy for the initial seeding, then the RNG's being "sufficiently seeded" will be in false believe and effective subsequent reseeding is important. The analysis of the cryptographic design revealed that in this case, the lack of the execution of the stirring function has the consequence that the RNG only reaches a guaranteed security level of 80 bits.
- attack result. The attacker recovers previous RNG output, or the complete RNG state and possibly even the high entropy seed data used for reseeding, and can also predict future RNG output.
- attack complexity. The complexity of the attacks can be as low as 280 hash evaluations with a few tens of bytes of hash input.
- conditions for the issue to arise. The initial seeding of the RNG, which is usually performed automatically on standard operating systems by the library when output is requested for the first time, fails in the sense that a seed of considerably lower entropy is used than expected. At least one call to RAND_pseudo_bytes or RAND_bytes is performed before subsequently reseeding the RNG with a high entropy seed, for instance from a seed file. Afterwards, the attacker either receives RNG output at a specific offset in the RNG's output stream or a part of the RNG output from a call to RAND_bytes where the remaining portion should be secret to him.
- estimation of exploitability. Due to the sufficiently high attack complexity, there is no risk of actual attacks being carried out today. The problem remains that an adversary, who recorded RNG output after a high entropy seeding that was conducted in a low entropy state, might break subsequently generated keys when the necessary computational power becomes available. But for quite a long time, such a computational effort will incur considerable costs. For these types of attack the probability of finding vulnerable systems is presumably not negligible.
- description. The notion of forward security of seed data is a new security notion of RNGs introduced in our work. The established notion of forward security means that an attacker gaining access to an RNG's internal state at some point in time, does not gain any information about the RNG's previous output. However, in this scenario, also the security of seed data within the RNG state is a relevant notion. An application might feed secret low entropy data, such as passwords entered by users, to the RNG as seed data (the OpenSSL manual pages even encourage this). Accordingly it is desirable that the RNG achieves their maximally possible protection against state-disclosure attacks as described above. However, OpenSSL's RNG fails to achieve this: When low entropy seed data is fed to the RNG, it remains potentially recoverable from the state. This is due to the fact that the RAND_add function does not effectively combine the new seed data with the previous RNG state and furthermore only hashes the input seed data block-wise, so that even high entropy seed data having a structure that includes known respectively fixed parts, can be affected.
- attack result. The attacker learns the values of seed data previously fed to the RNG by gaining access to the RNG's state (through a break-in into the system or application).
- attack complexity. The attack complexity can be arbitrarily low and depends on the seed data's entropy, its structure and the previous entropy level of the RNG state.
- conditions for the issue to arise. The RNG is in a low entropy state. Secret seed data is added which either is of low entropy or of high entropy and at the same time exhibiting a certain structure. Afterwards the RNG-state disclosure happens.
- estimation of exploitability. The issue only shows up when seed data is added in a low entropy state. All applications with sufficient initial seeding are safe, but on systems failing to achieve this, the issue is a realistic threat.
effect on embedded systems
The main category of systems affected by the above issues must be assumed to be embedded systems. As it is well known, such platforms often feature insufficient entropy levels of their operating system RNGs at least at boot time, without this being detectable by the RNG implementation. Accordingly, the use of a seed file, which stores entropy for applications across application restarts or system reboots, is a common mitigating measure under these circumstances. As we have learned from the above summary, this approach is at risk to be affected by the reported issues. Consequently, it is safe to say that mainly embedded or mobile applications using the OpenSSL RNG deserve being checked with respect to the reported issues.
a note on the approach of the paper
For those who explore the attacks described in the paper in detail in order to determine the vulnerability of their own applications, we wish to point out that the intent behind the attacks developed in the paper was to show that RNG suffers from considerable flaws and not to determine the optimal attacks. The developed attacks serve as examples that demonstrate the flaws, and for the ease of analysis and description are not necessarily optimal in all aspects. Accordingly, one should for instance not rely on the amount of RNG output necessary for an attacker in the DEJA-SEED and DEJA-STATE attacks as specified in the paper and simply exclude the vulnerability of one's own application because it only generates smaller output. What is undisputed is that the RNG's security does not fall under 80 bits (provided that it is seeded with at least 80 bits of entropy).
It is currently unknown to us whether or when OpenSSL will provide a patch for these issues. For those who want to implement countermeasures within their own application we provide secure wrapper functions for the RNG's API functions. Refer to the research paper for details.
a note on the forks LibreSSL and BoringSSL
On the background of the revelation of these shortcomings of OpenSSL's RNG, some users might consider alternative implementations as offered by the OpenSSL forks LibreSSL or BoringSSL. We wish to point out that both libraries' RNG implementations, while formally retaining the RNG API from OpenSSL, break the API's contract in a highly problematic way. They do so by implementing the functions necessary for reseeding the RNG from client code as "NOPs", i.e. as functions with empty bodies. Their RNGs entirely rely on the operating system's entropy sources. Consequently, a program performing well in terms of RNG security using OpenSSL on a system with a bad entropy level from the operating system RNG, for instance by managing a seed file, would fail when ported to use one of the forks. Accordingly, it is not possible to recommend the use of either of the forks to circumvent the shortcomings of OpenSSL's RNG.