You are currently viewing SemiWiki as a guest which gives you limited access to the site. To view blog comments and experience other SemiWiki features you must be a registered member. Registration is fast, simple, and absolutely free so please, join our community today!

  • Meltdown, Spectre and Formal

    Once again Oski delivered in their most recent Decoding Formal session, kicking off with a talk on the infamous Meltdown and Spectre bugs and possible relevance of formal methods in finding these and related problems. So far I haven’t invested much effort in understanding these beyond a hand-waving “cache and speculative execution” sort of view so I found the talk educational (given that I’m a non-processor type); I hope you will find my even more condensed summary an acceptable intro to the more detailed video (which you can find HERE).

    Article: TSMC Apple Rumors Debunked!-spectre-meltdown-min.jpg

    The speaker was Mike Hamburg of Rambus who, among other teams such as Google ProjectZero, were involved in the research on these vulnerabilities (summarized in the image above). I’ll only consider the Meltdown part of his talk, starting with a simple example he used. Consider this line of code:

    result = foo[ bar [ i ] * 256 ];

    In the simple world, to execute this, you look up a value referenced by bar[i], multiply by 256 and lookup what that references in the array foo. In an even modest OS with a distinction between kernel mode and user mode, a user-mode process will error on a bounds-check if it tries to access kernel-only memory. This is part of how you implement secure domains in memory. Only trusted processes can access privileged regions. Particularly for virtual machines running in a cloud, you expect these types of wall between processes. I can’t look at your stuff and you can’t look at mine.

    But we want to run really fast, so the simple world-view isn’t quite right. Instead processors work on many instructions per clock-cycle, allowing for 100 or more operations executed simultaneously in flight, including instructions ahead of the current instruction. These speculative executions work with whatever values the processor reasonably assumes they may have – with data values already in cache or triggering a fetch if a needed value is not yet in cache or making a prediction about whether a branch may be taken or a variety of other guesses. When the current program counter catches up, if the guess was correct, the result is ready, if not the processor has to unwind that stage and re-execute. Despite the misses and consequent rework, overall this works well enough that most programs run much faster.

    The problem, as Mike put it, is that the unwinding is architectural but not micro-architectural. When a speculatively-executed instruction turns out to have been wrong, the processor winds back the appropriate instructions to re-execute. But other stuff that happened during the speculative execution doesn’t get wound back. Data was fetched into cache and branch predictors were updated. Who cares? It doesn’t affect the results after all. Cache fetches in these cases even prove beneficial apparently – more often than not they save time later on.

    This is where Meltdown happens but in a fairly subtle way, through cache-timing side-channel attacks. Daniel Bernstein showed in 2005 that it is possible to extract secret data for the AES encryption algorithm through such attacks, simply by measuring encryption times with a precision timer for a series of known plaintext samples and running a simple analysis on those values. In a system with cache these times vary, which is what ultimately leaks information; you run enough data through the system and you can reconstruct the key.

    In case you hoped this might be an isolated problem for AES, these kinds of attacks are more broadly applicable and not just for getting encryption keys. Mike made the point that without mitigation, a Meltdown-enabled attack can effectively read all user memory. A defense especially in the cloud is to use separate (memory) page tables per process which may not be a big deal on modern server-class CPUs but can have a 30% performance penalty on older CPUs.

    Another approach would be to make all memory accesses take the same time, at least within certain domains, or disallow speculation without bounds checks on critical operations or … In fact given the complexity of modern processor architectures, it’s not easy to forestall or even anticipate all possible ways an attack might be launched. In Mike’s view formal can play a role in detecting potential issues, though he thinks this would be limited to small cores today. So not yet server-class cores, but I’m sure the big processor guys are working hard on the problem.

    This would start by writing a contract for covert channels – what should and should not be possible. Mike feels this can’t be a blanket attempt to make attacks impossible – maybe that could only happen if we forbid speculation. But we could on small machines define contracts to bound general execution versus execution around privileged/secret operations and/or characterize those cases where such guarantees cannot be given. Then a careful crypto-programmer for example could write code in such a way that it would be not susceptible, or at least less so, to this class of attack.

    Mike wrapped up with some general observations. Absent banning speculative execution, we’re likely to need careful analysis of covert channels. Perhaps we need to rely on slower but more resilient secure code (allowing for memory access checks even in speculation - I think I heard that AMD already does this). We also need to plan more for secure enclaves inaccessible by any form of external code, privileged or not. It was unclear in his mind whether TrustZone or similar systems could rise to this need today (maybe they can, but that’s not yet known). Certainly it seems more and more desirable to run crypto in a separate core or (if still run in software) supported by dedicated instructions hardened against side-channel attacks. I suspect there will be a lot of interest in further advancing proofs of resilience to such attacks. In formal at least this won't be available anytime soon in apps - this is going to take serious hands-on property-development and checking.