Meltdown and Spectre: Intel’s Seismic IT Disaster and A Look at Some Implication for Banks

The press has greatly under-reported the two security holes, called Meltdown and Spectre, that can without exaggeration be characterized as affecting just about every computing device in use today (with very rare exceptions, like the Apple Watch). And because the media has so badly dropped the ball, your humble blogger will start with a high-level introductory piece, in the hopes that the IT and security experts in our readership will chime in, ideally in comments, with more information and ideas. Lambert has more posts planned, and they will be more technical in nature.

One of the most obvious points, that cannot be made often enough, is that these security holes exist at the most foundational hardware level, the processors. Initial reports were that they could be fixed only via Very Extreme Measures, like getting hardware without the dodgy Intel chips. That was quickly scaled back to “oh, patches are being launched.”

The wee problem is that with a flaw this fundamental and widespread, these patches aren’t just any patches. Given the severity of the flaws (and Spectre is more recalcitrant than Meltdown), the industry’s incentives are to say whatever it can throw at the problem is adequate whether they really address the problems or not. These fixes are also said to slow down performance by 5% to 30% per process. That is a massive haircut, particularly in a high volume setting. Perhaps later optimizations can cut the performance cost, but the flip side is that later patches that do a better job could just as well increase the performance hit.

Moreover, it isn’t just that virtually everyone who has a computer (and that means smartphones too) is faced with what will feel like a big hardware downgrade in remedying these vulnerabilities. Even more important, it isn’t clear that any device with these flawed chips can ever be made secure again. While there was reason to assume that the NSA had managed to get back doors installed in every device, it’s one thing to have the NSA snooping on you. We now have the possibility of a much larger range of actors getting at your data. As our Clive put it:

And certainly for me, I’ve moved from a position of being fairly sure that most data I have either in the cloud or locally on my devices is secure and confidential to being totally convinced it’s been compromised already or could easily be by anyone who wants to.

We’ll provide links to some good overviews on Meltdown and Spectre and then give some initial examples from the financial services arena of their implications

Some Primers on Meltdown and Spectre

We’ll quote at length from , which broke the story, and which gave a good overview:

It is understood the bug is present in modern Intel processors produced in the past decade. It allows normal user programs – from database applications to JavaScript in web browsers – to discern to some extent the layout or contents of protected kernel memory areas.

The fix is to separate the kernel’s memory completely from user processes using what’s called Kernel Page Table Isolation, or KPTI. At one point, Forcefully Unmap Complete Kernel With Interrupt Trampolines, aka FUCKWIT, was mulled by the Linux kernel team, giving you an idea of how annoying this has been for the developers.

Whenever a running program needs to do anything useful – such as write to a file or open a network connection – it has to temporarily hand control of the processor to the kernel to carry out the job. To make the transition from user mode to kernel mode and back to user mode as fast and efficient as possible, the kernel is present in all processes’ virtual memory address spaces, although it is invisible to these programs. When the kernel is needed, the program makes a system call, the processor switches to kernel mode and enters the kernel. When it is done, the CPU is told to switch back to user mode, and reenter the process. While in user mode, the kernel’s code and data remains out of sight but present in the process’s page tables.

Think of the kernel as God sitting on a cloud, looking down on Earth. It’s there, and no normal being can see it, yet they can pray to it.

These KPTI patches move the kernel into a completely separate address space, so it’s not just invisible to a running process, it’s not even there at all. Really, this shouldn’t be needed, but clearly there is a flaw in Intel’s silicon that allows kernel access protections to be bypassed in some way…

At worst, the hole could be abused by programs and logged-in users to read the contents of the kernel’s memory. Suffice to say, this is not great. The kernel’s memory space is hidden from user processes and programs because it may contain all sorts of secrets, such as passwords, login keys, files cached from disk, and so on. Imagine a piece of JavaScript running in a browser, or malicious software running on a shared public cloud server, able to sniff sensitive kernel-protected data….

It appears, from what AMD software engineer Tom Lendacky was suggesting above, that Intel’s CPUs speculatively execute code potentially without performing security checks. It seems it may be possible to craft software in such a way that the processor starts executing an instruction that would normally be blocked – such as reading kernel memory from user mode – and completes that instruction before the privilege level check occurs.

That would allow ring-3-level user code to read ring-0-level kernel data. And that is not good.

The specifics of the vulnerability have yet to be confirmed, but consider this: the changes to Linux and Windows are significant and are being pushed out at high speed. That suggests it’s more serious than a KASLR bypass.

Richard Smith provided this simplification:

It is perhaps like a scam, with the processor as the uninformed front man, and the various user processes that share the processor as the ultimate victims.

The processor is hardwired to make *good* guesses [“speculative execution”] about what to do next, and makes a good job of it. However, the quality of the guess is based on the fundamental and inevitable assumption that the instruction stream is, as it were, honest, about what it is trying to do. CPUs can’t help making this assumption; they have no insight into what smells wrong and what doesn’t.

So: subvert these good intentions by presenting the processor with a dishonest and, to insightful human eyes, wildly improbable instruction stream, that is cunningly engineered to extract information about the carefully hidden inner workings of the host. The processor will now haplessly & obliviously leak info about stuff it’s meant to keep secret, thus compromising the security of all the other user processes.

Needless to say, cryptocurrency owners, this includes your holdings.

This post, , provides another good layperson-accessible description of the flaws (hat tip EM, who it turns out is such a hard core geek that he runs something Raspberry Pi-like).

Lambert liked the description in this tweet, because it provides a layperson-friendly discussion of how to make an instruction stream “dishonest,” as Richard puts it, but I was less enamored of it by virtue of not being able to relate it to actual computer operations. But if you are more computer-savvy, the analogy may seem more obvious:

Here’s my layman’s not-totally-accurate-but-gets-the-point-across story about how  & type attacks work:

Let’s say you go to a library that has a ‘special collection’ you’re not allowed access to, but you want to to read one of the books. 1/10

— Joe Fitz (@securelyfitz)

This post

Some Examples of Why This Is a Big Deal From Banking

Recall that the financial services industry is one of the most demanding IT environments: extremely high transaction volumes, many of which are mission critical, and very low tolerance for errors. The industry has made this bad situation worse by regularly under-investing in IT, so it is running with little headroom in many activities.

Consider some possible ramifications of a 5% to 30% increase in processing time:

Many large international banks run their big batch processes overnight, in New York time terms. Those need to complete execution before the start of the trading day in the US. What happens if the Meltdown and Spectre patches slow processing time so much that they can’t complete the overnight jobs by the opening of the trading day? As our Clive noted:

My TBTF — as per standard industry practices — rolls out security patches without much, if any, testing (and such testing that is performed is to deploy to a test PC and test server in the company’s test domain that, so long as it doesn’t fall over in a heap after perfunctory checks, is deemed to be a pass; this is acceptable because security patches shouldn’t touch functionality and shouldn’t make substantial and fundamental changes to wide ranging components like the kernel as this fix does).

But this fix inevitably kills some machine benchmarks by 30% or so. For some services that are already running “hot” (limited headroom available at peak processing times, for example) due to sweating assets and starvation-level budgets for upgrades which are now the norm, this will be more than sufficient to push them over the edge and into outages. No performance and capacity testing — which is among the most long winded and resource-intensive to do — will be scheduled because, realistically, it would take months to do properly and this fix needs to be rolled out now because it is exploitable and if successfully exploited can compromise all other security measures.

And recall, at the top of the post, we expressed our doubts that anything other than getting on entirely new chips would fully remedy the bugs. Clive highlights the implication of system security being in doubt:

A big problem for financial services is that, when a customer’s on-line banking facility (and usually then their account) is compromised ‎and the customer suffers a loss — via money transmission (wire transfer) to a fraudster’s bank account — the bank will invariably, at least initially, try to claim their systems are infallible and if a customer user ID and password was used alongside whatever two-factor authentication is adopted then the customer has been negligent in some way.

I don’t see how with this, alongside other similar security flaws, any bank can now claim their security systems are, well, secure.

I expect this will be increasingly tested both in regulator-run dispute resolution, mandatory arbitration or, eventually, the courts. ‎That will be a fairly seismic shock — financial services are used to showing up and simply saying “hey, we’re a bank, *of course* our systems are secure and fool-proof” without needing to supply serious evidence to show they’ve not fallen foul of the myriad of glitches that are now out there, in the public records.

One of the few upsides is that the increased processing time cost is a mini-transaction tax on high-frequency trading, which as we have discussed, is an entirely parasitic activity that should have been regulated or taxed out of existence long ago (among other things, it creates the worst possible market structure and drains market liquidity when it is most needed).

Needless to say, even the experts are only beginning to get their arms around what it will take to remedy these epic security flaws and what the costs will be. And the incumbents have incentives to minimize how bad things are. So reader sightings, both in the trade press and from their own experience will be very helpful in making a more accurate diagnosis.

Print Friendly, PDF & Email

142 comments

  1. Clive

    One point which I’ve not seen adequately reflected in mainstream media coverage is just how unnecessarily sprawling the patching job which will be needed has been made. To illustrate the prevalence and stealth infiltration of these CPUs into our everyday existences, as I did a stock-take of everything which I could find just in my own house. I was stunned at the number of devices I found.The most surprising thing was all the potential areas which I now needed to check. I ended up with the following list:

    Laptop PC — Intel Core i3
    “Smart” TV (late model Panasonic with internet access, browser, apps such as Netflix and presumably Linux O/S built in — Cortex-A9 MPCore based, from Panasonic published specifications)
    Satellite receiver (Sky Q — as per smart TV, hardware confidential but Intel Core i5 base is assumed by most industry experts)
    Nest 2nd Generation — ARM Cortex A8 as shown from iFixIt teardown results
    iPad Mini 3rd Generation (Apple A7 chipset)
    Blackberry Classic (ARM 9 chipset)
    iPhone 5 (Apple A6 chipset)

    The last is merely something I keep in a draw as a standby, but I don’t want to stop using it as it’s my backup phone. I do not habitually run Apple’s updates on it, so am not even sure it will allow a brand-new update without borking itself.

    And I don’t consider myself hugely into tech. These are simply technodetritus that you end up with, without really trying.

    So far, only the windows PC has been patched. The cost to the hardware vendors merely of compiling, packaging, testing and distributing the patches alone will be non-trivial. And I don’t have exactly unshakable faith that all the hardware vendors I am relying on will make the patches available. Panasonic has only rolled out one update to my TV — and that’s despite user forums having advised a myriad of annoying bugs in the TV’s features which Panasonic have refused to fix, luckily I don’t use the TV as anything other than a TV so I don’t care, but Panasonic has a long history, in my experience of owning their previous TVs, of not fixing software issues.

    The Nest (which I had hoped to simply rip out and replace with a dumb thermostat) is more troubling. It necessitated a complete re-wire of my HVAC (in the European market, which uses line voltages, Nest relies on a “heat link” gateway to handle the interface between line voltage switched equipment or OpenTherm control protocols which are widely used here) to install the Nest so I was super-dumb in getting it put in in the first place because now I need to undo the wiring changes, which I’m loathed to attempt myself because you’re not just dealing with 24vac low voltage systems. Nest does roll out security updates, but performance of the device was already sluggish and stuttering — you’d walk up to the Nest and it wouldn’t be able to process the infrared detection of a person approaching so you have to stand there waving your hands at it. Any hit to device performance is going to be seriously inconvenient and might tip the balance from being a pain to use to making it completely unusable without huge frustration.

    The satellite receiver similarly is, seemingly, designed to rely on every last CPU cycle and has been value-engineered so the processing capacity just about keeps up with the demands of the user interface most of the time. Like the Nest, it enjoys it’s fair share of stutter moments now and again (I don’t, though).

    The laptop PC was already at the useless end of the performance curve, and because it was so slow and creaky to begin with, I can’t tell if the fix has made an appreciable difference because the difference between “very bad” performance and “awful” performance is too fine to notice, certainly for the kinds of tasks I use it for.

    As for Apple, the Blackberry smartphone, we’ll have to wait and see.

    But I am shocked at just how far all this tech has silently crept into our lives. Most of it, like the “smart” TV, the “smart” thermostat and the un-necessarily complex satellite receiver are just complexity for the sake of complexity.

    Making sure that everything is patched is something that I should do, just for the sake of my own data protection. But it is a tax on time. And if any of the above vendors don’t put out a fix, do I really want to be chasing them up and trying to get them to make one available?

    1. vlade

      Linux kernes were patched too I believe. I have also seen claims that OsX was patched some time ago, but those were come-and go.

      Meltdown is bad – but it’s Intel only (from all I could see). Spectre is actually worse, as there’s no good fix*, and can run on anything. Heck, you can do Javascript (so probably someone already does).

      *well, there actually is a fix. Stop providing low-level, high-precision timers to anyone and everyone. The whole idea behind spectre is it’s a side-channel attack, so you need to measure and time your executions. If you can’t precisely do it (because it will put some random noise in for example), it makes it much harder (in theory, you can still do it, but you’d have to execute for much much longer… which is more likely to get noticed). Which is also the reason why I don’t buy it’s a NSA backdoor. The design of this is decades old, when this sideattack would be entirely impractical (not only due to timers, but even now the data extraction rate is something like 2000bytes/sec. If your HW is 100 slower (give or take), it translates to 20b/sec, which is too little to be of any use, especially when it would take pretty much the full core to do this (and so was very noticeable in 1990s). NSA would have to plan well over a decade ahead – and there are simpler ways to get the same (for a smaller populatin though).

      1. Carolinian

        There are browsers where Javascript can supposedly be turned off although it would cripple the functionality of most webpages including, no doubt, the one we are reading, Still, I think I would rather take steps to sandbox anything critical from the internet rather than slow down my computer. With hard drive memory now so cheap I’ve never seen much point, on a personal level, to the Cloud. However it seems many businesses have come to depend on it as well as the web itself. The whole setup may need a re-think rather than this ongoing process of applying band-aids.

        1. JuneZ

          Two reasons for using a Cloud service–backup in case a computer dies, and to sync between computers.

          1. Carolinian

            I have all my critical data on multiple hard drives (which like I say are no longer very expensive and portable versions do not depend on a particular computer).

            It has long been obvious that the business model of Google and other web companies revolves around keeping you on the internet as much a possible. This is why they offer free services like email and voip. And, IMO, this is also the thinking behind “the cloud.” Personally I’m not at all happy about the idea of Google holding my data no matter how much they promise not to abuse the privilege. Indeed one reason the current flaw is so bad is that you have no control over other people’s servers.

            1. Lord Koos

              “The cloud” is just someone else’s computer. I’m also a fan of cheap external drives, which have become very reliable in recent years.

        2. Grumpy Engineer

          @Carolinian:

          You can’t sandbox applications to get around Spectre. That’s the scary thing about the vulnerability. It bypasses all of the normal memory protection functions that exist in the CPU. Sandboxing can be defeated with Spectre. As can per-VM memory restrictions in the cloud.

          1. Carolinian

            By sandbox I mean keep computing devices away from the internet and dodgy attachments and software altogether. Stuxnet was apparently installed from an infected thumb drive.

      2. blennylips

        >Stop providing low-level, high-precision timers

        Indeed. Pale Moon was on it last October:

        Pale Moon isn’t vulnerable

        Pale Moon already set the granularity for the performance timers sufficiently coarse in Oct 2016 when it became clear that this could be used to perform hardware-timing based attacks and fingerprinting.
        Pale Moon also, by design, doesn’t allow buffer memory to be shared between threads in JavaScript, so the “SharedArrayBuffer” attack is not possible.

        Even so, we will be adding some additional defense-in-depth changes to the upcoming version 27.7 to be absolutely sure there is no further room for any of these sorts of hardware-timing based attacks in the future.

        Pale Moon is an Open Source, Goanna-based web browser available for Microsoft Windows and Linux (with other operating systems in development), focusing on efficiency and customization. Make sure to get the most out of your browser!

        Pale Moon offers you a browsing experience in a browser completely built from its own, independently developed source that has been forked off from Firefox/Mozilla code a number of years ago, with carefully selected features and optimizations to improve the browser’s stability and user experience, while offering full customization and a growing collection of extensions and themes to make the browser truly your own.

        I use it and like it.

      1. Clive

        Plain old fashioned honest-to-goodness stupidity. I naïvely put my faith in brand loyalty thinking, after the poor support with my previous Panasonic TV (which was about 10 years old when I changed it), they’d learnt their lesson and mended their ways. Which they hadn’t. All I can say is, belatedly, “never again”.

        1. Chris

          Thank you, Clive, for you insights. Here on an aging Zenbook with I7 chip running Linux.

          For me, my main fear has been bad actors getting access to my internet banking or my credit cards. So I always say No to ‘do you want … to save your password’. (They’re all written down in a notebook, as my memory can’t store them all efficiently these days.)

          Yet, at the same time, I know that AI is listening to my Samsung S8, just so that it can enhance my user experience and customise my adverts, you know. But I still use the phone as the choices, Apple or Alphabet, are the only real alternatives in today’s world.

          I guess we have allowed corporations into our lives and accept some sort of loss of privacy as a result.

          However, these flaws really are shocking learn about and I hope that the financial cost to the chip makers is large and will incentivise the companies to do better in the future.

          I will keep my dumb tv until it breaks and hope that some of the companies will continue to offer them, but not confident from what I see in the stores.

          Great post, Yves.

    2. Scott

      I’m just curious why you are concerned about your Nest thermostat. What is there that is “secret” or so irreplaceable that you care so much about protecting it?

      1. Clive

        Whether I’m in the house or not (presence data). If you were of a mind to burglarise property, knowing if the property is unoccupied would be, ah-hem, a big help to you.

        I could just disable the internet connectivity. But then how far do I trust the UI-based configuration to tell the truth about what it is doing under the hood?

        I could set up traces on my router to monitor the LAN side of my home network to check that it really has disabled the internet connectivity. But then I have to a) do the set up and b) review the logs. More taxes on time.

        And I paid £250 to purchase a device which promised to achieve something I wanted to do. I could simply throw the thing away (after paying some more money again to get an electrician to remove its tentacles in my HVAC system). And I’m supposed to just reconcile myself to writing that off to experience? Like some sort of idiot tax? Or do you think I expect too much to expect it to work reliably, safely and as advertised?

        1. Scott

          There are much easier ways to find out if you are home than hacking a Nest.

          “But then how far do I trust the UI-based configuration to tell the truth about what it is doing under the hood?”

          You could make this statement about any application running on any device, regardless of this hardware flaw. How many apps on your cell phone broadcast location data already? How many access your list? Or have access to your camera? Or microphone? Does the average person even know or care? If you truly are that concerned about everything, I’d suggest you unplug.

          “Or do you think I expect too much to expect it to work reliably, safely and as advertised?”

          How can you hold Nest responsible for a hardware flaw they could not possibly have known existed?

          1. Clive

            You asked me what data I valued and I told you. Now you’re making a different argument, that the same data is available by other means. Yes, someone could theoretically watch my house and work out if I was at home or not. But the layout of my street makes that hard to do unnoticed. The retired people and the stay at home mums keep a watch out for cars which aren’t owned by the residents and it’s an no-through road so they stick out like sore thumbs. There’s a high degree of probability someone would notice the registration number.

            So I am not sure how being able to compromise potentially tens of thousands of Nests and, at the miscreants leisure, poll a precise listing of which properties are vacant, for how long, at what times of the day, on a regular or irregular schedule with the data being conveniently mineable on an industrialised basis is the same thing at all as being able to sit outside my house in the full view of everyone else on the street.

            And yes, I could and do make the exact same statement about a myriad of other devices. That’s is precisely my point. It is all far too ubiquitous and a normalisation of breaching data security. Especially when manufacturers go out of their way to concealing what their products are really doing even when it is contrary to users instructions

            It sounds like you’re happy with these states of affairs. Good for you. But I’m not. And for Nest to have not issued a single word on whether their products are susceptible to these bugs (one presumes they are, but there’s no information either way from them on the subject) or if / when they’ll be fixed is dismal. When did it become acceptable to scam £250 out of people for a product which has a serious security flaw? And what’s happened to our culture where people then think it’s alright to show up and say, in effect, tough luck matey, that’s just too bad, you’d be better off chucking it in the trash?

    3. Oregoncharles

      Maybe this is a naive questrion, but: How many are under warranty?

      Sounds like some of them should have been sent back a while ago (I know: tax on time.)

      Apparently, not being able to afford all that stuff has unexpected benefits.

      1. Clive

        It’s certainly taught me some lessons. Never again will I believe in Silicon Valley flimflam. And I will use every means at my disposal, such as Cfdtrade kindly provides, to dis wholeheartedly the woeful state of the industry, it’s abysmal products and its bald-faced dishonesty. And any unicorns will be shot on sight. It might not do any good, but, I’ll certainly feel better.

        As for warranty, here in the EU we get two years pretty much no-questions-asked warranty without limitations (thanks, EU !) but unfortunately the items in question exceed even that fairly generous allowance.

    4. southern appalachian

      This all made me think of Le Carre’s Smiley’s People; Moscow rules, there, Clive. Funny to think we are bugging our own domiciles.

  2. Bill Smith

    The patches are out. Has anyone actually seen a big hit to speed? So far I have just heard about it. Amazon, Google and Microsoft they see no noticeable difference to customers on their cloud computing platforms. The companies I deal with that do simulation modeling haven’t seen much of a change from weekend testing.

    On the other hand some AMD (Athlon) machines have been bricked by the fix?

    If there was a really big hit to performance, high frequency traders that use dedicated machines for that might not bother to install the patch. Or maybe they might be using FPGA’s or ASIC based machines to do the trading? What is the cost these days to create an ASIC per a design?

    1. Grumpy Engineer

      I haven’t noticed any slowdown on my Linux system, which has been patched for Meltdown, but not yet Spectre. Of course, I’m also not running programs that make tons of system calls to the kernel at very high frequency. Basically, unless I run my computer as a high-performance network router or a high-performance file server for databases or small files, it should have minimal impact.

      Hmmm… The high-frequency traders probably make LOTS of network calls to the kernel. They’ll likely see an impact from the fixes.

    2. oliverks

      ASICs can be pretty pricey to create. To give some sense of a minor cost, you need to get the masks made. On a small geometry part, say 28nm, that might set up back $4M.

      Then there is the cost of actually developing the chip. That is much higher than $4M. Really, outside of some more common areas (like cryto currency mining), I got to believe most custom trading platforms are going to use FPGAs or GPUs. They are quicker to deploy, and you can ramp with new generations, probably quicker than you can deploy your own chips.

      Unless you are a big big player, it might take 9 months to get into production for you chip, once you release the mask set with someone like TMSC.

    3. ChrisFromGeorgia

      Amazon, Google and Microsoft they see no noticeable difference to customers on their cloud computing platforms

      Well they would say that, they have a lot riding on cloud adoption, i.e. motive to lie.

      Just sayin’

    4. Yves Smith Post author

      I didn’t link to it, but a post on Hacker News has tons of people screaming about how awful the patches are in terms of impact on performance.

    5. vlade

      The impact of the patches (assuming the “patch” is to move the kernel address space out of entirely as in the KPTI for Linux) depends on what your applications do – basically, how often they need to call kernel services. Which is pretty much all IO for example.

      So, if you applications access disk or network a lot (which is for example the case of databases), expect quite a bit of slowdown. Same goes for HFT as they do a lot of network traffic – although one could argue that it’d be actually pretty hard to get a malware on machines running HFT, as they tend to be locked up and co-located near the exchange and in general the control of them is very very paranoid (because any unwanted stuff can generate interrupts and problems at the worst of the times, which can cost HFT tons of money).

      If what you do is to use your PC as an office machine (Word, a bit of browsing, Excel..), it’s not going to be much noticeable (unless MS screwed up as they are wont to).

      For large computational jobs, or for games, the slowdown is likely to be nonexistent, as most of it does not use kernel much.

  3. The Rev Kev

    ‘The (financial) industry has made this bad situation worse by regularly under-investing in IT’

    For the love of God, why would they do that? Information technology must serve as the lifeblood of modern finance. You would think that Wall Street would have an office at Intel, Microsoft and other major tech corporations as they underpin everything that they do. From what Clive says, there is the devil to pay – and him out to lunch! Does anybody know if the next generation of chips will eliminate these faults? And how many years it will be before any such chips will be in full production?

    1. Clive

      The management elites in big finance are the archetypal kids in a sweetshop. They’ve been enticed by all the goodies on show (automation, self-service — so long as your customers continue to putting up with doing the businesses’ work for them, cashless societies and so on). But they’d don’t want to see the downsides.

      There’s the price to pay up-front for all those tasty treats. But they also make you fat — adding more and more gloop on top of existing legacy systems means you need a larger and more complex hardware estate to run it all. And a moment on the lips (a one-time cost reduction payoff when you implement the IT-based enhancements) but a lifetime on the hips (those systems need to be kept updated which is an ongoing cost).

      Hence the impetus to just do nothing and keep your fingers crossed. Which works, until (as with these issues) it doesn’t.

      1. FriarTuck

        There’s an old addage in programmer land where once you build something your boss will likely never allow time for you to go back and build it in the “proper way.” As long as it works, there’s no point of going back and re-doing something that is already done.

        In this mindset, new code gets layered on top of old until you end up with what we have now.

        It is understandable that this happens, though, as not all programmers have complete understanding of all current methodologies, nor do managers (and the money men) see ROI of investing in systems that are already “performing” as expected – even when new methodologies arrive to do things slightly better. Rebuilding things involves risk that it won’t work properly. On top of that, modularization is difficult to put in practice correctly and typically adds overhead to already-stretched resources.

        I can only imagine how it is at the TBTF banks.

        1. The Rev Kev

          Years ago I read of a guy being given a tour of a computer facility. Being shown one set of servers, the guy asked how often they had to be maintained as they looked a tad old. The site manager replied that those servers were never shut down for maintenance as there was no guarantee that they would ever be able to boot up ever again so it was deemed better to just keep using them until they finally died. True story that.

            1. Peter Phillips

              Commentators (and Yves on many ocassions) have expressed concerns regarding “legacy software” in large financial corporations.

              From first hand experience I can add the following. The “depth” and “breadth” and “impact” of this legacy software is enormous.

              To give an example from my own experience as a Business Project Manager in a TBTF bank. A business unit requested additional capacity for 300 financial advisers to be added to an existing platform that currently could only support 1200 users – the user cap was a “feature” of the legacy software.

              The request ran into the following “problems”.

              The platform was built on architecture constructed in the late 70’s. The vast majority of bank personnel with understanding of the original architecture and code were no longer in the bank. (There were some, but finding them and extracting the information was a strenuous undertaking as the personnel had usually moved upwards and onwards..and did not see providing input regarding legacy software as germane to their current role)

              The program..or key outputs from the program was utilised by 30 separate “business units” within the bank. A major bunfit blew up regarding the changes required between two business units with responsibility for fraud and security. One dominated by what I would call “old school” cautious operatives and the other by young whipper snappers with the attitude of “just do it and damn the consequences”. Luckily “old school” won the argument (the fact that there had been a spate of in house frauds based on the platform’s “lack of maintenance and oversight” was a very strong argument to counter).

              Changing the program required (a) co-ordination between all 30 business units (b) re-writes of the program that required understanding of the original code (c) extensive testing (d) involvement of personnel who understood the full ramifications of the program, the code and the security implications of a “failed” upgrade (d) a “clean-up” project to identify and remove access to the program for personnel who had been “inadvertently” “remained” in the system (some with loan approval levels of tens of millions of dollars)

              In the end, the additional fin advisors did get added. But the business unit that requested it was totally pissed off that it took 9 months from their request being made to the eventual successful implementation. Their mantra was…“How can we make increased profits and keep up with competitors with these sort of implementation time lags and why did it cost so much?”

        2. JerryB

          Its not just in computer programming. As someone who has worked in engineering and manufacturing for 20+ years it something that is very common in most industries. It is basically the “throw crap against the wall, most of it sticks, okay we’re good” view. Examples include the space shuttle Challenger disaster, the Gm ignition switch issue, the Ford Pinto gas tank fiasco ( I threw that in there for all of us over 50), the Samsung Galaxy S7 catching fire issues, and the list could go on and on.

          Good references for what we have lost are in the recent book Shopclass as Soulcraft and the older Zen and the Art of Motorcycle Maintenance. In my mind its caring, the idea of craft, attentiveness, virtue, ethics, the loss of value, etc.

          On today’s links page is the link . In the article is mentioned the idea of garbage, consumption, and waste. We make poor quality stuff, but its cheap, so it does not last long and breaks, and we buy another one, fill the landfills, and corporate america is happy. This is one of the sicknesses of modern (last 50 years) capitalism.

          Due to living in a very expensive cost of living society,( I being simplistic here), but if we made things of high quality and to last then at some point the demand for our products would not be as high as our product lasts long, then we would lose money and jobs. Probably one reason for the concept of Planned Obsolescence.

          Back in the 50’s and 60’s when people could live on a modest salary a person could make fine furniture of high quality in his small workshop and live on the small profit he or she made and take pride in their work. But to afford $200,000 houses, $30,000 cars, extreme healthcare costs, and $800 telephones, we can’t do that anymore. Enough as I am starting to sound like cranky Walter, the old man from the Jeff Dunham skits.

          I feel for new engineers who have huge student loans from top engineering schools. Many will never be allowed to design or build something the “proper way”.

        3. Martin Oline

          “once you build something your boss will likely never allow time for you to go back and build it in the “proper way.”
          I’ll learned that lesson in hardware too. Build a prototype plastic injection mold and it turns into a production mold, with no provision for quality or life expectancy.

          1. JerryB

            Completely understand. Most of my career in plastics engineering was in injection molding. I worked in engineering and on the manufacturing floor so have dealt with the scenario you describe many times as well as the fall out. Prototype molds are typically made of aluminum and aluminum is a prototype material and is not able to withstand the stresses resulting from making thousands or millions of cycles/parts in an injection molding machine. Over time the aluminum will wear and or wash out and QA will say the parts are not to engineering specs/dimensions. Then a higher up/manager writes an override so the dimensional tolerances get widened or someone will say “Thats close enough! Ship it! ” See Challenger disaster for consequences of management overriding engineering! And the beat goes on.

            1. foghorn longhorn

              If you can’t find the time to do it right the first time, where are you going to find the time to do it again.
              Some old dude somewhere.

              1. Kilgore Trout

                A variation on one of “Murphy’s Laws”, I believe: “There’s never time to do it right, but there’s always time to do it over.” Not so much anymore, evidently.

      2. Lambert Strether

        And if you strip away all the layers of gloop you come to a bunch of Cobol nobody knows how to maintain, and that’s on top of some assembly stuff. And nobody knows what that does…

    2. Pat

      Not having seen the financial industry act in a sane and rational manner regarding anything except political protection in almost two decades, I would say this is SOP.

      If it doesn’t enrich the select few it is not worth doing. Spending money on tech resources adequate for increased demand not only doesn’t enrich them, it costs them.

  4. Lost in OR

    From a non-techie perspective, this appears to be the type of failure expected from any mono-crop. A single bug can wipe out the whole crop and everything depentant on it. Perhaps this will be the potato famine of the digital world.

    1. Fraibert

      Honestly, I would expect at most a very small number of CPU design firms. The designs are complex (and therefore expensive), and due to the lead time in preparing a new design, it’s hard for me to see a new competitor entering the market, unless it was state sponsored.

      So even if Intel was a lot smaller, I suspect it’d still be a big deal.

      1. Lambert Strether

        Yes, and fab plants are enormously expensive.

        However, that we cannot have the kind of computing “we” “want” (big assumptions there) without surrendering ourselves (and our attention) to enormous monopolies isn’t something I like very much.

        Remind me why innovation is good?

  5. Self Affine

    The issue is somewhat more nuanced in financial services. There is still substantial mainframe usage as well as IBM PowerPC clusters. For example: Bank of NYMellon runs the worlds largest clearing system on top of HA PowerPC clusters.

    Having said that, anyone that stores data in the cloud is at risk since cloud computing is based on the cheapest hardware (i.e. Intel, AMD) possible as well as large scale usage of Virtual environments and a computing strategy based on stochastic load balancing and scheduling. I.e., cheap is the preferred approach since failure is part of the process.

    This could get very expensive if all these devices have to be replaced. Also, cloud computing is tuned to provide response time at the expense of absolute consistency. Any performance haircut is a real problem

    1. Amateur socialist

      Trying to find the link but I read a report over the weekend stating that ibms power8 and power9 are affected as is system z (mainframes).

        1. Amateur socialist

          But often share architectural features e.g. speculative execution without enforcement of hardware privilege mechanisms. The designers and architects are drawn (and recruited) from a fairly small pool of specialists.

          1. D

            Turns out mainframes already works around the problem….long long ago. but ibm made its own chips so knew it had to take the hit to validate authorization

        2. Altandmain

          The underlying architecture is the same. For example the Haswell Xeon cores will use the same cores as the desktop Haswell CPUs, just with more cores, memory, and PCIe lanes.

          Same with Skylake, although they have introduced AVX 512 and the new Mesh Topology for the Purley platform

    2. SpaceMtn

      I believe even IBM processors such as those developed on the Power and Z (Mainframe) architectures are affected. Essentially any processor that takes advantage of OOO (out-of-order) execution where transient state introduced by such speculative execution is not cleaned out in the cache. Normally when speculative execution occurs that is deemed ‘wrong-path’, i.e. architected state should not be updated, certain micro-architectural components such as the re-order buffer, execution pipelines, etc. are flushed out, but not the caches (both I and D). I do not know of one architecture out there at present that performs OOO execution that flushes the D/I-cache when the speculative instructions get flushed due to say an exception, e.g. trying to access memory that is privileged when you are a non-privileged user application. And in these attacks the privileged (secret) data never gets into the cache data cells, but rather is ingeniously embedded in the cache address, where the cache data itself is moot.

      1. Amateur socialist

        Z series also has hardware encryption of memory which (should) slow down a successful attack. Recent models anyway.

        1. Fraibert

          I don’t believe that’s correct. My understanding affected kernel data that is yielded from the “Meltdown” vulnerability is ultimately extracted from the CPU cache. If that’s the case, the fact that RAM is encrypted doesn’t matter.

      2. D

        Do t think they are, its a lot harder to crack them with the same tools that you would with a PC or mac, or tablets.

  6. Croatoan

    If it was not clear that we cannot trust banks or Tech companies before these flaws were exposed, it should be now. It scares me that 98% of the “save everything in my life on my phone and cloud” have no ideas of the horror of this feature-flaw.

    I have been telling a few of my (very wealthy) friends; Disable online access to your bank account, have as few online accounts as you can manage, do not save autosave your passwords – memorize them, get off all social media, get a new email address on a service that allows disposable aliases, and buy a separate computer to look at porn. And if they can, run Qubes OS on an ARM based system.

    There is a reason they have been working in silence trying to fix this feature-flaw for the last 9 months.

    I told my bank that I did not want online access to my bank account to mitigate my risk, that I was fine checking by phone or going in person. They did not know how to accommodate me. I went to a bank that did.

    1. vlade

      you can be as safe as you can be on your side if for your online banking you boot into a clean linux distro (i.e. hoping no trojans already there) on a read-only media, with all other media on your PC disabled (in fact, if you want to be paranoid, you’d have a laptop with nothing but the DVD drive and no other media). Technically, even that could be attacked via the side-channel of electronic noise emitted, unless you shield it all but then..

      Of course, that does not mitigate against problems on banks side, but for those the bank is responsible, not you – and, more importantly, if someone penetrates the bank’s systems, the fact you had no online banking is irrelevant.

  7. Matthew G. Saroff

    Raspberry Pi is , because they do not use predictive execution to get that last iota of performance.

    Security is generally the last consideration by major vendors, but because of the initial focus of the RP, education and 3rd world, the bullet missed them.

  8. drfrank

    Let us count the ways in which a 30-50% slowdown in processor speeds might actually be a benefit. Let us admit that peace of mind about security is well worth it. Let us regulate the adequacy of compute power in the TBTF, systemically critical, etc., etc. For the rest of us, we might, like Thoreau, find it more efficient to walk rather than take the train.

    1. Self Affine

      This is more serious if true. It’s always possible to boot from a Linux CD or Flash and recover data, but how does one re-install the environment to a well known state if the checkpoints are missing?

  9. Enrico Malatesta

    I don’t even pretend to understand the technical elements to this CPU problem, but I understand this is just another nail in the coffin of American Empire.

    The commodification, corruption, and crappification of the American Chip Giants gets added to what the rest of the world knows about American Software already.

    Like the weaponized USDollar, when will the alternatives to American Technology Products take over, like the nascent Swift system alternatives bear down on the PetroDollar?

    1. vlade

      ARM is a UK company (well, was until Softbank bought it, who knows what it is now). The underlying problem of this is not a company specific, it’s really a weakness of the design which was shared acros mutltitudes of silicon. It’s likely the design was actually first done at some uni in a theoretical paper quite a time back.

      1. Amateur socialist

        Or there are actually a fairly small number of specialized designers/architects in this area which have been laid off/recruited among multiple chip vendors. ( maybe more likely )

        1. nat

          Naw, its just the exploits take advantage of a concept that has became an industry standard like a decade or so before people imagined there would be a vulnerability to it.

          In an outrageous very hypothetical analogy, it would be like ISIS discovering a way to cause cars with anti-lock breaks to explode on-command. Anti-lock breaks were a great idea, so pretty much the whole industry adopted them and they became standard – your car company would have been at a huge advantage if you weren’t adding antilock breaks to your cars so you pretty much had to to stay competitive. And things were great with anti-lock breaks for the longest time … until the (hypothetical) ISIS exploit that no one saw coming years after it all became an industry standard.

          1. Other JL

            This is my take also. I think the chip designers weren’t devious enough to consider a side channel attack. It’s possible side channel attacks weren’t well-known when out of order execution was designed.

            1. Lambert Strether

              Doesn’t it make sense to ensure that “never the twain shall meet” for user space and kernel space should have been architected in from the beginning?

              IOW, KPIT (“Kernel Page Table Isolation”) should have been a design requirement?

              I mean, at some point, somebody in front of a whiteboard with a big diagram on it decided this was a good idea…

              1. vlade

                The need-for-speed. Switching the kernel in/out is expensive(ish) [ to the point it would not surprise me if that could be timed and used for some other sort of side-attack].

                But as I wrote above, a lot of these side-channel attacks relies on high-definition, low-level timers. Which really not that many other applications need. The timer stuff is often dealt with as “the operation takes X no matter what”, but of course, if the speed is your goal, you can’t do it. You can make timer obfuscated though. It does not stop things, but it definitely makes it much much harder (at the current HW performances, may not work so well if it improves another two orders of magnitude again).

  10. cm

    Lots of Intel sockpuppets on Slashdot since this came out, trying to underplay the damage and drag AMD into it…

      1. Jon S

        +1. A simple need to minimize context switches. This is as much an operating system design issue as it is a processor design issue. OS and chip vendors should have resolved this issue 20 years ago.

      2. Amateur socialist

        Interlocks between privilege checking and speculative execution are nontrivial redesigns. Expect multiple processor releases in 2018 to be pushed out as all the vendors revise designs already under development.

  11. Pavel

    Anyone else wonder what this is going to do to the already energy-intensive bitcoin mining farms? Even say a 10% hit on a PC’s efficiency will make mining the coins that much more expensive. And how secure are the wallets?

    1. vlade

      The second issue is actually more problematic. Basically, given nothing is secure now, it just might be a time of fun for all the bitcoin exchanges.

    2. Croatoan

      They use GPUs (for the most part) to mine, not CPUs.

      The risk is more on the platforms that store the coins.

  12. Steve H.

    > So: subvert these good intentions by presenting the processor with a dishonest and, to insightful human eyes, wildly improbable instruction stream

    I didn’t realize this was a post about politics…

    > 5% to 30% increase in processing time

    Mulling this in our current conditional environment. Not patching gives a competitive advantage, with a downside of potential ruin. But the status quo has been, ala Equifax and , to dump the downside downstream, while being protected by the ptb. There’s a similarity to the Weinstein trigger: assume a worst case, all Hollywood agencies are corrupt, then it’s a massive investment into self-protection and leaking info on rivals. If you last two years, your competition is cleared out and you are now a monopoly.

    1. Clive

      Not patching is simply not an option — you’d be out of vendor support if you didn’t deploy what were deemed critical fixes, certainly as far as thereafter trying to hold the vendor to the terms of any licence agreement security clauses. And you’d probably void your professional indemnity insurance, or equivalent, in the event of a data breach — you’re required by most policies to take all practicable steps to securing your systems. The regulators would take a dim view of it, too, if they found out (and it is normally a notifiable event that you’re knowingly running systems not protected against known threats).

      It’s one thing to try to use a “we’re did our best, what more could we have been expected to do?” defence. It’s quite another to say “we knew, but we didn’t care, we just wanted to make as much money as we could and didn’t want to kibosh a chance to get one over on the competition”.

      1. Joel

        Clive, if your take is accurate, why is Equifax still in business? They did not apply patches in a timely manner at all.

          1. Steve H.

            D, that’s a light-bulb going off on the downside of open-source.

            The question remains, Equifax is a regulated company that did not promptly patch. Joel’s question is still valid, and Clive is a trusted source. Is there an inconsistency, or am I missing something?

            1. Clive

              They’ve opened themselves up to all manner of regulatory and consumer-level legal action as a result of their cavalier attitude. Hopefully they’ll become a poster-child of what happens when you mess these things up big time.

              On the flipside, they are called “Too Big To Fails” for a reason — before the GFC it was official UK policy for the regulator of the time not to go after the banks for misconduct precisely because in doing so, it would destabilise the financial system. So it’s perfectly possible Equifax will escape with the usual beating with a wet noodle.

              1. Steve H.

                Thank you, Clive, ‘consumer-level legal action’ does provide a glint in the kindling for me.

              2. Fraibert

                With two other significant competitors that also basically have the same business model, it’s hard to see why Equifax would get TBTF treatment, unless TBTF is now extended to any big financial-related firm. If it died, Experian and TransUnion would gladly take up any slack.

                1. D

                  Best guess, because there are a total of 4, and its extremely expensive to get into that business, . Lot of finance companies depend on them. just one failing would or could break lots of companies, and we might a repeat of 2008

  13. Bronzed ragon

    Finally a NC topic where I can comment usefully after years as a lurker!

    These flaws are enormous, and do go right to the heart of modern computing environments, but I think there are a few things that need to be kept in mind:
    – The flaws (at least at the moment) allow read only access to memory, not write access (so you can’t change or corrupt data directly using these). Although if someone uses them to get your root / admin credentials they can make merry.

    – In order to be able to exploit the flaws you need to be able to execute arbitrary code locally on the target device. That’s obviously a big deal for general purpose computing devices like laptops and shared use environments like clouds. I’m not sure it is so much of an issue for fixed purpose devices like Nest thermostats and SkyQ boxes (to take Clive’s examples). If you can break those devices to the point where you can execute arbitrary code (and exfiltrate the data) then odds are you already have everything that these exploits can give you, or there are other less complex exploits you can leverage to get it.

    – The extent of the slowdown for the Meltdown patch (which is the one causing the big slowdowns) is apparently terribly workload specific. If you are doing mainly in memory number crunching with a bit of light networking you might not even see any difference. But if you are hammering the IO subsystem hard you start to see the 30%+ realm. It is impossible to say without proper performance testing which, as others have said, very few people will do. But the early indications I’ve seen online are that home laptop users may well not notice anything has changed, whereas big database or virtualisation servers can be hit badly.

    I think this is a huge issue for the big shared cloud providers (AWS, MSoft and the like). The kind of systems they run are the most likely to be badly hit by the Meltdown patch (Intel are almost ubiquitous in that environment) – which means their resource planning / capital allocations just got knocked out of whack by possibly 30%. It also cuts right to the very heart of their security model – that you don’t have to worry about who is running code next to you because they can’t see your stuff. Except that they can, and you’ve no way of knowing who they are or what they are doing.

    The Cloud is almost entirely built on the assumption and trust that your data can be kept securely separate from everyone else’s data while running on shared hardware. If the perception takes root that that isn’t true the justification for using cloud for anything even slightly sensitive just evaporates.

  14. Fraibert

    Question: Are the overnight banking processes run on machines connected to the internet, and do they run anything but custom software designed by the bank for its own purposes? I’m curious to know, because I don’t see Meltdown as such an immediate threat to the banks if there is no connectivity and the software is custom (i.e., less likely to have deliberate means of exploiting the vulnerability).

    1. D

      Some are, but the nightly batch processes aren’t. Always wondered if the online banking systems we have arent just an upgrade from what atm networks use

  15. duffolonious

    TL;DR: Meltdown is really nasty because practically all infrastructure is on compromised Intel CPUs. The performance hit can be worse than advertised. Not sure if PCID’s, one of the better mitigations is being backported OTOH.

    On the technical side lwn.net has some very good articles (last is paywalled, so Google the headline or be nice and pay for a month subscription – it’s cheap and they are a small shop like this blog :):
    KAISER: hiding the kernel from user space –
    The current state of kernel page-table isolation –
    Addressing Meltdown and Spectre in the kernel – – the current Meltdown backports of KTPI (Kernel Page Table Isolation sound rough – need to wait and test more …). Again 51-patch set.

    The performance his is on _every_ switch from userspace to kernelspace – this means applications that pull packets off a socket get hit (one I saw was 9% for 600Mb/s UDP traffic). A lot of well implemented production setups will have their Intel NIC’s (Network Interface Cards) batch the packet processing so as not to issue as many interrupts (probably where we’ll see continued work – more exploits). The other one that will hit a lot of people is the FS hit – this often has a lot of syscalls (hence the “du -s” 50% perf hit mentioned below). The 5-30% is just to give people a ballpark figure, but when you start actually looking into specific workloads, well there are some nasties.

    Also, if good to follow is grsecurity (Brad Spengler – whatever you think about him – he toots his own horn a lot, but he gives good insight):

    Performance:

    Note some of the perf (like “du -s” 50% drop). This may be mitigated by 2013+ Intel processors with PCID’s.
    These were only recently added (and probably not backported, so not in most production infrastructure):
    “The performance concerns that drove the use of a single set of page tables have not gone away, of course. More recent processors offer some help, though, in the form of process-context identifiers (PCIDs). These identifiers tag entries in the TLB; lookups in the TLB will only succeed if the associated PCID matches that of the thread running in the processor at the time. Use of PCIDs eliminates the need to flush the TLB at context switches; that reduces the cost of switching page tables during system calls considerably. Happily, the kernel got support for PCIDs during the 4.14 development cycle.” – having to flush the TLB (transaction lookaside buffer) is the main reason performance is hurt – this is what makes switching from kernelspace to userspace and back so much faster.

    On the grsecurity mentioned above):
    “So if your guest is reporting a PCID-capable CPU, but you have no PCID support, chances are you’re running under a host with an old upstream kernel or distro kernel without backported PCID support.” – this is the kind of thing and so many other that will hit small time cloud providers (or people that run private clouds) and mistakes will be made leaving vulnerabilities open for some time. Honestly, if I could short these companies … this is one of those exploits that is complex and nasty and great for big players (Amazon, Google) and even mid-sized players (like Rackspace [probably]).

    And more from grsecurity:
    “I’m seeing ~9% on a actual real IPTV workload. No virtualization, Broadwell Xeon D. About 600Mbps UDP in, 600Mbps out, remuxing UDP IPTV transport streams. A syscall and network heavy workload.”

    If this post looks incoherent and a bit frantic, well …

  16. Skip Intro

    This is one of those moments that I really wish the Snowden documents weren’t being suppressed by their new sole owner, The Intercept. Who knows whether the NSA has tracked and exploited this vulnerability, and for how long? They may have even introduced it. Were the documents available, even only to trusted but savvy journos, we might have learned of this much earlier. Conversely, the unlikely but reassuring scenario where the NSA had no clue about this could also be verified, or at least strongly supported by the absence of any info on these exploits in the NSA trove.

    1. Lambert Strether

      Interesting article. I agree that it’s up to every user to accept as much risk as they like, but the article is inaccurate in one regard. Re Spectre:

      For a typical user, the browser presents the highest risk, but we have yet to see proof of concept code that exploits this vulnerability through JavaScript

      Quoting that includes members of :

      In addition to violating process isolation boundaries using native code, Spectre attacks can also be used to violate browser sandboxing, by mounting them via portable JavaScript code. We wrote a JavaScript program that successfully reads data from the address space of the browser process running it.

      Of course, the original statement is qualified with “we have yet to see,” so perhaps their review of the literature doesn’t include this paper.

  17. Eustache De Saint Pierre

    Hmmm, fascinating stuff within which I am almost totally lost.

    It appears that dumb could be the new smart or vice versa, as for the banks – is this not an extension of moral hazard ? in the sense that it will be up to the state to put anything right if something upsets the trough.

    Having already been cleared out by the TBTF, I personally have little to lose & perhaps only real change will come when by whatever means these shysters hurt many others particularly up into the food chain . After all…..God forbid that all those beloved gadgets would turn on their loving owners.

    It all somehow reminds me of this.

  18. Duke of Prunes

    I think there’s some hysteria going on here. To be impacted, you must be tricked into running malicious code on your computer(s) – ok, with the javascript exploit, this isn’t that hard -, and then someone needs to make sense of whatever secrets it pulls from your computer’s memory (you computer’s memory doesn’t say “topic secret password to my bank account whose URL is xyx: “). Then someone needs to act on these secrets.

    If you use 2 factor authentication (as you should with any account that is important), having the password is not enough. Now, the bad actors need to coordinate an attack on your phone as well as your computer assuming they’ve passed the prior steps. Unless you are a high value target, I think this is all too much work for most. Yes, patch your systems. Yes, practice good security hygiene. Only connect systems to the internet that need to be connected, etc… but please don’t lose any sleep. As far as IoT devices go, they probably already contain bigger security holes than these.

    The real danger (IMHO) is in cloud computing where virtualization means it is trivial for a bad actor’s program to share a machine with another, and there are targets valuable enough to spend the resources to jump through all the other hoops. Then again, corporate IT is in love with cloud computing so I don’t know that a “little speed bump” like this is going to change anything.

    1. Clive

      I hate to dispel comforting illusions but two factor authentication is no magic bullet and neither is it immune from being defeated — SMS based authentication has already been the subject of malware compromises (not by any means the only example).

      And the financial services industry is just as vulnerable to internal network security breaches through phishing emails as anyone else. Here in the U.K. the main regulator, the FCA (Financial Comduct Authority) has issued a mandatory enforcement (Indian coverage, best I can find on the subject due to crappified search engines and way better than any domestic reporting I could unearth) that regulated entities must take proactive measures to address their vulnerabilities to phishing email hacks. Even simple preventative measures are not in place in all institutions and adherence to rules saying that employees should not open suspicious email attachments is astonishingly patchy.

      The US is little better with the likes of the DHS being exposed to very simple email spoofing techniques so actually detecting a phishing email is not as easy as it sounds.

      Needless to say, once a payload on a phishing email gets activitied behind the corporate firewall, the entire internal network is vulnerable. Just because they’re big doesn’t mean they are not dumb or badly managed.

      1. Duke of Prunes

        I did not mean to minimize these problems. They are an enormously huge problem for the IT industry, but I just don’t see these being that big a problem for the everyday home user. If a well funded, dedicated hacker wants to get into your computer, it will (and there are easier means than leveraging these bugs – and they have existed long before and will long after).

        However, why would a dedicated hacker want to get into my computer? I may be naive, but I don’t see myself as a target (and I doubt if many reader here are either). I’m not wealthy. I don’t possess state secrets. I don’t think anyone would view me as threat to the government. Yves and company many have to worry since they’re publicly bucking the status quo…

        1. Clive

          No one is, unfortunately, immune to identity theft no matter how little they have. A wrecked credit history will make your day-to-day existence harder forever regardless of your station in life. Someone who does a little but doesn’t do that much by way of online activity is, perhaps surprisingly, a fairly desirable target — you have some online footprint, but you don’t do enough activity often enough to necessarily spot something that is not quite right.

          As you say, if it’s a costly exercise it’ll not be a widespread problem. But a bug which potentially affects the vast majority of internet connected devices opens up a whole new world of possibilities for hackers to develop industrialised mass compromises — a big target pool warrants a big investment in attempts at attacking it.

        2. Yves Smith Post author

          Someone just hacked my e-mail account over the weekend and sent tons of spam before my mail host locked it. There are plenty of bad actors with too much time on their hands.

          1. The Rev Kev

            Commiserations. And some of those bad actors aren’t even being paid to do this sort of stuff.

        3. vlade

          Your computer time is valuable, precisely because it’s invisible-in-the-mass. A hosts of zombies that carry out all sorts of attacks are sold for pretty penny. That makes any and all computers valuable, if they can be had for a marginal cost. And since the cost of marginal attack is almost zero..

          Or put it in other way. Most of hackers dont’ want to get into _your_ computer. They want to get into many differnet computers, of which yours is likely to be “a computer” – so that they can after that carry out an attack on specific target sometime in the future.

          There is security in the mob – but only if the mob is sufficiently heterogenous (which makes the marginal cost of a random computer there possibly quite high). If ALL the machines in the mob have the same security flaw, then there is no security in the mob if the marginal cost of attack is close to nil.

  19. Synoia

    One must consider if a “server” is used by the “public,” or is internal, and who’s code is allowed to run on the “server”.

    For example. Bank Back End processing is generally performed on in house equipment, wiht in house code, and run by employees. There are no “injected scripts.”

    This for those system, the risk is small.

    For a bank’s web server, when the great public access the web server for transactions, the risks are higher. I’d also point out the web, with its 3rd party scripts, yes Google and Facebook, I’m looking at you, is particularly vulnerable to malicious actions.

    I’d bet that our beloved TLAs are all over these vulnerabilities and have been exploiting then for some months, with their code carries by (cough) large corporate good citizens.

    Risk assessment: (abbreviated)

    1. In hose “batch” running overnight in house programs – risk low or very low,
    2. In house database servers, with in house SQL scripts, for customer access – risk medium to low.
    3. In house transaction servers, with in house programs (eg IBM CICS) – risk medium to low
    4. In house web servers, with 3rd part script libraries, risk high to medium
    5. Social media sites – you are already public, all privacy was lost year ago.

    Malicious intent is not enough. One has to be able to inject scrips, which requires both a platform willing to accepts the scripts (sound of stable door slamming closed), access to load for the scripts, and permissions to run the scripts.

    Steps – I’d do these in parallel as much as possible.
    0. Tell the boss you need a large budget.
    1. Assess the risk to each “server” and its software.
    2. Eliminate 3rd party “free” code.
    3. Audit the installed software for provenance.
    4. Quarantine vulnerable servers.
    5. Replace vulnerable servers (probably with AMD).
    6. Make warranty claims on all servers under warranty.
    7. Repeat to the boss you need a larger budget.
    8. Evaluate protections under Virtual Machines (I read mixed signals here).
    9. Emphasize that you need an even larger budget.
    10. Talk to your insurer.
    11. Talk to you Legal department.

    1. Fraibert

      Thank you for the analysis.

      That was my feeling regarding overnight “batch” processing–that it wasn’t even necessarily clear, if the performance penalty was too significant in the requisite workloads, that it would even be that important to implement the patch for those specific machines.

      Security is, after all, a trade off. Sometimes you will accept a little more risk for more performance.

    2. Oregoncharles

      ” Make warranty claims on all servers under warranty.”
      Yes, I wondered about that one. Do you suppose there are enough warranties out there to bankrupt their issuers? Like Intel?

  20. Mikerw

    My view is that one needs to look at this from a slightly different perspective, that of the engineer. Previously when we built things we understood what we were doing, could design and test sub components, understand them in the real world and if necessary repair them.

    Everything today in our economy is built in incomprehensible to the human brain IT. Coders, programmers, hardware designers, etc. cannot and do not understand their creations. As a result, there has been a proliferation of system failures (airlines, airports, credit agencies, and on and on). This will only continue and get worse.

    If you haven’t read this excellent piece from the Atlantic on The Coming Software Apocalypse I encourage you to.

    1. JerryB

      In the late 90’s I was on an interview for a job as a plastics engineer. The company was in the midst of a crisis with one of its products due a (throw crap against the wall mentality or ready, fire, aim culture) and one theme I stressed throughout the interview was if you cannot understand something then you cannot control it. Hence I got the job.

      I used to have a book called Developing Managerial Skills in Scientists and Engineers and some of the criticisms of engineers was a paralysis by analysis, too much information, and bias towards objective measurement. Yes guilty as charged but those “flaws” are necessary if you want to understand something deeply.

    2. Bobby Gladd

      Thanks for that Atlantic article cite. Yeah. Spot-on. I used to write code in a radiation lab in Oak Ridge in the 80’s. We had to have a “Software QA” documentation set finished, reviewed, and signed off prior to putting any of my programs (nowadays called “apps,” I guess) into production.

      Our joke was “the [logic] flowchart comes last.”

  21. Knifecatcher

    A couple of additional points that I haven’t seen covered above:

    – With at a high percentage of corporate computing workloads now coming either in cloud infrastructure from Amazon, Google, Microsoft, etc. or in true SaaS products there is a direct impact to operating expenses from even a small performance hit. So even a few percent performance hit will amount to a massive amount of money being transferred from businesses to Intel to buy more capacity, even though it’s Intel’s fault. I believe that’s called adding insult to injury.

    – As I understand it unpatched systems run the risk of an attacker escaping the virtual machine “sandbox” and getting into the host system. This is catastrophic from a virtual hardware security model.

    1. Fraibert

      I imagine that for the “Cloud” providers there will be financial mitigation to the issue. It’s easy for those firms to demonstrate financial damage by their mitigation of these vulnerabilities, so I imagine Intel at a minimum probably is going to end up rebating a portion of the cost of new Intel CPUs in any additional hardware required. If Intel plays hardball (probably inadvisable given AMD is waiting in the wings with actually competitive hardware), the providers will definitely evaluate AMD, and also, in the worst case, could file suit.

      I also suspect that Microsoft and other OS vendors, as well as computer hardware vendors who now have to prepare additional firmware updates, might have a decent case for compensation. Intel’s own error is now resulting in these third parties having to expend significant time (and therefore money).

  22. Tom_Doak

    There was an article in Links just yesterday about the Michigan food stamp program going down at peak hours for several days in a row. I wonder if they’ve made the software patch and the slowdown is causing them to overload at peak usage times?

  23. D

    (Note, not the same “D” as the D above, I’m the D unable to nest comments due to my Dial Up (can’t afford DSL in my neck of the woods, even if I wanted it) and IE8 browser)

    One might hope that this would bring about effective penalties and regulation – which should have been instituted decades ago – regarding technologies the public ends up ultimately having no power of refusing which can make their lives a living hell.

    The US is particularly notorious for this utter lack of regulation and meaningful penalties (likely since the US Government has been entrenched in those technologies since their inception; and because California is the 5th or 6th largest economy in the world, with an inverse rate of obscene POVERTY). For all of its faults, at least the EU has far more stringent privacy laws.

  24. Oregoncharles

    I’m going to thank you ahead of time for giving it the ol’ college try, because the explanations I’ve seen so far left me with little more understanding than “Oh, (family blog).” NC is a rare trusted source!

  25. Oregoncharles

    I wonder whether Yves remembers a book called “Fleecing the Lambs”? (Christopher Elias, 1971 – holy family blog, Amazon has it. No info beyond the date, though.) I believe it would have come out about the time she started working on Wall St. It was an expose, as the title implies, from the period when computers were first taking over business.

    And it had a hair-raising story, something of a warning: In the enthusiasm over the “paperless office”, one brokerage (nameless, IIRC) was convinced to put everything in electronics (tape, then) and, ahem, throw away the paper.

    You can predict the ending: the computer crashed, of course, and the brokerage, expensively, ceased to exist.

  26. JBird

    Blast. Yet another something for a paper. l can see this being a marvelous political and social tipping point, or adding too it. People do not understand how much a regime’s authority and power, be it social, economic, legal, or political, depends on its legitimacy vis-à-vis the people under it. Once the people no longer accept its legitimacy, the only real way a system maintains its power is by by using force on the population under it to obey it.

    Okay, what does this have to do with political and economic upheavals? The system works because people just accept it to work. One of main reasons the murder rate is so high in some areas is because the police have no legitimacy. If the police cannot be depended on not arbitrarily hassle, beat, steal from, arrest, frame, even kill, and not even investigate whatever crime they were called on, who will call them? If the entire legal system is not legitimate how does one get justice? One can say poverty, guns, culture but the lack of the rule of law, of an actual justice system is the reason. At least a reason. People get protect and get justice themselves rather than through the police and the courts.

    The ever growing list of examples of economic, legal, and political regimes not working in very big ways is chipping away of the American people’s internal acceptance of the whole system:

    Civil asset forfeitures, TSA, the 2008 bailout of TBTF banks, the massive illegal foreclosures with the courts’ acceptance, NSA, the militarization of the police, the approximately eleven hundred death and who knows how many injuries each year caused by the police (this does not include jails or prisons), the Fergusonization of many police departments for tax collections, 2017 elections, Wells Fargo, Puerto Rico, Congress, and now our entire business and banking systems.

    Just as people are taking away legitimacy from the legal system, they will start taking it from the whole economic system. I am on a cellphone and this a comments section not a paper or my blog, so I will conclude with this question. If the increase in violence and the growing fear of an authoritarian police state is caused by a growing lack of trust, of respect of the police, what do you think happen when people at most economic, political, and social positions also add an increasing distrust of the economic system itself? When even the courts start too?

  27. D

    Re another D’s response to me:

    Not likely to happen with Trump in charge

    I want to clarify I was not continuing my above comment, another “D” posted that response to my comment, with no clarification that they were not me (why?????, that’s really not okay).

    Further, I despise Trump (particularly as a female), but this has been an utterly BIPARTISAN betrayal of the public, for decades; it is not likely to happen anytime soon, period.

  28. Hacker

    Now for an unpopular opinion: This won’t affect banking servers much at all.

    Why not? Because exploiting Meltdown and Spectre flaws requires the system owner running untrusted code on a system. Banks aren’t in the habit of running untrusted code on their servers. At least, the bank where I am a Security Architect doesn’t.

    Now endpoints, that’s another deal, as the average browser is loading a couple of dozen untrusted scripts on each web page load. You might call endpoints desktops, laptops, smartphones, etc. Things the users run and mess up.

    So my employer the Bank will be patching endpoints as fast as possible. The Bank will patch servers during their next scheduled patch cycle, if it looks like there are no negatives. A 30% performance hit would get a patch postponed real quick. That’s real business impact vs a risk that someone might get some malicious code running on the server.

    Other banks might do things differently. Not all of them have Security Architects with 20 years of infosec experience who can stand in the face of a news induced panic and rationalize the risk.

    1. Clive

      Please read my earlier comment — banks may not be in the habit of running untrusted code, but it does happen sufficiently often for the regulators to be taking concrete steps to adress the problem.

      And if you really believe that big organisations with sensitive data are always willing to put data security before cost savings, there’s plenty of examples to say otherwise. If the U.K.’s NHS got caught out and pretty much incapacitated itself for a couple of days as a target of a not-exactly-sophisticated hack then any organisation can end up in the same position.

  29. none

    Meltdown is a more serious problem than Spectre, but it’s a straightforward CPU bug that can be fixed in the next chip (no serious rearchitecture needed, just fix the mistake) and it can be remediated by software patches at the cost of the slowdown that you mention.

    Spectre is more inherent in how cpu’s work, so it’s harder to fix, but also much harder to exploit, and at the end of the day it’s really nothing new. The best fix is non-technical: just don’t let your adversaries run code on your computer! Unfortunately the geniuses in charge of the current Worldwide Web have deemed to inflict Javascript on all of us, which puts that code on all our machines.

    Anyway, lots of us with smaller web sites (including maybe NC) run our servers in what are called virtual machines (VM’s), which are simulated small computers where the simulation is running in a big computer. A midsized dedicated computer costs $50 to $100 a month, so server operators will get a pile of these, run a few dozen VM’s on each one, and make money by renting out the VM’s for perhaps $5 each depending on their specs. The VM’s are vulnerable to SPECTRE because an attacker can run code in them that basically breaks the barrier of simulation to get info about what’s happening someone else’s VM.

    In the case of a bank, this is a big nothing. They can just spend the extra money on a dedicated computer instead of VM’s on a shared one, which they’re probably doing anyway. Someone running a small web site on a $5/month VM could also get a dedicated machine (it’s not THAT expensive) but it’s probably not worth it to them since they have no big secrets to protect. The biggest pain is to the VM hosts like Amazon, since when the VM’s do start attacking each other with the obvious hilarity ensuing, the VM host is to some extent on the hook. So they’re all doing semi-panicked software updates right now.

    But, Spectre is what’s known as a timing attack, similar to “wallbanging” (a term describing a similar operation when the parties are cooperating rather than attacking each other, like prisoners communicating between jail cells by banging on their cell walls). Versions of it have been known since the 1990s, and highly sensitive security code since then has been written with mitigations in place as a matter of course. That code usually isn’t intensely performance sensitive, so the developers just accept the slowdown from the mitigation. Spectre tells us that a wider variety of software has to deal with the issue. At least at a technical level though, it’s not a big shock.

    Tl;dr: don’t panic unless you’re a VM host, in which case you’re already putting in extra hours dealing with this.

    1. vlade

      Most of the big corporates, banks including, use VM hosts extensively (run in-house) – especially to run their employees virtual desktops. Employees browse stuff, which often includes third party JS which is of dubious quality. Even worse, I know of not a few TBTF that still run IE as their PRIMARY browser!
      You can say that is just employees, no critical system..

      “Just” I’d say then?

      Employees are the people who have the access (and passwords) to all those nice and “safe” systems.

      The only “safe” system is one in a locked, shielded room, with no access to anyone or anything. Which is a pretty useless system. Your attack is NOT on the end system, it’s to get data from the employees – either directly such as passwords etc. or indirectly, to get enough data to do a good job at using humans in the organization to get what you want. How many employees will say “no” when they boss asks them to provide some information via email or OCS, even though they technically should not?

    2. flora

      Thanks. This link has a short useful list of what the Meltdown/Sectre dangers ‘are not’. Defining the negative as well as the positive of a problem – is/is not – as they say.

    3. Self Affine

      Straightforward CPU bug is a rather optimistic way to describe this. Changing OOO execution structure and logic will require a chip re-architecture, new layouts. testing, etc., not to speak of fabrication and distribution.

      Personally I think this is a very “big deal” and software patches to cloud computing infrastructure that slows down performance by say 30% is not exactly something that Amazon is looking forward to.

      Here is link to a recent Bloomberg article which lays out the history so far

      Note the last paragraph – CPU hacking may well be the new frontier.

      1. none

        Changing OOO execution structure and logic will require a chip re-architecture, new layouts. testing, etc., not to speak of fabrication and distribution.

        That’s only needed for Spectre. Meltdown is much easier to fix.

    4. Self Affine

      Straightforward CPU bug is a rather optimistic way to describe this. Changing OOO execution structure and logic will require a chip re-architecture, new layouts. testing, etc., not to speak of fabrication and distribution.

      Personally I think this is a very “big deal” and software patches to cloud computing infrastructure that slows down performance by say 30% is not exactly something that Amazon is looking forward to.

      Here is link to a recent Bloomberg article which lays out the history so far

      Note the last paragraph – CPU hacking may well be the new frontier.

      But not to worry – pretty soon we will have quantum computers and all this fussing about cyber security in old silicon computers will become irrelevant and things will really get interesting.

  30. Foppe

    Note that the design choice that allows for Meltdown (and secondarily Spectre-like attacks) to be a thing in (mainly Intel, AMD’s approach in Ryzen is architecturally superior, also because of AMD’s choice to include tech to allow for transparent encryption of VM memory in its Pro lineup, which mitigates the seriousness) CPUs was already flagged as undesirable in 1995: https:/s.semanticscholar.org/2209/42809262c17b6631c0f6536c91aaf7756857 (quoting this from 1992: )
    So it wasn’t like this was something that became dangerous only in hindsight, with the advent of New Tech.

  31. Jean

    I’m late to this and am a technovirgin without any hardware or software knowledge.

    How am I affected if
    I have no smart devices of any kind, only Macbook is connected to a router,
    I never store anything in the cloud,
    Memorize passwords and type them every time without using keychain on a Mac,
    I never do online banking,
    Do not use Gmail,
    Use a voice only flip phone,
    Do not use Java?

      1. John Zelnicker

        @Yves – I don’t know which one (or both) are used in the Macbook, but Java and Javascript are 2 different programming languages.

    1. flora

      Here’s more info. Answers a lot of questions. Just scroll down to the Questions & Answers section.

      1. flora

        adding: in the above linked article, the line
        “our proof-of-concept exploit can read the memory content of your computer ”
        memory refers to the CPU/processor/kernel memory. It does not refer to data storage memory, aka hard disk drive data storage (sometimes also referred to as memory).

        1. flora

          adding: Apple has just released a security update to mitigate against Spectre CPU flaw. They released a security update in December to mitigate against Meltdown. Quoting “Hitchhiker’s Guide to the Galaxy”: Don’t Panic. This is another computer security problem that needs attention and mitigation, like other viruses, malware, and exploits.

  32. alex mal

    In case you don’t use cloud services and have old version of Windows. You can use Adcontroller.co to protect your PC from Meltdown.

Comments are closed.