The feeling is less like an ending than just another starting point.
― Chuck Palahniuk
This is another reply to my critique of our modernization program for codes, which is really creating a whole new generation of legacy codes. This is proposing a solution to this that is the acceptance of the inevitable.
The path toward better performance in modeling and simulation has focused to an unhealthy degree on hardware for the past quarter century. This focus has been driven to a very large degree by a reliance on Moore’s law. It is a rather pathetic risk avoidance strategy. Moore’s law is the not really a law, but rather an empirical observation that computing power (or other equivalent measures) is growing at roughly a rate of doubling every 18 months. This observation has held since 1965 although its demise is now rapidly upon us. The reality is that Moore’s law has held for far longer than it ever could have been expected to hold, and its demise is probably overdue.
For microprocessors, Moore’s law died around 2007, and now only lives via increasing reliance upon parallelism (i.e., lots of processors). Getting the performance out of such massive parallelism is enormously difficult and practically unachievable for an increasingly large span of methods, procedures and algorithms. Our lack of getting the advertised performance out of computers has been a large and growing problem systematically ignored for the same quarter century. It is papered over by measuring performance by a benchmark that has virtually no resemblance of any useful application and is basically immaterial to real progress.
We can almost be certain that Moore’s law will be completely and unequivocally dead by 2020. For most of us its death has already been a fact of life for nearly a decade. Its death during the last decade was actually a good thing, and benefited the computing industry. They stopped trying to sell us new computers every year and unleashed the immense power of mobile computing and unparalleled connectivity. Could it actually be a good thing for scientific computing? Could its demise actually unleash innovation and positive change that we are denying ourselves?
Yes!
What if the death of Moore’s law is actually an opportunity and not a problem? What if accepting the death of Moore’s law is a virtue that the high performance computing community is denying itself? What might be gained through embracing this reality?
Each of these questions can be answered in a deeply affirmative way, but requires a rather complete and well-structured alteration in our current path. The opportunity relies upon the recognition that activities in modeling and simulation that have been under-emphasized for decades provide even greater benefits than advances in hardware. During the quarter century of reliance on hardware for advancing modeling and simulation we have failed to get the benefits of these other activities. These neglected activities are modeling, solution methods and algorithms. Each of these activities entails far higher risk than relying upon hardware, but also produce far greater benefits when breakthroughs are made. I’m a believer in humanity’s capacity for creation and the inevitability of progress if we remove the artificial barriers to creation we have placed upon ourselves.
The reasons for not emphasizing these other opportunities can be chalked up to the tendency to avoid high-risk work in favor of low risk work with seeming guarantees. Such guarantees come from Moore’s law hence the systematic over-reliance on its returns. Advances in models, methods and algorithms tend to be extremely episodic and require many outright failures with the breakthroughs happening in unpredictable ways. The breakthroughs also depend upon creative, innovative work, which is difficult to manage in the manner of the project management techniques so popular today. So under the spell of Moore’s law, why take the chance of having to explain research failures when you can bet on a sure thing?
If we look at the lost opportunities and performance from our failure to invest in these areas, we can easily see how much has been sacrificed. In a nutshell we have (in all probability) lost as much performance (and likely more) as Moore’s law could have given us. If we acknowledge that Moore’s law’s gains are actually not seen in real applications, we have lost an even greater level. Our lack of taste for failure and unpredictable research outcomes is costing us a huge amount of capability. More troublingly, the outcomes from research in each of these areas can actually enable things that are completely different in character than the legacy applications. There are wonderful things we can’t do today because of the lack of courage and vision. Instead the hardware path we are on almost assures that the applications only evolve in incremental, non-revolutionary ways.
If we finally accept that Moore’s law is dead, can we finally stop shooting ourselves in the foot? Can we start to support these activities with a proven track record and undeniable benefits? If we do not, the attempts to utilize the hardware to produce exascale computers will siphon all the energy from the system. The starvation of effort toward models, methods and algorithms will only grow. The gulf between what we might have produced and what we actually have will only grow larger and more extreme. This is an archetypical opportunity cost. Moreover we need to admit to ourselves that for any application we really care about, the term exascale is complete bullshit. If the press release says we have an exascale computer, for an actual application of real interest we might have one-hundredth of that speed. This might actually be optimistic.
To make matters worse, the imbalance in research and effort is poisoning the future. The hardware path has laid waste to an entire generation of modeling-simulation scientists who might have been able to conduct groundbreaking work. Instead they have been marshaled into the foolhardy hardware path. The only reasons for choosing hardware are the belief that it is easier to fund and yields guaranteed (lower) returns. Our management needs to stop embracing this low bar and begin to practices an effective management of the future. The depth of my worry is that we do not have the capacity to manage a creative environment in manner that accepts the failure necessary for success. We have become completely addicted to the “easy” progress of Moore’s law, and forgotten how to do hard work.
Perhaps the end of Moore’s law for real can provide the scientific computing community with a much needed crisis. I believe that the way out of the crisis is simple and easy. The path has been trod before, but we have lost the ability to walk it. We need to allow if not encourage risk and failure in forgotten areas of endeavor. We need to balance the work appropriately realizing the value of each activity, and focus on the work where we have need and opportunity. A quarter century of reduced effort in models, methods and algorithms probably means that efforts will yield an avalanche of breakthroughs if only the effort is marshaled. To unleash this creative tsunami, the steady march toward exascale needs to halt because it swallows all effort. We are creating computers so completely ill-suited to scientific computation that simply using them is catastrophic to every other aspect of computing.
Another more beneficial aspect of changing our perspective and accepting Moore’s law’s death is how we compute. Just as the death of Moore’s law in processors unleashed computing into the mobile market and an era of massive innovation in use of computing, the same can happen for scientific computing. We only need to change the perspective. Today the way we use computing is stuck in the past. The way computing is managed is stuck in the past. Scientists and engineers still solve their problems as they did 25 years ago, and the exascale focus does little to push forward. We still exist in the mainframe era with binding policies from the IT departments choking innovation and progress.
A pessimist sees the difficulty in every opportunity; an optimist sees the opportunity in every difficulty.
― Winston S. Churchill
Earlier this week I gave a talk on modernizing codes to a rather large audience at work. The abstract for the talk was based on the very first draft of my Christmas blog post. It was pointed and fiery enough to almost guarantee me a great audience. I can only hope that the talk didn’t disappoint. A valid critique of the talk was my general lack of solutions to the problems I articulate. I countered that the solutions are dramatically more controversial than the statement of the problems. Nonetheless the critique is valid and I will attempt to provide the start of a response here.
ASC is a prime example of failing to label and learn from failures. As a result we make the same mistakes over and over again. We are currently doing this in ASC in the march toward exascale. The national exascale initiative is doing the same thing. This tendency to relabel failures as success was the topic of my recent “bullshit” post. We need failure to be seen as such so that we do better things in the place of repeating our mistakes. Today the mistakes simply go unacknowledged and become the foundation of a lie. Such lies then become the truth and we lose all contact with reality. Loss of contact with reality is the hallmark of today’s programs.
One of the serious problems for the science programs is their capacity to masquerade as applied programs. For example ASC is sold as an applied program doing stockpile stewardship. It is not. It is a computer hardware program. Ditto for the exascale initiative, which is just a computing hardware program too. Science or the stockpile stewardship missions are mere afterthoughts. The hardware focus becomes invariant to any need for the hardware. Other activities that do not resonate with the hardware focus simply get shortchanged even when they have the greatest leverage in the real world.
The beginning of the year is a prime time for such a discussion. Too often the question of importance is simply ignored in lieu of simple and thoughtless subservience to other’s judgment. If I listen to my training at work, the guidance is simple. Do what you’re paid to do as directed by your customer. This is an ethos of obedience and your particular judgment and prioritization is not really a guide. This is a rather depressing state of affairs for someone trained to do independent research; let someone else decide for you what is important, what is a priority. This seems to be what the government wants to do to the Lab, destroy them as independent entities, and replace this with an obedient workforce doing whatever they are directed to do.
An important, but depressing observation about my work currently is that I do what I am supposed to be doing, but it isn’t what is important to be doing. Instead of some degree of autonomy and judgment being regularly exercised in my choice of daily activities, I do what I’m supposed to do. Part of the current milieu at work is the concept of accountability to customers. If a customer pays you to do something, you’re supposed to do it. Even if the thing you’re being paid for is a complete waste of time. The truth is most of what we are tasked to do at the Labs these days is wasteful and nigh on useless. It’s the rule of the day, so we just chug along doing our useless work, collecting a regular paycheck, and following the rules.
The real world is important. Things in the real world are important. This is an important thing to keep in mind at all times with modeling and simulation. We are supposed to be modeling the real world for the purpose of solving real world problems. Too often in the programs I work on this seemingly obvious maxim gets lost. Sometimes it is completely absent from the modeling and simulation narrative. Its lack of presence is palpable in today’s efforts in high performance computing. All the energy is going into producing the “fastest” computers. The United States must have the fastest computer in the World, and if it doesn’t it is a calamity. The fact that this fastest computer will allow us to simulate reality better is a foregone conclusion.
This is a rather faulty assumption. Not just a little bit faulty, but deeply and completely flawed. This is true under a set of conditions that are increasingly under threat. If the model of reality is flawed, no computer, no matter how fast can rescue the model. A whole bunch of other activities can provide an equal or greater impact onto the efficiency of modeling than a faster computer. Moreover the title of fastest computer has less a less to do with having the fastest simulation. The benchmark that measures the fastest computer is becoming ever less relevant to measuring speed with simulations. So in summary, efforts geared toward the fastest computer are not very important. Nonetheless they are the priority for my customer.
The reason for the lack of progress is simple, high performance computing is still acting as if it were in the mainframe era. We still have the same sort of painful IT departments that typified that era. High performance computing is more Mad Men than Star Trek. The control of computing resources, the policy based use and the culture of specialization all contribute to this community-wide failing. We still rely
upon centralized massive computing resources to be the principle delivery mechanism. Instead we should focus energy on getting computing for modeling and simulation to run seamlessly from the mobile computer all the way to the supercomputer without all the barriers we self-impose. We are doing far too little to simply put it at our collective
to be a niche activity, and not fulfill its potential.
It goes without saying that we want to have modern things. A modern car is generally better functionally than its predecessors. Classic cars primarily provide the benefit of nostalgia rather than performance, safety or functionality. Modern things are certainly even more favored in computing. We see computers, cell phones and tablets replaced on an approximately annual basis with hardware having far greater capability. Software (or apps) gets replaced even more frequently. Research programs are supposed to be the epitome of modernity and pave the road to the future. In high end computing no program has applied more resources (i.e., lots of money!! $$) to scientific computing than the DoE’s Advanced Simulation & Computing(ASC) program and its original ASCI. This program is part of a broader set of science campaigns to support the USA’s nuclear weapons’ stockpile in the absence of full scale testing. It is referred to as “Science-based” stockpile stewardship, and generally a commendable idea. Its been going on for nearly 25 years now, and perhaps the time is ripe (over-ripe?) for assessing our progress.
My judgment is that ASC has succeeded in replacing the old generation of legacy codes with a new generation of legacy codes. This is now marketed to the unwitting masses as “preserving the code base”. This is a terrible reason to spend a lot of money and fails to recognize the real role of code, which is to encode expertise and knowledge of the scientists into a working recipe. Legacy codes simply make this an intellectually empty exercise making the intellect of the current scientists subservient to the past. The codes of today have the same intellectual core as the codes of a quarter of a century ago. The lack of progress in developing new ideas into working code is palpable and hangs heavy around the entire modeling and simulation program like a noose.
A modern version of a legacy code is not modernizing; it is surrender. We have surrendered to fear, and risk aversion. We have surrendered to the belief that we already know enough. We have surrendered to the belief that the current scientists aren’t good enough to create something better than what already exists. As I will outline this modernization is more of an attempt to avoid any attempt to engage in risky or innovative work. It places all of the innovation in an inevitable change in computing platforms. The complexity of these new platforms makes programming so difficult that it swallows every amount of effort that could be going into more useful endeavors.
Is a code modern if it executes on the newest computing platforms? Is a code modern if it is implemented using a new computer language? Is a code modern if it utilizes new software libraries in its construction and execution? Is a code modern if it has embedded uncertainty quantification? Is a code modern if it does not solve today’s problems? Is a code modern if it uses methods developed decades ago? Is a code modern if it runs on my iPhone?
The conventional wisdom would have us believe that we are presently modernizing our codes in preparation for the next generation of supercomputers. This is certainly a positive take on the current efforts in code development, but not a terribly accurate characterization either. The modernization program is certainly limited to the aspects of the code that have the least impact on the results, and avoids modernizing the aspects of a code most responsible for its utility. To understand this rather bold statement requires a detailed explanation.
So this is where we are at, stuck in the past, trapped by our own cowardice and lack of imagination. Instead of simply creating modern codes, we should be creating the codes of the future, applications for tomorrow. We should be trailblazers, but this requires risk and taking bold chances. Our current system cannot tolerate risk because it entails the distinct chance of failure, or unintended consequence. If we had a functioning research program there is the distinct chance that we would create something unintended and unplanned. It would be wonderful and disruptive in a wonderful way, but it would require the sort of courage that is in woefully short supply today. Instead we want to have certain outcomes and control, which means that our chance of discovering anything unintended disappears from the realm of the possible.
The core of the issue is the difficulty of using the next generation of computers. These machines are literally monstrous in character. They raise parallelism to a level that makes the implementation of codes incredibly difficult. We are already in a massive deficit in terms of performance on computers. For the last 25 years we have steadily lost ground in accessing the potential performance of our computers. Our lack of evolution for algorithms and methods plays a clear role here. By choosing to follow our legacy code path we are locked into methods and algorithms that are suboptimal both in terms of performance, accuracy and utility on modern and future computing architectures. The amount of technical debt is mounting and magnified by acute technical inflation.












acting as the rich (aside from Trump) would like. The rich may get the last laugh and put one of their own puppets in control of the mob. Overall it is simply another element in looking toward a chaotic outcome.
Sexual freedom that provided one of the key elements for the 1960’s is undergoing a new renaissance revolving around mobile computing. Mobile apps and new forms of dating like Tinder, OKCupid are the most innocuous forms for these changes in the relationship landscape. The forces of conservatism are horrified and fearful of these changes, and are mobilizing to push back against them. All of this is ripe for deep and wrenching conflict on new social battlefields. Just as the forces of liberalization utilize modern technology, the forces arrayed against changes organize themselves with modern technology too (ISIS is an example!).


World. On the other hand, the downsides to the current milieu with regard to science are stark and obvious. Science is in complete disarray and we are to blame.
climate has become a festering and poisonous environment, the conduct of science has taken on a similar air. Part of this issue comes from the public funding of science, which is an absolute necessity considering the utter wasting away of corporate science. The imposition of short-term thinking as the principle organizing principle for industry has obliterated basic research at the corporate level. Unfortunately the corporate mantra has been adopted by the government as a way of improving the quality of governance. This has simply multiplied the harm done by the sort-term thinking and its ability to ravage any long-term accomplishments.
The truth is that short-term thinking is bad for business, bad for science, bad for careers, bad for everyone except those at the top. It only benefits activities like finance as a way of powering their moneymaking shell game. It benefits the very rich and their rent-seeking behavior. We get sold a complete line of bullshit in calling all the finance “investment” when it is simply moving money around to make more money. The middle class has been sold on this strategy through their retirement accounts, but this is the equivalent of a bribe as it only buys their acceptance of a system that harms the middle-class in the long-term. The only interest that truly benefits is the status quo that is locked in from this focus. For science, the short-term thinking is simply destructive and a recipe for mediocrity and lack of progress. Again, the problems are only seen in the long-term when the lack of scientific progress will harm our descendants.
For me I see my life and career playing out in this shitty time. I could have been part of something bold and wonderful with science providing the backbone of an important societal endeavor. Instead we are destroying research institutions through utter neglect. We are wasting careers and lives in pushing down risks due to irrational fears. All of this is done in service to short-term thinking that benefits the rich and powerful. No one else benefits from the short-term thinking, no one, it is simply a vehicle for empowering the status quo and assuring that the identities of those on top do not change. It is the straightjacket through which lack of social mobility arises.
All one has to do is look at the political environment today. Americans and perhaps the entire Western world have never been more fearful and afraid. At the same time the World has never been more peaceful. Our society and so-called leaders amp up every little fear and problem into a giant boogieman when the actual reality is completely opposite. We have never ever been safer than today. This is true even with the orgy of gun violence in the United States. The powers that be use the tiny danger of terrorism to drive the forces of the status quo while utterly ignoring larger dangers (like firearms). The truth is that we have never had less to fear. Yet fear is the driving force politically and used to sell fear-spewing candidates to a weak, shivering populace. Fear sells products to people whether is drugs, cars, media content, guns, or almost anything else. Our mass media is simply a tool of their corporate overlords with the ultimate design to enslave us to the status quo. Our society runs on fear and fear is used to enslave the populace. Science is simply an innocent bystander slain by the societal drive-by.
This thinking infests the approach to high performance computing where a tried and true path of relying upon Moore’s law has powered modest improvements for decades. At the same time we have avoided progress in other areas of computing with greater benefits, but also greater risks and higher probabilities of failure. A prime example of this disservice can be found in numerical linear algebra where the solution of sparse systems has stagnated for decades. All the effort has been consumed by moving the existing methods to the new computing platforms, and little or nothing on improving the methods themselves. Orders of magnitude in performance improvements have been scarified to fear and risk avoidance. Let’s not forget that the principle beneficiaries of the current supercomputing program are computer vendors who will receive great sums of money to produce the monstrous computers being contemplated. These horrible machines will sap the resources left over to actually use them and simply compound the stasis already evident in the field.
n a very clear way we are taking enormous risks with our future. We are accumulating massive long-term risk by consistently taking the low-risk short-term path. This is clearest when examining the state of the careers in science. Once we allowed people to aggressively pursue research with a high change of failure, but the possibility of massive payoffs. Today we timidly pursue incremental progress, yet view this as enormously risky. The greatest risk of the continued pursuit of a computing hardware driven path in high performance computing is the destruction of promising scientific careers, and the destruction of a balanced program for advancing modeling and simulation. Make no mistake, the current approach to modeling and simulation is completely unbalanced. It is timid. It lacks creativity. It lacks vitality. It is not science-based; it is fear-based. It is the result of an unhealthy fixation on short-term thinking about progress.
uncertainty in our knowledge is far larger than we are willing to admit. The sort of uncertainty that is present cannot be meaningfully addressed through the focus on more computing hardware (its assessment could be helped, but not solved). This uncertainty can only be addressed through a systematic effort to improve models and engage in broad experimental and observation science and engineering. If we work hard to actively understand reality better the knobs can be reduced or even removed as knowledge grows. This sort of work is exactly the sort of risky thing our current research culture eschews as a matter of course.
account for multiple effects (turbulence, mixing, plasma physics, radiation and numerical resolution are common). In this cases the knobs may account for a multitude of poorly understood physical phenomena, mystery physics and lack of numerical resolution. This creates a massive opportunity for severe cognitive dissonance, which is reflected in an over-confidence in simulation quality. Scientists using simulations like to provide those funding their work with greater confidence than it should carry because the actual uncertainty would trouble those paying for it. Moreover the range of validity for such calculation is not well understood or explicitly stated. One of the key aspects of the calibration being necessary is that the calculation cannot reflect a real World situation without it. The model simply misses key aspects of reality without the knobs (climate modeling is an essential example).
In essence there are two uncertainties that matter: the calibrated uncertainty where data is keeping the model reasonable, and the actual predictive uncertainty that is much larger and reflects the lack of knowledge that makes the calibration necessary in the first place. Another aspect of the modeling in the calibrated setting is the proper use of the model for computing quantities. If the quantity coming from the simulation can be tied to the data used for calibration, the calibrated uncertainty is a reasonable thing to use. If the quantity from the simulation is inferred and not directly calibrated, the larger uncertainty is appropriate. Thus we see that the calibrated model has intrinsic limitations, and cannot be used for predictions that go beyond the data’s physical implications. For example climate modeling is certainly reasonable for examining the mean temperature of the Earth. One the other hand the data associated with extreme weather events like flooding rains are not calibrated, and uncertainty regarding their prediction under climate change are more problematic.
In modeling and simulation nothing comes for free. If the model needs to be calibrated to accurately simulate a system, the modeling is limited in an essential way. The limitations in the model are uncertainties about aspects of the system tied to the modeling inadequacies. Any predictions of the details associated with these aspects of the model are intrinsically uncertain. The key is the acknowledgement of the limitations associated with calibration. Calibration is needed to deal with uncertainty about modeling, and the lack of knowledge limits the applicability of simulation. One applies the modeling in a manner that is cautious, if they are being rational. Unfortunately people are not rational and tend to put far too much faith in these calibrated models. In these cases they engage in wishful thinking, and fail to account for the uncertainty in applying the simulations for prediction.

national security issue to sell it without a scintila of comprehension for what makes these computers useful in the first place. The speed of the computer is one of the least important aspects of the real transformative power of supercomputing, and the most distant from its capasity to influence the real world.
immense and truly transformative. We have deep scientific, engineering and societal questions that will be unanswered, or answered poorly due to our risk aversion. For example, how does climate change impact the prevalence of extreme weather events? Our existing models can only infer this rather than simulate it directly. Other questions related to material failure, extremes of response for engineered systems, and numerous scientific challenges will remain beyond our collective grasp. All of this

