Let’s Discover Some Magic

27 Friday Jan 2017

Magic’s just science that we don’t understand yet.

― Arthur C. Clarke

Scientific discovery and wonder can often be viewed as magic. Some things we can do with our knowledge of the universe can seem magical until you understand them. We commonly use technology that would seem magical to people only a few decades ago. Our ability to discover, innovate and build upon or knowledge creates opportunity for better, happier and longer healthy lives for humanity. In many ways technology is the most human of endeavors and sets us apart from the animal kingdom through its ability to harness, control and shape our World to our benefit. Scientific knowledge and discovery is the foundation of all technology, and from this foundation we can produce magical results. I’m increasingly aware of our tendency to shy away from doing the very work that yields magic.

The world is full of magic things, patiently waiting for our senses to grow sharper.

― W.B. Yeats

Today I’ll talk about a couple things: the magical power of models, methods, and algorithms, and what it takes to create the magic.

What do I mean by magic with abstract things like models, methods and algorithms in the first place? As I mentioned these things are all basically ideas and these ideas take shape through mathematics and power through computational simulation. Ultimately the combination of mathematical structure and computer code the ideas can produce almost magical capabilities in understanding and explaining the World around us allowing us to tame reality in new innovative ways. One little correction is immediately in order; models themselves can be useful without computers. Simple models can be solved via analytical means and these solutions provided classical physics with many breakthroughs in the era before computers. Computers offered the ability expand the scope of these solutions to far more difficult and general models of reality.

This then takes us to the magic from methods and algorithms, which are similar, but differing in character. The method is the means of taking a model and solving it. The method enables a model to be solved, the nature of that solution, and the basic efficiency of the solution. Ultimately the methods power what is possible to achieve with computers. All our modeling and simulation codes depend upon these methods for their core abilities
. Without the innovate methods to solve models, the computers would be far less powerful for science. Many great methods have been devised over the past few decades, and advances with methods open the door to new models or simply greater accuracy, or efficiency in their solution. Some methods are magical in their ability to open new models to solution and with those new perspectives on our reality.

Any sufficiently advanced technology is indistinguishable from magic.

― Arthur C. Clarke

Despite their centrality and essential nature in scientific computing, emphasis and focus on method creation is waning badly. Research into new or better methods has little priority today and the simple transfer (or porting) of existing methods onto new computers is the preferred choice. The blunt truth is that porting a method onto a new computer will produce progress, but no magic. The magic of methods can be more than simply enabling; the best methods bridge a divide between modeling and methods by containing elements of physical modeling. The key example of this character is shock capturing. Shock capturing magically created the ability to solve discontinuous problems in a general way, and paved the way for many if not most of our general application codes.

The magic isn’t limited to just making solutions possible, the means of making the solution possible also added important physical modeling to the equations. The core methodology used for shock capturing is the addition of subgrid dissipative physics (i.e., artificial viscosity). The foundation of shock capturing led directly to large eddy simulation and the ability to simulate turbulence. Improved shock capturing developed in the 1970’s and 1980’s created implicit large eddy simulation. To many this seemed completely magical; the modeling simply came for free. In reality this magic was predictable. The basic method of shock capturing was the same as the basic subgrid modeling in LES. Finding out that improved shock capturing gives automatic LES modeling is actually quite logical. In essence the connection is due to the model leaving key physics out of the equations. Nature doesn’t allow this to go unpunished.

One of the aspects of modern science is that it provides a proverbial two-edged sword by understanding the magic. In the understanding we lose the magic, but open the door for new more miraculous capabilities. For implicit LES we have begun to unveil the secrets of its seemingly magical success. The core of the success is simply the same as original shock capturing, producing viable solutions on finite grids occurs via getting physically relevant solutions, which by definition means a dissipative (vanishing viscosity) solution. The new improved shock capturing methods extended the basic ability to solve problems. If one were cognizant of the connection between LES and shock capturing, the magic of implicit LES should be been foreseen.

The real key is the movement to physical admissible second-order accurate methods. Before the advent of modern shock capturing methods guarantees of physical admissibility were limited to first order accuracy. The first-order accuracy brings with it large numerical errors that look just like physical viscosity, which renders all solutions effectively laminar in character. This intrinsic laminar character disappears with second-order accuracy. The trick is that the classical second-order results are oscillatory and prone to being unphysical. Modern shock capturing methods solve this issue and make solutions realizable. It turns out that the fundamental and leading truncation error in a second-order finite volume method produces the same form of dissipation as many models produce in the limit of vanishing viscosity. In other words, the second order solutions match the asymptotic structure of the solutions to the inviscid equations in a deep manner. This structural matching is the basis of the seemingly magic ability of second-order methods to produce convincingly turbulent calculations.

This magic is the tip of the iceberg, and science is about understanding the magic as a route to even greater wizardry. One of the great tragedies of the modern age is the disconnect between these magical results and what we are allowed to do.

We can also get magical results from algorithms. The algorithms are important mathematical tools that enable methods to work. In some cases algorithmic limitations produce significantly limiting efficiency for numerical methods. One of the clearest areas of algorithmic magic is numerical linear algebra. Breakthroughs in numerical linear algebra have produced immense and enabling capabilities for methods. If the linear algebra is inefficient it can limit the capacity for solving problems. Conversely a breakthrough in linear algebra scaling (like multigrid) can allow solutions with a speed, magnitude and efficiency that seems positively magical in nature.

Numerous algorithms have been developed that endow codes with seemingly magical abilities. A recent breakthrough where magical power is ascribable to is compressed sensing. This methodology has seeded a number of related algorithmic capabilities that defy normal rules. The biggest element of compressed sensing is its appetite for sparsity, and sparsity drives good scaling properties. We see magical ability to recover clear images from noisy signals. The key to all of this capability is the marriage of deep mathematical theory to applied mathematical practice, and algorithmic implementation. We should want as much of this sort of magical capabilities as possible. They do seemingly impossible things providing new unforeseen abilities.

In the republic of mediocrity, genius is dangerous.

― Robert G. Ingersoll

We don’t do much of this these days. Model, method and algorithm advancement is difficult and risky. Unfortunately our modern management programs don’t do difficult things well anymore. We do risky things even less. A risky failure prone research program is likely to not be funded. Our management is incapable of taking risks, and progress in all of these areas is very risky. We must be able to absorb many failures in attempting to achieve breakthroughs. Without accepting and managing through these failures, the breakthroughs will not occur. If the breakthroughs occur massive benefits will arise, but these benefit while doubtless are hard to estimate. We are living in the lunacy of the scheduled breakthrough. Our inability to seek success without the possibility of failure is nothing, but unbridled bullshit and the recipe for systematic failure.

There is always danger for those who are afraid.

― George Bernard Shaw

The truly unfortunate aspect of today’s world is the systematic lack of trust in people, expertise, institutions and facts in general. These trustworthiness crises are getting worse, not better, and may be approaching a critical fracture. The end result of the lack of trust is a lack of effective execution of work because people’s hands are tied. The level of control placed on how work is executed is incompatible with serendipitous breakthroughs and adaption of complex efforts. Instead we tend to have highly controlled and scripted work lacking any innovation and discovery. In other words the control and lack of trust conspire to remove magic as a potential result. Over the years this leads to a lessening of the wonderful things we can accomplish.

If we expect to continue discovering wonderful things we need to change how we manage our programs. We need to start trusting people, expertise, and institutions again. Trust is a wonderful thing. Trust is an empowering thing. Trust drives greater efficiency and allows people to learn and adapt. If we trust people they will discover serendipitous results. Most discoveries are not completely new ideas. A much more common occurrence is for old mature ideas to combine into entirely new ideas. This is a common source of magical and new capabilities. Currently the controls placed on work driven by lack of trust remove most of the potential for a marriage of new ideas. The new ideas simply never meet and never have a chance to become something new and amazing. We need to give trust and relinquish some control if we want great things to happen.

The problem with releasing control and giving trust is the acceptance of risk. Anything new, wonderful, even magical will also entail great risk of failure. If one desires the magic, one must also accept the possibility of failure. The two things are intrinsically linked and utterly dependent. Without risks the reward will not materialize. The ability to take large risks, highly prone to failure is necessary to expose discoveries. The magic is out there waiting to be uncovered by those with the courage to take the risks.

Breaking Bad: Priorities, Intentions and Responsibility in High Performance Computing

20 Friday Jan 2017

Posted by Bill Rider in Uncategorized

≈ 1 Comment

Action expresses priorities.

― Mahatma Gandhi

Being the successful and competent at high performance computing (HPC) is an essential enabling technology for supporting many scientific, military and industrial activities. It plays an important role in national defense, economics, cyber-everything and a measure of National competence. So it is important. Being the top nation in high performance computers is an important benchmark in defining national power. It does not measure overall success or competence, but rather a component of those things. Success and competence in high performance computing depends on a number of things including physics modeling and experimentation, applied mathematics, many types of engineering including software engineering, and computer hardware. In the list of these things computing hardware is among the least important aspects of competence. It is generally enabling for everything else, but hardly defines competence. In other words, hardware is necessary and far from sufficient.

Claiming that you are what you are not will obscure the strengths you do have while destroying your credibility.

― Tom Hayes

Being a necessity for competence, hardware must receive some support for national success. Being insufficient, it cannot be the only thing supported, and it is not the determining factor for HPC supremacy. In other words, we could have the very best hardware and still be inferior to the competition. Indeed the key to success in HPC has always been a multidisciplinary endeavor and predicated on a high degree of balance across the spectrum of activities needed for competence. If one examines the state of affairs in HPC, we can easily see that all this experience and previous success has been ignored and forgotten. Instead of following a path blazed by previous funding success (i.e., ASCI), we have chosen a road to success solely focused on computing hardware and its direct implications. Worse, the lessons of the past are plain and ignored by the current management. Excellence in other areas has been eschewed in favor of the hardware’s wake. The danger in the current approach is dampening progress in a host of essential disciplines in favor of a success completely dependent on hardware.

The fundamental cause of the trouble is that in the modern world the stupid are cocksure while the intelligent are full of doubt.

― Bertrand Russell

Unfortunately, the situation is far worse than this. If computer hardware was in an era where huge advances in performance were primed to take place, the focus might be forgivable. Instead we are in an era where advances in hardware are incredibly strained. It is easy to see that huge advances in hardware are grinding to a halt, or at least relative to the past half century. The focus of the current programs, the “exascale” initiatives, is actually the opposite. We are attempting to continue growth in computing power at tremendous cost where the very physics of computers is working against us. The focus on hardware is actually completely illogical; if opportunity were the guide hardware would be a side-show instead of the main event. The core of the problem is the complete addiction of the field on Moore’s law for approximately 50 years, and like all addicts, kicking the habit is hard. In a sense under Moore’s law computer performance skyrocketed for free, and people are not ready to see it go.

Most of us spend too much time on what is urgent and not enough time on what is important.

― Stephen R. Covey

Moore’s law is dead and HPC is suffering from the effects of withdrawal. Instead of accepting the death of Moore’s law and shifting the focus to other areas for advancements, we are holding onto it like a junkie’s last fix. In other words, the current programs in HPC are putting an immense amount of focus and resources into keeping Moore’s law alive. It is not unlike the sort of heroic measures taken to extend the life of a terminal patient. Much like the terminal patient whose death is only delayed by the heroic measures, the quality of life is usually terrible. In the same way the performance of HPC is more zombie-like than robust. Achieving the performance comes at the cost of utility and general ease of use for the computers. Moreover the nature of the hardware inhibits advances inother areas due its difficulty of use. This goes above and beyond the vast resource sink the hardware is.

The core truth of HPC is that we’ve been losing this war for twenty years, and the current effort is simply the final apocalyptic battle in war that is about to end. The bottom line is that we are in a terrible place where all progress is threatened by supporting a dying trend that has benefitted HPC for decades.

I work on this program and quietly make all these points. They fall of deaf ears because the people committed to hardware dominate the national and international conversations. Hardware is an easier sell to the political class who are not sophisticated enough to smell the bullshit they are being fed. Hardware has worked to get funding before, so we go back to the well. Hardware advances are easy to understand and sell politically. The more naïve and superficial the argument, the better fit it is for our increasingly elite-unfriendly body politic. All the other things needed for HPC competence and advances are supported largely by pro bono work. They are simply added effort that comes down to doing the right thing. There is a rub that puts all this good faith effort at risk. The balance and all the other work is not a priority or emphasis of the program. Generally it is not important or measured in the success of the program, or defined in the tasking from the funding agencies.

We live in an era where we are driven to be unwaveringly compliant to rules and regulations. In other words you work on what you’re paid to work on, and you’re paid to complete the tasks spelled out in the work orders. As a result all of the things you do out of good faith and responsibility can be viewed as violating these rules. Success might depend doing all of these unfunded and unstated things, but the defined success from the work contracts are missing these elements. As a result the things that need to be done; do not get done. More often than not, you receive little credit or personal success from pursing doing the right thing. You do not get management or institutional support either. Expecting these unprioritized, unintentional things to happen is simply magical thinking.

We have the situation where the priorities of the program are arrayed toward success in a single area that puts other areas needed for success at risk. Management then asks people to do good faith pro bono work to make up the difference. This good faith work violates the letter of the law in compliance toward contracted work. There appears to be no intention of supporting all of the other disciplines needed for success. We rely upon people’s sense of responsibility for closing this gap even when we drive a sense of duty that pushes against doing any extra work. In addition, the hardware focus levies an immense tax on all other work because the hardware is so incredibly user-unfriendly. The bottom line is a systematic abdication of responsibility by those charged with leading our efforts. Moreover we exist within a time and system where grass roots dissent and negative feedback is squashed. Our tepid and incompetent leadership can rest assured that their decisions will not be questioned.

Before getting to my conclusion, one might reasonably ask, “what should we be doing instead?” First we need an HPC program with balance between the impact on reality and the stream of enabling technology. The single most contemptible aspect of current programs is the nature of the hardware focus. The computers we are building are monstrosities, largely unfit for scientific use and vomitously inefficient. They are chasing a meaningless summit of performance measured through an antiquated and empty benchmark. We would be better served through building computers tailored to scientific computation that solve real important problems with efficiency. We should be building computers and software that spur our productivity and are easy to use. Instead we levy an enormous penalty toward any useful application of these machines because of their monstrous nature. A refocus away from the meaningless summit defined by an outdated benchmark could have vast benefits for science.

We could then free up resources to provide a holistic value stream from computing we know by experience. Real applied focusing on modeling and solution methods produces the greatest possible benefit. These immensely valuable activities are completely and utterly unsupported by the current HPC program and paid little more than lip service. Hand-in-hand with the lack of focus on applications and answers is no focus on verification or validation. Verification deals with the overall quality of the calculations, which is just assumed by the magnitude of the calculations (it used so much computer power, it has to be awesome, right?). The lack of validation underpins a generic lack of interest in the quality of the work in terms of real world congruence and impact.

Next down the line of unsupported activities is algorithmic research. The sort of algorithmic research that yields game-changing breakthroughs is unsupported. Algorithmic breakthroughs make the impossible, possible and create capabilities undreamed of. They create a better future we couldn’t even dream of. We are putting no effort into this. Instead we have the new buzzword of “co-design” where we focus on figuring out how to put existing algorithms on the monstrous hardware we are pursuing. The benefits are hardly game changing, but rather simply fighting the tidal wave of entropy of the horrific hardware. Finally we get to the place where funding exists, code development that ports existing models, methods and algorithms onto the hardware. Because little or no effort is put into making this hardware scientifically productive (in fact it’s the opposite), the code can barely be developed and its quality suffers mightily.

A huge tell in the actions of those constructing current HPC programs is their inability to learn from the past (or care about the underlying issues). If one looks at the program for pursuing exascale, it is structured almost identically to the original ASCI program, except being even more relentlessly hardware obsessed. The original ASCI program needed to add significant efforts in support of physical modeling, algorithm research and V&V on top of the hardware focus. This reflected a desire and necessity to produce high quality results with high confidence. All of these elements are conspicuously absent from the current HPC efforts. This sends two clear and unambiguous messages to anyone paying attention. The first message is a steadfast belief that the only quality needed is the knowledge that a really big expensive computer did the calculation at great cost. Somehow the mere utilization of such exotic and expensive hardware will endow the calculations with legitimacy. The second message is that no other advances other than computer power are needed.

The true message is that connection to credibility and physical reality has no importance whatsoever to those running these programs. The actions and focus of the work spelled out plainly in the activities funded makes their plans. The current HPC efforts make no serious attempt to make sure calculations are high quality or impactful in the real world. If the calculations are high quality there will be scant evidence to prove this, and any demonstration will be done via authority. We are at the point where proof is granted by immensely expensive calculations rather then convincing evidence. There will be no focused or funded activity to demonstrate quality. There will be no focused activity to improve the physical, mathematical or algorithmic basis of the codes either. In other words all the application code related work in the program is little more than a giant porting exercise. The priority and intents regarding quality are clear to those of us working on the project, namely quality is not important and not valued.

I’ve been told to assume that the leadership supports the important things to do that are ignored by our current programs. Seeing how our current programs operate, this is hardly plausible. Every single act by the leadership constructs an ever-tightening noose of planning, reporting and constraint about our collective necks. Quality, knowledge and expertise are all seriously devalued in the current era, and we can expect the results to reflect our priorities. We see a system put in place that will punish any attempt to do the right thing. The “right thing” is to do exactly what you’re told to do. Of course, one might argue that the chickens will eventually come home to roost, and the failures of the leadership will be laid bare. I’d like to think this is inevitable, but recent events seem to indicate that all facts are negotiable, and any problems can be spun through innovative marketing and propaganda into success. I have a great deal of faith that the Chinese will mop the floor with us in HPC, and our current leadership should shoulder the blame. I also believe the blame will not fall to the guilty. It never does, today; the innocent will be scapegoated for their mistakes.

Nothing in this World is Static…Everything is Kinetic..

If there is no ‘progression’…there is bound to be ‘regression’…

― Abha Maryada Banerjee

I am left with the feeling that an important opportunity for reshaping the future is being missed. Rather than admit the technological limitations we are laboring under and transform HPC towards a new focus, we continue along a path that appears to be completely nostalgic. The acceptance of the limitations in the growth of computer power in the commercial computing industry led to a wonderful result. Computer hardware shifted to mobile computing and unleashed a level of impact and power far beyond what existing at the turn of the Century. Mobile computing is vastly more important and pervasive than the computing that preceded it. The same sort of innovation could unleash HPC to produce real value far beyond anything conceivable today. Instead we have built a program devoted to nostalgia and largely divorced from objective reality.

Doing better would be simple, at least at a conceptual level. One would need to commit to a balanced program where driving modeling and simulation to impact the real world is a priority. The funded and prioritized activities would need to reflect this focus. Those leading and managing the program would need to ask the right questions and demand progress in the right areas. Success would need to be predicated on the same holistic balanced philosophy. The people working on these programs are smart enough to infer the intent of the programs. This is patently obvious by examining the funding profiles.

Programs are funded around their priorities. The results that matter are connected tomoney. If something is not being paid for it is not important. If one couples steadfast compliance with only working on what you’re funded to do, any call to do the right thing despite funding is simply comical. The right thing becomes complying, and the important thing in this environment is funding the right things. As we work to account for every dime of spending in ever finer increments, the importance of sensible and visionary leadership becomes greater. The very nature of this accounting tsunami is to blunt and deny visionary leadership’s ability to exist. The end result is spending every dime as intended and wasting the vast majority of it on shitty, useless results. Any other outcome in the modern world is implausible.

You never change things by fighting the existing reality.

To change something, build a new model that makes the existing model obsolete.

― R. Buckminster Fuller

Are We Doing Numerical Error Bars Right?

13 Friday Jan 2017

Posted by Bill Rider in Uncategorized

≈ Leave a comment

No. I don’t think so, but I’ll give my argument.

If you reject feedback, you also reject the choice of acting in a way that may bring you abundant success.

― John Mattone

Despite a relatively obvious path to fulfillment, the estimation of numerical error in modeling and simulation appears to be worryingly difficult to achieve. A big part of the problem is outright laziness, inattention, and poor standards. A secondary issue is the mismatch between theory and practice. If we maintain reasonable pressure on the modeling and simulation community we can overcome the first problem, but it does require not accepting substandard work. The second problem requires some focused research, along with a more pragmatic approach to practical problems. Along with these systemic issues we can deal with a simpler problem, where to put the error bars on simulations, or should they show a bias or symmetric error. I strongly favor a bias.

Implicit in this discussion is an assumption of convergence for a local sequence of calculations. I suspect the assumption is generally a good one, but also prone to failure. One of the key realities is the relative rarity of calculations in the asymptotic range of convergence for methods and problems of interest. The biggest issue is how problems are modeled. The usual way of modeling problems or creating models for physics in problems produces technical issues that inhibit asymptotic convergence (various discontinutiies, other singularities, degenerate cases, etc.). Our convergence theory is predicated on smoothness that rarely exists in realistic problems. This gets to the core of the shortcomings of theory, we don’t know what to expect in these cases. In the end we need to either make some assumptions, collect data and do our best, or do some focused research to find a way.

The basic recipe for verification is simple: make an assumption about the form of the error, collect calculations and use the assumed error model to estimate errors. The assumed error form is $A = S_k + C h_k^\alpha$ where $A$ is the mesh converged solution, $S_k$ is the solution on a grid $k$ , $h_k$ is the mesh density, $C$ is a constant of proportionality and $\alpha$ is the convergence rate. We see three unknowns in this assumed form, $A$ , $C$ and $\alpha$ . Thus we need at least three solutions to solve for these values, or more if we are willing to solve an over-determined problem. At this point the hard part is done, and verification is just algebra and a few very key decisions. It is these key decisions that I’m going to ask some questions about.

The first thing to note is the basic yardstick for the error estimate is the difference between $A$ and the grid solution $S_k$ , which we will call $\Delta A$ . Notice that this whole error model assumes that the sequence of solutions $S_k$ approaches $A$ monotonically as $h_k$ becomes smaller. In other words all the evidence supports the solution going to $A$ . Therefore the error is actually signed, or biased by this fact. In a sense we should consider $A$ to be the most likely, or best estimate of the true solution as $h \rightarrow 0$ . There is also no evidence at all that the solution is moving the opposite direction. The problem I’m highlighting today is that the standard in solution verification does not apply these rather obvious conclusions in setting the numerical error bar.

The standard way of setting error bars takes the basic measure of error, multiplies it by an engineering safety factor $C_s \ge 1$ , and then centers it about the mesh solution, $S_k$ . The numerical uncertainty estimate is simple, $U_s = C_s \left| \Delta A \right|$ . So half the error bar is consistent with all the evidence, but the other half is not. This is easy to fix by ridding ourselves of the inconsistent piece.

The core issue I’m talking about is the position of the numerical error bar. Current approaches center the error bar on the finite grid solution of interest, usually the finest mesh used. This has the effect of giving the impression that this solution is the most likely answer, and the true answer could be either direction from that answer. Neither of these suggestions is supported by the data used to construct the error bar. For this reason the standard practice today is problematic and should be changed to something supportable by the evidence. The current error bars suggest incorrectly that the most likely error is zero. This is completely and utterly unsupported by evidence.

Instead of this impression, the evidence is pointing to the extrapolated solution as the most likely answer, and the difference between that solution, $A$ , and the mesh of interest $S_k$ is the most likely error. For this reason the error bar should be centered on the extrapolated solution. The most likely error is non-zero. This would make the error biased, and consistent with the evidence. If we padded our error estimate with a safety factor, $C_s$ , the error bar would include the mesh solution, $S_k$ and the potential for zero numerical error, but only as a low probability event. It would present the best estimate of the error as the best estimate.

There is a secondary impact of this bias that is no less important. The current standard approach also significantly discounts the potential for the numerical error to be much larger than the best estimate (where the current centering makes the best estimate appear to be low probability!). By centering the error bar on the best estimate we then present larger error as being equally as likely as smaller error, which is utterly and completely reasonable.

The man of science has learned to believe in justification, not by faith, but by verification.

― Thomas Henry Huxley

Why has this happened?

Part of the problem is the origin of error bars in common practice, and a serious technical difference in their derivation. The most common setting for error bars is measurement error. Here a number of measurements are taken and then analyzed to provide a single value (or values). In the most common use the mean value is presented as the measurement (i.e., the central tendency). Scientists then assume that the error bar is centered about the mean through assuming normal (i.e., Gaussian) statistics. This could be done differently with various biases in the data being presented, but truth be told this is rare, as is using any other statistical basis for computing the central tendency and deviations. This point of view is the standard way of viewing an error bar and implicitly plays in the mind of those viewing numerical error. This implicit view is dangerous because it imposes a technical perspective that does not fit numerical error.

The problem is that the basic structure of uncertainty is completely different with numerical error. A resolved numerical solution is definitely biased in its error. An under-resolved numerical solution is almost certainly inherently biased. The term under resolved is simply a matter of how exacting a solution one desires, so for the purposes of this conversation, all numerical solutions are under-resolved. The numerical error is always finite and if the calculation is well behaved, the error is always a bias. As such the process is utterly different than measurement error. With measurements there is an objective reality that is trying to be sensed. Observations can be biased, but generally are assumed to be unbiased unless otherwise noted. We have fluctuations in the observation or errors in the measurement itself. These both can have a distinct statistical nature. Numerical error is deterministic and structured, and has a basic bias through the leading order truncation error. As a result error bars from both sources should be structurally different. There are simply not the same thing and should not be treated as such.

The importance of this distinction in perspective is the proper accounting for sources and impact of uncertainty in modeling and simulation. Today we suffer most greatly from some degree of willful ignorance of uncertainty, and when it is acknowledged, too narrow a perspective. Numerical error is rarely estimated, assumed away and misrepresented even when it is computed. In the best work available, uncertainty is tackled as being dominantly epistemic uncertainty associated with modeling parameters (nominally subgrid or closure models). Large sources of uncertainty are defined by numerical error, problem modeling assumptions, model form error, and experimental uncertainty to name the big ones. All of these sources of uncertainty are commonly ignored by the community without much negative feedback, this needs to be somewhere for progress.

Science is a system of statements based on direct experience, and controlled by experimental verification. Verification in science is not, however, of single statements but of the entire system or a sub-system of such statements.

― Rudolf Carnap

Dealing with Bias and Calibration in Uncertainty Quantification

06 Friday Jan 2017

Posted by Bill Rider in Uncategorized

≈ Leave a comment

It is useless to attempt to reason a man out of a thing he was never reasoned into.

― Jonathan Swift

Most of the computer modeling and simulation examples in existence are subject to bias in the solutions. This bias comes from numerical solution, modeling inadequacy, and bad assumptions to name a few of the sources. In contrast uncertainty quantification is usually applied in a statistical and clearly unbiased manner. This is a serious difference in perspective. The differences are clear. With bias the difference between simulation and reality is one sided and the deviation can be cured by calibrating parts of the model to compensate. Unbiased uncertainty is common in measurement error and ends up dominating the approach to UQ in simulations. The result is a mismatch between the dominant mode of uncertainty and how it is modeled. Coming up with a more nuanced and appropriate model that acknowledges and deals with bias appropriately would be great progress.

One of the archetypes of the modern modeling and simulation are climate simulations (and their brethren, weather). These simulations carry with them significant bias associated with lack of computational resolution. The computational mesh is always far too coarse for comfort, and the numerical errors are significant. There are also issues associated with initial conditions, energy balance and representing physics at and below the level of the grid. In both cases the models are invariably calibrated heavily. This calibration compensates for the lack of mesh resolution, lack of knowledge of initial data and physics as well as problems with representing the energy balance essential to the simulation (especially climate). A serious modeling deficiency is the merging of all of these uncertainties into the calibration with an associated loss of information.

We all see only that which we are trained to see.

― Robert Anton Wilson

The issues with calibration are profound. Without calibration the models are effectively useless. For these models to contribute to our societal knowledge and decision-making or raw scientific investigation, the calibration is an absolute necessity. Calibration depends entirely on existing data, and this carries a burden of applicability. How valid is the calibration when the simulation is probing outside the range of the data used to calibrate? We commonly include the intrinsic numerical bias in the calibration, and most commonly a turbulence or mixing model is adjusted to account for the numerical bias. A colleague familiar with ocean models quipped that if the ocean were as viscous as we modeled it, one could drive to London from New York. It is well known that numerical viscosity stabilizes calculation, and we can use numerical methods to model turbulence (implicit large eddy simulation), but this practice should at the very least make people uncomfortable. We are also left with the difficult matter of how to validate models that have been calibrated.

I just touched on large eddy simulation, which is a particularly difficult topic because numerical effects are always in play. The mesh itself is part of the model with classical LES. With implicit LES the numerical method itself provides the physical modeling, or some part of the model. This issue plays out in weather and climate modeling where the mesh is part of the model rather than independent aspect of it. It should surprise no one that LES was born from weather-climate modeling (at the time where the distinction didn’t exist). In other words the chosen mesh and the model are intimately linked. If the mesh is modified, the modeling must also be modified (recalibrated) to get the balancing of the solution correct. This tends to happen in simulations where an intimate balance is essential to the phenomena. In these cases there is a system that in one respect or another is in a nearly equilibrium state, and the deviations from this equilibrium are essential. Aspects of the modeling related to the scales of interest including the grid itself impact the equilibrium to a degree that an un-calibrated model is nearly useless.

If numerical methods are being used correctly and at a resolution where the solution can be considered remotely mesh converged, the numerical error is a pure bias error. A significant problem is the standard approach to solution verification that treats numerical error as unbiased. This is applied in the case where no evidence exists for the error being unbiased! Well-behaved numerical error is intrinsically biased. This is a significant issue because making a biased error, unbiased represents a significant loss of information. Those who either must or do calibrate their models to account for numerical error rarely explicitly estimate numerical error, but account for the bias as a matter of course. Ultimately the failure of the V&V community to correctly apply well-behaved numerical error as a one-sided bias is counter-productive. It is particularly problematic in the endeavor to deal proactively with the issues associated with calibration.

Science is about recognizing patterns. […] Everything depends on the ground rules of the observer: if someone refuses to look at obvious patterns because they consider a pattern should not be there, then they will see nothing but the reflection of their own prejudices.

― Christopher Knight

Let me outline how we should be dealing with well-behaved numerical error below. If one has a quantity of interest where a sequence of meshes produces the monotonic approach to a value (assuming the rest of the model is held fixed) then the error is well behaved. The sequence of solutions on the meshes can then be used to estimate the solution to the mathematical problem, that is the solution where the mesh resolution is infinite (absurd as it might be). Along with this estimate of the “perfect” solution, the error can be estimated for any of the meshes. For this well-behaved case the error is one sided, a bias between the ideal solution and the one with a mesh. Any fuzz in the estimate would be applied to the bias. In other words any uncertainty in the error estimate is centered about the extrapolated “perfect” solution, not the finite grid solutions. The problem with the current accepted methodology is that the error is given as a standard two-sided error bar that is appropriate for statistical errors. In other words we use a two-sided accounting for this error even though there is no evidence for it. This is a problem that should be corrected. I should note that many models (i.e., like climate or weather) invariably recalibrate after all mesh changes, which invalidates the entire verification exercise where the model aside from the grid should be fixed across the mesh sequence.

I plan to talk more about this issue next week along with a concrete suggestion about how to do better.

When we get to the heart of the matter at hand, dealing with uncertainty in calibrated models, we rapidly come to the conclusion that we need to keep two sets of books. If the first thing that comes to mind is, “that’s what criminals do,” you’re on the right track. You should feel uneasy about this conclusion, and we should all feel as sense of disease regarding this outcome. What do we put in these two books? In one case we have calibrated models, and we can rely upon this model to reliably interpolate the data it is calibrated with. So for quantities of interest used to calibrate a model, the model is basically useless, or perhaps it unveils uncertainty and inconsistency within the data used for calibration.

A model is valuable for inferring other things from simulation. It is good for looking at quantities that cannot be measured. In this case the uncertainty must be approached carefully. The uncertainty in these values must almost invariably be larger than the quantities used for calibration. One needs to look at the modeling connections for these values and attack a reasonable approach to treating the quantities with an appropriate “grain of salt”. This includes numerical error, which I talked about above too. In the best case there is data available that was not used to calibrate the model. Maybe these are values that are not as highly prized or as important as those used to calibrate. The uncertainty between these measured data values and the simulation gives very strong indications regarding the uncertainty in the simulation. In other cases some of the data potentially available for calibration has been left out, and can be used for validating the calibrated model. This assumes that the hold-out data is sufficiently independent of the data used.

A truly massive issue with simulations is extrapolation of results beyond the data used for calibration. This is a common and important use of simulations. One should expect the uncertainty to grow substantially with the degree of extrapolation from data. A common and pedestrian source for seeing what this looks like is least square fitting of data. The variation and uncertainty in the calibrated range is the basis of the estimates, but depending on the nature of the calibrated range of the data and the degree of extrapolation, the uncertainty can grow to be very large. This makes perfect reasonable sense, as one departs from our knowledge and experience, we should expect the uncertainty in our knowledge to grow.

A second issue to consider is our second set of books where the calibration is not taken quite so generously. In this case the most honest approach to uncertainty is to apply significant variation to the parameters used to calibrate the model. In addition we should include the numerical error in the uncertainty. In the case of deeply calibrated models these sources of uncertainty can be quite large and generally paint an overly pessimistic picture of the uncertainty. Conversely we have an extremely optimistic picture of uncertainty with calibration. The hope and best possible outcome is that these two views bound reality, and the true uncertainty lies between these extremes. For decision-making using simulation this bounding approach to uncertainty quantification should serve us well.

There are three types of lies — lies, damn lies, and statistics.”

― Benjamin Disraeli

The Regularized Singularity

~ The Eyes of a citizen; the voice of the silent

Monthly Archives: January 2017

Let’s Discover Some Magic

Breaking Bad: Priorities, Intentions and Responsibility in High Performance Computing

Are We Doing Numerical Error Bars Right?

Dealing with Bias and Calibration in Uncertainty Quantification