We have already lost to the Chinese in Supercomputing; Good thing it doesn’t matter

27 Monday Jun 2016

In our attempt to make conservation easy, we have made it trivial.

― Aldo Leopold

The reality is that it may or may not matter that the Chinese are cleaning the floor with us in supercomputing. Whether it matters comes down to some details dutifully ignored by the news articles. Unfortunately, those executing our supercomputing programs ignorantly ignore these same details. In these details we can find the determining factors of whether the Chinese hardware supremacy is a sign that they are really beating the USA where it matters. Supercomputing is only important because of modeling & simulation, which depends on a massive swath of scientific endeavor to faithfully achieve. The USA’s national supercomputing program is starving this swath of scientific competence in order to build the fastest supercomputer. If the Chinese are simultaneously investing in computing hardware and the underlying science of modeling & simulation, they will win. Without the underlying science no amount of achievement in computing power will be provide victory. The source of the ignorance in our execution of a supercomputing program is a combination of willful ignorance, and outright incompetence. In terms of damage the two are virtually indistinguishable.

There is nothing more frightful than ignorance in action.

― Johann Wolfgang von Goethe

The news last week was full of the USA’s continued losing streak to Chinese supercomputers. Their degree of supremacy is only growing and now the Chinese have more machines on the list of top high performance computers than the USA. Perhaps as importantly the Chinese didn’t relied on homegrown computer hardware rather than on the USA’s. One could argue that American export law cost them money, but also encouraged them to build their own. So is this a failure or success of the policy, or a bit of both. Rather than panic, maybe its time to admit that it doesn’t really matter. Rather than offer a lot of concern we should start a discussion about how meaningful it actually is. If the truth is told it isn’t very important at all.

If we too aggressively pursue regaining the summit of this list we may damage the part of supercomputing that actually does matter. Furthermore what we aren’t discussing is the relative positions of the Chinese to the Americans (or the Europeans for that matter) in the part of computing that does matter: modeling & simulation. Modeling and simulation is the real reason we do computing and the value in it is not measured or defined by computing hardware. Granted that computer hardware plays a significant role in the overall capacity to conduct simulations, but is not even the dominant player in modeling effectiveness. Worse yet, in the process of throwing all our effort behind getting the fastest hardware, we are systematically undermining the parts of supercomputing that add real value and have far greater importance. Underlying this assessment is the conclusion that our current policy isn’t based on what is important and supports a focus on less important aspects of the field that simply are more explicable to lay people.

Let’s take this argument apart and get to the heart of why the current narrative in supercomputing is so completely and utterly off base. Worse than a bad narrative, the current mindset is leading to bad decisions and terrible investments that will set back science and diminish the power and scope of modeling & simulation to be transformative in today’s world. It is a reasonable concern that the reaction of the powers that be to the Chinese “success” in hardware will be to amplify the already terrible decisions in research and investment strategy.

The real heart of the matter is that the Chinese victory in supercomputing power is pretty meaningless. The computer they’ve put at the top of the list is close to practically useless, and wins at computing a rather poorly chosen benchmark problem. This benchmark problem was never a particularly good measure of supercomputers, and the gap between the real reasons for buying such computers, the efficacy of the benchmark has only grown more vast over the years. It is not the absolute height of ridiculousness that anyone even cares, but care we do. The benchmark computes the LU decomposition of a dense matrix, a classical linear algebra problem computed at a huge scale. It is simply wildly uncharacteristic of the problems and codes we buy supercomputers to solve. It skews the point of view in directions that are unremittingly negative, and ends up helping to drive lousy investments.
We have set about a serious program to recapture the supercomputing throne. In its wake we will do untold amounts of damage to the future of modeling & simulation. We are in the process of spending huge sums of money chasing a summit that does not matter at all. In the process we will starve the very efforts that could allow us to unleash the full power of the science and engineering capability we should be striving for. A capability that would have massively positive impacts on all of the things supercomputing is supposed to contribute toward. The entire situation is patently absurd, and tragic. It is ironic that those who act to promote high performance computing are killing it. They are killing it because they fail to understand it or how science actually works.

Let’s get a couple of truths out there to consider:

There is no amount of computer speed, nor solution accuracy or algorithmic efficiency that can rescue a model that is incorrect.
The best methods produce the ability to solve problems that were impossible before. Sometimes methods & models blur together producing a merged capability that transcends the normal distinctions providing ability to solve problems in ways not previously imagined.
Algorithmic efficiency coupled with modeling improvements and methods allows solutions to proceed with efficiency that goes well beyond mere multiples in efficiency; the gains are geometric or exponential in character.
The improvements in these three arenas all have historically vastly out-paced the improvements associated with computer hardware.

The tragedy is that these four areas are almost completely ignored by our current high-performance computing program. The current program is basically porting legacy code and doing little else. Again, we have to acknowledge that a great deal of work is being done to enable the porting of codes to succeed. This is in the area of genuinely innovative computer science research and methodology to allow easier programming of these monstrous computers we are building.

On the other hand, we have so little faith in the power of our imaginations and capacity for innovation that we will not take the risk of relying on these forces for progress. Declaring that the current approach is cowardly is probably a bit over the top, but not by much. I will declare that it is unequivocally anti-intellectual and based upon simple-minded conclusions about the power, value and source of computing’s impact on science and society. Simple is good and elegant, but simple-minded is not. The simple-minded approach does a great disservice to all of us.

Any darn fool can make something complex; it takes a genius to make something simple.

― Pete Seeger

The single most important thing in modeling & simulation is the nature of the model itself. The model contains the entirety of the capacity of the rest of the simulation to reproduce reality. In looking at HPC today any effort to improve models is utterly and completely lacking. Next in importance are the methods that solve those models, and again we see no effort at all in developing better methods. Next we have algorithms whose character determines the efficiency of solution, and again the efforts to improve this character are completely absent. With algorithms we start to see some effort, but only in the service of implementing existing ones on the new computers. Next in importance comes code and system software and here we see significant effort. The effort is to move old codes onto new computers, and produce software systems that unveil the power of new computers to some utility. Last and furthest from importance is the hardware. Here, we see the greatest degree of focus. In the final analysis we see the greatest focus, money and energy on those things that matter least. It is the makings of a complete disaster, and a disaster of our own making.

…the commercially available and laboratory technologies are inadequate for the stockpile stewardship tasks we will face in the future. Another hundred-to-thousand-fold increase in capability from hardware and software combined will be required… Some aspects of nuclear explosive design are still not understood at the level of physical principles…

–C. Paul Robinson, in testimony to the House National Security Committee on May 12, 1996

It is instructive to examine where these decisions came from whether they made sense at one time and still do today. The harbinger of the current exascale initiative was the ASC(I) program created in the wake of the end of nuclear testing in the early 1990’s. The ASC(I) program was based on a couple of predicates: computational simulation would replace full scale testing, and that computational speed was central to success. When nuclear tested ceased the Cray vector supercomputers provided the core of modeling & simulation computer capability. The prevailing wisdom that the combination of massively parallel computing and Moore’s law for commodity chips would win. This was the “Attack of the Killer Micros” (http://www.nytimes.com/1991/05/06/business/the-attack-of-the-killer-micros.html?pagewanted=all). I would submit that these premises were correct to some degree, but also woefully incomplete. Some of these shortcomings have been addressed in the broader stockpile stewardship program albeit poorly. Major experimental facilities and work was added to the original program, which has provided vastly more balanced scientific content to the endeavor. While more balanced, the overall program is still unbalanced with too much emphasis on computing as defined by raw computational power, with too little emphasis on theory (models), methods and algorithms along with an overly cautious and generally underfunded experimental program.

On the modeling & simulation side many of the balance issues were attacked by rather tepid efforts adding some modeling, and methods work. Algorithmic innovation has been largely limited to the monumental task of moving existing algorithms to massively parallel computers. The result has been a relative dearth of algorithmic innovations and increasingly high levels of technical debt and inflation permeating the code base. The modeling work is almost entirely based on closing existing systems of equations used to create models without any real focus on the propriety of the equations themselves. This is exacerbated by the cautious and underfunded experimental efforts that fail to challenge the equations in any fundamental way. We can look at public failures like NIF as being emblematic of the sort of problems we are creating for ourselves. We will not compute our way out of a problem like NIF!

The important thing about ASC(I) is that it created a model of successfully funding a program. As such, it became the go-to model for developing a new program, and the blueprint for our exascale program. Judging from the exascale program’s design and execution, it has learned little or nothing from the ASC(I) program’s history aside from the lesson of how to get something funded. The overall effort actually represents a sharp step backward by failing to implement any of the course corrections ASC(I) made relatively early in its life. The entirety of the lessons from ASC(I) were simply the recipe for getting money, which to put it plainly is “focus all the attention on the computers”. Everything else is a poorly thought through afterthought, largely consisting of a wealth of intellectually empty activities that look like relatively sophisticated code porting exercises (and some aren’t that sophisticated).

The price for putting such priority and importance on the trivial is rather high. We are creating an ecosystem that will strangle progress in its crib. As an ecosystem we have failed to care for the essential “apex” predators. By working steadfastly toward supremacy in the trivial we starve the efforts that would yield true gains and actual progress. Yet this is what we do, it has become a matter of policy. We have gotten here because those we have entrusted with creating the path forward don’t know what they are doing. They have allowed their passion for computer hardware to overwhelm their duties and their training as scientists. They have allowed a trivial and superficial understanding of modeling & simulation to rule the creation of the programs for sustaining progress. They have allowed the needs of marketing a program to overwhelm the duty to make those programs coherent and successful. It is a tragedy we see playing out society-wide.

Our current approach to advancing high performance computing is completely superficial. Rather than focus on the true measure of value being modeling & simulation, we focus on computing hardware to an unhealthy degree. Modeling & simulation is the activity where we provide a capability to change our reality, or our knowledge of reality. It is a holistic activity relying upon the full chain of competencies elaborated upon above. If anything in this chain fails, the entire activity fails. Computing hardware is absolutely essential and necessary. Progress via faster computers is welcome, but we fail to acknowledge its relative importance in the chain. In almost every respect it is the furthest away from impacting reality, while more essential parts of the overall capability are starved for resources and ignored. Perhaps more pointedly, the time for focus on hardware has passed as Moore’s law is dead and buried. Much of the current hardware focus is a futile attempt to breathe life into this increasingly cold and rigid corpse. Instead we should be looking at this moment in time as an opportunity (https://williamjrider.wordpress.com/2016/01/15/could-the-demise-of-moores-law-be-a-blessing-in-disguise/). The demise of Moore’s law was actual a blessing that unleashed a torrent of innovative energy we are still reeling from today.

It is time to focus our collective creative and innovative energy where opportunity exists. For example, the arena of algorithmic innovation has been a more fertile and productive route to improving the performance of modeling& simulation than hardware. The evidence for this source of progress is vast and varied; I’ve written on it on several occasions (https://williamjrider.wordpress.com/2014/02/28/why-algorithms-and-modeling-beat-moores-law/, https://williamjrider.wordpress.com/2015/02/14/not-all-algorithm-research-is-created-equal/, https://williamjrider.wordpress.com/2015/05/29/focusing-on-the-right-scaling-is-essential/). Despite this, we put little or no effort in this demonstrably productive direction. For example many codes rely desperately on numerical linear algebra algorithms. These algorithms have a massive impact on the efficiency of the modeling & simulation often being enabling. Yet we have seen no fundamental breakthroughs in the field for over 30 years. Perhaps there aren’t any further breakthroughs to be made, but this is a far too pessimistic outlook on the capacity of humans to create, innovate and discover solutions to important problems.

There is a cult of ignorance in the United States, and there has always been. The strain of anti-intellectualism has been a constant thread winding its way through our political and cultural life, nurtured by the false notion that democracy means that ‘my ignorance is just as good as your knowledge.

― Isaac Asimov

All of this gets at a far more widespread and dangerous societal issue; we are incapable of dealing with any issue that is complex and technical. Increasingly we have no tolerance for anything where expert judgment is necessary. Science and scientists are increasingly being dismissed when their messaging is unpleasant or difficult to take. Examples abound with last week’s Brexit easily coming to mind, and climate change providing an ongoing example of willful ignorance. Talking about how well we or anyone else does modeling & simulation is subtle and technical. As everyone should be well aware subtle and technical arguments and discussions are not possible in the public sphere. As a result we are left with horrifically superficial measures such as raw supercomputing power as measured by a meaningless, but accepted benchmark. The result is a harvest of immense damage to actual capability and investment in a strategy that does very little to improve matters. We are left with a hopelessly ineffective high performance-computing program that wastes sums of money; entire careers and will in all likelihood result in a real loss of National supremacy in modeling & simulation.

All of this to avoid having a conversation of substance about what might constitute our actual capability. All to avoid having to invest effort in a holistic strategy that might produce actual advances. It is a problem that is repeated over and over in our National conversations due to the widespread distrust of expertise and intellect. In the end the business interests in computing highjack the dialog and we default to a “business” friendly program that is basically a giveaway. Worse yet, we buy into the short-term time scales for business and starve the long-term health of the entire field.

One of the early comments on this post took note of the display of “healthy skepticism”. While I believe that healthy skepticism is warranted, to call my point-of-view skeptical understates my position rather severely. I am severely critical. I am appalled by the utter and complete lack of thought going into this program. I realize that what I feel is not skepticism, but rather contempt. I see the damage being done by our approach every single day. The basis of the current push for exascale lacks virtually any rigor or depth of thought. It is based on misunderstanding science, and most egregiously the science it is supposed to be supporting. We have defaulted to a simple-minded argument because we think we can get money using it. We are not relying on a simple argument, but simple-minded argument. The simple argument is that modeling & simulation is vital to our security economic or military. The simple-minded argument we are following is a real and present threat to what really matters.

Limiters are powerful; how to make them even more powerful

22 Wednesday Jun 2016

Posted by Bill Rider in Uncategorized

≈ 3 Comments

Progress is discovering what you can do without.

― Marty Rubin

Note: WordPress LaTeX woes continue so equations that wouldn’t parse are separated by “$” in the TeX “code”.

After last week’s orgy of words I was a bit more restrained this week, after all I am on vacation! It’s a topic that means a lot to me and actually represents a fun topic, so I’m continuing the thematic direction. Go figure! Still some more thoughts on the general recent theme are presented where improvements would mean much more to computational science than a bunch of expensive, hard to use, energy wasting, hard-to-impossible to program computers; Alternatively, improvements in methods that would make those precious computers radically more useful! None of the press regarding “exascale” computing acknowledges that supercomputing is an intrinsically holistic endeavor where the models, methods and algorithmic advances do far more to empower the future than any hardware investments.

Off the soapbox, and on to the chalkboard!

I’ve been advancing the theme of methods for improving the solution of hyperbolic PDEs for several weeks now. At the end of the last post I posited the proposition that looking to produce high order accurate approximation wasn’t always the best idea. Quite often we must content with solutions that are very under-resolved and rough in character. Experience seems to dictate that this eventuality is so important that it should always be present in the method’s design. One of the issues with the ENO and WENO methods is the absence of these ideas from high-order methods, which may be limiting their utility for broad use. Under appropriate conditions abandoning the pursuit of accuracy may be the advisable thing to do. From our experience with first generation monotonicity-preserving methods we can see that under the conditions where accuracy cannot be maintained, the discrete conditions are incredibly useful and powerful. Perhaps there is a route to improvement in methods that may use this experience productively. Limiters seem to endow the methods with a greater and stronger sense of nonlinear stability (i.e., robustness) to the extent of allowing one to use unstable approximations locally in a completely stable, robust manner.

The first thing to do is review the relations for providing monotonicity-preserving methods that can be easily stated by using TVD theory. This was discussed last week. First, the key fact that upwind differencing is a positive scheme is a foundation for things, $ U(j,n+1) = U(j,n) – \nu \left[ U(j,n) – U(j-1,n) \right] $ where $\nu = \frac{a \Delta t}{\Delta x}$ is the CFL number. Next, apply this basic thought process to second-order extension (or a high-order extension more generally), $U(j+1/2,n) = U(j,n) + \frac{1}{2} S(j,n)$ . Plugging this into our update equation, $ U(j,n+1) = U(j,n) – \nu \left[ U(j+1/2,n) – U(j-1/2,n) \right] $ can produce conditions on limiters for $S(j,n)$ . These are functional relationships defined by dividing the slope by the upwind difference on the mesh. These produce the famous diagrams introduced by Sweby under the proper CFL stability limits.

Secondly it is instructive to look at the differing approach taken by Suresh and Huynh. Here the approach to the limiting is done on the high-order edge values with two applications of the median function. The first step assures that the edge is bounded by its neighboring edge values, $\mbox{median}\left( U(j), U(j+1/2), U(j+1) \right)$ . The final step requires a bit more explanation and discussion because it differs in philosophy from the TVD limiters (or at least seemingly, actually its completely congruent). The idea of the second argument is to prevent the transport equation from producing any new extrema in the flow. Since the first test checks for the value of the edge to be bounded by the anti-upwind value, we need to keep things bounded for the upwind value. To find the boundary for the values where things go south, we find the value where the update equals the upwind values, $ U(j-1) = U(j) – \nu \left[ U(j+1/2) – U(j-1/2) \right]$, and then make sure that $U(j-1/2) \approx U(j)$ as not to help matters at all. Then we rearrange to find the bounding value, $ U(j+1/2) = U(j) + \frac{(U(j) – U(j-1))}{ \nu }$, which is a bit unsatisfying because of the Courant number dependence, but its based on transport, so what else did we expect! The second test is then $ \mbox{median} ( U(j), U(j+1/2), U(j) – \frac{( U(j) – U(j-1) )}{\nu } $. The result will be monotonicity-preserving up to $\Delta t = \frac{\nu \Delta x}{a}$ .

Every solution to every problem is simple. It’s the distance between the two where the mystery lies.

― Derek Landy

These two views are utterly complementary in nature. One revolves around extending the positivity of coefficients that naturally comes from the use of upwind differencing. The second uses the combination of bounding reconstructions of a function, and the direct impact of transport on the development of new extrema in flows. We could easily reuse either under the right conditions and more importantly these methods produce the “same” limiters under the right conditions. The ideas I’m developing here are a bit of an amalgam of my ideas with Suresh & Huynh and Colella & Sekora. In particular, they might work extremely well within a framework that is built around the PPM scheme, which I believe is both the most powerful and flexible method to build modern methods upon.

With this basic background in hand let’s throw out the idea that will be developed for the rest of the post. If one looks at the simplest hyperbolic PDE, the one-way wave equations, $U_t + a U_x = 0$ we can see that the same equation can simply be derived to describe the evolution of any derivative of the solution. $(U_x)_t + a(U_x)_x = 0 \rightarrow V_t + a V_x = 0$ and $(U_{xx})_t + a(U_{xx})_x = 0 \rightarrow W_t + a W_x = 0$ . As a result a limiter on the value or derivative or second derivative, and so on of the solution should all take basically the same form. We can use basically the same ideas over and over again with appropriate accommodation of the fact that we are bounding derivatives. The question we need to answer is when we should do this. Here is the major proposition we are making in this blog post: if the derivative in question takes the same sign then the limiter will be governed by either a high-order approximation, or the limit defined, but if the sign is different, the limiter will move to evaluation of the next derivative, and this may be done recursively.

In its simplest form we will simply have a monotonicity preserving method. We can take a high-order approximation $U_H$ –fifth-order upwinding suffices– and see whether it is monotone. If the local differences are different in sign we simply use first-order upwinding. If the signs are the same we see whether the high-order approximation is bounded by the monotone limit and upwinding, $U_f = \mbox{median}\left( U(j), U_H, U(j) + \frac{1}{2}S_M \right)$ where $S_M = 2\mbox{minmod}\left( U(j)-U(j-1), U(j+1)-U[j) \right)$ . Note that if $S_M=0$ there is an extrema. Doing something with this knowledge would seem to be a step in the right direction instead of just giving up.

It takes a long time to learn how to do something simple.

― Marty Rubin

In the vein of keeping things simple, I will stick with the bounding approach for now, and avoid the complications of the transport mindset. We basically go through a set of bounding arguments starting with basic monotonicity-preserving limiters, and graduating to higher order forms if we detect an extrema. This idea can be then applied recursively to whatever degree you’d like or can tolerate. The concept is that one can do something if a high-order structure is detected – some sort of critical point where a derivative of the solution goes through zero, but if the structure doesn’t have a critical point, and is simply under-resolved, higher order doesn’t make sense to pursue. This is the key to the entire dialog here, sometimes high-order simply doesn’t make sense, and the pragmatic approach is to stop.

The basic idea is that we can produce a sequence of approximations using the same basic conceptualizations as the wildly popular and powerful monotoncity-preserving methods in a manner that will promote improved accuracy to the extent we are willing to exercise. More importantly the methods acknowledge the impact of under-resolved solutions and the utter lunacy of trying to achieve high-order accuracy under those circumstances. If we find an extrema in the solution, we looks at whether the extrema is smooth enough to continue the use of the high-order approximation. The first derivative is approximately zero, but may need to have a non-zero value to promote accuracy. We start by using the safest value for the first derivative using a mineno limiter introduced two posts ago, which is the local gradient with the smallest absolute magnitude $U1(j) = \mbox{mineno}\left( U(j+1)-U(j), U(j)-U(j-1) \right)$ . Step two would produce a set of estimates of the second-derivative locally, we could easily come up with two or three centered in cell, j. We will suggest using three different approximations, $U2c(j) =U(j+1)-2 U(j) + U(j-1)$ , $U2L(j) =U(j+1)- U(j)-U(j-1) + U(j-2)$ and $U2R(j) =U(j+2)- U(j+1)-U(j) + U(j-1)$ . Alternatively we might simply use the classic second order derivative in the adjacent cells instead of the edges. We then take the limiting value of $U2(j) = 2\mbox{minmod}\left(U2L(j), U2c(j), U2R(j) \right)$ . We then can define the bounding value for the edge as $\mbox{median}\left( U(j) + \frac{1}{2} U1(j), U(j+1/2), U(j) + \frac{1}{2} U1(j) + \frac{1}{6} U2(j) \right)$ . Like the monotonicity-preserving method, this estimate should provide a lot of room if the solution well enough resolved. This will be at least a second-order approximation by the properties of the median function and the accuracy of the mineno based linear extrapolation.

This should allow one to get at least second-order accuracy in the infinity norm. In may make sense to use the broader stencils for the first derivative to improve the solutions a bit. To repeat, the general concept is make sure any evolution of the derivative of the solution does not create a new extrema in the derivatives (instead of the cell values as in basic monotoicity-preserving methods. Now, I will sketch out what the next order higher in the limiter hierarchy would look like. This limiter would only be used in places where there are the twin critical points in the value and derivative of the solutions. A final flourish for these methods would be to add a positivity limiter that helps insure that no unphysical values are generated (i.e., negative densities or pressures etc.).

The next higher order in the hierarchy would require us to make a safe approximation to the first and second derivatives, then use a bounding approximation for the third derivative. In addition if the third derivative has a sign change in the cell (or nearby vicinity), we either go to the next order in the hierarchy of truncate it with the limiter based on the safe extrapolation. This highlights one of the characteristics of these limiters in that the approximation will truncate to one order lower than the attempt (i.e., a safe limit). These lower limits are all within the ENO family of schemes and should provide entropy-stable approximations.

The key to this entire line of thinking is the manner of encoding pragmatic thinking into extending limiters. TVD–Monotonicity-preserving limiters are dynamically pragmatic in their makeup. If a profile is too steep, it is under-resolved and cannot be evolved without inducing oscillations. Furthermore the whole idea of a convergent Taylor series expansion, based on intrinsic smoothness is not a sensible approximation strategy. It makes perfect sense to stop from making a bad decision. This pragmatism extends to local extrema where we basically reject high-order ideas and use a first-order method to be safe. This safety is also built into the whole foundation of the method. The idea I am exploring here is to consider carefully these extrema for being capable of being represented with a smoothness based approximation; to see whether a Taylor series holds and can be used profitably, but eventually surrender to the reality of under-resolution. For problems with discontinuities and/or shocks, this is always in play.

It is notable that with the median limiters and ENO-like schemes, the first-order approximation is always in play via the TVD-like comparison scheme. These might be equally pursued in the case of the extrema-preserving method. In other words instead of using a TVD comparison scheme, the extrema-preserving method may be used to good measure in creating an even more powerful high-resolution method.

The role of genius is not to complicate the simple, but to simplify the complicated.

― Criss Jami

Everything has boundaries. The same holds true with thought. You shouldn’t fear boundaries, but you should not be afraid of destroying them. That’s what is most important if you want to be free: respect for and exasperation with boundaries.

― Haruki Murakami

A Recipe for Improved Methods

14 Tuesday Jun 2016

Posted by Bill Rider in Uncategorized

≈ 4 Comments

A couple of notes:

This is the longest post I’ve written, its way too long (4500 words! TL;DR)! Its a topic that inspires me, so sue me! The blog is for me and if others find it useful then its all the better.
The WordPress LaTeX capability is an awesome thing, but its buggy as hell and things sometimes just don’t work (it pisses me off cause a couple of the equations are important and they never worked)! In these cases I’ve left the equation in their naked LaTeX form since the parser is too stupid to render them properly. At least you’ll know what I’m intending to say–these cases are offset by a pair of “$”.

Life is really simple, but we insist on making it complicated.

― Confucius

The last couple of weeks have featured a general-purpose discussion of nonlinearity in approximation, and a clever way of introducing it. The original wave of nonlinear differencing that overcame Godunov’s theorem has been massively successful in transforming computational physics. The flux and slope limiting schemes provided both practical and pragmatic methods for solving problems that vexed earlier generations. The trick is continuing the forward progress and overcoming the shortcomings of these successful methods. While successful and useful, the first generation of nonlinear methods is not without fault and problem. Their replacements have never been entirely successful for a variety of reasons that I’ll elaborate upon below. Along with the failure to replace the first generation we have also seen a relative demise of further progress for reasons I’ve written about extensively. It is high time for some new principles to be introduced and seek a new generation of methods that make things better.

Just for the sake of completeness, linear methods are generically inadequate for solving many modern or real problems. As Godunov’s theorem dictated the linear methods cannot both be high quality and accurate, you have to choose one or the other, and you need both. These methods may be resurrected via a variety of heavy-handed mechanisms that almost always either render a solution first-order accurate, or rely upon excessive mesh resolution to resolve everything. Problems that can be fully solved are useless idealizations anyway! Sometimes the lousy solution is simply accepted either being overly diffuse or numerical corrupted. In either case these methods were not remotely the path to success for modeling and simulation. Their replacement was transformative and opened the door to success far more than any advances in computing hardware or power. The lack of acknowledgment for the necessity of method’s improvement instead of reliance on hardware is vexing and intellectually dishonest and shallow. The truth is that the nonlinear methods are an absolute and primal necessary for modeling and simulation to have its current measure of success. Computers rather than being enabling are just the “icing on the cake”.

Like all magnificent things, it’s very simple.

― Natalie Babbitt

As noted before, linear methods are only limited (pun intended) in their utility. Godunov’s theorem gives the reason for the limitations, and the path forward to nonlinear methods. In an “explosion” of innovation in the early 1970’s four different researchers came up with various nonlinear (adaptive) differencing methods to overcome Godunov’s theorem (in the USA, Russia, Netherlands, and Israel). In the fullness of time, these approaches are fundamentally similar and all intimately related. Of course in the heat of the moment, the fighting and turmoil over the development was bitter and tumultuous. Each method took the path of blending first-order methods together with high-order differencing by detecting where the high-method would get into trouble (cause oscillations), and adding enough first-order method to get the solution out of trouble. First-order methods that are useful are viewed as being overly dissipative, but also produce physically relevant solutions. The detection mechanism became widely known as limiters acting to determine the mix of methods via analysis of the underlying local solution.

An important side note is that the nonlinear differencing was used for shock wave solutions almost from the start. The first useful shock capturing methods was Von Neumann and Richtmyer’s method using artificial viscosity, which is a nonlinear method. This work includes three major contributions each having great traction in the tale unfolding below. The first is the underlying high-order method using a staggered mesh in time and space. It is second-order accurate, but catastrophically unstable without the next contribution, the artificial viscosity, which stabilizes the solution in the vicinity of shock waves by adding a shock-selective viscosity. This is a nonlinear differencing method because the dissipation is only added where there is a shock and not everywhere. Finally they used a stability analysis method that has become the standard for analyzing the stability of methods for PDE’s.

650px-LimiterPlots1 A key part of the first generation’s methods success was the systematic justification for the form of limiters via a deep mathematical theory. This was introduced by Ami Harten with his total variation diminishing (TVD) methods. This nice structure for limiters and proofs of the non-oscillatory property really allowed these methods to take off. To amp things up even more, Sweby did some analysis and introduced a handy diagram to visualize the limiter and determine simply whether it fit the bill for being a monotone limiter. The theory builds upon some earlier work of Harten that showed how upwind methods provided a systematic basis for reliable computation through vanishing viscosity.

Simplicity is the ultimate sophistication.

― Clare Boothe Luce

The ground state method is upwind (donor cell) differencing, $U(j,n+1) = U(j,n) - \nu \left(U(j,n) - U(j-1,n) \right)$ , which upon rearrangement shows that the new value, $U(j,n+1)$ is a combination of positive definite values as long as $\nu<1$ . If the update to the new value is fully determined by positive coefficients, the result will be monotonicity preserving without any new extrema (new highs or lows not in the data). The TVD methods work in the same way except the coefficients for the update are now nonlinear and the conditions for the update being positive may produce conditions by which the update will be positive. I’ll focus on the simplest schemes possible using a forward Euler method in time, $U(j,n+1) = U(j,n) - \nu \left(U(j+1/2,n) - U(j-1/2,n) \right)$ . For the upwind method this update is defined by $U(j+1/2,n) = U(j,n)$ using a positive direction for transport. For a generic second-order method one uses, $U(j+1/2,n) = U(j,n) + \frac{1}{2} S(j,n)$ where the use of linear extrapolation gives accuracy. Plugging this into the forward Euler update, and defining the extrapolation as a nonlinear function, $Q(j,n)$ . Now the update is written as, $U(j,n+1) = U(j,n) - \nu$ $(1 + 1/2 Q(j,n)$ $ – 1/2 Q(j-1,n)) $ $D(j-1/2,n)$ where $D(j-1/2,n) = U(j,n) - U(j-1,n)$ . Now we can do some math using the assumption that $Q(j-1,n)$ does nothing to help matters insofar as keeping coefficients positive. We are left with a pretty simple condition for positivity as $0 \le Q(j,n) \le \frac{2(1-\nu)} {\nu}$ . If $0 \le Q(j,n) \le 1$ , the method is positive to $\nu \le \frac{2}{3}$ ; if $0 \le Q(j,n) \le 2$ , $, the method is positive to $\nu \le \frac{1}{2}$ . The point is that we now can concretely design schemes based on these inequalities that have reliable and well-defined properties.

T images he TVD and monotonicity-preserving methods while reliable and tunable through these functional relations have serious shortcomings. These methods have serious limitations in accuracy especially near detailed features in solutions. These detailed structures are significantly degraded because the methods based on these principles are only first-order accurate in these areas basically being upwind differencing. In mathematical terms this means that the solution are first-order in the L-infinity norm, approximately one-and-a-half in the L-2 (energy) norm and second-order in the L1 norm. The L1 norm is natural for shocks so it isn’t a complete disaster. While this class of method is vastly better than their linear predecessors effectively providing solutions completely impossible before, the methods are imperfect. The desire is to remove the accuracy limitations of these methods especially to avoid degradation of features in the solution.

The next generation of methods (the 2^nd nonlinear generation) sought to overcome these problems. The thing to avoid in the approximation is the utilization of first-order differencing at extreme values. While this is sensible, it is also dangerous. This is where new extrema in the solutions are created. This is where unphysical oscillations come from, and opens the door for all the things these methods are supposed to cure. The upshot of this is that getting rid of the detriments of the TVD methods requires being really careful. This care went into the creation of these methods, which were called essentially non-oscillatory methods (ENO). The gist of the design was to try and select the smoothest, smallest in magnitude variation differencing option locally.

Because of this the method tend to be very dissipative. This might seem odd at first glance, but consider that upwind differencing is based on a piecewise constant approximation within each cell. This is the lowest magnitude of variation possible, as such the small variation approximation within the cell used in ENO results in enhanced dissipation compared to TVD methods. This dissipative nature is one reason why these methods have not really been successful. The dissipative nature of ENO has recently been proven. It is a good thing in the sense that entropy satisfaction can be shown, but in some cases it can exceed that produced by upwinding. This is the second reason why ENO is not successful; exceeding upwind dissipation is actually a threat to stability of the scheme. Thus we have a method that is both low fidelity and lacking in robustness. This ends up being the price to pay for not degrading to a low order approximation at extrema and maintaining formal high-order accuracy there. Do we always have to pay this price? I don’t think so, and the median I talked about last week might allow us to get around the problem.

Eno-fruit-salt A big part of understanding the issues with ENO comes down to how the method works. The approximation is made through hierarchically selecting the smoothest approximation (in some sense) for each order and working ones way to high-order. In this way the selection of the second-order term is dependent on the first-order term, and the third-order term is dependent on the selected second-order term, the fourth-order term on the third-order one,… This makes the ENO method subject to small variations in the data, and prey to pathological data. For example the data could be chosen so that the linearly unstable stencil is preferentially chosen leading to nasty solutions. The safety of the relatively smooth and dissipative method makes up for these dangers. Still these problems resulted in the creation of the weighted ENO (WENO) methods that have basically replaced ENO for all intents and purposes.

3aos1 WENO gets rid of the biggest issues with ENO by making the approximation much less sensitive to small variations in the data, and biasing the selection toward more numerically stable approximations. Moreover the solution to these issues allows for an even higher order approximation to be used if the solution is smooth enough to allow this. Part of the issue is the nature of ENO’s approach to approximation. ENO is focused on diminishing the dissipation at extrema, but systematically creates lower fidelity approximations everywhere else. If one looks at TVD approximations the “limiters” have a lot of room for non-oscillatory solutions if the solution is locally monotone. This is part of the power of these methods allowing great fidelity in regions away from extrema. By not capitalizing on this foundation, the ENO methods failed to build on the TVD foundation successfully. WENO methods come closer, but still produce far too much dissipation and too little fidelity to fully replace monotonicity-preservation in practical calculations. Nonetheless ENO and WENO methods are pragmatically and empirically known to be nonlinearly stable methods.

images-2 Given the lack of success of the second generation of methods, we have a great need for a third generation of nonlinear methods that might displace the TVD methods. Part of the issue with the ENO methods is the lack of a strict set of constraints to guide the development of the methods. While WENO methods also lack such constraints, they do have a framework that allows for design through the construction of the smoothness sensors, which help to determine the weights of the scheme. It allows for a lot of creativity, but the basic framework still leads to too much loss of fidelity compared to TVD methods. When one measures the accuracy of real solutions to practical problems WENO still has difficulty being competitive with TVD–don’t fooled by comparisons that show WENO being vastly better than TVD, the TVD method chosen to compare with is absolute shit. A good TVD method is very competitive, so the comparison is frankly disingenuous. For this reason something better is needed or progress will continue to stall.

My philosophy is to try and use the first two generations of nonlinear methods as the foundation. The median function is a means to allow the combining of the properties to get the best of both! This approach was taken in my 2007 paper with Jeff Greenough and Jim Kamm. As I stated then the objective is to get the best of both Worlds (TVD/monotonicity-preserving and ENO/WENO). Today I will explore if we can go even further.

The median comes in as a selection mechanism that allows beneficial properties to be built into methods. The three properties we will focus upon are listed here

Numerical accuracy, both resolution/fidelity and order of accuracy
Linear stability, or L2-energy stability
Nonlinear stability, L1, variation or non-oscillatory

These properties if brought together in a provable manner should allow new methods to be crafted that may replace TVD methods in practical codes if all works together to produce accurate, high-fidelity, stable, non-oscillatory methods for producing better solutions efficiently.

The approach to take in producing better methods is to start as before with TVD methods, $U(j,n+1) = U(j,n) - \nu \left(U(j+1/2,n) - U(j-1/2,n) \right)$ defining $U(j+1/2,n) = U(j,n) + \delta U(j,n)$ where this edge value is defined as being high order accurate. If we can define inequalities that limit the values that $\delta U(j,n)$ may take we can have productive methods. This is very challenging because we cannot have purely positive definite coefficients if we want high-order accuracy, and we want to proceed forward beyond monotonicity-preservation. The issue that continues to withstand strict analysis is the determination of how large the negativity of coefficients may be and not create serious problems.

Satisfaction is the biggest hurdle to success

― Pushkar Saraf

The approach that I am advocating was described in last week’s post where the median function was used to “breed” new high-order approximations with appropriate linear and nonlinear stability properties. The “seed” for any given scheme will be a monotonicity-preserving–TVD method to act as the selector for the median. A good reason to use these methods is their nonlinear stability properties. Remember that the manner in which the median works requires at least two of the three arguments to have a property in order to assure that the evaluation of the function retains this property.

Dealing with this serious deficiency in a rigorous manner is extremely difficult. A philosophy to dealing with it may be to ignore the problem and introduce other mechanisms to make the solution robust. For example, use a positivity limiter approach to keep positive definite values, positive definite during the integration of the equation. These limiters are devilishly simple and drop into this numerical framework almost seamlessly. The limiter takes a minimum acceptable value $U_{\min}$ and modifies the approximation to remove violations of that value in the evolution. The form is $ U(j+1/2,n) := U(j,n) + \theta\left( U(j+1/2,n) – U(j,n) \right) $ with $ \theta = \min \left( 1,\frac{ U_{\min} – U(j,n) }{ U(j+1/2,n) – U(j,n) } \right) $. The really awesome aspect of this limiter is its preservation of the accuracy of the original approximation. This could be extended to the enforcement of other bounds without difficulty.

The various extrema preserving limiters of Suresh and Huynh, Colella and Sekora or from myself along with Greenough and Kamm may be employed to good effect to keep the integration fully nonlinearly stable while allowing high-order accuracy. In every case these “limiters” or strategies are used to provide accurate approximations throughout the flow field being solved. It is opposed to the previous limiter where a non-local bound is enforced. By being more local the problem is more difficult and accuracy is more likely to be disturbed. The approaches taken before all work to detect extrema in the solutions and produce approximations that preserve extrema that are valid, but not produce new invalid extrema. A trap that this may fall into is only provide the capacity to produce solutions that are uniformly second-order accurate instead of the first-order accuracy that is standard with monotonicity-preservation. The problem with these approaches is their lack of rigorous and provable bounds or properties that helped drive the success of TVD methods. We will end by discussing an approach that might go to arbitrarily high-order without inducing extreme dissipation in the bulk of solution.

This approach is based on the properties of the median discussed last week. Given that we are going to gang up on bad approximations using the median by using two approximations with a desirable property to make sure any bad properties are removed. A really cool thing is getting rid of two bad approximations in one fell swoop. Take the following example, argument one is low-order, but linear and nonlinearly stable–upwind perhaps, or TVD, the second is linearly unstable (nonlinearly too), but high-order and the third is both high-order and nonlinearly stable. Using a median evaluation will produce a new approximation that is both nonlinear stable and high-order even if the evaluation does not return the original value with that property. This is a really powerful tool for getting what we are looking for!

More typically, we will focus on a single property in the evaluation of the median. In structuring and evaluating the tests one should carefully look at what common properties can be provably produced through any given test. The key is to use the evaluations to build up approximations with uniformly desirable properties. I gave several examples last week, and I will close with a new example that brings us closer to the goals set forth for a new generation of methods.

Let’s lay out the building blocks for our approximation scheme.

Have a TVD method, either a “classic” one, or a montonicity-preserving high-order scheme using the stencil described next.
Have an extra high-order method using a full stencil. This is analogous to the fifth-order WENO. Use a linear combination three third-order stencils to get a linear fifth-order scheme
One might also evaluate the fifth-order WENO method.
Finally have the three base third-order stencils recognizing that one of them is the “classic” ENO scheme.
Take inventory. You have three nonlinearly stable stencils, two fifth-order stencils, and three third-order stencils. You also have three or four linearly stable stencils.

Now let the median–magic happen right in front of you! Could one orchestrate the evaluations to give you a fifth-order, linearly and nonlinearly stable approximation? Can you really have it all? The answer, dear readers, is yes.

Here is how one might take on this challenge to systematically produce the desired result. Start with the playoff system discussed last week. This would use three median tests of the TVD scheme and the three third-order stencils. One of the third-order stencils is linearly unstable, and another is the classic ENO stencil, and $U_{3,2}$ is the third-order upstream-centered scheme – and upstream-centered is known to be very desirable. They might be the same one, or they might not, such is the nature of nonlinear stability. At the end of the three tests we now have a third-order scheme that is linearly and nonlinearly stable. Don’t take my word for it, lets take it all apart. The semi-finals would consist of two choices, $U_{3,A} = \mbox{median}\left(U_{3,1}, U_{3,2}, U_{2,M}\right)$ and $U_{3,B} = \mbox{median}\left(U_{3,2}, U_{3,3}, U_{2,M}\right)$ . Both $U_{3,A}$ and $U_{3,B}$ are third-order accurate and one of them is also nonlinearly stable to the extent that the ENO approximation would be, and another one is linearly stable $U_{3,B}$ in this case. Now we have the finals for step one, $U_{3,C} = \mbox{median}\left(U_{3,A}, U_{3,B}, U_{2,M}\right)$ , this is nonlinearly stable and third-order, but linear stability isn’t necessarily given. For example the first-order method is linearly and nonlinear stable, or the second-order monotone Fromm’s scheme is linearly and nonlinearly stable. These two choices give the desired outcome guaranteed third-order, linear and nonlinearly stable schemes.

Now we can use a single additional test to promote things to a fifth-order version of the same. Our final test is $U_{5,LN} = \mbox{median}\left(U_{3,C}, U_{5,L}, U_{5,N}\right)$ . Here the third-order approximation has both desired stability properties, the fifth-order linear approximation–upsteam centered is linearly stable and the fifth-order WENO is nonlinear stable. The final evaluation has it all! Fifth-order! Linear stability! Nonlinear Stability! All using the magical mystical median!

Cherish those who seek the truth but beware of those who find it.

― Voltaire

There is still a lot to do. A few things stand out insofar as coming to more optimal methods. For example, is it a good idea to test the high-order approximation you’d like to use to see if it violates monotonicity, first? If it doesn’t, dispense with the whole survey of the other choices? What gives a better answer? In previous work this can yield big savings. In addition, the major issues with the monotone methods appear at extrema where almost all of the methods degrade to first-order accuracy. Getting safe, high fidelity approximations at extrema benefit from the sort of things done in the set of papers by myself and others trying to extend limiters. The extent to which new extrema would be created and manifesting themselves as some negative coefficients instead of positive definite monotonicity preserving approaches whose magnitude remain undefined by rigorous analysis. Similarly if the approximations are steeper than non-oscillation would allow in monotone regions of the flow, it would manifest itself as negative coefficients. The magnitude of the negativity and oscillation size is difficult to define. So far it does not appear to match the desired expectation of ENO schemes of being $O(h^n)$ where $n$ is the order of the method, which might be as high as one likes. Evidence appears to go in the opposite direction insofar as WENO goes where the very high order WENO methods–higher than 7^th order appear to have larger oscillations and generic robustness issues.

What I will say is that the community of scientists building shock capturing methods do not as a rule measure the accuracy of the methods they work on under practical circumstances. Typically accuracy is only measured on problems where the design order of accuracy may be achieved. This characterizes none of the problems we develop such methods to solve! NONE! ZERO! NADA!

Thinking something does not make it true. Wanting something does not make it real.

― Michelle Hodkin

If we want continued improvement of methods we need to measure our progress! We need to get in the business of measuring the overall efficiency of the methods. The goal of producing a new generation of methods to replace the very successful first generation nonlinear methods should be focused on innovation with the purpose of producing better solutions for less effort along with deeply nonlinear stability, high-resolution and a reliable and robust framework allowing the method to adapt to the problem at hand. Today we avoid this like the proverbial plague and it serves no one, but a small cadre of insiders. It serves them poorly by maintaining an increasingly antiquated status quo, which only stands in the way of actual transformative progress.

While I’m dishing out dollops of cruel reality I should acknowledge that for practical problems there is a certain futility in seeking formally high-order accurate solutions. Almost everything necessary to model reality conspires against us. At every turn we are confronted with a brutal reality that offers us rough, and nearly singular behavior. To some extent the first generation methods reflect this brutish reality offering a pragmatic methodology. These methods simply reject any attempt at accuracy if the solution is too under-resolved or rough. Perhaps we must accept a degree of imperfection in the face of this truth?

There are two ways to get enough. One is to continue to accumulate more and more. The other is to desire less.

― G.K. Chesterton

The attempts to produce methods that robustly allow extrema in the solution to be evolved with less error effectively extend this principled submission to difficulty. They offer rather involved analysis of the extrema including explicit limits on their structure. The problem is that these methods have rather distinct limits on the accuracy that may be achieved with them. Maybe this is really OK? For too long we have pursued methods whose design and energy is driven by problems that are largely meaningless idealizations of the reality we should focus on. We might actually progress more effectively by accepting the intrinsic limitations of our models, and produce pragmatic improvements that manage the reality of their solution.

We all die. The goal isn’t to live forever, the goal is to create something that will.

― Chuck Palahniuk

References

Harten, Ami. “High resolution schemes for hyperbolic conservation laws.”Journal of computational physics 49.3 (1983): 357-393. Also. Harten, Ami. “High resolution schemes for hyperbolic conservation laws.”Journal of Computational Physics 135.2 (1997): 260-278. (see Harten, Ami. “On a class of high resolution total-variation-stable finite-difference schemes.” SIAM Journal on Numerical Analysis 21.1 (1984): 1-23. Too)

Sweby, Peter K. “High resolution schemes using flux limiters for hyperbolic conservation laws.” SIAM journal on numerical analysis 21.5 (1984): 995-1011.

Harten, Ami, Bjorn Engquist, Stanley Osher, and Sukumar R. Chakravarthy. “Uniformly high order accurate essentially non-oscillatory schemes, III.” Upwind and High-Resolution Schemes. Springer Berlin Heidelberg, 1987. 218-290. (see also Harten, Ami, Bjorn Engquist, Stanley Osher, and Sukumar R. Chakravarthy. “Uniformly high order accurate essentially non-oscillatory schemes, III.”Journal of Computational Physics 131, no. 1 (1997): 3-47.)

Harten, Ami, and Stanley Osher. “Uniformly high-order accurate nonoscillatory schemes. I.” SIAM Journal on Numerical Analysis 24.2 (1987): 279-309.

Fjordholm, Ulrik S., Siddhartha Mishra, and Eitan Tadmor. “Arbitrarily high-order accurate entropy stable essentially nonoscillatory schemes for systems of conservation laws.” SIAM Journal on Numerical Analysis 50.2 (2012): 544-573.

Fjordholm, Ulrik S., Siddhartha Mishra, and Eitan Tadmor. “ENO reconstruction and ENO interpolation are stable.” Foundations of Computational Mathematics 13.2 (2013): 139-159.

Jiang, Guang-Shan, and Chi-Wang Shu. “Efficient Implementation of Weighted ENO Schemes.” Journa of Computational Physics 126 (1996): 202-228.

Suresh, A., and H. T. Huynh. “Accurate monotonicity-preserving schemes with Runge–Kutta time stepping.” Journal of Computational Physics 136.1 (1997): 83-99.

Balsara, Dinshaw S., and Chi-Wang Shu. “Monotonicity preserving weighted essentially non-oscillatory schemes with increasingly high order of accuracy.”Journal of Computational Physics 160.2 (2000): 405-452.

Zhang, Xiangxiong, and Chi-Wang Shu. “Maximum-principle-satisfying and positivity-preserving high-order schemes for conservation laws: survey and new developments.” Proceedings of the Royal Society of London A: Mathematical, Physical and Engineering Sciences. Vol. 467. No. 2134. The Royal Society, 2011.

Rider, William J., Jeffrey A. Greenough, and James R. Kamm. “Accurate monotonicity-and extrema-preserving methods through adaptive nonlinear hybridizations.” Journal of Computational Physics 225.2 (2007): 1827-1848.

Colella, Phillip, and Michael D. Sekora. “A limiter for PPM that preserves accuracy at smooth extrema.” Journal of Computational Physics 227.15 (2008): 7069-7076. (see also Sekora, Michael, and Phillip Colella. “Extremum-preserving limiters for MUSCL and PPM.” arXiv preprint arXiv:0903.4200 (2009).)

The Marvelous Magical Median

07 Tuesday Jun 2016

Posted by Bill Rider in Uncategorized

≈ 4 Comments

Every solution to every problem is simple. It’s the distance between the two where the mystery lies.

― Derek Landy

images The median is what a lot of people think of when you say “household income”. It is a statistical measure of central tendency of data that is a harder to compute alternative to the friendly common mean value. For large data sets the median is tedious and difficult to compute while the mean is straightforward and easy. The median is the middle entry of the ordered list of numerical data thus the data being studied needs to be ordered, which is hard and expensive to perform. The mean of the data is simple to compute. On the other hand by almost any rational measure, the median is better than the mean. For one thing, the median is not easily corrupted by any outliers in the data (data that is inconsistent or corrupted); it quite effectively ignores them. The same outliers immediately and completely corrupt the mean. The median is strongly associated with the one-norm, which is completely awesome and magical. images-1 This connection is explored through the amazing and useful field of compressed sensing, which uses the one norm to do some really sweet (cool, awesome, spectacular, …) stuff.

About 15 years ago I found myself quite taken with a mathematical tool associated with constructing “limiters” for solving hyperbolic conservation laws numerically. This mathematical tool is the “median” function for three arguments. It simply returns the value that lies in the middle of the triad. It simple and powerful, but hasn’t really caught on more generally. It seems to be a boutique method rather than common in use. It’s a true shame; because the median is pretty fucking awesome. Given the median’s power and utility in building numerical method, I find its lack of “popularity” puzzling. Since I like it so much, I will wax a bit on its greatness and what I’ve done with it.

It’s easy to attack and destroy an act of creation. It’s a lot more difficult to perform one.

― Chuck Palahniuk

I came across the function and its use with limiters reading the work of HT Huynh. HT works at NASA in Cleveland and has done some really great work over the years. The median function made its first appearance in work on Hermitian polynomial interpolation (used a lot with tabular lookups). He wrote several of my favorite papers on solving hyperbolic PDE’s, first on slope limited second-order schemes (a real gem covering a wealth of material and presenting it in a wonderful manner, it deserves many more citations!) and a second with Suresh on limiting edge values (another tour de force!). The median function appears prominently in both and allows some cool stuff to be done in a very clear and literal fashion. Like the median function, HT’s work is significantly under-appreciated.

If you want something new, you have to stop doing something old

― Peter F. Drucker

History is filled with brilliant people who wanted to fix things and just made them worse.

― Chuck Palahniuk

The function itself is quite elegant. If you’re in the business of limiters, you can implement it directly from another function you certainly have on hand, the minmod (minimum modulus) function. Jay Boris invented the minmod as part of the flux corrected transport method (arguably the first limiter). It returns the function with the minimum magnitude if the arguments to the function have the same sign, and zero otherwise. It can be implemented in several ways of varying elegance. The classic way of implementing minmod is, $\mbox{minmod}\left(a, b \right) =\frac{1}{4} \left(\mbox{sign}[a]+\mbox{sign}[b]\right) \left(|a+b| - |a-b| \right)$ . HT Huynh introduced a really cool way to write minmod that was much better since you could use it with Taylor series expansions straightaway, $\mbox{minmod}\left(a, b \right) =\frac{1}{4} \left(\mbox{sign}[a]+\mbox{sign}[b]\right) \left(|a+b| - |a-b| \right)$ .

650px-LimiterPlots1 Once the minmod is in hand, the median is a piece of cake to implement, $\mbox{median}\left(a, b, c \right) = a + \mbox{minmod} \left( b-a,c-a\right)$ . In papers by Huynh and Huynh & Suresh, the median is used to enforce bounds. This mindset is perfectly in keeping with the thinking about monotonicity preserving methods but the function has other properties that can take it much further. A lack of recognition of these other properties may limit its penetration into further use. In a general sense it is used in defining the limits that an approximation value can take and produce a monotonicity preserving answer. The median is used to repeatedly bring an approximate value inside well-defined bounding values although for monotonicity it just requires two applications. Huynh first applied this to the linear interpolated second-order methods. In that case the median could be used to implement the classical TVD limiters or slope limiters. Suresh and Huynh took the same approach to higher order accuracy through looking edge value/flux approximations that might have an arbitrary order of accuracy and integrated via a method of lines approach, but uses many other applications of the median in the process.

Do not get obsolete like an old technology, keep innovating yourself.

― Sukant Ratnakar

Once the median function is in hand it may be used to write limiters in what I think is a rather elegant fashion. For example take the standard monotonicity or TVD limiter for a “good” second-order method based on Fromm’s scheme, which can be written in a relatively hideous fashion as, $S_M = (\mbox{sign}[S_2] \max\left[0, \min\left(|S_2|, 2 \mbox{sign}[S_2] \Delta_- U, 2 \mbox{sign}[S_2] \Delta_+ U \right)\right]$ . Here $S_2 = \frac{1}{2}(\Delta_- U + \Delta_+ U)$ is the unlimited slope for the linear Fromm scheme, with $\Delta_- U = U(j)$ – $U(j-1)$ and $\Delta_+ U = U(j+1)$ – $U(j)$ being the negative and positive differences of the variables on the mesh. With the median we can write the limiter in two compact statements, $S_m = \mbox{median}\left( 0, \Delta_- U , \Delta_+ U \right)$ , $S_M = \mbox{median}\left( 0, S_m , S_2 \right)$ . More importantly as Huynh shows this format allows significant and profitable extensions of the approach leading to higher accuracy approximation. It is the extensibility that may be exploited as a platform to innovate limiters and remove some of their detrimental qualities.

200px-ParabolicExtrap The same basic recipe of bounding allows the parabolic limiter for the PPM method to be expressed quite concisely and clearly. The classic version of the PPM limiter involves “if” statements and seems rather unintuitive. With the median it is very clear and concise. The first simply assures that the chosen (high-order) edge values, $U_l$ and $U_r$ are enclosed by the neighboring mesh values, $U_r := \mbox{median}\left( U(j), U_r, U(j+1) \right)$ and $U_l := \mbox{median}\left( U(j), U_l, U(j-1) \right)$ . While the second step assures that the parabola does not contain a zero derivative (or no local minima or maxima) in the cell, $U_R = \mbox{median} ( U_j, U_r, 3 U(j)$ – $2 U_l )$ and $U_L = \mbox{median} ( U_j, U_l, 3U(j)$ – $2 U_r )$ . With this version of PPM there is significant liberty in defining the left and right initial edge values by whatever formula, accuracy and nature you might like. This makes PPM an immensely flexible and powerful method (the original PPM is very powerful to begin with).

The median preserves accuracy and allows one to create new nonlinearly accurate approximations. Huynh delivered the median along with a theorem regarding it accuracy properties. If two of the three arguments in the median function have a certain order of accuracy in approximation than the evaluated median function will have that order of accuracy. The proof is simple and intuitive. If the defined order of accuracy values is returned there isn’t a question, but if the returned value does not have the order accuracy a priori, its accuracy is in question. By being bounded by the two values with well-defined linear order of accuracy, the median value is the convex combination of the two accurate values, which locally means it has the same order of accuracy. Viola! This means the median has the ability to “breed” new high order approximations from existing high-order expressions. If you’re interested in producing well-behaved approximations that are also high-order; it is a super useful property.

Here is a supposition that extends the accuracy property of the median to make it more powerful. The median can also “breed” new nonlinearly stable approximations. This would work as follows: if two of the arguments are stable, the evaluation of the median would produce a stable value even if the argument returned is linearly unstable. Thus we can take various stable approximations and through the median use them to produce new stable approximations even if they are unstable. This should work exactly like the order of approximation through the evaluated median producing a value that is the convex linear combination of the two defined stable approximations. If this property can be proven then all you need to do is keep track of the properties as you work defining your method. If you are careful you can end up with provable methods having well-defined accuracy and stability even when some parts of the method have neither to begin with. This is the essence of a the nonlinear method. It does something the linear method cannot.

aflbracket This produced a concept of producing nonlinear schemes that I originally defined as a “playoff” system. The best accurate and stable approximation would be the champion of the playoffs and would inherit the properties of the set of schemes used to construct the playoff. This would work as long as the playoff was properly orchestrated to produce stable and high-order approximations in a careful manner if the median were used to settle the competition. I had discussed this as a way to produce better ENO-like schemes. Perhaps I should have taken the biological analogy in the definition of the scheme in terms of what sort of offspring each decision point in the scheme bred. Here the median is used to breed the schemes and produce more fit offspring. The goal at the end of scheme is to produce an approximation with a well-chosen order of accuracy and stability that can be proven.

The method I show here was developed with a goal in mind, produce an ENO(-like) method with all of the stability, but with better accuracy and fidelity. It would have the same order of accuracy as ENO methods, but would produce solutions of greater fidelity. One of the big issues with ENO (and WENO) methods is that they are relatively dissipative as well as being complex. For applied problems with shocks and discontinuous features a well-designed second-order method will produce a much better, higher fidelity solution than an ENO method. WENO is more competitive, but still not better than a method such as PPM until the solution has immense amount of high frequency structure. The goal is to produce ENO methods that inherit the sort of fidelity properties that second-order methods produce, but also provide bona fide high-order accuracy,

Here is a relatively simple example of the sort of scheme I suggested and how the “playoff” system would work. In this case, the goal is to produce a third-order approximation with well-defined stability properties. We start with three third order approximations, $U_{3,1} = \frac{1}{3}U(j-2) - \frac{7}{6}U(j-1) + \frac{11}{6}U(j)$ , $U_{3,2} = -\frac{1}{6}U(j-1) + \frac{5}{6}U(j) + \frac{1}{3}U(j)$ and $U_{3,3} =\frac{1}{3}U(j) + \frac{5}{6}U(j+1) -\frac{1}{6}U(j+2)$ (all written from the right edge of cell $j$ ) plus a second-order monotone approximation $U_{2,M}$ . For example one might use the second-order minmod based scheme, $U(j) +\frac{1}{2}\mbox{minmod}\left( \Delta_- U, \Delta_+ U \right)$ . One of the three third-order methods is the upstream-centered scheme, which has favorable properties from an accuracy and stability point-of-view ( $U_{3,2}$ for this example).

The second thing to note is that one of the three third-order approximations is the nonlinearly stable classical ENO approximation (we don’t know which one, and don’t need to either! and that is cool). The playoff would consist of two rounds and three applications of the median in the process. The semi-finals would consist of two choices, $U_{3,A} = \mbox{median}\left(U_{3,1}, U_{3,2}, U_{2,M}\right)$ and $U_{3,B} = \mbox{median}\left(U_{3,2}, U_{3,3}, U_{2,M}\right)$ . Both $U_{3,A}$ and $U_{3,B}$ are third-order accurate and one of them is also nonlinearly stable to the extent that the ENO approximation would be (the second stable scheme is $U_{2,M}$ ). The finals then would be defined by the following test, $U_{3,C} = \mbox{median}\left(U_{3,A}, U_{3,B}, U_{2,M}\right)$ and would produce a third-order and nonlinearly stable approximation. This is a direct consequence of the median function’s use. Two of the approximations are third-order, and two of the approximations are nonlinearly stable, so the result of the playoff final is an approximation that gives us both. The results from my paper strongly indicate that this approach is a success, and can be used to construct methods of much higher order accuracy.

If one were feeling more bold about solving problems of this ilk, one might throw the fifth-order upstream centered approximation into the mix. This fifth-order approximation is the asymptotic limit of the fifth-order WENO method when the solution is (very) smooth and a linear combination of the three third-order approximations, $U_5 = \frac{1}{30}U(j-2) - \frac{13}{60}U(j-1) + \frac{47}{60}U(j) + \frac{27}{60}U(j+1) - \frac{1}{20}U(j+2)$ . Make the fifth-order approximation the returning champion have it play the winner of the playoff system in a final match. This would look like the following, $U_{3,D} = \mbox{median}\left(U_{3,C}, U_5, U_{2,M}\right)$ . It would still only be third-order accurate in a formal manner, but have better accuracy where the solution is well-resolved. The other thing that is intrinsically different from the classic ENO approach is the guidance of the choice for $U_{2,M}$ . To a very large extent the result of the playoff is determined by this choice and the result inherits the properties of this method. There is a wealth of monotonicity-preserving second-order methods to choose from used to determine the fitness of the high-order approximations.

ENO methods traditionally use a hierarchical stencil selection approach that introduces serious practical problems. Their weighted version is a far better method (WENO) both because it is higher order, but also better stability properties. The high order nature of these schemes can be analyzed via the elegant continuous nature of the weights. The largest issue with both methods is their great cost either in floating point or logical expression evaluation. Using the median function can provide a route analogous to the playoff scheme above to produce an ENO method where the prospect is that the chosen scheme is the one closest to some “comparison” method chosen from the wealth of monotonicity-preserving methods. This ENO scheme would then inherit the basic properties of the base scheme, but also have the high-order nature of ENO along with being non-oscillatory. A close relative of the median function is the xmedian, which delivers whatever argument is closer in an absolute value sense to the comparison value. For the following form, $\mbox{xmedian}\left(a, b, c \right)$ the function will provide either $b$ or $c$ depending on their distance from $a$ . The function is written down at the end of the post along with all the other functions used (I came up with these to allow the analysis of this class of methods via modified equation analysis).

Here is a third-order ENO like method based on either the median or xmedian procedure. Both methods will produce a bona fide method with third-order accuracy (confirmed via numerical tests). The method starts with the selection of the comparison scheme, $U_c$ . This scheme may be first or second-order accurate, but needs to have strong nonlinear stability properties. For the third-order (or 4^th order) method, the scheme has two rounds and three total tests (i.e., semi-finals and finals). The semi-finals are $U_{3,A} = \mbox{xmedian}\left( U_c, U_{3,1}, U_{3,2} \right)$ and $U_{3,B} = \mbox{xmedian}\left( U_c, U_{3,2}, U_{3,3} \right)$ . The finals are then easy to predict from the previous method, $ latex U_{3,C} = \mbox{xmedian}\left( U_c, U_{3,A}, U_{3,B} \right) $. With the properties of the xmedian we can see that the scheme will select one of the third-order methods. This issue is that this playoff scheme does not produce a selection that would obviously be stable using the logic from earlier in the post. The question is whether the non-oscillatory method chosen by its relative closeness to the comparison scheme is itself nonlinearly stable.

The main question is the nature of the stability. The proposition is that the comparison scheme drawn from the category of nonlinear stable, monotonicity preserving methods endows the result with stability. With the median there is a very strong argument for stability. For the xmedian this argumentation is absent. Stability in both cases is born out through numerical testing, but theoretical support does not seem to have an obvious path. I note that ENO and WENO methods don’t have a deep stability result aside from the notion that classic ENO will produce an entropy-stable flux (but WENO does not). Practical success would seem to favor WENO to a very large degree as it has been applied to a vast number of practical cases.

If you’re not prepared to be wrong, you’ll never come up with anything original.

― Ken Robinson

The use of the median and related functions starts to beg some questions desperately in need of answers. There is a broad a deep of understanding of what numerical stability is and implies. This is for linear schemes, but the concept of nonlinear stability needs deeper thought. The archetypes of nonlinear stability are TVD/monotonicity-preserving schemes and ENO or WENO methods. The nature of these methods allows the local use of intrinsically unstable approximations without the usual path to catastrophe this implies. Where is the dividing line between a method being non-oscillatory (nonlinearly stable) and not. Using the median or its brother the xmedian, I can come up with a host of methods that perform in a fashion that implies that they are nonlinearly stable. Can we come up with a technically defensible definition that can be used to take these methods even further?

The possession of knowledge does not kill the sense of wonder and mystery. There is always more mystery.

― Anaïs Nin

References

Huynh, Hung T. “Accurate monotone cubic interpolation.” SIAM Journal on Numerical Analysis 30.1 (1993): 57-100.

Huynh, Hung T. “Accurate upwind methods for the Euler equations.” SIAM Journal on Numerical Analysis 32.5 (1995): 1565-1619.

Suresh, A., and H. T. Huynh. “Accurate monotonicity-preserving schemes with Runge–Kutta time stepping.” Journal of Computational Physics 136.1 (1997): 83-99.

Boris, Jay P., and David L. Book. “Flux-corrected transport. I. SHASTA, A fluid transport algorithm that works.” Journal of computational physics 11.1 (1973): 38-69. (Kuzmin, D., R. Löhner, and S. Turek, eds. Flux-Corrected Transport: Principles, Algorithms, and Applications. Scientific Computation. Springer, 2005.)

Colella, Phillip, and Paul R. Woodward. “The piecewise parabolic method (PPM) for gas-dynamical simulations.” Journal of computational physics 54.1 (1984): 174-201.

Harten, A., Engquist, B., Osher, S., & Chakravarthy, S. R. (1997). Uniformly high order accurate essentially non-oscillatory schemes, III. Journal of Computational Physics, 131(1), 3-47. (Harten, Ami, et al. “Uniformly high order accurate essentially non-oscillatory schemes, III.” Upwind and High-Resolution Schemes. Springer Berlin Heidelberg, 1987. 218-290.)

Liu, Xu-Dong, Stanley Osher, and Tony Chan. “Weighted essentially non-oscillatory schemes.” Journal of computational physics 115.1 (1994): 200-212.

Shu, Chi-Wang. Essentially non-oscillatory and weighted essentially non-oscillatory schemes for hyperbolic conservation laws. Springer Berlin Heidelberg, 1998.

Henrick, Andrew K., Tariq D. Aslam, and Joseph M. Powers. “Mapped weighted essentially non-oscillatory schemes: achieving optimal order near critical points.” Journal of Computational Physics 207.2 (2005): 542-567.

Greenough, J. A., and W. J. Rider. “A quantitative comparison of numerical methods for the compressible Euler equations: fifth-order WENO and piecewise-linear Godunov.” Journal of Computational Physics 196.1 (2004): 259-281.

Rider, William J. “Building Better (Weighted) ENO Methods.” Computational Fluid Dynamics 2006. Springer Berlin Heidelberg, 2009. 161-166.

Algebraically defined functions:

$\min\left(a,b \right) =\frac{1}{2}(a+b)- \frac{1}{2}abs[a-b]$

$\max\left(a,b \right) = \frac{1}{2} (a+b)+ \frac{1}{2}abs[a-b]$

$\mbox{minmod}\left(a, b \right) =\frac{1}{4} \left(\mbox{sign}[a]+\mbox{sign}[b]\right) \left(|a+b| - |a-b| \right)$

$\mbox{minmod}\left(a, b \right) = (\mbox{sign}[a] \max\left[0, \min\left(|a|, \mbox{sign}[a] b \right)\right]$

$\mbox{maxmod}\left(a, b \right) =\frac{1}{4} \left(\mbox{sign}[a]+\mbox{sign}[b]\right)\left(|a+b| + |a-b|\right)$

$\mbox{mineno}\left(a, b \right) =\frac{1}{2} \mbox{sign}(a+b)\left( |a + b| - |a-b|\right)$

$\mbox{median}\left(a, b, c \right) = a + \mbox{minmod} \left( b-a,c-a\right)$

$\mbox{xmedian}\left(a, b, c \right) = a + \mbox{mineno}\left(b-a,c-a \right)$

Nonlinear methods, a key to modern modeling and simulation

03 Friday Jun 2016

Posted by Bill Rider in Uncategorized

≈ 3 Comments

Clarke’s Third Law: Any sufficiently advanced technology is indistinguishable from magic.

– Arthur C. Clarke

images-1 One of the most important things about modern computational methods is their nonlinear approach to solving problems. These methods are easily far more important to the utility of modeling and simulation in the modern world than high performance computing. The sad thing is that little or no effort is going into extending and improving these approaches despite the evidence of their primacy. Our current investments in hardware are unlikely to yield much improvement whereas these methods were utterly revolutionary in their impact. The lack of perspective regarding this reality is leading to vast investment in computing technology that will provide minimal returns.

In this case the nonlinearity is the use of solution adaptive methods that apply the “best” method adjusted to the local structure of the solution itself. In many cases the methods are well grounded in theory, and provided the essential push of improvement in results that have powered computational physics into a powerful technology. Without these methods we would not have the capacity to reliably simulate many scientifically interesting and important problems, or utilize simulation in the conduct of engineering. The epitome of modeling and simulation success is computational fluid dynamics (CFD) where these nonlinear methods have provided the robust stability and stunning results capturing imaginations. In CFD these methods were utterly game changing and the success of the entire field is predicated on their power. The key to this power is poorly understood, and too often credited to computing hardware instead of the real source of progress: models, methods and algorithms with the use of nonlinear discretizations being primal.

And a step backward, after making a wrong turn, is a step in the right direction.

― Kurt Vonnegut

Unfortunately, we live in an era where the power of these methods and algorithms is under-utilized. Instead we see an almost complete emphasis on computing hardware as the route toward progress. This emphasis is naïve, and ultimately limits the capacity of computing, modeling and simulation to become a boon to society. Our current efforts are utterly shallow intellectually and almost certainly bound to fail. This failure wouldn’t be the good kind of failure since the upcoming crash is absolutely predictable. Computing hardware is actually the least important and most distant aspect of computing technology necessary for the success of simulation. High performing computers are absolutely necessary, but woefully insufficient for success. A deeper and more appreciative understanding of nonlinear methods for solving our models would go a long way to harnessing the engine of progress.

When one thinks about these things that almost immediately comes to mind with nonlinear methods is the term “limiters”. Often limiters are viewed as being a rather heavy-handed and crude approach to things. A limiter is a term inherited from the physics community where physical effects in modeling and simulation need to be limited to stay within some prescribed physical bounds. For example keeping transport within the limit of the speed of light for radiation transport. Another example would be a limit keeping positive-definite quantities such as density or pressure, positive. In nonlinear methods the term limiter often can be interpreted as limiting the amount of high-order method used in solving an equation. It has been found that without limits high-order methods produce anomalous effects (e.g., oscillations, solutions outside bounds, non-positivity). The limiter becomes a mixing parameter for high-order and low-order method for solving a transport equation. Usually the proviso is that the low-order method is guaranteed to produce a physically admissible solution.

All theories are legitimate, no matter. What matters is what you do with them.

― Jorge Luis Borges

To keep from falling into the old traps of how to talk about things, let’s take a deeper look at how these things work, and why. Also explore the alternatives to limiters and the ways everything needs to work together to be effective. There is more than one way to make it all work, but the recipe ends up being quite the same at the high end.

Ideally, we would like to have highly accurate, very stable, and completely reliable computations. The fundamental nature of our models of the universe conspire to make this far from a trivial task. Our models being comprised of mathematical expressions then are governed by mathematical realities. These realities in the form of fundamental theorems provide a foundation upon which excellent numerical methods may be built. Last week I discussed the Lax Equivalence theorem and this is a good place to start. There the combination of consistency and stability imply convergence of solutions. Consistency is having numerical accuracy if at least first order, but higher than first order accuracy will be more efficient. Stability is the nature of just getting a solution that hangs together (doesn’t blow up as we compute it). Stability is something so essential to success that it supersedes anything else when it is absent. godunov-1 Next in our tour of basic foundational theorems is Godunov’s theorem, which tells us a lot about what is needed. In its original form it’s a bit of a downer, you can’t have a linear method be higher than first-order accurate and non-oscillatory (monotonicity preserving). The key concept is to turn the barrier on its head, you can have a nonlinear method be higher than first-order accurate and non-oscillatory. This is then the instigation for the topic of this post, the need and power of nonlinear methods. I’ll posit the idea that the concept may actually go beyond hyperbolic equations, but the whole concept of nonlinear discretizations is primarily applied to hyperbolic equations.

url A few more theorems help to flesh out the basic principles we bring to bear. A key result is the theorem of Lax and Wendroff that shows the value of discrete conservation. If one has conservation then you can show that you are achieving weak solutions. This must be combined with picking the right weak solution, as there are actually infinitely many weak solutions, all of them wrong save one. The task of getting the right weak solution is produced with sufficient dissipation, which produces entropy. Key results are due to Osher who attached the character of (approximate) Riemann solvers to the production of sufficient dissipation to insure physically relevant solutions. As we will describe there are other means to introducing dissipation aside from Riemann solvers, but these lack some degree of theoretical support unless we can tie them directly to the Riemann problem. Of course most of us want physically relevant solutions although lots of mathematicians act like this is not a primal concern! There is a vast phalanx of other osher photoNormal theorems of practical interest, but I will end this survey with a last one by Osher and Majda with lots of practical import. Simply stated this theorem limits the numerical accuracy we can achieve in regions affected by a discontinuous solution to first-order accuracy. The impacted region is bounded by the characteristics emanating from the discontinuity. This puts a damper on the zeal for formally high order accurate methods, which needs to be considered in the context of this theorem.

Whenever a theory appears to you as the only possible one, take this as a sign that you have neither understood the theory nor the problem which it was intended to solve.

― Karl Popper

I will note that we are focused on hyperbolic equations because they are so much harder to solve. It remains an open question as to how much all of this applies to parabolic or elliptic equations. I suspect it does albeit to a less degree because those equations are so much more forgiving. Hyperbolic equations are deeply unforgiving and cause you to pay for all sins many times over. For this reason hyperbolic equations have been a pacing aspect of computational modeling where nonlinear methods have allowed progress and to some degree tamed the demons.

vanleer2 Let’s return to the idea of limiters and act to dissuade the common view of limiters and their intrinsic connection to dissipation. The connection there is real, but less direct than commonly acknowledged. A limiter is really a means of stencil selection in an adaptive manner separate from dissipation. They may be combined, but usually not with good comprehension of the consequences. Another way to view the limiter is a way of selecting the appropriate bias in the stencil used to difference an equation based upon the application of a principle. The principle most often used for limiting is some sort of boundedness in the representation, which may equivalently be associated with selecting a smoother (nicer) neighborhood to execute the discretization on. The way an equation is differenced certainly impacts the nature and need for dissipation in the solution, but it is indirect. Put differently, the amount of dissipation needed with the application of a limiter varies both with the limiter itself, but also with the dissipation mechanism, but the two are independent. This does get into the whole difference between flux limiters and geometric limiters, a topic worth some digestion.

In examining the topic of limiters and nonlinear stabilization we find an immediate and significant line of demarcation between two philosophical tenets. The first limiters came from limiting fluxes in the method of flux corrected transport. An alternative approach is found through extending Godunov’s method to higher order accuracy, and involves the limiting of the geometric reconstruction of variables. In the case of flux limiting, the accuracy and dissipation (e.g. stabilization) are mixed together while the geometric approach separates these effects into independent steps. The geometric approach puts bounds on the variables being solved, and then relies on an agnostic approach to the variables for stabilization in the form of a Riemann solver. Both approaches are successful in solving complex systems of equations and have their rabid adherents (I favor the reconstruct-Riemann approach). The flux form can be quite effective and produces better extensions to multiple dimensions, but can also involve heavy-handed dissipation mechanisms.

If one follows the path using geometric reconstruction several aspects of nonlinear stabilization may be examined more effectively. Many of the key issues associated with the nonlinear stabilization are crystal clear in this context. The clearest example is monotonicity of the reconstruction. It is easy to see how the reconstructed version of the solution is bounded (or not) compared to the local variable’s bounds. It is generally appreciated that the application of monotonicity renders solutions relatively low order in accuracy. This is most acute at extrema in the solution where the accuracy degrades to first-order intrinsically. What becomes even more enabled by looking at the geometric reconstruction is moving beyond monotonicity to stabilization that provides better accuracy. The derivation of methods that do not degrade accuracy at extrema, but extend the concept of boundedness appropriately is much clearer with a geometric approach. Solving this problem effectively is a key to producing methods with greater numerical accuracy and better efficiency computationally. While this may be achieved via flux limiters with some success there are serious issues in conflating geometric fidelity with dissipative stabilization mechanisms.

The needs for dissipation in schemes are often viewed with significant despair by parts of the physics community. Much of the conventional wisdom would dictate that dissipation-free solutions are favored. This is rather unfortunate because dissipation is the encoding of powerful physics principles into methods rather than some sort of crude numerical artifice. Their specific and bounded dissipative character governs most complex nonlinear systems. If the solution has too little dissipation, the solution is not physical. As such, dissipation is necessary and its removal could be catastrophic.

john-von-neumann-2 The first stabilization of numerical methods was found with the original artificial viscosity (developed by Robert Richtmyer to stabilize and make useful John Von Neumann’s numerical method for shock waves). The name “artificial viscosity” is vastly unfortunate because the dissipation is utterly physical. Without its presence the basic numerical method was utterly and completely useless leading to a catastrophic instability (almost certainly helped instigate the investigation of numerical stability along with instability in integrating parabolic equations). Physically most interesting nonlinear systems produce dissipation even in the limit where the explicit dissipation can be regarded as vanishingly small. This is true for shock waves and turbulence where the dissipation in the inviscid limit has richtmyer_robert_a1 remarkably similar forms structurally. Given this basic need, the application of some sort of stabilization is an absolute necessity to produce meaningful results both from a purely numerical respect and the implicit connection to the physical World. I’ve written recently on the use of hyperviscosity as yet another mechanism for producing dissipation. Here the artificial viscosity is the archetype of hyperviscosity and its simplest form. As I’ve mentioned before the original turbulent subgrid model was also based directly upon the artificial viscosity devised by Richtmyer (often misattributed to Von Neumann although their collaboration clearly was important).

The other approach to stabilization is at first blush utterly different in approach, but upon deep reflection completely similar and complementary. For other numerical methods the use of Riemann solvers can be used to provide stability. A Riemann solver uses the nature of a local (approximate) exact physical solution to resolve a flow locally. In this way the nature of the physical propagation of information provides a more stable representation of the solution. It turns out that the Riemann solution will produce an implicit dissipative effect that impacts the solution much like artificial viscosity would. The real power is exploiting this implicit dissipative effect to provide real synergy where each may provide complementary views of stabilization. For example a Riemann solver produces a complete image of what modes in a solution must be stabilized in a manner that might inform the construction of an improved artificial viscosity. On the other hand artificial viscosity applies to nonlinearities often ignored by approximate Riemann solvers. Finding an equivalence between the two can produce more robust and complete Riemann solvers. As is true with most things, true innovation can be achieved through an open 05 edited mind and a two way street.

Perhaps I will attend to a deeper discussion of how high-order accuracy connects to these ideas in next week’s post. I’ll close with a bit of speculation. In my view one of the keys to the future is how to figure out how to efficiently harness high-order methods for more accurate solutions with sufficient robustness. New computing platforms dearly need the computational intensity offered by high-order methods to keep themselves busy in auseful manner because of exorbitant memory access costs. Because practical problems are ugly with all sorts of horrible nonlinearities and effectively singular features, the actual attainment of high-order convergence of solutions is unachievable. Making high-order methods both robust enough and efficient enough under these conditions has thus far eluded our existing practice.

Extending the ideas of nonlinear methods to other physics is another direction needed. For example, do any of these ideas apply to parabolic equations with high-order methods? There are some key ideas of physically relevant solutions, but parabolic equations are generally so forgiving no focus is given. The question of whether something like Godunov’s theorem applying to high-order solution of parabolic equations is worth asking (I think a modified version of it might well). I will not that some non-oscillatory schemes do apply to extremely nonlinear parabolic equations exhibiting wave-like solutions.

The last bit of speculation will apply to the application of these ideas to common methods for solid mechanics. Generally it has been found that artificial viscosity is needed for stabling integrating solid mechanical models. Unfortunately this is using the crudest and least justified philosophy of artificial viscosity. Solid mechanics could deeply utilize the principles of nonlinear methods to very great effect. Solid mechanics is basically a system hyperbolic equations abiding by the full set of foundational conditions elucidated earlier in the post. As such everything should follow as a matter of course. Today this is not the case and my belief is that the codes suffer greatly from a lack of robustness and deeper mathematical foundation for the modeling and simulation capability.

The success of modeling and simulation as a technological tool and engine of progress would be well served by focus on nonlinear discretization lacking almost entirely from our current emphasis on hardware.

The reasonable man adapts himself to the world: the unreasonable one persists in trying to adapt the world to himself. Therefore all progress depends on the unreasonable man.

― George Bernard Shaw

References

VonNeumann, John, and Robert D. Richtmyer. “A method for the numerical calculation of hydrodynamic shocks.” Journal of applied physics 21.3 (1950): 232-237.

Lax, Peter D., and Robert D. Richtmyer. “Survey of the stability of linear finite difference equations.” Communications on Pure and Applied Mathematics 9.2 (1956): 267-293.

Godunov, Sergei Konstantinovich. “A difference method for numerical calculation of discontinuous solutions of the equations of hydrodynamics.”Matematicheskii Sbornik 89.3 (1959): 271-306.

Lax, Peter, and Burton Wendroff. “Systems of conservation laws.”Communications on Pure and Applied mathematics 13.2 (1960): 217-237.

Osher, Stanley. “Riemann solvers, the entropy condition, and difference.”SIAM Journal on Numerical Analysis 21.2 (1984): 217-235.

Majda, Andrew, and Stanley Osher. “Propagation of error into regions of smoothness for accurate difference approximations to hyperbolic equations.”Communications on Pure and Applied Mathematics 30.6 (1977): 671-705.

Harten, Amiram, James M. Hyman, Peter D. Lax, and Barbara Keyfitz. “On finite‐difference approximations and entropy conditions for shocks.”Communications on pure and applied mathematics 29, no. 3 (1976): 297-322.

Boris, Jay P., and David L. Book. “Flux-corrected transport. I. SHASTA, A fluid transport algorithm that works.” Journal of computational physics 11.1 (1973): 38-69.

Zalesak, Steven T. “Fully multidimensional flux-corrected transport algorithms for fluids.” Journal of computational physics 31.3 (1979): 335-362.

Van Leer, Bram. “Towards the ultimate conservative difference scheme. V. A second-order sequel to Godunov’s method.” Journal of computational Physics 32.1 (1979): 101-136.

Van Leer, Bram. “Towards the ultimate conservative difference scheme. IV. A new approach to numerical convection.” Journal of computational physics23.3 (1977): 276-299.

Liu, Yuanyuan, Chi-Wang Shu, and Mengping Zhang. “High order finite difference WENO schemes for nonlinear degenerate parabolic equations.”SIAM Journal on Scientific Computing 33.2 (2011): 939-965.

The Regularized Singularity

~ The Eyes of a citizen; the voice of the silent

Monthly Archives: June 2016

We have already lost to the Chinese in Supercomputing; Good thing it doesn’t matter

Limiters are powerful; how to make them even more powerful

A Recipe for Improved Methods

The Marvelous Magical Median

Nonlinear methods, a key to modern modeling and simulation