My Job Should Be Awesome; Its Not, Why?

29 Friday Jul 2016

There is nothing quite so useless, as doing with great efficiency, something that should not be done at all.

― Peter F. Drucker

This post is going to be a bit more personal than usual; I’m trying to get my head around why work is deeply unsatisfying, and how the current system seems to conspire to destroy all the awesome potential it should have. My job should have all the things we desire: meaning, empowerment and a quest for mastery. At some level we seem to be in an era that belittles all dreams and robs work of the meaning it should have, disempowers most, and undermines mastery of anything. Worse yet, mastery of things seems to invoke outright suspicion being regarded more than a threat then a resource. On the other hand, I freely admit that I’m lucky and have a good well-paying, modestly empowering job compared to the average Joe or Jane. So many have it so much worse. The flipside of this point of view is that we need to improve work across the board; if the best jobs are this crappy one can scarcely imagine how bad things are for normal or genuinely shitty jobs.

In my overall quest for quality and mastery I refuse to settle for this and it only makes my dilemma all the more confounding. As I state in the title, my job has almost everything going for it and it should be damn close to unambiguously awesome. In fact my job used to be awesome, and lots of forces beyond my control have worked hard to completely fuck that awesomeness up. Again getting to the confounding aspects of the situation, the forces that be seem to be absolutely hell bent on continuing to fuck things up, and turn awesome jobs into genuinely shitty ones. I’m sure the shitty jobs are almost unbearable. Nonetheless, I know that grading on a curve my job is still awesome, but I don’t settle for success being defined by being less shitty than other people. That’s just a recipe for things to get shittier! If I have these issues with my work, what the hell is the average person going through?

So before getting to all the things fucking everything up, let’s talk about why the job should be so incredibly fucking awesome. I get to be a scientist! I get to solve problems, and do math and work with incredible phenomena (some of which I’ve tattooed on my body). I get to invent things like new ways of solving problems. I get to learn and grow and develop new skills, hone old skills and work with a bunch of super smart people who love to share their wealth of knowledge. I get to write papers that other people read and build on, I get to read papers written by a bunch of people who are way smarter than me, and if I understand them I learn something. I get to speak at conferences (which can be in nice places to visit) and listen at them too on interesting topics, and get involved in deep debates over the boundaries of knowledge. I get to contribute to solving important problems for mankind, or my nation, or simply for the joy of solving them. I work with incredible technology that is literally at the very bleeding edge of what we know. I get to do all of this and provide a reasonably comfortable living for my loved ones.

If failure is not an option, then neither is success.

― Seth Godin

All of the above is true, and here we get to the crux of the problem. When I look at each day I spend at work almost nothing in that day supports any of this. In a very real sense all the things that are awesome about my job are side projects or activities that only exist in the “white space” of my job. The actual job duties that anyone actually gives a shit about don’t involve anything from the above list of awesomeness. Everything I focus on and drive toward is the opposite of awesome; it is pure mediocre drudgery, a slog that starts on Monday and ends on Friday, only to start all over again. In a very deep and real sense, the work has evolved into a state where all the awesome things about being a scientist are not supported at all, and every fucking thing done by society at large undermines it. As a result we are steadily and completely hollowing out value, meaning, and joy from the work of being a scientist. This hollowing is systematic, but serves no higher purpose that I can see other than to place a sense of safety and control over things.

The Cul-de-Sac ( French for “dead end” ) … is a situation where you work and work and work and nothing much changes

― Seth Godin

So the real question to answer is how did we get to this point? How did we create systems whose sole purpose seems to be robbing life of meaning and value? How are formerly great institutions being converted into giant steaming piles of shit. Why is work becoming such a universal shit show? Work with meaning and purpose should be something society values both from the standpoint of pure productivity, but also for the sense of respect for humanity. Instead we are turning away from making work meaningful, and making steady choices that destroy the meaning in work. The forces at play are a combination of fear, greed, and power. Each of these forces has a role to play is a widespread and deep destruction of a potentially better future. These forces provide short-term comfort, but long-term damage that ultimately leaves us poorer both materially and spiritually.

Men go to far greater lengths to avoid what they fear than to obtain what they desire.

― Dan Brown

Of these forces, fear is the most acute and widespread. Fear is harnessed by the rich and powerful to hold onto and grow their power, their stranglehold on society. Across society we see people looking at the world and saying “Oh shit! this is scary, make it stop!” The rich and powerful can harness this chorus of fear to hold onto and enhance their power. The fear comes from the unknown and change, which is driving people into attempting to control things, which also suits the needs of the rich & powerful. This control gives people a false sense of safety and security at the cost of empowerment and meaning. For those at the top of the food chain, control is what they want because it allows them to hold onto their largess. The fear is basically used to enslave the population and cause them to willingly surrender for promises of safety and security against a myriad of fears. In most cases we don’t fear the greatest thing threatening us, the forces that work steadfastly to rob our lives of meaning. At work the fear is the great enemy of all that is good killing meaning, empowerment and mastery in one fell swoop.

Power does not corrupt. Fear corrupts… perhaps the fear of a loss of power.

― John Steinbeck

How does this manifest itself in my day-to-day work? A key mechanism in undermining meaning in work is the ever more intrusive and micromanaged money running research. The control comes under the guise of accountability (who can argue with that, right). The accountability leads to a systematic diminishment in achievement and has much more to do with a lack of societal trust (which embodies part of fear mechanics). Instead of insuring better results and money well spent, the whole dynamic creates a virtual straightjacket for everyone in the system that assures they actually create, learn and produce far less. We see research micromanaged, and projectized in ways that are utterly incongruent with how science can be conducted. The lack of trust translates to lack of risk and the lack of risk equates to lack of achievement (with empowerment and mastery sacrificed at the altar of accountability). This is only one aspect of how the control works to undermine work. There are so many more.

Our greatest fear should not be of failure but of succeeding at things in life that don’t really matter.

― Francis Chan

Part of these systematic control mechanisms at play is the growth of the management culture in all these institutions. Instead of valuing the top scientists and engineers who produce discovery, innovation and progress, we now value the management class above all else. The managers manage people, money and projects that have come to define everything. This is true at the Labs as it is at universities where the actual mission of both has been scarified to money and power. Neither the Labs nor Universities are producing what they were designed to create (weapons, students, knowledge,). Instead they have become money-laundering operations whose primary service is the careers of managers. All one has to do is see who are the headline grabbers from any of these places; it’s the managers (who by and large show no leadership). These managers are measured in dollars and people, not any actual achievements. All of this is enabled by control and control enables people to feel safe and in control. As long as reality doesn’t intrude we will go down this deep death spiral.

We have priorities and emphasis in our work and money that have nothing to do with the reason our Labs, Universities or even companies exist. We have useless training that serves absolutely no purpose other than to check a box off. The excellence or quality of the work done has no priority at all. We have gotten to the point where peer review is a complete sham, and any honest assessment of the quality of the work is met with hostility. We should all wrap our heads collectively around this maxim of the modern workplace, it can be far worse for your career to demand technical quality as part of what you do than to do shoddy work. We are heading headlong into a mode of operation where mediocrity is enshrined as a key organizational value to be defended against potential assaults by competence. All of this can be viewed as the ultimate victory of form over substance. If it looks good, it must be good. The result is that the appearances are managed, and anything of substance is rejected.

The result of the emphasis on everything else except the core mission of our organizations is the systematic devaluation of those missions, along with a requisite creeping incompetence and mediocrity. In the process the meaning and value of the work takes a fatal hit. Actually expressing a value system of quality and excellence is now seen as a threat and becomes are career limiting perspective. A key aspect of the dynamic to recognize is the relative simplicity of running mediocre operations without any drive for excellence. Its great work if you can get it! If your standards are complete shit, almost anything goes, and you avoid the need for conflict almost entirely. In fact the only source of conflict becomes the need to drive away any sense of excellence. Any hint of quality or excellence has the potential to overturn this entire operation and the sweet deal of running it. So any quality ideas are attacked and driven out as surely as the immune system attacks a virus. While this might be a tad hyperbolic, its not too far off at all, and the actual bull’s-eye for an ever growing swath of our modern world.

The key value is money and its continued flow. The whole system runs on a business model of getting money regardless of what it entails doing or how it is done. Of course having no standards makes this so much easier, if you’ll do any shitty thing as long as they pay you for it management is easier. With standards of quality this whole operation becomes self-replicating. In a very direct way the worse thing one can do is get hard work, so the system is wired to drive good work away. You’re actually better off doing shitty work held to shitty standards. Doing the right thing, of the thing right is viewed as a direct threat to the flow of money and generates an attack. The prime directive is money to fund people and measure the success of the managers. Whether or not the money generates excellent meaningful work or focuses on something of value simply does not matter. It becomes a completely viscous cycle where money breeds more money and more money can be bred by doing simple shoddy work than asking hard questions and demanding correct answers. In this way we can see how mediocrity becomes the value that is tolerated and excellence is reviled. Excellence is a threat to power, mediocrity simply accepts being lorded over by the incompetent.

At some level it is impossible to disconnect what is happening in science from the broader cultural trends. Everything happening in the political climate today is part of the trends I see at work. The political climate is utterly and completely corrosive, and the work environment is the same thing. In the United States we have had 20 years of government, which has been engineered to not function. This is to support the argument that government is bad and doesn’t work (and it should be smaller). The fact is that it is engineered not to work by the proponents of this philosophy. The result is a literal self-fulfilling prophesy, government doesn’t work if you don’t try to make it work. If we actually put effort into making it work, valued expertise and excellence, it would work just fine. We get shit because that’s what we ask for. If we demanded excellence and performance, and actually held people accountable for it, we might actually get it, but it would be hard, it would be demanding. The problem is that success would disprove the maxim that government is bad and doesn’t work.

One of my friends recently pointed out that the people managing and running the programs that fund the work at the Labs in Washington actually make less than our Postdocs at the Labs. The result is that we get what we pay for, incompetence, which grows more manifestly obvious with each passing year. If we want things to work we need to hire talented people and hold them to high standards, which means we need to pay them what they are worth.

We see a large body of people in society who are completely governed by fear above all else. The fear is driving people to make horrendous and destructive decisions politically. The fear is driving the workplace into the same set of horrendous and destructive decisions. Its not clear whether we will turn away from this mass fear before things get even worse. I worry that both work and politics will be governed by these fears until it pushes us over the precipice to disaster. Put differently, the shit show we see in public through politics mirrors the private shit show in our workplaces. The shit is likely to get much worse before it gets better.

There are two basic motivating forces: fear and love. When we are afraid, we pull back from life. When we are in love, we open to all that life has to offer with passion, excitement, and acceptance. We need to learn to love ourselves first, in all our glory and our imperfections. If we cannot love ourselves, we cannot fully open to our ability to love others or our potential to create. Evolution and all hopes for a better world rest in the fearlessness and open-hearted vision of people who embrace life.

― John Lennon

A More Robust, Less Fragile Stability for Numerical Methods

25 Monday Jul 2016

Posted by Bill Rider in Uncategorized

≈ 2 Comments

Science is the process that takes us from confusion to understanding…

― Brian Greene

Stability is essential for computation to succeed. Better stability principles can pave the way for greater computational success. We are in dire need of new, expanded concepts for stability that provides paths forward toward uncharted vistas of simulation.

Without stability numerical methods are completely useless. Even a modest amount of instability can completely undermine and destroy the best simulation intentions. Stability became a thing; right after computers became a thing. Early work on ordinary differential equations encountered instability, but the computations being handcrafted was always suspect. The availability of automatic computations via computers ended the speculation, and now it became clear, numerical methods could become unstable. With the proof of a clear issue in hand great minds went to work to put this potential chaos to order. This is the kind of great work we should be asking applied math to be doing today, and sadly are not because of our over reliance on raw computing power.

Von Neumann devised the first technique for stability analysis after encountering instability at Los Alamos during World War 2. This method is still the gold standard for analysis today in spite of rather profound limitations and applicability. In the early 1950’s Lax came up withthe equivalence theorem (interestingly both Von Neumann and Lax worked with Robert Richtmyer, https://williamjrider.wordpress.com/2016/05/20/the-lax-equivalence-theorem-its-importance-and-limitations/), which only highlighted the importance of stability more boldly. Remarkably ordinary differential equation methods came to stability later than partial differential equations in Dahlquist’s groundbreaking work. He produced a stability theory and equivalence theory that paralleled the work of Von Neumann and Lax for PDEs. All he needed were computers to drive the need for the work. We will note that the PDE theory is all for linear methods and linear equations, while the ODE theory is for
linear methods, but applies to nonlinear ODEs,

Once a theory was established for stability computations could proceed with enough guarantee of solution to progress. For a very long time this stability work was all that was needed. Numerical methods, algorithms and general techniques galore came into being and application covering a broad swath of the physics and engineering World. Gradually, over time, we started to see computation become spoken as a new complementary practice in science that might stand shoulder to shoulder with theory and experiment. These views are a bit on the grandiose side of things where a more balanced perspective might rationally note that numerical methods allow complex nonlinear models to be solved where classical analytical approaches are quite limited. At this point its wise to confront the issue that might be creeping into your thinking, our theory is mostly linear while the utility for computation is almost all nonlinear. We have a massive gap between theory and utility with virtually no emphasis or focus or effort to close it.

This is so super important that I’ve written about it before, doing the basic Von Neumannmethodology using Mathematica, https://williamjrider.wordpress.com/2014/07/15/conducting-von-neumann-stability-analysis/ & https://williamjrider.wordpress.com/2014/07/21/von-neumann-analysis-of-finite-difference-methods-for-first-order-hyperbolic-equations/, in the guise of thoughts about robustness, https://williamjrider.wordpress.com/2014/12/03/robustness-is-stability-stability-is-robustness-almost/,

and practical considerations for hyperbolic PDEs, https://williamjrider.wordpress.com/2014/01/11/practical-nonlinear-stability-considerations/. Running headlong through this arc of thought are lessons learned from hyperbolic PDEs.Hyperbolic PDEs have always been at the leading edge of computation because they are important to applications, difficult and this has attracted a lot of real unambiguous genius to solve it. I’ve mentioned a cadre of genius who blazed the trails 60 to 70 years ago (Von Neumann, Lax, Richtmyer https://williamjrider.wordpress.com/2014/05/30/lessons-from-the-history-of-cfd-computational-fluid-dynamics/, https://williamjrider.wordpress.com/2015/06/25/peter-laxs-philosophy-about-mathematics/). We are in dire need of new geniuses to slay the nonlinear dragons that stand in the way of progress. Unfortunately there is little or no appetite or desire for progress, and the general environment stands squarely in opposition. The status quo is viewed as all we need, https://williamjrider.wordpress.com/2015/07/10/cfd-codes-should-improve-but-wont-why/, and progress in improving basic capabilities and functionality has disappeared except for utilizing ever more complex and complicated hardware (with ever more vanishing practical returns).

Antifragility is beyond resilience or robustness. The resilient resists shocks and stays the same; the antifragile gets better.

― Nassim Nicholas Taleb

I’ve been thinking about concepts of nonlinear stability for a while wondering if we can move past the simple and time-honored concepts developed so long ago. Recently I’ve taken a look a Taleb’s “anti-fragile” concept and realized that it might have some traction in this arena. In a sense the nonlinear stability concepts developed here to fore are akin to anti-fragile where the methodology is developed and works best in the worst-case scenario. In the case of nonlinear stability for hyperbolic PDEs the worst-case scenario is a linear discontinuity where the solution has a jump and the linear solution is utterly unforgiving of any errors. In this crucible, all the bad things that can happen to a solution arise, either overly diffusive low accuracy solutions, or oscillations with high accuracy producing demonstrably unphysical results.

No structure, even an artificial one, enjoys the process of entropy. It is the ultimate fate of everything, and everything resists it.

― Philip K. Dick

When the discontinuity is associated with a real physical solution these twin and opposing maladies are both unacceptable. The diffusion leads to a complete was of computational effort that dramatically inhibits any practical utility of numerical methods. For more complex systems of equations where turbulent chaotic solutions would naturally arise, the diffusive methods drive all solutions to be laminar and boring (not matching physical reality in essential aspects!). On the plus side of the ledger, the diffusive solution is epically robust and reliable, a monument to stability. On the other high-order methods based on the premise that the solution is smooth and differentiable (i.e., nice and utterly ideal) is the epitome of fragility. The oscillations can easily put the solution into unphysical states that render the solution physically absurd.

Difficulty is what wakes up the genius

― Nassim Nicholas Taleb

Now we get to the absolute genius of nonlinear stability that arose from this challenge. Rather than forcing us to accept one or the other, we introduce a concept by which we can have the best of both, using whatever discretization is most appropriate for the local circumstances. Thus we have a solution adaptive method that chooses the right approximation for the solution locally. Therefore a different method may be used for every place in space and time. The key concept is the rejection of the use of a linear discretization where the same method is applied everywhere in the solution, which caused the entire problem elucidated above. Instead we introduce a mechanism to analyze the solution and introduce an approximation appropriate for the local structure of the solution.

The desired outcome is to use the high-order solution as much as possible, but without inducing the dangerous oscillations. The key is to build upon the foundation of the very stable, but low accuracy, dissipative method. The theory that can be utilized makes the dissipative structure of the solution a nonlinear relationship. This produces a test of the local structure of the solution, which tells us when it is safe to be high-order, and when the solution is so discontinuous that the low order solution must be used. The result is a solution that is high-order as much as possible, and inherits the stability of the low-order solution gaining purchase on its essential properties (asymptotic dissipation and entropy-principles). These methods are so stable and powerful that one might utilize a completely unstable method as one of the options with very little negative consequence. This class of methods revolutionized computational fluid dynamics, and allowed the relative confidence in the use of methods to solve practical problems.

Instead of building on the lessons learned with these methods, we seem to have entered an era where the belief that no further progress on this front is needed. We have lost sight of the benefits of looking to produce better methods. Better methods like those that are nonlinearly stable open new vistas of simulation that presently are closed to systematic and confident exploration. The methods described above did just this and produced the current (over) confidence in CFD codes.

A good question is why haven’t these sort of ideas spread to a wider swath of the numerical world? The core answer is “good enough” thinking, which is far too pervasive today. Part of the issue is the immediacy of need for hyperbolic PDEs isn’t there in other areas. Take time integration methods where the consensus view is that ODE integration is good enough, thank you very much, and we don’t need effort. To be honest, it’s the same idea as CFD codes, which are they are good enough and we don’t need effort. Other areas of simulation like parabolic and elliptic PDEs might also use such methods, but again theneed is far less obvious than for hyperbolic PDEs. The truth is that we have made rather stunning progress in both areas and the breakthroughs have put forth the illusion that the methods today are good enough. We need to recognize that this is awesome for those who developed the status quo, but a very bad thing if there are other breakthroughs ripe for the taking. In my view we are at such a point and missing the opportunity to make the “good enough,” “great” or even “awesome”. Nonlinear stability is deeply associated with adaptivity and ultimately more optimal and appropriate approximations for the problem at hand.

If you’re any good at all, you know you can be better.

― Lindsay Buckingham

So what might a more general principle look like as applied to ODE integration? Let’s explore this along with some ideas of how to extend the analysis to support such paths. The acknowledged bullet-proof method is backwards Euler, $u^{n+1} = u^n + h f(u^{n+1})$ , which is A-stable and the ODE gold standard of robustness. It is also first-order accurate. One might like to use something higher order with equal reliability, but alas this is exactly the issue we have with hyperbolic PDEs. What might a nonlinear approximation look like?

Let’s assume we will stick with the BDF (backwards differentiation formula) methods, and we have an ODE that should produce positive definite (or some sort of sign- or property-preserving character). We will stick with positivity for simplicity’s sake. The fact is that perfectly linearly stable solutions may well produce solutions that lose positivity. Staying with simplicity of exposition, the second-order BDF method is $\frac{3}{2} u^{n+1} = 2 u^n - \frac{1}{2} u^{n-1} + h f(u^{n+1})$ . This can be usefully rearranged to $\frac{3}{2} u^{n+1} - h f(u^{n+1}) = 2 u^n -\frac{1}{2} u^{n-1}$ . If the right hand side of this expression is positive, $2 u^n -\frac{1}{2} u^{n-1} > 0$ , and the eigenvalue of $f(u^{n+1}) < 0$ , we have faith that $u^{n+1} > 0$ . If the right hand side is negative, it would be wise to switch to the backwards Euler scheme for this time step. We could easily envision taking this approach to higher and higher order.

Further extensions of nonlinear stability would be useful for parabolic PDEs. Generally parabolic equations are fantastically forgiving so doing anything more complicated is not prized unless it produces better accuracy at the same time. Accuracy is imminently achievable because parabolic equations generate smooth solutions. Nonetheless these accurate solutions can still produce unphysical effects that violate other principles. Positivity is rarely threatened although this would be a reasonable property to demand. It is more likely that the solutions will violate some sort of entropy inequality in a mild manner. Instead of producing something demonstrably unphysical, the solution would simply not have enough entropy generated to be physical. As such we can see solutions approaching the right solution, but in a sense from the wrong direction that threatens to produce non-physically admissible solutions. One potential way to think about is might be in an application of heat conductions. One can examine whether or not the flow of heat matches the proper direction of heat flow locally, and if a high-order approximation does not either choose another high-order approximation that does, or limit to a lower order method wit unambiguous satisfaction of the proper direction. The intrinsically unfatal impact of these flaws mean they are not really addressed.

Mediocrity will never do. You are capable of something better.

― Gordon B. Hinckley

Thinking about marvelous magical median (https://williamjrider.wordpress.com/2016/06/07/the-marvelous-magical-median/) spurred these thoughts to mind. I’ve been taken with this functions ability to produce accurate approximations from approximations that are lower order, and pondering whether this property can extend to stability either linear or nonlinear (empirical evidence is strongly in favor). Could the same functionality be applied to other schemes or approximation methods? In the case of the median if we take two of the three arguments to have a particular property, the third argument will inherit this property if the other two bound it. This is immensely powerful and you can be off to the races by simply having two schemes that possess a given desirable property. Using this approach more broadly than it has been applied thus far would be an interesting avenue to explore.

If you don't have time to do it right, when will you have the time to do it over?

― John Wooden

The Death of Peer Review

16 Saturday Jul 2016

Posted by Bill Rider in Uncategorized

≈ 6 Comments

If you want to write a negative review, don’t tickle me gently with your aesthetic displeasure about my work. Unleash the goddamn Kraken.

― Scott Lynch

Sadly, this spirit is not what we see today either on the giving or receiving end of peer review, and we are all poorer for it.

As surely as the sun rises in the East, peer review that treasured and vital process for the health of science is dying or dead. In many cases we still try to conduct meaningful peer review, but increasingly it is simply a mere animated zombie form of peer review. The zombie peer review of today is a mere shadow of the living soul of science it once was. Its death is merely a manifestation of bigger broader societal trends such as those unraveling the political processes, or transforming our economies. We have allowed the quality of the work being done to become an assumption that we do not actively interrogate through a critical process (e.g., peer review). Instead if we examine the emphasis for how money is spent in science and engineering everything, but the quality of the technical work is focused on and demands are made. Instead there is an inherent assumption that the quality of the technical work is excellent and the organizational or institutional focus. With sufficient time this lack of emphasis is eroding the quality presumptions to the point where they no longer hold sway.

Being the best is rarely within our reach. Doing our best is always within our reach.

― Charles F. Glassman

In science, peer review takes many forms each vital to the healthy functioning of productive work. Peer review forms an essential check and balance on the quality of work, wellspring of ideas and vital communication mechanism. Whether in the service of publishing cutting edge research, or providing quality checks for Laboratory research or engineering design its primal function is the same; quality, defensibility and clarity are derived through its proper application. In each of its fashions the peer review has an irreplaceable core of a community wisdom, culture and self-policing. With its demise, each of these is at risk of dying too. Rebuilding everything we are tearing down is going to be expensive, time-consuming and painful.

Let’s get to the first conclusion of this thought process, peer review is healthiest in its classic form, the academic publishing review, and it is in crisis there. The scientific community widely acknowledges that the classic anonymous peer review is absolutely riddled with problems and abuses. The worst bit is that this is where it works the best. So at its best, peer review is terrible. The critiques are many and valid. For example there is widespread abuse of the process by the powerful and established. The system is driven by a corrupt academic system that feeds the overall dysfunction (i.e., publish or perish). Corruption and abuse by the journals themselves is deep and getting worse never mind the exploding costs. Then we have issues about teaming conflicts of interest and deeply passive aggressive behavior veiled behind the anonymity. Despite all these problems peer review here tends to still largely work and albeit in a deeply suboptimal manner.

Another complaint is the time and effort that these reviews take along with suggestions to make things better with modern technology. Online publishing and the ubiquity of the Internet is capable of radically reducing the time and effort (equals money) of publishing a paper. I will say that the time and effort issue for peer review is barking up the wrong tree. The problem with the time and effort is that peer review isn’t valued sufficiently. Doing peer review isn’t given much wait professionally whether you’re a professor or working in a private or government lab. Peer review won’t give you tenure, or pay raises or other benefits; it is simply a moral act as part of the community. This character as an unrewarded moral act gets to the issue at the heart of things. Moral acts and “doing the right thing” is not valued today, nor are there definable norms of behavior that drive things. It simply takes the form of an unregulated professional tax, pro bono work. The way to fix this is change the system to value and reward good peer review (and by the same token punish bad in some way). This is a positive side of modern technology, which would be good to see, as the demise of peer review is driven to some extent by negative aspects of modernity, as I will discuss at the end of this essay.

Honest differences are often a healthy sign of progress.

― Mahatma Gandhi

Let’s step away from the ideal context of the classical academic peer review of a paper to an equally common practice, the peer review of organizations, programs and projects. This is a practice of equal or greater importance as it pertains to the execution of technical work across the World. We see it taking action in the form of a design review for software, engineered products and analyses used to inform decisions. In my experience peer review in these venues is in complete free-fall and collapsing under the weight of societal pressures that cannot support the proper execution of the necessary practices. My argument is that we are living within a profoundly low-trust world, and peer review relies upon implicit expectations of trust to be executed with any competence. This lack of trust is present on both ends of the peer review system. When the trust is low, honesty cannot be present and instead honesty will be punished.

First let’s talk about the source of critique, the reviewers. Reviewers have little trust and faith that their efforts will be taken seriously if they find problems, and if they do raise an issue it is just as likely that they, the messenger, will be punished instead. As a result reviewers rarely do a complete of good job of reviewing things, as they understand what the expected result is. Thus the review gets hollowed out from its foundation because the recipient of the review expects to get more than just a passing grade, they expect to get a giant pat on the back. If they don’t get their expected results, the reaction is often swift and punishing to those finding the problems. Often those looking over the shoulder are equally unaccepting of problems being found. Those overseeing work are highly political and worried about appearances or potential scandal. The reviewers know this to and that a bad review won’t result in better work, it will just be trouble for those being reviewed. The end result is that the peer review is broken by the review itself being hollow, the reviewers being easy on the work because of the explicit expectations and the implicit punishments and lack of follow through for any problems that might be found.

Those who can create, will create. Those who cannot create, will compete.

― Michael F. Bruyn

If we look to those being reviewed, the system only gets worse. The people being reviewed only see downside to engaging in peer review, no upside at all. Increasingly any sort of spirit or implied expectation of technical quality has been left behind. Peer review is done to provide the veneer of quality regardless to its actual presence. As a consequence any result from peer review that doesn’t say this work is the best and executed perfectly is ripe to be ignored (or punish those not complying with the implied directive). Those being reviewed have no desire or intent to take any corrective action or address any issue that might be surfaced. As a result the peer review is simply window dressing and serves no purpose other than marketing. The reasons for the evolution to this dysfunctional state are many and clear. The key to the problem is the lack of ability to politically confront problems. A problem is often taken as a death sentence rather than a call to action. Since no issues or actual challenges will be confronted, much less solved, the only course of action is to ignore and bury them.

We then get to the level above who is being reviewed and closer to the source of the problem, the political system. Our political systems are poisonous to everything and everyone. We do not have a political system perhaps anywhere in the World that is functioning to govern. The result is a collective inability to deal with issues, problem and challenges at a massive scale. We see nothing, but stagnation and blockage. We have a complete lack of will to deal with anything that is imperfect. Politics is always present and important because science and engineering are still intrinsically human activities, and humans need politics. The problem is that truth, and reality must play some normative role in decisions. The rejection of effective peer review is a rejection of reality as being germane and important in decisions. This rejection is ultimately unstable and unsustainable. The only question is when and how reality will impose itself, but it will happen, and in all likelihood through some sort calamity.

To get to a better state visa-vis peer review trust and honesty needs to become a priority. This is a piece of a broader rubric for progress toward a system that values work that is high in quality. We are not talking about excellence as superficially declared by the current branding exercise peer review has become, but the actual achievement of unambiguous excellence and achievement. The combination of honesty, trust and the search for excellence and achievement are needed to begin to fix our system. Much of the basic structure of our modern society is arrayed against this sort of change. We need to recognize the stakes in this struggle and prepare our selves for difficult times. Producing a system that supports something that looks like peer review will be a monumental struggle. We have become accustomed to a system that feeds on false excellence and achievement and celebrates scandal as an opiate for the masses.

One only needs to look to the nature of the current public discourse and political climate. We are rapidly moving into a state where the discourse is utterly absent of any substance and the poisonous climate is teetering over into destructive. Reality is beginning to fight back against the flaws in the system. Socially we are seeing increased fear, violence and outright conflict. The problems with peer review pale in comparison to the tide rolling in, but reflect many of the same issues. Peer review is an introverted view of our numerous ills where the violence and damaging environment evident in our mass media is the extroverted side of the same coin. In this analysis peer view is simply another side effect of the massive issues confronting our entire world projected into the environment of science and engineering. Fixing all these issues is in the best interests of humanity, but it’s going to be hard and unpleasant. Because of the difficulty of fixing any of this, we will avoid it until the problems become unbearable for a large enough segment of humanity. Right now it is easier and simpler to just accept an intrinsically uncritical perspective and simply lie to ourselves about how good everything is and how excellent all of are.

If one doesn’t have the stomach to examine things through such a social lens, one might consider the impact of money on the system. In many ways the critical review of research can now be measured almost entirely in monetary terms. This is especially true in the organizational or laboratory environment where most people managing view money and its continued flow being the only form of review they care about. In such a system a critical peer review system becomes a threat instead a source of renewal. Gradually, over time the drive for technical excellence is replaced by the drive for financial stability. We have allowed financial stability to become disconnected from technical achievement, and in doing so killed peer review. When technical excellence and achievement become immaterial to any measure of success, and money only matters peer review is something to be ignored, avoided and managed because no perceived good can come from it.

Self-consciousness kills communication.

― Rick Steves

Worse than having no perceived good associated with it, peer review if done properly becomes evidence of problems. The problems exposed in peer review represent calls to action that today’s systems cannot handle because they are an affront to planning, schedules, milestones and budgetary allocations. Problems also expose flaws in the fundamental assumptions of today’s world that the technical work is high quality and does not need active focused (appropriate) management to succeed. As a result any problems induce a “shoot the messenger” mentality that acts to destroy the critique and send a clear message that peer review should not be done honestly or seriously. The result has been a continual erosion of the technical quality, so often assumed to be present a priori. This is a viscous cycle where technical problems remain unexposed, or hidden by a lack of vigorous, effective peer review. The problems then fester and grow because problems like these do not cure themselves, and the resistance to peer review or any form of critique only becomes further re-enforced. This doesn’t end well, and the end results are perfectly predictable. The only thing that stems the decay is the encroachment of decay ultimately ends the ability to conduct a peer review at all. Moreover, the culture that is arising in science acts as a further inhibition to effective review by removing the attitudes necessary for success from the basic repertoire of behaviors.

Some men are born mediocre, some men achieve mediocrity, and some men have mediocrity trust upon them.

― Joseph Heller

10 Big Things For the Future of (Computational) Science

10 Sunday Jul 2016

Posted by Bill Rider in Uncategorized

≈ 1 Comment

The future depends on what you do today.

― Mahatma Gandhi

The future is already here – it’s just not evenly distributed.

― William Gibson

When did the future switch from being a promise to being a threat?

― Chuck Palahniuk

It has been a long time since I wrote a list post, and it seemed a good time to do one. They’re always really popular online, and it’s a good way to survey something. Looking forward into the future is always a nice thing when you need to be cheered up. There are lots of important things to do, and lots of massive opportunities. Maybe if we can muster our courage and vision we can solve some important problems and make a better world. I will cover science in general, and hedge the conversation toward computational science, cause that’s what I do and know the most about.

Mediocrity will never do. You are capable of something better.

― Gordon B. Hinckley

Here is the list:

Fixing the research environment and encouraging risk taking, innovation and tolerance for failure
CRISPR
Additive manufactoring
Exascale computing
Nontraditional computing paradigms
Big data
Reproducibility of results
Algorithmic breakthroughs
The upcoming robotic revolution (driverless cars)
Cyber-security and cyber-privacy

Fixing the research environment and encouraging risk taking, innovation and tolerance for failure. I put this first because it impacts everything else so deeply. There are many wonderful things that the future holds for all of us, but the overall research environment is holding us back from the future we could be having. The environment for conducting good, innovative game changing research is terrible, and needs serious attention. We live in a time where all risk is shunned and any failure is punished. As a result innovation is crippled before it has a chance to breathe. The truth is that it is a symptom of a host larger societal issues revolving around our collective governance and capacity for change and progress.Somehow we have gotten the idea that research can be managed like a construction project, and such management is a mark of quality. Science absolutely needs great management, but the current brand of scheduled breakthroughs, milestones and micromanagement is choking the science away. We have lost the capacity to recognize that current management is only good for leeching money out of the economy for personal enrichment, and terrible for the organizations being managed whether it’s a business, laboratory or university. These current fads are oozing their way into every crevice of research including higher education where so much research happens. The result is a headlong march toward mediocrity and the destruction of the most fertile sources of innovation in the society. We are living off the basic research results of 30-50 years past, and creating an environment that will assure a less prosperous future. This plague is the biggest problem to solve but is truly reflective of a broader cultural milieu and may simply need to run its disastrous course.Over the weekend I read about the difference between first- and second-level thinking. First-level thinking looks for the obvious and superficial as a way of examining problems, issues and potential solutions. It is dealing with things in an obvious and completely intellectually unengaged manner. Let’s just say that science today is governed by first-level thinking, and it’s a very bad thing. This is contrasted with second-level thinking, which teases problems apart, analyzes them, and looks beyond the obvious and superficial. It is the source of innovation, serendipity and inspiration. Second-level thinking is the realm of expertise and depth of thought, and we all should know that in today’s World the expert is shunned and reviled as being dangerous. We will all suffer the ill-effects of devaluing expert judgment and thought as applied to our very real problems.

CRISPR What can I really say here, this technology is huge, enormous and an absolute game changer. When I learned about it the first time it literally stopped me in my tracks and I said “holy shit this could change everything!” If you don’t know CRISPR is the first easy to use and flexibly programmable method for manipulating the genetic code of living beings as well as short-circuiting the rules of natural selection. Just like nuclear energy CRISPR could be a massive force for good or evil. It has the potential to change the rules of how we deal with a host of diseases and plagues upon mankind. It also has the capacity to produce weapons of mass destruction and unleash carnage upon the world. We must use far more wisdom than we typically show in wielding its power. How we do this will shape the coming decades in ways we can scarcely imagine. It also emerges in the current era where great ideas are allowed to whither and die. It seems reasonable to say that we don’t know how to wield the very discoveries we make, and CRISPR seems like the epitome of this.
Additive manufacturing In engineering circles this is a massive opportunity for innovation and a challenge to a host of existing practices and knowledge. It will both impact and draw upon other issues from this list in how it plays out. It is often associated with the term 3-D printing, where we can produce full three dimensional objects in a process free of classic manufacturing processes like molds and production lines. The promise is to break free of the tyranny of traditional manufacturing approaches, limitations and design for small lots of designer, custom parts. Making the entire process work well enough for customers to rely upon it and have faith in the manufacturing quality and process is a huge aspect of the challenges. This is especially true for high performance parts where the requirements on the quality are very high. The other end of the problem is the opportunity to break free of traditional issues in design and open up the possibility of truly innovative approaches to optimality. Additional problems are associated with the quality and character of the material used in the design since its use in the creation of the part is substantially different than traditional manufactured part’s materials. Many of these challenges will be partially attacked using modeling & simulation drawing upon cutting edge computing platforms.
Exascale computing The push for more powerful computers for conducting societally important work is as misguided as it is a big deal. I’ve written so much recently about this I find little need to say more. It is an epitome of the first item on my list, as a wrongly managed, risk intolerant solution to a real issue, which will end up doing more harm than good in the long run. It is truly the victory of first-level thinking over the deeper and more powerful second-level thinking we need. Perhaps I’m being a bit haughty in my contention that what I’ve laid out in my blog constitutes second-level thinking about high performance computing, but I stand by it, and the summary that the first-level thinking governing our computing efforts today constitutes hopelessly superficial first-level thought. Really solving problems and winning at scientific computing requires a sea change toward applying the fruits of in-depth thinking about how to succeed at using computing as a means for societal good including the conduct of science.

Nontraditional computing paradigms We stand at the brink of a deep change in computing by one way or another. We will either see the end of the Moore’s law (actually its done, at all scales already), which has powered computing into a central role societally whether it is business or science. The only way the power of computing will continue to grow is through a systematic change in the principles by which computers are built. There are two potential routes being explored both being rather questionable in their capability to deliver the sort of power necessary to succeed. The most commonly discussed route is quantum computing, which promises incredible (almost limitless) power for a very limited set of applications. It also features rather difficult to impossible to manage hardware among problems limiting its transition to reality. The second approach is neuro-morphic, or brain-inspired computing, which may be more tangible and possible than quantum, but a longer shot at being a truly game changing technology. The jury is out on both technology paths, and we may just have to live with the end of Moore’s law for a long time.

Big data The Internet brought computing to the masses, and mobile computing brought computing to everyone in every aspect of our lives. Along with this ubiquity of computing came a wealth of data on virtually every aspect of everyone’s lives. This data is enormous and wildly varied in its structure teaming with possibility for uses of all stripes, good and bad. Big data is the route toward wealth beyond measure, and the embodiment of the Orwellian Big Brother we should all fear. Taming big data is the combination of computing, algorithms, statistics and business all rolled into one. It is one of the places where scientific computing is actually alive with holistic energy driving innovation all the way from models of data (reality), algorithms for taming the data and hardware to handle to load. New sensors and measurement devices are only adding to the wealth as the Internet of things moves forward. In science, medicine and engineering new instruments and sensors are flooding the World with huge data sets that must be navigated, understood and utilized. The potential for discovery and progress is immense as is the challenge of grappling with the magnitude of the problem.
Reproducibility of results The trust of science and expertise is seemingly at an all-time low. Part of this is the cause of the information (and misinformation) deluge we live in. It is feeding on and fed by the lack of trust in expertise within society. As such there has been some substantial focus on being able to reproduce the results of research. Some fields of study are having veritable crises driven by the failures of studies to be reproducible. In other cases the stakes are high enough that the public is genuinely worried about the issue. Such a common situation is a drug trial, which have massive stakes for anyone who might be treated with or need to be treated by a drug. Other areas science such as computation have fallen under the same suspicion, but may have the capacity to provide greater substance and faith in the reproducibility of their work. Nonetheless, this is literally a devil is in the details area and getting all the details right that contributes to research finding is really hard. The less oft spoken subtext to this discussion is the general societal lack of faith in science that is driving this issue. A more troubling thought regarding how replicable research actually comes from considering how uncommon replication actually is. It is uncommon to see actual replication, and difficult to fund or prioritize such work. Seeing how commonly such replication fails under these circumstances only heightens the sense of the magnitude of this problem.

Algorithmic breakthroughs One way of accelerating the progress in computers and the work they do is to focus on innovations in algorithms. Instead of relying on computational hardware to increase our throughput we rely on innovation in how we use those computers or implement our methods on those computers. Over time improvements in methods and algorithms have outpaced improvements in hardware. Recently this bit of wisdom has been lost to the sort of first-level thinking so common today. In big data we see needs for algorithm development overcoming the small-minded focus people rely upon. In scientific computing the benefits and potential is there for breakthroughs, but the vision and will to put effort into this is lacking. So I’m going to hedge toward the optimistic and hope that we see through the errors in our thinking and put faith in algorithms to unleash their power on our problems in the very near future!
The upcoming robotic revolution (driverless cars) The fact is that robots are among us already, but their scope and presence is going to grow. Part of the key issue with the robots is the lack of brainpower to really replace the human decision-making in tasks. Computing power, and the ubiquity of the Internet in all its coupled glory is making problems like this tractable. It would seem that driverless-robot cars are solving this problem in one huge area of human activity. Multiple huge entities are working this problem and by all accounts making enormous progress. The standard for the robot cars would seem to be very much higher than humans, and the system is biased against this sort of risk. Nonetheless, it would seem we are very close to seeing driverless cars on a road near you in the not too very distant future. If we can see the use of robot cars on our roads with all the attendant complexity, risks and issues associated with driving it is only a matter of time before robots begin to take their place in many other activities.

Cyber-security and cyber-privacy The advent of computing at such an enormous societal scale particularly with mobile computing penetrating every aspect of our lives is the twin security-privacy dilemma. On the one hand, we are potentially victimized by cyber-criminals as more and more commerce and finance takes place online driving a demand for security. The government-police-military-intelligence apparatus also sees the potential for incredible security issues and possible avenues through the virtual records being created. At the same time the ability to have privacy or be anonymous is shrinking away. People have the desire to not have every detail of their lives exposed to the authorities (employers, neighbors, parents, children, spouses,…) meaning that cyber-privacy will become a big issue too. This will lead to immense technical-legal-social problems and conflict over how to balance the needs-demands-desires for security and privacy. How we deal with these issues will shape our society in huge ways over the coming years.

The fantastic advances in the field of electronic communication constitute a greater danger to the privacy of the individual.

― Earl Warren

How to Win at Supercomputing

04 Monday Jul 2016

Posted by Bill Rider in Uncategorized

≈ 3 Comments

The best dividends on the labor invested have invariably come from seeking more knowledge rather than more power.

— Wilbur Wright

Here is a hint; it’s not how we are approaching it today. The approach today is ultimately doomed to fail and potentially take a generation of progress wit it. We need to emphasize the true differentiating factors and embrace the actual sources of progress. Computer hardware is certainly a part of the success, but by no means the dominant factor in true progress. As a result we are starving key aspects of scientific computing from the intellectual lifeblood needed for advancing the state of the art. Even if we “win” following our current trajectory, the end result will be a loss because of the opportunity cost incurred in pursuing the path we are on today. Supercomputing is a holistic activity embedded in a broader scientific enterprise. As such it needs to fully embrace the scientific method and structure its approach more effectively.

The reasonable man adapts himself to the world: the unreasonable one persists in trying to adapt the world to himself. Therefore all progress depends on the unreasonable man.

― George Bernard Shaw

The news of the Chinese success in solidifying their lead in supercomputer performance “shocked” the high performance-computing World a couple of weeks ago. To make things even more troubling to the United States, the Chinese achievement was accomplished with home grown hardware (a real testament to the USA’s export control law!). It comes as a blow to the American efforts to retake the lead in computing power. It wouldn’t matter if the USA or anyone else for that matter were doing things differently. Of course the subtext of the entire discussion around supercomputer speed is the supposition that raw computer power measures the broader capability in computing, which defines an important body of expertise for National economic and military security. A large part of winning in supercomputing is the degree to which this supposition is patently false. As falsehoods go, this is not ironclad and a matter of debate over lots of subtle details that I elaborated upon last week. The truth depends on how idiotic the discussion needs to be and one’s tolerance for subtle technical arguments. In today’s world arguments can only be simple, verging on moronic and technical discussions are suspect as a matter of course.

Instead of concentrating just on finding good answers to questions, it’s more important to learn how to find good questions!

― Donald E. Knuth

If you read that post you might guess the answer of how we might win the quest for supercomputing supremacy. In a sense we need to do a number of things better than today. First, we need to stop measuring computer power with meaningless and misleading benchmarks. These do nothing but damage the entire field by markedly skewing the overall articulation of both the successes, but also the challenges of building useful computers. Secondly, we need to invest our resources in the most effective areas for success these are modeling, methods and algorithms all of which are far greater sources of innovation and true performance for the accomplishment of modeling & simulation. The last thing is to change the focus of supercomputing to modeling & simulation because it is where the societal value of computing is delivered. If these three things were effectively executed upon victory would be assured to whomever made the choices. The option of taking more effective action is there for the taking.

Discovery consists of looking at the same thing as everyone else and thinking something different.

― Albert Szent-Györgyi

The first place to look for effort that might dramatically tilt the fortunes of supercomputing is modeling. Our models of the World are all wrong to some degree; they are all based on various limiting assumptions, and may be improved. None of these characteristics may be ameliorated by supercomputing power, or accuracy of discretization, nor algorithmic efficiency. Modeling limitations are utterly impervious to anything, but modeling improvement. The subtext to the entire discussion of supercomputing power is the supposition that our models today are completely adequate and only in need of faster computers to fully explain reality. This is an utterly specious point-of-view that basically offends the foundational principles of science itself. Modeling is the key to the understanding and irreplaceable in its power and scope to transform our capability.

And a step backward, after making a wrong turn, is a step in the right direction.

― Kurt Vonnegut

We might take a single example to illustrate the issues associated with modeling: gradient diffusion closures for turbulence. The diffusive closure of the fluid equations for the effects of turbulence is ubiquitous, useful and a dead end without evolution. It is truly a marvel of science going back to the work of Prantl’s mixing length theory. Virtually all the modeling of fluids done with supercomputing is reliant on its fundamental assumptions and intrinsic limitations. The only place where its reach does not extend to is the direct numerical simulation where the flows are computed without the aid of modeling, i.e., a priori (which for the purposes here I will take as a given although it actually needs a lot of conversation itself). All of this said, the ability of direct numerical simulation to answer our scientific and technical questions are limited because turbulence is such a vigorous and difficult multiscale problem that even an exascale computer cannot slay.

So let’s return to what we need to do to advance the serious business of turbulence modeling. In a broad sense one of the biggest limitations of diffusion as a subgrid closure is its inability to describe behavior that is not diffusive. While turbulence is a decisively dissipative phenomenon, it is not always and only dissipative locally. The diffusive subgrid closure makes this assumption and hence carries deep limitations. In key areas of a flow field the proper subgrid model is actually non-dissipative or even anti-dissipative. The problem is that diffusion is a very stable and simple way to model phenomena in many ways exaggerating its success. We need to develop non-diffusive models that extend the capacity to model flows not fully or well described by diffusive closure approaches.

Once a model is conceived of in theory we need to solve it. If the improved model cannot yield solutions, its utility is limited. Methods for computing solutions to models beyond the capability of analytical tools were the transformative aspect of modeling & simulation. Before this many models were only solvable in very limited cases through apply a number of even more limiting assumptions and simplifications. Beyond just solve the model; we need to solve it correctly, accurately and efficiently. This is where methods come in. Some models are nigh on impossible to solve, or entail connections and terms that evade tractability. Thus coming up with a method to solve the model is a necessary element in the success of computing. In the early years of scientific computing many methods came into use that tamed models into ease of use. Today’s work on methods has slowed to a crawl, and in a sense our methods development research are victims of their own success.

Arthur C. Clarke’s third law: Any sufficiently advanced technology is indistinguishable from magic.

An example of this success is the nonlinear stabilization methods I’ve written about recently. These methods are the lifeblood of the success computational fluid dynamics (CFD) codes have had. Without their invention the current turnkey utility of CFD codes would be unthinkable. Before their development CFD codes were far more art and far less science than today. Unfortunately, we have lost much of the appreciation for the power and scope of these methods. We have little understanding of what came before them and the full breadth of their magical powers. Before these methods came into the fore one was afforded the daunting task of choosing between an overly diffusive stable method (i.e., donor cell–upwind differencing) and a more accurate, but unphysically oscillatory method. These methods allowed on to have both and adaptively use whatever was necessary under the locally determined circumstances, but they can do much more. While their power to allow efficient solutions was absolutely immense, these methods actually opened doors to physically reasonable solutions to a host of problems. One could have both accuracy and physical admissibility in the same calculation.

This is where the tale turns back toward modeling. These methods actually provide some modeling capability for “free”. As such the modeling under the simplest circumstances is completely equivalent to the Prantl’s mixing layer approach, but with the added benefit of computability. More modern stabilized differencing actually provides modeling that goes beyond the simple diffusive closure. Because of the robust stability properties of the method one can compute solutions with backscatter stably. This stability is granted by the numerical approach, but provides the ability to solve the non-dissipative model with an asymptotic stability needed for physically admissible modeling. If one had devised a model with the right physical effect of local backscatter, these methods provide the stable implementation. In this way these methods are magical and make the seemingly impossible, possible.

This naturally takes us to the next activity in the chain of activities that add value to computing, algorithm development. This is the development of new algorithms that have greater efficiency to differentiate itself from the focus of algorithm work today, simply implementing old algorithms on the new computers, which comes down to dealing with the increasingly enormous amount of parallelism demanded. The sad thing is that no implementation can over come the power of algorithmic scaling, and this power is something we are systematically denying ourselves of. Indeed we have lost massive true gains in computational performance because of failure to invest in this area, and the inability to recognize the opportunity cost of a focus on implementing the old.

A useful place to look to in examining the sort of gains coming from algorithms is numerical linear algebra. The state of the art here comes from multigrid and it came into the fore over 30 years ago. Since then we have had no breakthroughs, when before a genuine breakthrough occurred about every decade. It is not coincidence 30 years ago is when parallel computing began its eventual takeover of high performance computing. Making multigrid or virtually any other “real” algorithm work at a massive parallel scale is very difficult, incredibly challenging work. This difficulty has swallowed up all the effort and energy in the system effectively starving the development of new algorithm invention out. What is the cost? We might understand the potential cost of these choices by looking back at what previous breakthroughs have gained.

We can look at the classical example of solving Poisson’s equation ( $\nabla^2 u = f$ ) on the unit square or cube to instruct us on how incredibly massive the algorithmic gains might be. The crossover point between a relaxation method (Gauss-Seidel, GS, or Jacobi) and an incomplete Cholesky conjugate gradient (ICCG) is at approximately 100 unknowns. For a multigrid algorithm the crossover point in cost occurs at around 1000 unknowns. Problems of 100 or 1000 unknowns can now be accomplished on something far less capable than a cell phone. For problems associated with supercomputers the differences in the cost of these different algorithms are utterly breathtaking to behold.

Consider a relatively small problem today of solving Poisson’s equation on a unit cube of 1000 unknowns in each direction ( $10^9$ unknowns). If we take the cost of multigrid as taking “one” the GS now takes ten million times more effort, and ICCG almost 1000 times the effort. Scale up the problem to something we might dream of doing on an exascale computer of a cube of 10,000 on a side with a trillion unknowns, and we easily see the tyranny of scaling and the opportunity of algorithmic breakthroughs we are denying ourselves of. For this larger problem, the GS now costs ten billion times the effort of multigrid, and ICCG is now 30,000 times the expense. Imagine the power of being able to solve something more efficiently than multigrid! Moreover multigrid can withstand incredible levels of inefficiency in its implementation and still win compared to the older algorithms. The truth is that parallel computing implementation drives the constant in front of the scaling up to a much larger value than a serial computer, so these gains are offset by the lousy hardware we have to work with.

Here is the punch line to this discussion. Algorithmic power is massive almost to a degree that defies belief. Yet algorithmic power is vanishingly small compared to methods, which itself is dwarfed by modeling. Modeling connects the whole simulation endeavor to the scientific method and is irreplaceable. Methods make these models solvable and open the doors of capability. All of these activities are receiving little tangible priority or support in the current high performance computing push resulting in the loss of incredible opportunities for societal benefit. Moreover we have placed our faith in the false hope that mere computing power is transformative.

Never underestimate the power of thought; it is the greatest path to discovery.

― Idowu Koyenikan

Both models and methods transcend the sort of gains computing hardware produces and can never replace. Algorithmic advances can be translated to the language of efficiency via scaling arguments, but provide gains that go far beyond hardware’s capacity for improvement. The problem is that all of these rely upon faith in humanities ability to innovate, think and produce things that had previously been beyond the imagination. This is an inherently risky endeavor that is prone to many failures or false hopes. This is something that today’s World seems to lack tolerance for, and as such the serendipity and marvel of discovery is scarified at the altar of fear.

We have to continually be jumping off cliffs and developing our wings on the way down.

― Kurt Vonnegut

The case for changing the focus of our current approach being airtight, and completely defensible. Despite the facts, the science and the benefits of following rational thinking there is precious little chance of seeing change. The global effort in supercomputing is utterly and completely devoted to the foolish hardware path. It wins by a combination of brutal simplicity, and eagerness to push money toward industry. So what we have is basically cash driven funeral pyre for Moore’s law. The risk-taking, innovation-driven approach necessary for success is seemingly beyond the capability of our society to execute today. The reasons why are hard to completely grasp, we have seemingly lost of nerve and taste for subtlety. Much of the case for doing the right things and those things that lead to success are bound to a change of mindset. Today the power, if not the value of computing are measured in the superficial form of hardware. The reality is that the power is bound to our ability to model, simulate and ultimately understand or harness reality. Instead we blindly put our faith in computing hardware instead of the intellectual strength of humanity.

The discussion gets to a number of misconceptions and inconsistencies that the field of supercomputing. The biggest issue is the disconnect between the needs of science and engineering and the success of supercomputing (i.e., what constitutes a win). Winning in supercomputing programs is tied to being able to put a (American) machine at the top of the list. Increasingly success at having the top computer on the increasingly useless Top500 list is completely at odds with acquiring machines useful for conducting science. A great deal of the uselessness of the list is the benchmark used to define its rankings, LINPAC, which is less relevant to applications every passing day. It has come to the point where it is hurting progress in a very real way.

The science and engineering needs are varied all the way from QCD, MD and DNS to climate modeling and integrated weapons calculations. The pure science needs of QCD, MD and DNS are better met by the machines being built today, but even in this idealized circumstance the machines we buy to top the computing list are fairly suboptimal for this pure science application. The degree of suboptimality for running our big integrated calculations has become absolutely massive over time and the gap is only growing larger with each passing year. Like most things, inattention to this condition is only allowing it to become worse. The machines being designed for winning the supercomputing contest are actual monstrosities that are genuinely unusable for scientific computing. Worse yet the execution of the exascale program is acting to make this worse in every way, not better.

We then increase the damaging execution of the supercomputing program is the systematic hollowing out of the science, and engineering content from our programs. We are systematically diminishing our efforts in experimentation, theory, modeling, and mathematics despite their greater importance and impact on the entire enterprise. The end result will be a lost generation of computational scientists who are left using computers completely ill-suited to the conduct of science. If National security is a concern, the damage we are doing is real and vast in scope.

We need supercomputing to be a fully complimentary part of the scientific enterprise used and relied upon only as appropriate with limits rationally chosen based on evidence. Instead we have created supercomputing as a prop and marketing stunt. There is a certain political correctness about how it contributes to our national security, and our increasingly compliant Labs offer no resistance to the misuse of the taxpayer money. The mantra is “don’t rock the boat,” we are getting money to do this. Whether or not it’s sensible or not is immaterial. The current programs are ineffective and poorly executed and do a poor job of providing the sorts of capability claimed. It is yet another example of and evidence of the culture of bullshit and pseudo-science that pervades our modern condition.

The biggest issue is the death of Moore’s law and our impending failure to produce the results promised. Rather than reform our programs to achieve real benefits for science and national security, we will see a catastrophic failure. This will be viewed through the usual lens of scandal. It is totally foreseeable and predictable. It would be advisable to fix this before disaster, but my guess is we don’t have the intellect, foresight, bravery or leadership to pull this off. The end is in sight and it won’t be pretty. Instead there is a different path that would be as glorious and successful. Does anyone have the ability to turn away from the disastrous path and consciously choose success?

An expert is someone who knows some of the worst mistakes that can be made in his subject, and how to avoid them.

― Werner Heisenberg

Some Background reading on the Top500 list and benchmarks that define it:

https://en.wikipedia.org/wiki/TOP500

https://en.wikipedia.org/wiki/LINPACK_benchmarks

https://en.wikipedia.org/wiki/HPCG_benchmark

A sample of prior posts on topics related to this one:

https://williamjrider.wordpress.com/2016/06/27/we-have-already-lost-to-the-chinese-in-supercomputing-good-thing-it-doesnt-matter/

https://williamjrider.wordpress.com/2016/05/04/hpc-is-just-a-tool-modeling-simulation-is-what-is-important/

https://williamjrider.wordpress.com/2016/01/15/could-the-demise-of-moores-law-be-a-blessing-in-disguise/

https://williamjrider.wordpress.com/2016/01/01/are-we-really-modernizing-our-codes/

https://williamjrider.wordpress.com/2015/11/19/supercomputing-is-defined-by-big-money-chasing-small-ideas-draft/

https://williamjrider.wordpress.com/2015/10/30/preserve-the-code-base-is-an-awful-reason-for-anything/

https://williamjrider.wordpress.com/2015/10/16/whats-the-point-of-all-this-stuff/

https://williamjrider.wordpress.com/2015/07/24/its-really-important-to-have-the-fastest-computer/

https://williamjrider.wordpress.com/2015/07/03/modeling-issues-for-exascale-computation/

https://williamjrider.wordpress.com/2015/06/05/the-best-computer/

https://williamjrider.wordpress.com/2015/05/29/focusing-on-the-right-scaling-is-essential/

https://williamjrider.wordpress.com/2015/04/10/the-profound-costs-of-end-of-life-care-for-moores-law/

https://williamjrider.wordpress.com/2015/03/06/science-requires-that-modeling-be-challenged/

https://williamjrider.wordpress.com/2015/02/14/not-all-algorithm-research-is-created-equal/

https://williamjrider.wordpress.com/2015/02/12/why-is-scientific-computing-still-in-the-mainframe-era/

https://williamjrider.wordpress.com/2015/02/06/no-amount-of-genius-can-overcome-a-preoccupation-with-detail/

https://williamjrider.wordpress.com/2015/02/02/why-havent-models-of-reality-changed-more/

https://williamjrider.wordpress.com/2015/01/05/what-is-the-essence-of-computational-science/

https://williamjrider.wordpress.com/2015/01/01/2015-time-for-a-new-era-in-scientific-computing/

We have already lost to the Chinese in Supercomputing; Good thing it doesn’t matter

27 Monday Jun 2016

Posted by Bill Rider in Uncategorized

≈ 3 Comments

In our attempt to make conservation easy, we have made it trivial.

― Aldo Leopold

The reality is that it may or may not matter that the Chinese are cleaning the floor with us in supercomputing. Whether it matters comes down to some details dutifully ignored by the news articles. Unfortunately, those executing our supercomputing programs ignorantly ignore these same details. In these details we can find the determining factors of whether the Chinese hardware supremacy is a sign that they are really beating the USA where it matters. Supercomputing is only important because of modeling & simulation, which depends on a massive swath of scientific endeavor to faithfully achieve. The USA’s national supercomputing program is starving this swath of scientific competence in order to build the fastest supercomputer. If the Chinese are simultaneously investing in computing hardware and the underlying science of modeling & simulation, they will win. Without the underlying science no amount of achievement in computing power will be provide victory. The source of the ignorance in our execution of a supercomputing program is a combination of willful ignorance, and outright incompetence. In terms of damage the two are virtually indistinguishable.

There is nothing more frightful than ignorance in action.

― Johann Wolfgang von Goethe

The news last week was full of the USA’s continued losing streak to Chinese supercomputers. Their degree of supremacy is only growing and now the Chinese have more machines on the list of top high performance computers than the USA. Perhaps as importantly the Chinese didn’t relied on homegrown computer hardware rather than on the USA’s. One could argue that American export law cost them money, but also encouraged them to build their own. So is this a failure or success of the policy, or a bit of both. Rather than panic, maybe its time to admit that it doesn’t really matter. Rather than offer a lot of concern we should start a discussion about how meaningful it actually is. If the truth is told it isn’t very important at all.

If we too aggressively pursue regaining the summit of this list we may damage the part of supercomputing that actually does matter. Furthermore what we aren’t discussing is the relative positions of the Chinese to the Americans (or the Europeans for that matter) in the part of computing that does matter: modeling & simulation. Modeling and simulation is the real reason we do computing and the value in it is not measured or defined by computing hardware. Granted that computer hardware plays a significant role in the overall capacity to conduct simulations, but is not even the dominant player in modeling effectiveness. Worse yet, in the process of throwing all our effort behind getting the fastest hardware, we are systematically undermining the parts of supercomputing that add real value and have far greater importance. Underlying this assessment is the conclusion that our current policy isn’t based on what is important and supports a focus on less important aspects of the field that simply are more explicable to lay people.

Let’s take this argument apart and get to the heart of why the current narrative in supercomputing is so completely and utterly off base. Worse than a bad narrative, the current mindset is leading to bad decisions and terrible investments that will set back science and diminish the power and scope of modeling & simulation to be transformative in today’s world. It is a reasonable concern that the reaction of the powers that be to the Chinese “success” in hardware will be to amplify the already terrible decisions in research and investment strategy.

The real heart of the matter is that the Chinese victory in supercomputing power is pretty meaningless. The computer they’ve put at the top of the list is close to practically useless, and wins at computing a rather poorly chosen benchmark problem. This benchmark problem was never a particularly good measure of supercomputers, and the gap between the real reasons for buying such computers, the efficacy of the benchmark has only grown more vast over the years. It is not the absolute height of ridiculousness that anyone even cares, but care we do. The benchmark computes the LU decomposition of a dense matrix, a classical linear algebra problem computed at a huge scale. It is simply wildly uncharacteristic of the problems and codes we buy supercomputers to solve. It skews the point of view in directions that are unremittingly negative, and ends up helping to drive lousy investments.
We have set about a serious program to recapture the supercomputing throne. In its wake we will do untold amounts of damage to the future of modeling & simulation. We are in the process of spending huge sums of money chasing a summit that does not matter at all. In the process we will starve the very efforts that could allow us to unleash the full power of the science and engineering capability we should be striving for. A capability that would have massively positive impacts on all of the things supercomputing is supposed to contribute toward. The entire situation is patently absurd, and tragic. It is ironic that those who act to promote high performance computing are killing it. They are killing it because they fail to understand it or how science actually works.

Let’s get a couple of truths out there to consider:

There is no amount of computer speed, nor solution accuracy or algorithmic efficiency that can rescue a model that is incorrect.
The best methods produce the ability to solve problems that were impossible before. Sometimes methods & models blur together producing a merged capability that transcends the normal distinctions providing ability to solve problems in ways not previously imagined.
Algorithmic efficiency coupled with modeling improvements and methods allows solutions to proceed with efficiency that goes well beyond mere multiples in efficiency; the gains are geometric or exponential in character.
The improvements in these three arenas all have historically vastly out-paced the improvements associated with computer hardware.

The tragedy is that these four areas are almost completely ignored by our current high-performance computing program. The current program is basically porting legacy code and doing little else. Again, we have to acknowledge that a great deal of work is being done to enable the porting of codes to succeed. This is in the area of genuinely innovative computer science research and methodology to allow easier programming of these monstrous computers we are building.

On the other hand, we have so little faith in the power of our imaginations and capacity for innovation that we will not take the risk of relying on these forces for progress. Declaring that the current approach is cowardly is probably a bit over the top, but not by much. I will declare that it is unequivocally anti-intellectual and based upon simple-minded conclusions about the power, value and source of computing’s impact on science and society. Simple is good and elegant, but simple-minded is not. The simple-minded approach does a great disservice to all of us.

Any darn fool can make something complex; it takes a genius to make something simple.

― Pete Seeger

The single most important thing in modeling & simulation is the nature of the model itself. The model contains the entirety of the capacity of the rest of the simulation to reproduce reality. In looking at HPC today any effort to improve models is utterly and completely lacking. Next in importance are the methods that solve those models, and again we see no effort at all in developing better methods. Next we have algorithms whose character determines the efficiency of solution, and again the efforts to improve this character are completely absent. With algorithms we start to see some effort, but only in the service of implementing existing ones on the new computers. Next in importance comes code and system software and here we see significant effort. The effort is to move old codes onto new computers, and produce software systems that unveil the power of new computers to some utility. Last and furthest from importance is the hardware. Here, we see the greatest degree of focus. In the final analysis we see the greatest focus, money and energy on those things that matter least. It is the makings of a complete disaster, and a disaster of our own making.

…the commercially available and laboratory technologies are inadequate for the stockpile stewardship tasks we will face in the future. Another hundred-to-thousand-fold increase in capability from hardware and software combined will be required… Some aspects of nuclear explosive design are still not understood at the level of physical principles…

–C. Paul Robinson, in testimony to the House National Security Committee on May 12, 1996

It is instructive to examine where these decisions came from whether they made sense at one time and still do today. The harbinger of the current exascale initiative was the ASC(I) program created in the wake of the end of nuclear testing in the early 1990’s. The ASC(I) program was based on a couple of predicates: computational simulation would replace full scale testing, and that computational speed was central to success. When nuclear tested ceased the Cray vector supercomputers provided the core of modeling & simulation computer capability. The prevailing wisdom that the combination of massively parallel computing and Moore’s law for commodity chips would win. This was the “Attack of the Killer Micros” (http://www.nytimes.com/1991/05/06/business/the-attack-of-the-killer-micros.html?pagewanted=all). I would submit that these premises were correct to some degree, but also woefully incomplete. Some of these shortcomings have been addressed in the broader stockpile stewardship program albeit poorly. Major experimental facilities and work was added to the original program, which has provided vastly more balanced scientific content to the endeavor. While more balanced, the overall program is still unbalanced with too much emphasis on computing as defined by raw computational power, with too little emphasis on theory (models), methods and algorithms along with an overly cautious and generally underfunded experimental program.

On the modeling & simulation side many of the balance issues were attacked by rather tepid efforts adding some modeling, and methods work. Algorithmic innovation has been largely limited to the monumental task of moving existing algorithms to massively parallel computers. The result has been a relative dearth of algorithmic innovations and increasingly high levels of technical debt and inflation permeating the code base. The modeling work is almost entirely based on closing existing systems of equations used to create models without any real focus on the propriety of the equations themselves. This is exacerbated by the cautious and underfunded experimental efforts that fail to challenge the equations in any fundamental way. We can look at public failures like NIF as being emblematic of the sort of problems we are creating for ourselves. We will not compute our way out of a problem like NIF!

The important thing about ASC(I) is that it created a model of successfully funding a program. As such, it became the go-to model for developing a new program, and the blueprint for our exascale program. Judging from the exascale program’s design and execution, it has learned little or nothing from the ASC(I) program’s history aside from the lesson of how to get something funded. The overall effort actually represents a sharp step backward by failing to implement any of the course corrections ASC(I) made relatively early in its life. The entirety of the lessons from ASC(I) were simply the recipe for getting money, which to put it plainly is “focus all the attention on the computers”. Everything else is a poorly thought through afterthought, largely consisting of a wealth of intellectually empty activities that look like relatively sophisticated code porting exercises (and some aren’t that sophisticated).

The price for putting such priority and importance on the trivial is rather high. We are creating an ecosystem that will strangle progress in its crib. As an ecosystem we have failed to care for the essential “apex” predators. By working steadfastly toward supremacy in the trivial we starve the efforts that would yield true gains and actual progress. Yet this is what we do, it has become a matter of policy. We have gotten here because those we have entrusted with creating the path forward don’t know what they are doing. They have allowed their passion for computer hardware to overwhelm their duties and their training as scientists. They have allowed a trivial and superficial understanding of modeling & simulation to rule the creation of the programs for sustaining progress. They have allowed the needs of marketing a program to overwhelm the duty to make those programs coherent and successful. It is a tragedy we see playing out society-wide.

Our current approach to advancing high performance computing is completely superficial. Rather than focus on the true measure of value being modeling & simulation, we focus on computing hardware to an unhealthy degree. Modeling & simulation is the activity where we provide a capability to change our reality, or our knowledge of reality. It is a holistic activity relying upon the full chain of competencies elaborated upon above. If anything in this chain fails, the entire activity fails. Computing hardware is absolutely essential and necessary. Progress via faster computers is welcome, but we fail to acknowledge its relative importance in the chain. In almost every respect it is the furthest away from impacting reality, while more essential parts of the overall capability are starved for resources and ignored. Perhaps more pointedly, the time for focus on hardware has passed as Moore’s law is dead and buried. Much of the current hardware focus is a futile attempt to breathe life into this increasingly cold and rigid corpse. Instead we should be looking at this moment in time as an opportunity (https://williamjrider.wordpress.com/2016/01/15/could-the-demise-of-moores-law-be-a-blessing-in-disguise/). The demise of Moore’s law was actual a blessing that unleashed a torrent of innovative energy we are still reeling from today.

It is time to focus our collective creative and innovative energy where opportunity exists. For example, the arena of algorithmic innovation has been a more fertile and productive route to improving the performance of modeling& simulation than hardware. The evidence for this source of progress is vast and varied; I’ve written on it on several occasions (https://williamjrider.wordpress.com/2014/02/28/why-algorithms-and-modeling-beat-moores-law/, https://williamjrider.wordpress.com/2015/02/14/not-all-algorithm-research-is-created-equal/, https://williamjrider.wordpress.com/2015/05/29/focusing-on-the-right-scaling-is-essential/). Despite this, we put little or no effort in this demonstrably productive direction. For example many codes rely desperately on numerical linear algebra algorithms. These algorithms have a massive impact on the efficiency of the modeling & simulation often being enabling. Yet we have seen no fundamental breakthroughs in the field for over 30 years. Perhaps there aren’t any further breakthroughs to be made, but this is a far too pessimistic outlook on the capacity of humans to create, innovate and discover solutions to important problems.

There is a cult of ignorance in the United States, and there has always been. The strain of anti-intellectualism has been a constant thread winding its way through our political and cultural life, nurtured by the false notion that democracy means that ‘my ignorance is just as good as your knowledge.

― Isaac Asimov

All of this gets at a far more widespread and dangerous societal issue; we are incapable of dealing with any issue that is complex and technical. Increasingly we have no tolerance for anything where expert judgment is necessary. Science and scientists are increasingly being dismissed when their messaging is unpleasant or difficult to take. Examples abound with last week’s Brexit easily coming to mind, and climate change providing an ongoing example of willful ignorance. Talking about how well we or anyone else does modeling & simulation is subtle and technical. As everyone should be well aware subtle and technical arguments and discussions are not possible in the public sphere. As a result we are left with horrifically superficial measures such as raw supercomputing power as measured by a meaningless, but accepted benchmark. The result is a harvest of immense damage to actual capability and investment in a strategy that does very little to improve matters. We are left with a hopelessly ineffective high performance-computing program that wastes sums of money; entire careers and will in all likelihood result in a real loss of National supremacy in modeling & simulation.

All of this to avoid having a conversation of substance about what might constitute our actual capability. All to avoid having to invest effort in a holistic strategy that might produce actual advances. It is a problem that is repeated over and over in our National conversations due to the widespread distrust of expertise and intellect. In the end the business interests in computing highjack the dialog and we default to a “business” friendly program that is basically a giveaway. Worse yet, we buy into the short-term time scales for business and starve the long-term health of the entire field.

One of the early comments on this post took note of the display of “healthy skepticism”. While I believe that healthy skepticism is warranted, to call my point-of-view skeptical understates my position rather severely. I am severely critical. I am appalled by the utter and complete lack of thought going into this program. I realize that what I feel is not skepticism, but rather contempt. I see the damage being done by our approach every single day. The basis of the current push for exascale lacks virtually any rigor or depth of thought. It is based on misunderstanding science, and most egregiously the science it is supposed to be supporting. We have defaulted to a simple-minded argument because we think we can get money using it. We are not relying on a simple argument, but simple-minded argument. The simple argument is that modeling & simulation is vital to our security economic or military. The simple-minded argument we are following is a real and present threat to what really matters.

Limiters are powerful; how to make them even more powerful

22 Wednesday Jun 2016

Posted by Bill Rider in Uncategorized

≈ 3 Comments

Progress is discovering what you can do without.

― Marty Rubin

Note: WordPress LaTeX woes continue so equations that wouldn’t parse are separated by “$” in the TeX “code”.

After last week’s orgy of words I was a bit more restrained this week, after all I am on vacation! It’s a topic that means a lot to me and actually represents a fun topic, so I’m continuing the thematic direction. Go figure! Still some more thoughts on the general recent theme are presented where improvements would mean much more to computational science than a bunch of expensive, hard to use, energy wasting, hard-to-impossible to program computers; Alternatively, improvements in methods that would make those precious computers radically more useful! None of the press regarding “exascale” computing acknowledges that supercomputing is an intrinsically holistic endeavor where the models, methods and algorithmic advances do far more to empower the future than any hardware investments.

Off the soapbox, and on to the chalkboard!

I’ve been advancing the theme of methods for improving the solution of hyperbolic PDEs for several weeks now. At the end of the last post I posited the proposition that looking to produce high order accurate approximation wasn’t always the best idea. Quite often we must content with solutions that are very under-resolved and rough in character. Experience seems to dictate that this eventuality is so important that it should always be present in the method’s design. One of the issues with the ENO and WENO methods is the absence of these ideas from high-order methods, which may be limiting their utility for broad use. Under appropriate conditions abandoning the pursuit of accuracy may be the advisable thing to do. From our experience with first generation monotonicity-preserving methods we can see that under the conditions where accuracy cannot be maintained, the discrete conditions are incredibly useful and powerful. Perhaps there is a route to improvement in methods that may use this experience productively. Limiters seem to endow the methods with a greater and stronger sense of nonlinear stability (i.e., robustness) to the extent of allowing one to use unstable approximations locally in a completely stable, robust manner.

The first thing to do is review the relations for providing monotonicity-preserving methods that can be easily stated by using TVD theory. This was discussed last week. First, the key fact that upwind differencing is a positive scheme is a foundation for things, $ U(j,n+1) = U(j,n) – \nu \left[ U(j,n) – U(j-1,n) \right] $ where $\nu = \frac{a \Delta t}{\Delta x}$ is the CFL number. Next, apply this basic thought process to second-order extension (or a high-order extension more generally), $U(j+1/2,n) = U(j,n) + \frac{1}{2} S(j,n)$ . Plugging this into our update equation, $ U(j,n+1) = U(j,n) – \nu \left[ U(j+1/2,n) – U(j-1/2,n) \right] $ can produce conditions on limiters for $S(j,n)$ . These are functional relationships defined by dividing the slope by the upwind difference on the mesh. These produce the famous diagrams introduced by Sweby under the proper CFL stability limits.

Secondly it is instructive to look at the differing approach taken by Suresh and Huynh. Here the approach to the limiting is done on the high-order edge values with two applications of the median function. The first step assures that the edge is bounded by its neighboring edge values, $\mbox{median}\left( U(j), U(j+1/2), U(j+1) \right)$ . The final step requires a bit more explanation and discussion because it differs in philosophy from the TVD limiters (or at least seemingly, actually its completely congruent). The idea of the second argument is to prevent the transport equation from producing any new extrema in the flow. Since the first test checks for the value of the edge to be bounded by the anti-upwind value, we need to keep things bounded for the upwind value. To find the boundary for the values where things go south, we find the value where the update equals the upwind values, $ U(j-1) = U(j) – \nu \left[ U(j+1/2) – U(j-1/2) \right]$, and then make sure that $U(j-1/2) \approx U(j)$ as not to help matters at all. Then we rearrange to find the bounding value, $ U(j+1/2) = U(j) + \frac{(U(j) – U(j-1))}{ \nu }$, which is a bit unsatisfying because of the Courant number dependence, but its based on transport, so what else did we expect! The second test is then $ \mbox{median} ( U(j), U(j+1/2), U(j) – \frac{( U(j) – U(j-1) )}{\nu } $. The result will be monotonicity-preserving up to $\Delta t = \frac{\nu \Delta x}{a}$ .

Every solution to every problem is simple. It’s the distance between the two where the mystery lies.

― Derek Landy

These two views are utterly complementary in nature. One revolves around extending the positivity of coefficients that naturally comes from the use of upwind differencing. The second uses the combination of bounding reconstructions of a function, and the direct impact of transport on the development of new extrema in flows. We could easily reuse either under the right conditions and more importantly these methods produce the “same” limiters under the right conditions. The ideas I’m developing here are a bit of an amalgam of my ideas with Suresh & Huynh and Colella & Sekora. In particular, they might work extremely well within a framework that is built around the PPM scheme, which I believe is both the most powerful and flexible method to build modern methods upon.

With this basic background in hand let’s throw out the idea that will be developed for the rest of the post. If one looks at the simplest hyperbolic PDE, the one-way wave equations, $U_t + a U_x = 0$ we can see that the same equation can simply be derived to describe the evolution of any derivative of the solution. $(U_x)_t + a(U_x)_x = 0 \rightarrow V_t + a V_x = 0$ and $(U_{xx})_t + a(U_{xx})_x = 0 \rightarrow W_t + a W_x = 0$ . As a result a limiter on the value or derivative or second derivative, and so on of the solution should all take basically the same form. We can use basically the same ideas over and over again with appropriate accommodation of the fact that we are bounding derivatives. The question we need to answer is when we should do this. Here is the major proposition we are making in this blog post: if the derivative in question takes the same sign then the limiter will be governed by either a high-order approximation, or the limit defined, but if the sign is different, the limiter will move to evaluation of the next derivative, and this may be done recursively.

In its simplest form we will simply have a monotonicity preserving method. We can take a high-order approximation $U_H$ –fifth-order upwinding suffices– and see whether it is monotone. If the local differences are different in sign we simply use first-order upwinding. If the signs are the same we see whether the high-order approximation is bounded by the monotone limit and upwinding, $U_f = \mbox{median}\left( U(j), U_H, U(j) + \frac{1}{2}S_M \right)$ where $S_M = 2\mbox{minmod}\left( U(j)-U(j-1), U(j+1)-U[j) \right)$ . Note that if $S_M=0$ there is an extrema. Doing something with this knowledge would seem to be a step in the right direction instead of just giving up.

It takes a long time to learn how to do something simple.

― Marty Rubin

In the vein of keeping things simple, I will stick with the bounding approach for now, and avoid the complications of the transport mindset. We basically go through a set of bounding arguments starting with basic monotonicity-preserving limiters, and graduating to higher order forms if we detect an extrema. This idea can be then applied recursively to whatever degree you’d like or can tolerate. The concept is that one can do something if a high-order structure is detected – some sort of critical point where a derivative of the solution goes through zero, but if the structure doesn’t have a critical point, and is simply under-resolved, higher order doesn’t make sense to pursue. This is the key to the entire dialog here, sometimes high-order simply doesn’t make sense, and the pragmatic approach is to stop.

The basic idea is that we can produce a sequence of approximations using the same basic conceptualizations as the wildly popular and powerful monotoncity-preserving methods in a manner that will promote improved accuracy to the extent we are willing to exercise. More importantly the methods acknowledge the impact of under-resolved solutions and the utter lunacy of trying to achieve high-order accuracy under those circumstances. If we find an extrema in the solution, we looks at whether the extrema is smooth enough to continue the use of the high-order approximation. The first derivative is approximately zero, but may need to have a non-zero value to promote accuracy. We start by using the safest value for the first derivative using a mineno limiter introduced two posts ago, which is the local gradient with the smallest absolute magnitude $U1(j) = \mbox{mineno}\left( U(j+1)-U(j), U(j)-U(j-1) \right)$ . Step two would produce a set of estimates of the second-derivative locally, we could easily come up with two or three centered in cell, j. We will suggest using three different approximations, $U2c(j) =U(j+1)-2 U(j) + U(j-1)$ , $U2L(j) =U(j+1)- U(j)-U(j-1) + U(j-2)$ and $U2R(j) =U(j+2)- U(j+1)-U(j) + U(j-1)$ . Alternatively we might simply use the classic second order derivative in the adjacent cells instead of the edges. We then take the limiting value of $U2(j) = 2\mbox{minmod}\left(U2L(j), U2c(j), U2R(j) \right)$ . We then can define the bounding value for the edge as $\mbox{median}\left( U(j) + \frac{1}{2} U1(j), U(j+1/2), U(j) + \frac{1}{2} U1(j) + \frac{1}{6} U2(j) \right)$ . Like the monotonicity-preserving method, this estimate should provide a lot of room if the solution well enough resolved. This will be at least a second-order approximation by the properties of the median function and the accuracy of the mineno based linear extrapolation.

This should allow one to get at least second-order accuracy in the infinity norm. In may make sense to use the broader stencils for the first derivative to improve the solutions a bit. To repeat, the general concept is make sure any evolution of the derivative of the solution does not create a new extrema in the derivatives (instead of the cell values as in basic monotoicity-preserving methods. Now, I will sketch out what the next order higher in the limiter hierarchy would look like. This limiter would only be used in places where there are the twin critical points in the value and derivative of the solutions. A final flourish for these methods would be to add a positivity limiter that helps insure that no unphysical values are generated (i.e., negative densities or pressures etc.).

The next higher order in the hierarchy would require us to make a safe approximation to the first and second derivatives, then use a bounding approximation for the third derivative. In addition if the third derivative has a sign change in the cell (or nearby vicinity), we either go to the next order in the hierarchy of truncate it with the limiter based on the safe extrapolation. This highlights one of the characteristics of these limiters in that the approximation will truncate to one order lower than the attempt (i.e., a safe limit). These lower limits are all within the ENO family of schemes and should provide entropy-stable approximations.

The key to this entire line of thinking is the manner of encoding pragmatic thinking into extending limiters. TVD–Monotonicity-preserving limiters are dynamically pragmatic in their makeup. If a profile is too steep, it is under-resolved and cannot be evolved without inducing oscillations. Furthermore the whole idea of a convergent Taylor series expansion, based on intrinsic smoothness is not a sensible approximation strategy. It makes perfect sense to stop from making a bad decision. This pragmatism extends to local extrema where we basically reject high-order ideas and use a first-order method to be safe. This safety is also built into the whole foundation of the method. The idea I am exploring here is to consider carefully these extrema for being capable of being represented with a smoothness based approximation; to see whether a Taylor series holds and can be used profitably, but eventually surrender to the reality of under-resolution. For problems with discontinuities and/or shocks, this is always in play.

It is notable that with the median limiters and ENO-like schemes, the first-order approximation is always in play via the TVD-like comparison scheme. These might be equally pursued in the case of the extrema-preserving method. In other words instead of using a TVD comparison scheme, the extrema-preserving method may be used to good measure in creating an even more powerful high-resolution method.

The role of genius is not to complicate the simple, but to simplify the complicated.

― Criss Jami

Everything has boundaries. The same holds true with thought. You shouldn’t fear boundaries, but you should not be afraid of destroying them. That’s what is most important if you want to be free: respect for and exasperation with boundaries.

― Haruki Murakami

A Recipe for Improved Methods

14 Tuesday Jun 2016

Posted by Bill Rider in Uncategorized

≈ 4 Comments

A couple of notes:

This is the longest post I’ve written, its way too long (4500 words! TL;DR)! Its a topic that inspires me, so sue me! The blog is for me and if others find it useful then its all the better.
The WordPress LaTeX capability is an awesome thing, but its buggy as hell and things sometimes just don’t work (it pisses me off cause a couple of the equations are important and they never worked)! In these cases I’ve left the equation in their naked LaTeX form since the parser is too stupid to render them properly. At least you’ll know what I’m intending to say–these cases are offset by a pair of “$”.

Life is really simple, but we insist on making it complicated.

― Confucius

The last couple of weeks have featured a general-purpose discussion of nonlinearity in approximation, and a clever way of introducing it. The original wave of nonlinear differencing that overcame Godunov’s theorem has been massively successful in transforming computational physics. The flux and slope limiting schemes provided both practical and pragmatic methods for solving problems that vexed earlier generations. The trick is continuing the forward progress and overcoming the shortcomings of these successful methods. While successful and useful, the first generation of nonlinear methods is not without fault and problem. Their replacements have never been entirely successful for a variety of reasons that I’ll elaborate upon below. Along with the failure to replace the first generation we have also seen a relative demise of further progress for reasons I’ve written about extensively. It is high time for some new principles to be introduced and seek a new generation of methods that make things better.

Just for the sake of completeness, linear methods are generically inadequate for solving many modern or real problems. As Godunov’s theorem dictated the linear methods cannot both be high quality and accurate, you have to choose one or the other, and you need both. These methods may be resurrected via a variety of heavy-handed mechanisms that almost always either render a solution first-order accurate, or rely upon excessive mesh resolution to resolve everything. Problems that can be fully solved are useless idealizations anyway! Sometimes the lousy solution is simply accepted either being overly diffuse or numerical corrupted. In either case these methods were not remotely the path to success for modeling and simulation. Their replacement was transformative and opened the door to success far more than any advances in computing hardware or power. The lack of acknowledgment for the necessity of method’s improvement instead of reliance on hardware is vexing and intellectually dishonest and shallow. The truth is that the nonlinear methods are an absolute and primal necessary for modeling and simulation to have its current measure of success. Computers rather than being enabling are just the “icing on the cake”.

Like all magnificent things, it’s very simple.

― Natalie Babbitt

As noted before, linear methods are only limited (pun intended) in their utility. Godunov’s theorem gives the reason for the limitations, and the path forward to nonlinear methods. In an “explosion” of innovation in the early 1970’s four different researchers came up with various nonlinear (adaptive) differencing methods to overcome Godunov’s theorem (in the USA, Russia, Netherlands, and Israel). In the fullness of time, these approaches are fundamentally similar and all intimately related. Of course in the heat of the moment, the fighting and turmoil over the development was bitter and tumultuous. Each method took the path of blending first-order methods together with high-order differencing by detecting where the high-method would get into trouble (cause oscillations), and adding enough first-order method to get the solution out of trouble. First-order methods that are useful are viewed as being overly dissipative, but also produce physically relevant solutions. The detection mechanism became widely known as limiters acting to determine the mix of methods via analysis of the underlying local solution.

An important side note is that the nonlinear differencing was used for shock wave solutions almost from the start. The first useful shock capturing methods was Von Neumann and Richtmyer’s method using artificial viscosity, which is a nonlinear method. This work includes three major contributions each having great traction in the tale unfolding below. The first is the underlying high-order method using a staggered mesh in time and space. It is second-order accurate, but catastrophically unstable without the next contribution, the artificial viscosity, which stabilizes the solution in the vicinity of shock waves by adding a shock-selective viscosity. This is a nonlinear differencing method because the dissipation is only added where there is a shock and not everywhere. Finally they used a stability analysis method that has become the standard for analyzing the stability of methods for PDE’s.

650px-LimiterPlots1 A key part of the first generation’s methods success was the systematic justification for the form of limiters via a deep mathematical theory. This was introduced by Ami Harten with his total variation diminishing (TVD) methods. This nice structure for limiters and proofs of the non-oscillatory property really allowed these methods to take off. To amp things up even more, Sweby did some analysis and introduced a handy diagram to visualize the limiter and determine simply whether it fit the bill for being a monotone limiter. The theory builds upon some earlier work of Harten that showed how upwind methods provided a systematic basis for reliable computation through vanishing viscosity.

Simplicity is the ultimate sophistication.

― Clare Boothe Luce

The ground state method is upwind (donor cell) differencing, $U(j,n+1) = U(j,n) - \nu \left(U(j,n) - U(j-1,n) \right)$ , which upon rearrangement shows that the new value, $U(j,n+1)$ is a combination of positive definite values as long as $\nu<1$ . If the update to the new value is fully determined by positive coefficients, the result will be monotonicity preserving without any new extrema (new highs or lows not in the data). The TVD methods work in the same way except the coefficients for the update are now nonlinear and the conditions for the update being positive may produce conditions by which the update will be positive. I’ll focus on the simplest schemes possible using a forward Euler method in time, $U(j,n+1) = U(j,n) - \nu \left(U(j+1/2,n) - U(j-1/2,n) \right)$ . For the upwind method this update is defined by $U(j+1/2,n) = U(j,n)$ using a positive direction for transport. For a generic second-order method one uses, $U(j+1/2,n) = U(j,n) + \frac{1}{2} S(j,n)$ where the use of linear extrapolation gives accuracy. Plugging this into the forward Euler update, and defining the extrapolation as a nonlinear function, $Q(j,n)$ . Now the update is written as, $U(j,n+1) = U(j,n) - \nu$ $(1 + 1/2 Q(j,n)$ $ – 1/2 Q(j-1,n)) $ $D(j-1/2,n)$ where $D(j-1/2,n) = U(j,n) - U(j-1,n)$ . Now we can do some math using the assumption that $Q(j-1,n)$ does nothing to help matters insofar as keeping coefficients positive. We are left with a pretty simple condition for positivity as $0 \le Q(j,n) \le \frac{2(1-\nu)} {\nu}$ . If $0 \le Q(j,n) \le 1$ , the method is positive to $\nu \le \frac{2}{3}$ ; if $0 \le Q(j,n) \le 2$ , $, the method is positive to $\nu \le \frac{1}{2}$ . The point is that we now can concretely design schemes based on these inequalities that have reliable and well-defined properties.

T images he TVD and monotonicity-preserving methods while reliable and tunable through these functional relations have serious shortcomings. These methods have serious limitations in accuracy especially near detailed features in solutions. These detailed structures are significantly degraded because the methods based on these principles are only first-order accurate in these areas basically being upwind differencing. In mathematical terms this means that the solution are first-order in the L-infinity norm, approximately one-and-a-half in the L-2 (energy) norm and second-order in the L1 norm. The L1 norm is natural for shocks so it isn’t a complete disaster. While this class of method is vastly better than their linear predecessors effectively providing solutions completely impossible before, the methods are imperfect. The desire is to remove the accuracy limitations of these methods especially to avoid degradation of features in the solution.

The next generation of methods (the 2^nd nonlinear generation) sought to overcome these problems. The thing to avoid in the approximation is the utilization of first-order differencing at extreme values. While this is sensible, it is also dangerous. This is where new extrema in the solutions are created. This is where unphysical oscillations come from, and opens the door for all the things these methods are supposed to cure. The upshot of this is that getting rid of the detriments of the TVD methods requires being really careful. This care went into the creation of these methods, which were called essentially non-oscillatory methods (ENO). The gist of the design was to try and select the smoothest, smallest in magnitude variation differencing option locally.

Because of this the method tend to be very dissipative. This might seem odd at first glance, but consider that upwind differencing is based on a piecewise constant approximation within each cell. This is the lowest magnitude of variation possible, as such the small variation approximation within the cell used in ENO results in enhanced dissipation compared to TVD methods. This dissipative nature is one reason why these methods have not really been successful. The dissipative nature of ENO has recently been proven. It is a good thing in the sense that entropy satisfaction can be shown, but in some cases it can exceed that produced by upwinding. This is the second reason why ENO is not successful; exceeding upwind dissipation is actually a threat to stability of the scheme. Thus we have a method that is both low fidelity and lacking in robustness. This ends up being the price to pay for not degrading to a low order approximation at extrema and maintaining formal high-order accuracy there. Do we always have to pay this price? I don’t think so, and the median I talked about last week might allow us to get around the problem.

Eno-fruit-salt A big part of understanding the issues with ENO comes down to how the method works. The approximation is made through hierarchically selecting the smoothest approximation (in some sense) for each order and working ones way to high-order. In this way the selection of the second-order term is dependent on the first-order term, and the third-order term is dependent on the selected second-order term, the fourth-order term on the third-order one,… This makes the ENO method subject to small variations in the data, and prey to pathological data. For example the data could be chosen so that the linearly unstable stencil is preferentially chosen leading to nasty solutions. The safety of the relatively smooth and dissipative method makes up for these dangers. Still these problems resulted in the creation of the weighted ENO (WENO) methods that have basically replaced ENO for all intents and purposes.

3aos1 WENO gets rid of the biggest issues with ENO by making the approximation much less sensitive to small variations in the data, and biasing the selection toward more numerically stable approximations. Moreover the solution to these issues allows for an even higher order approximation to be used if the solution is smooth enough to allow this. Part of the issue is the nature of ENO’s approach to approximation. ENO is focused on diminishing the dissipation at extrema, but systematically creates lower fidelity approximations everywhere else. If one looks at TVD approximations the “limiters” have a lot of room for non-oscillatory solutions if the solution is locally monotone. This is part of the power of these methods allowing great fidelity in regions away from extrema. By not capitalizing on this foundation, the ENO methods failed to build on the TVD foundation successfully. WENO methods come closer, but still produce far too much dissipation and too little fidelity to fully replace monotonicity-preservation in practical calculations. Nonetheless ENO and WENO methods are pragmatically and empirically known to be nonlinearly stable methods.

images-2 Given the lack of success of the second generation of methods, we have a great need for a third generation of nonlinear methods that might displace the TVD methods. Part of the issue with the ENO methods is the lack of a strict set of constraints to guide the development of the methods. While WENO methods also lack such constraints, they do have a framework that allows for design through the construction of the smoothness sensors, which help to determine the weights of the scheme. It allows for a lot of creativity, but the basic framework still leads to too much loss of fidelity compared to TVD methods. When one measures the accuracy of real solutions to practical problems WENO still has difficulty being competitive with TVD–don’t fooled by comparisons that show WENO being vastly better than TVD, the TVD method chosen to compare with is absolute shit. A good TVD method is very competitive, so the comparison is frankly disingenuous. For this reason something better is needed or progress will continue to stall.

My philosophy is to try and use the first two generations of nonlinear methods as the foundation. The median function is a means to allow the combining of the properties to get the best of both! This approach was taken in my 2007 paper with Jeff Greenough and Jim Kamm. As I stated then the objective is to get the best of both Worlds (TVD/monotonicity-preserving and ENO/WENO). Today I will explore if we can go even further.

The median comes in as a selection mechanism that allows beneficial properties to be built into methods. The three properties we will focus upon are listed here

Numerical accuracy, both resolution/fidelity and order of accuracy
Linear stability, or L2-energy stability
Nonlinear stability, L1, variation or non-oscillatory

These properties if brought together in a provable manner should allow new methods to be crafted that may replace TVD methods in practical codes if all works together to produce accurate, high-fidelity, stable, non-oscillatory methods for producing better solutions efficiently.

The approach to take in producing better methods is to start as before with TVD methods, $U(j,n+1) = U(j,n) - \nu \left(U(j+1/2,n) - U(j-1/2,n) \right)$ defining $U(j+1/2,n) = U(j,n) + \delta U(j,n)$ where this edge value is defined as being high order accurate. If we can define inequalities that limit the values that $\delta U(j,n)$ may take we can have productive methods. This is very challenging because we cannot have purely positive definite coefficients if we want high-order accuracy, and we want to proceed forward beyond monotonicity-preservation. The issue that continues to withstand strict analysis is the determination of how large the negativity of coefficients may be and not create serious problems.

Satisfaction is the biggest hurdle to success

― Pushkar Saraf

The approach that I am advocating was described in last week’s post where the median function was used to “breed” new high-order approximations with appropriate linear and nonlinear stability properties. The “seed” for any given scheme will be a monotonicity-preserving–TVD method to act as the selector for the median. A good reason to use these methods is their nonlinear stability properties. Remember that the manner in which the median works requires at least two of the three arguments to have a property in order to assure that the evaluation of the function retains this property.

Dealing with this serious deficiency in a rigorous manner is extremely difficult. A philosophy to dealing with it may be to ignore the problem and introduce other mechanisms to make the solution robust. For example, use a positivity limiter approach to keep positive definite values, positive definite during the integration of the equation. These limiters are devilishly simple and drop into this numerical framework almost seamlessly. The limiter takes a minimum acceptable value $U_{\min}$ and modifies the approximation to remove violations of that value in the evolution. The form is $ U(j+1/2,n) := U(j,n) + \theta\left( U(j+1/2,n) – U(j,n) \right) $ with $ \theta = \min \left( 1,\frac{ U_{\min} – U(j,n) }{ U(j+1/2,n) – U(j,n) } \right) $. The really awesome aspect of this limiter is its preservation of the accuracy of the original approximation. This could be extended to the enforcement of other bounds without difficulty.

The various extrema preserving limiters of Suresh and Huynh, Colella and Sekora or from myself along with Greenough and Kamm may be employed to good effect to keep the integration fully nonlinearly stable while allowing high-order accuracy. In every case these “limiters” or strategies are used to provide accurate approximations throughout the flow field being solved. It is opposed to the previous limiter where a non-local bound is enforced. By being more local the problem is more difficult and accuracy is more likely to be disturbed. The approaches taken before all work to detect extrema in the solutions and produce approximations that preserve extrema that are valid, but not produce new invalid extrema. A trap that this may fall into is only provide the capacity to produce solutions that are uniformly second-order accurate instead of the first-order accuracy that is standard with monotonicity-preservation. The problem with these approaches is their lack of rigorous and provable bounds or properties that helped drive the success of TVD methods. We will end by discussing an approach that might go to arbitrarily high-order without inducing extreme dissipation in the bulk of solution.

This approach is based on the properties of the median discussed last week. Given that we are going to gang up on bad approximations using the median by using two approximations with a desirable property to make sure any bad properties are removed. A really cool thing is getting rid of two bad approximations in one fell swoop. Take the following example, argument one is low-order, but linear and nonlinearly stable–upwind perhaps, or TVD, the second is linearly unstable (nonlinearly too), but high-order and the third is both high-order and nonlinearly stable. Using a median evaluation will produce a new approximation that is both nonlinear stable and high-order even if the evaluation does not return the original value with that property. This is a really powerful tool for getting what we are looking for!

More typically, we will focus on a single property in the evaluation of the median. In structuring and evaluating the tests one should carefully look at what common properties can be provably produced through any given test. The key is to use the evaluations to build up approximations with uniformly desirable properties. I gave several examples last week, and I will close with a new example that brings us closer to the goals set forth for a new generation of methods.

Let’s lay out the building blocks for our approximation scheme.

Have a TVD method, either a “classic” one, or a montonicity-preserving high-order scheme using the stencil described next.
Have an extra high-order method using a full stencil. This is analogous to the fifth-order WENO. Use a linear combination three third-order stencils to get a linear fifth-order scheme
One might also evaluate the fifth-order WENO method.
Finally have the three base third-order stencils recognizing that one of them is the “classic” ENO scheme.
Take inventory. You have three nonlinearly stable stencils, two fifth-order stencils, and three third-order stencils. You also have three or four linearly stable stencils.

Now let the median–magic happen right in front of you! Could one orchestrate the evaluations to give you a fifth-order, linearly and nonlinearly stable approximation? Can you really have it all? The answer, dear readers, is yes.

Here is how one might take on this challenge to systematically produce the desired result. Start with the playoff system discussed last week. This would use three median tests of the TVD scheme and the three third-order stencils. One of the third-order stencils is linearly unstable, and another is the classic ENO stencil, and $U_{3,2}$ is the third-order upstream-centered scheme – and upstream-centered is known to be very desirable. They might be the same one, or they might not, such is the nature of nonlinear stability. At the end of the three tests we now have a third-order scheme that is linearly and nonlinearly stable. Don’t take my word for it, lets take it all apart. The semi-finals would consist of two choices, $U_{3,A} = \mbox{median}\left(U_{3,1}, U_{3,2}, U_{2,M}\right)$ and $U_{3,B} = \mbox{median}\left(U_{3,2}, U_{3,3}, U_{2,M}\right)$ . Both $U_{3,A}$ and $U_{3,B}$ are third-order accurate and one of them is also nonlinearly stable to the extent that the ENO approximation would be, and another one is linearly stable $U_{3,B}$ in this case. Now we have the finals for step one, $U_{3,C} = \mbox{median}\left(U_{3,A}, U_{3,B}, U_{2,M}\right)$ , this is nonlinearly stable and third-order, but linear stability isn’t necessarily given. For example the first-order method is linearly and nonlinear stable, or the second-order monotone Fromm’s scheme is linearly and nonlinearly stable. These two choices give the desired outcome guaranteed third-order, linear and nonlinearly stable schemes.

Now we can use a single additional test to promote things to a fifth-order version of the same. Our final test is $U_{5,LN} = \mbox{median}\left(U_{3,C}, U_{5,L}, U_{5,N}\right)$ . Here the third-order approximation has both desired stability properties, the fifth-order linear approximation–upsteam centered is linearly stable and the fifth-order WENO is nonlinear stable. The final evaluation has it all! Fifth-order! Linear stability! Nonlinear Stability! All using the magical mystical median!

Cherish those who seek the truth but beware of those who find it.

― Voltaire

There is still a lot to do. A few things stand out insofar as coming to more optimal methods. For example, is it a good idea to test the high-order approximation you’d like to use to see if it violates monotonicity, first? If it doesn’t, dispense with the whole survey of the other choices? What gives a better answer? In previous work this can yield big savings. In addition, the major issues with the monotone methods appear at extrema where almost all of the methods degrade to first-order accuracy. Getting safe, high fidelity approximations at extrema benefit from the sort of things done in the set of papers by myself and others trying to extend limiters. The extent to which new extrema would be created and manifesting themselves as some negative coefficients instead of positive definite monotonicity preserving approaches whose magnitude remain undefined by rigorous analysis. Similarly if the approximations are steeper than non-oscillation would allow in monotone regions of the flow, it would manifest itself as negative coefficients. The magnitude of the negativity and oscillation size is difficult to define. So far it does not appear to match the desired expectation of ENO schemes of being $O(h^n)$ where $n$ is the order of the method, which might be as high as one likes. Evidence appears to go in the opposite direction insofar as WENO goes where the very high order WENO methods–higher than 7^th order appear to have larger oscillations and generic robustness issues.

What I will say is that the community of scientists building shock capturing methods do not as a rule measure the accuracy of the methods they work on under practical circumstances. Typically accuracy is only measured on problems where the design order of accuracy may be achieved. This characterizes none of the problems we develop such methods to solve! NONE! ZERO! NADA!

Thinking something does not make it true. Wanting something does not make it real.

― Michelle Hodkin

If we want continued improvement of methods we need to measure our progress! We need to get in the business of measuring the overall efficiency of the methods. The goal of producing a new generation of methods to replace the very successful first generation nonlinear methods should be focused on innovation with the purpose of producing better solutions for less effort along with deeply nonlinear stability, high-resolution and a reliable and robust framework allowing the method to adapt to the problem at hand. Today we avoid this like the proverbial plague and it serves no one, but a small cadre of insiders. It serves them poorly by maintaining an increasingly antiquated status quo, which only stands in the way of actual transformative progress.

While I’m dishing out dollops of cruel reality I should acknowledge that for practical problems there is a certain futility in seeking formally high-order accurate solutions. Almost everything necessary to model reality conspires against us. At every turn we are confronted with a brutal reality that offers us rough, and nearly singular behavior. To some extent the first generation methods reflect this brutish reality offering a pragmatic methodology. These methods simply reject any attempt at accuracy if the solution is too under-resolved or rough. Perhaps we must accept a degree of imperfection in the face of this truth?

There are two ways to get enough. One is to continue to accumulate more and more. The other is to desire less.

― G.K. Chesterton

The attempts to produce methods that robustly allow extrema in the solution to be evolved with less error effectively extend this principled submission to difficulty. They offer rather involved analysis of the extrema including explicit limits on their structure. The problem is that these methods have rather distinct limits on the accuracy that may be achieved with them. Maybe this is really OK? For too long we have pursued methods whose design and energy is driven by problems that are largely meaningless idealizations of the reality we should focus on. We might actually progress more effectively by accepting the intrinsic limitations of our models, and produce pragmatic improvements that manage the reality of their solution.

We all die. The goal isn’t to live forever, the goal is to create something that will.

― Chuck Palahniuk

References

Harten, Ami. “High resolution schemes for hyperbolic conservation laws.”Journal of computational physics 49.3 (1983): 357-393. Also. Harten, Ami. “High resolution schemes for hyperbolic conservation laws.”Journal of Computational Physics 135.2 (1997): 260-278. (see Harten, Ami. “On a class of high resolution total-variation-stable finite-difference schemes.” SIAM Journal on Numerical Analysis 21.1 (1984): 1-23. Too)

Sweby, Peter K. “High resolution schemes using flux limiters for hyperbolic conservation laws.” SIAM journal on numerical analysis 21.5 (1984): 995-1011.

Harten, Ami, Bjorn Engquist, Stanley Osher, and Sukumar R. Chakravarthy. “Uniformly high order accurate essentially non-oscillatory schemes, III.” Upwind and High-Resolution Schemes. Springer Berlin Heidelberg, 1987. 218-290. (see also Harten, Ami, Bjorn Engquist, Stanley Osher, and Sukumar R. Chakravarthy. “Uniformly high order accurate essentially non-oscillatory schemes, III.”Journal of Computational Physics 131, no. 1 (1997): 3-47.)

Harten, Ami, and Stanley Osher. “Uniformly high-order accurate nonoscillatory schemes. I.” SIAM Journal on Numerical Analysis 24.2 (1987): 279-309.

Fjordholm, Ulrik S., Siddhartha Mishra, and Eitan Tadmor. “Arbitrarily high-order accurate entropy stable essentially nonoscillatory schemes for systems of conservation laws.” SIAM Journal on Numerical Analysis 50.2 (2012): 544-573.

Fjordholm, Ulrik S., Siddhartha Mishra, and Eitan Tadmor. “ENO reconstruction and ENO interpolation are stable.” Foundations of Computational Mathematics 13.2 (2013): 139-159.

Jiang, Guang-Shan, and Chi-Wang Shu. “Efficient Implementation of Weighted ENO Schemes.” Journa of Computational Physics 126 (1996): 202-228.

Suresh, A., and H. T. Huynh. “Accurate monotonicity-preserving schemes with Runge–Kutta time stepping.” Journal of Computational Physics 136.1 (1997): 83-99.

Balsara, Dinshaw S., and Chi-Wang Shu. “Monotonicity preserving weighted essentially non-oscillatory schemes with increasingly high order of accuracy.”Journal of Computational Physics 160.2 (2000): 405-452.

Zhang, Xiangxiong, and Chi-Wang Shu. “Maximum-principle-satisfying and positivity-preserving high-order schemes for conservation laws: survey and new developments.” Proceedings of the Royal Society of London A: Mathematical, Physical and Engineering Sciences. Vol. 467. No. 2134. The Royal Society, 2011.

Rider, William J., Jeffrey A. Greenough, and James R. Kamm. “Accurate monotonicity-and extrema-preserving methods through adaptive nonlinear hybridizations.” Journal of Computational Physics 225.2 (2007): 1827-1848.

Colella, Phillip, and Michael D. Sekora. “A limiter for PPM that preserves accuracy at smooth extrema.” Journal of Computational Physics 227.15 (2008): 7069-7076. (see also Sekora, Michael, and Phillip Colella. “Extremum-preserving limiters for MUSCL and PPM.” arXiv preprint arXiv:0903.4200 (2009).)

The Marvelous Magical Median

07 Tuesday Jun 2016

Posted by Bill Rider in Uncategorized

≈ 4 Comments

Every solution to every problem is simple. It’s the distance between the two where the mystery lies.

― Derek Landy

images The median is what a lot of people think of when you say “household income”. It is a statistical measure of central tendency of data that is a harder to compute alternative to the friendly common mean value. For large data sets the median is tedious and difficult to compute while the mean is straightforward and easy. The median is the middle entry of the ordered list of numerical data thus the data being studied needs to be ordered, which is hard and expensive to perform. The mean of the data is simple to compute. On the other hand by almost any rational measure, the median is better than the mean. For one thing, the median is not easily corrupted by any outliers in the data (data that is inconsistent or corrupted); it quite effectively ignores them. The same outliers immediately and completely corrupt the mean. The median is strongly associated with the one-norm, which is completely awesome and magical. images-1 This connection is explored through the amazing and useful field of compressed sensing, which uses the one norm to do some really sweet (cool, awesome, spectacular, …) stuff.

About 15 years ago I found myself quite taken with a mathematical tool associated with constructing “limiters” for solving hyperbolic conservation laws numerically. This mathematical tool is the “median” function for three arguments. It simply returns the value that lies in the middle of the triad. It simple and powerful, but hasn’t really caught on more generally. It seems to be a boutique method rather than common in use. It’s a true shame; because the median is pretty fucking awesome. Given the median’s power and utility in building numerical method, I find its lack of “popularity” puzzling. Since I like it so much, I will wax a bit on its greatness and what I’ve done with it.

It’s easy to attack and destroy an act of creation. It’s a lot more difficult to perform one.

― Chuck Palahniuk

I came across the function and its use with limiters reading the work of HT Huynh. HT works at NASA in Cleveland and has done some really great work over the years. The median function made its first appearance in work on Hermitian polynomial interpolation (used a lot with tabular lookups). He wrote several of my favorite papers on solving hyperbolic PDE’s, first on slope limited second-order schemes (a real gem covering a wealth of material and presenting it in a wonderful manner, it deserves many more citations!) and a second with Suresh on limiting edge values (another tour de force!). The median function appears prominently in both and allows some cool stuff to be done in a very clear and literal fashion. Like the median function, HT’s work is significantly under-appreciated.

If you want something new, you have to stop doing something old

― Peter F. Drucker

History is filled with brilliant people who wanted to fix things and just made them worse.

― Chuck Palahniuk

The function itself is quite elegant. If you’re in the business of limiters, you can implement it directly from another function you certainly have on hand, the minmod (minimum modulus) function. Jay Boris invented the minmod as part of the flux corrected transport method (arguably the first limiter). It returns the function with the minimum magnitude if the arguments to the function have the same sign, and zero otherwise. It can be implemented in several ways of varying elegance. The classic way of implementing minmod is, $\mbox{minmod}\left(a, b \right) =\frac{1}{4} \left(\mbox{sign}[a]+\mbox{sign}[b]\right) \left(|a+b| - |a-b| \right)$ . HT Huynh introduced a really cool way to write minmod that was much better since you could use it with Taylor series expansions straightaway, $\mbox{minmod}\left(a, b \right) =\frac{1}{4} \left(\mbox{sign}[a]+\mbox{sign}[b]\right) \left(|a+b| - |a-b| \right)$ .

650px-LimiterPlots1 Once the minmod is in hand, the median is a piece of cake to implement, $\mbox{median}\left(a, b, c \right) = a + \mbox{minmod} \left( b-a,c-a\right)$ . In papers by Huynh and Huynh & Suresh, the median is used to enforce bounds. This mindset is perfectly in keeping with the thinking about monotonicity preserving methods but the function has other properties that can take it much further. A lack of recognition of these other properties may limit its penetration into further use. In a general sense it is used in defining the limits that an approximation value can take and produce a monotonicity preserving answer. The median is used to repeatedly bring an approximate value inside well-defined bounding values although for monotonicity it just requires two applications. Huynh first applied this to the linear interpolated second-order methods. In that case the median could be used to implement the classical TVD limiters or slope limiters. Suresh and Huynh took the same approach to higher order accuracy through looking edge value/flux approximations that might have an arbitrary order of accuracy and integrated via a method of lines approach, but uses many other applications of the median in the process.

Do not get obsolete like an old technology, keep innovating yourself.

― Sukant Ratnakar

Once the median function is in hand it may be used to write limiters in what I think is a rather elegant fashion. For example take the standard monotonicity or TVD limiter for a “good” second-order method based on Fromm’s scheme, which can be written in a relatively hideous fashion as, $S_M = (\mbox{sign}[S_2] \max\left[0, \min\left(|S_2|, 2 \mbox{sign}[S_2] \Delta_- U, 2 \mbox{sign}[S_2] \Delta_+ U \right)\right]$ . Here $S_2 = \frac{1}{2}(\Delta_- U + \Delta_+ U)$ is the unlimited slope for the linear Fromm scheme, with $\Delta_- U = U(j)$ – $U(j-1)$ and $\Delta_+ U = U(j+1)$ – $U(j)$ being the negative and positive differences of the variables on the mesh. With the median we can write the limiter in two compact statements, $S_m = \mbox{median}\left( 0, \Delta_- U , \Delta_+ U \right)$ , $S_M = \mbox{median}\left( 0, S_m , S_2 \right)$ . More importantly as Huynh shows this format allows significant and profitable extensions of the approach leading to higher accuracy approximation. It is the extensibility that may be exploited as a platform to innovate limiters and remove some of their detrimental qualities.

200px-ParabolicExtrap The same basic recipe of bounding allows the parabolic limiter for the PPM method to be expressed quite concisely and clearly. The classic version of the PPM limiter involves “if” statements and seems rather unintuitive. With the median it is very clear and concise. The first simply assures that the chosen (high-order) edge values, $U_l$ and $U_r$ are enclosed by the neighboring mesh values, $U_r := \mbox{median}\left( U(j), U_r, U(j+1) \right)$ and $U_l := \mbox{median}\left( U(j), U_l, U(j-1) \right)$ . While the second step assures that the parabola does not contain a zero derivative (or no local minima or maxima) in the cell, $U_R = \mbox{median} ( U_j, U_r, 3 U(j)$ – $2 U_l )$ and $U_L = \mbox{median} ( U_j, U_l, 3U(j)$ – $2 U_r )$ . With this version of PPM there is significant liberty in defining the left and right initial edge values by whatever formula, accuracy and nature you might like. This makes PPM an immensely flexible and powerful method (the original PPM is very powerful to begin with).

The median preserves accuracy and allows one to create new nonlinearly accurate approximations. Huynh delivered the median along with a theorem regarding it accuracy properties. If two of the three arguments in the median function have a certain order of accuracy in approximation than the evaluated median function will have that order of accuracy. The proof is simple and intuitive. If the defined order of accuracy values is returned there isn’t a question, but if the returned value does not have the order accuracy a priori, its accuracy is in question. By being bounded by the two values with well-defined linear order of accuracy, the median value is the convex combination of the two accurate values, which locally means it has the same order of accuracy. Viola! This means the median has the ability to “breed” new high order approximations from existing high-order expressions. If you’re interested in producing well-behaved approximations that are also high-order; it is a super useful property.

Here is a supposition that extends the accuracy property of the median to make it more powerful. The median can also “breed” new nonlinearly stable approximations. This would work as follows: if two of the arguments are stable, the evaluation of the median would produce a stable value even if the argument returned is linearly unstable. Thus we can take various stable approximations and through the median use them to produce new stable approximations even if they are unstable. This should work exactly like the order of approximation through the evaluated median producing a value that is the convex linear combination of the two defined stable approximations. If this property can be proven then all you need to do is keep track of the properties as you work defining your method. If you are careful you can end up with provable methods having well-defined accuracy and stability even when some parts of the method have neither to begin with. This is the essence of a the nonlinear method. It does something the linear method cannot.

aflbracket This produced a concept of producing nonlinear schemes that I originally defined as a “playoff” system. The best accurate and stable approximation would be the champion of the playoffs and would inherit the properties of the set of schemes used to construct the playoff. This would work as long as the playoff was properly orchestrated to produce stable and high-order approximations in a careful manner if the median were used to settle the competition. I had discussed this as a way to produce better ENO-like schemes. Perhaps I should have taken the biological analogy in the definition of the scheme in terms of what sort of offspring each decision point in the scheme bred. Here the median is used to breed the schemes and produce more fit offspring. The goal at the end of scheme is to produce an approximation with a well-chosen order of accuracy and stability that can be proven.

The method I show here was developed with a goal in mind, produce an ENO(-like) method with all of the stability, but with better accuracy and fidelity. It would have the same order of accuracy as ENO methods, but would produce solutions of greater fidelity. One of the big issues with ENO (and WENO) methods is that they are relatively dissipative as well as being complex. For applied problems with shocks and discontinuous features a well-designed second-order method will produce a much better, higher fidelity solution than an ENO method. WENO is more competitive, but still not better than a method such as PPM until the solution has immense amount of high frequency structure. The goal is to produce ENO methods that inherit the sort of fidelity properties that second-order methods produce, but also provide bona fide high-order accuracy,

Here is a relatively simple example of the sort of scheme I suggested and how the “playoff” system would work. In this case, the goal is to produce a third-order approximation with well-defined stability properties. We start with three third order approximations, $U_{3,1} = \frac{1}{3}U(j-2) - \frac{7}{6}U(j-1) + \frac{11}{6}U(j)$ , $U_{3,2} = -\frac{1}{6}U(j-1) + \frac{5}{6}U(j) + \frac{1}{3}U(j)$ and $U_{3,3} =\frac{1}{3}U(j) + \frac{5}{6}U(j+1) -\frac{1}{6}U(j+2)$ (all written from the right edge of cell $j$ ) plus a second-order monotone approximation $U_{2,M}$ . For example one might use the second-order minmod based scheme, $U(j) +\frac{1}{2}\mbox{minmod}\left( \Delta_- U, \Delta_+ U \right)$ . One of the three third-order methods is the upstream-centered scheme, which has favorable properties from an accuracy and stability point-of-view ( $U_{3,2}$ for this example).

The second thing to note is that one of the three third-order approximations is the nonlinearly stable classical ENO approximation (we don’t know which one, and don’t need to either! and that is cool). The playoff would consist of two rounds and three applications of the median in the process. The semi-finals would consist of two choices, $U_{3,A} = \mbox{median}\left(U_{3,1}, U_{3,2}, U_{2,M}\right)$ and $U_{3,B} = \mbox{median}\left(U_{3,2}, U_{3,3}, U_{2,M}\right)$ . Both $U_{3,A}$ and $U_{3,B}$ are third-order accurate and one of them is also nonlinearly stable to the extent that the ENO approximation would be (the second stable scheme is $U_{2,M}$ ). The finals then would be defined by the following test, $U_{3,C} = \mbox{median}\left(U_{3,A}, U_{3,B}, U_{2,M}\right)$ and would produce a third-order and nonlinearly stable approximation. This is a direct consequence of the median function’s use. Two of the approximations are third-order, and two of the approximations are nonlinearly stable, so the result of the playoff final is an approximation that gives us both. The results from my paper strongly indicate that this approach is a success, and can be used to construct methods of much higher order accuracy.

If one were feeling more bold about solving problems of this ilk, one might throw the fifth-order upstream centered approximation into the mix. This fifth-order approximation is the asymptotic limit of the fifth-order WENO method when the solution is (very) smooth and a linear combination of the three third-order approximations, $U_5 = \frac{1}{30}U(j-2) - \frac{13}{60}U(j-1) + \frac{47}{60}U(j) + \frac{27}{60}U(j+1) - \frac{1}{20}U(j+2)$ . Make the fifth-order approximation the returning champion have it play the winner of the playoff system in a final match. This would look like the following, $U_{3,D} = \mbox{median}\left(U_{3,C}, U_5, U_{2,M}\right)$ . It would still only be third-order accurate in a formal manner, but have better accuracy where the solution is well-resolved. The other thing that is intrinsically different from the classic ENO approach is the guidance of the choice for $U_{2,M}$ . To a very large extent the result of the playoff is determined by this choice and the result inherits the properties of this method. There is a wealth of monotonicity-preserving second-order methods to choose from used to determine the fitness of the high-order approximations.

ENO methods traditionally use a hierarchical stencil selection approach that introduces serious practical problems. Their weighted version is a far better method (WENO) both because it is higher order, but also better stability properties. The high order nature of these schemes can be analyzed via the elegant continuous nature of the weights. The largest issue with both methods is their great cost either in floating point or logical expression evaluation. Using the median function can provide a route analogous to the playoff scheme above to produce an ENO method where the prospect is that the chosen scheme is the one closest to some “comparison” method chosen from the wealth of monotonicity-preserving methods. This ENO scheme would then inherit the basic properties of the base scheme, but also have the high-order nature of ENO along with being non-oscillatory. A close relative of the median function is the xmedian, which delivers whatever argument is closer in an absolute value sense to the comparison value. For the following form, $\mbox{xmedian}\left(a, b, c \right)$ the function will provide either $b$ or $c$ depending on their distance from $a$ . The function is written down at the end of the post along with all the other functions used (I came up with these to allow the analysis of this class of methods via modified equation analysis).

Here is a third-order ENO like method based on either the median or xmedian procedure. Both methods will produce a bona fide method with third-order accuracy (confirmed via numerical tests). The method starts with the selection of the comparison scheme, $U_c$ . This scheme may be first or second-order accurate, but needs to have strong nonlinear stability properties. For the third-order (or 4^th order) method, the scheme has two rounds and three total tests (i.e., semi-finals and finals). The semi-finals are $U_{3,A} = \mbox{xmedian}\left( U_c, U_{3,1}, U_{3,2} \right)$ and $U_{3,B} = \mbox{xmedian}\left( U_c, U_{3,2}, U_{3,3} \right)$ . The finals are then easy to predict from the previous method, $ latex U_{3,C} = \mbox{xmedian}\left( U_c, U_{3,A}, U_{3,B} \right) $. With the properties of the xmedian we can see that the scheme will select one of the third-order methods. This issue is that this playoff scheme does not produce a selection that would obviously be stable using the logic from earlier in the post. The question is whether the non-oscillatory method chosen by its relative closeness to the comparison scheme is itself nonlinearly stable.

The main question is the nature of the stability. The proposition is that the comparison scheme drawn from the category of nonlinear stable, monotonicity preserving methods endows the result with stability. With the median there is a very strong argument for stability. For the xmedian this argumentation is absent. Stability in both cases is born out through numerical testing, but theoretical support does not seem to have an obvious path. I note that ENO and WENO methods don’t have a deep stability result aside from the notion that classic ENO will produce an entropy-stable flux (but WENO does not). Practical success would seem to favor WENO to a very large degree as it has been applied to a vast number of practical cases.

If you’re not prepared to be wrong, you’ll never come up with anything original.

― Ken Robinson

The use of the median and related functions starts to beg some questions desperately in need of answers. There is a broad a deep of understanding of what numerical stability is and implies. This is for linear schemes, but the concept of nonlinear stability needs deeper thought. The archetypes of nonlinear stability are TVD/monotonicity-preserving schemes and ENO or WENO methods. The nature of these methods allows the local use of intrinsically unstable approximations without the usual path to catastrophe this implies. Where is the dividing line between a method being non-oscillatory (nonlinearly stable) and not. Using the median or its brother the xmedian, I can come up with a host of methods that perform in a fashion that implies that they are nonlinearly stable. Can we come up with a technically defensible definition that can be used to take these methods even further?

The possession of knowledge does not kill the sense of wonder and mystery. There is always more mystery.

― Anaïs Nin

References

Huynh, Hung T. “Accurate monotone cubic interpolation.” SIAM Journal on Numerical Analysis 30.1 (1993): 57-100.

Huynh, Hung T. “Accurate upwind methods for the Euler equations.” SIAM Journal on Numerical Analysis 32.5 (1995): 1565-1619.

Suresh, A., and H. T. Huynh. “Accurate monotonicity-preserving schemes with Runge–Kutta time stepping.” Journal of Computational Physics 136.1 (1997): 83-99.

Boris, Jay P., and David L. Book. “Flux-corrected transport. I. SHASTA, A fluid transport algorithm that works.” Journal of computational physics 11.1 (1973): 38-69. (Kuzmin, D., R. Löhner, and S. Turek, eds. Flux-Corrected Transport: Principles, Algorithms, and Applications. Scientific Computation. Springer, 2005.)

Colella, Phillip, and Paul R. Woodward. “The piecewise parabolic method (PPM) for gas-dynamical simulations.” Journal of computational physics 54.1 (1984): 174-201.

Harten, A., Engquist, B., Osher, S., & Chakravarthy, S. R. (1997). Uniformly high order accurate essentially non-oscillatory schemes, III. Journal of Computational Physics, 131(1), 3-47. (Harten, Ami, et al. “Uniformly high order accurate essentially non-oscillatory schemes, III.” Upwind and High-Resolution Schemes. Springer Berlin Heidelberg, 1987. 218-290.)

Liu, Xu-Dong, Stanley Osher, and Tony Chan. “Weighted essentially non-oscillatory schemes.” Journal of computational physics 115.1 (1994): 200-212.

Shu, Chi-Wang. Essentially non-oscillatory and weighted essentially non-oscillatory schemes for hyperbolic conservation laws. Springer Berlin Heidelberg, 1998.

Henrick, Andrew K., Tariq D. Aslam, and Joseph M. Powers. “Mapped weighted essentially non-oscillatory schemes: achieving optimal order near critical points.” Journal of Computational Physics 207.2 (2005): 542-567.

Greenough, J. A., and W. J. Rider. “A quantitative comparison of numerical methods for the compressible Euler equations: fifth-order WENO and piecewise-linear Godunov.” Journal of Computational Physics 196.1 (2004): 259-281.

Rider, William J. “Building Better (Weighted) ENO Methods.” Computational Fluid Dynamics 2006. Springer Berlin Heidelberg, 2009. 161-166.

Algebraically defined functions:

$\min\left(a,b \right) =\frac{1}{2}(a+b)- \frac{1}{2}abs[a-b]$

$\max\left(a,b \right) = \frac{1}{2} (a+b)+ \frac{1}{2}abs[a-b]$

$\mbox{minmod}\left(a, b \right) =\frac{1}{4} \left(\mbox{sign}[a]+\mbox{sign}[b]\right) \left(|a+b| - |a-b| \right)$

$\mbox{minmod}\left(a, b \right) = (\mbox{sign}[a] \max\left[0, \min\left(|a|, \mbox{sign}[a] b \right)\right]$

$\mbox{maxmod}\left(a, b \right) =\frac{1}{4} \left(\mbox{sign}[a]+\mbox{sign}[b]\right)\left(|a+b| + |a-b|\right)$

$\mbox{mineno}\left(a, b \right) =\frac{1}{2} \mbox{sign}(a+b)\left( |a + b| - |a-b|\right)$

$\mbox{median}\left(a, b, c \right) = a + \mbox{minmod} \left( b-a,c-a\right)$

$\mbox{xmedian}\left(a, b, c \right) = a + \mbox{mineno}\left(b-a,c-a \right)$

Nonlinear methods, a key to modern modeling and simulation

03 Friday Jun 2016

Posted by Bill Rider in Uncategorized

≈ 3 Comments

Clarke’s Third Law: Any sufficiently advanced technology is indistinguishable from magic.

– Arthur C. Clarke

images-1 One of the most important things about modern computational methods is their nonlinear approach to solving problems. These methods are easily far more important to the utility of modeling and simulation in the modern world than high performance computing. The sad thing is that little or no effort is going into extending and improving these approaches despite the evidence of their primacy. Our current investments in hardware are unlikely to yield much improvement whereas these methods were utterly revolutionary in their impact. The lack of perspective regarding this reality is leading to vast investment in computing technology that will provide minimal returns.

In this case the nonlinearity is the use of solution adaptive methods that apply the “best” method adjusted to the local structure of the solution itself. In many cases the methods are well grounded in theory, and provided the essential push of improvement in results that have powered computational physics into a powerful technology. Without these methods we would not have the capacity to reliably simulate many scientifically interesting and important problems, or utilize simulation in the conduct of engineering. The epitome of modeling and simulation success is computational fluid dynamics (CFD) where these nonlinear methods have provided the robust stability and stunning results capturing imaginations. In CFD these methods were utterly game changing and the success of the entire field is predicated on their power. The key to this power is poorly understood, and too often credited to computing hardware instead of the real source of progress: models, methods and algorithms with the use of nonlinear discretizations being primal.

And a step backward, after making a wrong turn, is a step in the right direction.

― Kurt Vonnegut

Unfortunately, we live in an era where the power of these methods and algorithms is under-utilized. Instead we see an almost complete emphasis on computing hardware as the route toward progress. This emphasis is naïve, and ultimately limits the capacity of computing, modeling and simulation to become a boon to society. Our current efforts are utterly shallow intellectually and almost certainly bound to fail. This failure wouldn’t be the good kind of failure since the upcoming crash is absolutely predictable. Computing hardware is actually the least important and most distant aspect of computing technology necessary for the success of simulation. High performing computers are absolutely necessary, but woefully insufficient for success. A deeper and more appreciative understanding of nonlinear methods for solving our models would go a long way to harnessing the engine of progress.

When one thinks about these things that almost immediately comes to mind with nonlinear methods is the term “limiters”. Often limiters are viewed as being a rather heavy-handed and crude approach to things. A limiter is a term inherited from the physics community where physical effects in modeling and simulation need to be limited to stay within some prescribed physical bounds. For example keeping transport within the limit of the speed of light for radiation transport. Another example would be a limit keeping positive-definite quantities such as density or pressure, positive. In nonlinear methods the term limiter often can be interpreted as limiting the amount of high-order method used in solving an equation. It has been found that without limits high-order methods produce anomalous effects (e.g., oscillations, solutions outside bounds, non-positivity). The limiter becomes a mixing parameter for high-order and low-order method for solving a transport equation. Usually the proviso is that the low-order method is guaranteed to produce a physically admissible solution.

All theories are legitimate, no matter. What matters is what you do with them.

― Jorge Luis Borges

To keep from falling into the old traps of how to talk about things, let’s take a deeper look at how these things work, and why. Also explore the alternatives to limiters and the ways everything needs to work together to be effective. There is more than one way to make it all work, but the recipe ends up being quite the same at the high end.

Ideally, we would like to have highly accurate, very stable, and completely reliable computations. The fundamental nature of our models of the universe conspire to make this far from a trivial task. Our models being comprised of mathematical expressions then are governed by mathematical realities. These realities in the form of fundamental theorems provide a foundation upon which excellent numerical methods may be built. Last week I discussed the Lax Equivalence theorem and this is a good place to start. There the combination of consistency and stability imply convergence of solutions. Consistency is having numerical accuracy if at least first order, but higher than first order accuracy will be more efficient. Stability is the nature of just getting a solution that hangs together (doesn’t blow up as we compute it). Stability is something so essential to success that it supersedes anything else when it is absent. godunov-1 Next in our tour of basic foundational theorems is Godunov’s theorem, which tells us a lot about what is needed. In its original form it’s a bit of a downer, you can’t have a linear method be higher than first-order accurate and non-oscillatory (monotonicity preserving). The key concept is to turn the barrier on its head, you can have a nonlinear method be higher than first-order accurate and non-oscillatory. This is then the instigation for the topic of this post, the need and power of nonlinear methods. I’ll posit the idea that the concept may actually go beyond hyperbolic equations, but the whole concept of nonlinear discretizations is primarily applied to hyperbolic equations.

url A few more theorems help to flesh out the basic principles we bring to bear. A key result is the theorem of Lax and Wendroff that shows the value of discrete conservation. If one has conservation then you can show that you are achieving weak solutions. This must be combined with picking the right weak solution, as there are actually infinitely many weak solutions, all of them wrong save one. The task of getting the right weak solution is produced with sufficient dissipation, which produces entropy. Key results are due to Osher who attached the character of (approximate) Riemann solvers to the production of sufficient dissipation to insure physically relevant solutions. As we will describe there are other means to introducing dissipation aside from Riemann solvers, but these lack some degree of theoretical support unless we can tie them directly to the Riemann problem. Of course most of us want physically relevant solutions although lots of mathematicians act like this is not a primal concern! There is a vast phalanx of other osher photoNormal theorems of practical interest, but I will end this survey with a last one by Osher and Majda with lots of practical import. Simply stated this theorem limits the numerical accuracy we can achieve in regions affected by a discontinuous solution to first-order accuracy. The impacted region is bounded by the characteristics emanating from the discontinuity. This puts a damper on the zeal for formally high order accurate methods, which needs to be considered in the context of this theorem.

Whenever a theory appears to you as the only possible one, take this as a sign that you have neither understood the theory nor the problem which it was intended to solve.

― Karl Popper

I will note that we are focused on hyperbolic equations because they are so much harder to solve. It remains an open question as to how much all of this applies to parabolic or elliptic equations. I suspect it does albeit to a less degree because those equations are so much more forgiving. Hyperbolic equations are deeply unforgiving and cause you to pay for all sins many times over. For this reason hyperbolic equations have been a pacing aspect of computational modeling where nonlinear methods have allowed progress and to some degree tamed the demons.

vanleer2 Let’s return to the idea of limiters and act to dissuade the common view of limiters and their intrinsic connection to dissipation. The connection there is real, but less direct than commonly acknowledged. A limiter is really a means of stencil selection in an adaptive manner separate from dissipation. They may be combined, but usually not with good comprehension of the consequences. Another way to view the limiter is a way of selecting the appropriate bias in the stencil used to difference an equation based upon the application of a principle. The principle most often used for limiting is some sort of boundedness in the representation, which may equivalently be associated with selecting a smoother (nicer) neighborhood to execute the discretization on. The way an equation is differenced certainly impacts the nature and need for dissipation in the solution, but it is indirect. Put differently, the amount of dissipation needed with the application of a limiter varies both with the limiter itself, but also with the dissipation mechanism, but the two are independent. This does get into the whole difference between flux limiters and geometric limiters, a topic worth some digestion.

In examining the topic of limiters and nonlinear stabilization we find an immediate and significant line of demarcation between two philosophical tenets. The first limiters came from limiting fluxes in the method of flux corrected transport. An alternative approach is found through extending Godunov’s method to higher order accuracy, and involves the limiting of the geometric reconstruction of variables. In the case of flux limiting, the accuracy and dissipation (e.g. stabilization) are mixed together while the geometric approach separates these effects into independent steps. The geometric approach puts bounds on the variables being solved, and then relies on an agnostic approach to the variables for stabilization in the form of a Riemann solver. Both approaches are successful in solving complex systems of equations and have their rabid adherents (I favor the reconstruct-Riemann approach). The flux form can be quite effective and produces better extensions to multiple dimensions, but can also involve heavy-handed dissipation mechanisms.

If one follows the path using geometric reconstruction several aspects of nonlinear stabilization may be examined more effectively. Many of the key issues associated with the nonlinear stabilization are crystal clear in this context. The clearest example is monotonicity of the reconstruction. It is easy to see how the reconstructed version of the solution is bounded (or not) compared to the local variable’s bounds. It is generally appreciated that the application of monotonicity renders solutions relatively low order in accuracy. This is most acute at extrema in the solution where the accuracy degrades to first-order intrinsically. What becomes even more enabled by looking at the geometric reconstruction is moving beyond monotonicity to stabilization that provides better accuracy. The derivation of methods that do not degrade accuracy at extrema, but extend the concept of boundedness appropriately is much clearer with a geometric approach. Solving this problem effectively is a key to producing methods with greater numerical accuracy and better efficiency computationally. While this may be achieved via flux limiters with some success there are serious issues in conflating geometric fidelity with dissipative stabilization mechanisms.

The needs for dissipation in schemes are often viewed with significant despair by parts of the physics community. Much of the conventional wisdom would dictate that dissipation-free solutions are favored. This is rather unfortunate because dissipation is the encoding of powerful physics principles into methods rather than some sort of crude numerical artifice. Their specific and bounded dissipative character governs most complex nonlinear systems. If the solution has too little dissipation, the solution is not physical. As such, dissipation is necessary and its removal could be catastrophic.

john-von-neumann-2 The first stabilization of numerical methods was found with the original artificial viscosity (developed by Robert Richtmyer to stabilize and make useful John Von Neumann’s numerical method for shock waves). The name “artificial viscosity” is vastly unfortunate because the dissipation is utterly physical. Without its presence the basic numerical method was utterly and completely useless leading to a catastrophic instability (almost certainly helped instigate the investigation of numerical stability along with instability in integrating parabolic equations). Physically most interesting nonlinear systems produce dissipation even in the limit where the explicit dissipation can be regarded as vanishingly small. This is true for shock waves and turbulence where the dissipation in the inviscid limit has richtmyer_robert_a1 remarkably similar forms structurally. Given this basic need, the application of some sort of stabilization is an absolute necessity to produce meaningful results both from a purely numerical respect and the implicit connection to the physical World. I’ve written recently on the use of hyperviscosity as yet another mechanism for producing dissipation. Here the artificial viscosity is the archetype of hyperviscosity and its simplest form. As I’ve mentioned before the original turbulent subgrid model was also based directly upon the artificial viscosity devised by Richtmyer (often misattributed to Von Neumann although their collaboration clearly was important).

The other approach to stabilization is at first blush utterly different in approach, but upon deep reflection completely similar and complementary. For other numerical methods the use of Riemann solvers can be used to provide stability. A Riemann solver uses the nature of a local (approximate) exact physical solution to resolve a flow locally. In this way the nature of the physical propagation of information provides a more stable representation of the solution. It turns out that the Riemann solution will produce an implicit dissipative effect that impacts the solution much like artificial viscosity would. The real power is exploiting this implicit dissipative effect to provide real synergy where each may provide complementary views of stabilization. For example a Riemann solver produces a complete image of what modes in a solution must be stabilized in a manner that might inform the construction of an improved artificial viscosity. On the other hand artificial viscosity applies to nonlinearities often ignored by approximate Riemann solvers. Finding an equivalence between the two can produce more robust and complete Riemann solvers. As is true with most things, true innovation can be achieved through an open 05 edited mind and a two way street.

Perhaps I will attend to a deeper discussion of how high-order accuracy connects to these ideas in next week’s post. I’ll close with a bit of speculation. In my view one of the keys to the future is how to figure out how to efficiently harness high-order methods for more accurate solutions with sufficient robustness. New computing platforms dearly need the computational intensity offered by high-order methods to keep themselves busy in auseful manner because of exorbitant memory access costs. Because practical problems are ugly with all sorts of horrible nonlinearities and effectively singular features, the actual attainment of high-order convergence of solutions is unachievable. Making high-order methods both robust enough and efficient enough under these conditions has thus far eluded our existing practice.

Extending the ideas of nonlinear methods to other physics is another direction needed. For example, do any of these ideas apply to parabolic equations with high-order methods? There are some key ideas of physically relevant solutions, but parabolic equations are generally so forgiving no focus is given. The question of whether something like Godunov’s theorem applying to high-order solution of parabolic equations is worth asking (I think a modified version of it might well). I will not that some non-oscillatory schemes do apply to extremely nonlinear parabolic equations exhibiting wave-like solutions.

The last bit of speculation will apply to the application of these ideas to common methods for solid mechanics. Generally it has been found that artificial viscosity is needed for stabling integrating solid mechanical models. Unfortunately this is using the crudest and least justified philosophy of artificial viscosity. Solid mechanics could deeply utilize the principles of nonlinear methods to very great effect. Solid mechanics is basically a system hyperbolic equations abiding by the full set of foundational conditions elucidated earlier in the post. As such everything should follow as a matter of course. Today this is not the case and my belief is that the codes suffer greatly from a lack of robustness and deeper mathematical foundation for the modeling and simulation capability.

The success of modeling and simulation as a technological tool and engine of progress would be well served by focus on nonlinear discretization lacking almost entirely from our current emphasis on hardware.

The reasonable man adapts himself to the world: the unreasonable one persists in trying to adapt the world to himself. Therefore all progress depends on the unreasonable man.

― George Bernard Shaw

References

VonNeumann, John, and Robert D. Richtmyer. “A method for the numerical calculation of hydrodynamic shocks.” Journal of applied physics 21.3 (1950): 232-237.

Lax, Peter D., and Robert D. Richtmyer. “Survey of the stability of linear finite difference equations.” Communications on Pure and Applied Mathematics 9.2 (1956): 267-293.

Godunov, Sergei Konstantinovich. “A difference method for numerical calculation of discontinuous solutions of the equations of hydrodynamics.”Matematicheskii Sbornik 89.3 (1959): 271-306.

Lax, Peter, and Burton Wendroff. “Systems of conservation laws.”Communications on Pure and Applied mathematics 13.2 (1960): 217-237.

Osher, Stanley. “Riemann solvers, the entropy condition, and difference.”SIAM Journal on Numerical Analysis 21.2 (1984): 217-235.

Majda, Andrew, and Stanley Osher. “Propagation of error into regions of smoothness for accurate difference approximations to hyperbolic equations.”Communications on Pure and Applied Mathematics 30.6 (1977): 671-705.

Harten, Amiram, James M. Hyman, Peter D. Lax, and Barbara Keyfitz. “On finite‐difference approximations and entropy conditions for shocks.”Communications on pure and applied mathematics 29, no. 3 (1976): 297-322.

Boris, Jay P., and David L. Book. “Flux-corrected transport. I. SHASTA, A fluid transport algorithm that works.” Journal of computational physics 11.1 (1973): 38-69.

Zalesak, Steven T. “Fully multidimensional flux-corrected transport algorithms for fluids.” Journal of computational physics 31.3 (1979): 335-362.

Van Leer, Bram. “Towards the ultimate conservative difference scheme. V. A second-order sequel to Godunov’s method.” Journal of computational Physics 32.1 (1979): 101-136.

Van Leer, Bram. “Towards the ultimate conservative difference scheme. IV. A new approach to numerical convection.” Journal of computational physics23.3 (1977): 276-299.

Liu, Yuanyuan, Chi-Wang Shu, and Mengping Zhang. “High order finite difference WENO schemes for nonlinear degenerate parabolic equations.”SIAM Journal on Scientific Computing 33.2 (2011): 939-965.

The Regularized Singularity

~ The Eyes of a citizen; the voice of the silent

Author Archives: Bill Rider

My Job Should Be Awesome; Its Not, Why?

A More Robust, Less Fragile Stability for Numerical Methods

The Death of Peer Review

10 Big Things For the Future of (Computational) Science

How to Win at Supercomputing

We have already lost to the Chinese in Supercomputing; Good thing it doesn’t matter

Limiters are powerful; how to make them even more powerful

A Recipe for Improved Methods

The Marvelous Magical Median

Nonlinear methods, a key to modern modeling and simulation