13 good reasons to verify your code and calculations

26 Saturday Oct 2013

Code and calculation verification are an overlooked step for assuring the quality of codes and calculations. Perhaps people don’t have enough good reasons to do this work. It can often be labor intensive, frustrating and disappointing. That is a real awful sales pitch! I really believe in doing this sort of work, but the need to provide better reasoning is clear. I’ll get at the core of the issue right away, and then expound on ancillary benefits.

In the final analysis, not doing code verification is simply asking for trouble. Doing code verification well is another matter altogether, but doing it well is not as necessary as simply doing it. Many developers, students, and researchers simply compare the solutions to benchmarks using visual means (i.e., comparing to the benchmark solution in the infamous eyeball norm, or the “viewgraph” norm if its presented). This is better than nothing, but not by much at all. It is certainly easier and more convenient to simply not verify calculations.

Very few of us actually create codes that are free of bugs (in fact I would posit none of us). To not verify is to commit an act of extreme hubris. Nevertheless, admitting one’s intrinsic fallibility is difficult; dealing with the impact of one’s fallibility is inevitable.

So, without further philosophizing, here is my list:

Don’t assert that your code is correct; prove it’s correct. This is the scientific method; respect it, and apply it appropriately. Treat a code as you would an experiment, and apply many of the same procedures and measures to ensure quality results.
Mistakes found later in the code development process are harder and more expensive to fix. There is vast evidence of this and I recommend reading the material on the Capability Maturity Model for Software, or better yet Steve McConnell’s book, “Code Complete,” which is a masterpiece.
Once completed (and you aren’t ever really done), you will be confident in how your code will preform. You will be confident that you’ve done things correctly. You can project this confidence to others and the results you and your handiwork produce.
You will find the problems with your code instead of someone else like a code customer or user. As much as we dislike finding problems, some one else finding our problems is more painful.
Verification will allow you to firmly establish the accuracy properties of your method if you look at error and convergence rates. You might have to confront the theory behind you method and problem, and this might help you learn something. All of this is really good for you.
Doing the same thing with your calculations will allow you to understand the error associated with solving the equation approximately. Again, confront the available theory, or its lack of availability. It will provide you much needed humility.
It is embarrassing when a numerical error is influencing or hiding a model deficiency. Worse yet, it is badly conducted science and engineering. Don’t be responsible for more embarrassments.
When you are calibrating your model (admit it, you do it!), you might just be calibrating the model to the numerical error. You want to model physics, not truncation error, right?
Verification results will force you to confront really deep questions that will ultimately make you a better scientist or engineer. Science is about asking good questions, and verification is about asking good numerical questions.
You are a professional, right? Doing verification is part of due diligence, it is being the best you can be. Adopting a personal quality mentality is important in one’s development. If you are in the business of numerical solutions, verification is a key part of the quality arsenal.
You won’t really understand your code until you look really hard at the results, and verification helps you understand the details you are examining. You will looks at your work more deeply than before and that is a good thing.
Conducting real error analysis can help you make sure and prove that your mesh is adequate for the problem you are solving. Just because a mesh looks good, or looks like the thing you’re simulating isn’t evidence that it actually allows you to simulate that thing.
It is 2013, so come on! Please do things in a way that reflects a modern view of how to do computational science.

Robust Verification

18 Friday Oct 2013

Posted by Bill Rider in Uncategorized

≈ Leave a comment

The excuses for not doing verification of numerical solutions are myriad. One of the best although it is unstated, is that verification just doesn’t work all the time. The results are not “robust” even though the scientist “knows” the results are OK. A couple things happen to produce this outcome:

The results are actually not OK,
The mesh isn’t in the “asymptotic” range of convergence
The analysis of verification is susceptible to numerical problems.

I’ll comment on each of these in turn and suggest some changes to mindset and analysis approach.

First, I’d like to rant about a few more pernicious ways people avoid verification. These are especially bad because they often think they are doing it. For example, let’s say you have an adaptive mesh code (this is done without AMR, but more common with AMR because changing resolution is often so easy). You get solutions at several resolutions, and you “eyeball” the solutions. The quantity you or your customer cares about isn’t changing much as you refine the mesh. You declare victory.

What have you done? It is not verification, it is mesh sensitivity. You could actually have a convergent result, but you actually don’t have proof. The solution could be staying the same because it isn’t changing, and is in fact, mesh-insensitive. What would verification bring to the table? It would provide a convergence rate, and an error estimate. In fact, error estimates are the true core of verification. The error estimate is a much stronger statement about the influence of mesh on your solution, and you almost never see it.

Why? Quite often the act of computing the error estimate actually undermines your faith in the seemingly wonderful calculation and leaves you with questions you can’t answer. It is much easier to simply exist in comfortable ignorance and believe that your calculations are awesome. This state of ignorance is the most common way that people almost do verification of calculations, but fail at the final hurtle. Even at the highest level of achievement in computational science, the last two paragraphs describe the state of affairs. I think its rather pathetic.

OK, rant off.

Let’s get the zeroth point out of the way first. Verification is first and foremost about estimating numerical errors. This counters the oft-stated purpose associated with “order” verification where the rate of convergence for a numerical method is computed. Order verification is used exclusively with code verification where a solution is known to be sufficiently differentiable to provide proof that a method achieves the right rate of convergence as a function of the mesh parameters. It is an essential part of the verification repertoire, and it squashes a more important reason to verify, error quantification whether true or estimated.

The first point is the reason for doing verification in the first place. You want to make sure that you understand how large the impact of numerical error is on your numerical solution. If the numerical errors are large they can overwhelm the modeling you are interested in doing. If the numerical errors grow as a function of the mesh parameters, something is wrong. It could be the code, or it could be the model, or the mesh, but something is amiss and the solution isn’t trustworthy. If it isn’t checked, you don’t know.

The second point is much more subtle. So let’s get the elephant in the room identified, meshes used in practical numerical calculations are almost never asymptotic. This is true even in the case of what is generously called “direct numerical simulation (DNS)” where it is claimed that the numerical effects are small. Rarely is there an error estimate in sight. I’ve actually looked into this and the errors are much larger than scientists would have you believe, and the rates of convergence are clearly not asymptotic.

All of this is bad enough, but there is more and it is not easy to understand. Unless the mesh parameters are small the rate of convergence should systematically deviate from the theoretical rate in a non-trivial way. Depending on the size of the parameter and the nature of the equation being solved, the correct convergence rate could be smaller or larger than expected. All of this can be analyzed for ideal equations such as a linear ordinary differential equation. Depending on the details of the ODE method, and the solution one can get radically different rates of convergence.

The third point is the analysis of numerical solutions. Usually we just take our sequence of solution and apply standard regression to solve for the convergence rate and estimated converged solution. This simple approach is the heart of many unstated assumptions that we shouldn’t be making without at least thinking about them. Standard least squares relies a strong assumption about the data and its errors to begin with. It assumes that the errors from the regression are normally distributed (i.e., Gaussian). Very little about numerical error leads one to believe this is true. Perhaps in the case where the errors are dominated by dissipative dynamics a Gaussian would be plausible, but again this analysis itself only holds in the limit where the mesh is asymptotic. If one is granted the luxury of analyzing such data, the analysis methodology, frankly, matters little.

What would I suggest as an alternative?

One of the problems that plagues verification is bogus results associated with either bad data, changes in convergence behavior, or outright failure of the (nonlinear) regression. Any of these should be treated as an outlier and disregarded. Most common outlier analysis itself relies on the assumption of Gaussian statistics. Again, making use of this assumption is unwarranted. Standard statistics using the mean, and standard deviation is the same thing. Instead one should use median statistics, which can withstand the presence of up to half the data being outliers without problems. This is the definition of robust and this is what you should use.

Do not use a single regression to analyze data, but instead do many regressions using different formulations of the regression problem, and apply constraints to the solution using your knowledge. If you have the luxury of many mesh solutions run the regression over various subsets of your data. For example, you know that a certain quantity is positive, or better yet must take on values between certain limits. The same applies to convergence rates, you generally have an idea what would be reasonable from analysis; i.e., a first-order method might converge at a rate between one-half and two. Use these constraints to make your regression fits better and more guaranteed to produce results you can use. Make sure you throw out results that show that your second-order method is producing 25^th order convergence. This is simply numerical garbage and there is no excuse.

At the end of this you will have a set of numerical estimates for the error and convergence rate. Use median statistics to choose the best result and the variation from the best so that outliers along the way are disregarded naturally.

Verification should (almost) always produce a useful result and injecting ideas from robust statistics can do this.

Going beyond this point leads us to some really beautiful mathematics that is hot property now (L1 norms leading to compressed sensing, robust statistics, …). The key is not using the standard statistical toolbox without at least thinking about it and justifying its use. Generally in verification work it is not justifiable. For general use a constrained L1 regression would be the starting point, maybe it will be more generally available soon.

Why do verification?

10 Thursday Oct 2013

Posted by Bill Rider in Uncategorized

≈ 1 Comment

Verification is an important activity for numerical computation that is too seldom done. Almost anybody with a lick of common sense will acknowledge how important it is and vital to achieve. On the other hand very few actually do it.

Why?

It is where arrogant hubris intersects laziness. It is where theory goes to die. Verification produces an unequivocal measure of your work. Sometimes the result you get is unpleasant and sends you back to the drawing board. It basically checks whether you are full of shit or not. Most people who are full of shit would rather not admit it.

What is actually worse is to do verification of someone else’s work. This puts one in the position of telling someone else that they are full of shit. This never makes you popular or welcome. For this reason I’ve compared V&V work to the “dark side” of the force, and by virtue of this analogy, myself to a Sith lord. At least it’s a pretty cool bad guy.

Areas where I’ve worked a good bit are methods for shock physics. There is a standard problem almost everyone working on shock physics uses to test their method, Sod’s shock tube. Gary Sod introduced his problem is a 1978 paper in the Journal of Computational Physics to compare existing methods. By modern standards the methods available in 1978 were all bad. This was only presented as comparisons between the numerical solution, and the exact solution graphically. This is important to be sure, but so much more could be done. Sod’s problem is a Riemann problem, which can be solved and evaluated exactly (actually solved accurately using Newton’s method). The cool thing with an exact solution is that you can compute the errors in your numerical solution. Unfortunately Sod didn’t do this.

Again, why?

Instead he reported run time for the methods, as if to say, “all of these suck equally, so what matters is getting the bad answer the fastest”. A more technical response to this query is that shock captured solutions to problems with discontinuous solutions (Sod’s problem, has a rarefaction, contact discontinuity and a shock wave) are intrinsically limited to first-order accuracy. At a detail level the solution is to Sod’s shock tube is typically less than first-order accurate because the contact discontinuity’s solution convergences at less than first-order and eventually this dominates the numerical error under mesh refinement (covered in a paper by Banks, Aslam, and Rider(me)). The mindset is “well its going to be first order anyway so what’s the point of computing convergence rates?”

I’m going to tell you what the point is. The point is that the magnitude of the errors in the solutions is actually a lot different even among a whole bunch of methods of similar construction and accuracy. Not every method is created equal and there is a good bit of difference between the performance of Godunov’s original method, Van Leer’s method, Colella’s refined PLM method, WENO, PPM, etc. Ultimately what someone really cares about is the overall accuracy of the solution for amount of computational resource, as well as scheme robustness and extensibility.

So going back to the reality, people continued to use Sod’s problem and presented results with a graphical comparison usually at a single grid, and a comparison at runtime. I never saw anyone present the numerical error. People would only provide numerical errors for problems where they expected their method to achieve its full order of accuracy, i.e., smooth problems like advecting a Gaussian function or Sine wave. The only exception came from the flux-corrected transport world where they did present numerical accuracy for the propagation of square waves, but not systems of equations.

In 2004-2005 Jeff Greenough and I attacked this low hanging piece of fruit. Jeff’s Lab was using a weighted ENO (WENO) code developed at Brown University (where the inventor of WENO is a professor). Jeff worked on a code that used Colella’s PLM method (basically an improved version of Van Leer’s second-order Godunov method). I had my own implementations of both methods in a common code base. We undertook a comparison of the two methods head-to-head, with the following question in mind: “what method is the most efficient for different classes of problems?” We published the answer in the Journal of Computational Physics.

Before giving the answer to the question a little more background is needed. WENO methods are enormously popular in the research community with the fifth-order method being the archetype. I would say that the implementation of a WENO method is much more elegant and beautiful than the second-order Godunov method. It is compact and modular, and actually fun to implement. It is also expensive.

We used a set of 1-D problems: Sod’s shock tube, a stronger shock tube, a popular blast wave problem, and a shock hitting smooth density perturbations. For the first three problems, the 2^nd order method either out performed or was even with the 5^th order method at the same mesh. On the final problem WENO was better, until we asked the efficiency question. Accounting for runtime to achieve the same accuracy, the 2^nd order method outperformed WENO on every problem. If the problem was shock dominated, the difference in efficiency wasn’t even close. WENO was only competitive on the problem with lots of smooth structure. On the advection of a Gaussian pulse WENO wins at almost any level of error.

This asks a deeper question, why would a second-order method be more efficient?

This answer gets to the heart of method design. The 2^nd order method cheaply introduces some elements of high-order methods, i.e., it uses 4^th order approximations to the slope. Its nonlinear stability mechanism is cheap, and it uses a single step time integrator rather than a multistep integrator. The single step integrator provides a more generous time step size. The combination of the larger time step, single step integrator and simpler nonlinear stability mechanism equates to a factor of six in runtime. In addition to accommodate the fifth-order formal accuracy, the WENO method actually increases the dissipation used near shocks, which increases the error.

The bottom line is that verification is about asking good questions, answering these questions, and then going back to the well… it is in essence the scientific method.

There is nothing artificial about “Artificial Viscosity”

04 Friday Oct 2013

Posted by Bill Rider in Uncategorized

≈ 3 Comments

The name of the original successful shock capturing method is “artificial viscosity,” and it is terrible. John Von Neumann and Robert Richtmyer were both mathematical geniuses, perhaps, at least in this case, poor at marketing. To be fair, they struggled with the name, and artificial viscosity was better than “mock” or “fictitious” viscosity, which Richtmyer considered in his earlier Los Alamos reports (LA-671 and LA-699). Nonetheless, we are left with this less than ideal name.

I’ve often wished I could replace this name with “shock viscosity” or better “shock dissipation” because the impact of artificial viscosity is utterly real. It is physical and necessary to compute the solutions of shocks automatically without resolving the ridiculously small length and time scales associated with the physical dissipation. In a very true sense, the magnitude of viscosity used in artificial viscosity is the correct amount of dissipation for the manner in which the shock is being represented.

The same issue comes about in the models of turbulence. Again, the effects of nonlinearity enormously augment the impact of physical dissipation. The nonlinearity is associated with complex small-scale structures whose details are unimportant to the large-scale evolution of the flow. When simulating these circumstances numerically we are often only interested in the large-scale flow field and we can use an enhanced (nonlinear) numerical viscosity to stably compute the flow. This approach in both shocks and turbulence has been ridiculously successful. In turbulence this approach goes by the name implicit large eddy simulation. It has been extremely controversial, but in the last decade this approach has grown in acceptance.

The key to this whole line of thinking is that dissipation is essential for physical behavior of numerical simulations. While too much dissipation is undesirable, too little dissipation is a disaster. The most dangerous situation is when a simulation is stable (that is, runs to completion), and produces seemingly plausible results, but has less dissipation than physically called for. In this case the simulation will produce an “entropy-violating” solution. In other words, the result will be unphysical, that is, not achievable in reality. This is truly dangerous and far less desirable than the physical, but overly dissipated result (IMHO). I’ve often applied a maxim to simulations, “when in doubt, diffuse it out”. In other words, dissipation while not ideal (a play on words!) is better than too little dissipation, which allows physically unachievable solutions to persist.

Too often numerical practitioners seek to remove numerical dissipation without being mindful of the delicate balance between, the dissipation that is excessive, and the necessity of dissipation for guaranteeing physically relevant (or admissible) results. It is a poorly appreciated aspect of nonlinear solutions for physical systems that the truly physical solutions are not those that have no dissipation, but rather produce a finite amount of dissipation. This finite amount is determined by the large-scale variations in the flow, and is proportional to the third power in these variations.

Very similar scaling laws exist for ideal incompressible and compressible flows. In the incompressible case, Kolmogorov discovered the scaling law in the 1940’s in the form of his refined similarity hypothesis, or the “4/5ths” law. Remarkably, Kolmogorov did this work in the middle of the Nazi invasion of the Soviet Union, and during the darkest days of the war for the Soviets. At nearly the same time in the United States, Hans Bethe discovered a similar scaling law for shock waves. Both can be written in stunningly similar forms where the time rate of change of kinetic energy due to dissipative processes (or change in entropy), is proportional to the third power of large-scale velocity differences, and completely independent of the precise value of viscosity.

These scaling laws are responsible for the success of artificial viscosity, and large eddy simulation. In fact, the origin of large eddy simulation is artificial viscosity. The original suggestion that led to the development of the first large eddy simulation by Smagorinsky was made by Von Neumann’s collaborator, Jules Charney to remove numerical oscillations from early weather simulations in 1956. Smagorinsky implemented this approach in three dimensions in what became the first large eddy simulation, and the first global circulation model. This work was the origin of a major theme in turbulence modeling, and climate modeling. The depth of this common origin has rarely been elaborated upon, but I believe the commonality has profound implications.

At the very least, it is important to realize that these different fields have a common origin. Dissipation isn’t something to be avoided at all costs, but rather something to manage carefully. Better yet, dissipation is something to be modeled carefully, and controlled. In numerical simulations, stability is the most important thing because without stability, the simulation is worthless. The second priority is producing a physically meaningful (or realizable) simulation. In other words, we want a simulation that produces a result that matches a situation achievable in the real world. The last condition is accuracy. We want a solution that is as accurate as possible, without sacrificing the previous two conditions.

Too often, the researcher gets hung up on these conditions in the wrong order (like prizing accuracy above all else). The practitioner applying simulations in an engineering context often does not prize accuracy enough. The goal is to apply these constraints in balance and in the right order (stability then realizability then accuracy). Getting the order and balance right is the key to high quality simulations.

The Regularized Singularity

~ The Eyes of a citizen; the voice of the silent

Monthly Archives: October 2013

13 good reasons to verify your code and calculations

Robust Verification

Why do verification?

There is nothing artificial about “Artificial Viscosity”