tl;dr

Verification and validation (V&V) is receding from focus. This has been happening in traditional computational science for the past 15-20 years. Now, attention is focused on AI. The problem with V&V is its attention to quality. Quality is something we don’t care about anymore. More importantly, V&V is finding failure, and failure is the engine of progress. We’ve learned that success is all marketing and messaging. Success can simply be declared. Why do the work? AI has none of the quality advantages of traditional computational science. AI hallucinates regularly. These hallucinations are actually lies or bullshitting. AI confidently provides answers when it knows nothing. V&V could detect this, but that wouldn’t look like success, so no V&V.

“We can judge our progress by the courage of our questions and the depth of our answers, our willingness to embrace what is true rather than what feels good.” ― Carl Sagan

V&V for AI?

This was not my first title. My first title was actually “Where the fuck is the V&V?” I found that this was probably too jarring even for me. The reduced readership would be bad, but it captures my thoughts about what’s going on. V&V is a truly important part of the quality of anything computational. It was first slowly, then rapidly disappearing. The consequences of this are profound and quite poor for the quality of anything computational and the progress in science. It is bad enough for “classical” computational science; for AI, it is malpractice. That same lack of quality and progress is now being embedded in work on AI.

I’ve written many times on the concept that V&V is really just applying the scientific method to computations. It follows that a lack of V&V is a lack of science. This is what is going on. Part of science is progress and the advancement of knowledge. Progress depends on failure. Failure is how we learn. V&V is about looking for failures and gaps. It is up to the programs to respond to these failures by improving and closing these gaps. For science and progress to proceed, this is not optional. It is the essence of what we are investing in. Somehow, we have decided that all the failure and learning is optional. Computational science still needs progress. It seems we’ve missed that. AI is even more immature. Somehow, we have decided that success can be had without the hard part. This is not objectively supportable.

Our leaders are truly fucked in the head.

“It is hard to fail, but it is worse never to have tried to succeed.” ― Theodore Roosevelt

They don’t want actual success; it has become too easy to just declare it. Somehow, we’ve all bought into this fantasy. We have become comfortable rewarding outright incompetence. I have seen it in person. That is where the money is, and money is all that matters. This is true for business and science. Reality will not look kindly on our choices. AI in particular will be harmed by the approach we are taking. For it to evolve and grow successfully, the critical feedback that V&V produces is essential.

The core of the problem is the incentives the leaders respond to. None of the incentives connect to quality and progress. Our leaders have learned that quality and progress are expensive and uneven. We can have success on the cheap if we simply declare it. Any practice of legitimate V&V is a threat to the declaration of success. The warning signs for AI are clear and obvious. The leadership benefits from avoiding the work of V&V (and saving $$$ and complexity). The simple route is defining victory without quality assurance. Who’s going to check? No one! That shit is expensive. We already have the most awesome technology, no progress needed. This is a recipe for mediocrity and decline.

In AI, we see a technology that needs checking and to be treated with significant doubt. V&V is a practice that reveals this doubt. In many cases, the V&V offers assurance of quality, too. AI needs this, especially if used for consequential purposes. In science and engineering, we can map a path to a more reliable AI. To get to reliability, we need to work out the faults and remove them. Instead, we simply see a technology that is being implemented and executed as if it were perfect (and almost magical). The reality is that it is far less perfect than what it claims to replace.

This is nuts. This is shortsighted. This is completely and utterly irresponsible. Judging by the way the leaders of our society act, we should expect irresponsibility. Irresponsibility is exactly what we are getting. We as a society will suffer over the long run from today’s foolish decisions.

“Have no fear of perfection – you’ll never reach it.” ― Salvador Dali

V&V is Regressing in Plain Sight

I’ve seen two successive major programs at the Department of Energy, both neglecting V&V. One was the Exascale program. Now we have the Genesis program. They would claim V&V was “baked in.” This claim is bullshit. In the current incentive system, V&V produces friction. The response to friction is rejection, and V&V gets jettisoned. It is only supported if the results confirm the success narrative. As soon as V&V needs to confirm a narrative, it ceases to be V&V. It becomes a bullshit factory. In the precursor program ASC, this is where V&V has evolved. Over the past decade, V&V has become defanged and toothless. All assessments now need to be positive. Those who are not are buried. One such burial led me to retire.

“Life is full of screwups. You’re supposed to fail sometimes. It’s a required part of the human existence.” ― Sarah Dessen

The role of V&V is to determine the correctness of the computational work that’s done. Part of determining the correctness is that, when there are issues or problems (and there always are), you have a plan to fix those things. Fixing shit is progress. Fixing shit is what scientists and engineers are good at. What the leadership has discovered is that the cheapest and easiest way to succeed is to simply declare it. Part of this is avoiding the whole V&V step altogether. This, in our money-driven world, is the most efficient way to do things. It’s really a stupid way to fuck everything up.

“We are all failures- at least the best of us are.” ― J.M. Barrie

The Cost of No V&V

The real process of science, which V&V is at its core, is messy, expensive, and time-consuming. These are all things the leaders would rather do without. It’s become much more efficient and cost-effective to simply declare success and be excellent by definition. I saw this in spades during the Exascale program, where V&V was nodded and winked at but rarely done. With that program being a “success: we see the same approach with Genesys, I fear exactly the same will be done. The leaders will declare and bullshit their way to glorious success. AI is too immature and dangerous for this level of abject irresponsibility. The deeper cost will be a stagnation of this important technology. Basically, we are giving up on the future.

As I’ve noted with AI, the role of V&V is more important. It still was with the sort of computational science done with Exascale-scale computing. That program failed to make progress aside from fast computers. With AI, there are no decided-upon models or methods to solve things that have rigorous connections back to physical and mathematical theories. There, it is the data that drives things, along with algorithms that manipulate that data. Algorithms that converge but only in some nebulous, unspecified sense. They are without any sort of guarantees that we’re used to in computational science. This makes everything harder with AI. The freedom it provides is an illusion. The result is something that can often give answers that are called hallucinations. This is the most generous way to put it. In many cases could be called outright lies or bullshitting. Our current AI practice is to give answers without guardrails or warnings with complete confidence.

Now we are trying to do science without V&V, which would unveil the bullshit. For AI to do science successfully, it needs to have guardrails and come with a “bullshit” detector. V&V is that bullshit detector. The program that has been unveiled is not science. Maybe if I’m generous, I could say it’s bad or weak science. Somehow, we have gotten comfortable with programs that are called science while being unscientific. This needs to stop. By avoiding failure in small ways as part of this program, we are failing massively.

“Success is not final, failure is not fatal: it is the courage to continue that counts.” ― Winston S. Churchill

Disappearing V&V

This has become an almost ever-present thought in my mind. I watched as the V&V programs became funded and important back in the late 90s, and then went through a zenith and decline. Now I’m watching the V&V programs dissipate, fall into uselessness, and become utterly feckless. In recent years, it has turned into a rubber stamp. Whenever V&V found a problem, management would just bury it and ignore it. Only positive news was accepted. The essence of toxic positivity. We are allowing management to act like children who live in a fantasy world.

V&V for modeling is essential. This is where we are solving well-known, well-understood, and well-accepted models. These models are solved with well-known, well-accepted discretizations using real geometries. It is still essential; it still finds problems. V&V is needed to manifest the full scientific method in these domains. Now we see AI coming to the fore. With AI, we see even less V&V, even where there are none of the advantages that classical computational science has. Even when AI has a known propensity to hallucinate (lie, BS,..). V&V is more needed with AI, and yet there is less.

“You never change things by fighting the existing reality. To change something, build a new model that makes the existing model obsolete.” ― Buckminster Fuller

The real tension is the notion that we have already created the technology breakthrough. Now it is simply a matter of showing what it can do (and make a shitload of money). The current programs have declared victory and eschew the need for progress. We just need to demo the awesomeness. This is a self-own. This is why there is no V&V, and it is a symptom of extreme weakness. They cannot stand to have any hard look at the current state of things. That is what V&V would do. It would do a great deal more. V&V would power progress by pointing to problems and weaknesses. These problems and weaknesses are where research and advances are needed. This is the very heart of the flywheel of progress.

Without the introspection that V&V provides, we have stagnation. My belief is that the technology (AI and computational science) is quite far from where it should be and needs serious attention. Certainly, we need to demonstrate the current state of AI. The key to progress is a stern examination of the quality of this work with a critical eye. V&V provides this and a map of where progress is needed. Without V&V, there are claims of false victory and stalled progress. Progress is grounded in failure because failure is learning. The sort of victory being planned now is hollow and feeble-minded. It is not how science is done, and the recipe for mediocrity. Science in the USA is already quite far along the road to mediocrity. Our current approach widens that road into a superhighway.

“Those who make peaceful revolution impossible will make violent revolution inevitable.” ― John F. Kennedy