# Floating away

Every computer science student is told that floating-point arithmetic is fraught. There be dragons. Stack Overflow and other Q&A sites have a mountain of “is floating-point math broken?” questions. But I was today years old, approximately 51.07 (floating-point) years, before I ran into a situation where I actually met the dragons.

As a general rule in mathematics (for sensible algebraic structures),
`a + b + c`

is the same as `c + a + b`

. Not so in floating-point math.

At work-work we have a codebase that goes back 30 years. Parts of it are C++23 and cutting edge, and parts of it are .. not so much. Sometimes my job is figuring out what an undocumented function does, so we can add documentation and unit-tests and so nudge it towards modernity.

Let me take a moment to recommend Clare Macrae as a source of good talks on testing – tips, techniques, and the process of acceptance testing. She’s been an inspiration for the things I do these days.

My general approach to this kind of problem is “read the code,
try to understand it, then write a **re**implementation of it
in modern terms.” The implementation and reimplementation
live side-by side. The new code has documentation
and, of course, unit-tests that express what it does.

But the litmus test of the new code is not what it does,
but *does-it-do-what-the-old-code-does*. If it isn’t
a faithful, and possibly bug-for-bug-compatible, reimplementation,
then it’s not all that useful. The equivalence testing
is done with a bunch more unit-tests and a lot of random data.

For a reimplementation of a function that sums a bunch of `float`

s
and does an awful lot of memory-thrashing, I figured out it
was doing a sliding-window over a buffer of floats.
I wrote a little class that implements the sliding-window
part. And it did the obvious thing
and my silly tests like “compute the average of integers 1..100”
worked.

Naturally, it failed the litmus tests. Given a buffer filled with random
data, it would always fail to compute the same values as the existing code,
although the differences were in the 6th decimal digit.
Much `fmt::print()`

later, I boiled it down to differences in
floating-point summation order. Here’s a tiny demo program that
shows the problem as well:

```
#include <fmt/core.h>
int main() {
float inputs[4] = { 10.17f, -11.01f, 3.114677f, -3.109f };
float sum0 = 0.0f;
for (int i=0; i<4; ++i) sum0+=inputs[i];
float sum1 = 0.0f;
for (int i=3; 0<=i; --i) sum1+=inputs[i];
fmt::print("sum0={}\nsum1={}\n", sum0, sum1);
return 0;
}
```

On my amd64 machine with clang 15, summing these four numbers one way yields -0.83432317 and in another way, -0.8343229. And with optimization enabled, other values.

So I had to dig into my reimplementation and change the “obvious” summation order to match the order in which values were added together in the original implementation. Floating-point rounding errors pile up, and in my mind the image of Dr. Ben Polman, who taught me numerical computing in 1997 or so appeared with a disappointed tut-tut frown on his face.

Floating-point. It is fraught. There are dragons. Personally, I think I prefer Konqi.