# Square roots and fixed points

math programming 02 June 2014

There’s an old HN comment in which a developer whines bitterly and eloquently about his experience being stumped by a common interview question: “write an algorithm for computing the square root of a number.”

I tried a couple more times to answer this question that is completely unrelated to the job I’m interviewing for. Finally, I get so fed up with this moron that my internal ticker clicks over and I realize, “Even if I get this job, I’m going to be dealing with these kinds of nazel-gazing engineers every single day. Not an environment I want to be in.”

The most obvious solution is binary search, but that’s not very interesting. How else might we solve the problem of calculating $\sqrt{a}$? Let’s first introduce the concept of a fixed point. A fixed point is a value which is unchanged by the function - that is, $f(p) = p$. For example, a fixed point of the sine function is 0 because $\sin(0)=0$. A fixed point of the cosine function is located around 0.739085133 because $\cos(0.739085133)\approx0.739085133$. In fact, if we plot the cosine function on top of $f(p) = p$ then we can see that they intersect at exactly that point:

One interesting fact which might be of use is that $\sqrt{a}$ is a fixed point of the function $f(x)=\frac{a}{x}$:

We can therefore graph $f(x)=\frac{a}{x}$ to find its fixed point $\sqrt{a}$. Here’s a plot for various values of a:

By subtracting x from the function it becomes a root finding problem:

This we can solve with Newton’s method:

Of course, we could have just used Newton’s method from the beginning. If we’re trying to calculate $\sqrt{a}$ then the value we’re looking for is a zero of the function $f(x)=x^{2}-a$. In this case we have:

These are different functions but they have the same fixed point. This is a plot of $x_{n+1} - x_n$ for the two different functions (with a=7) showing that they have the same zeros:

Newton’s method corresponds closely to a concept called fixed point iteration. In fixed point iteration, we repeatedly evaluate a function and feed its output back into its input. Eventually we hope it will converge on a fixed point. We mentioned the cosine function earlier; this is one such function which always converges on a particular fixed point. Enter any value into a scientific calculator and repeatedly press the cosine button:

-1.0000, 0.5403, 0.8575, 0.6542, 0.7934, 0.7013, 0.7639, 0.7221, 0.7504, 0.7314, 0.7442, 0.7356

It will eventually converge to a value near 0.7391:

Another function which converges to a fixed point is the expression given by Newton’s method applied to the function $f(x)=x^2-a$, which we derived before:

If we rearrange the right side we obtain:

This is called the Babylonian method, or Heron’s method, and it was actually discovered long before Newton’s method. The idea was that if $x_n$ is an overestimate to $\sqrt{a}$ then $\frac{a}{x_n}$ is an underestimate, and you can average them to get a better approximation. Repeated iteration improves this approximation until it converges on $\sqrt{x}$, which we know is a fixed point.

Remember that before we said $\sqrt{a}$ is a fixed point of the function $f(x)=\frac{a}{x_n}$. Unfortunately if you iterate that function, you will not approach $\sqrt{a}$. Fixed point iteration doesn’t always work and this is one such case. The math behind being able to tell whether an arbitrary function will converge to a fixed point under fixed point iteration is complicated.

Now it would be very simple to wrap the Babylonian method in a loop and perform a couple steps of fixed point iteration to get a decent sqrt(a). But since we’re finding a fixed point, this seems like a nice time to break out something called a fixed point combinator. The best-known fixed point combinator is the Y combinator. You’ve probably heard of it due to the eponymous startup incubator founded by Lisp greybeard Paul Graham, of the famous Paul Graham essays.

This is the lambda calculus definition of the Y combinator, due to Haskell Curry:

The reason y is called a fixed-point combinator is because of what happens when you apply it to a function and reduce. By following the lambda calculus reduction rules you can find that the Y combinator satisfies the equation $y\ f = f\ (y\ f)$. This matches the form $something = f\ (something)$ - the definition of a fixed point. So, $y\ f$ is a fixed point of $f$! Therefore all we have to do is apply the Y combinator to obtain a fixed point of $f$, right?

Well, not really. $f$ takes and returns a function so the fixed point of $f$ isn’t a number at all, it’s the function $f'$ which $f$ maps to $f'$. In other words, all the Y combinator is doing is facilitating fixed-point iteration:

It appears that this expansion will continue forever and never terminate. But we can build a termination condition into the function $f$ so that it stops expanding. Let’s see how that would work with the sqrt example. Our $f$ could look like this in JavaScript:

function step (callback) {
return function (originalValue, approxSqrt) {
// Babylonian method formula to improve our approximation of the square root
// of originalValue.
var improvedApproxSqrt = (approxSqrt + (originalValue / approxSqrt)) / 2;

// How far off the mark we are.
var discrepancy = Math.abs(originalValue -
(improvedApproxSqrt * improvedApproxSqrt));

// Termination condition
if(discrepancy < 0.00001) {
return improvedApproxSqrt;
}

return callback(originalValue, improvedApproxSqrt);
};
}


This looks a lot like recursion, except that we’ve never referred to an free variable name. This is called anonymous recursion, and it’s useful in systems (notably lambda calculus) where functions cannot refer to themselves by name.

Now we need the magic which will repeatedly invoke step and feed its output back into its input: the Y combinator. How would it look in JavaScript? Here’s the Y combinator in lambda calculus again:

This corresponds pretty directly to a JavaScript function:

function Y(f) {
return (function (x) {
return f(x(x));
})(function (x) {
return f(x(x));
});
}


Unfortunately, when we try to use this version of the Y combinator, we get a stack overflow. If you trace out the execution you’ll see that x(x) must be evaluated in order to get a final return value for Y, and this causes infinite recursion. The reason this works at all in lambda calculus is that lambda calculus is call by name so $f (x\ x)$ is evaluated by expanding the definition of $x\ x$ and passing that function to f. JavaScript, on the other hand, is call by value, so the x function is actually evaluated with x as an argument.

If we η-reduce the Y combinator then we obtain an alternate fixed-point combinator, called the Z combinator, which contains an extra layer of indirection and prevents runaway recursion:

Here’s the JavaScript code corresponding to the Z combinator:

function Z(f) {
return (function (x) {
return f(function (v) { return x(x)(v); });
})(function (x) {
return f(function (v) { return x(x)(v); });
});
}


We still have a bit of a problem: our step function takes two variables, while Z only calls it with one. So let’s modify Z:

function Z(f) {
return (function (x) {
return f(function (v1, v2) { return x(x)(v1, v2); });
})(function (x) {
return f(function (v1, v2) { return x(x)(v1, v2); });
});
}


Now we can see it working:

> var sqrt = Z(step);
> sqrt(2, 1); // sqrt of 2, with a starting estimate of 1
1.4142156862745097


It’s rather awkward to always have to provide a starting estimate, so we can add a wrapper which always guesses 1 to start:

> function sqrt (num) { return Z(step)(num, 1); }
> sqrt(2)
1.4142156862745097