Fork me on GitHub
Brian Gordon blog

Square roots and fixed points

math programming 02 June 2014

There’s an old comment on HN that annoys me. In it, a developer whines bitterly and eloquently about his experience being stumped by a common interview question: “write an algorithm for computing the square root of a number.”

I tried a couple more times to answer this question that is completely unrelated to the job I’m interviewing for. Finally, I get so fed up with this moron that my internal ticker clicks over and I realize, “Even if I get this job, I’m going to be dealing with these kinds of nazel-gazing engineers every single day. Not an environment I want to be in.”

“Completely unrelated?” The problem is easily solved with binary search; is it inconceivable that a frontend developer would ever use binary search? For that matter, is this interview for a contract to build a single front-end, or for a full-time developer position which may continue indefinitely? Because if it’s the latter, general problem solving questions are completely fair game. The company is going to need to make an investment of institutional knowledge in a new developer. If the developer can pivot to a new role as needed then the company may be able to avoid hiring someone new in the future and save the cost of bringing another employee up to speed. There is therefore legitimate value added by being able to tackle problems outside of your comfort zone. Petulantly hanging up on the interviewer and deciding he’s not worth your time for asking such a question is childish.

Anyway, other than the obvious solution of binary search, how might we solve the problem of calculating ? Let’s first introduce the concept of a fixed point. A fixed point is a value which is unchanged by the function - that is, . For example, a fixed point of the sine function is 0 because . A fixed point of the cosine function is located around 0.739085133 because . In fact, if we plot the cosine function on top of then we can see that they intersect at exactly that point:

Plot 1

One interesting fact which might be of use is that is a fixed point of the function :

We can therefore graph to find its fixed point . Here’s a plot for various values of a:

Plot 2

By subtracting x from the function it becomes a root finding problem:

Plot 3

This we can solve with Newton’s method:

Of course, we could have just used Newton’s method from the beginning. If we’re trying to calculate then the value we’re looking for is a zero of the function . In this case we have:

These are different functions but they have the same fixed point. This is a plot of for the two different functions (with a=7) showing that they have the same zeros:

Plot 4

Newton’s method corresponds closely to a concept called fixed point iteration. In fixed point iteration, we repeatedly evaluate a function and feed its output back into its input. Eventually we hope it will converge on a fixed point. We mentioned the cosine function earlier; this is one such function which always converges on a particular fixed point. Enter any value into a scientific calculator and repeatedly press the cosine button:

-1.0000, 0.5403, 0.8575, 0.6542, 0.7934, 0.7013, 0.7639, 0.7221, 0.7504, 0.7314, 0.7442, 0.7356

It will eventually converge to a value near 0.7391:

Plot 5

Another function which converges to a fixed point is the expression given by Newton’s method applied to the function , which we derived before:

If we rearrange the right side we obtain:

This is called the Babylonian method, or Heron’s method, and it was actually discovered long before Newton’s method. The idea was that if is an overestimate to then is an underestimate, and you can average them to get a better approximation. Repeated iteration improves this approximation until it converges on , which we know is a fixed point.

Remember that before we said is a fixed point of the function . Unfortunately if you iterate that function, you will not approach . Fixed point iteration doesn’t always work and this is one such case. The math behind being able to tell whether an arbitrary function will converge to a fixed point under fixed point iteration is complicated.

Now it would be very simple to wrap the Babylonian method in a loop and perform a couple steps of fixed point iteration to get a decent sqrt(a). But since we’re finding a fixed point, this seems like a nice time to break out something called a fixed point combinator. The best-known fixed point combinator is the Y combinator. You’ve probably heard of it due to the eponymous startup incubator founded by Lisp greybeard Paul Graham, of the famous Paul Graham essays.

YCombinator logo

This is the lambda calculus definition of the Y combinator, due to Haskell Curry:

The reason y is called a fixed-point combinator is because of what happens when you apply it to a function and reduce. By following the lambda calculus reduction rules you can find that the Y combinator satisfies the equation . This matches the form - the definition of a fixed point. So, is a fixed point of ! Therefore all we have to do is apply the Y combinator to obtain a fixed point of , right?

Well, not really. takes and returns a function so the fixed point of isn’t a number at all, it’s the function which maps to . In other words, all the Y combinator is doing is facilitating fixed-point iteration:

It appears that this expansion will continue forever and never terminate. But we can build a termination condition into the function so that it stops expanding. Let’s see how that would work with the sqrt example. Our could look like this in JavaScript:

function step (callback) {
    return function (originalValue, approxSqrt) {
        // Babylonian method formula to improve our approximation of the square root 
        // of originalValue.
        var improvedApproxSqrt = (approxSqrt + (originalValue / approxSqrt)) / 2;

        // How far off the mark we are.
        var discrepancy = Math.abs(originalValue - 
                                (improvedApproxSqrt * improvedApproxSqrt));

        // Termination condition
        if(discrepancy < 0.00001) {
            return improvedApproxSqrt;
        }

        return callback(originalValue, improvedApproxSqrt);
    };
}

This looks a lot like recursion, except that we’ve never referred to an free variable name. This is called anonymous recursion, and it’s useful in systems (notably lambda calculus) where functions cannot refer to themselves by name.

Now we need the magic which will repeatedly invoke step and feed its output back into its input: the Y combinator. How would it look in JavaScript? Here’s the Y combinator in lambda calculus again:

This corresponds pretty directly to a JavaScript function:

function Y(f) {
    return (function (x) {
        return f(x(x));
    })(function (x) {
        return f(x(x));
    });
}

Unfortunately, when we try to use this version of the Y combinator, we get a stack overflow. If you trace out the execution you’ll see that x(x) must be evaluated in order to get a final return value for Y, and this causes infinite recursion. The reason this works at all in lambda calculus is that lambda calculus is call by name so is evaluated by expanding the definition of and passing that function to f. JavaScript, on the other hand, is call by value, so the x function is actually evaluated with x as an argument.

If we η-reduce the Y combinator then we obtain an alternate fixed-point combinator, called the Z combinator, which contains an extra layer of indirection and prevents runaway recursion:

Here’s the JavaScript code corresponding to the Z combinator:

function Z(f) {
    return (function (x) { 
        return f(function (v) { return x(x)(v); }); 
    })(function (x) {
        return f(function (v) { return x(x)(v); });
    });
}

We still have a bit of a problem: our step function takes two variables, while Z only calls it with one. So let’s modify Z:

function Z(f) {
    return (function (x) { 
        return f(function (v1, v2) { return x(x)(v1, v2); }); 
    })(function (x) {
        return f(function (v1, v2) { return x(x)(v1, v2); });
    });
}

Now we can see it working:

> var sqrt = Z(step);
> sqrt(2, 1); // sqrt of 2, with a starting estimate of 1
1.4142156862745097

It’s rather awkward to always have to provide a starting estimate, so we can add a wrapper which always guesses 1 to start:

> function sqrt (num) { return Z(step)(num, 1); }
> sqrt(2)
1.4142156862745097