Fork me on GitHub
Brian Gordon blog

Covariance and contravariance rules in Java

programming 05 September 2014

This post is a go-to quick reference for exactly how covariance and contravariance work in Java. Different programming languages do this in different ways, so sometimes it can be tricky to switch between languages and keep the rules straight. When you switch over to Java, use this guide to refresh your memory.

Casting

Casting in Java is covariant. That means that if SubClazz is a subtype of Clazz then a SubClazz reference can be cast to a Clazz.

public class Clazz { }
public class SubClazz extends Clazz { }
(Clazz)new SubClazz(); // OK
(SubClazz)new Clazz(); // Error

Casting can occur implictly during assignment:

Clazz instance = new SubClazz();

Casting can also occur implicitly when returning from a method or when passing arguments.

public Clazz makeClazz() {
    return new SubClazz();
}

public Clazz takeClazz(Clazz foo) { }
takeClazz(new SubClazz());

Arrays

Arrays in Java are covariant in the type of the objects they hold. In other words, Clazz[] can hold SubClazz objects.

Clazz[] array = new Clazz[10];
array[0] = new SubClazz();

They are also covariant in the type of the array itself. You can directly assign a SubClazz[] type to a Clazz[].

Clazz[] array = new SubClazz[10];

Be careful though; the above line is dangerous. Although the type of the array variable is Clazz[], the actual array object on the heap is a SubClazz[]. For that reason, the following code compiles fine but throws a java.lang.ArrayStoreException at runtime:

Clazz[] array = new SubClazz[10];
array[0] = new Clazz();

Overriding methods

The overriding method is covariant in the return type and invariant in the argument types. That means that the return type of the overriding method can be a subclass of the return type of the overridden method, but the argument types must match exactly.

public interface Parent {
    public Clazz act(Clazz argument);
}

public interface Child extends Parent {
    @Override
    public SubClazz act(Clazz argument);
}

If the argument types aren’t identical in the subclass then the method will be overloaded instead of overridden. You should always use the @Override annotation to ensure that this doesn’t happen accidentally.

Generics

Unless wildcards are involved, generic types are invariant with respect to the parameterized type. So you can’t do covariant ArrayLists like this:

ArrayList<Clazz> ary = new ArrayList<SubClazz>(); // Error!

The normal rules apply to the type being parameterized:

List<Clazz> list = new ArrayList<Clazz>();

Unbounded wildcards allow assignment with any type parameter:

List<?> list = new ArrayList<Clazz>();

Bounded wildcards affect assignment like you might expect:

List<? extends Clazz> list = new ArrayList<SubClazz>();
List<? super Clazz> list2 = new ArrayList<Object>();

You can add or remove the type parameters from the return type of an overriding method and it will still compile:

public interface Parent {
    public List echo();
}

public interface Child extends Parent {
    @Override
    public List<String> echo();
}
public interface Parent {
    public List<String> echo();
}

public interface Child extends Parent {
    @Override
    public List echo();
}

Wildcards can be present in the types of method arguments. If you want to override a method with a wildcard-typed argument, the overriding method must have an identical type parameter. You cannot be “more specific” with the overriding method:

public interface Parent {
    public void act(List<? extends List> a);
}

public interface Child extends Parent {
    @Override
    public void act(List<? extends ArrayList> a); // Error!
}

Also, you can replace any type-parameterized method argument with a non-type-parameterized method argument in the subclass and it will still be considered an override:

public interface Parent {
    public void act(List<? extends Number> a);
}

public interface Child extends Parent {
    @Override
    public void act(List a);
}

The skyline problem

programming 31 August 2014

You are given a set of rectangles in no particular order. They have varying widths and heights, but their bottom edges are collinear, so that they look like buildings on a skyline. For each rectangle, you’re given the x position of the left edge, the x position of the right edge, and the height. Your task is to draw an outline around the set of rectangles so that you can see what the skyline would look like when silhouetted at night.

How shall we proceed? If you’re drawing a blank, it’s always good to get a really awful solution on the table right away so that you have something to think about and improve upon.

The first thing I thought of was to construct a 1-dimensional heightmap. The idea is to create an array of height values and write each rectangle onto it. Without worrying about the details of mapping rectangle coordinates to pixel array indicies, the code will look something like this:

for each rectangle r:
    for each heightmap cell c starting at r.left and ending at r.right:
        c gets the max of r.height and the previous value of c

OK, so this works as a first attempt at a solution. What, specifically, is wrong with it?

You can see from the animation that the skyline constructed from the heightmap isn’t quite correct. The edges of the rectangles don’t line up perfectly with the array cells, so there is a small amount of error in the shape of the skyline. In fact, it is easily shown that the only way to guarantee zero error is to use a heightmap with the same resolution as the final rendered image. This means that the running time of your algorithm depends not only on the number of given rectangles, but also on the resolution of the output image.

Of course, unless you’re using a vector display, it’s inevitable that at some point you will have code looping on the order of the resolution of the output image, just to draw the line segments one pixel at a time. I’m inclined to not worry about this cost. If you’re writing code in Perl, for example, your concern is to do as little as possible in Perl and offload as much work as possible to the drawing library, which is likely compiled and heavily optimized. Only when the line-drawing code isn’t much faster than your application logic does it start to make sense to use a raster approach like a heightmap.

What now? If the chief weakness of the heightmap approach is the sheer number of points to deal with in application code, maybe we can reduce the number of points in play. Now that we think about it, the skyline is made up of horizontal line segments, interrupted in only a few places. In fact, the only time the skyline can change its y position is at the left or right side of a rectangle. It’s clear now that if we find the height of the skyline at each of these “critical points” on the x axis then we will have completely determined the shape of the skyline. At each critical point, you just go up or down to the new height and draw a line segment to the right until you reach the next critical point.

So how do we find the true height of the skyline at each of these critical points? Well, we already have this heightmap approach, so let’s try doing something similar. This time, instead of printing the rectangles onto a heightmap array with an entry for each pixel, let’s print the rectangles onto an array with an entry for each critical point! This will eliminate the problem of dealing with too many points, because we’re now dealing with the minimum number of points necessary to determine the skyline.

Here’s a first stab at what the code would look like:

for each rectangle r:
    for each critical point c:
        if c.x >= r.left && c.x < r.right:
            c.y gets the max of r.height and the previous value of c.y

Looks good! So now we have a working solution to exactly the problem we were trying to solve, with no error. Can we achieve a better running time? It occurs to us that we don’t really need to look at every critical point when printing a rectangle, but rather only those critical points below the rectangle in question.

for each rectangle r:
    for each critical point c below r (except the one at r.right):
        c.y gets the max of r.height and the previous value of c.y

This optimization depends, of course, on being able to efficiently find which critical points are subtended by each rectangle. This is easily done by sorting the critical points. For example, if we want to find the critical points subtended by the magenta rectangle, we start at the left side of the magenta rectangle and scan to the right, accumulating critical points until we reach the right side.

Unfortunately, this isn’t an asymptotic improvement in the worst case. It’s still given something like the following configuration:

The worst case.

At this point perhaps no ideas jump out at you about how to improve the algorithm’s performance further. Let’s try perturbing the solution we have in order to see what might present itself. What if, instead of looking at each critical point for each rectangle, we look at each rectangle for each critical point? This is the same code as before, with the loops switched:

for each critical point c:
    for each rectangle r:
        if c.x >= r.left && c.x < r.right:
            c.y gets the max of r.height and the previous value of c.y

Again, we don’t really need to consider all rectangles, only the ones above the critical point in question:

for each critical point c:
    for each rectangle r above c (not including the right edge of rectangles):
        c.y gets the max of r.height and the previous value of c.y

So, given a critical point, how do we efficiently find all of the rectangles above it? This requires a different strategy than before. Before we turned the problem inside out, we needed to find all of the critical points between the left and right sides of the given rectangle. Now, we need to find all of the rectangles with a left edge to the left of the given critical point and a right edge to the right of the given critical point.

What if we begin at the critical point and go left looking for left edges, and also go right looking for right edges, and then intersect the two sets of rectangles? That would work, but, again, it’s in total to do this for every critical point.

A better approach is to simply scan across the skyline’s sorted critical points from left to right, keeping track of an active set of rectangles as you go. When you reach a critical point, the active set is updated and then the critical point gets assigned a copy of the current active set of rectangles. By the end of the pass, each critical point will know about all of the rectangles above it.

Now that we’re able to scan through the critical points and consider only the “active” set of rectangles at each critical point, an interesting opportunity presents itself. Our current solution can be written as:

for each critical point c
    c.y gets the height of the tallest rectangle over c

This is no longer obviously . If we can somehow calculate the height of the tallest rectangle over c in faster than time, we have beaten our algorithm. Fortunately, we know about a data structure which can keep track of an active set of integer-keyed objects and return the highest one in time: a heap.

Our final solution, then, in time, is as follows. First, sort the critical points. Then scan across the critical points from left to right. When we encounter the left edge of a rectangle, we add that rectangle to the heap with its height as the key. When we encounter the right edge of a rectangle, we remove that rectangle from the heap. (This requires keeping external pointers into the heap.) Finally, any time we encounter a critical point, after updating the heap we set the height of that critical point to the value peeked from the top of the heap.

Square roots and fixed points

math programming 02 June 2014

There’s an old comment on HN that annoys me. In it, a developer whines bitterly and eloquently about his experience being stumped by a common interview question: “write an algorithm for computing the square root of a number.”

I tried a couple more times to answer this question that is completely unrelated to the job I’m interviewing for. Finally, I get so fed up with this moron that my internal ticker clicks over and I realize, “Even if I get this job, I’m going to be dealing with these kinds of nazel-gazing engineers every single day. Not an environment I want to be in.”

“Completely unrelated?” The problem is easily solved with binary search; is it inconceivable that a frontend developer would ever use binary search? For that matter, is this interview for a contract to build a single front-end, or for a full-time developer position which may continue indefinitely? Because if it’s the latter, general problem solving questions are completely fair game. The company is going to need to make an investment of institutional knowledge in a new developer. If the developer can pivot to a new role as needed then the company may be able to avoid hiring someone new in the future and save the cost of bringing another employee up to speed. There is therefore legitimate value added by being able to tackle problems outside of your comfort zone. Petulantly hanging up on the interviewer and deciding he’s not worth your time for asking such a question is childish.

Anyway, other than the obvious solution of binary search, how might we solve the problem of calculating ? Let’s first introduce the concept of a fixed point. A fixed point is a value which is unchanged by the function - that is, . For example, a fixed point of the sine function is 0 because . A fixed point of the cosine function is located around 0.739085133 because . In fact, if we plot the cosine function on top of then we can see that they intersect at exactly that point:

Plot 1

One interesting fact which might be of use is that is a fixed point of the function :

We can therefore graph to find its fixed point . Here’s a plot for various values of a:

Plot 2

By subtracting x from the function it becomes a root finding problem:

Plot 3

This we can solve with Newton’s method:

Of course, we could have just used Newton’s method from the beginning. If we’re trying to calculate then the value we’re looking for is a zero of the function . In this case we have:

These are different functions but they have the same fixed point. This is a plot of for the two different functions (with a=7) showing that they have the same zeros:

Plot 4

Newton’s method corresponds closely to a concept called fixed point iteration. In fixed point iteration, we repeatedly evaluate a function and feed its output back into its input. Eventually we hope it will converge on a fixed point. We mentioned the cosine function earlier; this is one such function which always converges on a particular fixed point. Enter any value into a scientific calculator and repeatedly press the cosine button:

-1.0000, 0.5403, 0.8575, 0.6542, 0.7934, 0.7013, 0.7639, 0.7221, 0.7504, 0.7314, 0.7442, 0.7356

It will eventually converge to a value near 0.7391:

Plot 5

Another function which converges to a fixed point is the expression given by Newton’s method applied to the function , which we derived before:

If we rearrange the right side we obtain:

This is called the Babylonian method, or Heron’s method, and it was actually discovered long before Newton’s method. The idea was that if is an overestimate to then is an underestimate, and you can average them to get a better approximation. Repeated iteration improves this approximation until it converges on , which we know is a fixed point.

Remember that before we said is a fixed point of the function . Unfortunately if you iterate that function, you will not approach . Fixed point iteration doesn’t always work and this is one such case. The math behind being able to tell whether an arbitrary function will converge to a fixed point under fixed point iteration is complicated.

Now it would be very simple to wrap the Babylonian method in a loop and perform a couple steps of fixed point iteration to get a decent sqrt(a). But since we’re finding a fixed point, this seems like a nice time to break out something called a fixed point combinator. The best-known fixed point combinator is the Y combinator. You’ve probably heard of it due to the eponymous startup incubator founded by Lisp greybeard Paul Graham, of the famous Paul Graham essays.

YCombinator logo

This is the lambda calculus definition of the Y combinator, due to Haskell Curry:

The reason y is called a fixed-point combinator is because of what happens when you apply it to a function and reduce. By following the lambda calculus reduction rules you can find that the Y combinator satisfies the equation . This matches the form - the definition of a fixed point. So, is a fixed point of ! Therefore all we have to do is apply the Y combinator to obtain a fixed point of , right?

Well, not really. takes and returns a function so the fixed point of isn’t a number at all, it’s the function which maps to . In other words, all the Y combinator is doing is facilitating fixed-point iteration:

It appears that this expansion will continue forever and never terminate. But we can build a termination condition into the function so that it stops expanding. Let’s see how that would work with the sqrt example. Our could look like this in JavaScript:

function step (callback) {
    return function (originalValue, approxSqrt) {
        // Babylonian method formula to improve our approximation of the square root 
        // of originalValue.
        var improvedApproxSqrt = (approxSqrt + (originalValue / approxSqrt)) / 2;

        // How far off the mark we are.
        var discrepancy = Math.abs(originalValue - 
                                (improvedApproxSqrt * improvedApproxSqrt));

        // Termination condition
        if(discrepancy < 0.00001) {
            return improvedApproxSqrt;
        }

        return callback(originalValue, improvedApproxSqrt);
    };
}

This looks a lot like recursion, except that we’ve never referred to an free variable name. This is called anonymous recursion, and it’s useful in systems (notably lambda calculus) where functions cannot refer to themselves by name.

Now we need the magic which will repeatedly invoke step and feed its output back into its input: the Y combinator. How would it look in JavaScript? Here’s the Y combinator in lambda calculus again:

This corresponds pretty directly to a JavaScript function:

function Y(f) {
    return (function (x) {
        return f(x(x));
    })(function (x) {
        return f(x(x));
    });
}

Unfortunately, when we try to use this version of the Y combinator, we get a stack overflow. If you trace out the execution you’ll see that x(x) must be evaluated in order to get a final return value for Y, and this causes infinite recursion. The reason this works at all in lambda calculus is that lambda calculus is call by name so is evaluated by expanding the definition of and passing that function to f. JavaScript, on the other hand, is call by value, so the x function is actually evaluated with x as an argument.

If we η-reduce the Y combinator then we obtain an alternate fixed-point combinator, called the Z combinator, which contains an extra layer of indirection and prevents runaway recursion:

Here’s the JavaScript code corresponding to the Z combinator:

function Z(f) {
    return (function (x) { 
        return f(function (v) { return x(x)(v); }); 
    })(function (x) {
        return f(function (v) { return x(x)(v); });
    });
}

We still have a bit of a problem: our step function takes two variables, while Z only calls it with one. So let’s modify Z:

function Z(f) {
    return (function (x) { 
        return f(function (v1, v2) { return x(x)(v1, v2); }); 
    })(function (x) {
        return f(function (v1, v2) { return x(x)(v1, v2); });
    });
}

Now we can see it working:

> var sqrt = Z(step);
> sqrt(2, 1); // sqrt of 2, with a starting estimate of 1
1.4142156862745097

It’s rather awkward to always have to provide a starting estimate, so we can add a wrapper which always guesses 1 to start:

> function sqrt (num) { return Z(step)(num, 1); }
> sqrt(2)
1.4142156862745097

MagicalTux's EVE Online pathfinder

programming 05 March 2014

The personal blog of Mark Karpeles, aka MagicalTux, CEO of the recently-bankrupt Bitcoin exchange Mt. Gox, received attention on Hacker News recently. Given that Gox’s spectacular $473 million downfall was supposedly caused by a bug in Karpeles’s custom implementation of the Bitcoin protocol, people were understandably interested in checking the quality of his public source code.

What we saw was not reassuring. In one post, Karpeles describes a custom implementation of SSH2 which he wrote for production use at his web hosting company.

With PHP I could write a fully working SSH server in only 3 days. … My goal when writing this was to provide a replacement for the FTP protocol for the customers of my hosting service.

As I was missing some functions to properly handle the initial Diffie-Hellman key exchange (and later to implement publickey authentication) I had to re-implement those in PHP.

This is horrifying to see from the guy who wrote the server which once handled 70 percent of all Bitcoin trades. If I had seen this post before Mt. Gox’s failure, I would never have deposited my Bitcoin with them. The SSH server source code has since been taken down, but one HN user remembers it:

The code for the sshd does not seem to be there anymore, but from memory: it did not check if the number sent by Bob was 0, 1, or any any other groups that would make it easy solve the discrete logarithm problem. I don’t think it bothered to check the primes either. [1] I think there was also something wrong with the signature checking (padding not checked maybe?).

Altogether it seemed like you could easily MITM connections made to the server, but I don’t think I ever tried. It was a perfect example–to me at least–of why you should not spend a trivial amount of time reading about crypto on Wikipedia and then writing crypto code.

I absolutely agree. The main lesson that you should take away from the crypto class is that you should be afraid of rolling your own crypto code. There are a thousand ways to screw up, and it only takes one mistake for your cryptosystem to fall apart. The recent Apple and GnuTLS vulns show that even the serious players get this wrong.

So best practices were apparently not followed at Mt. Gox. In fact, later it was alleged that developers at Mt. Gox would push changes directly to production, and didn’t even use version control for the site’s source code.

Another of his blog posts is about a tool which he wrote - in PHP of course - to compute routes between star systems in the MMORPG EVE Online. In EVE, solar systems are connected to each other with portals called Stargates. The result is a big graph. Savvy players will try to take the shortest possible path to get from point A to point B by using a shortest-path algorithm to automate navigation.

Systems map in EVE Online

I thought I’d hop on the bandwagon by criticizing his EVE pathfinder. Although I could nitpick on matters of style, I’d rather focus on the core- that is, the all-pairs shortest path solver which is the basis of his algorithm. The general idea is to generate an index which contains, for each system, the next hop to take in order to reach any given system. This uses O(n^2) space but allows efficient pathfinding between any two arbitrary systems in the universe. All of this is perfectly fine so far. The problem is how he goes about constructing this index.

Based on my reading of his code, this appears to be his algorithm:

  1. Inform each system how to reach its adjacent systems.
  2. From each system s, collect the best known paths to systems 1 hop away and advertise them to s’s adjacent systems.
  3. From each system s, collect the best known paths to systems 2 hops away and advertise them to s’s adjacent systems.
  4. From each system s, collect the best known paths to systems 3 hops away and advertise them to s’s adjacent systems.
  5. From each system s, collect the best known paths to systems 4 hops away and advertise them to s’s adjacent systems.

This continues for 256 steps, by which point MagicalTux hopes all systems have been informed of the shortest path to all other systems. Judging by a cusory search of the EVE forums, this appears to be a valid assumption.

This algorithm is essentially the Bellman-Ford algorithm run once for each system in the universe. Although this solution is asymptotically optimal, in real life it will perform poorly compared to a real all-pairs shortest path algorithm. I thought it would be interesting to see how badly I could beat his code’s performance. MagicalTux claims that it takes about 3 hours to construct the index using his PHP code. I’m going to see how much faster I can do it, by doing these things differently:

I downloaded the jump connection data helpfully provided by MagicalTux - the official data comes in an unusable binary MSSQL format - and put it in a CSV file for easy parsing.

My complete EVE pathfinder implementation can be found here. It doesn’t require any extra space (asymptotically) to construct the index, which is nice. This is the core of the code where almost 100% of the running time is spent:

// Floyd's algorithm
for(uint32_t k=0; k<NUM_SYSTEMS; k++) {
    #pragma omp parallel for shared(k, cost, next) schedule(dynamic)
    for(uint32_t i=0; i<NUM_SYSTEMS; i++) {
        if(i == k) {
            continue;
        }
        for(uint32_t j=0; j<NUM_SYSTEMS; j++) {
            uint32_t prev = cost[(NUM_SYSTEMS * i) + j];
            uint32_t candidate = cost[(NUM_SYSTEMS * i) + k] + cost[(NUM_SYSTEMS * k) + j];
            if(candidate < prev) {
                cost[(NUM_SYSTEMS * i) + j] = candidate;
                next[(NUM_SYSTEMS * i) + j] = next[(NUM_SYSTEMS * i) + k];
            }
        }
    }
}

I picked two random system IDs (30000029 and 30000050) for a pathfinding demo. My test program constructed the all-pairs index and then used it to find the shortest path between the two systems. You can see the output of the test below:

brian@mint ~/eve $ ./eve 
Indexing... done. Elapsed time 38.99 seconds.
Calculating the quickest route from 30000029 to 30000050... done.
Built a 14 hop route in 0.002 ms.

Note that, indeed, the shortest route between those two systems according to DOTLAN is 14 hops! And the index was built in less than 40 seconds- about 0.4% of the time that it took MagicalTux’s PHP version.

Optimizations performed by javac

programming 12 January 2014

I’ve been a full-time Java developer for about a year now, and in that time I’ve seen a lot of incorrect information out there about the behavior of the Java compiler (javac). With this post I hope to clear up some misconceptions about compile-time optimization in Java. Note that this information is correct for my version of javac but is not necessarily true for all JDKs.

brian@mint ~ $ java -version
java version "1.7.0_45"
Java(TM) SE Runtime Environment (build 1.7.0_45-b18)
Java HotSpot(TM) 64-Bit Server VM (build 24.45-b08, mixed mode)

Boolean expressions

Let’s start with an easy case. Is the compiler smart enough to spot a redundant logical NOT in a boolean condition? Let’s compile the following blocks of code:

if(!booleanExpression) {
    doSomething(1);
} else {
    doSomething(0);
}
if(booleanExpression) {
    doSomething(0);
} else {
    doSomething(1);
}

The compiler produces pretty much identical bytecode for both of these. Of course, the cost of an added logical NOT operation would be insignificantly small. But in fact there is no extra operation. Let’s examine the bytecode:

13: ifne          23
16: iconst_1      
17: invokestatic  #5                  // doSomething(1)
20: goto          27
23: iconst_0      
24: invokestatic  #5                  // doSomething(0)
27: return
13: ifeq          23
16: iconst_0      
17: invokestatic  #5                  // doSomething(0)
20: goto          27
23: iconst_1      
24: invokestatic  #5                  // doSomething(1)
27: return

The only difference is an ifeq instead of an ifne as the condition for the jump, so you can feel free to use either of the two forms depending on which makes more intuitive sense in context.

Now for a more complex case. You might be tempted to apply your knowledge of boolean algebra to transform the former of these blocks into the latter in order to improve performance in a tight loop:

if(!a || !b) {
    doSomething();
}
if(!(a && b)) {
    doSomething();
}

Would you be justified in doing so?

19: iload_2       
20: ifeq          27
23: iload_3       
24: ifne          33
30: invokevirtual #6
19: iload_2       
20: ifeq          27
23: iload_3       
24: ifne          33
30: invokevirtual #6

Nope! Once again, the compiler is smarter than you are. In fact, in this case the compiler literally generates the exact same bytecode for the two functions.

Constant arithmetic

private static int SECONDS_IN_30_DAYS = 60*60*24*30;

This is clearly more readable than embedding the magic constant 2592000 in the source code. But does it incur a runtime performance cost? Here is the bytecode for the class’s static initializer:

0: ldc           #5                  // int 2592000
2: putstatic     #3                  // Field SECONDS_IN_30_DAYS:I
5: return

The ldc instruction indicates that the compiler has precomputed the value 2592000 and stored it in the class’s constant pool. No multiplication occurs at run-time.

OK, so that works for integer arithmetic. How about constant boolean expressions?

private static boolean troo = true && (false || false || (true && true));
0: iconst_1      
1: putstatic     #3                  // Field troo:Z
4: return

This one is even leaner! Since there’s a special bytecode instruction for loading the integer 1, you don’t even have to go to the constant pool.

Note that in the two previous examples we have forced the compiler to create a static field within the class, and to create code which initializes the field. Creating a field is unnecessary for constant expressions. If we add the final keyword, we can get the compiler to inline these values wherever they’re actually used within the class.

Without final:

private static int SECONDS_IN_30_DAYS = 60*60*24*30;

public static void main(String[] args) {
    System.out.println(SECONDS_IN_30_DAYS);
}
public static void main(java.lang.String[]);
  Code:
     0: getstatic     #2                  // Field java/lang/System.out:Ljava/io/PrintStream;
     3: getstatic     #3                  // Field SECONDS_IN_30_DAYS:I
     6: invokevirtual #4                  // Method java/io/PrintStream.println:(I)V
     9: return        

static {};
  Code:
     0: ldc           #5                  // int 2592000
     2: putstatic     #3                  // Field SECONDS_IN_30_DAYS:I
     5: return

With final there is no separate static initializer section, and instead of getstatic we can simply ldc to get the constant value:

private static final int SECONDS_IN_30_DAYS = 60*60*24*30;

public static void main(String[] args) {
    System.out.println(SECONDS_IN_30_DAYS);
}
public static void main(java.lang.String[]);
  Code:
     0: getstatic     #2                  // Field java/lang/System.out:Ljava/io/PrintStream;
     3: ldc           #3                  // int 2592000
     5: invokevirtual #4                  // Method java/io/PrintStream.println:(I)V
     8: return

String concatenation

Some people will tell you that String concatenation with + is a performance killer on Java. They’ll tell you to use StringBuilder instead. But javac is actually pretty smart about converting + into StringBuilder appends.

return str1 + " : " + str2;
11: new           #3                  // class java/lang/StringBuilder
14: dup           
15: invokespecial #4                  // Method java/lang/StringBuilder."<init>":()V
18: aload_1       
19: invokevirtual #5                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
22: ldc           #6                  // String  : 
24: invokevirtual #5                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
27: aload_2       
28: invokevirtual #5                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
31: invokevirtual #7                  // Method java/lang/StringBuilder.toString:()Ljava/lang/String;

However, it’s not perfect. In cases like this you need to construct a StringBuilder yourself, because the automatic StringBuilder won’t be reused for the second line:

String cat = str1 + " : " + str2;
return cat + " 123";
 8: new           #2                  // class java/lang/StringBuilder
11: dup           
12: invokespecial #3                  // Method java/lang/StringBuilder."<init>":()V
15: aload_1       
16: invokevirtual #4                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
19: ldc           #5                  // String  : 
21: invokevirtual #4                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
24: aload_2       
25: invokevirtual #4                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
28: invokevirtual #6                  // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
31: astore_3      
35: new           #2                  // class java/lang/StringBuilder
38: dup           
39: invokespecial #3                  // Method java/lang/StringBuilder."<init>":()V
42: aload_3       
43: invokevirtual #4                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
46: ldc           #8                  // String  123
48: invokevirtual #4                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
51: invokevirtual #6                  // Method java/lang/StringBuilder.toString:()Ljava/lang/String;

Note that the StringBuilder constructor is called twice (lines 8 and 35).

Constant String concatenation

IntelliJ IDEA breaks lengthy String constants into multiple lines:

return "We the People of the United States, in Order to form a more perfect "
     + "Union, establish Justice, insure domestic Tranquility, provide for the "
     + "common defence, promote the general Welfare, and secure the Blessings "
     + "of Liberty to ourselves and our Posterity, do ordain and establish this "
     + "Constitution for the United States of America.";

As you might expect, the compiler deals:

0: ldc           #2                  // String We the People of the United States, in Order to form a more perfect Union, establish Justice, insure domestic Tranquility, provide for the common defence, promote the general Welfare, and secure the Blessings of Liberty to ourselves and our Posterity, do ordain and establish this Constitution for the United States of America.

Dead code elimination

If code is unreachable, it will be eliminated from the class file:

public static void main(String[] args) {
    if(false) {
        System.out.println("The universe is broken.");
    }
}
public static void main(java.lang.String[]);
  Code:
     0: return

Advanced optimizations

Believe it or not, that’s about all of the optimizations that are performed at compile time. The really advanced optimizations are going to be deferred until the code is just-in-time compiled at runtime. This allows the JIT to write machine code which is optimized for your specific processor architecture.

Something you need to be aware of, though, is that in order to employ certain optimizations the JIT cannot compile loaded bytecode immediately:

A warm-up period provides the HotSpot VM’s JIT compiler the opportunity to collect information about a running program and make intelligent dynamic optimization decisions based on the “hot” code paths taken by the executing program. By default, the HotSpot Server VM executes a block of Java byte code 10,000 times before the HotSpot Server JIT compiler produces native machine code for that block of Java bytecode.

Java Performance, by Charlie Hunt and Binu John.

This can really play hell with your efforts to profile code, because the performance profile of your application is constantly changing as more methods are JIT compiled. One thing you can try is setting the runtime flag -XX:CompileThreshold to a lower value, for example 1. This will cause the JIT to run immediately; however, its optimizations will be less effective and this will subvert the Java code cache mechanism if you don’t have enough memory available.

A brain teaser

riddles 28 December 2012

These two brain teasers posted on /r/math impressed me greatly, so I decided to present them here.

You have been abducted by terrorists and blindfolded. They sit you down at a table and tell you that if you can solve two puzzles then they will let you go.

There is a deck of 52 cards on the table in front of you. Ten of them are face-up and the rest are face-down. Split the deck such that each new stack has an equal number of face-up cards.

You sit thinking in darkness for a few minutes. The terrorists are growing impatient around you when you finally smile: you’ve got the solution! You count out 10 cards from the deck and flip them over. Then you push the two stacks across the table.

The terrorists count the face-up cards in each stack suspiciously. When they find that the number of face-up cards is identical, they’re impressed, but you have a yet harder challenge to face before they will let you go.

On one side of the table there are five dice which add up to 15. On the other side of the table there are another five dice which add up to 13. Move some dice around so that the two sides of the table have equal sums.

You furrow your brow and sit thinking in darkness for over an hour. The terrorists’ new respect for you is wearing off quickly, and they won’t wait much longer. Finally, you get it. You take one die from the group on the right and move it to the group on the left. Then you flip the remaining 4 dice of the right group upside-down. “You see,” you explain carefully. “If the left group sums to , then the right group sums to ; this was given in the problem. If you take a die from the right group and move it to the left, then the new sums are and . Now, because the opposite sides of a single die always add to 7, any time you flip a set of 4 dice summing to , you end up with a set of 4 dice summing to . Therefore, if we flip all of the dice in the right group, we now have:” “which is the same value that we have on the left side.”

Looping a YouTube video

tips 30 November 2012

Last night I was trying to figure out a balisong trick by watching 15 seconds of a YouTube video over and over while trying to duplicate the movements. Switching back and forth between my mouse and the knife was getting annoying, so I took a few minutes to figure out how to control the player with JavaScript. This is how I did it.

First, get the <embed> DOM object. The best way to do this is by using the DOM inspector in your favorite browser to find the ID string of the element. Right now that ID is movie_player, but YouTube mught change it in the future.

A screenshot of Firefox's DOM inspector

Once you have the ID string you can get the DOM object using getElementById in your browser’s JavaScript console.

var ytplayer = document.getElementById("movie_player");

Now you can use any of the functions of the YouTube player API, which are properties of the DOM object we just found. For example, seekTo will take you to a given point in the video. So if you want to go back to 1:01 in the video every ten seconds then you can use setInterval:

setInterval(function () { ytplayer.seekTo(61, true); }, 10000);

A tricky JavaScript closure problem

programming 21 November 2012

Say you have ten images on a page, with IDs img0 to img9. You’d like the number of the image to pop up when you click on one. You might be tempted to write the following JQuery code:

for(var i=0; i<10; i++) {
    $("#img" + i).click(
        function () { alert(i); }
    );
}

The event handlers will successfully bind to every image’s onClick event. You may be surprised, however, to find that when you click on any of the images, 10 always pops up! You can try it for yourself at this jsfiddle.

The problem, of course, is that a reference to the i object is being stored in the click handler closure, rather than the actual value of i. Since there’s only one counter object, every single click handler will refer to the same object, which has the value 10 by the time the user gets a chance to trigger the handler.

Here’s a workaround:

for(var i=0; i<10; i++) {
    (function (i) {
        $("#img" + i).click(
            function () { alert(i); }
        );
    })(i);
}

So what happened? Lest you think that the extra execution context has something to do with it, observe that this version without a function argument doesn’t work:

for(var i=0; i<10; i++) {
    (function () {
        $("#img" + i).click(
            function () { alert(i); }
        );
    })();
}

This demonstrates that the function argument preserves the passed value. Indeed, primitives are copied by value in function calls:

function toFive(in) {
	in = 5;
}
var a = 4;
toFive(a);
a == 4 // true

What about the case where we’re dealing with real objects? Here’s an alternative example, which toggles each image to a different image on click. This version doesn’t work (the last image is always toggled):

for(var i=0; i<5; i++) {
    var toggler = $("<img/>", { "src": "http://www.famfamfam.com/lab/icons/silk/icons/cross.png" });
    toggler.click(function () { toggler.attr("src", "http://www.famfamfam.com/lab/icons/silk/icons/tick.png"); });
    $("#container").append(toggler);
}

First of all, it’s important to note that variable and function declarations are hoisted to the top of their scope, so the above code is equivalent to:

var toggler;
for(var i=0; i<5; i++) {
    toggler = $("<img/>", { "src": "http://www.famfamfam.com/lab/icons/silk/icons/cross.png" });
    toggler.click(function () { toggler.attr("src", "http://www.famfamfam.com/lab/icons/silk/icons/tick.png"); });
    $("#container").append(toggler);
}

When you assign a new <img> to toggler, the reference is updated to point to the new object. And it’s the reference, not the value, which is bound into the click closure. So the same reference is being updated over and over, and the same reference is being bound into the click closure each time. If we want to fix this problem, we can create a new reference each time by using a self-executing function declaration like before:

for(var i=0; i<5; i++) {
    var toggler = $("<img/>", { "src": "http://www.famfamfam.com/lab/icons/silk/icons/cross.png" });
    (function (t) {
        t.click(function () { t.attr("src", "http://www.famfamfam.com/lab/icons/silk/icons/tick.png"); });
        $("#container").append(t);
    })(toggler);
}

Counter-Strike: Source ladder climbing

gaming 24 August 2012

I have been playing Counter-Strike: Source off and on since its release in late 2004. Gameplay largely consists of playing the same short scenarios over and over against different opponents. Over the course of hundreds of hours of gameplay, everything from tiny obstructions in the game levels to the feel of the recoil of each weapon is inevitably memorized and incorporated into players’ muscle memory. Players continually search out better tactics, tune their game client configurations, and generally try to develop a better feel for the Source engine.

I say all of that in the hope that with the above introduction, the next sentence won’t seem overly bizarre: I’ve been wondering for years about the most efficient way to climb a ladder.

Typically to climb a ladder in a first-person shooter, the player (1) walks forward into the ladder while looking up. But what happens if you (2) back into the ladder while looking down? What happens if you (3) strafe sideways into a ladder while facing off to the side? What happens if you (4) face 45 degrees to the right of the ladder and then walk diagonally forward and left? In the case of the Source engine, 2 produces the same upward movement as 1, 3 produces no movement, and - I found - 4 produces up to 142% of the upward movement of 1 and 2 depending on where you’re looking vertically.

To get exact times for each possibility, I created this test map (screenshot) with the Source SDK. The idea was to climb the ladders with each technique, record the climbs, and examine the video to get the exact time for each climb.

A screenshot of the Hammer editor.

Unfortunately, the second variable in play - the vertical angle that the player is facing when they climb the ladder- has a huge influence on the speed of the climb. What I really want is to rank the times using the optimal vertical angle for each technique. It would be extremely difficult to determine the optimal angle and control it experimentally, so I cheated!

Because of the game’s physics model, if the player gets up to speed and reaches the top of the ladder, then he or she will sail upward a short distance past the top before falling back down. By trial and error, I determined that it’s impossible to clear more than 53 units using techniques 1 and 2, and it’s imposible to clear more than 105 units using technique 4. Those maximum distances can only be cleared at very close to top speed. So I jump around trying to find the optimal vertical angle which will clear the first height at the bottom of the track. Then I let go of the mouse and record a climb the rest of the way up at top speed!

Here are some recordings of the forward and diagonal methods. It takes about a minute to find the exact angle where it works, but I cut that part out.

The final results: technique 1 and its variant 2 climb at about 281 ups (units per second), and technique 3 and its variants climb at about 399 ups. That’s 42% faster!

A Fibonacci mnemonic

math 24 May 2012

The Fibonacci sequence is:

Each term is the sum of the previous two terms in the sequence. Terms in the Fibonacci sequence are pretty easy to spot. An interesting property of this sequence is that it works as a mnemonic for converting miles to kilometers. For example, 21 is followed by 34 in the Fibonacci sequence, and 21 miles is about 34 kilometers. Likewise, 34 is followed by 55 in the Fibonacci sequence, and 34 miles is about 55 kilometers.

Why does this work? The short answer is that the Fibonacci numbers grow with , and is approximately the conversion constant from miles to kilometers. More exactly,

The value of is about 1.62, and 1 mile is about 1.61 kilometers. So successive terms in the Fibonacci sequence are approximately converting miles to kilometers! And, of course, to convert in the other direction you simply look at predecessive terms.

A couple of techniques are useful for when the number you want to convert isn’t exactly a Fibonacci number. If you want to convert 17 miles to kilometers, you could look at the conversions for Fibonacci numbers 13 and 21 and try to interpolate the answer. However, since 34 is a Fibonacci number you can get a much better answer by converting 34 miles to 55 kilometers and dividing by 2.

You could also change the initial conditions of the recurrence. A Fibonacci sequence with general starting conditions is called a Lucas sequence. Because when solving a linear homogeneous recurrence relation the starting conditions don’t affect the relation’s characteristic equation, a Lucas sequence also grows with , and the mnemonic applies. So you have more data points to work with if you also memorize, for example, the sequence:

For more mathematical coincidences, see the Wikipedia article.

The pipe optimization problem

puzzle 03 February 2012

GitHub provides a great service called GitHub pages, which allows you to host static web sites from GitHub servers by simply pushing a repository containing the content. GitHub founder Tom Preston-Werner wrote a fantastic blog platform called Jekyll to fit this hosting model. Blog authors can write their posts in Markdown and Jekyll will render the posts to HTML and generate an entire site structure of static pages suitable for hosting with GitHub Pages. It makes sense as an architecture- if not for user comments there would be no reason at all for having a whole scripting lanugage and database engine behind an otherwise static blog. Of course you’ve probably guessed that this post is published through that system. The complete source code and some comments about setting up your own blog are available here.

Besides playing with GitHub, I’ve also been working on my reading list over the break. Other than fiction, I’ve been reading through some textbooks which I have been fortunate enough to recieve for free thanks to financial aid. One of these books is Introduction to Algorithms (in my classes we used Manber and Kleinberg & Tardos instead), which I have discovered has some excellent exercises- perfect little puzzlers for a blog format! Today’s puzzle is based on exercise 9.3-9 in the 3rd edition.

The problem

A number of well pumps have been built in arbitrary positions throughout a region. An east-west pipeline is to be built through the area. Connections will run north-south from each pump to the nearest point on the main pipeline. Your task is to decide where to place the pipeline so that the total length of connections that need to be laid is minimized. An example region map is below; click the image to expand it.

An overhead map showing four pump stations. A pipe runs east-west through the center of the area. Shorter pipes running north-south connect each pump station to its closest point on the main east-west pipe.

The example above is, in fact, a solution- you cannot find a better place to put the east-west pipeline. So how do you find an optimal position in general, given the coordinates of each pump?

You might be tempted to suggest that the pipeline be placed along the mean y-coordinate of all of the pump stations. It seems to be clear that the pipeline should be laid roughly through the center of the constellation of pumps, since we’re trying to minimize the distance to each pump. This geometrical intuition turns out to be wrong in rather the same way that our as-the-crow-flies pathfinding intuition is wrong in cities- the unusual way of accounting for distance foils the obvious solution. I hope that examining an example of using the mean will convince you that the mean doesn’t work; click the image to expand it.

An overhead map showing five pump stations. A pipe runs east-west through approximately the mean of the y-coordinates of the pumps. Shorter pipes running north-south connect each pump station to its closest point on the main east-west pipe.

The correct answer is to use the median of the y-coordinates. The easiest way to see why this works is by considering the case where there are an odd number of pumps, for example 5. The median, then, is the latitude line through the third pump sorted by y-coordinate. Say we move the pipeline from the median to just one meter north of the median. Then the lengths of two vertical connectors (to pumps 4 and 5) will decrease by one meter, and the lengths of three vertical connectors (to pumps 1, 2, and 3) will increase by one meter. In sum, the total length of pipe will increase. The same thing will happen if you move the pipeline just one meter south.

If we have an even number of pumps, say 6, the solution is that the pipeline can go anywhere vertically between pumps 3 and 4. It should be clear that you can move the pipeline up and down within that range without changing the overall total length, because three vertical connectors increase and three decrease in lockstep. It’s only when you move outside of that range that an asymmetry appears.