Brian Gordon blog

I set up an available-anywhere development environment, and so can you.

docker programming 22 December 2019

I’ve been keeping my eyes open for a new development workstation powerful enough for Scala compilation in a laptop form factor. When I saw that Lenovo had a Black Friday sale on an extraordinarily inexpensive laptop ($129!) with good build quality but anemic specs, I had an idea: why not use this cheap laptop as a thin client for a remote development environment? I have a relatively snappy little server at home that can do all of the heavy lifting, which should mean that the laptop gets great performance without sacrificing on cost, weight, or battery life.

My Linux server runs headless Debian, which is great for a server but not really ideal for my development workflow. So the plan will be to use Docker to run Arch in a container. I’ll expose the container to the outside internet so that I can access it wherever I am in the world. It would be possible to do all of this with vagrant — there’s even an archlinux box - but I’m looking for something a little lighter-weight than a dedicated virtual machine. systemd-nspawn is perhaps more fit-for-purpose when running a whole guest OS in a container, but it lacks some of the convenience features provided by Docker, especially around automation and network configuration. Besides, this is a good opportunity to get some hands-on experience with Docker. It’s worth pointing out, though, that we’re accepting some risk by using containers because if there are ever any forward-incompatible kernel ABI changes then our Arch guest’s libc might try to make system calls that the Debian host kernel doesn’t understand. I’ve also found that security features in Docker mean that not everything in the guest OS works right out of the box, but that’s just something I’m going to deal with as it comes.

Before we move on, let’s discuss the different options available for how to remotely control our development environment from the laptop. LTSP and Thinstation don’t seem to exactly align with what I’m trying to do. I know from experience that basic X forwarding over SSH won’t give me anywhere near adequate performance (low latency, compatability with low-bandwidth WAN links) for comfortable use. Xpra adds some tricks and H.264 video compression to make X forwarding more bearable, so that might work — it even has an HTML5 client that works over SSL, in case you only have access to a web browser. TigerVNC/Xvnc over SSH would probably work okay. From my research, though, it looks like NoMachine has the best technology out there for remote desktop on Linux, and fortunately the freeware version of their product supports a virtual framebuffer via an embedded X server. If FOSS is a hard requirement for you, an old version (c. 2013) of their technology was open source, and it lives on in the X2Go project. From reports I see on the web, it seems that X2Go provides better performance than VNC and Xpra, so that would be my second choice. But I’m going with NoMachine.

Step zero is to make sure that Docker is installed on the host server. The version of Docker in the Debian repository is pretty old (even in Debian testing) so we’ll want to get it directly from Docker’s APT repo. The official instructions are fine but this is what I did instead:

sudo -i
curl -fs | gpg --dearmor > /etc/apt/trusted.gpg.d/docker.asc.gpg
echo "deb [arch=amd64] buster stable" > /etc/apt/sources.list.d/docker.list
apt-get update
apt-get install docker-ce docker-ce-cli

Now we need a Dockerfile specifying how to build the image for our development environment. This is what I came up with; feel free to customize it for your needs.

FROM archlinux/base
ARG username=brian

# The default mirrors are sometimes not synchronized. Updating the mirrorlist first is necessary for the Dockerfile to reliably run without failures.
RUN pacman -Syu --noconfirm \
 && pacman -S --noconfirm reflector \
 && reflector --verbose --fastest 5 --age 6 --save /etc/pacman.d/mirrorlist \
 && pacman -Syu --noconfirm \
 && pacman -S --noconfirm --needed man man-pages nano openssh iputils procps-ng base-devel git systemd-sysvcompat \
 && pacman -S --noconfirm --needed xfce4 xfce4-goodies gvfs pulseaudio ttf-roboto ttf-ubuntu-font-family ttf-dejavu

# Set the system time zone.
RUN ln -sf /usr/share/zoneinfo/America/Los_Angeles /etc/localtime

# Create our user *before* mounting the home directory as a volume so that it won't be owned by root.
RUN useradd --create-home --user-group --groups wheel --shell /bin/bash $username
RUN echo '%wheel ALL=(ALL) NOPASSWD: ALL' > /etc/sudoers.d/wheel

# Enable OpenSSH daemon with password-based authentication disabled. You can remove some of this if you don't have authorized_keys set up.
RUN systemctl enable sshd
RUN sed -i 's/UsePAM yes/UsePAM no/' /etc/ssh/sshd_config
RUN sed -i 's/#PasswordAuthentication yes/PasswordAuthentication no/' /etc/ssh/sshd_config
COPY --chown=$username:$username authorized_keys /home/brian/.ssh/authorized_keys

# Set up key-based auth for NX, though as we'll see it doesn't actually work. You can remove this if you don't have authorized_keys set up.
RUN mkdir -p /home/brian/.nx/config
RUN cp /home/brian/.ssh/authorized_keys /home/brian/.nx/config/authorized.crt
RUN chown -R brian:brian /home/brian/.nx
RUN chmod 0600 /home/brian/.nx/config/authorized.crt

# Patch makepkg so that it will let us run it as root. 
# Evil, but not as dumb as switching to a user only for makepkg to prompt for a nonexistent sudo password to install dependencies.
RUN cp --preserve=all /usr/bin/makepkg /usr/bin/makepkg.old
RUN sed -i 's/exit $E_ROOT/#exit $E_ROOT/g' /usr/bin/makepkg

# Build/install NoMachine. The vanilla installer doesn't support Arch but there's an AUR package which patches the Fedora support to work.
# The installer script checks /proc/1 to see if we're running systemd so if we want it to install the `.service` file we have to rename the shell.
# You can remove the DefaultDesktopCommand line if NoMachine autodetects your desktop environment. See
RUN git clone
WORKDIR nomachine
RUN makepkg -sric --noconfirm
RUN cp /bin/bash systemd-sh
RUN sed -i 's_#!/bin/bash_#!/root/nomachine/systemd-sh_' /usr/NX/scripts/setup/nxserver
RUN /usr/NX/scripts/setup/nxserver --enableservices
RUN sed -i 's_#!/root/nomachine/systemd-sh_#!/bin/bash_' /usr/NX/scripts/setup/nxserver
RUN sed -i 's_DefaultDesktopCommand /usr/bin/startxfce4_DefaultDesktopCommand "/usr/bin/dbus-launch /usr/bin/startxfce4"_' /usr/NX/etc/node.cfg
RUN sed -i 's/EnableUPnP NX/#EnableUPnP NX/' /usr/NX/etc/server.cfg
RUN echo "UDPPort 4000" >> /usr/NX/etc/server.cfg
RUN echo "CreateDisplay 1" >> /usr/NX/etc/server.cfg
RUN echo "DisplayOwner \"$username\"" >> /usr/NX/etc/server.cfg
RUN echo "#EnableNXClientAuthentication 1" >> /usr/NX/etc/server.cfg
RUN echo "#AcceptedAuthenticationMethods NX-private-key" >> /usr/NX/etc/server.cfg
RUN rm -rf nomachine

# Build/install yay, an AUR helper. You can remove this section if you use a different helper or prefer to run makepkg manually.
RUN git clone
RUN makepkg -sric --noconfirm
RUN rm -rf yay

# Put makepkg back the way it was.
RUN mv -f /usr/bin/makepkg.old /usr/bin/makepkg

# Since we're running systemd as our PID1 we need Docker to send it the SIGRTMIN+3 stop signal that systemd expects rather than the default SIGTERM.

VOLUME /home/$username
CMD [ "/sbin/init" ]

Note that we’re using systemd as our entry point. Typically you would not have an init system running in a container at all. It conventionally suffices to set up the filesystem and start your containerized application as PID1 directly. In our case, however, it would be nice to have systemd running so that we can install multiple services and start/stop them with systemctl. We want a full operating system, not just a single application, after all. Running systemd also helps us by reaping zombie processes which might otherwise be ignored by an application-specific entry point.

Put our Dockerfile in a directory and build an image from it. Change brian to your own username:

sudo -i
mkdir dev && cd dev
cp ~brian/.ssh/authorized_keys .
nano Dockerfile
docker build -t dev --build-arg username=brian .

Now we need to create a new container and start it up in the background. We need a couple of extra options so that systemd can run as PID1 in a non-privileged container. The SYS_PTRACE capability seems to be needed for NoMachine to start up its embedded X server and is useful for debugging, but be careful - if your kernel is older than version 4.8 then programs can use ptrace to bypass seccomp filters. SYS_RESOURCE is useful and relatively harmless, and stops sudo from complaining at us when it tries to call setrlimit. Setting --shm-size is required to prevent Chrome from running out of memory, because it makes use of /dev/shm which is only 64MB by default. We restate the /home/$username volume mount here so that Docker gives it a name and will reuse the same volume across docker run invocations; if you want to start completely fresh, remove the -v devvol:/home/brian option or run docker volume rm devvol.

docker run -d -v devvol:/home/brian -p 2222:22 -p 4000:4000 --name dev --hostname dev --cap-add=SYS_PTRACE --cap-add=SYS_RESOURCE --shm-size=1g --restart always --tmpfs /tmp:exec --tmpfs /run -v /sys/fs/cgroup:/sys/fs/cgroup:ro dev

For now we can start a new root shell in the container so that we can set a password for our user. It’s a good idea to do this interactively so that your password isn’t recorded into docker history.

docker exec -it dev bash
passwd brian

I had one additional step, which was to update my firewall configuration on the host server to permit IPv4 packets to pass through to the container via the Docker bridge network device.

sudo -i
cat > /etc/ufw/applications.d/nxserver <<-"ENDOFMESSAGE"
	title=NX protocol server
	description=Part of NoMachine, a remote desktop solution.
ufw app update nxserver
ufw allow from to app nxserver
ufw limit log proto tcp from to port 2222

If you’re a Chrome user, running it in a container is a bit tricky because it uses user namespaces for sandboxing. This feature is disabled by default in Debian for security reasons so I had to run this in order to enable it on the host kernel (note that this specific sysctl knob will not work in other distros):

echo 'kernel.unprivileged_userns_clone=1' > /etc/sysctl.d/00-local-userns.conf
service procps restart

The other thing you’ll need in order to run Chrome in your container is Jess Frazelle’s Chrome seccomp profile. Add --security-opt seccomp=/path/to/chrome.json to your docker run command in order to apply it to your container. Because this will replace the default seccomp profile completely, it can actually be more restrictive than the default profile in some ways. In particular, syscalls needed to use strace are allowed by the default profile but not by Jess’s Chrome profile. Her profile also doesn’t allow statx, which can break Qt apps. I created a merged profile that includes both the syscalls needed to run Chrome and the syscalls granted by the default profile.


In order to remote into your server, download and install the freeware NoMachine Enterprise Client. Create a new connection, selecting the NX protocol (the freeware version of the server software doesn’t support SSH) and entering the relevant hostname or IP address. You can select “Use UDP communication for multimedia data” if you change the port to 4000.

The Dockerfile copies your server’s .ssh/authorized_keys into ~/.nx/config/authorized.crt but I was unable to get key-based authentication to work (connections fail if EnableNXClientAuthentication is 1 and this guy’s fix doesn’t work for me) so password-based authentication is left enabled. Use a strong password if you expose this to the internet!

Docker tips

This section is a collection of pointers for Docker beginners to help with common tasks that you might want to perform with your container.

NoMachine screenshot

Who creates money?

economics investing 24 July 2018

Inflation has been around since antiquity. The Roman Empire was particularly known for debasing its currency (by minting their coins with increasingly impure precious metals) to cover state expenditures, especially in times of war. The increasing nominal money supply and pressure from imports paid for with silver resulted in steady inflation of prices and wages.

The invention of paper money made it easy for governments, particularly those facing a balance of payments crisis, to monetize their debt and get caught in a vicious cycle of inflation. This occurred most famously in the Weimar Republic (prices doubling about every two days) and in Zimbabwe (prices doubling about every day). The most extreme example in recorded history was the Hungarian pengő (prices doubling about every 15 hours): “…by the time the pengo was replaced by the forint in August 1946 the dollar value of all Hungarian banknotes in circulation was just one-thousandth of one [US] cent.”

Some small level of inflation is seen by present-day central banks as desirable for various reasons. Most obviously, it discourages hoarding of capital and therefore fuels economic growth. The ability to freely print money also gives central banks a greater level of control over the economy through setting monetary policy. The Federal Open Market Committee (FOMC), which sets monetary policy in the United States, has a varying inflation target but it’s generally around 2% per annum.

The equation of exchange states:

where M represents the supply of money, V represents the velocity of money, P represents prices, and Q represents the real value of transactions. This equation relates the “real” economic factors V and Q with the “nominal” factors M and P particular to the monetary system. The monetarist school of thought holds that because V and Q are determined by the “real economy” they are out of our direct control, so inflation in prices P should be managed by controlling the size of the money supply M. This is by no means an uncontentious assertion, but the preference for monetary rather than fiscal intervention is deeply ingrained into today’s economic orthodoxy so it will serve as a useful model for us. So then how does the money supply expand under our current system, and what controls are in place to control this expansion?

The conventional view is that banks take money from depositors (in exchange for safety and account services) and lend some of that money out to borrowers (in exchange for interest). The original depositors still have their money on paper (as long as not too many people try to withdraw their money at once), and the borrowers now also have money to spend. The M1 money supply has therefore increased. Some of this borrowed money is spent and deposited back into the banking system, where it can be loaned out again. The main obstacle said to prevent this lending cycle from going on forever and producing infinite money is the reserve requirement imposed on banks by regulators to ensure that sufficient reserves (say, 10% of deposits) are on hand at all times to satisfy withdrawals of demand deposits. The inverse of the reserve requirement is called the money multiplier; a 10x money multiplier means that a deposit of ten dollars results in nine of them being lent out to borrowers, with one dollar held in reserve. Supposedly, by adjusting reserve requirements a central bank can control how quickly the money supply grows.

Numerous economists and financial commentators have pointed out that this conventional textbook view bears little resemblance to how banks lend money in the real world. Banks do not and can not lend out their reserves to anyone except for other banks and a few other types of entities involved in short-term money markets. Instead, when a commercial loan is disbursed, a new asset (the value of the loan to the bank) and a new deposit liability (the money placed in the borrower’s account) are added ex nihilo to the bank’s balance sheet. This newly-created “commercial bank money” does require corresponding “reserve money” (physical bank notes in the vault or reserves on deposit with the central bank) in case the borrower wants to withdraw cash, but any solvent bank has no difficulty in acquiring sufficient reserves by borrowing them from other banks or, in an emergency, the lender of last resort. Reserve requirements mostly serve to ensure that sufficient liquidity is present to settle transactions between banks at the end of each business day. Reserve requirements play little role in limiting commercial lending and the Federal Reserve seldom changes reserve requirements to effect changes in monetary policy. Some countries, including Canada, the United Kingdom, New Zealand, Australia, Sweden, and Hong Kong have abolished traditional reserve requirements altogether. Post-Basel III, large banks are required to hold sufficient so-called high-quality liquid assets to cover 30 days of outflows (the liquidity coverage ratio) but, again, this is meant to ensure sufficient liquidity to settle flows between banks, not to limit commercial lending. The factor primarily responsible for limiting excessive lending is not a bank’s reserve requirements, but a bank’s capital requirements: a minimum ratio of equity to debt meant to ensure that the bank can remain solvent if some of its loans fail. The equity of a commercial bank is at best only distantly connected to the strength of its reserves.

Given adequate capitalization, a commercial bank can create new money as it pleases, and the money market will ultimately provide it with whatever reserves it needs to meet its reserve requirements. So then how does the central bank control the money supply in order to hit its inflation target? It has to incentivize or disincentivize lending. It turns out to be very difficult to anticipate and control the supply and demand of money, so in modern times the Fed instead tries to control the interest rates of loans throughout the banking system. In a low interest rate environment, money for investment is easily obtained and the economy “heats up,” increasing prices. By raising interest rates, the central bank makes loans less attractive for borrowers and the economy “cools down,” decreasing prices (or increasing them more slowly than before). This is called the monetary transmission mechanism.

In order to influence commercial lending rates, the Federal Reserve manipulates the federal funds rate, the average interest rate that banks charge each other for overnight loans to cover their reserve requirements.

Fed funds rate graph

This rate perhaps doesn’t seem prima facie particularly significant, but the short maturity of the loan (overnight) and extreme creditworthiness of the borrowers (banks) makes the federal funds rate an essential indicator for the theoretical interest rate of an uncollateralized loan with close to zero financial risk. This risk free interest rate is used throughout the financial system as a base for determining rates on all kinds of debt instruments. For example, the prime rate used for home loans, student loans, credit cards, and other lines of credit runs about 3% above the federal funds rate. The Fed has the most direct impact on short-term, low-risk interest rates because reserve lending competes most directly with short-term, low-risk debt. As you go farther out on the yield curve (or look at higher-yield debt), the Fed’s impact is more muted. Would-be arbitrageurs will tend to narrow the spread between the fed funds rate and other forms of debt, but they have to assume some risk to do so, so there will be a premium involved which depends on the macroeconomic outlook of the market. Thus, in practice the Fed has a difficult time controlling long-term rates given the tools they have.

So how does the Fed manipulate the fed funds rate? Individual banks determine amongst themselves the rates that they charge each other for overnight loans, but the Fed can constrain and influence these rates. It sets a ceiling on the federal funds rate by offering at all times to lend to member banks at the so-called discount rate, which is set higher than the target federal funds rate, in exchange for suitable collateral. Banks with excess reserves can’t charge other banks much more than the discount rate because they’re competing with the discount window at the Fed. Similarly, the Fed sets a floor on the federal funds rate by paying interest on excess reserves. Banks typically won’t loan reserves to each other at less than the IOER rate because they would make more money by simply depositing the reserves with the Fed. The Fed also operates a repo facility where the Fed can take out or provide what are essentially collateralized loans, typically just for the night. The rates on these loans serve to constrain the fed funds rate just like the discount rate and IOER rate, albeit only up to a limited volume determined daily by the Fed.

Within these bounds, the Fed has various tools for pressuring the federal funds rate in one direction or the other, though it mainly engages in open market operations, wherein the Fed trades securities (especially short-term Treasury debt) with its member banks in exchange for reserves, at market prices. Banks with excess or insufficient reserves are often happy to take such a deal, particularly with Treasury bills because they compete directly with reserve lending in the short-term risk-free money market. When the Fed buys Treasurys from the banks in exchange for reserves, the amount of excess reserves in the system goes up, driving down the cost of borrowing reserves. Conveniently, buying large quantities of Treasurys also creates a scarcity in the market for government debt, allowing the Department of the Treasury to get away with offering lower rates on new issues of debt. Conversely, when the Fed sells Treasurys to banks in exchange for reserves, the amount of excess reserves in the system decreases and the cost of borrowing reserves goes up. The extra Treasurys on the market make it harder for the government to sell new debt, and they have to raise rates on new debt to attract buyers.

So, to summarize, who creates money? The central bank creates some money to fund government operations, manage financial crises, and ensure sufficient reserves. But mostly commercial banks do, creating it out of thin air on their balance sheets. Curbing this ability was the focus of the failed sovereign money movement in Switzerland earlier this year. The ability of commercial banks to lend in the United States is primarily limited by capital requirements to ensure that lenders don’t go bust when borrowers default. But there’s a lot of wiggle room within that basic safety limit. If the Fed wants to speed up or slow down commercial money creation to manage inflation, they have to incentivize lenders by manipulating interest rates.

If you’re interested in learning more, this Creative Commons textbook has more details. There are also numerous reports on monetary policy at If you’re interested in the mechanics of the hyperinflation mentioned at the beginning of this post, I highly recommend picking up the fantastic and highly accessible Big Debt Crises by Ray Dalio of Bridgewater Associates fame. (Also available for free as a PDF.)

Developer interview questions

math programming 16 July 2016

The following is my attempt to write down some of the stuff that I’ve read and thought about when trying to come up with interview questions to ask software engineer applicants. Some of these questions are pretty hard and inappropriate for most interviews. You may nevertheless find it interesting to challenge yourself with them.

Prompt: Say you’re working for a shipping operation that moves bulk commodities across the ocean. You can select from an exchange which goods you want to accept for delivery on a given ship. Each lot on the exchange is for a certain amount of money to ship a certain number of cubic feet of a given good. How would you decide what lots to carry on a ship with a known displacement and volume capacity?

Answer: If you only had to consider volume capacity then you might be tempted to take the lots which yield the most money per cubic foot, in order, until your ship is full. Even in this simplified case, that wouldn’t be an optimal solution. You might greedily take a single lot which is slightly more profitable than the others per cubic foot, but leaves 4% of your ship empty, which you cannot fill since the other lots are for at least 5% of your ship. It could be more profitable overall to accept smaller lots so that your ship can be packed more completely. This sounds like a bin-packing problem, so we know at this point that our algorithm is probably going to be exponential in the worst case.

The prompt mentions displacement, which adds another constraint to our choice of lots. It might be very profitable to ship gold bullion, but if you filled your entire ship with gold then it would sink! This is an elementary integer programming problem. Your variables are booleans signifying whether or not to accept each lot on the exchange. Your objective function is the total money made by shipping the lots that you accept. You are constrained by the total weight and volume of those lots. And once you’ve found the optimal shipment you should make sure that it’s actually profitable given your costs!

The literature on solving integer programming problems is rich and there are many sophisticated optimizations to be made. It should be sufficient to pull an existing library off the shelf once you have the problem set up.

Note that it’s probably infeasible to solve the very large systems that you could construct if there are many available lots on the exchange. Integer programming has no known polynomial-time solution. To improve performance you can relax your boolean constraints to continuous ones and consider the problem as a linear programming problem, rounding your answer for each variable. This wouldn’t give a perfectly optimal solution but it could be solved in polynomial time. You could also eliminate some lots from consideration due to other lots being obviously more profitable.

Prompt: Given a number’s prime factorization, how many distinct divisors does that number have? Why do only perfect squares have an odd number of distinct divisors?

Answer: To count the number of distinct divisors, consider a number’s prime factorization:

We know that every divisor of can be written uniquely as:

In other words, we have choices (zero through ) for the first factor’s exponent, choices for the second factor’s exponent, and so on. The total number of divisors is therefore:

To see why only perfect squares have an odd number of divisors, think about a regular old number that’s not a perfect square. You can pair all of its divisors with another divisor . For example, if then there are three pairs of divisors: , , and . Since all of the divisors are paired up, the number of divisors must be even. However, if is a perfect square, then one of the divisors can’t be paired. The other divisors can be paired like before, but “pairs” with itself, making an odd number of distinct divisors.

Prompt: You’re tasked with creating a machine for automatically making change. Your machine should take as an input the amount of change to make (e.g. 16.00). It should output the number of each coin to dispense so that the fewest possible coins are dispensed. Your machine should work with any currency’s coin denominations.

Answer: The change-making algorithm that cashiers in the United States follow is simple. You take as many quarters as you can without going over the desired amount, then you do the same for dimes, then nickels, then pennies. You consider each coin in descending order of value. However, this greedy algorithm doesn’t work in general. If there’s a 6-cent coin, 18 cents should be given as three 6-cent coins, not a dime, a 6-cent coin, and two pennies.

This is a classic dynamic programming problem. The first step in the dynamic programming approach is to find a recurrence. Let’s define to be the number of coins that it takes to make cents. We can define this recursively with a case for each coin. Say the denominations are 1, 4, and 5.

For example, if we have to make change for 55 cents, then we can either 1) make 54 cents and add a 1 cent coin, 2) make 51 cents and add a 4 cent coin, or 3) make 50 cents and add a 5 cent coin. Our base case is that 0 cents can be made with 0 coins.

However, writing this code as a recursive function would be a terrible idea. There’s an enormous amount of overlap in the computation required to evaluate each case of the min. If we tried to make change for just 50 cents, we’d end up evaluating our function 818,598,723 times.

We could cache the result for each x after it’s evaluated and check the cache before recursing (this is called memoization) but it would be better to write the algorithm iteratively.

Say we need to make change for 100 cents. Then we’re going to fill in an array of 100 entries. The first entry will represent how many coins it takes to make 1 cent, the second entry will represent how many coins it takes to make 2 cents, and so on. We can evaluate each entry in sequence using the recurrence and the entries that we’ve already evaluated. After 100 steps we’ll have evaluated how many coins it takes to make 100 cents.

This only gets us the number of coins, not the exact coins themselves. We can rectify this by adding a corresponding “last coin used” entry for each entry in the array. The “last coin used” entry for x will store which case in the recurrence “won” for that x. Now once we reach 100 cents, we can backtrack by repeatedly subtracting the value of the last coin used. The sequence of last coins used is the set of change that the machine should dispense. This technique is often useful in dynamic programming algorithms.

Prompt: An inversion is a pair of two elements in a permutation (the integers arranged in some order) which are “out of order” with respect to each other; that is, . For example, the permutation has three inversions: , , and . Given a permutation, determine how many inversions it contains.

Answer: The most obvious solution is to try every possible pair of distinct elements using a nested loop, incrementing a counter each time you find an inversion. This would run in quadratic time. Interestingly, you can also perform a bubble sort and count the number of swaps you must perform– that number is exactly the number of inversions. Again, that’s quadratic time.

The key insight is that this problem can be solved with the divide-and-conquer technique. In this technique you divide the problem into parts, solve each part recursively, and then do some work to combine the answers into an answer for the whole problem. In this case, break the permutation into halves. Assume that you can recusively obtain the number of inversions in the left half and in the right half separately, and assume that you have sorted both halves. For the recursion to work, the output you need to produce from these values is the number of inversions in the whole permutation combined, and the whole sorted permutation.

You can accomplish this easily by modifying merge sort, another divide-and-conquer algorithm. Merge sort works by breaking the input list into halves, recursively sorting each half, and then merging them by repeatedly taking the first element from one or the other sorted halves, depending on which element is smaller. We can use this exact algorithm for counting inversions, except that the merge operation needs to additionally maintain an inversion counter which is incremented by the remaining size of the left half any time an element from the right half is merged. Like mergesort, this has a running time.

Prompt: Arbitrageurs make money by buying an asset and then immediately selling it in a different market to profit from a difference in price. One example of this is in the foreign exchange (forex) markets. Each currency pair (e.g. Euro / USD) has an exchange rate. If you can find a way to exchange your money between currency pairs such that you’re able to return to your original currency with more money than you started with, you’ve found an arbitrage opportunity, which means risk-free profit! Given a list of currency pairs and their exchange rates, how would you determine whether there’s an arbitrage opportunity?

Answer: The structure of this problem is begging for it to be turned into a graph. You can represent currencies as nodes and traded currency pairs as edges, each with a weight corresponding to the exchange rate between its incident currencies. An arbitrage opportunity would then look like a cycle in the graph where the product of the edge weights is greater than one.

It’s not obvious how to find such a cycle efficiently. We might like to take advantage of the existing corpus of graph algorithms for cracking this problem, but generally paths through a weighted graph have a length which is the sum of the constituent edge weights. In this case, we’re applying exchange rates multiplicatively so we want a path through the graph to have a length of the product of the constituent edge weights.

We can translate the product into a sum by taking the log of both sides:

So now we’ve formulated an equivalent problem: finding a positive cycle in the graph where the edge weights are the log of the exchange rates. However, positive cycles aren’t as easy to find as negative cycles. Multiply both sides by :

Now the problem is to find a negative cycle in the graph where the edge weights are the negative log of the exchange rates. The Bellman-Ford algorithm can find negative cycles in a graph.

The recurrence for Bellman-Ford defines as the length of the shortest path from the starting node to , using at most edges. This can be defined recursively:

Finding for all nodes would take time if you build a table from the recurrence. Standard dynamic programming optimization techniques can bring this down. You can also use an adjacency list to only consider incident nodes in the , rather than all nodes. In the end, Bellman-Ford can be optimized down to :

repeat |V|-1 times:
	for each edge (j, k)
		B[k] <- min(B[k], B[j] + w[j, k])

Now that we have the shortest distance from our starting node (our starting currency) to every other node, we can check for negative cycles. If there’s a negative cycle, then there’s going to be no true shortest path to all nodes: for the nodes reachable from the negative cycle, you can go through the cycle as many times as you want to make the shortest distance as low as you want. Since we only iterated the B-F algorithm times, those very meandering shortest paths won’t be accounted for. We can detect negative cycles by checking each edge and the “shortest” path lengths to its incident vertices:

for each edge (j, k)
	if (B[j] + w[j, k] < B[k])
		There is a negative cycle through (j, k)

Note: This is a bad interview question. Please don’t ask it. It’s so tricky in an interview setting that it basically only selects for people who’ve seen the problem before.

Prompt: How would you reduce lock contention in multithreaded code?

Answer: One technique that often helps to reduce contention is to use separate reader-writer locks. You can also introduce finer-grained locking so that you obtain a lock on only part of your data structure from each thread. Finally, you might want to try lock-free programming techniques (e.g. a compare-and-swap loop).

Prompt: How do you deal with deadlock?

Answer: A good practice for avoiding deadlock within an application is to impose a total ordering on locks and always acquire them in the same order in different places in your code. When potential deadlock is unavoidable, you can keep track of a wait graph and abort one of your processes/transactions if there’s a cycle (but be careful of livelock). Two-phase locking: acquire all of the locks you need before mutating the protected resource so that you can easily roll back by releasing the locks if there’s an issue acquiring all of the locks you need.

Prompt: What’s the difference between covariant and contravariant type parameters? When would you use each? What are your restrictions when designing a covariant collection class?

Answer: When you define a generic interface, covariant type parameters appear as the types of method arguments. They say “I don’t care what type you give me, as long as it extends T so that I can treat it as a T.” These are useful when you’re defining consumers, like Comparators. Contravariant type parameters, on the other hand, appear as method return types and say: “I don’t care what type you expect, as long as it’s a superclass of T, because I’m going to give you a T.” These are useful when you’re defining producers, like Iterators. Effective Java puts it succinctly: “producers extend, consumers super” or “PECS.”

A problem occurs when you want to consume an object of a covariant type, and then turn around and produce an object of the same type contravariantly. Languages like C# and Scala will actually prevent you from doing this at compile time, and with good reason. It turns out that allowing code like that to compile will break the type system. For example, in Java, this issue frequently confounds beginners:

List<List<? extends Number>> a = null;
List<List<Integer>> b = null;
a = b; // Does NOT compile.

List<? extends List<? extends Number>> a = null;
List<List<Integer>> b = null;
a = b; // You have to do this instead.

Here’s how the type system would break down if that were not the case:

List<List<? extends Number>> a = null;
List<List<Integer>> b = new ArrayList<>();
a = b; // This won't compile, but assume it does.

Integer c = b.get(0).get(0);
// Now c is an Integer containing 2.718

(That’s not to say that you can’t have both covariant and contravariant type parameters on the same class, or even on the same method. For example, a Function class with an apply method should have a covariant argument type and a contravariant return type, but – critically – they’re not the same type.)

trait Function1[-T, +R] {
    def apply(arg1: T): R

Since you can’t have the same type parameter in both a covariant argument position and a contravariant return position, it turns out to be impossible to safely implement a mutable collection with a variant type parameter. Indeed, the mutable collections in Scala all have invariant type parameters. The immutable collections, however, have covariant type parameters, because of the “one weird trick” that immutable collections have up their sleeve. Here’s how Scala defines the immutable List interface:

package scala.collection.immutable
class List[+A] {
    def head: A
    def tail: List[A]
    def prepend[B >: A] (x: B): List[B]

The list produces values of type A through its head method, so A is a contravariant type parameter. However, when it consumes values through its prepend method, it can accept anything which is a superclass of A. This would appear to violate our rule, but it doesn’t– because prepend returns a whole new collection of the new, less specific type. Immutable collections will always return a new instance rather than modifying state within themselves, which is the key to achieving both covariance and contravariance.

Covariance and contravariance rules in Java

programming 05 September 2014

This post is a go-to quick reference for exactly how covariance and contravariance work in Java. Different programming languages do this in different ways, so sometimes it can be tricky to switch between languages and keep the rules straight. When you switch over to Java, use this guide to refresh your memory.

Type conversion

Type conversion in Java is covariant (unlike Dart). That means that if SubClazz is a subtype of Clazz then a SubClazz reference can be cast to a Clazz.

public class Clazz { }
public class SubClazz extends Clazz { }
(Clazz)new SubClazz(); // OK
(SubClazz)new Clazz(); // Error

Conversion can occur implictly during assignment:

Clazz instance = new SubClazz();

Conversion can also occur implicitly when returning from a method or when passing arguments.

public Clazz makeClazz() {
    return new SubClazz();

public Clazz takeClazz(Clazz foo) { }
takeClazz(new SubClazz());


Arrays in Java are covariant in the type of the objects they hold. In other words, Clazz[] can hold SubClazz objects.

Clazz[] array = new Clazz[10];
array[0] = new SubClazz();

They are also covariant in the type of the array itself. You can directly assign a SubClazz[] type to a Clazz[].

Clazz[] array = new SubClazz[10];

Be careful though; the above line is dangerous. Although the type of the array variable is Clazz[], the actual array object on the heap is a SubClazz[]. For that reason, the following code compiles fine but throws a java.lang.ArrayStoreException at runtime:

Clazz[] array = new SubClazz[10];
array[0] = new Clazz();

Overriding methods

The overriding method is covariant in the return type and invariant in the argument types. That means that the return type of the overriding method can be a subclass of the return type of the overridden method, but the argument types must match exactly.

public interface Parent {
    public Clazz act(Clazz argument);

public interface Child extends Parent {
    public SubClazz act(Clazz argument);

If the argument types aren’t identical in the subclass then the method will be overloaded instead of overridden. You should always use the @Override annotation to ensure that this doesn’t happen accidentally.


Unless bounds are involved, generic types are invariant with respect to the parameterized type. So you can’t do covariant ArrayLists like this:

ArrayList<Clazz> ary = new ArrayList<SubClazz>(); // Error!

The normal rules apply to the type being parameterized:

List<Clazz> list = new ArrayList<Clazz>();

Unbounded wildcards allow assignment with any type parameter:

List<?> list = new ArrayList<Clazz>();

Bounded wildcards affect assignment like you might expect:

List<? extends Clazz> list = new ArrayList<SubClazz>();
List<? super Clazz> list2 = new ArrayList<Object>();

Java is smart enough that more restrictive type bounds are commensurable with less restrictive type bounds when appropriate:

List<? super Clazz> clazzList;
List<? super SubClazz> subClazzList;
subClazzList = clazzList;

Type parameter bounds work the same way, although they cannot be lower-bounded. If you have multiple upper bounds on a type parameter, you can upcast to any of them, as expected:

interface A {}
interface B {}
interface C extends A, B {}
public class Holder<T extends A & B> {
    T member;
A member1 = new Holder<C>().member;
B member2 = new Holder<C>().member;
C member3 = new Holder<C>().member;

You can add or remove the type parameters from the return type of an overriding method and it will still compile:

public interface Parent {
    public List echo();

public interface Child extends Parent {
    public List<String> echo();
public interface Parent {
    public List<String> echo();

public interface Child extends Parent {
    public List echo();

Wildcards can be present in the types of method arguments. If you want to override a method with a wildcard-typed argument, the overriding method must have an identical type parameter. You cannot be “more specific” with the overriding method:

public interface Parent {
    public void act(List<? extends List> a);

public interface Child extends Parent {
    public void act(List<? extends ArrayList> a); // Error!

Also, you can replace any type-parameterized method argument with a non-type-parameterized method argument in the subclass and it will still be considered an override:

public interface Parent {
    public void act(List<? extends Number> a);

public interface Child extends Parent {
    public void act(List a);

The skyline problem

programming 31 August 2014

You are given a set of rectangles in no particular order. They have varying widths and heights, but their bottom edges are collinear, so that they look like buildings on a skyline. For each rectangle, you’re given the x position of the left edge, the x position of the right edge, and the height. Your task is to draw an outline around the set of rectangles so that you can see what the skyline would look like when silhouetted at night.

How shall we proceed? If you’re drawing a blank, it’s always good to get a really awful solution on the table right away so that you have something to think about and improve upon.

The first thing I thought of was to construct a 1-dimensional heightmap. The idea is to create an array of height values and write each rectangle onto it. Without worrying about the details of mapping rectangle coordinates to pixel array indicies, the code will look something like this:

for each rectangle r:
    for each heightmap cell c starting at r.left and ending at r.right:
        c gets the max of r.height and the previous value of c

OK, so this works as a first attempt at a solution. What, specifically, is wrong with it?

You can see from the animation that the skyline constructed from the heightmap isn’t quite correct. The edges of the rectangles don’t line up perfectly with the array cells, so there is a small amount of error in the shape of the skyline. In fact, it is easily shown that the only way to guarantee zero error is to use a heightmap with the same resolution as the final rendered image. This means that the running time of your algorithm depends not only on the number of given rectangles, but also on the resolution of the output image.

Of course, unless you’re using a vector display, it’s inevitable that at some point you will have code looping on the order of the resolution of the output image, just to draw the line segments one pixel at a time. I’m inclined to not worry about this cost. If you’re writing code in Perl, for example, your concern is to do as little as possible in Perl and offload as much work as possible to the drawing library, which is likely compiled and heavily optimized. Only when the line-drawing code isn’t much faster than your application logic does it start to make sense to use a raster approach like a heightmap.

What now? If the chief weakness of the heightmap approach is the sheer number of points to deal with in application code, maybe we can reduce the number of points in play. Now that we think about it, the skyline is made up of horizontal line segments, interrupted in only a few places. In fact, the only time the skyline can change its y position is at the left or right side of a rectangle. It’s clear now that if we find the height of the skyline at each of these “critical points” on the x axis then we will have completely determined the shape of the skyline. At each critical point, you just go up or down to the new height and draw a line segment to the right until you reach the next critical point.

So how do we find the true height of the skyline at each of these critical points? Well, we already have this heightmap approach, so let’s try doing something similar. This time, instead of printing the rectangles onto a heightmap array with an entry for each pixel, let’s print the rectangles onto an array with an entry for each critical point! This will eliminate the problem of dealing with too many points, because we’re now dealing with the minimum number of points necessary to determine the skyline.

Here’s a first stab at what the code would look like:

for each rectangle r:
    for each critical point c:
        if c.x >= r.left && c.x < r.right:
            c.y gets the max of r.height and the previous value of c.y

Looks good! So now we have a working solution to exactly the problem we were trying to solve, with no error. Can we achieve a better running time? It occurs to us that we don’t really need to look at every critical point when printing a rectangle, but rather only those critical points below the rectangle in question.

for each rectangle r:
    for each critical point c below r (except the one at r.right):
        c.y gets the max of r.height and the previous value of c.y

This optimization depends, of course, on being able to efficiently find which critical points are subtended by each rectangle. This is easily done by sorting the critical points. For example, if we want to find the critical points subtended by the magenta rectangle, we start at the left side of the magenta rectangle and scan to the right, accumulating critical points until we reach the right side.

Unfortunately, this isn’t an asymptotic improvement in the worst case. It’s still given something like the following configuration:

The worst case.

At this point perhaps no ideas jump out at you about how to improve the algorithm’s performance further. Let’s try perturbing the solution we have in order to see what might present itself. What if, instead of looking at each critical point for each rectangle, we look at each rectangle for each critical point? This is the same code as before, with the loops switched:

for each critical point c:
    for each rectangle r:
        if c.x >= r.left && c.x < r.right:
            c.y gets the max of r.height and the previous value of c.y

Again, we don’t really need to consider all rectangles, only the ones above the critical point in question:

for each critical point c:
    for each rectangle r above c (not including the right edge of rectangles):
        c.y gets the max of r.height and the previous value of c.y

So, given a critical point, how do we efficiently find all of the rectangles above it? This requires a different strategy than before. Before we turned the problem inside out, we needed to find all of the critical points between the left and right sides of the given rectangle. Now, we need to find all of the rectangles with a left edge to the left of the given critical point and a right edge to the right of the given critical point.

What if we begin at the critical point and go left looking for left edges, and also go right looking for right edges, and then intersect the two sets of rectangles? That would work, but, again, it’s in total to do this for every critical point.

A better approach is to simply scan across the skyline’s sorted critical points from left to right, keeping track of an active set of rectangles as you go. When you reach a critical point, the active set is updated and then the critical point gets assigned a copy of the current active set of rectangles. By the end of the pass, each critical point will know about all of the rectangles above it.

Now that we’re able to scan through the critical points and consider only the “active” set of rectangles at each critical point, an interesting opportunity presents itself. Our current solution can be written as:

for each critical point c
    c.y gets the height of the tallest rectangle over c

This is no longer obviously . If we can somehow calculate the height of the tallest rectangle over c in faster than time, we have beaten our algorithm. Fortunately, we know about a data structure which can keep track of an active set of integer-keyed objects and return the highest one in time: a (max) heap.

Our final solution, then, in time, is as follows. First, sort the critical points. Then scan across the critical points from left to right. When we encounter the left edge of a rectangle, we add that rectangle to the heap with its height as the key. When we encounter the right edge of a rectangle, we remove that rectangle from the heap. (This requires keeping external pointers into the heap.) Finally, any time we encounter a critical point, after updating the heap we set the height of that critical point to the value peeked from the top of the heap.

Update (Feb 2019): Ivan Malison points out that you don’t actually need to keep external pointers into the heap. Instead, as you scan, when you hit the left edge of a building you add it to the heap, and when you hit the right edge of a building you pop nodes off the top of the heap repeatedly until the top node is a building whose right edge is still ahead. With this strategy, your heap may contain buildings which have already ended, but it doesn’t matter because you’ll discard them as soon as they’re at the top of the heap.

Square roots and fixed points

math programming 02 June 2014

There’s an old HN comment in which a developer whines bitterly and eloquently about his experience being stumped by a common interview question: “write an algorithm for computing the square root of a number.”

I tried a couple more times to answer this question that is completely unrelated to the job I’m interviewing for. Finally, I get so fed up with this moron that my internal ticker clicks over and I realize, “Even if I get this job, I’m going to be dealing with these kinds of nazel-gazing engineers every single day. Not an environment I want to be in.”


The most obvious solution is binary search, but that’s not very interesting. How else might we solve the problem of calculating ? Let’s first introduce the concept of a fixed point. A fixed point is a value which is unchanged by the function - that is, . For example, a fixed point of the sine function is 0 because . A fixed point of the cosine function is located around 0.739085133 because . In fact, if we plot the cosine function on top of then we can see that they intersect at exactly that point:

Plot 1

One interesting fact which might be of use is that is a fixed point of the function :

We can therefore graph to find its fixed point . Here’s a plot for various values of a:

Plot 2

By subtracting x from the function it becomes a root finding problem:

Plot 3

This we can solve with Newton’s method:

Of course, we could have just used Newton’s method from the beginning. If we’re trying to calculate then the value we’re looking for is a zero of the function . In this case we have:

These are different functions but they have the same fixed point. This is a plot of for the two different functions (with a=7) showing that they have the same zeros:

Plot 4

Newton’s method corresponds closely to a concept called fixed point iteration. In fixed point iteration, we repeatedly evaluate a function and feed its output back into its input. Eventually we hope it will converge on a fixed point. We mentioned the cosine function earlier; this is one such function which always converges on a particular fixed point. Enter any value into a scientific calculator and repeatedly press the cosine button:

-1.0000, 0.5403, 0.8575, 0.6542, 0.7934, 0.7013, 0.7639, 0.7221, 0.7504, 0.7314, 0.7442, 0.7356

It will eventually converge to a value near 0.7391:

Plot 5

Another function which converges to a fixed point is the expression given by Newton’s method applied to the function , which we derived before:

If we rearrange the right side we obtain:

This is called the Babylonian method, or Heron’s method, and it was actually discovered long before Newton’s method. The idea was that if is an overestimate to then is an underestimate, and you can average them to get a better approximation. Repeated iteration improves this approximation until it converges on , which we know is a fixed point.

Remember that before we said is a fixed point of the function . Unfortunately if you iterate that function, you will not approach . Fixed point iteration doesn’t always work and this is one such case. The math behind being able to tell whether an arbitrary function will converge to a fixed point under fixed point iteration is complicated.

Now it would be very simple to wrap the Babylonian method in a loop and perform a couple steps of fixed point iteration to get a decent sqrt(a). But since we’re finding a fixed point, this seems like a nice time to break out something called a fixed point combinator. The best-known fixed point combinator is the Y combinator. You’ve probably heard of it due to the eponymous startup incubator founded by Lisp greybeard Paul Graham, of the famous Paul Graham essays.

This is the lambda calculus definition of the Y combinator, due to Haskell Curry:

The reason y is called a fixed-point combinator is because of what happens when you apply it to a function and reduce. By following the lambda calculus reduction rules you can find that the Y combinator satisfies the equation . This matches the form - the definition of a fixed point. So, is a fixed point of ! Therefore all we have to do is apply the Y combinator to obtain a fixed point of , right?

Well, not really. takes and returns a function so the fixed point of isn’t a number at all, it’s the function which maps to . In other words, all the Y combinator is doing is facilitating fixed-point iteration:

It appears that this expansion will continue forever and never terminate. But we can build a termination condition into the function so that it stops expanding. Let’s see how that would work with the sqrt example. Our could look like this in JavaScript:

function step (callback) {
    return function (originalValue, approxSqrt) {
        // Babylonian method formula to improve our approximation of the square root 
        // of originalValue.
        var improvedApproxSqrt = (approxSqrt + (originalValue / approxSqrt)) / 2;

        // How far off the mark we are.
        var discrepancy = Math.abs(originalValue - 
                                (improvedApproxSqrt * improvedApproxSqrt));

        // Termination condition
        if(discrepancy < 0.00001) {
            return improvedApproxSqrt;

        return callback(originalValue, improvedApproxSqrt);

This looks a lot like recursion, except that we’ve never referred to an free variable name. This is called anonymous recursion, and it’s useful in systems (notably lambda calculus) where functions cannot refer to themselves by name.

Now we need the magic which will repeatedly invoke step and feed its output back into its input: the Y combinator. How would it look in JavaScript? Here’s the Y combinator in lambda calculus again:

This corresponds pretty directly to a JavaScript function:

function Y(f) {
    return (function (x) {
        return f(x(x));
    })(function (x) {
        return f(x(x));

Unfortunately, when we try to use this version of the Y combinator, we get a stack overflow. If you trace out the execution you’ll see that x(x) must be evaluated in order to get a final return value for Y, and this causes infinite recursion. The reason this works at all in lambda calculus is that lambda calculus is call by name so is evaluated by expanding the definition of and passing that function to f. JavaScript, on the other hand, is call by value, so the x function is actually evaluated with x as an argument.

If we η-reduce the Y combinator then we obtain an alternate fixed-point combinator, called the Z combinator, which contains an extra layer of indirection and prevents runaway recursion:

Here’s the JavaScript code corresponding to the Z combinator:

function Z(f) {
    return (function (x) { 
        return f(function (v) { return x(x)(v); }); 
    })(function (x) {
        return f(function (v) { return x(x)(v); });

We still have a bit of a problem: our step function takes two variables, while Z only calls it with one. So let’s modify Z:

function Z(f) {
    return (function (x) { 
        return f(function (v1, v2) { return x(x)(v1, v2); }); 
    })(function (x) {
        return f(function (v1, v2) { return x(x)(v1, v2); });

Now we can see it working:

> var sqrt = Z(step);
> sqrt(2, 1); // sqrt of 2, with a starting estimate of 1

It’s rather awkward to always have to provide a starting estimate, so we can add a wrapper which always guesses 1 to start:

> function sqrt (num) { return Z(step)(num, 1); }
> sqrt(2)

MagicalTux's EVE Online pathfinder

programming 05 March 2014

The personal blog of Mark Karpeles, aka MagicalTux, CEO of the recently-bankrupt Bitcoin exchange Mt. Gox, received attention on Hacker News recently. Given that Gox’s spectacular $473 million downfall was supposedly caused by a bug in Karpeles’s custom implementation of the Bitcoin protocol, people were understandably interested in checking the quality of his public source code.

What we saw was not reassuring. In one post, Karpeles describes a custom implementation of SSH2 which he wrote for production use at his web hosting company.

With PHP I could write a fully working SSH server in only 3 days. … My goal when writing this was to provide a replacement for the FTP protocol for the customers of my hosting service.

As I was missing some functions to properly handle the initial Diffie-Hellman key exchange (and later to implement publickey authentication) I had to re-implement those in PHP.

This is horrifying to see from the guy who wrote the server which once handled 70 percent of all Bitcoin trades. If I had seen this post before Mt. Gox’s failure, I would never have deposited my Bitcoin with them. The SSH server source code has since been taken down, but one HN user remembers it:

The code for the sshd does not seem to be there anymore, but from memory: it did not check if the number sent by Bob was 0, 1, or any any other groups that would make it easy solve the discrete logarithm problem. I don’t think it bothered to check the primes either. [1] I think there was also something wrong with the signature checking (padding not checked maybe?).

Altogether it seemed like you could easily MITM connections made to the server, but I don’t think I ever tried. It was a perfect example–to me at least–of why you should not spend a trivial amount of time reading about crypto on Wikipedia and then writing crypto code.

I absolutely agree. The main lesson that you should take away from the crypto class is that you should be afraid of rolling your own crypto code. There are a thousand ways to screw up, and it only takes one mistake for your cryptosystem to fall apart. The recent Apple and GnuTLS vulns show that even the serious players get this wrong.

So best practices were apparently not followed at Mt. Gox. In fact, later it was alleged that developers at Mt. Gox would push changes directly to production, and didn’t even use version control for the site’s source code.

Another of his blog posts is about a tool which he wrote - in PHP of course - to compute routes between star systems in the MMORPG EVE Online. In EVE, solar systems are connected to each other with portals called Stargates. The result is a big graph. Savvy players will try to take the shortest possible path to get from point A to point B by using a shortest-path algorithm to automate navigation.

Systems map in EVE Online

I thought I’d hop on the bandwagon by criticizing his EVE pathfinder. Although I could nitpick on matters of style, I’d rather focus on the core- that is, the all-pairs shortest path solver which is the basis of his algorithm. The general idea is to generate an index which contains, for each system, the next hop to take in order to reach any given system. This uses O(n^2) space but allows efficient pathfinding between any two arbitrary systems in the universe. All of this is perfectly fine so far. The problem is how he goes about constructing this index.

Based on my reading of his code, this appears to be his algorithm:

  1. Inform each system how to reach its adjacent systems.
  2. From each system s, collect the best known paths to systems 1 hop away and advertise them to s’s adjacent systems.
  3. From each system s, collect the best known paths to systems 2 hops away and advertise them to s’s adjacent systems.
  4. From each system s, collect the best known paths to systems 3 hops away and advertise them to s’s adjacent systems.
  5. From each system s, collect the best known paths to systems 4 hops away and advertise them to s’s adjacent systems.

This continues for 256 steps, by which point MagicalTux hopes all systems have been informed of the shortest path to all other systems. Judging by a cusory search of the EVE forums, this appears to be a valid assumption.

Although his solution seems asymptotically optimal (maybe off by a factor of the number of edges?), in real life it will perform poorly compared to a real all-pairs shortest path algorithm. I thought it would be interesting to see how badly I could beat his code’s performance. MagicalTux claims that it takes about 3 hours to construct the index using his PHP code. I’m going to see how much faster I can do it, by doing these things differently:

I downloaded the jump connection data helpfully provided by MagicalTux - the official data comes in an unusable binary MSSQL format - and put it in a CSV file for easy parsing.

My complete EVE pathfinder implementation can be found here. It doesn’t require any extra space (asymptotically) to construct the index, which is nice. This is the core of the code where almost 100% of the running time is spent:

// Floyd's algorithm
for(uint32_t k=0; k<NUM_SYSTEMS; k++) {
    #pragma omp parallel for shared(k, cost, next) schedule(dynamic)
    for(uint32_t i=0; i<NUM_SYSTEMS; i++) {
        if(i == k) {
        for(uint32_t j=0; j<NUM_SYSTEMS; j++) {
            uint32_t prev = cost[(NUM_SYSTEMS * i) + j];
            uint32_t candidate = cost[(NUM_SYSTEMS * i) + k] + cost[(NUM_SYSTEMS * k) + j];
            if(candidate < prev) {
                cost[(NUM_SYSTEMS * i) + j] = candidate;
                next[(NUM_SYSTEMS * i) + j] = next[(NUM_SYSTEMS * i) + k];

I picked two random system IDs (30000029 and 30000050) for a pathfinding demo. My test program constructed the all-pairs index and then used it to find the shortest path between the two systems. You can see the output of the test below:

brian@mint ~/eve $ ./eve 
Indexing... done. Elapsed time 38.99 seconds.
Calculating the quickest route from 30000029 to 30000050... done.
Built a 14 hop route in 0.002 ms.

Note that, indeed, the shortest route between those two systems according to DOTLAN is 14 hops! And the index was built in less than 40 seconds- about 0.4% of the time that it took MagicalTux’s PHP version.