2015 Architecture of Octave Online

I maintain Octave Online, a web UI for the GNU Octave computational engine. The site is enjoyed by tens of thousands of users from all around the world, and well over a million commands have been executed since the beginning of this year. What kind of architecture and infrastructure supports a web site of this scale?

Over the past two years, Octave Online has gone through several different iterations of back-end architecture, with each update improving the performance and decreasing the bounce rate. In this blog post, I'm going to give an overview of the current architecture that makes Octave Online robust and scalable to meet the needs of a growing audience.

Note: The code for Octave Online is in the long-term process of being prepared for open sourcing. When the code is released, I will publish more material on this blog about it.

The Objectives

The very first alpha version of Octave Online had the simplest architecture imaginable: a single cloud instance that would spawn GNU Octave processes on itself and read/write to their standard input/output streams. This is an extreme example that help illustrates some of the problems that a good back-end architecture needs to solve.

  1. Scalability: The architecture should be able to scale to support any number of simultaneous users.
  2. Performance: The architecture should ensure that the user is able to get fast I/O throughput, from any part of the world.
  3. Reliability: If any piece of the architecture goes offline, the application should pick up where it left off when the piece comes back online again.

Objective 1 means that as traffic increases on the site, I should be able to dynamically increase the number of servers to support the increased load. Objective 2 means that the architecture should be able to support an interconnected architecture across multiple data centers. Objective 3 means that each component needs to maintain state as best it can during unexpected outages, and also that it can return to its previous state efficiently.

The Architecture

Okay, so with the objectives in mind, let's cut to the chase. The following diagram illustrates the architecture of Octave Online, which I will explain in more detail below.


When an end user connects to Octave Online, they are actually creating a web socket connection to a front server, which is a relatively simple Node.JS application. Upon connection, the front server assigns the client an ID, and adds that ID to a queue in a Redis cache. Within a short amount of time, one of multiple back servers will pop the ID from the queue and locally spawn an Octave process. If the connection is from a returning user, the front server looks them up in a Mongo database (not shown) and the back server will clone an instance of the user's repository from a Git server (not shown). We're now ready to start running commands.

When the end user issues a command and sends it over the web socket, the front server will publish that command into a Redis channel identified by the user's session ID. The back server that adopted the corresponding ID will have subscribed to that channel, so it receives the message, and sends the command down to the Octave process. When the Octave process produces output, the back server publishes the output to another Redis channel to which the front server is subscribed, and finally the front server sends the output back to the end user over the web socket connection. The total round-trip time through the stack never stops impressing me.

When the end user disconnects or reloads their browser window, the front server sends a message over Redis to inform the back server that it can shut down the corresponding Octave process. The back server then kills the Octave processes, and if necessary, commits and pushes any changes to the user's repository back to the Git server.

Why the Redis cache?

One of the most common questions I get is, why the Redis cache? Why not have the front servers connect directly to the back servers with some sort of socket connection? The answer to this question is twofold, and arises from all three objectives.

Reason 1: Load Balancing

High performance and scalability means that we need to load-balancing the work across multiple back server instances. How do you perform the load balancing? One could use a traditional approach, like round-robin, for assigning the jobs. But what if the back servers could decide on their own when they were ready to accept new jobs?

This is one of my favorite parts of the Octave Online architecture. Each back server is treated as its own agent and controls its own destiny. If the back server is fresh and can handle more connections, it goes and pulls from the queue. If the back server is busy with many different jobs, it ignores the queue and lets other servers pull from it instead. A back server may also elect to go offline entirely and perform maintenance chores, like cleaning up local storage and killing orphaned processes. In order to ensure that there is at least one back server online and accepting connections at all times, the back servers talk to one another, and when one wants to go offline, it needs to get approval from at least half of the other back servers.

Redis is an important piece by providing constructs like a distributed priority queue and pub-sub channels. Redis is the framework through which the back servers talk with each other and with the front server that contains the connection to the end user.

Reason 2: State Recovery

If a user briefly loses their internet connection, or if the front server goes offline for a bit, we would like for the user to be able to reconnect to a front server, which may be different than the one to which they were previously connected, and connect to their same Octave process if it still exists.

Redis makes this possible through the pub-sub framework. If a user re-connects and provides a pre-existing ID, the front server simply sets up a Redis client listening on that ID for output from the Octave process. It is completely oblivious to the identity of the back server that is actually running the process.

Note: Each front server continuously touches a key in the cache for each of its ID. If the front server goes offline, those keys have a 15-second expiration. If they expire, the back server gets a notification and destroys the corresponding Octave processes. However, if the user reconnects to a different front server, or if the front server comes back online, the keys will be touched again, and the Octave process remains alive.

Collaborative Editing

The newest version of Octave Online brings support for real-time collaborative editing and sharing of the same Octave session. This was relatively easy to implement in the current architecture. Although the details of how this works is beyond the scope of this blog post, the high-level idea is that each end user is subscribed to the same session ID, so they receive the same notifications when the Octave process produces output, for example. I'm hoping to write another blog post detailing some of the difficulties of implementing collaborative editing and how I overcame them with my code.

The Infrastructure

In principle, each component of the architecture could be running on different machines, the same machine, or some mixture of the two.

Right now, I have Octave Online hosted in the Rackspace cloud. I have four cloud VMs. One of them holds a front server instance and is also home to the Redis cache, Mongo database, and Git server. The other three VMs are each running an instance of the back server and are designed to handle the demanding job of dynamically creating and destroying thousands of Octave sessions each day, in a fast and secure manner (another topic worthy of its own blog post). These four servers have proven sufficient for handling the typical load I currently get on Octave Online. The beauty is that if I ever need to add more front servers or back servers, it's as easy as going into the Rackspace control panel and clone more instances.


I hope that you found this blog post interesting. Feel free to leave comments/questions below!

Mushroom Monster: 3-Line Solution in Julia

The first question in Round 1A of the 2015 Google Code Jam is a great example of a problem that can be solved very easily in just a few lines of Julia.

The question goes like this:

Kaylin loves mushrooms. Put them on her plate and she'll eat them up! In this problem she's eating a plate of mushrooms, and Bartholomew is putting more pieces on her plate.

In this problem, we'll look at how many pieces of mushroom are on her plate at 10-second intervals. Bartholomew could put any non-negative integer number of mushroom pieces down at any time, and the only way they can leave the plate is by being eaten.

Figure out the minimum number of mushrooms that Kaylin could have eaten using two different methods of computation:

1) Assume Kaylin could eat any number of mushroom pieces at any time.

2) Assume that, starting with the first time we look at the plate, Kaylin eats mushrooms at a constant rate whenever there are mushrooms on her plate.

For example, if the input is 10 5 15 5: With the first method, Kaylin must have eaten at least 15 mushroom pieces: first she eats 5, then 10 more are put on her plate, then she eats another 10. There's no way she could have eaten fewer pieces. With the second method, Kaylin must have eaten at least 25 mushroom pieces. We can determine that she must eat mushrooms at a rate of at least 1 piece per second. She starts with 10 pieces on her plate. In the first 10 seconds, she eats 10 pieces, and 5 more are put on her plate. In the next 5 seconds, she eats 5 pieces, then her plate stays empty for 5 seconds, and then Bartholomew puts 15 more pieces on her plate. Then she eats 10 pieces in the last 10 seconds.

The key observations are that for the first method, the solution is simply the sum of the negative differences across Kaylin's plate between each time step. For the second method, you need to determine the time step that has the greatest decrease, and then let Kaylin eat either that number of mushrooms or the number of mushrooms on her plate, whichever is fewer.

A solution in a traditional programming language like C++ or Java takes 10-20 messy lines of for loops. You can download such implementations from the contest scoreboard. However, I present a 3-line solution using Julia (14 lines if you include my comments and I/O).

# Read input
T = int(readline(STDIN))
for t=1:T
    N = int(readline(STDIN))
    M = int(split(readline(STDIN)))

    # Compute solution
    deltas = M[2:end] - M[1:end-1]
    n1 = -sum(deltas[deltas .< 0])
    n2 = sum(min(-minimum(deltas), M[1:end-1]))

    # Print result
    println("Case #$t: $n1 $n2")

The implementation can be run from the command line as julia mushroom.jl < mushroom.in where mushroom.in is your input file.

For those who don't know, Julia is a new programming language, originally designed for scientific computing. Its syntax is similar to MATLAB/Octave, but under the hood it uses strong typing to speed up execution and give more flexibility for fast scripting. Like Romeo, I've fallen in love with Julia and don't expect to be switching to a different language any time soon. :)

Installing Cilk Plus on OS X

Cilk Plus is a language extension to C++ that gives convenient parallel programming constructs, like spawn and sync. Installing it on OS X is fairly simple.

These instructions are very similar to those found on the official web site.


You need to have the Xcode Command Line Tools installed before you can continue. Follow steps 1-3 of this tutorial for more information.


First, I recommend making a clean directory to save your Cilk Plus source, build files, and binaries.

$ cd ~
$ mkdir cilkplus
$ cd cilkplus
$ mkdir build
$ mkdir install

Next, we download the source code. I was not having luck downloading the SVN and Git repos for the Cilk Plus extension. However, downloading a snapshot worked on the first try.

Once you download, you need to extract the tarball. My ref was named gcc-6bca773, but yours might be something slightly different.

$ cd ~/cilkplus
$ tar -zxf gcc-6bca773.tar.gz
$ mv gcc-6bca773 source
$ cd source
$ ./contrib/download_prerequisites

Now you can start the build process. Warning: This involves building the whole freaking GNU C Compiler. It can take several hours.

$ cd ../build
$ ../source/configure --prefix=/absolute/path/to/cilkplus/install --enable-languages="c,c++"
$ make
$ make install

Download Cilk Plus Tools

If you need to use the cilkview and cilkscreen commands, download them from here and save them in your cilkplus directory.

$ cd ~/cilkplus
$ tar -zxf cilktools-mac-004277.tgz
$ mv cilktools-mac-004277 tools

Creating a Cilk Plus Environment

Once the build process is complete, you will have a GCC binary with support for the Cilk Plus extension. However, you probably don't want to use this binary as your default GCC. You can therefore create a custom Bash environment to only use Cilk Plus when you need it.

First, create a file ~/.bashrc.cilk and save it with this contents.

# Use the Cilk Plus gcc binaries
export PATH=/path/to/cilkplus/install/bin:$PATH

# Cilk Tools
export PATH=$PATH:/path/to/cilkplus/tools/bin

# Customize prompt
export PS1="\h:cilk\$ "

You can add the following alias to your ~/.bash_profile file.

alias cilk="bash --rcfile ~/.bashrc.cilk"

Using Cilk Plus

You can now use Cilk Plus.

$ gcc --version
i686-apple-darwin11-llvm-gcc-4.2 (GCC) 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.11.00)
$ cilk
cilk$ gcc --version
gcc (GCC) 4.9.0 20130520 (experimental)

Did this help? Let me know by commenting below.

GitHub – vote539