Installing git-http-backend with CentOS, NGINX, and Custom Authentication

The docs for git-http-backend have instructions on how to set up the service in Apache, but they stop short of giving an example for Nginx setup. There are other bloggers who have posted bits and pieces of the installation instructions, but nothing that works end-to-end. The goal of this blog post is to share instructions on getting git-http-backend up and running from a clean slate. More specifically, this post covers:

  1. Installing the necessary software on CentOS
  2. Creating an authentication/authorization service for customized HTTP authentication
  3. Putting all the pieces together in your Nginx config

Step 1: Installing the Software

There are a number of tools we need to install:

  • git-http-backend, which is the CGI utility that performs actions on your Git repos.
  • fcgiwrap, which listens for FastCGI requests on a UNIX socket and passes them to git-http-backend
  • spawn-fcgi, which daemonizes fcgiwrap
  • Optional: openresty, which bundles nginx with the bindings required for customized user authentication

git-http-backend, spawn-fcgi, and openresty are all available in yum.

$ sudo yum install git spawn-fcgi openresty

Unfortunately, fcgiwrap is not currently available in yum, so we need to build it ourselves. Here is how I did it:

$ sudo yum install fcgi-devel
$ sudo yum groupinstall 'Development Tools'
$ sudo su root
# cd /root
# git clone git://github.com/gnosek/fcgiwrap.git
# cd fcgiwrap
# autoreconf -i
# ./configure
# make
# make install

This installs the binary into /usr/local/sbin/fcgiwrap.

If you haven't already done so, create a new unprivileged user git for handling requests to the git repositories. Also add nginx to the usergroup for git so that it can use the FCGI socket.

$ sudo useradd git
$ sudo usermod -a -G git nginx

We now need to configure spawn-fcgi. Edit the file /etc/sysconfig/spawn-fcgi and add the following content:

SOCKET=/var/run/fcgi-git.sock
OPTIONS="-u git -g git -s $SOCKET -S -M 0660 -P /var/run/fcgi-git.pid -- /usr/local/sbin/fcgiwrap -f"

Most of the options are standard. -u git -g git sets the FCGI user to git so that it can read and write your repositories. -s $SOCKET -S _m 0660 creates the socket with permissions 0660 so that nginx, which is in the git usergroup, can communicate on the socket. -P /var/run/fcgi-git.pid saves the FCGI process ID in the stated file. -- /usr/local/sbin/fcgiwrap runs fcgiwrap, and the -f tells fcgiwrap to print errors on stderr so that CGI errors show up in the NGINX error log file.

Now enable and start spawn-fcgi so that it runs now and at startup:

$ sudo systemctl enable spawn-fcgi
$ sudo systemctl start spawn-fcgi

If everything went correctly, you should have a new UNIX socket located at /var/run/fcgi-git.sock which is owned by user git and is ready to start accepting requests.

Step 2: Creating an Authentication Service

If you want custom authentication (for example, against a user database), and you want to have certain users able to access certain repositories, you need to set up a custom authentication service. You can do this using the auth_request module, which is available in OpenResty (see above) but not in the default Nginx distribution in yum. (Note: you can also do this using a custom Lua script, but that is harder to maintain because it requires building Nginx from source.)

In order to use the auth_request module, you need to create am HTTP server that responds with 400 status codes if a user is not authenticated and 200 status codes if the user is authenticated. Here is an example server in Node.JS (it depends on the npm package "basic-auth"):

#!/usr/bin/env node
"use strict";

const basicAuth = require("basic-auth");
const fs = require("fs");
const http = require("http");
const path = require("path");

// Listen on a UNIX Socket:
const SOCKET_PATH = "/var/run/nginx_socks/auth.sock";
try { fs.unlinkSync(SOCKET_PATH); } catch(err) {}
console.log("Listening on UNIX socket " + SOCKET_PATH);

const server = http.createServer((req, res) => {
    const originalUri = path.normalize(req.headers["x-original-uri"] || "");
    // The user is requesting access to originalUri.  Perform user authentication.
    // To display a login box on the client side, do this:
    res.writeHead(401, { "WWW-Authenticate": "Basic realm='"Git Repositories"'" });
    return res.end();
    // To access the username and password, do this:
    if (req.headers["authorization"]) {
        const creds = basicAuth.parse(req.headers["authorization"]);
        // username is in creds.name; password is in creds.pass
    }
    // To signal that the user is authorized to access the repository, do this:
    res.writeHead(204);
    return res.end();
});
server.listen(SOCKET_PATH);

Create the directory /var/run/nginx_socks with owner nginx so that Nginx can communicate with your authentication service:

$ sudo mkdir /var/run/nginx_socks
$ sudo chown nginx:nginx /var/run/nginx_socks

To daemonize the authentication server, create /usr/lib/systemd/system/git-auth.service with the following content (substitute the path to your app.js in the second line):

[Service]
ExecStart=/path/to/your/nodejs/app.js
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=git-auth
User=nginx
Group=nginx
Environment=NODE_ENV=production
Restart=always

[Install]
WantedBy=multi-user.target

Then run the following commands to enable the service.

$ sudo systemctl daemon-reload
$ sudo systemctl enable git-auth
$ sudo systemctl start git-auth

Putting it Together in Nginx Config

Here is an example Nginx configuration to listen on port 3000 with https. Note that I have a file /etc/nginx/default.d/ssl.conf with SSL configurations.

server {
    listen 3000 ssl;
    server_name _;
    include /etc/nginx/default.d/*;

    location / {
        auth_request      /auth;
        auth_request_set  $auth_status $upstream_status;

        # Git HTTP Backend
        client_max_body_size 0;
        fastcgi_param SCRIPT_FILENAME /usr/libexec/git-core/git-http-backend;
        fastcgi_param GIT_HTTP_EXPORT_ALL "";
        fastcgi_param REMOTE_USER $remote_user;
        fastcgi_param GIT_PROJECT_ROOT /home/git;
        fastcgi_param PATH_INFO $uri;
        fastcgi_pass unix:/var/run/fcgi-git.sock;
        include fastcgi_params;
    }

    location = /auth {
        internal;
        proxy_pass               http://unix:/var/run/nginx_socks/auth.sock:/;
        proxy_pass_request_body  off;
        proxy_set_header         Content-Length "";
        proxy_set_header         X-Original-URI $request_uri;
    }
}

Note how Nginx will send requests to /var/run/nginx_socks/auth.sock for authentication and then to /var/run/fcgi-git.sock for git manipulation.

Conclusion

There are a lot of moving pieces! Let me know in the comments if this post helped you and suggestions for improvements.

Operational Tranformation in Redis

Operational Transformation (OT) is the most widely used algorithm for merging changes to the same file from two simultaneous users. It is used as a base for Google Docs and countless other real-time text editing programs.

There are only a few open-source implementations of OT, though, which means that we as developers are a lot more limited. The most popular choice is ot.js, which is designed to run in the browser and Node.JS. Unfortunately, it is limited in the sense that the documents being edited are stored in the memory of the server process, so ot.js will not work out of the box across server restarts or with a clustered server setup.

In this blog post, I explain an approach to using OT through Redis, the popular high-performance server-side key-value store, including an implementation of OT in Redis's scripting language, Lua. The methods illustrated here are used in practice by thousands of Octave Online users every day.

Background: How ot.js Works

In ot.js, every change made by a user corresponds to a list of operations. That list of operations is designed to transform a base string into the new string. There are three types of operations: insert, delete, and retain. When reconciling the list of operations, OT creates a new, empty string, and prepares to read in the base string with a "caret" at the first character. When it reads an operation, it does the following:

  1. delete (negative integer): Some number of characters at the caret are deleted. The caret is moved forward that many characters without copying.
  2. retain (positive integer): Some number of characters at the caret are retained. The caret is moved forward the corresponding number of characters, which each character copied into the new string.
  3. insert (string): A string of text is inserted. The inserted string is appended to the new string. The caret in the base string is not moved.

As an example, suppose we started with the string, Hello, world!, and let's suppose that our list of operations was:

{ 7, "Earth", -5, 1}

We start with an empty string. The first operation says, "keep the first 7 characters". We move the caret ahead 7 places, and our new string becomes Hello,.

The second operation says, "insert 'earth' at the current location". The caret stays in place, and our new string becomes Hello, Earth.

The third operation says, "remove the next 5 characters". The caret moves ahead, and our new string remains the same.

The fourth operation says, "keep the next character". The caret, which, in case you haven't been keeping track, is pointing to the exclamation mark, is moved ahead one more character, and the exclamation mark is copied into the new string, making Hello, Earth!. Since this was the last operation, we are now finished.

Where things get interesting is when two people change the document at the same time. How to you merge their changes together? The way that ot.js handles this is by associating a "document version" with each operations list. If two operations lists A and B reference the same document version, then ot.js performs some math to efficiently transform list B to reference the document version after list A would have been applied, giving it the name operational transformation. The magic behind the transformation is beyond the scope of this post, but it's rather easy to understand if you read through the source code.

Redis Architecture Overview

You have two users, Alice and Bob, editing a text file together. Alice is in New York, and is connected to a datacenter in US-East. Bob is in Paris, and is connected to a datacenter in Europe. Each datacenter is running a copy of your server application. However, both copies of the application query the same Redis database.

When Alice makes a change to the file, her change gets sent to the US-East datacenter, which is promptly forwarded to the Redis database. Redis performs the OT magic to merge Alice's change with any changes Bob may have recently made. Then, Redis broadcasts Alice's transformed change to the Europe datacenter, which forwards it to Bob.

The Code

I'm going to assume that you have ot.js set up on the client side and attached to some sort of text editor, either a bare textarea or something more sophisticated like an ACE Editor. I'm also going to assume that you are transmitting operations over some sort of socket connection to your server on ot.js's "sendOperation" callback.

In this example, I present Node.JS code, but your server doesn't need to be running Node.JS; it could be anything (Tornado, PHP, Rails, …) as long as it supports Redis and some way to quickly send messages back and forth to the client.

Below is a function that should be run on the server whenever a user makes a change to the document. It calls a pseudo-function called "runRedisScript", which should perform an "EVAL" operation on the Redis server. You could use redis-scripto, for example, to manage your Redis scripts.

function onOtChange(docId, opsJson, revision) {
	runRedisScript(
		4,
		"ot:" + docId + ":ops",
		"ot:" + docId + ":doc",
		"ot:" + docId + ":cnt",
		"ot:" + docId + ":sub",
		opsJson,
		revision
	);
};

So, what we're doing is to run our Redis script (which I will show you soon). It uses four channels:

  1. ops: A List containing JSON strings of every operation performed on the document.
  2. doc: An up-to-date copy of the document, useful if you need to persist the document across sessions.
  3. cnt: The latest revision number.
  4. sub: A channel for Redis pub-sub notifications of new operations against the document.

Here is the code for the Redis script that will be run.

local ops = cjson.decode(ARGV[1])
local rev = tonumber(ARGV[2])
local ops_key = KEYS[1]
local doc_key = KEYS[2]
local sub_key = KEYS[3]
local cnt_key = KEYS[4]

-- Load any concurrent operations from the cache
local concurrent = redis.call("LRANGE", ops_key, rev, -1)

-- Transform the new operation against all the concurrent operations
if concurrent then
	for i,cops in pairs(concurrent) do
		ops = transform(ops, cjson.decode(cops))
	end
end

-- Save the operation
redis.call("RPUSH", ops_key, cjson.encode(ops))
redis.call("INCR", cnt_key)

-- Load and apply to the document
local doc = redis.call("GET", doc_key)
if type(doc)=="boolean" then doc="" end
doc = apply(doc, ops)
redis.call("SET", doc_key, doc)

-- Publish to the subscribe channel
redis.call("PUBLISH", sub_key, cjson.encode(ops))

First, we read the arguments. Then we load the concurrent operations lists from the ops key. Then we perform the OT magic. Then we save the new operation into the ops key, update the other keys, and publish the operation to the sub channel.

Where is the implementation of transform and apply, you ask? You can find it in my Gist in the file ot.lua.

Back in Node.JS, now, all we need to do is broadcast the operation to all clients. We can make a Redis client to subscribe to the "sub" channel, and whenever something comes through that channel, we broadcast it through all the websocket connections. When the operation gets to the client, we can apply it to the document by calling ot.js's "applyServer" command (or, if applicable, "serverAck" on the client that first produced the operation).

Caveat: UTF-8 Support

For the most part, my ot.lua is transcribing from ot.js. However, one thing that I discovered through that process is that Lua has really crappy support for unicode! Lua, which only knows about single-byte characters, would do ugly things like split multibyte characters in half. To solve this problem, I had to include some UTF-8 code that is capable of correctly calculating string length and substrings.

Caveat: Expiring Keys

In addition to the transform and apply operations, my Gist also contains the rest of the Lua code from this post, with the bonus feature of supporting ops lists that expire after a certain amount of time. This helps keep the amount of data stored in Redis is constant over time. When running the scripts, you should pass in additional arguments corresponding to the number of seconds to keep operations alive in the cache. A couple of minutes should be more than enough time.

Conclusion

I hope that this approach to using a Redis cache as an Operational Transformation server proves useful. OT sounds scary, but it really isn't.

If you like this post, let me know by commenting below!

2015 Architecture of Octave Online

I maintain Octave Online, a web UI for the GNU Octave computational engine. The site is enjoyed by tens of thousands of users from all around the world, and well over a million commands have been executed since the beginning of this year. What kind of architecture and infrastructure supports a web site of this scale?

Over the past two years, Octave Online has gone through several different iterations of back-end architecture, with each update improving the performance and decreasing the bounce rate. In this blog post, I'm going to give an overview of the current architecture that makes Octave Online robust and scalable to meet the needs of a growing audience.

Note: The code for Octave Online is in the long-term process of being prepared for open sourcing. When the code is released, I will publish more material on this blog about it.

The Objectives

The very first alpha version of Octave Online had the simplest architecture imaginable: a single cloud instance that would spawn GNU Octave processes on itself and read/write to their standard input/output streams. This is an extreme example that help illustrates some of the problems that a good back-end architecture needs to solve.

  1. Scalability: The architecture should be able to scale to support any number of simultaneous users.
  2. Performance: The architecture should ensure that the user is able to get fast I/O throughput, from any part of the world.
  3. Reliability: If any piece of the architecture goes offline, the application should pick up where it left off when the piece comes back online again.

Objective 1 means that as traffic increases on the site, I should be able to dynamically increase the number of servers to support the increased load. Objective 2 means that the architecture should be able to support an interconnected architecture across multiple data centers. Objective 3 means that each component needs to maintain state as best it can during unexpected outages, and also that it can return to its previous state efficiently.

The Architecture

Okay, so with the objectives in mind, let's cut to the chase. The following diagram illustrates the architecture of Octave Online, which I will explain in more detail below.

oo-arch.png

When an end user connects to Octave Online, they are actually creating a web socket connection to a front server, which is a relatively simple Node.JS application. Upon connection, the front server assigns the client an ID, and adds that ID to a queue in a Redis cache. Within a short amount of time, one of multiple back servers will pop the ID from the queue and locally spawn an Octave process. If the connection is from a returning user, the front server looks them up in a Mongo database (not shown) and the back server will clone an instance of the user's repository from a Git server (not shown). We're now ready to start running commands.

When the end user issues a command and sends it over the web socket, the front server will publish that command into a Redis channel identified by the user's session ID. The back server that adopted the corresponding ID will have subscribed to that channel, so it receives the message, and sends the command down to the Octave process. When the Octave process produces output, the back server publishes the output to another Redis channel to which the front server is subscribed, and finally the front server sends the output back to the end user over the web socket connection. The total round-trip time through the stack never stops impressing me.

When the end user disconnects or reloads their browser window, the front server sends a message over Redis to inform the back server that it can shut down the corresponding Octave process. The back server then kills the Octave processes, and if necessary, commits and pushes any changes to the user's repository back to the Git server.

Why the Redis cache?

One of the most common questions I get is, why the Redis cache? Why not have the front servers connect directly to the back servers with some sort of socket connection? The answer to this question is twofold, and arises from all three objectives.

Reason 1: Load Balancing

High performance and scalability means that we need to load-balancing the work across multiple back server instances. How do you perform the load balancing? One could use a traditional approach, like round-robin, for assigning the jobs. But what if the back servers could decide on their own when they were ready to accept new jobs?

This is one of my favorite parts of the Octave Online architecture. Each back server is treated as its own agent and controls its own destiny. If the back server is fresh and can handle more connections, it goes and pulls from the queue. If the back server is busy with many different jobs, it ignores the queue and lets other servers pull from it instead. A back server may also elect to go offline entirely and perform maintenance chores, like cleaning up local storage and killing orphaned processes. In order to ensure that there is at least one back server online and accepting connections at all times, the back servers talk to one another, and when one wants to go offline, it needs to get approval from at least half of the other back servers.

Redis is an important piece by providing constructs like a distributed priority queue and pub-sub channels. Redis is the framework through which the back servers talk with each other and with the front server that contains the connection to the end user.

Reason 2: State Recovery

If a user briefly loses their internet connection, or if the front server goes offline for a bit, we would like for the user to be able to reconnect to a front server, which may be different than the one to which they were previously connected, and connect to their same Octave process if it still exists.

Redis makes this possible through the pub-sub framework. If a user re-connects and provides a pre-existing ID, the front server simply sets up a Redis client listening on that ID for output from the Octave process. It is completely oblivious to the identity of the back server that is actually running the process.

Note: Each front server continuously touches a key in the cache for each of its ID. If the front server goes offline, those keys have a 15-second expiration. If they expire, the back server gets a notification and destroys the corresponding Octave processes. However, if the user reconnects to a different front server, or if the front server comes back online, the keys will be touched again, and the Octave process remains alive.

Collaborative Editing

The newest version of Octave Online brings support for real-time collaborative editing and sharing of the same Octave session. This was relatively easy to implement in the current architecture. Although the details of how this works is beyond the scope of this blog post, the high-level idea is that each end user is subscribed to the same session ID, so they receive the same notifications when the Octave process produces output, for example. I'm hoping to write another blog post detailing some of the difficulties of implementing collaborative editing and how I overcame them with my code.

The Infrastructure

In principle, each component of the architecture could be running on different machines, the same machine, or some mixture of the two.

Right now, I have Octave Online hosted in the Rackspace cloud. I have four cloud VMs. One of them holds a front server instance and is also home to the Redis cache, Mongo database, and Git server. The other three VMs are each running an instance of the back server and are designed to handle the demanding job of dynamically creating and destroying thousands of Octave sessions each day, in a fast and secure manner (another topic worthy of its own blog post). These four servers have proven sufficient for handling the typical load I currently get on Octave Online. The beauty is that if I ever need to add more front servers or back servers, it's as easy as going into the Rackspace control panel and clone more instances.

Conclusion

I hope that you found this blog post interesting. Feel free to leave comments/questions below!

GitHub – vote539

vote539