Spawnfest Retrospective

This year I participated in SpawnFest 2017, a two day hackathon focused around the BEAM virtual machine. With judging finished, I felt this an appropriate time to reflect on the architecture and process of my submission.

My submission was a server for Classic Minecraft written in Elixir. Unlike the current version, this version of Minecraft is very limited in feature set supporting only chat, placing blocks and moving fluids. Because of the simplistic protocol and the availability of open source clients, It seemed like a perfect project for a two day contest.

The Server

Unlike Ruby, C, or Java, Elixir is an actor based language. In actor based languages, communication across boundaries happens via message passing instead of function calls. Its roughly equivalent to a service oriented architecture, but replacing network boundaries between services with ‘in memory’ communication. The pattern leads to highly concurrent software, but requires a different way of thinking.

At a high level, the server I submitted had three kinds of actors client protocol, game server and map server. Ranch, a socket acceptor pool, would spin up instances of the client protocol to manage connections between the server and client. Those instances decoded messages from the client and passed them to the game server, and the game server sent updates back to the client. The game server, kept track of all of the connected clients, ensuring that all changes in the map, chat messages and player connections / disconnections were seen by everyone on the server. Finally, the map server kept track of updates to the game map.

What Went Well: Communication Between Clients

The actor model was very nice to use for this project. It helped cleanly separate the encoding and decoding logic and provided process IDs to make tracking connected clients simple. It also helped us avoid the concurrency issues that some traditional approaches to multithreading suffer from.

There was an actor for each connected client, all the game server needed to do was validate that the input was correct, then send a message to each client actor. The logic involving speaking to the protocol was contained in the client actor. This made the game server much cleaner, since all of the work to construct the binary was separate from the game server logic.

There was also a good way to keep track of each connected client for free. Since each actor is assigned a process id, the actor which runs the game server just needed to keep a mapping of username and player ids to process ids. It made implementing a “whisper” chat command and despawning logged off players trivial. For “whisper”, the server maps the target player’s username to their pid and send the chat message along. To get despawning working for players that logged off, the server takes the process id of the disconnecting player, released their player id back into the pool, then broadcasted the despawn message to the rest of the connected clients.

Most importantly, using actor based concurrency removed the cognitive burden associated with a more traditional threading approach. There was no need to manage mutexes or deal with deadlocks. The only thing to focus on was message structure, the runtime took care of the rest.

What Didn’t Go Well: The Map

Getting a performant way of representing the map proved to be a challenge. With a little understanding of how they are built and sent to the client I think you will see why.

A Minecraft classic map is made up of blocks. They get sorted in a “x * y * z” matrix of unsigned 8 bit integers. Each byte in the matrix represents corresponds to the type of block and the index is its position in the world. When this data is sent to the client the matrix is converted to a one dimensional array, a 4 byte header (the number of elements in the array) is added, then it is gziped and sent is chunks of 1024 bytes to the client.

The largest map is 512 * 512 * 64 blocks. This means it takes 16.78mb of memory for the map data if you store one byte per block. We also needed the ability to update an arbitrary block, which meant that if we treated it as a binary we would have a lot of work to access and update a single block. To overcome this, I implemented two versions of an Erlang Term Storage based map, each with their own issues.

Attempt One

Erlang Term Storage (ETS) was a really appealing way to deal with map updates. It provides a key value table which can contain any arbitrary element, has multiple storage options, and provides constant lookup or logarithmic lookup time. If each block in the map was stored as a tuple like {{x,y,z}, block_type}, we could do our updates quickly with little overhead. There were two issues with this approach, it was very large in memory and you needed to build the array to send to the client.

For a 512 * 512 * 64 block map, it required an 800mb table. This didn’t feel the best since if we were doing this using malloced c, we could get by with a fiftieth of that. The other, more pressing, issue was building the array of map data to send to the client. By default, ETS tables are unsorted sets. This meant our process for building the array was:

Load all of the terms from the table into memory
Sort the keys into the correct order
Concatenate each value into a new array
Send that array to the client

For the largest map size we ended up sorting and concatenating 16,777,216 items. This proved to be very slow. Switching storage options didn’t provide any appreciable speed boost; another solution was needed.

Attempt Two

After a good nights sleep and a bit of Googling, I decided to add Run-length encoding to our map’s internal structure. This helped tremendously with the performance and size.

If you are unfamiliar, Run-length encoding (RLE) is a lossless form of compression which stores consecutive elements as a single value, and then the number of repetitions. This means:

[1,2,2,2,5]

Could become

[{1, 5}, {3, 2}, {1, 1}]

Since our map data has a lot of consecutive elements, compressing a row or column with RLE means we save a lot on space (800mb to 16mb) and we cut down on the amount of sorting we need to do in memory. Instead of storing {{x,y,z}, block_type} we use {{y,z}, x_row_compressed}. This means we only sort 32,768 items, then decompress and concatenate entire rows at a time. The RLE approach proved to be good enough given the time constraints and its the one that made it into the final submission.

Still Not Perfect

While I’m quite proud of what we were able to get done in two days, I still feel like there were a few areas that I could make better given some more time: GameServer ended up being too large for my tastes. I feel it would be cleaner if there was a split between tracking players, passing player changes around and chat message. We currently just use send to pass messages back to clients, it would be more intention revealing if there was a public api, provided by ClientProtocol, to encapsulate this behavior. The map should be built as a gzip stream when we read it from the ETS table. The current list to binary to gzip process is not performant. I’m unhappy with the way fluids work. It feels like they want to be treated as a player like thing and live inside of their own simple_one_for_one supervisor.

Outside of a few small issues and some areas that I want to correct, the contest was an enjoyable time. Big thanks to Brujo and the rest of the organizers, judges and sponsors for putting on such a wonderful event.