Challenge #5a: Single-Node Kafka-Style Log
In this challenge, you’ll need to implement a replicated log service similar to Kafka. Replicated logs are often used as a message bus or an event stream.
This challenge is broken up in multiple sections so that you can build out your system incrementally. First, we’ll start out with a single-node log system and then we’ll distribute it in later challenges.
Specification
Your nodes will need to store an append-only log in order to handle the
"kafka" workload. Each log is identified by a string key (e.g. "k1") and
these logs contain a series of messages which are identified by an integer
offset. These offsets can be sparse in that not every offset must contain a
message.
Maelstrom will check to make sure several anomalies do not occur:
- Lost writes: for example, a client sees offset 10 but not offset 5.
- Monotonic increasing offsets: an offset for a log should always be increasing.
There are no recency requirements so acknowledged send messages do not need to
return in poll messages immediately.
RPC: send
This message requests that a "msg" value be appended to a log identified
by "key". Your node will receive a request message body that looks like this:
{
"type": "send",
"key": "k1",
"msg": 123
}
In response, it should send an acknowledge with a send_ok message that
contains the unique offset for the message in the log:
{
"type": "send_ok",
"offset": 1000
}
RPC: poll
This message requests that a node return messages from a set of logs starting from the given offset in each log. Your node will receive a request message body that looks like this:
{
"type": "poll",
"offsets": {
"k1": 1000,
"k2": 2000
}
}
In response, it should return a poll_ok message with messages starting from
the given offset for each log. Your server can choose to return as many messages
for each log as it chooses:
{
"type": "poll_ok",
"msgs": {
"k1": [[1000, 9], [1001, 5], [1002, 15]],
"k2": [[2000, 7], [2001, 2]]
}
}
RPC: commit_offsets
This message informs the node that messages have been successfully processed up to and including the given offset. Your node will receive a request message body that looks like this:
{
"type": "commit_offsets",
"offsets": {
"k1": 1000,
"k2": 2000
}
}
In this example, the messages have been processed up to and including offset
1000 for log k1 and all messages up to and including offset 2000 for k2.
In response, your node should return a commit_offsets_ok message body to
acknowledge the request:
{
"type": "commit_offsets_ok"
}
RPC: list_committed_offsets
This message returns a map of committed offsets for a given set of logs. Clients use this to figure out where to start consuming from in a given log.
Your node will receive a request message body that looks like this:
{
"type": "list_committed_offsets",
"keys": ["k1", "k2"]
}
In response, your node should return a list_committed_offsets_ok message body
containing a map of offsets for each requested key. Keys that do not exist on
the node can be omitted.
{
"type": "list_committed_offsets_ok",
"offsets": {
"k1": 1000,
"k2": 2000
}
}
Evaluation
Build your Go binary as maelstrom-kafka and run it against Maelstrom with the
following command:
./maelstrom test -w kafka --bin ~/go/bin/maelstrom-kafka --node-count 1 --concurrency 2n --time-limit 20 --rate 1000
This will run a single node for 20 seconds with two clients. It will validate that messages are queued and committed properly.
If you’re successful, wahoo! Continue on to the Multi-Node Kafka challenge. If you’re having trouble, jump over to the Fly.io Community forum for help.