Building a Simple Distributed Task Scheduling System
In this post, I will describe a simple, horizontally scalable distributed task scheduling system using just Go and MongoDB. For what purpose, I can’t say. But it works.
System requirements:
- Unpredictable Workloads: All tasks must be executed. A task can take anywhere from 2 seconds to 10 minutes.
- Horizontal Scaling: The system scales simply by adding more worker nodes.
- Simplicity: Adding new nodes or tasks must be straightforward.
The system has three components:
Task Handler:
This acts as the controller and performs the following tasks:
- Fetches new tasks from external sources.
- Inserts them into the MongoDB with a
pendingstatus. - Scans MongoDB for tasks with a
donestatus and sends the results to external services. - Scans MongoDB for tasks with a
workingstatus to check if they are overdue (say, 15 minutes). If they are, it resets their status topending.
Worker:
This component does the heavy lifting and performs the following tasks:
- Polls MongoDB for tasks with a
pendingstatus. - Claims a task by atomically changing its status to
workingusing MongoDB’sfindOneAndUpdate. - Finishes the work and saves the result to the document, updating its status to
done.
MongoDB:
It’s MongoDB 😏. It handles atomic operations and storage.
AI was used to help refine and polish this article based on factual information