HLD - Job Scheduler | Krishankant Ray

Problem Statement:

Run tasks at a given specified time or at a repeated interval of time.

Functional Requirements:

Register a task along with its execution pattern (one_time, repeated)
Run the task on whatever its next execution time is
Notify the user when a task fails

Non-Functional Requirement:

Scalable
Available
Gracefully handle failures (retry)

Capacity Estimation:

Run 1 M task a day Each task takes 30s to run

1,000,0000 / 100,0000 = 10 Task / sec

10 x 30 = 300 concurrent task

16 cores = 32 thread

CPUs required = 300/30 = 10 CPUs

1 M task/day ------> 10 CPUs

APIs Required

POST: /task/

        Request:
        {
            "sourceFile": "s3://",
            "name": "data-fix job",
            "desc": "To fix data",
            "type": "ONE_TIME",
            "retry_count": 3,
            "executionPeriod": 500 // minutes
        }

        Response:
        200
        {
            taskId: 121212
        }

POST: /task/cancel/id={id}
Response 200 // this api set status of task to "Inactive"

GET: /task/id={id}
Response:
{
name: "",
status: "",
message: ""
}

DB Schema

Tables

Tasks

- id 
- name 
- desc 
- type 
- status (Active, Inactive) 
- retry_count - execution_period 
- lastExecutedAt 
- lastExecutionStatus 
- lastExecutionMessage 
- createdBy

Execution

- id <PK> 
- taskId <foreign key> 
- nextExecutionTime <index> 
- status ( QUEUED, PROCESSING, COMPLETE ) 
- failureReason 
- executedAt

High Level Architecture Diagram

^{^{✳️ open image in new tab to view in large}}