Home assignment - HTTP file server

Implement an HTTP server with the following API:

POST /blobs/{id}

Payload: binary

If the blob already exists - overwrite it ("upsert")

Some headers, if sent, should also be stored. These headers are:

  • Content-Type
  • Any header that starts with x-rebase- (case insensitive)

Possible errors

  • missing Content-Length header
  • sum(Binary length, stored headers length) exceeds MAX_LENGTH
  • Overall disk space, if potentially storing this blob, exceeds MAX_DISK_QUOTA
  • Any of the stored header (including key and value) exceeds MAX_HEADER_LENGTH
  • count(stored-headers) exceeds MAX_HEADER_COUNT
  • id should include only the following characters: a-z, A-Z, 0-9, dot (.), underscore (_), minus (-). Any other character is not valid.
  • id should not exceed MAX_ID_LENGTH
GET /blobs/{id}

Return the relevant blob and its stored headers, if they exist.

If the Content-Type header does not exist in the stored headers, either use the value application/octet-stream or try to infer it using an external lib (e.g. mime-types)

Possible errors

  • Non existing blob (404)
DELETE /blobs/{id}

Delete an existing blob

There's no need to return 404 if the blob doesn't exist

MAX_LENGTH = 10MB
MAX_DISK_QUOTA = 1 GB
MAX_HEADER_LENGTH = 50
MAX_HEADER_COUNT = 20
MAX_ID_LENGTH = 200
MAX_BLOBS_IN_FOLDER = 10000

Limitations

  • The server should be able to run on a modest machine in terms of RAM, so you should not read the entire input stream at once and then store it as a file. Instead, you should read it chunk by chunk and store the chunks on the disk. Same goes for serving a blob. (in nodejs there are streams and pipes for that)
  • For performance reasons, you should not store all more than MAX_BLOBS_IN_FOLDER blobs directly under a given folder.
  • The solution should be consistent even when the requests break. E.g. disk quota should be recalculated if a POST request was broken while it was being processed.
  • You can use an HTTP framework/lib of your choice.

Notes

  • You may assume that there are no concurrent requests to the same id
  • It's a common practice to perform some calculations and cleanups when the server is launched. However, while warming up, the server should not listen to incoming requests
  • Headers are in pure ASCII
  • Storing data can be done only on the file system
  • You can assume that you have at least 1.5 * MAX_DISK_QUOTA available on the local disk