Hi! I have recently been working on implementing the support for the ETag
header for my own HTTP server implementation, and I thought that it might also be a nice fit for the Codecrafters platform. I would be posting here an initial draft of how I think the extension can be made, and I would love to receive your feedback on the same.
For the beginning, I would like the ETag & Caching
extension to be implemented with the help of three challenges:
-
ETag
Header Generation: Easy -
Handling
If-None-Match
header: Medium -
Supporting
Weak ETags
: Easy
The detailed descriptions of the challenges (as they might appear on the platform) can be something like this:
Challenge 1: ETag Header Generation
-
Marketing Description: In this stage, you’ll add support for ETags to your HTTP server, allowing clients to recognize when files change and enabling efficient caching.
-
Difficulty: Easy
Welcome to the ETag Caching Extension! In this extension, you’ll learn about ETags and how they are used for efficient HTTP caching.
In this stage, you’ll add support for sending an ETag
header in your responses to GET /files/<filename>
requests.
ETag Header
The ETag
header is a string that represents a particular version of the resource. When a client requests a resource, the server includes the ETag
header in the response. The client can then use this ETag
value in subsequent requests to check if the resource has changed.
In this stage, your server will need to calculate the ETag for the file contents and include it in the response headers. The ETag should be calculated using the MD5 hash of the file contents and should be quoted. For example, if the file contents are hello world
, the ETag would be "5eb63bbbe01eeed093cb22bb8f5acdc3"
.
Tests
The tester will execute your program like this:
$ ./your_server.sh --directory <directory>
The tester will also create a file called hello
in the diretory with contents hello world
.
It will then send a GET /files/hello
request:
$ curl -i http://localhost:4221/files/hello
Your server must respond with a 200 OK
response. The response should have the ETag
header set to the MD5 hash of the file contents, and should be quoted. The response body should contain the file contents, and other previously required headers like Content-Type
and Content-Length
should be present as well.
Here’s the expected response:
HTTP/1.1 200 OK
Content-Type: application/octet-stream
Content-Length: 11
ETag: "5eb63bbbe01eeed093cb22bb8f5acdc3"
hello world
Challenge 2: Handling If-None-Match header
-
Marketing Description: In this stage, you’ll implement full ETag caching, letting clients avoid unnecessary downloads when files haven’t changed!
-
Difficulty: Medium
In this stage, you’ll complete the ETag mechanism by handling the If-None-Match
request header.
How If-None-Match Works
Once the client has the ETag for a resource, it can use the If-None-Match
header in subsequent requests to check if the resource has changed. The server will compare the ETag in the If-None-Match
header with the current ETag for the resource.
If the ETags match, the server will respond with a 304 Not Modified
response, indicating that the resource has not changed and the client can use its cached version. If the ETags do not match, the server will respond with a 200 OK
response, including the new ETag and the updated resource.
In this stage, your server will need to handle the If-None-Match
header in the request and respond accordingly.
Tests
The tester will run your server like this:
$ ./your_server.sh --directory <directory>
It will also create a file called hello
in the directory with contents hello world
.
Test 1: Cache Hit
The tester will send a GET /files/hello
with If-None-Match
header set to the correct ETag for the file (to simulate a cache hit):
$ curl -i http://localhost:4221/files/hello --header "If-None-Match: \"5eb63bbbe01eeed093cb22bb8f5acdc3\""
Your server must respond with a 304 Not Modified
response:
HTTP/1.1 304 Not Modified
ETag: "5eb63bbbe01eeed093cb22bb8f5acdc3"
The body od the response should be empty, and thus no Content-Type
or Content-Length
headers should be present as well. The ETag
header must be present in the response.
Test 2: Cache Miss
The tester will update the contents of the file hello
to hello universe
(to show a new version of the resource) and then send a GET /files/hello
with the If-None-Match
header set to the old outdated ETag (to simulate a cache miss):
$ curl -i http://localhost:4221/files/hello --header "If-None-Match: \"5eb63bbbe01eeed093cb22bb8f5acdc3\""
This time, your server must respond with a 200 OK
response, and include both the correct ETag and the file contents in the response body:
HTTP/1.1 200 OK
Content-Type: application/octet-stream
Content-Length: 14
ETag: "342a4db326c2b213840c7d4967cb183e"
hello universe
Notes
-
Always calculate the current ETag before comparing it with the
If-None-Match
header. -
Header names (
If-None-Match
) are case-insensitive. -
Always include the current ETag, even in 304 responses.
Challenge 3: Weak ETags
-
Marketing Description: Learn about weak ETags and how they are used to provide weaker gurantees surrounding resource version.
-
Difficulty: Easy
In this stage, you will learn about weak ETags and how they provide weaker caching gurantees.
Weak ETags
In HTTP, ETags can be strong or weak:
-
Strong ETags: Must match byte-for-byte exact files (default behavior).
-
Weak ETags: Allow slight differences between the cached resource (like timestamps).
Weak ETags are prefixed with W/
like:
ETag: W/"etag-value"
They are useful when strong ETags (which require identical content) are impractical to generate efficiently, or for large resources that have not undergone significant revisions between requests.
In this stage, you will add support for returning Weak ETags for large files.
For the purpose of this challenge, we will assume that:
-
All the files which are longer than 1KB in size are “large” files, and thus the computation of a strong ETag for them is wasteful.
-
The weak Etag of a large file is computed as the MD5 hash of the first 1024 bytes of the file content, quoted.
Tests
The tester will run your server like this:
$ ./your_server.sh --directory <directory>
It will also create a file called hello
in the directory with contents which will has the word hi
repeated in it 1024 times (the file size would be 2KB
and the content hihihi...
).
The tester will then send a GET /files/hello
:
$ curl -i http://localhost:4221/files/hello
Your server must respond with a 200 OK
response and return the weak ETag for the file along the file content:
HTTP/1.1 200 OK
Content-Type: application/octet-stream
Content-Length: 2048
ETag: W"7231c8b7ea44acd0562363f248982abc"
hihihi...
It will also then test the cache-hit and cache miss (as in the last challenge) to ensure the If-None-Match
functionality has no regressions.