Skip to content

🐛(backend) manage race condition between GET and PATCH content#2271

Open
lunika wants to merge 1 commit intomainfrom
fix/content-race-condition
Open

🐛(backend) manage race condition between GET and PATCH content#2271
lunika wants to merge 1 commit intomainfrom
fix/content-race-condition

Conversation

@lunika
Copy link
Copy Markdown
Member

@lunika lunika commented May 5, 2026

Purpose

When a PATCH and a GET on the content endpoint are made at the same time
for different users a race condition can happen and the metadata returned
by the S3 head_object can be outdated when the object is fetched leading
to an error raised because the Content-Length header does not match the
size of the response body. To avoid this, we no longer used head_object
followed bu get_object, we have to manage
everything in one call with the get_object. The get_object also accepts
as parameters an etag or last-modified header and will return a 304 if
the content has not changed, so we can use this to not return the entire
body if this one has not changed.

Proposal

  • 🐛(backend) manage race condition between GET and PATCH content

@lunika lunika requested a review from jmaupetit May 5, 2026 13:38
@lunika lunika self-assigned this May 5, 2026
@lunika lunika added bug Something isn't working enhancement improve an existing feature labels May 5, 2026
When a PATCH and a GET on the content endpoint are made at the same time
for different users a race condition can happen and the metadata returned
by the S3 head_object can be outdated when the object is fetched leading
to an error raised because the Content-Length header does not match the
size of the response body. To avoid this, we no longer used head_object
followed bu get_object, we have to manage
everything in one call with the get_object. The get_object also accepts
as parameters an etag or last-modified header and will return a 304 if
the content has not changed, so we can use this to not return the entire
body if this one has not changed.
@lunika lunika force-pushed the fix/content-race-condition branch from 7bde43e to bae3015 Compare May 5, 2026 13:42
Copy link
Copy Markdown

@jmaupetit jmaupetit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only have minor comments, else 👏

Comment thread src/backend/core/api/utils.py
Comment thread src/backend/core/api/viewsets.py
Comment thread src/backend/core/api/viewsets.py
Comment thread src/backend/core/api/viewsets.py
Comment on lines +1992 to +1995
if code in ("304", "PreconditionFailed", "NotModified"):
return drf_response.Response(status=status.HTTP_304_NOT_MODIFIED)
if code in ("NoSuchKey", "404"):
return StreamingHttpResponse(b"", content_type="text/plain", status=200)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about using the amazing match/case syntax? 😍

Suggested change
if code in ("304", "PreconditionFailed", "NotModified"):
return drf_response.Response(status=status.HTTP_304_NOT_MODIFIED)
if code in ("NoSuchKey", "404"):
return StreamingHttpResponse(b"", content_type="text/plain", status=200)
match code:
case "304" | "PreconditionFailed" | "NotModified":
return drf_response.Response(status=status.HTTP_304_NOT_MODIFIED)
case "NoSuchKey" | "404":
return StreamingHttpResponse(b"", content_type="text/plain", status=200)
case _:
raise

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it was not allowed by the Geneva Convention to use a switch in Python.

Comment on lines +2002 to +2003
# Refresh the metadata cache so future conditional requests can
# check them earlier
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this would be good enough?

Suggested change
# Refresh the metadata cache so future conditional requests can
# check them earlier
# Refresh the metadata cache

def _stream(file_key):
with default_storage.open(file_key, "rb") as f:
while chunk := f.read(8192):
while chunk := body.read(8192):
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think the chunk size should be configurable?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also to avoid using a while loop, you can use iter_chunks:

Suggested change
while chunk := body.read(8192):
for chunk in body.iter_chunks(chunk_size):

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the iter_chunks is doing the same thing. I have no strong opinion. I would prefer to have a solution to have an async generator.

while chunk := body.read(8192):
yield chunk
finally:
body.close()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure the try/finally pattern is required here to close the http response stream (as you return a generator)?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nop I don't really know. I keep it from the previous implementation.


def _request(**headers):
"""Build a request with the given HTTP headers."""
return APIRequestFactory().get("/", headers=headers)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not declaring a pytest fixture instead?

assert if_none_match == '"deadbeef"'
assert if_modified_since_dt == dt.datetime(
2015, 10, 21, 7, 28, 0, tzinfo=dt.timezone.utc
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All tests here could be configured as one single test using indirect parametrization of the proposed fixture.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working enhancement improve an existing feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants