fix fastapi search.json discrepancies#11517
Conversation
…ditionally skip legacy WSGI app initialization during pytest.
…rameterized test.
…g a new helper function.
…arameters to the search endpoint, including a new test.
…AllowedParams` to handle underscore prefixes with aliases.
cdrini
left a comment
There was a problem hiding this comment.
Greeeeaaattt work!! This is becoming so much tidier 😊
…pe ignore for field splitting
|
For posterity sake, here's the set of "tests" I wrote to track my progress trying to get the pydantic model working with list queries. Spoiler##### The tests here are to show that it's hard to get the lists working for query params
def test_multi_key(): # noqa: PLR0915
app = FastAPI()
# This doesn't work because it expects the author keys to be in the body
@app.get("/search.json")
async def search_works(
author_key: list[str],
):
return {'author_key': author_key}
client = TestClient(app)
response = client.get('/search.json?author_key=OL1A&author_key=OL2A')
assert response.status_code == 422
assert response.json() != {'author_key': ['OL1A', 'OL2A']}
assert response.json()['detail'][0]['type'] == 'missing'
assert response.json()['detail'][0]['loc'] == ['body']
# This test does work because we're explicitly using Query but we want it moved into a Pydantic model
app = FastAPI()
@app.get("/search.json")
async def search_works2(
author_key: Annotated[list[str], Query()],
):
return {'author_key': author_key}
client = TestClient(app)
response = client.get('/search.json?author_key=OL1A&author_key=OL2A')
assert response.status_code == 200
assert response.json() == {'author_key': ['OL1A', 'OL2A']}
# This test does work because we're explicitly using query but we don't want None
app = FastAPI()
@app.get("/search.json")
async def search_works3(
author_key: Annotated[list[str] | None, Query()] = None,
):
return {'author_key': author_key}
client = TestClient(app)
response = client.get('/search.json?author_key=OL1A&author_key=OL2A')
assert response.status_code == 200
assert response.json() == {'author_key': ['OL1A', 'OL2A']}
# This this says body is missing again ok
app = FastAPI()
class SearchParams(BaseModel):
author_key: list[str]
@app.get("/search.json")
async def search_works4(
params: SearchParams,
):
return {'author_key': params.author_key}
client = TestClient(app)
response = client.get('/search.json?author_key=OL1A&author_key=OL2A')
assert response.status_code == 422
assert response.json() != {'author_key': ['OL1A', 'OL2A']}
assert response.json()['detail'][0]['type'] == 'missing'
assert response.json()['detail'][0]['loc'] == ['body']
# Ok so now this works. Yay!
app = FastAPI()
class SearchParams(BaseModel):
author_key: list[str]
@app.get("/search.json")
async def search_works5(
params: Annotated[SearchParams, Query()],
):
return {'author_key': params.author_key}
client = TestClient(app)
response = client.get('/search.json?author_key=OL1A&author_key=OL2A')
assert response.status_code == 200
assert response.json() == {'author_key': ['OL1A', 'OL2A']}
# But what if there are other params? Uh oh then they're missing...
app = FastAPI()
class SearchParams(BaseModel):
author_key: list[str]
@app.get("/search.json")
async def search_works6(
params: Annotated[SearchParams, Query()],
q: str | None = None,
):
return {'author_key': params.author_key}
client = TestClient(app)
response = client.get('/search.json?author_key=OL1A&author_key=OL2A')
assert response.status_code == 422
assert response.json()['detail'][0]['type'] == 'missing'
assert response.json()['detail'][0]['loc'] == ['query', 'params']
# So Gemini says it'll work if we use Depends instead of query! But then we get a body missing :(
app = FastAPI()
class SearchParams(BaseModel):
author_key: list[str]
@app.get("/search.json")
async def search_works7(
params: Annotated[SearchParams, Depends()],
q: str | None = None,
):
return {'author_key': params.author_key}
client = TestClient(app)
response = client.get('/search.json?author_key=OL1A&author_key=OL2A')
assert response.status_code == 422
# assert response.json() == {'author_key': ['OL1A', 'OL2A']}
assert response.json()['detail'][0]['type'] == 'missing'
assert response.json()['detail'][0]['loc'] == ['body']
# So what if we make it clearer that it's a query param? Woah that works!
"""
It seems to work because:
1. Depends(): Tells FastAPI to explode the Pydantic model into individual arguments (dependency injection).
2. Field(Query([])): Overrides the default behavior for lists. It forces FastAPI to look for ?author_key=...
in the URL query string instead of expecting a JSON array in the request body.
The Field part is needed because FastAPI's default guess for lists inside Pydantic models is wrong for your use case.
It guesses "JSON Body," and you have to manually correct it to "Query String."
"""
app = FastAPI()
class SearchParams(BaseModel):
author_key: list[str] = Field(Query([]))
@app.get("/search.json")
async def search_works8(
params: Annotated[SearchParams, Depends()],
q: str | None = None,
):
return {'author_key': params.author_key}
client = TestClient(app)
response = client.get('/search.json?author_key=OL1A&author_key=OL2A')
assert response.status_code == 200
assert response.json() == {'author_key': ['OL1A', 'OL2A']}
# A quick check to make sure it's ok with no params
response = client.get('/search.json')
assert response.status_code == 200
assert response.json() == {'author_key': []}
# But wait I think doing Query([]) is not great to put a mutable class in the default.
# However, pydantic said don't worry about it.
# https://docs.pydantic.dev/latest/concepts/fields/#mutable-default-values
# Lets try to use the "proper way" just in case
# And it works great! But it's ugly so lets not do it
app = FastAPI()
class SearchParams(BaseModel):
author_key: list[str] = Field(Query(default_factory=list))
@app.get("/search.json")
async def search_works9(
params: Annotated[SearchParams, Depends()],
q: str | None = None,
):
return {'author_key': params.author_key}
client = TestClient(app)
response = client.get('/search.json?author_key=OL1A&author_key=OL2A')
assert response.status_code == 200
assert response.json() == {'author_key': ['OL1A', 'OL2A']}
# A quick check to make sure it's ok with no params
response = client.get('/search.json')
assert response.status_code == 200
assert response.json() == {'author_key': []}
# But wait AI says there's a modern standard of Annotated
# However, after looking at the fullstack fastapi template https://github.com/fastapi/full-stack-fastapi-template
# It seems they don't do it so we shouldn't have to either
# So in summary we should use search_works8 I think!
app = FastAPI()
class SearchParams(BaseModel):
author_key: Annotated[list[str], Field(Query([]))]
@app.get("/search.json")
async def search_works10(
params: Annotated[SearchParams, Depends()],
q: str | None = None,
):
return {'author_key': params.author_key}
client = TestClient(app)
response = client.get('/search.json?author_key=OL1A&author_key=OL2A')
assert response.status_code == 200
assert response.json() == {'author_key': ['OL1A', 'OL2A']}
# A quick check to make sure it's ok with no params
response = client.get('/search.json')
assert response.status_code == 200
assert response.json() == {'author_key': []} |
cdrini
left a comment
There was a problem hiding this comment.
This looks great! I tested on testing and everything seemed to work!
- Facets worked: https://testing.openlibrary.org/search.json?q=sherlock+holmes&mode=everything&language=ger&fields=key,title,editions,language (editions are in german)
- The
queryjson parameter works - Multiple subjects work https://testing.openlibrary.org/search.json?q=sherlock+holmes&mode=everything&language=ger&subject_facet=English+Detective+and+mystery+stories&subject_facet=Fiction&fields=key,title,editions
None of the comments here are blockers, but curious mainly to the question about the unit tests. Otherwise lgtm and will merge up tomorrow!
| """ | ||
| query: dict[str, Any] = {} | ||
| if query_str: | ||
| query = json.loads(query_str) |
There was a problem hiding this comment.
In the previous version, sort/page/etc were also in the ?query={...} parameter
There was a problem hiding this comment.
If you mean the previous version of this PR then I think it was just a mistake.
If you mean the web.py version of this endpoint then I'm not sure where you see that can you elaborate?
They certainly can specify those things in the ?query={...} param but we don't really check/validate that as of now. Would be good idea for the future though.
There was a problem hiding this comment.
Oh, I mean in the web py version you can specify sort inside the query json object, and it will be used instead of the sort parameter. Here, you can only specify sort as a URL parameter, not in the query Json object. But it's not super high priority, since this feature isn't really used.
…ng for author_key
When we first put out the fastapi search.json we thought passing along all the parameters was good enough for testing.
Turns out there were a few discrepancies with how we were handling lists and pagination.
This PR makes the final changes to have everything be a 1:1 match with the old search endpoint.
Technical
Testing
Lots of tests have been added.
Screenshot
Stakeholders
@cdrini