-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-45456 Move from Flask to FastAPI +424-488 #32
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #32 +/- ##
==========================================
+ Coverage 81.90% 84.90% +3.00%
==========================================
Files 4 4
Lines 503 550 +47
==========================================
+ Hits 412 467 +55
+ Misses 91 83 -8 ☔ View full report in Codecov by Sentry. |
c365fe9
to
8999914
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks alright -- the only issue that is functionally incorrect is that all of those async def
route handlers should be def
.
Some resources that might help you out as you start your FastAPI journey:
- Russ's recommended FastAPI pattern technote: https://sqr-072.lsst.io/ . None of this is mandatory and lots of apps vary from it for different reasons. But the closer you stay to this, the more people will be able to quickly jump in and help with the app.
- SQRE just put on a boot camp that covered a lot of about FastAPI and how it is deployed in Rubin. Recordings of all the talks are here: https://confluence.lsstcorp.org/display/DM/SQuaRE+Bootcamp+-+May+6-10+2024
- You should probably stop in to #dm-square on Slack and use the link at the top to sign up for an "Ask SQuaRE" slot to stop by their weekly Q&A, just to introduce yourself. SQRE has sort of paved the way for doing this kind of work and will be helpful folks to talk to if you get stuck. (This is Frossie's team. Frossie, Adam, Angelo, and Stelios are in Tucson -- have you met them yet? If not you should come to lunch with us tomorrow. A couple of the key players are remote -- Russ and Jonathan.)
- I'm in room M-15 all day every day if you need anything.
python/lsst/consdb/pqserver.py
Outdated
|
||
Returns | ||
------- | ||
json_dict: `dict` [ `str`, `Any` ] | ||
JSON response with a list of instruments, observation types, and | ||
data types. | ||
""" | ||
logger.info("GET /") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FastAPI by default logs the paths that are accessed, so you don't need to manually. (Or it might be one of the layers under it -- asgi or uvicorn -- but something puts it in the logs anyway.)
python/lsst/consdb/pqserver.py
Outdated
"/", | ||
description="Metadata and health check endpoint.", | ||
include_in_schema=False, | ||
response_model=IndexResponseModel, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need to manually declare the response_model here -- FastAPI already knows what it is from the type annotation on the function.
python/lsst/consdb/pqserver.py
Outdated
|
||
message: str = Field(..., title="Human-readable response message") | ||
key: str = Field(..., title="The name of the added key") | ||
instrument: str = (Depends(validate_instrument_name),) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think FastAPI will do anything with a Depends
if it is anywhere other than a function parameter to a route handler.
Generally speaking Depends
isn't used for validation logic -- it is for pulling in things like a database connection, user authentication info, etc. Generally things that are external resources that are part of your system, not data from the user. See the docs at https://fastapi.tiangolo.com/tutorial/dependencies/. (They're also really helpful for unit testing because it gives you a way to inject mocks easily.)
Since it looks like you want this instrument name validation logic in a few places, one good way to write this might be something like:
# Declare a reusable, validated string data type
InstrumentName = Annotated[str, AfterValidator(validate_instrument_name)]
class AddKeyResponseModel(BaseModel):
...
instrument : InstrumentName
...
async def add_flexible_metadata_key(
instrument: InstrumentName
...
Note that you will need to modify validate_instrument_name
so it raises ValueError
instead of HttpException
.
If it's a one-off, you can declare validation logic directly on the field in a few different ways as well, see the docs at https://docs.pydantic.dev/latest/concepts/validators/
python/lsst/consdb/pqserver.py
Outdated
|
||
|
||
class AddKeyRequestModel(BaseModel): | ||
key: str = Field(..., title="The name of the added key") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can just omit the default value instead of explicitly writing ...
for required fields.
python/lsst/consdb/pqserver.py
Outdated
key: str = Field(..., title="The name of the added key") | ||
dtype: AllowedFlexTypeEnum = Field(..., title="Data type for the added key") | ||
doc: Optional[str] = Field("", title="Documentation string for the new key") | ||
unit: Optional[str | None] = Field(None, title="Unit for value") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some type-related nitpicks here. Optional[T]
is exactly equivalent to T | None
, so saying Optional[str | None]
is redundant -- you can just say str | None
or Optional[str]
.
For cases like doc
where you have provided a default value and don't want to let it be None
, you can just declare it as
doc: str = Field("", title=...)
From the end-user perspective it is still "optional" because if they don't provide it, it will default to the empty string. If you had declared it as Optional[str]
they could pass in a null
in the JSON and it would pass validation, turning into None
in your model.
python/lsst/consdb/pqserver.py
Outdated
response_model=AddKeyResponseModel, | ||
description="Add a flexible metadata key for the specified instrument and obs_type.", | ||
) | ||
async def add_flexible_metadata_key( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to be EXTREMELY careful about the difference between async def
and def
in FastAPI.
If you use a blocking, synchronous function inside an async def
handler, no other requests will be handled until the function returns. Unless there is an await
statement in your function to return control back to the event loop, the function will prevent everything else in FastAPI from running. (async/await
is basically cooperative multi-tasking.)
Generally speaking, the two places it is correct to use async def
in FastAPI handlers (or dependency functions, exception handlers, etc.) are:
- When your function body contains an
await
statement to do background I/O - Short, fast functions that do no I/O or long-running computations.
In this function you are doing a synchronous call to sqlalchemy (there is no await
), so this will block. (sqlalchemy does provide an async/await
version of its API if its configured with the right driver -- you might consider switching to that so you can use async
throughout.)
If you instead declare the function with plain def
, FastAPI will run this on a thread pool for you so it does not block other requests.
See https://fastapi.tiangolo.com/async/ for more info.
python/lsst/consdb/pqserver.py
Outdated
description="Flex schema for the given instrument and observation type.", | ||
) | ||
async def get_flexible_metadata_keys( | ||
instrument: Annotated[str, Depends(validate_instrument_name)] = Path(..., title="Instrument name"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to the discussion above, it doesn't make sense to use Depends
and Path
in the same declaration -- either the data is coming from the path, or it's coming from the dependency function -- it can't be both.
(This looks like it is working because validate_instrument_name
is separately declaring a Path
for pulling in the instrument name. The value is coming entirely from validate_instrument_name
, not the Path
declaration here.)
python/lsst/consdb/pqserver.py
Outdated
table: str, | ||
data: InsertMultipleRequestModel = Body(..., title="Data to insert or update"), | ||
u: Optional[int] = Query(0, title="Update if data already exist"), | ||
) -> dict[str, Any] | tuple[dict[str, str], int]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This return value doesn't appear to be correct (it's actually returning a GenericResponse
).
Mypy flags this in my editor -- it looks like this repo might not be running typechecks in the GitHub actions.
python/lsst/consdb/pqserver.py
Outdated
return m[0](v) | ||
|
||
|
||
class ObsIdColname(str, Enum): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does it do when you subclass from both str
and Enum
? I've never seen that before and I don't see anything about it in the docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't know what I was thinking there, but I was going for StrEnum.
One other thing is that you may want to preserve the case of the instrument names and observation types unless you have a good reason to do otherwise. These show up in many different contexts and they typically have a very specific casing. (Ask Tim about this maybe.) |
Dockerfile.pqserver
Outdated
# Environment variables that must be set: | ||
# DB_HOST DB_PASS DB_USER DB_NAME or POSTGRES_URL | ||
|
||
# Expose the port. | ||
EXPOSE 8080 | ||
|
||
ENTRYPOINT [ "gunicorn", "-b", "0.0.0.0:8080", "-w", "2", "consdb-pq.pqserver:app" ] | ||
ENTRYPOINT [ "gunicorn", "-b", "0.0.0.0:8080", "-w", "2", "-k", "uvicorn.workers.UvicornWorker", "consdb_pq.pqserver:app" ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually people just invoke uvicorn
directly e.g.
uvicorn consdb_pq.pqserver:app --host 0.0.0.0 --port 8080
and I think that would let you drop the dependency on gunicorn
as well.
Small changes in API behavior:
Astropy is added as a dependency for validation of UCD and units.