Making and Deploying an AI Web App in 2023 (Part 6)
Containerize an App with Docker
Now that your web app is working locally, it’s time to think about deploying it somewhere. The easiest way to do that is to make a Docker image, and send it to wherever you want to deploy it. Having a docker image guarantees that your environment is replicable wherever you take it.
Build a Docker Image
In our app, we have a 2-stage process: first we build our Python package into a wheel file, and then we run the web server. Therefore, we can use Docker’s multi-stage build feature. Having this process split into 2 stages enables us to have a minimalist Python environment in the final image. That is, we install the dev environment in the first stage, but only the minimal needed dependencies in the final image.
This is our
Dockerfile, which should be in the root of our project:
FROM python:3.10 AS builder # install hatch RUN pip install --no-cache-dir --upgrade hatch COPY . /code # build python package WORKDIR /code RUN hatch build -t wheel FROM python:3.10 # copy wheel package from stage 1 COPY --from=builder /code/dist /code/dist RUN pip install --no-cache-dir --upgrade /code/dist/* # copy the serving code and our database COPY app.py /code COPY articles.sqlite /code # run web server WORKDIR /code CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8080"]
.dockerignorefile to avoid having docker run through all your files. See more about this in Docker’s docs.
To simplify our workflow, we should also add a couple of scripts to our
pyproject.toml, to build and run our
[tool.hatch.envs.default.scripts] build = "docker buildx build . -t ai-web-app:latest" serve-docker = ["docker run -p 5000:8080 ai-web-app:latest"]
You can then run
hatch run build
and the docker image will be built. Afterwards, you should run
hatch run serve-docker
which will serve the app in your local port
You can again test with curl, by running this command in a new terminal window:
curl -X GET "http://127.0.0.1:5000/search?query=symptoms%20of%20covid"
This request should get the same result as the one in Part 5
[Optional] Optimize the Image
In this case, the container startup is taking around 2 minutes in my machine. That could be fine if we’re deploying the image in our own premises, and we just start it up once and it will running. However, we want to deploy it to some serverless provider, which means the container will need to startup from scratch more or less for each request.
Therefore, we should have a look at what we can do to minimize this startup time.
In our app, every time the container starts, the database is being indexed.
Another thing that takes time is downloading the
all-MiniLM-L6-v2 model (see Part 2),
which is also happening every time the container starts.
There is an easy fix in this case: since the model and the database will be the same for all containers we launch, we should just move these 2 steps to the build process. The build process will then take a bit more time, but the startup will be almost instant.
Indeed this is what we did in this commit. After doing those optimizations, the startup is now almost instant.
To continue this tutorial, go to Part 7.
For comments or questions, use the Reddit discussion or reach out to me directly via email.