Making and Deploying an AI Web App in 2023 (Part 7)
Build a CI/CD Pipeline for an AI App
This is part of a multi-part blogpost about how to build an AI Web App. Please refer to Part 1 for more context.
This post uses GitHub Actions. Possible alternatives are: GitLab Pipelines, Jenkins, pypyr, and many others.
Now that we have a first version of our app working, it’s time to setup a CI pipeline. In this post, we’ll make a very simple pipeline which runs linting and unit/integration tests. We’ll use GitHub Actions for this post.
Setup CI Pipeline
The first thing we need to do is create a file .github/workflows/backend.yaml
on the root of our repository:
name: Backend CI
on: push
jobs:
run:
name: Run on Python ${{ matrix.python-version }} (${{ matrix.os }})
runs-on: ${{ matrix.os }}
strategy:
matrix:
python-version: ["3.10"] # can use other python versions
os: ["ubuntu-latest"] # can test in other OS
steps:
- uses: actions/checkout@v3
- name: Check for CRLF endings
uses: erclu/check-crlf@v1
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}
- name: Install hatch
run: python -m pip install -U pip hatch
- name: Lint
run: hatch run lint
- name: Test
run: hatch run cov
The file above will run a job that clones our repo, checks if all files have LF line endings (if you’re using Windows to develop, that can be a common issue), installs Python and hatch, and runs our linter and unit tests.
Using the hatch scripts (see Part 3 for details) greatly simplifies the pipeline. With the scripts, it’s easy to make sure that we run the same command locally as on the CI pipeline, since all the parameters are in the config file.
We also want to run the integration tests.
For that, we will need to download our database (or, if it’s too big, a subset of our real database).
This download will take some time, and the integration tests themselves are also heavy.
Therefore, we don’t want to run this job for every commit.
In this case, we’ll limit the integration tests job to run only on commits in
Pull Requests, or in the main
branch.
This is our job configuration (just append to .github/workflows/backend.yaml
):
integration:
runs-on: ubuntu-latest
# only run on pull requests and main branch
if: ${{ github.event_name == 'pull_request' || github.ref == 'refs/heads/main' }}
steps:
- uses: actions/checkout@v3
- name: Set up Python 3.10
uses: actions/setup-python@v4
with:
python-version: "3.10"
- name: Download database
run: |
wget https://github.com/neuml/txtai/releases/download/v1.1.0/tests.gz
mv tests.gz articles.sqlite.gz
gunzip articles.sqlite
- name: Install hatch
run: python -m pip install hatch
- name: Run integration tests
run: hatch run integration
On your first setup of the CI pipeline, it’s very common that something doesn’t behave as you expected, since you’re running the code on a different machine for the first time (even if the same code!). In my case, I had to make some small changes to my code to get it working.
Having this pipeline always running is a pretty good start, as it ensures that tests are run regularly and you’re notified if they fail. For next things to do with the pipelines, you can have a look at building/pushing docker images (such as the one we built in Part 6).
To continue this tutorial, go to Part 8.
For comments or questions, use the Reddit discussion or reach out to me directly via email.