Tuesday, May 21, 2019

Patrick Kennedy: Setting Up GitLab CI for a Python Application

Introduction

This blog post describes how to configure a Continuous Integration (CI) process on GitLab for a python application.  This blog post utilizes one of my python applications (bild) to show how to setup the CI process:

In this blog post, I’ll show how I setup a GitLab CI process to run the following jobs on a python application:

  • Unit and functional testing using pytest
  • Linting using flake8
  • Static analysis using pylint
  • Type checking using mypy

What is CI?

To me, Continuous Integration (CI) means frequently testing your application in an integrated state.  However, the term ‘testing’ should be interpreted loosely as this can mean:

  • Integration testing
  • Unit testing
  • Functional testing
  • Static analysis
  • Style checking (linting)
  • Dynamic analysis

To facilitate running these tests, it’s best to have these tests run automatically as part of your configuration management (git) process.  This is where GitLab CI is awesome!

In my experience, I’ve found it really beneficial to develop a test script locally and then add it to the CI process that gets automatically run on GitLab CI.

Getting Started with GitLab CI

Before jumping into GitLab CI, here are a few definitions:

 – pipeline: a set of tests to run against a single git commit.

  – runner: GitLab uses runners on different servers to actually execute the tests in a pipeline; GitLab provides runners to use, but you can also spin up your own servers as runners.

  – job: a single test being run in a pipeline.

  – stage: a group of related tests being run in a pipeline.

Here’s a screenshot from GitLab CI that helps illustrate these terms:

GitLab utilizes the ‘.gitlab-ci.yml’ file to run the CI pipeline for each project.  The ‘.gitlab-ci.yml’ file should be found in the top-level directory of your project.

While there are different methods of running a test in GitLab CI, I prefer to utilize a Docker container to run each test.  I’ve found the overhead in spinning up a Docker container to be trivial (in terms of execution time) when doing CI testing.

Creating a Single Job in GitLab CI

The first job that I want to add to GitLab CI for my project is to run a linter (flake8).  In my local development environment, I would run this command:

$ flake8 --max-line-length=120 bild/*.py

This command can be transformed into a job on GitLab CI in the ‘.gitlab-ci.yml’ file:

image: "python:3.7"

before_script:
  - python --version
  - pip install -r requirements.txt

stages:
  - Static Analysis

flake8:
  stage: Static Analysis
  script:
  - flake8 --max-line-length=120 bild/*.py

This YAML file tells GitLab CI what to run on each commit pushed up to the repository. Let’s break down each section…

The first line (image: “python: 3.7”) instructs GitLab CI to utilize Docker for performing ALL of the tests for this project, specifically to use the ‘python:3.7‘ image that is found on DockerHub.

The second section (before_script) is the set of commands to run in the Docker container before starting each job. This is really beneficial for getting the Docker container in the correct state by installing all the python packages needed by the application.

The third section (stages) defines the different stages in the pipeline. There is only a single stage (Static Analysis) at this point, but later a second stage (Test) will be added. I like to think of stages as a way to group together related jobs.

The fourth section (flake8) defines the job; it specifies the stage (Static Analysis) that the job should be part of and the commands to run in the Docker container for this job. For this job, the flake8 linter is run against the python files in the application.

At this point, the updates to ‘.gitlab-ci.yml’ file should be commited to git and then pushed up to GitLab:

git add .gitlab_ci.yml
git commit -m "Updated .gitlab_ci.yml"
git push origin master

GitLab Ci will see that there is a CI configuration file (.gitlab-ci.yml) and use this to run the pipeline:

This is the start of a CI process for a python project!  GitLab CI will run a linter (flake8) on every commit that is pushed up to GitLab for this project.

Running Tests with pytest on GitLab CI

When I run my unit and functional tests with pytest in my development environment, I run the following command in my top-level directory:

$ pytest

My initial attempt at creating a new job to run pytest in ‘.gitlab-ci.yml’ file was:

image: "python:3.7"

before_script:
  - python --version
  - pip install -r requirements.txt

stages:
  - Static Analysis
  - Test

...

pytest:
  stage: Test
  script:
  - pytest 

However, this did not work as pytest was unable to find the ‘bild’ module (ie. the source code) to test:

$ pytest
========================= test session starts ==========================
platform linux -- Python 3.7.3, pytest-4.5.0, py-1.5.4, pluggy-0.11.0
rootdir: /builds/patkennedy79/bild, inifile: pytest.ini
plugins: datafiles-2.0
collected 0 items / 3 errors
============================ ERRORS ====================================
___________ ERROR collecting tests/functional/test_bild.py _____________
ImportError while importing test module '/builds/patkennedy79/bild/tests/functional/test_bild.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
tests/functional/test_bild.py:4: in <module>
    from bild.directory import Directory
E   ModuleNotFoundError: No module named 'bild'
...
==================== 3 error in 0.24 seconds ======================
ERROR: Job failed: exit code 1

The problem encountered here is that the ‘bild’ module is not able to be found by the test_*.py files, as the top-level directory of the project was not being specified in the system path:

$ python -c "import sys;print(sys.path)"
['', '/usr/local/lib/python37.zip', '/usr/local/lib/python3.7', '/usr/local/lib/python3.7/lib-dynload', '/usr/local/lib/python3.7/site-packages']

The solution that I came up with was to add the top-level directory to the system path within the Docker container for this job:

pytest:
  stage: Test
  script:
  - pwd
  - ls -l
  - export PYTHONPATH="$PYTHONPATH:."
  - python -c "import sys;print(sys.path)"
  - pytest

With the updated system path, this job was able to run successfully:

$ pwd
/builds/patkennedy79/bild
$ export PYTHONPATH="$PYTHONPATH:."
$ python -c "import sys;print(sys.path)"
['', '/builds/patkennedy79/bild', '/usr/local/lib/python37.zip', '/usr/local/lib/python3.7', '/usr/local/lib/python3.7/lib-dynload', '/usr/local/lib/python3.7/site-packages']

Final GitLab CI Configuration

Here is the final .gitlab-ci.yml file that runs the static analysis jobs (flake8, mypy, pylint) and the tests (pytest):

image: "python:3.7"

before_script:
- python --version
- pip install -r requirements.txt

stages:
- Static Analysis
- Test

mypy:
stage: Static Analysis
script:
- pwd
- ls -l
- python -m mypy bild/file.py
- python -m mypy bild/directory.py

flake8:
stage: Static Analysis
script:
- flake8 --max-line-length=120 bild/*.py

pylint:
stage: Static Analysis
allow_failure: true
script:
- pylint -d C0301 bild/*.py

unit_test:
stage: Test
script:
- pwd
- ls -l
- export PYTHONPATH="$PYTHONPATH:."
- python -c "import sys;print(sys.path)"
- pytest

Here is the resulting output from GitLab CI:

One item that I’d like to point out is that pylint is reporting some warnings, but I find this to be acceptable. However, I still want to have pylint running in my CI process, but I don’t care if it has failures. I’m more concerned with trends over time (are there warnings being created). Therefore, I set the pylint job to be allowed to fail via the ‘allow_failure’ setting:

pylint:
  stage: Static Analysis
  allow_failure: true
  script:
  - pylint -d C0301 bild/*.py


from Planet Python
via read more

No comments:

Post a Comment

TestDriven.io: Working with Static and Media Files in Django

This article looks at how to work with static and media files in a Django project, locally and in production. from Planet Python via read...