Sunday, October 27, 2019

Python Diary: Python and Docker development

As I will be starting a new full-time job soon which will be using Docker as their primary deployment technology, I thought this would be the best time to jump back into Docker to see what has changed over the years since I tried it last. It has definitely evolved into a much more useful product, and if used correctly in practice, it can be extremely powerful. Here are some bullet point use-cases for Docker I can think of:

  • Server software isolation in a powerful cgroup-enabled chroot jail
  • Self-contained development environments which can be easily shared
  • Support legacy Linux server software in a safely contained environment
  • Running programs using a completely different Linux distribution
  • Scaling out a large microservice stack using either Swarm or Kubernetes

Now, Docker isn't with it's security flaws as I recently found out. Basically, if a hacker gains access to an account which is part of the docker group on Linux, they can easily gain full root access to that system. Some may argue however, that once a hacker does get in, it is normally best to completely wipe that machine and reinstall from scratch, as there is no telling what the hacker may have done, or exploited. However, it is still best to have proper barriers in place to prevent an unauthorized person from gaining root in the first place. This way, you can still recover whatever data is on that machine with at least some certainty that it has not been tampered with. You will of course want to scrub said data and perhaps compare it to a recent backup to confirm it can be imported into the newly created machine. Basically, any user in the docker group on Linux has full access to the docker daemon to manage the images and containers. With this ability, you easily use a bind volume to map the entire root file system into the docker container, which will actually then give you full read/write access to the entire system. A hacker can then proceed to change the root password, modify various system files, and generally cause damage which could have otherwise been avoided if docker wasn't exposed. When I normally configure a Linux server, I do not install sudo, or any packages which otherwise could be abused to gain root. However, I am soon planning on using Docker on my current cloud servers as a partial experiment, and partially because I do not have enough time to upgrade my software from an aging Debian server. However, I will not be placing any user in the docker group, and instead using the stellar SaltStack, and their docker states to configure the containers on the cloud server. This provides a nice isolation between user and system processes.

However, you aren't here to read about the use-cases of Docker, or hear me ramble on about it's potential security flaws if not configured correctly. You are here to read about how you can use Docker with a local Python development environment. If you are in a position where you are putting together docker images for an organization, here is a word of advise, I would not recommend using the various software images on the Docker Hub. You may be asking, why's that? There are plenty of pre-created images you can easily pull in and use. The problem with this, is efficiency. If you do not have a hugely large stack, or are just playing around with various software trying to learn it, then pulling images from Docker Hub is perfectly fine. However, for production servers, you always want to aim for the most efficient method to allow your application stack to easily scale without worry. If you pull images from Docker Hub, you will have a large number of incompatible base images, leading to a large amount of downloads, and a large amounts of customizations on your part. As time goes on, you may find it getting very difficult to manage all these various images. My suggestion is to use a base operating system image with lots to offer. I personally use Debian, so I spin my images from a base Debian image, which I do pull from Docker Hub. From this, I make a simple customization to point the Debian mirror to my local mirror to drastically speed up docker image building. My absolute base image is a Debian image with a simple mirror customization in it. I then build on top of this with the various base software packages I will be needing, and create several custom images from that. The reason is this much more efficient than pulling random images from Docker Hub is simple, local disk and memory caches. Also, if you opt for ZFS, this can be a huge savings in both space and overall performance of your images, as all your software is sharing the same sectors on the underlying disk media. This along with disk and memory caching, allows the various shared libraries from your ultimate base image to be easily shared and managed by the host kernel, leading to lower memory and disk usage. As your application stack grows, memory and disk will start to mean more, and have an effect on the application's overall performance.

And there I go rambling again... Let's get started with a custom Dockerfile I created to enable easier development with Django. Both this Dockerfile and the entrypoint script can be updated for your specific use-cases, feel free to use them as you please.

FROM python:2.7.9

RUN pip install Django==1.11.17

ADD entrypoint.py /

VOLUME /app

EXPOSE 8000

ENTRYPOINT ["/entrypoint.py"]

Yes, I know, Python 2.7.x is losing support as of January, 2020. As stated above, please customize for your own use-cases. The above Python image may not exist in Docker Hub, so either roll your own base image, or update that line to point to an acceptable image. Next, let's take a look at the entrypoint script:

#!/usr/bin/python

import sys, os, django

print "Kevin's Django Development Docker Image\n\n"
print "Django Version: %s" % django.get_version()
print "PID: %s" % os.getpid()

app_dir = os.listdir('/app')

if len(app_dir) == 0:
    print " *** No Django Project currently exists! ***"
    proj_dir = None
else:
    for d in app_dir:
        if d == 'requirements.txt':
            os.system('pip install -r /app/requirements.txt')
        else:
            try:
                os.stat('/app/%s/manage.py' % d)
                print "Found Project: %s" % d
                proj_dir = '/app/%s' % d
            except:
                pass

if proj_dir:
    os.chdir(proj_dir)
else:
    os.chdir('/app')

params = len(sys.argv)

if params == 1:
    if proj_dir:
        args = ['/usr/bin/python','%s/manage.py' % proj_dir,'runserver','0.0.0.0:8000']
    else:
        sys.exit(2)
else:
    if sys.argv[1] == 'bash':
        os.execl('/bin/bash','bash')
    elif sys.argv[1] == 'startproject':
        if proj_dir:
            print " *** The project directory '%s' already exists! ***" % proj_dir
            sys.exit(2)
        if params != 3:
            print " *** Missing parameter to startproject! ***"
            sys.exit(2)
        args = ['django-admin', '/usr/local/bin/django-admin.py', 'startproject', sys.argv[2]]
    else:
        args = ['manage.py','%s/manage.py' % proj_dir]+sys.argv[1:]

os.execv('/usr/bin/python',args)

If you are using Python 3, update the print statements and anything else which is needed. This entrypoint script is very tailored towards Django development as you may see. It does quite a few checks within the /app directory. It will check for a requirements.txt file, and install the packages into the container, and attempt to auto-detect the Django project directory by searching for manage.py. Depending on if a Django project is found, it will allow the use of startproject, or allow you to call sub-commands of the manage.py command effortlessly. The idea is that you use a bind volume map to your host file system where your Django project lives. This allows you to easily develop using your favorite IDE on your host machine, while keeping a very persistent development environment within the container itself. I even created a nice shell script to allow the starting of this development environment with zero efforts.

#!/bin/sh

PROJ_DIR=$1
shift

docker run -it -v $PROJ_DIR:/app -p 8000:8000 --rm django:dev "$@"

This could be updated to allow the port to be changed and such as well. You can run this within the project directory and pass `pwd` for example to have it use the current directory. This ensures a pristine start everytime, so you know that the packages in the environment always match.

In the next article I will be publish shortly, I will provide another Dockerfile and an endpoint.py file which will allow you to run any Python web framework from the powerful uWSGI container software within a Docker container.



from Planet Python
via read more

No comments:

Post a Comment

TestDriven.io: Working with Static and Media Files in Django

This article looks at how to work with static and media files in a Django project, locally and in production. from Planet Python via read...