CSpace-Django webapps - setting up a production environment using Apache

Introduction

The Python-based Django framework has been chosen at UC Berkeley to provide additional functionality within deployed instances of CollectionSpace. Uses include reporting and other collection administration tasks, as well as outward facing, public web-sites for presentation of the data stored in a CollectionSpace instance.

Normally, you may assume that this environment already exists and is operating on UC Berkeley VMs used for CollectionSpace. Indeed, if you have a choice, you should choose a VM that already has the required packages installed.

But if you are unlucky enough to have to do it yourself, here are the steps we have used at UC Berkeley to feather the bed for CSpace-Django webapps.

Below are the steps necessary for installing and running a Django web application, using the Apache Webserver and WSGI as the front-end. Much of the configuration of a Django application will be project-specific, so this is not intended as a one-size-fits-all approach.

Relates to: /wiki/spaces/collectionspace/pages/666276118
Relates to: CSpace-Django webapps - setting up configuration for your project
Relates to: CSpace-Django webapps - setting up a code repository

This document will cover the installation steps only for Linux platforms. Because UC Berkeley deployments run on RedHat Linux machines, the main focus will be on yum-managed servers. Some attention will be given to aptitude- (or apt-) based systems.

Contents

Installation

Install Python

Django requires any Python version from 2.6.5 to 2.7,x. To ensure Python is installed and is the required version, type into the shell:

python -V
Python 2.7.1

Linux systems are highly dependent on the default installed version of Python. If the default version of Python is not adequate, DO NOT try to upgrade it unless you fully understand the consequences. Rather, install a usable version alongside of it. Keep in mind that you are not using the default version as you do the later steps, particularly when installing mod-wsgi.

Ed. note - Add instructions for installing Python?

Install Apache

It is assumed that the Apache server is already installed. Apache comes pre-installed on most Linux distributions.

On systems with the Yum package manager (RedHat, Fedora, CentOS ...), you can check if it's installed via

yum list installed | grep httpd

Be sure that httpd-tools are installed, as well:

httpd.x86_64                       2.2.15-29.el6_4
httpd-tools.x86_64                 2.2.15-29.el6_4

And on systems with the Aptitude package manager (Debian, Ubuntu ...)

dpkg --get-selections | grep apache2

Be sure that apache2-utils are installed, as well.

If not installed, on Yum systems, it can be installed with the command:

yum install httpd
yum install httpd-tools

And on Aptitude systems:

apt-get install apache2
apt-get install apache2-utils

It may turn out that some components need to be compiled from source (see below), in which case the full Apache development package will be needed. We will come back to that.

Install mod_wsgi (for connecting Django to Apache)

In order to run under Apache, Django requires an adapter that connects Apache to a Django project. The two best known adapters are mod_python and mod_wsgi. The recommended adapter is mod_wsgi, unless mod_python is already installed and in use on your server. The two cannot be used together.

To check if mod_wsgi is installed:

On Yum systems:

yum list installed | grep mod_wsgi

On Aptitude systems:

dpkg --get-selections | grep -v deinstall | grep libapache2-mod-wsgi

To install mod_wsgi:

The next step works if the default system version of Python is the one that you're using for CSpace-Django webapps. If you've had to install another version of Python, you will need to configure mod-wsgi to use that version, which requires compiling mod-wsgi from source.

On Yum systems:

yum install mod_wsgi

On Aptitude systems:

apt-get install libapache2-mod-wsgi

Usually, mod_wsgi will be installed and automatically enabled in Apache. To verify that this has been done correctly, check that a directive beginning with LoadModule wsgi_module appears somewhere in the Apache configuration, and add that directive if it does not.

On a RedHat system, for example, that directive should be present in a file within the /etc/httpd/conf.d/ directory, such as django.conf or wsgi.conf, and should read:

LoadModule wsgi_module modules/mod_wsgi.so

On an Ubuntu system using apt, that directive should be present in the file /etc/apache2/mods-enabled and should read:

LoadModule wsgi_module /usr/lib/apache2/modules/mod_wsgi.so

There is one more step to getting Apache hooked up with Django, but since it doesn't make much sense at this stage - we can return to that after getting the Django application framework installed.

mod-wsgi Issues

mod_wsgi must be compiled with the same version of python that will be used in the Django applications. If mod_wsgi is installed with yum or apt, the version installed will have been compiled with the current system default version of python (which cannot be upgraded). If that is not the version of python that will be used by Django, then that version of mod_wsgi will not work.

Alternate versions of mod_wsgi may be available through apt and yum: look for packages with names like "mod_wsgi26". or "mod_wsgi27". There are also a limited number of pre-compiled binaries available at code.google.com. Unfortunately, none of them are for Linux platforms.

The last resort is to compile mod_wsgi from source, which is available at http://code.google.com/p/modwsgi/downloads/detail?name=mod_wsgi-3.4.tar.gz&can=2&q=.

Compiling and Installing mod_wsgi

If no appropriate binary version of mod_wsgi is available, download the source code and unpack it in a convenient place:

tar -zxf mod_wsgi-3.4.tar.gz
cd mod_wsgi-3.4
./configure
make

This will probably not be good enough, since you will probably be wanting to compile against a different version of python (why else would you be compiling from source?), so the configure command might look more like:

./configure --with-python=/usr/local/bin/python2.7

If you get a compiler error about not finding the apache apxs file then add yet another option:

./configure --with-python=/usr/local/bin/python2.7  --with-apxs=/usr/local/apache/bin/apxs

Apache apxs is part of the apache2-dev package on Debian (apt based) systems or the httpd-devel package on yum based systems. These are not installed by default, so if you do not find apxs then the appropriate development package must be installed.

When the mod_wsgi source code has been configured and compiled successfully, run

make install

which will copy the compiled mod_wsgi.so file into the apache2 or httpd modules directory.

Install Pip

Pip is a tool for installing and managing Python packages.

  • Begin by checking whether Pip is installed, and if so, which version:

    sudo yum list installed | grep -i pip
    

    or

    which pip
    
  • If Pip is not installed, install it.
    • On RedHat systems, this is a slightly roundabout process:
      1. Install the python-devel package:

        sudo yum install python-devel
        
      2. Install python-setuptools:

        sudo yum install python-setuptools
        
      3. Install Pip:

        sudo easy_install pip
        
      4. Update setuptools using Pip:

        sudo pip install setuptools --upgrade
        
    • For apt-based systems, the process is straight-forward:

      sudo apt-get install python-pip python-dev build-essential
      

      The python-dev and build-essential packages are recommended to install along, because it isn't possible to install any Python module that ships with a C extension without them later on.

  • Upgrade Pip (same process for apt- and yum-compatible systems)

    sudo pip install --upgrade pip
    

    Please install any Python libraries that you might need.

Install Virtualenv

  • Check that virtualenv is installed

    which virtualenv
    
  • If it isn't already present, install virtualenv:

    sudo pip install virtualenv
    

Install Virtualenvwrapper

Virtualenvwrapper is a wrapper for Virtualenv that allows for easier management of multiple environments on the same server. Its post-activate and pre-deactivate hooks might create an easy way to, for example, point to configuration files with passwords that one would like to manage outside of a Git environment.

  • Check that virtualenvwrapper is installed

    which virtualenvwrapper.sh
    
  • If it isn't already present, install Virtualenvwrapper (same for apt- and yum-compatible systems

    sudo pip install virtualenvwrapper
    

Install Django

Django 1.5 requires a 2.x Python version higher than 2.5. Python 2.6 or higher will work, while 2.7 or higher is recommended. Do not (yet) use Python 3.x, as Django 1.5 still has only experimental support for that version of Python.

Django 1.6 also requires Python 2.6 or 2.7, but is compatible as well with Python 3.x.

Make a Django directory

  • Make a Django directory. This directory will contain all of your virtualenvs and Django projects:

    cd /usr/local/share
    sudo mkdir django
    
  • Give read/write access to the users who will need to create and edit Django webapps. The following instructions assume that you have previously created a Unix group called developers that includes these users.

    sudo chgrp developers django
    
  • The entire directory tree leading to a Django app must be readable by whatever account is running the http server, typically apache or, on Ubuntu, www-data. To facilitate this, make the Django directory readable and executable by everyone. This will ensure that Apache can read the entire directory path down to the level of each of your Django projects. (By default, the parent /usr/local/share directory already is readable and executable by everyone.)

    sudo chmod 775 django
    

    Ed. note - this makes the django directory writable by group. Explain why we do this?

Create a virtual environment

  • Create a virtual environment in a subdirectory within the Django directory. (The virtual environment in this example happens to be named "venv26", referring to the use of a Python 2.6.x version within that environment.)

    cd django
    sudo virtualenv --no-site-packages venv26
    sudo chgrp developers venv26/
    sudo chmod g+w venv26/
    
  • Make the virtual environment active:

    cd venv26
    source bin/activate
    

    You should see the virtual machine name (venv26) displayed in your shell prompt once you've activated the vm.

Install Django

  • Install Django within the virtual environment:

    sudo sh -c "source ../venv26/bin/activate; pip install django"
    

    A note about the syntax used above: Running the sequence of commands as root installs Django within the activated virtual environment. If run as two separate steps, the command sudo pip install django would exit virtualenv and install Django in a system directory.

(Optional) Install Python module to support Postgres bindings:

  • Install psycopg2

    export PATH=$PATH:/usr/pgsql-9.2/bin
    pip install psycopg2
    

    Ed.note: not yet tested.

Configuration

This section covers configuring Django to run under Apache.

From the Django website: "Deploying Django with Apache and mod_wsgi is a tried and tested way to get Django into production. mod_wsgi is an Apache module which can host any Python WSGI application, including Django. Django will work with any version of Apache which supports mod_wsgi."

Before continuing onto the next steps, it's strongly recommended that you read the information at How to use Django with Apache and mod_wsgi.

At this point, if you already have your Django code repository in place, you may jump to CSpace-Django webapps - setting up configuration for your project. That document provides instructions for installing and configuring the Django project code, then configuring Apache to work with that code, rather than a sample 'hello world' project. Be sure to review the sections on SELinux issues, sqlite3 issues, etc. towards the bottom of this page.

Create a 'hello world' project to verify configuration

It's a good idea at this point to create a 'hello world' Django project, to verify that Apache and Django are configured correctly to talk to one another. To create a "hello world" project, change to your Django directory and enter the Django startproject command:

cd /usr/local/share/django
django-admin.py startproject hello_world

This will make a sub-directory named hello_world, and populate it with a standard set of project files.

At some place inside each of your Django project directories, Apache Web Server will need to be able to write files, so make another directory inside of the hello_world directory named, for instance, apache. (There's nothing special about the name, and you call it anything that seems appropriate.)

cd hello_world
mkdir apache

Make whatever account is running the http server, such as apache or (on Ubuntu) www-data, the owner of that directory with full access rights; e.g.

sudo chown apache:apache apache
sudo chmod u+rwx

Continue with Apache configuration

Change to Apache's directory for user-created configuration files, conf.d. Depending on your system, this directory might be located at /etc/httpd/conf.d or /etc/apache2/conf.d; e.g.

cd /etc/httpd/conf.d

Configure Apache directives

Create a file named django.conf in which to configure Apache directives for your Django projects.

Add the following directives to that file:

  • WSGISocketPrefix
WSGISocketPrefix run/wsgi
  • Directory

    Ed. note: Is it necessary to grant Apache permissions beyond the django/ directory?

    <Directory "/usr/local/share/django/hello_world">
            Order allow,deny
            Allow from all
    </Directory>
    
  • WSGIScriptAlias

    WSGIScriptAlias /hello_world /usr/local/share/django/hello_world/hello_world/wsgi.py
    

You can later add multiple WSGIScriptAlias entries, one for each of your Django projects. There are also a number of other options that can be set here, in particular WSGIPythonPath, if you need additional python modules from non-default locations.

Configure Daemon mode (WSGIDaemonProcess and WSGIProcessGroup)

“Daemon mode” is the recommended mode for running mod_wsgi (on non-Windows platforms). To create the required daemon process group and delegate the Django instance to run in it, you will need to add appropriate WSGIDaemonProcess and WSGIProcessGroup directives to django.conf:

Ed. note: Wrapping WSGIProcessGroup with a Location tag might not be necessary if there's only one project.

WSGIDaemonProcess hello_world_wsgi
<Location /hello_world
   WSGIProcessGroup hello_world_wsgi
</Location>

A further change required to the above configuration, if you use daemon mode, is that you can’t use WSGIPythonPath; instead you should use the python-path option to WSGIDaemonProcess, for example:

WSGIDaemonProcess hello_world_wsgi python-path=/path/to/hello_world project:/path/to/venv/lib/python2.7/site-packages

For more details see: https://docs.djangoproject.com/en/1.5/howto/deployment/wsgi/modwsgi/#using-mod-wsgi-daemon-mode

Sample configuration for multiple projects running in Daemon mode

The following is a sample configuration for two Django projects, pahma_project and cspace_django_project, running in parallel in Daemon mode:

<!-- We're using the same virtual environment for both projects, since their dependencies do not conflict, located at /usr/local/share/django/vir274 -->
WSGIDaemonProcess pahma_project_wsgi python-path=/usr/local/share/django/pahma_project:/usr/local/share/django/vir274/lib/python2.7/site-packages user=apache group=apache
WSGIDaemonProcess cspace_django_project_wsgi python-path=/usr/local/share/django/cspace_django_project:/usr/local/share/django/vir274/lib/python2.7/site-packages user=apache group=apache

<!-- Each project has a wsgi.py file in it -->
WSGIScriptAlias /pahma_project /usr/local/share/django/pahma_project/cspace_django_site/wsgi.py
WSGIScriptAlias /cspace_django_project /usr/local/share/django/cspace_django_project/cspace_django_site/wsgi.py

WSGISocketPrefix run/wsgi

<!-- the location for each webapp should be unique, and should match the value of WSGI_BASE in the project's wsgi.py file -->
<Location /pahma_project>
   WSGIProcessGroup  pahma_project_wsgi
</Location>
<Location /cspace_django_project>
   WSGIProcessGroup  cspace_django_project_wsgi
</Location>

<!-- for running two cspace django projects alongside each other, you will have to give the static root a unique name here, and change the STATIC_URL in that project's settings.py file to match -->
Alias /static  /usr/local/share/django/pahma_project/static_root
Alias /cspace_django_project_static  /usr/local/share/django/cspace_django_project/static_root

<Directory "/usr/local/share/django/pahma_project/static_root">
   Order deny,allow
   Deny from all
   Allow from all
</Directory>
<Directory "/usr/local/share/django/cspace_django_project/static_root">
   Order deny,allow
   Deny from all
   Allow from all
</Directory>

Notice that we have specifically defined the user and group in the configuration, even though that may not strictly be necessary:

WSGIDaemonProcess ... user=apache group=apache

(If you do include them, be sure not to introduce typos!)

Also, note that static files are .css and .js files, images, etc.

See the official mod_wsgi documentation for details on setting up daemon mode. https://code.google.com/p/modwsgi/wiki/QuickConfigurationGuide#Delegation_To_Daemon_Process

Additional Configuration Considerations

SELinux issues

Most recent Linux distributions come with SELinux installed and activated, so it is an issue that is likely to come up when deploying a Django application under Apache. Since SELinux policy files differ between distributions, it is not possible to give a generic configuration solution, but here are a few useful tips.

  • Use ls -lZ to show SELinux labels
  • Use chcon to change SELinux labels, where needed; e.g. chcon -t httpd_sys_content_t db.sqlite3
  • Use setsebool to change SELinux boolean values, where needed; e.g. setsebool httpd_can_network_connect_db on
    • Use getsebool to view pertinent boolean values; e.g. getsebool -a | grep httpd

(You might find chcon, setsebool, and getsebool at a path that typically might not be in your default executables path; e.g. /usr/sbin on some systems)

In overview, we need to grant permissions to Apache on anything it needs to read from or write to.

  • Set the SELinux label on the project directory to match Apache's document root, typically httpd_sys_rw_content_t
  • If you have chosen to use sqlite3, add the same SELinux label to the sqlite3 database file.
  • Give apache permission to connect to the database: setsebool httpd_can_network_connect_db on.
  • For the previous setting to work this is also needed: setsebool httpd_can_network_connect on.
  • WSGI processes create temporary files in /tmp but, under default SELinux policy, do not have permission to access those files, so we also need: setsebool httpd_tmp_exec on.

Sample SELinux permissions from a deployment of pahma_project:

pahma_project directory - apache:developers drwxrwxr-x - httpd_sys_rw_content_t

Apache needs write access to this folder because it needs write access to the folder containing the database file. If we move the sqlite db, we no longer have to give apache write access to this folder

pahma_project/db.sqlite3 - apache:developers -rw-rw-r-- - httpd_sys_rw_content_t
pahma_project/logs - apache:developers drwxrwxr-x - httpd_sys_content_t
pahma_project/static_root - apache:developers drwxrwxr-x - httpd_sys_content_t
pahma_project/logs/logfile.txt - apache:developers -rw-rw-r-- - httpd_sys_content_t
pahma_project/logs/settings.log - apache:developers -rw-rw-r-- - httpd_sys_content_t
Everything within static_root must have SELinux label httpd_sys_content_t

Ed. note - The sample permissions above come from a development server and thus are more lenient that desired. On a production server, the db.sqlite3 file and logs directory should be placed in the pahma_project/apache directory. The pahma_project directory should NOT be writable. It's SELinux label should read httpd_sys_content_t. The apache directory SHOULD be writable; it's label should be httpd_sys_rw_content_t.

sqlite3 Issues

Apache must have read/write permission on the sqlite3 database file and on the file's parent directory. Therefore, in the example above, we set the SELinux label on these items to be httpd_sys_rw_content_t. (In that example, the parent directory is the pahma_project folder.)

Sqlite3 is used to support the CollectionSpace authentication webapp that is included within the cspace_django_project. See Installing CSpace-enabled Webapps. ?