Created by: gwideman, Dec 4, 2013 4:17 pm
Revised by: gwideman, Dec 5, 2013 8:53 pm (9 revisions)

.
  • Work in progress
  • Part of a series which starts here.

Scope/Overview of deployment and packaging topic

[Overview diagram of package - deploy - install]
[Files arranged well for dev on dev machine --> Packaging ---> Distribution package ---> To user machine ---> Install---> files in an arrangement that runs]
The job of packaging and deployment is to get the necessary files to the target machine, and placed in an arrangement that works properly. For Python projects, the process on the developer's machine needs to assemble files and metadata in the distribution package sufficient for the installation process on the user's machine to handle the following:
  • copy and lay out files:
    • copy Python files to directories that make sense relative to sys.path
    • copy other functionality (if any) like DLLs or SOs to locations where the operating system can find them for loading
  • compile
    • compiled-language items (if any) that the distribution delivers as source code
  • fetch and install third-party items (if any)
    • that the project depends upon but doesn't supply in the distribution package (if not already on the target machine)
  • configure
    • take care of registry entries, PATH entries, links/shortcuts etc that may be needed for the user to be able to, for example, launch the application.

Uninstall

The installation tools and procedure might also prepare for later uninstallation of the files and configuration that the installation process installs.

History and ongoing developments

Python's packaging landscape is currently a hodge-podge of tools and formats that does not present just a single coherent set of capabilities. Instead, one must understand the historical context and current developments in order to focus on the tools, formats and docs that are currently viable and will be into the future.
The following provide some essential background:

Minimal landscape orientation

Tools
  • distutils: Original installation features in the Python standard libraries: distutils library, to be called via its "setup()" function, typically from a developer-created script called "setup.py". The setup.py is used with various command-line commands to perform different tasks, such as to create a distribution, or to install a distribution.
  • setuptools: Maintained by Python Packaging Authority, PyPA. Literally an extension to distutils, providing an enhanced set of packaging and installation functions. Uses the same setup()/setup.py calling convention. Includes an end-user oriented installation tool called Easy Install, which is now somewhat deprecated due intrusive configuration when installing libraries.
    • Easy Install is not to be confused with ez_setup, a script that bootstraps the download and installation of setuptools.
  • pip: Also maintained by PyPA. A newer installation tool, which implements less-instrusive configuration.
  • zc.buildout: A popular third-party (maintained by Zope Corporation) approach to creating distribution packages and performing installation.
Formats for distributions
  • Zip file, possibly with metadata
  • sdist: "Source distribution". Contains some arrangement of Python files and metadata files. (Do sdist files ever contain uncompiled C modules?) (What is "pure".)
  • bdist: "Binary distribution". Contains some arrangement of Python files, and compiled C (or other language) modules (extensions). (Do bdists ever contain uncompiled C?)
  • egg: Specially named zip file containing a binary distribution along with more-elaborate metadata.
  • wheel: A newer distribution format with more-elaborate metadata.
Repository
Things to ignore
  • distutils2: A fork of distutils, discontinued
  • Distribute: A fork of setuptools. Discontinued, and merged back into setuptools as of 2013-06
  • distlib: Improved distribution-related library functions, targeted for Python 3.3, but cancelled.

The salient tools and formats

(Same layout as diagram in Scope.)
The following table is an attempt to elaborate all of the main alternatives at each of the phases in the packaging-deployment-install cycle. Which pieces can be used together is the topic of later sections.
Project components and layout
Developer machine: Packaging alternatives (possibly with build step)
Distribution package formats
Delivery alternatives
User machine: Installation alternatives (possibly with build step)
  • Python modules, packages, scripts
  • Non-python "extension" functionality like C modules
  • Data files
  • Third-party library
  • Plain zip
  • Dev-written setup.py calling distutils (built-in library).
  • setup.py, calling setuptools (extension to distutils)
  • pip. Does it have a role in creating distributions?
  • buildout
  • Plain zip
  • Zip with __main__.py
  • sdist "source distrib"
  • bdist "binary distrib"
  • egg (several variants)
  • wheel
  • Privately; via file sharing, web server etc
  • Via upload-->repository-->download. Eg. PyPI (especially for 3rd party libs)
  • Manual
  • Zip with __main__.py
  • Dev-supplied setup.py calling distutils
  • Dev-supplied setup.py calling distutils extended by setuptools.
  • "Easy Install", which is equivalent to dev-supplied setup.py calling setuptools
  • pip

"The" recommended solution

PyPA's Quick Recommendations page, as of this writing (2013-11-06) favor the following (bold = most-recommended):
Project components and layout
Developer machine: Packaging (possibly with build step)
Distribution package format
Delivery
User machine: Installation (possibly with build step)
(Unknown what range of projects the recommendations are for.)
  • Setuptools (extension to distutils)
  • sdist "source distrib"?
  • bdist "binary distrib"?
  • wheel
  • PyPI
  • for non-eggs: pip
  • for eggs: setuptools' Easy Install
Other PyPA receommendations:
1. For more elaborate installations: Consider setting up a Python virtual environment to allow libraries (third party, or part of the distribution) to be installed without interfering with (or interference from) already-installed libraries.
My notes:
In light of the PyPI recommendation for delivery, I think that the recommendations here is rather narrowly focused on distributing open-source projects that are intended to be broadly shared. I think it is not a useful recommendation for private projects shared among a few users, where delivery would better be accomplished by plain old copying of some kind.
It is not clear to me whether this narrow focus also means that other aspects of the recommendation are inappropriate for other use cases. This highlights the need for recommendations to specify what use cases they relate to, and in turn the need for some succinct set of use cases which the developers and documenters of these facilities could reference.