114 lines
4.3 KiB
ReStructuredText
114 lines
4.3 KiB
ReStructuredText
NLTK c-i setup
|
||
==============
|
||
|
||
This is an overview of how our `continuous integration`_ setup works. It
|
||
includes a quick introduction to the tasks it runs, and the later sections
|
||
detail the process of setting up these tasks.
|
||
|
||
Our continuous integration is currently hosted at `Shining Panda`_, free thanks
|
||
to their FLOSS program. The setup is not specific to their solutions, it could
|
||
be moved to any `Jenkins`_ instance. The URL of our current instance is
|
||
https://jenkins.shiningpanda.com/nltk/
|
||
|
||
.. _`continuous integration`: http://en.wikipedia.org/wiki/Continuous_integration
|
||
.. _`Shining Panda`: http://shiningpanda.com
|
||
.. _`Jenkins`: http://jenkins-ci.org
|
||
|
||
|
||
Base tasks
|
||
----------
|
||
|
||
The base tasks of the c-i instance is as follows:
|
||
|
||
* Check out the NLTK project when VCS changes occur
|
||
* Build the project using setup.py
|
||
* Run our test suite
|
||
* Make packages for all platforms
|
||
* Build these web pages
|
||
|
||
Because the NLTK build environment is highly customized, we only run tests on
|
||
one configuration - the lowest version supported. NLTK 2 supports python down
|
||
to version 2.5, so all tests are run using a python2.5 virtualenv. The
|
||
virtualenv configuration is slightly simplified on ShiningPanda machines by
|
||
their having compiled all relevant python versions and making virtualenv use
|
||
these versions in their custom virtualenv builders.
|
||
|
||
|
||
VCS setup/integration
|
||
---------------------
|
||
|
||
All operations are done against the `NLTK repos on Github`_. The Jenkins
|
||
instance on ShiningPanda has a limit to the build time it can use each day.
|
||
Because of this, it only polls the main NLTK repo once a day, using the `Poll
|
||
SCM` option in Jenkins. Against the main code repo it uses public access only,
|
||
and for pushing to the nltk.github.com repo it uses the key of the user
|
||
nltk-webdeploy.
|
||
|
||
.. _`NLTK repos on Github`: https://github.com/nltk/
|
||
|
||
|
||
The base build
|
||
--------------
|
||
|
||
To build the project, the following tasks are run:
|
||
|
||
1. Create a VERSION file
|
||
A VERSION file is created using
|
||
``git describe --tags --match '*.*.*' > nltk/VERSION``.
|
||
This makes the most recent VCS tag available in nltk.__version__ etc.
|
||
2. ``python setup.py build``
|
||
This essentially copies the files that are required to run NLTK into build/
|
||
|
||
|
||
The test suite
|
||
--------------
|
||
|
||
The tests require that all dependencies be installed. These have all been
|
||
installed beforehand, and to make them run a series of extra environment
|
||
variables are initialized. These dependencies will not be detailed until the
|
||
last section.
|
||
|
||
The test suite itself consists of doctests. These are found in each module as
|
||
docstrings, and in all the .doctest files under the test folder in the nltk
|
||
repo. We run these tests using nose_, find code coverage using `coverage.py`_
|
||
and check for `PEP-8`_ etc. standard violations using `pylint`_.
|
||
|
||
All these tools are easily installable through pip your favourite OS' software
|
||
packaging system. For testing, only nose_ is really needed. This is also the
|
||
only software that does not work properly out of the box. To use the options
|
||
+ELLIPSIS and +NORMALIZE_WHITESPACE in our doctests, we have installed nose
|
||
from source with `a patch that allows this`_ applied.
|
||
|
||
The results of these programs are parsed and published by the jenkins instance,
|
||
giving us pretty graphs :)
|
||
|
||
.. _nose: http://readthedocs.org/docs/nose/
|
||
.. _`coverage.py`: http://nedbatchelder.com/code/coverage/
|
||
.. _`PEP-8`: http://www.python.org/dev/peps/pep-0008/
|
||
.. _`pylint`: http://www.logilab.org/project/pylint
|
||
.. _`a patch that allows this`: https://github.com/nose-devs/nose/issues/7
|
||
|
||
|
||
The builds
|
||
----------
|
||
|
||
The packages are built using ``make dist``. The outputted builds are all placed
|
||
`in our jenkins workspace`_ and should be safe to distribute. Builds
|
||
specifically for mac are not available. File names are made based on the
|
||
``__version__`` string, so they change every build.
|
||
|
||
.. _`in our jenkins workspace`: http://example.com/
|
||
|
||
|
||
Web page builder
|
||
----------------
|
||
|
||
The web page is built using Sphinx_. It fetches all code documentation directly
|
||
from the code's docstrings. After building the page using ``make web`` it
|
||
pushes it to the `nltk.github.com repo on github`_. To push it, it needs access
|
||
to the repo – because this cannot be done using a deploy key, it has the ssh
|
||
key of the ``nltk-webdeploy`` user.
|
||
|
||
.. _Sphinx: http://sphinx.pocoo.org
|
||
.. _`nltk.github.com repo on github`: https://github.com/nltk/nltk.github.com
|