Events Guide/Text Sprints
From Open Knowledge Foundation
An overview of how to run a text sprint -- a sprint to produce some piece of text such as a guide, a manual or a book. The OKFN has been involved in several such efforts, for example the Open Data Manual and the Data Driven Journalism Handbook.
Most text sprints involve 3 phases:
- Planning
- Text authoring -- usually during the sprint itself
- Text consolidation -- collating, proofing and formatting the material produced, usually so that it can be provided in consolidated form online or in PDF
Contents |
Planning
- Clear statement of what is supposed to be produced
- Dividing up and allocating areas to work on
- It is usually more productive if different people focus on different areas of the text
- Deciding on the platform for sprint text authoring (NB: this does not have to be same as the platform you use for doing the finishing product)
- Deciding on a time, and if relevant, the place
Text Authoring during Sprint
The big recommendations here are:
- Allow authors to use whatever system they want (though require that export is available to some standard format such as plain text, .doc, .ods ...)
This is crucial: during the sprint phase you want people to write as much as they can unencumbered by having to learn new technologies or processes.
If someone wants to use Word, let them, if they want use vim or emacs, let them, if they want to use new fangled technology X let them. The only requirement is, as stated above, that they can then export the material produced in some standard format.
The second point (though it can be over-ridden by the first) is:
- Use a platform that allows for real-time collaborative authoring
- For example, etherpads, google docs, etc
If a recommendation is wanted for a particular technology we suggest using google docs because a) it has good standard word processing features b) good collaborative editing capabilities c) exports and imports from major common formats.
Text Authoring Post Sprint
If successful text authoring will likely go on post-sprint (or in subsequent sprints). In this case there will be a distinction between developing new material and extending or correcting existing material. New material can be developed as in the standard sprint setup but work on existing material will need to take account of the system used for "text consolidation" (see next section).
Text Consolidation
At the text consolidation phase priorities are very different from the sprint phase. While your requirements may vary the following are common:
- Revisioning -- revisioning simplifies ongoing addition of content as well as more complex operations like translation
- Easy to produce html and to deploy online -- you will want the result to be online as a url of your choice
- Styling / theming capabilities -- you want the final result to look good
- Exportable to other formats -- especially PDF for printing or offline use
- Translation support -- either directly or via integration with third party tools
There are a various technology options here that could be suitable. Most of these are reviewed (albeit against somewhat different criteria) in Open_Data_Manual/Technology_Options. Here, we document our recommended method based on our experience here (we have personally tried out several other options and we've found this the best so far).
Recommended Solution
Technology
This solution combines a variety of best-of-breed tools into a coherent workflow.
- Sphinx - document preparation and generation system
- Source documents are plain text using Restructured Text formatting. This makes them very easy to manage, for example in version control systems (see below), and there are also a vast number of tools for working with them.
- Variety of output formats - including HTML and PDF
- Good support for theming
- Support for structural features found in longer material such as cross-references, table of contents, indices
- Built in (basic) search
- Integrated support for translation (aka i18n)
- Version control tool such as Mercurial or Git (plus online hosting). Our current preference is Git plus Github.
- This allows for us to store the source texts along with instructions in a versioned manner
- ReadTheDocs (or Amazon S3) -- for simple deployment and hosting of the resulting material online
- ReadTheDocs provides a simple way to have your Sphinx project automatically generate HTML and posted online. It has integrated hooks with both Github and Bitbucket so that documentation can be automatically regenerated on every change.
Comments: Sphinx is a move away from a what-you-see-is-what-you-get system as in a standard word processor. This brings some costs but it is noteworthy that almost all systems for preparing longer material like books or manuals, especially if nice formatting is requried, are of this form (ie. are no longer WYSIWYG).
To see an example of how this is setup see the Open Data Manual repository on github: https://github.com/okfn/opendatamanual
Workflow
See Open Data Manual. Main roles are:
- Open Data Manual#Editor
- Open Data Manual#Build_Manager -- note this role will not be relevant if you use ReadTheDocs as it will be done for you
Translation
See Open Data Manual for approach.
TODO: provide a template project on github you can clone.