tychoish, a wiki

tychoish/ A LaTeX Build System

A LaTeX Build System

I've been throwing around the idea of creating a LaTeX-based document production service for people and work groups that produce a lot of paper documents. I think lawyers, academics, and PR-types are ideal examples, but I think the basic approach could be extrapolated to other kinds of work. The core of this proposal is that it takes a traditionally unstructured domain (document editing and generation,) and adds structure and infrastructure to provide software and service

LaTeX is a great system for making great documents: they look great and it's easy to build templates and styles so that your documents always look consistent. These features are the result of LaTeX's structure. The downside is the LaTeX tool chain is pretty big and editing is in a source format which may be a bit complicated for the uninitiated.

Components

Each of the following sections explain one of the basic components of such a system, along with a basic idea for a solution, and some rationale of why I propose each aspect of the "system."

To be fair, I envision this as a lose collection of associated tools, rather than a single application. "Unix-like" or "modular" as you please.

Build System

As a web service, you need something that will take input files that include bibliographies or citation lists, auxiliary style specifications, and source files, and output PDFs (and possibly HTML.) One of the ways to make LaTeX usable, I think, is to take out all of the idiosyncrasies around the build process and have a stable, and consistent build environment. This gets rid of weird situations where you have to intuitively know when you have to run LaTeX four times. Between cutting out the lengthy installation and configuration process, and simplifying the build workflow, the win is enormous.

While there's no reason that one couldn't develop such a tool to run on a single machine, the dependencies would be complicated, and if you were deploying this for a company (say,) it makes sense to centralize the build process (and templates) and as long as the build service/infrastructure is freely available, there's no functional reason to not have it as a service.

In terms of implementation, I think there needs to be two main interfaces. The first is a website that has a box that you drag files onto, and then generates a PDF and lets you download that as a file. The second is a RESTful-style interface that can be easily integrated with text editors so that you can issue a command and have a PDF open a few moments later as if you ran pdflatex locally. I can imagine other nifty extensions of this basic "pipe" framework. LaTeX-by-email, document sharing, personal document libraries, and so forth.

Also, I think it ought to go without saying that the system should use pybtex and YAML formating. The source files themselves should basically be standard LaTeX, but I think each document should have a YAML header for configuration information for the build process. With some optional flags in the header, one could also configure the server provide some level of pre-processing to make the LaTeX a bit easier to manage.

Editing Environment

A great editing environment for LaTeX (and really anything,) can make all the difference in the world between a good editing experience. While I envision the basic files to be basic LaTeX files with a YAML header (possible providing a user-configurable level of pre-processing, to make some mundane aspects of LaTeX easier to handle,) I think that it'd be best if the system included a pre-configured editing system.

My inclination is to use emacs as a basic framework or foundation for the editing platform. Take the basic emacs that we all know and love, and wrap it all up with an installer, some basic AUCTeX, configuration, yasnippet, some glue for the build service, and a few commands and menu-bar items and bundle it all together. The end result is an editing environment that is potentially really powerful and pretty easy to customize for specific uses. The best part is that all of the hard work of making an editor is done. How cool is that?

Tempting Structure

The thing I learned from using LaTeX was how to create an use templates to achieve more consistent products with less work all around. The first document you produce with LaTeX takes forever and can be pretty painful. Every document thereafter is perfectly identical and delightfully simple to produce. By controlling the build system and the editing environment, it becomes crazy easy to do awesome things with templates. In addition to simplifying some syntactic forms with the build system, the build system can become a platform for offering access to templates. Providing and managing templates in this way is really the crown of the whole system, because:

Value

In general the value can be summarized as: better looking documents, more coherent workflows, with less work per-document, after a reasonable learning curve. Information workers in some industries and professions will find more value from systems like these: media, law, public relations, and academia spring to mind as fertile grounds for this kind of technology. At the same time, every industry produces documents and collateral of some kind and having better publication tools is useful everywhere.

Resources

I've compiled a few links links to projects that are doing complementary and related work.