preprocess.py -- a portable multi-language file preprocessor

Home http://trentm.com/projects/preprocess/
License MIT (more details at OSI)
Platforms Windows, Linux, Mac OS X, Unix
Current Version 1.0.3 What's new?
Dev Status Fairly mature, has been used in the Komodo build system for over 3 years.
Requirements Python >= 2.3

What's new?

Support has been added for preprocessing TeX, Fortran, C#, Java, Shell script and PHP files. See the Change Log below for more.

Why preprocess.py?

There are millions of templating systems out there (most of them developed for the web). This isn't one of those, though it does share some basics: a markup syntax for templates that are processed to give resultant text output. The main difference with preprocess.py is that its syntax is hidden in comments (whatever the syntax for comments maybe in the target filetype) so that the file can still have valid syntax. A comparison with the C preprocessor is more apt.

preprocess.py is targetted at build systems that deal with many types of files. Languages for which it works include: C++, Python, Perl, Tcl, XML, JavaScript, CSS, IDL, TeX, Fortran, PHP, Java, Shell scripts (Bash, CSH, etc.) and C#. Preprocess is usable both as a command line app and as a Python module.

Here is how is works: All preprocessor statements are on their own line. A preprocessor statement is a comment (as appropriate for the language of the file being preprocessed). This way the preprocessor statements do not make an unpreprocessed file syntactically incorrect. For example:

preprocess -D FEATURES=macros,scc myapp.py

will yield this transformation:

...                                     ...
# #if "macros" in FEATURES
def do_work_with_macros():              def do_work_with_macros():
    pass                                    pass
# #else
def do_work_without_macros():
    pass 
# #endif
...                                     ...

or, with a JavaScript file:

...                                     ...
// #if "macros" in FEATURES
function do_work_with_macros() {        function do_work_with_macros() {
}                                       }
// #else
function do_work_without_macros() {
}
// #endif
...                                     ...

Despite these contrived examples preprocess has proved useful for build-time code differentiation in the Komodo build system -- which includes source code in Python, JavaScript, XML, CSS, Perl, and C/C++.

The #if expression ("macros" in FEATURES in the example) is Python code, so has Python's full comparison richness. A number of preprocessor statements are implemented:

#define VAR [VALUE]
#undef VAR
#ifdef VAR
#ifndef VAR
#if EXPRESSION
#elif EXPRESSION
#else
#endif
#error ERROR_STRING
#include "FILE"

As well, preprocess will do in-line substitution of defined variables. Although this is currently off by default because substitution will occur in program strings, which is not ideal. When a future version of preprocess can lex languages being preprocessed it will NOT substitute into program strings and substitution will be turned ON by default.

Please send any feedback to Trent Mick.

Install Notes

Download the latest preprocess source package, unzip it, and run python setup.py install:

unzip preprocess-1.0.3.zip
cd preprocess-1.0.3
python setup.py install

If your install fails then please visit the Troubleshooting FAQ.

This will install preprocess.py into your Python site-packages and also into your Python bin directory. If you can now run preprocess and get a response then you are good to go, otherwise read on.

The problem is that the Python bin directory is not always on your PATH on some operating systems -- notably Mac OS X. To finish the install on OS X either manually move 'preprocess' to somewhere on your PATH:

cp preprocess.py /usr/local/bin/preprocess

or create a symlink to it (try one of these depending on your Python version):

ln -s /System/Library/Frameworks/Python.framework/Versions/2.3/bin/preprocess /usr/local/bin/preprocess
ln -s /Library/Frameworks/Python.framework/Versions/2.4/bin/preprocess /usr/local/bin/preprocess

(Note: You'll probably need to prefix those commands with sudo and the exact paths may differ on your system.)

Getting Started

Once you have it install, run preprocess --help for full usage information:

$ preprocess --help
Preprocess a file.

Command Line Usage:
    preprocess [<options>...] <infile>

Options:
    -h, --help      Print this help and exit.
    -V, --version   Print the version info and exit.
    -v, --verbose   Give verbose output for errors.

    -o <outfile>    Write output to the given file instead of to stdout.
    -f, --force     Overwrite given output file. (Otherwise an IOError
                    will be raised if <outfile> already exists.
    -D <define>     Define a variable for preprocessing. <define>
                    can simply be a variable name (in which case it
                    will be true) or it can be of the form
                    <var>=<val>. An attempt will be made to convert
                    <val> to an integer so "-D FOO=0" will create a
                    false value.
    -I <dir>        Add an directory to the include path for
                    #include directives.

    -k, --keep-lines    Emit empty lines for preprocessor statement
                    lines and skipped output lines. This allows line
                    numbers to stay constant.
    -s, --substitute    Substitute defines into emitted lines. By
                    default substitution is NOT done because it
                    currently will substitute into program strings.

Module Usage:
    from preprocess import preprocess
    preprocess(infile, outfile=sys.stdout, defines={}, force=0,
               keepLines=0, includePath=[], substitute=0)

The <infile> can be marked up with special preprocessor statement lines
of the form:
    <comment-prefix> <preprocessor-statement> <comment-suffix>
where the <comment-prefix/suffix> are the native comment delimiters for
that file type. 


Examples
--------

HTML (*.htm, *.html) or XML (*.xml, *.kpf, *.xul) files:

    <!-- #if FOO -->
    ...
    <!-- #endif -->

Python (*.py), Perl (*.pl), Tcl (*.tcl), Ruby (*.rb), Bash (*.sh),
or make ([Mm]akefile*) files:

    # #if defined('FAV_COLOR') and FAV_COLOR == "blue"
    ...
    # #elif FAV_COLOR == "red"
    ...
    # #else
    ...
    # #endif

C (*.c, *.h), C++ (*.cpp, *.cxx, *.cc, *.h, *.hpp, *.hxx, *.hh),
Java (*.java), PHP (*.php) or C# (*.cs) files:

    // #define FAV_COLOR 'blue'
    ...
    /* #ifndef FAV_COLOR */
    ...
    // #endif

Fortran 77 (*.f) or 90/95 (*.f90) files:

    C     #if COEFF == 'var'
          ...
    C     #endif


Preprocessor Syntax
-------------------

- Valid statements:
    #define <var> [<value>]
    #undef <var>
    #ifdef <var>
    #ifndef <var>
    #if <expr>
    #elif <expr>
    #else
    #endif
    #error <error string>
    #include "<file>"
  where <expr> is any valid Python expression.
- The expression after #if/elif may be a Python statement. It is an
  error to refer to a variable that has not been defined by a -D
  option or by an in-content #define.
- Special built-in methods for expressions:
    defined(varName)    Return true if given variable is defined.  


Tips
----

A suggested file naming convention is to let input files to
preprocess be of the form <basename>.p.<ext> and direct the output
of preprocess to <basename>.<ext>, e.g.:
    preprocess -o foo.py foo.p.py
The advantage is that other tools (esp. editors) will still
recognize the unpreprocessed file as the original language.

And, for module usage, read the preprocess.preprocess() docstring:

pydoc preprocess.preprocess

Change Log

v1.0.3

v1.0.2

v1.0.1

v1.0.0

v0.9.2

v0.9.1

v0.9.0

v0.8.1

v0.8.0

0.7.0:

0.6.1:

0.6.0:

0.5.0:

0.4.0:

0.3.2:

0.2.0:

0.1.0: