RCSChecker is a syntax checker for RCS (CVS) files.
It's main purpose is to find out if
To this end I reviewed the original rcsfile(5) grammar. Failing to find current grammars for this file format I took a look at JavaCC and was able to come up with a working grammar in a very short time. JavaCC has a very nice and expressive syntax.
The closest match for an existing RCS file format parser was some Perl modules and one for ANTLR (see jrcs.g). Unfortunately, this file is a bit outdated. It does not work with current ANTLR v3.2. An attempt at re-writing this to match v3.2 syntax did work out OK, up to a point where lookahead became a problem.
Using JavaCC was easier thanks to the built-in support for lexical states - with no code blocks required. Having to use code blocks for lexical state management in ANTLR (really?) appears to be a drawback in a comparison with JavaCC.
java -jar RCSChecker.jar < <input file,v>
Of course, it's possible (and expected) to take the grammer and tweak it to suit whatever other uses you need it for. Or subclass the parser class and change things this way.
In case binary data in your CVS files cause problems it may be required to specify a Java file encoding other than the default such as
java -Dfile.encoding=ISO-8859-1 -jar RCSChecker.jar
jar xf RCSChecker.jar sample,v java -jar RCSChecker.jar < sample,v
ID: deltatype text ID: kopt kv ID: permissions 666 ID: commitid 1158467fa5ef0686 ID: filename sample
1) I think the original grammar is not perfect. newphrase IDs theoretically may look like revisions (1.2.3.4). This makes parsing a bit complicated when a delta production (starting with a revision) needs to be told apart from a newphrase which happens to start with such "revision". Practically this does not happen, as all newphrase IDs (that I have seen) look like nouns in the style of the other RCS file format tokens (head, strict and so on). So this grammar assumes newphrase IDs start with letters or underscore, but no dots or digits.
2) The JAR file was generated with
mkdir src-generated javacc -OUTPUT_DIRECTORY:src-generated RCS.jj mkdir classes javac -d classes src-generated\*.java jar cvfm RCSChecker.jar MANIFEST.MF RCS.jj sample,v -C classes .
This code is licensed under the Apache License, Version 2.0.
Back to main page. Back to software page.
This page was last changed on Julye 29th, 2010. © Matthias Gärtner 2010 |