I started ebnf2yacc as a personal experiment, and then ended up using it as a tool at work (for OpenWBEM - see http://www.openwbem.org and http://www.sourceforge.net/projects/openwbem). I finally got it into a usable state, and decided to open source it. The purpose of ebnf2yacc is to ease the creation of yacc parsers. Yacc input files must be in bnf. It is much easier to write a grammer in ebnf. This program will take an input file in ebnf and convert it to a usable yacc file. Caveat: Right now, it will only accept bnf input, basically the same that you would feed to yacc. The main usefullness of ebnf2yacc right now is to create a c++ abstract syntax tree. For a concrete example, see the WQL parser of OpenWBEM. It is planned in the future to support most ebnf features. ebnf2yacc generates a set of classes that represent the ast of the grammar. These ast classes support the visitor pattern. An abstract visitor base class is generated as well as a sample concrete visitor that simply traverses the tree. ebnf2yacc also generates a yacc file that can be used (with slight modification if you need precedence or other yacc features) to build the ast. To build a parser, you will still need to provide the appropriate framework. Some people learn best by example. There are two examples of ebnf2yacc input in the tests subdirectory. test1.e2y is the grammar for WQL that I created for OpenWBEM. test2.e2y is the grammar for ebnf2yacc itself. In order to implement certain features, ebnf2yacc makes use of certain characteristics of the names of grammar rules. Any token that is ALL CAPS is assumed to be a terminal, and a token that comes from the lexer. If a rule begins with "str" (e.g. strToken) or is ALL CAPS, it is stored as a string in the ast. No ast class is generated for rules that begin with str. You should only use this for rules that are simple alternatives of a bunch of tokens. e.g.: strOp: PLUS | MINUS | TIMES | DIVIDE ; If a rule begins with "opt", then code will be generated to check the ast for null in the sample traversal visitor. e.g.: optSemicolon: /* EMPTY */ | SEMICOLON ; If a rule ends with "List", then the ast will contain a list of the first non-terminal of the first alternative of the rule. e.g.: varList: var | varList COMMA var ; There is no checking done to enforce these rules, so the "garbage in, garbage out" rule applies here. Right now the names of the generated classes are fixed. I plan to have this be configurable in the future, but have not yet decided on a good mechanism for that. To build ebnf2yacc you need lex and yacc. In particular, I used flex and bison. It has not been tested with other lex and yaccs. If someone tries it with a different lex or yacc, I would like to know if it works or not. I have tried to write the code in portable c++, and have compiled it with gcc 2.95.2. The project uses autoconf/automake, so to build it, you can simply run: ./configure make and then to install it: make install The binary is named ebnf2yacc. The command line arguments are: Usage: If you find any bugs or have any suggestions for improvements and features, I am eager to hear them. Please feel free to make use of the sourceforge facilities at http://sourceforge.net/projects/openwbem There is a mailing list for ebnf2yacc, hosted at sourceforge that you can subscribe to as well. --Dan Nuffer