This page describes the Python interface to DParser. Please see the DParser manual for more detailed information on DParser.
Grammar rules are input to DParser using Python function documentation strings. (A string placed as the first line of a Python function is the function's documentation string.) In order to let DParser know that you want it to use a specific function's documentation string as part of your grammar, begin that function's name with "d_". The function then becomes an action that is executed whenever the production defined in the documentation string reduces. For example,
def d_action1(t): " sentence : noun 'runs' " print 'found a sentence' #...This function specifies an action,
d_action1
, and a production, sentence
, to DParser. d_action1
will be called when DParser recognizes a sentence
. The argument, t
, to d_action1
is an array. The array consists of the return values of the elements
making up the production, or, for terminal elements, the string the terminal matched. In the above example, the array t
array will contain the return value of noun
's action as the first element and the Python string 'runs'
as
the second.
Regular expression are specified by enclosing the regular expression in double quotes:
def d_number(t): ' number : "[0-9]+" ' # match a positive integer return int(t[0]) # turn the matched string into an integer #...Make sure your documentation string is a Python raw string (precede it with the letter
r
) if it contains any Python escape sequences.
For more advanced features of productions, such as priorities and associativites, see the DParser manual.
For a simple, complete example to add integers, go back to the home page.
spec
, spec_only
spec
, that action will be called for both speculative and final parses
(otherwise, the action is only called for final parses).
The value of spec
indicates whether the parse is final or speculative (1 is speculative, 0 is final).
To reject a speculative parse, return dparser.Reject
.
If an action takes spec_only
, the action will be called only for speculative parses. The return value of the
action for the final parse will be the same Python object that was returned for the speculative parse.
Complete example.
g
g
is actually an array, the first element of which is the
global state. (Using a one-element array in this manner allows the action to change the global state.)
s
s
is useful if the purpose of your parser is to alter some text, leaving it mostly intact. See here for a complete example.
nodes
D_ParseNode
s. They contain information on line numbers and such. See here for useful fields.
this
D_ParseNode
for the current production. ($$
in DParser.) Again, see this example.
parser
modules
:
file_prefix
:start_symbol
:
print_debug_info
:
dont_fixup_internal_productions, dont_merge_epsilon_trees, commit_actions_interval, error_recovery
:
D_Parser
(see the DParser manual)
initial_skip_space_fn
:whitespace
production,
and instead of the built-in, c-like whitespace parser).
Its argument is a d_loc_t
structure. This structure's member, s
, is an index into the string that is being parsed. Modify this index
to skip whitespace:
def whitespace(loc): # no d_ prefix while loc.s < len(loc.buf) and loc.buf[loc.s:loc.s+2] == ':)': # make smiley face the whitespace loc.s = loc.s + 2 #... Parser().parse('int:)var:)=:)2', initial_skip_space_fn = whitespace)
syntax_error_fn
:d_loc_t
structure (see initial_skip_space_fn
)
indicating the location of the error. The function below will put '<--error' and a line break at the location of the error:
def syntax_error(loc): mn = max(loc.s - 10, 0) mx = min(loc.s + 10, len(loc.buf)) begin = loc.buf[mn:loc.s] end = loc.buf[loc.s:mx] space = ' '*len(begin) print begin + '\n' + space + '<--error' + '\n' + space + end #... Parser().parse('python is bad.', syntax_error_fn = syntax_error)
ambiguity_fn
:D_ParseNode
s and expects one of them to be returned. By default a dparser.AmbiguityException
is raised.
print_debug_info=1
to Parser.parse()
to see a list of the actions that are being called (pass it 2 to see only final actions).
Also, try looking at the grammar file that is created, d_parser_mach_gen.g.
Also, make sure your documentation string is a Python raw string (precede it with the letter r
) if it contains any Python escape sequences.
initial_skip_space_fn
, as shown above, or define the special whitespace
production:
def d_whitespace(t): 'whitespace : "[ \t\n]*" ' # treat space, tab and newline as whitespace, but treat the # character normally print 'found whitespace:' + t[0]
from dparser import Parser def d_somefunc(t) : '${declare longest_match}' #...(see the DParser manual for an explanation of specifiers and declarations.)
from dparser import Parser def d_grammar(t): '''sentence : noun verb; noun : 'dog' | 'cat'; verb : 'run' ''' print 'this function gets called for every reduction' Parser().parse("dog run")