grok -m match -r reaction
DESCRIPTION
The grok utility watches files (or STDOUT) as specified in the configura-
tion file for patterns. Upon recognition of a pattern a pre-configured
action is taken.
The options are as follows:
-b Run in the background. Detaches grok and redirects STDOUT,
STDIN, and STDERR to /dev/null
-d [1, 2, ...]
The value (default is 1 if no value is given) specifies the debug
level. Only messages with a debug value less than or equal to
the specified value will be printed. A debug value of 2 or
higher will output information about the patterns, file, and exec
definitions.
-f file
Use file as a configuration file. If this is not specified grok
will look in the current working directory for grok.conf and use
that.
-m match
Specify a match string. See COMMANDLINE below.
-r reaction
Specify a reaction string. See COMMANDLINE below.
CONFIGURATION
The grok.conf file follows this syntax:
patterns { <patternspec> };
filters { <filterspec> };
file "filename" { <filespec> };
exec "command" { <filespec> };
filelist "file1,file2,..." { <filespec> };
catlist "file1,file2,..." { <filespec> };
filecmd "command" { filespec };
patternspec
Pattern specifiers determine what to match and capture. These pat-
terns can be referred to as %PATTERNNAME% elsewhere in the configu-
ration file. Patterns can have other patterns nested within them.
Patterns have have the following syntax:
PATTERNNAME = "regexp";
Pre-existing named patterns exist for those who do not know regular
expressions. The patterns are referenced by using %PATTERN%. grok
/FILTERNAME/ = { perl block };
The slash notation is important and can often be thought of as a
search and replace, as that is often the purpose of a filter. The
syntax for using a filter in a reaction statement is %NAME|FILTER%.
filespec
File specifiers have the following syntax:
type "description" {
match = "match pattern";
key = "some string";
threshold = "integer";
interval = "integer";
reaction = "reaction command(s)";
reaction = { reaction perl block };
};
Description A string used to identify the definition. This has
no significance other than debugging and clarifica-
tion for the user.
Match The string pattern to match. Multiple match entries
per type definition and named patterns are allowed.
Key The key to use in the hashtable when counting
occourances of a specific event. The default key is
a concatenation of all captured patterns into a
string. As this may not be desirable the key value
can be specified. Pattern names are valid and eval-
uated at runtime.
Threshold Specify the number of matches by it's key before the
reaction(s) take place. Once the threshold is
reached the reaction(s) occour and the occurance
count is set to zero. The default threshold is 1
(each occurance causes a reaction).
Interval The time, in seconds, between occurance counter
reset. If the threshold is not reached by the speci-
fied interval time then the occourance count is is
reset and no reaction is triggered.
Reaction Any valid (string of) command(s) or block of perl
code. If using a perl block grok provides helper
variables. These are $v (a hashref containing all
named patterns matched, subnames are valid here) and
$d (a hashref to use for your own storage, keyed
with your key value).
'tail -0f'.
filecat has the same syntax as filelist, but instead of those files
watched by tail, they are simply catted.
filecmd allows a command which returns a list of files. The
returned list should be newline delimited (cp the output of find,
ls). Under the hood, this essentially becomes a dynamically-gener-
ated filelist entry As with filelist, so the listed output can con-
tain globs as filelist can.
BUILTIN PATTERNS
There are lots of builtin patterns at your disposal:
Name Regular Expression
USERNAME Match a username: /[a-zA-Z0-9_-]+/
USER Alias of USERNAME
NUMBER Any real number (including scientific notation).
INT Any integer
DATA Non-greedy wildcard.
GREEDYDATA Greedy wildcard.
QUOTEDSTRING Quoted string. (double or single)
QS Alias of QUOTEDSTRING
MAC MAC Address
IP IP Address
HOSTNAME Any RFC 1035 compliant hostname.
HOST Alias of HOSTNAME
IPORHOST IP or HOSTNAME
MONTH Any representation of a month: Jan, January, 01, 1
MONTHDAY 01-31 and 1-31
DAY Any form of english weekday: Mon, Monday, 1
YEAR Alias of INT
TIME /\d{2}:\d{2}:\d{2}/
HTTPDATE %MONTHDAY%/%MONTH%/%YEAR%:%TIME% %INT:ZONE%
PROG Alias of WORD
PID Alias of INT
SYSLOGDATE MONTH% %MONTHDAY% %TIME%
SYSLOGPROG %PROG%([%PID%])?
SYSLOGBASE %SYSLOGDATE% %HOSTNAME% %SYSLOGPROG%:
APACHELOG %IPORHOST% %USER:IDENT% %USER:AUTH% [%HTTPDATE%] %QS:URL%
%NUMBER:RESPONSE% %NUMBER:BYTES% %QS:REFERRER% %QS:AGENT%"
BUILTIN FILTERS
There are pre-defined filters available within a reaction statement. The
pre-defined filters and their descriptions are:
shnq Used to escape (){}[]$*?!|'"` characters.
shdq Used to escape $"` characters.
e[XYZ] Used to escape arbitrary (XYZ) characters.
stripquotes Strip leading and trailing quote characters ("`').
parsedate Convert a string containing a date to machine time.
strftime(fmt) Convert unix epoch to arbitrary time format using speci-
grok -m MATCH -r REACTION
grok will create an in-memory config file that looks like this:
exec "cat" {
type "all" {
match = "MATCH";
reaction = { print meta2string("REACTION", $v); };
};
};
What does that mean? This means that you get to specify one match and one
reaction. The reaction is just a string that will be expanded and printed
when the match matches. See EXAMPLES for more information.
EXAMPLES
1. The following example defines %TTY%, watches /var/log/messages for
failed su(1) attempts and prints a message to STDOUT (notice the use
of sub-named pattern captures).
patterns {
TTY = "/dev/tty[qp][a-z0-9]";
};
file "/var/log/messages" {
type "failed su(1) attempt" {
match = "BAD SU %USER:FROM% to %USER:TO% on %TTY%";
reaction = "echo 'Failed su(1): %USER:FROM% -> %USER:TO% (%TTY%)'";
};
};
2. For the following example we are watching apache logfiles and
replacing the quored URL string with just the URL in the reaction
statement. The format of the file is:
127.0.0.1 - frank [10/Oct/2000:13:55:36 -0700] "GET /apache_pb.gif
HTTP/1.0" 200 2326 "http://www.example.com/start.html" "Mozilla/4.08
[en] (Win98; I ;Nav)"
filters {
/httpfilter/ = { s/^\+ (\S+) \S+$/$1/; };
};
exec "cat /var/log/http.access.log" {
type "http" {
match = "%APACHELOG%";
reaction = "echo '%IP%: %QUOTEDSTRING:URL|e[']|stripquotes|httpfilter%'";
};
};
3. Below is a rule for watching failed SSH login attempts and blocking
them using PF. Notice the multiple type entries for a single file.
reaction = "pfctl -t naughty -T add %IP%";
};
};
4. The following is an example of watching tcpdump output for SYN pack-
ets destined to port 22 and printing a message. The second type
statement is useful for watching portscans.
exec "tcpdump -li em0 -n 2< /dev/null" {
type "ssh-connect" {
match = "%IP:SRC%.\d+ < %IP:DST%.22: S";
reaction = "echo 'SSH connect(): %IP:SRC% -< %IP:DST%'";
};
type "port-scan" {
match = "%IP:SRC%.%PORT% < %IP:DST%.%PORT:DST%: S";
key = "%IP:SRC%";
threshold = 30;
interval = 5;
reaction = "echo 'Port scan from %IP:SRC%'";
};
};
5. The following example illustrates the optional filters available
when evaluating a variable in a reaction statement. Assume that
/etc/passwd contains the following line:
test:*:1002:1002:T"est?:/home/test:/bin/sh
exec "cat /etc/passwd" {
type "passwd" {
match = "^test";
reaction = "echo 'Found: %=LINE|shdq%'";
};
};
The output of this is:
Found: test:*:1002:1002:T\"est?:/home/test:/bin/sh
Using the same line in /etc/passwd but changing the example to look
like:
exec "cat /etc/passwd" {
type "passwd" {
match = "^test";
reaction = "echo 'Found: %=LINE|shnq%'";
};
};
results in:
8. The following example shows how to use -m and -r. Let's find out
what programs are logging to /var/log/messages:
% grok -m "%SYSLOGBASE%" -r "%PROG%" < /var/log/messages | sort | uniq
kernel
named
newsyslog
sshd
9. The following (somewhat silly) example shows how to use -m and -r.
Let's compare the forward and reverse dns entries for
www.freebsd.org:
% host -t A www.freebsd.org | grok -m "%IP%" -r "%IP|ip2host%"
www.freebsd.org
In the above example, we do a name resolution on www.freebsd.org and
grok for an IP, then filter the IP through a dns lookup and print
the output.
10. The following (advanced) example shows how to use -m and -r. Let's
break a apart a log into files by IP. Note that the match below
matches the IP anywhere in the line.
% cat /var/log/messages \
| perl grok -m '%IP%' -r 'echo "%=LINE|shdq%" >> /tmp/log.%IP%' \
| sh
% ls /tmp/log.*
/tmp/log.192.168.0.254 /tmp/log.192.168.10.177
/tmp/log.192.168.10.175 /tmp/log.192.168.10.189
FILES
/usr/local/etc/grok.conf
AUTHOR
Jordan Sissel <jls@semicomplete.com> wrote and maintains grok. Wesley
Shields <wxs@csh.rit.edu> wrote the manual page.
CONTRIBUTORS
Canaan Silberberg contributed patches supporting filelist and filecmd.
BUGS
There are no known bugs at this time. Bugs can be reported to
<jls@semicomplete.com>.
BSD February 23, 2007 BSD
Man(1) output converted with
man2html