SYNOPSIS

       log_analysis [-h] [-r] [-g] [-f config_file] [-o file] [-O] [-n node-
       name] [-U] [-u unknownsdir] [-D var1,var2=value,...] [-d days_ago] [-a]
       [-F] [-i] [-m mail_address] [-M mail_prog] [-s] [-S] [-t forced_type]
       [required_files. . .]  log_analysis -I info_type

DESCRIPTION

       log_analysis analyzes and summarizes system logs files.  It also runs
       some other commands (ie. w, df -k) to show the system state.  It's
       intended to be run on a daily basis out of cron.

       log_analysis supports several major modes.  The default mode is report
       mode, which scans through your logs, produces a text report, and exits.
       There is also real mode, which lets you monitor your logs continuously;
       gui mode, which is a gui sitting on top of real mode; and daemon mode,
       which is a daemonized variant of real mode.

OPTIONS

       -a all
           Show all logs, not just the ones from yesterday.

       -A daemon mode
           Start in daemon mode.  Daemon mode is like real mode, except that
           the process daemonizes, and there is no regular output, just
           actions.  daemon mode is useful if you want to start log_analysis
           at system boot time to run actions.  It's also useful if you have
           actions configured, and you have multiple copies of log_analysis
           running in real/gui mode, and you only want the actions to happen
           once.

           See -r for more info on real mode.  In general, anything that
           applies to real mode applies to daemon mode unless it explicitly
           says otherwise.

           The variables specific to daemon mode are daemon_mode and dae-
           mon_mode_pid_file.  One variable that is not specific to daemon
           mode but is really useful with daemon mode is
           real_mode_no_actions_unless_is_daemon.

       -b real mode backlogs
           By default, real mode and gui mode ignore all existing log messages
           and only show new logs.  With this option, real mode shows logs as
           indicated by days_ago.  See -r for more info.

       -d days_ago
           Show logs from days_ago days ago.  Defaults to 1 (ie. show yester-
           day's logs.)  In -a mode, this option only affects the heading, and
           it defaults to 0.

           You can also provide an absolute date in the form YYYY_MM_DD, ie.
           2001_03_02.  And you can provide the symbolic names today (equiva-
           lent to 0) and yesterday (equivalent to 1).
           stant to a particular value, say "constant=value".

       -f config_file
           Read config_file in addition to the internal config and the inter-
           nal config files.  See "CONFIG FILE" for details.

       -F  Instead of loading the whole internal config, just use a minimal
           subset.

       -g  "gui mode", ie. monitor log files continuously.  Currently con-
           flicts with many other modes and options.  Yes, has built-in sup-
           port for log file rollover.  This is basically real mode (see -r)
           with a GUI; variables that apply to real mode also apply to gui
           mode, but not vice versa.

           See variables gui_mode, gui_mode_modifier, and window_command for
           gui mode specifics.  See -r for many things that also apply to gui
           mode.

       -h help
           Show command summary and exit.

       -i includes suppress
           Don't include the standard include files, ie. /etc/log_analy-
           sis.conf, /etc/log_analysis.conf, and the others listed in "FILES".
           Note that this option does not stop the inclusion of
           $HOME/.log_analysis.conf in gui mode.

       -I info
           This option is used for obtaining internal information about
           log_analysis.  log_analysis exits immediately after outputting the
           information.

           If info is help, log_analysis outputs the list of things you can
           use for info.

           If info is categories, all categories (those mentioned in the vari-
           ous configs and implicit categories) will be listed.

           If info is colors, all colors that work for real_mode and gui_mode
           will be listed.

           If info is config_versions, all config files will be listed with
           their config_version and file_version (if defined).

           If info is evals, the evals built from the config (internal and
           local) are output.

           If info is internal_config, the internal config is output.

           If info is log_files, the log files that would have been read are
           output.

       -M mail_command
           Use mail_command to send the mail.  This can also be specified in
           the config; see mail_command in "VARIABLES" for more info, includ-
           ing the default.

       -n nodename
           Use nodename as the nodename (AKA hostname) instead of the default.
           This is more than just cosmetic: entries in syslogged files will be
           processed differently if they didn't come from this nodename.  This
           can also be specified in the config file; see nodename in "VARI-
           ABLES".

       -N process all nodenames
           If the logs contain entries for nodes other than nodename, (ie. if
           the host is a syslog server), analyze them anyway.

       -o file
           Output to file instead of to standard output.  Works with -m, so
           you can save to a file and send mail with one command.

       -O  With -o file, causes the output to go both to the file and to stan-
           dard output.  NB: this does not currently work with -m, so you
           can't output to a file, standard output, and to email.

       -p pgp_type
           Encrypts the mail output.  Uses pgp_type to determine the encryp-
           tion command.  For use with -m or mail_address.  See pgp_type in
           the list of global variables for info on encryption types.

       -r  "Real mode", ie. monitor log files continuously.  Currently con-
           flicts with many other modes and options.  Yes, has built-in sup-
           port for log file rollover.  See -g for a GUI that can sit on top
           of this mode, and -A to run real mode as a daemon.

           See variables real_mode, real_mode_output_format,
           real_mode_sleep_interval, real_mode_check_interval, real_mode_back-
           logs (or the -b option), and keep_all_raw_logs in the list of
           global variables for more configurables.

           WARNING: in real mode and gui mode, only the most recent file per
           glob in optional_log_files is monitored.  This means that you
           should set it to something like /var/log/messages* and
           /var/log/syslog* rather than /var/log/*.

           WARNING: in real mode and in gui mode, log_analysis treats days_ago
           differently; if it's a simple number, it is treated as the number
           of days ago to start looking at logs.  So, if days_ago is 7,
           log_analysis looks through the past 7 days' worth of logs.  HOW-
           EVER, even if -d is set, log_analysis doesn't actually show these
           logs unless -b is specified or the corresponding variable
           real_mode_backlogs is set.
           Usually, log_analysis will include its version number, the time it
           spent running, and its arguments at the end of the output.  This
           option suppresses that output.  The suppress_footer variable does
           the same thing as this option.

       -t forced_type
           log_analysis usually determines the type of logfiles by looking at
           the per-type log_filenames extension.  This option and the
           type_force variable let you bypass that check.

       -U unknowns-only
           Output logfile unknowns to stdout and exit.  If unknownsdir exists,
           also wipe unknownsdir if it exists and then write out raw unknown
           lines to files in unknownsdir.  This exists to make writing custom
           rules easier.

       -u unknownsdir
           Use unknownsdir as the unknownsdir.  If unknownsdir already exists,
           and contains files, its files will be used as the input for
           log_analysis regardless of any other command line options.  If -U
           is also specified, after all processing unknownsdir will be wiped
           out and its files rewritten with the current unknowns.  This is
           useful for writing your own configs.

       -v version
           Output version and exit.

       required-files
           If files are specified on the command line, log_analysis ignores
           its built-in list of optional and required log files, and process
           the files on the command line.  If one of the files doesn't exist,
           it's a fatal error.

CONFIG FILE

       The script has an embedded config file.  It will also read various
       external config files if they exist; see "FILES" for a list.  Later
       directives (from later in the file or from a file read later) override
       earlier directives.

       You can make comments with '#' at the beginning of a line.  If you want
       a '#' or '=' at the beginning of a line, you usually need to quote it
       with backslash.

       Some directives take a "block" as argument.  A block is a collection of
       lines that ends with a line that is empty or only contains whitespace.
       '#' at the beginning of a line still comments out the line.  Leading
       whitespace on a line is ignored.

       Before the config is parsed, it is passed through a preprocessor
       inspired by the aide(1) preprocessor.

       Pattern directives
       pattern: pattern
           pattern is a perl regex (see perlre(1)) that implictly starts with
           ^ (beginning of the line) and implicitly ends with \s*$ (optional
           whitespace and the end of the line.)  This should only be issued
           after a logtype: has been issued in the same config file.  Wildcard
           parts of the pattern should be surrounded with parentheses, to save
           these parts for later use in the format:.  Note that there are some
           tokens with special meanings that can be used here in the format
           $pat{something}, ie.  $pat{ip}, $pat{file}, etc. (see "pat" for
           details, and run log_analysis -I pats for the current list).  Exam-
           ples:

           pattern: popper: Stats: ($pat{mail_user}) (\d+) (\d+) (\d+) (\d+)

           pattern: login: LOGIN ON ($pat{file}) BY ($pat{user})

           The order of precedence for patterns is undefined, except that
           user-defined patterns always have precedence over the patterns of
           the internal config.

       format: format
           format is treated as a string that contains the useful information
           from a pattern.  Note that it should not actually be quoted.  A
           format is mandatory for category destinations, but should not be
           used with SKIP or LAST destinations.

           For example, if we had a pattern that was login: LOGIN ON
           ($pat{file}) BY ($pat{user}), we would probably just want $2, so we
           might say:

           format: $2

           Similarly, if we had a patterns that was kernel: deny (\d+) packets
           from ($pat{ip}) to ($pat{ip}), we might want to say:

           format: $2 => $3

       use_sprintf
           use_sprintf is optional.  If this directive is present for a given
           format, than instead of the format being treated as a string, it is
           treated as the arguments for sprintf(3).  For example, if you have
           a source IP address in $2 and a destination IP address in $3, you
           could just have dest as $2 => $3, but you would have things lining
           up better if you did this:

           format: "%-15s => $3", $2

           use_sprintf

       delete_if_unique
           delete_if_unique is optional.  This feature can be used when you
           have multiple dests for one pattern, one of which is a regular cat-
           count: $1

       color: colors
           space-separated list of colors to display this message in when in
           real-mode or gui-mode.  For a list of colors that will work in both
           modes, run log_analysis -I colors.  Note that "bell" is among the
           available colors, because it didn't fit anywhere else.  See the
           colors entry for more info.

           NOTE: if multiple dest configs with conflicting color settings
           result in delivery to the same line in gui mode, the result is cur-
           rently undefined.  There is only one line to be displayed, after
           all.

       description: description_text
           This is a simple text description of the event, to explain the
           problem to your operators.  It can be accessed via gui mode.  The
           note above by color applies.

       do_action: action
           Run "action" (described elsewhere in the config with the "action:"
           keyword) if this event is seen in real mode or gui mode.

       priority: priority
           Assign priority priority to action.  Currently, the only priority
           that does anything is "IGNORE".  It can be used to ignore events.

       dest: dest
           This describes what you want done with the data in a pattern.  If
           dest is the special token SKIP the data is discarded.  If dest is
           the special token LAST, the data is assumed to be of the form "last
           message repeated N times", and we pretend as though the last mes-
           sage we saw occurred, using count as a multiplier.  If dest starts
           with the special token UNIQUE, we do special "unique" handling,
           which is covered in "UNIQUE DESTINATION".  If dest starts with the
           special token CATEGORY or is any other string, it is treated as a
           category that the pattern data should be saved to.  Ie. if pattern
           was login: LOGIN ON ($pat{file}) BY ($pat{user}), and format was
           $2, then one might set dest to login: successful local login.  You
           must have a format defined before the dest.

           You can have multiple dest directives for a single pattern, if all
           of the dests are category destinations.  Each one needs its own
           format.  Similarly, if you set count or use_sprintf, they are tied
           to the particular dest you set them with.

           Note that dest "closes" the description of a destination, so you
           need to have any other related directives (ie. format, count,
           use_sprintf, delete_if_unique) before the dest directive.  This
           ordering is necessary to avoid ambiguity in the multiple-destina-
           tion case.

       match hostname: value
           This event config applies when the "category" is "value", or the
           "data" is value, or the "hostname" is "value".  If multiple match
           lines are supplied, they are ANDed together.

       color: color
       description: description_text
       do_action: action
       priority: priority
           color, description, do_action, and priority work the same way as
           they do in a "dest" config or in an "event" config.

           If "event", "dest", and "category" configs all apply to a given
           event than "event" has highest precedence, followed by "dest", fol-
           lowed by "category".

       Category directives

       Several patterns can lead to the same category, so category-specific
       directives are associated with the category, not with a pattern.  Here
       are the category directives:

       category: category
           Specifies which category subsequent directives will define.

       filter: filter commands
           By default, log_analysis will output all the data it finds in a
           category.  Filters let you specify, say, that only the top 10 items
           should be output, or that only the items that occurred fewer than 5
           times should be output.  If a category has data, but none of the
           data meet the filter rules, then the category will be completely
           skipped.  See "FILTERS" for more info.

       sort: sorting keywords
           Specifies how this category should be sorted in the output.  Exam-
           ples are "funky", "string", "value", "reverse value", etc.  The
           default is "funky".  See "SORTING" for more info.

       derive: derive commands
           The usual way to populate categories is via the pattern config.
           But sometimes, you want to combine two or more elemental categories
           to make a new category.  Any categories derived in this manner may
           not be a destination for simple patterns.

           There are currently three subcommands for this (the quotes are lit-
           eral):

           "category1" add "category2"
           "category1" subtract "category2"
                   These do what you expect: take the values for the items in
                   category2 and add or subtract them from the values for the
                   items in category1.  Any item defined in either category

       color: color
       description: description_text
       do_action: action
       priority: priority
           color, description, do_action, and priority work the same way as
           they do in a "dest" config or in an "event" config.

           If "event", "dest", and "category" configs all apply to a given
           event than "event" has highest precedence, followed by "dest", fol-
           lowed by "category".

       Action directives

       In real mode and in gui mode, sometimes you want an "action" (like pag-
       ing someone) to automatically happen when a particular message is seen.
       And in gui mode, you might want to run a command on a message interac-
       tively (ie. to telnet or ssh into the host it came from.)  The direc-
       tives to do that (inspired by swatch(1)) are:

       action: action_name
           Starts defining a new action named action_name.

       command: command
           The command to run for the current action.  command uses the same
           tags as real_mode_output_format.

           WARNING: you can potentially shoot yourself in the foot by passing
           data that has not been sanitized to a command on your system.  Be
           careful!

       window: title
           Performing the action will require creating a window using title as
           the title.  The title will be passed to window_command as the "%t"
           tag.  title itself uses the same tags as real_mode_output_format.
           This only makes sense for gui mode.

           WARNING: you can potentially shoot yourself in the foot by passing
           data that has not been sanitized to a command on your system.  Be
           careful!

       use_pipe:
           The data in the event will be sent to the command via standard
           input.  The format used will be that specified by the
           default_action_format variable, unless overridden locally by the
           action_format: directive.  These formats allow the same tags as
           real_mode_output_format.

       action_format: format
           See use_pipe above.

       throttle: throttle_time
           Automatically-triggered actions can potentially result in a slew of
           tle_format variable, which defaults to "%c\n%d".  It can be overri-
           den on a per-action basis with the throttle_format: directive,
           which takes the same tags as real_mode_output_format.  If you want
           the throttle to be global to the action (say, a pager action), set
           throttle_format to a simple scalar value (like 1).

       throttle_format: format
           See throttle: above.

       Other directives


       config_version version-number
           Declare that the config is compatible with version version-number.
           This is for version-control purposes.  Every config file should
           have one of these.  You can scan your config files' config versions
           with -I config_versions.

       file_version revision-information
           Your own version control information.  revision-information can be
           arbitrary text.  You can scan your config files' config versions
           with -I config_versions.

       include file
           Read in configuration from file.  Dies if file doesn't exist.  file
           is subject to usual tag substitutions; see "TAG SUBSTITUTION".

       include_if_exists file
           Just like include, but doesn't die if the file doesn't exist.

       include_dir dir
           Read in all files in dir, and include them.  Die if the directory
           doesn't exist, or if a file in the directory isn't readable.  dir
           is subject to the usual tag substitutions; see "TAG SUBSTITUTION".
           Any filenames that match a pattern in filename_ignore_patterns will
           be skipped.

       include_dir_if_exists dir
           Just like include_dir, but doesn't die if the directory doesn't
           exist.  Does still die if any of the files in dir isn't readable.

       block_comment
           Throws out the block immediately after it.

       set var varname =value
           Set scalar variable varname to value value.  If the variable
           already exists, this will overwrite it.

           See "VARIABLES" for the list of variables you can play with.

       add var varname =value
           If scalar variable varname already exists, append value to the end
           into an array, and set the array variable arrname to that array.

           See "VARIABLES" for the list of variables you can play with.

       add arr arrname =
           Read in the block that follows this declaration, make the lines
           into an array, and append that array to the array named arrname.

           See "VARIABLES" for the list of variables you can play with.

       prepend arr arrname =
           Read in the block that follows this declaration, make the lines
           into an array, and prepend that array to the array named arrname.

           See "VARIABLES" for the list of variables you can play with.

       remove arr arrname =
           Read in the block that follows this declaration, and for each line,
           look for and delete that line from array arrname.  If one of these
           lines cannot be found, the result is a warning, not death.

           See "VARIABLES" for the list of variables you can play with.

       local OTHER DIRECTIVE
           Putting "local" in front of another directive means that this
           directive should be saved when gui_mode_config_savelocal is in
           effect.

       nowarn OTHER DIRECTIVE
           Putting "nowarn" in front of another directive means that this
           directive should not generate a config warning, i.e. for redefining
           a category filter.

VARIABLES

       Some variables are scalar, which means they are strings or numbers.
       Some variables are arrays, which are lists of scalars.

       Some variables are mandatory, which means they must be defined some-
       where in one of the config files, while some variables are optional.

       Some variables are global, while some are per-log-type extensions.
       Some example of per-log-type extensions are date_pattern and filenames.
       Extensions should actually appear in the format "TYPE_EXTENSION", ie.
       date_pattern would actually appear as syslog_date_pattern for the sys-
       log log-type and sulog_date_pattern for sulog.

       To see examples of many of the possibilities, as well as the default
       values, run log_analysis -I internal_config.

       PER-LOG-TYPE VARIABLE EXTENSIONS


           diate data will be stored in a temp file unless pipe_decom-
           press_to_open is used.  See "pipe_decompress_to_open" for more
           info.

       pipe_decompress_to_open
           If both decompression_rules and open_command apply to a given file,
           the intermediate data will be stored in a temporary file by default
           to avoid problems with some commands that can't handle input from a
           pipe.  If this optional scalar extension is set to 1 (or any
           "true") value, then instead, the output of the decompression rule
           will be piped to the open command, and the open command's %f tag
           will be mapped to "-".

       open_command_is_continuous
           If an open_command has been specified and the command is the sort
           that never exits (ie. tcpdump or the like) you should set this to
           let log_analysis know what to expext.  Such commands should only
           ever be used in real mode or gui mode.

       pre_date_hook
           This optional extension is an array of arbitrary perl commands that
           are run for each log line, before the date processing (or any other
           processing) is done.

       date_pattern
           This mandatory extension is a scalar that contains a pattern with
           at least one parenthesized subpattern.  Before any rules are
           applied to a log line, the engine strips off the date pattern.  If
           the engine is only looking at one day (ie. the default), it takes
           the part of the string that matched the parenthesized subpattern,
           and if it isn't equal to the right date, it skips the line.  The
           date_format extension (next) describes what the date should look
           like.

       date_format
           This mandatory extension is a scalar that describes the date using
           the same format as ssttrrffttiimmee(3).  For example, syslog_date_format is
           "%b %e".

       nodename_pattern
           This optional extension is a pattern with at least one parenthe-
           sized subpattern.  If it exists, then after the date_pattern is
           stripped from the line, this pattern is stripped, and the part that
           matched the subpattern is compared to the nodename.  If they're not
           equal, then the relevant counter for the category named by the
           other_host_message variable is incremented.  Note that all node-
           names are subject to having the local domain stripped from them;
           see domain and leave_FQDNs_alone for details.

       pre_skip_list_hook
           This optional extension is an array of perl commands to be run
           after the nodename check, just before the skip_list check.
           This variable is a mandatory global array that contains the list of
           all known log-types, ie. syslog, sulog, wtmpx, etc.

       pat This variable is a madatory global array that contains a list of
           subpattern names followed by a comma, optional whitespace, and a
           perl regex that represents that subpattern.  Some of the predefined
           patterns include "ip", "zone", "user", "mail_user", etc.  Run
           log_analysis -I pats for a list.

       host_pat
       file_pat
       ip_pat
       mail_user_pat
       user_pat
       word_pat
       zone_pat
           Legacy variables.  Please don't use them.

       other_host_message
       output_message_one_day
       output_message_all_days
       output_message_all_days_in_range
           Assorted mandatory scalars that are used for human-readable output.
           other_host_message defaults to "Other hosts syslogging to us", out-
           put_message_one_day defaults to "Logs for %n on %d", output_mes-
           sage_all_days defaults to "All logs for %n as of %d".  output_mes-
           sage_all_days_in_range defaults to "All logs for %n for %s through
           %e".

       date_format
           This variable is a mandatory global scalar that describes how you
           want the date printed in the output.  Uses the format of ssttrrff--
           ttiimmee(3).  Note that you probably shouldn't use characters that you
           wouldn't want in a filename (ie. whitespace or '/') if you want to
           use the %d tag for output_file.

       output_file
           Equivalent to -o file.  This variable is an optional global scalar
           that lists a filename that will be output to instead of to standard
           output.  Works with mail_address (if specified.)  Note that this
           variable is subject to the usual tag substitutions (see "TAG SUB-
           STITUTIONS", plus you can use the %d tag for the date, so you can
           set it to something like "/var/log_analysis/archive/%n-%d".  See
           output_file_and_stdout.

       output_file_and_stdout
           Equivalent to -O.  This variable is an optional global scalar that
           changes the behavior of -o or output_file.  By default, -o or out-
           put_file causes output to only to only go to the named file.  With
           this variable, output also goes to standard output.  Note: this
           does not currently work with -m.

           ing in default config files.  Their values set the s and r tags,
           respectively.

       domain
           This variable is an optional global scalar.  If you don't set it,
           log_analysis will try to set it by looking for a domain line in
           /etc/resolv.conf.  If log_analysis has domain set, it will attempt
           to strip away the local domain name from all nodenames it encoun-
           ters, unless leave_FQDNs_alone is set.  See leave_FQDNs_alone for
           details.

       leave_FQDNs_alone
           This variable is an optional global scalar.  By default, if
           log_analysis has domain set (either explicitly or implicitly), it
           will attempt to strip away the domain name in domain, or "localdo-
           main", from all nodenames it encounters.  If you set this to 1, or
           to some other true value, log_analysis will not attempt to strip
           the domain name in domain.

       PATH
           This variable is an optional global scalar that sets the PATH envi-
           ronment variable.  This doesn't help the initial setting of node-
           name, osname, or osrelease, which are set from uname(2).

       umask
           This variable is an optional global scalar that sets the umask.
           See umask(2).

       priority
           This variable is an optional global scalar that sets the priority,
           or "niceness."  See nice(1).  Setting this to zero means run
           unchanged from the current niceness.  Setting this negative is a
           bad idea unless you really know what you're doing, and is forbdid-
           den to non-root users.

       decompression_rules
           This variable is an optional global array of rules to decompress
           compressed files, in the format: compression-extension, comma,
           space, command to decompress to stdout.  The command is subject to
           the usual tag substitutions (see "TAG SUBSTITUTIONS", plus %f
           stands for the filename.  For example, the rule for gzipped files
           is:

           "gz, gzip -dc %f"

           The default rules support: .gz .Z .bz2

           If both decompression_rules and open_command apply to a given file,
           the default is to use a temp file for the intermediate results
           unless pipe_decompress_to_open is used.  See "pipe_decom-
           press_to_open" for more info.

           the mail destination's key for this to work.  Make sure to test
           this before you put it in a cronjob.

       filename_ignore_patterns
           This variable is an optional global array of patterns that describe
           filenames to be skipped in an include_dir/include_dir_if_exists
           context, such as emacs backup file (".*~") or vim backup files
           ("\..*\.swp").  Only the file component of the path is examined,
           not the directory component.  Patterns implicitly begin with ^ and
           implicitly end with $.

       mail_address
           This variable is an optional global scalar that can consist of an
           email address.  If set, the output of the script will be mailed to
           the address it is set to.  The -m option does the same thing, and
           overrides this.

       mail_command
           This variable is an optional global scalar that is the command used
           to send mail if -m is user or mail_address is set.  The -M option
           does the same thing, and overrides this.  This variable is subject
           to the usual tag substitutions, plus %m stands for mail_address and
           %o stands for the relevant output message.  The default is:

           "Mail -s '%o' %m"

       memory_size_command
           This variable is an optional global scalar that is the command used
           to determine the process' memory size.  Subject to the usual tag
           substitutions, plus %p stands for the PID (process ID) in question.
           If set, the command is run at the end of the report, and the output
           is included in the footer.

           The default value for Linux is:

           "ps -p %p -o vsz | tail -n +2"

           The default value for Solaris/SunOS is:

           "ps -p %p -o vsz | tail -n +2"

       optional_log_files
           This variable is an optional array of file globs that are to be
           processed.  Note that, unlike required_log_files, these are globs
           rather than literal filenames, although literal filenames will also
           work.  [Globs are filenames with wildcards, ie. /var/adm/mes-
           sages*.]

           See -r for an issue specific to real mode and gui mode.

       commands_to_run
           This variable is an optional array of commands that are also sup-
           If set, the commands in commands_to_run are NOT run during report
           mode.  This is equivalent to the -s option.

       suppress_footer
           If set, the various report mode footers are not displayed.  This is
           equivalent to the -S option.

       ignore_categories
           This variable is an optional array of categories that you don't
           want to see.  Rather than try to remove all the rules for these
           categories, you can just list them here.

       priority_categories
           This variable is an optional array of categories that will be
           listed first in the output.

       days_ago
           This optional scalar variable is the config equivalent of the -d
           option.

       process_all_nodenames
           This optional scalar variable is the config equivalent of the -N
           option.

       type_force
           This optional scalar is the config equivalent of the -t option.

       allow_nodenames
           This variable is an optional array of nodenames that can log to
           this host.  Usually, logs labelled as being from another host will
           not be anaylzed, and each such line will be listed in a special
           category; if you chose to allow some nodenames (or if you choose to
           process all nodenames by setting -N or setting process_all_node-
           names) then these log messages will also be processed.

       real_mode
           This variable is the config equivalent of the -r option; see the -r
           option for more details.

       real_mode_output_format
           This is a required global scalar.  It describes the per-output for-
           mat for real mode and gui mode.  It is subject to normal tag sub-
           stitution (see "TAG SUBSTITUTION"); in addition to the normal tags,
           "%c" is replaced with the category, "%#" is replaced with the
           count, "%d" is replaced with the formatted data, "%h" is replaced
           with the nodename of the message, and "%R" is the raw, original log
           line without the trailing newline.  If keep_all_log_lines is set,
           you also get "%A" for all the raw logs line.  WARNING: you usually
           want "%h" (nodename of the message), not "%n" (nodename of the host
           you're running on, which is one of the default tags substitutions.)
           Defaults to "%c: (loghost %n, from host %h)\n%-10# %d\n\n".

       keep_all_raw_logs
           This optional global scalar is a boolean for use with real mode and
           gui mode.  It enables a %A tag that contains all the raw logs for a
           given entry.  That is, if you have multiple log lines that contain
           essentially the same data, only the first line shows up in %R, and
           the rest are thrown out.  This variable lets you keep them all.  It
           can eat up a lot of memory, so it's disabled by default.

       real_mode_backlogs
           This optional global scalar is equivalent to -b.

       colors
           This variable is an optional global array for use with real mode
           and gui mode.  It defines the colors available on console, using
           "name, string" pairs.  The usual tag substitution rules apply to
           the string, plus the special tag %a stands for octal character 007
           (ASCII BEL) and %e stands for octal character 033 (ASCII ESC).
           Some of the colors are actually mode changes (ie. "normal",
           "inverse", "reverse", "blink", etc.)  If you define any colors, you
           should also define a "normal" color.  Note that "bell" is among the
           colors; it didn't belong anywhere else.  You can list colors with
           log_analysis -I colors.

       gui_mode
           This variable is the config equivalent of the -g option; see the -g
           option for more details.  It is an optional scalar.

       gui_mode_modifier
           In gui mode, the default modifier to do things with the keyboard is
           "alt", ie. "alt-q" to exit.  This lets you change it.  It is an
           optional scalar.

       report_mode_output_node_per_category
       report_mode_combine_nodes
       report_mode_combine_shows_nodes
       report_mode_combine_is_partway
           These are assorted options for dealing with output for multiple
           node situations (ie. logservers.)  They are all optional scalars.
           See "LOGSERVER CONSIDERATIONS" for details.

       window_command
           In gui mode, if we need a window to run a command, say an action,
           this will be the command that is used.  The tags are the same as
           real_mode_output_format, plus we have "%t" as the title and "%C" as
           the command.  It is an optional scalar.

       login_action
           This optional array lets you specify what action should be used to
           login to a given host in gui mode, overriding default_login_action.
           Lines are in the format host, login_action.

       default_login_action
       gui_mode_config_save_does_rcs
       gui_mode_config_file
       gui_mode_print_all
       gui_mode_save_all
       gui_mode_save_events_file
           These are for GUI use.

       default_sort
           This variable is an optional global scalar that describes how cer-
           tain things will be sorted.  See "SORTING" for info on what this
           can be set to.  Defaults to funky.

       default_filter
           This variable is an optional global scalar that describes the
           default category filter.  See "FILTERS" for info on what this can
           be set to.

PREPROCESSOR DIRECTIVES
NB: these get completely processed before all other directives, so they don't care about other syntax elements. Except as noted, these should appear at the beginning of the line after optional whitespace. @@end End of config file. @@define var val Define var as value val. var should contain only alphanumerics and underscores, and start with an alphanumeric. val may contain no whitespace. @@undef var Undo any previous definition of var. @@ifdef var @@ifndef var @@else @@endif If variable var is defined, even defined as a false value, the lines after the @@ifdef are used, otherwise the lines are effec- tively commented out. @@ifndef is the logical reverse. @@ifdef and @@ifndef must be terminated by an @@endif. They may contain an @@else section that works in the usual way. @@ifhost name @@ifnhost name These are just like @@ifdef and @@ifndef above, except that they test if the variable nodename is equal to the value supplied for name. @@ifos name @@ifnos name These are just like @@ifdef and @@ifndef above, except that they

SORTING
You can sort category items using several different criteria. You can set the default_sort, and then on a per-category basis, you can use the sort: keyword to control things even closer. If you don't override it, default_sort defaults to funky. Sorts stack, so you can use "reverse string" or "reverse value". In theory, you can stack all of them, ie. "reverse value reverse funky", but there is no guarantee that sorts are stable. The available sorts are: string Simple string "lexicographical" sort. Does not handle numbers well. numeric Sorts numbers, including decimal numbers, correctly, but cannot handle non-numeric characters, and cannot handle IPs correctly. funky Tries to do the right thing with mixed integers and strings. Han- dles IP addresses correctly. It does not handle decimal numbers correctly. reverse Reverses the current order. Can be used in conjunction with another sort, ie. "reverse string". value Sorts by count (ascending) instead of by item. none Does no additional sorting.

FILTERS
Sometimes, you don't want to see all the information in a category, just the top few items, or whatever. Filters let you do this. You can set a default filter using default_filter (defaults to "none") or you can set filters on a per-category basis using the filter: keyword. Some commands you can use: >= N Only show items whose count is greater than or equal to N. <= N > N < N = N, == N != N, <> N, >< N These are analagous to >=. bottom_strict N% Analagous to top. subfilter and subfilter subfilter or subfilter Lets you "and" or "or" two or more subfilters togther (ie. "top 10 and >= 4").

UNIQUE DESTINATION
log_analysis has a relatively simple counting mechanism that is usually effective. One exception is when you want to track how often one value occurs in your log uniquely with another value. For example, suppose you're watching firewall logs, $1 is the source IP, $2 is the destina- tion IP, and you want to know if you're being scanned. Tracking counts of "$1 $2" requires you to manually count how many times $1 occurs. Tracking just "$1" doesn't really tell you what you want, because you don't know if the source IP is really scanning a bunch of different hosts, or just has a renegade process that's banging away at a single destination. What you want to track is how many times $1 occurs with a unique $2. To do this sort of thing in a pattern config, set format: to value1, value2 and set dest: to "UNIQUE category-name". In our example, we might say: format: $1, $2 dest: UNIQUE scans The fields in format are not evaluated in a string context, and only the last comma acts as a separator. So, if $3 contains the protocol information, you might say this: format: sprintf("%-15s %s", $1, $3), $2 dest: UNIQUE scans When detecting scans in particular, it makes sense to specify an event filter, ie.: category: scans filter: >= 5 Note that it's often useful to specify multiple dests with firewall pattern, ie. one regular category dest, one UNIQUE dest with a filter threshold to detect a scan. If so, you might want to add delete_if_unique to the regular dest, so if it turns out you have a scan, you don't have to wade through lots of garbage. Ie.: pattern: kernel: block from ($pat{ip}):($pat{port}) to ($pat{ip}):($pat{port}) format: $1 => $3:$4 delete_if_unique dest: kernel block for backslash). Anything subject to tag substitutions will be listed as such. Here are the standard tag sequences: %% literal % %n nodename (ie. the output of uname -n.) %r OS release (ie. the output of uname -r.) %s OS name (ie. the output of uname -s.) There are also other tag sequences that apply in special situations. They are listed where they apply. If you try to use an undefined sequence (ie. "%Z" or something else), you'll get an error.

LOGSERVER CONSIDERATIONS
log_analysis defaults to single host operation. If you have a logserver that allows logs from multiple hosts (ie. centralized log- ging) then you potentially have two concerns: configuring what host- names to allow, and how to display multi-node logs in report mode. By default, log_analysis will only allow logs from the nodename of the logserver, so if you want to allow other nodes, you need to tell log_analysis which hostnames it should allow logs from. Either set allow_nodenames to a list of nodenames to allow logs from, or set process_all_nodenames (AKA option -N) to accept everything. Another useful variable here is leave_FQDNs_alone. Once you've accepted multiple nodes, there are a number of ways log_analysis can display them. Let's say I received two "Accepted pub- lickey for morty from 192.168.1.1 port 50000 ssh2" events from "red-sonja" and three from "conan". In the default mode, that would look like this: Logs found for other hosts. For host conan: ... sshd: accepted publickey: 3 morty from 192.168.1.1 ... Logs found for other hosts. For host red-sonja: ... sshd: accepted publickey: 2 morty from 192.168.1.1 ... You can get the categories listed together more compactly by setting report_mode_output_node_per_category. Ie: ... If you set both report_mode_combine_nodes and report_mode_com- bine_shows_nodes, you get the combined messages along with a list of applicable hostnames. Ie.: ... sshd: accepted publickey: 5 morty from 192.168.1.1 (conan red-sonja) ... If you set both report_mode_combine_nodes and report_mode_com- bine_is_partway, the messages are listed like so: ... sshd: accepted publickey: 3 morty from 192.168.1.1 (conan) 2 morty from 192.168.1.1 (red-sonja) ... Other combinations of the variables report_mode_output_node_per_cate- gory, report_mode_combine_nodes, report_mode_combine_shows_nodes, and report_mode_combine_is_partway produce undefined results.

EXAMPLES
log_analysis -m root@whatever Analyze yesterday's logs and mail the results to root@whatever. You might want to put this in a cronjob. log_analysis -p5 -m root@whatever Same as the last one, but PGP encrypt the logs using PGP 5 before mail- ing. log_analysis -a Look at all the logs, not just yesterday's. log_analysis -sa /var/adm/sulog Analyze all the contents of sulog, don't bother with local state. log_analysis -san otherhost syslog-file Analyze all the contents of syslog-file, which was created on "other- host". Don't run the local state commands. log_analysis -sd1 -f foo.conf -U This style of command is useful while developing local configs to han- dle log messages unknown to the internal config. makes the dates in such logfiles unambiguous. log_analysis by default looks for log lines that match a particular day of the year, but does not even try to guess the year. If the OS you're using doesn't rollover some logfiles by default (ie. Solaris doesn't rollover /var/adm/wtmpx, /var/adm/wtmp, or /var/adm/sulog), you will need to rollover these files yourself to get valid output from this program. On some OSes, '%' (ie. the percent symbol) has a special meaning in crontabs, and needs to be commented. See crontab(1). When there are a lot of unknowns, log_analysis can take a lot longer to run. This is particularly a problem when you're first running it, before you customize for your site. To get around this problem, if you send log_analysis a SIGINT (ie. if you hit control-C), it will stop going through your logs and immediately output the results.

FILES
/etc/log_analysis.conf /etc/log_analysis.conf-%n /etc/log_analysis.conf-%s-%r /etc/log_analysis.conf-%s /etc/log_analysis.conf /etc/log_analysis.conf-%n /etc/log_analysis.conf-%s-%r /etc/log_analysis.conf-%s Config files, in order of precedence. "%n", "%s", and "%r" have the usual tag substitution meanings; see "TAG SUBSTITUTIONS". /etc/log_analysis.d /etc/log_analysis.d Plug-in directories. All files in these directories will be treated as config files and include'd. $HOME/.log_analysis.conf If you start log_analysis with the "-g" option, this file will be loaded as a config file after all other config files, except those specified by -f. This is also the default file for the "save con- fig" menu option.

AUTHOR
Mordechai T. Abzug <morty@frakir.org>

See Also
syslogd(8), last(1), perlre(1) perl v5.8.8 2008-01-13 LOG_ANALYSIS(1)

Man(1) output converted with man2html