package VCP::Help;
%topics = (
########################################################################
'vcp' => <<'TOPIC',
NAME
vcp - Copy versions of files between repositories and/or RevML
SYNOPSIS
# interactive mode:
vcp [vcp_opts]
# scriptable command line mode:
vcp [vcp_opts] <source> <dest>
# getting options from a file:
vcp vcp:config.vcp
# help output:
vcp help
vcp help [topic]
DESCRIPTION
"vcp" ('version copy') copies versions of files from one repository
to another, translating as much metadata as possible along the way.
This allows you to copy and translate files and their histories
between revision storage systems.
Supported source and destination types are "cvs:", "p4:", and
"revml:".
Copying Versions
The general syntax of the vcp command line is:
vcp [<vcp options>] <source> <dest>
The three portions of the command line are:
"<vcp options>"
Command line options that control the operation of the "vcp"
command, like "-d" for debugging or "-h" for help. There are
very few global options, these are covered below. Note that they
must come before the "<source>" specification.
"<source>"
Were to extract versions from, including any command line
options needed to control what is extracted and how. See the
next section.
"<dest>"
Where to insert versions, including any command line options
needed to control how files are stored. See the next section.
Specifying Repositories
The "<source>" and "<dest>" specifications specify a repository and
provide any options needed for accessing that repository.
These spefications may be a simple filename for reading or writing
RevML files (if the requisite XML handling modules are installed).
or a full repository specification like "cvs:/home/cvs/root:module"
or "p4:user:password@server:port://depot/dir".
When using the long form to access a repository, "<source>" and
"<dest>" specification have several fields delimited by ":" and "@",
and may have trailing command line options. The full (rarely used)
syntax is:
scheme:user(view):password@repository:filespec [<options>]
where
"scheme:"
The repository type ("p4:", "cvs:", "revml:").
"user", "view", and "password"
Optional values for authenticating with the repository and
identifying which view to use. "cvs" does not use "view". For
"p4", "view" is the client setting (equibalent to setting
"P4CLIENT" or using "p4"'s "-c" option).
"repository"
The repository spec, CVSROOT for CVS or P4PORT for p4.
"filespec"
Which versions of what files to move. As much as possible, this
spec is similar to the native filespecs used by the repository
indicated by the scheme.
"<options>"
Command line options that usually mimic the options provided by
the underlying repositories' command line tools ("cvs", "p4",
etc).
Most of these fields are omitted in practice, only the "scheme"
field is required, though (in most cases) the "repository" field is
also needed unless you set the appropriate environment variables
("CVSROOT", "P4PORT", etc).
The a bit confusing, here are some examples specs:
cvs:server:/foo
p4:user@server://depot/foo/...
p4:user:password@public.perforce.com:1666://depot/foo/...
Options and formats for of individual schemes can be found in the
relevant help topics, for instance:
vcp help source::cvs
Run "vcp help" for a list of topics.
When reading and writing RevML files, a simple filename will do
(although the long form may also be used). The special value "-"
means to read/write stdin and stdout when used as a source or
destination name, respectively. "-" is assumed if a specification is
not provided, so these invocations all accomplish the same thing,
reading and writing RevML:
vcp
vcp -
vcp revml:-
vcp revml:
vcp - -
vcp - revml:-
vcp - revml:
vcp revml:- revml:-
vcp revml: revml:
"vcp" Options
All general options to vcp must precede the "<source>".
Scheme-specific options must be placed immediately after the
"<source>" or "<dest>" spec and before the next one.
--debug, -d
Enables logging of debugging information.
--help, -h, -?
These are all equivalent to "vcp help".
--output-config-file=$filename
Write the settings (parsed from the UI, the command line, or a
config file to a file. Useful for capturing settings or user
interface output. Does not affect running. Use "-" to emit to
STDOUT.
NOTE 1: This does *not* emit an "Options:" section containing
global options (those listed here). Almost all of these options
are not useful to emit; we can add an option to force their
emission if need be.
NOTE 2: When using the interactive user interface, this option
takes effect after the last interactive portion and, if vcp goes
on to run a conversion, before any conversion is run. This
occurs in addition to any configuration files the user may ask
the interactive interface to write. This may change in the
future (for instance, if the interactive dialog includes an
option to extract and analyze metadata).
--dont-convert
Do not run a conversion. Useful when you just want to emit a
.vcp file.
--versions
Emits the version numbers of bundled files.
--terse, -t
Suppress verbose explanations when running the interactive UI.
Has no effect on operation if all settings are read from the
command line or a .vcp file.
--quiet, -q
Suppresses the banner and progress bars.
Getting help
(See also Generating HTML Documentation, below).
There is a slightly different command line format for requesting
help:
vcp help [<topic>]
where "<topic>" is the optional name of a topic. "vcp help" without
a "<"topic">" prints out a list of topics, and "vcp help vcp" emits
this page.
All help documents are also available as Unix "man" pages and using
the "perldoc" command, although the names are slightly different:
with vcp via perldoc
================ ===========
vcp help vcp perldoc vcp
vcp help source::cvs perldoc VCP::Source::cvs
vcp help source::cvs perldoc VCP::Dest::p4
"vcp help" is case insensitive, "perldoc" and "man" may or may not
be depending on your filesystem. The "man" commands look just like
the example "perldoc" commands except for the command name. Both
have the advantage that they use your system's configured pager if
possible.
Environment Variables
The environment is often used to set context for the source and
destination by way of variables like P4USER, P4CLIENT, CVSROOT, etc.
VCPDEBUG
The VCPDEBUG variable acts just like "-d=$VCPDEBUG" was present
on the command line:
VCPDEBUG=1
(see "--debug, -d" for more info). This is useful when VCP is
embedded in another application, like a makefile or a test
suite.
SEE ALSO
VCP::Process, VCP::Newlines, VCP::Source::p4, VCP::Dest::p4,
VCP::Source::cvs, VCP::Dest::cvs, VCP::Source::revml,
VCP::Dest::revml, VCP::Newlines. All are also available using "vcp
help".
AUTHOR
Barrie Slaymaker <barries@slaysys.com>
COPYRIGHT
Copyright (c) 2000, 2001, 2002 Perforce Software, Inc. All rights
reserved.
See VCP::License ("vcp help license") for the terms of use.
TOPIC
########################################################################
'vcp usage' => <<'TOPIC',
Usage:
# interactive mode:
vcp [vcp_opts]
# scriptable command line mode:
vcp [vcp_opts] <source> <dest>
# getting options from a file:
vcp vcp:config.vcp
# help output:
vcp help
vcp help [topic]
TOPIC
########################################################################
'filter' => <<'TOPIC',
NAME
VCP::Filter - A base class for filters
SYNOPSIS
use VCP::Filter;
@ISA = qw( VCP::Filter );
...
DESCRIPTION
A VPC::Filter is a VCP::Plugin that is placed between the source and
the destination and allows the stream of revisions to be altered.
For instance, the Map: option in vcp files is implemented by
VCP::Filter::Map
By default a filter is a pass-through.
SUBCLASSING
This class uses the fields pragma, so you'll need to use base and
possibly fields in any subclasses.
parse_rules_list
Used in VCP::Filter::*map and VCP::Filter::*edit to parse lists
of rules where every rule is a set of N "words". The value of N
is computed from the number of labels passed in and the labels
are used when printing an error message:
@rules = $self->parse_rules( $options, "Pattern", "Replacement" );
filter_name
Returns the StudlyCaps version of the filter name. By default,
assumes a single work name and uses ucfirst on it. Filters like
StringEdit should overload this to be more creative and
typgraphically appealing (heh).
sort_keys
my @output_sort_order = $filter->sort_keys( @input_sort_order );
Accepts a list of sort keys from the upstream filter and returns
a list of sort keys representing the order that records will be
emitted in.
This is a pass-through by default, but VCP::Filter::sort and
VCP::Filter::changesets return appropriate values.
config_file_section_as_string
last_rev_in_filebranch
(passthru; see VCP::Dest)
backfill
(passthru; see VCP::Dest)
handle_header
(passthru)
rev_count
$self->SUPER::rev_count( @_ );
passthru, see VCP::Dest.
handle_rev
$self->SUPER::handle_rev( @_ );
passthru, see VCP::Dest.
skip_rev
$self->SUPER::skip_rev( @_ );
passthru, see VCP::Dest
handle_footer
$self->SUPER::handle_footer( @_ );
passthru, see VCP::Dest
COPYRIGHT
Copyright 2000, Perforce Software, Inc. All Rights Reserved.
This module and the VCP package are licensed according to the terms
given in the file LICENSE accompanying this distribution, a copy of
which is included in vcp.
AUTHOR
Barrie Slaymaker <barries@slaysys.com>
TOPIC
########################################################################
'source' => <<'TOPIC',
NAME
VCP::Source - A base class for repository sources
SYNOPSIS
DESCRIPTION
OPTIONS
--bootstrap
--bootstrap=pattern
Forces all files matching the given shell regular expression
(may use wildcards like "*", "?", and "...") to have their first
revisions transferred as complete copies instead of deltas. This
is useful when you want to transfer a revision other than the
first revision as the first revision in the target repository.
It is also useful when you want to skip some revisions in the
target repository (although the Map filter has superceded this
use).
--continue
Tells VCP to continue where it left off from last time. This
will not detect new branches of already transferred revisions
(this limitation should be lifted, but results in an expensive
rescan of metadata), but will detect updates to already
transferred revisions.
--rev-root
Tells VCP to extract files relative to a directory in the source
repository other than the default directory. Ordinarily, VCP
looks at the source specification and deduces that the lowest
complete directory name is the "root" directory for all
revisions, or the "rev_root".
For instance, given the specification:
cvs:foo/bar/...
the rev_root will be "foo/bar" and the files under bar will be
extracted with a path relative to bar, so "foo/bar/baz/bat" will
be extracted with the value "baz/bat".
They will be inserted in the destination repository relative to
the rev_root for the destination, so if the destination spec is
like:
p4://depot/...
then "baz/bat" will be inserted in the destination as
"//depot/baz/bat".
If there is no target rev_root specified, as in the spec:
p4:
then the source's rev_root will be assumed, so "baz/bat" in our
example would be placed in "//foo/bar/baz/bat".
SUBCLASSING
This class uses the fields pragma, so you'll need to use base and
possibly fields in any subclasses. See VCP::Plugin for methods often
needed in subclasses.
Subclass utility API
options_spec
Adds common VCP::Source options to whatever options VCP::Plugin
parses:
dest
Sets/Gets a reference to the VCP::Dest object. The source uses
this to call handle_header(), handle_rev(), and handle_end()
methods.
continue
Sets/Gets the CONTINUE field (which the user sets via the
--continue flag)
real_source
Returns the reference to be used when sending revisions to the
destination.
Each revision has a pointer to the source that sends it so that
filters and destinations can call get_source_file().
Most sources return $self; Sources that spool data, such as
VCP::Source::metadb, need to specify a real source. They do so
by overloading this method. VCP::Source::revml does not do this,
as it supplies a get_source_file().
rev_mode
my $mode = $self->rev_mode( $filebranch_id, $rev_id );
Returns FALSE, "base", or "normal" as a function of the
filebranch and rev_id. Do not queue the revision if this returns
FALSE (you may also skip any preceding revisions). Queue it only
as a base revision if it returns "base", and queue it as a full
revision otherwise.
Not all base revs will be sent; base revs that have no child
revs will not be sent.
Always returns "normal" when not in continue mode.
queue_rev
Some revs can't be sent immediately. They get queued. Once
queued, the revision may not be altered. All revisions must be
queued before being sent. All revs from the source repository
should be queued, --continue processing is automatic.
Placeholders should be inserted for all branches, even empty
ones.
This updates last_rev and last_rev_for_filebranch.
Returns FALSE if the rev cannot be queued, for instance if it's
already been queued once.
rev_mode() should be called before creating a rev, or at least
before queue_rev()ing it in order to see if and in what form the
rev should be sent.
queued_rev
$self->queued_rev( $id );
Returns a queued rev by id.
Sources where revs can arrive willy-nilly, like
VCP::Source::revml, queue up all revs and need to randomly
access them.
last_rev_for_filebranch
$self->last_rev_for_filebranch( $filebranch_id );
Returns the last revision queued on the indicated filebranch.
set_last_rev_in_filebranch_previous_id
$self->set_last_rev_in_filebranch_previous_id( $r );
If there is a last_rev_for_filebranch for $r->filebranch_id,
sets its previous_id to point to $r. This is useful for sources
which scan in most-recent-first order.
queued_rev_count
Returns (does not set) the number of revs queued so far.
Replaces the deprecated function sent_rev_count().
store_cached_revs
$self->store_cached_revs;
For parsers which read history one file at a time and branch in
rev_id space, like VCP::Source::cvs, it's possible to flush all
revs to disk after each file is parsed. This method takes the
last VCP::Rev in each filebranch and stores it to disk, freeing
memory.
send_revs
$self->send_revs;
Removes and sends all revs accumulated so far. Called
automatically after scan_metadata().
SUBCLASS OVERLOADS
These methods should be overridded in any subclasses.
scan_metadata
This is called to scan the metadata for the source repository.
It should call rev_mode() for each revision found (including any
that need to be concocted to make up for collapsed metadata in
the source, like VSS or CVS deletes or CVS branch creation) and
if that returns TRUE, then queue_rev() should be called.
If rev_mode() returns "base", then the transfer is in --continue
mode and this rev should be built as or converted to a base
revision. The easiest way to do this is to build it normally and
then call $r->base_revify().
If the metadata source returns metadata from most recent to
oldest, as do most file history reports, the previous_id() need
not be set until the next revision in a filebranch is scanned.
The most recent rev passed to queue_rev() is available by
calling last_rev(), if the metadata is one branch at a time, and
the last rev in each filebranch is available by calling
last_rev_for_filebranch().
If the metadata is scanned one file or filebranch at a time and
branched are all created by the time the end of a file's
metadata arrives, calling store_cached_revs() will flush all
queued revs from the last_rev() and last_rev_for_filebranch()
in-memory caches to the disk cache (all other revs are flushed
as their successors arrive).
There is no easy way to handle randomly ordered metadata at this
time, typically a source will accumulate as little as it can in
memory and queue the rest. See VCP::Source::cvs for an example
of this.
Once scan_metadata() is complete, send_revs() will be called
automatically.
get_source_file
REQUIRED OVERLOAD.
All sources must provide a way for the destination to fetch a
revision.
handle_header
REQUIRED OVERLOAD.
Subclasses must add all repository-specific info to the $header,
at least including rep_type and rep_desc.
$header->{rep_type} => 'p4',
$self->p4( ['info'], \$header->{rep_desc} ) ;
The subclass must pass the $header on to the dest:
$self->dest->handle_header( $header )
if $self->dest;
This may be called when dest is null to allow the source to
initialize itself when it won't be scanning the real source. So
the if $self->dest is important.
That's not the case for copy_revs().
handle_footer
Not a required overload, as the footer carries no useful
information at this time. Overriding methods must call this
method to pass the $footer on:
$self->SUPER::handle_footer( $footer ) ;
parse_time
$time = $self->parse_time( $timestr ) ;
Parses "[cc]YY/MM/DD[ HH[:MM[:SS]]]".
Will add ability to use format strings in future. HH, MM, and SS
are assumed to be 0 if not present.
Returns a time suitable for feeding to localtime or gmtime.
Assumes local system time, so no good for parsing times in
revml, but that's not a common thing to need to do, so it's in
VCP::Source::revml.pm.
bootstrap
Sets (and parses) or gets the bootstrap spec.
Can be called plain:
$self->bootstrap( $bootstrap_spec ) ;
See the command line documentation for the format of
$bootstrap_spec.
is_bootstrap_mode
... if $self->is_bootstrap_mode( $file ) ;
Compares the filename passed in against the list of bootstrap
regular expressions set by "bootstrap".
The file should be in a format similar to the command line spec
for whatever repository is passed in, and not relative to
rev_root, so "//depot/foo/bar" for p4, or "module/foo/bar" for
cvs.
This is typically called in the subbase class only after looking
at the revision number to see if it is a first revision (in
which case the subclass should automatically put it in bootstrap
mode).
COPYRIGHT
Copyright 2000, Perforce Software, Inc. All Rights Reserved.
This module and the VCP package are licensed according to the terms
given in the file LICENSE accompanying this distribution, a copy of
which is included in vcp.
AUTHOR
Barrie Slaymaker <barries@slaysys.com>
TOPIC
########################################################################
'defaultfilters' => <<'TOPIC',
NAME
VCP::DefaultFilters - Class for determining default filters to
install for a given source and dest.
SYNOPSIS
require VCP::DefaultFilters;
my $df = VCP::DefaultFilters->new;
my @filter_args = $df->create_default_filters( $source, $dest );
DESCRIPTION
Given references to a vcp source and destination, determines the
default filters which would be appropriate, builds and returns a
list of arguments that should look like the portion of @ARGV
(command line arguments) that specify filters.
COPYRIGHT
Copyright 2000, Perforce Software, Inc. All Rights Reserved.
This module and the VCP package are licensed according to the terms
given in the file LICENSE accompanying this distribution, a copy of
which is included in vcp.
TOPIC
########################################################################
'dest' => <<'TOPIC',
NAME
VCP::Dest - A base class for VCP destinations
SYNOPSIS
DESCRIPTION
SUBCLASS API
These methods are intended to support subclasses.
digest
$self->digest( "/tmp/readers" ) ;
Returns the Base64 MD5 digest of the named file. Used to compare
a base rev (which is the revision *before* the first one we want
to transfer) of a file from the source repo to the existing head
rev of a dest repo.
The Base64 version is returned because that's what RevML uses
and we might want to cross-check with a .revml file when
debugging.
compare_base_revs
$self->compare_base_revs( $rev, $work_path ) ;
Checks out the indicated revision from the destination
repository and compares it (using digest()) to the file from the
source repository (as indicated by $work_path). Dies with an
error message if the base revisions do not match.
Calls $self->checkout_file( $rev ), which the subclass must
implement.
header
Gets/sets the $header data structure passed to handle_header().
rev_map
Returns a reference to the RevMapDB for this backend and
repository. Creates an empty one if need be.
head_revs
Returns a reference to the HeadRevsDB for this backend and
repository. Creates an empty one if need be.
main_branch_id
Returns a reference to the MainBranchIdDB for this backend and
repository. Creates an empty one if need be.
files
Returns a reference to the FilesDB for this backend and
repository. Creates an empty one if need be.
SUBCLASS OVERLOADS
These methods are overloaded by subclasses.
backfill
$dest->backfill( $rev ) ;
Checks the file indicated by VCP::Rev $rev out of the target
repository if this destination supports backfilling. Currently,
only the revml and the reporting & debugging destinations do not
support backfilling.
The $rev->workpath must be set to the filename the backfill was
put in.
This is used when doing an incremental update, where the first
revision of a file in the update is encoded as a delta from the
prior version. A digest of the prior version is sent along
before the first version delta to verify it's presence in the
database.
So, the source calls backfill(), which returns TRUE on success,
FALSE if the destination doesn't support backfilling, and dies
if there's an error in procuring the right revision.
If FALSE is returned, then the revisions will be sent through
with no working path, but will have a delta record.
MUST BE OVERRIDDEN.
sort_filter
sub sort_filter {
my $self = shift;
my @sort_keys = @_;
return () if @sort_keys && $sort_keys[0] eq "change_id";
require VCP::Filter::changesets;
return ( VCP::Filter::changesets->new(), );
}
This is passed a sort specification string and returns any
filters needed to presort data for this destination. It may
return the empty list (the default), or one or more instantiated
filters.
require_change_id_sort
Destinations that care about the sort order usually want to use
the changesets filter, so they can overload the sort filter like
so:
sub sort_filters { shift->require_change_id_sort( @_ ) }
handle_footer
$dest->handle_footer( $footer ) ;
Does any cleanup necessary. Not required. Don't call this from
the override.
handle_header
$dest->handle_header( $header ) ;
Stows $header in $self->header. This should only rarely be
overridden, since the first call to handle_rev() should output
any header info.
rev_count
$dest->rev_count( $number_of_revs_forthcoming );
Sent by the last aggregating plugin in the filter chain just
before the first revision is sent to inform us of the number of
revs to expect.
skip_rev
Sent by filters that discard revisions in line.
handle_rev
$dest->handle_rev( $rev ) ;
Outputs the item referred to by VCP::Rev $rev. If this is the
first call, then $self->none_seen will be TRUE and any preamble
should be emitted.
MUST BE OVERRIDDEN. Don't call this from the override.
last_rev_in_filebranch
my $rev_id = $dest->last_rev_in_filebranch(
$source_repo_id,
$source_filebranch_id
);
Returns the last revision for the file and branch indicated by
$source_filebranch_id. This is used to support --continue.
Returns undef if not found.
NOTES
Several fields are jury rigged for "base revisions": these are fake
revisions used to start off incremental, non-bootstrap transfers
with the MD5 digest of the version that must be the last version in
the target repository. Since these are "faked", they don't contain
comments or timestamps, so the comment and timestamp fields are
treated as "" and 0 by the sort routines.
COPYRIGHT
Copyright 2000, Perforce Software, Inc. All Rights Reserved.
This module and the VCP package are licensed according to the terms
given in the file LICENSE accompanying this distribution, a copy of
which is included in vcp.
AUTHOR
Barrie Slaymaker <barries@slaysys.com>
TOPIC
########################################################################
'filter::csv_trace' => <<'TOPIC',
NAME
VCP::Filter::csv_trace - developement logging filter
DESCRIPTION
Dumps fields of revisions in CSV format.
Not a supported module, API and behavior may change without warning.
AUTHOR
Barrie Slaymaker <barries@slaysys.com>
COPYRIGHT
Copyright (c) 2000, 2001, 2002 Perforce Software, Inc. All rights
reserved.
See VCP::License ("vcp help license") for the terms of use.
TOPIC
########################################################################
'filter::csv_trace usage' => <<'TOPIC',
TOPIC
########################################################################
'filter::csv_trace description' => <<'TOPIC',
Dumps fields of revisions in CSV format.
Not a supported module, API and behavior may change without
warning.
TOPIC
########################################################################
'filter::logmemsize' => <<'TOPIC',
NAME
VCP::Filter::logmemsize - developement logging filter
DESCRIPTION
Watches memory size. Only works on linux for now.
Not a supported module, API and behavior may change without warning.
AUTHOR
Barrie Slaymaker <barries@slaysys.com>
COPYRIGHT
Copyright (c) 2000, 2001, 2002 Perforce Software, Inc. All rights
reserved.
See VCP::License ("vcp help license") for the terms of use.
TOPIC
########################################################################
'filter::logmemsize usage' => <<'TOPIC',
TOPIC
########################################################################
'filter::logmemsize description' => <<'TOPIC',
Watches memory size. Only works on linux for now.
Not a supported module, API and behavior may change without
warning.
TOPIC
########################################################################
'filter::changesets' => <<'TOPIC',
NAME
VCP::Filter::changesets - Group revs in to changesets
SYNOPSIS
## From the command line:
vcp <source> changesets: ...options... -- <dest>
## In a .vcp file:
ChangeSets:
time <=60 ## seconds
user_id equal ## case-sensitive equality
comment equal ## case-sensitive equality
source_filebranch_id notequal ## case-sensitive inequality
DESCRIPTION
This filter is automatically loaded when there is no sort filter
loaded (both this and VCP::Filter::sort count as sort filters).
Sorting by change_id, etc.
When all revs from the source have change numbers, this filter sorts
by change_id, branch_id, and name, regardless of the rules set. The
name sort is case sensitive, though it should not be for Win32. This
sort by change_id is necessary for sources that supply change_id
because the order of scanning the revisions is not usually (ever, so
far :) in change set order.
Aggregating changes
If one or more revisions arrives from the source with an empty
change_id, the rules for this filter establish the conditions that
determine what revisions may be grouped in to each change.
In this case, this filter rewrites all change_id fields so that the
(eventual) destination can use the change_id field to break the
revisions in to changes. This is sometimes used by non-changeset
oriented destinations to aggregate "changes" as though a user were
performing them and to reduce the number of individual operations
the destination driver must perform (for instance: VCP::Dest::cvs
prefers to not call cvs commit all the time; cvs commit is slow).
Revisions are aggregated in to changes using a set of rules that
determine what revisions may be combined. One rule is implicit in
the algorithm, the others are explicitly specified as a set of
defaults that may be altered by the user.
The Implicit Rule
The implicit rule is that no change may contain two revisions where
one is a descendant of another. The algorithm starts with the set of
revisions that have no parents in this transfer, chooses a set of
them to be a change according to the explicit conditions, and emits
it. Only when a revision is emitted does this filter consider it's
offspring for emission. This cannot be changed.
(EXPERIMENTAL) The only time this implicit rule is not enough is in
a cloning situation. In CVS and VSS, it is possible to "share" files
between branches. VSS supports and promotes this model in its user
interface and documentation while CVS allows it more subtlely by
allowing the same branch to have multiple branch tags. In either
case, there are multiple branches of a file that are changed
simultaneously. The CVS source recognizes this (and the VSS source
may by the time you read this) and chooses a master revision from
which to "clone" other revisions. These cloned revisions appear on
the child branch as children of the master revision, not as children
of the preceding revision on the child branch. This is confusing,
but it works. In order to prevent this from confusing the
destinations, however, it can be important to make sure that two
revisions to a given branch of a given file do not occur in the same
revision; this is the purpose of the explicit rule
"source_filebranch_id notequal", covered below.
The Explicit Rules
Rules may be specified for the ChangeSets filter. If no rules are
specified, a set of default rules are used. If any rules are
specified, none of the default rules are used. The default rules are
explained after rule conditions are explained.
Each rule is a pair of words: a data field and a condition.
There are three conditions: "notequal", "equal" and "<=N" (where N
is a number; note that no spaces are allowed before the number
unless the spec is quoted somehow):
equal
The "equal" condition is valid for all fields and states that
all revisions in the same change must have identical values for
the indicated field. So:
user_id equal
states that all revisions in a change must be submitted by the
same user.
All "equal" conditions are used before any other conditions,
regardless of the order they are specified in to categorize
revisions in to prototype changes. Once all revisions have been
categorized in to prototyps changes, the "<=N" and "notequal"
rules are applied in order to split the change prototypes in to
as many changes as are needed to satisfy them.
notequal
The "notequal" condition is also valid for all fields and
specifies that no two revisions in a change may have equal
values for a field. It does not make sense to apply this to time
fields, and is usually only needed to ensure that two revisions
to the same file on the same branch do not get bundled in to the
same change.
<=N The "<=N" specification is only available for the "time" field.
It specifices that no gaps larger than N seconds may exist in a
change.
The default rules are:
time <=60 ## seconds
user_id equal ## case-sensitive equality
comment equal ## case-sensitive equality
source_filebranch_id notequal ## case-sensitive inequality
These rules
The "time <=60" condition sets a maximum allowable difference
between two revisions; revisions that are more than this number of
seconds apart are considered to be in different changes.
The "user_id equal" and "comment equal" conditions assert that two
revisions must be by the same user and have the same comment in
order to be in the same change.
The "source_filebranch_id notequal" condition prevents cloned revs
of a file from appearing in the same change as eachother (see the
discussion above for more details).
ALGORITHM
handle_rev()
As revs are received by handle_rev(), they are store on disk.
Several RAM-efficient (well, for Perl) data structures are built,
however, that describe each revision's children and its membership
in a changeset. Some or all of these structures may be moved to disk
when we need to handly truly large data sets.
The ALL_HAVE_CHANGE_IDS statistic
One statistic that handle_rev() gathers is whether or not all
revisions arrived with a non-empty change_id field.
The REV_COUNT statistic
How many revisions have been recieved. This is used only for UI
feedback; primarily it is to forewarn the downstream filter(s) and
destination of how many revisions will constitute a 100% complete
transfer.
The CHANGES list
As each rev arrives, it is placed in a "protochange" determined
solely by the revision's fields in the rules list with an "equal"
condition. Protochanges are likely to have too many revisions in
them, including revisions that descend from one another and
revisions that are too far apart in time.
The CHANGES_BY_KEY index
The categorization of each revision in to changes is done by forming
a key string from all the fields in the rules list with the "equal"
condition. This index maps unique keys to changes.
The CHILDREN index
This is an index of all revisions that are direct offspring of a
revision.
the PREDECESSOR_COUNT statistic
Counts the number of parents a revision has that haven't been
submitted yet. A revision may have a previous_id and, optionally,
also have a from_id (can't have a from_id without a previous_id,
however).
The REVS_BY_CHANGE_ID index
If all revs do indeed arrive with change_ids, they need to be sorted
and sent out in order. This index is gathered until the first rev
with an empty change_id arrives.
The ROOT_IDS list
This is a list of the IDs of all revisions that have no parent
revisions in this transfer. This is used as the starting point for
send_changes(), below.
The CHANGES_BY_REV index
As the large protochanges are split in to smaller ones, the
resulting CHANGES list is indexed by, among other things, which revs
are in the change. This is so the algorithms can quickly find what
change a revision is in when it's time to consider sending that
revision.
handle_footer()
All the real work occurs when handle_footer() is called.
handle_footer() glances at the change_id statistic gathered by
handle_rev() and determines whether it can sort by change_id or
whether it has to perform change aggregation.
If all revisions arrive with a change_id,
sort_by_change_id_and_send() If at least one revision didn't
handle_footer() decides to perform change aggregation by calling
split_protochanges() and then send_changes().
Any source or upstream filter may perform change aggregation by
assigning change_ids to all revisions. VCP::Source::p4 does this. At
the time of this writing no otherd do.
Likewise, a filter like VCP::Filter::StringEdit may be used to clear
out all the change_ids and force change aggregation.
sort_by_change_id_and_send()
If all revisions arrived with a change_id, then they will be sorted
by the values of ( change_id, time, branch_id, name ) and sent on.
There is no provision in this filter for ignoring change_id other
than if any revisions arrive with an empty change_id, this sort is
not done.
split_and_send_changes()
Once all revisions have been placed in to protochanges, a change is
selected and sent like so:
1 Get an oldest change with no revs that can't yet be sent. If
none is found, then select one oldest change and remove any revs
that can't be sent yet.
2 Select as many revs as can legally be sent in a change by
sorting them in to time order and then using the <=N and
notequal rules to determine if each rev can be sent given the
revs that have already passed the rules. Delay all other revs
for a later change.
LIMITATIONS
This filter does not take the source_repo_id in to account: if
somehow you are merging multiple repositories in to one and want to
interleave the commits/submits "properly", ask for advice.
AUTHOR
Barrie Slaymaker <barries@slaysys.com>
COPYRIGHT
Copyright (c) 2000, 2001, 2002 Perforce Software, Inc. All rights
reserved.
See VCP::License ("vcp help license") for the terms of use.
TOPIC
########################################################################
'filter::changesets usage' => <<'TOPIC',
Usage:
## From the command line:
vcp <source> changesets: ...options... -- <dest>
## In a .vcp file:
ChangeSets:
time <=60 ## seconds
user_id equal ## case-sensitive equality
comment equal ## case-sensitive equality
source_filebranch_id notequal ## case-sensitive inequality
TOPIC
########################################################################
'filter::changesets description' => <<'TOPIC',
This filter is automatically loaded when there is no sort
filter loaded (both this and VCP::Filter::sort count as
sort filters).
Sorting by change_id, etc.
==========================
When all revs from the source have change numbers, this
filter sorts by change_id, branch_id, and name, regardless
of the rules set. The name sort is case sensitive, though
it should not be for Win32. This sort by change_id is
necessary for sources that supply change_id because the
order of scanning the revisions is not usually (ever, so
far :) in change set order.
Aggregating changes
===================
If one or more revisions arrives from the source with an
empty change_id, the rules for this filter establish the
conditions that determine what revisions may be grouped in
to each change.
In this case, this filter rewrites all change_id fields so
that the (eventual) destination can use the change_id field
to break the revisions in to changes. This is sometimes
used by non-changeset oriented destinations to aggregate
"changes" as though a user were performing them and to
reduce the number of individual operations the destination
driver must perform (for instance: VCP::Dest::cvs prefers
to not call cvs commit all the time; cvs commit is slow).
Revisions are aggregated in to changes using a set of rules
that determine what revisions may be combined. One rule is
implicit in the algorithm, the others are explicitly
specified as a set of defaults that may be altered by the
user.
The Implicit Rule
=================
The implicit rule is that no change may contain two
revisions where one is a descendant of another. The
algorithm starts with the set of revisions that have no
parents in this transfer, chooses a set of them to be a
change according to the explicit conditions, and emits it.
Only when a revision is emitted does this filter consider
it's offspring for emission. This cannot be changed.
(EXPERIMENTAL) The only time this implicit rule is not
enough is in a cloning situation. In CVS and VSS, it is
possible to "share" files between branches. VSS supports
and promotes this model in its user interface and
documentation while CVS allows it more subtlely by allowing
the same branch to have multiple branch tags. In either
case, there are multiple branches of a file that are
changed simultaneously. The CVS source recognizes this (and
the VSS source may by the time you read this) and chooses a
master revision from which to "clone" other revisions.
These cloned revisions appear on the child branch as
children of the master revision, not as children of the
preceding revision on the child branch. This is confusing,
but it works. In order to prevent this from confusing the
destinations, however, it can be important to make sure
that two revisions to a given branch of a given file do not
occur in the same revision; this is the purpose of the
explicit rule "source_filebranch_id notequal", covered
below.
The Explicit Rules
==================
Rules may be specified for the ChangeSets filter. If no
rules are specified, a set of default rules are used. If
any rules are specified, none of the default rules are
used. The default rules are explained after rule conditions
are explained.
Each rule is a pair of words: a data field and a condition.
There are three conditions: "notequal", "equal" and "<=N"
(where N is a number; note that no spaces are allowed
before the number unless the spec is quoted somehow):
equal
=====
The "equal" condition is valid for all fields and states
that all revisions in the same change must have identical
values for the indicated field. So:
user_id equal
states that all revisions in a change must be submitted by
the same user.
All "equal" conditions are used before any other
conditions, regardless of the order they are specified in
to categorize revisions in to prototype changes. Once all
revisions have been categorized in to prototyps changes,
the "<=N" and "notequal" rules are applied in order to
split the change prototypes in to as many changes as are
needed to satisfy them.
notequal
========
The "notequal" condition is also valid for all fields and
specifies that no two revisions in a change may have equal
values for a field. It does not make sense to apply this to
time fields, and is usually only needed to ensure that two
revisions to the same file on the same branch do not get
bundled in to the same change.
<=N
===
The "<=N" specification is only available for the "time"
field. It specifices that no gaps larger than N seconds may
exist in a change.
The default rules are:
time <=60 ## seconds
user_id equal ## case-sensitive equality
comment equal ## case-sensitive equality
source_filebranch_id notequal ## case-sensitive inequality
These rules
The "time <=60" condition sets a maximum allowable
difference between two revisions; revisions that are more
than this number of seconds apart are considered to be in
different changes.
The "user_id equal" and "comment equal" conditions assert
that two revisions must be by the same user and have the
same comment in order to be in the same change.
foo
===
The "branched_rev_branch_id equal" condition is a special
case to handle repositories like CVS which don't record
branch creation times. This condition kicks in when a user
creates several branches before changing any files on any
of them; in this case all of the branches get created at
the same time. That leaves odd looking conversions. This
condition also kicks in when multiple CVS branches exist
with no changes on them. In this case, VCP::Source::cvs
groups all of the branch creations after the last "real"
edit. In both cases, the changeset filter splits branch
creations so that only one branch is created per change.
The "branched_rev_branch_id" condition only applies to
revisions branching from one branch in to another.
foo
===
The "source_filebranch_id notequal" condition prevents
cloned revs of a file from appearing in the same change as
eachother (see the discussion above for more details).
TOPIC
########################################################################
'filter::sort' => <<'TOPIC',
NAME
VCP::Filter::sort - Sort revs by field, order
SYNOPSIS
## From the command line:
vcp <source> sort: name ascending rev_id ascending -- <dest>
## In a .vcp file:
Sort:
name ascending
rev_id ascending
DESCRIPTION
NOTE: this filter is primarily for development and testing, it is
not designed for large datasets (it can use a lot of RAM if fed
enough data).
Useful with the revml: destination to get RevML output in a desired
order. Otherwise the sorting built in to the change aggregator
should suffice.
The default sort spec is "name,rev_id" which is what is handy to
VCP's test suite as it puts all revisions in a predictable order so
the output revml can be compared to the input revml.
NOTE: this is primarily for development use; not all fields may work
right. All plain string fields should work right as well as name,
rev_id, change_id and their source_... equivalents (which are parsed
and compared piece-wise) and time, and mod_tome (which are stored as
integers internally).
Plain case sensitive string comparison is used for all fields other
than those mentioned in the preceding paragraphs.
This sort may be slow for extremely large data sets; it sorts things
by comparing revs to eachother field by field instead of by
generating indexes and VCP::Rev is not designed to be super fast
when accessing fields one by one. This can be altered if need be.
How rev_id and change_id are sorted
"change_id" or "rev_id" are split in to segments suitable for
sorting.
The splits occur at the following points:
1. Before and after each substring of consecutive digits
2. Before and after each substring of consecutive letters
3. Before and after each non-alpha-numeric character
The substrings are greedy: each is as long as possible and
non-alphanumeric characters are discarded. So "11..22aa33" is split
in to 5 segments: ( 11, "", 22, "aa", 33 ).
If a segment is numeric, it is left padded with 10 NUL characters.
This algorithm makes 1.52 be treated like revision 1, minor revision
52, not like a floating point 1.52. So the following sort order is
maintained:
1.0
1.0b1
1.0b2
1.0b10
1.0c
1.1
1.2
1.10
1.11
1.12
The substring "pre" might be treated specially at some point.
(At least) the following cases are not handled by this algorithm:
1. floating point rev_ids: 1.0, 1.1, 1.11, 1.12, 1.2
2. letters as "prereleases": 1.0a, 1.0b, 1.0, 1.1a, 1.1
LIMITATIONS
Stores all metadata in RAM.
AUTHOR
Barrie Slaymaker <barries@slaysys.com>
COPYRIGHT
Copyright (c) 2000, 2001, 2002 Perforce Software, Inc. All rights
reserved.
See VCP::License ("vcp help license") for the terms of use.
TOPIC
########################################################################
'filter::sort usage' => <<'TOPIC',
Usage:
## From the command line:
vcp <source> sort: name ascending rev_id ascending -- <dest>
## In a .vcp file:
Sort:
name ascending
rev_id ascending
TOPIC
########################################################################
'filter::sort description' => <<'TOPIC',
NOTE: this filter is primarily for development and testing,
it is not designed for large datasets (it can use a lot of
RAM if fed enough data).
Useful with the revml: destination to get RevML output in a
desired order. Otherwise the sorting built in to the change
aggregator should suffice.
The default sort spec is "name,rev_id" which is what is
handy to VCP's test suite as it puts all revisions in a
predictable order so the output revml can be compared to
the input revml.
NOTE: this is primarily for development use; not all fields
may work right. All plain string fields should work right
as well as name, rev_id, change_id and their source_...
equivalents (which are parsed and compared piece-wise) and
time, and mod_tome (which are stored as integers
internally).
Plain case sensitive string comparison is used for all
fields other than those mentioned in the preceding
paragraphs.
This sort may be slow for extremely large data sets; it
sorts things by comparing revs to eachother field by field
instead of by generating indexes and VCP::Rev is not
designed to be super fast when accessing fields one by one.
This can be altered if need be.
TOPIC
########################################################################
'filter::identity' => <<'TOPIC',
NAME
VCP::Filter::identity - identity (ie noop)
SYNOPSIS
vcp <source> identity: <dest>
DESCRIPTION
A simple passthrough, used for testing to make sure that VCP::Filter
really is a pass through and that vcp can load filters.
AUTHOR
Barrie Slaymaker <barries@slaysys.com>
COPYRIGHT
Copyright (c) 2000, 2001, 2002 Perforce Software, Inc. All rights
reserved.
See VCP::License ("vcp help license") for the terms of use.
TOPIC
########################################################################
'filter::identity usage' => <<'TOPIC',
Usage:
vcp <source> identity: <dest>
TOPIC
########################################################################
'filter::identity description' => <<'TOPIC',
A simple passthrough, used for testing to make sure that
VCP::Filter really is a pass through and that vcp can load
filters.
test_script t/10vcp.t
=====================
TOPIC
########################################################################
'filter::dumpdata' => <<'TOPIC',
NAME
VCP::Filter::dumpdata - developement output filter
DESCRIPTION
Dump all data structures. Requires the module BFD, which is not
installed automatically. Dumps to the log file.
Not a supported module, API and behavior may change without warning.
AUTHOR
Barrie Slaymaker <barries@slaysys.com>
COPYRIGHT
Copyright (c) 2000, 2001, 2002 Perforce Software, Inc. All rights
reserved.
See VCP::License ("vcp help license") for the terms of use.
TOPIC
########################################################################
'filter::dumpdata usage' => <<'TOPIC',
TOPIC
########################################################################
'filter::dumpdata description' => <<'TOPIC',
Dump all data structures. Requires the module BFD, which is
not installed automatically. Dumps to the log file.
Not a supported module, API and behavior may change without
warning.
TOPIC
########################################################################
'filter::addlabels' => <<'TOPIC',
NAME
VCP::Filter::addlabels - Add labels to each revision
SYNOPSIS
## From the command line:
vcp <source> addlabels: "rev_$rev_id" "change_$change_id" -- <dest>
## In a .vcp file:
AddLabels:
rev_$rev_id
change_$change_id
# ... etc ...
DESCRIPTION
Used when you want to track the original rev_id, change_id,
branch_id, etc. each revision had in the source repository by adding
a label. Can be used to turn any piece of metadata in to a label.
Note that the fields
source_name, source_filebranch_id, source_branch_id,
source_rev_id, source_change_id
are set by VCP to be the same value as the corresponding fields
without the source prefix (except source_filebranch_id, which is
built from the file name, rooted in the repository, and for cvs
repositories, the branch number in angle brackets.) These source_*
fields (intended to be immutable in vcp) should be used to make
labels rather than their mutable equivalents which may be changed
via a vcp filter.
There is no way to add labels only to selected revisions at this
time, but if you try to add a label for metadata that is undefined
or empty, it will not be added.
AUTHOR
Barrie Slaymaker <barries@slaysys.com>
COPYRIGHT
Copyright (c) 2000, 2001, 2002 Perforce Software, Inc. All rights
reserved.
See VCP::License ("vcp help license") for the terms of use.
TOPIC
########################################################################
'filter::addlabels usage' => <<'TOPIC',
Usage:
## From the command line:
vcp <source> addlabels: "rev_$rev_id" "change_$change_id" -- <dest>
## In a .vcp file:
AddLabels:
rev_$rev_id
change_$change_id
# ... etc ...
TOPIC
########################################################################
'filter::addlabels description' => <<'TOPIC',
Used when you want to track the original rev_id, change_id,
branch_id, etc. each revision had in the source repository
by adding a label. Can be used to turn any piece of
metadata in to a label.
Note that the fields
source_name, source_filebranch_id, source_branch_id,
source_rev_id, source_change_id
are set by VCP to be the same value as the corresponding
fields without the source prefix (except
source_filebranch_id, which is built from the file name,
rooted in the repository, and for cvs repositories, the
branch number in angle brackets.) These source_* fields
(intended to be immutable in vcp) should be used to make
labels rather than their mutable equivalents which may be
changed via a vcp filter.
There is no way to add labels only to selected revisions at
this time, but if you try to add a label for metadata that
is undefined or empty, it will not be added.
test_script t/61addlabels.t
===========================
TOPIC
########################################################################
'filter::stringedit' => <<'TOPIC',
NAME
VCP::Filter::stringedit - alter any field character by character
SYNOPSIS
StringEdit:
## Convert illegal p4 characters to ^NN hex escapes and the
## p4 wildcard "..." to a safe string. The "^" is not an illegal
## char, it's replaced with an escape to allow us to use it as
## an escape character without the (extremely small) risk of
## running across a file name that actually uses it.
## Order is significant in this ruleset.
# field(s) match replacement
name,labels /([\s@#*%^])/ ^%02x
name,labels "..." ^___
StringEdit:
## underscorify each unwanted character to a single "_"
name,labels /[\s@#*%^]/ _
StringEdit:
## underscorify each run of unwanted characters to a single "_"
name,labels /[\s@#*%^]*/ _
StringEdit:
## prefix labels that don't start with a letter or underscore:
labels /([^a-zA-Z_])/ _%c
DESCRIPTION
Allows field by field string editing, using Perl regular expressions
to match characters and substrings and sprintf-like replacement
strings.
Rules
A rule is a triplet of expressions specifying a (1) set of fields to
match, (2) a pattern to match against those fields' contents
(matching contents are removed), and (3) a string to replace each of
the removed bits with.
NOTE 1: the "match" expression uses perl5 regular expressions, not
filename wildcards used in most other places in VCP configurations.
The list of rules is evaluated top down and all rules are applied to
each string.
NOTE 2: The all-rules-apply nature of this filter is different from
the behaviors of the ...Map: filters, which stop after the first
matching rule. This is because ...Map: filters are rewriting entire
strings and there can be only one result string, while the
StringEdit filter may be rewriting pieces of string and multiple
rewrites may be combined to good effect.
The Fields List
A comma separated list of field names. Any field may be edited
except those that begin with "source_".
The Match Expression
For each field, the match expression is run against the field and,
if it matches, causes all matching portions of string to be
replaced.
The match expression is a full perl5 regular expression enclosed in
/.../ delimiters or a plain string, either of which may be enclosed
in '' or "" delimiters if inline spaces are needed (rare, we hope).
The Replacement Expression
Each match is replaced by one instance of the replacement
expression, optionally enclosed in single or double quotation marks.
The replacement expression provides a limited list of C sprintf
style macros:
%d The decimal codes for each character in the match
%o The octal codes for each character in the match
%x The hex codes for each character in the match
Any non-letter preceded by a backslash "\" character is replaced by
itself. Some more or less useful examples:
\% \\ \" \' \` \{ \} \$ \* \+ \? \1
If a punctuation character other than a period (.) or slash "/"
follows a letter macro, it must be escaped using the backslash
character (this is to reserve room in the spec for postfix modifiers
like "*", "+", and "?"). So, to put a literal star (*) after a hex
code, you would do something like "%02x\*".
The "normal" perl5 letter abbreviations are also allowed:
\t tab (HT, TAB)
\n newline (NL)
\r return (CR)
\f form feed (FF)
\b backspace (BS)
\a alarm (bell) (BEL)
\e escape (ESC)
\033 octal char (ESC)
\x1b hex char (ESC)
\x{263a} wide hex char (SMILEY)
\c[ control char (ESC)
\N{name} named Unicode character
including the following escape sequences are available in constructs
that modify what follows:
\l lowercase next char
\u uppercase next char
\L lowercase till \E
\U uppercase till \E
\E end case modification
\Q quote non-word characters till \E
As shown above, normal sprintf-style options may be included (and
are recommended), so %02x produces results like "%09" (if the match
was a single TAB character) or "%20" (if the match was a SPACE
character). The dot precision modifiers (".3") are not supported,
just the leading 0 and the field width specifier.
Case sensitivity
By default, all patterns are case sensitive. There is no way to
override this at present; one will be added.
Command Line Parsing
For large stringedits or repeated use, the stringedit is best
specified in a .vcp file. For quick one-offs or scripted situations,
however, the stringedit: scheme may be used on the command line. In
this case, each parameter is a "word" and every triple of words is a
( pattern, result ) pair.
Because vcp command line parsing is performed incrementally and the
next filter or destination specifications can look exactly like a
pattern or result, the special token "--" is used to terminate the
list of patterns if StringEdit: is used on the command line. This
may also be the last word in the "StringEdit:" section of a .vcp
file, but that is superfluous. It is an error to use "--" before the
last word in a .vcp file.
LIMITATIONS
There is no way (yet) of telling the stringeditor to continue
processing the rules list. We could implement labels like "
<<*label*"> > to be allowed before pattern expressions (but not
between pattern and result), and we could then impelement " <<goto
*label*"> >. And a " <<next"> > could be used to fall through to the
next label. All of which is wonderful, but I want to gain some real
world experience with the current system and find a use case for
gotos and fallthroughs before I implement them. This comment is here
to solicit feedback :).
AUTHOR
Barrie Slaymaker <barries@slaysys.com>
COPYRIGHT
Copyright (c) 2000, 2001, 2002 Perforce Software, Inc. All rights
reserved.
See VCP::License ("vcp help license") for the terms of use.
TOPIC
########################################################################
'filter::stringedit usage' => <<'TOPIC',
Usage:
StringEdit:
## Convert illegal p4 characters to ^NN hex escapes and the
## p4 wildcard "..." to a safe string. The "^" is not an illegal
## char, it's replaced with an escape to allow us to use it as
## an escape character without the (extremely small) risk of
## running across a file name that actually uses it.
## Order is significant in this ruleset.
# field(s) match replacement
name,labels /([\s@#*%^])/ ^%02x
name,labels "..." ^___
StringEdit:
## underscorify each unwanted character to a single "_"
name,labels /[\s@#*%^]/ _
StringEdit:
## underscorify each run of unwanted characters to a single "_"
name,labels /[\s@#*%^]*/ _
StringEdit:
## prefix labels that don't start with a letter or underscore:
labels /([^a-zA-Z_])/ _%c
TOPIC
########################################################################
'filter::stringedit description' => <<'TOPIC',
Allows field by field string editing, using Perl regular
expressions to match characters and substrings and
sprintf-like replacement strings.
Rules
=====
A rule is a triplet of expressions specifying a (1) set of
fields to match, (2) a pattern to match against those
fields' contents (matching contents are removed), and (3) a
string to replace each of the removed bits with.
NOTE 1: the "match" expression uses perl5 regular
expressions, not filename wildcards used in most other
places in VCP configurations.
The list of rules is evaluated top down and all rules are
applied to each string.
NOTE 2: The all-rules-apply nature of this filter is
different from the behaviors of the ...Map: filters, which
stop after the first matching rule. This is because ...Map:
filters are rewriting entire strings and there can be only
one result string, while the StringEdit filter may be
rewriting pieces of string and multiple rewrites may be
combined to good effect.
The Fields List
===============
A comma separated list of field names. Any field may be
edited except those that begin with "source_".
The Match Expression
====================
For each field, the match expression is run against the
field and, if it matches, causes all matching portions of
string to be replaced.
The match expression is a full perl5 regular expression
enclosed in /.../ delimiters or a plain string, either of
which may be enclosed in '' or "" delimiters if inline
spaces are needed (rare, we hope).
The Replacement Expression
==========================
Each match is replaced by one instance of the replacement
expression, optionally enclosed in single or double
quotation marks.
The replacement expression provides a limited list of C
sprintf style macros:
%d The decimal codes for each character in the match
%o The octal codes for each character in the match
%x The hex codes for each character in the match
Any non-letter preceded by a backslash "\" character is
replaced by itself. Some more or less useful examples:
\% \\ \" \' \` \{ \} \$ \* \+ \? \1
If a punctuation character other than a period (.) or slash
"/" follows a letter macro, it must be escaped using the
backslash character (this is to reserve room in the spec
for postfix modifiers like "*", "+", and "?"). So, to put a
literal star (*) after a hex code, you would do something
like "%02x\*".
the_future %x* %x{1} %x{1,} %x{,3} %x{1,3}
==========================================
The "normal" perl5 letter abbreviations are also allowed:
\t tab (HT, TAB)
\n newline (NL)
\r return (CR)
\f form feed (FF)
\b backspace (BS)
\a alarm (bell) (BEL)
\e escape (ESC)
\033 octal char (ESC)
\x1b hex char (ESC)
\x{263a} wide hex char (SMILEY)
\c[ control char (ESC)
\N{name} named Unicode character
including the following escape sequences are available in
constructs that modify what follows:
\l lowercase next char
\u uppercase next char
\L lowercase till \E
\U uppercase till \E
\E end case modification
\Q quote non-word characters till \E
As shown above, normal sprintf-style options may be
included (and are recommended), so %02x produces results
like "%09" (if the match was a single TAB character) or
"%20" (if the match was a SPACE character). The dot
precision modifiers (".3") are not supported, just the
leading 0 and the field width specifier.
Case sensitivity
================
By default, all patterns are case sensitive. There is no
way to override this at present; one will be added.
Command Line Parsing
====================
For large stringedits or repeated use, the stringedit is
best specified in a .vcp file. For quick one-offs or
scripted situations, however, the stringedit: scheme may be
used on the command line. In this case, each parameter is a
"word" and every triple of words is a ( pattern, result )
pair.
Because vcp command line parsing is performed incrementally
and the next filter or destination specifications can look
exactly like a pattern or result, the special token "--" is
used to terminate the list of patterns if StringEdit: is
used on the command line. This may also be the last word in
the "StringEdit:" section of a .vcp file, but that is
superfluous. It is an error to use "--" before the last
word in a .vcp file.
test_script t/61stringedit.t
============================
TOPIC
########################################################################
'filter::labelmap' => <<'TOPIC',
NAME
VCP::Filter::labelmap - Alter or remove labels from each revision
SYNOPSIS
## From the command line:
vcp <source> labelmap: "rev_$rev_id" "change_$change_id" -- <dest>
## In a .vcp file:
LabelMap:
foo-... <<delete>> # remove all labels beginning with foo-
F...R <<delete>> # remove all labels F
v-(...) V-$1 # use uppercase v prefixes
DESCRIPTION
Allows labels to be altered or removed using a syntax similar to
VCP::Filter::map. This is being written for development use so more
documentation is needed. See VCP::Filter::map for more examples of
pattern matching (though VCP::Filter::labelmap does not use
<branch_id> syntax).
AUTHOR
Barrie Slaymaker <barries@slaysys.com>
COPYRIGHT
Copyright (c) 2000, 2001, 2002 Perforce Software, Inc. All rights
reserved.
See VCP::License ("vcp help license") for the terms of use.
TOPIC
########################################################################
'filter::labelmap usage' => <<'TOPIC',
Usage:
## From the command line:
vcp <source> labelmap: "rev_$rev_id" "change_$change_id" -- <dest>
## In a .vcp file:
LabelMap:
foo-... <<delete>> # remove all labels beginning with foo-
F...R <<delete>> # remove all labels F
v-(...) V-$1 # use uppercase v prefixes
TOPIC
########################################################################
'filter::labelmap description' => <<'TOPIC',
Allows labels to be altered or removed using a syntax
similar to VCP::Filter::map. This is being written for
development use so more documentation is needed. See
VCP::Filter::map for more examples of pattern matching
(though VCP::Filter::labelmap does not use <branch_id>
syntax).
test_script t/61labelmap.t
==========================
TOPIC
syntax highlighted by Code2HTML, v. 0.9.1