Inter-Call State in libarch
Suppose that you write a libarch
client which calls a function
to set the default archive, overriding the setting in
~/.arch-params
(for your client only).
libarch
returns from that call and your client makes another, this
time to the librified equivalent of the categories
command.
The second call to libarch
will have to "remember" the default
archive setting recorded by the first call: there is inter-call
state implied by some familiar arch primitives.
This note is about the data structures I'm coding for inter-call state in librified libarch. The description here is in terms of a pseudo-code language; the transcription into the C API is straightforward.
Scalars
String constants: "hello world"
Integers: -5
, 10
Identifiers: xyzzy
Nil: #nil
Module Names and Module Variables
Each module has a unique (to the process) name.
A module can create a "global" variables whose state persists across
libarch
calls:
module user_params { var default_archive; var history_limit; } default_archive = "gnu@gnu.gnu--2001"; history_limit = 5;
Lists, Arrays, and Associative Tables
The system supports lists, arrays, and associative tables of scalar values:
module user_params { list library_path; array aliases; list new_alias; table aliases_map; } push library_path "~/{revlib}"; push library_path "/usr2/gnu/{revlib}"; aliases[0] = list "commit" "cmt"; aliases[1] = list "update" "up"; new_alias = list "replay" "r"; push aliases new_alias; aliases_map["cmt"] = "commit"; aliases_map["up"] = "update"; -------------------------------- library_path => ("~/{revlib}" "/usr2/gnu/{revlib}") aliases => (("commit" "cmt") ("update" "up") ("replay" r")) new_alias => ("replay" "r") library_path[0] => "~/{revlib}" aliases[0][1] => "cmt"; aliases_map["up"] => "update"; aliases_map["frob"] => #nil;
Namespace Variables
A namespace variable is similar to a table except that bindings in a namespace are not limited to being scalars (they may be of any type).
module user_params { namespace perms_defaults; } list perms_defaults["gnu.org"]; push perms_defaults["gnu.org"] = 'rw-; push perms_defaults["gnu.org"] = 'r--; push perms_defaults["gnu.org"] = 'r--; -------------------------------- perms_defaults["gnu.org"] => (rw-- r-- r--) perms_defaults["gnu.org"][0] => rw--
Note: Namespaces are not values -- they are properties of
variables. For example, the above example could not contain the
statement tmp = perms_defaults
because that would be an attempt to
use the entire namespace (perms_defaults
) as a value (to be stored
in tmp
). Namespaces may be nested: The statement namespace
perms_default["subdir"]
creates a nested, initially-empty namespace
named "subdir"
in perms_default
.
Records
A record variable is a nested namespace of variables. Records have a fixed structure which must be declared before use:
record user_id { var full_id; var uid_part; table host_id_map; } module user_params { record user_id id; } id.full_id = "Tom Lord <lord@emf.net>"; id.uid_part = "lord@emf.net" id.host_id_map["emf.net"] = "lord"; id.host_id_map["gnu.org"] = "tomlord";
Note: Records, like namespaces, are not values but are properties of variables. A record can contain another record or a namespace, an array, list or table -- but records can not be copied from variable to variable.
Example: A Database of Archive Locations
It's useful to make a namespace of records:
record archive_location { var archive_name; var primary; var mirror; } record user_params { namespace archive_registry; }; record archive_location archive_registry["lord@emf.net--2005"]; archive_registry["lord@emf.net--2005"].arch_name = "lord@emf.net--2005"; archive_registry["lord@emf.net--2005"].primary = "~/archives/CURRENT"; archive_registry["lord@emf.net--2005"].mirror = " ... "; archive_registry["lord@emf.net--2004"].arch_name = "lord@emf.net--2004"; archive_registry["lord@emf.net--2004"].primary = "~/archives/PREV"; archive_registry["lord@emf.net--2004"].mirror = " ... ";
Observation: It's A Tree (With Some Sharing)
The global state of an arch client (represented in C as a t_arch
value) forms a tree. At the root is a namespace of records: the
root namespace has one entry for each module; the private variables
of a module are the fields of its corresponding record type.
Namespaces and records can contain namespaces and records and form a tree.
That tree contains subtrees, each of which is a list, array, table, or scalar. Arrays are subtrees of lists. Lists and tables are subtrees of scalars. All leaf nodes are scalars.
(One pleasing consequence is that all of these data structures can be reference counted since a tree contains no cycles.)
A single list, array, or table may occur at multiple locations in the tree. This matters only when mutations are considered: changing a list at one place in the tree might change another part of the tree where that list also occurs. Module implementations should use this facility for sharing with caution for obvious reasons.
A Nice Debugging Aid
Among other benefits, organizing the inter-call state of libarch
this way gives rise to a nice debugging aid: the entire state of
a t_arch
instance can be usefully printed by a generic printer:
module user_params { var user_id = "Tom Lord <lord@emf.net>"; var default = "lord@emf.net--2005"; } module revision_libraries { list path = ("~/{revlib}" "/usr2/share/{revlib}"); namespace libprops { record library_properties "~/{revlib}" { greedy = 1; sparse = 1; } record library_properties "/usr2/share/{revlib}" { greedy = 1; sparse = 0; } } } ...etc...
(The printer is what I'm currently working on.)