Inter-Call State in `libarch`

Suppose that you write a libarch client which calls a function to set the default archive, overriding the setting in ~/.arch-params (for your client only).

libarch returns from that call and your client makes another, this time to the librified equivalent of the categories command.

The second call to libarch will have to "remember" the default archive setting recorded by the first call: there is inter-call state implied by some familiar arch primitives.

This note is about the data structures I'm coding for inter-call state in librified libarch. The description here is in terms of a pseudo-code language; the transcription into the C API is straightforward.

Scalars

String constants: "hello world"

Integers: -5, 10

Identifiers: xyzzy

Nil: #nil

Module Names and Module Variables

Each module has a unique (to the process) name.

A module can create a "global" variables whose state persists across libarch calls:

        module user_params
        {
          var default_archive;
          var history_limit;
        }

        default_archive = "gnu@gnu.gnu--2001";
        history_limit = 5;

Lists, Arrays, and Associative Tables

The system supports lists, arrays, and associative tables of scalar values:

      module user_params
      {
        list library_path;
        array aliases;
        list new_alias;
        table aliases_map;
      }

      push library_path "~/{revlib}";
      push library_path "/usr2/gnu/{revlib}";

      aliases[0] = list "commit" "cmt";
      aliases[1] = list "update" "up";

      new_alias = list "replay" "r";
      push aliases new_alias;

      aliases_map["cmt"] = "commit";
      aliases_map["up"] = "update";


      --------------------------------

      library_path      =>      ("~/{revlib}" "/usr2/gnu/{revlib}")

      aliases           =>      (("commit" "cmt")
                                 ("update" "up")
                                 ("replay" r"))

      new_alias         =>      ("replay" "r")

      library_path[0]   =>      "~/{revlib}"

      aliases[0][1]     =>      "cmt";

      aliases_map["up"] =>      "update";

      aliases_map["frob"] =>    #nil;

Namespace Variables

A namespace variable is similar to a table except that bindings in a namespace are not limited to being scalars (they may be of any type).

        module user_params
        {
          namespace perms_defaults;
        }

        list perms_defaults["gnu.org"];
        push perms_defaults["gnu.org"] = 'rw-;
        push perms_defaults["gnu.org"] = 'r--;
        push perms_defaults["gnu.org"] = 'r--;

        --------------------------------

        perms_defaults["gnu.org"] => (rw-- r-- r--)

        perms_defaults["gnu.org"][0] => rw--

Note: Namespaces are not values -- they are properties of variables. For example, the above example could not contain the statement tmp = perms_defaults because that would be an attempt to use the entire namespace (perms_defaults) as a value (to be stored in tmp). Namespaces may be nested: The statement namespace perms_default["subdir"] creates a nested, initially-empty namespace named "subdir" in perms_default.

Records

A record variable is a nested namespace of variables. Records have a fixed structure which must be declared before use:

        record user_id
        {
          var full_id;
          var uid_part;
          table host_id_map;
        }

        module user_params
        {
          record user_id id;
        }

        id.full_id = "Tom Lord <lord@emf.net>";
        id.uid_part = "lord@emf.net"
        id.host_id_map["emf.net"] = "lord";
        id.host_id_map["gnu.org"] = "tomlord";

Note: Records, like namespaces, are not values but are properties of variables. A record can contain another record or a namespace, an array, list or table -- but records can not be copied from variable to variable.

Example: A Database of Archive Locations

It's useful to make a namespace of records:

        record archive_location
        {
          var archive_name;
          var primary;
          var mirror;
        }

        record user_params
        {
          namespace archive_registry;
        };


        record archive_location archive_registry["lord@emf.net--2005"];

        archive_registry["lord@emf.net--2005"].arch_name = "lord@emf.net--2005";
        archive_registry["lord@emf.net--2005"].primary = "~/archives/CURRENT";
        archive_registry["lord@emf.net--2005"].mirror = " ... ";

        archive_registry["lord@emf.net--2004"].arch_name = "lord@emf.net--2004";
        archive_registry["lord@emf.net--2004"].primary = "~/archives/PREV";
        archive_registry["lord@emf.net--2004"].mirror = " ... ";

Observation: It's A Tree (With Some Sharing)

The global state of an arch client (represented in C as a t_arch value) forms a tree. At the root is a namespace of records: the root namespace has one entry for each module; the private variables of a module are the fields of its corresponding record type.

Namespaces and records can contain namespaces and records and form a tree.

That tree contains subtrees, each of which is a list, array, table, or scalar. Arrays are subtrees of lists. Lists and tables are subtrees of scalars. All leaf nodes are scalars.

(One pleasing consequence is that all of these data structures can be reference counted since a tree contains no cycles.)

A single list, array, or table may occur at multiple locations in the tree. This matters only when mutations are considered: changing a list at one place in the tree might change another part of the tree where that list also occurs. Module implementations should use this facility for sharing with caution for obvious reasons.

A Nice Debugging Aid

Among other benefits, organizing the inter-call state of libarch this way gives rise to a nice debugging aid: the entire state of a t_arch instance can be usefully printed by a generic printer:

    module user_params
      {
        var user_id = "Tom Lord <lord@emf.net>";
        var default = "lord@emf.net--2005";
      }

    module revision_libraries
      {
        list path = ("~/{revlib}" "/usr2/share/{revlib}");
        namespace libprops
          {
            record library_properties "~/{revlib}"
              {
                greedy = 1;
                sparse = 1;
              }
            record library_properties "/usr2/share/{revlib}"
              {
                greedy = 1;
                sparse = 0;
              }
          }
      }

      ...etc...

(The printer is what I'm currently working on.)

Tom Lord's Hackery

Inter-Call State in `libarch`

Scalars

Module Names and Module Variables

Lists, Arrays, and Associative Tables

Namespace Variables

Records

Example: A Database of Archive Locations

Observation: It's A Tree (With Some Sharing)

A Nice Debugging Aid

GNU Arch

Tom Lord's Hackery

Inter-Call State in libarch

Scalars

Module Names and Module Variables

Lists, Arrays, and Associative Tables

Namespace Variables

Records

Example: A Database of Archive Locations

Observation: It's A Tree (With Some Sharing)

A Nice Debugging Aid

GNU Arch

Inter-Call State in `libarch`