Tla 1.3.1 Librification Experiment Progress Report
2005-02-21
This document refers to the tla
development branch:
Archive:
lord@emf.net--librify-tla-2005
Version:
tla--factor-1--1.3.1
(See links on the home page for help finding my archives.)
Context
Last week I reported having just spent a week modifying the libawk
part of tla
to permit string-sharing between multiple relational
and associative table entries. (The bug report describes that
work.)
I was surprised at how quickly that work went. I had only meant to spend some time estimating how long the task would take. Working on the estimate, I couldn't resist working on the modification itself, and when the deadline for the estimate arrived I already finished the work I was supposed to estimate! (I provided an estimate of "-3 days" -- undertaking the already-complete task should count as adding three days to our schedule. :-)
That got me thinking: the libawk
cleanups were a small part of the
long list of changes that would be necessarily to incrementally
transform libarch
from its state in 1.3 into a more librified and
more portable piece of code: friendly to scripting languages, GUIs,
alternative front-ends, extension languages, and non-unix platforms.
I have long assumed (as described elsewhere that an incremental librification starting from the 1.3 code base had poor chances for success.
But if the fixes to libawk
took "-3 days", perhaps the rest of
incremental librification wouldn't be so impractical.
A Two Week Experiment
Today marks the middle of what will be a two week experiment in 1.3
librification.
The form of the experiement is that I am spending these two weeks to
librify as much of libarch
as I can subject to some constraints:
Results After Week 1
Executive Summary
The experiment tries to answer the yes or no question:
The Executive Question
Is persuing the librification effort for 1.3.1 a practical strategy for persuing the high-level objectives for GNU Arch (such as Windows support, Unicode support, scripting and extension language support, a demonstrably/visibly robust implementation, etc.)?
One week into the experiment I will leave my betting money at 90/10: there is a 90% chance that the answer to the executive question will be a clear "yes" at the end of the two week experiment (which will be 28-Feb-2005).
One modest but interesting demonstration of the benefits of librification might be the improvements to error reporting that it might lead to.
Technical Summary
See also ./tla-fn-anatomy.html.
Last Week
I spent most of the first week laying down a foundation for librification. That included:
Factoring the source tree. I set up a framework for splitting
up files into multiple directories organized around modular
and "modular cluster" boundaries. The old contents of libarch
now live in a directory called libarch-compat
. Those files will
be incrementally deprecated: removed one by one and replaced by
librified replacements in other libarch-*
directories.
Setting up a new front end. My plan is to librify "from the top
down" as much as possible. I'll work in a loop: pick a an arch
subcommand; rewrite it to use only librified code (no code from
libarch-compat
); repeat until there are no unlibrified commands.
Therefore, I set up a new tla.c
(the home of main
). The new
front end first looks for a librified version of the subcommand. If
it doesn't find one, it runs the subcommand from libarch-compat
.
Designing and implementing the error signalling mechanism.
libarch
needs to consistently and robustly signal errors rather
than (in the manner of 1.3 and prior) often simply exitting on
discovery of an error condition. Part of what I did this week was
to install run-time systemf support (./src/tla/libach-errors
) for
error management.
Rebuilt libawk
. The libawk
cleanup modified all callers into
libawk
to be robust in the face of a libawk
implementation that
shared strings between multiple table entries. It also modified
the existing libawk
code to actually share strings
opportunistically, resulting in at least a significant run-time
space savings. Many librified functions will need libawk
-style
functionality but the existing libawk
implementation does not
provide for error signalling and recovery and, in other ways, does
not conform to the requirements for a fully librified libarch
.
Last week, collecting ideas and code-scraps from both the existing
libawk
and the code base for tla 2.0
, I built a new
implementation of the functionality in libawk
. The new libawk
(now called libarch-values
), in addition to be librified, adds
support for table entries whose values are of types other than just
string (e.g., integer-valued table entries).
I also started on librifying the my-id
command. That invovled
working on revised support for option parsing, on the API for
functions implementing tla sub-commands, and work on writing
librified versions of the low-level functions for manipulating a
user id.
This work went well in a few senses. I was able to cut-past-edit a
certain amount of code from both tla 1.3
and tla 2.0
to write
what I needed in this context. A great deal of the new code I
simply rewrote, from scratch: this was code that is a minor
variation on code that I've rewritten from scratch 3 or 4 times over
the past few months. The resulting code seems to work well,
although testing has been scattershot. I'm satisfied with
the emerging calling conventions.
Next Week and Possibly Beyond
In week two I have a little more work to do on the foundation: string primitive operations; better option parsing; the beginnings of a more portable file system protocol stack.
Beyond that I want to librify as much as I can in the remaining time.
I'll consider the experiment to have produced a distinctly positive
result (meaning that this approach to librification is worth
persuing) if I can get through librifying the file-id
command and
some commands that pertain to per-user (~/.arch-params
)
parameters. Such an outcome implies an efficient framework for
reimplementing CLI parsers, progress on librifying namespace
management, project tree file system access, project tree arch
control file access, and ~/.arch-params
access.
A positive outcome will warrant a follow-on series of three "wind
sprints": one each to librify inventory
, mkpatch
, and dopatch
.
Past experience has shown that, once those commands are in place,
implementing (in this case, librifying) the rest of tla
is a
relative cake walk.
Librification Experiment Constraints
This experiment asks how long it will take to make a "clean up pass"
over libarch
such that, at the end of the process, the constraints
described below are satisfied throughout the implementation of
tla
.
Librification Experiment Constraints
Upward compatability -- for several roughly one week intervals it is anticipated that only part of
libarch
will be librified. Nevertheless,tla
must be fully operable at those intervals, passing bothmake test
and changeset burn-in tests. The intent is that it should be possible (and ideally useful) to merge partially-complete librification work into the mainline early and often.Perfect Error Handling -- Librified parts of
libarch
must have perfected error handling. That means that they do not exit the process except under truly uncontinuable conditions -- most errors are signalled to callers. Resource allocation and deallocation must be robustly handled across all paths, including error-signaling paths through the code.Abstract String Handling -- No part
libarch
code should make presumptions about the internal representation of strings. Strings should be manipulated purely via procedural interfaces based on an ontology of code-point-index-addressable sequences of unicode codepoints. Where specific codepoint values must be presumed, only graphical and space ASCII characters should be referred to.Reinforced On-disk Representation Abstractions
libarch
has long internally had a rough layering of its filesystem access. Thevu
layer, from hackerlab, provides a low-level indirection above Posix system calls; for each of project-trees,~/.arch-param
directories, and file-system archives arch includes a roughly procedural interface. Within those three primary disk formats are ad-hoc formats for specific subcomponents (e.g, for files in~/.arch-params
or for patch logs in./{arch}
). Two of these subsystems (project tree and archive formats) have proven to need major restructuring for a clean port to Windows-based platforms. Throughout the code, abstraction barriers are unevenly preserved with leaks across them exposing details of path names, descriptors, and so forth. A librified libarch needs to clarify the layering in these components and ensure that the API to them is sufficiently abstract that changes to them (such as for a Windows port) can be made easily.Customizability, Extensibility, and Self-Documentation Third party developers have made very clear the demand for robust scripting language bindings to
libarch
. Work on arch GUIs, IDE bindings, and alternative front-ends suggests a similar demand. Some desirable capabilities in the core of arch, such as file-type-specificdiff
compuation andpatch
application, suggests a demand for an arch which is not merely scriptable (callable as primitive routines from a scripting language) but extensible (can be configured to call out to extension language routines during core operations). The APIs, data types, error handling conventions, and available documentation used in a librifiedlibarch
must be scripting and extension language friendly.
Copyright
Copyright (C) 2004 Tom Lord
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
See the file COPYING
for further information about
the copyright and warranty status of this work.