mercredi 23 décembre 2009

Strict reversibility in XSugar

XSugar is a tool to do bidirectional transformations between two file format. This is particulary useful to provide common API to configuration files under Linux. For example, here is the result of a stylesheet on /etc/hosts file :

<hosts xmlns="http://usherbrooke.ca/">
<record>
<ipaddr>127.0.0.1</ipaddr>
<canonical>localhost</canonical>
</record>
</hosts>

This file can be converted back to it's flat format. But, as you may notice, indentation doesn't appears in the XML file, and will be lost. Spacing is reset to a default value. The round-trip between hosts file and XML format keeps the semantic, but looses formating. Even without modification, if the file is written back, diff will show changes. Once spaces are reset, round-trip will yield identity function i.e. strings will be exactly the same.

One solution to overcome this problem is to add to the XML all elements that would be lost otherwise. This can be done by labeling terminal elements, and add corresponding nodes to XML part of the stylesheet. For examples, this rule loose optional "a" header :

A = [a]*
X = [x]+
n : [A] [X x] "z" = <x> [X x] </x>

Providing input "aaaaxxz" will give the following XML :
<x>xx</x>
Converting it back to non-XML will yield the string "xxz". Since the empty string matches "[a]*", this is the default string that is returned.

Now, let's label the terminal "A" :

A = [a]*
X = [x]+
n : [A a] [X x] "z" = <x> [X x] <a> [A a] <a></x>

Now, we get the string
<x>
  xx
  <a>aaaa</a>
</x>

and converting it back to non-XML format yield "aaaaxxz", the exact same string as the original input.

Preserving semantic of the file is simple bidirectional property. In addition, if the stylesheet preserve the concrete representation of an input, I call this strict bidirectionality.

Strict bidirectionality can be achieved by labeling unlabeled terminal, and add corresponding element to the XML part. I did a small prototype of this algorithm, that augment the resulting stylesheet. Hence, any stylesheet can be made strict bidirectional.

It rises the question : can we staticaly verify that a stylesheet is strictly bidirectional. Hopefully yes, it's really simple. We have to do the basic check that the stylesheet is bidirectional, and then verify that all regular expression terminal are labeled. This way, we are sure that all the variable concrete string will be represented in the XML.

Automatic strict bidirectionality for stylesheet and static validation of this property will be useful to provide the behavior a system administrator would expect from a tool that modify configuration files under Linux. Let's go on!

vendredi 18 décembre 2009

Bcfg2 V.S. Puppet

Bcfg2 and Puppets are two tools for system configuration management, where we want to install configuration files, packages and adjust permissions. It seems simple said like this, but it's near from trivial when the target machines are various operating system and versions.

Bcfg2 and Puppet, while they share the same goal, are completely different in terms of their concepts, how does the desired state is modeled. The fundamental concept here is more important than the actual implementation.

Puppet is like a programming language for system administration. It's another meta-language, with it's own syntax, to indicate what files, packages, users and such, should be present on a system. It's concepts are very close to Cfengine. Well, I used Cfengine a lot, even developing a small utility to automating the management of files under Subversion (svnengine). But, while using this tool, I really came to one simple conclusion : what a mess for so little! I ended up doing everything in scripts, which are copied by cfengine and run in a cron. Big deal.

Bcfg2 is not the same, because unlike Puppet, you don't program configuration elements, they are declared in XML document. The client, specific to a target, encapsulate actions on that target, their sequencing, error handling, etc. This model enables advanced behavior, because the model can be manipulated programmatically, which is not the case for Puppet. If you want to output a Puppet manifest, you have to output strings, which is a Middle Age practice!

Bcfg2 implementation itself has some glitches that must be fixed, but the concept behind it represents a building block for semantic system configuration management, which represent the future. Bcfg2 belongs to scientific field, while I think that Puppet has the exposure it has now because of marketing wave.

But, as the history reminds us, it's not always the best product or software that wins...