UNIX Password, Roles & Node Management: R API


The R API is an application interface into the active and inactive inventory underneath Password Management. It uses HTTP as a transport and CGI as it's argument passing. It is, in essence, a CGI acting as an abstraction layer.

R is designed with a REST-style model in mind. Arguments are obtained by examining the URL accessed; that URL will always be canonical. For example, http://servername/R/node/ will always be a valid location, and should always return the same information. (REST defines much more than that; see also for more details on REST.)

R can return data in a variety of formats. It does well-formed (but not truely valid) XML, or plain text, or HTML. It even can return data in the output of a the Data::Dumper manpage call to Dumper(), which makes it very easy to re-eval the structure back into your own namespace (if your client is perl5, that is).

Basic Usage

R works within the CGI framework underneath an HTTP server. It takes the URL accessed as arguments, and invokes an action handler based on the URL to process the request. The handler returns it's output to R, which then invokes an output format handler to correctly format the data and return it to the requestor.

     requestor       ----------           -----------
          |          |        |---------->| action  |------>{ does stuff }
       ^  +--------->|   R    |           | handler |             |
       |             |        |           ----------- <-----------+
       |             |        |                |
       |             ---------- <--------------+ 
       |                    |
       |                    |             -----------
       |                    +------------>| output  |
       |                                  | format  |
       |                                  | handler |
       |                                  -----------
       |                                      |

The URL is key for selecting the correct action handler. R's syntax looks like this:



This is the beginning part of the URL, from scheme:// to R. It's set by the pwman.r.url property. For example, if you have an R instance on, and your webserver for this instance is running on port 80, then your URI_PREFIX is

The action is what you want done. It corresponds directly to the actions found under the directory specified by pwman.r.dir.actions. Each action is a self-contained perl5 coderef, like a plugin, that can take a series of arguments and return a result. For example:
  $ GET -uUsSe
  nodename: testing%201%202%203

See also the section below on Action Handlers.

Most actions require arguments. An argument is everything after the last slash / and up to the options. Arguments may contain any character except the semi-colon ;. The syntax and content of an argument depends solely on the action being called, and will likely vary from action to action. For example, the node/ action's only argument is a nodename (eg URI_PREFIX/node/, whereas the tags/info/ action's argument is a valid +TAG (eg, URI_PREFIX/tags/info/CONSOLE).

You can get the syntax of arguments for an action handler with the capability handler:

  $ GET
  nodename: tags/info
  author: $Author: jgilbertsjc $
  description: given a tag name, return all known data about that tag
  syntax: 'URI_PREFIX/tags/info/TAGNAME', where TAGNAME is a supported tag name.
  version: $Id: pwman_docs_rapi.pod,v 1.2 2005/10/20 08:46:34 jgilbertsjc Exp $

options are, well, optional, and can further modify an action handlers' behavior. You can also use options to change how the output is returned, by setting the output_format. Options take the form name=value, and are seperated by a semi-colon ;, as a CGI query-string is formatted (you can also use the ampersand & if you're old-school), as in:

In the above example, the options passed are taglist, whose value is CONSOLE:LOCATION:TAGLIST, and output_format, whose value is text/plain.

All options are accepted, but it's up to the action to act on them. There is one option that is always acted upon: output_format. This option defines how response data is formatted before it is returned. If an output_format option is not specified (or if it doesn't exist), R will default to text/plain.

See also the section below on Output Format Handlers.

R will work with requests that are submitted via GET and POST, however, data that is encoded in via a POST method is placed into the option listing, so you still need to specify a valid URL to the action with arguments. This, for example, won't work:

  $ POST
  Please enter content (application/x-www-form-urlencoded) to be POSTed:
  nodename: actions
  capability: Built-in handler to list capabilites for HANDLER.
  dns: current DNS information by node

But this will:

  $ POST
  Please enter content (application/x-www-form-urlencoded) to be POSTed:
   <node name="tags/info">
     <author>$Author: jgilbertsjc $</author>
     <description>given a tag name, return all known data about that tag</description>
     <syntax>'URI_PREFIX/tags/TAGNAME', where TAGNAME is a supported tag name.</syntax>
     <version>$Id: pwman_docs_rapi.pod,v 1.2 2005/10/20 08:46:34 jgilbertsjc Exp $</version>

Any option that is supported by an action handler can be transferred via POSTed data.

Quick Setup

R is ready for use as soon as Password Management is installed. However, you'll have to configure your webserver to start passing requests to it.

(Note: the only webserver configuration listed here is Apache's httpd, simply because it's the one R was written in mind for. Because R is pure CGI, there's nothing preventing it from being used by other webservers; they're not listed because Apache is pretty ubiquitous, and the author didn't have access to N webservers to test on.)

Sample httpd.conf configuration

The following is the barebones of what you need to get your R instance working underneath Apache's httpd. Insert this into the appropriate <VirtualHost> directive, after changing the pathing to be whereever you installed your Password Management instance into.

  RewriteEngine on
  RewriteRule ^/R/(.*)  /opt/password_management/R/R?/$1 [T=application/x-httpd-cgi]    
  <Directory "/opt/password_management/R">
    DirectoryIndex      R
    AllowOverride       None
    Options     ExecCGI
    Order       allow,deny
    Allow       from all                

You can, of course, add in Auth directives inside the <Directory> block. There are situations (like when using R's 'remote' system) when that's a whole-heartedly good idea.

Once you've got this working, you should change the property value of pwman.r.url to the correct value.

Using R via the Covad::Pwman::R_Client module

Any application or programming framework that can make HTTP calls and interpret the result is capable of using R. To simplify some of the work involved in perl, Password Management comes with a seperate class, called Covad::Pwman::R_Client. This object class allows you to quickly write scripts that talk to a given R instance, a la:

  my $R = Covad::Pwman::R_Client->new( ""; );
  my $node_information = $R->get_r( "node", "" );
  $node_information == [
      '00NODENAME' => '', 
      'CONSOLE' => '',
      'DESCRIPTION' => 'first in a series of nodes for',
  my $node_console_info = $R->get_node_tag_entry( "", "CONSOLE" );
  $node_console_info == "";
  my($invalid_response, $error_message) = $R->get_r("unsupported/handler", "unknown.node");
  unless ($invalid_response) {
    die $error_message;

You should check the inline POD documentation for Covad::Pwman::R_Client for more information.

R Remote

R comes with a series of handlers under the remote directory. These handlers permit R instances to act as satellite Password Management instances, taking new work and Node Control File (NCF) data from an authorative ``build'' instance and transferring to nodes from themselves. Consider the following layout:

   (( authorative user database ))             ----------
          |                                --> | node01 |   
          |  --------------               /    ----------      ----------
          +->| primary    |----- x ------/                /--> | node02 |  
             | Password   |----- f -----------------------     ----------
             | Management |----- e -----------------------     ----------     
             | instance   |----- r -----\                 \--> | node03 |     
             --------------              \     ----------      ----------
                 |                        -->  | node04 |
                 |                             ----------
              disparate, remote 
                 |      --------------               ----------
                 +----> | remote     |---- xfer ---> | node05 |
                        | Password   |               ----------
                        | Management |
                        | instance   |

In this scenario, a ``primary'' Password Management instance generates all required work files (/etc/passwd, /etc/shadow, /etc/sudoers, and the digest) for all five nodes. It only has the ability to transfer directly to nodes 01 through 04, however. node05 is pushed to by the ``remote'' Password Management instance. There may be several reasons for this:

ACL restrictions
node05 may live on a subnet that is not accessbible via SSH on port 22 from the ``primary'', but does accept direct or proxied SSL traffic. A remote datacenter with no permanent, dedicated connection to the primary, for example.

Division of access
node05 may live in a special subnet that is required to have limited SSH access. Placing all your eggs in a single basket may not be permitted in your organization. By delegating responsibilty, you can change SSH credentials at seperate schedules. A decent example of this might be a DMZ subnet.

Division of labor
You may need to do this in order to keep up with frequent updates, if you're running an exceptionally large site - it's just something to help you keep up with the scaling of your infrastructure (I don't presume to define ``exceptionally large'', because it's a pretty subjective condition that would be based on the OS, and the plugins needed to make a push successful. I will state, however, that I have yet to see an installation require the use of remote instances based on scaling).

Whatever the reason, the primary is unable to communicate directly with node05. The remote Password Management instance can step in, and recieve the work files and push them. The primary knows which R instance to talk to, because node05's NCF has it recorded in the +REMOTE tag. This tag contains a URL, as a scalar, of the remote R instance that should receive the work files. When the parses the NCF for node05, it detects the +REMOTE tag, and attempts to push the work files there via R.

To accomplish all this the R remote handlers exist. They manage the acceptance and saving of the generated /etc/passwd, /etc/shadow, /etc/sudoers files, as well as a modified NCF, and ensuring that these files are uptodate. The script can then be automated at a fairly high frequency (once every 10 or 30 minutes); it will detect new work files that need to be transferred, and transfer them (while executing all the nessecary plugins for transfer, and logging accordingly) just as if the remote instance were the primary. The handlers are:

Given a nodename, accepts new work and NCF files as POSTed data, and saves them to the locations specified by the remote Password Management configuration.

Returns a unique MD5 checksum that identifies this instance of Password Management. The ident checksum is used to ensure the instance to which we are transferring work files to is indeed the instance that should push for a given node. We can't rely on source or destination IP address, because there may be a NAT system in place inbetween the two instances. We also can't rely on the hostname in the R URI_PREFIX, because there may be a HTTP proxy inbetween the instances.

This checksum is then inserted into the NCF for the node being transferred, into the +REMOTE tag - it replaces the R URL that was previously there. This signals to the on the remote Password Management instance to not generate work files for this node, and further ensures to the that yes, this remote instance is supposed to push this node.

Given a nodename, returns file information and MD5 checksums of the current work files, as well as the NCF and the per-node log.

Given a nodename, returns current log data for that node. The logs are pulled directly from pwman.log_dir.

Given a nodename, remove all work and NCF data for that node. This is usefull when you decommission a node, or want to ensure that all data for that node has been cleared. Note the logs are not removed with remote/zap; you must clear those by hand.

Because the files are transferred via HTTP, any number of correctly configured HTTP proxies can sit inbetween the primary and the remote instances. You can also utilize SSL to encrypt the transfer (a very good idea). Additionally, you can use whatever authentication system nessecary to ensure that only authenticated instances can transfer new work files. (More work on this particular point is needed, as the current implementation of and the Covad::Pwman::R_Client modules only understand basic or digest authentication; eventually, authentication by remote could be as robust as client-side certificates or similar.)

Setting up a remote Password Management instance

Setting up a remote Password Management instance is identical to setting up a primary Password Management instance. However, you need to take particular care when configuring the webserver that services the R calls. Specifically:

Initial configuration
Because a remote Password Management instance is given all the configuration needed for transfer, you don't have to build environments or roles or even install NCFs onto the remote instance. You'll still have to generate NCFs for the nodes under this instance's control, but those files live on the primary instance.

Webserver configuration: Ownership and permissions
Whatever user the webserver runs as will need write access to the pwman.active_inventory and pwman.work_dir directories. You can opt to chown these directories to be that of the webserver user (usually nobody) or to run the webserver as a seperate user.

Webserver configuration: Access control
Add as many access restrictions as you can for the R remote instances. Authentication by username/password is a good start. If you can, consider limiting access to the R location by IP address (or at least, IP subnet).

Webserver configuration: SSL encryption
If you're not encrypting R data, then you really should think about doing so. Remember that HTTP doesn't encrypt data, it just encodes it; decoding it is a trivial operation. If your session is vulnerable and a R remote transfer is sniffed, then the contents of the /etc/passwd and /etc/shadow files would be exposed.

Setting up a node for remote push

Once you've got a functioning R remote instance, you'll need to configure the nodes' NCFs to point to that instance. Start with a basic NCF, and add the R URL into the +REMOTE tag, like so:


Note the lack of a trailing slash.

And that's it, really. The will generate the node as normal; the will attempt to use the remote R when pushing.

Available configuration properties

The following configuration properties are used by R. You can set them via etc/ They're all prefixed with pwman.r..
The name of the API. There is no default for this. It's preset to R.

The absolute URL for the API. (I'm not sure that this is used anywhere in the API itself, but we may have other applications or scripts that need to know where to look, and so we set it as a globally accessable property.) There is no default for this property. It's preset to (the node on which R was originally developed).

The directory where the require file can be found. This is a deprecated option, as R now uses Covad::Pwman::Tags to parse tag information. There is no default for this property. It is preset to /opt/rcs/passwd_new/lib.

The location of the 'actions' handlers. Defaults to ``{location of R}/handlers/actions''. It is preset to %%pwman.base_dir%%/R/handlers/actions.

The location of the 'output_format' handlers. Defaults to ``{location of R}/handlers/output_format''. It is preset to %%pwman.base_dir%%/R/handlers/output_format.

The name of the indexable element in datastructures as they're passed around from handler to handler. There are problems this this, so expect it to change at some point in the future. Defaults to '00NODENAME'. It is preset to 00NODENAME.

Require https for R 'remote' calls. If this is set to 'true', then R calls to remote/* will be refused (at the R level, before the handler is dispatched but after HTTP data is accepted) unless the HTTP instance is running under the https (eg, encoded with SSL). Defaults to unset.

Require user authentication for R 'remote' calls. If this is set to a non-null value, then the value will be compared against the enviroment REMOTE_USER as set by the httpd server. If the two values don't match, then R calls to remote/* will be refused (at the R level, before the handler is dispatched but after HTTP data is accepted). Defaults to unset.


Handlers are what make R functional. They exist as little perl5 coderefs that are require'd into place at runtime. There are two types of handlers: action and output_format. Both handler types are called with similar arguments; all handlers of the same type are invoked with the same arguments.

Action Handlers

Action handlers live underneath the pwman.r.dir.actions directory. Their job is to collect or process information based on the arguments passed to them, then return the results back to R. Action handlers should not communicate directly with the client; they only work within the framework of R itself.

The simplest action is echo, which does exactly what it sounds like: returns whatever you pass it.

  $ GET -uUsSe
  nodename: testing%201%202%203

Or, if you prefer a little more verboseness with your examples:

  $ GET -uUsSe
  User-Agent: lwp-request/2.06  
  GET --> 200 OK
  Connection: close
  Date: Sun, 16 Oct 2005 22:15:46 GMT
  Content-Type: text/plain; charset=ISO-8859-1
  Client-Date: Sun, 16 Oct 2005 22:15:46 GMT
  Client-Response-Num: 1
  Client-Transfer-Encoding: chunked
  nodename: testing%201%202%203

The echo action itself is little more than this:

  $handler = sub {
        ## simple echo test.
        my(%arg) = @_;
        if ($arg{_info}) {
                return ({
                        author => '$Author: jgilbertsjc $',
                        description => "returns the value of the passed parameter in STRING",
                        syntax => "'URI_PREFIX/echo/STRING'",
                        version => '$Id: pwman_docs_rapi.pod,v 1.2 2005/10/20 08:46:34 jgilbertsjc Exp $',
        } ## endif
        my %er = (
                status => 1,
                result => [ { $NODE_HEADER => $arg{params}, } ],
        return ( \%er );
  }; ## endsub

Actions can exist in subdirectories that inspire functional grouping. For example, 
the F<node> action handlers are all prefixed with C<node/>, and live in the directory
I<pwman.r.dir.actions>F</node>. If you examine it, you'll see F<invalid>, F<revisions>, 
F<list>, F<search>, and F<default>. The F<default> handler is what is invoked for just 
C<node/>. For example:

invokes pwman.r.dir.actions/node/default with as an argument, which returns basic inventory information about Alternatively,


invokes pwman.r.dir.actions/node/revisions with as an argument. You cannot invoke handlers called default via URI_PREFIX/whatever/default.

All available actions are found in the pwman.r.dir.actions directory. To get this list remotely, you can use a special, built-in handler called capability. This handler is inline to R itself, and returns information on other handlers, including availablity, syntax, and whatnot. Calling capability with no arguments returns a list of all available action and output_format handlers, as well as a quick description of the handler.

  $ GET
  nodename: actions
  capability: Built-in handler to list capabilites for HANDLER.
  dns: current DNS information by node
  echo: returns the value of the passed parameter in STRING
  environments: returns a list of currently valid ENV environments
  node: returns all supported tag information for NODENAME. If 'tags' contains a valid TAGLIST, only those tags will be returned.
  node/invalid: returns entries who's TAGLISTING tags do not pass syntax checks or count checks

Actions can be as simple or complex as you want them to be.

Output Format Handlers

Output format handlers live underneath the pwman.r.dir.output_format directory. Their job is to reformat data passed from R and then return the data directly back to the client. It's the output format handler's job to set the correct outgoing MIME type, to restructure data so that it makes sense in the chosen MIME type (but not to change it), and to ensure that all the data is passed back.

A list of all active output format handlers is available with the capability/ action:

  $ GET    
  nodename: output_formats
  perl/struct: Output format handler - returns a Data::Dumper listref in a MIME type of 'application/x-perl'
  text/html: Output format handler - returns in a MIME type of 'text/html'
  text/plain: Output format handler - returns in a MIME type of 'text/plain'
  text/xml: Output format handler - returns in a MIME type of 'text/xml'

You can use any available output format handler with any action. For example, here is the output of URI_PREFIX/echo/yoink passed to text/xml:

  $ GET ';output_format=text/xml'
   <node name="yoink">

...and as passed to text/plain:

  $ GET ';output_format=text/plain'
  nodename: yoink

...and as passed to perl/struct:

  $ GET ';output_format=perl/struct'
  $VAR1 = [
              '00NODENAME' => 'yoink'

Only the format is different; the data will always be the same.

Note that results returned in text/xml will, while well-formed, never be considered valid, because there is no DTD or XML schema associated with it. (This may change someday, to give action handlers the ability to set an xmlns or some such nonesense.)

Adding new handlers

Adding new handlers is fairly straightforward. There are a few rules specific for action handlers, and slightly different ones for output format handlers, as well as rules that apply to both groups.

Rules for both action and output format handlers

Clean compile
Every handler needs to cleanly compile under perl5. If you can't run
  perl -c ./handler_name

then you shouldn't install it. Remember that this handler is require'd into the main R namespace, so ensure that you include a true value at the end (mostly this is accomplished with a '1;' at the end of the file).

Argument passing
Handlers get their arguments as a hashref. The values of the hash entries may be quite complex. A simple
  my(%arg) = @_;


Each handler will be passed at least the following arguments:

argument: action
A scalar defining the name of the action that was invoked. Note that if the action invoked ended up being a default handler (for example, node/), the value of this argument will be node, not node/default.

argument: tags
A hashref of all available and supported tags, whose values equal the keys.

argument: options
A hashref of all options passed to R from the client. This includes any data that might have been passed as part of a POST HTTP method.

argument: props
A hashref of all current properties as read and set up by Covad::Pwman::Properties.

The _info argument
Each handler must support the '_info' argument, which should return a hashref of the following information:
  $ref = &handler( _info => 1 );
  $ref = {
    'author'  => 'Author information (think RCS $Author: jgilbertsjc $)',
    'description' => 'What this handler does, and/or notes for the syntax of the handler',
    'syntax' => "'URI_PREFIX/handlername/ARGUMENTS' - syntax information (ie, "/HANDLER/PARAMS")',
    'version' => 'Versioning information (think RCS $Id: pwman_docs_rapi.pod,v 1.2 2005/10/20 08:46:34 jgilbertsjc Exp $)',

This _info block is used when a '/capability/HANDLER_NAME' is issued, and when determining whether or not to advertise this handler's functionality. It may also be used programmatically to glean how requests should be made, so it's convention to structure the syntax value thusly:

  "'URI_PREFIX/handler/name/ARGUMENT', where ARGUMENT is a valid argument."

If an argument or part of an argument is a nodename, you should include the string ``NODENAME'' (yes, in all capital letters) within the first single-quoted part, as such:

  "'URI_PREFIX/handler/name/NODENAME', where NODENAME is a valid node."

Normal return
Each handler must return from itself with the following struct:
  $ref = (
    status => 1 || undef,   ## 1 for success, undef for error
    result => $listref,     ## listref of data (see L</Data Format> section below)
    errmsg => 'an error',   ## scalar of an error message, if status is undef

Handlers that return data in any other method will be considered 'bad', and may not be included in the list of active handlers. Note that output_format handlers may elect to not return a 'result' entity (because there's no real reason to do so), and that this is not considered an error. Note also that

  return ( status => 1, result => undef );

is not an error; will produce a 200 OK HTTP status with no data.

Abnormal exiting
If for whatever reason your handler needs to cease functioning, use die, and set an appropriate message. That message (from $@) will be sent back to the client along with a 500 HTTP status code. Simply dieing is OK, but it's a courtesy to the client to know why they're not getting what they expected back. Something like:
  die "The whizbangle won't frob the dohicky - is your constricter unlocked?\n";

should be fine. Ensure that all die messages end in a newline, please. An empty $@ scalar upon return of the handler will be replaced with

  Unknown Error from 'handler'

This includes scenarios when your handler may (accidentally) return undef.

Data Format
The ``result'' format must conform to the following: multiple hashrefs inside one listref. Output format handlers may elect to not sort their results; therefore action handlers must return the listref pre-sorted however they choose.
  $listref = (
      $NODE_HEADER => 'fqdn',         ## required
      'TAG' => 'value',               ## optional
      'TAG' => 'value',               ## optional

...where the value for $NODE_HEADER is the fully-qualified domainname or inventory name of the node in question; TAG is the name of the +TAG on which we performed an operation; value is the data for that +TAG.

An optional HEADER entry may be included, for actions such as 'list' and 'invalid'. It should look like:

    $NODE_HEADER => 'HEADER',               ## required
    'TAG' => 1,                             ## optional
    'TAG' => 1,                             ## optional

Note that HEADER entries must be the first element in the listref.

There are three HTML-specific directives that can be returned in addition to or in lieu of hash entries. They are:

Defines a simple URL that can be followed to obtain more information on the data.

A URL interpreted to be image data, which will not be parsed and re-sent by R.

-item _html

Raw HTML, that should remain unparsed and un-interpreted; it should be sent directly back to the client as-is.

You may have as many hashrefs entries inside of your listref as you need.

This format is valid for both action handlers (who return this data in {result}) and for output format handlers (who act on this data in {result}).

Rules for action handlers

R may be hit programmatically, and may be hit several times in a given time period based on usage. Try to limit what your handler does, or at least give thought to how it might be more efficient (note: obscure does not equate efficient).

Action handlers should not use fork, exec, or the special fork-after-opening mode of open (a la open(FH, "|-")).

Action handlers are expected to clean up after themselves. If you create a temporary file, ensure that you remove it before any return points.

Action handlers may, at their discretion, modify options, handler name, or URL parameters. Try to limit doing this to times when you actually need to do so, however. Action handlers should not modify properties or supported tags (via the props and tags hashrefs, respectively).

action handler arguments
In addition to the arguments passed to all handlers, action handlers also get the following:
argument: params
A scalar containing the {arguments} portion of the URL that brought us here (which is everything after the {action} leading up to the first semi-colon ;).

Rules for output format handlers

HTTP headers
Output format handlers must provide a valid HTTP header. You can do this by hand, or you can use the CGI object passed as the q argument.

output format handler arguments
In addition to the arguments passed to all handlers, output format handlers also get the following:
argument: q
This is a CGI object, instantiated at runtime, with which your handler can use to output HTTP headers. R does not do this for you, because it doesn't know what MIME type the output format will set things to. The output format must output a valid HTTP header, and it must be of a known (and expected) MIME type (don't do something weird like return a MIME type of 'text/plain' for XML content, OK?).

argument: result
This is a listref; the same listref, actually, that was returned from the action handler. See the Data Format section above for it's format.

See Also

R, the Covad::Pwman::Properties manpage, the Covad::Pwman::R_Client manpage, the Covad::Pwman::Tags manpage, ``REST Web Services'' at, die, require


Jon Gilbert <>


$Id: pwman_docs_rapi.pod,v 1.2 2005/10/20 08:46:34 jgilbertsjc Exp $ Logo