1. Plumbing Protocol

1.1 Executive Summary

The netplumbing protocol allows running a program on a different machine in a secure manner. It functions somewhat like 'rsh', but encrypts all communications and allows replicating named pipes and unix-domain sockets as well as the traditional stdin/stdout/stderr.

1.2 Introduction

A distributed backup program needs some way of running components on other systems. The netplumbing protocol was created in order to do this. We use Unix-style components (i.e., that communicate in a stream oriented manner and can be invoked and tested from the command line), and use 'plumb' and the 'netplumb' service to run components that happen to be on another system. We encrypt network communications in order to protect against hackers, and we replicate local domain sockets and named pipes in order to allow additional communications channels other than stdin/stdout/stderr.

1.3 References

Check out:

OpenSSL ( http://www.openssl.org ).

Some of the ideas involved originated with SSH and the OpenSSH project, and some are based on reading the 'rsh' source code (and barfing) and coming up with a better way of doing things.

1.4 Definitions

Definitions will go here.

1.5 Requirements

netplumb will accept connections only from explicitly authorized clients.
netplumb may have different ``gluepots'' (lists of commands that are authorized to be run) for each client.
netplumb will be secure. Only an RSA-authenticated central controller node can tell it what commands to run
netplumb will be able to accept data from non-authorized nodes if properly told to do so by a central controller node. Thus netplumb can accept ``one time keys'' from a central controller node that can be used once then are no longer valid, along with a token dictating what command shall be run by that key.
netplumb will not require any infrastructure to be running in order to properly operate, so that it can be used on recovery disks etc. to do network-based recovery of data.

1.6 Assumptions

The environment in which the widget runs. We assume certain things about it.

We assume that:

We are using IP-based networking. This doesn't work via other types of networking.
We have some way of getting a list of authorized control nodes to clients, along with (hopefully) their public keys (or at least the fingerprint of those public keys).
We can contact the clients from the control node(s).

We do *NOT* assume that:

We do not assume that we can contact the control node(s) from the clients. They may be on opposite side of a firewall.

1.7 Relationships

The entire network backup pudding relies upon netplumb.

1.8 User Interface

1.8.1 plumb command

plumb machinename { -recipe=/some/path/name } { -user=username } { -infifo=/some/path/name } {-outfifo=/some/path/name } { -socket=/some/path/name } {-export=namelist }command

Where:

-recipe is an optional 'recipe' file which contains one or more commands to be executed. If you use a recipe file, you do not need to include 'command'.
-user is an optional user name. If omitted, the command will be run as user 'root'.
-infifo precedes a list of path names which will be input pipes (fifos) to the 'plumb' command (i.e., will also be input pipes to the command being run by 'plumb'). If there is more than one path name, they are seperated by commas. We currently do not allow commas in path names.
-outfifo precedes a list of path names which will be output (written) pipes.
-socket precedes a list of (comma-delimited) path names for named (Unix domain) pipes that will be forward.
-export precedes a list of environment variables to propogate forward to the other end.
command - Name of a component in the gluepot. All other arguments are passed as-is. Note that shell substitutions are *NOT* done on the other end, and that the component *MUST* reside in the gluepot ( /var/lib/tapioca/gluepot/ for Linux, note that these may be symlinks to the actual executable ). This is because we use plain old 'exec' rather than using /bin/sh. Please note that if any of the command arguments contain ``-'' you may need to instead create a recipe and feed the command through a recipe file.

1.8.2 Recipes

A recipe file is a (possibly labeled) list of commands. It has the format of:

$\begin{verb} \par [label:] command [>\%label2:] [2>\%label3:] [\vert] [\&] \par\end{verb}$

where >%label: will send its output to the input of the command preceded by label3, and 2>%label3: will send its stderr to the stdin of the command labeled with 'label3:'. A plain old | will pipe its output to the stdin of the next command. The normal >file, 2>file, »file, 2»file, and 2>&1 redirections are also supported, but this scheme allows rather complicated pipelines that are more like a pudding than a pipeline, when you toss in named pipes into the mix. Please note that the >file etc. default to being created in /var/lib/tapioca/tmp on Linux if you leave out the path name (which is recommended, because Windows path names look nothing like Unix pathnames and this is supposed to be cross-platform). In reality, about the only thing of interest there are the named pipes.

All pipes are opened and readied for use prior to the recipe actually executing. The commands are then executed sequentially, one after the other. If a '&' or '|' is at the end of the command, it will spawn off that command then immediately spawn the next command without waiting for the first command to complete (what is normally expected in a Unix-style pipeline).

1.9 Interfaces/Protocols

1.9.1 General packet format

All binary numbers in packet headers are in big-endian (Internet) format.

All packets start with the characters 'TAP1', then a 4 byte number which represents the size of the packet. This is followed by a 32-bit CRC checksum of the remainder of the packet. Then a a 4-byte sequence number (rolls over at 32767), This is followed by a 1-byte packet type, which currently has the following values:

1 - START packet (sets us up).
2 - PUBKEY packet (gets us a public key if we don't have one)
10 - SESSIONKEY packet (for further conversations).
20 - ONETIMEKEY packet (for client-client communications).
30 - RECIPE packet (tells us what commands to execute).
35 - STATUS packet (tells us exit status of a command).
50 - DATA packet (payload!).
99 - FAILED - if something failed.

All packets from #10 upwards are encrypted. #10 uses a signed ( MD5 signature encrypted w/private key, encrypted by public key of recipient ) encryption using the recipient's public key and the originator's private key, and #20 upwards use the symetric key negotiated by the sessionkey command.

Some packets have a response packet in return. This response packet will use the same packet type, but may have a different payload, depending upon what the particular class processing the packet expects.

1.9.2 General format of non-DATA packets

Virtually all of these are a list of name/value pairs, of format

$\begin{verb} NAME=VALUE \end{verb}$

with the last entry being:

$\begin{verb} \par ==END== \par\end{verb}$

All entries are null-terminated. There are no illegal values in entries, other than the null character (note: This means that binary data containing zeros should be sent in hex or some other encoded format such as base64). Trailing spaces are *NOT* stripped off, so be careful if you are encoding a packet by hand.