next_inactive up previous


Intro to TapiocaStor Implementation

©2001 The TapiocaStor Group


Contents

1 Overview:

TapiocaStor is a distributed network backup program. It consists of three major components:

  1. tapio - the backup and restore engine. This is a highly componentized set of 'black boxes' which are strung together in the appropriate order using 'glue' (which in turn uses various types of 'plumbing' to handle communications) in order to back up some system to some tape drive somewhere on the network. Sample component types are Sources, Tees, Funnels, and Sinks. More in the architecture document (complete with pictures!).
  2. tapioca - the tapio central authority. This is responsible for deciding what glue to use to chain together tapio components to do the desired backup and restore operations, handling storage management, and handling the central data store (a MySQL database as of the time of this writing).
  3. tapigui, tapiweb, tapicli: These communicate with tapioca via the tapicom protocol in order to give it the proper instructions for performing backup and restore.

2 Development Environment

The low-level data data sources/sinks and most of the processors for 'tapio' is written in C++, for portability and performance reasons. The standard GNU development tools are used (GNU Make, GNU autoconf, gcc 2.96+). On Windows, the free Dev-C++ development environment (http://www.bloodshed.net/devcpp/), which is a wrapper around various GNU software, is used.

'tapioca' and the various clients are written in Java, with occasional dips down to ``C'' or C++ for utilities such as 'aescrypt', 'mtx', 'tapiomt', and 'lzop'. The standard Java development environment is Sun's JDK 1.3.1 (http://www.javasoft.com/j2se/1.3/) using Sun's Forte' development environment (http://www.sun.com/forte/ffj/index.cgi).

``Glue'' may be either Java or C++, this decision has not been made yet as of the time of this writing.

The ``Plumbing'' for ``Glue'' will probably be C++ on Unix. It has not been detirmined whether it will be C++ or Java on Windows.

GNU MP ( http://www.swox.com/gmp/) is used to handle the bigint math needed for RSA public key encryption on Unix. It has not yet been detirmined what math library will be used to handle the bigint math on Windows. We may simply decide that the Windows implementation of Plumbing is Java, and use Java's own bigint math libraries, despite the performance limitations involved.

For the web server interface, the Tomcat (http://jakarta.apache.org/tomcat/) Java Server Pages server in conjunction with Apache ( http://www.apache.org) are used to interface to 'tapioca' via the Tapicom service.

Red Hat Linux 7.1 is the de facto development environment for Unix, while Windows 98 Second Edition is the de facto development environment for Windows. Support for earlier versions of Windows is not forseen at this time. Support for tape drives on Windows is not forseen at this time, though if some enterprising soul wants to port the appropriate Sinks and Sources components to Windows we certainly have no objection. Note that due to lack of a standard tape drive API on Windows 9x, it will be much easier to port the low level tape handling code to Windows NT or 2000 (which do have a standard tape drive API).

Documentation for these products are at:

3 Credits

The inspiration for TapiocaStor came from RandyK, who said ``hey, why don't we write an open source backup program?''. This is why the Java classes are in package ``net.nimitz.*'' (note: Java package names generally begin with the domain name of the author in reverse order, in order to prevent name collisions with other packages).

The name ``Tapioca'' came from RJF, who also came up with the slogan ``Your backups and restores will go smooth as pudding.''

This document and the SourceForge project were put together by EGreen, who also is drawing all the pretty pictures for the architecture document.

The idea for the Glue and the Plumbing came from Peter Buschman, who mentioned it as one of the strengths of Bakbone (i.e., its ability to glue lots of components into various orders to handle all sorts of weird situations). Any weirdness is solely the responsibility of EGreen, whose warped mind came up with the whonky terminology such as ``Gluepot'' and ``Pudding''.

The notion of using Unix components, rather than CORBA components or COM components or one of those other newcomer component models, was inspired by the 'xbru' program that was/is part of BRU, which did its work by calling command-line Unix programs such as 'mt' and 'bru'. Using the Glue and Plumbing allows those calls to be on entirely different machines.

The use of C++ was suggested by RJF, who said he was tired of spaghetti-like ``C'' code.

The session management and ticketing model for tapicom was inspired by Kerberos.

The use of Java was suggested by EGreen after RJF shot down the notion of using Perl as the Glue (too easy to spaghetti-ize it, he said). Python was not considered due to intellectual property problems which will not be discussed here but which are related to prior employment activities on the part of the core team. Other languages were not considered to have a large enough user community for people to be able to easily contribute to the project.

About this document ...

Intro to TapiocaStor Implementation

This document was generated using the LaTeX2HTML translator Version 2K.1beta (1.47)

Copyright © 1993, 1994, 1995, 1996, Nikos Drakos, Computer Based Learning Unit, University of Leeds.
Copyright © 1997, 1998, 1999, Ross Moore, Mathematics Department, Macquarie University, Sydney.

The command line arguments were:
latex2html -local_icons -no_auto_link -no_subdir -split 0 -show_section_numbers ImpOverview.tex

The translation was initiated by The Unknown Hacker on 2001-07-01


next_inactive up previous
The Unknown Hacker 2001-07-01