$Id$ INTRODUCTION The design of rpkid and friends assumes that certain tasks can be thrown over the wall to the registry's back end operation. This was a deliberate design decision to allow rpkid et al to remain independent of existing database schema, business PKIs, and so forth that a registry might already have. All very nice, but it leaves someone who just wants to test the tools or who has no existing back end with a fairly large programming project. The tools in this directory attempt to fill that gap. This is a basic implementation of what a registry back end would need to use rpkid and friends. These tools do not use every available option, nor are they necessarily as efficient as possible. Large registries will almost certainly want to roll their own tools, perhaps using these as a starting point. Nevertheless, we hope that these tools will at least provide a useful example. The primary tool here is a single command line Python program: myrpki.py. myrpki has a number of commands, most of which are used for initial setup, some of which are used on an ongoing basis. myrpki can be run either in an interactive mode or by passing a single command on the command line when starting the program; the former mode is intended to be somewhat human-friendly, the latter mode is useful in scripting, cron jobs, and automated testing. myrpki use has two distinct phases: setup and data maintenance. The setup phase is primarily about constructing the "business PKI" (BPKI) certificates that the daemons use to authenticate CMS and HTTPS messages and obtaining the service URLs needed to configure the daemons. The data maintenance phase is about configuring local data into the daemons. myrpki uses the OpenSSL command line tool for almost all operations on keys and certificates; the one exception to this is the comamnd which talks directly to the daemons, as this command uses the same communication libraries as the daemons themselves do. The intent behind using the OpenSSL command line tool for everything else is to allow all the other commands to be run without requiring all the auxiliary packages upon which the daemons depend; this can be useful, eg, if one wants to run the back-end on a laptop while running the daemons on a server, in which case one might prefer not to have to install a bunch of unnecessary packages on the laptop. During setup phase myrpki generates and processes small XML messages which it expects the user to ship to and from its parents, children, etc via some out-of-band means (email, perhaps with PGP signatures, USB stick, we really don't care). During data maintenance phase, myrpki does something similar with another XML file, to allow hosting of RPKI services; in the degenerate case where an entity is just self-hosting (ie, is running the daemons for itself, and only for itself), this latter XML file need not be sent anywhere. The basic idea here is that a user who has resources maintains a set of .csv files containing a text representation of the data needed by the back-end, along with a configuration file containing other parameters. The intent is that these be very simple files that are easy to generate either by hand or as a dump from relational database, spreadsheet, awk script, whatever works in your environment. Given these files, the user then runs myrpki to extract the relevant information and encode everything about its back end state into an XML file, which can then be shipped to the appropriate other party. Many of the myrpki commands which process XML input write out a new XML file, either in place or as an entirely new file; in general, these files need to be sent back to the party that sent the original file. Think of all this as a very slow packet-based communication channel, where each XML file is a single packet. In setup phase, there's generally a single round-trip per setup conversation; in the data maintenance phase, the same XML file keeps bouncing back and forth between hosted entity and hosting entity. Note that, as certificates and CRLs have expiration and nextUpdate values, a low-level cycle of updates passing between resource holder and rpkid operator will be necessary as a part of steady state operation. [The current version of these tools does not yet regenerate these expiring objects, but fixing this will be a relatively minor matter.] The third important kind of file in this system is the configuration file for myrpki. This contains a number of sections, some of which are for myrpki, others of which are for the OpenSSL command line tool, still others of which are for the various RPKI daemon programs. The examples/ subdirectory contains a commented version of the configuration file that explains the various parameters. The .csv files read by myrpki can be anything that the Python "csv" library understands. By default, they're in tab-delimited format (because the author finds this easier to read than the comma-delimited format), but this can be changed to fit local needs. Please note: tab-delimited CSV is a format defined by a certain popular spreadsheet program, and is *not* the same as whitespace-separated text. Tab characters are *punctuation*, and each tab character indicates the division between two columns. Two tab characters in a row indicates a separator, a blank cell, and another separator, not one separator. The upshot of all this is that attempting to make your columns line up prettily will not work as you expect, you will end up with too many cells, some of them empty. Keep reading, and don't panic. The default configuration file name is myrpki.conf. You can change this using the "-c" option when invoking myrpki, or by setting the environment variable MYRPKI_CONF. See examples/myrpki.conf for details on the variables that you can (and in some cases must) set. See examples/*.csv for commented examples of the several CSV files. Note that the comments themselves are not legal CSV, they're just present to make it easier to understand the examples. GETTING STARTED -- OVERVIEW Which process you need to follow depends on whether you are running rpkid yourself or will be hosted by somebody else. We call the first case "self-hosted", because the software treats running rpkid to handle resources that you yourself hold as if you are an rpkid operator who is hosting an entity that happens to be yourself. "$top" in the following refers to wherever you put the subvert-rpki.hactrn.net code. Once we have autoconf and "make install" targets, this will be some system directory or another; for now, it's wherever you checked out a copy of the code from the subversion repository or unpacked a tarball of the code. Most of the setup process looks the same for any resource holder, regardless of whether they are self-hosting or not. The differences come in the data maintenence phase. The steps needed during setup phase are: 0) Write a configuration file (copy $top/myrpki/examples/myrpki.conf and edit as needed). You need to configure the [myrpki] section; in theory, the rest of the file should be ok as it is, at least for simple use. You also need to create (either by hand or by dumping from a database, spreadsheet, whatever) the CSV files describing prefixes and ASNs you want to allocate to your children and ROAs you want created. 1) Initialization ("initialize" command). This creates the local BPKI and other data structures that can be constructed just based on local data such as the config file. Other than some internal data structures, the main output of this step is the "identity.xml" file, which is used as input to later stages. In theory it should be safe to run the "initialize" command more than once, in practice this has not (yet) been tested. 2) Send (email, USB stick, carrier pigeon) identity.xml to each of your parents. This tells each of your parents what you call yourself, and supplies each parent with a trust anchor for your resource-holding BPKI. 3) Each of your parents runs the "configure_child" command, giving the identity.xml you supplied as input. This registers your data with the parent, including BPKI cross-registration, and generates a return message containing your parent's BPKI trust anchors, a service URL for contacting your parent via the "up-down" protocol, and (usually) either an offer of publication service (if your parent operates a repository) or a referral from your parent to whatever publication service your parent does use. Referrals include a CMS-signed authorization token that the repository operator can use to determine that your parent has given you permission to home underneath your parent in the publication tree. 4) Each of your parents sends (...) back the response XML file generated by the "configure_child" command. 5) You feed the response message you just got into myrpki using the "configure_parent" command. This registers the parent's information in your database, including BPKI cross-certification, and processes the repository offer or referral to generate a publication request message. 6) You send (...) the publication request message to the repository. The element in the request message should (in theory) provide some clue as to where you should send this. 7) The repository operator processes your request using myrpki's "configure_publication_client" command. This registers your information, including BPKI cross-certification, and generates a response message containing the repository's BPKI trust anchor and service URL. 8) Repository operator sends (...) the publication confirmation message back to you. 9) You process the publication confirmation message using myrpki's "configure_repository" command. At this point you should, in theory, have established relationships, exchanged trust anchors, and obtained service URLs from all of your parents and repositories. The last setup step is establishing a relationship with your RPKI service host, if you're not self-hosted, but as this is really just the first message of an ongoing exchange with your host, it's handled by the data maintenance commands. The two commands used in data maintenence phase are "configure_resources" and "configure_daemons". The first is used by the resource holder, the second is used by the host. In the self-hosted case, it is not necessary to run "configure_resources" at all, myrpki will run it for you automatically. GETTING STARTED -- CONFIGURATION FILE The current sample configuration file should, in theory, be much simpler to use than in earlier versions of this code. The sample configuration uses a simple macro-expansion mechanism to place all of the configuration data you need to touch into the [myrpki] section; the rest of the configuration file is for the various daemons and other tools, and is entirely configured via references to the values defined in the [myrpki] section. GETTING STARTED -- HOSTED CASE The basic steps involved in getting started for a resource holder who is being hosted by somebody else are: a) Run through steps (0)-(9), above. b) Run the configure_resources command to generate myrpki.xml. c) Send myrpki.xml to the rpkid operator who will be hosting you. d) Wait for your rpkid operator to ship you back an updated XML file containing a PKCS #10 certificate request for the BPKI signing context (BSC) created by rpkid. e) Run configure_resources again with the XML file received in step (d), to issue the BSC certificate and update the XML file again to contain the newly issued BSC certificate. f) Send the updated XML file back to your rpkid operator. At this point you're done with initial setup. You will need to run configure_resources again whenever you make any changes to your configuration file or CSV files. [Once myrpki knows how to update BPKI CRLs, you will also need to run configure_resources periodically to keep your BPKI CRLs up to date.] Any time you run configure_resources myrpki, you should send the updated XML file to your rpkid operator, who will [generally?] send you a further updated XML file in response. GETTING STARTED -- SELF-HOSTED CASE The first few steps involved in getting started for a self-hosted resource holder (that is, a resource holder that runs its own copy of rpkid) are the same as in the hosted case above; after that the process diverges. The [current] steps are: a) See rpkid/doc/Installation, and follow the basic installation instructions there to build the RFC-3779-aware OpenSSL code and associated Python extension module. b) Run through steps (0)-(9), above. c) Next, you need to set up the MySQL databases that rpkid et al will use. The MySQL database, username, and password values all need to match the ones you specified in myrpki.conf. There are two different ways you can do this: i) You can use the setup-sql.py script, which prompts you for your MySQL root password then attempts to do everything else automatically using values from myrpki.conf; or ii) You can do it manually. The first approach is simple: $ python setup-sql.py Please enter your MySQL root password: The script should tell you what databases it creates. You can use the -v option if you want to see more details about what it's doing. If you'd prefer to do the SQL setup manually, perhaps because you have valuable data in other MySQL databases and you don't want to trust some random setup script with your MySQL root password, you'll need to use the MySQL command line tool, as follows: $ mysql -u root -p mysql> CREATE DATABASE irdb_database; mysql> GRANT all ON irdb_database.* TO irdb_user@localhost IDENTIFIED BY 'irdb_password'; mysql> USE irdb_database; mysql> SOURCE $top/rpkid/irdbd.sql; mysql> CREATE DATABASE rpki_database; mysql> GRANT all ON rpki_database.* TO rpki_user@localhost IDENTIFIED BY 'rpki_password'; mysql> USE rpki_database; mysql> SOURCE $top/rpkid/rpkid.sql; mysql> COMMIT; mysql> quit where "irdb_database", "irdb_user", "irdb_password", "rpki_database", "rpki_user", and "rpki_password" are the appropriate values from your configuration file. If you are running pubd and doing manual SQL setup, you'll also have to do: $ mysql -u root -p mysql> CREATE DATABASE pubd_database; mysql> GRANT all ON pubd_database.* TO pubd_user@localhost IDENTIFIED BY 'pubd_password'; mysql> USE pubd_database; mysql> SOURCE $top/rpkid/pubd.sql; mysql> COMMIT; mysql> quit d) If you are running your own publication repository (that is, if you are running pubd), you will also need to set up an rsyncd server or configure your existing one to serve pubd's output. There's a sample configuration file in $top/myrpki/examples/rsyncd.conf, but you may need to do something more complicated if you are already running rsyncd for other purposes. See the rsync(1) and rsyncd.conf(5) manual pages for more details. e) Start the daemons. You can use $top/myrpki/start-servers.py to do this, or write your own script. If you intend to run pubd, you should make sure that the directory you specified as publication_base_directory exists and is writable by the userid that will be running pubd, and should also make sure to start rsyncd. f) Run myrpki's configure_daemons command, twice, with no arguments. You need to run the command twice because myrpki has to ask rpkid to create a keypair and generate a certification request for the BSC. The first pass does this, the second processes the certification request, issues the BSC, and loads the result into rpkid. [Yes, we could automate this somehow, if necessary.] At this point, if everything went well, rpkid should be up, configured, and starting to obtain resource certificates from its parents, generate CRLs and manifests, and so forth. At this point you should go figure out how to use the relying party tool, rcynic: see $top/rcynic/README if you haven't already done so. If and when you change your CSV files, you should run configure_daemons again to feed the changes into the daemons. GETTING STARTED -- HOSTING CASE If you are running rpkid not just for your own resources but also to host other resource holders (see "HOSTED CASE" above), your setup will be almost the same as in the self-hosted case (see "SELF-HOSTED CASE", above), with one procedural change: you will need to tell configure_daemons to process the XML files produced by the resource holders you are hosting. You do this by specifying the names of all those XML files on as arguments to the configure_daemons command. So, if you are hosting two friends, Alice and Bob, then, everywhere the instructions for the self-hosted case say to run configure_daemons with no arguments, you will instead run it with the names of Alice's and Bob's XML files as arguments. Note that configure_daemons sometimes modifies these XML files, in which case it will write them back to the same filenames. While it is possible to figure out the set of circumstances in which this will happen (at present, only when myrpki has to ask rpkid to create a new BSC keypair and PKCS #10 certificate request), it may be easiest just to ship back an updated copy of the XML file after every you run configure_daemons. GETTING STARTED -- "PURE" HOSTING CASE In general we assume that anybody who bothers to run rpkid is also a resource holder, but the software does not insist on this. [Er, well, rpkid doesn't, but myrpki now does -- "pure" hosting was an unused feature that fell by the wayside while simplifying the user interface. It would be relatively straightforward to add it back if we ever need it for anything, but the mechanism it used to use no longer exists -- the old [myirbe] section of the config file has been collapsed into the [myrpki] section, so testing for existance of the [myrpki] section no longer works. So we'll need an explicit configuration option, no big deal, just not worth chasing now.] A (perhaps) plausible use for this capability would be if you are an rpkid-running resource holder who wants for some reason to keep the resource-holding side of your operation completely separate from the rpkid-running side of your operation. This is essentially the pure-hosting model, just with an internal hosted entity within a different part of your own organization. UPGRADING FROM OLD MYRPKI TOOLS There's a script that attempts to upgrade from the previous version of the myrpki tools (myirbe scripts, parents.csv file, etcetera). The conversion script is not well tested, so taking a backup (including an SQL dump) FIRST is STRONGLY recommended. The script attempts to read all the necessary settings out of your old myrpki.conf file and the obsolete {parents,children,pubclients}.csv files, and writes out a new configuration file (myrpki.conf.new) and a set of "entitydb" files (the local XML database used by the current myrpki program). To use the conversion script, just run $ python convert-from-csv-to-entitydb.py with no arguments in the directory where your old myrpki.conf and .csv files reside. See the script itself for available command line options, most of which override various filenames. Note that the conversion script will not rename existing BPKI directories to the new convention (./bpki/{resources,servers}/), instead it will write out myrpki.conf.new using the old directory names (./bpki.{myrpki,myirbe}/); if you want to switch to the new convention, move the directories yourself and edit the .conf file to match. The script does not delete any of the old files, so you'll want to clean up yourself after you're sure the conversion worked. Be warned that the old file format contains less information than the new XML files do, so in some cases the conversion script is just making stuff up as best it can. In theory, the cases where it has to do this will not matter, but this has not been tested yet. TROUBLESHOOTING If you run into trouble setting up this package, the first thing to do is categorize the kind of trouble you are having. If you've gotten far enough to be running the daemons, check their log files. If you're seeing Python exceptions, read the error messages. If you're getting TLS errors, check to make sure that you're using all the right BPKI certificates and service contact URLs. TLS configuration errors are, unfortunately, notoriously difficult to debug, because connection failures due to misconfiguration happen early, deep in the guts of the OpenSSL TLS code, where there isn't enough application context available to provide useful error messages. If you've completed the steps above, everything appears to have gone OK, but nothing seems to be happening, the first thing to do is check the logs to confirm that nothing is actively broken. rpkid's log should include messages telling you when it starts and finishes its internal "cron" cycle. It can take several cron cycles for resources to work their way down from your parent into a full set of certificates and ROAs, so have a little patience. rpkid's log should also include messages showing every time it contacts its parent(s) or attempts to publish anything. rcynic in fully verbose mode provides a fairly detailed explanation of what it's doing and why objects that fail have failed. You can use rsync (sic) to examine the contents of a publication repository one directory at a time, without attempting validation, by running rsync with just the URI of the directory on its command line: $ rsync rsync://rpki.example.org/where/ever/ [Maybe there should be something here explaining how to use irbe_cli.py for debugging, but the syntax is fairly obscure as it's just a command line interface to the left-right and publication protocols -- almost certainly want a friendlier tool for troubleshooting.] KNOWN ISSUES The lxml package provides a Python interface to the Gnome libxml2 and libxslt C libraries. This code has been quite stable for several years, but initial testing with lxml compiled and linked against a newer version of libxml2 ran into problems (specifically, gratuitous RelaxNG schema validation failures). libxml2 2.7.3 worked; libxml2 2.7.5 did not work on the test machine in question. Reverting to libxml2 2.7.3 fixed the problem. Rewriting the two lines of Python code that were triggering the lxml bug appears to have solved the problem, so the code now works properly with libxml 2.7.5, but if you start seeing weird XML validation failures, it might be another variation of this lxml bug. An earlier version of this code ran into problems with what appears to be an implementation restriction in the the GNU linker ("ld") on 64-bit hardware, resulting in obscure build failures. The workaround for this required use of shared libraries and is somewhat less portable than the original code, but without it the code simply would not build in 64-bit environments with the GNU tools. The current workaround appears to behave properly, but the workaround requires that the pathname to the RFC-3779-aware OpenSSL shared libraries be built into the _POW.so Python extension module. At the moment, in the absence of "make install" targets for the Python code and libraries, this means the build directory; eventually, once we're using autoconf and installation targets, this will be the installation directory. If necessary, you can override this by setting the LD_LIBRARY_PATH environment variable, see the ld.so man page for details. This is a relatively minor variation on the usual build issues for shared libraries, it's just annoying because shared libraries should not be needed here and would not be if not for this GNU linker issue.