****** The myrpki tool ******

The design of rpkid and friends assumes that certain tasks can be thrown over
the wall to the registry's back end operation.

This was a deliberate design decision to allow rpkid et al to remain
independent of existing database schema, business PKIs, and so forth that a
registry might already have. All very nice, but it leaves someone who just
wants to test the tools or who has no existing back end with a fairly large
programming project. The myrpki tool attempts to fill that gap.

myrpki is a basic implementation of what a registry back end would need to use
rpkid and friends. myrpki does not use every available option in the other
programs, nor is it necessarily as efficient as possible. Large registries will
almost certainly want to roll their own tools, perhaps using these as a
starting point. Nevertheless, we hope that myrpki will at least provide a
useful example, and may be adaquate for simple use.

myrpki is (currently) implemented as a single command line Python program. It
has a number of commands, most of which are used for initial setup, some of
which are used on an ongoing basis. myrpki can be run either in an interactive
mode or by passing a single command on the command line when starting the
program; the former mode is intended to be somewhat human-friendly, the latter
mode is useful in scripting, cron jobs, and automated testing.

myrpki use has two distinct phases: setup and data maintenance. The setup phase
is primarily about constructing the "business PKI" (BPKI) certificates that the
daemons use to authenticate CMS and HTTPS messages and obtaining the service
URLs needed to configure the daemons. The data maintenance phase is about
configuring local data into the daemons.

myrpki uses the OpenSSL command line tool for almost all operations on keys and
certificates; the one exception to this is the comamnd which talks directly to
the daemons, as this command uses the same communication libraries as the
daemons themselves do. The intent behind using the OpenSSL command line tool
for everything else is to allow all the other commands to be run without
requiring all the auxiliary packages upon which the daemons depend; this can be
useful, eg, if one wants to run the back-end on a laptop while running the
daemons on a server, in which case one might prefer not to have to install a
bunch of unnecessary packages on the laptop.

During setup phase myrpki generates and processes small XML messages which it
expects the user to ship to and from its parents, children, etc via some out-
of-band means (email, perhaps with PGP signatures, USB stick, we really don't
care). During data maintenance phase, myrpki does something similar with
another XML file, to allow hosting of RPKI services; in the degenerate case
where an entity is just self-hosting (ie, is running the daemons for itself,
and only for itself), this latter XML file need not be sent anywhere.

The basic idea here is that a user who has resources maintains a set of .csv
files containing a text representation of the data needed by the back-end,
along with a configuration file containing other parameters. The intent is that
these be very simple files that are easy to generate either by hand or as a
dump from relational database, spreadsheet, awk script, whatever works in your
environment. Given these files, the user then runs myrpki to extract the
relevant information and encode everything about its back end state into an XML
file, which can then be shipped to the appropriate other party.

Many of the myrpki commands which process XML input write out a new XML file,
either in place or as an entirely new file; in general, these files need to be
sent back to the party that sent the original file. Think of all this as a very
slow packet-based communication channel, where each XML file is a single
packet. In setup phase, there's generally a single round-trip per setup
conversation; in the data maintenance phase, the same XML file keeps bouncing
back and forth between hosted entity and hosting entity.

Note that, as certificates and CRLs have expiration and nextUpdate values, a
low-level cycle of updates passing between resource holder and rpkid operator
will be necessary as a part of steady state operation. [The current version of
these tools does not yet regenerate these expiring objects, but fixing this
will be a relatively minor matter.]

The third important kind of file in this system is the configuration_file for
myrpki. This contains a number of sections, some of which are for myrpki,
others of which are for the OpenSSL command line tool, still others of which
are for the various RPKI daemon programs. The examples/ subdirectory contains a
commented version of the configuration file that explains the various
parameters.

The .csv files read by myrpki are (now) misnamed: formerly, they used the
"excel-tab" format from the Python csv library, but early users kept trying to
make the colums line up, which didn't do what the users expected. So now these
files are just whitespace-delimted, such as a program like "awk" would
understand.

Keep reading, and don't panic.

The default configuration file name for myrpki is myrpki.conf. You can change
this using the "-c" option when invoking myrpki, or by setting the environment
variable MYRPKI_CONF.

See examples/*.csv for commented examples of the several CSV files. Note that
the comments themselves are not legal CSV, they're just present to make it
easier to understand the examples.

 myrpki overview

Which process you need to follow depends on whether you are running rpkid
yourself or will be hosted by somebody else. We call the first case "self-
hosted", because the software treats running rpkid to handle resources that you
yourself hold as if you are an rpkid operator who is hosting an entity that
happens to be yourself.

"$top" in the following refers to wherever you put the subvert-rpki.hactrn.net
code. Once we have autoconf and "make install" targets, this will be some
system directory or another; for now, it's wherever you checked out a copy of
the code from the subversion repository or unpacked a tarball of the code.

Most of the setup process looks the same for any resource holder, regardless of
whether they are self-hosting or not. The differences come in the data
maintenence phase.

The steps needed during setup phase are:

* Write a configuration file (copy $top/rpkid/examples/myrpki.conf and edit as
  needed). You need to configure the [myrpki] section; in theory, the rest of
  the file should be ok as it is, at least for simple use. You also need to
  create (either by hand or by dumping from a database, spreadsheet, whatever)
  the CSV files describing prefixes and ASNs you want to allocate to your
  children and ROAs you want created.

* Initialization ("initialize" command). This creates the local BPKI and other
  data structures that can be constructed just based on local data such as the
  config file. Other than some internal data structures, the main output of
  this step is the "identity.xml" file, which is used as input to later stages.

In theory it should be safe to run the "initialize" command more than once, in
practice this has not (yet) been tested.

* Send (email, USB stick, carrier pigeon) identity.xml to each of your parents.
  This tells each of your parents what you call yourself, and supplies each
  parent with a trust anchor for your resource-holding BPKI.

* Each of your parents runs the "configure_child" command, giving the
  identity.xml you supplied as input. This registers your data with the parent,
  including BPKI cross-registration, and generates a return message containing
  your parent's BPKI trust anchors, a service URL for contacting your parent
  via the "up-down" protocol, and (usually) either an offer of publication
  service (if your parent operates a repository) or a referral from your parent
  to whatever publication service your parent does use. Referrals include a
  CMS-signed authorization token that the repository operator can use to
  determine that your parent has given you permission to home underneath your
  parent in the publication tree.

* Each of your parents sends (...) back the response XML file generated by the
  "configure_child" command.

* You feed the response message you just got into myrpki using the
  "configure_parent" command. This registers the parent's information in your
  database, including BPKI cross-certification, and processes the repository
  offer or referral to generate a publication request message.

* You send (...) the publication request message to the repository. The
  contact_info element in the request message should (in theory) provide some
  clue as to where you should send this.

* The repository operator processes your request using myrpki's
  "configure_publication_client" command. This registers your information,
  including BPKI cross-certification, and generates a response message
  containing the repository's BPKI trust anchor and service URL.

* Repository operator sends (...) the publication confirmation message back to
  you.

* You process the publication confirmation message using myrpki's
  "configure_repository" command.

At this point you should, in theory, have established relationships, exchanged
trust anchors, and obtained service URLs from all of your parents and
repositories. The last setup step is establishing a relationship with your RPKI
service host, if you're not self-hosted, but as this is really just the first
message of an ongoing exchange with your host, it's handled by the data
maintenance commands.

The two commands used in data maintenence phase are "configure_resources" and
"configure_daemons". The first is used by the resource holder, the second is
used by the host. In the self-hosted case, it is not necessary to run
"configure_resources" at all, myrpki will run it for you automatically.

 Hosted case

The basic steps involved in getting started for a resource holder who is being
hosted by somebody else are:

* Run through steps listed in the_myrpki_overview_section.

* Run the configure_resources command to generate myrpki.xml.

* Send myrpki.xml to the rpkid operator who will be hosting you.

* Wait for your rpkid operator to ship you back an updated XML file containing
  a PKCS #10 certificate request for the BPKI signing context (BSC) created by
  rpkid.

* Run configure_resources again with the XML file you just received, to issue
  the BSC certificate and update the XML file again to contain the newly issued
  BSC certificate.

* Send the updated XML file back to your rpkid operator.

At this point you're done with initial setup. You will need to run
configure_resources again whenever you make any changes to your configuration
file or CSV files.

  Warning:
      Once myrpki knows how to update BPKI CRLs, you will also need to run
      configure_resources periodically to keep your BPKI CRLs up to date.

Any time you run configure_resources myrpki, you should send the updated XML
file to your rpkid operator, who should send you a further updated XML file in
response.

 Self-hosted case

The first few steps involved in getting started for a self-hosted resource
holder (that is, a resource holder that runs its own copy of rpkid) are the
same as in the hosted_case above; after that the process diverges.

The [current] steps are:

* Follow the basic installation instructions in the_Installation_Guide to build
  the RFC-3779-aware OpenSSL code and associated Python extension module.

* Run through steps listed in the_myrpki_overview_section.

* Set up the MySQL databases that rpkid et al will use. The package includes a
  tool to do this for you, you can use that or do the job by hand. See MySQL
  database_setup for details.

* If you are running your own publication repository (that is, if you are
  running pubd), you will also need to set up an rsyncd server or configure
  your existing one to serve pubd's output. There's a sample configuration file
  in $top/rpkid/examples/rsyncd.conf, but you may need to do something more
  complicated if you are already running rsyncd for other purposes. See the
  rsync(1) and rsyncd.conf(5) manual pages for more details.

* Start the daemons. You can use $top/rpkid/start-servers.py to do this, or
  write your own script. If you intend to run pubd, you should make sure that
  the directory you specified as publication_base_directory exists and is
  writable by the userid that will be running pubd, and should also make sure
  to start rsyncd.

* Run myrpki's configure_daemons command, twice, with no arguments. You need to
  run the command twice because myrpki has to ask rpkid to create a keypair and
  generate a certification request for the BSC. The first pass does this, the
  second processes the certification request, issues the BSC, and loads the
  result into rpkid. [Yes, we could automate this somehow, if necessary.]

At this point, if everything went well, rpkid should be up, configured, and
starting to obtain resource certificates from its parents, generate CRLs and
manifests, and so forth. At this point you should go figure out how to use the
relying party tool, rcynic: see $top/rcynic/README if you haven't already done
so.

If and when you change your CSV files, you should run configure_daemons again
to feed the changes into the daemons.

 Hosting case

If you are running rpkid not just for your own resources but also to host other
resource holders (see hosted_case above), your setup will be almost the same as
in the self-hosted case (see self-hosted_case, above), with one procedural
change: you will need to tell configure_daemons to process the XML files
produced by the resource holders you are hosting. You do this by specifying the
names of all those XML files on as arguments to the configure_daemons command.
So, if you are hosting two friends, Alice and Bob, then, everywhere the
instructions for the self-hosted case say to run configure_daemons with no
arguments, you will instead run it with the names of Alice's and Bob's XML
files as arguments.

Note that configure_daemons sometimes modifies these XML files, in which case
it will write them back to the same filenames. While it is possible to figure
out the set of circumstances in which this will happen (at present, only when
myrpki has to ask rpkid to create a new BSC keypair and PKCS #10 certificate
request), it may be easiest just to ship back an updated copy of the XML file
after every you run configure_daemons.

 "Pure" hosting case

In general we assume that anybody who bothers to run rpkid is also a resource
holder, but the software does not insist on this.

  Todo:
      Er, well, rpkid doesn't, but myrpki now does -- "pure" hosting was an
      unused feature that fell by the wayside while simplifying the user
      interface. It would be relatively straightforward to add it back if we
      ever need it for anything, but the mechanism it used to use no longer
      exists -- the old [myirbe] section of the config file has been collapsed
      into the [myrpki] section, so testing for existance of the [myrpki]
      section no longer works. So we'll need an explicit configuration option,
      no big deal, just not worth chasing now.

A (perhaps) plausible use for this capability would be if you are an rpkid-
running resource holder who wants for some reason to keep the resource-holding
side of your operation completely separate from the rpkid-running side of your
operation. This is essentially the pure-hosting model, just with an internal
hosted entity within a different part of your own organization.

 Troubleshooting

If you run into trouble setting up this package, the first thing to do is
categorize the kind of trouble you are having. If you've gotten far enough to
be running the daemons, check their log files. If you're seeing Python
exceptions, read the error messages. If you're getting TLS errors, check to
make sure that you're using all the right BPKI certificates and service contact
URLs.

TLS configuration errors are, unfortunately, notoriously difficult to debug,
because connection failures due to misconfiguration happen early, deep in the
guts of the OpenSSL TLS code, where there isn't enough application context
available to provide useful error messages.

If you've completed the steps above, everything appears to have gone OK, but
nothing seems to be happening, the first thing to do is check the logs to
confirm that nothing is actively broken. rpkid's log should include messages
telling you when it starts and finishes its internal "cron" cycle. It can take
several cron cycles for resources to work their way down from your parent into
a full set of certificates and ROAs, so have a little patience. rpkid's log
should also include messages showing every time it contacts its parent(s) or
attempts to publish anything.

rcynic in fully verbose mode provides a fairly detailed explanation of what
it's doing and why objects that fail have failed.

You can use rsync (sic) to examine the contents of a publication repository one
directory at a time, without attempting validation, by running rsync with just
the URI of the directory on its command line:

     $ rsync rsync://rpki.example.org/where/ever/

 Known Issues

The lxml package provides a Python interface to the Gnome libxml2 and libxslt C
libraries. This code has been quite stable for several years, but initial
testing with lxml compiled and linked against a newer version of libxml2 ran
into problems (specifically, gratuitous RelaxNG schema validation failures).
libxml2 2.7.3 worked; libxml2 2.7.5 did not work on the test machine in
question. Reverting to libxml2 2.7.3 fixed the problem. Rewriting the two lines
of Python code that were triggering the lxml bug appears to have solved the
problem, so the code now works properly with libxml 2.7.5, but if you start
seeing weird XML validation failures, it might be another variation of this
lxml bug.

An earlier version of this code ran into problems with what appears to be an
implementation restriction in the the GNU linker ("ld") on 64-bit hardware,
resulting in obscure build failures. The workaround for this required use of
shared libraries and is somewhat less portable than the original code, but
without it the code simply would not build in 64-bit environments with the GNU
tools. The current workaround appears to behave properly, but the workaround
requires that the pathname to the RFC-3779-aware OpenSSL shared libraries be
built into the _POW.so Python extension module. At the moment, in the absence
of "make install" targets for the Python code and libraries, this means the
build directory; eventually, once we're using autoconf and installation
targets, this will be the installation directory. If necessary, you can
override this by setting the LD_LIBRARY_PATH environment variable, see the
ld.so man page for details. This is a relatively minor variation on the usual
build issues for shared libraries, it's just annoying because shared libraries
should not be needed here and would not be if not for this GNU linker issue.