$Id$ -*- Text -*- Python RPKI production tools. Requires Python 2.5. See doc/Installation for installation instructions and required packages. $Revision$ TO DO: * Make rpkid fully event-driven (async tasking model). STATUS: Done * Error handling: make sure that exceptions map correctly to up-down error codes, flesh out left-right error codes. Note that the same exception may produce different error codes depending on which up-down PDU we're processing (sigh). Will require code audit for coherency, which is most of the work. TIME REQUIRED: Two weeks DEPENDS ON: almost everything else, as almost any code change can raise new exceptions that we'd need to handle. STATUS: Not started * db.commit(), db.rollback(), code audit for data integrity issues, fix any data integrity issues that turn up. Among other issues, need to handle loss of connection to database server and other MySQL errors. Need to be careful about recovery action depending on whether we had uncommitted changes. TIME REQUIRED (commit and rollback): 3-4 weeks TIME REQUIRED (data integrity audit): 1 week TIME REQUIRED (fix data integrity): Unknown, depends on code audit and results of runtime testing. DEPENDS ON: async tasking model rollback. STATUS: Not started * Test framework for multiple self-instances per engine-instance. STATUS: Done * Replace tlslite with something based on OpenSSL TLS code. STATUS: Done * Resource subsetting (req_* attributes in up-down protocol), full implementation. Requires expanding SQL child_cert table to hold subset masks and rewriting a fair amount of code. TIME REQUIRED: 3-4 weeks STATUS: Not started * Performance testing. Some very preliminary tests show a hotspot in the TLS code, but further testing will be needed, particularly after the async tasking model change. STATUS: Barely started * Clean up rootd.py to be usable in a production system. Most urgent issues are handling of private keys, publishing outputs in pubd, and reissuing when details or keys change. May not need much else, as this is not a high-traffic server. TIME REQUIRED: One week STATUS: Not started * Update internals docs (Doxygen). Mostly this means updating function comments in the Python code, as the rest is automatic. May require a bit more overview text to explain the workings and usage of the code. TIME REQUIRED: One week. STATUS: Ongoing * Add HSM support. Architecture includes it, current code does not. First step here would be talking to somebody with strong understanding of PKCS# 11. TIME REQUIRED: Unknown STATUS: Not started * Installation packaging, so that rpkid can be built and installed like a normal package. TIME REQUIRED: One week, longer if installation for many platforms is required STATUS: Not started * Tighten up syntax checking in left-right schema. TIME REQUIRED: One day. STATUS: Not started * Rethink exposing SQL primary indices in protocols. Right now, auto-incremented SQL indices are used in many places in the left-right protocol, and are even exposed in a few places in our implementation of the up-down protocol. This is nicely unique but may be operationally fragile, since up-down usage means that URLs contain mechanically assigned identifiers rather than an identifier negotiated between the two parties during contract setup. Review by RIPE NCC staff suggested that we should instead use something like a hash of the client's name, which would be probabilistically unique and would not expose information, but would be stable even if we had to rebuild the database. TIME REQUIRED: One week to evaluate. Implementation time if we decide to make a change unknown, but probably on the order of another week. STATUS: Done * rcynic handling of RPKI trust anchors does not yet match most recent agreement by design team. Currently waiting for an OID assignment for the CMS-wrapped indirection format that the design team settled on. TIME REQUIRED: Three days DEPENDS ON: OID assignment STATUS: Not started * Publication protocol ACL checking may need revisiting. Tricky bit is making sure that repository receives enough information to know whether parent has authorized child to use parent's namespace in nesting case; in theory this is straightforward but requires careful checking. Current implementation just uses a configured path check and does not attempt to trace back to permission from parent in nested publication case. Class and method design is intended to make it easy to drop in additional checks if needed. STATUS: Trivial version (required path check) done. * Deaddrop of incoming messages, for audit. Absent a better theory, steal existing tech for this: preface with minimal RFC 2822 header and drop it into a Maildir folder using built-in Python Maildir library code, at which point it becomes soebody else's problem. STATUS: Not started * Investigate using EKU (RFC 3280 4.2.1.13) as an alternative to wiring in BPKI EE certs for left-right protocol. STATUS: Not started * Rethink current ROA generation scheme: why are we pushing objects into rpkid instead of letting rpkid pull data from irdbd as it does for resources? Is there a better way to represent proto-ROA data in SQL so that we can query directly for the resources we need (NB: this might require rethinking other rpkid SQL tables)? STATUS: Done * Really need scripts and better doc on BPKI setup. TIME REQUIRED: One week STATUS: Not started * Testing of this by anybody but the author and a few friends is going to require some kind of user interface. Python based web UI is probably the most cost effective approach, Django might be a good base for this. Some of the operations suggested in an initial brainstorming session on this are outside the scope of what rpkid currently knows how to do (eg, signing S/MIME "please route" messages), so one of the tasks here is to see if trying to write a user interface sheds light on required features that are currently missing. STATUS: Not started * At present there is no mechanism by which an IRBE could request signing of objects other than ROAs. Eg, there has been some discussion of signing S/MIME letters to humans asking for routing, as an alternative to ROAs. If we decide to support this at all, it turns into a generalization of the ROA problem, and suggests that perhaps ROA generation should be handled somewhere outside of rpkid and only passed to rpkid for signing. This would be a significant change to the architecture, as it would remove rpkid's responsibility for keeping ROAs up to date. STATUS: Not started Other random notes: Being able to specify interaction with other servers (not running under testbed) in a testbed.yaml might be useful for interop tests. Kind of breaks testbed's fundamental model, though. Replacing what testbed thinks is a leaf with somebody else would be easy, so maybe we could specify some way to hang a bunch of rpkids under an external parent? Hmm, data needed would look a lot like testpoke.yaml, maybe we can reuse some of that language? There's a three-way tradeoff lurking in the publication protocol, manifest generation, and CRL generation: 1) Consistancy issues for relying parties (eg, don't want to withdraw something that's still listed in the manifest); 2) Efficiency issues for the RPKI engine (eg, generating a new manifest for each individual change during a batch run could be expensive, would prefer to batch up the changes into a single manifest run); and 3) Coherency issues for the RPKI engine (don't want to defer things that could result in loss of state if something bad happens). Considerations (1) and (3) have to dominate, which may mean we take a hit on (2).