diff options
Diffstat (limited to 'rpkid.stable/README')
-rw-r--r-- | rpkid.stable/README | 259 |
1 files changed, 259 insertions, 0 deletions
diff --git a/rpkid.stable/README b/rpkid.stable/README new file mode 100644 index 00000000..be3da4bf --- /dev/null +++ b/rpkid.stable/README @@ -0,0 +1,259 @@ +$Id$ -*- Text -*- + +Python RPKI production tools. + +Requires Python 2.5. + +See doc/Installation for installation instructions and required +packages. + + + +$Revision$ + +TO DO: + + + * Make rpkid fully event-driven (async tasking model), except for SQL + queries. This may involve the "twisted" framework. + + TIME REQUIRED: Two weeks. + + STATUS: Not started + + * Error handling: make sure that exceptions map correctly to up-down error + codes, flesh out left-right error codes. Note that the same exception may + produce different error codes depending on which up-down PDU we're + processing (sigh). + + Will require code audit for coherency, which is most of the work. + + TIME REQUIRED: Two weeks + + DEPENDS ON: almost everything else, as almost any code change can raise new + exceptions that we'd need to handle. + + STATUS: Not started + + * db.commit(), db.rollback(), code audit for data integrity issues, fix any + data integrity issues that turn up. Among other issues, need to handle loss + of connection to database server and other MySQL errors. Need to be careful + about recovery action depending on whether we had uncommitted changes. + + TIME REQUIRED (commit and rollback): 3-4 weeks + + TIME REQUIRED (data integrity audit): 1 week + + TIME REQUIRED (fix data integrity): Unknown, depends on code audit and results + of runtime testing. + + DEPENDS ON: async tasking model rollback. + + STATUS: Not started + + * Test framework for multiple self-instances per engine-instance (single + self-instance per engine-instance already done). + + DEPENDS ON: Async tasking model. + + TIME REQUIRED: One week + + STATUS: Not started + + * Current TLS code (tlslite) is flakey and slow. Unless I can + find a good Python TLS interface that somebody else is + maintaining, best option would be to add TLS support to POW. + + TIME REQUIRED: 3-4 weeks + + DEPENDS ON: Async tasking model. + + STATUS: Not started + + * Resource subsetting (req_* attributes in up-down protocol), full + implementation. Requires expanding SQL child_cert table to hold subset + masks and rewriting a fair amount of code. + + TIME REQUIRED: 3-4 weeks + + STATUS: Not started + + * Performance testing. Some very preliminary tests show a + hotspot in the TLS code, but further testing will be needed, + particularly after the async tasking model change. + + STATUS: Barely started + + * Clean up rootd.py to be usable in a production system. Most + urgent issues are handling of private keys, publishing outputs + in pubd, and reissuing when details or keys change. May not + need much else, as this is not a high-traffic server. + + TIME REQUIRED: One week + + STATUS: Not started + + * Update internals docs (Doxygen). Mostly this means updating + function comments in the Python code, as the rest is + automatic. May require a bit more overview text to explain + the workings and usage of the code. + + TIME REQUIRED: One week. + + STATUS: Ongoing + + * Add HSM support. Architecture includes it, current code does + not. First step here would be talking to somebody with strong + understanding of PKCS# 11. + + TIME REQUIRED: Unknown + + STATUS: Not started + + * Installation packaging, so that rpkid can be built and installed like a + normal package. + + TIME REQUIRED: One week, longer if installation for many platforms is + required + + STATUS: Not started + + * Tighten up syntax checking in left-right schema. + + TIME REQUIRED: One day. + + STATUS: Not started + + * Rethink exposing SQL primary indices in protocols. Right now, + auto-incremented SQL indices are used in many places in the + left-right protocol, and are even exposed in a few places in + our implementation of the up-down protocol. This is nicely + unique but may be operationally fragile, since up-down usage + means that URLs contain mechanically assigned identifiers + rather than an identifier negotiated between the two parties + during contract setup. + + Review by RIPE NCC staff suggested that we should instead use + something like a hash of the client's name, which would be + probabilistically unique and would not expose information, but + would be stable even if we had to rebuild the database. + + TIME REQUIRED: One week to evaluate. Implementation time if we decide to make a + change unknown, but probably on the order of another week. + + STATUS: Not started + + * IETF SIDR WG is still talking about ROAs with multiple + signatures. No obvious need for this but IETF may mandate it + anyway. Full implementation would require significant work + revising current SQL table relations and upgrading CMS + support, and would also require nontrivial rewrite of rcynic. + + TIME REQUIRED: Unknown + + STATUS: Not started + + * rcynic handling of RPKI trust anchors does not yet match most + recent agreement by design team. Currently waiting for an OID + assignment for the CMS-wrapped indirection format that the + design team settled on. + + TIME REQUIRED: Three days + + DEPENDS ON: OID assignment + + STATUS: Not started + + * Publication protocol ACL checking may need revisiting. Tricky + bit is making sure that repository receives enough information + to know whether parent has authorized child to use parent's + namespace in nesting case; in theory this is straightforward + but requires careful checking. Current implementation just + uses a configured path check and does not attempt to trace + back to permission from parent in nested publication case. + Class and method design is intended to make it easy to drop in + additional checks if needed. + + STATUS: Trivial version (required path check) done. + + * Deaddrop of incoming messages, for audit. Absent a better + theory, steal existing tech for this: preface with minimal RFC + 2822 header and drop it into a Maildir folder using built-in + Python Maildir library code, at which point it becomes soebody + else's problem. + + STATUS: Not started + + * Investigate using EKU (RFC 3280 4.2.1.13) as an alternative to + wiring in BPKI EE certs for left-right protocol. + + STATUS: Not started + + * Rethink current ROA generation scheme: why are we pushing + <route_origin/> objects into rpkid instead of letting rpkid + pull data from irdbd as it does for resources? Is there a + better way to represent proto-ROA data in SQL so that we can + query directly for the resources we need (NB: this might + require rethinking other rpkid SQL tables)? + + STATUS: Not started + + * Really need scripts and better doc on BPKI setup. + + TIME REQUIRED: One week + + STATUS: Not started + + * Testing of this by anybody but the author and a few friends is + going to require some kind of user interface. Python based + web UI is probably the most cost effective approach, Django + might be a good base for this. Some of the operations + suggested in an initial brainstorming session on this are + outside the scope of what rpkid currently knows how to do (eg, + signing S/MIME "please route" messages), so one of the tasks + here is to see if trying to write a user interface sheds light + on required features that are currently missing. + + STATUS: Not started + + * At present there is no mechanism by which an IRBE could + request signing of objects other than ROAs. Eg, there has + been some discussion of signing S/MIME letters to humans + asking for routing, as an alternative to ROAs. If we decide + to support this at all, it turns into a generalization of the + ROA problem, and suggests that perhaps ROA generation should + be handled somewhere outside of rpkid and only passed to rpkid + for signing. This would be a significant change to the + architecture, as it would remove rpkid's responsibility for + keeping ROAs up to date. + + STATUS: Not started + + + +Other random notes: + +Being able to specify interaction with other servers (not running +under testbed) in a testbed.yaml might be useful for interop tests. +Kind of breaks testbed's fundamental model, though. Replacing what +testbed thinks is a leaf with somebody else would be easy, so maybe we +could specify some way to hang a bunch of rpkids under an external +parent? Hmm, data needed would look a lot like testpoke.yaml, maybe +we can reuse some of that language? + +There's a three-way tradeoff lurking in the publication protocol, +manifest generation, and CRL generation: + +1) Consistancy issues for relying parties (eg, don't want to withdraw + something that's still listed in the manifest); + +2) Efficiency issues for the RPKI engine (eg, generating a new + manifest for each individual change during a batch run could be + expensive, would prefer to batch up the changes into a single + manifest run); and + +3) Coherency issues for the RPKI engine (don't want to defer things + that could result in loss of state if something bad happens). + +Considerations (1) and (3) have to dominate, which may mean we take a +hit on (2). |