diff options
Diffstat (limited to 'scripts/README')
-rw-r--r-- | scripts/README | 466 |
1 files changed, 0 insertions, 466 deletions
diff --git a/scripts/README b/scripts/README deleted file mode 100644 index 3bb44561..00000000 --- a/scripts/README +++ /dev/null @@ -1,466 +0,0 @@ -$Id$ -*- Text -*- - -Python RPKI production tools. - -Requires Python 2.5. - -External Python packages required: - -- lxml, which in turn requires the libxml2 C libraries. - - http://codespeak.net/lxml/ - - FreeBSD: /usr/ports/devel/py-lxml - -- MySQLdb, which in turn requires MySQL client and server. I'm - testing with MySQL 5.1. - - http://sourceforge.net/projects/mysql-python/ - - FreeBSD: /usr/ports/databases/py-MySQLdb - -- TLSLite, which pulls in other crypto packages. - - http://trevp.net/tlslite/ - - FreeBSD: /usr/ports/security/py-tlslite - -- Cryptlib, at the moment just to support TLSlite but may end up using - it for other things later. - - http://www.cs.auckland.ac.nz/~pgut001/cryptlib/ - - FreeBSD: /usr/ports/security/cryptlib - - ...but the FreeBSD port doesn't (yet?) install the Python bindings, - sigh, so at the moment you have to do that by hand: - - # cd /usr/ports/security/cryptlib - # make install - # cd work/bindings - # python setup.py install - # cd ../.. - # make clean - -- Eventually I expect that this will require an event-handling package - like Twisted, but I'm not there yet. - -- The testpoke tool (up-down protocol command line test client) and - testbed tools also uses PyYAML. - - http://pyyaml.org/ - - FreeBSD: /usr/ports/devel/py-yaml - -We also use a hacked copy of the Python OpenSSL Wrappers (POW) -package, but our copy has enough modifications that it's expanded in -the Subversion tree. Depending on how this all works out, I may end -up splitting the POW.pkix module out of the POW package and using it -with Cryptlib, as the POW.pkix package is 98% about doing ASN.1 in -pure Python and only 2% about any kind of crypto. - - - -$Revision$ - -TO DO: - -- Scripted tests to grow and shrink and revoke and .... See - testbed.*.yaml, but more systematic testing needed. - - PRIORITY: Required - - TIME REQUIRED: open-ended - - STATUS: Ongoing - -- Randy's "user validation tool" (fetch and validate certs and - probably the ROA for a prefix I want to accept in a route filter I - am building in Python/Perl). This probably uses rcync's output as - one of its inputs. - - This is a basic tool for a sysadmin who wants to -use- all this crud - we're working so hard to generate. It's not required for the - generation tools to work, but without it the entire toolset does - nothing obviously useful, which will make it a very hard sell during - the limited public test stage. - - PRIORITY: Required - - DEPENDS ON: ROA generation - - TIME REQUIRED: three days - - STATUS: Not started - -- Common protocol dump format with APNIC and other implementors so we - can read each other's dumps. "Obvious" format would be an - OpenSSL-style PEM of the CMS, with a "text" portion (the place where - "openssl x509 -text" would put a text dump of a cert) showing the - wrapped XML. - - PRIORITY: Desirable - - TIME REQUIRED: one day - - STATUS: Not started - -- Clean unused cruft out of left-right protocol, or at least have - control booleans we don't intend to implement at present signal an - error if used. - - Bottleneck here has been deciding what to punt and what to - implement. Removing unused booleans or raising errors when they're - used is trivial. - - PRIORITY: Required - - TIME REQUIRED: Less than one day - - STATUS: Error signalling done - -- resource_set_notafter attribute added to RelaxNG but not yet to - rpki.up_down.class_elt. Need to convert to and from - rpki.sundial.datetime. This is an up-down protocol feature that was - added fairly late and that none of us properly implement yet, but - failing to handle it would be a spec violation and eventually cause - an interop problem. - - PRIORITY: Required - - TIME REQUIRED: Less than one day - - STATUS: Done - -- Publication protocol and implementation thereof. Protocol design - started, Randy had comments that sent me back to the drawing board - (he was right). Next step is to integrate Randy's advice, which - probably means picking up more of the left-right protocol framework. - - Desirable although not strictly required that protcol be agreed upon - among the RIRs. Might not be practical given how long it takes - group to decide anything. - - Tricky bit is making sure that repository receives enough - information to know whether parent has authorized child to use - parent's namespace in nesting case. In theory this is - straightforward but requires careful checking. - - ARIN can't host output of non-hosted RPKI engines without this, and - that's critical both to the security model as discussed with ARIN - staff in late 2006, so I believe we need this capability even as - part of the initial limited test. - - PRIORITY: Required - - TIME REQUIRED: 1-2 weeks for implementation once protocol settled, - depending on how much of the protocol and implementation I can steal - from the existing left-right protocol. - - STATUS: Started - -- Subsetting (req_* attributes in up-down protocol) - - Minimal implementation would be to recognize this as correct - protocol and signal an internal server error if it's ever used. - - More serious implementation would require expanding SQL child_cert - table to hold subset masks and tweaking almost every bit of code - that touches that table. - - PRIORITY: Required - - TIME REQUIRED (minimal version): One day - - TIME REQUIRED (real version): 1-2 weeks - - STATUS: Not started - -- Error handling: make sure that exceptions map correctly to up-down - error codes, flesh out left-right error codes. Note that the same - exception may produce different error codes depending on which - up-down PDU we're processing (sigh). - - Will require code audit for coherency. - - PRIORITY: Required - - TIME REQUIRED: four days - - DEPENDS ON: almost everything else, as almost any code change can - raise new exceptions that we'd need to handle. - - STATUS: Not started - -- db.commit(), db.rollback(), code audit for data integrity issues, - fix any data integrity issues that turn up. - - Among other issues, we need to handle loss of connnection to - database server and other MySQL errors. MySQLdb throws an - exception, which we can catch, and retrying is easy enough, but need - to be careful about recovery action depending on whether we had - uncommitted changes. - - PRIORITY: Required - - TIME REQUIRED (commit and rollback): Two weeks - - TIME REQUIRED (data integrity audit): Three days - - TIME REQUIRED (fix data integrity): Unknown, depends on code audit - and results of runtime testing. - - DEPENDS ON: async tasking model, sort of -- could do it first, but - tasking change will affect the exception handling that triggers - rollback. - - STATUS: Not started - -- Test with larger data set -- Tim gave me plenty of data, I have the - low-level tools and the glue logic to create child objects for all - the entities in the IRDB, but I don't yet have logic to poll on - behalf of each of them and check result for sanity. - - Maybe it'd be easier to write something that dumps Tim's database in - YAML format for testbed.py to chew on? - - PRIORITY: Highly desirable - - TIME REQUIRED (setup): One day to convert Tim's data to YAML - - TIME REQUIRED (testing): Unknown, depends on what we turn up - - STATUS: Not started - -- Clean up rootd.py to be usable in a production system. Most urgent - issue is handling of private keys. May not need much else, as this - is not a high-traffic server. - - PRIORITY: Highly desirable (not strictly needed for limited testing) - - TIME REQUIRED: Two days - - STATUS: Not started - -- Test framework, multiple self-instances per engine-instance (single - self-instance per engine-instance is already done). - - PRIORITY: Required - - DEPENDS ON: async tasking model. - - TIME REQUIRED: One week - - STATUS: Not started - -- tlslite code seems flakey under heavy use, and doesn't support all - the cert checks we want. Best bet for getting this right is - probably to hack on the POW Ssl class until it supports everything - shown in the OpenSSL book; aside from speed, the main advantage here - is that there -is- a list of all the things one needs to do to use - TLS properly if one follows this recipe, whereas with TLSlite it's - all a mystery. - - Useful side effect of doing this via POW: it brings us back to only - needing one crypto library (in particular it lets us punt M2Crypto, - which appears to be coded as an accident waiting to happen). - - PRIORITY: Required (cert checking is a security issue). - - TIME REQUIRED: Two weeks. - - DEPENDS ON: Async tasking model. - - STATUS: Not started - -- ROA generation. We have a bunch of the primitives for this but we - aren't yet generating the ROAs themselves. - - PRIORITY: Required - - TIME REQUIRED: Three days - - STATUS: Not started - -- Make rpkid fully event-driven (async tasking model), except for SQL - queries. This probably involves the "twisted" framework. - - PRIORITY: Required (to implement hosting model) - - TIME REQUIRED: one week. - - STATUS: Not started - -- Update biz trust anchor model to what we came up with in Amsterdam. - This was a direct result of security review by Kent and Housley. - - This has been waiting for work we hope RobK is doing. This is - probably not a lot of coding, probably a few extra cert fields in - the self object which we then need to toss into the - rpki.x509.X509_chain objects before verifying CMS or TLS, and - perhaps the existing TA fields in various objects become pairs of - certs instead of a single TA, but this is mostly just generalization - and reuse of existing code, no bold new adventures. - - PRIORITY: Required (security issue) - - TIME REQUIRED: One week. - - STATUS: Not started - -- Performance testing - - STATUS: Not started - -- rcynic handling of RPKI trust anchors probably needs updating. - Discussions over last N months of how RPKI trust anchors work, how - we package them, and how we roll them over. The last (TA rollover) - is the driver for this. - - Last I recall (need to check email archives) APNIC had proposed a - relatively simple format (CMS signed PEM-encoded X.509 object set, - or something like that). Need to do analysis to make sure this is - adaquate for our needs, if so just use it. This would involve minor - changes to rcynic. - - Alternatively, this could be a separate program to keep this grot - out of rcynic itself, but that's probably a usability nightmare. - - PRIORITY: Required (usability issue for relying parties) - - TIME REQUIRED: Three days. - - STATUS: Not started - -- rcynic does not yet handle manifests. This is both a real problem - (manifests were added to plug a security hole) and a user acceptance - problem (without manifest support rcynic checks old certs that are - supposed to fail because they've been revoked, resulting in what - appear to be spurious errors, which just annoy the user). - - PRIORITY: Required - - TIME REQUIRED: One week. - - STATUS: Not started - -- Update operation and installation docs. - - Known current omissions: left-right "rekey" and "revoke" operations, - testbed.py's rootd_sia config option. - - TIME REQUIRED (current work items): Less than one day - - PRIORITY: Required - - STATUS: Ongoing - -- Update internals docs (Doxygen). Mostly this means updating - function comments in the Python code, as the rest is automatic. May - require a bit of overview text to explain the workings of the code, - this overview text may well turn out to be just the current flat - text documents marked up for inclusion by Doxygen. - - PRIORITY: Desirable - - TIME REQUIRED: Two days - - STATUS: Ongoing - -- Reorganize code (directory names, module names, which objects are in - which modules, add gctx pointers to objects so we can stop passing - all these flipping explicit gctx pointers in almost every function - call) to make it easier to understand and maintain. Portions of the - existing code were done in extreme haste to meet testing deadlines, - and it shows. - - STATUS: Not started - - TIME REQUIRED: two days - - PRIORITY: Highly desirable (to preserve programmers' and - maintainers' sanity, if nothing else) - -- Add HSM support. Architecture includes it, current code does not. - First step here would be talking to somebody who understands PKCS#11 - better than I do, ie, Richard Lamb or Francis Dupont. - - STATUS: Not started - - TIME REQUIRED: Unknown - - PRIORITY: Desirable. Am guessing ARIN does not require this for - initial test - - - -Things implemented but not yet tested. - -- Client side of expiration now assumes that parent will reissue - when its IRDB changes. - -- Parent side of revocation (child_cert objects) and CRL generation - implemented. - -- Parent side of expiration implemented. - -- Child batch processing loop: regeneration or removal of expired - certs based on what's in the IRDB. - -- Batch regeneration of CRLs and manifests for all CAs. - -- Protection against up-down operations specifying a class_name that - belongs to some other self context. - -- Rewrote code that handles revoke on shrink to revoke -all- old certs - for that key, not just most recent. Not certain, but this may have - been the cause of a cert dropping not showing up in the CRL during - testing with APNIC in Vancouver. - -- Kludgy local publication hack seems to work now, including - withdrawal. rcynic still whines occasionally, but I think that's - just because, without manifest support, rcynic has no way of telling - the difference between certs we withdrew on purpose and certs that - were removed by an attacker, so the first rcynic run after a cert - has been revoked pulls the old cert from the previous rcynic pass, - find that it's listed in the CRL, and whines about it. - - - -Other random notes: - -Being able to specify interaction with other servers (not running -under testbed) in a testbed.yaml might be useful for interop tests. -Kind of breaks testbed's fundamental model, though. Replacing what -testbed thinks is a leaf with somebody else would be easy, so maybe we -could specify some way to hang a bunch of rpkids under an external -parent? Hmm, data needed would look a lot like testpoke.yaml, maybe -we can reuse some of that language? - -There's a three-way tradeoff lurking in the publication protocol, -manifest generation, and CRL generation: - -1) Consistancy issues for relying parties (eg, don't want to withdraw - something that's still listed in the manifest); - -2) Efficiency issues for the RPKI engine (eg, generating a new - manifest for each individual change during a batch run could be - expensive, would prefer to batch up the changes into a single - manifest run); and - -3) Coherency issues for the RPKI engine (don't want to defer things - that could result in loss of state if something bad happens). - -Considerations (1) and (3) have to dominate, which may mean we take a -hit on (2). - -Most of the explicit calls to sql_fetch*() are now encapsulated in -one-line methods. The remaining ones are probably hints at minor bits -of abstraction still to be done. - -Biz certs currently used by test scripts don't include SKI or AKI. I -think this is because the test scripts use "openssl x509" rather than -"openssl ca" when generating these certs. Not critical, and will -probably become completely irrelevant with all-singing all-dancing -post-Amsterdam biz cert scripts, but should not be a big problem to -fix either if it gets in the way again. |