aboutsummaryrefslogtreecommitdiff
path: root/rpkid/README
blob: 2186c5db75661e85441ee2001b9ee43600db71ce (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
$Id$ -*- Text -*-

Python RPKI production tools.

Requires Python 2.5.

See doc/Installation for installation instructions and required
packages.

The full manual is available in both PDF and HTML formats; the PDF is
in doc/manual.pdf, the HTML is in a compressed tarball
doc/manual.tar.gz.



$Revision$

TO DO:

      * Rework handling of surprising responses to up-down requests.
        Right now we get confused when we find that parent has issued
        a cert that we don't remember requesting, even when we have
        the ca_detail object in question sitting in our SQL as
        pending.  This can happen if we throw an exception later and
        don't clean up properly -- which should never happen, but
        let's try to be robust about this.

	So we need to be smarter about comparing our own state with
	what we get back from our parent and figuring out what to do
	next.  We probably also need to commit changes to SQL earlier.

	In general we should never have more than one ca_detail in
	state pending for a given ca, but the current code blindly
	assumes that will never happen and never recovers if that
	assumption has been violated.

	STATUS: Started, not complete.  Internal tracking state for
	whether objects have been published is in place, but there's
	no code yet to force retry of failed publication.  Some
	support for cleaning up extraneous ca_detail objects, dunno
	(yet) whether this is enough.

	TIME REQUIRED: One week (remaining).

      * Error handling: make sure that exceptions map correctly to
	up-down error codes, flesh out left-right error codes. Note
	that the same exception may produce different error codes
	depending on which up-down PDU we're processing (sigh).

	Will require code audit for coherency, which is most of the work.

	TIME REQUIRED: Two weeks

	DEPENDS ON: almost everything else, as almost any code change
	can raise new exceptions that we'd need to handle.

	STATUS: Not started

      * db.commit(), db.rollback(), code audit for data integrity
	issues, fix any data integrity issues that turn up. Among
	other issues, need to handle loss of connection to database
	server and other MySQL errors. Need to be careful about
	recovery action depending on whether we had uncommitted
	changes.

	TIME REQUIRED (commit and rollback): 3-4 weeks

	TIME REQUIRED (data integrity audit): 1 week

	TIME REQUIRED (fix data integrity): Unknown, depends on code
	audit and results of runtime testing.

	STATUS: Not started

      * Resource subsetting (req_* attributes in up-down protocol),
	full implementation.  Requires expanding SQL child_cert table
	to hold subset masks and rewriting a fair amount of code.

	TIME REQUIRED: 3-4 weeks

	STATUS: Not started

      * Performance testing and profiling.  Getting rid of tlslite was
        a good first step, and RSA will always be slow without a HSM,
        but last time I tried profiling I saw hints that the Python
        ASN.1 may be a bottleneck.

	TIME REQUIRED: A few days to do profiling.  What happens after
	that depends on what profiling finds.

	DEPENDS ON: Serious load testing may require assistance from
	others with larger test labs than I have directly available.

	STATUS: Barely started

      * Clean up rootd.py to be usable in a production system.	Most
	urgent issues are handling of private keys, publishing outputs
	in pubd, and reissuing when details or keys change. May not
	need much else, as this is not a high-traffic server.

	Alternatively, perhaps rootd's functionality should be merged
	into rpkid after all, given that we now believe that anybody
	who needs to certify private address space may need to run it.

	TIME REQUIRED: One week if just cleaning up rootd.  2-3 weeks
	if folding rootd into rpkid.

	STATUS: Not started

      * Update internals docs (Doxygen). Mostly this means updating
	function comments in the Python code, as the rest is
	automatic.  May require a bit more overview text to explain
	the workings and usage of the code.

	TIME REQUIRED: One week.

	STATUS: Ongoing

      * Add HSM support. Architecture includes it, current code does
	not.  First step here would be talking to somebody with strong
	understanding of PKCS# 11.

	TIME REQUIRED: Unknown

	STATUS: Not started

      * Tighten up syntax checking in left-right schema.

	TIME REQUIRED: One day.

	STATUS: Not started

      * rcynic handling of RPKI trust anchors does not yet support
        draft-ietf-sidr-ta.  Not needed for technical reasons
        ("trust-anchor-uri-with-key" method is roughly equivilent and
        much simpler), may be required for political reasons.

	TIME REQUIRED: Three days

	STATUS: Not started

      * Investigate using EKU (RFC 3280 4.2.1.13) as an alternative to
	wiring in BPKI EE certs for left-right protocol.

	STATUS: Not started

      * Django web UI to RPKI code will require some back-end support,
	not yet sure exactly how much.  Current plan is that Django
	tool just drops data into SQL, at which point it becomes my
	problem; if we keep this model, semantics of the existing
	command line tools should map fairly well, so this part will
	just be a matter of performing essentially the same operations
	that the command line UI does now, with SQL tables instead of
	CSV files as the data store.

	TIME REQUIRED: Two weeks

	STATUS: Not started

      * Django Web UI to RPKI code as currently envisioned will (also)
        require some additional data from rpkid and rcynic that is not
        yet available in machine-parsable form, so that UI can report
        on status of delegated resources and detailed validation
        status.  rpkid extensions for this should be relatively
        simple, rcynic work may be a bit more complex, or at least
        tedious, as it'll be C code generating XML.

	<list_received_resources/> was part of this.

	rcynic XML detail should describe entire certificate
	validation chain and status of validation at each stage.
	Given tree walk this is probably going to be some kind of
	XML-representation tree structure, which i will probably
	design as s-expressions first to keep my brain from exploding
	with gratuitous XML syntax.

	TIME REQUIRED: 1-2 weeks.

	STATUS:	<list_received_resources/> done.
		rcynic work started along a different path (XML
		reporting of validation failure events, mostly done,
		still some corner cases); unclear whether this will
		suffice or UI really needs tree structure per above.

      * At present there is no mechanism by which an IRBE could
	request signing of objects other than ROAs.  Eg, there has
	been some discussion of signing S/MIME letters to humans
	asking for routing, as an alternative to ROAs.	If we decide
	to support this at all, it turns into a generalization of the
	ROA problem, and suggests that perhaps ROA generation should
	be handled somewhere outside of rpkid and only passed to rpkid
	for signing.  This would be a significant change to the
	architecture, as it would remove rpkid's responsibility for
	keeping ROAs up to date.

	On further analysis: ROAs are different from S/MIME letters,
	in that ROAs are something we want both published and
	maintained on an ongoing basis until canceled, while S/MIME
	letters are one-offs that probably are not published.  So ROAs
	need -something- to keep them current, and that something
	might as well be rpkid unless we find a strong argument that
	it should be something else.  So the S/MIME letter
	functionality probably stays a different mechanism from ROAs.

	TIME REQUIRED: One week (including deciding what left-right
	protocol semantics for this should be)

	STATUS: Not started

      * There has been some discussion both in and out of the SIDR WG
        on perhaps dropping TLS out of the up-down protocol, as it is
        arguably not providing much that we can't do equally well with
        CMS.  Left-right and publication are currently not SIDR WG
        docs, but presumably they would follow.  Dunno where this is
        going to go, but assuming for purposes of discussion that we
        do drop TLS, we'll want to rip all that code out.  This
        includes revising BPKI, SQL, left-right and publication
        protocols, and code using all of these both in daemons and UI
        tools.

	TIME REQUIRED: Three weeks (very rough guess).

	DEPENDS ON: Decision whether to keep or drop TLS.

	STATUS: Not started

      * Integrate UI tools into main code base.  Right now there's
        this odd split, with the myrpki stuff off to one side, irdbd
        (which is also a sample implementation, not a core tool) in
        with rpkid, test code scattered hither and yon in all the
        above places, and none of it set up nicely either for running
        in place or installation.  This all needs to be cleaned up,
        most likely by reorganizing all of the Python (and POW) code.

	TIME REQUIRED: Two weeks.

	STATUS: Mostly done.  POW is still hanging off to the side,
	but the myrpki/ directory has now been merged into rpkid/, and
	various other bits have been cleaned up.

      * Autoconf review.  Right now we're making minimal use of
        autoconf, just enough to get the code running on Mac OS and
        clean up a few old annoyances.  There are other things that
        ought to be using autoconf, now that we're stuck with it.  Eg,
        installation scripts, the build code for POW, etc.  In the
        long run we might even want to check for usable system OpenSSL
        code and libraries: the RFC 3779 code is still off by default
        in all known public releases of OpenSSL, but the BPKI stuff
        that myrpki does only requires CMS, not RFC 3779, so it may be
        able to use the system openssl binary in many cases.

	TIME REQUIRED: One week for an initial pass.

	DEPENDS ON: Installation scripts.  Not so much depends on,
	really, as two aspects of one interrelated mess.

	STATUS: Not started

      * We need installation scripts.  Right now the only thing we
        install is rcynic, and that only on FreeBSD.

	TIME REQUIRED: One week, longer if installation for many platforms is
	required

	STATUS: Not started

      * We need better and unified documentation.  Right now doc is
        scattered between rpkid core manual, various READMEs, internal
        docuemntation in various tools, etcetera.  This is not kind to
        the user.  Depending on how much hand-written (as opposed to
        Doxygen-harvested) doc we end up with, might want to convert
        overall to something like the Doxygen/Docbook combination that
        the Boost project uses (Boostbook).

	TIME REQUIRED: At least two weeks, plus at least one more week
	if making a serious change in doc tools (eg, Boostbook).

	DEPENDS ON: Portions of this would make sense to defer until
	after whatever code reorg happens to integrate UI tools, etc.
	Most of the hand-written content could be done right away,
	might require minor edits later to track reorg changes.

	STATUS: Not started

      * Rewrite irbe_cli.py to use cmd module.   Right now irbe_cli is
        useful only as a debugging tool for its author, and the
        interface is very clunky (even by comparision to other clunky
        bits of code in this package).  Rewriting to use cmd module
        would be a major improvement; some minor challenges here
        because irbe_cli integrates so tightly with the Python message
        classes representing the left-right and publication protcols;
        figuring out how to turn this into a cmd-based program without
        massive (and fragile) duplication of code is probably good for
        a few days of head scratching.

	TIME REQUIRED: Two weeks (including head scratching)

	STATUS: Not started

      * Clean up testbed tools.  There's a collection of hacks that
        have been evolving as we've been building the testbed, most of
        which just grew as our needs evolved.  The main scripts are
        checked into the repository, but some of the minor stuff is
        not, and some of the automation used in the testbed (cron
        scripts, automated use of a version control system (currently
        subversion) to archive changes to running data, etcetera)
        might be useful to others, so it should be cleaned up and made
        available as part of the package.

	TIME REQUIRED: One week

	STATUS: Moving target, but let's say "not started" for the
	bits I'm thinking about as I type this.

      * Early (pseudo) operational testing has uncovered a conflict
        between RIRs need not to be in the business of attesting to
        identities and operators need to have -some- way of finding
        out who to call when a RPKI cert is broken.  Current proposal
        is to allow signature and publication of blobs of whois-like
        data; these would be signed by an EE cert using RFC 3779
        inheritance, and would in essence be a self-attestation (no
        checking by others, no liability incurred by others, etc) as
        to contact information one might use in case of a problem.
        This would require a minor change to the rescert profile, so
        the SIDR WG would need to sign off on this.  As this data
        would be published, presumably we would want rcynic to be able
        to check it.

	TIME REQUIRED: One week to add to rpkid et al (very rough --
	includes design work to figure out exactly where this would
	fit, actual coding probably relatively minor).  Perhaps an
	additional day or three to add to rcynic and write suitable
	search and display tools.

	DEPENDS ON: Agreement to add this to rescert profile.

	STATUS: Not started

      * myrpki.py should have a command that summarizes current state
        (data on file, actions it might make sense to take now, etc).

	TIME REQUIRED: A day or two

	STATUS: Not started

      * rcynic needs major rewrite to run multiple rsync processes in
        background, to work around tarpit attack by evil publishers.

	TIME REQUIRED: Three weeks (wild guess)

	STATUS: Not started, byond some preliminary design thoughts.