1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
|
$Id$ -*- Text -*-
Python RPKI production tools.
Requires Python 2.5.
See doc/Installation for installation instructions and required
packages.
$Revision$
TO DO:
- rcynic handling of RPKI trust anchors may need updating, per
discussions over previous months of how RPKI trust anchors
work, how we package them, and how we roll them over. Current
code supports local file and RIPE key+URI methods, as these
were trivial to implement and needed no coordinated action.
May need to revisit this depending on subsequent discussions.
PRIORITY: Required
STATUS: Local file and RIPE key+URI methods implemented.
- Publication protocol ACL checking may need revisiting. Tricky
bit is making sure that repository receives enough information
to know whether parent has authorized child to use parent's
namespace in nesting case; in theory this is straightforward
but requires careful checking. Current implementation just
uses a configured path check and does not attempt to trace
back to permission from parent in nested publication case.
Class and method design is intended to make it easy to drop in
additional checks if needed.
PRIORITY: Required
STATUS: Trivial version (required path check) done.
- Make rpkid fully event-driven (async tasking model), except
for SQL queries. This probably involves the "twisted"
framework.
PRIORITY: Required (to implement scalable hosting model)
TIME REQUIRED: Two weeks.
STATUS: Not started
- Error handling: make sure that exceptions map correctly to
up-down error codes, flesh out left-right error codes. Note
that the same exception may produce different error codes
depending on which up-down PDU we're processing (sigh).
Will require code audit for coherency, which is most of the work.
PRIORITY: Required
TIME REQUIRED: Two weeks
DEPENDS ON: almost everything else, as almost any code change
can raise new exceptions that we'd need to handle.
STATUS: Not started
- db.commit(), db.rollback(), code audit for data integrity
issues, fix any data integrity issues that turn up. Among
other issues, need to handle loss of connection to database
server and other MySQL errors. Need to be careful about
recovery action depending on whether we had uncommitted
changes.
PRIORITY: Required
TIME REQUIRED (commit and rollback): 3-4 weeks
TIME REQUIRED (data integrity audit): 1 week
TIME REQUIRED (fix data integrity): Unknown, depends on code
audit and results of runtime testing.
DEPENDS ON: async tasking model rollback.
STATUS: Not started
- Test framework for multiple self-instances per engine-instance
(single self-instance per engine-instance is already done).
PRIORITY: Required for testing
DEPENDS ON: Async tasking model.
TIME REQUIRED: One week
STATUS: Not started
- Current TLS code (tlslite) appeared to be flakey under heavy
use back in November, and doesn't support all the required
certificate checks out of the box.
Certificate checker has now been replaced with something based
on OpenSSL/POW, and the result seems to work. If the TLS code
itself is still unstable, best bet would be to replace it with
a Tls class cloned from the existing POW Ssl class; the
current Ssl class isn't adaquate either, but there's
documentation (eg, the O'Reilly OpenSSL book) that explains in
some detail what this code would need to do.
PRIORITY: Required for pilot (cert checking is a security issue).
TIME REQUIRED: 3-4 weeks
DEPENDS ON: Async tasking model.
STATUS: Not started
- Resource subsetting (req_* attributes in up-down protocol),
full implementation. Requires expanding SQL child_cert table
to hold subset masks and rewriting a fair amount of code.
PRIORITY: Required for full implementation.
TIME REQUIRED: 3-4 weeks
STATUS: Not started
- Performance testing
STATUS: Not started
- Clean up rootd.py to be usable in a production system. Most
urgent issue is handling of private keys. May not need much
else, as this is not a high-traffic server, but probably
should use publication protocol.
PRIORITY: Highly desirable (not strictly needed for pilot testing)
TIME REQUIRED: One week
STATUS: Not started
- Update internals docs (Doxygen). Mostly this means updating
function comments in the Python code, as the rest is
automatic. May require a bit of overview text to explain the
workings of the code, this overview text may well turn out to
be just the current flat text documents marked up for
inclusion by Doxygen.
PRIORITY: Desirable
TIME REQUIRED: One week.
STATUS: Ongoing
- Reorganize code (directory names, module names, which objects
are in which modules, add gctx pointers to objects to avoid
passing explicit gctx pointers in almost every function call)
to make it easier to understand and maintain. Portions of the
existing code were done in extreme haste to meet testing
deadlines, and it shows.
PRIORITY: Highly desirable
TIME REQUIRED: One week.
STATUS: Explicit gctx eradication done; much file renaming done; other
stuff not started.
- Add HSM support. Architecture includes it, current code does not. First
step here would be talking to somebody with strong understanding of PKCS#
11.
PRIORITY: Desirable, not required for pilot
TIME REQUIRED: Unknown
STATUS: Not started
- Installation packaging, so that rpkid can be built and
installed like a normal package.
PRIORITY: Desirable
TIME REQUIRED: One week, longer if installation for many
platforms is required
STATUS: Not started
- Tighten up syntax checking in left-right schema.
PRIORITY: Desirable
TIME REQUIRED: One day.
STATUS: Not started
- Rethink exposing SQL primary indices in protocols. Right now,
auto-incremented SQL indices are used in many places in the
left-right protocol, and are even exposed in a few places in
our implementation of the up-down protocol. This is nice and
unique but may be operationally fragile, since up-down usage
means that URLs contain mechanically assigned identifiers
rather than an identifier negotiated between the two parties
during contract setup.
The RIPE NCC suggested that we should instead use something
like a hash of the client's name, which would be
probabilistically unique, would not expose information, but
would be stable even if we had to rebuild the database.
PRIORITY: Rethinking desirable; reworking unknown
TIME REQUIRED: One week to evaluate. Implementation time if we
decide to make a change unknown, but probably on the order of
another week.
STATUS: Not started
- Common protocol dump format with APNIC and other implementors so we can
exchange protocol dumps.
PRIORITY: Desirable
TIME REQUIRED: Two days
STATUS: Not started
- IETF SIDR WG is still talking about ROAs with multiple
signatures. No obvious need for this but IETF may mandate it
anyway. Full implementation would require significant work
revising current SQL table relations and upgrading CMS
support.
PRIORITY: Minimal, IETF feeping creaturism
TIME REQUIRED: Unknown
STATUS: Not started
- Deaddrop of incoming messages, for audit. Absent a better
theory, steal existing tech for this: preface with minimal RFC
2822 header and drop it into a Maildir folder using built-in
Python Maildir library code, at which point it becomes soebody
else's problem.
STATUS: Not started
PRIORITY: Desirable, trivial to implement.
- Investigate using EKU (RFC 3280 4.2.1.13) as an alternative to
wiring in BPKI EE certs for left-right protocol.
Other random notes:
Being able to specify interaction with other servers (not running
under testbed) in a testbed.yaml might be useful for interop tests.
Kind of breaks testbed's fundamental model, though. Replacing what
testbed thinks is a leaf with somebody else would be easy, so maybe we
could specify some way to hang a bunch of rpkids under an external
parent? Hmm, data needed would look a lot like testpoke.yaml, maybe
we can reuse some of that language?
There's a three-way tradeoff lurking in the publication protocol,
manifest generation, and CRL generation:
1) Consistancy issues for relying parties (eg, don't want to withdraw
something that's still listed in the manifest);
2) Efficiency issues for the RPKI engine (eg, generating a new
manifest for each individual change during a batch run could be
expensive, would prefer to batch up the changes into a single
manifest run); and
3) Coherency issues for the RPKI engine (don't want to defer things
that could result in loss of state if something bad happens).
Considerations (1) and (3) have to dominate, which may mean we take a
hit on (2).
|