README 14 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211
  1. Notes: 2001-09-24
  2. -----------------
  3. This "description" (if one chooses to call it that) needed some major updating
  4. so here goes. This update addresses a change being made at the same time to
  5. OpenSSL, and it pretty much completely restructures the underlying mechanics of
  6. the "ENGINE" code. So it serves a double purpose of being a "ENGINE internals
  7. for masochists" document *and* a rather extensive commit log message. (I'd get
  8. lynched for sticking all this in CHANGES or the commit mails :-).
  9. ENGINE_TABLE underlies this restructuring, as described in the internal header
  10. "eng_local.h", implemented in eng_table.c, and used in each of the "class" files;
  11. tb_rsa.c, tb_dsa.c, etc.
  12. However, "EVP_CIPHER" underlies the motivation and design of ENGINE_TABLE so
  13. I'll mention a bit about that first. EVP_CIPHER (and most of this applies
  14. equally to EVP_MD for digests) is both a "method" and a algorithm/mode
  15. identifier that, in the current API, "lingers". These cipher description +
  16. implementation structures can be defined or obtained directly by applications,
  17. or can be loaded "en masse" into EVP storage so that they can be catalogued and
  18. searched in various ways, ie. two ways of encrypting with the "des_cbc"
  19. algorithm/mode pair are;
  20. (i) directly;
  21. const EVP_CIPHER *cipher = EVP_des_cbc();
  22. EVP_EncryptInit(&ctx, cipher, key, iv);
  23. [ ... use EVP_EncryptUpdate() and EVP_EncryptFinal() ...]
  24. (ii) indirectly;
  25. OpenSSL_add_all_ciphers();
  26. cipher = EVP_get_cipherbyname("des_cbc");
  27. EVP_EncryptInit(&ctx, cipher, key, iv);
  28. [ ... etc ... ]
  29. The latter is more generally used because it also allows ciphers/digests to be
  30. looked up based on other identifiers which can be useful for automatic cipher
  31. selection, eg. in SSL/TLS, or by user-controllable configuration.
  32. The important point about this is that EVP_CIPHER definitions and structures are
  33. passed around with impunity and there is no safe way, without requiring massive
  34. rewrites of many applications, to assume that EVP_CIPHERs can be reference
  35. counted. One an EVP_CIPHER is exposed to the caller, neither it nor anything it
  36. comes from can "safely" be destroyed. Unless of course the way of getting to
  37. such ciphers is via entirely distinct API calls that didn't exist before.
  38. However existing API usage cannot be made to understand when an EVP_CIPHER
  39. pointer, that has been passed to the caller, is no longer being used.
  40. The other problem with the existing API w.r.t. to hooking EVP_CIPHER support
  41. into ENGINE is storage - the OBJ_NAME-based storage used by EVP to register
  42. ciphers simultaneously registers cipher *types* and cipher *implementations* -
  43. they are effectively the same thing, an "EVP_CIPHER" pointer. The problem with
  44. hooking in ENGINEs is that multiple ENGINEs may implement the same ciphers. The
  45. solution is necessarily that ENGINE-provided ciphers simply are not registered,
  46. stored, or exposed to the caller in the same manner as existing ciphers. This is
  47. especially necessary considering the fact ENGINE uses reference counts to allow
  48. for cleanup, modularity, and DSO support - yet EVP_CIPHERs, as exposed to
  49. callers in the current API, support no such controls.
  50. Another sticking point for integrating cipher support into ENGINE is linkage.
  51. Already there is a problem with the way ENGINE supports RSA, DSA, etc whereby
  52. they are available *because* they're part of a giant ENGINE called "openssl".
  53. Ie. all implementations *have* to come from an ENGINE, but we get round that by
  54. having a giant ENGINE with all the software support encapsulated. This creates
  55. linker hassles if nothing else - linking a 1-line application that calls 2 basic
  56. RSA functions (eg. "RSA_free(RSA_new());") will result in large quantities of
  57. ENGINE code being linked in *and* because of that DSA, DH, and RAND also. If we
  58. continue with this approach for EVP_CIPHER support (even if it *was* possible)
  59. we would lose our ability to link selectively by selectively loading certain
  60. implementations of certain functionality. Touching any part of any kind of
  61. crypto would result in massive static linkage of everything else. So the
  62. solution is to change the way ENGINE feeds existing "classes", ie. how the
  63. hooking to ENGINE works from RSA, DSA, DH, RAND, as well as adding new hooking
  64. for EVP_CIPHER, and EVP_MD.
  65. The way this is now being done is by mostly reverting back to how things used to
  66. work prior to ENGINE :-). Ie. RSA now has a "RSA_METHOD" pointer again - this
  67. was previously replaced by an "ENGINE" pointer and all RSA code that required
  68. the RSA_METHOD would call ENGINE_get_RSA() each time on its ENGINE handle to
  69. temporarily get and use the ENGINE's RSA implementation. Apart from being more
  70. efficient, switching back to each RSA having an RSA_METHOD pointer also allows
  71. us to conceivably operate with *no* ENGINE. As we'll see, this removes any need
  72. for a fallback ENGINE that encapsulates default implementations - we can simply
  73. have our RSA structure pointing its RSA_METHOD pointer to the software
  74. implementation and have its ENGINE pointer set to NULL.
  75. A look at the EVP_CIPHER hooking is most explanatory, the RSA, DSA (etc) cases
  76. turn out to be degenerate forms of the same thing. The EVP storage of ciphers,
  77. and the existing EVP API functions that return "software" implementations and
  78. descriptions remain untouched. However, the storage takes more meaning in terms
  79. of "cipher description" and less meaning in terms of "implementation". When an
  80. EVP_CIPHER_CTX is actually initialised with an EVP_CIPHER method and is about to
  81. begin en/decryption, the hooking to ENGINE comes into play. What happens is that
  82. cipher-specific ENGINE code is asked for an ENGINE pointer (a functional
  83. reference) for any ENGINE that is registered to perform the algo/mode that the
  84. provided EVP_CIPHER structure represents. Under normal circumstances, that
  85. ENGINE code will return NULL because no ENGINEs will have had any cipher
  86. implementations *registered*. As such, a NULL ENGINE pointer is stored in the
  87. EVP_CIPHER_CTX context, and the EVP_CIPHER structure is left hooked into the
  88. context and so is used as the implementation. Pretty much how things work now
  89. except we'd have a redundant ENGINE pointer set to NULL and doing nothing.
  90. Conversely, if an ENGINE *has* been registered to perform the algorithm/mode
  91. combination represented by the provided EVP_CIPHER, then a functional reference
  92. to that ENGINE will be returned to the EVP_CIPHER_CTX during initialisation.
  93. That functional reference will be stored in the context (and released on
  94. cleanup) - and having that reference provides a *safe* way to use an EVP_CIPHER
  95. definition that is private to the ENGINE. Ie. the EVP_CIPHER provided by the
  96. application will actually be replaced by an EVP_CIPHER from the registered
  97. ENGINE - it will support the same algorithm/mode as the original but will be a
  98. completely different implementation. Because this EVP_CIPHER isn't stored in the
  99. EVP storage, nor is it returned to applications from traditional API functions,
  100. there is no associated problem with it not having reference counts. And of
  101. course, when one of these "private" cipher implementations is hooked into
  102. EVP_CIPHER_CTX, it is done whilst the EVP_CIPHER_CTX holds a functional
  103. reference to the ENGINE that owns it, thus the use of the ENGINE's EVP_CIPHER is
  104. safe.
  105. The "cipher-specific ENGINE code" I mentioned is implemented in tb_cipher.c but
  106. in essence it is simply an instantiation of "ENGINE_TABLE" code for use by
  107. EVP_CIPHER code. tb_digest.c is virtually identical but, of course, it is for
  108. use by EVP_MD code. Ditto for tb_rsa.c, tb_dsa.c, etc. These instantiations of
  109. ENGINE_TABLE essentially provide linker-separation of the classes so that even
  110. if ENGINEs implement *all* possible algorithms, an application using only
  111. EVP_CIPHER code will link at most code relating to EVP_CIPHER, tb_cipher.c, core
  112. ENGINE code that is independent of class, and of course the ENGINE
  113. implementation that the application loaded. It will *not* however link any
  114. class-specific ENGINE code for digests, RSA, etc nor will it bleed over into
  115. other APIs, such as the RSA/DSA/etc library code.
  116. ENGINE_TABLE is a little more complicated than may seem necessary but this is
  117. mostly to avoid a lot of "init()"-thrashing on ENGINEs (that may have to load
  118. DSOs, and other expensive setup that shouldn't be thrashed unnecessarily) *and*
  119. to duplicate "default" behaviour. Basically an ENGINE_TABLE instantiation, for
  120. example tb_cipher.c, implements a hash-table keyed by integer "nid" values.
  121. These nids provide the uniquenness of an algorithm/mode - and each nid will hash
  122. to a potentially NULL "ENGINE_PILE". An ENGINE_PILE is essentially a list of
  123. pointers to ENGINEs that implement that particular 'nid'. Each "pile" uses some
  124. caching tricks such that requests on that 'nid' will be cached and all future
  125. requests will return immediately (well, at least with minimal operation) unless
  126. a change is made to the pile, eg. perhaps an ENGINE was unloaded. The reason is
  127. that an application could have support for 10 ENGINEs statically linked
  128. in, and the machine in question may not have any of the hardware those 10
  129. ENGINEs support. If each of those ENGINEs has a "des_cbc" implementation, we
  130. want to avoid every EVP_CIPHER_CTX setup from trying (and failing) to initialise
  131. each of those 10 ENGINEs. Instead, the first such request will try to do that
  132. and will either return (and cache) a NULL ENGINE pointer or will return a
  133. functional reference to the first that successfully initialised. In the latter
  134. case it will also cache an extra functional reference to the ENGINE as a
  135. "default" for that 'nid'. The caching is acknowledged by a 'uptodate' variable
  136. that is unset only if un/registration takes place on that pile. Ie. if
  137. implementations of "des_cbc" are added or removed. This behaviour can be
  138. tweaked; the ENGINE_TABLE_FLAG_NOINIT value can be passed to
  139. ENGINE_set_table_flags(), in which case the only ENGINEs that tb_cipher.c will
  140. try to initialise from the "pile" will be those that are already initialised
  141. (ie. it's simply an increment of the functional reference count, and no real
  142. "initialisation" will take place).
  143. RSA, DSA, DH, and RAND all have their own ENGINE_TABLE code as well, and the
  144. difference is that they all use an implicit 'nid' of 1. Whereas EVP_CIPHERs are
  145. actually qualitatively different depending on 'nid' (the "des_cbc" EVP_CIPHER is
  146. not an interoperable implementation of "aes_256_cbc"), RSA_METHODs are
  147. necessarily interoperable and don't have different flavours, only different
  148. implementations. In other words, the ENGINE_TABLE for RSA will either be empty,
  149. or will have a single ENGINE_PILE hashed to by the 'nid' 1 and that pile
  150. represents ENGINEs that implement the single "type" of RSA there is.
  151. Cleanup - the registration and unregistration may pose questions about how
  152. cleanup works with the ENGINE_PILE doing all this caching nonsense (ie. when the
  153. application or EVP_CIPHER code releases its last reference to an ENGINE, the
  154. ENGINE_PILE code may still have references and thus those ENGINEs will stay
  155. hooked in forever). The way this is handled is via "unregistration". With these
  156. new ENGINE changes, an abstract ENGINE can be loaded and initialised, but that
  157. is an algorithm-agnostic process. Even if initialised, it will not have
  158. registered any of its implementations (to do so would link all class "table"
  159. code despite the fact the application may use only ciphers, for example). This
  160. is deliberately a distinct step. Moreover, registration and unregistration has
  161. nothing to do with whether an ENGINE is *functional* or not (ie. you can even
  162. register an ENGINE and its implementations without it being operational, you may
  163. not even have the drivers to make it operate). What actually happens with
  164. respect to cleanup is managed inside eng_lib.c with the "engine_cleanup_***"
  165. functions. These functions are internal-only and each part of ENGINE code that
  166. could require cleanup will, upon performing its first allocation, register a
  167. callback with the "engine_cleanup" code. The other part of this that makes it
  168. tick is that the ENGINE_TABLE instantiations (tb_***.c) use NULL as their
  169. initialised state. So if RSA code asks for an ENGINE and no ENGINE has
  170. registered an implementation, the code will simply return NULL and the tb_rsa.c
  171. state will be unchanged. Thus, no cleanup is required unless registration takes
  172. place. ENGINE_cleanup() will simply iterate across a list of registered cleanup
  173. callbacks calling each in turn, and will then internally delete its own storage
  174. (a STACK). When a cleanup callback is next registered (eg. if the cleanup() is
  175. part of a graceful restart and the application wants to cleanup all state then
  176. start again), the internal STACK storage will be freshly allocated. This is much
  177. the same as the situation in the ENGINE_TABLE instantiations ... NULL is the
  178. initialised state, so only modification operations (not queries) will cause that
  179. code to have to register a cleanup.
  180. What else? The bignum callbacks and associated ENGINE functions have been
  181. removed for two obvious reasons; (i) there was no way to generalise them to the
  182. mechanism now used by RSA/DSA/..., because there's no such thing as a BIGNUM
  183. method, and (ii) because of (i), there was no meaningful way for library or
  184. application code to automatically hook and use ENGINE supplied bignum functions
  185. anyway. Also, ENGINE_cpy() has been removed (although an internal-only version
  186. exists) - the idea of providing an ENGINE_cpy() function probably wasn't a good
  187. one and now certainly doesn't make sense in any generalised way. Some of the
  188. RSA, DSA, DH, and RAND functions that were fiddled during the original ENGINE
  189. changes have now, as a consequence, been reverted back. This is because the
  190. hooking of ENGINE is now automatic (and passive, it can internally use a NULL
  191. ENGINE pointer to simply ignore ENGINE from then on).
  192. Hell, that should be enough for now ... comments welcome.