log_contexts.rst 18 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498
  1. Log Contexts
  2. ============
  3. .. contents::
  4. To help track the processing of individual requests, synapse uses a
  5. 'log context' to track which request it is handling at any given moment. This
  6. is done via a thread-local variable; a ``logging.Filter`` is then used to fish
  7. the information back out of the thread-local variable and add it to each log
  8. record.
  9. Logcontexts are also used for CPU and database accounting, so that we can track
  10. which requests were responsible for high CPU use or database activity.
  11. The ``synapse.logging.context`` module provides a facilities for managing the
  12. current log context (as well as providing the ``LoggingContextFilter`` class).
  13. Deferreds make the whole thing complicated, so this document describes how it
  14. all works, and how to write code which follows the rules.
  15. Logcontexts without Deferreds
  16. -----------------------------
  17. In the absence of any Deferred voodoo, things are simple enough. As with any
  18. code of this nature, the rule is that our function should leave things as it
  19. found them:
  20. .. code:: python
  21. from synapse.logging import context # omitted from future snippets
  22. def handle_request(request_id):
  23. request_context = context.LoggingContext()
  24. calling_context = context.LoggingContext.current_context()
  25. context.LoggingContext.set_current_context(request_context)
  26. try:
  27. request_context.request = request_id
  28. do_request_handling()
  29. logger.debug("finished")
  30. finally:
  31. context.LoggingContext.set_current_context(calling_context)
  32. def do_request_handling():
  33. logger.debug("phew") # this will be logged against request_id
  34. LoggingContext implements the context management methods, so the above can be
  35. written much more succinctly as:
  36. .. code:: python
  37. def handle_request(request_id):
  38. with context.LoggingContext() as request_context:
  39. request_context.request = request_id
  40. do_request_handling()
  41. logger.debug("finished")
  42. def do_request_handling():
  43. logger.debug("phew")
  44. Using logcontexts with Deferreds
  45. --------------------------------
  46. Deferreds — and in particular, ``defer.inlineCallbacks`` — break
  47. the linear flow of code so that there is no longer a single entry point where
  48. we should set the logcontext and a single exit point where we should remove it.
  49. Consider the example above, where ``do_request_handling`` needs to do some
  50. blocking operation, and returns a deferred:
  51. .. code:: python
  52. @defer.inlineCallbacks
  53. def handle_request(request_id):
  54. with context.LoggingContext() as request_context:
  55. request_context.request = request_id
  56. yield do_request_handling()
  57. logger.debug("finished")
  58. In the above flow:
  59. * The logcontext is set
  60. * ``do_request_handling`` is called, and returns a deferred
  61. * ``handle_request`` yields the deferred
  62. * The ``inlineCallbacks`` wrapper of ``handle_request`` returns a deferred
  63. So we have stopped processing the request (and will probably go on to start
  64. processing the next), without clearing the logcontext.
  65. To circumvent this problem, synapse code assumes that, wherever you have a
  66. deferred, you will want to yield on it. To that end, whereever functions return
  67. a deferred, we adopt the following conventions:
  68. **Rules for functions returning deferreds:**
  69. * If the deferred is already complete, the function returns with the same
  70. logcontext it started with.
  71. * If the deferred is incomplete, the function clears the logcontext before
  72. returning; when the deferred completes, it restores the logcontext before
  73. running any callbacks.
  74. That sounds complicated, but actually it means a lot of code (including the
  75. example above) "just works". There are two cases:
  76. * If ``do_request_handling`` returns a completed deferred, then the logcontext
  77. will still be in place. In this case, execution will continue immediately
  78. after the ``yield``; the "finished" line will be logged against the right
  79. context, and the ``with`` block restores the original context before we
  80. return to the caller.
  81. * If the returned deferred is incomplete, ``do_request_handling`` clears the
  82. logcontext before returning. The logcontext is therefore clear when
  83. ``handle_request`` yields the deferred. At that point, the ``inlineCallbacks``
  84. wrapper adds a callback to the deferred, and returns another (incomplete)
  85. deferred to the caller, and it is safe to begin processing the next request.
  86. Once ``do_request_handling``'s deferred completes, it will reinstate the
  87. logcontext, before running the callback added by the ``inlineCallbacks``
  88. wrapper. That callback runs the second half of ``handle_request``, so again
  89. the "finished" line will be logged against the right
  90. context, and the ``with`` block restores the original context.
  91. As an aside, it's worth noting that ``handle_request`` follows our rules -
  92. though that only matters if the caller has its own logcontext which it cares
  93. about.
  94. The following sections describe pitfalls and helpful patterns when implementing
  95. these rules.
  96. Always yield your deferreds
  97. ---------------------------
  98. Whenever you get a deferred back from a function, you should ``yield`` on it
  99. as soon as possible. (Returning it directly to your caller is ok too, if you're
  100. not doing ``inlineCallbacks``.) Do not pass go; do not do any logging; do not
  101. call any other functions.
  102. .. code:: python
  103. @defer.inlineCallbacks
  104. def fun():
  105. logger.debug("starting")
  106. yield do_some_stuff() # just like this
  107. d = more_stuff()
  108. result = yield d # also fine, of course
  109. return result
  110. def nonInlineCallbacksFun():
  111. logger.debug("just a wrapper really")
  112. return do_some_stuff() # this is ok too - the caller will yield on
  113. # it anyway.
  114. Provided this pattern is followed all the way back up to the callchain to where
  115. the logcontext was set, this will make things work out ok: provided
  116. ``do_some_stuff`` and ``more_stuff`` follow the rules above, then so will
  117. ``fun`` (as wrapped by ``inlineCallbacks``) and ``nonInlineCallbacksFun``.
  118. It's all too easy to forget to ``yield``: for instance if we forgot that
  119. ``do_some_stuff`` returned a deferred, we might plough on regardless. This
  120. leads to a mess; it will probably work itself out eventually, but not before
  121. a load of stuff has been logged against the wrong context. (Normally, other
  122. things will break, more obviously, if you forget to ``yield``, so this tends
  123. not to be a major problem in practice.)
  124. Of course sometimes you need to do something a bit fancier with your Deferreds
  125. - not all code follows the linear A-then-B-then-C pattern. Notes on
  126. implementing more complex patterns are in later sections.
  127. Where you create a new Deferred, make it follow the rules
  128. ---------------------------------------------------------
  129. Most of the time, a Deferred comes from another synapse function. Sometimes,
  130. though, we need to make up a new Deferred, or we get a Deferred back from
  131. external code. We need to make it follow our rules.
  132. The easy way to do it is with a combination of ``defer.inlineCallbacks``, and
  133. ``context.PreserveLoggingContext``. Suppose we want to implement ``sleep``,
  134. which returns a deferred which will run its callbacks after a given number of
  135. seconds. That might look like:
  136. .. code:: python
  137. # not a logcontext-rules-compliant function
  138. def get_sleep_deferred(seconds):
  139. d = defer.Deferred()
  140. reactor.callLater(seconds, d.callback, None)
  141. return d
  142. That doesn't follow the rules, but we can fix it by wrapping it with
  143. ``PreserveLoggingContext`` and ``yield`` ing on it:
  144. .. code:: python
  145. @defer.inlineCallbacks
  146. def sleep(seconds):
  147. with PreserveLoggingContext():
  148. yield get_sleep_deferred(seconds)
  149. This technique works equally for external functions which return deferreds,
  150. or deferreds we have made ourselves.
  151. You can also use ``context.make_deferred_yieldable``, which just does the
  152. boilerplate for you, so the above could be written:
  153. .. code:: python
  154. def sleep(seconds):
  155. return context.make_deferred_yieldable(get_sleep_deferred(seconds))
  156. Fire-and-forget
  157. ---------------
  158. Sometimes you want to fire off a chain of execution, but not wait for its
  159. result. That might look a bit like this:
  160. .. code:: python
  161. @defer.inlineCallbacks
  162. def do_request_handling():
  163. yield foreground_operation()
  164. # *don't* do this
  165. background_operation()
  166. logger.debug("Request handling complete")
  167. @defer.inlineCallbacks
  168. def background_operation():
  169. yield first_background_step()
  170. logger.debug("Completed first step")
  171. yield second_background_step()
  172. logger.debug("Completed second step")
  173. The above code does a couple of steps in the background after
  174. ``do_request_handling`` has finished. The log lines are still logged against
  175. the ``request_context`` logcontext, which may or may not be desirable. There
  176. are two big problems with the above, however. The first problem is that, if
  177. ``background_operation`` returns an incomplete Deferred, it will expect its
  178. caller to ``yield`` immediately, so will have cleared the logcontext. In this
  179. example, that means that 'Request handling complete' will be logged without any
  180. context.
  181. The second problem, which is potentially even worse, is that when the Deferred
  182. returned by ``background_operation`` completes, it will restore the original
  183. logcontext. There is nothing waiting on that Deferred, so the logcontext will
  184. leak into the reactor and possibly get attached to some arbitrary future
  185. operation.
  186. There are two potential solutions to this.
  187. One option is to surround the call to ``background_operation`` with a
  188. ``PreserveLoggingContext`` call. That will reset the logcontext before
  189. starting ``background_operation`` (so the context restored when the deferred
  190. completes will be the empty logcontext), and will restore the current
  191. logcontext before continuing the foreground process:
  192. .. code:: python
  193. @defer.inlineCallbacks
  194. def do_request_handling():
  195. yield foreground_operation()
  196. # start background_operation off in the empty logcontext, to
  197. # avoid leaking the current context into the reactor.
  198. with PreserveLoggingContext():
  199. background_operation()
  200. # this will now be logged against the request context
  201. logger.debug("Request handling complete")
  202. Obviously that option means that the operations done in
  203. ``background_operation`` would be not be logged against a logcontext (though
  204. that might be fixed by setting a different logcontext via a ``with
  205. LoggingContext(...)`` in ``background_operation``).
  206. The second option is to use ``context.run_in_background``, which wraps a
  207. function so that it doesn't reset the logcontext even when it returns an
  208. incomplete deferred, and adds a callback to the returned deferred to reset the
  209. logcontext. In other words, it turns a function that follows the Synapse rules
  210. about logcontexts and Deferreds into one which behaves more like an external
  211. function — the opposite operation to that described in the previous section.
  212. It can be used like this:
  213. .. code:: python
  214. @defer.inlineCallbacks
  215. def do_request_handling():
  216. yield foreground_operation()
  217. context.run_in_background(background_operation)
  218. # this will now be logged against the request context
  219. logger.debug("Request handling complete")
  220. Passing synapse deferreds into third-party functions
  221. ----------------------------------------------------
  222. A typical example of this is where we want to collect together two or more
  223. deferred via ``defer.gatherResults``:
  224. .. code:: python
  225. d1 = operation1()
  226. d2 = operation2()
  227. d3 = defer.gatherResults([d1, d2])
  228. This is really a variation of the fire-and-forget problem above, in that we are
  229. firing off ``d1`` and ``d2`` without yielding on them. The difference
  230. is that we now have third-party code attached to their callbacks. Anyway either
  231. technique given in the `Fire-and-forget`_ section will work.
  232. Of course, the new Deferred returned by ``gatherResults`` needs to be wrapped
  233. in order to make it follow the logcontext rules before we can yield it, as
  234. described in `Where you create a new Deferred, make it follow the rules`_.
  235. So, option one: reset the logcontext before starting the operations to be
  236. gathered:
  237. .. code:: python
  238. @defer.inlineCallbacks
  239. def do_request_handling():
  240. with PreserveLoggingContext():
  241. d1 = operation1()
  242. d2 = operation2()
  243. result = yield defer.gatherResults([d1, d2])
  244. In this case particularly, though, option two, of using
  245. ``context.preserve_fn`` almost certainly makes more sense, so that
  246. ``operation1`` and ``operation2`` are both logged against the original
  247. logcontext. This looks like:
  248. .. code:: python
  249. @defer.inlineCallbacks
  250. def do_request_handling():
  251. d1 = context.preserve_fn(operation1)()
  252. d2 = context.preserve_fn(operation2)()
  253. with PreserveLoggingContext():
  254. result = yield defer.gatherResults([d1, d2])
  255. Was all this really necessary?
  256. ------------------------------
  257. The conventions used work fine for a linear flow where everything happens in
  258. series via ``defer.inlineCallbacks`` and ``yield``, but are certainly tricky to
  259. follow for any more exotic flows. It's hard not to wonder if we could have done
  260. something else.
  261. We're not going to rewrite Synapse now, so the following is entirely of
  262. academic interest, but I'd like to record some thoughts on an alternative
  263. approach.
  264. I briefly prototyped some code following an alternative set of rules. I think
  265. it would work, but I certainly didn't get as far as thinking how it would
  266. interact with concepts as complicated as the cache descriptors.
  267. My alternative rules were:
  268. * functions always preserve the logcontext of their caller, whether or not they
  269. are returning a Deferred.
  270. * Deferreds returned by synapse functions run their callbacks in the same
  271. context as the function was orignally called in.
  272. The main point of this scheme is that everywhere that sets the logcontext is
  273. responsible for clearing it before returning control to the reactor.
  274. So, for example, if you were the function which started a ``with
  275. LoggingContext`` block, you wouldn't ``yield`` within it — instead you'd start
  276. off the background process, and then leave the ``with`` block to wait for it:
  277. .. code:: python
  278. def handle_request(request_id):
  279. with context.LoggingContext() as request_context:
  280. request_context.request = request_id
  281. d = do_request_handling()
  282. def cb(r):
  283. logger.debug("finished")
  284. d.addCallback(cb)
  285. return d
  286. (in general, mixing ``with LoggingContext`` blocks and
  287. ``defer.inlineCallbacks`` in the same function leads to slighly
  288. counter-intuitive code, under this scheme).
  289. Because we leave the original ``with`` block as soon as the Deferred is
  290. returned (as opposed to waiting for it to be resolved, as we do today), the
  291. logcontext is cleared before control passes back to the reactor; so if there is
  292. some code within ``do_request_handling`` which needs to wait for a Deferred to
  293. complete, there is no need for it to worry about clearing the logcontext before
  294. doing so:
  295. .. code:: python
  296. def handle_request():
  297. r = do_some_stuff()
  298. r.addCallback(do_some_more_stuff)
  299. return r
  300. — and provided ``do_some_stuff`` follows the rules of returning a Deferred which
  301. runs its callbacks in the original logcontext, all is happy.
  302. The business of a Deferred which runs its callbacks in the original logcontext
  303. isn't hard to achieve — we have it today, in the shape of
  304. ``context._PreservingContextDeferred``:
  305. .. code:: python
  306. def do_some_stuff():
  307. deferred = do_some_io()
  308. pcd = _PreservingContextDeferred(LoggingContext.current_context())
  309. deferred.chainDeferred(pcd)
  310. return pcd
  311. It turns out that, thanks to the way that Deferreds chain together, we
  312. automatically get the property of a context-preserving deferred with
  313. ``defer.inlineCallbacks``, provided the final Defered the function ``yields``
  314. on has that property. So we can just write:
  315. .. code:: python
  316. @defer.inlineCallbacks
  317. def handle_request():
  318. yield do_some_stuff()
  319. yield do_some_more_stuff()
  320. To conclude: I think this scheme would have worked equally well, with less
  321. danger of messing it up, and probably made some more esoteric code easier to
  322. write. But again — changing the conventions of the entire Synapse codebase is
  323. not a sensible option for the marginal improvement offered.
  324. A note on garbage-collection of Deferred chains
  325. -----------------------------------------------
  326. It turns out that our logcontext rules do not play nicely with Deferred
  327. chains which get orphaned and garbage-collected.
  328. Imagine we have some code that looks like this:
  329. .. code:: python
  330. listener_queue = []
  331. def on_something_interesting():
  332. for d in listener_queue:
  333. d.callback("foo")
  334. @defer.inlineCallbacks
  335. def await_something_interesting():
  336. new_deferred = defer.Deferred()
  337. listener_queue.append(new_deferred)
  338. with PreserveLoggingContext():
  339. yield new_deferred
  340. Obviously, the idea here is that we have a bunch of things which are waiting
  341. for an event. (It's just an example of the problem here, but a relatively
  342. common one.)
  343. Now let's imagine two further things happen. First of all, whatever was
  344. waiting for the interesting thing goes away. (Perhaps the request times out,
  345. or something *even more* interesting happens.)
  346. Secondly, let's suppose that we decide that the interesting thing is never
  347. going to happen, and we reset the listener queue:
  348. .. code:: python
  349. def reset_listener_queue():
  350. listener_queue.clear()
  351. So, both ends of the deferred chain have now dropped their references, and the
  352. deferred chain is now orphaned, and will be garbage-collected at some point.
  353. Note that ``await_something_interesting`` is a generator function, and when
  354. Python garbage-collects generator functions, it gives them a chance to clean
  355. up by making the ``yield`` raise a ``GeneratorExit`` exception. In our case,
  356. that means that the ``__exit__`` handler of ``PreserveLoggingContext`` will
  357. carefully restore the request context, but there is now nothing waiting for
  358. its return, so the request context is never cleared.
  359. To reiterate, this problem only arises when *both* ends of a deferred chain
  360. are dropped. Dropping the the reference to a deferred you're supposed to be
  361. calling is probably bad practice, so this doesn't actually happen too much.
  362. Unfortunately, when it does happen, it will lead to leaked logcontexts which
  363. are incredibly hard to track down.