1
0

auth.html 67 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482
  1. <HEAD>
  2. <TITLE>Security in Plan 9</TITLE>
  3. <META content="text/html; charset=utf-8" http-equiv=Content-Type>
  4. </HEAD>
  5. <BODY BGCOLOR=WHITE>
  6. <h1>Security in Plan 9</h1>
  7. <EM>Russ Cox, MIT LCS<br>
  8. Eric Grosse, Bell Labs<br>
  9. Rob Pike, Bell Labs<br>
  10. Dave Presotto, Avaya Labs and Bell Labs<br>
  11. Sean Quinlan, Bell Labs<br>
  12. </EM><TT>{rsc,ehg,rob,presotto,seanq}@plan9.bell-labs.com<br>
  13. </TT><h4>ABSTRACT</h4>
  14. The security architecture of the Plan 9(tm) operating system has recently been redesigned to address some technical shortcomings. This redesign provided an opportunity also to make the system more convenient to use securely. Plan 9 has thus improved in two ways not usually seen together: it has become more secure <EM>and </EM>easier to use.
  15. <P>
  16. The central component of the new architecture is a per-user self-contained agent called <TT>factotum</TT>. <TT>Factotum</TT> securely holds a copy of the user's keys and negotiates authentication protocols, on behalf of the user, with secure services around the network. Concentrating security code in a single program offers several advantages including: ease of update or repair to broken security software and protocols; the ability to run secure services at a lower privilege level; uniform management of keys for all services; and an opportunity to provide single sign on, even to unchanged legacy applications. <TT>Factotum</TT> has an unusual architecture: it is implemented as a Plan 9 file server. [[ To appear, in a slightly different form, in <EM>Proc. of the 2002 Usenix Security Symposium, </EM>San Francisco. ]]
  17. <H4>1. Introduction
  18. </H4>
  19. Secure computing systems face two challenges: first, they must employ sophisticated technology that is difficult to design and prove correct; and second, they must be easy for regular people to use. The question of ease of use is sometimes neglected, but it is essential: weak but easy-to-use security can be more effective than strong but difficult-to-use security if it is more likely to be used. People lock their front doors when they leave the house, knowing full well that a burglar is capable of picking the lock (or avoiding the door altogether); yet few would accept the cost and awkwardness of a bank vault door on the house even though that might reduce the probability of a robbery. A related point is that users need a clear model of how the security operates (if not how it actually provides security) in order to use it well; for example, the clarity of a lock icon on a web browser is offset by the confusing and typically insecure steps for installing X.509 certificates.
  20. <P>
  21. The security architecture of the Plan 9 operating system [Pike95] has recently been redesigned to make it both more secure and easier to use. By <EM>security </EM>we mean three things: first, the business of authenticating users and services; second, the safe handling, deployment, and use of keys and other secret information; and third, the use of encryption and integrity checks to safeguard communications from prying eyes.
  22. <P>
  23. The old security architecture of Plan 9 had several engineering problems in common with other operating systems. First, it had an inadequate notion of security domain. Once a user provided a password to connect to a local file store, the system required that the same password be used to access all the other file stores. That is, the system treated all network services as belonging to the same security domain.
  24. <P>
  25. Second, the algorithms and protocols used in authentication, by nature tricky and difficult to get right, were compiled into the various applications, kernel modules, and file servers. Changes and fixes to a security protocol required that all components using that protocol needed to be recompiled, or at least relinked, and restarted.
  26. <P>
  27. Third, the file transport protocol, 9P [Pike93], that forms the core of the Plan 9 system, had its authentication protocol embedded in its design. This meant that fixing or changing the authentication used by 9P required deep changes to the system. If someone were to find a way to break the protocol, the system would be wide open and very hard to fix.
  28. <P>
  29. These and a number of lesser problems, combined with a desire for more widespread use of encryption in the system, spurred us to rethink the entire security architecture of Plan 9.<br>
  30. <P>
  31. The centerpiece of the new architecture is an agent, called <TT>factotum</TT>, that handles the user's keys and negotiates all security interactions with system services and applications. Like a trusted assistant with a copy of the owner's keys, <TT>factotum</TT> does all the negotiation for security and authentication. Programs no longer need to be compiled with cryptographic code; instead they communicate with <TT>factotum</TT> agents that represent distinct entities in the cryptographic exchange, such as a user and server of a secure service. If a security protocol needs to be added, deleted, or modified, only <TT>factotum</TT> needs to be updated for all system services to be kept secure.
  32. <P>
  33. Building on <TT>factotum</TT>, we modified secure services in the system to move user authentication code into <TT>factotum</TT>; made authentication a separable component of the file server protocol; deployed new security protocols; designed a secure file store, called <TT>secstore</TT>, to protect our keys but make them easy to get when they are needed; designed a new kernel module to support transparent use of Transport Layer Security (TLS) [RFC2246]; and began using encryption for all communications within the system. The overall architecture is illustrated in Figure 1a.
  34. <DL><DT><DD>
  35. <br><img src="-.2669382.gif"><br>
  36. </DL>
  37. Figure 1a. Components of the security architecture.
  38. Each box is a (typically) separate machine; each ellipse a process.
  39. The ellipses labeled <I>F</I><I>X</I>
  40. are
  41. <TT>factotum</TT>
  42. processes; those labeled
  43. <I>P</I><I>X</I>
  44. are the pieces and proxies of a distributed program.
  45. The authentication server is one of several repositories for users' security information
  46. that
  47. <TT>factotum</TT>
  48. processes consult as required.
  49. <TT>Secstore</TT>
  50. is a shared resource for storing private information such as keys;
  51. <TT>factotum</TT>
  52. consults it for the user during bootstrap.
  53. <P>
  54. Secure protocols and algorithms are well understood and are usually not the weakest link in a system's security. In practice, most security problems arise from buggy servers, confusing software, or administrative oversights. It is these practical problems that we are addressing. Although this paper describes the algorithms and protocols we are using, they are included mainly for concreteness. Our main intent is to present a simple security architecture built upon a small trusted code base that is easy to verify (whether by manual or automatic means), easy to understand, and easy to use.
  55. <P>
  56. Although it is a subjective assessment, we believe we have achieved our goal of ease of use. That we have achieved our goal of improved security is supported by our plan to move our currently private computing environment onto the Internet outside the corporate firewall. The rest of this paper explains the architecture and how it is used, to explain why a system that is easy to use securely is also safe enough to run in the open network.
  57. <H4>2. An Agent for Security
  58. </H4> <P>
  59. One of the primary reasons for the redesign of the Plan 9 security infrastructure was to remove the authentication method both from the applications and from the kernel. Cryptographic code is large and intricate, so it should be packaged as a separate component that can be repaired or modified without altering or even relinking applications and services that depend on it. If a security protocol is broken, it should be trivial to repair, disable, or replace it on the fly. Similarly, it should be possible for multiple programs to use a common security protocol without embedding it in each program.
  60. <P>
  61. Some systems use dynamically linked libraries (DLLs) to address these configuration issues. The problem with this approach is that it leaves security code in the same address space as the program using it. The interactions between the program and the DLL can therefore accidentally or deliberately violate the interface, weakening security. Also, a program using a library to implement secure services must run at a privilege level necessary to provide the service; separating the security to a different program makes it possible to run the services at a weaker privilege level, isolating the privileged code to a single, more trustworthy component.
  62. <P>
  63. Following the lead of the SSH agent [Ylon96], we give each user an agent process responsible for holding and using the user's keys. The agent program is called <TT>factotum</TT> because of its similarity to the proverbial servant with the power to act on behalf of his master because he holds the keys to all the master's possessions. It is essential that <TT>factotum</TT> keep the keys secret and use them only in the owner's interest. Later we'll discuss some changes to the kernel to reduce the possibility of <TT>factotum</TT> leaking information inadvertently.
  64. <P>
  65. <TT>Factotum</TT> is implemented, like most Plan 9 services, as a file server. It is conventionally mounted upon the directory <TT>/mnt/factotum</TT>, and the files it serves there are analogous to virtual devices that provide access to, and control of, the services of the <TT>factotum</TT>. The next few sections describe the design of <TT>factotum</TT> and how it operates with the other pieces of Plan 9 to provide security services.
  66. <H4>2.1. Logging in
  67. </H4>
  68. To make the discussions that follow more concrete, we begin with a couple of examples showing how the Plan 9 security architecture appears to the user. These examples both involve a user <TT>gre</TT> logging in after booting a local machine. The user may or may not have a secure store in which all his keys are kept. If he does, <TT>factotum</TT> will prompt him for the password to the secure store and obtain keys from it, prompting only when a key isn't found in the store. Otherwise, <TT>factotum</TT> must prompt for each key.
  69. <P>
  70. In the typescripts, <TT>\n</TT> represents a literal newline character typed to force a default response. User input is in italics, and long lines are folded and indented to fit.<br>
  71. <P>
  72. This first example shows a user logging in without help from the secure store. First, <TT>factotum</TT> prompts for a user name that the local kernel will use:
  73. <pre>
  74. user[none]: <em>gre</em>
  75. </pre>
  76. (Default responses appear in square brackets.) The kernel then starts accessing local resources and requests, through <TT>factotum</TT>, a user/password pair to do so:
  77. <pre>
  78. !Adding key: dom=cs.bell-labs.com proto=p9sk1
  79. user[gre]: <em>\n</em>
  80. password: ****
  81. </pre>
  82. Now the user is logged in to the local system, and the mail client starts up:
  83. <pre>
  84. !Adding key: proto=apop server=plan9.bell-labs.com
  85. user[gre]: <em>\n</em>
  86. password: <em>****</em>
  87. </pre>
  88. <TT>Factotum</TT> is doing all the prompting and the applications being started are not even touching the keys. Note that it's always clear which key is being requested.<br>
  89. <P>
  90. Now consider the same login sequence, but in the case where <TT>gre</TT> has a secure store account:
  91. <pre>
  92. user[none]: gre
  93. secstore password: <em>*********</em>
  94. STA PIN+SecurID: <em>*********</em>
  95. </pre>
  96. That's the last <TT>gre </TT>will hear from <TT>factotum </TT>unless an attempt is made to contact a system for which no key is kept in the secure store.<br>
  97. <H4>2.2. The factotum
  98. </H4>
  99. Each computer running Plan 9 has one user id that owns all the resources on that system -- the scheduler, local disks, network interfaces, etc. That user, the <EM>host owner</EM>, is the closest analogue in Plan 9 to a Unix <TT>root</TT> account (although it is far weaker; rather than having special powers, as its name implies the host owner is just a regular user that happens to own the resources of the local machine). On a single-user system, which we call a terminal, the host owner is the id of the terminal's user. Shared servers such as CPU servers normally have a pseudo-user that initially owns all resources. At boot time, the Plan 9 kernel starts a <TT>factotum</TT> executing as, and therefore with the privileges of, the host owner.
  100. <P>
  101. New processes run as the same user as the process which created them. When a process must take on the identity of a new user, such as to provide a login shell on a shared CPU server, it does so by proving to the host owner's <TT>factotum</TT> that it is authorized to do so. This is done by running an authentication protocol with <TT>factotum</TT> to prove that the process has access to secret information which only the new user should possess. For example, consider the setup in Figure 1a. If a user on the terminal wants to log in to the CPU server using the Plan 9 <TT>cpu</TT> service [Pike93], then <EM>PT </EM>might be the <TT>cpu</TT> client program and <EM>PC </EM>the <TT>cpu</TT> server. Neither <EM>PC </EM>nor <EM>PT </EM>knows the details of the authentication. They do need to be able to shuttle messages back and forth between the two <TT>factotums</TT>, but this is a generic function easily performed without knowing, or being able to extract, secrets in the messages. <EM>PT </EM>will make a network connection to <EM>PC</EM>. <EM>PT </EM>and <EM>PC </EM>will then relay messages between the <TT>factotum</TT> owned by the user, <EM>FT</EM>, and the one owned by the CPU server, <EM>FC</EM>, until mutual authentication has been established. Later sections describe the RPC between <TT>factotum</TT> and applications and the library functions to support proxy operations.
  102. <P>
  103. The kernel always uses a single local instance of <TT>factotum</TT>, running as the host owner, for its authentication purposes, but a regular user may start other <TT>factotum</TT> agents. In fact, the <TT>factotum</TT> representing the user need not be running on the same machine as its client. For instance, it is easy for a user on a CPU server, through standard Plan 9 operations, to replace the <TT>/mnt/factotum</TT> in the user's private file name space on the server with a connection to the <TT>factotum</TT> running on the terminal. (The usual file system permissions prevent interlopers from doing so maliciously.) This permits secure operations on the CPU server to be transparently validated by the user's own <TT>factotum</TT>, so secrets need never leave the user's terminal. The SSH agent [Ylon96] does much the same with special SSH protocol messages, but an advantage to making our agent a file system is that we need no new mechanism to access our remote agent; remote file access is sufficient.
  104. <P>
  105. Within <TT>factotum</TT>, each protocol is implemented as a state machine with a generic interface, so protocols are in essence pluggable modules, easy to add, modify, or drop. Writing a message to and reading a message from <TT>factotum</TT> each require a separate RPC and result in a single state transition. Therefore <TT>factotum</TT> always runs to completion on every RPC and never blocks waiting for input during any authentication. Moreover, the number of simultaneous authentications is limited only by the amount of memory we're willing to dedicate to representing the state machines.
  106. <P>
  107. Authentication protocols are implemented only within <TT>factotum</TT>, but adding and removing protocols does require relinking the binary, so <TT>factotum</TT> processes (but no others) need to be restarted in order to take advantage of new or repaired protocols.<br>
  108. <P>
  109. At the time of writing, <TT>factotum</TT> contains authentication modules for the Plan 9 shared key protocol (p9sk1), SSH's RSA authentication, passwords in the clear, APOP, CRAM, PPP's CHAP, Microsoft PPP's MSCHAP, and VNC's challenge/response.<br>
  110. <H4>2.3. Local capabilities
  111. </H4>
  112. A capability system, managed by the kernel, is used to empower <TT>factotum</TT> to grant permission to another process to change its user id. A kernel device driver implements two files, <TT>/dev/caphash</TT> and <TT>/dev/capuse</TT>. The write-only file <TT>/dev/caphash</TT> can be opened only by the host owner, and only once. <TT>Factotum</TT> opens this file immediately after booting.
  113. <P>
  114. To use the files, <TT>factotum</TT> creates a string of the form <EM>userid1</EM><TT>@</TT><EM>userid2</EM><TT>@</TT><EM>random-string</EM>, uses SHA1 HMAC to hash <EM>userid1</EM><TT>@</TT><EM>userid2 </EM>with key <EM>random-string</EM>, and writes that hash to <TT>/dev/caphash</TT>. <TT>Factotum</TT> then passes the original string to another process on the same machine, running as user <EM>userid1</EM>, which writes the string to <TT>/dev/capuse</TT>. The kernel hashes the string and looks for a matching hash in its list. If it finds one, the writing process's user id changes from <EM>userid1 </EM>to <EM>userid2</EM>. Once used, or if a timeout expires, the capability is discarded by the kernel.
  115. <P>
  116. The capabilities are local to the machine on which they are created. Hence a <TT>factotum</TT> running on one machine cannot pass capabilities to processes on another and expect them to work.<br>
  117. <H4>2.4. Keys
  118. </H4>
  119. We define the word <EM>key </EM>to mean not only a secret, but also a description of the context in which that secret is to be used: the protocol, server, user, etc. to which it applies. That is, a key is a combination of secret and descriptive information used to authenticate the identities of parties transmitting or receiving information. The set of keys used in any authentication depends both on the protocol and on parameters passed by the program requesting the authentication.
  120. <P>
  121. Taking a tip from SDSI [RiLa], which represents security information as textual S-expressions, keys in Plan 9 are represented as plain UTF-8 text. Text is easily understood and manipulated by users. By contrast, a binary or other cryptic format can actually reduce overall security. Binary formats are difficult for users to examine and can only be cracked by special tools, themselves poorly understood by most users. For example, very few people know or understand what's inside their X.509 certificates. Most don't even know where in the system to find them. Therefore, they have no idea what they are trusting, and why, and are powerless to change their trust relationships. Textual, centrally stored and managed keys are easier to use and safer.
  122. <P>
  123. Plan 9 has historically represented databases as attribute/value pairs, since they are a good foundation for selection and projection operations. <TT>Factotum</TT> therefore represents the keys in the format <EM>attribute</EM><TT>=</TT><EM>value</EM>, where <EM>attribute </EM>is an identifier, possibly with a single-character prefix, and <EM>value </EM>is an arbitrary quoted string. The pairs themselves are separated by white space. For example, a Plan 9 key and an APOP key might be represented like this:
  124. <pre>
  125. dom=bell-labs.com proto=p9sk1 user=gre !password='don''t tell'
  126. proto=apop server=x.y.com user=gre !password='open sesame'
  127. </pre>
  128. If a value is empty or contains white space or single quotes, it must be quoted; quotes are represented by doubled single quotes. Attributes that begin with an exclamation mark (<TT>!</TT>) are considered <EM>secret</EM>. <TT>Factotum</TT> will never let a secret value escape its address space and will suppress keyboard echo when asking the user to type one.
  129. <P>
  130. A program requesting authentication selects a key by providing a <EM>query</EM>, a list of elements to be matched by the key. Each element in the list is either an <EM>attribute</EM><TT>=</TT><EM>value </EM>pair, which is satisfied by keys with exactly that pair; or an attribute followed by a question mark, <EM>attribute</EM><TT>?</TT>, which is satisfied by keys with some pair specifying the attribute. A key matches a query if every element in the list is satisfied. For instance, to select the APOP key in the previous example, an APOP client process might specify the query
  131. <DL><DT><DD><TT>server=x.y.com proto=apop<br>
  132. </TT></DL>Internally, <TT>factotum</TT>'s APOP module would add the requirements of having <TT>user </TT>and <TT>!password </TT>attributes, forming the query<br>
  133. <DL><DT><DD><TT>server=x.y.com proto=apop user? !password?<br>
  134. </TT></DL>when searching for an appropriate key.<br>
  135. <P>
  136. <TT>Factotum</TT> modules expect keys to have some well-known attributes. For instance, the <TT>proto</TT> attribute specifies the protocol module responsible for using a particular key, and protocol modules may expect other well-known attributes (many expect keys to have <TT>!password</TT> attributes, for example). Additional attributes can be used as comments or for further discrimination without intervention by <TT>factotum</TT>; for example, the APOP and IMAP mail clients conventionally include a <TT>server</TT> attribute to select an appropriate key for authentication.
  137. <P>
  138. Unlike in SDSI, keys in Plan 9 have no nested structure. This design keeps the representation simple and straightforward. If necessary, we could add a nested attribute or, in the manner of relational databases, an attribute that selects another tuple, but so far the simple design has been sufficient.
  139. <P>
  140. A simple common structure for all keys makes them easy for users to administer, but the set of attributes and their interpretation is still protocol-specific and can be subtle. Users may still need to consult a manual to understand all details. Many attributes (<TT>proto</TT>, <TT>user</TT>, <TT>password</TT>, <TT>server</TT>) are self-explanatory and our short experience has not uncovered any particular difficulty in handling keys. Things will likely get messier, however, when we grapple with public keys and their myriad components.
  141. <H4>2.5. Protecting keys
  142. </H4>
  143. Secrets must be prevented from escaping <TT>factotum</TT>. There are a number of ways they could leak: another process might be able to debug the agent process, the agent might swap out to disk, or the process might willingly disclose the key. The last is the easiest to avoid: secret information in a key is marked as such, and whenever <TT>factotum</TT> prints keys or queries for new ones, it is careful to avoid displaying secret information. (The only exception to this is the ``plaintext password'' protocol, which consists of sending the values of the <TT>user</TT> and <TT>!password</TT> attributes. Only keys tagged with <TT>proto=pass</TT> can have their passwords disclosed by this mechanism.)
  144. <P>
  145. Preventing the first two forms of leakage requires help from the kernel. In Plan 9, every process is represented by a directory in the <TT>/proc</TT> file system. Using the files in this directory, other processes could (with appropriate access permission) examine <TT>factotum</TT>'s memory and registers. <TT>Factotum</TT> is protected from processes of other users by the default access bits of its <TT>/proc</TT> directory. However, we'd also like to protect the agent from other processes owned by the same user, both to avoid honest mistakes and to prevent an unattended terminal being exploited to discover secret passwords. To do this, we added a control message to <TT>/proc</TT> called <TT>private</TT>. Once the <TT>factotum</TT> process has written <TT>private</TT> to its <TT>/proc/</TT><EM>pid</EM><TT>/ctl</TT> file, no process can access <TT>factotum</TT>'s memory through <TT>/proc</TT>. (Plan 9 has no other mechanism, such as <TT>/dev/kmem</TT>, for accessing a process's memory.)
  146. <P>
  147. Similarly, the agent's address space should not be swapped out, to prevent discovering unencrypted keys on the swapping media. The <TT>noswap</TT> control message in <TT>/proc</TT> prevents this scenario. Neither <TT>private</TT> nor <TT>noswap</TT> is specific to <TT>factotum</TT>. User-level file servers such as <TT>dossrv</TT>, which interprets FAT file systems,
  148. could use <TT>noswap</TT>
  149. to keep their buffer caches from being
  150. swapped to disk.
  151. <P>
  152. Despite our precautions, attackers might still find a way to gain access to a process running as the host owner on a machine. Although they could not directly access the keys, attackers could use the local <TT>factotum</TT> to perform authentications for them. In the case of some keys, for example those locking bank accounts, we want a way to disable or at least detect such access. That is the role of the <TT>confirm</TT> attribute in a key. Whenever a key with a <TT>confirm</TT> attribute is accessed, the local user must confirm use of the key via a local GUI. The next section describes the actual mechanism.
  153. <P>
  154. We have not addressed leaks possible as a result of someone rebooting or resetting a machine running <TT>factotum</TT>. For example, someone could reset a machine and reboot it with a debugger instead of a kernel, allowing them to examine the contents of memory and find keys. We have not found a satisfactory solution to this problem.
  155. <H4>2.6. Factotum transactions
  156. </H4>
  157. External programs manage <TT>factotum</TT>'s internal key state through its file interface, writing textual <TT>key</TT> and <TT>delkey</TT> commands to the <TT>/mnt/factotum/ctl</TT> file. Both commands take a list of attributes as an argument. <TT>Key</TT> creates a key with the given attributes, replacing any extant key with an identical set of public attributes. <TT>Delkey</TT> deletes all keys that match the given set of attributes. Reading the <TT>ctl</TT> file returns a list of keys, one per line, displaying only public attributes. The following example illustrates these interactions.
  158. <pre>
  159. % cd /mnt/factotum
  160. % ls -l
  161. -lrw------- gre gre 0 Jan 30 22:17 confirm
  162. --rw------- gre gre 0 Jan 30 22:17 ctl
  163. -lr-------- gre gre 0 Jan 30 22:17 log
  164. -lrw------- gre gre 0 Jan 30 22:17 needkey
  165. --r--r--r-- gre gre 0 Jan 30 22:17 proto
  166. --rw-rw-rw- gre gre 0 Jan 30 22:17 rpc
  167. % cat &gt;ctl
  168. key dom=bell-labs.com proto=p9sk1 user=gre !password='don''t tell'
  169. key proto=apop server=x.y.com user=gre !password='bite me'
  170. ^D
  171. % cat ctl
  172. key dom=bell-labs.com proto=p9sk1 user=gre
  173. key proto=apop server=x.y.com user=gre
  174. % echo 'delkey proto=apop' &gt;ctl
  175. % cat ctl
  176. key dom=bell-labs.com proto=p9sk1 user=gre
  177. %
  178. </pre>(A file with the <TT>l</TT> bit set can be opened by only one process at a time.)<br>
  179. <P>
  180. The heart of the interface is the <TT>rpc</TT> file. Programs authenticate with <TT>factotum</TT> by writing a request to the <TT>rpc</TT> file and reading back the reply; this sequence is called an RPC <EM>transaction</EM>. Requests and replies have the same format: a textual verb possibly followed by arguments, which may be textual or binary. The most common reply verb is <TT>ok</TT>, indicating success. An RPC session begins with a <TT>start</TT> transaction; the argument is a key query as described earlier. Once started, an RPC conversation usually consists of a sequence of <TT>read</TT> and <TT>write</TT> transactions. If the conversation is successful, an <TT>authinfo</TT> transaction will return information about the identities learned during the transaction. The <TT>attr</TT> transaction returns a list of attributes for the current conversation; the list includes any attributes given in the <TT>start</TT> query as well as any public attributes from keys being used.
  181. <P>
  182. As an example of the <TT>rpc</TT> file in action, consider a mail client connecting to a mail server and authenticating using the POP3 protocol's APOP challenge-response command. There are four programs involved: the mail client <EM>PC</EM>, the client <TT>factotum</TT> <EM>FC</EM>, the mail server <EM>PS</EM>, and the server <TT>factotum</TT> <EM>FS</EM>. All authentication computations are handled by the <TT>factotum</TT> processes. The mail programs' role is just to relay messages.
  183. <P>
  184. At startup, the mail server at <TT>x.y.com</TT> begins an APOP conversation with its <TT>factotum</TT> to obtain the banner greeting, which includes a challenge:<br>
  185. <DL><DT><DD><EM>PS->FS</EM><TT>: start proto=apop role=server<br>
  186. </TT><EM>FS->PS</EM><TT>: ok<br>
  187. </TT><EM>PS->FS</EM><TT>: read<br>
  188. </TT><EM>FS->PS</EM><TT>: ok +OK POP3 </TT><EM>challenge<br>
  189. </EM></DL>Having obtained the challenge, the server greets the client:<br>
  190. <DL><DT><DD><EM>PS->PC</EM><TT>: +OK POP3 </TT><EM>challenge<br>
  191. </EM></DL>The client then uses an APOP conversation with its <TT>factotum</TT> to obtain a response:<br>
  192. <DL><DT><DD><EM>PC->FC</EM><TT>: start proto=apop role=client server=x.y.com<br>
  193. </TT><DT><DD><DT><DD><DL><DT><DD></DL><EM>FC->PC</EM><TT>: ok<br>
  194. </TT><EM>PC->FC</EM><TT>: write +OK POP3 </TT><EM>challenge<br>
  195. FC->PC</EM><TT>: ok<br>
  196. </TT><EM>PC->FC</EM><TT>: read<br>
  197. </TT><EM>FC->PC</EM><TT>: ok APOP gre </TT><EM>response<br>
  198. </EM></DL><TT>Factotum</TT> requires that <TT>start</TT> requests include a <TT>proto</TT> attribute, and the APOP module requires an additional <TT>role</TT> attribute, but the other attributes are optional and only restrict the key space. Before responding to the <TT>start</TT> transaction, the client <TT>factotum</TT> looks for a key to use for the rest of the conversation. Because of the arguments in the <TT>start</TT> request, the key must have public attributes <TT>proto=apop</TT> and <TT>server=x.y.com</TT>; as mentioned earlier, the APOP module additionally requires that the key have <TT>user</TT> and <TT>!password</TT> attributes. Now that the client has obtained a response from its <TT>factotum</TT>, it echoes that response to the server:
  199. <DL><DT><DD><EM>PC->PS</EM><TT>: APOP gre </TT><EM>response<br>
  200. </EM></DL>Similarly, the server passes this message to its <TT>factotum </TT>and obtains another to send back.<br>
  201. <DL><DT><DD><EM>PS->FS</EM><TT>: write APOP gre </TT><EM>response<br>
  202. FS->PS</EM><TT>: ok<br>
  203. </TT><EM>PS->FS</EM><TT>: read<br>
  204. </TT><EM>FS->PS</EM><TT>: ok +OK welcome<br>
  205. </TT><EM>PS->PC</EM><TT>: +OK welcome<br>
  206. </TT></DL>Now the authentication protocol is done, and the server can retrieve information about what the protocol established.<br>
  207. <DL><DT><DD><EM>PS->FS</EM><TT>: authinfo<br>
  208. </TT><EM>FS->PS</EM><TT>: ok client=gre <TT>capability=</TT><EM>capability
  209. </EM><br>
  210. </TT></DL>The <TT>authinfo</TT> data is a list of <EM>attr</EM><TT>=</TT><EM>value </EM>pairs, here a client user name and a capability. (Protocols that establish shared secrets or provide mutual authentication indicate this by adding appropriate <EM>attr</EM><TT>=</TT><EM>value </EM>pairs.) The capability can be used by the server to change its identity to that of the client, as described earlier. Once it has changed its identity, the server can access and serve the client's mailbox.
  211. <P>
  212. Two more files provide hooks for a graphical <TT>factotum</TT> control interface. The first, <TT>confirm</TT>, allows the user detailed control over the use of certain keys. If a key has a <TT>confirm=</TT> attribute, then the user must approve each use of the key. A separate program with a graphical interface reads from the <TT>confirm</TT> file to see when a confirmation is necessary. The read blocks until a key usage needs to be approved, whereupon it will return a line of the form
  213. <DL><DT><DD><TT>confirm tag=1</TT> <EM>attributes<br>
  214. </EM></DL>requesting permission to use the key with those public attributes. The graphical interface then prompts the user for approval and writes back<br>
  215. <DL><DT><DD><TT>tag=1 answer=yes<br>
  216. </TT></DL>(or <TT>answer=no</TT>).<br>
  217. <P>
  218. The second file, <TT>needkey</TT>, diverts key requests. In the APOP example, if a suitable key had not been found during the <TT>start</TT> transaction, <TT>factotum</TT> would have indicated failure by returning a response indicating what key was needed:
  219. <pre>
  220. <EM>FC->PC</EM>: needkey proto=apop server=x.y.com user? !password?
  221. </pre>
  222. A typical client would then prompt the user for the desired key information, create a new key via the <TT>ctl</TT> file, and then reissue the <TT>start</TT> request. If the <TT>needkey</TT> file is open, then instead of failing, the transaction will block, and the next read from the <TT>/mnt/factotum/needkey</TT> file will return a line of the form
  223. <DL><DT><DD><TT>needkey tag=1</TT> <EM>attributes<br>
  224. </EM></DL>The graphical interface then prompts the user for the needed key information, creates the key via the <TT>ctl</TT> file, and writes back <TT>tag=1</TT> to resume the transaction.<br>
  225. <P>
  226. The remaining files are informational and used for debugging. The <TT>proto</TT> file contains a list of supported protocols (to see what protocols the system supports, <TT>cat /mnt/factotum/proto</TT>), and the <TT>log</TT> file contains a log of operations and debugging output enabled by a <TT>debug</TT> control message.
  227. <P>
  228. The next few sections explain how <TT>factotum</TT> is used by system services.<br>
  229. <H4>3. Authentication in 9P
  230. </H4>
  231. Plan 9 uses a remote file access protocol, 9P [Pike93], to connect to resources such as the file server and remote processes. The original design for 9P included special messages at the start of a conversation to authenticate the user. Multiple users can share a single connection, such as when a CPU server runs processes for many users connected to a single file server, but each must authenticate separately. The authentication protocol, similar to that of Kerberos [Stei88], used a sequence of messages passed between client, file server, and authentication server to verify the identities of the user, calling machine, and serving machine. One major drawback to the design was that the authentication method was defined by 9P itself and could not be changed. Moreover, there was no mechanism to relegate authentication to an external (trusted) agent, so a process implementing 9P needed, besides support for file service, a substantial body of cryptographic code to implement a handful of startup messages in the protocol.
  232. <P>
  233. A recent redesign of 9P addressed a number of file service issues outside the scope of this paper. On issues of authentication, there were two goals: first, to remove details about authentication from the protocol itself; second, to allow an external program to execute the authentication part of the protocol. In particular, we wanted a way to quickly incorporate ideas found in other systems such as SFS [Mazi99].
  234. <P>
  235. Since 9P is a file service protocol, the solution involved creating a new type of file to be served: an <EM>authentication file</EM>. Connections to a 9P service begin in a state that
  236. allows no general file access but permits the client
  237. to open an authentication file by sending a special message, generated by the new <TT>fauth </TT>system call:
  238. <DL><DT><DD><TT>afd = fauth(int fd, char *servicename);<br>
  239. </TT></DL>Here <TT>fd</TT> is the user's file descriptor for the established network connection to the 9P server and <TT>servicename</TT> is the name of the desired service offered on that server, typically the file subsystem to be accessed. The returned file descriptor, <TT>afd</TT>, is a unique handle representing the authentication file created for this connection to authenticate to this service; it is analogous to a capability. The authentication file represented by <TT>afd</TT> is not otherwise addressable on the server, such as through the file name hierarchy. In all other respects, it behaves like a regular file; most important, it accepts standard read and write operations.
  240. <P>
  241. To prove its identity, the user process (via <TT>factotum</TT>) executes the authentication protocol, described in the next section of this paper, over the <TT>afd</TT> file descriptor with ordinary reads and writes. When client and server have successfully negotiated, the authentication file changes state so it can be used as evidence of authority in <TT>mount</TT>.
  242. <P>
  243. Once identity is established, the process presents the (now verified) <TT>afd</TT> as proof of identity to the <TT>mount</TT> system call:
  244. <pre>
  245. mount(int fd, int afd, char *mountpoint,
  246. int flag, char *servicename)
  247. </pre>
  248. succeeds, the user now has appropriate permissions for the file hierarchy made visible at the mount point.<br>
  249. <P>
  250. This sequence of events has several advantages. First, the actual authentication protocol is implemented using regular reads and writes, not special 9P messages, so they can be processed, forwarded, proxied, and so on by any 9P agent without special arrangement. Second, the business of negotiating the authentication by reading and writing the authentication file can be delegated to an outside agent, in particular <TT>factotum</TT>; the programs that implement the client and server ends of a 9P conversation need no authentication or cryptographic code. Third, since the authentication protocol is not defined by 9P itself, it is easy to change and can even be negotiated dynamically. Finally, since <TT>afd</TT> acts like a capability, it can be treated like one: handed to another process to give it special permissions; kept around for later use when authentication is again required; or closed to make sure no other process can use it.
  251. <P>
  252. All these advantages stem from moving the authentication negotiation into reads and writes on a separate file. As is often the case in Plan 9, making a resource (here authentication) accessible with a file-like interface reduces <EM>a priori </EM>the need for special interfaces.<br>
  253. <P>
  254. <H4>3.1. Plan 9 shared key protocol
  255. </H4>
  256. In addition to the various standard protocols supported by <TT>factotum</TT>, we use a shared key protocol for native Plan 9 authentication. This protocol provides backward compatibility with older versions of the system. One reason for the new architecture is to let us replace such protocols in the near future with more cryptographically secure ones.
  257. <P>
  258. <EM>P9sk1</EM> is a shared key protocol that uses tickets much like those in the original Kerberos. The difference is that we've replaced the expiration time in Kerberos tickets with a random nonce parameter and a counter. We summarize it here:
  259. <pre>
  260. <EM>C->S</EM>: <EM>nonceC</EM>
  261. <EM>S->C</EM>: <EM>nonceS</EM>,<EM>uidS</EM>,<EM>domainS</EM>
  262. <EM>C->A</EM>: <EM>nonceS</EM>,<EM>uidS</EM>,<EM>domainS</EM>,<EM>uidC</EM>,<EM>factotumC</EM>
  263. <EM>A->C</EM>: <EM>KC</EM>{<EM>nonceS</EM>,<EM>uidC</EM>,<EM>uidS</EM>,<EM>Kn</EM>},
  264. <EM>KS</EM>{<EM>nonceS</EM>,<EM>uidC</EM>,<EM>uidS</EM>,<EM>Kn</EM>}
  265. <EM>C->S</EM>: <EM>KS</EM>{<EM>nonceS</EM>,<EM>uidC</EM>,<EM>uidS</EM>,<EM>Kn</EM>},
  266. <EM>Kn</EM>{<EM>nonceS</EM>,<EM>counter</EM>}
  267. <EM>S->C</EM>: <EM>Kn</EM>{<EM>nonceC</EM>,<EM>counter</EM>}
  268. </pre>
  269. (Here <EM>K</EM>{<EM>x</EM>} indicates <EM>x </EM>encrypted with DES key <EM>K</EM>.) The first two messages exchange nonces and server identification. After this initial exchange, the client contacts the authentication server to obtain a pair of encrypted tickets, one encrypted with the client key and one with the server key. The client relays the server ticket to the server. The server believes that the ticket is new because it contains <EM>nonceS </EM>and that the ticket is from the authentication server because it is encrypted in the server key <EM>KS</EM>. The ticket is basically a statement from the authentication server that now <EM>uidC </EM>and <EM>uidS </EM>share a secret <EM>Kn</EM>. The authenticator <EM>Kn</EM>{<EM>nonceS</EM>,<EM>counter</EM>} convinces the server that the client knows <EM>Kn </EM>and thus must be <EM>uidC</EM>. Similarly, authenticator <EM>Kn</EM>{<EM>nonceC</EM>,<EM>counter</EM>} convinces the client that the server knows <EM>Kn </EM>and thus must be <EM>uidS</EM>. Tickets can be reused, without contacting the authentication server again, by incrementing the counter before each authenticator is generated.
  270. <P>
  271. In the future we hope to introduce a public key version of p9sk1, which would allow authentication even when the authentication server is not available.<br>
  272. <H4>3.2. The authentication server
  273. </H4>
  274. Each Plan 9 security domain has an authentication server (AS) that all users trust to keep the complete set of shared keys. It also offers services for users and administrators to manage the keys, create and disable accounts, and so on. It typically runs on a standalone machine with few other services. The AS comprises two services, <TT>keyfs</TT> and <TT>authsrv</TT>.
  275. <P>
  276. <TT>Keyfs</TT> is a user-level file system that manages an encrypted database of user accounts. Each account is represented by a directory containing the files <TT>key</TT>, containing the Plan 9 key for p9sk1; <TT>secret</TT> for the challenge/response protocols (APOP, VNC, CHAP, MSCHAP, CRAM); <TT>log</TT> for authentication outcomes; <TT>expire</TT> for an expiration time; and <TT>status</TT>. If the expiration time passes, if the number of successive failed authentications exceeds 50, or if <TT>disabled</TT> is written to the status file, any attempt to access the <TT>key</TT> or <TT>secret</TT> files will fail.
  277. <P>
  278. <TT>Authsrv</TT> is a network service that brokers shared key authentications for the protocols p9sk1, APOP, VNC, CHAP, MSCHAP, and CRAM. Remote users can also call <TT>authsrv</TT> to change their passwords.<br>
  279. <P>
  280. The p9sk1 protocol was described in the previous section. The challenge/response protocols differ in detail but all follow the general structure:
  281. <pre>
  282. <EM>C->S</EM>: <EM>nonceC</EM>
  283. <EM>S->C</EM>: <EM>nonceS</EM>,<EM>uidS</EM>,<EM>domainS</EM>
  284. <EM>C->A</EM>: <EM>nonceS</EM>,<EM>uidS</EM>,<EM>domainS</EM>, <EM>hostidC</EM>,<EM>uidC</EM>
  285. <EM>A->C</EM>: <EM>KC</EM>{<EM>nonceS</EM>,<EM>uidC</EM>,<EM>uidS</EM>,<EM>Kn</EM>},
  286. <EM>KS</EM>{<EM>nonceS</EM>,<EM>uidC</EM>,<EM>uidS</EM>,<EM>Kn</EM>}
  287. <EM>C->S</EM>: <EM>KS</EM>{<EM>nonceS</EM>,<EM>uidC</EM>,<EM>uidS</EM>,<EM>Kn</EM>}, <EM>Kn</EM>{<EM>nonceS</EM>}
  288. <EM>S->C</EM>: <EM>Kn</EM>{<EM>nonceC</EM>}
  289. </pre>
  290. The password protocol is:
  291. <pre>
  292. <EM>C->A</EM>: <EM>uidC</EM>
  293. <EM>A->C</EM>: <EM>Kc</EM>{<EM>Kn</EM>}
  294. <EM>C->A</EM>: <EM>Kn</EM>{<EM>passwordold</EM>,<EM>passwordnew</EM>}
  295. <EM>A->C</EM>: <EM>OK</EM>
  296. </pre>To avoid replay attacks, the pre-encryption clear text for each of the protocols (as well as for p9sk1) includes a tag indicating the encryption's role in the protocol. We elided them in these outlines.<br>
  297. <H4>3.3. Protocol negotiation
  298. </H4>
  299. Rather than require particular protocols for particular services, we implemented a negotiation metaprotocol, <EM>p9any</EM>, which chooses the actual authentication protocol to use. P9any is used now by all native services on Plan 9.<br>
  300. <P>
  301. The metaprotocol is simple. The callee sends a null-terminated string of the form:<br>
  302. <DL><DT><DD><TT>v</TT><EM>n proto</EM>1<TT>@</TT><EM>domain</EM>1 <EM>proto</EM>2<TT>@</TT><EM>domain</EM>2 <TT>...<br>
  303. </TT></DL>where <EM>n </EM>is a decimal version number, <EM>protok </EM>is the name of a protocol for which the <TT>factotum</TT> has a key, and <EM>domaink </EM>is the name of the domain in which the key is valid. The caller then responds<br>
  304. <DL><DT><DD><EM>proto</EM><TT>@</TT><EM>domain<br>
  305. </EM></DL>indicating its choice. Finally the callee responds<br>
  306. <DL><DT><DD><TT>OK<br>
  307. </TT></DL>Any other string indicates failure. At this point the chosen protocol commences. The final fixed-length reply is used to make it easy to delimit the I/O stream should the chosen protocol require the caller rather than the callee to send the first message.<br>
  308. <P>
  309. With this negotiation metaprotocol, the underlying authentication protocols used for Plan 9 services can be changed under any application just by changing the keys known by the <TT>factotum</TT> agents at each end.<br>
  310. <P>
  311. P9any is vulnerable to man in the middle attacks to the extent that the attacker may constrain the possible choices by changing the stream. However, we believe this is acceptable since the attacker cannot force either side to choose algorithms that it is unwilling to use.
  312. <H4>4. Library Interface to Factotum
  313. </H4>
  314. Although programs can access <TT>factotum</TT>'s services through its file system interface, it is more common to use a C library that packages the interaction. There are a number of routines in the library, not all of which are relevant here, but a few examples should give their flavor.
  315. <P>
  316. First, consider the problem of mounting a remote file server using 9P. An earlier discussion showed how the <TT>fauth</TT> and <TT>mount</TT> system calls use an authentication file, <TT>afd</TT>, as a capability, but not how <TT>factotum</TT> manages <TT>afd</TT>. The library contains a routine, <TT>amount</TT> (authenticated mount), that is used by most programs in preference to the raw <TT>fauth</TT> and <TT>mount</TT> calls. <TT>Amount</TT> engages <TT>factotum</TT> to validate <TT>afd</TT>; here is the complete code:
  317. <pre>
  318. int
  319. amount(int fd, char *mntpt, int flags, char *aname)
  320. {
  321. int afd, ret;
  322. AuthInfo *ai;
  323. afd = fauth(fd, aname);
  324. if(afd &gt;= 0){
  325. ai = auth_proxy(afd, amount_getkey,
  326. &quot;proto=p9any role=client&quot;);
  327. if(ai != NULL)
  328. auth_freeAI(ai);
  329. }
  330. ret = mount(fd, afd, mntpt,flags, aname);
  331. if(afd &gt;= 0)
  332. close(afd);
  333. return ret;
  334. }
  335. </pre>
  336. where parameter <TT>fd</TT> is a file descriptor returned by <TT>open</TT> or <TT>dial</TT> for a new connection to a file server. The conversation with <TT>factotum</TT> occurs in the call to <TT>auth_proxy</TT>, which specifies, as a key query, which authentication protocol to use (here the metaprotocol <TT>p9any</TT>) and the role being played (<TT>client</TT>). <TT>Auth_proxy</TT> will read and write the <TT>factotum</TT> files, and the authentication file descriptor <TT>afd</TT>, to validate the user's right to access the service. If the call is successful, any auxiliary data, held in an <TT>AuthInfo</TT> structure, is freed. In any case, the <TT>mount</TT> is then called with the (perhaps validated) <TT>afd.</TT> A 9P server can cause the <TT>fauth</TT> system call to fail, as an indication that authentication is not required to access the service.
  337. <P>
  338. The second argument to <TT>auth_proxy</TT> is a function, here <TT>amount_getkey</TT>, to be called if secret information such as a password or response to a challenge is required as part of the authentication. This function, of course, will provide this data to <TT>factotum</TT> as a <TT>key</TT> message on the <TT>/mnt/factotum/ctl</TT> file.
  339. <P>
  340. Although the final argument to <TT>auth_proxy</TT> in this example is a simple string, in general it can be a formatted-print specifier in the manner of <TT>printf</TT>, to enable the construction of more elaborate key queries.<br>
  341. <P>
  342. As another example, consider the Plan 9 <TT>cpu</TT> service, which exports local devices to a shell process on a remote machine, typically to connect the local screen and keyboard to a more powerful computer. At heart, <TT>cpu</TT> is a superset of a service called <TT>exportfs</TT> [Pike93], which allows one machine to see an arbitrary portion of the file name space of another machine, such as to export the network device to another machine for gatewaying. However, <TT>cpu</TT> is not just <TT>exportfs</TT> because it also delivers signals such as interrupt and negotiates the initial environment for the remote shell.
  343. <P>
  344. To authenticate an instance of <TT>cpu</TT> requires <TT>factotum</TT> processes on both ends: the local, client end running as the user on a terminal and the remote, server end running as the host owner of the server machine. Here is schematic code for the two ends:
  345. <pre>
  346. /* client */
  347. int
  348. p9auth(int fd)
  349. {
  350. AuthInfo *ai;
  351. ai = auth_proxy(fd, auth_getkey, &quot;proto=p9any role=client&quot;);
  352. if(ai == NULL)
  353. return -1;
  354. /* start cpu protocol here */
  355. }
  356. /* server */
  357. int
  358. srvp9auth(int fd, char *user)
  359. {
  360. AuthInfo *ai;
  361. ai = auth_proxy(fd, NULL, &quot;proto=p9any role=server&quot;);
  362. if(ai == NULL)
  363. return -1;
  364. /* set user id for server process */
  365. if(auth_chuid(ai, NULL) &lt; 0)
  366. return -1;
  367. /* start cpu protocol here */
  368. }
  369. </pre>
  370. Auth_chuid</TT> encapsulates the negotiation to change a user id using the <TT>caphash</TT> and <TT>capuse</TT> files of the (server) kernel. Note that although the client process may ask the user for new keys, using <TT>auth_getkey</TT>, the server machine, presumably a shared machine with a pseudo-user for the host owner, sets the key-getting function to <TT>NULL</TT>.
  371. <H4>5. Secure Store
  372. </H4>
  373. <TT>Factotum</TT> keeps its keys in volatile memory, which must somehow be
  374. initialized at boot time.
  375. Therefore,
  376. <TT>factotum</TT> must be supplemented by a persistent store, perhaps a floppy disk containing a key file of commands to be copied into <TT>/mnt/factotum/ctl</TT> during bootstrap. But removable media are a nuisance to carry and are vulnerable to theft. Keys could be stored encrypted on a shared file system, but only if those keys are not necessary for authenticating to the file system in the first place. Even if the keys are encrypted under a user password, a thief might well succeed with a dictionary attack. Other risks of local storage are loss of the contents through mechanical mishap or dead batteries. Thus for convenience and safety we provide a <TT>secstore</TT> (secure store) server in the network to hold each user's permanent list of keys, a <EM>key file</EM>.
  377. <P>
  378. <TT>Secstore</TT> is a file server for encrypted data, used only during bootstrapping. It must provide strong authentication and resistance to passive and active protocol attacks while assuming nothing more from the client than a password. Once <TT>factotum</TT> has loaded the key file, further encrypted or authenticated file storage can be accomplished by standard mechanisms.
  379. <P>
  380. The cryptographic technology that enables <TT>secstore</TT> is a form of encrypted key exchange called PAK [Boyk00], analogous to EKE [Bell93], SRP [Wu98], or SPEKE [Jabl]. PAK was chosen because it comes with a proof of equivalence in strength to Diffie-Hellman; subtle flaws in some earlier encrypted key exchange protocols and implementations have encouraged us to take special care. In outline, the PAK protocol is:
  381. <DL><DT><DD><EM>C->S</EM>: <EM>C</EM>,<EM>gxH<br>
  382. S->C</EM>: <EM>S</EM>,<EM>gy</EM>,<EM>hash</EM>(<EM>gxy</EM>,<EM>C</EM>,<EM>S</EM>)<br>
  383. <EM>C->S</EM>: <EM>hash</EM>(<EM>gxy</EM>,<EM>S</EM>,<EM>C</EM>)<br>
  384. </DL>where <EM>H </EM>is a preshared secret between client <EM>C </EM>and server <EM>S</EM>. There are several variants of PAK, all presented in papers mainly concerned with proofs of cryptographic properties. To aid implementers, we have distilled a description of the specific version we use into an Appendix to this paper. The Plan 9 open source license provides for use of Lucent's encrypted key exchange patents in this context.
  385. <P>
  386. As a further layer of defense against password theft, we provide (within the encrypted channel <EM>C->S</EM>) information that is validated at a RADIUS server, such as the digits from a hardware token [RFC2138]. This provides two-factor authentication, which potentially requires tricking two independent administrators in any attack by social engineering.
  387. <P>
  388. The key file stored on the server is encrypted with AES (Rijndael) using CBC with a 10-byte initialization vector and trailing authentication padding. All this is invisible to the user of <TT>secstore</TT>. For that matter, it is invisible to the <TT>secstore</TT> server as well; if the AES Modes of Operation are standardized and a new encryption format designed, it can be implemented by a client without change to the server. The <TT>secstore</TT> is deliberately not backed up; the user is expected to use more than one <TT>secstore</TT> or save the key file on removable media and lock it away. The user's password is hashed to create the <EM>H </EM>used in the PAK protocol; a different hash of the password is used as the file encryption key. Finally, there is a command (inside the authenticated, encrypted channel between client and <TT>secstore</TT>) to change passwords by sending a new <EM>H</EM>; for consistency, the client process must at the same time fetch and re-encrypt all files.
  389. <P>
  390. When <TT>factotum</TT> starts, it dials the local <TT>secstore</TT> and checks whether the user has an account. If so, it prompts for the user's <TT>secstore</TT> password and fetches the key file. The PAK protocol ensures mutual authentication and prevents dictionary attacks on the password by passive wiretappers or active intermediaries. Passwords saved in the key file can be long random strings suitable for simpler challenge/response authentication protocols. Thus the user need only remember a single, weaker password to enable strong, ``single sign on'' authentication to unchanged legacy applications scattered across multiple authentication domains.
  391. <H4>6. Transport Layer Security
  392. </H4>
  393. Since the Plan 9 operating system is designed for use in network elements that must withstand direct attack, unguarded by firewall or VPN, we seek to ensure that all applications use channels with appropriate mutual authentication and encryption. A principal tool for this is TLS 1.0 [RFC2246]. (TLS 1.0 is nearly the same as SSL 3.0, and our software is designed to interoperate with implementations of either standard.)
  394. <P>
  395. TLS defines a record layer protocol for message integrity and privacy through the use of message digesting and encryption with shared secrets. We implement this service as a kernel device, though it could be performed at slightly higher cost by invoking a separate program. The library interface to the TLS kernel device is:
  396. <pre>
  397. int pushtls(int fd, char *hashalg, char *cryptalg, int isclient,
  398. char *secret, char *dir);
  399. </pre>
  400. Given a file descriptor, the names of message digest and encryption algorithms, and the shared secret, <TT>pushtls</TT> returns a new file descriptor for the encrypted connection. (The final argument <TT>dir</TT> receives the name of the directory in the TLS device that is associated with the new connection.) The function is named by analogy with the ``push'' operation supported by the stream I/O system of Research Unix and the first two editions of Plan 9. Because adding encryption is as simple as replacing one file descriptor with another, adding encryption to a particular network service is usually trivial.
  401. <P>
  402. The Plan 9 shared key authentication protocols establish a shared 56-bit secret as a side effect. Native Plan 9 network services such as <TT>cpu</TT> and <TT>exportfs</TT> use these protocols for authentication and then invoke <TT>pushtls</TT> with the shared secret.<br>
  403. <P>
  404. Above the record layer, TLS specifies a handshake protocol using public keys to establish the session secret. This protocol is widely used with HTTP and IMAP4 to provide server authentication, though with client certificates it could provide mutual authentication. The library function
  405. <DL><DT><DD><TT>int tlsClient(int fd, TLSconn *conn)<br>
  406. </TT></DL>handles the initial handshake and returns the result of <TT>pushtls</TT>. On return, it fills the <TT>conn</TT> structure with the session ID used and the X.509 certificate presented by the server, but makes no effort to verify the certificate. Although the original design intent of X.509 certificates expected that they would be used with a Public Key Infrastructure, reliable deployment has been so long delayed and problematic that we have adopted the simpler policy of just using the X.509 certificate as a representation of the public key, depending on a locally-administered directory of SHA1 thumbprints to allow applications to decide which public keys to trust for which purposes.
  407. <H4>7. Related Work and Discussion
  408. </H4>
  409. Kerberos, one of the earliest distributed authentication systems, keeps a set of authentication tickets in a temporary file called a ticket cache. The ticket cache is protected by Unix file permissions. An environment variable containing the file name of the ticket cache allows for different ticket caches in different simultaneous login sessions. A user logs in by typing his or her Kerberos password. The login program uses the Kerberos password to obtain a temporary ticket-granting ticket from the authentication server, initializes the ticket cache with the ticket-granting ticket, and then forgets the password. Other applications can use the ticket-granting ticket to sign tickets for themselves on behalf of the user during the login session. The ticket cache is removed when the user logs out [Stei88]. The ticket cache relieves the user from typing a password every time authentication is needed.
  410. <P>
  411. The secure shell SSH develops this idea further, replacing the temporary file with a named Unix domain socket connected to a user-level program, called an agent. Once the SSH agent is started and initialized with one or more RSA private keys, SSH clients can employ it to perform RSA authentications on their behalf. In the absence of an agent, SSH typically uses RSA keys read from encrypted disk files or uses passphrase-based authentication, both of which would require prompting the user for a passphrase whenever authentication is needed [Ylon96]. The self-certifying file system SFS uses a similar agent [Kami00], not only for moderating the use of client authentication keys but also for verifying server public keys [Mazi99].
  412. <P>
  413. <TT>Factotum</TT> is a logical continuation of this evolution, replacing the program-specific SSH or SFS agents with a general agent capable of serving a wide variety of programs. Having one agent for all programs removes the need to have one agent for each program. It also allows the programs themselves to be protocol-agnostic, so that, for example, one could build an SSH workalike capable of using any protocol supported by <TT>factotum</TT>, without that program knowing anything about the protocols. Traditionally each program needs to implement each authentication protocol for itself, an <EM>O</EM>(<EM>n</EM>2) coding problem that <TT>factotum</TT> reduces to <EM>O</EM>(<EM>n</EM>).
  414. <P>
  415. Previous work on agents has concentrated on their use by clients authenticating to servers. Looking in the other direction, Sun Microsystem's pluggable authentication module (PAM) is one of the earliest attempts to provide a general authentication mechanism for Unix-like operating systems [Sama96]. Without a central authority like PAM, system policy is tied up in the various implementations of network services. For example, on a typical Unix, if a system administrator decides not to allow plaintext passwords for authentication, the configuration files for a half dozen different servers -- <TT>rlogind</TT>, <TT>telnetd</TT>, <TT>ftpd</TT>, <TT>sshd</TT>, and so on -- need to be edited. PAM solves this problem by hiding the details of a given authentication mechanism behind a common library interface. Directed by a system-wide configuration file, an application selects a particular authentication mechanism by dynamically loading the appropriate shared library. PAM is widely used on Sun's Solaris and some Linux distributions.
  416. <P>
  417. <TT>Factotum</TT> achieves the same goals using the agent approach. <TT>Factotum</TT> is the only process that needs to create capabilities, so all the network servers can run as untrusted users (e.g., Plan 9's <TT>none</TT> or Unix's <TT>nobody</TT>), which greatly reduces the harm done if a server is buggy and is compromised. In fact, if <TT>factotum</TT> were implemented on Unix along with an analogue to the Plan 9 capability device, venerable programs like <TT>su</TT> and <TT>login</TT> would no longer need to be installed ``setuid root.''
  418. <P>
  419. Several other systems, such as Password Safe [Schn], store multiple passwords in an encrypted file, so that the user only needs to remember one password. Our <TT>secstore</TT> solution differs from these by placing the storage in a hardened location in the network, so that the encrypted file is less liable to be stolen for offline dictionary attack and so that it is available even when a user has several computers. In contrast, Microsoft's Passport system [Micr] keeps credentials in the network, but centralized at one extremely-high-value target. The important feature of Passport, setting up trust relationships with e-merchants, is outside our scope. The <TT>secstore</TT> architecture is almost identical to Perlman and Kaufman's [Perl99] but with newer EKE technology. Like them, we chose to defend mainly against outside attacks on <TT>secstore</TT>; if additional defense of the files on the server itself is desired, one can use distributed techniques [Ford00].
  420. <P>
  421. We made a conscious choice of placing encryption, message integrity, and key management at the application layer (TLS, just above layer 4) rather than at layer 3, as in IPsec. This leads to a simpler structure for the network stack, easier integration with applications and, most important, easier network administration since we can recognize which applications are misbehaving based on TCP port numbers. TLS does suffer (relative to IPsec) from the possibility of forged TCP Reset, but we feel that this is adequately dealt with by randomized TCP sequence numbers. In contrast with other TLS libraries, Plan 9 does not require the application to change <TT>write</TT> calls to <TT>sslwrite</TT> but simply to add a few lines of code at startup [Resc01].
  422. <H4>8. Conclusion
  423. </H4>
  424. Writing safe code is difficult. Stack attacks, mistakes in logic, and bugs in compilers and operating systems can each make it possible for an attacker to subvert the intended execution sequence of a service. If the server process has the privileges of a powerful user, such as <TT>root</TT> on Unix, then so does the attacker. <TT>Factotum</TT> allows us to constrain the privileged execution to a single process whose core is a few thousand lines of code. Verifying such a process, both through manual and automatic means, is much easier and less error prone than requiring it of all servers.
  425. <P>
  426. An implementation of these ideas is in Plan 9 from Bell Labs, Fourth Edition, freely available from <A href=http://plan9.bell-labs.com/plan9><TT>http://plan9.bell-labs.com/plan9</TT>.</A>
  427. <H4>Acknowledgments<br>
  428. </H4>
  429. William Josephson contributed to the implementation of password changing in <TT>secstore</TT>. We thank Phil MacKenzie and Mart&iacute;n Abadi for helpful comments on early parts of the design. Chuck Blake, Peter Bosch, Frans Kaashoek, Sape Mullender, and Lakshman Y. N., predominantly Dutchmen, gave helpful comments on the paper. Russ Cox is supported by a fellowship from the Fannie and John Hertz Foundation.
  430. <H4>References<br>
  431. </H4>
  432. [Bell93] S.M. Bellovin and M. Merritt, ``Augmented Encrypted Key Exchange,'' Proceedings of the 1st ACM Conference on Computer and Communications Security, 1993, pp. 244 - 250.<br>
  433. <P>
  434. [Boyk00] Victor Boyko, Philip MacKenzie, and Sarvar Patel, ``Provably Secure Password-Authenticated Key Exchange using Diffie-Hellman,'' Eurocrypt 2000, 156&#173;171.<br>
  435. <P>
  436. [RFC2246] T . Dierks and C. Allen, ``The TLS Protocol, Version 1.0,'' RFC 2246.<br>
  437. <P>
  438. [Ford00] Warwick Ford and Burton S. Kaliski, Jr., ``Server-Assisted Generation of a Strong Secret from a Password,'' IEEE Fifth International Workshop on Enterprise Security, National Institute of Standards and Technology (NIST), Gaithersburg MD, June 14 - 16, 2000.<br>
  439. <P>
  440. [Jabl] David P. Jablon, ``Strong Password-Only Authenticated Key Exchange,'' <A href=http://integritysciences.com/speke97.html><TT>http://integritysciences.com/speke97.html</TT></A>.<br>
  441. <P>
  442. [Kami00] Michael Kaminsky. ``Flexible Key Management with SFS Agents,'' Master's Thesis, MIT, May 2000.<br>
  443. <P>
  444. [Mack] Philip MacKenzie, private communication.<br>
  445. <P>
  446. [Mazi99] David Mazi&egrave;res, Michael Kaminsky, M. Frans Kaashoek and Emmett Witchel, ``Separating key management from file system security,'' Symposium on Operating Systems Principles, 1999, pp. 124-139.<br>
  447. <P>
  448. [Micr] Microsoft Passport, <A href=http://www.passport.com/><TT>http://www.passport.com/</TT></A>.<br>
  449. <P>
  450. [Perl99] Radia Perlman and Charlie Kaufman, ``Secure Password-Based Protocol for Downloading a Private Key,'' Proc. 1999 Network and Distributed System Security Symposium, Internet Society, January 1999.<br>
  451. <P>
  452. [Pike95] Rob Pike, Dave Presotto, Sean Dorward, Bob Flandrena, Ken Thompson, Howard Trickey, and Phil Winterbottom, ``Plan 9 from Bell Labs,'' Computing Systems, 8, 3, Summer 1995, pp. 221-254.<br>
  453. <P>
  454. [Pike93] Rob Pike, Dave Presotto, Ken Thompson, Howard Trickey, Phil Winterbottom, ``The Use of Name Spaces in Plan 9,'' Operating Systems Review, 27, 2, April 1993, pp. 72-76 (reprinted from Proceedings of the 5th ACM SIGOPS European Workshop, Mont Saint-Michel, 1992, Paper n&#186; 34).
  455. <P>
  456. [Resc01] Eric Rescorla, ``SSL and TLS: Designing and Building Secure Systems,'' Addison-Wesley, 2001. ISBN 0-201-61598-3, p. 387.<br>
  457. <P>
  458. [RFC2138] C. Rigney, A. Rubens, W. Simpson, S. Willens, ``Remote Authentication Dial In User Service (RADIUS),'' RFC2138, April 1997.<br>
  459. <P>
  460. [RiLa] Ronald L. Rivest and Butler Lampson, ``SDSI--A Simple Distributed Security Infrastructure,'' <A href=http://theory.lcs.mit.edu/~rivest/sdsi10.ps><TT>http://theory.lcs.mit.edu/~rivest/sdsi10.ps</TT></A>.<br>
  461. <P>
  462. [Schn] Bruce Schneier, Password Safe, <A href=http://www.counterpane.com/passsafe.html><TT>http://www.counterpane.com/passsafe.html</TT></A>.<br>
  463. <P>
  464. [Sama96] Vipin Samar, ``Unified Login with Pluggable Authentication Modules (PAM),'' Proceedings of the Third ACM Conference on Computer Communications and Security, March 1996, New Delhi, India.<br>
  465. <P>
  466. [Stei88] Jennifer G. Steiner, Clifford Neumann, and Jeffrey I. Schiller, ``<EM>Kerberos</EM>: An Authentication Service for Open Network Systems,'' Proceedings of USENIX Winter Conference, Dallas, Texas, February 1988, pp. 191&#173;202.<br>
  467. <P>
  468. [Wu98] T. Wu, ``The Secure Remote Password Protocol,'' Proceedings of the 1998 Internet Society Network and Distributed System Security Symposium, San Diego, CA, March 1998, pp. 97-111.<br>
  469. <P>
  470. [Ylon96] Ylonen, T., ``SSH--Secure Login Connections Over the Internet,'' 6th USENIX Security Symposium, pp. 37-42. San Jose, CA, July 1996.<br>
  471. <H4>Appendix: Summary of the PAK protocol
  472. </H4>
  473. Let <EM>q&gt;</EM>2^160 and <EM>p&gt;</EM>2^1024 be primes such that <EM>p=rq+</EM>1 with <EM>r </EM>not a multiple of <EM>q</EM>. Take <EM>h∈Zp* </EM>such that <EM>g==h^r </EM>is not 1. These parameters may be chosen by the NIST algorithm for DSA, and are public, fixed values. The client <EM>C </EM>knows a secret π and computes <EM>H==</EM>(<EM>H</EM>1(<EM>C</EM>, π))<EM>r </EM>and <EM>H</EM>^-1, where <EM>H</EM>1 is a hash function yielding a random element of <EM>Zp*</EM>, and <EM>H</EM>^-1 may be computed by gcd. (All arithmetic is modulo <EM>p</EM>.) The client gives <EM>H</EM>^-1 to the server <EM>S </EM>ahead of time by a private channel. To start a new connection, the client generates a random value <EM>x</EM>, computes <EM>m==g^xH</EM>, then calls the server and sends <EM>C </EM>and <EM>m</EM>. The server checks <EM>m!=</EM>0 mod <EM>p</EM>, generates random <EM>y</EM>, computes μ==<EM>g^y</EM>, σ==(<EM>mH</EM>^-1)<EM>y</EM>, and sends <EM>S</EM>, μ, <EM>k==sha1</EM>(&quot;server&quot;,<EM>C</EM>,<EM>S</EM>,<EM>m</EM>,μ,σ,<EM>H-</EM>1). Next the client computes σ=μ<EM>x</EM>, verifies <EM>k</EM>, and sends <EM>k&#180;==sha1</EM>(&quot;client&quot;,<EM>C</EM>,<EM>S</EM>,<EM>m</EM>,μ,σ,<EM>H</EM>^-1). The server then verifies <EM>k&#180; </EM>and both sides begin using session key <EM>K==sha1</EM>(&quot;session&quot;,<EM>C</EM>,<EM>S</EM>,<EM>m</EM>,μ,σ,<EM>H</EM>^-1). In the published version of PAK, the server name <EM>S </EM>is included in the initial hash <EM>H</EM>, but doing so is inconvenient in our application, as the server may be known by various equivalent names.
  474. <P>
  475. MacKenzie has shown [Mack] that the equivalence proof [Boyk00] can be adapted to cover our version.<br>
  476. <BR><FONT size=1><A HREF="http://www.lucent.com/copyright.html">
  477. Copyright</A> &#169; 2002 Lucent Technologies. All rights reserved.</FONT>
  478. </BODY></HTML>