1
0

net.html 42 KB

12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788899091929394959697989910010110210310410510610710810911011111211311411511611711811912012112212312412512612712812913013113213313413513613713813914014114214314414514614714814915015115215315415515615715815916016116216316416516616716816917017117217317417517617717817918018118218318418518618718818919019119219319419519619719819920020120220320420520620720820921021121221321421521621721821922022122222322422522622722822923023123223323423523623723823924024124224324424524624724824925025125225325425525625725825926026126226326426526626726826927027127227327427527627727827928028128228328428528628728828929029129229329429529629729829930030130230330430530630730830931031131231331431531631731831932032132232332432532632732832933033133233333433533633733833934034134234334434534634734834935035135235335435535635735835936036136236336436536636736836937037137237337437537637737837938038138238338438538638738838939039139239339439539639739839940040140240340440540640740840941041141241341441541641741841942042142242342442542642742842943043143243343443543643743843944044144244344444544644744844945045145245345445545645745845946046146246346446546646746846947047147247347447547647747847948048148248348448548648748848949049149249349449549649749849950050150250350450550650750850951051151251351451551651751851952052152252352452552652752852953053153253353453553653753853954054154254354454554654754854955055155255355455555655755855956056156256356456556656756856957057157257357457557657757857958058158258358458558658758858959059159259359459559659759859960060160260360460560660760860961061161261361461561661761861962062162262362462562662762862963063163263363463563663763863964064164264364464564664764864965065165265365465565665765865966066166266366466566666766866967067167267367467567667767867968068168268368468568668768868969069169269369469569669769869970070170270370470570670770870971071171271371471571671771871972072172272372472572672772872973073173273373473573673773873974074174274374474574674774874975075175275375475575675775875976076176276376476576676776876977077177277377477577677777877978078178278378478578678778878979079179279379479579679779879980080180280380480580680780880981081181281381481581681781881982082182282382482582682782882983083183283383483583683783883984084184284384484584684784884985085185285385485585685785885986086186286386486586686786886987087187287387487587687787887988088188288388488588688788888989089189289389489589689789889990090190290390490590690790890991091191291391491591691791891992092192292392492592692792892993093193293393493593693793893994094194294394494594694794894995095195295395495595695795895996096196296396496596696796896997097197297397497597697797897998098198298398498598698798898999099199299399499599699799899910001001100210031004100510061007100810091010101110121013101410151016101710181019102010211022102310241025102610271028102910301031103210331034103510361037103810391040104110421043104410451046104710481049105010511052105310541055105610571058105910601061106210631064106510661067106810691070107110721073107410751076107710781079108010811082108310841085108610871088108910901091109210931094109510961097109810991100110111021103110411051106110711081109111011111112111311141115111611171118111911201121112211231124112511261127112811291130113111321133113411351136113711381139114011411142114311441145114611471148114911501151115211531154115511561157115811591160116111621163116411651166116711681169117011711172117311741175117611771178117911801181118211831184118511861187118811891190119111921193119411951196119711981199120012011202120312041205120612071208120912101211121212131214121512161217121812191220122112221223122412251226122712281229123012311232123312341235123612371238123912401241124212431244124512461247124812491250125112521253125412551256125712581259126012611262126312641265126612671268126912701271127212731274127512761277127812791280128112821283128412851286128712881289129012911292129312941295129612971298129913001301130213031304130513061307130813091310131113121313131413151316131713181319132013211322132313241325132613271328132913301331133213331334133513361337133813391340134113421343134413451346134713481349135013511352135313541355135613571358135913601361136213631364136513661367136813691370137113721373137413751376137713781379
  1. <html>
  2. <title>
  3. data
  4. </title>
  5. <body BGCOLOR="#FFFFFF" TEXT="#000000" LINK="#0000FF" VLINK="#330088" ALINK="#FF0044">
  6. <H1>The Organization of Networks in Plan 9
  7. </H1>
  8. <DL><DD><I>Dave Presotto<br>
  9. Phil Winterbottom<br>
  10. <br>&#32;<br>
  11. presotto,philw@plan9.bell-labs.com<br>
  12. </I></DL>
  13. <DL><DD><H4>ABSTRACT</H4>
  14. <DL>
  15. <DT><DT>&#32;<DD>
  16. NOTE:<I> Originally appeared in
  17. Proc. of the Winter 1993 USENIX Conf.,
  18. pp. 271-280,
  19. San Diego, CA
  20. </I><DT>&#32;<DD></dl>
  21. <br>
  22. In a distributed system networks are of paramount importance. This
  23. paper describes the implementation, design philosophy, and organization
  24. of network support in Plan 9. Topics include network requirements
  25. for distributed systems, our kernel implementation, network naming, user interfaces,
  26. and performance. We also observe that much of this organization is relevant to
  27. current systems.
  28. </DL>
  29. <H4>1 Introduction
  30. </H4>
  31. <P>
  32. Plan 9 [Pike90] is a general-purpose, multi-user, portable distributed system
  33. implemented on a variety of computers and networks.
  34. What distinguishes Plan 9 is its organization.
  35. The goals of this organization were to
  36. reduce administration
  37. and to promote resource sharing. One of the keys to its success as a distributed
  38. system is the organization and management of its networks.
  39. </P>
  40. <P>
  41. A Plan 9 system comprises file servers, CPU servers and terminals.
  42. The file servers and CPU servers are typically centrally
  43. located multiprocessor machines with large memories and
  44. high speed interconnects.
  45. A variety of workstation-class machines
  46. serve as terminals
  47. connected to the central servers using several networks and protocols.
  48. The architecture of the system demands a hierarchy of network
  49. speeds matching the needs of the components.
  50. Connections between file servers and CPU servers are high-bandwidth point-to-point
  51. fiber links.
  52. Connections from the servers fan out to local terminals
  53. using medium speed networks
  54. such as Ethernet [Met80] and Datakit [Fra80].
  55. Low speed connections via the Internet and
  56. the AT&amp;T backbone serve users in Oregon and Illinois.
  57. Basic Rate ISDN data service and 9600 baud serial lines provide slow
  58. links to users at home.
  59. </P>
  60. <P>
  61. Since CPU servers and terminals use the same kernel,
  62. users may choose to run programs locally on
  63. their terminals or remotely on CPU servers.
  64. The organization of Plan 9 hides the details of system connectivity
  65. allowing both users and administrators to configure their environment
  66. to be as distributed or centralized as they wish.
  67. Simple commands support the
  68. construction of a locally represented name space
  69. spanning many machines and networks.
  70. At work, users tend to use their terminals like workstations,
  71. running interactive programs locally and
  72. reserving the CPU servers for data or compute intensive jobs
  73. such as compiling and computing chess endgames.
  74. At home or when connected over
  75. a slow network, users tend to do most work on the CPU server to minimize
  76. traffic on the slow links.
  77. The goal of the network organization is to provide the same
  78. environment to the user wherever resources are used.
  79. </P>
  80. <H4>2 Kernel Network Support
  81. </H4>
  82. <P>
  83. Networks play a central role in any distributed system. This is particularly
  84. true in Plan 9 where most resources are provided by servers external to the kernel.
  85. The importance of the networking code within the kernel
  86. is reflected by its size;
  87. of 25,000 lines of kernel code, 12,500 are network and protocol related.
  88. Networks are continually being added and the fraction of code
  89. devoted to communications
  90. is growing.
  91. Moreover, the network code is complex.
  92. Protocol implementations consist almost entirely of
  93. synchronization and dynamic memory management, areas demanding
  94. subtle error recovery
  95. strategies.
  96. The kernel currently supports Datakit, point-to-point fiber links,
  97. an Internet (IP) protocol suite and ISDN data service.
  98. The variety of networks and machines
  99. has raised issues not addressed by other systems running on commercial
  100. hardware supporting only Ethernet or FDDI.
  101. </P>
  102. <H4>2.1 The File System protocol
  103. </H4>
  104. <P>
  105. A central idea in Plan 9 is the representation of a resource as a hierarchical
  106. file system.
  107. Each process assembles a view of the system by building a
  108. <I>name space</I>
  109. [Needham] connecting its resources.
  110. File systems need not represent disc files; in fact, most Plan 9 file systems have no
  111. permanent storage.
  112. A typical file system dynamically represents
  113. some resource like a set of network connections or the process table.
  114. Communication between the kernel, device drivers, and local or remote file servers uses a
  115. protocol called 9P. The protocol consists of 17 messages
  116. describing operations on files and directories.
  117. Kernel resident device and protocol drivers use a procedural version
  118. of the protocol while external file servers use an RPC form.
  119. Nearly all traffic between Plan 9 systems consists
  120. of 9P messages.
  121. 9P relies on several properties of the underlying transport protocol.
  122. It assumes messages arrive reliably and in sequence and
  123. that delimiters between messages
  124. are preserved.
  125. When a protocol does not meet these
  126. requirements (for example, TCP does not preserve delimiters)
  127. we provide mechanisms to marshal messages before handing them
  128. to the system.
  129. </P>
  130. <P>
  131. A kernel data structure, the
  132. <I>channel</I>,
  133. is a handle to a file server.
  134. Operations on a channel generate the following 9P messages.
  135. The
  136. <TT>session</TT>
  137. and
  138. <TT>attach</TT>
  139. messages authenticate a connection, established by means external to 9P,
  140. and validate its user.
  141. The result is an authenticated
  142. channel
  143. referencing the root of the
  144. server.
  145. The
  146. <TT>clone</TT>
  147. message makes a new channel identical to an existing channel, much like
  148. the
  149. <TT>dup</TT>
  150. system call.
  151. A
  152. channel
  153. may be moved to a file on the server using a
  154. <TT>walk</TT>
  155. message to descend each level in the hierarchy.
  156. The
  157. <TT>stat</TT>
  158. and
  159. <TT>wstat</TT>
  160. messages read and write the attributes of the file referenced by a channel.
  161. The
  162. <TT>open</TT>
  163. message prepares a channel for subsequent
  164. <TT>read</TT>
  165. and
  166. <TT>write</TT>
  167. messages to access the contents of the file.
  168. <TT>Create</TT>
  169. and
  170. <TT>remove</TT>
  171. perform the actions implied by their names on the file
  172. referenced by the channel.
  173. The
  174. <TT>clunk</TT>
  175. message discards a channel without affecting the file.
  176. </P>
  177. <P>
  178. A kernel resident file server called the
  179. <I>mount driver</I>
  180. converts the procedural version of 9P into RPCs.
  181. The
  182. <I>mount</I>
  183. system call provides a file descriptor, which can be
  184. a pipe to a user process or a network connection to a remote machine, to
  185. be associated with the mount point.
  186. After a mount, operations
  187. on the file tree below the mount point are sent as messages to the file server.
  188. The
  189. mount
  190. driver manages buffers, packs and unpacks parameters from
  191. messages, and demultiplexes among processes using the file server.
  192. </P>
  193. <H4>2.2 Kernel Organization
  194. </H4>
  195. <P>
  196. The network code in the kernel is divided into three layers: hardware interface,
  197. protocol processing, and program interface.
  198. A device driver typically uses streams to connect the two interface layers.
  199. Additional stream modules may be pushed on
  200. a device to process protocols.
  201. Each device driver is a kernel-resident file system.
  202. Simple device drivers serve a single level
  203. directory containing just a few files;
  204. for example, we represent each UART
  205. by a data and a control file.
  206. <DL><DT><DD><TT><PRE>
  207. cpu% cd /dev
  208. cpu% ls -l eia*
  209. --rw-rw-rw- t 0 bootes bootes 0 Jul 16 17:28 eia1
  210. --rw-rw-rw- t 0 bootes bootes 0 Jul 16 17:28 eia1ctl
  211. --rw-rw-rw- t 0 bootes bootes 0 Jul 16 17:28 eia2
  212. --rw-rw-rw- t 0 bootes bootes 0 Jul 16 17:28 eia2ctl
  213. cpu%
  214. </PRE></TT></DL>
  215. The control file is used to control the device;
  216. writing the string
  217. <TT>b1200</TT>
  218. to
  219. <TT>/dev/eia1ctl</TT>
  220. sets the line to 1200 baud.
  221. </P>
  222. <P>
  223. Multiplexed devices present
  224. a more complex interface structure.
  225. For example, the LANCE Ethernet driver
  226. serves a two level file tree (Figure 1)
  227. providing
  228. </P>
  229. <DL COMPACT>
  230. <DT>*<DD>
  231. device control and configuration
  232. <DT>*<DD>
  233. user-level protocols like ARP
  234. <DT>*<DD>
  235. diagnostic interfaces for snooping software.
  236. </dl>
  237. <br>&#32;<br>
  238. The top directory contains a
  239. <TT>clone</TT>
  240. file and a directory for each connection, numbered
  241. <TT>1</TT>
  242. to
  243. <TT>n</TT>.
  244. Each connection directory corresponds to an Ethernet packet type.
  245. Opening the
  246. <TT>clone</TT>
  247. file finds an unused connection directory
  248. and opens its
  249. <TT>ctl</TT>
  250. file.
  251. Reading the control file returns the ASCII connection number; the user
  252. process can use this value to construct the name of the proper
  253. connection directory.
  254. In each connection directory files named
  255. <TT>ctl</TT>,
  256. <TT>data</TT>,
  257. <TT>stats</TT>,
  258. and
  259. <TT>type</TT>
  260. provide access to the connection.
  261. Writing the string
  262. <TT>connect 2048</TT>
  263. to the
  264. <TT>ctl</TT>
  265. file sets the packet type to 2048
  266. and
  267. configures the connection to receive
  268. all IP packets sent to the machine.
  269. Subsequent reads of the file
  270. <TT>type</TT>
  271. yield the string
  272. <TT>2048</TT>.
  273. The
  274. <TT>data</TT>
  275. file accesses the media;
  276. reading it
  277. returns the
  278. next packet of the selected type.
  279. Writing the file
  280. queues a packet for transmission after
  281. appending a packet header containing the source address and packet type.
  282. The
  283. <TT>stats</TT>
  284. file returns ASCII text containing the interface address,
  285. packet input/output counts, error statistics, and general information
  286. about the state of the interface.
  287. <DL><DT><DD><TT><PRE>
  288. <br><img src="data.7580.gif"><br>
  289. </PRE></TT></DL>
  290. If several connections on an interface
  291. are configured for a particular packet type, each receives a
  292. copy of the incoming packets.
  293. The special packet type
  294. <TT>-1</TT>
  295. selects all packets.
  296. Writing the strings
  297. <TT>promiscuous</TT>
  298. and
  299. <TT>connect</TT>
  300. <TT>-1</TT>
  301. to the
  302. <TT>ctl</TT>
  303. file
  304. configures a conversation to receive all packets on the Ethernet.
  305. <P>
  306. Although the driver interface may seem elaborate,
  307. the representation of a device as a set of files using ASCII strings for
  308. communication has several advantages.
  309. Any mechanism supporting remote access to files immediately
  310. allows a remote machine to use our interfaces as gateways.
  311. Using ASCII strings to control the interface avoids byte order problems and
  312. ensures a uniform representation for
  313. devices on the same machine and even allows devices to be accessed remotely.
  314. Representing dissimilar devices by the same set of files allows common tools
  315. to serve
  316. several networks or interfaces.
  317. Programs like
  318. <TT>stty</TT>
  319. are replaced by
  320. <TT>echo</TT>
  321. and shell redirection.
  322. </P>
  323. <H4>2.3 Protocol devices
  324. </H4>
  325. <P>
  326. Network connections are represented as pseudo-devices called protocol devices.
  327. Protocol device drivers exist for the Datakit URP protocol and for each of the
  328. Internet IP protocols TCP, UDP, and IL.
  329. IL, described below, is a new communication protocol used by Plan 9 for
  330. transmitting file system RPC's.
  331. All protocol devices look identical so user programs contain no
  332. network-specific code.
  333. </P>
  334. <P>
  335. Each protocol device driver serves a directory structure
  336. similar to that of the Ethernet driver.
  337. The top directory contains a
  338. <TT>clone</TT>
  339. file and a directory for each connection numbered
  340. <TT>0</TT>
  341. to
  342. <TT>n</TT>.
  343. Each connection directory contains files to control one
  344. connection and to send and receive information.
  345. A TCP connection directory looks like this:
  346. <DL><DT><DD><TT><PRE>
  347. cpu% cd /net/tcp/2
  348. cpu% ls -l
  349. --rw-rw---- I 0 ehg bootes 0 Jul 13 21:14 ctl
  350. --rw-rw---- I 0 ehg bootes 0 Jul 13 21:14 data
  351. --rw-rw---- I 0 ehg bootes 0 Jul 13 21:14 listen
  352. --r--r--r-- I 0 bootes bootes 0 Jul 13 21:14 local
  353. --r--r--r-- I 0 bootes bootes 0 Jul 13 21:14 remote
  354. --r--r--r-- I 0 bootes bootes 0 Jul 13 21:14 status
  355. cpu% cat local remote status
  356. 135.104.9.31 5012
  357. 135.104.53.11 564
  358. tcp/2 1 Established connect
  359. cpu%
  360. </PRE></TT></DL>
  361. The files
  362. <TT>local</TT>,
  363. <TT>remote</TT>,
  364. and
  365. <TT>status</TT>
  366. supply information about the state of the connection.
  367. The
  368. <TT>data</TT>
  369. and
  370. <TT>ctl</TT>
  371. files
  372. provide access to the process end of the stream implementing the protocol.
  373. The
  374. <TT>listen</TT>
  375. file is used to accept incoming calls from the network.
  376. </P>
  377. <P>
  378. The following steps establish a connection.
  379. </P>
  380. <DL COMPACT>
  381. <DT>1)<DD>
  382. The clone device of the
  383. appropriate protocol directory is opened to reserve an unused connection.
  384. <DT>2)<DD>
  385. The file descriptor returned by the open points to the
  386. <TT>ctl</TT>
  387. file of the new connection.
  388. Reading that file descriptor returns an ASCII string containing
  389. the connection number.
  390. <DT>3)<DD>
  391. A protocol/network specific ASCII address string is written to the
  392. <TT>ctl</TT>
  393. file.
  394. <DT>4)<DD>
  395. The path of the
  396. <TT>data</TT>
  397. file is constructed using the connection number.
  398. When the
  399. <TT>data</TT>
  400. file is opened the connection is established.
  401. </dl>
  402. <br>&#32;<br>
  403. A process can read and write this file descriptor
  404. to send and receive messages from the network.
  405. If the process opens the
  406. <TT>listen</TT>
  407. file it blocks until an incoming call is received.
  408. An address string written to the
  409. <TT>ctl</TT>
  410. file before the listen selects the
  411. ports or services the process is prepared to accept.
  412. When an incoming call is received, the open completes
  413. and returns a file descriptor
  414. pointing to the
  415. <TT>ctl</TT>
  416. file of the new connection.
  417. Reading the
  418. <TT>ctl</TT>
  419. file yields a connection number used to construct the path of the
  420. <TT>data</TT>
  421. file.
  422. A connection remains established while any of the files in the connection directory
  423. are referenced or until a close is received from the network.
  424. <H4>2.4 Streams
  425. </H4>
  426. <P>
  427. A
  428. <I>stream</I>
  429. [Rit84a][Presotto] is a bidirectional channel connecting a
  430. physical or pseudo-device to user processes.
  431. The user processes insert and remove data at one end of the stream.
  432. Kernel processes acting on behalf of a device insert data at
  433. the other end.
  434. Asynchronous communications channels such as pipes,
  435. TCP conversations, Datakit conversations, and RS232 lines are implemented using
  436. streams.
  437. </P>
  438. <P>
  439. A stream comprises a linear list of
  440. <I>processing modules</I>.
  441. Each module has both an upstream (toward the process) and
  442. downstream (toward the device)
  443. <I>put routine</I>.
  444. Calling the put routine of the module on either end of the stream
  445. inserts data into the stream.
  446. Each module calls the succeeding one to send data up or down the stream.
  447. </P>
  448. <P>
  449. An instance of a processing module is represented by a pair of
  450. <I>queues</I>,
  451. one for each direction.
  452. The queues point to the put procedures and can be used
  453. to queue information traveling along the stream.
  454. Some put routines queue data locally and send it along the stream at some
  455. later time, either due to a subsequent call or an asynchronous
  456. event such as a retransmission timer or a device interrupt.
  457. Processing modules create helper kernel processes to
  458. provide a context for handling asynchronous events.
  459. For example, a helper kernel process awakens periodically
  460. to perform any necessary TCP retransmissions.
  461. The use of kernel processes instead of serialized run-to-completion service routines
  462. differs from the implementation of Unix streams.
  463. Unix service routines cannot
  464. use any blocking kernel resource and they lack a local long-lived state.
  465. Helper kernel processes solve these problems and simplify the stream code.
  466. </P>
  467. <P>
  468. There is no implicit synchronization in our streams.
  469. Each processing module must ensure that concurrent processes using the stream
  470. are synchronized.
  471. This maximizes concurrency but introduces the
  472. possibility of deadlock.
  473. However, deadlocks are easily avoided by careful programming; to
  474. date they have not caused us problems.
  475. </P>
  476. <P>
  477. Information is represented by linked lists of kernel structures called
  478. <I>blocks</I>.
  479. Each block contains a type, some state flags, and pointers to
  480. an optional buffer.
  481. Block buffers can hold either data or control information, i.e., directives
  482. to the processing modules.
  483. Blocks and block buffers are dynamically allocated from kernel memory.
  484. </P>
  485. <H4>2.4.1 User Interface
  486. </H4>
  487. <P>
  488. A stream is represented at user level as two files,
  489. <TT>ctl</TT>
  490. and
  491. <TT>data</TT>.
  492. The actual names can be changed by the device driver using the stream,
  493. as we saw earlier in the example of the UART driver.
  494. The first process to open either file creates the stream automatically.
  495. The last close destroys it.
  496. Writing to the
  497. <TT>data</TT>
  498. file copies the data into kernel blocks
  499. and passes them to the downstream put routine of the first processing module.
  500. A write of less than 32K is guaranteed to be contained by a single block.
  501. Concurrent writes to the same stream are not synchronized, although the
  502. 32K block size assures atomic writes for most protocols.
  503. The last block written is flagged with a delimiter
  504. to alert downstream modules that care about write boundaries.
  505. In most cases the first put routine calls the second, the second
  506. calls the third, and so on until the data is output.
  507. As a consequence, most data is output without context switching.
  508. </P>
  509. <P>
  510. Reading from the
  511. <TT>data</TT>
  512. file returns data queued at the top of the stream.
  513. The read terminates when the read count is reached
  514. or when the end of a delimited block is encountered.
  515. A per stream read lock ensures only one process
  516. can read from a stream at a time and guarantees
  517. that the bytes read were contiguous bytes from the
  518. stream.
  519. </P>
  520. <P>
  521. Like UNIX streams [Rit84a],
  522. Plan 9 streams can be dynamically configured.
  523. The stream system intercepts and interprets
  524. the following control blocks:
  525. </P>
  526. <DL COMPACT>
  527. <DT><TT>push</TT> <I>name</I><DD>
  528. adds an instance of the processing module
  529. <I>name</I>
  530. to the top of the stream.
  531. <DT><TT>pop</TT><DD>
  532. removes the top module of the stream.
  533. <DT><TT>hangup</TT><DD>
  534. sends a hangup message
  535. up the stream from the device end.
  536. </dl>
  537. <br>&#32;<br>
  538. Other control blocks are module-specific and are interpreted by each
  539. processing module
  540. as they pass.
  541. <P>
  542. The convoluted syntax and semantics of the UNIX
  543. <TT>ioctl</TT>
  544. system call convinced us to leave it out of Plan 9.
  545. Instead,
  546. <TT>ioctl</TT>
  547. is replaced by the
  548. <TT>ctl</TT>
  549. file.
  550. Writing to the
  551. <TT>ctl</TT>
  552. file
  553. is identical to writing to a
  554. <TT>data</TT>
  555. file except the blocks are of type
  556. <I>control</I>.
  557. A processing module parses each control block it sees.
  558. Commands in control blocks are ASCII strings, so
  559. byte ordering is not an issue when one system
  560. controls streams in a name space implemented on another processor.
  561. The time to parse control blocks is not important, since control
  562. operations are rare.
  563. </P>
  564. <H4>2.4.2 Device Interface
  565. </H4>
  566. <P>
  567. The module at the downstream end of the stream is part of a device interface.
  568. The particulars of the interface vary with the device.
  569. Most device interfaces consist of an interrupt routine, an output
  570. put routine, and a kernel process.
  571. The output put routine stages data for the
  572. device and starts the device if it is stopped.
  573. The interrupt routine wakes up the kernel process whenever
  574. the device has input to be processed or needs more output staged.
  575. The kernel process puts information up the stream or stages more data for output.
  576. The division of labor among the different pieces varies depending on
  577. how much must be done at interrupt level.
  578. However, the interrupt routine may not allocate blocks or call
  579. a put routine since both actions require a process context.
  580. </P>
  581. <H4>2.4.3 Multiplexing
  582. </H4>
  583. <P>
  584. The conversations using a protocol device must be
  585. multiplexed onto a single physical wire.
  586. We push a multiplexer processing module
  587. onto the physical device stream to group the conversations.
  588. The device end modules on the conversations add the necessary header
  589. onto downstream messages and then put them to the module downstream
  590. of the multiplexer.
  591. The multiplexing module looks at each message moving up its stream and
  592. puts it to the correct conversation stream after stripping
  593. the header controlling the demultiplexing.
  594. </P>
  595. <P>
  596. This is similar to the Unix implementation of multiplexer streams.
  597. The major difference is that we have no general structure that
  598. corresponds to a multiplexer.
  599. Each attempt to produce a generalized multiplexer created a more complicated
  600. structure and underlined the basic difficulty of generalizing this mechanism.
  601. We now code each multiplexer from scratch and favor simplicity over
  602. generality.
  603. </P>
  604. <H4>2.4.4 Reflections
  605. </H4>
  606. <P>
  607. Despite five year's experience and the efforts of many programmers,
  608. we remain dissatisfied with the stream mechanism.
  609. Performance is not an issue;
  610. the time to process protocols and drive
  611. device interfaces continues to dwarf the
  612. time spent allocating, freeing, and moving blocks
  613. of data.
  614. However the mechanism remains inordinately
  615. complex.
  616. Much of the complexity results from our efforts
  617. to make streams dynamically configurable, to
  618. reuse processing modules on different devices
  619. and to provide kernel synchronization
  620. to ensure data structures
  621. don't disappear under foot.
  622. This is particularly irritating since we seldom use these properties.
  623. </P>
  624. <P>
  625. Streams remain in our kernel because we are unable to
  626. devise a better alternative.
  627. Larry Peterson's X-kernel [Pet89a]
  628. is the closest contender but
  629. doesn't offer enough advantage to switch.
  630. If we were to rewrite the streams code, we would probably statically
  631. allocate resources for a large fixed number of conversations and burn
  632. memory in favor of less complexity.
  633. </P>
  634. <H4>3 The IL Protocol
  635. </H4>
  636. <P>
  637. None of the standard IP protocols is suitable for transmission of
  638. 9P messages over an Ethernet or the Internet.
  639. TCP has a high overhead and does not preserve delimiters.
  640. UDP, while cheap, does not provide reliable sequenced delivery.
  641. Early versions of the system used a custom protocol that was
  642. efficient but unsatisfactory for internetwork transmission.
  643. When we implemented IP, TCP, and UDP we looked around for a suitable
  644. replacement with the following properties:
  645. </P>
  646. <DL COMPACT>
  647. <DT>*<DD>
  648. Reliable datagram service with sequenced delivery
  649. <DT>*<DD>
  650. Runs over IP
  651. <DT>*<DD>
  652. Low complexity, high performance
  653. <DT>*<DD>
  654. Adaptive timeouts
  655. </dl>
  656. <br>&#32;<br>
  657. None met our needs so a new protocol was designed.
  658. IL is a lightweight protocol designed to be encapsulated by IP.
  659. It is a connection-based protocol
  660. providing reliable transmission of sequenced messages between machines.
  661. No provision is made for flow control since the protocol is designed to transport RPC
  662. messages between client and server.
  663. A small outstanding message window prevents too
  664. many incoming messages from being buffered;
  665. messages outside the window are discarded
  666. and must be retransmitted.
  667. Connection setup uses a two way handshake to generate
  668. initial sequence numbers at each end of the connection;
  669. subsequent data messages increment the
  670. sequence numbers allowing
  671. the receiver to resequence out of order messages.
  672. In contrast to other protocols, IL does not do blind retransmission.
  673. If a message is lost and a timeout occurs, a query message is sent.
  674. The query message is a small control message containing the current
  675. sequence numbers as seen by the sender.
  676. The receiver responds to a query by retransmitting missing messages.
  677. This allows the protocol to behave well in congested networks,
  678. where blind retransmission would cause further
  679. congestion.
  680. Like TCP, IL has adaptive timeouts.
  681. A round-trip timer is used
  682. to calculate acknowledge and retransmission times in terms of the network speed.
  683. This allows the protocol to perform well on both the Internet and on local Ethernets.
  684. <P>
  685. In keeping with the minimalist design of the rest of the kernel, IL is small.
  686. The entire protocol is 847 lines of code, compared to 2200 lines for TCP.
  687. IL is our protocol of choice.
  688. </P>
  689. <H4>4 Network Addressing
  690. </H4>
  691. <P>
  692. A uniform interface to protocols and devices is not sufficient to
  693. support the transparency we require.
  694. Since each network uses a different
  695. addressing scheme,
  696. the ASCII strings written to a control file have no common format.
  697. As a result, every tool must know the specifics of the networks it
  698. is capable of addressing.
  699. Moreover, since each machine supplies a subset
  700. of the available networks, each user must be aware of the networks supported
  701. by every terminal and server machine.
  702. This is obviously unacceptable.
  703. </P>
  704. <P>
  705. Several possible solutions were considered and rejected; one deserves
  706. more discussion.
  707. We could have used a user-level file server
  708. to represent the network name space as a Plan 9 file tree.
  709. This global naming scheme has been implemented in other distributed systems.
  710. The file hierarchy provides paths to
  711. directories representing network domains.
  712. Each directory contains
  713. files representing the names of the machines in that domain;
  714. an example might be the path
  715. <TT>/net/name/usa/edu/mit/ai</TT>.
  716. Each machine file contains information like the IP address of the machine.
  717. We rejected this representation for several reasons.
  718. First, it is hard to devise a hierarchy encompassing all representations
  719. of the various network addressing schemes in a uniform manner.
  720. Datakit and Ethernet address strings have nothing in common.
  721. Second, the address of a machine is
  722. often only a small part of the information required to connect to a service on
  723. the machine.
  724. For example, the IP protocols require symbolic service names to be mapped into
  725. numeric port numbers, some of which are privileged and hence special.
  726. Information of this sort is hard to represent in terms of file operations.
  727. Finally, the size and number of the networks being represented burdens users with
  728. an unacceptably large amount of information about the organization of the network
  729. and its connectivity.
  730. In this case the Plan 9 representation of a
  731. resource as a file is not appropriate.
  732. </P>
  733. <P>
  734. If tools are to be network independent, a third-party server must resolve
  735. network names.
  736. A server on each machine, with local knowledge, can select the best network
  737. for any particular destination machine or service.
  738. Since the network devices present a common interface,
  739. the only operation which differs between networks is name resolution.
  740. A symbolic name must be translated to
  741. the path of the clone file of a protocol
  742. device and an ASCII address string to write to the
  743. <TT>ctl</TT>
  744. file.
  745. A connection server (CS) provides this service.
  746. </P>
  747. <H4>4.1 Network Database
  748. </H4>
  749. <P>
  750. On most systems several
  751. files such as
  752. <TT>/etc/hosts</TT>,
  753. <TT>/etc/networks</TT>,
  754. <TT>/etc/services</TT>,
  755. <TT>/etc/hosts.equiv</TT>,
  756. <TT>/etc/bootptab</TT>,
  757. and
  758. <TT>/etc/named.d</TT>
  759. hold network information.
  760. Much time and effort is spent
  761. administering these files and keeping
  762. them mutually consistent.
  763. Tools attempt to
  764. automatically derive one or more of the files from
  765. information in other files but maintenance continues to be
  766. difficult and error prone.
  767. </P>
  768. <P>
  769. Since we were writing an entirely new system, we were free to
  770. try a simpler approach.
  771. One database on a shared server contains all the information
  772. needed for network administration.
  773. Two ASCII files comprise the main database:
  774. <TT>/lib/ndb/local</TT>
  775. contains locally administered information and
  776. <TT>/lib/ndb/global</TT>
  777. contains information imported from elsewhere.
  778. The files contain sets of attribute/value pairs of the form
  779. <I>attr<TT>=</TT>value</I>,
  780. where
  781. <I>attr</I>
  782. and
  783. <I>value</I>
  784. are alphanumeric strings.
  785. Systems are described by multi-line entries;
  786. a header line at the left margin begins each entry followed by zero or more
  787. indented attribute/value pairs specifying
  788. names, addresses, properties, etc.
  789. For example, the entry for our CPU server
  790. specifies a domain name, an IP address, an Ethernet address,
  791. a Datakit address, a boot file, and supported protocols.
  792. <DL><DT><DD><TT><PRE>
  793. sys = helix
  794. dom=helix.research.bell-labs.com
  795. bootf=/mips/9power
  796. ip=135.104.9.31 ether=0800690222f0
  797. dk=nj/astro/helix
  798. proto=il flavor=9cpu
  799. </PRE></TT></DL>
  800. If several systems share entries such as
  801. network mask and gateway, we specify that information
  802. with the network or subnetwork instead of the system.
  803. The following entries define a Class B IP network and
  804. a few subnets derived from it.
  805. The entry for the network specifies the IP mask,
  806. file system, and authentication server for all systems
  807. on the network.
  808. Each subnetwork specifies its default IP gateway.
  809. <DL><DT><DD><TT><PRE>
  810. ipnet=mh-astro-net ip=135.104.0.0 ipmask=255.255.255.0
  811. fs=bootes.research.bell-labs.com
  812. auth=1127auth
  813. ipnet=unix-room ip=135.104.117.0
  814. ipgw=135.104.117.1
  815. ipnet=third-floor ip=135.104.51.0
  816. ipgw=135.104.51.1
  817. ipnet=fourth-floor ip=135.104.52.0
  818. ipgw=135.104.52.1
  819. </PRE></TT></DL>
  820. Database entries also define the mapping of service names
  821. to port numbers for TCP, UDP, and IL.
  822. <DL><DT><DD><TT><PRE>
  823. tcp=echo port=7
  824. tcp=discard port=9
  825. tcp=systat port=11
  826. tcp=daytime port=13
  827. </PRE></TT></DL>
  828. </P>
  829. <P>
  830. All programs read the database directly so
  831. consistency problems are rare.
  832. However the database files can become large.
  833. Our global file, containing all information about
  834. both Datakit and Internet systems in AT&amp;T, has 43,000
  835. lines.
  836. To speed searches, we build hash table files for each
  837. attribute we expect to search often.
  838. The hash file entries point to entries
  839. in the master files.
  840. Every hash file contains the modification time of its master
  841. file so we can avoid using an out-of-date hash table.
  842. Searches for attributes that aren't hashed or whose hash table
  843. is out-of-date still work, they just take longer.
  844. </P>
  845. <H4>4.2 Connection Server
  846. </H4>
  847. <P>
  848. On each system a user level connection server process, CS, translates
  849. symbolic names to addresses.
  850. CS uses information about available networks, the network database, and
  851. other servers (such as DNS) to translate names.
  852. CS is a file server serving a single file,
  853. <TT>/net/cs</TT>.
  854. A client writes a symbolic name to
  855. <TT>/net/cs</TT>
  856. then reads one line for each matching destination reachable
  857. from this system.
  858. The lines are of the form
  859. <I>filename message</I>,
  860. where
  861. <I>filename</I>
  862. is the path of the clone file to open for a new connection and
  863. <I>message</I>
  864. is the string to write to it to make the connection.
  865. The following example illustrates this.
  866. <TT>Ndb/csquery</TT>
  867. is a program that prompts for strings to write to
  868. <TT>/net/cs</TT>
  869. and prints the replies.
  870. <DL><DT><DD><TT><PRE>
  871. % ndb/csquery
  872. &#62; net!helix!9fs
  873. /net/il/clone 135.104.9.31!17008
  874. /net/dk/clone nj/astro/helix!9fs
  875. </PRE></TT></DL>
  876. </P>
  877. <P>
  878. CS provides meta-name translation to perform complicated
  879. searches.
  880. The special network name
  881. <TT>net</TT>
  882. selects any network in common between source and
  883. destination supporting the specified service.
  884. A host name of the form <TT>$</TT><I>attr</I>
  885. is the name of an attribute in the network database.
  886. The database search returns the value
  887. of the matching attribute/value pair
  888. most closely associated with the source host.
  889. Most closely associated is defined on a per network basis.
  890. For example, the symbolic name
  891. <TT>tcp!$auth!rexauth</TT>
  892. causes CS to search for the
  893. <TT>auth</TT>
  894. attribute in the database entry for the source system, then its
  895. subnetwork (if there is one) and then its network.
  896. <DL><DT><DD><TT><PRE>
  897. % ndb/csquery
  898. &#62; net!$auth!rexauth
  899. /net/il/clone 135.104.9.34!17021
  900. /net/dk/clone nj/astro/p9auth!rexauth
  901. /net/il/clone 135.104.9.6!17021
  902. /net/dk/clone nj/astro/musca!rexauth
  903. </PRE></TT></DL>
  904. </P>
  905. <P>
  906. Normally CS derives naming information from its database files.
  907. For domain names however, CS first consults another user level
  908. process, the domain name server (DNS).
  909. If no DNS is reachable, CS relies on its own tables.
  910. </P>
  911. <P>
  912. Like CS, the domain name server is a user level process providing
  913. one file,
  914. <TT>/net/dns</TT>.
  915. A client writes a request of the form
  916. <I>domain-name type</I>,
  917. where
  918. <I>type</I>
  919. is a domain name service resource record type.
  920. DNS performs a recursive query through the
  921. Internet domain name system producing one line
  922. per resource record found. The client reads
  923. <TT>/net/dns</TT>
  924. to retrieve the records.
  925. Like other domain name servers, DNS caches information
  926. learned from the network.
  927. DNS is implemented as a multi-process shared memory application
  928. with separate processes listening for network and local requests.
  929. </P>
  930. <H4>5 Library routines
  931. </H4>
  932. <P>
  933. The section on protocol devices described the details
  934. of making and receiving connections across a network.
  935. The dance is straightforward but tedious.
  936. Library routines are provided to relieve
  937. the programmer of the details.
  938. </P>
  939. <H4>5.1 Connecting
  940. </H4>
  941. <P>
  942. The
  943. <TT>dial</TT>
  944. library call establishes a connection to a remote destination.
  945. It
  946. returns an open file descriptor for the
  947. <TT>data</TT>
  948. file in the connection directory.
  949. <DL><DT><DD><TT><PRE>
  950. int dial(char *dest, char *local, char *dir, int *cfdp)
  951. </PRE></TT></DL>
  952. </P>
  953. <DL COMPACT>
  954. <DT><TT>dest</TT><DD>
  955. is the symbolic name/address of the destination.
  956. <DT><TT>local</TT><DD>
  957. is the local address.
  958. Since most networks do not support this, it is
  959. usually zero.
  960. <DT><TT>dir</TT><DD>
  961. is a pointer to a buffer to hold the path name of the protocol directory
  962. representing this connection.
  963. <TT>Dial</TT>
  964. fills this buffer if the pointer is non-zero.
  965. <DT><TT>cfdp</TT><DD>
  966. is a pointer to a file descriptor for the
  967. <TT>ctl</TT>
  968. file of the connection.
  969. If the pointer is non-zero,
  970. <TT>dial</TT>
  971. opens the control file and tucks the file descriptor here.
  972. </dl>
  973. <br>&#32;<br>
  974. Most programs call
  975. <TT>dial</TT>
  976. with a destination name and all other arguments zero.
  977. <TT>Dial</TT>
  978. uses CS to
  979. translate the symbolic name to all possible destination addresses
  980. and attempts to connect to each in turn until one works.
  981. Specifying the special name
  982. <TT>net</TT>
  983. in the network portion of the destination
  984. allows CS to pick a network/protocol in common
  985. with the destination for which the requested service is valid.
  986. For example, assume the system
  987. <TT>research.bell-labs.com</TT>
  988. has the Datakit address
  989. <TT>nj/astro/research</TT>
  990. and IP addresses
  991. <TT>135.104.117.5</TT>
  992. and
  993. <TT>129.11.4.1</TT>.
  994. The call
  995. <DL><DT><DD><TT><PRE>
  996. fd = dial("net!research.bell-labs.com!login", 0, 0, 0, 0);
  997. </PRE></TT></DL>
  998. tries in succession to connect to
  999. <TT>nj/astro/research!login</TT>
  1000. on the Datakit and both
  1001. <TT>135.104.117.5!513</TT>
  1002. and
  1003. <TT>129.11.4.1!513</TT>
  1004. across the Internet.
  1005. <P>
  1006. <TT>Dial</TT>
  1007. accepts addresses instead of symbolic names.
  1008. For example, the destinations
  1009. <TT>tcp!135.104.117.5!513</TT>
  1010. and
  1011. <TT>tcp!research.bell-labs.com!login</TT>
  1012. are equivalent
  1013. references to the same machine.
  1014. </P>
  1015. <H4>5.2 Listening
  1016. </H4>
  1017. <P>
  1018. A program uses
  1019. four routines to listen for incoming connections.
  1020. It first
  1021. <TT>announce()</TT>s
  1022. its intention to receive connections,
  1023. then
  1024. <TT>listen()</TT>s
  1025. for calls and finally
  1026. <TT>accept()</TT>s
  1027. or
  1028. <TT>reject()</TT>s
  1029. them.
  1030. <TT>Announce</TT>
  1031. returns an open file descriptor for the
  1032. <TT>ctl</TT>
  1033. file of a connection and fills
  1034. <TT>dir</TT>
  1035. with the
  1036. path of the protocol directory
  1037. for the announcement.
  1038. <DL><DT><DD><TT><PRE>
  1039. int announce(char *addr, char *dir)
  1040. </PRE></TT></DL>
  1041. <TT>Addr</TT>
  1042. is the symbolic name/address announced;
  1043. if it does not contain a service, the announcement is for
  1044. all services not explicitly announced.
  1045. Thus, one can easily write the equivalent of the
  1046. <TT>inetd</TT>
  1047. program without
  1048. having to announce each separate service.
  1049. An announcement remains in force until the control file is
  1050. closed.
  1051. </P>
  1052. <br>&#32;<br>
  1053. <TT>Listen</TT>
  1054. returns an open file descriptor for the
  1055. <TT>ctl</TT>
  1056. file and fills
  1057. <TT>ldir</TT>
  1058. with the path
  1059. of the protocol directory
  1060. for the received connection.
  1061. It is passed
  1062. <TT>dir</TT>
  1063. from the announcement.
  1064. <DL><DT><DD><TT><PRE>
  1065. int listen(char *dir, char *ldir)
  1066. </PRE></TT></DL>
  1067. <br>&#32;<br>
  1068. <TT>Accept</TT>
  1069. and
  1070. <TT>reject</TT>
  1071. are called with the control file descriptor and
  1072. <TT>ldir</TT>
  1073. returned by
  1074. <TT>listen.</TT>
  1075. Some networks such as Datakit accept a reason for a rejection;
  1076. networks such as IP ignore the third argument.
  1077. <DL><DT><DD><TT><PRE>
  1078. int accept(int ctl, char *ldir)
  1079. int reject(int ctl, char *ldir, char *reason)
  1080. </PRE></TT></DL>
  1081. <P>
  1082. The following code implements a typical TCP listener.
  1083. It announces itself, listens for connections, and forks a new
  1084. process for each.
  1085. The new process echoes data on the connection until the
  1086. remote end closes it.
  1087. The "*" in the symbolic name means the announcement is valid for
  1088. any addresses bound to the machine the program is run on.
  1089. <DL><DT><DD><TT><PRE>
  1090. int
  1091. echo_server(void)
  1092. {
  1093. int dfd, lcfd;
  1094. char adir[40], ldir[40];
  1095. int n;
  1096. char buf[256];
  1097. afd = announce("tcp!*!echo", adir);
  1098. if(afd &#60; 0)
  1099. return -1;
  1100. for(;;){
  1101. /* listen for a call */
  1102. lcfd = listen(adir, ldir);
  1103. if(lcfd &#60; 0)
  1104. return -1;
  1105. /* fork a process to echo */
  1106. switch(fork()){
  1107. case 0:
  1108. /* accept the call and open the data file */
  1109. dfd = accept(lcfd, ldir);
  1110. if(dfd &#60; 0)
  1111. return -1;
  1112. /* echo until EOF */
  1113. while((n = read(dfd, buf, sizeof(buf))) &#62; 0)
  1114. write(dfd, buf, n);
  1115. exits(0);
  1116. case -1:
  1117. perror("forking");
  1118. default:
  1119. close(lcfd);
  1120. break;
  1121. }
  1122. }
  1123. }
  1124. </PRE></TT></DL>
  1125. </P>
  1126. <H4>6 User Level
  1127. </H4>
  1128. <P>
  1129. Communication between Plan 9 machines is done almost exclusively in
  1130. terms of 9P messages. Only the two services
  1131. <TT>cpu</TT>
  1132. and
  1133. <TT>exportfs</TT>
  1134. are used.
  1135. The
  1136. <TT>cpu</TT>
  1137. service is analogous to
  1138. <TT>rlogin</TT>.
  1139. However, rather than emulating a terminal session
  1140. across the network,
  1141. <TT>cpu</TT>
  1142. creates a process on the remote machine whose name space is an analogue of the window
  1143. in which it was invoked.
  1144. <TT>Exportfs</TT>
  1145. is a user level file server which allows a piece of name space to be
  1146. exported from machine to machine across a network. It is used by the
  1147. <TT>cpu</TT>
  1148. command to serve the files in the terminal's name space when they are
  1149. accessed from the
  1150. cpu server.
  1151. </P>
  1152. <P>
  1153. By convention, the protocol and device driver file systems are mounted in a
  1154. directory called
  1155. <TT>/net</TT>.
  1156. Although the per-process name space allows users to configure an
  1157. arbitrary view of the system, in practice their profiles build
  1158. a conventional name space.
  1159. </P>
  1160. <H4>6.1 Exportfs
  1161. </H4>
  1162. <P>
  1163. <TT>Exportfs</TT>
  1164. is invoked by an incoming network call.
  1165. The
  1166. <I>listener</I>
  1167. (the Plan 9 equivalent of
  1168. <TT>inetd</TT>)
  1169. runs the profile of the user
  1170. requesting the service to construct a name space before starting
  1171. <TT>exportfs</TT>.
  1172. After an initial protocol
  1173. establishes the root of the file tree being
  1174. exported,
  1175. the remote process mounts the connection,
  1176. allowing
  1177. <TT>exportfs</TT>
  1178. to act as a relay file server. Operations in the imported file tree
  1179. are executed on the remote server and the results returned.
  1180. As a result
  1181. the name space of the remote machine appears to be exported into a
  1182. local file tree.
  1183. </P>
  1184. <P>
  1185. The
  1186. <TT>import</TT>
  1187. command calls
  1188. <TT>exportfs</TT>
  1189. on a remote machine, mounts the result in the local name space,
  1190. and
  1191. exits.
  1192. No local process is required to serve mounts;
  1193. 9P messages are generated by the kernel's mount driver and sent
  1194. directly over the network.
  1195. </P>
  1196. <P>
  1197. <TT>Exportfs</TT>
  1198. must be multithreaded since the system calls
  1199. <TT>open,</TT>
  1200. <TT>read</TT>
  1201. and
  1202. <TT>write</TT>
  1203. may block.
  1204. Plan 9 does not implement the
  1205. <TT>select</TT>
  1206. system call but does allow processes to share file descriptors,
  1207. memory and other resources.
  1208. <TT>Exportfs</TT>
  1209. and the configurable name space
  1210. provide a means of sharing resources between machines.
  1211. It is a building block for constructing complex name spaces
  1212. served from many machines.
  1213. </P>
  1214. <P>
  1215. The simplicity of the interfaces encourages naive users to exploit the potential
  1216. of a richly connected environment.
  1217. Using these tools it is easy to gateway between networks.
  1218. For example a terminal with only a Datakit connection can import from the server
  1219. <TT>helix</TT>:
  1220. <DL><DT><DD><TT><PRE>
  1221. import -a helix /net
  1222. telnet ai.mit.edu
  1223. </PRE></TT></DL>
  1224. The
  1225. <TT>import</TT>
  1226. command makes a Datakit connection to the machine
  1227. <TT>helix</TT>
  1228. where
  1229. it starts an instance
  1230. <TT>exportfs</TT>
  1231. to serve
  1232. <TT>/net</TT>.
  1233. The
  1234. <TT>import</TT>
  1235. command mounts the remote
  1236. <TT>/net</TT>
  1237. directory after (the
  1238. <TT>-a</TT>
  1239. option to
  1240. <TT>import</TT>)
  1241. the existing contents
  1242. of the local
  1243. <TT>/net</TT>
  1244. directory.
  1245. The directory contains the union of the local and remote contents of
  1246. <TT>/net</TT>.
  1247. Local entries supersede remote ones of the same name so
  1248. networks on the local machine are chosen in preference
  1249. to those supplied remotely.
  1250. However, unique entries in the remote directory are now visible in the local
  1251. <TT>/net</TT>
  1252. directory.
  1253. All the networks connected to
  1254. <TT>helix</TT>,
  1255. not just Datakit,
  1256. are now available in the terminal. The effect on the name space is shown by the following
  1257. example:
  1258. <DL><DT><DD><TT><PRE>
  1259. philw-gnot% ls /net
  1260. /net/cs
  1261. /net/dk
  1262. philw-gnot% import -a musca /net
  1263. philw-gnot% ls /net
  1264. /net/cs
  1265. /net/cs
  1266. /net/dk
  1267. /net/dk
  1268. /net/dns
  1269. /net/ether
  1270. /net/il
  1271. /net/tcp
  1272. /net/udp
  1273. </PRE></TT></DL>
  1274. </P>
  1275. <H4>6.2 Ftpfs
  1276. </H4>
  1277. <P>
  1278. We decided to make our interface to FTP
  1279. a file system rather than the traditional command.
  1280. Our command,
  1281. <I>ftpfs,</I>
  1282. dials the FTP port of a remote system, prompts for login and password, sets image mode,
  1283. and mounts the remote file system onto
  1284. <TT>/n/ftp</TT>.
  1285. Files and directories are cached to reduce traffic.
  1286. The cache is updated whenever a file is created.
  1287. Ftpfs works with TOPS-20, VMS, and various Unix flavors
  1288. as the remote system.
  1289. </P>
  1290. <H4>7 Cyclone Fiber Links
  1291. </H4>
  1292. <P>
  1293. The file servers and CPU servers are connected by
  1294. high-bandwidth
  1295. point-to-point links.
  1296. A link consists of two VME cards connected by a pair of optical
  1297. fibers.
  1298. The VME cards use 33MHz Intel 960 processors and AMD's TAXI
  1299. fiber transmitter/receivers to drive the lines at 125 Mbit/sec.
  1300. Software in the VME card reduces latency by copying messages from system memory
  1301. to fiber without intermediate buffering.
  1302. </P>
  1303. <H4>8 Performance
  1304. </H4>
  1305. <P>
  1306. We measured both latency and throughput
  1307. of reading and writing bytes between two processes
  1308. for a number of different paths.
  1309. Measurements were made on two- and four-CPU SGI Power Series processors.
  1310. The CPUs are 25 MHz MIPS 3000s.
  1311. The latency is measured as the round trip time
  1312. for a byte sent from one process to another and
  1313. back again.
  1314. Throughput is measured using 16k writes from
  1315. one process to another.
  1316. <DL><DT><DD><TT><PRE>
  1317. <br><img src="data.7581.gif"><br>
  1318. </PRE></TT></DL>
  1319. </P>
  1320. <H4>9 Conclusion
  1321. </H4>
  1322. <P>
  1323. The representation of all resources as file systems
  1324. coupled with an ASCII interface has proved more powerful
  1325. than we had originally imagined.
  1326. Resources can be used by any computer in our networks
  1327. independent of byte ordering or CPU type.
  1328. The connection server provides an elegant means
  1329. of decoupling tools from the networks they use.
  1330. Users successfully use Plan 9 without knowing the
  1331. topology of the system or the networks they use.
  1332. More information about 9P can be found in the Section 5 of the Plan 9 Programmer's
  1333. Manual, Volume I.
  1334. </P>
  1335. <H4>10 References
  1336. </H4>
  1337. <br>&#32;<br>
  1338. [Pike90] R. Pike, D. Presotto, K. Thompson, H. Trickey,
  1339. ``Plan 9 from Bell Labs'',
  1340. UKUUG Proc. of the Summer 1990 Conf. ,
  1341. London, England,
  1342. 1990.
  1343. <br>&#32;<br>
  1344. [Needham] R. Needham, ``Names'', in
  1345. Distributed systems,
  1346. S. Mullender, ed.,
  1347. Addison Wesley, 1989.
  1348. <br>&#32;<br>
  1349. [Presotto] D. Presotto, ``Multiprocessor Streams for Plan 9'',
  1350. UKUUG Proc. of the Summer 1990 Conf. ,
  1351. London, England, 1990.
  1352. <br>&#32;<br>
  1353. [Met80] R. Metcalfe, D. Boggs, C. Crane, E. Taf and J. Hupp, ``The
  1354. Ethernet Local Network: Three reports'',
  1355. CSL-80-2,
  1356. XEROX Palo Alto Research Center, February 1980.
  1357. <br>&#32;<br>
  1358. [Fra80] A. G. Fraser, ``Datakit - A Modular Network for Synchronous
  1359. and Asynchronous Traffic'',
  1360. Proc. Int'l Conf. on Communication,
  1361. Boston, June 1980.
  1362. <br>&#32;<br>
  1363. [Pet89a] L. Peterson, ``RPC in the X-Kernel: Evaluating new Design Techniques'',
  1364. Proc. Twelfth Symp. on Op. Sys. Princ.,
  1365. Litchfield Park, AZ, December 1990.
  1366. <br>&#32;<br>
  1367. [Rit84a] D. M. Ritchie, ``A Stream Input-Output System'',
  1368. AT&amp;T Bell Laboratories Technical Journal, 68(8),
  1369. October 1984.
  1370. <br>&#32;<br>
  1371. <A href=http://www.lucent.com/copyright.html>
  1372. Copyright</A> &#169; 2000 Lucent Technologies Inc. All rights reserved.
  1373. </body></html>