sh.ms 48 KB


  1. .TL
  2. The Inferno Shell
  3. .AU
  4. Roger Peppé
  5. rog@vitanuova.com
  6. .AB
  7. The Inferno shell
  8. .I sh
  9. is a reasonably small shell that brings together aspects of
  10. several other shells along with Inferno's dynamically loaded
  11. modules, which it uses for much of the functionality
  12. traditionally built in to the shell. This paper focuses principally
  13. on the features that make it unusual, and presents
  14. an example ``network chat'' application written entirely
  15. in
  16. .I sh
  17. script.
  18. .AE
  19. .SH
  20. Introduction
  21. .LP
  22. Shells come in many shapes and sizes. The Inferno
  23. shell
  24. .I sh
  25. (actually one of three shells supplied with Inferno)
  26. is an attempt to combine the strengths of a Unix-like
  27. shell, notably Tom Duff's
  28. .I rc ,
  29. with some of the features peculiar to Inferno.
  30. It owes its largest debt to
  31. .I rc ,
  32. which provides almost all of the syntax
  33. and most of the semantics too; when in doubt,
  34. I copied
  35. .I rc 's
  36. behaviour.
  37. In fact, I borrowed as many good ideas as I could
  38. from elsewhere, inventing new concepts and syntax
  39. only when unbearably tempted. See Credits
  40. for a list of those I could remember.
  41. .LP
  42. This paper does not attempt to give more than
  43. a brief overview of the aspects of
  44. .I sh
  45. which it holds in common with Plan 9's
  46. .I rc .
  47. The reader is referred
  48. to
  49. .I sh (1)
  50. (the definitive reference)
  51. and Tom Duff's paper ``Rc - The Plan 9 Shell''.
  52. I have occasionally pinched examples from the latter,
  53. so the differences are easily contrasted.
  54. .SH
  55. Overview
  56. .LP
  57. .I Sh
  58. is, at its simplest level, a command interpreter that will
  59. be familiar to all those who have used the Bourne-shell,
  60. C shell, or any of the numerous variants thereof (e.g.
  61. .I bash ,
  62. .I ksh ,
  63. .I tcsh ).
  64. All of the following commands behave as expected:
  65. .P1
  66. date
  67. cat /lib/keyboard
  68. ls -l > file.names
  69. ls -l /dis >> file.names
  70. wc <file
  71. echo [a-f]*.b
  72. ls | wc
  73. ls; date
  74. limbo *.b &
  75. .P2
  76. An
  77. .I rc
  78. concept that will be less familiar to users
  79. of more conventional shells is the rôle of
  80. .I lists
  81. in the shell.
  82. Each simple
  83. .I sh
  84. command, and the value of any
  85. .I sh
  86. environment variable, consists of a list of words.
  87. .I Sh
  88. lists are flat, a simple ordered list of words,
  89. where a word is a sequence of characters that
  90. may include white-space or characters special
  91. to the shell. The Bourne-shell and its kin
  92. have no such concept, which means that every
  93. time the value of any environment variable is
  94. used, it is split into blank separated words.
  95. For instance, the command:
  96. .P1
  97. x='-l /lib/keyboard'
  98. ls $x
  99. .P2
  100. would in many shells pass the two arguments
  101. .CW -l '' ``
  102. and
  103. .CW /lib/keyboard '' ``
  104. to the
  105. .CW ls
  106. command.
  107. In
  108. .I sh ,
  109. it will pass the single argument
  110. .CW "-l /lib/keyboard" ''. ``
  111. .LP
  112. The following aspects of
  113. .I sh 's
  114. syntax will be familiar to users of
  115. .I rc .
  116. .LP
  117. File descriptor manipulation:
  118. .P1
  119. echo hello, world > /dev/null >[1=2]
  120. .P2
  121. Environment variable values:
  122. .P1
  123. echo $var
  124. .P2
  125. Count number of elements in a variable:
  126. .P1
  127. echo $#var
  128. .P2
  129. Run a command and substitute its output:
  130. .P1
  131. rm `{grep -li microsoft *}
  132. .P2
  133. Lists:
  134. .P1
  135. echo (((a b) c) d)
  136. .P2
  137. List concatenation:
  138. .P1
  139. cat /appl/cmd/sh/^(std regex expr)^.b
  140. .P2
  141. To the above,
  142. .I sh
  143. adds a variant of the
  144. .CW `{}
  145. operator:
  146. \f5"{}\fP,
  147. which is the same except that it does not
  148. split the input into tokens,
  149. for example:
  150. .P1
  151. for i in "{echo one two three} {
  152. echo loop
  153. }
  154. .P2
  155. will only print
  156. .CW loop
  157. once.
  158. .LP
  159. .I Sh
  160. also adds a new redirection operator
  161. .CW <> ,
  162. which opens the standard input (by default) for
  163. reading
  164. .I and
  165. writing.
  166. .SH
  167. Command blocks
  168. .LP
  169. Possibly
  170. .I sh 's
  171. most significant departure from the
  172. norm is its use of command blocks as values.
  173. In a conventional shell, a command block
  174. groups commands together into a single
  175. syntactic unit that can then be used wherever
  176. a simple command might appear.
  177. For example:
  178. .P1
  179. {
  180. echo hello
  181. echo goodbye
  182. } > /dev/null
  183. .P2
  184. .I Sh
  185. allows this, but it also allows a command block to appear
  186. wherever a normal word would appear. In this
  187. case, the command block is not executed immediately,
  188. but is bundled up as if it was a single quoted word.
  189. For example:
  190. .P1
  191. cmd = {
  192. echo hello
  193. echo goodbye
  194. }
  195. .P2
  196. will store the contents of the braced block inside
  197. the environment variable
  198. .CW $cmd .
  199. Printing the value of
  200. .CW $cmd
  201. gets the block back again, for example:
  202. .P1
  203. echo $cmd
  204. .P2
  205. gives
  206. .P1
  207. {echo hello;echo goodbye}
  208. .P2
  209. Note that when the shell parsed the block,
  210. it ignored everything that was not
  211. syntactically relevant to the execution
  212. of the block; for instance, the white space
  213. has been reduced to the minimum necessary,
  214. and the newline has been changed to
  215. the functionally identical semi-colon.
  216. .LP
  217. It is also worth pointing out that
  218. .CW echo
  219. is an external module, implementing only the
  220. standard
  221. .I Command (2)
  222. interface; it has no knowledge of shell command
  223. blocks. When the shell invokes an external command,
  224. and one of the arguments is a command block,
  225. it simply passes the equivalent string. Internally, built in commands
  226. are slightly different for efficiency's sake, as we will see,
  227. but for almost all purposes you can treat command blocks
  228. as if they were strings holding functionally equivalent shell commands.
  229. .LP
  230. This equivalence also applies to the execution of commands.
  231. When the
  232. shell comes to execute a simple command (a sequence of
  233. words), it examines the first word to decide what to execute.
  234. In most shells, this word can be either the file name of
  235. an external command, or the name of a command built in
  236. to the shell (e.g.
  237. .CW exit ).
  238. .LP
  239. .I Sh
  240. follows these conventional rules, but first, it examines
  241. the first character of the first word, and if it is an open
  242. brace
  243. .CW { ) (
  244. character, it treats it as a command block,
  245. parses it, and executes it according to the normal syntax
  246. rules of the shell. For the duration of this execution, it
  247. sets the environment variable
  248. .CW $*
  249. to the list of arguments passed to the block. For example:
  250. .P1
  251. {echo $*} hello world
  252. .P2
  253. is exactly the same as
  254. .P1
  255. echo hello world
  256. .P2
  257. Execution of command blocks is the same whether
  258. the command block is just a string or has already been
  259. parsed by the shell.
  260. For example:
  261. .P1
  262. {echo hello}
  263. .P2
  264. is exactly the same as
  265. .P1
  266. \&'{echo hello}'
  267. .P2
  268. The only difference is that the former case has its syntax
  269. checked for correctness as soon as the shell sees the script;
  270. whereas if the latter contained a malformed command block,
  271. a syntax error will be raised only when it
  272. comes to actually execute the command.
  273. .LP
  274. The shell's treatment of braces can be used to provide functionality
  275. similar to the
  276. .CW eval
  277. command that is built in to most other shells.
  278. .P1
  279. cmd = 'echo hello; echo goodbye'
  280. \&'{'^$cmd^'}'
  281. .P2
  282. In other words, simply by surrounding a string
  283. by braces and executing it, the string
  284. will be executed as if it had been typed to the
  285. shell. Note the use of the caret
  286. .CW ^ ) (
  287. string concatenatation operator.
  288. .I Sh
  289. does provide `free carets' in the same way as
  290. .I rc ,
  291. so in the previous example
  292. .P1
  293. \&'{'$cmd'}'
  294. .P2
  295. would work exactly the same, but generally,
  296. and in particular when writing scripts, it is
  297. good style to make the carets explicit.
  298. .SH
  299. Assignment and scope
  300. .LP
  301. The assignment operator in
  302. .I sh ,
  303. in common with most other shells
  304. is
  305. .CW = .
  306. .P1
  307. x=a b c d
  308. .P2
  309. assigns the four element list
  310. .CW "(a b c d)"
  311. to the environment variable named
  312. .CW x .
  313. The value can later be extracted
  314. with the
  315. .CW $
  316. operator, for example:
  317. .P1
  318. echo $x
  319. .P2
  320. will print
  321. .P1
  322. a b c d
  323. .P2
  324. .I Sh
  325. also implements a form of local variable.
  326. An execution of a braced block command
  327. creates a new scope for the duration of that block;
  328. the value of a variable assigned with
  329. .CW :=
  330. in that block will be lost when the
  331. block exits. For example:
  332. .P1
  333. x = hello
  334. {x := goodbye }
  335. echo $x
  336. .P2
  337. will print ``hello''.
  338. Note that the scoping rules are
  339. .I dynamic
  340. \- variable references are interpreted
  341. relative to their containing scope at execution time.
  342. For example:
  343. .P1
  344. x := hello
  345. cmd := {echo $x}
  346. {
  347. x := goodbye
  348. $cmd
  349. }
  350. .P2
  351. wil print ``goodbye'', not ``hello''. For one
  352. way of avoiding this problem, see ``Lexical
  353. binding'' below.
  354. .LP
  355. One late, but useful, addition to the shell's assignment
  356. syntax is tuple assignment. This partially
  357. makes up for the lack of list indexing primitives in the shell.
  358. If the left hand side of the assignment operator is
  359. a list of variable names, each element of the list on the
  360. right hand side is assigned in turn to its respective variable.
  361. The last variable mentioned gets assigned all the
  362. remaining elements.
  363. For example, after:
  364. .P1
  365. (a b c) := (one two three four five)
  366. .P2
  367. .CW a
  368. is
  369. .CW one ,
  370. .CW b
  371. is
  372. .CW two ,
  373. and
  374. .CW c
  375. contains the three element list
  376. .CW "(three four five)".
  377. For example:
  378. .P1
  379. (first var) = $var
  380. .P2
  381. knocks the first element off
  382. .CW $var
  383. and puts it in
  384. .CW $first .
  385. .LP
  386. One important difference between
  387. .I sh 's
  388. variables and variables in shells under
  389. Unix-like operating systems derives from
  390. the fact that Inferno's underlying process
  391. creation primitive is
  392. .I spawn ,
  393. not
  394. .I fork .
  395. This means that, even though the shell
  396. might create a new process to accomplish
  397. an I/O redirection, variables changed by
  398. the sub-process are still visible in the parent
  399. process. This applies anywhere a new process
  400. is created that runs synchronously with respect
  401. to the rest of the shell script - i.e. there is no
  402. chance of parallel access to the environment.
  403. For example, it is possible to get
  404. access to the status value of a command executed
  405. by the
  406. .CW `{}
  407. operator:
  408. .P1
  409. files=`{du -a; dustatus = $status}
  410. if {! ~ $dustatus ''} {
  411. echo du failed
  412. }
  413. .P2
  414. When the shell does spawn an asynchronous
  415. process (background processes and pipelines
  416. are the two occasions that it does so), the
  417. environment is copied so changes in one
  418. process do not affect another.
  419. .SH
  420. Loadable modules
  421. .LP
  422. The ability to pass command blocks as values is
  423. all very well, but does not in itself provide the
  424. programmability that is central to the power of shell scripts
  425. and is built in to most shells, the conditional
  426. execution of commands, for instance.
  427. The Inferno shell is different;
  428. it provides no programmability within the shell itself,
  429. but instead relies on external modules to provide this.
  430. It has a built in command
  431. .CW load
  432. that loads a new module into the shell. The module
  433. that supports standard control flow functionality
  434. and a number of other useful tidbits is called
  435. .CW std .
  436. .P1
  437. load std
  438. .P2
  439. loads this module into the shell.
  440. .CW Std
  441. is a Dis module that
  442. implements the
  443. .CW Shellbuiltin
  444. interface; the shell looks in the directory
  445. .CW /dis/sh
  446. for the module file, in this case
  447. .CW /dis/sh/std.dis .
  448. .LP
  449. When a module is loaded, it is given the opportunity
  450. to define as many new commands as it wants.
  451. Perhaps slightly confusingly, these are known as
  452. ``built-in'' commands (or just ``builtins''), to distinguish
  453. them from commands executed in a separate process
  454. with no access to shell internals. Built-in
  455. commands run in the same process as the shell, and
  456. have direct access to all its internal state (environment variables,
  457. command line options, and state stored within the implementing
  458. module itself). It is possible to find out
  459. what built-in commands are currently defined with
  460. the command
  461. .CW loaded .
  462. Before any modules have been loaded, typing
  463. .P1
  464. loaded
  465. .P2
  466. produces:
  467. .P1
  468. builtin builtin
  469. exit builtin
  470. load builtin
  471. loaded builtin
  472. run builtin
  473. unload builtin
  474. whatis builtin
  475. ${builtin} builtin
  476. ${loaded} builtin
  477. ${quote} builtin
  478. ${unquote} builtin
  479. .P2
  480. These are all the commands that are built in to the
  481. shell proper; I'll explain the
  482. .CW ${}
  483. commands later.
  484. After loading
  485. .CW std ,
  486. executing
  487. .CW loaded
  488. produces:
  489. .P1
  490. ! std
  491. and std
  492. apply std
  493. builtin builtin
  494. exit builtin
  495. flag std
  496. fn std
  497. for std
  498. getlines std
  499. if std
  500. load builtin
  501. loaded builtin
  502. .P3
  503. or std
  504. pctl std
  505. raise std
  506. rescue std
  507. run builtin
  508. status std
  509. subfn std
  510. unload builtin
  511. whatis builtin
  512. while std
  513. ~ std
  514. .P3
  515. ${builtin} builtin
  516. ${env} std
  517. ${hd} std
  518. ${index} std
  519. ${join} std
  520. ${loaded} builtin
  521. ${parse} std
  522. ${pid} std
  523. ${pipe} std
  524. ${quote} builtin
  525. ${split} std
  526. ${tl} std
  527. ${unquote} builtin
  528. .P2
  529. The name of each command defined
  530. by a loaded module is followed by the name of
  531. the module, so you can see that in this case
  532. .CW std
  533. has defined commands such as
  534. .CW if
  535. and
  536. .CW while .
  537. These commands are reminiscent of the
  538. commands built in to the syntax of
  539. other shells, but have no special syntax
  540. associated with them: they obey the normal
  541. argument gathering and execution semantics.
  542. .LP
  543. As an example, consider the
  544. .CW for
  545. command.
  546. .P1
  547. for i in a b c d {
  548. echo $i
  549. }
  550. .P2
  551. This command traverses the list
  552. .CW "(a b c d)"
  553. executing
  554. .CW "{echo $i}"
  555. with
  556. .CW $i
  557. set to each element in turn. In
  558. .I rc ,
  559. this might be written
  560. .P1
  561. for (i in a b c d) {
  562. echo $i
  563. }
  564. .P2
  565. and in fact, in
  566. .I sh ,
  567. this is exactly equivalent. The round brackets
  568. denote a list and, like
  569. .I rc ,
  570. all lists are flattened before passing to an
  571. executed command.
  572. Unlike the
  573. .CW for
  574. command in
  575. .I rc ,
  576. the braces around the command are
  577. not optional; as with the arguments to
  578. a normal command, gathering of arguments
  579. stops at a newline. The exception to this rule
  580. is that newlines within brackets are treated as white space.
  581. This last rule also
  582. applies to round brackets, for example:
  583. .P1
  584. (for i in
  585. a
  586. b
  587. c
  588. d
  589. {echo $i}
  590. )
  591. .P2
  592. does the same thing.
  593. This is very useful for commands that take multiple
  594. command block arguments, and is actually the only
  595. line continuation mechanism that
  596. .I sh
  597. provides (the usual backslash
  598. .CW \e ) (
  599. character is not in any way special to
  600. .I sh ).
  601. .SH
  602. Control structures
  603. .LP
  604. Inferno commands, like shell commands in Unix
  605. or Plan 9, return a status when they finish.
  606. A command's status in Inferno is a short string
  607. describing any error that has occurred;
  608. it can be found in the environment variable
  609. .CW $status .
  610. This is the value that commands defined by
  611. .CW std
  612. use to determine conditional
  613. execution - if it is empty, it is true; otherwise
  614. false.
  615. .CW Std
  616. defines, for instance, a command
  617. .CW ~
  618. that provides a simple pattern matching capability.
  619. Its first argument is the string to test the patterns
  620. against, and subsequent arguments give the patterns,
  621. in normal shell wildcard syntax; its status is true
  622. if there is a match.
  623. .P1
  624. ~ sh.y '*.y'
  625. ~ std.b '*.y'
  626. .P2
  627. give true and false statuses respectively.
  628. A couple of pitfalls lurk here for the unwary:
  629. unlike its
  630. .I rc
  631. namesake, the patterns
  632. .I are
  633. expanded by the shell if left unquoted, so
  634. one has to be careful to quote wildcard characters,
  635. or escape them with a backslash if they are to
  636. be used literally.
  637. Like any other command,
  638. .CW ~
  639. receives a simple list of arguments, so it has to
  640. assume that the string tested has exactly one element;
  641. if you provide a null variable, or one with more
  642. than one element, then you will get unexpected results.
  643. If in doubt, use the
  644. \f5$"\fP
  645. operator to make sure of that.
  646. .LP
  647. Used in conjunction with the
  648. .CW $#
  649. operator,
  650. .CW ~
  651. provides a way to check the
  652. number of elements in a list:
  653. .P1
  654. ~ $#var 0
  655. .P2
  656. will be true if
  657. .CW $var
  658. is empty.
  659. .LP
  660. This can be tested by the
  661. .CW if
  662. command, which
  663. accepts command blocks for
  664. its arguments, executing its second argument if
  665. the status of the first is empty (true).
  666. For example:
  667. .P1
  668. if {~ $#var 0} {
  669. echo '$var has no elements'
  670. }
  671. .P2
  672. Note that the start of one argument must
  673. come on the same line as the end of of the previous,
  674. otherwise it will be treated as a new command,
  675. and always executed. For example:
  676. .P1
  677. if {~ $#var 0}
  678. {echo '$var has no elements'} # this will always be executed
  679. .P2
  680. The way to get around this is to use list bracketing,
  681. for example:
  682. .P1
  683. (if {~ $#var 0}
  684. {echo '$var has no elements'}
  685. )
  686. .P2
  687. will have the desired effect.
  688. The
  689. .CW if
  690. command is more general than
  691. .I rc 's
  692. .CW if ,
  693. in that it accepts an arbitrary number
  694. of condition/action pairs, and executes each condition
  695. in turn until one is true, whereupon it executes the associated
  696. action. If the last condition has no action, then it
  697. acts as the ``else'' clause in the
  698. .CW if .
  699. For example:
  700. .P1
  701. (if {~ $#var 0} {
  702. echo zero elements
  703. }
  704. {~ $#var 1} {
  705. echo one element
  706. }
  707. {echo more than one element}
  708. )
  709. .P2
  710. .LP
  711. .CW Std
  712. provides various other control structures.
  713. .CW And
  714. and
  715. .CW or
  716. provide the equivalent of
  717. .I rc 's
  718. .CW &&
  719. and
  720. .CW ||
  721. operators. They each take any number of command
  722. block arguments and conditionally execute each
  723. in turn.
  724. .CW And
  725. stops executing when a block's status is false,
  726. .CW or
  727. when a block's status is true:
  728. .P1
  729. and {~ $#var 1} {~ $var '*.sbl'} {echo variable ends in .sbl}
  730. (or {mount /dev/eia0 /n/remote}
  731. {echo mount has failed with $status}
  732. )
  733. .P2
  734. An extremely easy trap to fall into is to use
  735. .CW $*
  736. inside a block assuming that its value is the
  737. same as that outside the block. For instance:
  738. .P1
  739. # this will not work
  740. if {~ $#* 2} {echo two arguments}
  741. .P2
  742. It will not work because
  743. .CW $*
  744. is set locally for every block, whether it
  745. is given arguments or not. A solution is to
  746. assign
  747. .CW $*
  748. to a variable at the start of the block:
  749. .P1
  750. args = $*
  751. if {~ $#args 2} {echo two arguments}
  752. .P2
  753. .LP
  754. .CW While
  755. provides looping, executing its second argument
  756. as long as the status of the first remains true.
  757. As the status of an empty block is always true,
  758. .P1
  759. while {} {echo yes}
  760. .P2
  761. will loop forever printing ``yes''.
  762. Another looping command is
  763. .CW getlines ,
  764. which loops reading lines from its standard
  765. input, and executing its command argument,
  766. setting the environment variable
  767. .CW $line
  768. to each line in turn.
  769. For example:
  770. .P1
  771. getlines {
  772. echo '#' $line
  773. } < x.b
  774. .P2
  775. will print each line of the file
  776. .CW x.b
  777. preceded by a
  778. .CW #
  779. character.
  780. .SH
  781. Exceptions
  782. .LP
  783. When the shell encounters some error conditions, such
  784. as a parsing error, or a redirection failure,
  785. it prints a message to standard error and raises
  786. an
  787. .I exception .
  788. In an interactive shell this is caught by the interactive
  789. command loop; in a script it will cause an exit with
  790. a false status, unless handled.
  791. .LP
  792. Exceptions can be handled and raised with the
  793. .CW rescue
  794. and
  795. .CW raise
  796. commands provided by
  797. .CW std .
  798. An exception has a short string associated with it.
  799. .P1
  800. raise error
  801. .P2
  802. will raise an exception named ``error''.
  803. .P1
  804. rescue error {echo an error has occurred} {
  805. command
  806. }
  807. .P2
  808. will execute
  809. .CW command
  810. and will, in the event that it raises an
  811. .CW error
  812. exception, print a diagnostic message.
  813. The name of the exception given to
  814. .CW rescue
  815. can end in an asterisk
  816. .CW * ), (
  817. which will match any exception starting with
  818. the preceding characters. The
  819. .CW *
  820. needs quoting to avoid being expanded as a wildcard
  821. by the shell.
  822. .P1
  823. rescue '*' {echo caught an exception $exception} {
  824. command
  825. }
  826. .P2
  827. will catch all exceptions raised by
  828. .CW command ,
  829. regardless of name.
  830. Within the handler block,
  831. .CW rescue
  832. sets the environment variable
  833. .CW $exception
  834. to the actual name of the exception caught.
  835. .LP
  836. Exceptions can be caught only within a single
  837. process \- if an exception is not caught, then
  838. the name of the exception becomes the
  839. exit status of the process.
  840. As
  841. .I sh
  842. starts a new process for commands with redirected
  843. I/O, this means that
  844. .P1
  845. raise error
  846. echo got here
  847. .P2
  848. behaves differently to:
  849. .P1
  850. raise error > /dev/null
  851. echo got here
  852. .P2
  853. The former prints nothing, while the latter
  854. prints ``got here''.
  855. .LP
  856. The exceptions
  857. .CW break
  858. and
  859. .CW continue
  860. are recognised by
  861. .CW std 's
  862. looping commands
  863. .CW for ,
  864. .CW while ,
  865. and
  866. .CW getlines .
  867. A
  868. .CW break
  869. exception causes the loop to terminate;
  870. a
  871. .CW continue
  872. exception causes the loop to continue
  873. as before. For example:
  874. .P1
  875. for i in * {
  876. if {~ $i 'r*'} {
  877. echo found $i
  878. raise break
  879. }
  880. }
  881. .P2
  882. will print the name of the first
  883. file beginning with ``r'' in the
  884. current directory.
  885. .SH
  886. Substitution builtins
  887. .LP
  888. In addition to normal commands, a loaded module
  889. can also define
  890. .I "substitution builtin"
  891. commands. These are different from normal commands
  892. in that they are executed as part of the argument
  893. gathering process of a command, and instead of
  894. returning an exit status, they yield a list of values
  895. to be used as arguments to a command. They
  896. can be thought of as a kind of `active environment variable',
  897. whose value is created every time it is referenced.
  898. For example, the
  899. .CW split
  900. substitution builtin defined by
  901. .CW std
  902. splits up a single argument into strings separated
  903. by characters in its first argument:
  904. .P1
  905. echo ${split e 'hello there'}
  906. .P2
  907. will print
  908. .P1
  909. h llo th r
  910. .P2
  911. Note that, unlike the conventional shell
  912. backquote operator, the result of the
  913. .CW $
  914. command is not re-interpreted, for example:
  915. .P1
  916. for i in ${split e 'hello there'} {
  917. echo arg $i
  918. }
  919. .P2
  920. will print
  921. .P1
  922. arg h
  923. arg llo th
  924. arg r
  925. .P2
  926. Substitution builtins can only be named
  927. as the initial command inside a dollar-referenced
  928. command block - they live in a different namespace
  929. from that of normal commands.
  930. For instance,
  931. .CW loaded
  932. and
  933. .CW ${loaded}
  934. are quite distinct: the former prints a list
  935. of all builtin names and their defining modules, whereas
  936. the former yields a list of all the currently loaded
  937. modules.
  938. .LP
  939. .CW Std
  940. provides a number of useful commands
  941. in the form of substitution builtins.
  942. .CW ${join}
  943. is the complement of
  944. .CW ${split} :
  945. it joins together any elements in its argument list
  946. using its first argument as the separator, for example:
  947. .P1
  948. echo ${join . file tar gz}
  949. .P2
  950. will print:
  951. .P1
  952. file.tar.gz
  953. .P2
  954. The in-built shell operator
  955. \f5$"\fP
  956. is exactly equivalent to
  957. .CW ${join}
  958. with a space as its first argument.
  959. .LP
  960. List indexing is provided with
  961. .CW ${index} ,
  962. which given a numeric index and a list
  963. yields the
  964. .I index 'th
  965. item in the list (origin 1). For example:
  966. .P1
  967. echo ${index 4 one two three four five}
  968. .P2
  969. will print
  970. .P1
  971. four
  972. .P2
  973. A pair of substitution builtins with some of
  974. the most interesting uses are defined by
  975. the shell itself:
  976. .CW ${quote}
  977. packages its argument list into a single
  978. string in such a way that it can be later
  979. parsed by the shell and turned back into the same list.
  980. This entails quoting any items in the list
  981. that contain shell metacharacters, such as
  982. .CW ; ` '
  983. or
  984. .CW & '. `
  985. For example:
  986. .P1
  987. x='a;' 'b' 'c d' ''
  988. echo $x
  989. echo ${quote $x}
  990. .P2
  991. will print
  992. .P1
  993. a; b c d
  994. \&'a;' b 'c d' ''
  995. .P2
  996. Travel in the reverse direction is possible
  997. using
  998. .CW ${unquote} ,
  999. which takes a single string, as produced by
  1000. .CW ${quote} ,
  1001. and produces the original list again.
  1002. There are situations in
  1003. .I sh
  1004. where only a single string can be used, but
  1005. it is useful to be able to pass around the values
  1006. of arbitrary
  1007. .I sh
  1008. variables in this form;
  1009. .CW ${quote}
  1010. and
  1011. .CW ${unquote}
  1012. between them make this possible. For instance
  1013. the value of a
  1014. .I sh
  1015. list can be stored in a file and later retrieved
  1016. without loss. They are also useful to implement
  1017. various types of behaviour involving automatically
  1018. constructed shell scripts; see ``Lexical binding'', below,
  1019. for an example.
  1020. .LP
  1021. Two more list manipulation commands provided
  1022. by
  1023. .CW std
  1024. are
  1025. .CW ${hd}
  1026. and
  1027. .CW ${tl} ,
  1028. which mirror their Limbo namesakes:
  1029. .CW ${hd}
  1030. returns the first element of a list,
  1031. .CW ${tl}
  1032. returns all but the first element of a list.
  1033. For example:
  1034. .P1
  1035. x=one two three four
  1036. echo ${hd $x}
  1037. echo ${tl $x}
  1038. .P2
  1039. will print:
  1040. .P1
  1041. one
  1042. two three four
  1043. .P2
  1044. Unlike their Limbo counterparts, they
  1045. do not complain if their argument list
  1046. is not long enough; they just yield a null list.
  1047. .LP
  1048. .CW Std
  1049. provides three other substitution builtins of
  1050. note.
  1051. .CW ${pid}
  1052. yields the process id of the current
  1053. process.
  1054. .CW ${pipe}
  1055. provides a somewhat more cumbersome equivalent of the
  1056. .CW >{}
  1057. and
  1058. .CW <{}
  1059. commands found in
  1060. .I rc ,
  1061. i.e. branching pipelines.
  1062. For example:
  1063. .P1
  1064. cmp ${pipe from {old}} ${pipe from {new}}
  1065. .P2
  1066. will regression-test a new version of a command.
  1067. Using
  1068. .CW ${pipe}
  1069. yields the name of a file in the namespace
  1070. which is a pipe to its argument command.
  1071. .LP
  1072. The substitution builtin
  1073. .CW ${parse}
  1074. is used to check shell syntax without actually
  1075. executing a command. The command:
  1076. .P1
  1077. x=${parse '{echo hello, world}'}
  1078. .P2
  1079. will return a parsed version of the string
  1080. .CW "echo hello, world" ''; ``
  1081. if an error occurs, then a
  1082. .CW "parse error"
  1083. exception will be raised.
  1084. .SH
  1085. Functions
  1086. .LP
  1087. Shell functions are a facility provided
  1088. by the
  1089. .CW std
  1090. shell module; they associate a command
  1091. name with some code to execute when
  1092. that command is named.
  1093. .P1
  1094. fn hello {
  1095. echo hello, world
  1096. }
  1097. .P2
  1098. defines a new command,
  1099. .CW hello ,
  1100. that prints a message when executed.
  1101. The command is passed arguments in the
  1102. usual way, for example:
  1103. .P1
  1104. fn removems {
  1105. for i in $* {
  1106. if {grep -s Microsoft $i} {
  1107. rm $i
  1108. }
  1109. }
  1110. }
  1111. removems *
  1112. .P2
  1113. will remove all files in the current directory
  1114. that contain the string ``Microsoft''.
  1115. .LP
  1116. The
  1117. .CW status
  1118. command provides a way to return an
  1119. arbitrary status from a function. It takes
  1120. a single argument \- its exit status
  1121. is the value of that argument. For instance:
  1122. .P1
  1123. fn false {
  1124. status false
  1125. }
  1126. fn true {
  1127. status ''
  1128. }
  1129. .P2
  1130. It is also possible to define new substitution builtins
  1131. with the command
  1132. .CW subfn :
  1133. the value of
  1134. .CW $result
  1135. at the end of the execution of the
  1136. command gives the value yielded.
  1137. For example:
  1138. .P1
  1139. subfn backwards {
  1140. for i in $* {
  1141. result=$i $result
  1142. }
  1143. }
  1144. echo ${backwards a b c 'd e'}
  1145. .P2
  1146. will reverse a list, producing:
  1147. .P1
  1148. d e c b a
  1149. .P2
  1150. .LP
  1151. The commands associated with shell functions
  1152. are stored as normal environment variables, and
  1153. so are exported to external commands in the usual
  1154. way.
  1155. .CW Fn
  1156. definitions are stored in environment variables
  1157. starting
  1158. .CW fn- ;
  1159. .CW subfn
  1160. definitions use environment variables starting
  1161. .CW sfn- .
  1162. It is useful to know this, as the shell core knows
  1163. nothing of these functions - they look just like
  1164. builtin commands defined by
  1165. .CW std ;
  1166. looking at the current definition of
  1167. .CW $fn-\fIname\fP
  1168. is the only way of finding out the body of code
  1169. associated with function
  1170. .I name .
  1171. .SH
  1172. Other loadable
  1173. .I sh
  1174. modules
  1175. .LP
  1176. In addition to
  1177. .CW std ,
  1178. and
  1179. .CW tk ,
  1180. which is mentioned later, there are
  1181. several loadable
  1182. .I sh
  1183. modules that extend
  1184. .I sh's
  1185. functionality.
  1186. .LP
  1187. .CW Expr
  1188. provides a very simple stack-based calculator,
  1189. giving simple arithmetic capability to the shell.
  1190. For example:
  1191. .P1
  1192. load expr
  1193. echo ${expr 3 2 1 + x}
  1194. .P2
  1195. will print
  1196. .CW 9 .
  1197. .LP
  1198. .CW String
  1199. provides shell level access to the Limbo
  1200. string library routines. For example:
  1201. .P1
  1202. load string
  1203. echo ${tolower 'Hello, WORLD'}
  1204. .P2
  1205. will print
  1206. .P1
  1207. hello, world
  1208. .P2
  1209. .CW Regex
  1210. provides regular expression matching and
  1211. substitution operations. For instance:
  1212. .P1
  1213. load regex
  1214. if {! match '^[a-z0-9_]+$' $line} {
  1215. echo line contains invalid characters
  1216. }
  1217. .P2
  1218. .CW File2chan
  1219. provides a way for a shell script to create a
  1220. file in the namespace with properties
  1221. under its control. For instance:
  1222. .P1
  1223. load file2chan
  1224. (file2chan /chan/myfile
  1225. {echo read request from /chan/myfile}
  1226. {echo write request to /chan/myfile}
  1227. )
  1228. .P2
  1229. .CW Arg
  1230. provides support for the parsing of standard
  1231. Unix-style options.
  1232. .SH
  1233. .I Sh
  1234. and Inferno devices
  1235. .LP
  1236. Devices under Inferno are implemented as files,
  1237. and usually device interaction consists of simple
  1238. strings written or read from the device files.
  1239. This is a happy coincidence, as the two things
  1240. that
  1241. .I sh
  1242. does best are file manipulation and string manipulation.
  1243. This means that
  1244. .I sh
  1245. scripts can exploit the power of direct access to
  1246. devices without the need to write more long winded
  1247. Limbo programs. You do not get the type checking
  1248. that Limbo gives you, and it is not quick, but for
  1249. knocking up quick prototypes, or ``wrapper scripts'',
  1250. it can be very useful.
  1251. .LP
  1252. Consider the way that Inferno implements network
  1253. access, for example. A file called
  1254. .CW /net/cs
  1255. implements DNS address translation. A string such as
  1256. .CW tcp!www.vitanuova.com!telnet
  1257. is written to
  1258. .CW /net/cs ;
  1259. the translated form of the address is then read
  1260. back, in the form of a (\fIfile\fP, \fItext\fP)
  1261. pair, where
  1262. .I file
  1263. is the name of a
  1264. .I clone
  1265. file in the
  1266. .CW /net
  1267. directory
  1268. (e.g.
  1269. .CW /net/tcp/clone ),
  1270. and
  1271. .I text
  1272. is a translated address as understood by the relevant
  1273. network (e.g.
  1274. .CW 194.217.172.25!23 ).
  1275. We can write a shell function that performs this
  1276. translation, returning a triple
  1277. (\fIdirectory\fP \fIclonefile\fP \fItext\fP):
  1278. .P1
  1279. subfn cs {
  1280. addr := $1
  1281. or {
  1282. <> /net/cs {
  1283. (if {echo -n $addr >[1=0]} {
  1284. (clone addr) := `{read 8192 0}
  1285. netdir := ${dirname $clone}
  1286. result=$netdir $clone $addr
  1287. } {
  1288. echo 'cs: cannot translate "' ^
  1289. $addr ^
  1290. '":' $status >[1=2]
  1291. status failed
  1292. }
  1293. )
  1294. }
  1295. } {raise 'cs failed'}
  1296. }
  1297. .P2
  1298. The code
  1299. .P1
  1300. <> /net/cs { \fR....\fP }
  1301. .P2
  1302. opens
  1303. .CW /net/cs
  1304. for reading and writing, on the standard input;
  1305. the code inside the braces can then read and
  1306. write it.
  1307. If the address translation fails, an error will
  1308. be generated on the write, so the
  1309. .CW echo
  1310. will fail - this is detected, and an appropriate exit status
  1311. set.
  1312. Being a substitution function, the only way that
  1313. .CW cs
  1314. can indicate an error is by raising an exception, but
  1315. exceptions do not propagate across processes
  1316. (a new process is created as a result of the redirection),
  1317. hence the need for the status check and the raised exception
  1318. on failure.
  1319. .LP
  1320. The external program
  1321. .CW read
  1322. is invoked to make a single read of the
  1323. result from
  1324. .CW /lib/cs .
  1325. It takes a block size, and a read offset - it
  1326. is important to set this, as the initial write of the
  1327. address to
  1328. .CW /lib/cs
  1329. will have advanced the file offset, and we will miss
  1330. a chunk of the returned address if we're not careful.
  1331. .LP
  1332. .CW Dirname
  1333. is a little shell function that uses one of the
  1334. .I string
  1335. builtin functions to get the directory name from
  1336. the pathname of the
  1337. .I clone
  1338. file. It looks like:
  1339. .P1
  1340. load string
  1341. subfn dirname {
  1342. result = ${hd ${splitr $1 /}}
  1343. }
  1344. .P2
  1345. Now we have an address translation function, we can
  1346. access the network interface directly. There are
  1347. three main operations possible with Inferno network
  1348. devices: connecting to a remote address, announcing
  1349. the availability of a local dial-in address, and listening
  1350. for an incoming connection on a previously announced
  1351. address. They are accessed in similar ways (see
  1352. .I ip (3)
  1353. for details):
  1354. .LP
  1355. The dial and announce operations require a new
  1356. .CW net
  1357. directory, which is created by reading the
  1358. clone file - this actually opens the
  1359. .CW ctl
  1360. file in a newly created net directory, representing
  1361. one end of a network connection. Reading a
  1362. .CW ctl
  1363. file yields the name of the new directory;
  1364. this enables an application to find the associated
  1365. .CW data
  1366. file; reads and writes to this file go to the
  1367. other end of the network connection.
  1368. The listen operation is similar, but the new
  1369. net directory is created by reading from an existing
  1370. directory's
  1371. .CW listen
  1372. file.
  1373. .LP
  1374. Here is a
  1375. .I sh
  1376. function that implements some behaviour common
  1377. to all three operations:
  1378. .P1
  1379. fn newnetcon {
  1380. (netdir constr datacmd) := $*
  1381. id := "{read 20 0}
  1382. or {~ $constr ''} {echo -n $constr >[1=0]} {
  1383. echo cannot $constr >[1=2]
  1384. raise failed
  1385. }
  1386. net := $netdir/^$id
  1387. $datacmd <> $net^/data
  1388. }
  1389. .P2
  1390. It takes the name of a network protocol directory
  1391. (e.g.
  1392. .CW /net/tcp ),
  1393. a possibly empty string to write into the control
  1394. file when the new directory id has been read,
  1395. and a command to be executed connected to
  1396. the newly opened
  1397. .CW data
  1398. file. The code is fairly straightforward: read
  1399. the name of a new directory from standard input
  1400. (we are assuming that the caller of
  1401. .CW newnetcon
  1402. sets up the standard input correctly); then
  1403. write the configuration string (if it is not empty),
  1404. raising an error if the write failed; then run the
  1405. command, attached to the
  1406. .CW data
  1407. file.
  1408. .LP
  1409. We set up the
  1410. .CW $net
  1411. environment variable so that
  1412. the running command knows its network
  1413. context, and can access other files in the
  1414. directory (the
  1415. .CW local
  1416. and
  1417. .CW remote
  1418. files, for example).
  1419. Given
  1420. .CW newnetcon ,
  1421. the implementation of
  1422. .CW dial ,
  1423. .CW announce ,
  1424. and
  1425. .CW listen
  1426. is quite easy:
  1427. .P1
  1428. fn announce {
  1429. (addr cmd) := $*
  1430. (netdir clone addr) := ${cs $addr}
  1431. newnetcon $netdir 'announce '^$addr $cmd <> $clone
  1432. }
  1433. fn dial {
  1434. (addr cmd) := $*
  1435. (netdir clone addr) := ${cs $addr}
  1436. newnetcon $netdir 'connect '^$addr $cmd <> $clone
  1437. }
  1438. fn listen {
  1439. newnetcon ${dirname $net} '' $1 <> $net/listen
  1440. }
  1441. .P2
  1442. .CW Dial
  1443. and
  1444. .CW announce
  1445. differ only in the string that is written to the control
  1446. file;
  1447. .CW listen
  1448. assumes it is being called in the context of
  1449. an
  1450. .CW announce
  1451. command, so can use the value
  1452. of
  1453. .CW $net
  1454. to open the
  1455. .CW listen
  1456. file to wait for incoming connections.
  1457. .LP
  1458. The upshot of these function definitions is that we
  1459. can make connections to, and announce, services
  1460. on the network. The code for a simple client might look like:
  1461. .P1
  1462. dial tcp!somewhere.com!5432 {
  1463. echo connected to `{cat $net/remote}
  1464. echo hello somewhere >[1=0]
  1465. }
  1466. .P2
  1467. A server might look like:
  1468. .P1
  1469. announce tcp!somewhere.com!5432 {
  1470. listen {
  1471. echo got connection from `{cat $net/remote}
  1472. cat
  1473. }
  1474. }
  1475. .P2
  1476. .SH
  1477. .I Sh
  1478. and the windowing environment
  1479. .LP
  1480. The main interface to the Inferno graphics and windowing
  1481. system is a textual one, based on Osterhaut's Tk,
  1482. where commands to manipulate the graphics inside
  1483. windows are strings using a uniform syntax not
  1484. a million miles away from the syntax of
  1485. .I sh .
  1486. (See section 9 of Volume 1 for details).
  1487. The
  1488. .CW tk
  1489. .I sh
  1490. module provides an interface to the Tk graphics
  1491. subsystem, providing not only graphics capabilities,
  1492. but also the channel communication on which
  1493. Inferno's Tk event mechanism is based.
  1494. .LP
  1495. The Tk module gives each window a unique
  1496. numeric id which is used to control that window.
  1497. .P1
  1498. load tk
  1499. wid := ${tk window 'My window'}
  1500. .P2
  1501. loads the tk module, creates a new window titled ``My window''
  1502. and assigns its unique identifier to the variable
  1503. .CW $wid .
  1504. Commands of the form
  1505. .CW "tk $wid"
  1506. .I tkcommand
  1507. can then be used to control graphics in the window.
  1508. When writing tk applets, it is helpful to get feedback
  1509. on errors that occur as tk commands are executed, so
  1510. here's a function that checks for errors, and minimises
  1511. the syntactic overhead of sending a Tk command:
  1512. .P1
  1513. fn x {
  1514. args := $*
  1515. or {tk $wid $args} {
  1516. echo error on tk cmd $"args':' $status
  1517. }
  1518. }
  1519. .P2
  1520. It assumes that
  1521. .CW $wid
  1522. has already been set.
  1523. Using
  1524. .CW x ,
  1525. we could create a button in our new window:
  1526. .P1
  1527. x button .b -text {A button}
  1528. x pack .b -side top
  1529. x update
  1530. .P2
  1531. Note that the nice coincidence of the quoting rules
  1532. of
  1533. .I sh
  1534. and tk mean that the unquoted
  1535. .I sh
  1536. command block argument to the
  1537. .CW button
  1538. command gets through to tk unchanged,
  1539. there to become quoted text.
  1540. .LP
  1541. Once we've got a button, we want to know when
  1542. it has been pressed. Inferno Tk sends events
  1543. through Limbo channels, so the Tk module provides
  1544. access to simple string channels. A channel is
  1545. created with the
  1546. .CW chan
  1547. command.
  1548. .P1
  1549. chan event
  1550. .P2
  1551. creates a channel named
  1552. .CW event .
  1553. A
  1554. .CW send
  1555. command takes a string to send down the channel,
  1556. and the
  1557. .CW ${recv}
  1558. builtin yields a received value. Both operations
  1559. block until the transfer of data can proceed \- as with
  1560. Limbo channels, the operation is synchronous. For example:
  1561. .P1
  1562. send event 'hello, world' &
  1563. echo ${recv event}
  1564. .P2
  1565. will print ``hello, world''. Note that the send
  1566. and receive operations must execute in different
  1567. processes, hence the use of the
  1568. .CW &
  1569. backgrounding operator.
  1570. Although for implementation reasons they are
  1571. part of the Tk module, these channel operations
  1572. are potentially useful in non-graphical scripts \-
  1573. they will still work fine if there's no graphics context.
  1574. .LP
  1575. The
  1576. .CW "tk namechan"
  1577. command makes a channel known to Tk.
  1578. .P1
  1579. tk namechan $wid event
  1580. .P2
  1581. Then we can get events from Tk:
  1582. .P1
  1583. x .b configure -command {send event buttonpressed}
  1584. while {} {echo ${recv event}} &
  1585. .P2
  1586. This starts a background process that prints a message
  1587. each time the button is pressed.
  1588. Interaction with the window manager is handled in
  1589. a similar way. When a window is created, it is automatically
  1590. associated with a channel of the same name as the window id.
  1591. Strings arriving on this are window manager events, such as
  1592. .CW resize
  1593. and
  1594. .CW move .
  1595. These can be interpreted if desired, or forwarded back
  1596. to the window manager for default handling with
  1597. .CW "tk winctl" .
  1598. The following is a useful idiom that does all the usual
  1599. event handling on a window:
  1600. .P1
  1601. while {} {tk winctl $wid ${recv $wid}} &
  1602. .P2
  1603. One thing worth knowing is that the default
  1604. .CW exit
  1605. action (i.e. when the user closes the window) is
  1606. to kill all processes in the current process group, so
  1607. in a script that creates windows,
  1608. it is usual to fork the process group with
  1609. .CW "pctl newgrp"
  1610. early on, otherwise
  1611. it can end up killing the shell window that spawned it.
  1612. .SH
  1613. An example
  1614. .LP
  1615. By way of an example. I'll present a function that implements
  1616. a simple network chat facility, allowing two people on the
  1617. network to send text messages to one another, making use
  1618. of the network functions described earlier.
  1619. .LP
  1620. The core is a function called
  1621. .CW chat
  1622. which assumes that its standard input has
  1623. been directed to an active network connection; it creates a
  1624. window containing an entry widget and a text widget. Any text
  1625. entered into the entry widget is sent to the other end
  1626. of the connection; lines of text arriving from
  1627. the network are appended to the text widget.
  1628. .LP
  1629. The first part of the function creates the window,
  1630. forks the process group, runs the window controller
  1631. and creates the widgets inside the window:
  1632. .P1
  1633. fn chat {
  1634. load tk
  1635. pctl newpgrp
  1636. wid := ${tk window 'Chat'}
  1637. nl := '
  1638. \&' # newline
  1639. while {} {tk winctl $wid ${recv $wid}} &
  1640. x entry .e
  1641. x frame .f
  1642. x scrollbar .f.s -orient vertical -command {.f.t yview}
  1643. x text .f.t -yscrollcommand {.f.s set}
  1644. x pack .f.s -side left -fill y
  1645. x pack .f.t -side top -fill both -expand 1
  1646. x pack .f -side top -fill both -expand 1
  1647. x pack .e -side top -fill x
  1648. x pack propagate . 0
  1649. x bind .e '<Key-'^$nl^'>' {send event enter}
  1650. x update
  1651. chan event
  1652. tk namechan $wid event event
  1653. .P2
  1654. The middle part of
  1655. .CW chat
  1656. loops in the background getting text entered
  1657. by the user and sending it across the network
  1658. (also putting a copy in the local text widget
  1659. so that you can see what you have sent.
  1660. .P1
  1661. while {} {
  1662. {} ${recv event}
  1663. txt := ${tk $wid .e get}
  1664. echo $txt >[1=0]
  1665. x .f.t insert end '''me: '^$txt^$nl
  1666. x .e delete 0 end
  1667. x .f.t see end
  1668. x update
  1669. } &
  1670. .P2
  1671. Note the null command on the second line,
  1672. used to wait for the receive event without
  1673. having to deal with the value (there's only
  1674. one event that can arrive on the channel, and
  1675. we know what it is).
  1676. .LP
  1677. The final piece of
  1678. .CW chat
  1679. gets lines from the network and puts them
  1680. in the text widget. The loop will terminate when
  1681. the connection is dropped by the other party, whereupon
  1682. the window closes and the chat finished:
  1683. .P1
  1684. getlines {
  1685. x .f.t insert end '''you: '^$line^$nl
  1686. x .f.t see end
  1687. x update
  1688. }
  1689. tk winctl $wid exit
  1690. }
  1691. .P2
  1692. Now we can wrap up the network functions and the
  1693. chat function in a shell script, to finish off the little demo:
  1694. .P1
  1695. #!/dis/sh
  1696. .I "Include the earlier function definitions here."
  1697. fn usage {
  1698. echo 'usage: chat [-s] address' >[1=2]
  1699. raise usage
  1700. }
  1701. args=$*
  1702. or {~ $#args 1 2} {usage}
  1703. (addr args) := $*
  1704. if {~ $addr -s} {
  1705. # server
  1706. or {~ $#args 1} {usage}
  1707. (addr nil) := $args
  1708. announce $addr {
  1709. echo announced on `{cat $net/local}
  1710. while {} {
  1711. net := $net
  1712. listen {
  1713. echo got connection from `{cat $net/remote}
  1714. chat &
  1715. }
  1716. }
  1717. }
  1718. } {
  1719. or {~ $#args 0} {usage}
  1720. # client
  1721. dial $addr {
  1722. echo made connection
  1723. chat
  1724. }
  1725. }
  1726. .P2
  1727. If this is placed in an executable script file
  1728. named
  1729. .CW chat ,
  1730. then
  1731. .P1
  1732. chat -s tcp!mymachine.com!5432
  1733. .P2
  1734. would announce a chat server using tcp
  1735. on
  1736. .CW mymachine.com
  1737. (the local machine)
  1738. on port 5432.
  1739. .P1
  1740. chat tcp!mymachine.com!5432
  1741. .P2
  1742. would make a connection to
  1743. the previous server; they would both pop
  1744. up windows and allow text to be typed in from
  1745. either end.
  1746. .SH
  1747. Lexical binding
  1748. .LP
  1749. One potential problem with all this passing around
  1750. of fragments of shell script is the scope of names.
  1751. This piece of code:
  1752. .P1
  1753. fn runit {x := Two; $*}
  1754. x := One
  1755. runit {echo $x}
  1756. .P2
  1757. will print ``Two'', which is quite likely to confound the
  1758. expectations of the person writing the script if they
  1759. did not know that
  1760. .CW runit
  1761. set the value of
  1762. .CW $x
  1763. before calling its argument script.
  1764. Some functional languages (and the
  1765. .I es
  1766. shell) implement
  1767. .I "lexical binding"
  1768. to get around this problem. The idea
  1769. is to derive a new script from the old
  1770. one with all the necessary variables bound to
  1771. their current values, regardless of the context in which
  1772. the script is later called.
  1773. .LP
  1774. .I Sh
  1775. does not provide any explicit support for
  1776. this operation; however it is possible to fake
  1777. up a reasonably passable job.
  1778. Recall that blocks can be treated as strings if necessary,
  1779. and that
  1780. .CW ${quote}
  1781. allows the bundling of lists in such a way that they
  1782. can later be extracted again without loss. These two
  1783. features allow the writing of the following
  1784. .CW let
  1785. function (I have omitted argument checking code here and
  1786. in later code for the sake of brevity):
  1787. .P1
  1788. subfn let {
  1789. # usage: let cmd var...
  1790. (let_cmd let_vars) := $*
  1791. if {~ $#let_cmd 0} {
  1792. echo 'usage: let {cmd} var...' >[1=2]
  1793. raise usage
  1794. }
  1795. let_prefix := ''
  1796. for let_i in $let_vars {
  1797. let_prefix = $let_prefix ^
  1798. ${quote $let_i}^':='^${quote $$let_i}^';'
  1799. }
  1800. result=${parse '{'^$let_prefix^$let_cmd^' $*}'}
  1801. }
  1802. .P2
  1803. .CW Let
  1804. takes a block of code, and the names of environment variables
  1805. to bind onto it; it returns the resulting new block of code.
  1806. For example:
  1807. .P1
  1808. fn runit {x := hello, world; $*}
  1809. x := a 'b c d' 'e'
  1810. runit ${let {echo $x} x}
  1811. .P2
  1812. will print:
  1813. .P1
  1814. a b c d e
  1815. .P2
  1816. Looking at the code it produces is perhaps more
  1817. enlightening than examining the function definition:
  1818. .P1
  1819. x=a 'b c d' 'e'
  1820. echo ${let {echo $x} x}
  1821. .P2
  1822. produces
  1823. .P1
  1824. {x:=a 'b c d' e;{echo $x} $*}
  1825. .P2
  1826. .CW Let
  1827. has bundled up the values of the two bound variables,
  1828. stuck them onto the beginning of the code block
  1829. and surrounded the whole thing in braces.
  1830. It makes sure that it has valid syntax by using
  1831. .CW ${parse} ,
  1832. and it ensures that the correct arguments are
  1833. passed to the script by passing it
  1834. .CW $* .
  1835. .LP
  1836. Note that all the variable names used inside the
  1837. body of
  1838. .CW let
  1839. are prefixed with
  1840. .CW let_ .
  1841. This is to try to reduce the likelihood that someone
  1842. will want to lexically bind to a variable of a name used
  1843. inside
  1844. .CW let .
  1845. .SH
  1846. The module interface
  1847. .PP
  1848. It is not within the scope of this paper to discuss in
  1849. detail the public module interface to the shell, but
  1850. it is probably worth mentioning some of the other
  1851. benefits that
  1852. .I sh
  1853. derives from living within Inferno.
  1854. .PP
  1855. Unlike shells in conventional systems, where
  1856. the shell is a standalone program, accessible
  1857. only through
  1858. .CW exec() ,
  1859. in Inferno,
  1860. .I sh
  1861. presents a module interface that allows programs
  1862. to gain lower level access to the primitives provided
  1863. by the shell. For example, Inferno programs can make use of
  1864. the shell syntax parsing directly, so
  1865. a shell command in a configuration script might be
  1866. checked for correctness before running it,
  1867. or parsed to avoid parsing overhead when running
  1868. a shell command within a loop.
  1869. .PP
  1870. More importantly, as long as it implements a superset
  1871. of the
  1872. .CW Shellbuiltin
  1873. interface, an application can
  1874. load
  1875. .I itself
  1876. into the shell as a module, and define builtin commands
  1877. that directly access functionality that it can provide.
  1878. .PP
  1879. This can, with minimum effort, provide an application
  1880. with a programmable interface to its primitives.
  1881. I have modified the Inferno window manager
  1882. .CW wm ,
  1883. for example, so that instead of using a custom, fairly limited
  1884. format file, its configuration file is just
  1885. a shell script.
  1886. .CW Wm
  1887. loads itself into the shell,
  1888. defines a new builtin command
  1889. .CW menu
  1890. to create items in
  1891. its main menu, and runs a shell script.
  1892. The shell script has the freedom to customise
  1893. menu entries dynamically, to run arbitrary programs,
  1894. and even to publicise this interface to
  1895. .CW wm
  1896. by creating a file with
  1897. .CW file2chan
  1898. and interpreting writes to the file as calls
  1899. to the
  1900. .CW menu
  1901. command:
  1902. .P1
  1903. file2chan /chan/wmmenu {} {menu ${unquote ${rget data}}}
  1904. .P2
  1905. A corresponding
  1906. .CW wmmenu
  1907. shell function might be written to provide access to
  1908. the functionality:
  1909. .P1
  1910. fn wmmenu {
  1911. echo ${quote $*} > /chan/wmmenu
  1912. }
  1913. .P2
  1914. Inferno has blurred the boundaries between
  1915. application and library and
  1916. .I sh
  1917. exploits this \- the possibilities have only just begun
  1918. to be explored.
  1919. .SH
  1920. Discussion
  1921. .LP
  1922. Although it is a newly written shell, the use of tried
  1923. and tested semantics means that most of the
  1924. normal shell functionality works quite smoothly.
  1925. The separation between normal commands and
  1926. substitution builtins is arguable, but I think justifiable.
  1927. The distinction between the two classes of command
  1928. means that there is less awkwardness in the transition between
  1929. ordinary commands and internally implemented commands:
  1930. both return the same kind of thing. A normal command's
  1931. return value remains essentially a simple true/false status,
  1932. whereas the new substitution builtins are returning a list
  1933. with no real distinction between true and false.
  1934. .LP
  1935. I believe that the decision to keep as much functionality as
  1936. possible out
  1937. of the core shell has paid off. Allowing command blocks
  1938. as values enables external modules to define new
  1939. control-flow primitives, which in turn means that
  1940. the core shell can be kept reasonably static,
  1941. while the design of the shell modules evolves
  1942. independently. There is a syntactic price
  1943. to pay for this generality, but I think it is worth it!
  1944. .LP
  1945. There are some aspects to the design that I do not
  1946. find entirely satisfactory. It is strange, given the
  1947. throwaway and non-explicit use of subprocesses
  1948. in the shell, that exceptions do not propagate
  1949. between processes. The model is Limbo's, but
  1950. I am not sure it works perfectly for
  1951. .I sh .
  1952. I feel there should probably be some difference
  1953. between:
  1954. .P1
  1955. raise error > /dev/null
  1956. .P2
  1957. and
  1958. .P1
  1959. status error > /dev/null
  1960. .P2
  1961. The shared nature of loaded modules can cause
  1962. problems; unlike environment variables, which
  1963. are copied for asynchronously running processes,
  1964. the module instances for an asynchronously running
  1965. process remain the same. This means that a
  1966. module such as
  1967. .CW tk
  1968. must maintain mutual exclusion locks to
  1969. protect access to its data structures. This
  1970. could be solved if Limbo had some kind of polymorphic
  1971. type that enabled the shell to hold some data on
  1972. a module's behalf \- it could ask the module
  1973. to copy it when necessary.
  1974. .LP
  1975. One thing that is lost going from Limbo to
  1976. .I sh
  1977. when using the
  1978. .CW tk
  1979. module is the usual reference-counted garbage collection
  1980. of windows. Because a shell-script holds not
  1981. a direct handle on the window, but only a string
  1982. that indirectly refers to a handle held inside
  1983. the
  1984. .CW tk
  1985. module, there is no way for the system to
  1986. know when the window is no longer referred to,
  1987. so, as long as a
  1988. .CW tk
  1989. module is loaded, its windows must be
  1990. explicitly deleted.
  1991. .LP
  1992. The names defined by loaded modules will
  1993. become an issue if
  1994. loaded modules proliferate. It is not easy
  1995. to ensure that a command that you are executing
  1996. is defined by the module you think it is, given name clashes
  1997. between modules.I have been considering some
  1998. kind of scheme that would allow discrimination
  1999. between modules, but for the moment, the point
  2000. is moot \- there are no module name clashes, and
  2001. I hope that that will remain the case.
  2002. .SH
  2003. Credits
  2004. .LP
  2005. .I Sh
  2006. is almost entirely an amalgam of other people's
  2007. ideas that I have been fortunate enough to
  2008. encounter over the years. I hope they will forgive
  2009. me for the corruption I've applied...
  2010. .LP
  2011. I have been a happy user of a version of Tom Duff's
  2012. .I rc
  2013. for ten years or so; without
  2014. .I rc ,
  2015. this shell would not exist in anything like its present form.
  2016. Thanks, Tom.
  2017. .LP
  2018. It was Byron Rakitzis's UNIX version of
  2019. .I rc
  2020. that I was using for most of those ten years; it was his
  2021. version of the grammar that eventually became
  2022. .I sh 's
  2023. grammar, and the name of my
  2024. .CW glom()
  2025. function came straight from his
  2026. .I rc
  2027. source.
  2028. .LP
  2029. From Paul Haahr's
  2030. .I es ,
  2031. a descendent of Byron's
  2032. .I rc ,
  2033. and the shell that probably holds the most in common
  2034. with
  2035. .I sh ,
  2036. I stole the ``blocks as values'' idea;
  2037. the way that blocks transform into strings
  2038. and vice versa is completely
  2039. .I es 's.
  2040. The syntax of the
  2041. .CW if
  2042. command also comes directly from
  2043. .I es .
  2044. .LP
  2045. From Bruce Ellis's
  2046. .I mash ,
  2047. the other programmable shell for Inferno,
  2048. I took the
  2049. .CW load
  2050. command, the
  2051. \f5"{}\fP
  2052. syntax and the
  2053. .CW <>
  2054. redirection operator.
  2055. .LP
  2056. Last, but by no means least, S. R. Bourne,
  2057. the author of the original
  2058. .I sh ,
  2059. the granddaddy of this
  2060. .I sh ,
  2061. is indirectly responsible for all these shells.
  2062. That so much has remained unchanged from
  2063. then is a testament to the power of his original
  2064. vision.