comp.ms 37 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743744745746747748749750751752753754755756757758759760761762763764765766767768769770771772773774775776777778779780781782783784785786787788789790791792793794795796797798799800801802803804805806807808809810811812813814815816817818819820821822823824825826827828829830831832833834835836837838839840841842843844845846847848849850851852853854855856857858859860861862863864865866867868869870871872873874875876877878879880881882883884885886887888889890891892893894895896897898899900901902903904905906907908909910911912913914915916917918919920921922923924925926927928929930931932933934935936937938939940941942943944945946947948949950951952953954955956957958959960961962963964965966967968969970971972973974975976977978979980981982983984985986987988989990991992993994995996997998999100010011002100310041005100610071008100910101011101210131014101510161017101810191020102110221023102410251026102710281029103010311032103310341035103610371038103910401041104210431044104510461047104810491050105110521053105410551056105710581059106010611062106310641065106610671068106910701071107210731074107510761077107810791080108110821083108410851086108710881089109010911092109310941095109610971098109911001101110211031104110511061107110811091110111111121113111411151116111711181119112011211122112311241125112611271128112911301131113211331134113511361137113811391140114111421143114411451146114711481149115011511152115311541155115611571158115911601161116211631164116511661167116811691170117111721173117411751176117711781179118011811182118311841185118611871188118911901191119211931194119511961197119811991200120112021203120412051206120712081209121012111212121312141215121612171218121912201221122212231224122512261227122812291230123112321233123412351236123712381239124012411242124312441245124612471248124912501251125212531254125512561257125812591260126112621263126412651266126712681269127012711272127312741275127612771278127912801281128212831284128512861287128812891290129112921293129412951296129712981299130013011302130313041305130613071308130913101311131213131314131513161317131813191320132113221323132413251326132713281329133013311332133313341335133613371338133913401341134213431344134513461347134813491350135113521353135413551356135713581359136013611362136313641365136613671368136913701371137213731374137513761377137813791380138113821383138413851386138713881389139013911392139313941395139613971398139914001401140214031404140514061407140814091410141114121413141414151416141714181419142014211422142314241425142614271428142914301431143214331434143514361437143814391440144114421443144414451446
  1. .TL
  2. How to Use the Plan 9 C Compiler
  3. .AU
  4. Rob Pike
  5. rob@plan9.bell-labs.com
  6. .SH
  7. Introduction
  8. .PP
  9. The C compiler on Plan 9 is a wholly new program; in fact
  10. it was the first piece of software written for what would
  11. eventually become Plan 9 from Bell Labs.
  12. Programmers familiar with existing C compilers will find
  13. a number of differences in both the language the Plan 9 compiler
  14. accepts and in how the compiler is used.
  15. .PP
  16. The compiler is really a set of compilers, one for each
  17. architecture \(em MIPS, SPARC, Motorola 68020, Intel 386, etc. \(em
  18. that accept a dialect of ANSI C and efficiently produce
  19. fairly good code for the target machine.
  20. There is a packaging of the compiler that accepts strict ANSI C for
  21. a POSIX environment, but this document focuses on the
  22. native Plan 9 environment, that in which all the system source and
  23. almost all the utilities are written.
  24. .SH
  25. Source
  26. .PP
  27. The language accepted by the compilers is the core ANSI C language
  28. with some modest extensions,
  29. a greatly simplified preprocessor,
  30. a smaller library that includes system calls and related facilities,
  31. and a completely different structure for include files.
  32. .PP
  33. Official ANSI C accepts the old (K&R) style of declarations for
  34. functions; the Plan 9 compilers
  35. are more demanding.
  36. Without an explicit run-time flag
  37. .CW -B ) (
  38. whose use is discouraged, the compilers insist
  39. on new-style function declarations, that is, prototypes for
  40. function arguments.
  41. The function declarations in the libraries' include files are
  42. all in the new style so the interfaces are checked at compile time.
  43. For C programmers who have not yet switched to function prototypes
  44. the clumsy syntax may seem repellent but the payoff in stronger typing
  45. is substantial.
  46. Those who wish to import existing software to Plan 9 are urged
  47. to use the opportunity to update their code.
  48. .PP
  49. The compilers include an integrated preprocessor that accepts the familiar
  50. .CW #include ,
  51. .CW #define
  52. for macros both with and without arguments,
  53. .CW #undef ,
  54. .CW #line ,
  55. .CW #ifdef ,
  56. .CW #ifndef ,
  57. and
  58. .CW #endif .
  59. It
  60. supports neither
  61. .CW #if
  62. nor
  63. .CW ## ,
  64. although it does
  65. honor a few
  66. .CW #pragmas .
  67. The
  68. .CW #if
  69. directive was omitted because it greatly complicates the
  70. preprocessor, is never necessary, and is usually abused.
  71. Conditional compilation in general makes code hard to understand;
  72. the Plan 9 source uses it sparingly.
  73. Also, because the compilers remove dead code, regular
  74. .CW if
  75. statements with constant conditions are more readable equivalents to many
  76. .CW #ifs .
  77. To compile imported code ineluctably fouled by
  78. .CW #if
  79. there is a separate command,
  80. .CW /bin/cpp ,
  81. that implements the complete ANSI C preprocessor specification.
  82. .PP
  83. Include files fall into two groups: machine-dependent and machine-independent.
  84. The machine-independent files occupy the directory
  85. .CW /sys/include ;
  86. the others are placed in a directory appropriate to the machine, such as
  87. .CW /mips/include .
  88. The compiler searches for include files
  89. first in the machine-dependent directory and then
  90. in the machine-independent directory.
  91. At the time of writing there are thirty-one machine-independent include
  92. files and two (per machine) machine-dependent ones:
  93. .CW <ureg.h>
  94. and
  95. .CW <u.h> .
  96. The first describes the layout of registers on the system stack,
  97. for use by the debugger.
  98. The second defines some
  99. architecture-dependent types such as
  100. .CW jmp_buf
  101. for
  102. .CW setjmp
  103. and the
  104. .CW va_arg
  105. and
  106. .CW va_list
  107. macros for handling arguments to variadic functions,
  108. as well as a set of
  109. .CW typedef
  110. abbreviations for
  111. .CW unsigned
  112. .CW short
  113. and so on.
  114. .PP
  115. Here is an excerpt from
  116. .CW /68020/include/u.h :
  117. .P1
  118. #define nil ((void*)0)
  119. typedef unsigned short ushort;
  120. typedef unsigned char uchar;
  121. typedef unsigned long ulong;
  122. typedef unsigned int uint;
  123. typedef signed char schar;
  124. typedef long long vlong;
  125. typedef long jmp_buf[2];
  126. #define JMPBUFSP 0
  127. #define JMPBUFPC 1
  128. #define JMPBUFDPC 0
  129. .P2
  130. Plan 9 programs use
  131. .CW nil
  132. for the name of the zero-valued pointer.
  133. The type
  134. .CW vlong
  135. is the largest integer type available; on most architectures it
  136. is a 64-bit value.
  137. A couple of other types in
  138. .CW <u.h>
  139. are
  140. .CW u32int ,
  141. which is guaranteed to have exactly 32 bits (a possibility on all the supported architectures) and
  142. .CW mpdigit ,
  143. which is used by the multiprecision math package
  144. .CW <mp.h> .
  145. The
  146. .CW #define
  147. constants permit an architecture-independent (but compiler-dependent)
  148. implementation of stack-switching using
  149. .CW setjmp
  150. and
  151. .CW longjmp .
  152. .PP
  153. Every Plan 9 C program begins
  154. .P1
  155. #include <u.h>
  156. .P2
  157. because all the other installed header files use the
  158. .CW typedefs
  159. declared in
  160. .CW <u.h> .
  161. .PP
  162. In strict ANSI C, include files are grouped to collect related functions
  163. in a single file: one for string functions, one for memory functions,
  164. one for I/O, and none for system calls.
  165. Each include file is protected by an
  166. .CW #ifdef
  167. to guarantee its contents are seen by the compiler only once.
  168. Plan 9 takes a different approach. Other than a few include
  169. files that define external formats such as archives, the files in
  170. .CW /sys/include
  171. correspond to
  172. .I libraries.
  173. If a program is using a library, it includes the corresponding header.
  174. The default C library comprises string functions, memory functions, and
  175. so on, largely as in ANSI C, some formatted I/O routines,
  176. plus all the system calls and related functions.
  177. To use these functions, one must
  178. .CW #include
  179. the file
  180. .CW <libc.h> ,
  181. which in turn must follow
  182. .CW <u.h> ,
  183. to define their prototypes for the compiler.
  184. Here is the complete source to the traditional first C program:
  185. .P1
  186. #include <u.h>
  187. #include <libc.h>
  188. void
  189. main(void)
  190. {
  191. print("hello world\en");
  192. exits(0);
  193. }
  194. .P2
  195. The
  196. .CW print
  197. routine and its relatives
  198. .CW fprint
  199. and
  200. .CW sprint
  201. resemble the similarly-named functions in Standard I/O but are not
  202. attached to a specific I/O library.
  203. In Plan 9
  204. .CW main
  205. is not integer-valued; it should call
  206. .CW exits ,
  207. which takes a string argument (or null; here ANSI C promotes the 0 to a
  208. .CW char* ).
  209. All these functions are, of course, documented in the Programmer's Manual.
  210. .PP
  211. To use
  212. .CW printf ,
  213. .CW <stdio.h>
  214. must be included to define the function prototype for
  215. .CW printf :
  216. .P1
  217. #include <u.h>
  218. #include <libc.h>
  219. #include <stdio.h>
  220. void
  221. main(int argc, char *argv[])
  222. {
  223. printf("%s: hello world; argc = %d\en", argv[0], argc);
  224. exits(0);
  225. }
  226. .P2
  227. In practice, Standard I/O is not used much in Plan 9. I/O libraries are
  228. discussed in a later section of this document.
  229. .PP
  230. There are libraries for handling regular expressions, raster graphics,
  231. windows, and so on, and each has an associated include file.
  232. The manual for each library states which include files are needed.
  233. The files are not protected against multiple inclusion and themselves
  234. contain no nested
  235. .CW #includes .
  236. Instead the
  237. programmer is expected to sort out the requirements
  238. and to
  239. .CW #include
  240. the necessary files once at the top of each source file. In practice this is
  241. trivial: this way of handling include files is so straightforward
  242. that it is rare for a source file to contain more than half a dozen
  243. .CW #includes .
  244. .PP
  245. The compilers do their own register allocation so the
  246. .CW register
  247. keyword is ignored.
  248. For different reasons,
  249. .CW volatile
  250. and
  251. .CW const
  252. are also ignored.
  253. .PP
  254. To make it easier to share code with other systems, Plan 9 has a version
  255. of the compiler,
  256. .CW pcc ,
  257. that provides the standard ANSI C preprocessor, headers, and libraries
  258. with POSIX extensions.
  259. .CW Pcc
  260. is recommended only
  261. when broad external portability is mandated. It compiles slower,
  262. produces slower code (it takes extra work to simulate POSIX on Plan 9),
  263. eliminates those parts of the Plan 9 interface
  264. not related to POSIX, and illustrates the clumsiness of an environment
  265. designed by committee.
  266. .CW Pcc
  267. is described in more detail in
  268. .I
  269. APE\(emThe ANSI/POSIX Environment,
  270. .R
  271. by Howard Trickey.
  272. .SH
  273. Process
  274. .PP
  275. Each CPU architecture supported by Plan 9 is identified by a single,
  276. arbitrary, alphanumeric character:
  277. .CW k
  278. for SPARC,
  279. .CW q
  280. for Motorola Power PC 630 and 640,
  281. .CW v
  282. for MIPS,
  283. .CW 1
  284. for Motorola 68000,
  285. .CW 2
  286. for Motorola 68020 and 68040,
  287. .CW 5
  288. for Acorn ARM 7500,
  289. .CW 6
  290. for Intel 960,
  291. .CW 7
  292. for DEC Alpha,
  293. .CW 8
  294. for Intel 386, and
  295. .CW 9
  296. for AMD 29000.
  297. The character labels the support tools and files for that architecture.
  298. For instance, for the 68020 the compiler is
  299. .CW 2c ,
  300. the assembler is
  301. .CW 2a ,
  302. the link editor/loader is
  303. .CW 2l ,
  304. the object files are suffixed
  305. .CW \&.2 ,
  306. and the default name for an executable file is
  307. .CW 2.out .
  308. Before we can use the compiler we therefore need to know which
  309. machine we are compiling for.
  310. The next section explains how this decision is made; for the moment
  311. assume we are building 68020 binaries and make the mental substitution for
  312. .CW 2
  313. appropriate to the machine you are actually using.
  314. .PP
  315. To convert source to an executable binary is a two-step process.
  316. First run the compiler,
  317. .CW 2c ,
  318. on the source, say
  319. .CW file.c ,
  320. to generate an object file
  321. .CW file.2 .
  322. Then run the loader,
  323. .CW 2l ,
  324. to generate an executable
  325. .CW 2.out
  326. that may be run (on a 680X0 machine):
  327. .P1
  328. 2c file.c
  329. 2l file.2
  330. 2.out
  331. .P2
  332. The loader automatically links with whatever libraries the program
  333. needs, usually including the standard C library as defined by
  334. .CW <libc.h> .
  335. Of course the compiler and loader have lots of options, both familiar and new;
  336. see the manual for details.
  337. The compiler does not generate an executable automatically;
  338. the output of the compiler must be given to the loader.
  339. Since most compilation is done under the control of
  340. .CW mk
  341. (see below), this is rarely an inconvenience.
  342. .PP
  343. The distribution of work between the compiler and loader is unusual.
  344. The compiler integrates preprocessing, parsing, register allocation,
  345. code generation and some assembly.
  346. Combining these tasks in a single program is part of the reason for
  347. the compiler's efficiency.
  348. The loader does instruction selection, branch folding,
  349. instruction scheduling,
  350. and writes the final executable.
  351. There is no separate C preprocessor and no assembler in the usual pipeline.
  352. Instead the intermediate object file
  353. (here a
  354. .CW \&.2
  355. file) is a type of binary assembly language.
  356. The instructions in the intermediate format are not exactly those in
  357. the machine. For example, on the 68020 the object file may specify
  358. a MOVE instruction but the loader will decide just which variant of
  359. the MOVE instruction \(em MOVE immediate, MOVE quick, MOVE address,
  360. etc. \(em is most efficient.
  361. .PP
  362. The assembler,
  363. .CW 2a ,
  364. is just a translator between the textual and binary
  365. representations of the object file format.
  366. It is not an assembler in the traditional sense. It has limited
  367. macro capabilities (the same as the integral C preprocessor in the compiler),
  368. clumsy syntax, and minimal error checking. For instance, the assembler
  369. will accept an instruction (such as memory-to-memory MOVE on the MIPS) that the
  370. machine does not actually support; only when the output of the assembler
  371. is passed to the loader will the error be discovered.
  372. The assembler is intended only for writing things that need access to instructions
  373. invisible from C,
  374. such as the machine-dependent
  375. part of an operating system;
  376. very little code in Plan 9 is in assembly language.
  377. .PP
  378. The compilers take an option
  379. .CW -S
  380. that causes them to print on their standard output the generated code
  381. in a format acceptable as input to the assemblers.
  382. This is of course merely a formatting of the
  383. data in the object file; therefore the assembler is just
  384. an
  385. ASCII-to-binary converter for this format.
  386. Other than the specific instructions, the input to the assemblers
  387. is largely architecture-independent; see
  388. ``A Manual for the Plan 9 Assembler'',
  389. by Rob Pike,
  390. for more information.
  391. .PP
  392. The loader is an integral part of the compilation process.
  393. Each library header file contains a
  394. .CW #pragma
  395. that tells the loader the name of the associated archive; it is
  396. not necessary to tell the loader which libraries a program uses.
  397. The C run-time startup is found, by default, in the C library.
  398. The loader starts with an undefined
  399. symbol,
  400. .CW _main ,
  401. that is resolved by pulling in the run-time startup code from the library.
  402. (The loader undefines
  403. .CW _mainp
  404. when profiling is enabled, to force loading of the profiling start-up
  405. instead.)
  406. .PP
  407. Unlike its counterpart on other systems, the Plan 9 loader rearranges
  408. data to optimize access. This means the order of variables in the
  409. loaded program is unrelated to its order in the source.
  410. Most programs don't care, but some assume that, for example, the
  411. variables declared by
  412. .P1
  413. int a;
  414. int b;
  415. .P2
  416. will appear at adjacent addresses in memory. On Plan 9, they won't.
  417. .SH
  418. Heterogeneity
  419. .PP
  420. When the system starts or a user logs in the environment is configured
  421. so the appropriate binaries are available in
  422. .CW /bin .
  423. The configuration process is controlled by an environment variable,
  424. .CW $cputype ,
  425. with value such as
  426. .CW mips ,
  427. .CW 68020 ,
  428. .CW 386 ,
  429. or
  430. .CW sparc .
  431. For each architecture there is a directory in the root,
  432. with the appropriate name,
  433. that holds the binary and library files for that architecture.
  434. Thus
  435. .CW /mips/lib
  436. contains the object code libraries for MIPS programs,
  437. .CW /mips/include
  438. holds MIPS-specific include files, and
  439. .CW /mips/bin
  440. has the MIPS binaries.
  441. These binaries are attached to
  442. .CW /bin
  443. at boot time by binding
  444. .CW /$cputype/bin
  445. to
  446. .CW /bin ,
  447. so
  448. .CW /bin
  449. always contains the correct files.
  450. .PP
  451. The MIPS compiler,
  452. .CW vc ,
  453. by definition
  454. produces object files for the MIPS architecture,
  455. regardless of the architecture of the machine on which the compiler is running.
  456. There is a version of
  457. .CW vc
  458. compiled for each architecture:
  459. .CW /mips/bin/vc ,
  460. .CW /68020/bin/vc ,
  461. .CW /sparc/bin/vc ,
  462. and so on,
  463. each capable of producing MIPS object files regardless of the native
  464. instruction set.
  465. If one is running on a SPARC,
  466. .CW /sparc/bin/vc
  467. will compile programs for the MIPS;
  468. if one is running on machine
  469. .CW $cputype ,
  470. .CW /$cputype/bin/vc
  471. will compile programs for the MIPS.
  472. .PP
  473. Because of the bindings that assemble
  474. .CW /bin ,
  475. the shell always looks for a command, say
  476. .CW date ,
  477. in
  478. .CW /bin
  479. and automatically finds the file
  480. .CW /$cputype/bin/date .
  481. Therefore the MIPS compiler is known as just
  482. .CW vc ;
  483. the shell will invoke
  484. .CW /bin/vc
  485. and that is guaranteed to be the version of the MIPS compiler
  486. appropriate for the machine running the command.
  487. Regardless of the architecture of the compiling machine,
  488. .CW /bin/vc
  489. is
  490. .I always
  491. the MIPS compiler.
  492. .PP
  493. Also, the output of
  494. .CW vc
  495. and
  496. .CW vl
  497. is completely independent of the machine type on which they are executed:
  498. .CW \&.v
  499. files compiled (with
  500. .CW vc )
  501. on a SPARC may be linked (with
  502. .CW vl )
  503. on a 386.
  504. (The resulting
  505. .CW v.out
  506. will run, of course, only on a MIPS.)
  507. Similarly, the MIPS libraries in
  508. .CW /mips/lib
  509. are suitable for loading with
  510. .CW vl
  511. on any machine; there is only one set of MIPS libraries, not one
  512. set for each architecture that supports the MIPS compiler.
  513. .SH
  514. Heterogeneity and \f(CWmk\fP
  515. .PP
  516. Most software on Plan 9 is compiled under the control of
  517. .CW mk ,
  518. a descendant of
  519. .CW make
  520. that is documented in the Programmer's Manual.
  521. A convention used throughout the
  522. .CW mkfiles
  523. makes it easy to compile the source into binary suitable for any architecture.
  524. .PP
  525. The variable
  526. .CW $cputype
  527. is advisory: it reports the architecture of the current environment, and should
  528. not be modified. A second variable,
  529. .CW $objtype ,
  530. is used to set which architecture is being
  531. .I compiled
  532. for.
  533. The value of
  534. .CW $objtype
  535. can be used by a
  536. .CW mkfile
  537. to configure the compilation environment.
  538. .PP
  539. In each machine's root directory there is a short
  540. .CW mkfile
  541. that defines a set of macros for the compiler, loader, etc.
  542. Here is
  543. .CW /mips/mkfile :
  544. .P1
  545. </sys/src/mkfile.proto
  546. CC=vc
  547. LD=vl
  548. O=v
  549. AS=va
  550. .P2
  551. The line
  552. .P1
  553. </sys/src/mkfile.proto
  554. .P2
  555. causes
  556. .CW mk
  557. to include the file
  558. .CW /sys/src/mkfile.proto ,
  559. which contains general definitions:
  560. .P1
  561. #
  562. # common mkfile parameters shared by all architectures
  563. #
  564. OS=v486xq7
  565. CPUS=mips 386 power alpha
  566. CFLAGS=-FVw
  567. LEX=lex
  568. YACC=yacc
  569. MK=/bin/mk
  570. .P2
  571. .CW CC
  572. is obviously the compiler,
  573. .CW AS
  574. the assembler, and
  575. .CW LD
  576. the loader.
  577. .CW O
  578. is the suffix for the object files and
  579. .CW CPUS
  580. and
  581. .CW OS
  582. are used in special rules described below.
  583. .PP
  584. Here is a
  585. .CW mkfile
  586. to build the installed source for
  587. .CW sam :
  588. .P1
  589. </$objtype/mkfile
  590. OBJ=sam.$O address.$O buffer.$O cmd.$O disc.$O error.$O \e
  591. file.$O io.$O list.$O mesg.$O moveto.$O multi.$O \e
  592. plan9.$O rasp.$O regexp.$O string.$O sys.$O xec.$O
  593. $O.out: $OBJ
  594. $LD $OBJ
  595. install: $O.out
  596. cp $O.out /$objtype/bin/sam
  597. installall:
  598. for(objtype in $CPUS) mk install
  599. %.$O: %.c
  600. $CC $CFLAGS $stem.c
  601. $OBJ: sam.h errors.h mesg.h
  602. address.$O cmd.$O parse.$O xec.$O unix.$O: parse.h
  603. clean:V:
  604. rm -f [$OS].out *.[$OS] y.tab.?
  605. .P2
  606. (The actual
  607. .CW mkfile
  608. imports most of its rules from other secondary files, but
  609. this example works and is not misleading.)
  610. The first line causes
  611. .CW mk
  612. to include the contents of
  613. .CW /$objtype/mkfile
  614. in the current
  615. .CW mkfile .
  616. If
  617. .CW $objtype
  618. is
  619. .CW mips ,
  620. this inserts the MIPS macro definitions into the
  621. .CW mkfile .
  622. In this case the rule for
  623. .CW $O.out
  624. uses the MIPS tools to build
  625. .CW v.out .
  626. The
  627. .CW %.$O
  628. rule in the file uses
  629. .CW mk 's
  630. pattern matching facilities to convert the source files to the object
  631. files through the compiler.
  632. (The text of the rules is passed directly to the shell,
  633. .CW rc ,
  634. without further translation.
  635. See the
  636. .CW mk
  637. manual if any of this is unfamiliar.)
  638. Because the default rule builds
  639. .CW $O.out
  640. rather than
  641. .CW sam ,
  642. it is possible to maintain binaries for multiple machines in the
  643. same source directory without conflict.
  644. This is also, of course, why the output files from the various
  645. compilers and loaders
  646. have distinct names.
  647. .PP
  648. The rest of the
  649. .CW mkfile
  650. should be easy to follow; notice how the rules for
  651. .CW clean
  652. and
  653. .CW installall
  654. (that is, install versions for all architectures) use other macros
  655. defined in
  656. .CW /$objtype/mkfile .
  657. In Plan 9,
  658. .CW mkfiles
  659. for commands conventionally contain rules to
  660. .CW install
  661. (compile and install the version for
  662. .CW $objtype ),
  663. .CW installall
  664. (compile and install for all
  665. .CW $objtypes ),
  666. and
  667. .CW clean
  668. (remove all object files, binaries, etc.).
  669. .PP
  670. The
  671. .CW mkfile
  672. is easy to use. To build a MIPS binary,
  673. .CW v.out :
  674. .P1
  675. % objtype=mips
  676. % mk
  677. .P2
  678. To build and install a MIPS binary:
  679. .P1
  680. % objtype=mips
  681. % mk install
  682. .P2
  683. To build and install all versions:
  684. .P1
  685. % mk installall
  686. .P2
  687. These conventions make cross-compilation as easy to manage
  688. as traditional native compilation.
  689. Plan 9 programs compile and run without change on machines from
  690. large multiprocessors to laptops. For more information about this process, see
  691. ``Plan 9 Mkfiles'',
  692. by Bob Flandrena.
  693. .SH
  694. Portability
  695. .PP
  696. Within Plan 9, it is painless to write portable programs, programs whose
  697. source is independent of the machine on which they execute.
  698. The operating system is fixed and the compiler, headers and libraries
  699. are constant so most of the stumbling blocks to portability are removed.
  700. Attention to a few details can avoid those that remain.
  701. .PP
  702. Plan 9 is a heterogeneous environment, so programs must
  703. .I expect
  704. that external files will be written by programs on machines of different
  705. architectures.
  706. The compilers, for instance, must handle without confusion
  707. object files written by other machines.
  708. The traditional approach to this problem is to pepper the source with
  709. .CW #ifdefs
  710. to turn byte-swapping on and off.
  711. Plan 9 takes a different approach: of the handful of machine-dependent
  712. .CW #ifdefs
  713. in all the source, almost all are deep in the libraries.
  714. Instead programs read and write files in a defined format,
  715. either (for low volume applications) as formatted text, or
  716. (for high volume applications) as binary in a known byte order.
  717. If the external data were written with the most significant
  718. byte first, the following code reads a 4-byte integer correctly
  719. regardless of the architecture of the executing machine (assuming
  720. an unsigned long holds 4 bytes):
  721. .P1
  722. ulong
  723. getlong(void)
  724. {
  725. ulong l;
  726. l = (getchar()&0xFF)<<24;
  727. l |= (getchar()&0xFF)<<16;
  728. l |= (getchar()&0xFF)<<8;
  729. l |= (getchar()&0xFF)<<0;
  730. return l;
  731. }
  732. .P2
  733. Note that this code does not `swap' the bytes; instead it just reads
  734. them in the correct order.
  735. Variations of this code will handle any binary format
  736. and also avoid problems
  737. involving how structures are padded, how words are aligned,
  738. and other impediments to portability.
  739. Be aware, though, that extra care is needed to handle floating point data.
  740. .PP
  741. Efficiency hounds will argue that this method is unnecessarily slow and clumsy
  742. when the executing machine has the same byte order (and padding and alignment)
  743. as the data.
  744. The CPU cost of I/O processing
  745. is rarely the bottleneck for an application, however,
  746. and the gain in simplicity of porting and maintaining the code greatly outweighs
  747. the minor speed loss from handling data in this general way.
  748. This method is how the Plan 9 compilers, the window system, and even the file
  749. servers transmit data between programs.
  750. .PP
  751. To port programs beyond Plan 9, where the system interface is more variable,
  752. it is probably necessary to use
  753. .CW pcc
  754. and hope that the target machine supports ANSI C and POSIX.
  755. .SH
  756. I/O
  757. .PP
  758. The default C library, defined by the include file
  759. .CW <libc.h> ,
  760. contains no buffered I/O package.
  761. It does have several entry points for printing formatted text:
  762. .CW print
  763. outputs text to the standard output,
  764. .CW fprint
  765. outputs text to a specified integer file descriptor, and
  766. .CW sprint
  767. places text in a character array.
  768. To access library routines for buffered I/O, a program must
  769. explicitly include the header file associated with an appropriate library.
  770. .PP
  771. The recommended I/O library, used by most Plan 9 utilities, is
  772. .CW bio
  773. (buffered I/O), defined by
  774. .CW <bio.h> .
  775. There also exists an implementation of ANSI Standard I/O,
  776. .CW stdio .
  777. .PP
  778. .CW Bio
  779. is small and efficient, particularly for buffer-at-a-time or
  780. line-at-a-time I/O.
  781. Even for character-at-a-time I/O, however, it is significantly faster than
  782. the Standard I/O library,
  783. .CW stdio .
  784. Its interface is compact and regular, although it lacks a few conveniences.
  785. The most noticeable is that one must explicitly define buffers for standard
  786. input and output;
  787. .CW bio
  788. does not predefine them. Here is a program to copy input to output a byte
  789. at a time using
  790. .CW bio :
  791. .P1
  792. #include <u.h>
  793. #include <libc.h>
  794. #include <bio.h>
  795. Biobuf bin;
  796. Biobuf bout;
  797. main(void)
  798. {
  799. int c;
  800. Binit(&bin, 0, OREAD);
  801. Binit(&bout, 1, OWRITE);
  802. while((c=Bgetc(&bin)) != Beof)
  803. Bputc(&bout, c);
  804. exits(0);
  805. }
  806. .P2
  807. For peak performance, we could replace
  808. .CW Bgetc
  809. and
  810. .CW Bputc
  811. by their equivalent in-line macros
  812. .CW BGETC
  813. and
  814. .CW BPUTC
  815. but
  816. the performance gain would be modest.
  817. For more information on
  818. .CW bio ,
  819. see the Programmer's Manual.
  820. .PP
  821. Perhaps the most dramatic difference in the I/O interface of Plan 9 from other
  822. systems' is that text is not ASCII.
  823. The format for
  824. text in Plan 9 is a byte-stream encoding of 16-bit characters.
  825. The character set is based on the Unicode Standard and is backward compatible with
  826. ASCII:
  827. characters with value 0 through 127 are the same in both sets.
  828. The 16-bit characters, called
  829. .I runes
  830. in Plan 9, are encoded using a representation called
  831. UTF,
  832. an encoding that is becoming accepted as a standard.
  833. (ISO calls it UTF-8;
  834. throughout Plan 9 it's just called
  835. UTF.)
  836. UTF
  837. defines multibyte sequences to
  838. represent character values from 0 to 65535.
  839. In
  840. UTF,
  841. character values up to 127 decimal, 7F hexadecimal, represent themselves,
  842. so straight
  843. ASCII
  844. files are also valid
  845. UTF.
  846. Also,
  847. UTF
  848. guarantees that bytes with values 0 to 127 (NUL to DEL, inclusive)
  849. will appear only when they represent themselves, so programs that read bytes
  850. looking for plain ASCII characters will continue to work.
  851. Any program that expects a one-to-one correspondence between bytes and
  852. characters will, however, need to be modified.
  853. An example is parsing file names.
  854. File names, like all text, are in
  855. UTF,
  856. so it is incorrect to search for a character in a string by
  857. .CW strchr(filename,
  858. .CW c)
  859. because the character might have a multi-byte encoding.
  860. The correct method is to call
  861. .CW utfrune(filename,
  862. .CW c) ,
  863. defined in
  864. .I rune (2),
  865. which interprets the file name as a sequence of encoded characters
  866. rather than bytes.
  867. In fact, even when you know the character is a single byte
  868. that can represent only itself,
  869. it is safer to use
  870. .CW utfrune
  871. because that assumes nothing about the character set
  872. and its representation.
  873. .PP
  874. The library defines several symbols relevant to the representation of characters.
  875. Any byte with unsigned value less than
  876. .CW Runesync
  877. will not appear in any multi-byte encoding of a character.
  878. .CW Utfrune
  879. compares the character being searched against
  880. .CW Runesync
  881. to see if it is sufficient to call
  882. .CW strchr
  883. or if the byte stream must be interpreted.
  884. Any byte with unsigned value less than
  885. .CW Runeself
  886. is represented by a single byte with the same value.
  887. Finally, when errors are encountered converting
  888. to runes from a byte stream, the library returns the rune value
  889. .CW Runeerror
  890. and advances a single byte. This permits programs to find runes
  891. embedded in binary data.
  892. .PP
  893. .CW Bio
  894. includes routines
  895. .CW Bgetrune
  896. and
  897. .CW Bputrune
  898. to transform the external byte stream
  899. UTF
  900. format to and from
  901. internal 16-bit runes.
  902. Also, the
  903. .CW %s
  904. format to
  905. .CW print
  906. accepts
  907. UTF;
  908. .CW %c
  909. prints a character after narrowing it to 8 bits.
  910. The
  911. .CW %S
  912. format prints a null-terminated sequence of runes;
  913. .CW %C
  914. prints a character after narrowing it to 16 bits.
  915. For more information, see the Programmer's Manual, in particular
  916. .I utf (6)
  917. and
  918. .I rune (2),
  919. and the paper,
  920. ``Hello world, or
  921. Καλημέρα κόσμε, or\
  922. \f(Jpこんにちは 世界\f1'',
  923. by Rob Pike and
  924. Ken Thompson;
  925. there is not room for the full story here.
  926. .PP
  927. These issues affect the compiler in several ways.
  928. First, the C source is in
  929. UTF.
  930. ANSI says C variables are formed from
  931. ASCII
  932. alphanumerics, but comments and literal strings may contain any characters
  933. encoded in the native encoding, here
  934. UTF.
  935. The declaration
  936. .P1
  937. char *cp = "abcÿ";
  938. .P2
  939. initializes the variable
  940. .CW cp
  941. to point to an array of bytes holding the
  942. UTF
  943. representation of the characters
  944. .CW abcÿ.
  945. The type
  946. .CW Rune
  947. is defined in
  948. .CW <u.h>
  949. to be
  950. .CW ushort ,
  951. which is also the `wide character' type in the compiler.
  952. Therefore the declaration
  953. .P1
  954. Rune *rp = L"abcÿ";
  955. .P2
  956. initializes the variable
  957. .CW rp
  958. to point to an array of unsigned short integers holding the 16-bit
  959. values of the characters
  960. .CW abcÿ .
  961. Note that in both these declarations the characters in the source
  962. that represent
  963. .CW "abcÿ"
  964. are the same; what changes is how those characters are represented
  965. in memory in the program.
  966. The following two lines:
  967. .P1
  968. print("%s\en", "abcÿ");
  969. print("%S\en", L"abcÿ");
  970. .P2
  971. produce the same
  972. UTF
  973. string on their output, the first by copying the bytes, the second
  974. by converting from runes to bytes.
  975. .PP
  976. In C, character constants are integers but narrowed through the
  977. .CW char
  978. type.
  979. The Unicode character
  980. .CW ÿ
  981. has value 255, so if the
  982. .CW char
  983. type is signed,
  984. the constant
  985. .CW 'ÿ'
  986. has value \-1 (which is equal to EOF).
  987. On the other hand,
  988. .CW L'ÿ'
  989. narrows through the wide character type,
  990. .CW ushort ,
  991. and therefore has value 255.
  992. .PP
  993. Finally, although it's not ANSI C, the Plan 9 C compilers
  994. assume any character with value above
  995. .CW Runeself
  996. is an alphanumeric,
  997. so α is a legal, if non-portable, variable name.
  998. .SH
  999. Arguments
  1000. .PP
  1001. Some macros are defined
  1002. in
  1003. .CW <libc.h>
  1004. for parsing the arguments to
  1005. .CW main() .
  1006. They are described in
  1007. .I ARG (2)
  1008. but are fairly self-explanatory.
  1009. There are four macros:
  1010. .CW ARGBEGIN
  1011. and
  1012. .CW ARGEND
  1013. are used to bracket a hidden
  1014. .CW switch
  1015. statement within which
  1016. .CW ARGC
  1017. returns the current option character (rune) being processed and
  1018. .CW ARGF
  1019. returns the argument to the option, as in the loader option
  1020. .CW -o
  1021. .CW file .
  1022. Here, for example, is the code at the beginning of
  1023. .CW main()
  1024. in
  1025. .CW ramfs.c
  1026. (see
  1027. .I ramfs (1))
  1028. that cracks its arguments:
  1029. .P1
  1030. void
  1031. main(int argc, char *argv[])
  1032. {
  1033. char *defmnt;
  1034. int p[2];
  1035. int mfd[2];
  1036. int stdio = 0;
  1037. defmnt = "/tmp";
  1038. ARGBEGIN{
  1039. case 'i':
  1040. defmnt = 0;
  1041. stdio = 1;
  1042. mfd[0] = 0;
  1043. mfd[1] = 1;
  1044. break;
  1045. case 's':
  1046. defmnt = 0;
  1047. break;
  1048. case 'm':
  1049. defmnt = ARGF();
  1050. break;
  1051. default:
  1052. usage();
  1053. }ARGEND
  1054. .P2
  1055. .SH
  1056. Extensions
  1057. .PP
  1058. The compiler has several extensions to ANSI C, all of which are used
  1059. extensively in the system source.
  1060. First,
  1061. .I structure
  1062. .I displays
  1063. permit
  1064. .CW struct
  1065. expressions to be formed dynamically.
  1066. Given these declarations:
  1067. .P1
  1068. typedef struct Point Point;
  1069. typedef struct Rectangle Rectangle;
  1070. struct Point
  1071. {
  1072. int x, y;
  1073. };
  1074. struct Rectangle
  1075. {
  1076. Point min, max;
  1077. };
  1078. Point p, q, add(Point, Point);
  1079. Rectangle r;
  1080. int x, y;
  1081. .P2
  1082. this assignment may appear anywhere an assignment is legal:
  1083. .P1
  1084. r = (Rectangle){add(p, q), (Point){x, y+3}};
  1085. .P2
  1086. The syntax is the same as for initializing a structure but with
  1087. a leading cast.
  1088. .PP
  1089. If an
  1090. .I anonymous
  1091. .I structure
  1092. or
  1093. .I union
  1094. is declared within another structure or union, the members of the internal
  1095. structure or union are addressable without prefix in the outer structure.
  1096. This feature eliminates the clumsy naming of nested structures and,
  1097. particularly, unions.
  1098. For example, after these declarations,
  1099. .P1
  1100. struct Lock
  1101. {
  1102. int locked;
  1103. };
  1104. struct Node
  1105. {
  1106. int type;
  1107. union{
  1108. double dval;
  1109. double fval;
  1110. long lval;
  1111. }; /* anonymous union */
  1112. struct Lock; /* anonymous structure */
  1113. } *node;
  1114. void lock(struct Lock*);
  1115. .P2
  1116. one may refer to
  1117. .CW node->type ,
  1118. .CW node->dval ,
  1119. .CW node->fval ,
  1120. .CW node->lval ,
  1121. and
  1122. .CW node->locked .
  1123. Moreover, the address of a
  1124. .CW struct
  1125. .CW Node
  1126. may be used without a cast anywhere that the address of a
  1127. .CW struct
  1128. .CW Lock
  1129. is used, such as in argument lists.
  1130. The compiler automatically promotes the type and adjusts the address.
  1131. Thus one may invoke
  1132. .CW lock(node) .
  1133. .PP
  1134. Anonymous structures and unions may be accessed by type name
  1135. if (and only if) they are declared using a
  1136. .CW typedef
  1137. name.
  1138. For example, using the above declaration for
  1139. .CW Point ,
  1140. one may declare
  1141. .P1
  1142. struct
  1143. {
  1144. int type;
  1145. Point;
  1146. } p;
  1147. .P2
  1148. and refer to
  1149. .CW p.Point .
  1150. .PP
  1151. In the initialization of arrays, a number in square brackets before an
  1152. element sets the index for the initialization. For example, to initialize
  1153. some elements in
  1154. a table of function pointers indexed by
  1155. ASCII
  1156. character,
  1157. .P1
  1158. void percent(void), slash(void);
  1159. void (*func[128])(void) =
  1160. {
  1161. ['%'] percent,
  1162. ['/'] slash,
  1163. };
  1164. .P2
  1165. .LP
  1166. A similar syntax allows one to initialize structure elements:
  1167. .P1
  1168. Point p =
  1169. {
  1170. .y 100,
  1171. .x 200
  1172. };
  1173. .P2
  1174. These initialization syntaxes were later added to ANSI C, with the addition of an
  1175. equals sign between the index or tag and the value.
  1176. The Plan 9 compiler accepts either form.
  1177. .PP
  1178. Finally, the declaration
  1179. .P1
  1180. extern register reg;
  1181. .P2
  1182. .I this "" (
  1183. appearance of the register keyword is not ignored)
  1184. allocates a global register to hold the variable
  1185. .CW reg .
  1186. External registers must be used carefully: they need to be declared in
  1187. .I all
  1188. source files and libraries in the program to guarantee the register
  1189. is not allocated temporarily for other purposes.
  1190. Especially on machines with few registers, such as the i386,
  1191. it is easy to link accidentally with code that has already usurped
  1192. the global registers and there is no diagnostic when this happens.
  1193. Used wisely, though, external registers are powerful.
  1194. The Plan 9 operating system uses them to access per-process and
  1195. per-machine data structures on a multiprocessor. The storage class they provide
  1196. is hard to create in other ways.
  1197. .SH
  1198. The compile-time environment
  1199. .PP
  1200. The code generated by the compilers is `optimized' by default:
  1201. variables are placed in registers and peephole optimizations are
  1202. performed.
  1203. The compiler flag
  1204. .CW -N
  1205. disables these optimizations.
  1206. Registerization is done locally rather than throughout a function:
  1207. whether a variable occupies a register or
  1208. the memory location identified in the symbol
  1209. table depends on the activity of the variable and may change
  1210. throughout the life of the variable.
  1211. The
  1212. .CW -N
  1213. flag is rarely needed;
  1214. its main use is to simplify debugging.
  1215. There is no information in the symbol table to identify the
  1216. registerization of a variable, so
  1217. .CW -N
  1218. guarantees the variable is always where the symbol table says it is.
  1219. .PP
  1220. Another flag,
  1221. .CW -w ,
  1222. turns
  1223. .I on
  1224. warnings about portability and problems detected in flow analysis.
  1225. Most code in Plan 9 is compiled with warnings enabled;
  1226. these warnings plus the type checking offered by function prototypes
  1227. provide most of the support of the Unix tool
  1228. .CW lint
  1229. more accurately and with less chatter.
  1230. Two of the warnings,
  1231. `used and not set' and `set and not used', are almost always accurate but
  1232. may be triggered spuriously by code with invisible control flow,
  1233. such as in routines that call
  1234. .CW longjmp .
  1235. The compiler statements
  1236. .P1
  1237. SET(v1);
  1238. USED(v2);
  1239. .P2
  1240. decorate the flow graph to silence the compiler.
  1241. Either statement accepts a comma-separated list of variables.
  1242. Use them carefully: they may silence real errors.
  1243. For the common case of unused parameters to a function,
  1244. leaving the name off the declaration silences the warnings.
  1245. That is, listing the type of a parameter but giving it no
  1246. associated variable name does the trick.
  1247. .SH
  1248. Debugging
  1249. .PP
  1250. There are two debuggers available on Plan 9.
  1251. The first, and older, is
  1252. .CW db ,
  1253. a revision of Unix
  1254. .CW adb .
  1255. The other,
  1256. .CW acid ,
  1257. is a source-level debugger whose commands are statements in
  1258. a true programming language.
  1259. .CW Acid
  1260. is the preferred debugger, but since it
  1261. borrows some elements of
  1262. .CW db ,
  1263. notably the formats for displaying values, it is worth knowing a little bit about
  1264. .CW db .
  1265. .PP
  1266. Both debuggers support multiple architectures in a single program; that is,
  1267. the programs are
  1268. .CW db
  1269. and
  1270. .CW acid ,
  1271. not for example
  1272. .CW vdb
  1273. and
  1274. .CW vacid .
  1275. They also support cross-architecture debugging comfortably:
  1276. one may debug a 68020 binary on a MIPS.
  1277. .PP
  1278. Imagine a program has crashed mysteriously:
  1279. .P1
  1280. % X11/X
  1281. Fatal server bug!
  1282. failed to create default stipple
  1283. X 106: suicide: sys: trap: fault read addr=0x0 pc=0x00105fb8
  1284. %
  1285. .P2
  1286. When a process dies on Plan 9 it hangs in the `broken' state
  1287. for debugging.
  1288. Attach a debugger to the process by naming its process id:
  1289. .P1
  1290. % acid 106
  1291. /proc/106/text:mips plan 9 executable
  1292. /sys/lib/acid/port
  1293. /sys/lib/acid/mips
  1294. acid:
  1295. .P2
  1296. The
  1297. .CW acid
  1298. function
  1299. .CW stk()
  1300. reports the stack traceback:
  1301. .P1
  1302. acid: stk()
  1303. At pc:0x105fb8:abort+0x24 /sys/src/ape/lib/ap/stdio/abort.c:6
  1304. abort() /sys/src/ape/lib/ap/stdio/abort.c:4
  1305. called from FatalError+#4e
  1306. /sys/src/X/mit/server/dix/misc.c:421
  1307. FatalError(s9=#e02, s8=#4901d200, s7=#2, s6=#72701, s5=#1,
  1308. s4=#7270d, s3=#6, s2=#12, s1=#ff37f1c, s0=#6, f=#7270f)
  1309. /sys/src/X/mit/server/dix/misc.c:416
  1310. called from gnotscreeninit+#4ce
  1311. /sys/src/X/mit/server/ddx/gnot/gnot.c:792
  1312. gnotscreeninit(snum=#0, sc=#80db0)
  1313. /sys/src/X/mit/server/ddx/gnot/gnot.c:766
  1314. called from AddScreen+#16e
  1315. /n/bootes/sys/src/X/mit/server/dix/main.c:610
  1316. AddScreen(pfnInit=0x0000129c,argc=0x00000001,argv=0x7fffffe4)
  1317. /sys/src/X/mit/server/dix/main.c:530
  1318. called from InitOutput+0x80
  1319. /sys/src/X/mit/server/ddx/brazil/brddx.c:522
  1320. InitOutput(argc=0x00000001,argv=0x7fffffe4)
  1321. /sys/src/X/mit/server/ddx/brazil/brddx.c:511
  1322. called from main+0x294
  1323. /sys/src/X/mit/server/dix/main.c:225
  1324. main(argc=0x00000001,argv=0x7fffffe4)
  1325. /sys/src/X/mit/server/dix/main.c:136
  1326. called from _main+0x24
  1327. /sys/src/ape/lib/ap/mips/main9.s:8
  1328. .P2
  1329. The function
  1330. .CW lstk()
  1331. is similar but
  1332. also reports the values of local variables.
  1333. Note that the traceback includes full file names; this is a boon to debugging,
  1334. although it makes the output much noisier.
  1335. .PP
  1336. To use
  1337. .CW acid
  1338. well you will need to learn its input language; see the
  1339. ``Acid Manual'',
  1340. by Phil Winterbottom,
  1341. for details. For simple debugging, however, the information in the manual page is
  1342. sufficient. In particular, it describes the most useful functions
  1343. for examining a process.
  1344. .PP
  1345. The compiler does not place
  1346. information describing the types of variables in the executable,
  1347. but a compile-time flag provides crude support for symbolic debugging.
  1348. The
  1349. .CW -a
  1350. flag to the compiler suppresses code generation
  1351. and instead emits source text in the
  1352. .CW acid
  1353. language to format and display data structure types defined in the program.
  1354. The easiest way to use this feature is to put a rule in the
  1355. .CW mkfile :
  1356. .P1
  1357. syms: main.$O
  1358. $CC -a main.c > syms
  1359. .P2
  1360. Then from within
  1361. .CW acid ,
  1362. .P1
  1363. acid: include("sourcedirectory/syms")
  1364. .P2
  1365. to read in the relevant definitions.
  1366. (For multi-file source, you need to be a little fancier;
  1367. see
  1368. .I 2c (1)).
  1369. This text includes, for each defined compound
  1370. type, a function with that name that may be called with the address of a structure
  1371. of that type to display its contents.
  1372. For example, if
  1373. .CW rect
  1374. is a global variable of type
  1375. .CW Rectangle ,
  1376. one may execute
  1377. .P1
  1378. Rectangle(*rect)
  1379. .P2
  1380. to display it.
  1381. The
  1382. .CW *
  1383. (indirection) operator is necessary because
  1384. of the way
  1385. .CW acid
  1386. works: each global symbol in the program is defined as a variable by
  1387. .CW acid ,
  1388. with value equal to the
  1389. .I address
  1390. of the symbol.
  1391. .PP
  1392. Another common technique is to write by hand special
  1393. .CW acid
  1394. code to define functions to aid debugging, initialize the debugger, and so on.
  1395. Conventionally, this is placed in a file called
  1396. .CW acid
  1397. in the source directory; it has a line
  1398. .P1
  1399. include("sourcedirectory/syms");
  1400. .P2
  1401. to load the compiler-produced symbols. One may edit the compiler output directly but
  1402. it is wiser to keep the hand-generated
  1403. .CW acid
  1404. separate from the machine-generated.
  1405. .PP
  1406. To make things simple, the default rules in the system
  1407. .CW mkfiles
  1408. include entries to make
  1409. .CW foo.acid
  1410. from
  1411. .CW foo.c ,
  1412. so one may use
  1413. .CW mk
  1414. to automate the production of
  1415. .CW acid
  1416. definitions for a given C source file.
  1417. .PP
  1418. There is much more to say here. See
  1419. .CW acid
  1420. manual page, the reference manual, or the paper
  1421. ``Acid: A Debugger Built From A Language'',
  1422. also by Phil Winterbottom.