comp.ms 37 KB

1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192939495969798991001011021031041051061071081091101111121131141151161171181191201211221231241251261271281291301311321331341351361371381391401411421431441451461471481491501511521531541551561571581591601611621631641651661671681691701711721731741751761771781791801811821831841851861871881891901911921931941951961971981992002012022032042052062072082092102112122132142152162172182192202212222232242252262272282292302312322332342352362372382392402412422432442452462472482492502512522532542552562572582592602612622632642652662672682692702712722732742752762772782792802812822832842852862872882892902912922932942952962972982993003013023033043053063073083093103113123133143153163173183193203213223233243253263273283293303313323333343353363373383393403413423433443453463473483493503513523533543553563573583593603613623633643653663673683693703713723733743753763773783793803813823833843853863873883893903913923933943953963973983994004014024034044054064074084094104114124134144154164174184194204214224234244254264274284294304314324334344354364374384394404414424434444454464474484494504514524534544554564574584594604614624634644654664674684694704714724734744754764774784794804814824834844854864874884894904914924934944954964974984995005015025035045055065075085095105115125135145155165175185195205215225235245255265275285295305315325335345355365375385395405415425435445455465475485495505515525535545555565575585595605615625635645655665675685695705715725735745755765775785795805815825835845855865875885895905915925935945955965975985996006016026036046056066076086096106116126136146156166176186196206216226236246256266276286296306316326336346356366376386396406416426436446456466476486496506516526536546556566576586596606616626636646656666676686696706716726736746756766776786796806816826836846856866876886896906916926936946956966976986997007017027037047057067077087097107117127137147157167177187197207217227237247257267277287297307317327337347357367377387397407417427437447457467477487497507517527537547557567577587597607617627637647657667677687697707717727737747757767777787797807817827837847857867877887897907917927937947957967977987998008018028038048058068078088098108118128138148158168178188198208218228238248258268278288298308318328338348358368378388398408418428438448458468478488498508518528538548558568578588598608618628638648658668678688698708718728738748758768778788798808818828838848858868878888898908918928938948958968978988999009019029039049059069079089099109119129139149159169179189199209219229239249259269279289299309319329339349359369379389399409419429439449459469479489499509519529539549559569579589599609619629639649659669679689699709719729739749759769779789799809819829839849859869879889899909919929939949959969979989991000100110021003100410051006100710081009101010111012101310141015101610171018101910201021102210231024102510261027102810291030103110321033103410351036103710381039104010411042104310441045104610471048104910501051105210531054105510561057105810591060106110621063106410651066106710681069107010711072107310741075107610771078107910801081108210831084108510861087108810891090109110921093109410951096109710981099110011011102110311041105110611071108110911101111111211131114111511161117111811191120112111221123112411251126112711281129113011311132113311341135113611371138113911401141114211431144114511461147114811491150115111521153115411551156115711581159116011611162116311641165116611671168116911701171117211731174117511761177117811791180118111821183118411851186118711881189119011911192119311941195119611971198119912001201120212031204120512061207120812091210121112121213121412151216121712181219122012211222122312241225122612271228122912301231123212331234123512361237123812391240124112421243124412451246124712481249125012511252125312541255125612571258125912601261126212631264126512661267126812691270127112721273127412751276127712781279128012811282128312841285128612871288128912901291129212931294129512961297129812991300130113021303130413051306130713081309131013111312131313141315131613171318131913201321132213231324132513261327132813291330133113321333133413351336133713381339134013411342134313441345134613471348134913501351135213531354135513561357135813591360136113621363136413651366136713681369137013711372137313741375137613771378137913801381138213831384138513861387138813891390139113921393139413951396139713981399140014011402140314041405140614071408140914101411141214131414141514161417141814191420142114221423142414251426142714281429143014311432143314341435143614371438143914401441144214431444
  1. .HTML "How to Use the Plan 9 C Compiler
  2. .TL
  3. How to Use the Plan 9 C Compiler
  4. .AU
  5. Rob Pike
  6. rob@plan9.bell-labs.com
  7. .SH
  8. Introduction
  9. .PP
  10. The C compiler on Plan 9 is a wholly new program; in fact
  11. it was the first piece of software written for what would
  12. eventually become Plan 9 from Bell Labs.
  13. Programmers familiar with existing C compilers will find
  14. a number of differences in both the language the Plan 9 compiler
  15. accepts and in how the compiler is used.
  16. .PP
  17. The compiler is really a set of compilers, one for each
  18. architecture \(em MIPS, SPARC, Intel 386, Power PC, ARM, etc. \(em
  19. that accept a dialect of ANSI C and efficiently produce
  20. fairly good code for the target machine.
  21. There is a packaging of the compiler that accepts strict ANSI C for
  22. a POSIX environment, but this document focuses on the
  23. native Plan 9 environment, that in which all the system source and
  24. almost all the utilities are written.
  25. .SH
  26. Source
  27. .PP
  28. The language accepted by the compilers is the core 1989 ANSI C language
  29. with some modest extensions,
  30. a greatly simplified preprocessor,
  31. a smaller library that includes system calls and related facilities,
  32. and a completely different structure for include files.
  33. .PP
  34. Official ANSI C accepts the old (K&R) style of declarations for
  35. functions; the Plan 9 compilers
  36. are more demanding.
  37. Without an explicit run-time flag
  38. .CW -B ) (
  39. whose use is discouraged, the compilers insist
  40. on new-style function declarations, that is, prototypes for
  41. function arguments.
  42. The function declarations in the libraries' include files are
  43. all in the new style so the interfaces are checked at compile time.
  44. For C programmers who have not yet switched to function prototypes
  45. the clumsy syntax may seem repellent but the payoff in stronger typing
  46. is substantial.
  47. Those who wish to import existing software to Plan 9 are urged
  48. to use the opportunity to update their code.
  49. .PP
  50. The compilers include an integrated preprocessor that accepts the familiar
  51. .CW #include ,
  52. .CW #define
  53. for macros both with and without arguments,
  54. .CW #undef ,
  55. .CW #line ,
  56. .CW #ifdef ,
  57. .CW #ifndef ,
  58. and
  59. .CW #endif .
  60. It
  61. supports neither
  62. .CW #if
  63. nor
  64. .CW ## ,
  65. although it does
  66. honor a few
  67. .CW #pragmas .
  68. The
  69. .CW #if
  70. directive was omitted because it greatly complicates the
  71. preprocessor, is never necessary, and is usually abused.
  72. Conditional compilation in general makes code hard to understand;
  73. the Plan 9 source uses it sparingly.
  74. Also, because the compilers remove dead code, regular
  75. .CW if
  76. statements with constant conditions are more readable equivalents to many
  77. .CW #ifs .
  78. To compile imported code ineluctably fouled by
  79. .CW #if
  80. there is a separate command,
  81. .CW /bin/cpp ,
  82. that implements the complete ANSI C preprocessor specification.
  83. .PP
  84. Include files fall into two groups: machine-dependent and machine-independent.
  85. The machine-independent files occupy the directory
  86. .CW /sys/include ;
  87. the others are placed in a directory appropriate to the machine, such as
  88. .CW /mips/include .
  89. The compiler searches for include files
  90. first in the machine-dependent directory and then
  91. in the machine-independent directory.
  92. At the time of writing there are thirty-one machine-independent include
  93. files and two (per machine) machine-dependent ones:
  94. .CW <ureg.h>
  95. and
  96. .CW <u.h> .
  97. The first describes the layout of registers on the system stack,
  98. for use by the debugger.
  99. The second defines some
  100. architecture-dependent types such as
  101. .CW jmp_buf
  102. for
  103. .CW setjmp
  104. and the
  105. .CW va_arg
  106. and
  107. .CW va_list
  108. macros for handling arguments to variadic functions,
  109. as well as a set of
  110. .CW typedef
  111. abbreviations for
  112. .CW unsigned
  113. .CW short
  114. and so on.
  115. .PP
  116. Here is an excerpt from
  117. .CW /386/include/u.h :
  118. .P1
  119. #define nil ((void*)0)
  120. typedef unsigned short ushort;
  121. typedef unsigned char uchar;
  122. typedef unsigned long ulong;
  123. typedef unsigned int uint;
  124. typedef signed char schar;
  125. typedef long long vlong;
  126. typedef long jmp_buf[2];
  127. #define JMPBUFSP 0
  128. #define JMPBUFPC 1
  129. #define JMPBUFDPC 0
  130. .P2
  131. Plan 9 programs use
  132. .CW nil
  133. for the name of the zero-valued pointer.
  134. The type
  135. .CW vlong
  136. is the largest integer type available; on most architectures it
  137. is a 64-bit value.
  138. A couple of other types in
  139. .CW <u.h>
  140. are
  141. .CW u32int ,
  142. which is guaranteed to have exactly 32 bits (a possibility on all the supported architectures) and
  143. .CW mpdigit ,
  144. which is used by the multiprecision math package
  145. .CW <mp.h> .
  146. The
  147. .CW #define
  148. constants permit an architecture-independent (but compiler-dependent)
  149. implementation of stack-switching using
  150. .CW setjmp
  151. and
  152. .CW longjmp .
  153. .PP
  154. Every Plan 9 C program begins
  155. .P1
  156. #include <u.h>
  157. .P2
  158. because all the other installed header files use the
  159. .CW typedefs
  160. declared in
  161. .CW <u.h> .
  162. .PP
  163. In strict ANSI C, include files are grouped to collect related functions
  164. in a single file: one for string functions, one for memory functions,
  165. one for I/O, and none for system calls.
  166. Each include file is protected by an
  167. .CW #ifdef
  168. to guarantee its contents are seen by the compiler only once.
  169. Plan 9 takes a different approach. Other than a few include
  170. files that define external formats such as archives, the files in
  171. .CW /sys/include
  172. correspond to
  173. .I libraries.
  174. If a program is using a library, it includes the corresponding header.
  175. The default C library comprises string functions, memory functions, and
  176. so on, largely as in ANSI C, some formatted I/O routines,
  177. plus all the system calls and related functions.
  178. To use these functions, one must
  179. .CW #include
  180. the file
  181. .CW <libc.h> ,
  182. which in turn must follow
  183. .CW <u.h> ,
  184. to define their prototypes for the compiler.
  185. Here is the complete source to the traditional first C program:
  186. .P1
  187. #include <u.h>
  188. #include <libc.h>
  189. void
  190. main(void)
  191. {
  192. print("hello world\en");
  193. exits(0);
  194. }
  195. .P2
  196. The
  197. .CW print
  198. routine and its relatives
  199. .CW fprint
  200. and
  201. .CW sprint
  202. resemble the similarly-named functions in Standard I/O but are not
  203. attached to a specific I/O library.
  204. In Plan 9
  205. .CW main
  206. is not integer-valued; it should call
  207. .CW exits ,
  208. which takes a string argument (or null; here ANSI C promotes the 0 to a
  209. .CW char* ).
  210. All these functions are, of course, documented in the Programmer's Manual.
  211. .PP
  212. To use
  213. .CW printf ,
  214. .CW <stdio.h>
  215. must be included to define the function prototype for
  216. .CW printf :
  217. .P1
  218. #include <u.h>
  219. #include <libc.h>
  220. #include <stdio.h>
  221. void
  222. main(int argc, char *argv[])
  223. {
  224. printf("%s: hello world; argc = %d\en", argv[0], argc);
  225. exits(0);
  226. }
  227. .P2
  228. In practice, Standard I/O is not used much in Plan 9. I/O libraries are
  229. discussed in a later section of this document.
  230. .PP
  231. There are libraries for handling regular expressions, raster graphics,
  232. windows, and so on, and each has an associated include file.
  233. The manual for each library states which include files are needed.
  234. The files are not protected against multiple inclusion and themselves
  235. contain no nested
  236. .CW #includes .
  237. Instead the
  238. programmer is expected to sort out the requirements
  239. and to
  240. .CW #include
  241. the necessary files once at the top of each source file. In practice this is
  242. trivial: this way of handling include files is so straightforward
  243. that it is rare for a source file to contain more than half a dozen
  244. .CW #includes .
  245. .PP
  246. The compilers do their own register allocation so the
  247. .CW register
  248. keyword is ignored.
  249. For different reasons,
  250. .CW volatile
  251. and
  252. .CW const
  253. are also ignored.
  254. .PP
  255. To make it easier to share code with other systems, Plan 9 has a version
  256. of the compiler,
  257. .CW pcc ,
  258. that provides the standard ANSI C preprocessor, headers, and libraries
  259. with POSIX extensions.
  260. .CW Pcc
  261. is recommended only
  262. when broad external portability is mandated. It compiles slower,
  263. produces slower code (it takes extra work to simulate POSIX on Plan 9),
  264. eliminates those parts of the Plan 9 interface
  265. not related to POSIX, and illustrates the clumsiness of an environment
  266. designed by committee.
  267. .CW Pcc
  268. is described in more detail in
  269. .I
  270. APE\(emThe ANSI/POSIX Environment,
  271. .R
  272. by Howard Trickey.
  273. .SH
  274. Process
  275. .PP
  276. Each CPU architecture supported by Plan 9 is identified by a single,
  277. arbitrary, alphanumeric character:
  278. .CW k
  279. for SPARC,
  280. .CW q
  281. for 32-bit Power PC,
  282. .CW v
  283. for MIPS,
  284. .CW 0
  285. for little-endian MIPS,
  286. .CW 5
  287. for ARM v5 and later 32-bit architectures,
  288. .CW 6
  289. for AMD64,
  290. .CW 8
  291. for Intel 386, and
  292. .CW 9
  293. for 64-bit Power PC.
  294. The character labels the support tools and files for that architecture.
  295. For instance, for the 386 the compiler is
  296. .CW 8c ,
  297. the assembler is
  298. .CW 8a ,
  299. the link editor/loader is
  300. .CW 8l ,
  301. the object files are suffixed
  302. .CW \&.8 ,
  303. and the default name for an executable file is
  304. .CW 8.out .
  305. Before we can use the compiler we therefore need to know which
  306. machine we are compiling for.
  307. The next section explains how this decision is made; for the moment
  308. assume we are building 386 binaries and make the mental substitution for
  309. .CW 8
  310. appropriate to the machine you are actually using.
  311. .PP
  312. To convert source to an executable binary is a two-step process.
  313. First run the compiler,
  314. .CW 8c ,
  315. on the source, say
  316. .CW file.c ,
  317. to generate an object file
  318. .CW file.8 .
  319. Then run the loader,
  320. .CW 8l ,
  321. to generate an executable
  322. .CW 8.out
  323. that may be run (on a 386 machine):
  324. .P1
  325. 8c file.c
  326. 8l file.8
  327. 8.out
  328. .P2
  329. The loader automatically links with whatever libraries the program
  330. needs, usually including the standard C library as defined by
  331. .CW <libc.h> .
  332. Of course the compiler and loader have lots of options, both familiar and new;
  333. see the manual for details.
  334. The compiler does not generate an executable automatically;
  335. the output of the compiler must be given to the loader.
  336. Since most compilation is done under the control of
  337. .CW mk
  338. (see below), this is rarely an inconvenience.
  339. .PP
  340. The distribution of work between the compiler and loader is unusual.
  341. The compiler integrates preprocessing, parsing, register allocation,
  342. code generation and some assembly.
  343. Combining these tasks in a single program is part of the reason for
  344. the compiler's efficiency.
  345. The loader does instruction selection, branch folding,
  346. instruction scheduling,
  347. and writes the final executable.
  348. There is no separate C preprocessor and no assembler in the usual pipeline.
  349. Instead the intermediate object file
  350. (here a
  351. .CW \&.8
  352. file) is a type of binary assembly language.
  353. The instructions in the intermediate format are not exactly those in
  354. the machine. For example, on the 68020 the object file may specify
  355. a MOVE instruction but the loader will decide just which variant of
  356. the MOVE instruction \(em MOVE immediate, MOVE quick, MOVE address,
  357. etc. \(em is most efficient.
  358. .PP
  359. The assembler,
  360. .CW 8a ,
  361. is just a translator between the textual and binary
  362. representations of the object file format.
  363. It is not an assembler in the traditional sense. It has limited
  364. macro capabilities (the same as the integral C preprocessor in the compiler),
  365. clumsy syntax, and minimal error checking. For instance, the assembler
  366. will accept an instruction (such as memory-to-memory MOVE on the MIPS) that the
  367. machine does not actually support; only when the output of the assembler
  368. is passed to the loader will the error be discovered.
  369. The assembler is intended only for writing things that need access to instructions
  370. invisible from C,
  371. such as the machine-dependent
  372. part of an operating system;
  373. very little code in Plan 9 is in assembly language.
  374. .PP
  375. The compilers take an option
  376. .CW -S
  377. that causes them to print on their standard output the generated code
  378. in a format acceptable as input to the assemblers.
  379. This is of course merely a formatting of the
  380. data in the object file; therefore the assembler is just
  381. an
  382. ASCII-to-binary converter for this format.
  383. Other than the specific instructions, the input to the assemblers
  384. is largely architecture-independent; see
  385. ``A Manual for the Plan 9 Assembler'',
  386. by Rob Pike,
  387. for more information.
  388. .PP
  389. The loader is an integral part of the compilation process.
  390. Each library header file contains a
  391. .CW #pragma
  392. that tells the loader the name of the associated archive; it is
  393. not necessary to tell the loader which libraries a program uses.
  394. The C run-time startup is found, by default, in the C library.
  395. The loader starts with an undefined
  396. symbol,
  397. .CW _main ,
  398. that is resolved by pulling in the run-time startup code from the library.
  399. (The loader undefines
  400. .CW _mainp
  401. when profiling is enabled, to force loading of the profiling start-up
  402. instead.)
  403. .PP
  404. Unlike its counterpart on other systems, the Plan 9 loader rearranges
  405. data to optimize access. This means the order of variables in the
  406. loaded program is unrelated to its order in the source.
  407. Most programs don't care, but some assume that, for example, the
  408. variables declared by
  409. .P1
  410. int a;
  411. int b;
  412. .P2
  413. will appear at adjacent addresses in memory. On Plan 9, they won't.
  414. .SH
  415. Heterogeneity
  416. .PP
  417. When the system starts or a user logs in the environment is configured
  418. so the appropriate binaries are available in
  419. .CW /bin .
  420. The configuration process is controlled by an environment variable,
  421. .CW $cputype ,
  422. with value such as
  423. .CW mips ,
  424. .CW 386 ,
  425. .CW arm ,
  426. or
  427. .CW sparc .
  428. For each architecture there is a directory in the root,
  429. with the appropriate name,
  430. that holds the binary and library files for that architecture.
  431. Thus
  432. .CW /mips/lib
  433. contains the object code libraries for MIPS programs,
  434. .CW /mips/include
  435. holds MIPS-specific include files, and
  436. .CW /mips/bin
  437. has the MIPS binaries.
  438. These binaries are attached to
  439. .CW /bin
  440. at boot time by binding
  441. .CW /$cputype/bin
  442. to
  443. .CW /bin ,
  444. so
  445. .CW /bin
  446. always contains the correct files.
  447. .PP
  448. The MIPS compiler,
  449. .CW vc ,
  450. by definition
  451. produces object files for the MIPS architecture,
  452. regardless of the architecture of the machine on which the compiler is running.
  453. There is a version of
  454. .CW vc
  455. compiled for each architecture:
  456. .CW /mips/bin/vc ,
  457. .CW /arm/bin/vc ,
  458. .CW /sparc/bin/vc ,
  459. and so on,
  460. each capable of producing MIPS object files regardless of the native
  461. instruction set.
  462. If one is running on a SPARC,
  463. .CW /sparc/bin/vc
  464. will compile programs for the MIPS;
  465. if one is running on machine
  466. .CW $cputype ,
  467. .CW /$cputype/bin/vc
  468. will compile programs for the MIPS.
  469. .PP
  470. Because of the bindings that assemble
  471. .CW /bin ,
  472. the shell always looks for a command, say
  473. .CW date ,
  474. in
  475. .CW /bin
  476. and automatically finds the file
  477. .CW /$cputype/bin/date .
  478. Therefore the MIPS compiler is known as just
  479. .CW vc ;
  480. the shell will invoke
  481. .CW /bin/vc
  482. and that is guaranteed to be the version of the MIPS compiler
  483. appropriate for the machine running the command.
  484. Regardless of the architecture of the compiling machine,
  485. .CW /bin/vc
  486. is
  487. .I always
  488. the MIPS compiler.
  489. .PP
  490. Also, the output of
  491. .CW vc
  492. and
  493. .CW vl
  494. is completely independent of the machine type on which they are executed:
  495. .CW \&.v
  496. files compiled (with
  497. .CW vc )
  498. on a SPARC may be linked (with
  499. .CW vl )
  500. on a 386.
  501. (The resulting
  502. .CW v.out
  503. will run, of course, only on a MIPS.)
  504. Similarly, the MIPS libraries in
  505. .CW /mips/lib
  506. are suitable for loading with
  507. .CW vl
  508. on any machine; there is only one set of MIPS libraries, not one
  509. set for each architecture that supports the MIPS compiler.
  510. .SH
  511. Heterogeneity and \f(CWmk\fP
  512. .PP
  513. Most software on Plan 9 is compiled under the control of
  514. .CW mk ,
  515. a descendant of
  516. .CW make
  517. that is documented in the Programmer's Manual.
  518. A convention used throughout the
  519. .CW mkfiles
  520. makes it easy to compile the source into binary suitable for any architecture.
  521. .PP
  522. The variable
  523. .CW $cputype
  524. is advisory: it reports the architecture of the current environment, and should
  525. not be modified. A second variable,
  526. .CW $objtype ,
  527. is used to set which architecture is being
  528. .I compiled
  529. for.
  530. The value of
  531. .CW $objtype
  532. can be used by a
  533. .CW mkfile
  534. to configure the compilation environment.
  535. .PP
  536. In each machine's root directory there is a short
  537. .CW mkfile
  538. that defines a set of macros for the compiler, loader, etc.
  539. Here is
  540. .CW /mips/mkfile :
  541. .P1
  542. </sys/src/mkfile.proto
  543. CC=vc
  544. LD=vl
  545. O=v
  546. AS=va
  547. .P2
  548. The line
  549. .P1
  550. </sys/src/mkfile.proto
  551. .P2
  552. causes
  553. .CW mk
  554. to include the file
  555. .CW /sys/src/mkfile.proto ,
  556. which contains general definitions:
  557. .P1
  558. #
  559. # common mkfile parameters shared by all architectures
  560. #
  561. OS=5689qv
  562. CPUS=arm amd64 386 power mips
  563. CFLAGS=-FTVw
  564. LEX=lex
  565. YACC=yacc
  566. MK=/bin/mk
  567. .P2
  568. .CW CC
  569. is obviously the compiler,
  570. .CW AS
  571. the assembler, and
  572. .CW LD
  573. the loader.
  574. .CW O
  575. is the suffix for the object files and
  576. .CW CPUS
  577. and
  578. .CW OS
  579. are used in special rules described below.
  580. .PP
  581. Here is a
  582. .CW mkfile
  583. to build the installed source for
  584. .CW sam :
  585. .P1
  586. </$objtype/mkfile
  587. OBJ=sam.$O address.$O buffer.$O cmd.$O disc.$O error.$O \e
  588. file.$O io.$O list.$O mesg.$O moveto.$O multi.$O \e
  589. plan9.$O rasp.$O regexp.$O string.$O sys.$O xec.$O
  590. $O.out: $OBJ
  591. $LD $OBJ
  592. install: $O.out
  593. cp $O.out /$objtype/bin/sam
  594. installall:
  595. for(objtype in $CPUS) mk install
  596. %.$O: %.c
  597. $CC $CFLAGS $stem.c
  598. $OBJ: sam.h errors.h mesg.h
  599. address.$O cmd.$O parse.$O xec.$O unix.$O: parse.h
  600. clean:V:
  601. rm -f [$OS].out *.[$OS] y.tab.?
  602. .P2
  603. (The actual
  604. .CW mkfile
  605. imports most of its rules from other secondary files, but
  606. this example works and is not misleading.)
  607. The first line causes
  608. .CW mk
  609. to include the contents of
  610. .CW /$objtype/mkfile
  611. in the current
  612. .CW mkfile .
  613. If
  614. .CW $objtype
  615. is
  616. .CW mips ,
  617. this inserts the MIPS macro definitions into the
  618. .CW mkfile .
  619. In this case the rule for
  620. .CW $O.out
  621. uses the MIPS tools to build
  622. .CW v.out .
  623. The
  624. .CW %.$O
  625. rule in the file uses
  626. .CW mk 's
  627. pattern matching facilities to convert the source files to the object
  628. files through the compiler.
  629. (The text of the rules is passed directly to the shell,
  630. .CW rc ,
  631. without further translation.
  632. See the
  633. .CW mk
  634. manual if any of this is unfamiliar.)
  635. Because the default rule builds
  636. .CW $O.out
  637. rather than
  638. .CW sam ,
  639. it is possible to maintain binaries for multiple machines in the
  640. same source directory without conflict.
  641. This is also, of course, why the output files from the various
  642. compilers and loaders
  643. have distinct names.
  644. .PP
  645. The rest of the
  646. .CW mkfile
  647. should be easy to follow; notice how the rules for
  648. .CW clean
  649. and
  650. .CW installall
  651. (that is, install versions for all architectures) use other macros
  652. defined in
  653. .CW /$objtype/mkfile .
  654. In Plan 9,
  655. .CW mkfiles
  656. for commands conventionally contain rules to
  657. .CW install
  658. (compile and install the version for
  659. .CW $objtype ),
  660. .CW installall
  661. (compile and install for all
  662. .CW $objtypes ),
  663. and
  664. .CW clean
  665. (remove all object files, binaries, etc.).
  666. .PP
  667. The
  668. .CW mkfile
  669. is easy to use. To build a MIPS binary,
  670. .CW v.out :
  671. .P1
  672. % objtype=mips
  673. % mk
  674. .P2
  675. To build and install a MIPS binary:
  676. .P1
  677. % objtype=mips
  678. % mk install
  679. .P2
  680. To build and install all versions:
  681. .P1
  682. % mk installall
  683. .P2
  684. These conventions make cross-compilation as easy to manage
  685. as traditional native compilation.
  686. Plan 9 programs compile and run without change on machines from
  687. large multiprocessors to laptops. For more information about this process, see
  688. ``Plan 9 Mkfiles'',
  689. by Bob Flandrena.
  690. .SH
  691. Portability
  692. .PP
  693. Within Plan 9, it is painless to write portable programs, programs whose
  694. source is independent of the machine on which they execute.
  695. The operating system is fixed and the compiler, headers and libraries
  696. are constant so most of the stumbling blocks to portability are removed.
  697. Attention to a few details can avoid those that remain.
  698. .PP
  699. Plan 9 is a heterogeneous environment, so programs must
  700. .I expect
  701. that external files will be written by programs on machines of different
  702. architectures.
  703. The compilers, for instance, must handle without confusion
  704. object files written by other machines.
  705. The traditional approach to this problem is to pepper the source with
  706. .CW #ifdefs
  707. to turn byte-swapping on and off.
  708. Plan 9 takes a different approach: of the handful of machine-dependent
  709. .CW #ifdefs
  710. in all the source, almost all are deep in the libraries.
  711. Instead programs read and write files in a defined format,
  712. either (for low volume applications) as formatted text, or
  713. (for high volume applications) as binary in a known byte order.
  714. If the external data were written with the most significant
  715. byte first, the following code reads a 4-byte integer correctly
  716. regardless of the architecture of the executing machine (assuming
  717. an unsigned long holds 4 bytes):
  718. .P1
  719. ulong
  720. getlong(void)
  721. {
  722. ulong l;
  723. l = (getchar()&0xFF)<<24;
  724. l |= (getchar()&0xFF)<<16;
  725. l |= (getchar()&0xFF)<<8;
  726. l |= (getchar()&0xFF)<<0;
  727. return l;
  728. }
  729. .P2
  730. Note that this code does not `swap' the bytes; instead it just reads
  731. them in the correct order.
  732. Variations of this code will handle any binary format
  733. and also avoid problems
  734. involving how structures are padded, how words are aligned,
  735. and other impediments to portability.
  736. Be aware, though, that extra care is needed to handle floating point data.
  737. .PP
  738. Efficiency hounds will argue that this method is unnecessarily slow and clumsy
  739. when the executing machine has the same byte order (and padding and alignment)
  740. as the data.
  741. The CPU cost of I/O processing
  742. is rarely the bottleneck for an application, however,
  743. and the gain in simplicity of porting and maintaining the code greatly outweighs
  744. the minor speed loss from handling data in this general way.
  745. This method is how the Plan 9 compilers, the window system, and even the file
  746. servers transmit data between programs.
  747. .PP
  748. To port programs beyond Plan 9, where the system interface is more variable,
  749. it is probably necessary to use
  750. .CW pcc
  751. and hope that the target machine supports ANSI C and POSIX.
  752. .SH
  753. I/O
  754. .PP
  755. The default C library, defined by the include file
  756. .CW <libc.h> ,
  757. contains no buffered I/O package.
  758. It does have several entry points for printing formatted text:
  759. .CW print
  760. outputs text to the standard output,
  761. .CW fprint
  762. outputs text to a specified integer file descriptor, and
  763. .CW sprint
  764. places text in a character array.
  765. To access library routines for buffered I/O, a program must
  766. explicitly include the header file associated with an appropriate library.
  767. .PP
  768. The recommended I/O library, used by most Plan 9 utilities, is
  769. .CW bio
  770. (buffered I/O), defined by
  771. .CW <bio.h> .
  772. There also exists an implementation of ANSI Standard I/O,
  773. .CW stdio .
  774. .PP
  775. .CW Bio
  776. is small and efficient, particularly for buffer-at-a-time or
  777. line-at-a-time I/O.
  778. Even for character-at-a-time I/O, however, it is significantly faster than
  779. the Standard I/O library,
  780. .CW stdio .
  781. Its interface is compact and regular, although it lacks a few conveniences.
  782. The most noticeable is that one must explicitly define buffers for standard
  783. input and output;
  784. .CW bio
  785. does not predefine them. Here is a program to copy input to output a byte
  786. at a time using
  787. .CW bio :
  788. .P1
  789. #include <u.h>
  790. #include <libc.h>
  791. #include <bio.h>
  792. Biobuf bin;
  793. Biobuf bout;
  794. main(void)
  795. {
  796. int c;
  797. Binit(&bin, 0, OREAD);
  798. Binit(&bout, 1, OWRITE);
  799. while((c=Bgetc(&bin)) != Beof)
  800. Bputc(&bout, c);
  801. exits(0);
  802. }
  803. .P2
  804. For peak performance, we could replace
  805. .CW Bgetc
  806. and
  807. .CW Bputc
  808. by their equivalent in-line macros
  809. .CW BGETC
  810. and
  811. .CW BPUTC
  812. but
  813. the performance gain would be modest.
  814. For more information on
  815. .CW bio ,
  816. see the Programmer's Manual.
  817. .PP
  818. Perhaps the most dramatic difference in the I/O interface of Plan 9 from other
  819. systems' is that text is not ASCII.
  820. The format for
  821. text in Plan 9 is a byte-stream encoding of 16-bit characters.
  822. The character set is based on the Unicode Standard and is backward compatible with
  823. ASCII:
  824. characters with value 0 through 127 are the same in both sets.
  825. The 16-bit characters, called
  826. .I runes
  827. in Plan 9, are encoded using a representation called
  828. UTF,
  829. an encoding that is becoming accepted as a standard.
  830. (ISO calls it UTF-8;
  831. throughout Plan 9 it's just called
  832. UTF.)
  833. UTF
  834. defines multibyte sequences to
  835. represent character values from 0 to 65535.
  836. In
  837. UTF,
  838. character values up to 127 decimal, 7F hexadecimal, represent themselves,
  839. so straight
  840. ASCII
  841. files are also valid
  842. UTF.
  843. Also,
  844. UTF
  845. guarantees that bytes with values 0 to 127 (NUL to DEL, inclusive)
  846. will appear only when they represent themselves, so programs that read bytes
  847. looking for plain ASCII characters will continue to work.
  848. Any program that expects a one-to-one correspondence between bytes and
  849. characters will, however, need to be modified.
  850. An example is parsing file names.
  851. File names, like all text, are in
  852. UTF,
  853. so it is incorrect to search for a character in a string by
  854. .CW strchr(filename,
  855. .CW c)
  856. because the character might have a multi-byte encoding.
  857. The correct method is to call
  858. .CW utfrune(filename,
  859. .CW c) ,
  860. defined in
  861. .I rune (2),
  862. which interprets the file name as a sequence of encoded characters
  863. rather than bytes.
  864. In fact, even when you know the character is a single byte
  865. that can represent only itself,
  866. it is safer to use
  867. .CW utfrune
  868. because that assumes nothing about the character set
  869. and its representation.
  870. .PP
  871. The library defines several symbols relevant to the representation of characters.
  872. Any byte with unsigned value less than
  873. .CW Runesync
  874. will not appear in any multi-byte encoding of a character.
  875. .CW Utfrune
  876. compares the character being searched against
  877. .CW Runesync
  878. to see if it is sufficient to call
  879. .CW strchr
  880. or if the byte stream must be interpreted.
  881. Any byte with unsigned value less than
  882. .CW Runeself
  883. is represented by a single byte with the same value.
  884. Finally, when errors are encountered converting
  885. to runes from a byte stream, the library returns the rune value
  886. .CW Runeerror
  887. and advances a single byte. This permits programs to find runes
  888. embedded in binary data.
  889. .PP
  890. .CW Bio
  891. includes routines
  892. .CW Bgetrune
  893. and
  894. .CW Bputrune
  895. to transform the external byte stream
  896. UTF
  897. format to and from
  898. internal 16-bit runes.
  899. Also, the
  900. .CW %s
  901. format to
  902. .CW print
  903. accepts
  904. UTF;
  905. .CW %c
  906. prints a character after narrowing it to 8 bits.
  907. The
  908. .CW %S
  909. format prints a null-terminated sequence of runes;
  910. .CW %C
  911. prints a character after narrowing it to 16 bits.
  912. For more information, see the Programmer's Manual, in particular
  913. .I utf (6)
  914. and
  915. .I rune (2),
  916. and the paper,
  917. ``Hello world, or
  918. Καλημέρα κόσμε, or\
  919. \f(Jpこんにちは 世界\f1'',
  920. by Rob Pike and
  921. Ken Thompson;
  922. there is not room for the full story here.
  923. .PP
  924. These issues affect the compiler in several ways.
  925. First, the C source is in
  926. UTF.
  927. ANSI says C variables are formed from
  928. ASCII
  929. alphanumerics, but comments and literal strings may contain any characters
  930. encoded in the native encoding, here
  931. UTF.
  932. The declaration
  933. .P1
  934. char *cp = "abcÿ";
  935. .P2
  936. initializes the variable
  937. .CW cp
  938. to point to an array of bytes holding the
  939. UTF
  940. representation of the characters
  941. .CW abcÿ.
  942. The type
  943. .CW Rune
  944. is defined in
  945. .CW <u.h>
  946. to be
  947. .CW ushort ,
  948. which is also the `wide character' type in the compiler.
  949. Therefore the declaration
  950. .P1
  951. Rune *rp = L"abcÿ";
  952. .P2
  953. initializes the variable
  954. .CW rp
  955. to point to an array of unsigned short integers holding the 16-bit
  956. values of the characters
  957. .CW abcÿ .
  958. Note that in both these declarations the characters in the source
  959. that represent
  960. .CW "abcÿ"
  961. are the same; what changes is how those characters are represented
  962. in memory in the program.
  963. The following two lines:
  964. .P1
  965. print("%s\en", "abcÿ");
  966. print("%S\en", L"abcÿ");
  967. .P2
  968. produce the same
  969. UTF
  970. string on their output, the first by copying the bytes, the second
  971. by converting from runes to bytes.
  972. .PP
  973. In C, character constants are integers but narrowed through the
  974. .CW char
  975. type.
  976. The Unicode character
  977. .CW ÿ
  978. has value 255, so if the
  979. .CW char
  980. type is signed,
  981. the constant
  982. .CW 'ÿ'
  983. has value \-1 (which is equal to EOF).
  984. On the other hand,
  985. .CW L'ÿ'
  986. narrows through the wide character type,
  987. .CW ushort ,
  988. and therefore has value 255.
  989. .PP
  990. Finally, although it's not ANSI C, the Plan 9 C compilers
  991. assume any character with value above
  992. .CW Runeself
  993. is an alphanumeric,
  994. so α is a legal, if non-portable, variable name.
  995. .SH
  996. Arguments
  997. .PP
  998. Some macros are defined
  999. in
  1000. .CW <libc.h>
  1001. for parsing the arguments to
  1002. .CW main() .
  1003. They are described in
  1004. .I ARG (2)
  1005. but are fairly self-explanatory.
  1006. There are four macros:
  1007. .CW ARGBEGIN
  1008. and
  1009. .CW ARGEND
  1010. are used to bracket a hidden
  1011. .CW switch
  1012. statement within which
  1013. .CW ARGC
  1014. returns the current option character (rune) being processed and
  1015. .CW ARGF
  1016. returns the argument to the option, as in the loader option
  1017. .CW -o
  1018. .CW file .
  1019. Here, for example, is the code at the beginning of
  1020. .CW main()
  1021. in
  1022. .CW ramfs.c
  1023. (see
  1024. .I ramfs (1))
  1025. that cracks its arguments:
  1026. .P1
  1027. void
  1028. main(int argc, char *argv[])
  1029. {
  1030. char *defmnt;
  1031. int p[2];
  1032. int mfd[2];
  1033. int stdio = 0;
  1034. defmnt = "/tmp";
  1035. ARGBEGIN{
  1036. case 'i':
  1037. defmnt = 0;
  1038. stdio = 1;
  1039. mfd[0] = 0;
  1040. mfd[1] = 1;
  1041. break;
  1042. case 's':
  1043. defmnt = 0;
  1044. break;
  1045. case 'm':
  1046. defmnt = ARGF();
  1047. break;
  1048. default:
  1049. usage();
  1050. }ARGEND
  1051. .P2
  1052. .SH
  1053. Extensions
  1054. .PP
  1055. The compiler has several extensions to 1989 ANSI C, all of which are used
  1056. extensively in the system source.
  1057. Some of these have been adopted in later ANSI C standards.
  1058. First,
  1059. .I structure
  1060. .I displays
  1061. permit
  1062. .CW struct
  1063. expressions to be formed dynamically.
  1064. Given these declarations:
  1065. .P1
  1066. typedef struct Point Point;
  1067. typedef struct Rectangle Rectangle;
  1068. struct Point
  1069. {
  1070. int x, y;
  1071. };
  1072. struct Rectangle
  1073. {
  1074. Point min, max;
  1075. };
  1076. Point p, q, add(Point, Point);
  1077. Rectangle r;
  1078. int x, y;
  1079. .P2
  1080. this assignment may appear anywhere an assignment is legal:
  1081. .P1
  1082. r = (Rectangle){add(p, q), (Point){x, y+3}};
  1083. .P2
  1084. The syntax is the same as for initializing a structure but with
  1085. a leading cast.
  1086. .PP
  1087. If an
  1088. .I anonymous
  1089. .I structure
  1090. or
  1091. .I union
  1092. is declared within another structure or union, the members of the internal
  1093. structure or union are addressable without prefix in the outer structure.
  1094. This feature eliminates the clumsy naming of nested structures and,
  1095. particularly, unions.
  1096. For example, after these declarations,
  1097. .P1
  1098. struct Lock
  1099. {
  1100. int locked;
  1101. };
  1102. struct Node
  1103. {
  1104. int type;
  1105. union{
  1106. double dval;
  1107. double fval;
  1108. long lval;
  1109. }; /* anonymous union */
  1110. struct Lock; /* anonymous structure */
  1111. } *node;
  1112. void lock(struct Lock*);
  1113. .P2
  1114. one may refer to
  1115. .CW node->type ,
  1116. .CW node->dval ,
  1117. .CW node->fval ,
  1118. .CW node->lval ,
  1119. and
  1120. .CW node->locked .
  1121. Moreover, the address of a
  1122. .CW struct
  1123. .CW Node
  1124. may be used without a cast anywhere that the address of a
  1125. .CW struct
  1126. .CW Lock
  1127. is used, such as in argument lists.
  1128. The compiler automatically promotes the type and adjusts the address.
  1129. Thus one may invoke
  1130. .CW lock(node) .
  1131. .PP
  1132. Anonymous structures and unions may be accessed by type name
  1133. if (and only if) they are declared using a
  1134. .CW typedef
  1135. name.
  1136. For example, using the above declaration for
  1137. .CW Point ,
  1138. one may declare
  1139. .P1
  1140. struct
  1141. {
  1142. int type;
  1143. Point;
  1144. } p;
  1145. .P2
  1146. and refer to
  1147. .CW p.Point .
  1148. .PP
  1149. In the initialization of arrays, a number in square brackets before an
  1150. element sets the index for the initialization. For example, to initialize
  1151. some elements in
  1152. a table of function pointers indexed by
  1153. ASCII
  1154. character,
  1155. .P1
  1156. void percent(void), slash(void);
  1157. void (*func[128])(void) =
  1158. {
  1159. ['%'] percent,
  1160. ['/'] slash,
  1161. };
  1162. .P2
  1163. .LP
  1164. A similar syntax allows one to initialize structure elements:
  1165. .P1
  1166. Point p =
  1167. {
  1168. .y 100,
  1169. .x 200
  1170. };
  1171. .P2
  1172. These initialization syntaxes were later added to ANSI C, with the addition of an
  1173. equals sign between the index or tag and the value.
  1174. The Plan 9 compiler accepts either form.
  1175. .PP
  1176. Finally, the declaration
  1177. .P1
  1178. extern register reg;
  1179. .P2
  1180. .I this "" (
  1181. appearance of the register keyword is not ignored)
  1182. allocates a global register to hold the variable
  1183. .CW reg .
  1184. External registers must be used carefully: they need to be declared in
  1185. .I all
  1186. source files and libraries in the program to guarantee the register
  1187. is not allocated temporarily for other purposes.
  1188. Especially on machines with few registers, such as the i386,
  1189. it is easy to link accidentally with code that has already usurped
  1190. the global registers and there is no diagnostic when this happens.
  1191. Used wisely, though, external registers are powerful.
  1192. The Plan 9 operating system uses them to access per-process and
  1193. per-machine data structures on a multiprocessor. The storage class they provide
  1194. is hard to create in other ways.
  1195. .SH
  1196. The compile-time environment
  1197. .PP
  1198. The code generated by the compilers is `optimized' by default:
  1199. variables are placed in registers and peephole optimizations are
  1200. performed.
  1201. The compiler flag
  1202. .CW -N
  1203. disables these optimizations.
  1204. Registerization is done locally rather than throughout a function:
  1205. whether a variable occupies a register or
  1206. the memory location identified in the symbol
  1207. table depends on the activity of the variable and may change
  1208. throughout the life of the variable.
  1209. The
  1210. .CW -N
  1211. flag is rarely needed;
  1212. its main use is to simplify debugging.
  1213. There is no information in the symbol table to identify the
  1214. registerization of a variable, so
  1215. .CW -N
  1216. guarantees the variable is always where the symbol table says it is.
  1217. .PP
  1218. Another flag,
  1219. .CW -w ,
  1220. turns
  1221. .I on
  1222. warnings about portability and problems detected in flow analysis.
  1223. Most code in Plan 9 is compiled with warnings enabled;
  1224. these warnings plus the type checking offered by function prototypes
  1225. provide most of the support of the Unix tool
  1226. .CW lint
  1227. more accurately and with less chatter.
  1228. Two of the warnings,
  1229. `used and not set' and `set and not used', are almost always accurate but
  1230. may be triggered spuriously by code with invisible control flow,
  1231. such as in routines that call
  1232. .CW longjmp .
  1233. The compiler statements
  1234. .P1
  1235. SET(v1);
  1236. USED(v2);
  1237. .P2
  1238. decorate the flow graph to silence the compiler.
  1239. Either statement accepts a comma-separated list of variables.
  1240. Use them carefully: they may silence real errors.
  1241. For the common case of unused parameters to a function,
  1242. leaving the name off the declaration silences the warnings.
  1243. That is, listing the type of a parameter but giving it no
  1244. associated variable name does the trick.
  1245. .SH
  1246. Debugging
  1247. .PP
  1248. There are two debuggers available on Plan 9.
  1249. The first, and older, is
  1250. .CW db ,
  1251. a revision of Unix
  1252. .CW adb .
  1253. The other,
  1254. .CW acid ,
  1255. is a source-level debugger whose commands are statements in
  1256. a true programming language.
  1257. .CW Acid
  1258. is the preferred debugger, but since it
  1259. borrows some elements of
  1260. .CW db ,
  1261. notably the formats for displaying values, it is worth knowing a little bit about
  1262. .CW db .
  1263. .PP
  1264. Both debuggers support multiple architectures in a single program; that is,
  1265. the programs are
  1266. .CW db
  1267. and
  1268. .CW acid ,
  1269. not for example
  1270. .CW vdb
  1271. and
  1272. .CW vacid .
  1273. They also support cross-architecture debugging comfortably:
  1274. one may debug a 386 binary on a MIPS.
  1275. .PP
  1276. Imagine a program has crashed mysteriously:
  1277. .P1
  1278. % X11/X
  1279. Fatal server bug!
  1280. failed to create default stipple
  1281. X 106: suicide: sys: trap: fault read addr=0x0 pc=0x00105fb8
  1282. %
  1283. .P2
  1284. When a process dies on Plan 9 it hangs in the `broken' state
  1285. for debugging.
  1286. Attach a debugger to the process by naming its process id:
  1287. .P1
  1288. % acid 106
  1289. /proc/106/text:mips plan 9 executable
  1290. /sys/lib/acid/port
  1291. /sys/lib/acid/mips
  1292. acid:
  1293. .P2
  1294. The
  1295. .CW acid
  1296. function
  1297. .CW stk()
  1298. reports the stack traceback:
  1299. .P1
  1300. acid: stk()
  1301. At pc:0x105fb8:abort+0x24 /sys/src/ape/lib/ap/stdio/abort.c:6
  1302. abort() /sys/src/ape/lib/ap/stdio/abort.c:4
  1303. called from FatalError+#4e
  1304. /sys/src/X/mit/server/dix/misc.c:421
  1305. FatalError(s9=#e02, s8=#4901d200, s7=#2, s6=#72701, s5=#1,
  1306. s4=#7270d, s3=#6, s2=#12, s1=#ff37f1c, s0=#6, f=#7270f)
  1307. /sys/src/X/mit/server/dix/misc.c:416
  1308. called from gnotscreeninit+#4ce
  1309. /sys/src/X/mit/server/ddx/gnot/gnot.c:792
  1310. gnotscreeninit(snum=#0, sc=#80db0)
  1311. /sys/src/X/mit/server/ddx/gnot/gnot.c:766
  1312. called from AddScreen+#16e
  1313. /n/bootes/sys/src/X/mit/server/dix/main.c:610
  1314. AddScreen(pfnInit=0x0000129c,argc=0x00000001,argv=0x7fffffe4)
  1315. /sys/src/X/mit/server/dix/main.c:530
  1316. called from InitOutput+0x80
  1317. /sys/src/X/mit/server/ddx/brazil/brddx.c:522
  1318. InitOutput(argc=0x00000001,argv=0x7fffffe4)
  1319. /sys/src/X/mit/server/ddx/brazil/brddx.c:511
  1320. called from main+0x294
  1321. /sys/src/X/mit/server/dix/main.c:225
  1322. main(argc=0x00000001,argv=0x7fffffe4)
  1323. /sys/src/X/mit/server/dix/main.c:136
  1324. called from _main+0x24
  1325. /sys/src/ape/lib/ap/mips/main9.s:8
  1326. .P2
  1327. The function
  1328. .CW lstk()
  1329. is similar but
  1330. also reports the values of local variables.
  1331. Note that the traceback includes full file names; this is a boon to debugging,
  1332. although it makes the output much noisier.
  1333. .PP
  1334. To use
  1335. .CW acid
  1336. well you will need to learn its input language; see the
  1337. ``Acid Manual'',
  1338. by Phil Winterbottom,
  1339. for details. For simple debugging, however, the information in the manual page is
  1340. sufficient. In particular, it describes the most useful functions
  1341. for examining a process.
  1342. .PP
  1343. The compiler does not place
  1344. information describing the types of variables in the executable,
  1345. but a compile-time flag provides crude support for symbolic debugging.
  1346. The
  1347. .CW -a
  1348. flag to the compiler suppresses code generation
  1349. and instead emits source text in the
  1350. .CW acid
  1351. language to format and display data structure types defined in the program.
  1352. The easiest way to use this feature is to put a rule in the
  1353. .CW mkfile :
  1354. .P1
  1355. syms: main.$O
  1356. $CC -a main.c > syms
  1357. .P2
  1358. Then from within
  1359. .CW acid ,
  1360. .P1
  1361. acid: include("sourcedirectory/syms")
  1362. .P2
  1363. to read in the relevant definitions.
  1364. (For multi-file source, you need to be a little fancier;
  1365. see
  1366. .I 8c (1)).
  1367. This text includes, for each defined compound
  1368. type, a function with that name that may be called with the address of a structure
  1369. of that type to display its contents.
  1370. For example, if
  1371. .CW rect
  1372. is a global variable of type
  1373. .CW Rectangle ,
  1374. one may execute
  1375. .P1
  1376. Rectangle(*rect)
  1377. .P2
  1378. to display it.
  1379. The
  1380. .CW *
  1381. (indirection) operator is necessary because
  1382. of the way
  1383. .CW acid
  1384. works: each global symbol in the program is defined as a variable by
  1385. .CW acid ,
  1386. with value equal to the
  1387. .I address
  1388. of the symbol.
  1389. .PP
  1390. Another common technique is to write by hand special
  1391. .CW acid
  1392. code to define functions to aid debugging, initialize the debugger, and so on.
  1393. Conventionally, this is placed in a file called
  1394. .CW acid
  1395. in the source directory; it has a line
  1396. .P1
  1397. include("sourcedirectory/syms");
  1398. .P2
  1399. to load the compiler-produced symbols. One may edit the compiler output directly but
  1400. it is wiser to keep the hand-generated
  1401. .CW acid
  1402. separate from the machine-generated.
  1403. .PP
  1404. To make things simple, the default rules in the system
  1405. .CW mkfiles
  1406. include entries to make
  1407. .CW foo.acid
  1408. from
  1409. .CW foo.c ,
  1410. so one may use
  1411. .CW mk
  1412. to automate the production of
  1413. .CW acid
  1414. definitions for a given C source file.
  1415. .PP
  1416. There is much more to say here. See
  1417. .CW acid
  1418. manual page, the reference manual, or the paper
  1419. ``Acid: A Debugger Built From A Language'',
  1420. also by Phil Winterbottom.