asm.ms 31 KB

1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192939495969798991001011021031041051061071081091101111121131141151161171181191201211221231241251261271281291301311321331341351361371381391401411421431441451461471481491501511521531541551561571581591601611621631641651661671681691701711721731741751761771781791801811821831841851861871881891901911921931941951961971981992002012022032042052062072082092102112122132142152162172182192202212222232242252262272282292302312322332342352362372382392402412422432442452462472482492502512522532542552562572582592602612622632642652662672682692702712722732742752762772782792802812822832842852862872882892902912922932942952962972982993003013023033043053063073083093103113123133143153163173183193203213223233243253263273283293303313323333343353363373383393403413423433443453463473483493503513523533543553563573583593603613623633643653663673683693703713723733743753763773783793803813823833843853863873883893903913923933943953963973983994004014024034044054064074084094104114124134144154164174184194204214224234244254264274284294304314324334344354364374384394404414424434444454464474484494504514524534544554564574584594604614624634644654664674684694704714724734744754764774784794804814824834844854864874884894904914924934944954964974984995005015025035045055065075085095105115125135145155165175185195205215225235245255265275285295305315325335345355365375385395405415425435445455465475485495505515525535545555565575585595605615625635645655665675685695705715725735745755765775785795805815825835845855865875885895905915925935945955965975985996006016026036046056066076086096106116126136146156166176186196206216226236246256266276286296306316326336346356366376386396406416426436446456466476486496506516526536546556566576586596606616626636646656666676686696706716726736746756766776786796806816826836846856866876886896906916926936946956966976986997007017027037047057067077087097107117127137147157167177187197207217227237247257267277287297307317327337347357367377387397407417427437447457467477487497507517527537547557567577587597607617627637647657667677687697707717727737747757767777787797807817827837847857867877887897907917927937947957967977987998008018028038048058068078088098108118128138148158168178188198208218228238248258268278288298308318328338348358368378388398408418428438448458468478488498508518528538548558568578588598608618628638648658668678688698708718728738748758768778788798808818828838848858868878888898908918928938948958968978988999009019029039049059069079089099109119129139149159169179189199209219229239249259269279289299309319329339349359369379389399409419429439449459469479489499509519529539549559569579589599609619629639649659669679689699709719729739749759769779789799809819829839849859869879889899909919929939949959969979989991000100110021003100410051006100710081009101010111012101310141015101610171018101910201021102210231024102510261027102810291030103110321033103410351036103710381039104010411042104310441045104610471048104910501051105210531054105510561057105810591060106110621063106410651066106710681069107010711072107310741075107610771078107910801081108210831084108510861087108810891090109110921093109410951096109710981099110011011102110311041105110611071108110911101111111211131114111511161117111811191120112111221123112411251126112711281129113011311132113311341135113611371138113911401141114211431144114511461147114811491150115111521153115411551156115711581159116011611162116311641165116611671168116911701171117211731174117511761177117811791180118111821183118411851186118711881189119011911192119311941195119611971198119912001201120212031204120512061207120812091210121112121213121412151216121712181219122012211222122312241225122612271228122912301231123212331234123512361237123812391240124112421243124412451246124712481249125012511252125312541255125612571258125912601261126212631264126512661267126812691270127112721273127412751276127712781279128012811282128312841285128612871288128912901291129212931294129512961297129812991300130113021303130413051306130713081309131013111312131313141315131613171318131913201321132213231324132513261327132813291330133113321333133413351336133713381339134013411342134313441345134613471348134913501351135213531354135513561357135813591360136113621363136413651366136713681369137013711372137313741375137613771378137913801381138213831384138513861387138813891390139113921393139413951396139713981399140014011402140314041405140614071408140914101411141214131414141514161417141814191420142114221423142414251426142714281429
  1. .HTML "A Manual for the Plan 9 assembler
  2. .ft CW
  3. .ta 8n +8n +8n +8n +8n +8n +8n
  4. .ft
  5. .TL
  6. A Manual for the Plan 9 assembler
  7. .AU
  8. Rob Pike
  9. rob@plan9.bell-labs.com
  10. .SH
  11. Machines
  12. .PP
  13. There is an assembler for each of the MIPS, SPARC, Intel 386,
  14. Intel 960, AMD 29000, Motorola 68020 and 68000, Motorola Power PC, DEC Alpha, and Acorn ARM.
  15. The 68020 assembler,
  16. .CW 2a ,
  17. is the oldest and in many ways the prototype.
  18. The assemblers are really just variations of a single program:
  19. they share many properties such as left-to-right assignment order for
  20. instruction operands and the synthesis of macro instructions
  21. such as
  22. .CW MOVE
  23. to hide the peculiarities of the load and store structure of the machines.
  24. To keep things concrete, the first part of this manual is
  25. specifically about the 68020.
  26. At the end is a description of the differences among
  27. the other assemblers.
  28. .PP
  29. The document, ``How to Use the Plan 9 C Compiler'', by Rob Pike,
  30. is a prerequisite for this manual.
  31. .SH
  32. Registers
  33. .PP
  34. All pre-defined symbols in the assembler are upper-case.
  35. Data registers are
  36. .CW R0
  37. through
  38. .CW R7 ;
  39. address registers are
  40. .CW A0
  41. through
  42. .CW A7 ;
  43. floating-point registers are
  44. .CW F0
  45. through
  46. .CW F7 .
  47. .PP
  48. A pointer in
  49. .CW A6
  50. is used by the C compiler to point to data, enabling short addresses to
  51. be used more often.
  52. The value of
  53. .CW A6
  54. is constant and must be set during C program initialization
  55. to the address of the externally-defined symbol
  56. .CW a6base .
  57. .PP
  58. The following hardware registers are defined in the assembler; their
  59. meaning should be obvious given a 68020 manual:
  60. .CW CAAR ,
  61. .CW CACR ,
  62. .CW CCR ,
  63. .CW DFC ,
  64. .CW ISP ,
  65. .CW MSP ,
  66. .CW SFC ,
  67. .CW SR ,
  68. .CW USP ,
  69. and
  70. .CW VBR .
  71. .PP
  72. The assembler also defines several pseudo-registers that
  73. manipulate the stack:
  74. .CW FP ,
  75. .CW SP ,
  76. and
  77. .CW TOS .
  78. .CW FP
  79. is the frame pointer, so
  80. .CW 0(FP)
  81. is the first argument,
  82. .CW 4(FP)
  83. is the second, and so on.
  84. .CW SP
  85. is the local stack pointer, where automatic variables are held
  86. (SP is a pseudo-register only on the 68020);
  87. .CW 0(SP)
  88. is the first automatic, and so on as with
  89. .CW FP .
  90. Finally,
  91. .CW TOS
  92. is the top-of-stack register, used for pushing parameters to procedures,
  93. saving temporary values, and so on.
  94. .PP
  95. The assembler and loader track these pseudo-registers so
  96. the above statements are true regardless of what has been
  97. pushed on the hardware stack, pointed to by
  98. .CW A7 .
  99. The name
  100. .CW A7
  101. refers to the hardware stack pointer, but beware of mixed use of
  102. .CW A7
  103. and the above stack-related pseudo-registers, which will cause trouble.
  104. Note, too, that the
  105. .CW PEA
  106. instruction is observed by the loader to
  107. alter SP and thus will insert a corresponding pop before all returns.
  108. The assembler accepts a label-like name to be attached to
  109. .CW FP
  110. and
  111. .CW SP
  112. uses, such as
  113. .CW p+0(FP) ,
  114. to help document that
  115. .CW p
  116. is the first argument to a routine.
  117. The name goes in the symbol table but has no significance to the result
  118. of the program.
  119. .SH
  120. Referring to data
  121. .PP
  122. All external references must be made relative to some pseudo-register,
  123. either
  124. .CW PC
  125. (the virtual program counter) or
  126. .CW SB
  127. (the ``static base'' register).
  128. .CW PC
  129. counts instructions, not bytes of data.
  130. For example, to branch to the second following instruction, that is,
  131. to skip one instruction, one may write
  132. .P1
  133. BRA 2(PC)
  134. .P2
  135. Labels are also allowed, as in
  136. .P1
  137. BRA return
  138. NOP
  139. return:
  140. RTS
  141. .P2
  142. When using labels, there is no
  143. .CW (PC)
  144. annotation.
  145. .PP
  146. The pseudo-register
  147. .CW SB
  148. refers to the beginning of the address space of the program.
  149. Thus, references to global data and procedures are written as
  150. offsets to
  151. .CW SB ,
  152. as in
  153. .P1
  154. MOVL $array(SB), TOS
  155. .P2
  156. to push the address of a global array on the stack, or
  157. .P1
  158. MOVL array+4(SB), TOS
  159. .P2
  160. to push the second (4-byte) element of the array.
  161. Note the use of an offset; the complete list of addressing modes is given below.
  162. Similarly, subroutine calls must use
  163. .CW SB :
  164. .P1
  165. BSR exit(SB)
  166. .P2
  167. File-static variables have syntax
  168. .P1
  169. local<>+4(SB)
  170. .P2
  171. The
  172. .CW <>
  173. will be filled in at load time by a unique integer.
  174. .PP
  175. When a program starts, it must execute
  176. .P1
  177. MOVL $a6base(SB), A6
  178. .P2
  179. before accessing any global data.
  180. (On machines such as the MIPS and SPARC that cannot load a register
  181. in a single instruction, constants are loaded through the static base
  182. register. The loader recognizes code that initializes the static
  183. base register and treats it specially. You must be careful, however,
  184. not to load large constants on such machines when the static base
  185. register is not set up, such as early in interrupt routines.)
  186. .SH
  187. Expressions
  188. .PP
  189. Expressions are mostly what one might expect.
  190. Where an offset or a constant is expected,
  191. a primary expression with unary operators is allowed.
  192. A general C constant expression is allowed in parentheses.
  193. .PP
  194. Source files are preprocessed exactly as in the C compiler, so
  195. .CW #define
  196. and
  197. .CW #include
  198. work.
  199. .SH
  200. Addressing modes
  201. .PP
  202. The simple addressing modes are shared by all the assemblers.
  203. Here, for completeness, follows a table of all the 68020 addressing modes,
  204. since that machine has the richest set.
  205. In the table,
  206. .CW o
  207. is an offset, which if zero may be elided, and
  208. .CW d
  209. is a displacement, which is a constant between -128 and 127 inclusive.
  210. Many of the modes listed have the same name;
  211. scrutiny of the format will show what default is being applied.
  212. For instance, indexed mode with no address register supplied operates
  213. as though a zero-valued register were used.
  214. For "offset" read "displacement."
  215. For "\f(CW.s\fP" read one of
  216. .CW .L ,
  217. or
  218. .CW .W
  219. followed by
  220. .CW *1 ,
  221. .CW *2 ,
  222. .CW *4 ,
  223. or
  224. .CW *8
  225. to indicate the size and scaling of the data.
  226. .IP
  227. .TS
  228. l lfCW.
  229. data register R0
  230. address register A0
  231. floating-point register F0
  232. special names CAAR, CACR, etc.
  233. constant $con
  234. floating point constant $fcon
  235. external symbol name+o(SB)
  236. local symbol name<>+o(SB)
  237. automatic symbol name+o(SP)
  238. argument name+o(FP)
  239. address of external $name+o(SB)
  240. address of local $name<>+o(SB)
  241. indirect post-increment (A0)+
  242. indirect pre-decrement -(A0)
  243. indirect with offset o(A0)
  244. indexed with offset o()(R0.s)
  245. indexed with offset o(A0)(R0.s)
  246. external indexed name+o(SB)(R0.s)
  247. local indexed name<>+o(SB)(R0.s)
  248. automatic indexed name+o(SP)(R0.s)
  249. parameter indexed name+o(FP)(R0.s)
  250. offset indirect post-indexed d(o())(R0.s)
  251. offset indirect post-indexed d(o(A0))(R0.s)
  252. external indirect post-indexed d(name+o(SB))(R0.s)
  253. local indirect post-indexed d(name<>+o(SB))(R0.s)
  254. automatic indirect post-indexed d(name+o(SP))(R0.s)
  255. parameter indirect post-indexed d(name+o(FP))(R0.s)
  256. offset indirect pre-indexed d(o()(R0.s))
  257. offset indirect pre-indexed d(o(A0))
  258. offset indirect pre-indexed d(o(A0)(R0.s))
  259. external indirect pre-indexed d(name+o(SB))
  260. external indirect pre-indexed d(name+o(SB)(R0.s))
  261. local indirect pre-indexed d(name<>+o(SB))
  262. local indirect pre-indexed d(name<>+o(SB)(R0.s))
  263. automatic indirect pre-indexed d(name+o(SP))
  264. automatic indirect pre-indexed d(name+o(SP)(R0.s))
  265. parameter indirect pre-indexed d(name+o(FP))
  266. parameter indirect pre-indexed d(name+o(FP)(R0.s))
  267. .TE
  268. .in
  269. .SH
  270. Laying down data
  271. .PP
  272. Placing data in the instruction stream, say for interrupt vectors, is easy:
  273. the pseudo-instructions
  274. .CW LONG
  275. and
  276. .CW WORD
  277. (but not
  278. .CW BYTE )
  279. lay down the value of their single argument, of the appropriate size,
  280. as if it were an instruction:
  281. .P1
  282. LONG $12345
  283. .P2
  284. places the long 12345 (base 10)
  285. in the instruction stream.
  286. (On most machines,
  287. the only such operator is
  288. .CW WORD
  289. and it lays down 32-bit quantities.
  290. The 386 has all three:
  291. .CW LONG ,
  292. .CW WORD ,
  293. and
  294. .CW BYTE .
  295. The AMD64 adds
  296. .CW QUAD
  297. to that for 64-bit values.
  298. The 960 has only one,
  299. .CW LONG .)
  300. .PP
  301. Placing information in the data section is more painful.
  302. The pseudo-instruction
  303. .CW DATA
  304. does the work, given two arguments: an address at which to place the item,
  305. including its size,
  306. and the value to place there. For example, to define a character array
  307. .CW array
  308. containing the characters
  309. .CW abc
  310. and a terminating null:
  311. .P1
  312. DATA array+0(SB)/1, $'a'
  313. DATA array+1(SB)/1, $'b'
  314. DATA array+2(SB)/1, $'c'
  315. GLOBL array(SB), $4
  316. .P2
  317. or
  318. .P1
  319. DATA array+0(SB)/4, $"abc\ez"
  320. GLOBL array(SB), $4
  321. .P2
  322. The
  323. .CW /1
  324. defines the number of bytes to define,
  325. .CW GLOBL
  326. makes the symbol global, and the
  327. .CW $4
  328. says how many bytes the symbol occupies.
  329. Uninitialized data is zeroed automatically.
  330. The character
  331. .CW \ez
  332. is equivalent to the C
  333. .CW \e0.
  334. The string in a
  335. .CW DATA
  336. statement may contain a maximum of eight bytes;
  337. build larger strings piecewise.
  338. Two pseudo-instructions,
  339. .CW DYNT
  340. and
  341. .CW INIT ,
  342. allow the (obsolete) Alef compilers to build dynamic type information during the load
  343. phase.
  344. The
  345. .CW DYNT
  346. pseudo-instruction has two forms:
  347. .P1
  348. DYNT , ALEF_SI_5+0(SB)
  349. DYNT ALEF_AS+0(SB), ALEF_SI_5+0(SB)
  350. .P2
  351. In the first form,
  352. .CW DYNT
  353. defines the symbol to be a small unique integer constant, chosen by the loader,
  354. which is some multiple of the word size. In the second form,
  355. .CW DYNT
  356. defines the second symbol in the same way,
  357. places the address of the most recently
  358. defined text symbol in the array specified by the first symbol at the
  359. index defined by the value of the second symbol,
  360. and then adjusts the size of the array accordingly.
  361. .PP
  362. The
  363. .CW INIT
  364. pseudo-instruction takes the same parameters as a
  365. .CW DATA
  366. statement. Its symbol is used as the base of an array and the
  367. data item is installed in the array at the offset specified by the most recent
  368. .CW DYNT
  369. pseudo-instruction.
  370. The size of the array is adjusted accordingly.
  371. The
  372. .CW DYNT
  373. and
  374. .CW INIT
  375. pseudo-instructions are not implemented on the 68020.
  376. .SH
  377. Defining a procedure
  378. .PP
  379. Entry points are defined by the pseudo-operation
  380. .CW TEXT ,
  381. which takes as arguments the name of the procedure (including the ubiquitous
  382. .CW (SB) )
  383. and the number of bytes of automatic storage to pre-allocate on the stack,
  384. which will usually be zero when writing assembly language programs.
  385. On machines with a link register, such as the MIPS and SPARC,
  386. the special value -4 instructs the loader to generate no PC save
  387. and restore instructions, even if the function is not a leaf.
  388. Here is a complete procedure that returns the sum
  389. of its two arguments:
  390. .P1
  391. TEXT sum(SB), $0
  392. MOVL arg1+0(FP), R0
  393. ADDL arg2+4(FP), R0
  394. RTS
  395. .P2
  396. An optional middle argument
  397. to the
  398. .CW TEXT
  399. pseudo-op is a bit field of options to the loader.
  400. Setting the 1 bit suspends profiling the function when profiling is enabled for the rest of
  401. the program.
  402. For example,
  403. .P1
  404. TEXT sum(SB), 1, $0
  405. MOVL arg1+0(FP), R0
  406. ADDL arg2+4(FP), R0
  407. RTS
  408. .P2
  409. will not be profiled; the first version above would be.
  410. Subroutines with peculiar state, such as system call routines,
  411. should not be profiled.
  412. .PP
  413. Setting the 2 bit allows multiple definitions of the same
  414. .CW TEXT
  415. symbol in a program; the loader will place only one such function in the image.
  416. It was emitted only by the Alef compilers.
  417. .PP
  418. Subroutines to be called from C should place their result in
  419. .CW R0 ,
  420. even if it is an address.
  421. Floating point values are returned in
  422. .CW F0 .
  423. Functions that return a structure to a C program
  424. receive as their first argument the address of the location to
  425. store the result;
  426. .CW R0
  427. is unused in the calling protocol for such procedures.
  428. A subroutine is responsible for saving its own registers,
  429. and therefore is free to use any registers without saving them (``caller saves'').
  430. .CW A6
  431. and
  432. .CW A7
  433. are the exceptions as described above.
  434. .SH
  435. When in doubt
  436. .PP
  437. If you get confused, try using the
  438. .CW -S
  439. option to
  440. .CW 2c
  441. and compiling a sample program.
  442. The standard output is valid input to the assembler.
  443. .SH
  444. Instructions
  445. .PP
  446. The instruction set of the assembler is not identical to that
  447. of the machine.
  448. It is chosen to match what the compiler generates, augmented
  449. slightly by specific needs of the operating system.
  450. For example,
  451. .CW 2a
  452. does not distinguish between the various forms of
  453. .CW MOVE
  454. instruction: move quick, move address, etc. Instead the context
  455. does the job. For example,
  456. .P1
  457. MOVL $1, R1
  458. MOVL A0, R2
  459. MOVW SR, R3
  460. .P2
  461. generates official
  462. .CW MOVEQ ,
  463. .CW MOVEA ,
  464. and
  465. .CW MOVESR
  466. instructions.
  467. A number of instructions do not have the syntax necessary to specify
  468. their entire capabilities. Notable examples are the bitfield
  469. instructions, the
  470. multiply and divide instructions, etc.
  471. For a complete set of generated instruction names (in
  472. .CW 2a
  473. notation, not Motorola's) see the file
  474. .CW /sys/src/cmd/2c/2.out.h .
  475. Despite its name, this file contains an enumeration of the
  476. instructions that appear in the intermediate files generated
  477. by the compiler, which correspond exactly to lines of assembly language.
  478. .PP
  479. The MC68000 assembler,
  480. .CW 1a ,
  481. is essentially the same, honoring the appropriate subset of the instructions
  482. and addressing modes.
  483. The definitions of these are, nonetheless, part of
  484. .CW 2.out.h .
  485. .SH
  486. Laying down instructions
  487. .PP
  488. The loader modifies the code produced by the assembler and compiler.
  489. It folds branches,
  490. copies short sequences of code to eliminate branches,
  491. and discards unreachable code.
  492. The first instruction of every function is assumed to be reachable.
  493. The pseudo-instruction
  494. .CW NOP ,
  495. which you may see in compiler output,
  496. means no instruction at all, rather than an instruction that does nothing.
  497. The loader discards all
  498. .CW NOP 's.
  499. .PP
  500. To generate a true
  501. .CW NOP
  502. instruction, or any other instruction not known to the assembler, use a
  503. .CW WORD
  504. pseudo-instruction.
  505. Such instructions on RISCs are not scheduled by the loader and must have
  506. their delay slots filled manually.
  507. .SH
  508. MIPS
  509. .PP
  510. The registers are only addressed by number:
  511. .CW R0
  512. through
  513. .CW R31 .
  514. .CW R29
  515. is the stack pointer;
  516. .CW R30
  517. is used as the static base pointer, the analogue of
  518. .CW A6
  519. on the 68020.
  520. Its value is the address of the global symbol
  521. .CW setR30(SB) .
  522. The register holding returned values from subroutines is
  523. .CW R1 .
  524. When a function is called, space for the first argument
  525. is reserved at
  526. .CW 0(FP)
  527. but in C (not Alef) the value is passed in
  528. .CW R1
  529. instead.
  530. .PP
  531. The loader uses
  532. .CW R28
  533. as a temporary. The system uses
  534. .CW R26
  535. and
  536. .CW R27
  537. as interrupt-time temporaries. Therefore none of these registers
  538. should be used in user code.
  539. .PP
  540. The control registers are not known to the assembler.
  541. Instead they are numbered registers
  542. .CW M0 ,
  543. .CW M1 ,
  544. etc.
  545. Use this trick to access, say,
  546. .CW STATUS :
  547. .P1
  548. #define STATUS 12
  549. MOVW M(STATUS), R1
  550. .P2
  551. .PP
  552. Floating point registers are called
  553. .CW F0
  554. through
  555. .CW F31 .
  556. By convention,
  557. .CW F24
  558. must be initialized to the value 0.0,
  559. .CW F26
  560. to 0.5,
  561. .CW F28
  562. to 1.0, and
  563. .CW F30
  564. to 2.0;
  565. this is done by the operating system.
  566. .PP
  567. The instructions and their syntax are different from those of the manufacturer's
  568. manual.
  569. There are no
  570. .CW lui
  571. and kin; instead there are
  572. .CW MOVW
  573. (move word),
  574. .CW MOVH
  575. (move halfword),
  576. and
  577. .CW MOVB
  578. (move byte) pseudo-instructions. If the operand is unsigned, the instructions
  579. are
  580. .CW MOVHU
  581. and
  582. .CW MOVBU .
  583. The order of operands is from left to right in dataflow order, just as
  584. on the 68020 but not as in MIPS documentation.
  585. This means that the
  586. .CW Bcond
  587. instructions are reversed with respect to the book; for example, a
  588. .CW va
  589. .CW BGTZ
  590. generates a MIPS
  591. .CW bltz
  592. instruction.
  593. .PP
  594. The assembler is for the R2000, R3000, and most of the R4000 and R6000 architectures.
  595. It understands the 64-bit instructions
  596. .CW MOVV ,
  597. .CW MOVVL ,
  598. .CW ADDV ,
  599. .CW ADDVU ,
  600. .CW SUBV ,
  601. .CW SUBVU ,
  602. .CW MULV ,
  603. .CW MULVU ,
  604. .CW DIVV ,
  605. .CW DIVVU ,
  606. .CW SLLV ,
  607. .CW SRLV ,
  608. and
  609. .CW SRAV .
  610. The assembler does not have any cache, load-linked, or store-conditional instructions.
  611. .PP
  612. Some assembler instructions are expanded into multiple instructions by the loader.
  613. For example the loader may convert the load of a 32 bit constant into an
  614. .CW lui
  615. followed by an
  616. .CW ori .
  617. .PP
  618. Assembler instructions should be laid out as if there
  619. were no load, branch, or floating point compare delay slots;
  620. the loader will rearrange\(em\f2schedule\f1\(emthe instructions
  621. to guarantee correctness and improve performance.
  622. The only exception is that the correct scheduling of instructions
  623. that use control registers varies from model to model of machine
  624. (and is often undocumented) so you should schedule such instructions
  625. by hand to guarantee correct behavior.
  626. The loader generates
  627. .P1
  628. NOR R0, R0, R0
  629. .P2
  630. when it needs a true no-op instruction.
  631. Use exactly this instruction when scheduling code manually;
  632. the loader recognizes it and schedules the code before it and after it independently. Also,
  633. .CW WORD
  634. pseudo-ops are scheduled like no-ops.
  635. .PP
  636. The
  637. .CW NOSCHED
  638. pseudo-op disables instruction scheduling
  639. (scheduling is enabled by default);
  640. .CW SCHED
  641. re-enables it.
  642. Branch folding, code copying, and dead code elimination are
  643. disabled for instructions that are not scheduled.
  644. .SH
  645. SPARC
  646. .PP
  647. Once you understand the Plan 9 model for the MIPS, the SPARC is familiar.
  648. Registers have numerical names only:
  649. .CW R0
  650. through
  651. .CW R31 .
  652. Forget about register windows: Plan 9 doesn't use them at all.
  653. The machine has 32 global registers, period.
  654. .CW R1
  655. [sic] is the stack pointer.
  656. .CW R2
  657. is the static base register, with value the address of
  658. .CW setSB(SB) .
  659. .CW R7
  660. is the return register and also the register holding the first
  661. argument to a C (not Alef) function, again with space reserved at
  662. .CW 0(FP) .
  663. .CW R14
  664. is the loader temporary.
  665. .PP
  666. Floating-point registers are exactly as on the MIPS.
  667. .PP
  668. The control registers are known by names such as
  669. .CW FSR .
  670. The instructions to access these registers are
  671. .CW MOVW
  672. instructions, for example
  673. .P1
  674. MOVW Y, R8
  675. .P2
  676. for the SPARC instruction
  677. .P1
  678. rdy %r8
  679. .P2
  680. .PP
  681. Move instructions are similar to those on the MIPS: pseudo-operations
  682. that turn into appropriate sequences of
  683. .CW sethi
  684. instructions, adds, etc.
  685. Instructions read from left to right. Because the arguments are
  686. flipped to
  687. .CW SUBCC ,
  688. the condition codes are not inverted as on the MIPS.
  689. .PP
  690. The syntax for the ASI stuff is, for example to move a word from ASI 2:
  691. .P1
  692. MOVW (R7, 2), R8
  693. .P2
  694. The syntax for double indexing is
  695. .P1
  696. MOVW (R7+R8), R9
  697. .P2
  698. .PP
  699. The SPARC's instruction scheduling is similar to the MIPS's.
  700. The official no-op instruction is:
  701. .P1
  702. ORN R0, R0, R0
  703. .P2
  704. .SH
  705. i960
  706. .PP
  707. Registers are numbered
  708. .CW R0
  709. through
  710. .CW R31 .
  711. Stack pointer is
  712. .CW R29 ;
  713. return register is
  714. .CW R4 ;
  715. static base is
  716. .CW R28 ;
  717. it is initialized to the address of
  718. .CW setSB(SB) .
  719. .CW R3
  720. must be zero; this should be done manually early in execution by
  721. .P1
  722. SUBO R3, R3
  723. .P2
  724. .CW R27
  725. is the loader temporary.
  726. .PP
  727. There is no support for floating point.
  728. .PP
  729. The Intel calling convention is not supported and cannot be used; use
  730. .CW BAL
  731. instead.
  732. Instructions are mostly as in the book. The major change is that
  733. .CW LOAD
  734. and
  735. .CW STORE
  736. are both called
  737. .CW MOV .
  738. The extension character for
  739. .CW MOV
  740. is as in the manual:
  741. .CW O
  742. for ordinal,
  743. .CW W
  744. for signed, etc.
  745. .SH
  746. i386
  747. .PP
  748. The assembler assumes 32-bit protected mode.
  749. The register names are
  750. .CW SP ,
  751. .CW AX ,
  752. .CW BX ,
  753. .CW CX ,
  754. .CW DX ,
  755. .CW BP ,
  756. .CW DI ,
  757. and
  758. .CW SI .
  759. The stack pointer (not a pseudo-register) is
  760. .CW SP
  761. and the return register is
  762. .CW AX .
  763. There is no physical frame pointer but, as for the MIPS,
  764. .CW FP
  765. is a pseudo-register that acts as
  766. a frame pointer.
  767. .PP
  768. Opcode names are mostly the same as those listed in the Intel manual
  769. with an
  770. .CW L ,
  771. .CW W ,
  772. or
  773. .CW B
  774. appended to identify 32-bit,
  775. 16-bit, and 8-bit operations.
  776. The exceptions are loads, stores, and conditionals.
  777. All load and store opcodes to and from general registers, special registers
  778. (such as
  779. .CW CR0,
  780. .CW CR3,
  781. .CW GDTR,
  782. .CW IDTR,
  783. .CW SS,
  784. .CW CS,
  785. .CW DS,
  786. .CW ES,
  787. .CW FS,
  788. and
  789. .CW GS )
  790. or memory are written
  791. as
  792. .P1
  793. MOV\f2x\fP src,dst
  794. .P2
  795. where
  796. .I x
  797. is
  798. .CW L ,
  799. .CW W ,
  800. or
  801. .CW B .
  802. Thus to get
  803. .CW AL
  804. use a
  805. .CW MOVB
  806. instruction. If you need to access
  807. .CW AH ,
  808. you must mention it explicitly in a
  809. .CW MOVB :
  810. .P1
  811. MOVB AH, BX
  812. .P2
  813. There are many examples of illegal moves, for example,
  814. .P1
  815. MOVB BP, DI
  816. .P2
  817. that the loader actually implements as pseudo-operations.
  818. .PP
  819. The names of conditions in all conditional instructions
  820. .CW J , (
  821. .CW SET )
  822. follow the conventions of the 68020 instead of those of the Intel
  823. assembler:
  824. .CW JOS ,
  825. .CW JOC ,
  826. .CW JCS ,
  827. .CW JCC ,
  828. .CW JEQ ,
  829. .CW JNE ,
  830. .CW JLS ,
  831. .CW JHI ,
  832. .CW JMI ,
  833. .CW JPL ,
  834. .CW JPS ,
  835. .CW JPC ,
  836. .CW JLT ,
  837. .CW JGE ,
  838. .CW JLE ,
  839. and
  840. .CW JGT
  841. instead of
  842. .CW JO ,
  843. .CW JNO ,
  844. .CW JB ,
  845. .CW JNB ,
  846. .CW JZ ,
  847. .CW JNZ ,
  848. .CW JBE ,
  849. .CW JNBE ,
  850. .CW JS ,
  851. .CW JNS ,
  852. .CW JP ,
  853. .CW JNP ,
  854. .CW JL ,
  855. .CW JNL ,
  856. .CW JLE ,
  857. and
  858. .CW JNLE .
  859. .PP
  860. The addressing modes have syntax like
  861. .CW AX ,
  862. .CW (AX) ,
  863. .CW (AX)(BX*4) ,
  864. .CW 10(AX) ,
  865. and
  866. .CW 10(AX)(BX*4) .
  867. The offsets from
  868. .CW AX
  869. can be replaced by offsets from
  870. .CW FP
  871. or
  872. .CW SB
  873. to access names, for example
  874. .CW extern+5(SB)(AX*2) .
  875. .PP
  876. Other notes: Non-relative
  877. .CW JMP
  878. and
  879. .CW CALL
  880. have a
  881. .CW *
  882. added to the syntax.
  883. Only
  884. .CW LOOP ,
  885. .CW LOOPEQ ,
  886. and
  887. .CW LOOPNE
  888. are legal loop instructions. Only
  889. .CW REP
  890. and
  891. .CW REPN
  892. are recognized repeaters. These are not prefixes, but rather
  893. stand-alone opcodes that precede the strings, for example
  894. .P1
  895. CLD; REP; MOVSL
  896. .P2
  897. Segment override prefixes in
  898. .CW MOD/RM
  899. fields are not supported.
  900. .SH
  901. AMD64
  902. .PP
  903. The assembler assumes 64-bit mode unless a
  904. .CW MODE
  905. pseudo-operation is given:
  906. .P1
  907. MODE $32
  908. .P2
  909. to change to 32-bit mode.
  910. The effect is mainly to diagnose instructions that are illegal in
  911. the given mode, but the loader will also assume 32-bit operands and addresses,
  912. and 32-bit PC values for call and return.
  913. The assembler's conventions are similar to those for the 386, above.
  914. The architecture provides extra fixed-point registers
  915. .CW R8
  916. to
  917. .CW R15 .
  918. All registers are 64 bit, but instructions access low-order 8, 16 and 32 bits
  919. as described in the processor handbook.
  920. For example,
  921. .CW MOVL
  922. to
  923. .CW AX
  924. puts a value in the low-order 32 bits and clears the top 32 bits to zero.
  925. Literal operands are limited to signed 32 bit values, which are sign-extended
  926. to 64 bits in 64 bit operations; the exception is
  927. .CW MOVQ ,
  928. which allows 64-bit literals.
  929. The external registers in Plan 9's C are allocated from
  930. .CW R15
  931. down.
  932. There are many new instructions, including the MMX and XMM media instructions,
  933. and conditional move instructions.
  934. MMX registers are
  935. .CW M0
  936. to
  937. .CW M7 ,
  938. and
  939. XMM registers are
  940. .CW X0
  941. to
  942. .CW X15 .
  943. As with the 386 instruction names,
  944. all new 64-bit integer instructions, and the MMX and XMM instructions
  945. uniformly use
  946. .CW L
  947. for `long word' (32 bits) and
  948. .CW Q
  949. for `quad word' (64 bits).
  950. Some instructions use
  951. .CW O
  952. (`octword') for 128-bit values, where the processor handbook
  953. variously uses
  954. .CW O
  955. or
  956. .CW DQ .
  957. The assembler also consistently uses
  958. .CW PL
  959. for `packed long' in
  960. XMM instructions, instead of
  961. .CW Q ,
  962. .CW DQ
  963. or
  964. .CW PI .
  965. Either
  966. .CW MOVL
  967. or
  968. .CW MOVQ
  969. can be used to move values to and from control registers, even when
  970. the registers might be 64 bits.
  971. The assembler often accepts the handbook's name to ease conversion
  972. of existing code (but remember that the operand order is uniformly
  973. source then destination).
  974. C's
  975. .CW "long long"
  976. type is 64 bits, but passed and returned by value, not by reference.
  977. More notably, C pointer values are 64 bits, and thus
  978. .CW "long long"
  979. and
  980. .CW "unsigned long long"
  981. are the only integer types wide enough to hold a pointer value.
  982. The C compiler and library use the XMM floating-point instructions, not
  983. the old 387 ones, although the latter are implemented by assembler and loader.
  984. Unlike the 386, the first integer or pointer argument is passed in a register, which is
  985. .CW BP
  986. for an integer or pointer (it can be referred to in assembly code by the pseudonym
  987. .CW RARG ).
  988. .CW AX
  989. holds the return value from subroutines as before.
  990. Floating-point results are returned in
  991. .CW X0 ,
  992. although currently the first floating-point parameter is not passed in a register.
  993. All parameters less than 8 bytes in length have 8 byte slots reserved on the stack
  994. to preserve alignment and simplify variable-length argument list access,
  995. including the first parameter when passed in a register,
  996. even though bytes 4 to 7 are not initialized.
  997. .SH
  998. Alpha
  999. .PP
  1000. On the Alpha, all registers are 64 bits. The architecture handles 32-bit values
  1001. by giving them a canonical format (sign extension in the case of integer registers).
  1002. Registers are numbered
  1003. .CW R0
  1004. through
  1005. .CW R31 .
  1006. .CW R0
  1007. holds the return value from subroutines, and also the first parameter.
  1008. .CW R30
  1009. is the stack pointer,
  1010. .CW R29
  1011. is the static base,
  1012. .CW R26
  1013. is the link register, and
  1014. .CW R27
  1015. and
  1016. .CW R28
  1017. are linker temporaries.
  1018. .PP
  1019. Floating point registers are numbered
  1020. .CW F0
  1021. to
  1022. .CW F31 .
  1023. .CW F28
  1024. contains
  1025. .CW 0.5 ,
  1026. .CW F29
  1027. contains
  1028. .CW 1.0 ,
  1029. and
  1030. .CW F30
  1031. contains
  1032. .CW 2.0 .
  1033. .CW F31
  1034. is always
  1035. .CW 0.0
  1036. on the Alpha.
  1037. .PP
  1038. The extension character for
  1039. .CW MOV
  1040. follows DEC's notation:
  1041. .CW B
  1042. for byte (8 bits),
  1043. .CW W
  1044. for word (16 bits),
  1045. .CW L
  1046. for long (32 bits),
  1047. and
  1048. .CW Q
  1049. for quadword (64 bits).
  1050. Byte and ``word'' loads and stores may be made unsigned
  1051. by appending a
  1052. .CW U .
  1053. .CW S
  1054. and
  1055. .CW T
  1056. refer to IEEE floating point single precision (32 bits) and double precision (64 bits), respectively.
  1057. .SH
  1058. Power PC
  1059. .PP
  1060. The Power PC follows the Plan 9 model set by the MIPS and SPARC,
  1061. not the elaborate ABIs.
  1062. The 32-bit instructions of the 60x and 8xx PowerPC architectures are supported;
  1063. there is no support for the older POWER instructions.
  1064. Registers are
  1065. .CW R0
  1066. through
  1067. .CW R31 .
  1068. .CW R0
  1069. is initialized to zero; this is done by C start up code
  1070. and assumed by the compiler and loader.
  1071. .CW R1
  1072. is the stack pointer.
  1073. .CW R2
  1074. is the static base register, with value the address of
  1075. .CW setSB(SB) .
  1076. .CW R3
  1077. is the return register and also the register holding the first
  1078. argument to a C function, with space reserved at
  1079. .CW 0(FP)
  1080. as on the MIPS.
  1081. .CW R31
  1082. is the loader temporary.
  1083. The external registers in Plan 9's C are allocated from
  1084. .CW R30
  1085. down.
  1086. .PP
  1087. Floating point registers are called
  1088. .CW F0
  1089. through
  1090. .CW F31 .
  1091. By convention, several registers are initialized
  1092. to specific values; this is done by the operating system.
  1093. .CW F27
  1094. must be initialized to the value
  1095. .CW 0x4330000080000000
  1096. (used by float-to-int conversion),
  1097. .CW F28
  1098. to the value 0.0,
  1099. .CW F29
  1100. to 0.5,
  1101. .CW F30
  1102. to 1.0, and
  1103. .CW F31
  1104. to 2.0.
  1105. .PP
  1106. As on the MIPS and SPARC, the assembler accepts arbitrary literals
  1107. as operands to
  1108. .CW MOVW ,
  1109. and also to
  1110. .CW ADD
  1111. and others where `immediate' variants exist,
  1112. and the loader generates sequences
  1113. of
  1114. .CW addi ,
  1115. .CW addis ,
  1116. .CW oris ,
  1117. etc. as required.
  1118. The register indirect addressing modes use the same syntax as the SPARC,
  1119. including double indexing when allowed.
  1120. .PP
  1121. The instruction names are generally derived from the Motorola ones,
  1122. subject to slight transformation:
  1123. the
  1124. .CW . ' `
  1125. marking the setting of condition codes is replaced by
  1126. .CW CC ,
  1127. and when the letter
  1128. .CW o ' `
  1129. represents `OE=1' it is replaced by
  1130. .CW V .
  1131. Thus
  1132. .CW add ,
  1133. .CW addo.
  1134. and
  1135. .CW subfzeo.
  1136. become
  1137. .CW ADD ,
  1138. .CW ADDVCC
  1139. and
  1140. .CW SUBFZEVCC .
  1141. As well as the three-operand conditional branch instruction
  1142. .CW BC ,
  1143. the assembler provides pseudo-instructions for the common cases:
  1144. .CW BEQ ,
  1145. .CW BNE ,
  1146. .CW BGT ,
  1147. .CW BGE ,
  1148. .CW BLT ,
  1149. .CW BLE ,
  1150. .CW BVC ,
  1151. and
  1152. .CW BVS .
  1153. The unconditional branch instruction is
  1154. .CW BR .
  1155. Indirect branches use
  1156. .CW "(CTR)"
  1157. or
  1158. .CW "(LR)"
  1159. as target.
  1160. .PP
  1161. Load or store operations are replaced by
  1162. .CW MOV
  1163. variants in the usual way:
  1164. .CW MOVW
  1165. (move word),
  1166. .CW MOVH
  1167. (move halfword with sign extension), and
  1168. .CW MOVB
  1169. (move byte with sign extension, a pseudo-instruction),
  1170. with unsigned variants
  1171. .CW MOVHZ
  1172. and
  1173. .CW MOVBZ ,
  1174. and byte-reversing
  1175. .CW MOVWBR
  1176. and
  1177. .CW MOVHBR .
  1178. `Load or store with update' versions are
  1179. .CW MOVWU ,
  1180. .CW MOVHU ,
  1181. and
  1182. .CW MOVBZU .
  1183. Load or store multiple is
  1184. .CW MOVMW .
  1185. The exceptions are the string instructions, which are
  1186. .CW LSW
  1187. and
  1188. .CW STSW ,
  1189. and the reservation instructions
  1190. .CW lwarx
  1191. and
  1192. .CW stwcx. ,
  1193. which are
  1194. .CW LWAR
  1195. and
  1196. .CW STWCCC ,
  1197. all with operands in the usual data-flow order.
  1198. Floating-point load or store instructions are
  1199. .CW FMOVD ,
  1200. .CW FMOVDU ,
  1201. .CW FMOVS ,
  1202. and
  1203. .CW FMOVSU .
  1204. The register to register move instructions
  1205. .CW fmr
  1206. and
  1207. .CW fmr.
  1208. are written
  1209. .CW FMOVD
  1210. and
  1211. .CW FMOVDCC .
  1212. .PP
  1213. The assembler knows the commonly used special purpose registers:
  1214. .CW CR ,
  1215. .CW CTR ,
  1216. .CW DEC ,
  1217. .CW LR ,
  1218. .CW MSR ,
  1219. and
  1220. .CW XER .
  1221. The rest, which are often architecture-dependent, are referenced as
  1222. .CW SPR(n) .
  1223. The segment registers of the 60x series are similarly
  1224. .CW SEG(n) ,
  1225. but
  1226. .I n
  1227. can also be a register name, as in
  1228. .CW SEG(R3) .
  1229. Moves between special purpose registers and general purpose ones,
  1230. when allowed by the architecture,
  1231. are written as
  1232. .CW MOVW ,
  1233. replacing
  1234. .CW mfcr ,
  1235. .CW mtcr ,
  1236. .CW mfmsr ,
  1237. .CW mtmsr ,
  1238. .CW mtspr ,
  1239. .CW mfspr ,
  1240. .CW mftb ,
  1241. and many others.
  1242. .PP
  1243. The fields of the condition register
  1244. .CW CR
  1245. are referenced as
  1246. .CW CR(0)
  1247. through
  1248. .CW CR(7) .
  1249. They are used by the
  1250. .CW MOVFL
  1251. (move field) pseudo-instruction,
  1252. which produces
  1253. .CW mcrf
  1254. or
  1255. .CW mtcrf .
  1256. For example:
  1257. .P1
  1258. MOVFL CR(3), CR(0)
  1259. MOVFL R3, CR(1)
  1260. MOVFL R3, $7, CR
  1261. .P2
  1262. They are also accepted in
  1263. the conditional branch instruction, for example
  1264. .P1
  1265. BEQ CR(7), label
  1266. .P2
  1267. Fields of the
  1268. .CW FPSCR
  1269. are accessed using
  1270. .CW MOVFL
  1271. in a similar way:
  1272. .P1
  1273. MOVFL FPSCR, F0
  1274. MOVFL F0, FPSCR
  1275. MOVFL F0, $7, FPSCR
  1276. MOVFL $0, FPSCR(3)
  1277. .P2
  1278. producing
  1279. .CW mffs ,
  1280. .CW mtfsf
  1281. or
  1282. .CW mtfsfi ,
  1283. as appropriate.
  1284. .SH
  1285. ARM
  1286. .PP
  1287. The assembler provides access to
  1288. .CW R0
  1289. through
  1290. .CW R14
  1291. and the
  1292. .CW PC .
  1293. The stack pointer is
  1294. .CW R13 ,
  1295. the link register is
  1296. .CW R14 ,
  1297. and the static base register is
  1298. .CW R12 .
  1299. .CW R0
  1300. is the return register and also the register holding
  1301. the first argument to a subroutine.
  1302. The assembler supports the
  1303. .CW CPSR
  1304. and
  1305. .CW SPSR
  1306. registers.
  1307. It also knows about coprocessor registers
  1308. .CW C0
  1309. through
  1310. .CW C15 .
  1311. Floating registers are
  1312. .CW F0
  1313. through
  1314. .CW F7 ,
  1315. .CW FPSR
  1316. and
  1317. .CW FPCR .
  1318. .PP
  1319. As with the other architectures, loads and stores are called
  1320. .CW MOV ,
  1321. e.g.
  1322. .CW MOVW
  1323. for load word or store word, and
  1324. .CW MOVM
  1325. for
  1326. load or store multiple,
  1327. depending on the operands.
  1328. .PP
  1329. Addressing modes are supported by suffixes to the instructions:
  1330. .CW .IA
  1331. (increment after),
  1332. .CW .IB
  1333. (increment before),
  1334. .CW .DA
  1335. (decrement after), and
  1336. .CW .DB
  1337. (decrement before).
  1338. These can only be used with the
  1339. .CW MOV
  1340. instructions.
  1341. The move multiple instruction,
  1342. .CW MOVM ,
  1343. defines a range of registers using brackets, e.g.
  1344. .CW [R0-R12] .
  1345. The special
  1346. .CW MOVM
  1347. addressing mode bits
  1348. .CW W ,
  1349. .CW U ,
  1350. and
  1351. .CW P
  1352. are written in the same manner, for example,
  1353. .CW MOVM.DB.W .
  1354. A
  1355. .CW .S
  1356. suffix allows a
  1357. .CW MOVM
  1358. instruction to access user
  1359. .CW R13
  1360. and
  1361. .CW R14
  1362. when in another processor mode.
  1363. Shifts and rotates in addressing modes are supported by binary operators
  1364. .CW <<
  1365. (logical left shift),
  1366. .CW >>
  1367. (logical right shift),
  1368. .CW ->
  1369. (arithmetic right shift), and
  1370. .CW @>
  1371. (rotate right); for example
  1372. .CW "R7>>R2" or
  1373. .CW "R2@>2" .
  1374. The assembler does not support indexing by a shifted expression;
  1375. only names can be doubly indexed.
  1376. .PP
  1377. Any instruction can be followed by a suffix that makes the instruction conditional:
  1378. .CW .EQ ,
  1379. .CW .NE ,
  1380. and so on, as in the ARM manual, with synonyms
  1381. .CW .HS
  1382. (for
  1383. .CW .CS )
  1384. and
  1385. .CW .LO
  1386. (for
  1387. .CW .CC ), for example
  1388. .CW ADD.NE .
  1389. Arithmetic
  1390. and logical instructions
  1391. can have a
  1392. .CW .S
  1393. suffix, as ARM allows, to set condition codes.
  1394. .PP
  1395. The syntax of the
  1396. .CW MCR
  1397. and
  1398. .CW MRC
  1399. coprocessor instructions is largely as in the manual, with the usual adjustments.
  1400. The assembler directly supports only the ARM floating-point coprocessor
  1401. operations used by the compiler:
  1402. .CW CMP ,
  1403. .CW ADD ,
  1404. .CW SUB ,
  1405. .CW MUL ,
  1406. and
  1407. .CW DIV ,
  1408. all with
  1409. .CW F
  1410. or
  1411. .CW D
  1412. suffix selecting single or double precision.
  1413. Floating-point load or store become
  1414. .CW MOVF
  1415. and
  1416. .CW MOVD .
  1417. Conversion instructions are also specified by moves:
  1418. .CW MOVWD ,
  1419. .CW MOVWF ,
  1420. .CW MOVDW ,
  1421. .CW MOVWD ,
  1422. .CW MOVFD ,
  1423. and
  1424. .CW MOVDF .
  1425. .SH
  1426. AMD 29000
  1427. .PP
  1428. For details about this assembly language, which was built for the AMD 29240,
  1429. look at the sources or examine compiler output.