asm.ms 34 KB

1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192939495969798991001011021031041051061071081091101111121131141151161171181191201211221231241251261271281291301311321331341351361371381391401411421431441451461471481491501511521531541551561571581591601611621631641651661671681691701711721731741751761771781791801811821831841851861871881891901911921931941951961971981992002012022032042052062072082092102112122132142152162172182192202212222232242252262272282292302312322332342352362372382392402412422432442452462472482492502512522532542552562572582592602612622632642652662672682692702712722732742752762772782792802812822832842852862872882892902912922932942952962972982993003013023033043053063073083093103113123133143153163173183193203213223233243253263273283293303313323333343353363373383393403413423433443453463473483493503513523533543553563573583593603613623633643653663673683693703713723733743753763773783793803813823833843853863873883893903913923933943953963973983994004014024034044054064074084094104114124134144154164174184194204214224234244254264274284294304314324334344354364374384394404414424434444454464474484494504514524534544554564574584594604614624634644654664674684694704714724734744754764774784794804814824834844854864874884894904914924934944954964974984995005015025035045055065075085095105115125135145155165175185195205215225235245255265275285295305315325335345355365375385395405415425435445455465475485495505515525535545555565575585595605615625635645655665675685695705715725735745755765775785795805815825835845855865875885895905915925935945955965975985996006016026036046056066076086096106116126136146156166176186196206216226236246256266276286296306316326336346356366376386396406416426436446456466476486496506516526536546556566576586596606616626636646656666676686696706716726736746756766776786796806816826836846856866876886896906916926936946956966976986997007017027037047057067077087097107117127137147157167177187197207217227237247257267277287297307317327337347357367377387397407417427437447457467477487497507517527537547557567577587597607617627637647657667677687697707717727737747757767777787797807817827837847857867877887897907917927937947957967977987998008018028038048058068078088098108118128138148158168178188198208218228238248258268278288298308318328338348358368378388398408418428438448458468478488498508518528538548558568578588598608618628638648658668678688698708718728738748758768778788798808818828838848858868878888898908918928938948958968978988999009019029039049059069079089099109119129139149159169179189199209219229239249259269279289299309319329339349359369379389399409419429439449459469479489499509519529539549559569579589599609619629639649659669679689699709719729739749759769779789799809819829839849859869879889899909919929939949959969979989991000100110021003100410051006100710081009101010111012101310141015101610171018101910201021102210231024102510261027102810291030103110321033103410351036103710381039104010411042104310441045104610471048104910501051105210531054105510561057105810591060106110621063106410651066106710681069107010711072107310741075107610771078107910801081108210831084108510861087108810891090109110921093109410951096109710981099110011011102110311041105110611071108110911101111111211131114111511161117111811191120112111221123112411251126112711281129113011311132113311341135113611371138113911401141114211431144114511461147114811491150115111521153115411551156115711581159116011611162116311641165116611671168116911701171117211731174117511761177117811791180118111821183118411851186118711881189119011911192119311941195119611971198119912001201120212031204120512061207120812091210121112121213121412151216121712181219122012211222122312241225122612271228122912301231123212331234123512361237123812391240124112421243124412451246124712481249125012511252125312541255125612571258125912601261126212631264126512661267126812691270127112721273127412751276127712781279128012811282128312841285128612871288128912901291129212931294129512961297129812991300130113021303130413051306130713081309131013111312131313141315131613171318131913201321132213231324132513261327132813291330133113321333133413351336133713381339134013411342134313441345134613471348134913501351135213531354135513561357135813591360136113621363136413651366136713681369137013711372137313741375137613771378137913801381138213831384138513861387138813891390139113921393139413951396139713981399140014011402140314041405140614071408140914101411141214131414141514161417141814191420142114221423142414251426142714281429143014311432143314341435143614371438143914401441144214431444144514461447144814491450145114521453145414551456145714581459146014611462146314641465146614671468146914701471147214731474147514761477147814791480148114821483148414851486148714881489149014911492149314941495149614971498149915001501150215031504150515061507150815091510151115121513151415151516151715181519152015211522152315241525152615271528152915301531153215331534153515361537153815391540154115421543
  1. .HTML "A Manual for the Plan 9 assembler
  2. .ft CW
  3. .ta 8n +8n +8n +8n +8n +8n +8n
  4. .ft
  5. .TL
  6. A Manual for the Plan 9 assembler
  7. .AU
  8. Rob Pike
  9. rob@plan9.bell-labs.com
  10. .SH
  11. Machines
  12. .PP
  13. There is an assembler for each of the MIPS, SPARC, Intel 386, AMD64,
  14. Power PC, ARM, and RISC-V.
  15. The 68020 assembler,
  16. .CW 2a ,
  17. (no longer distributed)
  18. is the oldest and in many ways the prototype.
  19. The assemblers are really just variations of a single program:
  20. they share many properties such as left-to-right assignment order for
  21. instruction operands and the synthesis of macro instructions
  22. such as
  23. .CW MOVE
  24. to hide the peculiarities of the load and store structure of the machines.
  25. To keep things concrete, the first part of this manual is
  26. specifically about the 68020.
  27. At the end is a description of the differences among
  28. the other assemblers.
  29. .PP
  30. The document, ``How to Use the Plan 9 C Compiler'', by Rob Pike,
  31. is a prerequisite for this manual.
  32. .SH
  33. Registers
  34. .PP
  35. All pre-defined symbols in the assembler are upper-case.
  36. Data registers are
  37. .CW R0
  38. through
  39. .CW R7 ;
  40. address registers are
  41. .CW A0
  42. through
  43. .CW A7 ;
  44. floating-point registers are
  45. .CW F0
  46. through
  47. .CW F7 .
  48. .PP
  49. A pointer in
  50. .CW A6
  51. is used by the C compiler to point to data, enabling short addresses to
  52. be used more often.
  53. The value of
  54. .CW A6
  55. is constant and must be set during C program initialization
  56. to the address of the externally-defined symbol
  57. .CW a6base .
  58. .PP
  59. The following hardware registers are defined in the assembler; their
  60. meaning should be obvious given a 68020 manual:
  61. .CW CAAR ,
  62. .CW CACR ,
  63. .CW CCR ,
  64. .CW DFC ,
  65. .CW ISP ,
  66. .CW MSP ,
  67. .CW SFC ,
  68. .CW SR ,
  69. .CW USP ,
  70. and
  71. .CW VBR .
  72. .PP
  73. The assembler also defines several pseudo-registers that
  74. manipulate the stack:
  75. .CW FP ,
  76. .CW SP ,
  77. and
  78. .CW TOS .
  79. .CW FP
  80. is the frame pointer, so
  81. .CW 0(FP)
  82. is the first argument,
  83. .CW 4(FP)
  84. is the second, and so on.
  85. .CW SP
  86. is the local stack pointer, where automatic variables are held
  87. (SP is a pseudo-register only on the 68020);
  88. .CW 0(SP)
  89. is the first automatic, and so on as with
  90. .CW FP .
  91. Finally,
  92. .CW TOS
  93. is the top-of-stack register, used for pushing parameters to procedures,
  94. saving temporary values, and so on.
  95. .PP
  96. The assembler and loader track these pseudo-registers so
  97. the above statements are true regardless of what has been
  98. pushed on the hardware stack, pointed to by
  99. .CW A7 .
  100. The name
  101. .CW A7
  102. refers to the hardware stack pointer, but beware of mixed use of
  103. .CW A7
  104. and the above stack-related pseudo-registers, which will cause trouble.
  105. Note, too, that the
  106. .CW PEA
  107. instruction is observed by the loader to
  108. alter SP and thus will insert a corresponding pop before all returns.
  109. The assembler accepts a label-like name to be attached to
  110. .CW FP
  111. and
  112. .CW SP
  113. uses, such as
  114. .CW p+0(FP) ,
  115. to help document that
  116. .CW p
  117. is the first argument to a routine.
  118. The name goes in the symbol table but has no significance to the result
  119. of the program.
  120. .SH
  121. Referring to data
  122. .PP
  123. All external references must be made relative to some pseudo-register,
  124. either
  125. .CW PC
  126. (the virtual program counter) or
  127. .CW SB
  128. (the ``static base'' register).
  129. .CW PC
  130. counts instructions, not bytes of data.
  131. For example, to branch to the second following instruction, that is,
  132. to skip one instruction, one may write
  133. .P1
  134. BRA 2(PC)
  135. .P2
  136. Labels are also allowed, as in
  137. .P1
  138. BRA return
  139. NOP
  140. return:
  141. RTS
  142. .P2
  143. When using labels, there is no
  144. .CW (PC)
  145. annotation.
  146. .PP
  147. The pseudo-register
  148. .CW SB
  149. refers to the beginning of the address space of the program.
  150. Thus, references to global data and procedures are written as
  151. offsets to
  152. .CW SB ,
  153. as in
  154. .P1
  155. MOVL $array(SB), TOS
  156. .P2
  157. to push the address of a global array on the stack, or
  158. .P1
  159. MOVL array+4(SB), TOS
  160. .P2
  161. to push the second (4-byte) element of the array.
  162. Note the use of an offset; the complete list of addressing modes is given below.
  163. Similarly, subroutine calls must use
  164. .CW SB :
  165. .P1
  166. BSR exit(SB)
  167. .P2
  168. File-static variables have syntax
  169. .P1
  170. local<>+4(SB)
  171. .P2
  172. The
  173. .CW <>
  174. will be filled in at load time by a unique integer.
  175. .PP
  176. When a program starts, it must execute
  177. .P1
  178. MOVL $a6base(SB), A6
  179. .P2
  180. before accessing any global data.
  181. (On machines such as the MIPS and SPARC that cannot load a register
  182. in a single instruction, constants are loaded through the static base
  183. register. The loader recognizes code that initializes the static
  184. base register and treats it specially. You must be careful, however,
  185. not to load large constants on such machines when the static base
  186. register is not set up, such as early in interrupt routines.)
  187. .SH
  188. Expressions
  189. .PP
  190. Expressions are mostly what one might expect.
  191. Where an offset or a constant is expected,
  192. a primary expression with unary operators is allowed.
  193. A general C constant expression is allowed in parentheses.
  194. .PP
  195. Source files are preprocessed exactly as in the C compiler, so
  196. .CW #define
  197. and
  198. .CW #include
  199. work.
  200. .SH
  201. Addressing modes
  202. .PP
  203. The simple addressing modes are shared by all the assemblers.
  204. Here, for completeness, follows a table of all the 68020 addressing modes,
  205. since that machine has the richest set.
  206. In the table,
  207. .CW o
  208. is an offset, which if zero may be elided, and
  209. .CW d
  210. is a displacement, which is a constant between -128 and 127 inclusive.
  211. Many of the modes listed have the same name;
  212. scrutiny of the format will show what default is being applied.
  213. For instance, indexed mode with no address register supplied operates
  214. as though a zero-valued register were used.
  215. For "offset" read "displacement."
  216. For "\f(CW.s\fP" read one of
  217. .CW .L ,
  218. or
  219. .CW .W
  220. followed by
  221. .CW *1 ,
  222. .CW *2 ,
  223. .CW *4 ,
  224. or
  225. .CW *8
  226. to indicate the size and scaling of the data.
  227. .IP
  228. .TS
  229. l lfCW.
  230. data register R0
  231. address register A0
  232. floating-point register F0
  233. special names CAAR, CACR, etc.
  234. constant $con
  235. floating point constant $fcon
  236. external symbol name+o(SB)
  237. local symbol name<>+o(SB)
  238. automatic symbol name+o(SP)
  239. argument name+o(FP)
  240. address of external $name+o(SB)
  241. address of local $name<>+o(SB)
  242. indirect post-increment (A0)+
  243. indirect pre-decrement -(A0)
  244. indirect with offset o(A0)
  245. indexed with offset o()(R0.s)
  246. indexed with offset o(A0)(R0.s)
  247. external indexed name+o(SB)(R0.s)
  248. local indexed name<>+o(SB)(R0.s)
  249. automatic indexed name+o(SP)(R0.s)
  250. parameter indexed name+o(FP)(R0.s)
  251. offset indirect post-indexed d(o())(R0.s)
  252. offset indirect post-indexed d(o(A0))(R0.s)
  253. external indirect post-indexed d(name+o(SB))(R0.s)
  254. local indirect post-indexed d(name<>+o(SB))(R0.s)
  255. automatic indirect post-indexed d(name+o(SP))(R0.s)
  256. parameter indirect post-indexed d(name+o(FP))(R0.s)
  257. offset indirect pre-indexed d(o()(R0.s))
  258. offset indirect pre-indexed d(o(A0))
  259. offset indirect pre-indexed d(o(A0)(R0.s))
  260. external indirect pre-indexed d(name+o(SB))
  261. external indirect pre-indexed d(name+o(SB)(R0.s))
  262. local indirect pre-indexed d(name<>+o(SB))
  263. local indirect pre-indexed d(name<>+o(SB)(R0.s))
  264. automatic indirect pre-indexed d(name+o(SP))
  265. automatic indirect pre-indexed d(name+o(SP)(R0.s))
  266. parameter indirect pre-indexed d(name+o(FP))
  267. parameter indirect pre-indexed d(name+o(FP)(R0.s))
  268. .TE
  269. .in
  270. .SH
  271. Laying down data
  272. .PP
  273. Placing data in the instruction stream, say for interrupt vectors, is easy:
  274. the pseudo-instructions
  275. .CW LONG
  276. and
  277. .CW WORD
  278. (but not
  279. .CW BYTE )
  280. lay down the value of their single argument, of the appropriate size,
  281. as if it were an instruction:
  282. .P1
  283. LONG $12345
  284. .P2
  285. places the long 12345 (base 10)
  286. in the instruction stream.
  287. (On most machines,
  288. the only such operator is
  289. .CW WORD
  290. and it lays down 32-bit quantities.
  291. The 386 has all three:
  292. .CW LONG ,
  293. .CW WORD ,
  294. and
  295. .CW BYTE .
  296. The AMD64 adds
  297. .CW QUAD
  298. to that for 64-bit values.
  299. The 960 has only one,
  300. .CW LONG .)
  301. .PP
  302. Placing information in the data section is more painful.
  303. The pseudo-instruction
  304. .CW DATA
  305. does the work, given two arguments: an address at which to place the item,
  306. including its size,
  307. and the value to place there. For example, to define a character array
  308. .CW array
  309. containing the characters
  310. .CW abc
  311. and a terminating null:
  312. .P1
  313. DATA array+0(SB)/1, $'a'
  314. DATA array+1(SB)/1, $'b'
  315. DATA array+2(SB)/1, $'c'
  316. GLOBL array(SB), $4
  317. .P2
  318. or
  319. .P1
  320. DATA array+0(SB)/4, $"abc\ez"
  321. GLOBL array(SB), $4
  322. .P2
  323. The
  324. .CW /1
  325. defines the number of bytes to define,
  326. .CW GLOBL
  327. makes the symbol global, and the
  328. .CW $4
  329. says how many bytes the symbol occupies.
  330. Uninitialized data is zeroed automatically.
  331. The character
  332. .CW \ez
  333. is equivalent to the C
  334. .CW \e0.
  335. The string in a
  336. .CW DATA
  337. statement may contain a maximum of eight bytes;
  338. build larger strings piecewise.
  339. Two pseudo-instructions,
  340. .CW DYNT
  341. and
  342. .CW INIT ,
  343. allow the (obsolete) Alef compilers to build dynamic type information during the load
  344. phase.
  345. The
  346. .CW DYNT
  347. pseudo-instruction has two forms:
  348. .P1
  349. DYNT , ALEF_SI_5+0(SB)
  350. DYNT ALEF_AS+0(SB), ALEF_SI_5+0(SB)
  351. .P2
  352. In the first form,
  353. .CW DYNT
  354. defines the symbol to be a small unique integer constant, chosen by the loader,
  355. which is some multiple of the word size. In the second form,
  356. .CW DYNT
  357. defines the second symbol in the same way,
  358. places the address of the most recently
  359. defined text symbol in the array specified by the first symbol at the
  360. index defined by the value of the second symbol,
  361. and then adjusts the size of the array accordingly.
  362. .PP
  363. The
  364. .CW INIT
  365. pseudo-instruction takes the same parameters as a
  366. .CW DATA
  367. statement. Its symbol is used as the base of an array and the
  368. data item is installed in the array at the offset specified by the most recent
  369. .CW DYNT
  370. pseudo-instruction.
  371. The size of the array is adjusted accordingly.
  372. The
  373. .CW DYNT
  374. and
  375. .CW INIT
  376. pseudo-instructions are not implemented on the 68020.
  377. .SH
  378. Defining a procedure
  379. .PP
  380. Entry points are defined by the pseudo-operation
  381. .CW TEXT ,
  382. which takes as arguments the name of the procedure (including the ubiquitous
  383. .CW (SB) )
  384. and the number of bytes of automatic storage to pre-allocate on the stack,
  385. which will usually be zero when writing assembly language programs.
  386. On machines with a link register, such as the MIPS and SPARC,
  387. the special value -4 instructs the loader to generate no PC save
  388. and restore instructions, even if the function is not a leaf.
  389. Here is a complete procedure that returns the sum
  390. of its two arguments:
  391. .P1
  392. TEXT sum(SB), $0
  393. MOVL arg1+0(FP), R0
  394. ADDL arg2+4(FP), R0
  395. RTS
  396. .P2
  397. An optional middle argument
  398. to the
  399. .CW TEXT
  400. pseudo-op is a bit field of options to the loader.
  401. Setting the 1 bit suspends profiling the function when profiling is enabled for the rest of
  402. the program.
  403. For example,
  404. .P1
  405. TEXT sum(SB), 1, $0
  406. MOVL arg1+0(FP), R0
  407. ADDL arg2+4(FP), R0
  408. RTS
  409. .P2
  410. will not be profiled; the first version above would be.
  411. Subroutines with peculiar state, such as system call routines,
  412. should not be profiled.
  413. .PP
  414. Setting the 2 bit allows multiple definitions of the same
  415. .CW TEXT
  416. symbol in a program; the loader will place only one such function in the image.
  417. It was emitted only by the Alef compilers.
  418. .PP
  419. Subroutines to be called from C should place their result in
  420. .CW R0 ,
  421. even if it is an address.
  422. Floating point values are returned in
  423. .CW F0 .
  424. Functions that return a structure to a C program
  425. receive as their first argument the address of the location to
  426. store the result;
  427. .CW R0
  428. is unused in the calling protocol for such procedures.
  429. A subroutine is responsible for saving its own registers,
  430. and therefore is free to use any registers without saving them (``caller saves'').
  431. .CW A6
  432. and
  433. .CW A7
  434. are the exceptions as described above.
  435. .SH
  436. When in doubt
  437. .PP
  438. If you get confused, try using the
  439. .CW -S
  440. option to
  441. .CW 2c
  442. and compiling a sample program.
  443. The standard output is valid input to the assembler.
  444. .SH
  445. Instructions
  446. .PP
  447. The instruction set of the assembler is not identical to that
  448. of the machine.
  449. It is chosen to match what the compiler generates, augmented
  450. slightly by specific needs of the operating system.
  451. For example,
  452. .CW 2a
  453. does not distinguish between the various forms of
  454. .CW MOVE
  455. instruction: move quick, move address, etc. Instead the context
  456. does the job. For example,
  457. .P1
  458. MOVL $1, R1
  459. MOVL A0, R2
  460. MOVW SR, R3
  461. .P2
  462. generates official
  463. .CW MOVEQ ,
  464. .CW MOVEA ,
  465. and
  466. .CW MOVESR
  467. instructions.
  468. A number of instructions do not have the syntax necessary to specify
  469. their entire capabilities. Notable examples are the bitfield
  470. instructions, the
  471. multiply and divide instructions, etc.
  472. For a complete set of generated instruction names (in
  473. .CW 2a
  474. notation, not Motorola's) see the file
  475. .CW /sys/src/cmd/2c/2.out.h .
  476. Despite its name, this file contains an enumeration of the
  477. instructions that appear in the intermediate files generated
  478. by the compiler, which correspond exactly to lines of assembly language.
  479. .SH
  480. Laying down instructions
  481. .PP
  482. The loader modifies the code produced by the assembler and compiler.
  483. It folds branches,
  484. copies short sequences of code to eliminate branches,
  485. and discards unreachable code.
  486. The first instruction of every function is assumed to be reachable.
  487. The pseudo-instruction
  488. .CW NOP ,
  489. which you may see in compiler output,
  490. means no instruction at all, rather than an instruction that does nothing.
  491. The loader discards all
  492. .CW NOP 's.
  493. .PP
  494. To generate a true
  495. .CW NOP
  496. instruction, or any other instruction not known to the assembler, use a
  497. .CW WORD
  498. pseudo-instruction.
  499. Such instructions on RISCs are not scheduled by the loader and must have
  500. their delay slots filled manually.
  501. .SH
  502. MIPS
  503. .PP
  504. The registers are only addressed by number:
  505. .CW R0
  506. through
  507. .CW R31 .
  508. .CW R29
  509. is the stack pointer;
  510. .CW R30
  511. is used as the static base pointer, the analogue of
  512. .CW A6
  513. on the 68020.
  514. Its value is the address of the global symbol
  515. .CW setR30(SB) .
  516. The register holding returned values from subroutines is
  517. .CW R1 .
  518. When a function is called, space for the first argument
  519. is reserved at
  520. .CW 0(FP)
  521. but in C (not Alef) the value is passed in
  522. .CW R1
  523. instead.
  524. .PP
  525. The loader uses
  526. .CW R28
  527. as a temporary. The system uses
  528. .CW R26
  529. and
  530. .CW R27
  531. as interrupt-time temporaries. Therefore none of these registers
  532. should be used in user code.
  533. .PP
  534. The control registers are not known to the assembler.
  535. Instead they are numbered registers
  536. .CW M0 ,
  537. .CW M1 ,
  538. etc.
  539. Use this trick to access, say,
  540. .CW STATUS :
  541. .P1
  542. #define STATUS 12
  543. MOVW M(STATUS), R1
  544. .P2
  545. .PP
  546. Floating point registers are called
  547. .CW F0
  548. through
  549. .CW F31 .
  550. By convention,
  551. .CW F24
  552. must be initialized to the value 0.0,
  553. .CW F26
  554. to 0.5,
  555. .CW F28
  556. to 1.0, and
  557. .CW F30
  558. to 2.0;
  559. this is done by the operating system.
  560. .PP
  561. The instructions and their syntax are different from those of the manufacturer's
  562. manual.
  563. There are no
  564. .CW lui
  565. and kin; instead there are
  566. .CW MOVW
  567. (move word),
  568. .CW MOVH
  569. (move halfword),
  570. and
  571. .CW MOVB
  572. (move byte) pseudo-instructions. If the operand is unsigned, the instructions
  573. are
  574. .CW MOVHU
  575. and
  576. .CW MOVBU .
  577. The order of operands is from left to right in dataflow order, just as
  578. on the 68020 but not as in MIPS documentation.
  579. This means that the
  580. .CW Bcond
  581. instructions are reversed with respect to the book; for example, a
  582. .CW va
  583. .CW BGTZ
  584. generates a MIPS
  585. .CW bltz
  586. instruction.
  587. .PP
  588. The assembler is for the R2000, R3000, and most of the R4000 and R6000 architectures.
  589. It understands the 64-bit instructions
  590. .CW MOVV ,
  591. .CW MOVVL ,
  592. .CW ADDV ,
  593. .CW ADDVU ,
  594. .CW SUBV ,
  595. .CW SUBVU ,
  596. .CW MULV ,
  597. .CW MULVU ,
  598. .CW DIVV ,
  599. .CW DIVVU ,
  600. .CW SLLV ,
  601. .CW SRLV ,
  602. and
  603. .CW SRAV .
  604. The assembler does not have any cache, load-linked, or store-conditional instructions.
  605. .PP
  606. Some assembler instructions are expanded into multiple instructions by the loader.
  607. For example the loader may convert the load of a 32 bit constant into an
  608. .CW lui
  609. followed by an
  610. .CW ori .
  611. .PP
  612. Assembler instructions should be laid out as if there
  613. were no load, branch, or floating point compare delay slots;
  614. the loader will rearrange\(em\f2schedule\f1\(emthe instructions
  615. to guarantee correctness and improve performance.
  616. The only exception is that the correct scheduling of instructions
  617. that use control registers varies from model to model of machine
  618. (and is often undocumented) so you should schedule such instructions
  619. by hand to guarantee correct behavior.
  620. The loader generates
  621. .P1
  622. NOR R0, R0, R0
  623. .P2
  624. when it needs a true no-op instruction.
  625. Use exactly this instruction when scheduling code manually;
  626. the loader recognizes it and schedules the code before it and after it independently. Also,
  627. .CW WORD
  628. pseudo-ops are scheduled like no-ops.
  629. .PP
  630. The
  631. .CW NOSCHED
  632. pseudo-op disables instruction scheduling
  633. (scheduling is enabled by default);
  634. .CW SCHED
  635. re-enables it.
  636. Branch folding, code copying, and dead code elimination are
  637. disabled for instructions that are not scheduled.
  638. .SH
  639. SPARC
  640. .PP
  641. Once you understand the Plan 9 model for the MIPS, the SPARC is familiar.
  642. Registers have numerical names only:
  643. .CW R0
  644. through
  645. .CW R31 .
  646. Forget about register windows: Plan 9 doesn't use them at all.
  647. The machine has 32 global registers, period.
  648. .CW R1
  649. [sic] is the stack pointer.
  650. .CW R2
  651. is the static base register, with value the address of
  652. .CW setSB(SB) .
  653. .CW R7
  654. is the return register and also the register holding the first
  655. argument to a C (not Alef) function, again with space reserved at
  656. .CW 0(FP) .
  657. .CW R14
  658. is the loader temporary.
  659. .PP
  660. Floating-point registers are exactly as on the MIPS.
  661. .PP
  662. The control registers are known by names such as
  663. .CW FSR .
  664. The instructions to access these registers are
  665. .CW MOVW
  666. instructions, for example
  667. .P1
  668. MOVW Y, R8
  669. .P2
  670. for the SPARC instruction
  671. .P1
  672. rdy %r8
  673. .P2
  674. .PP
  675. Move instructions are similar to those on the MIPS: pseudo-operations
  676. that turn into appropriate sequences of
  677. .CW sethi
  678. instructions, adds, etc.
  679. Instructions read from left to right. Because the arguments are
  680. flipped to
  681. .CW SUBCC ,
  682. the condition codes are not inverted as on the MIPS.
  683. .PP
  684. The syntax for the ASI stuff is, for example to move a word from ASI 2:
  685. .P1
  686. MOVW (R7, 2), R8
  687. .P2
  688. The syntax for double indexing is
  689. .P1
  690. MOVW (R7+R8), R9
  691. .P2
  692. .PP
  693. The SPARC's instruction scheduling is similar to the MIPS's.
  694. The official no-op instruction is:
  695. .P1
  696. ORN R0, R0, R0
  697. .P2
  698. .SH
  699. i960
  700. .PP
  701. Registers are numbered
  702. .CW R0
  703. through
  704. .CW R31 .
  705. Stack pointer is
  706. .CW R29 ;
  707. return register is
  708. .CW R4 ;
  709. static base is
  710. .CW R28 ;
  711. it is initialized to the address of
  712. .CW setSB(SB) .
  713. .CW R3
  714. must be zero; this should be done manually early in execution by
  715. .P1
  716. SUBO R3, R3
  717. .P2
  718. .CW R27
  719. is the loader temporary.
  720. .PP
  721. There is no support for floating point.
  722. .PP
  723. The Intel calling convention is not supported and cannot be used; use
  724. .CW BAL
  725. instead.
  726. Instructions are mostly as in the book. The major change is that
  727. .CW LOAD
  728. and
  729. .CW STORE
  730. are both called
  731. .CW MOV .
  732. The extension character for
  733. .CW MOV
  734. is as in the manual:
  735. .CW O
  736. for ordinal,
  737. .CW W
  738. for signed, etc.
  739. .SH
  740. i386
  741. .PP
  742. The assembler assumes 32-bit protected mode.
  743. The register names are
  744. .CW SP ,
  745. .CW AX ,
  746. .CW BX ,
  747. .CW CX ,
  748. .CW DX ,
  749. .CW BP ,
  750. .CW DI ,
  751. and
  752. .CW SI .
  753. The stack pointer (not a pseudo-register) is
  754. .CW SP
  755. and the return register is
  756. .CW AX .
  757. There is no physical frame pointer but, as for the MIPS,
  758. .CW FP
  759. is a pseudo-register that acts as
  760. a frame pointer.
  761. .PP
  762. Opcode names are mostly the same as those listed in the Intel manual
  763. with an
  764. .CW L ,
  765. .CW W ,
  766. or
  767. .CW B
  768. appended to identify 32-bit,
  769. 16-bit, and 8-bit operations.
  770. The exceptions are loads, stores, and conditionals.
  771. All load and store opcodes to and from general registers, special registers
  772. (such as
  773. .CW CR0,
  774. .CW CR3,
  775. .CW GDTR,
  776. .CW IDTR,
  777. .CW SS,
  778. .CW CS,
  779. .CW DS,
  780. .CW ES,
  781. .CW FS,
  782. and
  783. .CW GS )
  784. or memory are written
  785. as
  786. .P1
  787. MOV\f2x\fP src,dst
  788. .P2
  789. where
  790. .I x
  791. is
  792. .CW L ,
  793. .CW W ,
  794. or
  795. .CW B .
  796. Thus to get
  797. .CW AL
  798. use a
  799. .CW MOVB
  800. instruction. If you need to access
  801. .CW AH ,
  802. you must mention it explicitly in a
  803. .CW MOVB :
  804. .P1
  805. MOVB AH, BX
  806. .P2
  807. There are many examples of illegal moves, for example,
  808. .P1
  809. MOVB BP, DI
  810. .P2
  811. that the loader actually implements as pseudo-operations.
  812. .PP
  813. The names of conditions in all conditional instructions
  814. .CW J , (
  815. .CW SET )
  816. follow the conventions of the 68020 instead of those of the Intel
  817. assembler:
  818. .CW JOS ,
  819. .CW JOC ,
  820. .CW JCS ,
  821. .CW JCC ,
  822. .CW JEQ ,
  823. .CW JNE ,
  824. .CW JLS ,
  825. .CW JHI ,
  826. .CW JMI ,
  827. .CW JPL ,
  828. .CW JPS ,
  829. .CW JPC ,
  830. .CW JLT ,
  831. .CW JGE ,
  832. .CW JLE ,
  833. and
  834. .CW JGT
  835. instead of
  836. .CW JO ,
  837. .CW JNO ,
  838. .CW JB ,
  839. .CW JNB ,
  840. .CW JZ ,
  841. .CW JNZ ,
  842. .CW JBE ,
  843. .CW JNBE ,
  844. .CW JS ,
  845. .CW JNS ,
  846. .CW JP ,
  847. .CW JNP ,
  848. .CW JL ,
  849. .CW JNL ,
  850. .CW JLE ,
  851. and
  852. .CW JNLE .
  853. .PP
  854. The addressing modes have syntax like
  855. .CW AX ,
  856. .CW (AX) ,
  857. .CW (AX)(BX*4) ,
  858. .CW 10(AX) ,
  859. and
  860. .CW 10(AX)(BX*4) .
  861. The offsets from
  862. .CW AX
  863. can be replaced by offsets from
  864. .CW FP
  865. or
  866. .CW SB
  867. to access names, for example
  868. .CW extern+5(SB)(AX*2) .
  869. .PP
  870. Other notes: Non-relative
  871. .CW JMP
  872. and
  873. .CW CALL
  874. have a
  875. .CW *
  876. added to the syntax.
  877. Only
  878. .CW LOOP ,
  879. .CW LOOPEQ ,
  880. and
  881. .CW LOOPNE
  882. are legal loop instructions. Only
  883. .CW REP
  884. and
  885. .CW REPN
  886. are recognized repeaters. These are not prefixes, but rather
  887. stand-alone opcodes that precede the strings, for example
  888. .P1
  889. CLD; REP; MOVSL
  890. .P2
  891. Segment override prefixes in
  892. .CW MOD/RM
  893. fields are not supported.
  894. .SH
  895. AMD64
  896. .PP
  897. The assembler assumes 64-bit mode unless a
  898. .CW MODE
  899. pseudo-operation is given:
  900. .P1
  901. MODE $32
  902. .P2
  903. to change to 32-bit mode.
  904. The effect is mainly to diagnose instructions that are illegal in
  905. the given mode, but the loader will also assume 32-bit operands and addresses,
  906. and 32-bit PC values for call and return.
  907. The assembler's conventions are similar to those for the 386, above.
  908. The architecture provides extra fixed-point registers
  909. .CW R8
  910. to
  911. .CW R15 .
  912. All registers are 64 bit, but instructions access low-order 8, 16 and 32 bits
  913. as described in the processor handbook.
  914. For example,
  915. .CW MOVL
  916. to
  917. .CW AX
  918. puts a value in the low-order 32 bits and clears the top 32 bits to zero.
  919. Literal operands are limited to signed 32 bit values, which are sign-extended
  920. to 64 bits in 64 bit operations; the exception is
  921. .CW MOVQ ,
  922. which allows 64-bit literals.
  923. The external registers in Plan 9's C are allocated from
  924. .CW R15
  925. down.
  926. .PP
  927. There are many new instructions, including the MMX and XMM media instructions,
  928. and conditional move instructions.
  929. MMX registers are
  930. .CW M0
  931. to
  932. .CW M7 ,
  933. and
  934. XMM registers are
  935. .CW X0
  936. to
  937. .CW X15 .
  938. As with the 386 instruction names,
  939. all new 64-bit integer instructions, and the MMX and XMM instructions
  940. uniformly use
  941. .CW L
  942. for `long word' (32 bits) and
  943. .CW Q
  944. for `quad word' (64 bits).
  945. Some instructions use
  946. .CW O
  947. (`octword') for 128-bit values, where the processor handbook
  948. variously uses
  949. .CW O
  950. or
  951. .CW DQ .
  952. The assembler also consistently uses
  953. .CW PL
  954. for `packed long' in
  955. XMM instructions, instead of
  956. .CW Q ,
  957. .CW DQ
  958. or
  959. .CW PI .
  960. Either
  961. .CW MOVL
  962. or
  963. .CW MOVQ
  964. can be used to move values to and from control registers, even when
  965. the registers might be 64 bits.
  966. The assembler often accepts the handbook's name to ease conversion
  967. of existing code (but remember that the operand order is uniformly
  968. source then destination).
  969. .PP
  970. C's
  971. .CW long
  972. .CW long
  973. type is 64 bits, but passed and returned by value, not by reference.
  974. More notably, C pointer values are 64 bits, and thus
  975. .CW long
  976. .CW long
  977. and
  978. .CW unsigned
  979. .CW long
  980. .CW long
  981. are the only integer types wide enough to hold a pointer value.
  982. The C compiler and library use the XMM floating-point instructions, not
  983. the old 387 ones, although the latter are implemented by assembler and loader.
  984. Unlike the 386, the first integer or pointer argument is passed in a register, which is
  985. .CW BP
  986. for an integer or pointer (it can be referred to in assembly code by the pseudonym
  987. .CW RARG ).
  988. .CW AX
  989. holds the return value from subroutines as before.
  990. Floating-point results are returned in
  991. .CW X0 ,
  992. although currently the first floating-point parameter is not passed in a register.
  993. All parameters less than 8 bytes in length have 8 byte slots reserved on the stack
  994. to preserve alignment and simplify variable-length argument list access,
  995. including the first parameter when passed in a register,
  996. even though bytes 4 to 7 are not initialized.
  997. .
  998. .SH
  999. Power PC
  1000. .PP
  1001. The Power PC follows the Plan 9 model set by the MIPS and SPARC,
  1002. not the elaborate ABIs.
  1003. The 32-bit instructions of the 60x and 8xx PowerPC architectures are supported;
  1004. there is no support for the older POWER instructions.
  1005. Registers are
  1006. .CW R0
  1007. through
  1008. .CW R31 .
  1009. .CW R0
  1010. is initialized to zero; this is done by C start up code
  1011. and assumed by the compiler and loader.
  1012. .CW R1
  1013. is the stack pointer.
  1014. .CW R2
  1015. is the static base register, with value the address of
  1016. .CW setSB(SB) .
  1017. .CW R3
  1018. is the return register and also the register holding the first
  1019. argument to a C function, with space reserved at
  1020. .CW 0(FP)
  1021. as on the MIPS.
  1022. .CW R31
  1023. is the loader temporary.
  1024. The external registers in Plan 9's C are allocated from
  1025. .CW R30
  1026. down.
  1027. .PP
  1028. Floating point registers are called
  1029. .CW F0
  1030. through
  1031. .CW F31 .
  1032. By convention, several registers are initialized
  1033. to specific values; this is done by the operating system.
  1034. .CW F27
  1035. must be initialized to the value
  1036. .CW 0x4330000080000000
  1037. (used by float-to-int conversion),
  1038. .CW F28
  1039. to the value 0.0,
  1040. .CW F29
  1041. to 0.5,
  1042. .CW F30
  1043. to 1.0, and
  1044. .CW F31
  1045. to 2.0.
  1046. .PP
  1047. As on the MIPS and SPARC, the assembler accepts arbitrary literals
  1048. as operands to
  1049. .CW MOVW ,
  1050. and also to
  1051. .CW ADD
  1052. and others where `immediate' variants exist,
  1053. and the loader generates sequences
  1054. of
  1055. .CW addi ,
  1056. .CW addis ,
  1057. .CW oris ,
  1058. etc. as required.
  1059. The register indirect addressing modes use the same syntax as the SPARC,
  1060. including double indexing when allowed.
  1061. .PP
  1062. The instruction names are generally derived from the Motorola ones,
  1063. subject to slight transformation:
  1064. the
  1065. .CW . ' `
  1066. marking the setting of condition codes is replaced by
  1067. .CW CC ,
  1068. and when the letter
  1069. .CW o ' `
  1070. represents `OE=1' it is replaced by
  1071. .CW V .
  1072. Thus
  1073. .CW add ,
  1074. .CW addo.
  1075. and
  1076. .CW subfzeo.
  1077. become
  1078. .CW ADD ,
  1079. .CW ADDVCC
  1080. and
  1081. .CW SUBFZEVCC .
  1082. As well as the three-operand conditional branch instruction
  1083. .CW BC ,
  1084. the assembler provides pseudo-instructions for the common cases:
  1085. .CW BEQ ,
  1086. .CW BNE ,
  1087. .CW BGT ,
  1088. .CW BGE ,
  1089. .CW BLT ,
  1090. .CW BLE ,
  1091. .CW BVC ,
  1092. and
  1093. .CW BVS .
  1094. The unconditional branch instruction is
  1095. .CW BR .
  1096. Indirect branches use
  1097. .CW "(CTR)"
  1098. or
  1099. .CW "(LR)"
  1100. as target.
  1101. .PP
  1102. Load or store operations are replaced by
  1103. .CW MOV
  1104. variants in the usual way:
  1105. .CW MOVW
  1106. (move word),
  1107. .CW MOVH
  1108. (move halfword with sign extension), and
  1109. .CW MOVB
  1110. (move byte with sign extension, a pseudo-instruction),
  1111. with unsigned variants
  1112. .CW MOVHZ
  1113. and
  1114. .CW MOVBZ ,
  1115. and byte-reversing
  1116. .CW MOVWBR
  1117. and
  1118. .CW MOVHBR .
  1119. `Load or store with update' versions are
  1120. .CW MOVWU ,
  1121. .CW MOVHU ,
  1122. and
  1123. .CW MOVBZU .
  1124. Load or store multiple is
  1125. .CW MOVMW .
  1126. The exceptions are the string instructions, which are
  1127. .CW LSW
  1128. and
  1129. .CW STSW ,
  1130. and the reservation instructions
  1131. .CW lwarx
  1132. and
  1133. .CW stwcx. ,
  1134. which are
  1135. .CW LWAR
  1136. and
  1137. .CW STWCCC ,
  1138. all with operands in the usual data-flow order.
  1139. Floating-point load or store instructions are
  1140. .CW FMOVD ,
  1141. .CW FMOVDU ,
  1142. .CW FMOVS ,
  1143. and
  1144. .CW FMOVSU .
  1145. The register to register move instructions
  1146. .CW fmr
  1147. and
  1148. .CW fmr.
  1149. are written
  1150. .CW FMOVD
  1151. and
  1152. .CW FMOVDCC .
  1153. .PP
  1154. The assembler knows the commonly used special purpose registers:
  1155. .CW CR ,
  1156. .CW CTR ,
  1157. .CW DEC ,
  1158. .CW LR ,
  1159. .CW MSR ,
  1160. and
  1161. .CW XER .
  1162. The rest, which are often architecture-dependent, are referenced as
  1163. .CW SPR(n) .
  1164. The segment registers of the 60x series are similarly
  1165. .CW SEG(n) ,
  1166. but
  1167. .I n
  1168. can also be a register name, as in
  1169. .CW SEG(R3) .
  1170. Moves between special purpose registers and general purpose ones,
  1171. when allowed by the architecture,
  1172. are written as
  1173. .CW MOVW ,
  1174. replacing
  1175. .CW mfcr ,
  1176. .CW mtcr ,
  1177. .CW mfmsr ,
  1178. .CW mtmsr ,
  1179. .CW mtspr ,
  1180. .CW mfspr ,
  1181. .CW mftb ,
  1182. and many others.
  1183. .PP
  1184. The fields of the condition register
  1185. .CW CR
  1186. are referenced as
  1187. .CW CR(0)
  1188. through
  1189. .CW CR(7) .
  1190. They are used by the
  1191. .CW MOVFL
  1192. (move field) pseudo-instruction,
  1193. which produces
  1194. .CW mcrf
  1195. or
  1196. .CW mtcrf .
  1197. For example:
  1198. .P1
  1199. MOVFL CR(3), CR(0)
  1200. MOVFL R3, CR(1)
  1201. MOVFL R3, $7, CR
  1202. .P2
  1203. They are also accepted in
  1204. the conditional branch instruction, for example
  1205. .P1
  1206. BEQ CR(7), label
  1207. .P2
  1208. Fields of the
  1209. .CW FPSCR
  1210. are accessed using
  1211. .CW MOVFL
  1212. in a similar way:
  1213. .P1
  1214. MOVFL FPSCR, F0
  1215. MOVFL F0, FPSCR
  1216. MOVFL F0, $7, FPSCR
  1217. MOVFL $0, FPSCR(3)
  1218. .P2
  1219. producing
  1220. .CW mffs ,
  1221. .CW mtfsf
  1222. or
  1223. .CW mtfsfi ,
  1224. as appropriate.
  1225. .SH
  1226. ARM
  1227. .PP
  1228. The assembler provides access to
  1229. .CW R0
  1230. through
  1231. .CW R14
  1232. and the
  1233. .CW PC .
  1234. The stack pointer is
  1235. .CW R13 ,
  1236. the link register is
  1237. .CW R14 ,
  1238. and the static base register is
  1239. .CW R12 .
  1240. .CW R0
  1241. is the return register and also the register holding
  1242. the first argument to a subroutine.
  1243. The external registers in Plan 9's C are allocated from
  1244. .CW R10
  1245. down.
  1246. .CW R11
  1247. is used by the loader as a temporary register.
  1248. The assembler supports the
  1249. .CW CPSR
  1250. and
  1251. .CW SPSR
  1252. registers.
  1253. It also knows about coprocessor registers
  1254. .CW C0
  1255. through
  1256. .CW C15 .
  1257. Floating registers are
  1258. .CW F0
  1259. through
  1260. .CW F7 ,
  1261. .CW FPSR
  1262. and
  1263. .CW FPCR .
  1264. .PP
  1265. As with the other architectures, loads and stores are called
  1266. .CW MOV ,
  1267. e.g.
  1268. .CW MOVW
  1269. for load word or store word, and
  1270. .CW MOVM
  1271. for
  1272. load or store multiple,
  1273. depending on the operands.
  1274. .PP
  1275. Addressing modes are supported by suffixes to the instructions:
  1276. .CW .IA
  1277. (increment after),
  1278. .CW .IB
  1279. (increment before),
  1280. .CW .DA
  1281. (decrement after), and
  1282. .CW .DB
  1283. (decrement before).
  1284. These can only be used with the
  1285. .CW MOV
  1286. instructions.
  1287. The move multiple instruction,
  1288. .CW MOVM ,
  1289. defines a range of registers using brackets, e.g.
  1290. .CW [R0-R12] .
  1291. The special
  1292. .CW MOVM
  1293. addressing mode bits
  1294. .CW W ,
  1295. .CW U ,
  1296. and
  1297. .CW P
  1298. are written in the same manner, for example,
  1299. .CW MOVM.DB.W .
  1300. A
  1301. .CW .S
  1302. suffix allows a
  1303. .CW MOVM
  1304. instruction to access user
  1305. .CW R13
  1306. and
  1307. .CW R14
  1308. when in another processor mode.
  1309. Shifts and rotates in addressing modes are supported by binary operators
  1310. .CW <<
  1311. (logical left shift),
  1312. .CW >>
  1313. (logical right shift),
  1314. .CW ->
  1315. (arithmetic right shift), and
  1316. .CW @>
  1317. (rotate right); for example
  1318. .CW "R7>>R2" or
  1319. .CW "R2@>2" .
  1320. The assembler does not support indexing by a shifted expression;
  1321. only names can be doubly indexed.
  1322. .PP
  1323. Any instruction can be followed by a suffix that makes the instruction conditional:
  1324. .CW .EQ ,
  1325. .CW .NE ,
  1326. and so on, as in the ARM manual, with synonyms
  1327. .CW .HS
  1328. (for
  1329. .CW .CS )
  1330. and
  1331. .CW .LO
  1332. (for
  1333. .CW .CC ),
  1334. for example
  1335. .CW ADD.NE .
  1336. Arithmetic
  1337. and logical instructions
  1338. can have a
  1339. .CW .S
  1340. suffix, as ARM allows, to set condition codes.
  1341. .PP
  1342. The syntax of the
  1343. .CW MCR
  1344. and
  1345. .CW MRC
  1346. coprocessor instructions is largely as in the manual, with the usual adjustments.
  1347. The assembler directly supports only the ARM floating-point coprocessor
  1348. operations used by the compiler:
  1349. .CW CMP ,
  1350. .CW ADD ,
  1351. .CW SUB ,
  1352. .CW MUL ,
  1353. and
  1354. .CW DIV ,
  1355. all with
  1356. .CW F
  1357. or
  1358. .CW D
  1359. suffix selecting single or double precision.
  1360. Floating-point load or store become
  1361. .CW MOVF
  1362. and
  1363. .CW MOVD .
  1364. Conversion instructions are also specified by moves:
  1365. .CW MOVWD ,
  1366. .CW MOVWF ,
  1367. .CW MOVDW ,
  1368. .CW MOVWD ,
  1369. .CW MOVFD ,
  1370. and
  1371. .CW MOVDF .
  1372. .SH
  1373. RISC-V
  1374. .PP
  1375. The riscv and riscv64 assemblers support RV32GC and RV64GC instruction sets,
  1376. conforming as usual to Plan 9 syntax rather than the form described
  1377. in the RISC-V specification.
  1378. .PP
  1379. Registers are
  1380. .CW R0
  1381. through
  1382. .CW R31 ,
  1383. with
  1384. .CW R1
  1385. used as the link register,
  1386. .CW R2
  1387. as stack pointer,
  1388. .CW R3
  1389. as static base,
  1390. .CW R8
  1391. used for the first function argument and function return value, and
  1392. .CW R4
  1393. as the loader temporary.
  1394. These register conventions are different from the usual Plan 9 model,
  1395. for compatibility with the
  1396. compressed instruction set extension. For example, the compressed
  1397. form of the
  1398. .CW JAL
  1399. instruction assumes that the link register is
  1400. .CW R1 .
  1401. There are no separate opcode mnemonics for compressed instructions.
  1402. The loader will generate the compressed (2 byte) form of instructions
  1403. where possible, unless it is invoked with the
  1404. .CW -c
  1405. option.
  1406. .PP
  1407. Three-operand logical and arithmetic instructions are written in the order
  1408. .P1
  1409. op rs2, rs1, rd
  1410. .P2
  1411. where
  1412. .I rs2
  1413. may be omitted if it's the same as
  1414. .I rd .
  1415. For all but multiply and divide instructions,
  1416. .I rs2
  1417. may be replaced by a constant
  1418. .I $con
  1419. to obtain the immediate form of the instruction (without appending
  1420. .CW I
  1421. to the opcode).
  1422. .PP
  1423. Three-operand conditional branches are written in similar order
  1424. .P1
  1425. Bcond rs2, rs1, dest
  1426. .P2
  1427. where
  1428. .I rs2
  1429. may be omitted to indicate comparison with
  1430. .CW R0
  1431. (which always contains zero).
  1432. For example:
  1433. .P1
  1434. SUB R1, R2, R3 /* R3 = R2 - R1 */
  1435. SUB $1, R2, R3 /* R3 = R2 - 1 */
  1436. ADD R4, R3 /* R3 += R4 */
  1437. BLT R1, R3, done /* if (R3 < R1) goto done */
  1438. BNE R1, done /* if (R1 != 0) goto done */
  1439. .P2
  1440. Mnemonics for conditional branches (some of which are pseudo-ops) are
  1441. as in the RISC-V specification:
  1442. .CW BEQ ,
  1443. .CW BNE ,
  1444. .CW BGT ,
  1445. .CW BGE ,
  1446. .CW BLT ,
  1447. .CW BLE ,
  1448. .CW BGTU ,
  1449. .CW BGEU ,
  1450. .CW BLTU ,
  1451. and
  1452. .CW BLEU .
  1453. The function call instruction
  1454. .CW JAL
  1455. uses an explicit link register operand, but the
  1456. .CW RET
  1457. pseudo-op assumes the link register is
  1458. .CW 1 .
  1459. The unconditional branch is
  1460. .CW JMP ,
  1461. which generates a
  1462. .CW JAL
  1463. with
  1464. .CW R0
  1465. as the link register.
  1466. To branch to an address in a register use
  1467. .CW JMP
  1468. with indirect addressing mode.
  1469. .PP
  1470. To allow for common source code files to be used with both riscv and riscv64
  1471. assemblers, the built-in constant
  1472. .CW XLEN
  1473. represents the register width in bytes (4 or 8),
  1474. and some opcode mnemonics will generate different machine instructions
  1475. for each instruction set architecture. For data movement (loads, stores and
  1476. register transfers), the
  1477. .CW MOV
  1478. opcode always denotes the native register width. When used with a memory
  1479. operand it will generate an
  1480. .CW lw
  1481. or
  1482. .CW sw
  1483. instruction for riscv, and an
  1484. .CW ld
  1485. or
  1486. .CW sd
  1487. instruction for riscv64.
  1488. .CW MOV
  1489. should also be used to load a constant or copy one register to another.
  1490. On the other hand,
  1491. .CW MOVW
  1492. on either architecture
  1493. will move a 32-bit word. With a memory operand it will generate
  1494. .CW lw
  1495. or
  1496. .CW sw .
  1497. When used to load a constant or copy between registers,
  1498. .CW MOVW
  1499. on riscv is a synonym for
  1500. .CW MOV ;
  1501. on riscv64 will generate code for a 32-bit move with sign extension.
  1502. The mnemonic
  1503. .CW MOVWU
  1504. can be used for a 32-bit move with zero extension on riscv64;
  1505. on riscv it is another synonym for
  1506. .CW MOV .
  1507. .PP
  1508. Some other opcodes also have native-width and 32-bit variants:
  1509. .CW ADDW ,
  1510. .CW SUBW ,
  1511. .CW SLLW ,
  1512. .CW SRLW ,
  1513. .CW SRAW ,
  1514. .CW MULW ,
  1515. .CW DIVW ,
  1516. and
  1517. .CW REMW
  1518. will each generate an instruction which performs a 32-bit operation with
  1519. sign or zero extension for riscv64 (which would cause
  1520. an illegal instruction trap on riscv), and generate the corresponding
  1521. register-width instruction
  1522. .CW ADD ,
  1523. .CW SUB
  1524. etc
  1525. for riscv.
  1526. .PP
  1527. Loads and stores of [unsigned] halfword and byte operands use the opcodes
  1528. .CW MOVH[U]
  1529. and
  1530. .CW MOVB[U] ,
  1531. which generate the same machine instructions on both architectures.
  1532. .PP
  1533. If
  1534. .CW MOV
  1535. is used with a constant source operand to load a value which doesn't
  1536. fit into the 12-bit signed immediate field, the loader will generate
  1537. a two instruction sequence to construct the value if possible,
  1538. otherwise will generate a load instruction and place a literal value
  1539. in the data segment.
  1540. .PP
  1541. Atomic instructions are not yet implemented; they need to be
  1542. constructed by hand using
  1543. .CW WORD .