lprof.ms 14 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514
  1. .TL
  2. Limbo profilers in Inferno
  3. .AU
  4. J R Firth
  5. .AI
  6. Vita Nuova
  7. .br
  8. 13 June 2002
  9. .SP 4
  10. .NH 1
  11. Introduction
  12. .LP
  13. Currently there are three application level profiling tools in the
  14. Inferno package.
  15. .I Prof
  16. is a time profiler which, by sampling, can provide statistics
  17. on the percentage of time spent on each line of limbo source.
  18. .I Cprof
  19. is a
  20. coverage profiler which provides the execution profile for limbo source
  21. code. It can accumulate results over a series of runs to allow full
  22. coverage testing. Finally,
  23. .I mprof
  24. is a memory profiler which provides
  25. statistics on the amount of memory used by each line
  26. of limbo source.
  27. .LP
  28. Two gui versions of these tools currently exist.
  29. .I Wm/cprof
  30. shows the
  31. coverage per module and highlights those lines which have not been
  32. executed or have been only partially executed.
  33. .I Wm/mprof
  34. shows the memory
  35. usage per module and highlights lines with high memory allocation in
  36. darker shades of red.
  37. .I Prof
  38. itself does not have a gui equivalent as it
  39. was originally written to determine why acme was so slow when using it's
  40. global editing command. A gui for it was not a requirement at that stage.
  41. .LP
  42. All these tools use a common library module
  43. .CW /module/profile.m
  44. and
  45. .CW /appl/lib/profile.b
  46. that acts as the direct interface with the kernel profiling device.
  47. .LP
  48. Note that none of these tools give kernel profile statistics. For that, the devmem driver should be used.
  49. .LP
  50. Although the use of these tools is very similar, there are a few differences
  51. when it comes to interactive testing as the profilers were written to
  52. answer different questions. Thus
  53. .I prof
  54. tries to determine 'why is blah so
  55. slow ?',
  56. .I cprof
  57. tries to accumulate coverage records over time and
  58. .I mprof
  59. tries to
  60. give a series of memory statistics at intervals during the execution of
  61. a program or series of programs.
  62. .NH 1
  63. Prof
  64. .LP
  65. The time profiler works by sampling. A kernel procedure sleeps for the
  66. given sample time and then notes the particular dis instruction currently
  67. being executed before repeating the process. After many such samples, an accurate profile can be obtained.
  68. .LP
  69. At it's simplest we can profile a particular command by giving the command
  70. to execute followed by any arguments to the command eg
  71. .sp
  72. .RS
  73. prof wm/polyhedra
  74. .RE
  75. .sp
  76. profiles the polyhedra displayer. After letting the latter run for a reasonable
  77. amount of time, we exit and then get the following statistics.
  78. .br
  79. .DS
  80. Module: Wmlib(/dis/lib/wmlib.dis)
  81. 34 0.06 str = load String String->PATH;
  82. **** module sampling points 1 ****
  83. .DE
  84. .DS
  85. Module: Bufio(/dis/lib/bufio.dis)
  86. 340 0.06 n := 0;
  87. 341 0.12 while(b.index < b.size){
  88. 342 0.99 (ch, i, nil) = sys->byte2char(b.buffer[0:b.size], b.index);
  89. **** module sampling points 19 ****
  90. Module: Polyfill(/dis/math/polyfill.dis)
  91. 37 10.80 for(i := 0; i < n; i++)
  92. 38 19.86 b0[i] = b1[i] = ∞;
  93. 57 0.06 p.y += y;
  94. 58 1.18 dst.line((left, y), (right, y), Endsquare, Endsquare, 0, src, p);
  95. 63 0.12 prevx := ∞;
  96. 64 9.93 for(x := left; x <= right; x++){
  97. 65 20.61 if(z+e < zbuf0[k] || (z-e <= zbuf1[k] && x != right && prevx != ∞)){
  98. 66 6.46 zbuf0[k] = z-e;
  99. 67 5.71 zbuf1[k] = z+e;
  100. 68 0.74 if(prevx == ∞)
  101. 69 0.74 prevx = x;
  102. 71 0.12 else if(prevx != ∞){
  103. 72 0.25 fillline(dst, prevx, x-1, y, src, p);
  104. 73 2.61 prevx = ∞;
  105. 75 4.35 z += dx;
  106. 76 3.17 k++;
  107. 78 0.06 if(prevx != ∞)
  108. 79 0.87 fillline(dst, prevx, right, y, src, p);
  109. 80 0.06 }
  110. 152 0.06 return (vx/z, mod);
  111. 186 0.06 sp.dzrem = mod(sp.num, sp.den) << fixshift;
  112. 187 0.06 sp.dz += sdiv(sp.dzrem, sp.den);
  113. 217 0.62 for(q = p = 0; p < ep; p++) {
  114. 218 0.37 sp = seg[p];
  115. 220 0.12 continue;
  116. 221 0.12 sp.z += sp.dz;
  117. 222 0.19 sp.zerr += sp.dzrem;
  118. 223 0.12 if(sp.zerr >= sp.den) {
  119. 224 0.19 sp.z++;
  120. 225 0.25 sp.zerr -= sp.den;
  121. 226 0.25 if(sp.zerr < 0 || sp.zerr >= sp.den)
  122. 227 0.25 sys->print("bad ratzerr1: %d den %d dzrem %d\n", sp.zerr, sp.den, sp.dzrem);
  123. 229 0.31 seg[q] = sp;
  124. 230 0.31 q++;
  125. 233 0.25 for(p = next; seg[p] != nil; p++) {
  126. 234 0.06 sp = seg[p];
  127. 247 0.12 ep = q;
  128. 248 0.06 next = p;
  129. 257 0.06 continue;
  130. 260 0.62 zsort(seg, ep);
  131. 262 0.25 for(p = 0; p < ep; p++) {
  132. 263 0.19 sp = seg[p];
  133. 264 0.06 cnt = 0;
  134. 265 0.06 x = sp.z;
  135. 266 0.25 ix = (x + onehalf) >> fixshift;
  136. 267 0.06 if(ix >= maxx)
  137. 271 0.06 cnt += sp.d;
  138. 272 0.12 p++;
  139. 273 0.25 sp = seg[p];
  140. 275 0.06 if(p == ep) {
  141. 277 0.06 return;
  142. 279 0.06 cnt += sp.d;
  143. 280 0.12 if((cnt&wind) == 0)
  144. 283 0.19 sp = seg[p];
  145. 286 0.25 ix2 = (x2 + onehalf) >> fixshift;
  146. 291 1.92 filllinez(dst, ix, ix2, iy, zv+ix*dx, er, dx, k+ix-zr.min.x, zbuf0, zbuf1, src, spt);
  147. 293 0.06 y += (1<<fixshift);
  148. 294 0.31 iy++;
  149. 295 0.06 k += xlen;
  150. 296 0.06 zv += dy;
  151. 298 0.06 }
  152. 310 0.06 done = 1;
  153. 311 0.12 q--;
  154. 312 0.25 for(p = 0; p < q; p++) {
  155. 313 0.87 if(seg[p].z > seg[p+1].z) {
  156. 367 0.06 t = a[0]; a[0] = a[i]; a[i] = t;
  157. 373 0.06 while(i < n && ycompare(a[i], a[0]) < 0);
  158. 379 0.12 t = a[i]; a[i] = a[j]; a[j] = t;
  159. 384 0.06 qsortycompare(a, j);
  160. **** module sampling points 1584 ****
  161. Module: Polyhedra(/dis/wm/polyhedra.dis)
  162. 327 0.12 return (int (geo.sx*v.x)+geo.tx, int (geo.sy*v.y)+geo.ty);
  163. 471 0.06 if(allf || dot(geo.view, newn[j]) < 0.0)
  164. 472 0.06 polyfilla(fv[j], new, newn[j], dot(geo.light, newn[j]), geo, concave, inc);
  165. 496 0.06 ap[j] = map(vtx, geo);
  166. 512 0.06 if(a <= -LIMIT || a >= LIMIT)
  167. 531 0.06 fillpoly(RDisp, ap, ~0, face, (0, 0), geo.zstate, dc, dx, dy);
  168. **** module sampling points 7 ****
  169. **** total sampling points 1611 ****
  170. .DE
  171. .br
  172. The output lists all lines in all modules with a sampling point. Each line
  173. shows the line number in the corresponding source file, the percentage of
  174. time spent on that line and the source code. We can see that about 60% of
  175. the sampling points occur on lines 37, 38, 64 and 65 of the Polyfill module.
  176. With this information we might then try to speed up this part of the code
  177. by altering the algorithm or making the limbo code more efficient (for
  178. instance by moving constant calculations or addressing out of loops).
  179. .LP
  180. The number of sampling points is also shown. The sampling rate can be
  181. increased with the -s option to give better granularity.
  182. This will cause a decrease in apparent performance but increases the
  183. accuracy of the results. The above example showed the results for all
  184. modules sampled. We might have restricted attention to the two main
  185. polyhedra modules instead by executing
  186. .sp
  187. .RS
  188. prof -m Polyhedra -m Polyfill wm/polyhedra
  189. .RE
  190. .sp
  191. See the manual page for other options to
  192. .I prof
  193. and further examples.
  194. .NH1
  195. Cprof
  196. .LP
  197. Coverage of instructions is achieved by running a special dis instruction execute routine in place of the usual one (just as the debugger does).
  198. This routine notes down each
  199. instruction as it is executed. The profile device then passes this information
  200. to
  201. .I cprof
  202. via the io system.
  203. .LP
  204. The coverage profiler is used in a similar way to the time profiler.
  205. .sp
  206. .RS
  207. cprof -m Zeros zeros 1024 2880
  208. .RE
  209. .sp
  210. gives
  211. .br
  212. .DS
  213. Module: Zeros(zeros.dis) 56% coverage
  214. 1 implement Zeros;
  215. 2
  216. 3 include "sys.m";
  217. 4 sys: Sys;
  218. 5 include "arg.m";
  219. 6 arg: Arg;
  220. 7 include "string.m";
  221. 8 str: String;
  222. 9 include "keyring.m";
  223. 10 include "security.m";
  224. 11 random: Random;
  225. 12
  226. 13 include "draw.m";
  227. 14
  228. 15 Zeros: module
  229. 16 {
  230. 17 init: fn(nil: ref Draw->Context, argv: list of string);
  231. 18 };
  232. 19
  233. 20 init(nil: ref Draw->Context, argv: list of string)
  234. 21 {
  235. 22 z: array of byte;
  236. 23 i: int;
  237. 24 + sys = load Sys Sys->PATH;
  238. 25 + arg = load Arg Arg->PATH;
  239. 26 + str = load String String->PATH;
  240. 27
  241. 28 + if(sys == nil || arg == nil)
  242. 29 - return;
  243. 30
  244. 31 + bs := 0;
  245. 32 + n := 0;
  246. 33 + val := 0;
  247. 34 + rflag := 0;
  248. 35 + arg->init(argv);
  249. 36 + while ((c := arg->opt()) != 0)
  250. 37 - case c {
  251. 38 - 'r' => rflag = 1;
  252. 39 - 'v' => (val, nil) = str->toint(arg->arg(), 16);
  253. 40 - * => sys->raise(sys->sprint("fail: unknown option (%c)\n", c));
  254. 41 }
  255. .DE
  256. .DS
  257. 42 + argv = arg->argv();
  258. 43 + if(len argv >= 1)
  259. 44 + bs = int hd argv;
  260. 45 else
  261. 46 - bs = 1;
  262. 47 + if (len argv >= 2)
  263. 48 + n = int hd tl argv;
  264. 49 else
  265. 50 - n = 1;
  266. 51 + if(bs == 0 || n == 0) {
  267. 52 - sys->fprint(sys->fildes(2), "usage: zeros [-r] [-v value] blocksize [number]\n");
  268. 53 - sys->raise("fail: usage");
  269. 54 }
  270. 55 + if (rflag) {
  271. 56 - random = load Random Random->PATH;
  272. 57 - if (random == nil)
  273. 58 - sys->raise("fail: no security module\n");
  274. 59 - z = random->randombuf(random->NotQuiteRandom, bs);
  275. 60 }
  276. 61 else {
  277. 62 + z = array[bs] of byte;
  278. 63 + for(i=0;i<bs;i++)
  279. 64 + z[i] = byte val;
  280. 65 }
  281. 66 + for(i=0;i<n;i++)
  282. 67 + sys->write(sys->fildes(1), z, bs);
  283. 68 + }
  284. **** module dis instructions 39725 ****
  285. .DE
  286. .br
  287. Here the -m option has restricted attention to the Zeros module itself.
  288. The output shows the source line number, an indicator of coverage and
  289. the source. The indicator is + if the line has been executed, - if
  290. it hasn't and ? if only part of it has (for instance a loop statement that has
  291. never had it's iteration part executed). Lines with no indicator have no
  292. corresponding dis instructions associated with them. Another option (-f)
  293. shows coverage frequencies instead.
  294. .LP
  295. An alternative to
  296. .I cprof
  297. is
  298. .I wm/cprof
  299. which shows the statistics graphically.
  300. It's options are pretty much the same as those to
  301. .I cprof .
  302. Unexecuted and
  303. partially executed lines of code are shown in colour. See the man page
  304. for exact details of the colouring convention
  305. .LP
  306. Results may be accumulated with the -r option so that multiple runs of
  307. code can be made. The resulting statistics go into a file <xxx>.prf when
  308. <xxx>.dis is the corresponding dis file. See the manual page for further
  309. details on how to use this option and then review the accumulated
  310. results.
  311. .NH 1
  312. Mprof
  313. .LP
  314. When memory profiling, the kernel profile device associates each heap allocation with a line of limbo source and each heap deallocation with the line of
  315. limbo source that allocated it. In this way, current memory usage and
  316. high-water usage can be determined on a line by line basis.
  317. .LP
  318. Here it seems that memory usage at a particular point in the execution of
  319. a program is more appropriate than the post-mortem approach of
  320. .I prof
  321. and
  322. .I cprof
  323. , so an interactive example is described (though
  324. .I mprof
  325. can be
  326. used non-interactively and
  327. .I prof
  328. interactively if so desired). See the manual
  329. pages for complete details and further examples.
  330. .LP
  331. To do this we execute
  332. .sp
  333. .RS
  334. mprof -b -m Polyhedra
  335. .RE
  336. .sp
  337. which kicks off profiling and restricts attention to the Polyhedra module
  338. whenever it runs. The -b simply says begin profiling. Note that no command
  339. to execute is given to
  340. .I mprof
  341. at this stage. Then run the command
  342. .sp
  343. .RS
  344. wm/polyhedra &
  345. .RE
  346. .sp
  347. and interact with it. Now show memory statistics
  348. .sp
  349. .RS
  350. mprof
  351. .RE
  352. .sp
  353. This gives
  354. .br
  355. .DS
  356. Module: Polyhedra(/dis/wm/polyhedra.dis)
  357. 44 100 100 sys = load Sys Sys->PATH;
  358. 45 132 132 draw = load Draw Draw->PATH;
  359. 46 68 68 tk = load Tk Tk->PATH;
  360. 47 1788 1788 wmlib = load Wmlib Wmlib->PATH;
  361. 48 232 232 bufio = load Bufio Bufio->PATH;
  362. 49 68 68 math = load Math Math->PATH;
  363. 50 204 204 rand = load Rand Rand->PATH;
  364. 51 0 3504 daytime = load Daytime Daytime->PATH;
  365. 52 544 544 polyfill = load Polyfill Polyfill->PATH;
  366. 53 1824 1824 smenu = load Smenu Smenu->PATH;
  367. 86 36 36 cmdch := chan of string;
  368. 95 36 36 sync := chan of int;
  369. 96 36 36 chanθ := chan of real;
  370. 103 68 68 shade = array[NSHADES] of ref Image;
  371. 116 36 36 yieldc := chan of int;
  372. 120 36 36 sm := array[2] of ref Scrollmenu;
  373. 378 68 176 s += " (" + string p.indx + ")";
  374. 403 36 36 vec := array[2] of array of Vector;
  375. 404 740 740 vec[0] = array[V] of Vector;
  376. 405 740 740 vec[1] = array[V] of Vector;
  377. 407 36 36 norm = array[2] of array of Vector;
  378. 408 612 612 norm[0] = array[F] of Vector;
  379. 409 612 612 norm[1] = array[F] of Vector;
  380. 492 68 68 ap := array[n+1] of Point;
  381. 609 164 164 geo := ref Geom;
  382. 610 36 36 TM := array[4] of array of real;
  383. 612 272 272 TM[i] = array[4] of real;
  384. 663 8000 8000 p := ref Polyhedron;
  385. 707 740 740 p.v = array[p.V] of Vector;
  386. 710 612 612 p.f = array[p.F] of Vector;
  387. 713 132 132 p.fv = array[p.F] of array of int;
  388. 716 164 164 p.vf = array[p.V] of array of int;
  389. 729 9504 9640 return s[0: len s - 1];
  390. 750 3672 3672 a := array[n+2] of int;
  391. 768 0 136 return (n, s[i+1:]);
  392. 779 0 104 return (r, s[i+1:]);
  393. 802 0 68 s = s[1:];
  394. 806 0 72 s = s[0: ln-1];
  395. .DE
  396. .DS
  397. 866 0 200 cmd(mainwin, ".f1.txt configure -text {" + s + "}");
  398. 874 0 356 labs := array[n] of string;
  399. 881 0 5128 labs[i++] = string q.indx + " " + name;
  400. 884 0 68 cmd(top, mname + " configure -borderwidth 3");
  401. 920 0 104 cmd(win, ". configure -height " + string (scrsize.y - bd * 2));
  402. 934 0 244 cmd(win, ". configure -x " + string actr.min.x + " -y " + string actr.min.y);
  403. Module totals 31416 33984
  404. .DE
  405. .br
  406. We get the source line number, the amount of memory in bytes
  407. currently allocated on that line, the high-water mark in bytes and then the source. Thus
  408. loading the Sys module on line 44 used 100 bytes and this memory is
  409. still allocated. Loading Daytime on line 51 used 3504 bytes but this is now
  410. released (because the module pointer is set to nil in the source and the
  411. memory has been reclaimed). The string concatenation on line 378 currently uses 68 bytes
  412. but at it's worst it was 176 bytes.
  413. .LP
  414. Further interaction with wm/polyhedra can now be done and memory
  415. statistics reviewed before the
  416. profiling session is ended, throwing away the accumulated memory
  417. statistics inside the kernel with
  418. .sp
  419. .RS
  420. mprof -c
  421. .RE
  422. .sp
  423. The -c option simply says cease profiling.
  424. .LP
  425. An alternative to
  426. .I mprof
  427. is
  428. .I wm/mprof
  429. which shows the statistics graphically.
  430. It's options are pretty much the same as those to
  431. .I mprof .
  432. Lines of code
  433. which have allocated more of the memory are shown in darker shades of red.
  434. .NH 1
  435. Manual pages
  436. .LP
  437. Further information and other examples are given in the following
  438. manual pages :-
  439. .sp
  440. .RS
  441. .I cprof(1)
  442. .br
  443. .I mprof(1)
  444. .br
  445. .I prof(1)
  446. .br
  447. .I wm-cprof(1)
  448. .br
  449. .I wm-mprof(1)
  450. .RE
  451. .sp
  452. For the lower level library module interface to these profilers
  453. .sp
  454. .RS
  455. .I prof(2)
  456. .RE
  457. .sp
  458. For the kernel profile device which gathers timing, coverage and memory
  459. statistics
  460. .sp
  461. .RS
  462. .I prof(3)
  463. .RE
  464. .br
  465. .NH 1
  466. Sources
  467. .LP
  468. The relevant sources are
  469. .sp
  470. .RS
  471. .CW /module/profile.m
  472. .br
  473. .CW /appl/lib/profile.b
  474. .br
  475. .CW /appl/cmd/cprof.b
  476. .br
  477. .CW /appl/cmd/mprof.b
  478. .br
  479. .CW /appl/cmd/prof.b
  480. .br
  481. .CW /appl/wm/cprof.b
  482. .br
  483. .CW /appl/wm/mprof.b
  484. .br
  485. .CW /emu/devprof.c
  486. .br
  487. .CW /os/port/devprof.c
  488. .RE
  489. .NH 1
  490. Addendum
  491. .LP
  492. A gui version of
  493. .I prof
  494. has been added for completeness. See the manual page
  495. .I wm-prof(1)
  496. and the source
  497. .CW /appl/wm/prof.b .