12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788899091929394959697989910010110210310410510610710810911011111211311411511611711811912012112212312412512612712812913013113213313413513613713813914014114214314414514614714814915015115215315415515615715815916016116216316416516616716816917017117217317417517617717817918018118218318418518618718818919019119219319419519619719819920020120220320420520620720820921021121221321421521621721821922022122222322422522622722822923023123223323423523623723823924024124224324424524624724824925025125225325425525625725825926026126226326426526626726826927027127227327427527627727827928028128228328428528628728828929029129229329429529629729829930030130230330430530630730830931031131231331431531631731831932032132232332432532632732832933033133233333433533633733833934034134234334434534634734834935035135235335435535635735835936036136236336436536636736836937037137237337437537637737837938038138238338438538638738838939039139239339439539639739839940040140240340440540640740840941041141241341441541641741841942042142242342442542642742842943043143243343443543643743843944044144244344444544644744844945045145245345445545645745845946046146246346446546646746846947047147247347447547647747847948048148248348448548648748848949049149249349449549649749849950050150250350450550650750850951051151251351451551651751851952052152252352452552652752852953053153253353453553653753853954054154254354454554654754854955055155255355455555655755855956056156256356456556656756856957057157257357457557657757857958058158258358458558658758858959059159259359459559659759859960060160260360460560660760860961061161261361461561661761861962062162262362462562662762862963063163263363463563663763863964064164264364464564664764864965065165265365465565665765865966066166266366466566666766866967067167267367467567667767867968068168268368468568668768868969069169269369469569669769869970070170270370470570670770870971071171271371471571671771871972072172272372472572672772872973073173273373473573673773873974074174274374474574674774874975075175275375475575675775875976076176276376476576676776876977077177277377477577677777877978078178278378478578678778878979079179279379479579679779879980080180280380480580680780880981081181281381481581681781881982082182282382482582682782882983083183283383483583683783883984084184284384484584684784884985085185285385485585685785885986086186286386486586686786886987087187287387487587687787887988088188288388488588688788888989089189289389489589689789889990090190290390490590690790890991091191291391491591691791891992092192292392492592692792892993093193293393493593693793893994094194294394494594694794894995095195295395495595695795895996096196296396496596696796896997097197297397497597697797897998098198298398498598698798898999099199299399499599699799899910001001100210031004100510061007100810091010101110121013101410151016101710181019102010211022102310241025102610271028102910301031103210331034103510361037103810391040104110421043104410451046104710481049105010511052105310541055105610571058105910601061106210631064106510661067106810691070107110721073107410751076107710781079108010811082108310841085108610871088108910901091109210931094109510961097109810991100110111021103110411051106110711081109111011111112111311141115111611171118111911201121112211231124112511261127112811291130113111321133113411351136113711381139114011411142114311441145114611471148114911501151115211531154115511561157115811591160116111621163116411651166116711681169117011711172117311741175117611771178117911801181118211831184118511861187118811891190119111921193119411951196119711981199120012011202120312041205120612071208120912101211121212131214121512161217121812191220122112221223122412251226122712281229123012311232123312341235123612371238123912401241124212431244124512461247124812491250125112521253125412551256125712581259126012611262126312641265126612671268126912701271127212731274127512761277127812791280128112821283128412851286128712881289129012911292129312941295129612971298129913001301130213031304130513061307130813091310131113121313131413151316131713181319132013211322132313241325132613271328132913301331133213331334133513361337133813391340134113421343134413451346134713481349135013511352135313541355135613571358135913601361136213631364136513661367136813691370137113721373137413751376137713781379138013811382138313841385138613871388138913901391139213931394139513961397139813991400140114021403140414051406140714081409141014111412141314141415141614171418141914201421142214231424142514261427142814291430143114321433143414351436143714381439144014411442144314441445144614471448144914501451145214531454145514561457145814591460146114621463146414651466146714681469147014711472147314741475147614771478147914801481148214831484148514861487148814891490149114921493149414951496149714981499150015011502150315041505150615071508150915101511151215131514151515161517151815191520152115221523152415251526152715281529153015311532153315341535153615371538153915401541154215431544154515461547154815491550155115521553155415551556155715581559156015611562156315641565156615671568156915701571157215731574157515761577157815791580158115821583158415851586158715881589159015911592159315941595159615971598159916001601160216031604160516061607160816091610161116121613161416151616161716181619162016211622162316241625162616271628162916301631163216331634163516361637163816391640164116421643164416451646164716481649165016511652165316541655165616571658165916601661166216631664166516661667166816691670167116721673167416751676167716781679168016811682168316841685168616871688168916901691169216931694169516961697169816991700170117021703170417051706170717081709171017111712171317141715171617171718171917201721172217231724172517261727172817291730173117321733173417351736173717381739174017411742174317441745174617471748174917501751175217531754175517561757175817591760176117621763176417651766176717681769177017711772177317741775177617771778177917801781178217831784 |
- .de P1
- .KS
- .DS
- .ft CW
- .ta 5n 10n 15n 20n 25n 30n 35n 40n 45n 50n 55n 60n 65n 70n 75n 80n
- ..
- .de P2
- .ft 1
- .DE
- .KE
- ..
- .de CW
- .lg 0
- \%\&\\$3\f(CW\\$1\fP\&\\$2
- .lg
- ..
- .de WC
- .lg 0
- \%\&\\$3\f(CI\\$1\fP\&\\$2
- .lg
- ..
- .TL
- A tutorial for the
- .CW sam
- .B
- command language
- .AU
- Rob Pike
- .AI
- .MH
- .AB
- .CW sam
- is an interactive text editor with a command language that makes heavy use
- of regular expressions.
- Although the language is syntactically similar to
- .CW ed (1),
- the details are interestingly different.
- This tutorial introduces the command language, but does not discuss
- the screen and mouse interface.
- With apologies to those unfamiliar with the Ninth Edition Blit software,
- it is assumed that the similarity of
- .CW sam
- to
- .CW mux (9)
- at this level makes
- .CW sam 's
- mouse language easy to learn.
- .PP
- The
- .CW sam
- command language applies identically to two environments:
- when running
- .CW sam
- on an ordinary terminal
- (\f2via\f1\f1
- .CW sam\ -d ),
- and in the command window of a
- .I downloaded
- .CW sam ,
- that is, one using the bitmap display and mouse.
- .AE
- .SH
- Introduction
- .PP
- This tutorial describes the command language of
- .CW sam ,
- an interactive text editor that runs on Blits and
- some computers with bitmap displays.
- For most editing tasks, the mouse-based editing features
- are sufficient, and they are easy to use and to learn.
- .PP
- The command language is often useful, however, particularly
- when making global changes.
- Unlike the commands in
- .CW ed ,
- which are necessary to make changes,
- .CW sam
- commands tend to be used
- only for complicated or repetitive editing tasks.
- It is in these more involved uses that
- the differences between
- .CW sam
- and other text editors are most evident.
- .PP
- .CW sam 's
- language makes it easy to do some things that other editors,
- including programs like
- .CW sed
- and
- .CW awk ,
- do not handle gracefully, so this tutorial serves partly as a
- lesson in
- .CW sam 's
- manner of manipulating text.
- The examples below therefore concentrate entirely on the language,
- assuming that facility with the use of the mouse in
- .CW sam
- is at worst easy to pick up.
- In fact,
- .CW sam
- can be run without the mouse at all (not
- .I downloaded ),
- by specifying the
- .CW -d
- flag, and it is this domain that the tutorial
- occupies; the command language in these modes
- are identical.
- .PP
- A word to the Unix adept:
- although
- .CW sam
- is syntactically very similar to
- .CW ed ,
- it is fundamentally and deliberately different in design and detailed semantics.
- You might use knowledge of
- .CW ed
- to predict how the substitute command works,
- but you'd only be right if you had used some understanding of
- .CW sam 's
- workings to influence your prediction.
- Be particularly careful about idioms.
- Idioms form in curious nooks of languages and depend on
- undependable peculiarities.
- .CW ed
- idioms simply don't work in
- .CW sam :
- .CW 1,$s/a/b/
- makes one substitution in the whole file, not one per line.
- .CW sam
- has its own idioms.
- Much of the purpose of this tutorial is to publish them
- and make fluency in
- .CW sam
- a matter of learning, not cunning.
- .PP
- The tutorial depends on familiarity with regular expressions, although
- some experience with a more traditional Unix editor may be helpful.
- To aid readers familiar with
- .CW ed ,
- I have pointed out in square brackets [] some of
- the relevant differences between
- .CW ed
- and
- .CW sam .
- Read these comments only if you wish
- to understand the differences; the lesson is about
- .CW sam ,
- not
- .CW sam
- .I vs.
- .CW ed .
- Another typographic convention is that output appears in
- .CW "this font,
- while typed input appears as
- .WC "slanty text.
- .PP
- Nomenclature:
- .CW sam
- keeps a copy of the text it is editing.
- This copy is called a
- .I file .
- To avoid confusion, I have called the permanent storage on disc a
- .I
- Unix file.
- .R
- .SH
- Text
- .PP
- To get started, we need some text to play with.
- Any text will do; try something from
- James Gosling's Emacs manual:
- .P1
- $ \f(CIsam -d
- a
- This manual is organized in a rather haphazard manner. The first
- several sections were written hastily in an attempt to provide a
- general introduction to the commands in Emacs and to try to show
- the method in the madness that is the Emacs command structure.
- \&.
- .ft
- .P2
- .WC "sam -d
- starts
- .CW sam
- running.
- The
- .CW a
- command adds text until a line containing just a period, and sets the
- .I
- current text
- .R
- (also called
- .I dot )
- to what was typed \(em everything between the
- .CW a
- and the period.
- .CW ed "" [
- would leave dot set to only the last line.]
- The
- .CW p
- command prints the current text:
- .P1
- .WC p
- This manual is organized in a rather haphazard manner. The first
- several sections were written hastily in an attempt to provide a
- general introduction to the commands in Emacs and to try to show
- the method in the madness that is the Emacs command structure.
- .P2
- [Again,
- .CW ed
- would print only the last line.]
- The
- .CW a
- command adds its text
- .I after
- dot; the
- .CW i
- command is like
- .CW a,
- but adds the text
- .I before
- dot.
- .P1
- .ft CI
- i
- Introduction
- \&.
- p
- .ft
- Introduction
- .P2
- There is also a
- .CW c
- command that changes (replaces) the current text,
- and
- .CW d
- that deletes it; these are illustrated below.
- .PP
- To see all the text, we can specify what text to print;
- for the moment, suffice it to say that
- .WC 0,$
- specifies the entire file.
- .CW ed "" [
- users would probably type
- .WC 1,$ ,
- which in practice is the same thing, but see below.]
- .P1
- .WC 0,$p
- Introduction
- This manual is organized in a rather haphazard manner. The first
- several sections were written hastily in an attempt to provide a
- general introduction to the commands in Emacs and to try to show
- the method in the madness that is the Emacs command structure.
- .P2
- Except for the
- .CW w
- command described below,
- .I all
- commands,
- including
- .CW p ,
- set dot to the text they touch.
- Thus,
- .CW a
- and
- .CW i
- set dot to the new text,
- .CW p
- to the text printed, and so on.
- Similarly, all commands
- (except
- .CW w )
- by default operate on the current
- text [unlike
- .CW ed ,
- for which some commands (such as
- .CW g )
- default to the entire file].
- .PP
- Things are not going to get very interesting until we can
- set dot arbitrarily.
- This is done by
- .I addresses ,
- which specify a piece of the file.
- The address
- .CW 1 ,
- for example, sets dot to the first line of the file.
- .P1
- .WC 1p
- Introduction
- .WC c
- .WC Preamble
- .WC .
- .P2
- The
- .CW c
- command didn't need to specify dot; the
- .CW p
- left it on line one.
- It's therefore easy to delete the first line utterly;
- the last command left dot set to line one:
- .P1
- .WC d
- .WC 1p
- This manual is organized in a rather haphazard manner. The first
- .P2
- (Line numbers change
- to reflect changes to the file.)
- .PP
- The address \f(CW/\f2text\f(CW/\f1
- sets dot to the first appearance of
- .I text ,
- after dot.
- .CW ed "" [
- matches the first line containing
- .I text .]
- If
- .I text
- is not found, the search restarts at the beginning of the file
- and continues until dot.
- .P1
- .WC /Emacs/p
- Emacs
- .P2
- It's difficult to indicate typographically, but in this example no newline appears
- after
- .CW Emacs :
- the text to be printed is the string
- .CW Emacs ', `
- exactly.
- (The final
- .CW p
- may be left off \(em it is the default command.
- When downloaded, however, the default is instead to select the text,
- to highlight it,
- and to make it visible by moving the window on the file if necessary.
- Thus,
- .CW /Emacs/
- indicates on the display the next occurrence of the text.)
- .PP
- Imagine we wanted to change the word
- .CW haphazard
- to
- .CW thoughtless .
- Obviously, what's needed is another
- .CW c
- command, but the method used so far to insert text includes a newline.
- The syntax for including text without newlines is to surround the
- text with slashes (which is the same as the syntax for
- text searches, but what is going on should be clear from context).
- The text must appear immediately after the
- .CW c
- (or
- .CW a
- or
- .CW i ).
- Given this, it is easy to make the required change:
- .P1
- .WC /haphazard/c/thoughtless/
- .WC 1p
- This manual is organized in a rather thoughtless manner. The first
- .P2
- [Changes can always be done with a
- .CW c
- command, even if the text is smaller than a line].
- You'll find that this way of providing text to commands is much
- more common than is the multiple-lines syntax.
- If you want to include a slash
- .CW /
- in the text, just precede it with a backslash
- .CW \e ,
- and use a backslash to protect a backslash itself.
- .P1
- .WC /Emacs/c/Emacs\e\e360/
- .WC 4p
- general introduction to the commands in Emacs\e360 and to try to show
- .P2
- We could also make this particular change by
- .P1
- .WC /Emacs/a/\e\e360/
- .P2
- .PP
- This is as good a place as any to introduce the
- .CW u
- command, which undoes the last command.
- A second
- .CW u
- will undo the penultimate command, and so on.
- .P1
- .WC u
- .WC 4p
- general introduction to the commands in Emacs and to try to show
- .WC u
- .WC 3p
- This manual is organized in a rather haphazard manner. The first
- .P2
- Undoing can only back up; there is no way to undo a previous
- .CW u .
- .SH
- Addresses
- .PP
- We've seen the simplest forms of addresses, but there is more
- to learn before we can get too much further.
- An address selects a region in the file \(em a substring \(em
- and therefore must define the beginning and the end of a region.
- Thus, the address
- .CW 13
- selects from the beginning of line thirteen to the end of line thirteen, and
- .CW /Emacs/
- selects from the beginning of the word
- .CW Emacs ' `
- to the end.
- .PP
- Addresses may be combined with a comma:
- .P1
- 13,15
- .P2
- selects lines thirteen through fifteen. The definition of the comma
- operator is to select from the beginning of the left hand address (the
- beginning of line 13) to the end of the right hand address (the
- end of line 15).
- .PP
- A few special simple addresses come in handy:
- .CW .
- (a period) represents dot, the current text,
- .CW 0
- (line zero) selects the null string at the beginning of the file, and
- .CW $
- selects the null string at the end of the file
- [not the last line of the file].
- Therefore,
- .P1
- 0,13
- .P2
- selects from the beginning of the file to the end of line thirteen,
- .P1
- \&.,$
- .P2
- selects from the beginning of the current text to the end of the file, and
- .P1
- 0,$
- .P2
- selects the whole file [that is, a single string containing the whole file,
- not a list of all the lines in the file].
- .PP
- These are all
- .I absolute
- addresses: they refer to specific places in the file.
- .CW sam
- also has relative addresses, which depend
- on the value of dot,
- and in fact we have already seen one form:
- .CW /Emacs/
- finds the first occurrence of
- .CW Emacs
- searching forwards from dot.
- Which occurrence of
- .CW Emacs
- it finds depends on the value of dot.
- What if you wanted the first occurrence
- .CW before
- dot? Just precede the pattern with a minus sign, which reverses the direction
- of the search:
- .P1
- -/Emacs/
- .P2
- In fact, the complete syntax for forward searching is
- .P1
- +/Emacs/
- .P2
- but the plus sign is the default, and in practice is rarely used.
- Here is an example that includes it for clarity:
- .P1
- 0+/Emacs/
- .P2
- selects the first occurrence of
- .CW Emacs
- in the file; read it as ``go to line 0, then search forwards for
- .CW Emacs .''
- Since the
- .CW +
- is optional, this can be written
- .CW 0/Emacs/ .
- Similarly,
- .P1
- $-/Emacs/
- .P2
- finds the last occurrence in the file, so
- .P1
- 0/Emacs/,$-/Emacs/
- .P2
- selects the text from the first to last
- .CW Emacs ,
- inclusive.
- Slightly more interesting:
- .P1
- /Emacs/+/Emacs/
- .P2
- (there is an implicit
- .CW .+
- at the beginning) selects the second
- .CW Emacs
- following dot.
- .PP
- Line numbers may also be relative.
- .P1
- -2
- .P2
- selects the second previous line, and
- .P1
- +5
- .P2
- selects the fifth following line (here the plus sign is obligatory).
- .PP
- Since addresses may select (and dot may be) more than one line,
- we need a definition of `previous' and `following:'
- `previous' means
- .I
- before the beginning
- .R
- of dot, and `following'
- means
- .I
- after the end
- .R
- of dot.
- For example, if the file contains \f(CWA\f(CIAA\f(CWA\f1,
- with dot set to the middle two
- .CW A 's
- (the slanting characters),
- .CW -/A/
- sets dot to the first
- .CW A ,
- and
- .CW +/A/
- sets dot to the last
- .CW A .
- Except under odd circumstances (such as when the only occurrence of the
- text in the file is already the current text), the text selected by a
- search will be disjoint from dot.
- .PP
- To select the
- .CW "troff -ms
- paragraph containing dot, however long it is, use
- .P1
- -/.PP/,/.PP/-1
- .P2
- which will include the
- .CW .PP
- that begins the paragraph, and exclude the one that ends it.
- .PP
- When typing relative line number addresses, the default number is
- .CW 1 ,
- so the above could be written slightly more simply:
- .P1
- -/.PP/,/.PP/-
- .P2
- .PP
- What does the address
- .CW +1-1
- or the equivalent
- .CW +-
- mean? It looks like it does nothing, but recall that dot need not be a
- complete line of text.
- .CW +1
- selects the line after the end of the current text, and
- .CW -1
- selects the line before the beginning. Therefore
- .CW +1-1
- selects the line before the line after the end of dot, that is,
- the complete line containing the end of dot.
- We can use this construction to expand a selection to include a complete line,
- say the first line in the file containing
- .CW Emacs :
- .P1
- .WC 0/Emacs/+-p
- general introduction to the commands in Emacs and to try to show
- .P2
- The address
- .CW +-
- is an idiom.
- .SH
- Loops
- .PP
- Above, we changed one occurrence of
- .CW Emacs
- to
- .CW Emacs\e360 ,
- but if the name of the editor is really changing, it would be useful
- to change
- .I all
- instances of the name in a single command.
- .CW sam
- provides a command,
- .CW x
- (extract), for just that job.
- The syntax is
- \f(CWx/\f2pattern\f(CW/\f2command\f1.
- For each occurrence of the pattern in the selected text,
- .CW x
- sets dot to the occurrence and runs command.
- For example, to change
- .CW Emacs
- to
- .CW vi,
- .P1
- .WC 0,$x/Emacs/c/vi/
- .WC 0,$p
- This manual is organized in a rather haphazard manner. The first
- several sections were written hastily in an attempt to provide a
- general introduction to the commands in vi and to try to show
- the method in the madness that is the vi command structure.
- .P2
- This
- works by subdividing the current text
- .CW 0,$ "" (
- \(em the whole file) into appearances of its textual argument
- .CW Emacs ), (
- and then running the command that follows
- .CW c/vi/ ) (
- with dot set to the text.
- We can read this example as, ``find all occurrences of
- .CW Emacs
- in the file, and for each one,
- set the current text to the occurrence and run the command
- .CW c/vi/ ,
- which will replace the current text by
- .CW vi. ''
- [This command is somewhat similar to
- .CW ed 's
- .CW g
- command. The differences will develop below, but note that the
- default address, as always, is dot rather than the whole file.]
- .PP
- A single
- .CW u
- command is sufficient to undo an
- .CW x
- command, regardless of how many individual changes the
- .CW x
- makes.
- .P1
- .WC u
- .WC 0,$p
- This manual is organized in a rather haphazard manner. The first
- several sections were written hastily in an attempt to provide a
- general introduction to the commands in Emacs and to try to show
- the method in the madness that is the Emacs command structure.
- .P2
- .PP
- Of course,
- .CW c
- is not the only command
- .CW x
- can run. An
- .CW a
- command can be used to put proprietary markings on
- .CW Emacs :
- .P1
- .WC 0,$x/Emacs/a/{TM}/
- .WC /Emacs/+-p
- general introduction to the commands in Emacs{TM} and to try to show
- .P2
- [There is no way to see the changes as they happen, as in
- .CW ed 's
- .CW g/Emacs/s//&{TM}/p ;
- see the section on Multiple Changes, below.]
- .PP
- The
- .CW p
- command is also useful when driven by an
- .CW x ,
- but be careful that you say what you mean;
- .P1
- .WC 0,$x/Emacs/p
- EmacsEmacs
- .P2
- since
- .CW x
- sets dot to the text in the slashes, printing only that text
- is not going to be very
- informative. But the command that
- .CW x
- runs can contain addresses. For example, if we want to print all
- lines containing
- .CW Emacs ,
- just use
- .CW +- :
- .P1
- .WC 0,$x/Emacs/+-p
- general introduction to the commands in Emacs{TM} and to try to show
- the method in the madness that is the Emacs{TM} command structure.
- .P2
- Finally, let's restore the state of the file with another
- .CW x
- command, and make use of a handy shorthand:
- a comma in an address has its left side default to
- .CW 0 ,
- and its right side default to
- .CW $ ,
- so the easy-to-type address
- .CW ,
- refers to the whole file:
- .P1
- .WC ",x/Emacs/ /{TM}/d
- .WC ,p
- This manual is organized in a rather haphazard manner. The first
- several sections were written hastily in an attempt to provide a
- general introduction to the commands in Emacs and to try to show
- the method in the madness that is the Emacs command structure.
- .P2
- Notice what this
- .CW x
- does: for each occurrence of Emacs,
- find the
- .CW {TM}
- that follows, and delete it.
- .PP
- The `text'
- .CW sam
- accepts
- for searches in addresses and in
- .CW x
- commands is not simple text, but rather
- .I regular\ expressions.
- Unix has several distinct interpretations of regular expressions.
- The form used by
- .CW sam
- is that of
- .CW egrep (1),
- including parentheses
- .CW ()
- for grouping and an `or' operator
- .CW |
- for matching strings in parallel.
- .CW sam
- makes two extensions:
- although
- .CW .
- (the most overloaded character in Unix) matches any character
- .I except
- newline, the regular expression
- .CW @
- (think of it as a big dot) matches any character, even newlines;
- and the character sequence
- .CW \en
- matches a newline character.
- Replacement text, such as used in the
- .CW a
- and
- .CW c
- commands, is still plain text, but the sequence
- .CW \en
- represents newline in that context, too.
- .PP
- Here is an example. Say we wanted to double space the document, that is,
- turn every newline into two newlines.
- The following all do the job:
- .P1
- .WC ",x/\en/ a/\en/
- .WC ",x/\en/ c/\en\en/
- .WC ",x/$/ a/\en/
- .WC ",x/^/ i/\en/
- .P2
- The last example is slightly different, because it puts a newline
- .I before
- each line; the other examples place it after.
- The first two examples manipulate newlines directly
- [something outside
- .CW ed 's
- ken]; the last two
- use regular expressions:
- .CW $
- is the empty string at the end of a line, while
- .CW ^
- is the empty string at the beginning.
- .PP
- These solutions all have a possible drawback: if there is already a blank line
- (that is, two consecutive newlines), they make it much larger (four
- consecutive newlines).
- A better method is to extend every group of newlines by one:
- .P1
- .WC ",x/\en+/ a/\en/
- .P2
- The regular expression operator
- .CW +
- means `one or more;'
- .CW \en+
- is identical to
- .CW \en\en* .
- Thus, this example
- takes every sequence of newlines and adds another
- to the end.
- .PP
- A more common example is indenting a block of text by a tab stop.
- The following all work,
- although the first is arguably the cleanest (the blank text in slashes is a tab):
- .P1
- .WC ",x/^/a/ /
- .WC ",x/^/c/ /
- .WC ",x/.*\en/i/ /
- .P2
- The last example uses the pattern (idiom, really)
- .CW .*\en
- to match lines:
- .CW .*
- matches the longest possible string of non-newline characters.
- Taking initial tabs away is just as easy:
- .P1
- .WC ",x/^ /d
- .P2
- In these examples I have specified an address (the whole file), but
- in practice commands like these are more likely to be run without
- an address, using the value of dot set by selecting text with the mouse.
- .SH
- Conditionals
- .PP
- The
- .CW x
- command is a looping construct:
- for each match of a regular expression,
- it extracts (sets dot to) the match and runs a command.
- .CW sam
- also has a conditional,
- .CW g :
- \f(CWg/\f2pattern\f(CW/\f2command\f1
- runs the command if dot contains a match of the pattern
- .I
- without changing the value of dot.
- .R
- The inverse,
- .CW v ,
- runs the command if dot does
- .I not
- contain a match of the pattern.
- (The letters
- .CW g
- and
- .CW v
- are historical and have no mnemonic significance. You might
- think of
- .CW g
- as `guard.')
- .CW ed "" [
- users should read the above definitions very carefully; the
- .CW g
- command in
- .CW sam
- is fundamentally different from that in
- .CW ed .]
- Here is an example of the difference between
- .CW x
- and
- .CW g:
- .P1
- ,x/Emacs/c/vi/
- .P2
- changes each occurrence of the word
- .CW Emacs
- in the file to the word
- .CW vi ,
- but
- .P1
- ,g/Emacs/c/vi/
- .P2
- changes the
- .I "whole file
- to
- .CW vi
- if there is the word
- .CW Emacs
- anywhere in the file.
- .PP
- Neither of these commands is particularly interesting in isolation,
- but they are valuable when combined with
- .CW x
- and with themselves.
- .SH
- Composition
- .PP
- One way to think about the
- .CW x
- command is that, given a selection (a value of dot)
- it iterates through interesting subselections (values of dot within).
- In other words, it takes a piece of text and cuts it into smaller pieces.
- But the text that it cuts up may already be a piece cut by a previous
- .CW x
- command or selected by a
- .CW g .
- .CW sam 's
- most interesting property is the ability to define a sequence of commands
- to perform a particular task.\(dg
- .FS
- \(dg
- The obvious analogy with shell pipelines is only partially valid,
- because the individual
- .CW sam
- commands are all working on the same text; it is only how the text is
- sliced up that is changing.
- .FE
- A simple example is to change all occurrences of
- .CW Emacs
- to
- .CW emacs ;
- certainly the command
- .P1
- .WC ",x/Emacs/ c/emacs/
- .P2
- will work, but we can use an
- .CW x
- command to save retyping most of the word
- .CW Emacs :
- .P1
- .WC ",x/Emacs/ x/E/ c/e/
- .P2
- (Blanks can be used
- to separate commands on a line to make them easier to read.)
- What this command does is find all occurrences of
- .CW Emacs
- .CW ,x/Emacs/ ), (
- and then
- .I
- with dot set to that text,
- .R
- find all occurrences of the letter
- .CW E
- .CW x/E/ ), (
- and then
- .I
- with dot set to that text,
- .R
- run the command
- .CW c/e/
- to change the character to lower case.
- Note that the address for the command \(em the whole file, specified by a comma
- \(em is only given to the leftmost
- piece of the command; the rest of the pieces have dot set for them by
- the execution of the pieces to their left.
- .PP
- As another simple example, consider a problem
- solved above: printing all lines in the file containing the word
- .CW Emacs:
- .P1
- .WC ",x/.*\en/ g/Emacs/p
- general introduction to the commands in Emacs and to try to show
- the method in the madness that is the Emacs command structure.
- .P2
- This command says to break the file into lines
- .CW ,x/.*\en/ ), (
- and for each line that contains the string
- .CW Emacs
- .CW g/Emacs/ ), (
- run the command
- .CW p
- with dot set to the line (not the match of
- .CW Emacs ),
- which prints the line.
- To save typing, because
- .CW .*\en
- is a common pattern in
- .CW x
- commands,
- if the
- .CW x
- is followed immediately by a space, the pattern
- .CW .*\en
- is assumed.
- Therefore, the above could be written more succinctly:
- .P1
- .WC ",x g/Emacs/p
- .P2
- The solution we used before was
- .P1
- .WC ,x/Emacs/+-p
- .P2
- which runs the command
- .CW +-p
- with dot set to each match of
- .CW Emacs
- in the file (recall that the idiom
- .CW +-p
- prints the line containing the end of dot).
- .PP
- The two commands usually produce the same result
- (the
- .CW +-p
- form will print a line twice if it contains
- .CW Emacs
- twice). Which is better?
- .CW ,x/Emacs/+-p
- is easier to type and will be much faster if the file is large and
- there are few occurrences of the string, but it is really an odd special case.
- .CW ",x/.*\en/ g/Emacs/p
- is slower \(em it breaks each line out separately, then examines
- it for a match \(em but is conceptually cleaner, and generalizes more easily.
- For example, consider the following piece of the Emacs manual:
- .P1
- command name="append-to-file", key="[unbound]"
- Takes the contents of the current buffer and appends it to the
- named file. If the files doesn't exist, it will be created.
- command name="apropos", key="ESC-?"
- Prompts for a keyword and then prints a list of those commands
- whose short description contains that keyword. For example,
- if you forget which commands deal with windows, just type
- "@b[ESC-?]@t[window]@b[ESC]".
- \&\f2and so on\f(CW
- .P2
- This text consists of groups of non-empty lines, with a simple format
- for the text within each group.
- Imagine that we wanted to find the description of the `apropos'
- command.
- The problem is to break the file into individual descriptions,
- and then to find the description of `apropos' and to print it.
- The solution is straightforward:
- .P1
- .WC ,x/(.+\en)+/\ g/command\ name="apropos"/p
- command name="apropos", key="ESC-?"
- Prompts for a keyword and then prints a list of those commands
- whose short description contains that keyword. For example,
- if you forget which commands deal with windows, just type
- "@b[ESC-?]@t[window]@b[ESC]".
- .P2
- The regular expression
- .CW (.+\en)+
- matches one or more lines with one or more characters each, that is,
- the text between blank lines, so
- .CW ,x/(.+\en)+/
- extracts each description; then
- .CW g/command\ name="apropos"/
- selects the description for `apropos' and
- .CW p
- prints it.
- .PP
- Imagine that we had a C program containing the variable
- .CW n ,
- but we wanted to change it to
- .CW num .
- This command is a first cut:
- .P1
- .WC ",x/n/ c/num/
- .P2
- but is obviously flawed: it will change all
- .CW n 's
- in the file, not just the
- .I identifier
- .CW n .
- A better solution is to use an
- .CW x
- command to extract the identifiers, and then use
- .CW g
- to find the
- .CW n 's:
- .P1
- .WC ",x/[a-zA-Z_][a-zA-Z_0-9]*/ g/n/ v/../ c/num/
- .P2
- It looks awful, but it's fairly easy to understand when read
- left to right.
- A C identifier is an alphabetic or underscore followed by zero or more
- alphanumerics or underscores, that is, matches of the regular expression
- .CW [a-zA-Z_][a-zA-Z_0-9]* .
- The
- .CW g
- command selects those identifiers containing
- .CW n ,
- and the
- .CW v
- is a trick: it rejects those identifiers containing more than one
- character. Hence the
- .CW c/num/
- applies only to free-standing
- .CW n 's.
- .PP
- There is still a problem here:
- we don't want to change
- .CW n 's
- that are part of the character constant
- .CW \en .
- There is a command
- .CW y ,
- complementary to
- .CW x ,
- that is just what we need:
- \f(CWy/\f2pattern\f(CW/\f2command\f1
- runs the command on the pieces of text
- .I between
- matches of the pattern;
- if
- .CW x
- selects,
- .CW y
- rejects.
- Here is the final command:
- .P1
- .WC ",y/\e\en/ x/[a-zA-Z_][a-zA-Z_0-9]*/ g/n/ v/../ c/num/
- .P2
- The
- .CW y/\e\en/
- (with backslash doubled to make it a literal character)
- removes the two-character sequence
- .CW \en
- from consideration, so the rest of the command will not touch it.
- There is more we could do here; for example, another
- .CW y
- could be prefixed to protect comments in the code.
- I won't elaborate the example any further, but you should have
- an idea of the way in which the looping and conditional commands
- in
- .CW sam
- may be composed to do interesting things.
- .SH
- Grouping
- .PP
- There is another way to arrange commands.
- By enclosing them in brace brackets
- .CW {} ,
- commands may be applied in parallel.
- This example uses the
- .CW =
- command, which reports the line and character numbers of dot,
- together with
- .CW p ,
- to report on appearances of
- .CW Emacs
- in our original file:
- .P1
- .WC ,p
- This manual is organized in a rather haphazard manner. The first
- several sections were written hastily in an attempt to provide a
- general introduction to the commands in Emacs and to try to show
- the method in the madness that is the Emacs command structure.
- .ft CI
- ,x/Emacs/{
- =
- +-p
- }
- .ft
- 3; #171,#176
- general introduction to the commands in Emacs and to try to show
- 4; #234,#239
- the method in the madness that is the Emacs command structure.
- .P2
- (The number before the semicolon is the line number;
- the numbers beginning with
- .CW #
- are character numbers.)
- As a more interesting example, consider changing all occurrences of
- .CW Emacs
- to
- .CW vi
- and vice versa. We can type
- .P1
- .ft CI
- ,x/Emacs|vi/{
- g/Emacs/ c/vi/
- g/vi/ c/Emacs/
- }
- .ft
- .P2
- or even
- .P1
- .ft CI
- ,x/[a-zA-Z]+/{
- g/Emacs/ v/....../ c/vi/
- g/vi/ v/.../ c/Emacs/
- }
- .ft
- .P2
- to make sure we don't change strings embedded in words.
- .SH
- Multiple Changes
- .PP
- You might wonder why, once
- .CW Emacs
- has been changed to
- .CW vi
- in the above example,
- the second command in the braces doesn't put it back again.
- The reason is that the commands are run in parallel:
- within any top-level
- .CW sam
- command, all changes to the file refer to the state of the file
- before any of the changes in that command are made.
- After all the changes have been determined, they are all applied
- simultaneously.
- .PP
- This means, as mentioned, that commands within a compound
- command see the state of the file before any of the changes apply.
- This method of evaluation makes some things easier (such as the exchange of
- .CW Emacs
- and
- .CW vi ),
- and some things harder.
- For instance, it is impossible to use a
- .CW p
- command to print the changes as they happen,
- because they haven't happened when the
- .CW p
- is executed.
- An indirect ramification is that changes must occur in forward
- order through the file,
- and must not overlap.
- .SH
- Unix
- .PP
- .CW sam
- has a few commands to connect to Unix processes.
- The simplest is
- .CW ! ,
- which runs the command with input and output connected to the terminal.
- .P1
- .WC !date
- Wed May 28 23:25:21 EDT 1986
- !
- .P2
- (When downloaded, the input is connected to
- .CW /dev/null
- and only the first few lines of output are printed;
- any overflow is stored in
- .CW $HOME/sam.err .)
- The final
- .CW !
- is a prompt to indicate when the command completes.
- .PP
- Slightly more interesting is
- .CW > ,
- which provides the current text as standard input to the Unix command:
- .P1
- .WC "1,2 >wc
- 2 22 131
- !
- .P2
- The complement of
- .CW >
- is, naturally,
- .CW < :
- it replaces the current text with the standard output of the Unix command:
- .P1
- .WC "1 <date
- !
- .WC 1p
- Wed May 28 23:26:44 EDT 1986
- .P2
- The last command is
- .CW | ,
- which is a combination of
- .CW <
- and
- .CW > :
- the current text is provided as standard input to the Unix command,
- and the Unix command's standard output is collected and used to
- replace the original text.
- For example,
- .P1
- .WC ",| sort
- .P2
- runs
- .CW sort (1)
- on the file, sorting the lines of the text lexicographically.
- Note that
- .CW < ,
- .CW >
- and
- .CW |
- are
- .CW sam
- commands, not Unix shell operators.
- .PP
- The next example converts all appearances of
- .CW Emacs
- to upper case using
- .CW tr (1):
- .P1
- .WC ",x/Emacs/ | tr a-z A-Z
- .P2
- .CW tr
- is run once for each occurrence of
- .CW Emacs .
- Of course, you could do this example more efficiently with a simple
- .CW c
- command, but here's a trickier one:
- given a Unix mail box as input,
- convert all the
- .CW Subject
- headers to distinct fortunes:
- .P1
- .WC ",x/^Subject:.*\en/ x/[^:]*\en/ < /usr/games/fortune
- .P2
- (The regular expression
- .CW [^:]
- refers to any character
- .I except
- .CW :
- and newline; the negation operator
- .CW ^
- excludes newline from the list of characters.)
- Again,
- .CW /usr/games/fortune
- is run once for each
- .CW Subject
- line, so each
- .CW Subject
- line is changed to a different fortune.
- .SH
- A few other text commands
- .PP
- For completeness, I should mention three other commands that
- manipulate text. The
- .CW m
- command moves the current text to after the text specified by the
- (obligatory) address after the command.
- Thus
- .P1
- .WC "/Emacs/+- m 0
- .P2
- moves the next line containing
- .CW Emacs
- to the beginning of the file.
- Similarly,
- .CW t
- (another historic character) copies the text:
- .P1
- .WC "/Emacs/+- t 0
- .P2
- would make, at the beginning of the file, a copy of the next line
- containing
- .CW Emacs .
- .PP
- The third command is more interesting: it makes substitutions.
- Its syntax is
- \f(CWs/\f2pattern\f(CW/\f2replacement\f(CW/\f1.
- Within the current text, it finds the first occurrence of
- the pattern and replaces it by the replacement text,
- leaving dot set to the entire address of the substitution.
- .P1
- .WC 1p
- This manual is organized in a rather haphazard manner. The first
- .WC s/haphazard/thoughtless/
- .WC p
- This manual is organized in a rather thoughtless manner. The first
- .P2
- Occurrences of the character
- .CW &
- in the replacement text stand for the text matching the pattern.
- .P1
- .WC s/T/"&&&&"/
- .WC p
- "TTTT"his manual is organized in a rather thoughtless manner. The first
- .P2
- There are two variants. The first is that a number may be specified
- after the
- .CW s ,
- to indicate which occurrence of the pattern to substitute; the default
- is the first.
- .P1
- .WC s2/is/was/
- .WC p
- "TTTT"his manual was organized in a rather thoughtless manner. The first
- .P2
- The second is that suffixing a
- .CW g
- (global) causes replacement of all occurrences, not just the first.
- .P1
- .WC s/[a-zA-Z]/x/g
- .WC p
- "xxxx"xxx xxxxxx xxx xxxxxxxxx xx x xxxxxx xxxxxxxxxxx xxxxxxx xxx xxxxx
- .P2
- Notice that in all these examples
- dot is left
- set to the entire line.
- .PP
- [The substitute command is vital to
- .CW ed,
- because it is the only way to make changes within a line.
- It is less valuable in
- .CW sam ,
- in which the concept of a line is much less important.
- For example, many
- .CW ed
- substitution idioms are handled well by
- .CW sam 's
- basic commands. Consider the commands
- .P1
- s/good/bad/
- s/good//
- s/good/& bye/
- .P2
- which are equivalent in
- .CW sam
- to
- .P1
- /good/c/bad/
- /good/d
- /good/a/ bye/
- .P2
- and for which the context search is likely unnecessary because the desired
- text is already dot.
- Also, beware this
- .CW ed
- idiom:
- .P1
- 1,$s/good/bad/
- .P2
- which changes the first
- .CW good
- on each line; the same command in
- .CW sam
- will only change the first one in the whole file.
- The correct
- .CW sam
- version is
- .P1
- ,x s/good/bad/
- .P2
- but what is more likely meant is
- .P1
- ,x/good/ c/bad/
- .P2
- .CW sam
- operates under different rules.]
- .SH
- Files
- .PP
- So far, we have only been working with a single file,
- but
- .CW sam
- is a multi-file editor.
- Only one file may be edited at a time, but
- it is easy to change which file is the `current' file for editing.
- To see how to do this, we need a
- .CW sam
- with a few files;
- the easiest way to do this is to start it
- with a list of Unix file names to edit.
- .P1
- $ \f(CIecho *.ms\f(CW
- conquest.ms death.ms emacs.ms famine.ms slaughter.ms
- $ \f(CIsam -d *.ms\f(CW
- -. conquest.ms
- .P2
- (I'm sorry the Horsemen don't appear in liturgical order.)
- The line printed by
- .CW sam
- is an indication that the Unix file
- .CW conquest.ms
- has been read, and is now the current file.
- .CW sam
- does not read the Unix file until
- the associated
- .CW sam
- file becomes current.
- .PP
- The
- .CW n
- command prints the names of all the files:
- .P1
- .WC n
- -. conquest.ms
- - death.ms
- - emacs.ms
- - famine.ms
- - slaughter.ms
- .P2
- This list is also available in the menu on mouse button 3.
- The command
- .CW f
- tells the name of just the current file:
- .P1
- .WC f
- -. conquest.ms
- .P2
- The characters to the left of the file name encode helpful information about
- the file.
- The minus sign becomes a plus sign if the file has a window open, and an
- asterisk if more than one is open.
- The period (another meaning of dot) identifies the current file.
- The leading blank changes to an apostrophe if the file is different
- from the contents of the associated Unix file, as far as
- .CW sam
- knows.
- This becomes evident if we make a change.
- .P1
- .WC 1d
- .WC f
- \&'-. conquest.ms
- .P2
- If the file is restored by an undo command, the apostrophe disappears.
- .P1
- .WC u
- .WC f
- -. conquest.ms
- .P2
- The file name may be changed by providing a new name with the
- .CW f
- command:
- .P1
- .CW "f pestilence.ms
- \&'-. pestilence.ms
- .P2
- .WC f
- prints the new status of the file,
- that is, it changes the name if one is provided, and prints the
- name regardless.
- A file name change may also be undone.
- .P1
- .WC u
- .WC f
- -. conquest.ms
- .P2
- .PP
- When
- .CW sam
- is downloaded, the current file may be changed simply by selecting
- the desired file from the menu (selecting the same file subsequently
- cycles through the windows opened on the file).
- Otherwise, the
- .CW b
- command can be used to choose the desired file:\(dg
- .FS
- \(dg A bug prevents the
- .CW b
- command from working when downloaded.
- Because the menu is more convenient anyway, and
- because the method
- of choosing files from the command language is slated to change,
- the bug hasn't been fixed.
- .FE
- .P1
- .WC "b emacs.ms
- -. emacs.ms
- .P2
- Again,
- .CW sam
- prints the name (actually, executes an implicit
- .CW f
- command) because the Unix file
- .CW emacs.ms
- is being read for the first time.
- It is an error to ask for a file
- .CW sam
- doesn't know about, but the
- .CW B
- command will prime
- .CW sam 's
- menu with a new file, and make it current.
- .P1
- .WC "b flood.pic
- ?no such file `flood.pic'
- .WC "B flood.pic
- -. flood.pic
- .WC n
- - conquest.ms
- - death.ms
- - emacs.ms
- - famine.ms
- -. flood.pic
- - slaughter.ms
- .P2
- Both
- .CW b
- and
- .CW B
- will accept a list of file names.
- .CW b
- simply takes the first file in the list, but
- .CW B
- loads them all.
- The list may be typed on one line \(em
- .P1
- .WC "B devil.tex satan.tex 666.tex emacs.tex
- .P2
- \(em or generated by a Unix command \(em
- .P1
- .WC "B <echo *.tex
- .P2
- The latter form requires a Unix command;
- .CW sam
- does not understand the shell file name metacharacters, so
- .CW "B *.tex
- attempts to load a single file named
- .CW *.tex .
- (The
- .CW <
- form is of course derived from
- .CW sam 's
- .CW <
- command.)
- .CW echo
- is not the only useful command to run subservient to
- .CW B ;
- for example,
- .P1
- .WC "B <grep -l Emacs *
- .P2
- will load only those files containing the string
- .CW Emacs .
- Finally, a special case: a
- .CW B
- with no arguments creates an empty, nameless file within
- .CW sam .
- .PP
- The complement of
- .CW B
- is
- .CW D :
- .P1
- .WC "D devil.tex satan.tex 666.tex emacs.tex
- .P2
- eradicates the files from
- .CW sam 's
- memory (not from the Unix machine's disc).
- .CW D
- without any file names removes the current file from
- .CW sam .
- .PP
- There are three other commands that relate the current file
- to Unix files.
- The
- .CW w
- command writes the file to disc;
- without arguments, it writes the entire file to the Unix file associated
- with the current file in
- .CW sam
- (it is the only command whose default address is not dot).
- Of course, you can specify an address to be written,
- and a different file name, with the obvious syntax:
- .P1
- .WC "1,2w /tmp/revelations
- /tmp/revelations: #44
- .P2
- .CW sam
- responds with the file name and the number of characters written to the file.
- The
- .CW write
- command on the button 3 menu is identical in function to an unadorned
- .CW w
- command.
- .PP
- The other two commands,
- .CW e
- and
- .CW r ,
- read data from Unix files.
- The
- .CW e
- command clears out the current file,
- reads the data from the named file (or uses the current file's old name if
- none is explicitly provided), and sets the file name.
- It's much like a
- .CW B
- command, but puts the information in the current file instead of a new one.
- .CW e
- without any file name is therefore an easy way to refresh
- .CW sam 's
- copy of a Unix file.
- [Unlike in
- .CW ed ,
- .CW e
- doesn't complain if the file is modified. The principle is not
- to protect against things that can be undone if wrong.]
- Since its job is to replace the whole text,
- .CW e
- never takes an address.
- .PP
- The
- .CW r
- command is like
- .CW e ,
- but it doesn't clear the file:
- the text in the Unix file replaces dot, or the specified text if an
- address is given.
- .P1
- .WC "r emacs.ms
- .P2
- has essentially the effect of
- .P1
- .WC "<cat emacs.ms
- .P2
- The commands
- .CW r
- and
- .CW w
- will set the name of the file if the current file has no name already defined;
- .CW e
- sets the name even if the file already has one.
- .PP
- There is a command, analogous to
- .CW x ,
- that iterates over files instead of pieces of text:
- .CW X
- (capital
- .CW x ).
- The syntax is easy; it's just like that of
- .CW x
- \(em \f(CWX/\f2pattern\f(CW/\f2command\f1.
- (The complementary command is
- .CW Y ,
- analogous to
- .CW y .)
- The effect is to run the command in each file whose menu entry
- (that is, whose line printed by an
- .CW f
- command) matches the pattern.
- For example, since an apostrophe identifies modified files,
- .P1
- .WC "X/'/ w
- .P2
- writes the changed files out to disc.
- Here is a longer example: find all uses of a particular variable
- in the C source files:
- .P1
- .WC "X/\e.c$/ ,x/variable/+-p
- .P2
- We can use an
- .CW f
- command to identify which file the variable appears in:
- .P1
- .ft CI
- X/\e.c$/ ,g/variable/ {
- f
- ,x/variable/+-{
- =
- p
- }
- }
- .ft
- .P2
- Here, the
- .CW g
- command guarantees that only the names of files containing the variable
- will be printed (but beware that
- .CW sam
- may confuse matters by printing the names of files it reads in during
- the command).
- The
- .CW =
- command shows where in the file the variable appears, and the
- .CW p
- command prints the line.
- .PP
- The
- .CW D
- command is handy as the target of an
- .CW X .
- This example deletes from the menu all C files that do not contain
- a particular variable:
- .P1
- .WC "X/\e.c$/ ,v/variable/ D
- .P2
- If no pattern is provided for the
- .CW X ,
- the command (which defaults to
- .CW f )
- is run in all files, so
- .P1
- .WC "X D
- .P2
- cleans
- .CW sam
- up for a fresh start.
- .PP
- But rather than working any further, let's stop now:
- .P1
- .WC q
- $
- .P2
- .fi
- .PP
- Some of the file manipulating commands can be undone:
- undoing a
- .CW f ,
- .CW e ,
- or
- .CW r
- restores the previous state of the file,
- but
- .CW w ,
- .CW B
- and
- .CW D
- are irrevocable.
- And, of course, so is
- .CW q .
|