123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129 |
- .TH REGEXP 6
- .SH NAME
- regexp \- regular expression notation
- .SH DESCRIPTION
- A
- .I "regular expression"
- specifies
- a set of strings of characters.
- A member of this set of strings is said to be
- .I matched
- by the regular expression. In many applications
- a delimiter character, commonly
- .LR / ,
- bounds a regular expression.
- In the following specification for regular expressions
- the word `character' means any character (rune) but newline.
- .PP
- The syntax for a regular expression
- .B e0
- is
- .IP
- .EX
- e3: literal | charclass | '.' | '^' | '$' | '(' e0 ')'
- e2: e3
- | e2 REP
- REP: '*' | '+' | '?'
- e1: e2
- | e1 e2
- e0: e1
- | e0 '|' e1
- .EE
- .PP
- A
- .B literal
- is any non-metacharacter, or a metacharacter
- (one of
- .BR .*+?[]()|\e^$ ),
- or the delimiter
- preceded by
- .LR \e .
- .PP
- A
- .B charclass
- is a nonempty string
- .I s
- bracketed
- .BI [ \|s\| ]
- (or
- .BI [^ s\| ]\fR);
- it matches any character in (or not in)
- .IR s .
- A negated character class never
- matches newline.
- A substring
- .IB a - b\f1,
- with
- .I a
- and
- .I b
- in ascending
- order, stands for the inclusive
- range of
- characters between
- .I a
- and
- .IR b .
- In
- .IR s ,
- the metacharacters
- .LR - ,
- .LR ] ,
- an initial
- .LR ^ ,
- and the regular expression delimiter
- must be preceded by a
- .LR \e ;
- other metacharacters
- have no special meaning and
- may appear unescaped.
- .PP
- A
- .L .
- matches any character.
- .PP
- A
- .L ^
- matches the beginning of a line;
- .L $
- matches the end of the line.
- .PP
- The
- .B REP
- operators match zero or more
- .RB ( * ),
- one or more
- .RB ( + ),
- zero or one
- .RB ( ? ),
- instances respectively of the preceding regular expression
- .BR e2 .
- .PP
- A concatenated regular expression,
- .BR "e1\|e2" ,
- matches a match to
- .B e1
- followed by a match to
- .BR e2 .
- .PP
- An alternative regular expression,
- .BR "e0\||\|e1" ,
- matches either a match to
- .B e0
- or a match to
- .BR e1 .
- .PP
- A match to any part of a regular expression
- extends as far as possible without preventing
- a match to the remainder of the regular expression.
- .SH "SEE ALSO"
- .IR awk (1),
- .IR ed (1),
- .IR grep (1),
- .IR sam (1),
- .IR sed (1),
- .IR regexp (2)
|