List of Examples
REGEXP:MATCH
REGEXP:REGEXP-QUOTE
The “REGEXP” module implements the POSIX
regular expressions
matching by calling the standard C system facilities.
The syntax of these regular expressions is described in many places,
such as your local <regex.h
> manual and Emacs info pages.
This module is present in the base linking set by default.
When this module is present, *FEATURES*
contains the symbol :REGEXP
.
Regular Expression API
(REGEXP:MATCH
pattern
string
&KEY
(:START
0) :END
:EXTENDED
:IGNORE-CASE
:NEWLINE
:NOSUB
:NOTBOL
:NOTEOL
)
This macro returns as first value a REGEXP:MATCH
structure
containing the indices of the start and end of the first match for the
regular expression pattern
in string
; or no values if there is no match.
Additionally, a REGEXP:MATCH
structure is returned for every matched
"\(...\)"
group in pattern
, in the
order that the open parentheses appear in pattern
.
If start
is non-NIL
, the search starts at that index in string
.
If end
is non-NIL
, only (
is considered.
SUBSEQ
string
start
end
)
Example 33.1. REGEXP:MATCH
(REGEXP:MATCH
"quick" "The quick brown fox jumped quickly.") ⇒#S(
(REGEXP:MATCH
:START 4 :END 9)REGEXP:MATCH
"quick" "The quick brown fox jumped quickly." :start 8) ⇒#S(
(REGEXP:MATCH
:START 27 :END 32)REGEXP:MATCH
"quick" "The quick brown fox jumped quickly." :start 8 :end 30) ⇒(
NIL
REGEXP:MATCH
"\\([a-z]*\\)[0-9]*\\(bar\\)" "foo12bar") ⇒#S(
; ⇒REGEXP:MATCH
:START 0 :END 8)#S(
; ⇒REGEXP:MATCH
:START 0 :END 3)#S(
REGEXP:MATCH
:START 5 :END 8)
(REGEXP:MATCH-START
match
)
(REGEXP:MATCH-END
match
)
match
; SETF
-able.
(REGEXP:MATCH-STRING
string
match
)
string
corresponding
to the given pair of start and end indices of match
.
The result is shared with string
.
If you want a fresh STRING
, use COPY-SEQ
or
COERCE
to SIMPLE-STRING
.(REGEXP:REGEXP-QUOTE
string
&OPTIONAL
extended
)
This function returns a regular expression STRING
that matches exactly string
and nothing else.
This allows you to request an exact string match when calling a
function that wants a regular expression.
One use of REGEXP:REGEXP-QUOTE
is to combine an exact string match with
context described as a regular expression.
When extended
is non-NIL
, also
quote #\+ and #\?.
(REGEXP:REGEXP-COMPILE
string
&KEY
:EXTENDED
:IGNORE-CASE
:NEWLINE
:NOSUB
)
string
into an
object suitable for REGEXP:REGEXP-EXEC
.(REGEXP:REGEXP-EXEC
pattern
string
&KEY
:RETURN-TYPE
(:START
0) :END
:NOTBOL
:NOTEOL
)
Execute the pattern
, which must be a compiled
regular expression returned by REGEXP:REGEXP-COMPILE
, against the appropriate
portion of the string
.
Returns REGEXP:MATCH
structures as multiple values (one for each
subexpression which successfully matched and one for the whole pattern).
If :RETURN-TYPE
is LIST
(or VECTOR
), the
REGEXP:MATCH
structures are returned as a LIST
(or a VECTOR
) instead.
Also, if there are more than MULTIPLE-VALUES-LIMIT
REGEXP:MATCH
structures
to return, a LIST
is returned instead of multiple values.
If :RETURN-TYPE
is BOOLEAN
, return T
or NIL
as an indicator
of success or failure, but do not allocate anything.
(REGEXP:REGEXP-SPLIT
pattern
string
&KEY
(:START
0) :END
:EXTENDED
:IGNORE-CASE
:NEWLINE
:NOSUB
:NOTBOL
:NOTEOL
)
string
(all
sharing the structure with string
) separated by pattern
(a
regular expression STRING
or a return value of REGEXP:REGEXP-COMPILE
)
(REGEXP:WITH-LOOP-SPLIT
(variable
stream
pattern
&KEY
(:START
0) :END
:EXTENDED
:IGNORE-CASE
:NEWLINE
:NOSUB
:NOTBOL
:NOTEOL
) &BODY
body
)
stream
, split them with
REGEXP:REGEXP-SPLIT
on pattern
, and bind the resulting list to
variable
.:EXTENDED
:IGNORE-CASE
:NEWLINE
:NOSUB
regex.h
> for their meaning.:NOTBOL
:NOTEOL
regex.h
> for their meaning.REGEXP:REGEXP-MATCHER
CUSTOM:*APROPOS-MATCHER*
.
This will work only when your LOCALE
is CHARSET:UTF-8
because CLISP uses CHARSET:UTF-8
internally and POSIX constrains
<regex.h
> to use the current LOCALE
.Example 33.3. Count unix shell users
The following code computes the number of people who use a particular shell:
#!/usr/bin/clisp -C (USE-PACKAGE
"REGEXP") (let ((h (make-hash-table :test #'equal :size 10)) (n 0)) (with-open-file (f (or (firstEXT:*ARGS*
) "/etc/passwd")) (with-loop-split (s f (s f (or (secondEXT:*ARGS*
) ":"))) (incf (gethash (seventh s) h 0)))) (with-hash-table-iterator (i h) (loop (multiple-value-bind (r k v) (i) (unless r (return)) (format t "[~d] ~s~30t== ~5:d~%" (incf n) k v)))))
For comparison, the same (almost - except for the nice output formatting) can be done by the following Perl:
#!/usr/bin/perl -w use diagnostics; use strict; my $IN = $ARGV[0] || "/etc/passwd"; open(INF,"≤ $IN") or die "$0: cannot read file [$IN]: $!\n"; my %hash; while (≤INF≥) { chop; my @all = split($ARGV[1] || ":"); $hash{$all[6] || ""}++; } my $ii = 0; for my $kk (keys(%hash)) { print "[",++$ii,"] \"",$kk,"\" -- ",$hash{$kk},"\n"; } close(INF);
These notes document CLISP version 2.49.93+ | Last modified: 2018-02-19 |