An update to the new Regular Expression API (added 7/29/04)

In June, we released a new, fast, Perl-compatible regular expression matcher module. It is described in this essay. We have now released an update to that module, with a new function and three new macros. In this essay, we briefly describe this new functionality.

You should download the new module with sys:update-allegro. You should evaluate (require :regexp2) to load the module. The documentation has been updated, see regexp.htm. The new regexp module and the additional functionality are described in The new regexp2 module in that document.

The new function is re-submatch. The three new macros are re-lambda, re-let, and re-case (the links are to the descriptions in regexp.htm).

The function re-submatch

Arguments: regexp string indexes selector &key (type :string)

This function is an efficient way to deal with submatches in a regular expression, in the event that not all submatches will be used by the caller of match-re, the function that find the matches. In the following example, the submatch string is not created until the call to re-submatch:

cl-user(3): (setq m (match-re "foo (.*)" "foo bar" :return :match))
#S(regexp::regular-expression-match :indexes ((0 . 7) (4 . 7))
                                    :input "foo bar" :num-submatches 1
                                    :named-submatches nil)
cl-user(4): (re-submatch m nil nil 1)
"bar"
cl-user(5): 

Or, using named submatches:

cl-user(9): (setq m (match-re "foo (?<noun>.*)" "foo bar" :return :match))
#S(regexp::regular-expression-match :indexes ((0 . 7) (4 . 7))
                                    :input "foo bar" :num-submatches 1
                                    :named-submatches (("noun" 1)))
cl-user(10): (re-submatch m nil nil "noun")
"bar"
ccl-user(11): 

The combination of named submatches and the match object returned by match-re can be very powerful.

The macro re-lambda

Arguments: regexp bindings &body body

This macro returns a function that can be later applied to a string to match. For example:

cl-user(20): (funcall (re-lambda "([^ ]+) ([^ ]+) ([^ ]+)"
                          ((foo 1) (bar 2) (baz 3))
                        (list foo bar baz))
                      
                      "foo the bar")
("foo" "the" "bar")

Now using named submatches:

cl-user(21): (funcall (re-lambda "(?<pos1>[^ ]+) (?<pos2>[^ ]+) (?<pos3>[^ ]+)"
                          ((foo "pos1") (bar "pos2") (baz "pos3"))
                        (list foo bar baz))
                      
                      "foo the bar")
("foo" "the" "bar")

Now using named submatches with keywords as their name (the preferred method):

cl-user(22): (funcall (re-lambda "(?<pos1>[^ ]+) (?<pos2>[^ ]+) (?<pos3>[^ ]+)"
                          ((foo :pos1) (bar :pos2) (baz :pos3))
                        (list foo bar baz))
                      
                      "foo the bar")
("foo" "the" "bar")
cl-user(23): 

The macro re-let

Arguments: regexp string bindings &body body

Similar to let, but using re-lambda binding syntax:

cl-user(23): (re-let "(?<pos1>[^ ]+) (?<pos2>[^ ]+) (?<pos3>[^ ]+)"
                "foo the bar"
                ((foo :pos1) (bar :pos2) (baz :pos3))
              (list foo bar baz))
("foo" "the" "bar")
cl-user(24): 

The macro re-case

Arguments: string &rest clauses

This macro provides a handy syntax to test a string against a succession of regular expressions:

cl-user(24): (re-case "foo the barmy"
               ("foo a (.*)" ((it 1)) (list :a it))
               ("foo the (.*)" ((it 1)) (list :the it))
               (t :no-match))
(:the "barmy")
cl-user(25): (re-case "foo a barmy"
               ("foo a (.*)" ((it 1)) (list :a it))
               ("foo the (.*)" ((it 1)) (list :the it))
               (t :no-match))
(:a "barmy")
cl-user(26): (re-case "foo blah blah"
               ("foo a (.*)" ((it 1)) (list :a it))
               ("foo the (.*)" ((it 1)) (list :the it))
               (t :no-match))
:no-match
cl-user(27): 
Copyright © 2023 Franz Inc., All Rights Reserved | Privacy Statement Twitter