Regular expressions for Elixir built on top of Erlang's re
module.
As the re
module, Regex is based on PCRE
(Perl Compatible Regular Expressions). More information can be
found in the re
documentation.
Regular expressions in Elixir can be created using Regex.compile!/2
or using the special form with ~r
:
# A simple regular expressions that matches foo anywhere in the string
~r/foo/
# A regular expression with case insensitive and unicode options
~r/foo/iu
A Regex is represented internally as the Regex
struct. Therefore,
%Regex{}
can be used whenever there is a need to match on them.
Modifiers
The modifiers available when creating a Regex are:
unicode
(u) - enables unicode specific patterns like\p
and changes modifiers like\w
,\W
,\s
and friends to also match on unicode. It expects valid unicode strings to be given on matchcaseless
(i) - add case insensitivitydotall
(s) - causes dot to match newlines and also set newline to anycrlf; the new line setting can be overridden by setting(*CR)
or(*LF)
or(*CRLF)
or(*ANY)
according to re documentationmultiline
(m) - causes^
and$
to mark the beginning and end of each line; use\A
and\z
to match the end or beginning of the stringextended
(x) - whitespace characters are ignored except when escaped and allow#
to delimit commentsfirstline
(f) - forces the unanchored pattern to match before or at the first newline, though the matched text may continue over the newlineungreedy
(r) - inverts the "greediness" of the regexp
The options not available are:
anchored
- not available, use^
or\A
insteaddollar_endonly
- not available, use\z
insteadno_auto_capture
- not available, use?:
insteadnewline
- not available, use(*CR)
or(*LF)
or(*CRLF)
or(*ANYCRLF)
or(*ANY)
at the beginning of the regexp according to the re documentation
Captures
Many functions in this module allows what to capture in a regex
match via the :capture
option. The supported values are:
:all
- all captured subpatterns including the complete matching string (this is the default):first
- only the first captured subpattern, which is always the complete matching part of the string; all explicitly captured subpatterns are discarded:all_but_first
- all but the first matching subpattern, i.e. all explicitly captured subpatterns, but not the complete matching part of the string:none
- do not return matching subpatterns at all:all_names
- captures all names in the Regexlist(binary)
- a list of named captures to capture