There are multiple ways you can define a character class.
Usage
character_class(x)
one_of(...)
any_of(..., type = c("greedy", "lazy", "possessive"))
some_of(..., type = c("greedy", "lazy", "possessive"))
none_of(...)
except_any_of(..., type = c("greedy", "lazy", "possessive"))
except_some_of(..., type = c("greedy", "lazy", "possessive"))
range(start, end)
`:`(start, end)
exclude_range(start, end)Arguments
- x
text to include in the character class (must be escaped manually)
- ...
shortcuts, R variables, text, or other rex functions.- type
the type of match to perform.
There are three match types
greedy: match the longest string. This is the default matching type.lazy: match the shortest string. This matches the shortest string from the same anchor point, not necessarily the shortest global string.possessive: match and don't allow backtracking
- start
beginning of character class
- end
end of character class
Functions
character_class: explicitly define a character classone_of: matches one of the specified characters.any_of: matches zero or more of the specified characters.some_of: matches one or more of the specified characters.none_of: matches anything but one of the specified characters.except_any_of: matches zero or more of anything but the specified characters.except_some_of: matches one or more of anything but the specified characters.range: matches one of any of the characters in the range.:: matches one of any of the characters in the range.exclude_range: matches one of any of the characters except those in the range.
Examples
# grey = gray
re <- rex("gr", one_of("a", "e"), "y")
grepl(re, c("grey", "gray")) # TRUE TRUE
#> [1] TRUE TRUE
# Match non-vowels
re <- rex(none_of("a", "e", "i", "o", "u"))
# They can also be in the same string
re <- rex(none_of("aeiou"))
grepl(re, c("k", "l", "e")) # TRUE TRUE FALSE
#> [1] TRUE TRUE FALSE
# Match range
re <- rex(range("a", "e"))
grepl(re, c("b", "d", "f")) # TRUE TRUE FALSE
#> [1] TRUE TRUE FALSE
# Explicit creation
re <- rex(character_class("abcd\\["))
grepl(re, c("a", "d", "[", "]")) # TRUE TRUE TRUE FALSE
#> [1] TRUE TRUE TRUE FALSE