regexp

Imports

Imports #

"regexp/syntax"
"sync"
"io"
"regexp/syntax"
"sync"
"regexp/syntax"
"slices"
"strings"
"unicode"
"unicode/utf8"
"bytes"
"io"
"regexp/syntax"
"strconv"
"strings"
"sync"
"unicode"
"unicode/utf8"

Constants & Variables

anyRune var #

var anyRune = []rune{...}

anyRuneNotNL var #

var anyRuneNotNL = []rune{...}

arrayNoInts var #

arrayNoInts is returned by doExecute match if nil dstCap is passed to it with ncap=0.

var arrayNoInts [0]int

bitStatePool var #

var bitStatePool sync.Pool

endOfText const #

const endOfText rune = *ast.UnaryExpr

matchPool var #

Pools of *machine for use during (*Regexp).doExecute, split up by the size of the execution queues. matchPool[i] machines have queue size matchSize[i]. On a 64-bit system each queue entry is 16 bytes, so matchPool[0] has 16*2*128 = 4kB queues, etc. The final matchPool is a catch-all for very large queues.

var matchPool [*ast.CallExpr]sync.Pool

matchSize var #

Pools of *machine for use during (*Regexp).doExecute, split up by the size of the execution queues. matchPool[i] machines have queue size matchSize[i]. On a 64-bit system each queue entry is 16 bytes, so matchPool[0] has 16*2*128 = 4kB queues, etc. The final matchPool is a catch-all for very large queues.

var matchSize = [...]int{...}

maxBacktrackProg const #

const maxBacktrackProg = 500

maxBacktrackVector const #

const maxBacktrackVector = *ast.BinaryExpr

mergeFailed const #

mergeRuneSets merges two non-intersecting runesets, and returns the merged result, and a NextIp array. The idea is that if a rune matches the OnePassRunes at index i, NextIp[i/2] is the target. If the input sets intersect, an empty runeset and a NextIp array with the single element mergeFailed is returned. The code assumes that both inputs contain ordered and non-intersecting rune pairs.

const mergeFailed = *ast.CallExpr

noNext var #

var noNext = []uint32{...}

noRune var #

var noRune = []rune{...}

onePassPool var #

var onePassPool sync.Pool

specialBytes var #

Bitmap used by func special to check whether a character needs to be escaped.

var specialBytes [16]byte

startSize const #

const startSize = 10

visitedBits const #

const visitedBits = 32

Type Aliases

lazyFlag type #

A lazyFlag is a lazily-evaluated syntax.EmptyOp, for checking zero-width flags like ^ $ \A \z \B \b. It records the pair of relevant runes and does not determine the implied flags until absolutely necessary (most of the time, that means never).

type lazyFlag uint64

Interfaces

input interface #

input abstracts different representations of the input text. It provides one-character lookahead.

type input interface {
step(pos int) (r rune, width int)
canCheckPrefix() bool
hasPrefix(re *Regexp) bool
index(re *Regexp, pos int) int
context(pos int) lazyFlag
}

Structs

Regexp struct #

Regexp is the representation of a compiled regular expression. A Regexp is safe for concurrent use by multiple goroutines, except for configuration methods, such as [Regexp.Longest].

type Regexp struct {
expr string
prog *syntax.Prog
onepass *onePassProg
numSubexp int
maxBitStateLen int
subexpNames []string
prefix string
prefixBytes []byte
prefixRune rune
prefixEnd uint32
mpool int
matchcap int
prefixComplete bool
cond syntax.EmptyOp
minInputLen int
longest bool
}

bitState struct #

bitState holds state for the backtracker.

type bitState struct {
end int
cap []int
matchcap []int
jobs []job
visited []uint32
inputs inputs
}

entry struct #

An entry is an entry on a queue. It holds both the instruction pc and the actual thread. Some queue entries are just place holders so that the machine knows it has considered that pc. Such entries have t == nil.

type entry struct {
pc uint32
t *thread
}

inputBytes struct #

inputBytes scans a byte slice.

type inputBytes struct {
str []byte
}

inputReader struct #

inputReader scans a RuneReader.

type inputReader struct {
r io.RuneReader
atEOT bool
pos int
}

inputString struct #

inputString scans a string.

type inputString struct {
str string
}

inputs struct #

type inputs struct {
bytes inputBytes
string inputString
reader inputReader
}

job struct #

A job is an entry on the backtracker's job stack. It holds the instruction pc and the position in the input.

type job struct {
pc uint32
arg bool
pos int
}

machine struct #

A machine holds all the state during an NFA simulation for p.

type machine struct {
re *Regexp
p *syntax.Prog
q0 queue
q1 queue
pool []*thread
matched bool
matchcap []int
inputs inputs
}

onePassInst struct #

A onePassInst is a single instruction in a one-pass regular expression program. It is the same as syntax.Inst except for the new 'Next' field.

type onePassInst struct {
syntax.Inst
Next []uint32
}

onePassMachine struct #

type onePassMachine struct {
inputs inputs
matchcap []int
}

onePassProg struct #

A onePassProg is a compiled one-pass regular expression program. It is the same as syntax.Prog except for the use of onePassInst.

type onePassProg struct {
Inst []onePassInst
Start int
NumCap int
}

queue struct #

A queue is a 'sparse array' holding pending threads of execution. See https://research.swtch.com/2008/03/using-uninitialized-memory-for-fun-and.html

type queue struct {
sparse []uint32
dense []entry
}

queueOnePass struct #

Sparse Array implementation is used as a queueOnePass.

type queueOnePass struct {
sparse []uint32
dense []uint32
size uint32
nextIndex uint32
}

thread struct #

A thread is the state of a single path through the machine: an instruction and a corresponding capture array. See https://swtch.com/~rsc/regexp/regexp2.html

type thread struct {
inst *syntax.Inst
cap []int
}

Functions

AppendText method #

AppendText implements [encoding.TextAppender]. The output matches that of calling the [Regexp.String] method. Note that the output is lossy in some cases: This method does not indicate POSIX regular expressions (i.e. those compiled by calling [CompilePOSIX]), or those for which the [Regexp.Longest] method has been called.

func (re *Regexp) AppendText(b []byte) ([]byte, error)

Compile function #

Compile parses a regular expression and returns, if successful, a [Regexp] object that can be used to match against text. When matching against text, the regexp returns a match that begins as early as possible in the input (leftmost), and among those it chooses the one that a backtracking search would have found first. This so-called leftmost-first matching is the same semantics that Perl, Python, and other implementations use, although this package implements it without the expense of backtracking. For POSIX leftmost-longest matching, see [CompilePOSIX].

func Compile(expr string) (*Regexp, error)

CompilePOSIX function #

CompilePOSIX is like [Compile] but restricts the regular expression to POSIX ERE (egrep) syntax and changes the match semantics to leftmost-longest. That is, when matching against text, the regexp returns a match that begins as early as possible in the input (leftmost), and among those it chooses a match that is as long as possible. This so-called leftmost-longest matching is the same semantics that early regular expression implementations used and that POSIX specifies. However, there can be multiple leftmost-longest matches, with different submatch choices, and here this package diverges from POSIX. Among the possible leftmost-longest matches, this package chooses the one that a backtracking search would have found first, while POSIX specifies that the match be chosen to maximize the length of the first subexpression, then the second, and so on from left to right. The POSIX rule is computationally prohibitive and not even well-defined. See https://swtch.com/~rsc/regexp/regexp2.html#posix for details.

func CompilePOSIX(expr string) (*Regexp, error)

Copy method #

Copy returns a new [Regexp] object copied from re. Calling [Regexp.Longest] on one copy does not affect another. Deprecated: In earlier releases, when using a [Regexp] in multiple goroutines, giving each goroutine its own copy helped to avoid lock contention. As of Go 1.12, using Copy is no longer necessary to avoid lock contention. Copy may still be appropriate if the reason for its use is to make two copies with different [Regexp.Longest] settings.

func (re *Regexp) Copy() *Regexp

Expand method #

Expand appends template to dst and returns the result; during the append, Expand replaces variables in the template with corresponding matches drawn from src. The match slice should have been returned by [Regexp.FindSubmatchIndex]. In the template, a variable is denoted by a substring of the form $name or ${name}, where name is a non-empty sequence of letters, digits, and underscores. A purely numeric name like $1 refers to the submatch with the corresponding index; other names refer to capturing parentheses named with the (?P...) syntax. A reference to an out of range or unmatched index or a name that is not present in the regular expression is replaced with an empty slice. In the $name form, name is taken to be as long as possible: $1x is equivalent to ${1x}, not ${1}x, and, $10 is equivalent to ${10}, not ${1}0. To insert a literal $ in the output, use $$ in the template.

func (re *Regexp) Expand(dst []byte, template []byte, src []byte, match []int) []byte

ExpandString method #

ExpandString is like [Regexp.Expand] but the template and source are strings. It appends to and returns a byte slice in order to give the calling code control over allocation.

func (re *Regexp) ExpandString(dst []byte, template string, src string, match []int) []byte

Find method #

Find returns a slice holding the text of the leftmost match in b of the regular expression. A return value of nil indicates no match.

func (re *Regexp) Find(b []byte) []byte

FindAll method #

FindAll is the 'All' version of [Regexp.Find]; it returns a slice of all successive matches of the expression, as defined by the 'All' description in the package comment. A return value of nil indicates no match.

func (re *Regexp) FindAll(b []byte, n int) [][]byte

FindAllIndex method #

FindAllIndex is the 'All' version of [Regexp.FindIndex]; it returns a slice of all successive matches of the expression, as defined by the 'All' description in the package comment. A return value of nil indicates no match.

func (re *Regexp) FindAllIndex(b []byte, n int) [][]int

FindAllString method #

FindAllString is the 'All' version of [Regexp.FindString]; it returns a slice of all successive matches of the expression, as defined by the 'All' description in the package comment. A return value of nil indicates no match.

func (re *Regexp) FindAllString(s string, n int) []string

FindAllStringIndex method #

FindAllStringIndex is the 'All' version of [Regexp.FindStringIndex]; it returns a slice of all successive matches of the expression, as defined by the 'All' description in the package comment. A return value of nil indicates no match.

func (re *Regexp) FindAllStringIndex(s string, n int) [][]int

FindAllStringSubmatch method #

FindAllStringSubmatch is the 'All' version of [Regexp.FindStringSubmatch]; it returns a slice of all successive matches of the expression, as defined by the 'All' description in the package comment. A return value of nil indicates no match.

func (re *Regexp) FindAllStringSubmatch(s string, n int) [][]string

FindAllStringSubmatchIndex method #

FindAllStringSubmatchIndex is the 'All' version of [Regexp.FindStringSubmatchIndex]; it returns a slice of all successive matches of the expression, as defined by the 'All' description in the package comment. A return value of nil indicates no match.

func (re *Regexp) FindAllStringSubmatchIndex(s string, n int) [][]int

FindAllSubmatch method #

FindAllSubmatch is the 'All' version of [Regexp.FindSubmatch]; it returns a slice of all successive matches of the expression, as defined by the 'All' description in the package comment. A return value of nil indicates no match.

func (re *Regexp) FindAllSubmatch(b []byte, n int) [][][]byte

FindAllSubmatchIndex method #

FindAllSubmatchIndex is the 'All' version of [Regexp.FindSubmatchIndex]; it returns a slice of all successive matches of the expression, as defined by the 'All' description in the package comment. A return value of nil indicates no match.

func (re *Regexp) FindAllSubmatchIndex(b []byte, n int) [][]int

FindIndex method #

FindIndex returns a two-element slice of integers defining the location of the leftmost match in b of the regular expression. The match itself is at b[loc[0]:loc[1]]. A return value of nil indicates no match.

func (re *Regexp) FindIndex(b []byte) (loc []int)

FindReaderIndex method #

FindReaderIndex returns a two-element slice of integers defining the location of the leftmost match of the regular expression in text read from the [io.RuneReader]. The match text was found in the input stream at byte offset loc[0] through loc[1]-1. A return value of nil indicates no match.

func (re *Regexp) FindReaderIndex(r io.RuneReader) (loc []int)

FindReaderSubmatchIndex method #

FindReaderSubmatchIndex returns a slice holding the index pairs identifying the leftmost match of the regular expression of text read by the [io.RuneReader], and the matches, if any, of its subexpressions, as defined by the 'Submatch' and 'Index' descriptions in the package comment. A return value of nil indicates no match.

func (re *Regexp) FindReaderSubmatchIndex(r io.RuneReader) []int

FindString method #

FindString returns a string holding the text of the leftmost match in s of the regular expression. If there is no match, the return value is an empty string, but it will also be empty if the regular expression successfully matches an empty string. Use [Regexp.FindStringIndex] or [Regexp.FindStringSubmatch] if it is necessary to distinguish these cases.

func (re *Regexp) FindString(s string) string

FindStringIndex method #

FindStringIndex returns a two-element slice of integers defining the location of the leftmost match in s of the regular expression. The match itself is at s[loc[0]:loc[1]]. A return value of nil indicates no match.

func (re *Regexp) FindStringIndex(s string) (loc []int)

FindStringSubmatch method #

FindStringSubmatch returns a slice of strings holding the text of the leftmost match of the regular expression in s and the matches, if any, of its subexpressions, as defined by the 'Submatch' description in the package comment. A return value of nil indicates no match.

func (re *Regexp) FindStringSubmatch(s string) []string

FindStringSubmatchIndex method #

FindStringSubmatchIndex returns a slice holding the index pairs identifying the leftmost match of the regular expression in s and the matches, if any, of its subexpressions, as defined by the 'Submatch' and 'Index' descriptions in the package comment. A return value of nil indicates no match.

func (re *Regexp) FindStringSubmatchIndex(s string) []int

FindSubmatch method #

FindSubmatch returns a slice of slices holding the text of the leftmost match of the regular expression in b and the matches, if any, of its subexpressions, as defined by the 'Submatch' descriptions in the package comment. A return value of nil indicates no match.

func (re *Regexp) FindSubmatch(b []byte) [][]byte

FindSubmatchIndex method #

FindSubmatchIndex returns a slice holding the index pairs identifying the leftmost match of the regular expression in b and the matches, if any, of its subexpressions, as defined by the 'Submatch' and 'Index' descriptions in the package comment. A return value of nil indicates no match.

func (re *Regexp) FindSubmatchIndex(b []byte) []int

LiteralPrefix method #

LiteralPrefix returns a literal string that must begin any match of the regular expression re. It returns the boolean true if the literal string comprises the entire regular expression.

func (re *Regexp) LiteralPrefix() (prefix string, complete bool)

Longest method #

Longest makes future searches prefer the leftmost-longest match. That is, when matching against text, the regexp returns a match that begins as early as possible in the input (leftmost), and among those it chooses a match that is as long as possible. This method modifies the [Regexp] and may not be called concurrently with any other methods.

func (re *Regexp) Longest()

MarshalText method #

MarshalText implements [encoding.TextMarshaler]. The output matches that of calling the [Regexp.AppendText] method. See [Regexp.AppendText] for more information.

func (re *Regexp) MarshalText() ([]byte, error)

Match method #

Match reports whether the byte slice b contains any match of the regular expression re.

func (re *Regexp) Match(b []byte) bool

Match function #

Match reports whether the byte slice b contains any match of the regular expression pattern. More complicated queries need to use [Compile] and the full [Regexp] interface.

func Match(pattern string, b []byte) (matched bool, err error)

MatchReader method #

MatchReader reports whether the text returned by the [io.RuneReader] contains any match of the regular expression re.

func (re *Regexp) MatchReader(r io.RuneReader) bool

MatchReader function #

MatchReader reports whether the text returned by the [io.RuneReader] contains any match of the regular expression pattern. More complicated queries need to use [Compile] and the full [Regexp] interface.

func MatchReader(pattern string, r io.RuneReader) (matched bool, err error)

MatchString function #

MatchString reports whether the string s contains any match of the regular expression pattern. More complicated queries need to use [Compile] and the full [Regexp] interface.

func MatchString(pattern string, s string) (matched bool, err error)

MatchString method #

MatchString reports whether the string s contains any match of the regular expression re.

func (re *Regexp) MatchString(s string) bool

MustCompile function #

MustCompile is like [Compile] but panics if the expression cannot be parsed. It simplifies safe initialization of global variables holding compiled regular expressions.

func MustCompile(str string) *Regexp

MustCompilePOSIX function #

MustCompilePOSIX is like [CompilePOSIX] but panics if the expression cannot be parsed. It simplifies safe initialization of global variables holding compiled regular expressions.

func MustCompilePOSIX(str string) *Regexp

NumSubexp method #

NumSubexp returns the number of parenthesized subexpressions in this [Regexp].

func (re *Regexp) NumSubexp() int

QuoteMeta function #

QuoteMeta returns a string that escapes all regular expression metacharacters inside the argument text; the returned string is a regular expression matching the literal text.

func QuoteMeta(s string) string

ReplaceAll method #

ReplaceAll returns a copy of src, replacing matches of the [Regexp] with the replacement text repl. Inside repl, $ signs are interpreted as in [Regexp.Expand].

func (re *Regexp) ReplaceAll(src []byte, repl []byte) []byte

ReplaceAllFunc method #

ReplaceAllFunc returns a copy of src in which all matches of the [Regexp] have been replaced by the return value of function repl applied to the matched byte slice. The replacement returned by repl is substituted directly, without using [Regexp.Expand].

func (re *Regexp) ReplaceAllFunc(src []byte, repl func([]byte) []byte) []byte

ReplaceAllLiteral method #

ReplaceAllLiteral returns a copy of src, replacing matches of the [Regexp] with the replacement bytes repl. The replacement repl is substituted directly, without using [Regexp.Expand].

func (re *Regexp) ReplaceAllLiteral(src []byte, repl []byte) []byte

ReplaceAllLiteralString method #

ReplaceAllLiteralString returns a copy of src, replacing matches of the [Regexp] with the replacement string repl. The replacement repl is substituted directly, without using [Regexp.Expand].

func (re *Regexp) ReplaceAllLiteralString(src string, repl string) string

ReplaceAllString method #

ReplaceAllString returns a copy of src, replacing matches of the [Regexp] with the replacement string repl. Inside repl, $ signs are interpreted as in [Regexp.Expand].

func (re *Regexp) ReplaceAllString(src string, repl string) string

ReplaceAllStringFunc method #

ReplaceAllStringFunc returns a copy of src in which all matches of the [Regexp] have been replaced by the return value of function repl applied to the matched substring. The replacement returned by repl is substituted directly, without using [Regexp.Expand].

func (re *Regexp) ReplaceAllStringFunc(src string, repl func(string) string) string

Split method #

Split slices s into substrings separated by the expression and returns a slice of the substrings between those expression matches. The slice returned by this method consists of all the substrings of s not contained in the slice returned by [Regexp.FindAllString]. When called on an expression that contains no metacharacters, it is equivalent to [strings.SplitN]. Example: s := regexp.MustCompile("a*").Split("abaabaccadaaae", 5) // s: ["", "b", "b", "c", "cadaaae"] The count determines the number of substrings to return: - n > 0: at most n substrings; the last substring will be the unsplit remainder; - n == 0: the result is nil (zero substrings); - n < 0: all substrings.

func (re *Regexp) Split(s string, n int) []string

String method #

String returns the source text used to compile the regular expression.

func (re *Regexp) String() string

SubexpIndex method #

SubexpIndex returns the index of the first subexpression with the given name, or -1 if there is no subexpression with that name. Note that multiple subexpressions can be written using the same name, as in (?Pa+)(?Pb+), which declares two subexpressions named "bob". In this case, SubexpIndex returns the index of the leftmost such subexpression in the regular expression.

func (re *Regexp) SubexpIndex(name string) int

SubexpNames method #

SubexpNames returns the names of the parenthesized subexpressions in this [Regexp]. The name for the first sub-expression is names[1], so that if m is a match slice, the name for m[i] is SubexpNames()[i]. Since the Regexp as a whole cannot be named, names[0] is always the empty string. The slice should not be modified.

func (re *Regexp) SubexpNames() []string

UnmarshalText method #

UnmarshalText implements [encoding.TextUnmarshaler] by calling [Compile] on the encoded value.

func (re *Regexp) UnmarshalText(text []byte) error

add method #

add adds an entry to q for pc, unless the q already has such an entry. It also recursively adds an entry for all instructions reachable from pc by following empty-width conditions satisfied by cond. pos gives the current position in the input.

func (m *machine) add(q *queue, pc uint32, pos int, cap []int, cond *lazyFlag, t *thread) *thread

allMatches method #

allMatches calls deliver at most n times with the location of successive matches in the input text. The input text is b if non-nil, otherwise s.

func (re *Regexp) allMatches(s string, b []byte, n int, deliver func([]int))

alloc method #

alloc allocates a new thread with the given instruction. It uses the free pool if possible.

func (m *machine) alloc(i *syntax.Inst) *thread

backtrack method #

backtrack runs a backtracking search of prog on the input starting at pos.

func (re *Regexp) backtrack(ib []byte, is string, pos int, ncap int, dstCap []int) []int

canCheckPrefix method #

func (i *inputBytes) canCheckPrefix() bool

canCheckPrefix method #

func (i *inputString) canCheckPrefix() bool

canCheckPrefix method #

func (i *inputReader) canCheckPrefix() bool

cleanupOnePass function #

cleanupOnePass drops working memory, and restores certain shortcut instructions.

func cleanupOnePass(prog *onePassProg, original *syntax.Prog)

clear method #

func (i *inputs) clear()

clear method #

func (q *queueOnePass) clear()

clear method #

clear frees all threads on the thread queue.

func (m *machine) clear(q *queue)

compile function #

func compile(expr string, mode syntax.Flags, longest bool) (*Regexp, error)

compileOnePass function #

compileOnePass returns a new *syntax.Prog suitable for onePass execution if the original Prog can be recharacterized as a one-pass regexp program, or syntax.nil if the Prog cannot be converted. For a one pass prog, the fundamental condition that must be true is: at any InstAlt, there must be no ambiguity about what branch to take.

func compileOnePass(prog *syntax.Prog) (p *onePassProg)

contains method #

func (q *queueOnePass) contains(u uint32) bool

context method #

func (i *inputString) context(pos int) lazyFlag

context method #

func (i *inputReader) context(pos int) lazyFlag

context method #

func (i *inputBytes) context(pos int) lazyFlag

doExecute method #

doExecute finds the leftmost match in the input, appends the position of its subexpressions to dstCap and returns dstCap. nil is returned if no matches are found and non-nil if matches are found.

func (re *Regexp) doExecute(r io.RuneReader, b []byte, s string, pos int, ncap int, dstCap []int) []int

doMatch method #

doMatch reports whether either r, b or s match the regexp.

func (re *Regexp) doMatch(r io.RuneReader, b []byte, s string) bool

doOnePass method #

doOnePass implements r.doExecute using the one-pass execution engine.

func (re *Regexp) doOnePass(ir io.RuneReader, ib []byte, is string, pos int, ncap int, dstCap []int) []int

empty method #

func (q *queueOnePass) empty() bool

expand method #

func (re *Regexp) expand(dst []byte, template string, bsrc []byte, src string, match []int) []byte

extract function #

extract returns the name from a leading "name" or "{name}" in str. (The $ has already been removed by the caller.) If it is a number, extract returns num set to that number; otherwise num = -1.

func extract(str string) (name string, num int, rest string, ok bool)

freeBitState function #

func freeBitState(b *bitState)

freeOnePassMachine function #

func freeOnePassMachine(m *onePassMachine)

get method #

get returns a machine to use for matching re. It uses the re's machine cache if possible, to avoid unnecessary allocation.

func (re *Regexp) get() *machine

hasPrefix method #

func (i *inputReader) hasPrefix(re *Regexp) bool

hasPrefix method #

func (i *inputBytes) hasPrefix(re *Regexp) bool

hasPrefix method #

func (i *inputString) hasPrefix(re *Regexp) bool

index method #

func (i *inputReader) index(re *Regexp, pos int) int

index method #

func (i *inputBytes) index(re *Regexp, pos int) int

index method #

func (i *inputString) index(re *Regexp, pos int) int

init function #

func init()

init method #

func (i *inputs) init(r io.RuneReader, b []byte, s string) (input, int)

init method #

func (m *machine) init(ncap int)

insert method #

func (q *queueOnePass) insert(u uint32)

insertNew method #

func (q *queueOnePass) insertNew(u uint32)

iop function #

func iop(i *syntax.Inst) syntax.InstOp

makeOnePass function #

makeOnePass creates a onepass Prog, if possible. It is possible if at any alt, the match engine can always tell which branch to take. The routine may modify p if it is turned into a onepass Prog. If it isn't possible for this to be a onepass Prog, the Prog nil is returned. makeOnePass is recursive to the size of the Prog.

func makeOnePass(p *onePassProg) *onePassProg

match method #

match runs the machine over the input starting at pos. It reports whether a match was found. If so, m.matchcap holds the submatch information.

func (m *machine) match(i input, pos int) bool

match method #

func (f lazyFlag) match(op syntax.EmptyOp) bool

maxBitStateLen function #

maxBitStateLen returns the maximum length of a string to search with the backtracker using prog.

func maxBitStateLen(prog *syntax.Prog) int

mergeRuneSets function #

func mergeRuneSets(leftRunes *[]rune, rightRunes *[]rune, leftPC uint32, rightPC uint32) ([]rune, []uint32)

minInputLen function #

minInputLen walks the regexp to find the minimum length of any matchable input.

func minInputLen(re *syntax.Regexp) int

newBitState function #

func newBitState() *bitState

newBytes method #

func (i *inputs) newBytes(b []byte) input

newLazyFlag function #

func newLazyFlag(r1 rune, r2 rune) lazyFlag

newOnePassMachine function #

func newOnePassMachine() *onePassMachine

newQueue function #

func newQueue(size int) (q *queueOnePass)

newReader method #

func (i *inputs) newReader(r io.RuneReader) input

newString method #

func (i *inputs) newString(s string) input

next method #

func (q *queueOnePass) next() (n uint32)

onePassCopy function #

onePassCopy creates a copy of the original Prog, as we'll be modifying it.

func onePassCopy(prog *syntax.Prog) *onePassProg

onePassNext function #

onePassNext selects the next actionable state of the prog, based on the input character. It should only be called when i.Op == InstAlt or InstAltMatch, and from the one-pass machine. One of the alternates may ultimately lead without input to end of line. If the instruction is InstAltMatch the path to the InstMatch is in i.Out, the normal node in i.Next.

func onePassNext(i *onePassInst, r rune) uint32

onePassPrefix function #

onePassPrefix returns a literal string that all matches for the regexp must start with. Complete is true if the prefix is the entire match. Pc is the index of the last rune instruction in the string. The onePassPrefix skips over the mandatory EmptyBeginText.

func onePassPrefix(p *syntax.Prog) (prefix string, complete bool, pc uint32)

pad method #

The number of capture values in the program may correspond to fewer capturing expressions than are in the regexp. For example, "(a){0}" turns into an empty program, so the maximum capture in the program is 0 but we need to return an expression for \1. Pad appends -1s to the slice a as needed.

func (re *Regexp) pad(a []int) []int

push method #

push pushes (pc, pos, arg) onto the job stack if it should be visited.

func (b *bitState) push(re *Regexp, pc uint32, pos int, arg bool)

put method #

put returns a machine to the correct machine pool.

func (re *Regexp) put(m *machine)

quote function #

func quote(s string) string

replaceAll method #

func (re *Regexp) replaceAll(bsrc []byte, src string, nmatch int, repl func(dst []byte, m []int) []byte) []byte

reset method #

reset resets the state of the backtracker. end is the end position in the input. ncap is the number of captures.

func (b *bitState) reset(prog *syntax.Prog, end int, ncap int)

shouldBacktrack function #

shouldBacktrack reports whether the program is too long for the backtracker to run.

func shouldBacktrack(prog *syntax.Prog) bool

shouldVisit method #

shouldVisit reports whether the combination of (pc, pos) has not been visited yet.

func (b *bitState) shouldVisit(pc uint32, pos int) bool

special function #

special reports whether byte b needs to be escaped by QuoteMeta.

func special(b byte) bool

step method #

step executes one step of the machine, running each of the threads on runq and appending new threads to nextq. The step processes the rune c (which may be endOfText), which starts at position pos and ends at nextPos. nextCond gives the setting for the empty-width flags after c.

func (m *machine) step(runq *queue, nextq *queue, pos int, nextPos int, c rune, nextCond *lazyFlag)

step method #

func (i *inputString) step(pos int) (rune, int)

step method #

func (i *inputBytes) step(pos int) (rune, int)

step method #

func (i *inputReader) step(pos int) (rune, int)

tryBacktrack method #

tryBacktrack runs a backtracking search starting at pos.

func (re *Regexp) tryBacktrack(b *bitState, i input, pc uint32, pos int) bool

Generated with Arrow