Specification of path-pattern globbing rules
Path Patterns
Preliminaries
This note documents the rules used by the server for all configuration entries that use the path-pattern idiom.
Many of the server's configuration settings use a sequence of path-patterns to apply distinct rules to differentiated requests. Each of these settings follows the same set of rules for matching the configuration to the request pattern.
Patterns may include the special symbol *
, which matches any sequence of characters, or the special symbol ?
, which matches any one character. All other characters in a path-pattern are used as-is when comparing paths to patterns. (There is no special globstar meaning attached to the sequence '**'
.)
Here are some examples to demonstrate configured path-patterns that match request resource paths.
Configured path-pattern | Request Resource Path |
---|---|
`/.git` | /.git |
`/js/*obs.js` | /js/version-obs.js |
`/img/*-v?.*` | /img/ab-v1.gif |
`*api-v1*` | /users/api-v1/john-smith |
`*.pdf` | /docs/specification.pdf |
The pattern matching process starts with the beginning portion of the requested resource's path. In practice, most patterns should begin with either a solidus /
, to start the matching process at the public document-root, or with the special symbol *
, to start the match anywhere else.
The pattern matching process proceeds up to the end of the resource path. Use a trailing '*'
in the path-pattern when the intention is to match all files of a given directory.
Path-patterns are parsed as BLUEPHRASE sourceref attributes and should therefore be enclosed in GRAVE-ACCENT delimiters.
Sequential table scan
Configuration entries that use path-pattern matching, treat the collection of entries in a section as a scan table. The algorithm to determine which entry's configured values to use, is this:
- Beginning with the first entry in the list, attempt to match the requested resource's path to the path-pattern.
- If there isn't a match, continue with the next entry in the list.
- If there is a match, break out of the scan and use the entry's configured values.
- When there is no match to any entry in the list, each handler's no matching path-pattern case handles the situation differently. Refer to each handler's separate note.
EBNF
SP | ::= | U+20 |
CR | ::= | U+0D |
ASTERISK | ::= | U+2A |
QUESTION-MARK | ::= | U+3F |
SOLIDUS | ::= | U+2F |
GRAVE-ACCENT | ::= | U+60 |
file-system-chars | ::= | (ALPHA | DIGIT | †)* |
wildcards | ::= | ASTERISK | QUESTION-MARK |
path-pattern | ::= | (SOLIDUS | file-system-chars | wildcards)* |
delimited-path-pattern | ::= | GRAVE-ACCENT path-pattern GRAVE-ACCENT |
† Legal file system characters vary by platform
Review
Key points to remember:
- Path-patterns match the entire request path.
- Use a
*
wildcard at the start or end of a path-pattern to create generic patterns. - Path-patterns must always be enclosed with GRAVE-ACCENT delimiters.