Monitoring and capturing what's going on
Logging
Preliminaries
This note describes the options available for fine-grain control over the server's logging operations.
Webmasters need to know what's going on with the server in order to properly fine-tune its operation. One of the most important tools to help with this is the server's log.
Different stages of a website's operation need different information. During development, designers and authors need to know that embedded media and hyperlinked documents are being served without 404 status codes. During initial startup, webmasters need to know that access to the public is allowed, and that access to restricted content is prohibited. When the website begins to receive awareness, and usage patterns begin to be recognized, engineers need to fine tune caching and load balancing. And when the website has reached a steady state, someone needs to monitor what's going on and be alerted when bad actors are behaving maliciously.
The server's logging facility can help with all of these scenarios.
Motivation
The server sends messages to the system log with the intention that they will be read and analyzed by computers rather than people. Messages are designed to be both flexible and parsable. Anyone with basic knowledge of regular expressions and a scripting language should be able to filter the system log to answer these types of questions:
- How many requests were successfully fulfilled yesterday?
- How many megabytes of data were transferred over the network?
- What was the average compression ratio for gzip responses?
- Which IP addresses generated more than ten potentially malicious requests?
- What is the percentage breakdown of user requests by browser type?
- Are there any internal hyperlinks that no longer work?
- Did any unexpected errors occur from the website's custom pages?
SystemD
When operated under control of the SystemD service manager, messages are routed to the system log. When running in direct foreground mode, messages are sent to the console via STDOUT (this is uncommon).
Configuration
The server's logging section consists of several subsections allowing for both high-level and fine-tuned configuration:
The message-type subsection comprises seven "on" or "off" switches to enable or disable logging at a high-level.
requestcomprises entries that define which HTTP/2 request headers should be logged.stagingcomprises entries that define which of the server's work order variables should be logged.informationcomprises entries that define which server information headers should be logged.responsecomprises entries that define which HTTP/2 response headers should be logged.policycomprises entries received from feature policy and content-security policies.networkenables/disables the logging of TLS network errors.clusterenables/disables the logging of startup and shutdown of cluster worker threads.debugenables/disables the logging of internal tracing messages.
Message type
The server emits 12 types of messages:
- Request header entries provide details of each incoming HTTP/2 request.
- Staging variable entries provide details of about the internal state of the server during the request-response cycle.
- Information header entries are emitted for responses that contain any of the special server headers that begin with the letters
"rw-". - Response header entries provide details of each outgoing HTTP/2 response.
- Policy messages are sent by the browser back to the server in response to unmet policy conditions.
- Network messages are emitted for TLS network errors such as: no shared cipher, unknown protocol, http request, version too low, wrong version number, and others.
- Cluster messages are emitted when a worker thread is created or disconnected.
- Debug messages are used for tracing the path of the request/response cycle through the software.
- Config messages are emitted during server initialization to show its progress towards binding to the network interface, and to report any non-recoverable configuration mistakes.
- Error messages are emitted any time an unexpected condition occurs, due to user configuration issues or server software bugs.
- Nodejs messages are emitted any time an uncaught Node.js exception is thrown.
- Systemd messages are emitted by the operating system when the server starts or stops.
The content and format of request, staging, information, response and policy messages are configurable. The choice of which header values and variables to be included is made in the logging section.
The network, cluster and debug messages may be enabled or disabled, but the individual messages themselves are non-configurable.
The config, error, nodejs and systemd messages cannot be disabled.
Log message structure
Here is the breakdown of a standard log entry. First there is a message type, as explained above. This is followed by a request-response (RR) identifier, which is a sequential number that is shared by the request/
This is the regular expression for messages generated by the server, where the second parenthesis will capture the fields:
RR=(.*);( .*=.*;)*
When sent to the system log via the SystemD service manager, each message is stored with these additional OS-provided values: 1) timestamp, 2) hostname, 3) unit name, and 4) process identifier.
Filtering fields
Many HTTP headers are repetitious and of little value to typical website analysis. The server's configuration can specify exactly which fields to include in the logged message. Each of the four configurable subsections comprise a list of fields to include. When specified this way, only those fields are included; any other HTTP header names or work-order variables will be omitted.
Sometimes, a webmaster may not know which header fields to expect in a message. In that case it is possible to use the special field name 'default', in a configuration subsection, either by itself or immediately after the other field entries. The 'default' keyword is a catch-all fallback; it instructs the server to include any other HTTP header actually present in the logged message. This is a useful technique when setting up a web server the first time, and when learning about HTTP requests and responses. Later, it is more prudent to remove the 'default' catch-all and limit the logging to just what's needed.
Abbreviating field names
In the interest of conserving disk space, it is desirable to shorten fields which may be lengthy both because the HTTP header name itself may be long and because its associated value may be long.
Each configuration entry may include an *abbr attribute which will be substituted for the field name. Typically these abbreviations are 2, 3, or 4 capitalized letters. Be sure to choose unique values in order to retain the ability to parse messages using scripts.
When the *abbr attribute value is "none" or omitted, the full field name is used in the log message.
Here are some typical configuration entries showing the use of the *abbr attribute:
Request headers
| Request header | abbreviation |
|---|---|
| :method | *abbr=ME |
| :scheme | *abbr=SC |
| :authority | *abbr=AU |
| :path | *abbr=PA |
| accept | *abbr=AC |
| access-control-request-headers | *abbr=ACRH |
| access-control-request-method | *abbr=ACRM |
| accept-encoding | *abbr=AE |
| accept-language | *abbr=AL |
| alpn | *abbr=ALPN |
| cache-control | *abbr=CC |
| content-encoding | *abbr=CE |
| content-length | *abbr=CL |
| cookie | *abbr=CO |
| content-range | *abbr=CR |
| content-type | *abbr=CT |
| if-match | *abbr=IM |
| if-modified-since | *abbr=IMS |
| if-none-match | *abbr=INM |
| if-unmodified-since | *abbr=IUS |
| link | *abbr=LK |
| origin | *abbr=OR |
| pragma | *abbr=PR |
| referer | *abbr=RF |
| user-agent | *abbr=UA |
| upgrade-insecure-requests | *abbr=UG |
Staging variables
| Staging variables | abbreviation |
|---|---|
| cookie-map | *abbr=CK |
| form-data-map | *abbr=FD |
| proxy-pocation | *abbr=PL |
| masked-replacement-path | *abbr=MRP |
| parameter-map | *abbr=PM |
| push-candidate | *abbr=PC |
| query-string | *abbr=QS |
| remote-address | *abbr=RA |
| resource-path | *abbr=RP |
| user-agent-common-name | *abbr=UCN |
| user-agent-groups | *abbr=UAG |
Information headers
| Information headers | abbreviation |
|---|---|
| rw-bad-range | *abbr=none |
| rw-blue-cache-add | *abbr=none |
| rw-blue-cache-dependency | *abbr=none |
| rw-blue-cache-hit | *abbr=none |
| rw-blue-cache-stale | *abbr=none |
| rw-blue-dir-deleted | *abbr=none |
| rw-blue-file-deleted | *abbr=none |
| rw-blue-invalid-option | *abbr=none |
| rw-blue-map-deleted | *abbr=none |
| rw-blue-unsupported-method | *abbr=none |
| rw-conditional-range-failed | *abbr=none |
| rw-content-decoding | *abbr=none |
| rw-content-decoding | *abbr=none |
| rw-content-length-limit | *abbr=none |
| rw-content-length-missing | *abbr=none |
| rw-content-range | *abbr=none |
| rw-content-type-mismatch | *abbr=none |
| rw-content-type-missing | *abbr=none |
| rw-content-type-not-supported | *abbr=none |
| rw-encoding-cache-add | *abbr=none |
| rw-encoding-cache-add | *abbr=none |
| rw-encoding-cache-hit | *abbr=none |
| rw-encoding-dir-deleted | *abbr=none |
| rw-encoding-file-deleted | *abbr=none |
| rw-forbidden | *abbr=none |
| rw-header-not-allowed | *abbr=none |
| rw-hostname | *abbr=none |
| rw-if-match | *abbr=none |
| rw-if-none-match | *abbr=none |
| rw-if-unmodified-since | *abbr=none |
| rw-ip-access | *abbr=none |
| rw-language-not-acceptable | *abbr=none |
| rw-language-not-configured | *abbr=none |
| rw-method-not-allowed | *abbr=none |
| rw-no-acceptable-type | *abbr=none |
| rw-no-cache-control | *abbr=none |
| rw-origin-not-allowed | *abbr=none |
| rw-path | *abbr=none |
| rw-range-units-not-supported | *abbr=none |
| rw-rbac-expired | *abbr=none |
| rw-rbac-forbidden | *abbr=none |
| rw-rbac-forged | *abbr=none |
| rw-rbac-internal-error | *abbr=none |
| rw-rbac-missing-credentials | *abbr=none |
| rw-rbac-no-matching-role | *abbr=none |
| rw-rbac-no-resource-rule | *abbr=none |
| rw-rbac-remote-address | *abbr=none |
| rw-rbac-renewal | *abbr=none |
| rw-rbac-unsupported-action | *abbr=none |
| rw-rbac-unsupported-content-type | *abbr=none |
| rw-rbac-unsupported-method | *abbr=none |
| rw-rbac-unsupported-scheme | *abbr=none |
| rw-referer-nopush | *abbr=none |
| rw-request-dir-deleted | *abbr=none |
| rw-request-file-deleted | *abbr=none |
| rw-request-file-not-deleted | *abbr=none |
| rw-resource-path-illegal | *abbr=none |
| rw-resource-path-length | *abbr=none |
| rw-router-missing-module | *abbr=none |
| rw-router-not-forwarded | *abbr=none |
| rw-user-agent-nopush | *abbr=none |
| rw-user-agent-noserve | *abbr=none |
Response headers
| Response headers | abbreviation |
|---|---|
| :status | *abbr=ST |
| access-control-allow-headers | *abbr=ACAH |
| access-control-expose-headers | *abbr=ACEH |
| access-control-allow-origin | *abbr=ACAO |
| access-control-allow-credentials | *abbr=ACAC |
| access-control-allow-methods | *abbr=ACAM |
| access-control-max-age | *abbr=ACMA |
| allow | *abbr=AL |
| cache-control | *abbr=CC |
| content-encoding | *abbr=CE |
| content-length | *abbr=CL |
| content-location | *abbr=CLO |
| content-language | *abbr=CLA |
| content-type | *abbr=CT |
| last-modified | *abbr=LM |
| etag | *abbr=ET |
| location | *abbr=LO |
| set-cookie | *abbr=SC |
| server | *abbr=SE |
| vary | *abbr=VA |
Policy data
| Policy data | abbreviation |
|---|---|
| type | *abbr=TYP |
| age | *abbr=AGE |
| blocked-uri | *abbr=B-URI |
| disposition | *abbr=DISP |
| document-uri | *abbr=D-URI |
| effective-directive | *abbr=E-DIR |
| elapsed-time | *abbr=ELAP |
| line-number | *abbr=LN |
| method | *abbr=ME |
| phase | *abbr=PH |
| protocol | *abbr=PROT |
| referrer | *abbr=RF |
| sampling-fraction | *abbr=S-FRAC |
| server-ip | *abbr=S-IP |
| source-file | *abbr=S-FIL |
| status | *abbr=ST |
| status-code | *abbr=ST |
| original-policy | *abbr=O-POL |
| url | *abbr=URL |
| user-agent | *abbr=UA |
| violated-directive | *abbr=V-DIR |
Formatting fields
Some header values can be quite lengthy. These can be shortened using ellipses. Use the *format attribute to instruct the server how much to shorten the string and where to put the ellipsis. There are three meaningful definition patterns: 1) trim leading, 2) trim trailing, and 3) trim middle. Each definition consists of the number of leading characters to keep, followed by three FULL-STOPS, followed by the number of trailing characters to keep.
Given an example header value like this "AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz" the text to be sent to the log message is demonstrated here:
| Definition pattern | attribute | log text |
|---|---|---|
| Trim leading | *format= | ...VvWw |
| Trim trailing | *format= | AaBbCc |
| Trim middle | *format= | AaBbCcDd... |
Be sure to enclose the attribute value in APOSTROPHES.
Configuration snippet using the *format attribute
cookie *abbr=CO *format='...10'
referer *abbr=RF *format='10...'
user-agent *abbr=UA *format='8...8'
When no format attribute is provided, the default for most field names is *format='20...20'. The exceptions to this rule are the :path and referer fields, which by default use an implicit *format='100...100'. These two special fields retain more of the original string in order to provide better results with the Analytics CLI.
Coloring fields
The foreground and background color of each header value can be set by specifying *fg and *bg attributes. Eight colors are allowed: black, white, red, green, blue, cyan, magenta, and yellow.
Configuration snippet using the *fg and *bg attributes
cookie *abbr=CO *fg=red *bg=white
referer *abbr=RF *fg=green *bg=black
user-agent *abbr=UA *fg=black *bg=cyan
Colors are normally suppressed by the journalctl utility. To enable colors, use the --output=cat switch, like this:
journalctl -u rwserve --output=cat
Orientation of log fields
Log messages are normally sent to the console as single horizontal lines. It is also possible to configure the server to create messages in a vertically stacked style. This is convenient during development and testing, to bring an extra bit of clarity to the output. Do this by adding an orientation vertical setting to the logging section.
EBNF
| SP | ::= | U+20 |
| CR | ::= | U+0D |
| ASTERISK | ::= | U+2A |
| APOSTROPHE | ::= | U+27 |
| EQUALS-SIGN | ::= | U+3D |
| QUESTION-MARK | ::= | U+3F |
| LEFT-CURLY-BRACKET | ::= | U+7B |
| RIGHT-CURLY-BRACKET | ::= | U+7D |
| orientation-entry | ::= | 'orientation' SP ('vertical' | 'horizontal') |
| http-header | ::= | ALPHA* | 'default' |
| policy-header | ::= | ALPHA* | 'default' |
| work-order-variable | ::= | 'cookieMap' | 'formDataMap' | 'proxyLocation' | 'maskedReplacement |
| ansi-color | ::= | ('white' | 'black' | 'red' | 'green' | 'blue' | 'cyan' | 'magenta' | 'yellow') |
| abbr-attr | ::= | ASTERISK 'abbr' EQUALS-SIGN ALPHA* |
| abbr-fg | ::= | ASTERISK 'fg' EQUALS-SIGN ansi-color |
| abbr-bg | ::= | ASTERISK 'bg' EQUALS-SIGN ansi-color |
| leading-char-count | ::= | DIGITS* |
| trailing-char-count | ::= | DIGITS* |
| format-spec | ::= | leading-char-count '...' trailing-char-count |
| format-attr | ::= | ASTERISK 'format' EQUALS-SIGN APOSTROPHE format-spec APOSTROPHE |
| message-type-entry | ::= | ('request' | 'staging' | 'response' | information' | 'policy' | 'network' | 'cluster' | 'debug') SP ('on' | 'off') CR |
| request-entry | ::= | http-header SP abbr-attribute* SP format-attr* SP fg-attr* SP bg-attr* CR |
| staging-entry | ::= | work-order-variable SP abbr-attribute* SP format-attr* SP fg-attr* SP bg-attr* CR |
| information-entry | ::= | http-header SP abbr-attribute* SP format-attr* SP fg-attr* SP bg-attr* CR |
| response-entry | ::= | http-header SP abbr-attribute* SP format-attr* SP fg-attr* SP bg-attr* CR |
| policy-entry | ::= | policy-header SP abbr-attribute* SP format-attr* SP fg-attr* SP bg-attr* CR |
| message-type-subsection | ::= | 'message-type' SP LEFT-CURLY-BRACKET CR message-type-entry* RIGHT-CURLY-BRACKET CR |
| request-subsection | ::= | 'request' SP LEFT-CURLY-BRACKET CR request-entry* RIGHT-CURLY-BRACKET CR |
| staging-subsection | ::= | 'staging' SP LEFT-CURLY-BRACKET CR staging-entry* RIGHT-CURLY-BRACKET CR |
| information-subsection | ::= | 'information' SP LEFT-CURLY-BRACKET CR information-entry* RIGHT-CURLY-BRACKET CR |
| response-subsection | ::= | 'response' SP LEFT-CURLY-BRACKET CR response-entry* RIGHT-CURLY-BRACKET CR |
| policy-subsection | ::= | 'policy' SP LEFT-CURLY-BRACKET CR policy-entry* RIGHT-CURLY-BRACKET CR |
| logging-subsection | ::= | orientation-entry | message-type-subsection | request-subsection | staging-subsection | information-subsection | response-subsection | policy-subsection |
| logging-section | ::= | 'logging' SP LEFT-CURLY-BRACKET CR logging-subsection* RIGHT-CURLY-BRACKET CR |
Cookbook
Example 1: Logging only config and error messages
Note: With these rules in place, important 'config' and 'error' messages are still sent to the system log.
server {
logging {
message-type {
request off
staging off
information off
response off
network off
cluster off
}
}
}
Example 2: Basic request/response logging
server {
logging {
message-type {
request on
staging off
information off
response on
network off
cluster off
}
request {
:method *abbr=ME
:path *abbr=PA
cookie *abbr=CO
referer *abbr=RF
}
response {
:status *abbr=ST
content-length *abbr=CL
content-type *abbr=CT
set-cookie *abbr=SC
}
}
}
Example 3: Debugging the website
server {
logging {
message-type {
request on
staging on
information off
response on
network on
cluster off
}
request {
default *format='64...'
}
staging {
remote-address *abbr=RA
cookie-map *abbr=CM
parameter-map *abbr=PM
form-data-map *abbr=FDM
user-agent-common-name *abbr=UACN
user-agent-groups *abbr=UAG
masked-replacement-path *abbr=MRP
resource-path *abbr=RP
query-string *abbr=QS
push-candidate *abbr=PC
proxy-location *abbr=PL
default
}
response {
default *format='64...'
}
}
}
Example 4: Monitoring the website
server {
logging {
message-type {
request on
staging off
information on
response on
network off
cluster off
}
request {
:method *abbr=ME
:path *abbr=PA
cookie *abbr=CO
referer *abbr=RF
}
information {
default *format='32...32'
}
response {
:status *abbr=ST
content-length *abbr=CL
content-type *abbr=CT
set-cookie *abbr=SC
}
}
}
Example 5: Vertical logging
server {
logging {
orientation vertical
message-type {
request on
staging off
information on
response on
network off
cluster off
}
request {
default *format='32...32'
}
information {
default *format='32...32'
}
response {
default *format='32...32'
}
}
}
Review
Key points to remember:
- The default settings for logging are only a starting point; adjustments should be made to match the operator's needs.
- Request and response headers can be lengthy and repetitive; use of the 'default' entry can produce copious messages.
- Staging variables are useful for debugging during development.
- Information headers are useful during the website's steady-state to monitor for possible misuse.
