Monitoring and capturing what's going on
Logging
Preliminaries
This note describes the options available for fine-grain control over the server's logging operations.
Webmasters need to know what's going on with the server in order to properly fine-tune its operation. One of the most important tools to help with this is the server's log.
Different stages of a website's operation need different information. During development, designers and authors need to know that embedded media and hyperlinked documents are being served without 404
status codes. During initial startup, webmasters need to know that access to the public is allowed, and that access to restricted content is prohibited. When the website begins to receive awareness, and usage patterns begin to be recognized, engineers need to fine tune caching and load balancing. And when the website has reached a steady state, someone needs to monitor what's going on and be alerted when bad actors are behaving maliciously.
The server's logging facility can help with all of these scenarios.
Motivation
The server sends messages to the system log with the intention that they will be read and analyzed by computers rather than people. Messages are designed to be both flexible and parsable. Anyone with basic knowledge of regular expressions and a scripting language should be able to filter the system log to answer these types of questions:
- How many requests were successfully fulfilled yesterday?
- How many megabytes of data were transferred over the network?
- What was the average compression ratio for gzip responses?
- Which IP addresses generated more than ten potentially malicious requests?
- What is the percentage breakdown of user requests by browser type?
- Are there any internal hyperlinks that no longer work?
- Did any unexpected errors occur from the website's custom pages?
SystemD
When operated under control of the SystemD service manager, messages are routed to the system log. When running in direct foreground mode, messages are sent to the console via STDOUT (this is uncommon).
Configuration
The server's logging
section consists of several subsections allowing for both high-level and fine-tuned configuration:
The message-type
subsection comprises seven "on"
or "off"
switches to enable or disable logging at a high-level.
request
comprises entries that define which HTTP/2 request headers should be logged.staging
comprises entries that define which of the server's work order variables should be logged.information
comprises entries that define which server information headers should be logged.response
comprises entries that define which HTTP/2 response headers should be logged.policy
comprises entries received from feature policy and content-security policies.network
enables/disables the logging of TLS network errors.cluster
enables/disables the logging of startup and shutdown of cluster worker threads.debug
enables/disables the logging of internal tracing messages.
Message type
The server emits 12 types of messages:
- Request header entries provide details of each incoming HTTP/2 request.
- Staging variable entries provide details of about the internal state of the server during the request-response cycle.
- Information header entries are emitted for responses that contain any of the special server headers that begin with the letters
"rw-"
. - Response header entries provide details of each outgoing HTTP/2 response.
- Policy messages are sent by the browser back to the server in response to unmet policy conditions.
- Network messages are emitted for TLS network errors such as: no shared cipher, unknown protocol, http request, version too low, wrong version number, and others.
- Cluster messages are emitted when a worker thread is created or disconnected.
- Debug messages are used for tracing the path of the request/response cycle through the software.
- Config messages are emitted during server initialization to show its progress towards binding to the network interface, and to report any non-recoverable configuration mistakes.
- Error messages are emitted any time an unexpected condition occurs, due to user configuration issues or server software bugs.
- Nodejs messages are emitted any time an uncaught Node.js exception is thrown.
- Systemd messages are emitted by the operating system when the server starts or stops.
The content and format of request
, staging
, information
, response
and policy
messages are configurable. The choice of which header values and variables to be included is made in the logging
section.
The network
, cluster
and debug
messages may be enabled or disabled, but the individual messages themselves are non-configurable.
The config
, error
, nodejs
and systemd
messages cannot be disabled.
Log message structure
Here is the breakdown of a standard log entry. First there is a message type, as explained above. This is followed by a request-response (RR) identifier, which is a sequential number that is shared by the request/
This is the regular expression for messages generated by the server, where the second parenthesis will capture the fields:
RR=(.*);( .*=.*;)*
When sent to the system log via the SystemD service manager, each message is stored with these additional OS-provided values: 1) timestamp, 2) hostname, 3) unit name, and 4) process identifier.
Filtering fields
Many HTTP headers are repetitious and of little value to typical website analysis. The server's configuration can specify exactly which fields to include in the logged message. Each of the four configurable subsections comprise a list of fields to include. When specified this way, only those fields are included; any other HTTP header names or work-order variables will be omitted.
Sometimes, a webmaster may not know which header fields to expect in a message. In that case it is possible to use the special field name 'default'
, in a configuration subsection, either by itself or immediately after the other field entries. The 'default'
keyword is a catch-all fallback; it instructs the server to include any other HTTP header actually present in the logged message. This is a useful technique when setting up a web server the first time, and when learning about HTTP requests and responses. Later, it is more prudent to remove the 'default'
catch-all and limit the logging to just what's needed.
Abbreviating field names
In the interest of conserving disk space, it is desirable to shorten fields which may be lengthy both because the HTTP header name itself may be long and because its associated value may be long.
Each configuration entry may include an *abbr
attribute which will be substituted for the field name. Typically these abbreviations are 2, 3, or 4 capitalized letters. Be sure to choose unique values in order to retain the ability to parse messages using scripts.
When the *abbr
attribute value is "none"
or omitted, the full field name is used in the log message.
Here are some typical configuration entries showing the use of the *abbr
attribute:
Request headers
Request header | abbreviation |
---|---|
:method | *abbr=ME |
:scheme | *abbr=SC |
:authority | *abbr=AU |
:path | *abbr=PA |
accept | *abbr=AC |
access-control-request-headers | *abbr=ACRH |
access-control-request-method | *abbr=ACRM |
accept-encoding | *abbr=AE |
accept-language | *abbr=AL |
alpn | *abbr=ALPN |
cache-control | *abbr=CC |
content-encoding | *abbr=CE |
content-length | *abbr=CL |
cookie | *abbr=CO |
content-range | *abbr=CR |
content-type | *abbr=CT |
if-match | *abbr=IM |
if-modified-since | *abbr=IMS |
if-none-match | *abbr=INM |
if-unmodified-since | *abbr=IUS |
link | *abbr=LK |
origin | *abbr=OR |
pragma | *abbr=PR |
referer | *abbr=RF |
user-agent | *abbr=UA |
upgrade-insecure-requests | *abbr=UG |
Staging variables
Staging variables | abbreviation |
---|---|
cookie-map | *abbr=CK |
form-data-map | *abbr=FD |
proxy-pocation | *abbr=PL |
masked-replacement-path | *abbr=MRP |
parameter-map | *abbr=PM |
push-candidate | *abbr=PC |
query-string | *abbr=QS |
remote-address | *abbr=RA |
resource-path | *abbr=RP |
user-agent-common-name | *abbr=UCN |
user-agent-groups | *abbr=UAG |
Information headers
Information headers | abbreviation |
---|---|
rw-bad-range | *abbr=none |
rw-blue-cache-add | *abbr=none |
rw-blue-cache-dependency | *abbr=none |
rw-blue-cache-hit | *abbr=none |
rw-blue-cache-stale | *abbr=none |
rw-blue-dir-deleted | *abbr=none |
rw-blue-file-deleted | *abbr=none |
rw-blue-invalid-option | *abbr=none |
rw-blue-map-deleted | *abbr=none |
rw-blue-unsupported-method | *abbr=none |
rw-conditional-range-failed | *abbr=none |
rw-content-decoding | *abbr=none |
rw-content-decoding | *abbr=none |
rw-content-length-limit | *abbr=none |
rw-content-length-missing | *abbr=none |
rw-content-range | *abbr=none |
rw-content-type-mismatch | *abbr=none |
rw-content-type-missing | *abbr=none |
rw-content-type-not-supported | *abbr=none |
rw-encoding-cache-add | *abbr=none |
rw-encoding-cache-add | *abbr=none |
rw-encoding-cache-hit | *abbr=none |
rw-encoding-dir-deleted | *abbr=none |
rw-encoding-file-deleted | *abbr=none |
rw-forbidden | *abbr=none |
rw-header-not-allowed | *abbr=none |
rw-hostname | *abbr=none |
rw-if-match | *abbr=none |
rw-if-none-match | *abbr=none |
rw-if-unmodified-since | *abbr=none |
rw-ip-access | *abbr=none |
rw-language-not-acceptable | *abbr=none |
rw-language-not-configured | *abbr=none |
rw-method-not-allowed | *abbr=none |
rw-no-acceptable-type | *abbr=none |
rw-no-cache-control | *abbr=none |
rw-origin-not-allowed | *abbr=none |
rw-path | *abbr=none |
rw-range-units-not-supported | *abbr=none |
rw-rbac-expired | *abbr=none |
rw-rbac-forbidden | *abbr=none |
rw-rbac-forged | *abbr=none |
rw-rbac-internal-error | *abbr=none |
rw-rbac-missing-credentials | *abbr=none |
rw-rbac-no-matching-role | *abbr=none |
rw-rbac-no-resource-rule | *abbr=none |
rw-rbac-remote-address | *abbr=none |
rw-rbac-renewal | *abbr=none |
rw-rbac-unsupported-action | *abbr=none |
rw-rbac-unsupported-content-type | *abbr=none |
rw-rbac-unsupported-method | *abbr=none |
rw-rbac-unsupported-scheme | *abbr=none |
rw-referer-nopush | *abbr=none |
rw-request-dir-deleted | *abbr=none |
rw-request-file-deleted | *abbr=none |
rw-request-file-not-deleted | *abbr=none |
rw-resource-path-illegal | *abbr=none |
rw-resource-path-length | *abbr=none |
rw-router-missing-module | *abbr=none |
rw-router-not-forwarded | *abbr=none |
rw-user-agent-nopush | *abbr=none |
rw-user-agent-noserve | *abbr=none |
Response headers
Response headers | abbreviation |
---|---|
:status | *abbr=ST |
access-control-allow-headers | *abbr=ACAH |
access-control-expose-headers | *abbr=ACEH |
access-control-allow-origin | *abbr=ACAO |
access-control-allow-credentials | *abbr=ACAC |
access-control-allow-methods | *abbr=ACAM |
access-control-max-age | *abbr=ACMA |
allow | *abbr=AL |
cache-control | *abbr=CC |
content-encoding | *abbr=CE |
content-length | *abbr=CL |
content-location | *abbr=CLO |
content-language | *abbr=CLA |
content-type | *abbr=CT |
last-modified | *abbr=LM |
etag | *abbr=ET |
location | *abbr=LO |
set-cookie | *abbr=SC |
server | *abbr=SE |
vary | *abbr=VA |
Policy data
Policy data | abbreviation |
---|---|
type | *abbr=TYP |
age | *abbr=AGE |
blocked-uri | *abbr=B-URI |
disposition | *abbr=DISP |
document-uri | *abbr=D-URI |
effective-directive | *abbr=E-DIR |
elapsed-time | *abbr=ELAP |
line-number | *abbr=LN |
method | *abbr=ME |
phase | *abbr=PH |
protocol | *abbr=PROT |
referrer | *abbr=RF |
sampling-fraction | *abbr=S-FRAC |
server-ip | *abbr=S-IP |
source-file | *abbr=S-FIL |
status | *abbr=ST |
status-code | *abbr=ST |
original-policy | *abbr=O-POL |
url | *abbr=URL |
user-agent | *abbr=UA |
violated-directive | *abbr=V-DIR |
Formatting fields
Some header values can be quite lengthy. These can be shortened using ellipses. Use the *format
attribute to instruct the server how much to shorten the string and where to put the ellipsis. There are three meaningful definition patterns: 1) trim leading, 2) trim trailing, and 3) trim middle. Each definition consists of the number of leading characters to keep, followed by three FULL-STOPS, followed by the number of trailing characters to keep.
Given an example header value like this "AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz
" the text to be sent to the log message is demonstrated here:
Definition pattern | attribute | log text |
---|---|---|
Trim leading | *format= | ...VvWw |
Trim trailing | *format= | AaBbCc |
Trim middle | *format= | AaBbCcDd... |
Be sure to enclose the attribute value in APOSTROPHES.
Configuration snippet using the *format
attribute
cookie *abbr=CO *format='...10'
referer *abbr=RF *format='10...'
user-agent *abbr=UA *format='8...8'
When no format attribute is provided, the default for most field names is *format='20...20'
. The exceptions to this rule are the :path
and referer
fields, which by default use an implicit *format='100...100'
. These two special fields retain more of the original string in order to provide better results with the Analytics
CLI.
Coloring fields
The foreground and background color of each header value can be set by specifying *fg
and *bg
attributes. Eight colors are allowed: black, white, red, green, blue, cyan, magenta, and yellow.
Configuration snippet using the *fg
and *bg
attributes
cookie *abbr=CO *fg=red *bg=white
referer *abbr=RF *fg=green *bg=black
user-agent *abbr=UA *fg=black *bg=cyan
Colors are normally suppressed by the journalctl
utility. To enable colors, use the --output=cat
switch, like this:
journalctl -u rwserve --output=cat
Orientation of log fields
Log messages are normally sent to the console as single horizontal lines. It is also possible to configure the server to create messages in a vertically stacked style. This is convenient during development and testing, to bring an extra bit of clarity to the output. Do this by adding an orientation vertical
setting to the logging
section.
EBNF
SP | ::= | U+20 |
CR | ::= | U+0D |
ASTERISK | ::= | U+2A |
APOSTROPHE | ::= | U+27 |
EQUALS-SIGN | ::= | U+3D |
QUESTION-MARK | ::= | U+3F |
LEFT-CURLY-BRACKET | ::= | U+7B |
RIGHT-CURLY-BRACKET | ::= | U+7D |
orientation-entry | ::= | 'orientation' SP ('vertical' | 'horizontal') |
http-header | ::= | ALPHA* | 'default' |
policy-header | ::= | ALPHA* | 'default' |
work-order-variable | ::= | 'cookieMap' | 'formDataMap' | 'proxyLocation' | 'maskedReplacement |
ansi-color | ::= | ('white' | 'black' | 'red' | 'green' | 'blue' | 'cyan' | 'magenta' | 'yellow') |
abbr-attr | ::= | ASTERISK 'abbr' EQUALS-SIGN ALPHA* |
abbr-fg | ::= | ASTERISK 'fg' EQUALS-SIGN ansi-color |
abbr-bg | ::= | ASTERISK 'bg' EQUALS-SIGN ansi-color |
leading-char-count | ::= | DIGITS* |
trailing-char-count | ::= | DIGITS* |
format-spec | ::= | leading-char-count '...' trailing-char-count |
format-attr | ::= | ASTERISK 'format' EQUALS-SIGN APOSTROPHE format-spec APOSTROPHE |
message-type-entry | ::= | ('request' | 'staging' | 'response' | information' | 'policy' | 'network' | 'cluster' | 'debug') SP ('on' | 'off') CR |
request-entry | ::= | http-header SP abbr-attribute* SP format-attr* SP fg-attr* SP bg-attr* CR |
staging-entry | ::= | work-order-variable SP abbr-attribute* SP format-attr* SP fg-attr* SP bg-attr* CR |
information-entry | ::= | http-header SP abbr-attribute* SP format-attr* SP fg-attr* SP bg-attr* CR |
response-entry | ::= | http-header SP abbr-attribute* SP format-attr* SP fg-attr* SP bg-attr* CR |
policy-entry | ::= | policy-header SP abbr-attribute* SP format-attr* SP fg-attr* SP bg-attr* CR |
message-type-subsection | ::= | 'message-type' SP LEFT-CURLY-BRACKET CR message-type-entry* RIGHT-CURLY-BRACKET CR |
request-subsection | ::= | 'request' SP LEFT-CURLY-BRACKET CR request-entry* RIGHT-CURLY-BRACKET CR |
staging-subsection | ::= | 'staging' SP LEFT-CURLY-BRACKET CR staging-entry* RIGHT-CURLY-BRACKET CR |
information-subsection | ::= | 'information' SP LEFT-CURLY-BRACKET CR information-entry* RIGHT-CURLY-BRACKET CR |
response-subsection | ::= | 'response' SP LEFT-CURLY-BRACKET CR response-entry* RIGHT-CURLY-BRACKET CR |
policy-subsection | ::= | 'policy' SP LEFT-CURLY-BRACKET CR policy-entry* RIGHT-CURLY-BRACKET CR |
logging-subsection | ::= | orientation-entry | message-type-subsection | request-subsection | staging-subsection | information-subsection | response-subsection | policy-subsection |
logging-section | ::= | 'logging' SP LEFT-CURLY-BRACKET CR logging-subsection* RIGHT-CURLY-BRACKET CR |
Cookbook
Example 1: Logging only config and error messages
Note: With these rules in place, important 'config' and 'error' messages are still sent to the system log.
server {
logging {
message-type {
request off
staging off
information off
response off
network off
cluster off
}
}
}
Example 2: Basic request/response logging
server {
logging {
message-type {
request on
staging off
information off
response on
network off
cluster off
}
request {
:method *abbr=ME
:path *abbr=PA
cookie *abbr=CO
referer *abbr=RF
}
response {
:status *abbr=ST
content-length *abbr=CL
content-type *abbr=CT
set-cookie *abbr=SC
}
}
}
Example 3: Debugging the website
server {
logging {
message-type {
request on
staging on
information off
response on
network on
cluster off
}
request {
default *format='64...'
}
staging {
remote-address *abbr=RA
cookie-map *abbr=CM
parameter-map *abbr=PM
form-data-map *abbr=FDM
user-agent-common-name *abbr=UACN
user-agent-groups *abbr=UAG
masked-replacement-path *abbr=MRP
resource-path *abbr=RP
query-string *abbr=QS
push-candidate *abbr=PC
proxy-location *abbr=PL
default
}
response {
default *format='64...'
}
}
}
Example 4: Monitoring the website
server {
logging {
message-type {
request on
staging off
information on
response on
network off
cluster off
}
request {
:method *abbr=ME
:path *abbr=PA
cookie *abbr=CO
referer *abbr=RF
}
information {
default *format='32...32'
}
response {
:status *abbr=ST
content-length *abbr=CL
content-type *abbr=CT
set-cookie *abbr=SC
}
}
}
Example 5: Vertical logging
server {
logging {
orientation vertical
message-type {
request on
staging off
information on
response on
network off
cluster off
}
request {
default *format='32...32'
}
information {
default *format='32...32'
}
response {
default *format='32...32'
}
}
}
Review
Key points to remember:
- The default settings for logging are only a starting point; adjustments should be made to match the operator's needs.
- Request and response headers can be lengthy and repetitive; use of the 'default' entry can produce copious messages.
- Staging variables are useful for debugging during development.
- Information headers are useful during the website's steady-state to monitor for possible misuse.