Parallel processes to handle simultaneous requests
Cluster
Preliminaries
This note describes the server's ability to handle concurrent requests through a single address and port.
Configuration
The RWSERVE software is able to handle multiple simultaneous requests using its built-in cluster capability. The number of concurrent processes to load into memory is declared in the cluster-size
entry. The maximum number of concurrent processes is 64.
Round-robin scheduling
When a browser issues an HTTP/2 request to any of the hostnames declared in the configuration file, the incoming connection request is assigned to one of the cluster processes, and that process handles the entire request/response cycle. When subsequent requests are made via the same browser within a short period of time, the connection will remain open and the same process will handle it.
When another browser issues a request, its incoming connection is assigned to a different cluster process, using a simple round-robin approach. For example, with a cluster size of four, a series of incoming requests coming from different browsers would be assigned to cluster process 0, 1, 2, 3, 0, 1, 2, 3, ...
Be aware that this simple approach to scheduling is indifferent to the types of requests being made, or their potential processing needs, or their possible payload sizes. Because of this it is possible to experience good throughput on some processes, while experiencing undesirable throughput on others, due to weak round-robin scheduling.
For example, if a series of eight consecutive requests arriving 50ms apart exhibit this processing profile [250ms, 250ms, 2000ms, 250ms, 250ms, 250ms, 250ms, 250ms], and the process queues were initially all empty, then the apparent response time as experienced by each browser would be [250ms, 250ms, 2000ms, 250ms, 300ms, 300ms, 2050ms, 300ms]. The first four requests are served without delay, but the next four are each subjected to process queueing that increases their overall response time. The seventh request in particular, has the worst response time, even though the effort spent in actual processing is the same as its neighbors.
process | start offset | queue time | end offset | response time |
---|---|---|---|---|
#0 | 0 | 0 ms | 250 | 250 ms |
#1 | 50 | 0 ms | 300 | 250 ms |
#2 | 100 | 0 ms | 2100 | 2000 ms |
#3 | 150 | 0 ms | 400 | 250 ms |
#0 | 200 | 50 ms | 500 | 300 ms |
#1 | 250 | 50 ms | 550 | 300 ms |
#2 | 300 | 1800 ms | 2350 | 2050 ms |
#3 | 350 | 50 ms | 650 | 300 ms |
The solution to this problem is to increase the number of processes in the cluster in order to reduce, or entirely remove, queuing times.
Cluster size considerations
Determining the optimal size for the server's cluster should be done through empirical testing against actual loads.
All things being equal, if you have a choice between provisioning a server with more CPU cores or more memory, choose the server with more CPUs.
EBNF
SP | ::= | U+20 |
CR | ::= | U+0D |
SOLIDUS | ::= | U+2F |
ASTERISK | ::= | U+2A |
FULL-STOP | ::= | U+2E |
GRAVE-ACCENT | ::= | U+60 |
LEFT-CURLY-BRACKET | ::= | U+7B |
RIGHT-CURLY-BRACKET | ::= | U+7D |
number-of-processes | ::= | [1..64] |
cluster-size-entry | ::= | 'cluster-size' number-of-processes CR |
server-section | ::= | 'server' SP LEFT-CURLY-BRACKET CR cluster-size RIGHT-CURLY-BRACKET CR |
Cookbook
Example 1: three CPUs, 1Gb RAM
server {
ip-address 10.20.30.40
port 443
cluster-size 3
}
Example 2: two CPUs, 2Gb RAM
server {
ip-address 10.20.30.40
port 443
cluster-size 2
}
Example 3: one CPU, 3Gb RAM
server {
ip-address 10.20.30.40
port 443
cluster-size 1
}
Review
Key points to remember:
- Servers with larger cluster sizes exhibit fewer queue-related delays.
- Servers with more CPUs are able to handle larger cluster sizes.