libwebsockets/READMEs/README.lws_retry.md
Andy Green f9f6bb66fe lws_validity: unified connection validity tracking
Refactor everything around ping / pong handling in ws and h2, so there
is instead a protocol-independent validity lws_sul tracking how long it
has been since the last exchange that confirms the operation of the
network connection in both directions.

Clean out periodic role callback and replace the last two role users
with discrete lws_sul for each pt.
2019-09-22 09:35:07 -07:00

99 lines
3.8 KiB
Markdown

# `lws_retry_bo_t` client connection management
This struct sets the policy for delays between retries, and for
how long a connection may be 'idle' before it first tries to
ping / pong on it to confirm it's up, or drops the connection
if still idle.
## Retry rate limiting
You can define a table of ms-resolution delays indexed by which
connection attempt number is ongoing, this is pointed to by
`.retry_ms_table` with `.retry_ms_table_count` containing the
count of table entries.
`.conceal_count` is the number of retries that should be allowed
before informing the parent that the connection has failed. If it's
greater than the number of entries in the table, the last entry is
reused for the additional attempts.
`.jitter_percent` controls how much additional random delay is
added to the actual interval to be used... this stops a lot of
devices all synchronizing when they try to connect after a single
trigger event and DDoS-ing the server.
The struct and apis are provided for user implementations, lws does
not offer reconnection itself.
## Connection validity management
Lws has a sophisticated idea of connection validity and the need to
reconfirm that a connection is still operable if proof of validity
has not been seen for some time. It concerns itself only with network
connections rather than streams, for example, it only checks h2
network connections rather than the individual streams inside (which
is consistent with h2 PING frames only working at the network stream
level itself).
Connections may fail in a variety of ways, these include that no traffic
at all is passing, or, eg, incoming traffic may be received but no
outbound traffic is making it to the network, and vice versa. In the
case that tx is not failing at any point but just isn't getting sent,
endpoints can potentially kid themselves that since "they are sending"
and they are seeing RX, the combination means the connection is valid.
This can potentially continue for a long time if the peer is not
performing keepalives.
"Connection validity" is proven when one side sends something and later
receives a response that can only have been generated by the peer
receiving what was just sent. This can happen for some kinds of user
transactions on any stream using the connection, or by sending PING /
PONG protocol packets where the PONG is only returned for a received PING.
To ensure that the generated traffic is only sent when necessary, user
code can report for any stream that it has observed a transaction amounting
to a proof of connection validity using an api. This resets the timer for
the associated network connection before the validity is considered
expired.
`.secs_since_valid_ping` in the retry struct sets the number of seconds since
the last validity after which lws will issue a protocol-specific PING of some
kind on the connection. `.secs_since_valid_hangup` specifies how long lws
will allow the connection to go without a confirmation of validity before
simply hanging up on it.
## Defaults
The context defaults to having a 5m valid ping interval and 5m10s hangup interval,
ie, it'll send a ping at 5m idle if the protocol supports it, and if no response
validating the connection arrives in another 10s, hang up the connection.
User code can set this in the context creation info and can individually set the
retry policy per vhost for server connections. Client connections can set it
per connection in the client creation info `.retry_and_idle_policy`.
## Checking for h2 and ws
Check using paired minimal examples with the -v flag on one or both sides to get a
small validity check period set of 3s / 10s
Also give, eg, -d1039 to see info level debug logging
### h2
```
$ lws-minimal-http-server-h2-long-poll -v
$ lws-minimal-http-client -l -v
```
### ws
```
$ lws-minimal-ws-server-h2 -s -v
$ lws-minimal-ws-client-ping -n --server 127.0.0.1 --port 7681 -v
```