Watchdog Integration
Starting with RPort version 0.8.3 you can enable a watchdog integration on the rport client.
The rport client works stable and reliable and broken connections are always resumed. There is no need to enable watchdog integration by default. It should only be activated in exceptional cases when you notice the rport client process is running, but no connection attempts are made.
On the rport.conf
go the the [connection]
section and set watchdog_integration = true
.
Also make sure you have max_retry_count
disabled and keep_alive
must be enabled.
With the watchdog integration enabled, the rport client will create and constantly update a file state.json
in the
data directory. The file looks like the following example:
{
"last_update": "2022-08-15T10:37:25.643839+02:00",
"last_update_ts": 1660552645,
"last_state": "connected",
"last_message": "ping to rport server 81.151.79.171:8080 succeeded within 14.682823ms"
}
The first two fields indicate when the file has been updated for the last time. These are the most important field, and your supervisor logic must be based one of the update times. As long as the last_update constantly increases, the rport client is working correctly.
The field last_state
can have one of the following values:
initialized
- the watchdog integration was started but no rport client state is yet known.
connected
- the rport client is connected to the server.
reconnecting
- the rport client is not connected but trying to reconnect.
The field last_message
can have one of the following values:
ping to {server} succeeded within {latency}
- a ping succeeded
Retrying in {time}...
- a connection failed
connected to {server} within {latency}
- a connection succeeded
The state.json
file gets updated on the following events:
- connection succeeded
- connection retry, according to setting
max_retry_interval
- ping to the server, according to setting
keep_alive
While your server is connected, the file gets updates every time a ping to the server is sent using the keep_alive
interval. With the client being disconnected, the file is updated on each reconnect attempt which can have a maximum
interval set by max_retry_interval
.
Your watchdog implementation should restart the rport client when is “hangs”. Either the last update is older than the current time minus the keep alive interval or older than the max retry interval.
The Windows service does not have a built-in watchdog. So you must implement a schedules task that checks the state.json file regularly.
$stateFile = 'C:\Program Files\rport\data\state.json'
$threshHoldSec = 600
$now = ((Get-Date -UFormat %s) - [int](get-date -Uformat %Z)*3600)
$lastUpdate = (Get-Content $stateFile| ConvertFrom-Json).last_update_ts
$diff = $now - $lastUpdate
if ($diff -gt $threshHoldSec){
Write-Output "RPort hangs. No activity deteced for $($diff) seconds."
Restart-Service rport
} else {
Write-Output "RPort is running fine. Last activity $($diff) seconds ago."
}
🐶 A fully-featured rport watchdog for Windows implemented in PowerShell can be downloaded here. It comes ready-to-use with an easy installation.
On Linux systemd comes with a built-in watchdog. Systemd does not use the state.json file. The file is written anyway, so you can observe what’s happening behind the scenes.
To enable the systemd watchdog you must enter a line WatchdogSec=N
to the service file
/etc/systemd/system/rport.service
. This causes systemd to create a unix socket where watchdog updates can be pushed to.
The rport client will automatically recognize the presence of the socket and updates will be pushed on the events listed
above.
If you enable debug logging, you will get a confirmation like Using NOTIFY_SOCKET /run/systemd/notify for systemd watchdog integration
in the log file.
If systemd does not detect any updates on the socket within the WatchdogSec
period, it will restart the rport client.
This means you must set WatchdogSec
a bit longer than max_retry_interval
and keep_alive
. See above.
Setting keep_alive
and max_retry_interval
both the 3 minutes and WatchdocSec
to 200 are good values.
The shorter you set keep_alive
and max_retry_interval
the faster the watchdog can act. But the more bandwidth the
rport client consumes in idle mode.
Full systemd service example:
[Unit]
Description=Create reverse tunnels with ease.
ConditionFileIsExecutable=/usr/local/bin/rport
[Service]
ExecStart=/usr/local/bin/rport "-c" "/etc/rport/rport.conf"
LimitNOFILE=1048576
User=rport
Restart=always
RestartSec=120
WatchdogSec=200
[Install]
WantedBy=multi-user.target
[Unit]
StartLimitIntervalSec=5
StartLimitBurst=10