Skip to content

Make echo interval and timeout configurable#87

Open
juerghegglin wants to merge 1 commit intoFAForever:masterfrom
juerghegglin:feat/configurable-echo-timeout
Open

Make echo interval and timeout configurable#87
juerghegglin wants to merge 1 commit intoFAForever:masterfrom
juerghegglin:feat/configurable-echo-timeout

Conversation

@juerghegglin
Copy link
Copy Markdown

Problem

PeerConnectivityCheckerModule is the only thing standing between "this peer is healthy" and "this peer is dead." Both knobs are hardcoded:

private static final int ECHO_INTERVAL = 1000;        // checkerThread, line 22
...
if (System.currentTimeMillis() - lastPacketReceived > 10000) {   // line 136

The 10-second silence threshold puts a hard floor on the user-visible deadlock window when a peer drops mid-game: SC's lockstep simulation stalls until the adapter declares the peer disconnected, and that takes 10 seconds regardless of how fast the actual drop was. Tournament players (and anyone debugging chronic mesh drops) have no way to tighten this.

Real-world evidence

A 16-player game session: one peer experienced what looked like a JVM/CPU pause on their machine. From the local adapter's perspective the peer simply went silent. The game froze for every other player while the lockstep waited for input that wasn't coming, until the 10-second echo timeout fired and the simulation could move on with a vote-to-continue. The full freeze duration is essentially ECHO_TIMEOUT_MS + (FA's vote-grace period).

A separate teammate's log shows a flapping peer with three echo timeouts inside 26 seconds, each followed by a successful re-establishment. With the same data and a 2-second timeout, those would have been 2-second blips instead of 10-second freezes.

Change

Two new CLI options, defaults preserve current behavior:

Option Default Effect
--echo-interval-ms 1000 Cadence at which the connectivity checker sends echo packets to peers.
--echo-timeout-ms 10000 Silence threshold before declaring a peer disconnected.

Files touched:

  • IceOptions.java — adds the two @Option fields.
  • IceAdapter.java — adds two static accessors (getEchoIntervalMs(), getEchoTimeoutMs()) following the existing pattern (getPingCount, getAcceptableLatency).
  • PeerConnectivityCheckerModule.java — replaces the two hardcoded values with calls to the new accessors. The log message at line 138 becomes "for the past {}ms from {}" (was "for the past 10 seconds from {}") so it reflects the configured value, but the prefix Didn't receive any answer to echo requests stays identical so existing log-grep tooling continues to work.

Use cases

  • Tournament play / fast-recovery: --echo-timeout-ms 2000 --echo-interval-ms 500. Detects peer loss in 2 seconds; loses 4 echoes before declaring dead (still robust to occasional packet loss). Combined with the FAF client's existing reconnect logic, this makes brief network blips much less disruptive.
  • Slow / high-latency links: leave defaults, or even widen --echo-timeout-ms 15000 if needed.
  • Debugging chronic disconnects: tighten timeouts during repro to make incidents fail faster and cluster more visibly in logs.

Risk

  • Defaults are unchanged, so any caller (FAF client, manual launchers) that doesn't pass the new options sees identical behavior.
  • The log message changes one numeric value and the unit suffix; the grep prefix is preserved.
  • No new threads, locks, or state. The two values are read on each iteration of the checker thread, so a hot config swap would even be possible — though that's not exposed here, since the values are CLI-set once at startup.

Verification

  • Local :ice-adapter:check (compile + spotlessCheck) passes after spotlessApply.
  • Manual smoke test by passing --echo-timeout-ms 2000 and observing the connectivity check now fires after 2 s (verified with the existing repro pattern — silent UDP server).

Co-authored with Claude (Anthropic); reviewed and locally verified by me.

PeerConnectivityCheckerModule has been the only knob between "this peer
is healthy" and "this peer is dead." The echo cadence (1000 ms) and
silence threshold (10000 ms) were hardcoded constants. The 10-second
threshold sets the floor on the user-visible deadlock window when a
peer drops mid-game: lockstep stalls until the adapter declares the
peer disconnected, and that takes ten seconds no matter what.

Real-world data: in a 16-player game session, a single peer-side issue
froze every other player's simulation for the full 10 seconds before
the game's vote-to-continue could fire. Tournament players in
particular have asked for tighter detection.

This change exposes two CLI options:

  --echo-interval-ms  (default 1000)
  --echo-timeout-ms   (default 10000)

Defaults preserve current behavior. Power users can pass for example
'--echo-timeout-ms 2000' to detect peer loss in 2 seconds instead of
10. The log message "for the past N seconds" becomes "for the past
{}ms" so it reflects the configured value.

The diagnostic-grep pattern "Didn't receive any answer to echo
requests" is unchanged, so log-analysis tooling is not affected.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 6, 2026

Warning

Rate limit exceeded

@juerghegglin has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 38 minutes and 33 seconds before requesting another review.

To continue reviewing without waiting, purchase usage credits in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ed9adb41-6ea5-4883-b373-5781958779b2

📥 Commits

Reviewing files that changed from the base of the PR and between d49f9fe and 7eeefb1.

📒 Files selected for processing (3)
  • ice-adapter/src/main/java/com/faforever/iceadapter/IceAdapter.java
  • ice-adapter/src/main/java/com/faforever/iceadapter/IceOptions.java
  • ice-adapter/src/main/java/com/faforever/iceadapter/ice/PeerConnectivityCheckerModule.java
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@juerghegglin
Copy link
Copy Markdown
Author

https://github.com/coderabbitai review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant