feat: use individual module connections#2049
Conversation
Greptile SummaryThis PR refactors Go2 and G1 robot connections from a single monolithic class with internal dispatch into separate per-backend
Confidence Score: 3/5The backend-swap machinery is sound but several unresolved defects in connection modules and CLI routing make this risky to merge as-is. The
Important Files Changed
Reviews (6): Last reviewed commit: "feat: use individual module connections" | Re-trigger Greptile |
| for atom in self.blueprints: | ||
| tag = getattr(atom.module, "_connection_tag", None) | ||
| if tag is None or tag.backend == backend: |
There was a problem hiding this comment.
The
with_backend method uses getattr(atom.module, "_connection_tag", None) which walks the MRO and picks up inherited tags. Go2FleetConnection inherits _connection_tag = ConnectionTag(robot="go2", backend="webrtc") from Go2WebRtcConnection without being directly decorated. This means a blueprint containing Go2FleetConnection would have with_backend("mujoco") silently swap it to a bare Go2MujocoConnection, discarding all fleet-coordination logic.
| for atom in self.blueprints: | |
| tag = getattr(atom.module, "_connection_tag", None) | |
| if tag is None or tag.backend == backend: | |
| for atom in self.blueprints: | |
| tag = vars(atom.module).get("_connection_tag", None) | |
| if tag is None or tag.backend == backend: |
| def _publish_camera_info(self) -> None: | ||
| while True: | ||
| self.camera_info.publish(self.camera_info_static) | ||
| time.sleep(1.0) |
There was a problem hiding this comment.
Camera-info thread never terminates, causing
stop() to always time-out
_publish_camera_info loops forever with no exit condition. The stop() method joins this thread with timeout=DEFAULT_THREAD_JOIN_TIMEOUT, so every stop() call blocks for the full timeout before continuing teardown. The same pattern is present in Go2MujocoConnection and Go2ReplayConnection. Compare with G1MujocoConnection._publish_camera_info_loop, which checks _stop_event.is_set() and exits cleanly. Adding a similar stop-event check here would make teardown prompt.
|
|
||
| self.thread = Thread(target=start_background_loop, daemon=True) | ||
| self.thread.start() | ||
| self.connection_ready.wait() |
There was a problem hiding this comment.
Unbounded block on connection handshake
self.connection_ready.wait() is called with no timeout. If the fleet-member robot at the given IP is unreachable or the WebRTC handshake never completes, constructing Go2FleetConnection (which builds one _FleetMemberClient per extra robot) will block the calling thread indefinitely with no error, no warning, and no recovery path.
The same pattern exists in Go2WebRtcConnection.start() (line 167 of connection_webrtc.py) and G1WebRtcConnection.start() (line 98 of g1/connection.py), both of which are new code introduced in this PR. Adding a timeout parameter (e.g. self.connection_ready.wait(timeout=30)) and raising a TimeoutError when it returns False would make all three sites fail fast on bad configuration.
|
bab5f66 to
7647f65
Compare
| @property | ||
| def effective_simulator(self) -> str | None: | ||
| """Resolved simulator backend from --simulator or --simulation.""" | ||
| if self.simulator: | ||
| return self.simulator | ||
| if self.simulation: | ||
| return "mujoco" | ||
| return None |
There was a problem hiding this comment.
The
effective_simulator property should also cover the replay flag so the CLI's with_backend() call fires for replay mode the same way it does for simulation. Without this, --replay silently stays on Go2WebRtcConnection and blocks on a WebRTC handshake.
| @property | |
| def effective_simulator(self) -> str | None: | |
| """Resolved simulator backend from --simulator or --simulation.""" | |
| if self.simulator: | |
| return self.simulator | |
| if self.simulation: | |
| return "mujoco" | |
| return None | |
| @property | |
| def effective_simulator(self) -> str | None: | |
| """Resolved simulator backend from --simulator, --simulation, or --replay.""" | |
| if self.replay: | |
| return "replay" | |
| if self.simulator: | |
| return self.simulator | |
| if self.simulation: | |
| return "mujoco" | |
| return None |
113994a to
72080c3
Compare
72080c3 to
f82d270
Compare
Problem
Closes DIM-XXX
Solution
How to Test
Contributor License Agreement