Pim/feat/g1 groot wbc#2033
Conversation
…a separate process
…k to dev (1500 LOC)
…y-over from old quadruped branch)
…era_pose, _publish_tf refactor, spec.py docstring expansion)
The DDS callback registered via ChannelSubscriber.Init(handler, 10) doesn't fire reliably on macOS, so the adapter timed out waiting for the first LowState to capture mode_machine. Switch to passive-mode Init(None, 0) and Read() per tick. mode_machine is now hardcoded to the static value for the 29-DOF G1 instead of read-back-then-echoed. Also adds has_motor_states() to satisfy the post-refactor WholeBodyAdapter Protocol, makes _release_sport_mode null-tolerant (CheckMode returns (status, None) once nothing is active), drops the now-unused _on_low_state callback. WebsocketVisModule: restore the arm / disarm / set_dry_run socket.io handlers and the matching activate / dry_run Out[Bool] ports that an earlier cleanup pass dropped — the dashboard buttons now publish on the LCM topics the coordinator already subscribes to. TransportWholeBodyAdapter: trivial Protocol fix — implement read_odom returning None so isinstance(adapter, WholeBodyAdapter) passes the runtime_checkable type check in ControlCoordinator._setup_hardware.
The earlier cleanup pass over-deleted: openarm manipulator support, the dimos/utils/workspace.py module, the audio STT node_whisper edits, CI configs, README, command-center package-lock churn, and two go2 LFS DBs. None of those are part of the GR00T WBC story. Restore each to origin/dev's content. Drop one path that was added on this branch but doesn't exist on dev (api/requirements.txt). PR diff is now 24 files / +2623 / -210, all GR00T-WBC-relevant.
Both Mustafa's #1954 (already on dev) and the GR00T-WBC PR added this method. After the merge, both definitions live in coordinator.py — Python uses the second, the first becomes dead code, and the transport adapter (which depends on hardware_id being passed) silently picks up its default ``"wholebody"`` topic prefix instead of the component's id. Collapse to one definition that passes the union of kwargs both adapters need: dof, hardware_id, network_interface, domain_id, address, plus **adapter_kwargs. Adapters tolerate extras via their constructor's **_ catch-all.
Switch unitree-g1-groot-wbc to the architecture Mustafa landed in #1954 (G1WholeBodyConnection Module + TransportWholeBodyAdapter via LCM bridge), matching unitree-g1-coordinator. Single G1 low-level pattern in the codebase now. * Delete dimos/hardware/whole_body/unitree/g1/adapter.py — the UnitreeG1LowLevelAdapter (single-process DDS) is gone. * Rewrite unitree_g1_groot_wbc.py to compose G1WholeBodyConnection + ControlCoordinator(adapter_type="transport_lcm") + dashboard via autoconnect. * Patch wholebody_connection.py to apply the macOS-cyclonedds fixes the dropped monolith was carrying: - Init(None, 0) instead of Init(callback, 10) - Hardcode mode_machine = 5 (the static value for 29-DOF G1) - publish_loop polls subscriber.Read() per tick - Drop the now-unused _on_low_state callback
_MJCF_PATH was a relative path "data/mujoco_sim/g1_gear_wbc.xml"
which only resolved when dimos was launched from the repo root.
The mujoco viewer subprocess inherits CWD from the launching shell,
so running ``dimos run unitree-g1-groot-wbc-sim`` from any other
directory tripped FileNotFoundError on viewer startup.
get_data("mujoco_sim") resolves the absolute install path and
auto-extracts the LFS tarball on first run.
1. unitree_g1_groot_wbc_sim.py: resolve _MJCF_PATH via get_data so
both MujocoSimModule and the DIMOS_MUJOCO_VIEW=1 subprocess open
the file regardless of shell CWD. Earlier the relative path
"data/mujoco_sim/g1_gear_wbc.xml" only worked from the repo root.
2. wholebody_connection.py: hardcoded mode_machine = 5 has no
fallback if a future G1 firmware revision changes the value.
Add a one-shot warning the first time we read a LowState whose
mode_machine doesn't match. Self-disables after the first log
line so it doesn't spam the operator.
3. test_unitree_g1_groot_wbc.py: composition smoke test verifying
the three deployed modules, bridge adapter binding (transport_lcm,
confirming the migration off the deleted unitree_g1 monolith),
real-hw safety profile (auto_arm=False, auto_dry_run=True,
default_ramp_seconds=10.0), servo_arms defaults, and all bridge
+ dashboard LCM topics. No DDS, no hardware — just module
imports.
Two safety fixes for back-to-back ``dimos run`` cycles on real
hardware:
1. ``stop()`` now sends a final ``LowCmd_`` with ``mode=0x00``
(motors disabled) for every motor slot before closing the DDS
publisher. Previously the last commanded pose lingered in the
motor controllers — when the next process opened a publisher
and re-released sport mode, those stale stiff commands fought
the new participant during the handoff window, producing
audible gearbox stress.
2. ``_release_sport_mode()`` bails immediately if the first
``CheckMode()`` reports nothing active (``result is None`` or
``{"name": ""}``). A clean second start now logs "Sport mode
already released — skipping ReleaseMode" and skips the
loop-and-poll dance entirely, removing the handoff window.
``dimos/project/test_no_sections.py`` forbids ``# -----`` separator banners and ``# --- text ---`` inline section headers as a project style rule. Three files in this PR carried the pattern from earlier drafts; convert inline-section to plain ``# text`` and drop pure separators. No code-behaviour change.
* tick_loop.py / groot_wbc_task.py: import the class
(``from dimos.msgs.geometry_msgs.PoseStamped import PoseStamped``,
same for ``Twist``), not the module — module-as-type tripped
mypy ``[valid-type]``.
* coordinator.py: when constructing ``GrootWBCTask``, narrow ``hw``
to ``ConnectedWholeBody`` via isinstance before passing
``hw.adapter`` (else mypy unions the manipulator + whole-body
adapter types). Also raises a clear TypeError if a non-whole-body
component is referenced by a groot_wbc task.
* test_unitree_g1_groot_wbc.py: type ``_coordinator_kwargs() ->
dict[str, Any]`` so list/iter usages on the values typecheck.
``mypy dimos`` is clean (670 files).
Greptile SummaryThis PR adds
Confidence Score: 4/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Dashboard as Dashboard (WebSocket)
participant WsVis as WebsocketVisModule
participant Coord as ControlCoordinator
participant Task as G1GrootWBCTask
participant HW as ConnectedWholeBody
participant Adapter as WholeBodyAdapter
Dashboard->>WsVis: arm (socket event)
WsVis->>Coord: "RPC set_activated(engaged=True)"
Coord->>Task: "arm(ramp_seconds=10)"
Note over Task: _arm_pending=True
loop 500 Hz tick loop
Adapter-->>HW: read_motor_states() + read_imu()
HW-->>Coord: CoordinatorState (joints + imu)
Coord->>Task: compute(state)
alt arming (ramp phase)
Task-->>Coord: JointCommandOutput (lerp toward default_15)
else armed + not dry-run
Task->>Task: ONNX inference (every 10th tick)
Task-->>Coord: JointCommandOutput (policy targets)
else dry-run
Task-->>Coord: None (suppressed)
end
Coord->>HW: write_command(positions, SERVO_POSITION)
HW->>Adapter: write_motor_commands(MotorCommand[kp,kd,tau])
end
Dashboard->>WsVis: move_command (WASD)
WsVis->>Coord: twist_command (LCM)
Coord->>Task: set_velocity_command(vx, vy, yaw_rate, t_now)
Reviews (5): Last reviewed commit: "fix(g1-wbc): wire sim command controls" | Re-trigger Greptile |
| @@ -284,20 +376,53 @@ def _apply_shm_commands(self, engine: MujocoEngine) -> None: | |||
|
|
|||
There was a problem hiding this comment.
Direct access to private
_data attribute of MujocoEngine
engine._data reaches into MujocoEngine's private implementation. If that field is ever renamed or restructured, this call will break silently at runtime. A small @property data accessor on MujocoEngine would expose the same information without coupling to the private layout.
| current_15 = np.zeros(_NUM_ACTIONS, dtype=np.float32) | ||
| for i, jname in enumerate(self._joint_names_list): | ||
| pos = state.joints.get_position(jname) | ||
| current_15[i] = pos if pos is not None else 0.0 |
There was a problem hiding this comment.
This will return current_15 as [0*15] if there is a packet drop or a corrupted message.
This can cause the robot to snap because it thinks it is zero position when it is actually somewhere else.
Better to use a cached pose or handle this gracefully.
| q_29[i] = pos if pos is not None else 0.0 | ||
| dq_29[i] = vel if vel is not None else 0.0 |
There was a problem hiding this comment.
should use cached or safe values in case of None. Similar to above comment
| # IMU comes from the adapter, not CoordinatorState. | ||
| imu = self._adapter.read_imu() | ||
| gyro = np.asarray(imu.gyroscope, dtype=np.float32) | ||
| gravity = self._projected_gravity(imu.quaternion) |
There was a problem hiding this comment.
Coordinator state should also hold IMU data.
Please create issue for this.
| Attributes: | ||
| name: Task name (e.g., "traj_arm") | ||
| type: Task type ("trajectory", "servo", "velocity", "cartesian_ik", "teleop_ik") | ||
| type: Task type ("trajectory", "servo", "velocity", "cartesian_ik", "teleop_ik", "groot_wbc") |
There was a problem hiding this comment.
Need better task management system. Something similar to registry for adapters
| # Mismatched rates make the policy hold actions for too long and | ||
| # the robot tips over. ``None`` keeps the task's own default (10). | ||
| decimation: int | None = None | ||
|
|
There was a problem hiding this comment.
We can skip so much verbose. one line comments are good enough in coordinator.
| def read_odom(self) -> PoseStamped | None: | ||
| # Default: no base pose available. Sim/estimator adapters override. | ||
| return None |
There was a problem hiding this comment.
Whole body adapters will likely never publish odom so we should modify the spec.
Should be addressed in separate PR
There was a problem hiding this comment.
I feel this should just be part of the groot-wbc_task.py
Don't see a reason why we must have this here
There was a problem hiding this comment.
One blueprint for real and sim, with just a simulation flag change would be what we would like
| # Drain the latest LowState frame (None means no fresh sample | ||
| # this tick — keep the previous one). | ||
| sub = self._subscriber | ||
| if sub is not None: | ||
| fresh = sub.Read() | ||
| if fresh is not None: | ||
| with self._lock: | ||
| self._low_state = fresh | ||
| # One-shot sanity check on the hardcoded mode_machine. | ||
| if not self._mode_machine_verified: | ||
| self._mode_machine_verified = True | ||
| actual = int(getattr(fresh, "mode_machine", -1)) | ||
| if actual != self._mode_machine: | ||
| logger.warning( | ||
| f"mode_machine mismatch: hardcoded " | ||
| f"{self._mode_machine}, robot reports " | ||
| f"{actual}. Commands may be silently " | ||
| f"rejected by firmware — set " | ||
| f"_MODE_MACHINE_G1 to {actual} for this " | ||
| f"variant." | ||
| ) |
There was a problem hiding this comment.
definitely needs a refactor. 2 tasks in 1 code block
sub is not none is not needed either.
There was a problem hiding this comment.
don't PR code like this pls, this shifts the burden of your agent output review to others
There was a problem hiding this comment.
Don't like this at all. we already have a mujoco sim module. We must not use another implementation
There was a problem hiding this comment.
Technically we have 2 others.
simulation/mujoco/mujoco_process.pyis the first one. It used to be a class, but Jeff noticed that Mujoco can only run on the main thread in a Mac. So I converted it to a standalone process with shared memory communication.simulation/engines/mujoco_engine.pywas added afterwards. It runs in side a Module, which means it's not running on the main thread.- This PR adds
simulation/engines/mujoco_view_subprocess.py. From the comment below, I assume it was added to fix the same issue of Mujoco not working on Macs. @Nabla7 is this so?
@mustafab0 I agree that we shouldn't be adding another one. The ideal solution would be to convert simulation/engines/mujoco_engine.py to work as the main thread in a new process if possible.
| logger.info("GPS goal points cleared and updated clients") | ||
|
|
||
| @self.sio.event # type: ignore[untyped-decorator] | ||
| async def arm(sid: str, data: dict[str, Any] | None = None) -> None: |
There was a problem hiding this comment.
You need to send these commands from the command center, right? Or are you sending from rerun?
| # Uses dimos.msgs.std_msgs.Bool to match the coordinator's | ||
| # ``activate`` / ``dry_run`` In[Bool] ports, rather than | ||
| # dimos_lcm.std_msgs.Bool used by ``explore_cmd`` — the LCM wire | ||
| # format is identical; what matters for autoconnect is type parity. |
There was a problem hiding this comment.
| # Uses dimos.msgs.std_msgs.Bool to match the coordinator's | |
| # ``activate`` / ``dry_run`` In[Bool] ports, rather than | |
| # dimos_lcm.std_msgs.Bool used by ``explore_cmd`` — the LCM wire | |
| # format is identical; what matters for autoconnect is type parity. |
| activate: Out[DimosBool] | ||
| dry_run: Out[DimosBool] |
There was a problem hiding this comment.
Nit: you might want more specific names. This module is for all robots and t's not clear what activate means for any one particular robot. Maybe groot_activate?
| data.qpos[free_qposadr + 3 : free_qposadr + 7] = base_wxyz | ||
| mujoco.mj_kinematics(model, data) | ||
| v.sync() | ||
| time.sleep(1.0 / 60.0) |
There was a problem hiding this comment.
To achieve 60 hz you need to subtract the time it took to process the iteration of the loop.
| js_t.stop() | ||
| od_t.stop() |
There was a problem hiding this comment.
This is not good practice. You have to call stop even if the code fails in the middle. You need try-finally.
| def _on_joint_state(msg: JointState) -> None: | ||
| # Coordinator publishes dimos canonical names ("g1/left_hip_pitch") | ||
| # but the MJCF uses MuJoCo names ("left_hip_pitch_joint"); translate | ||
| # so the lookup against ``name_to_qposadr`` actually hits. | ||
| try: | ||
| for n, q in zip(list(msg.name), list(msg.position), strict=False): | ||
| latest["joints"][dimos_joint_to_mjcf(str(n))] = float(q) | ||
| except Exception: | ||
| pass | ||
|
|
||
| def _on_odom(msg: PoseStamped) -> None: | ||
| try: | ||
| latest["base_pos"] = np.array( | ||
| [msg.position.x, msg.position.y, msg.position.z], dtype=np.float64 | ||
| ) | ||
| latest["base_wxyz"] = np.array( | ||
| [msg.orientation.w, msg.orientation.x, msg.orientation.y, msg.orientation.z], | ||
| dtype=np.float64, | ||
| ) | ||
| except Exception: | ||
| pass | ||
|
|
There was a problem hiding this comment.
These calls are coming on different threads. You need a lock to modify latest. You also need the lock in the loop below where you access latest.
| # from_xml_path can't find them on disk. Inject the dimos-bundled mesh | ||
| # bytes by name — same trick MujocoEngine uses. | ||
| try: | ||
| from dimos.simulation.mujoco.model import get_assets |
There was a problem hiding this comment.
Please move all import everywhere in this file to the top.
| ) | ||
| self._imu_accel_slice = _find_sensor_slice( | ||
| self._engine.model, | ||
| "imu-pelvis-linear-acceleration", |
There was a problem hiding this comment.
These are G1 specific sensor names, right? But this module isn't just for G1. I think this should be abstracted some how.
| "DIMOS_MUJOCO_VIEW=1: couldn't locate mjpython/python on PATH; viewer not launched" | ||
| ) | ||
| else: | ||
| _viewer_proc = subprocess.Popen( |
There was a problem hiding this comment.
Blueprint files are for defining blueprints. You can't just start random process because a file is imported. In any case, you'd need to manage the process (shut it down when its not needed).
| ).transports( | ||
| { | ||
| ("joint_state", JointState): LCMTransport("/coordinator/joint_state", JointState), | ||
| ("odom", PoseStamped): LCMTransport("/odom", PoseStamped), |
There was a problem hiding this comment.
This is the default. No need to define it.
| { | ||
| ("cmd_vel", Twist): LCMTransport("/g1/cmd_vel", Twist), | ||
| ("activate", DimosBool): LCMTransport("/g1/activate", DimosBool), | ||
| ("dry_run", DimosBool): LCMTransport("/g1/dry_run", DimosBool), |
There was a problem hiding this comment.
.transports applies for all the matching modules in a blueprint so it doesn't make sense to define it twice (in _g1_coordinator too).
If you really want to change the default topic, it's better to only define it once:
unitree_g1_groot_wbc_sim = autoconnect(_g1_engine, _g1_coordinator, _g1_ws_vis).transports(
("dry_run", DimosBool): LCMTransport("/g1/dry_run", DimosBool)
)But I think you should remove both and embrace the defaults. This applies to all. Why not just use the defaults?
| # Resolved to an absolute path so MujocoSimModule (parent) and the | ||
| # DIMOS_MUJOCO_VIEW=1 subprocess can both open the file regardless of | ||
| # the shell's CWD. get_data also auto-extracts the LFS tarball on | ||
| # first run. | ||
| _MJCF_PATH = str(get_data("mujoco_sim/g1_gear_wbc.xml")) |
There was a problem hiding this comment.
You shouldn't call get_data at import time. This blocks python's imports until the data is downloaded. Import-time is just for defining (functions/variables/etc), not doing work.
|
|
||
| _g1_connection = G1WholeBodyConnection.blueprint( | ||
| release_sport_mode=True, | ||
| network_interface=os.getenv("ROBOT_INTERFACE", "enp86s0"), |
There was a problem hiding this comment.
There's no need for environment variables.
You can see all the configs for a blueprint with:
$ uv run dimos run unitree-g1-groot-wbc --help
Blueprint arguments:
g1wholebodyconnection:
* g1wholebodyconnection.default_rpc_timeout: float (default: 120.0)
* g1wholebodyconnection.frame_id_prefix: str | None (default: None)
* g1wholebodyconnection.frame_id: str (default: g1_pelvis)
* g1wholebodyconnection.network_interface: str (default: enp86s0)
* g1wholebodyconnection.release_sport_mode: bool (default: True)
* g1wholebodyconnection.publish_rate_hz: float (default: 500.0)
controlcoordinator:
* controlcoordinator.default_rpc_timeout: float (default: 120.0)
* controlcoordinator.frame_id_prefix: str | None (default: None)
* controlcoordinator.frame_id: str | None (default: None)
* controlcoordinator.tick_rate: float (default: 500.0)
* controlcoordinator.publish_joint_state: bool (default: True)
* controlcoordinator.joint_state_frame_id: str (default: coordinator)
* controlcoordinator.publish_odom: bool (default: True)
* controlcoordinator.log_ticks: bool (default: False)
websocketvismodule:
* websocketvismodule.default_rpc_timeout: float (default: 120.0)
* websocketvismodule.frame_id_prefix: str | None (default: None)
* websocketvismodule.frame_id: str | None (default: None)
* websocketvismodule.port: int (default: 7779)So if you want to pass network_interface you just send:
uv run dimos run unitree-g1-groot-wbc -o g1wholebodyconnection.network_interface=enp86s0See docs/usage/cli.md
| type="groot_wbc", | ||
| joint_names=g1_legs_waist, | ||
| priority=50, | ||
| model_path=os.getenv("GROOT_MODEL_DIR", str(get_data("groot"))), |
There was a problem hiding this comment.
You'll want to use LfsPath instead of get_data so the file isn't downloaded at import time.
|
|
||
| """Composition smoke tests for ``unitree_g1_groot_wbc`` (real-hw blueprint). | ||
|
|
||
| These tests do not exercise DDS or actually run anything — they just |
There was a problem hiding this comment.
These tests do not [...] run anything
A very damning statement. :P
Please delete the file. Blueprints are just declarations, not logic. There's nothing to test.
| [0:3] cmd_vel * cmd_scale # scaled velocity command | ||
| [3] height_cmd # fixed slot (0.74) | ||
| [4:7] (0, 0, 0) # rpy_cmd, zeros | ||
| [7:10] gyro * obs_ang_vel_scale # body-frame ang vel | ||
| [10:13] projected_gravity(quat) # gravity in body frame | ||
| [13:42] (q_29 - default_29) * dof_pos_scale | ||
| [42:71] dq_29 * dof_vel_scale | ||
| [71:86] last_action (15 dims) |
There was a problem hiding this comment.
Are these G1 specific values? If so, either abstract it to allow for other robots, or rename the class to make it clear it's only for G1.
Phase 1 — trivial cleanups:
- Delete test_unitree_g1_groot_wbc.py (the "tests don't run anything"
blueprint test the reviewer asked to drop).
- MujocoEngine.data public property; mujoco_sim_module stops reaching
into engine._data.
- JointServoTask.start() resets _last_update_time so a non-zero-timeout
caller doesn't time out on the first tick.
- Split wholebody_connection's two-tasks-in-one publish loop into
_drain_low_state, _verify_mode_machine_once, _snapshot_motor_imu,
_publish_motor_state_and_imu helpers; dropped the `sub is not None`
guard.
- MujocoSimModuleConfig.imu_gyro_sensor_names + imu_accel_sensor_names
configurable instead of G1-hardcoded; default list still covers the
common humanoid pelvis names.
Phase 2 — renames and moves:
- groot_wbc_task.py -> g1_groot_wbc_task.py; GrootWBCTask ->
G1GrootWBCTask (Paul: G1-specific values, name it).
- Fold _groot_wbc_common.py (joint lists, gain tables, ARM_DEFAULT_POSE)
into g1_groot_wbc_task.py and delete the common file.
- Move dimos/hardware/whole_body/mujoco/g1/adapter.py ->
dimos/simulation/adapters/whole_body/g1.py. Adapter registry now
scans both roots.
Phase 3 — safety: G1GrootWBCTask no longer falls back to zero
when a joint is missing from CoordinatorState. Adds last-known-good
caches (_cached_q_29, _cached_dq_29, _cached_q_15) and a _state_seen
guard that refuses to emit a command until at least one fully-
populated state has been observed. Stops the "packet drop -> snap
to zero pose -> robot falls" failure mode.
Phase 4 — task registry: new dimos/control/tasks/registry.py.
Each task module exposes a register(registry) hook; coordinator's
_create_task_from_config drops 100+ lines of if/elif and becomes
a single control_task_registry.create(cfg.type, cfg, hardware=...)
call. Task type "g1_groot_wbc" is the new canonical name;
"groot_wbc" kept as an alias.
Phase 5 — activate / dry_run from streams to RPC:
- Drop the `activate: In[Bool]` and `dry_run: In[Bool]` ports on
ControlCoordinator. Replaced with @rpc set_activated(engaged) and
@rpc set_dry_run(enabled) -- these are one-shot configuration
events, not continuous signals.
- WebsocketVisModule's arm/disarm/set_dry_run socket.io handlers
call the coordinator's RPC directly via RPCClient(None,
ControlCoordinator); the matching Out[DimosBool] ports go away.
- Blueprints lose their now-dead `("activate", DimosBool)` /
`("dry_run", DimosBool)` LCM transport entries.
Phase 6 — hardware addressing: _create_whole_body_adapter passes a
single canonical `address` (matching the manipulator/twist-base
pattern) instead of overloaded address + network_interface.
Adapter __init__ signatures updated accordingly.
Phase 7 — MuJoCo subprocess hygiene:
- Move the standalone viewer's logic from mujoco_view_subprocess.py
into mujoco_engine.view_main() with `python -m
dimos.simulation.engines.mujoco_engine <mjcf>` entry point.
One MuJoCo entry-point file, not two. Adds the missing lock
around `latest_*` shared state, try-finally for LCM teardown,
and tick-aware render sleep so 60 Hz holds under load.
- New MujocoViewerModule wraps the subprocess lifecycle as a
proper dimos Module (start() spawns, stop() terminates, atexit
fallback). Blueprints declare it instead of running
subprocess.Popen at import time. Configurable via -o
mujocoviewermodule.enabled=true; no more DIMOS_MUJOCO_VIEW env
var.
Phase 8 — one blueprint, --simulation flag:
- Merge unitree_g1_groot_wbc.py + unitree_g1_groot_wbc_sim.py into
a single unitree_g1_groot_wbc.py that branches on
global_config.simulation. Real-hw path: G1WholeBodyConnection +
transport_lcm adapter, 500 Hz, unarmed + dry-run, servo_arms.
Sim path: MujocoSimModule + sim_mujoco_g1 adapter, 50 Hz,
auto-arm, optional MujocoViewerModule.
- Drop the unitree-g1-groot-wbc-sim registry entry; CLI runs
`dimos --simulation run unitree-g1-groot-wbc`.
- Replace get_data() at import time with LfsPath() (Paul's
don't-block-imports note).
- Drop ROBOT_INTERFACE / GROOT_MODEL_DIR / DIMOS_DDS_DOMAIN /
DIMOS_MUJOCO_VIEW env-var reads from blueprints; users override
via `-o module.field=value` like every other dimos blueprint.
Deferred (Mustafa flagged for separate PRs, captured as TODOs):
- CoordinatorState should carry IMU data so the task doesn't have to
reach into the adapter for read_imu (TODO in g1_groot_wbc_task.py).
- Drop read_odom from WholeBodyAdapter Protocol -- most adapters
shouldn't be a source of base pose (TODO in whole_body/spec.py).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
| pitch = math.copysign(math.pi / 2.0, sinp) if abs(sinp) >= 1.0 else math.asin(sinp) | ||
| siny = 2.0 * (w * z + x * y) | ||
| cosy = 1.0 - 2.0 * (y * y + z * z) | ||
| yaw = math.atan2(siny, cosy) | ||
| return IMUState( | ||
| quaternion=quat, | ||
| gyroscope=gyro, | ||
| accelerometer=accel, | ||
| rpy=(roll, pitch, yaw), | ||
| ) | ||
|
|
||
| def read_odom(self) -> PoseStamped | None: | ||
| # MujocoSimModule publishes the floating-base pose on its own | ||
| # ``odom`` Out port (TBD); the adapter doesn't currently route |
There was a problem hiding this comment.
Command-mode transient race in
write_motor_commands
Every call to write_motor_commands calls write_position_command first, which internally calls self._set_command_mode(CMD_MODE_POSITION) (setting the SHM control word to 0). write_kp_command then sets it back to CMD_MODE_PD_TAU (2). Because the coordinator process and the sim engine process run concurrently (separate processes, no shared GIL), the sim engine's _apply_shm_commands can read CTRL_COMMAND_MODE in the narrow window between those two adapter writes.
When that happens, shm.read_command_mode() == CMD_MODE_POSITION, so the engine falls into engine.write_joint_command(JointState(position=pos_cmd.tolist())) — direct position-servo mode, bypassing the PD-tau path. On the following tick _latest_pd_pos_target is stale (from two cycles ago) because the new position was consumed by the servo path. The net result is one tick of incorrect actuator-mode followed by one tick of a stale torque target — observable as an occasional control glitch in sim.
The fix is to stop having write_position_command touch the mode at all; let only the kp/kd/tau writes determine the mode, or latch CMD_MODE_PD_TAU once at connect time and never reset it during normal operation.
…Protocol Promoted the two follow-up TODOs from the PR #2033 review (Mustafa :381 and :91) into this PR after re-reading the actual scope: 1) IMU on CoordinatorState (Mustafa :381): - Add ``imu: dict[hardware_id, IMUState]`` field to CoordinatorState. - TickLoop polls every ConnectedWholeBody's read_imu() each tick via a new _read_all_imu() helper and stuffs the dict on the state. - G1GrootWBCTask reads from state.imu first, falls back to self._adapter.read_imu() only when state.imu is empty (e.g. unit tests that build a bare CoordinatorState). The state path is the non-coupled one going forward. 2) Drop read_odom from WholeBodyAdapter Protocol (Mustafa :91): - Remove read_odom from spec.py — the WholeBodyAdapter Protocol no longer promises base pose. Most adapters never did (real-hw G1's was hardcoded None, transport_lcm's likewise); the few that did (sim_mujoco_g1) reported pose the engine module already publishes itself, so the adapter polling was redundant. - Drop the read_odom impls from SimMujocoG1WholeBodyAdapter and TransportWholeBodyAdapter. - Drop ControlCoordinatorConfig.publish_odom + ControlCoordinator's ``odom: Out[PoseStamped]`` port + TickLoop._publish_odom + odom_callback wiring. ~50 lines deleted. - Sim path: MujocoSimModule's ``odom`` Out is now the sole producer of ``/odom``. Removed the previous ``/sim/odom`` topic override since the autoconnect default for ``(odom, PoseStamped)`` is ``/odom``. - Real-hw path: nothing publishes /odom (matches current behaviour — read_odom returned None). A future estimator Module subscribes to motor_states + imu and publishes to /odom directly, which is the architecturally correct seam. Not addressed in this PR (genuinely pre-existing, not introduced here): Paul's "ideal solution" of converting mujoco_engine.py to run as the main thread in its own process (so mujoco_process.py and mujoco_engine.py can collapse to one). That's a refactor of the pre-existing two-MuJoCo-implementations situation; the PR review only flagged "don't add a third", which Phase 7 of the previous commit already addressed by folding the standalone viewer into mujoco_engine.view_main(). Unifying mujoco_process.py with mujoco_engine.py touches the manipulator stack (xarm/piper) and is out of scope for this G1-WBC PR. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
| @self.sio.event # type: ignore[untyped-decorator] | ||
| async def set_dry_run(sid: str, data: dict[str, Any]) -> None: |
There was a problem hiding this comment.
The
set_dry_run handler declares data: dict[str, Any] as a required argument with no default. In python-socketio, if a client emits set_dry_run without a payload (or the JS side sends null), the handler is invoked without the second argument, raising TypeError: missing 1 required positional argument: 'data'. Even when the payload is present but None, data.get("enabled", False) throws AttributeError. Compare to arm and disarm which both default data to None.
| @self.sio.event # type: ignore[untyped-decorator] | |
| async def set_dry_run(sid: str, data: dict[str, Any]) -> None: | |
| @self.sio.event # type: ignore[untyped-decorator] | |
| async def set_dry_run(sid: str, data: dict[str, Any] | None = None) -> None: |
| if client is None: | ||
| logger.warning("set_dry_run requested but ControlCoordinator RPC is unavailable") | ||
| return | ||
| enabled = bool(data.get("enabled", False)) |
There was a problem hiding this comment.
After the
data parameter is made optional, the data.get() call needs a None guard to avoid an AttributeError when the payload is absent. Without this the fix above alone still crashes on a no-payload emission.
| enabled = bool(data.get("enabled", False)) | |
| enabled = bool((data or {}).get("enabled", False)) |
❌ 3 Tests Failed:
View the top 3 failed test(s) by shortest run time
To view more test analytics, go to the Test Analytics Dashboard |
Paul's "ideal solution" from the PR #2033 review (paraphrased): "convert mujoco_engine.py to work as the main thread in a new process if possible." Done. WholeBodySimHooks (new): - Extracts the per-step SHM-bridge logic out of MujocoSimModule into a standalone class: pre_step pulls motor commands from SHM and applies PD-with-feedforward; post_step writes motor state + gripper back to SHM. The Module's _apply_shm_commands + _publish_shm_state methods, the latched _latest_pd_* state, and _gripper_joint_to_ctrl all move here. Same hook bodies now run regardless of whether the engine lives in a thread or a subprocess. mujoco_engine.engine_main (new entry point): - Standalone "whole-body sim subprocess" loaded with `python -m dimos.simulation.engines.mujoco_engine <mjcf> <shm_key> <dof> [--view]`. Owns: MujocoEngine, WholeBodySimHooks, an LCM publisher for /odom and /imu so non-adapter consumers (viser viewer etc.) still get those topics, plus signal handlers that tear down SHM + LCM cleanly on SIGTERM/SIGINT. - Replaces the previous view-only view_main(): the subprocess now runs full physics + viewer instead of just mirroring state from LCM. MujocoSimModule: - New config field `engine_mode: Literal["thread", "subprocess"]` (default "thread"). Thread mode: today's behaviour, untouched except for delegating SHM hooks to WholeBodySimHooks instead of inline methods. Subprocess mode: skips in-process engine entirely, spawns `engine_main` via subprocess.Popen, terminates it cleanly on stop(). macOS uses mjpython for the subprocess (Cocoa main- thread requirement); Linux uses sys.executable. - Subprocess + cameras combo is intentionally rejected with a clear error — no cross-process frame buffer yet; users either disable cameras or pick thread mode. MujocoViewerModule deleted: - Now redundant. The engine subprocess can launch its own viewer directly with viewer.launch_passive (on its own main thread) when the Module is configured headless=False + engine_mode="subprocess". Blueprint loses its standalone _viewer module entry; the engine subprocess handles viewing. What this does NOT do (deliberately): - Touch the manipulator stack (dimos/simulation/mujoco/mujoco_process.py). Paul didn't ask for that — he asked specifically for mujoco_engine.py to support subprocess+main-thread, which is what landed. The pre- existing two-MuJoCo-implementations situation can collapse to one in a separate refactor that addresses xarm + piper too. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…t import
Blueprints can inspect ``global_config`` at module top level to pick
between real-hw and sim backends (``unitree_g1_groot_wbc.py`` branches
on ``global_config.simulation`` to decide which WholeBodyAdapter to
wire). Until now, the ``run`` command only handed the overrides off to
``ModuleCoordinator.build`` *after* the blueprint was already imported —
so the import-time branch read the stale defaults and the resulting
BlueprintConfig was missing the modules the user was trying to
configure via ``-o module.field=value``. Repro before this fix:
$ dimos --simulation run unitree-g1-groot-wbc \\
-o mujocosimmodule.engine_mode=subprocess
ValidationError: mujocosimmodule
Extra inputs are not permitted
(Real-hw branch was chosen at import; ``mujocosimmodule`` slot doesn't
exist in the resulting config.)
Apply the same ``global_config.update(**cli_config_overrides)`` the
``show_config`` command already uses, but earlier — before
``get_by_name_or_exit`` triggers the blueprint module import.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
| """ | ||
| if not self._active: | ||
| logger.warning(f"G1GrootWBCTask '{self._name}' arm() called before start() — ignoring") | ||
| return False | ||
| if self._armed: | ||
| logger.info(f"G1GrootWBCTask '{self._name}' already armed — arm() ignored") | ||
| return False | ||
| ramp = ramp_seconds if ramp_seconds is not None else self._config.default_ramp_seconds | ||
| self._arming_duration = max(0.0, float(ramp)) | ||
| self._arm_pending = True | ||
| logger.info( | ||
| f"G1GrootWBCTask '{self._name}' arm requested (ramp={self._arming_duration:.1f}s)" | ||
| ) | ||
| return True |
There was a problem hiding this comment.
arm() can silently restart a live ramp mid-execution
arm() returns early only when _armed=True, but not when _arming=True. If an operator double-clicks Arm in the dashboard (or set_activated is called twice in quick succession), the second call overwrites _arming_duration, sets _arm_pending=True, and on the very next compute() tick the ramp restarts from the mid-ramp current pose. On real hardware this means the robot abruptly stops interpolating toward the target and begins a new, unexpected ramp from wherever it currently is. The disarm() guard checks all three flags (_armed or _arming or _arm_pending); arm() should follow the same convention.
| """ | |
| if not self._active: | |
| logger.warning(f"G1GrootWBCTask '{self._name}' arm() called before start() — ignoring") | |
| return False | |
| if self._armed: | |
| logger.info(f"G1GrootWBCTask '{self._name}' already armed — arm() ignored") | |
| return False | |
| ramp = ramp_seconds if ramp_seconds is not None else self._config.default_ramp_seconds | |
| self._arming_duration = max(0.0, float(ramp)) | |
| self._arm_pending = True | |
| logger.info( | |
| f"G1GrootWBCTask '{self._name}' arm requested (ramp={self._arming_duration:.1f}s)" | |
| ) | |
| return True | |
| if self._armed: | |
| logger.info(f"G1GrootWBCTask '{self._name}' already armed — arm() ignored") | |
| return False | |
| if self._arming or self._arm_pending: | |
| logger.info(f"G1GrootWBCTask '{self._name}' arm in progress — arm() ignored") | |
| return False |
| stop_explore_cmd: Out[Bool] | ||
| cmd_vel: Out[Twist] | ||
| tele_cmd_vel: Out[Twist] | ||
| movecmd_stamped: Out[TwistStamped] |
There was a problem hiding this comment.
cmd_vel → tele_cmd_vel rename silently breaks an existing blueprint
WebsocketVisModule.cmd_vel was renamed to tele_cmd_vel, but dimos/robot/unitree/g1/blueprints/primitive/uintree_g1_primitive_no_nav.py still wires ("cmd_vel", Twist): LCMTransport("/cmd_vel", Twist) against WebsocketVisModule. After the rename no port matches that binding, so WASD teleoperation from the dashboard is silently dropped for any blueprint that imports that file. The fix is either to update that blueprint to use tele_cmd_vel, or to keep the old port name as an alias and deprecate it.
Problem
dimos doesn't have a whole-body locomotion controller for the Unitree G1 humanoid. The
ControlCoordinator/WholeBodyAdaptermachinery from #1954 is set up to host one — there just isn't a task that drives the legs and waist.Closes DIM-XXX
Solution
A
GrootWBCTaskthat runs the GR00T balance + walk ONNX policies inside the coordinator tick loop, and two blueprints (unitree-g1-groot-wbcreal-hw +unitree-g1-groot-wbc-sim) that compose it. Sim and real run identical task code; only theWholeBodyAdapterunderneath differs — sim uses an in-process MuJoCo adapter, real hardware uses Mustafa'stransport_lcmbridge toG1WholeBodyConnection(no new G1 low-level adapter).The coordinator gains
activate/dry_run/twist_commandports duck-typed to any task exposingarm()/set_dry_run()/set_velocity_command().TaskConfiggainshardware_id,default_positions,auto_arm,auto_dry_run,default_ramp_seconds,decimation.WholeBodyConfigcarries per-joint kp/kd;ConnectedWholeBodyresolves them when buildingMotorCommands.WebsocketVisModuleexposes Arm + Dry-Run buttons that publish on the new ports — the operator activates real-hw runs from the dashboard.Real-hardware safety profile: comes up unarmed + dry-run + 10-s ramp from current pose to the bent-knee default. Sim auto-arms with no ramp.
G1WholeBodyConnectionpatched for macOS: passiveInit(None, 0)+Read()-per-tick (the callback-based variant doesn't fire reliably under cyclonedds on Darwin), andmode_machinehardcoded to the G1's static value with a one-shot warning if a future firmware reports something else.Breaking Changes
None. Additive — existing blueprints unchanged.
WholeBodyConfigis new (kp/kdwere previously direct fields onHardwareComponent; the migration is a constructor rename).How to Test
Unit:
uv run pytest dimos/control/tasks/test_groot_wbc_task.py \ dimos/control/test_control.py \ dimos/robot/unitree/g1/blueprints/basic/test_unitree_g1_groot_wbc.py \ dimos/robot/test_all_blueprints_generation.pySim (no hardware needed):
MuJoCo viewer pops, robot stands. Open http://localhost:7779/, WASD walks the robot.
Real hardware (requires a G1 + ethernet + the
[unitree-dds]extra installed;CYCLONEDDS_HOMEexported in the shell):Robot connects in dev-mode damping; dashboard at http://localhost:7779/. Click Arm (10-s ramp to default pose), inspect the dry-run command stream, then toggle Dry Run OFF to hand control to the policy. WASD = cmd_vel.
Contributor License Agreement