task: tighten AUTO read-ahead wait-clear to match MDI (#3650) by grandixximo · Pull Request #3971 · LinuxCNC/linuxcnc

grandixximo · 2026-04-26T10:56:31Z

Fixes the long-standing "Queue is not empty after probing" error.

mdi_execute_hook has, since 2012, gated wait-clear on execState==DONE && motion.traj.queue==0 && io.status==DONE. readahead_reading (AUTO mode) checks only execState==DONE, on the assumption that the prior EMC_TASK_PLAN_SYNCH precondition (WAITING_FOR_MOTION_AND_IO) had already drained motion.

That assumption breaks across cycles: emcTaskIssueCommand writes EMC_TRAJ_PROBE to motion via shared memory, then the same task cycle's emcMotionUpdate may snapshot emcmotStatus before the servo loop has added the new segment. motion.status reads DONE (stale), the SYNCH precondition transitions immediately, and the next Interp::_read runs read_inputs against a queue that is now non-empty. CHKS fires.

Patch: add the same two conjuncts to the AUTO check.

Closes #3650, #662, #263.

Complementary to #3537 (already merged): that fix handled probe re-trip during deceleration. This one handles the inter-command snapshot-lag race that lets the same family of bugs leak through.

cc @andypugh, @DauntlessAq, @rmu75, @BsAtHome: would value review on the wait-clear semantics.
cc @tiket18, @Cromaglious: could you test against your reproducing workload? Can use built artifacts for testing

Test plan

Cyclic G38.3/G38.5 sim loop (>20k iterations), error gone
Single-probe + tool-setter still latch `#5061..#5069` correctly
Reporter validation against XXYYZ scan + smartprobe workloads
2.9 backport candidate after master soak

Auto mode cleared the read-ahead wait on execState==DONE alone, so a G38 issued just before an emcMotionUpdate snapshot could let the next read fire while motion.traj.queue was still non-zero, tripping NCE_QUEUE_IS_NOT_EMPTY_AFTER_PROBING. Add the queue==0 + io.status ==DONE conjuncts already used by mdi_execute_hook. Closes LinuxCNC#3650, LinuxCNC#662, LinuxCNC#263.

rmu75 · 2026-04-26T16:35:18Z

Not sure I really qualify to comment, but explanation makes sense.

grandixximo · 2026-04-27T00:09:05Z

Need testers, I will try to post in the forum as well

grandixximo · 2026-04-27T22:43:24Z

I'll send a 2.9 PR just so I can have debs built, users on the forum want 2.9 to test the fix

DauntlessAq · 2026-04-28T16:11:57Z

I'm slightly confused, does this issue occur when the probe move starts, or when it ends?

Because the error message implies it is at the end, that is to say that the probe move (EMC_TRAJ_PROBE) is still in the queue when it should have been removed?

Aka, currently it is possible that a state sync check at the end of the probe move passed when it should not do so. This would be because execState==DONE was set before the move was REMOVED from the queue, due to a race condition.
So just checking execState==DONE isn't completely reliable, hence your more stringent checks being a fix.

This makes sense to me as an explanation that also matches the error message (but of course it might not be correct)

But when you say that:

emcTaskIssueCommand writes EMC_TRAJ_PROBE to motion via shared memory, then the same task cycle's emcMotionUpdate may snapshot emcmotStatus before the servo loop has added the new segment

Are you saying that the error actually occurs during a state sync BEFORE the probe move takes place, instead of AFTER? Because you're talking about a sync where EMC_TRAJ_PROBE is the move which is written?

Or are you talking about the END of the probe move, and the following move just happens to be EMC_TRAJ_PROBE in your testing because you are using consecutive probe moves in your testing (e.g. both towards and away from the part)?

But either way, you're saying that the race condition is that the move is ADDED prematurely, not that the condition passed prematurely because the move has not been fully REMOVED?

Surely the move wouldn't be added prematurely because the addition of the move to the interpreter queue would itself be gated behind the state sync check of execState==DONE so that the moves aren't added to the queue until the sync has occurred?
Else the next move would always be added during a probe move, so this "Queue is not empty after probing" error would happen every probe move?

Or are you saying that is the race condition, that upon completion of the probe move there is a brief period where moves can be added to the queue before the sync is commanded?

And this can allow the next move to be added to the queue before the sync takes place, causing the current sync logic to fail because the next move has been added to the queue despite the machine otherwise being in a state where the sync should pass?

In which case, while tightening up this logic might be a fix, we'd ideally also want to fix the issue closer to its orgin by preventing moves which require a sync beforehand from being added to the queue before a sync is confirmed to have occurred?

Just to show where the error message originates:

This is where the "Queue is not empty after probing" message is defined:

linuxcnc/src/emc/rs274ngc/rs274ngc_return.hh

Line 133 in 2378edb

    
           #define NCE_QUEUE_IS_NOT_EMPTY_AFTER_PROBING _("Queue is not empty after probing")

And this is how the message is called:

linuxcnc/src/emc/rs274ngc/rs274ngc_pre.cc

Line 1448 in 2378edb

if (settings->probe_flag) {

if (settings->probe_flag) {
CHKS((GET_EXTERNAL_QUEUE_EMPTY() == 0),
     NCE_QUEUE_IS_NOT_EMPTY_AFTER_PROBING);
set_probe_data(&_setup);
settings->probe_flag = false;
}

The message is triggered when the interpreter is commanded to read an input, if the probe_flag is true and the queue is not empty then the error occurs. So the error definitely seems to occur AFTER a probe move.

rmu75 added bug v2.9 candidate Things that would be nice to have for 2.9 2.10-candidate would be nice to have fixed in 2.10 labels Apr 26, 2026

grandixximo mentioned this pull request Apr 27, 2026

[2.9] task: tighten AUTO read-ahead wait-clear to match MDI #3973

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

task: tighten AUTO read-ahead wait-clear to match MDI (#3650)#3971

task: tighten AUTO read-ahead wait-clear to match MDI (#3650)#3971
grandixximo wants to merge 1 commit intoLinuxCNC:masterfrom
grandixximo:fix/3650-queue-not-empty-probe

grandixximo commented Apr 26, 2026 •

edited

Loading

Uh oh!

rmu75 commented Apr 26, 2026

Uh oh!

grandixximo commented Apr 27, 2026

Uh oh!

grandixximo commented Apr 27, 2026

Uh oh!

DauntlessAq commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

grandixximo commented Apr 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test plan

Uh oh!

rmu75 commented Apr 26, 2026

Uh oh!

grandixximo commented Apr 27, 2026

Uh oh!

grandixximo commented Apr 27, 2026

Uh oh!

DauntlessAq commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

grandixximo commented Apr 26, 2026 •

edited

Loading