Skip to content

of/irq: Fix MSI map walk regression and NULL deref in of_msi…#507

Open
vijayanandjitta-oss wants to merge 6 commits intoqualcomm-linux:qcom-6.18.yfrom
vijayanandjitta-oss:msi-map-fix
Open

of/irq: Fix MSI map walk regression and NULL deref in of_msi…#507
vijayanandjitta-oss wants to merge 6 commits intoqualcomm-linux:qcom-6.18.yfrom
vijayanandjitta-oss:msi-map-fix

Conversation

@vijayanandjitta-oss
Copy link
Copy Markdown

@vijayanandjitta-oss vijayanandjitta-oss commented Apr 23, 2026

Commit a4503c1 ("FROMLIST: of: Factor arguments passed to of_map_id() into a struct") refactored of_map_id() to use an explicit filter_np parameter instead of the dual-purpose struct device_node **target pointer.

The old API distinguished three cases via the double pointer:

  • target == NULL -> pass-through (return 0) when no msi-map
  • target != NULL, *target == NULL -> return -ENODEV (walk continues)
  • target != NULL, *target != NULL -> filter by *target

In of_msi_xlate(), the call was changed from passing &np (always a non-NULL pointer, with *np initially NULL) to passing *msi_np (the dereferenced value, initially NULL). This collapsed the "pointer-but-no-filter-yet" case into "no filter at all", causing of_map_id() to return 0 (pass-through) instead of -ENODEV when a node has no msi-map property.

Back in of_msi_xlate(), a return value of 0 triggers break, terminating the walk at the first node (e.g., a PCIe port or endpoint) before ever reaching the root complex node that has the msi-map. As a result, *msi_np remains NULL, irq_find_matching_host() returns NULL, and no MSI domain is associated with the device.

This affects all callers that start with *msi_np == NULL:

  • of_msi_map_get_device_domain(): MSI domain not found for PCIe devices
  • pci_msi_map_rid_ctlr_node(): MSI controller node not found
  • iproc_pcie_setup_msi(): returns -ENODEV

Additionally, fsl_mc_get_msi_id() passes msi_np == NULL directly to of_msi_xlate(), causing a NULL pointer dereference when the function tries to dereference *msi_np.

Fix both issues in of_msi_xlate():

  1. Walk regression: after of_map_msi_id() returns 0, check msi_spec.np. A NULL np indicates a pass-through result (no msi-map on this node), so continue walking up the device hierarchy rather than breaking.

  2. NULL msi_np: introduce a local fallback pointer using the __free(device_node) cleanup attribute (consistent with existing usage in this file). When msi_np is NULL, np points to local_np instead, allowing the walk to proceed safely. Any reference acquired is released automatically on function return.

This PR reverts the v13 version and pick v14 which has the above fixes.

CRs-Fixed: 4513046

@vijayanandjitta-oss vijayanandjitta-oss changed the title PENDING: of/irq: Fix MSI map walk regression and NULL deref in of_msi… of/irq: Fix MSI map walk regression and NULL deref in of_msi… Apr 24, 2026
vijayanandjitta-oss and others added 6 commits April 28, 2026 09:08
This reverts commit 374c98d.

The v13 version of this patch series introduced a regression in
of_msi_xlate(). The "Factor arguments passed to of_map_id() into a
struct" patch changed the call from passing &np to passing *msi_np,
collapsing the "pointer-but-no-filter-yet" case into "no filter at all".
This causes of_map_id() to return 0 (pass-through) instead of -ENODEV
when a node has no msi-map property, terminating the walk prematurely
and leaving *msi_np NULL so no MSI domain is associated with the device.
Additionally, fsl_mc_get_msi_id() passes msi_np == NULL directly to
of_msi_xlate(), causing a NULL pointer dereference.

Revert the v13 series to pick up v14 which has the above fixes.

Signed-off-by: Vijayanand Jitta <vijayanand.jitta@oss.qualcomm.com>
…truct"

This reverts commit 62babf2.

The v13 version of this patch series introduced a regression in
of_msi_xlate(). This patch changed the call from passing &np to passing
*msi_np, collapsing the "pointer-but-no-filter-yet" case into "no filter
at all". This causes of_map_id() to return 0 (pass-through) instead of
-ENODEV when a node has no msi-map property, terminating the walk
prematurely and leaving *msi_np NULL so no MSI domain is associated with
the device. Additionally, fsl_mc_get_msi_id() passes msi_np == NULL
directly to of_msi_xlate(), causing a NULL pointer dereference.

Revert the v13 series to pick up v14 which has the above fixes.

Signed-off-by: Vijayanand Jitta <vijayanand.jitta@oss.qualcomm.com>
This reverts commit 7a3860c.

The v13 version of this patch series introduced a regression in
of_msi_xlate(). The "Factor arguments passed to of_map_id() into a
struct" patch changed the call from passing &np to passing *msi_np,
collapsing the "pointer-but-no-filter-yet" case into "no filter at all".
This causes of_map_id() to return 0 (pass-through) instead of -ENODEV
when a node has no msi-map property, terminating the walk prematurely
and leaving *msi_np NULL so no MSI domain is associated with the device.
Additionally, fsl_mc_get_msi_id() passes msi_np == NULL directly to
of_msi_xlate(), causing a NULL pointer dereference.

Revert the v13 series to pick up v14 which has the above fixes.

Signed-off-by: Vijayanand Jitta <vijayanand.jitta@oss.qualcomm.com>
Since we now have quite a few users parsing "iommu-map" and "msi-map"
properties, give them some wrappers to conveniently encapsulate the
appropriate sets of property names. This will also make it easier to
then change of_map_id() to correctly account for specifier cells.

Link: https://lore.kernel.org/lkml/20260424-parse_iommu_cells-v14-1-fd02f11b6c38@oss.qualcomm.com/
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>

[Conflict: irq-gic-its-msi-parent.c was refactored to split
of_pmsi_get_msi_info() into of_pmsi_get_dev_id() and
of_v5_pmsi_get_msi_info(); updated both of_map_id() calls.]

Signed-off-by: Vijayanand Jitta <vijayanand.jitta@oss.qualcomm.com>
Change of_map_id() to take a pointer to struct of_phandle_args
instead of passing target device node and translated IDs separately.
Update all callers accordingly.

Add an explicit filter_np parameter to of_map_id() and of_map_msi_id()
to separate the filter input from the output. Previously, the target
parameter served dual purpose: as an input filter (if non-NULL, only
match entries targeting that node) and as an output (receiving the
matched node with a reference held). Now filter_np is the explicit
input filter and arg->np is the pure output.

Previously, of_map_id() would call of_node_put() on the matched node
when a filter was provided, making reference ownership inconsistent.
Remove this internal of_node_put() call so that of_map_id() now always
transfers ownership of the matched node reference to the caller via
arg->np. Callers are now consistently responsible for releasing this
reference with of_node_put(arg->np) when done.

Link: https://lore.kernel.org/lkml/20260424-parse_iommu_cells-v14-2-fd02f11b6c38@oss.qualcomm.com/
Acked-by: Frank Li <Frank.Li@nxp.com>
Suggested-by: Rob Herring (Arm) <robh@kernel.org>
Suggested-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Signed-off-by: Charan Teja Kalla <charan.kalla@oss.qualcomm.com>

[Conflict: irq-gic-its-msi-parent.c was refactored to split
of_pmsi_get_msi_info() into of_pmsi_get_dev_id() and
of_v5_pmsi_get_msi_info(); updated both of_map_id() calls.]

Signed-off-by: Vijayanand Jitta <vijayanand.jitta@oss.qualcomm.com>
So far our parsing of {iommu,msi}-map properties has always blindly
assumed that the output specifiers will always have exactly 1 cell.
This typically does happen to be the case, but is not actually enforced
(and the PCI msi-map binding even explicitly states support for 0 or 1
cells) - as a result we've now ended up with dodgy DTs out in the field
which depend on this behaviour to map a 1-cell specifier for a 2-cell
provider, despite that being bogus per the bindings themselves.

Since there is some potential use in being able to map at least single
input IDs to multi-cell output specifiers (and properly support 0-cell
outputs as well), add support for properly parsing and using the target
nodes' #cells values, albeit with the unfortunate complication of still
having to work around expectations of the old behaviour too.

Since there are multi-cell output specifiers, the callers of of_map_id()
may need to get the exact cell output value for further processing.
Update of_map_id() to set args_count in the output to reflect the actual
number of output specifier cells.

Link: https://lore.kernel.org/lkml/20260424-parse_iommu_cells-v14-3-fd02f11b6c38@oss.qualcomm.com/
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Charan Teja Kalla <charan.kalla@oss.qualcomm.com>
Signed-off-by: Vijayanand Jitta <vijayanand.jitta@oss.qualcomm.com>
@qcomlnxci
Copy link
Copy Markdown

Test Matrix

Test Case glymur-crd kaanapali-mtp lemans-evk monaco-evk qcs615-ride qcs6490-rb3gen2 qcs8300-ride qcs9100-ride-r3 sm8750-mtp x1e80100-crd
0_qcom-next-ci-premerge-tests ◻️ ◻️ ◻️ ◻️ ◻️ ◻️ ◻️ ❌ Fail ◻️ ◻️
BT_FW_KMD_Service ◻️ ◻️ ✅ Pass ✅ Pass ✅ Pass ✅ Pass ✅ Pass ✅ Pass ◻️ ◻️
BT_ON_OFF ◻️ ◻️ ✅ Pass ✅ Pass ✅ Pass ✅ Pass ✅ Pass ✅ Pass ◻️ ◻️
BT_SCAN ◻️ ◻️ ✅ Pass ✅ Pass ✅ Pass ✅ Pass ✅ Pass ✅ Pass ◻️ ◻️
CPUFreq_Validation ◻️ ◻️ ✅ Pass ✅ Pass ✅ Pass ✅ Pass ✅ Pass ✅ Pass ◻️ ◻️
CPU_affinity ◻️ ◻️ ✅ Pass ✅ Pass ✅ Pass ✅ Pass ✅ Pass ✅ Pass ◻️ ◻️
DSP_AudioPD ◻️ ◻️ ✅ Pass ✅ Pass ✅ Pass ✅ Pass ✅ Pass ✅ Pass ◻️ ◻️
Ethernet ◻️ ◻️ ⚠️ skip ✅ Pass ⚠️ skip ⚠️ skip ⚠️ skip ⚠️ skip ◻️ ◻️
Freq_Scaling ◻️ ◻️ ✅ Pass ✅ Pass ✅ Pass ✅ Pass ✅ Pass ✅ Pass ◻️ ◻️
GIC ◻️ ◻️ ✅ Pass ✅ Pass ✅ Pass ✅ Pass ✅ Pass ✅ Pass ◻️ ◻️
IPA ◻️ ◻️ ✅ Pass ✅ Pass ✅ Pass ✅ Pass ✅ Pass ✅ Pass ◻️ ◻️
Interrupts ◻️ ◻️ ✅ Pass ✅ Pass ✅ Pass ✅ Pass ✅ Pass ✅ Pass ◻️ ◻️
OpenCV ◻️ ◻️ ⚠️ skip ⚠️ skip ⚠️ skip ⚠️ skip ⚠️ skip ⚠️ skip ◻️ ◻️
PCIe ◻️ ◻️ ✅ Pass ✅ Pass ✅ Pass ✅ Pass ✅ Pass ✅ Pass ◻️ ◻️
Probe_Failure_Check ◻️ ◻️ ❌ Fail ❌ Fail ✅ Pass ❌ Fail ❌ Fail ✅ Pass ◻️ ◻️
RMNET ◻️ ◻️ ✅ Pass ✅ Pass ✅ Pass ✅ Pass ✅ Pass ✅ Pass ◻️ ◻️
UFS_Validation ◻️ ◻️ ✅ Pass ✅ Pass ✅ Pass ✅ Pass ✅ Pass ✅ Pass ◻️ ◻️
USBHost ◻️ ◻️ ✅ Pass ✅ Pass ✅ Pass ❌ Fail ✅ Pass ✅ Pass ◻️ ◻️
WiFi_Firmware_Driver ◻️ ◻️ ⚠️ skip ⚠️ skip ⚠️ skip ⚠️ skip ⚠️ skip ⚠️ skip ◻️ ◻️
WiFi_OnOff ◻️ ◻️ ✅ Pass ❌ Fail ✅ Pass ✅ Pass ✅ Pass ✅ Pass ◻️ ◻️
cdsp_remoteproc ◻️ ◻️ ✅ Pass ✅ Pass ✅ Pass ✅ Pass ✅ Pass ✅ Pass ◻️ ◻️
hotplug ◻️ ◻️ ✅ Pass ✅ Pass ✅ Pass ✅ Pass ✅ Pass ✅ Pass ◻️ ◻️
irq ◻️ ◻️ ✅ Pass ✅ Pass ✅ Pass ✅ Pass ✅ Pass ✅ Pass ◻️ ◻️
kaslr ◻️ ◻️ ✅ Pass ✅ Pass ✅ Pass ✅ Pass ✅ Pass ✅ Pass ◻️ ◻️
pinctrl ◻️ ◻️ ✅ Pass ✅ Pass ✅ Pass ✅ Pass ✅ Pass ✅ Pass ◻️ ◻️
qcom_hwrng ◻️ ◻️ ✅ Pass ✅ Pass ✅ Pass ✅ Pass ✅ Pass ✅ Pass ◻️ ◻️
remoteproc ◻️ ◻️ ✅ Pass ✅ Pass ✅ Pass ✅ Pass ✅ Pass ✅ Pass ◻️ ◻️
rngtest ◻️ ◻️ ✅ Pass ✅ Pass ✅ Pass ✅ Pass ✅ Pass ✅ Pass ◻️ ◻️
shmbridge ◻️ ◻️ ❌ Fail ✅ Pass ✅ Pass ✅ Pass ✅ Pass ❌ Fail ◻️ ◻️
smmu ◻️ ◻️ ✅ Pass ✅ Pass ✅ Pass ✅ Pass ✅ Pass ✅ Pass ◻️ ◻️
watchdog ◻️ ◻️ ✅ Pass ✅ Pass ✅ Pass ✅ Pass ✅ Pass ✅ Pass ◻️ ◻️
wpss_remoteproc ◻️ ◻️ ✅ Pass ✅ Pass ✅ Pass ✅ Pass ✅ Pass ✅ Pass ◻️ ◻️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants