Add CLRMA diagnostic command runnable under dotnet-dump#5865
Conversation
Adds a native SOS 'clrma' command that drives the CLRMA contract (CLRMACreateInstance -> ICLRManagedAnalysis: AssociateClient/GetThread/GetException) and prints the managed thread stack and current/nested exceptions. This lets the Watson/!analyze code path be exercised locally in any SOS host (notably dotnet-dump) without windbg/lldb, as a proxy for Watson bucketing. Registered as a top-level command for DbgEng !clrma parity and framed as a CLRMA-provider diagnostic. Fixes two latent blockers that only surface in the dotnet-dump host (where Extensions debugger services are null): - AssociateClient gated the whole CLRMA path on GetDebuggerServices() != null; now uses the target obtained from the host. - InternalOutputVaList/TraceHostingError dereferenced a null GetDebuggerServices() when CLRMA logging was enabled; now null-guarded. Adds a dotnet-dump-scoped clrma verification to NestedExceptionTest.script and documents the local workflow in documentation/clrma.md. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Adds a new SOS diagnostic command (clrma) to exercise the CLRMA contract end-to-end in any SOS host (notably dotnet-dump), enabling local validation of the Watson/!analyze managed-analysis path and addressing dotnet-dump-specific null debugger-services scenarios.
Changes:
- Introduces a new native SOS
clrmacommand that prints managed stack and exception (current/nested) details viaICLRManagedAnalysis. - Removes an
AssociateClientgate that prevented CLRMA from running when debugger services are unavailable (dotnet-dump host scenario). - Adds dotnet-dump-scoped script verification and documents the local workflow in
documentation/clrma.md.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| src/tests/SOS.UnitTests/Scripts/NestedExceptionTest.script | Adds DOTNETDUMP-only validation of clrma output structure (provider + exception type/HResult). |
| src/SOS/Strike/sos.def | Exports the new clrma command from the SOS module. |
| src/SOS/Strike/clrma/managedanalysis.cpp | Allows CLRMA to proceed even when debugger services are null (dotnet-dump), relying on host target/runtime. |
| src/SOS/Strike/clrma/clrma.cpp | Implements the new clrma command and its printing helpers. |
| src/SOS/SOS.Hosting/Commands/SOSCommand.cs | Registers clrma as a top-level hosted command. |
| src/SOS/extensions/extensions.cpp | Adds null-guards for missing debugger services to avoid crashes in dotnet-dump scenarios. |
| documentation/clrma.md | Documents how to run clrma under dotnet-dump and provides sample output. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
The clrma SOS command was only compiled and exported on Windows, so under dotnet-dump on Linux/macOS it failed with 'Unrecognized SOS command clrma', breaking the NestedExceptionTest CLRMA verification on every non-Windows leg. Compile the clrma sources into the Unix SOS build and export clrma in sos_unixexports.src. Port the native code to the cross-platform debugger services: source the debug interfaces from the ExtQuery globals (no COM IDebugClient on Unix), map IDebugControl/IDebugSymbols3 to the available IDebugControl2/IDebugSymbols compat interfaces, widen ASCII module names from GetModuleNames (no GetModuleNameStringWide), and guard the last-event-thread path (no GetThreadIdsByIndex). Use W() literals and basic_string<WCHAR> so UTF-16 strings match WCHAR/BSTR where wchar_t differs from WCHAR on Unix. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
In _DEBUG builds, debugreturn.h (pulled in via exts.h -> contract.h) defines 'return' as a macro. The Alpine/musl rootfs uses libstdc++ 10.2.1 whose <vector> has constexpr relocate helpers containing return statements, which break when the macro is active. Including managedanalysis.h (which pulls in <vector>) before exts.h matches the sibling files and resolves the compile error. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
noahfalk
left a comment
There was a problem hiding this comment.
I'd only suggest we keep this if we can cut it down to an unseen Windows-only command. I don't think we want to create any customer expectations that this is publicly supported or xplat.
| if (m_message != nullptr) | ||
| { | ||
| memcpy(m_message, none, (noneLen + 1) * sizeof(WCHAR)); | ||
| } |
There was a problem hiding this comment.
Required for xplat. Keeps it utf-16 everywhere.
| { | ||
| // To match the built-in SOS provider that scrapes !pe output | ||
| typeName = L"<Unknown>"; | ||
| typeName = W("<Unknown>"); |
There was a problem hiding this comment.
Same. Required for xplat. Keeps it utf-16 everywhere.
| m_debugControl = debugControl.Detach(); | ||
| m_debugSymbols = debugSymbols.Detach(); | ||
| #else | ||
| // On Unix there is no COM IDebugClient. The cross-platform debugger services (ILLDBServices, |
There was a problem hiding this comment.
Do we need to support CLRMA on non-Windows systems? Until we have a scenario that requires it I'd suggest we don't create expand our feature support.
There was a problem hiding this comment.
For testing purposes I think having something that works on linux and mac makes sense. I hid the command from the actual help list.
|
|
||
| namespace SOS.Hosting | ||
| { | ||
| [Command(Name = "clrma", DefaultOptions = "clrma", Help = "Diagnostic that drives the CLRMA provider used by Watson/!analyze and prints the managed thread and exception analysis. Primarily for validating the CLRMA path; not a general triage command.")] |
There was a problem hiding this comment.
Do we have a way to make the command hidden? I don't think we want test utilities showing up in the public command list that all .NET developers see.
There was a problem hiding this comment.
Added "Hidden" to the CommandAttribute. We didn't before, but we do now.
Make clrma a hidden command and apply review feedback from PR dotnet#5865: - Add a reusable Hidden flag to CommandAttribute and filter hidden commands out of the help/soshelp command list (CommandService. GetAllCommandHelp). The command stays registered and invocable on all hosts/platforms; it just isn't advertised. clrma is marked Hidden. - Document why exception.cpp uses W()/manual-copy instead of L"" with wcslen/wcscpy (wchar_t != WCHAR on Unix; W() yields UTF-16). - Harden InternalOutputVaList: use int length, guard negative vsnprintf returns, and always va_end(argsCopy) before returning. - clrma.md: grammar fix and an internal-validation note clarifying clrma is not a supported public command. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds a native SOS 'clrma' command that drives the CLRMA contract (CLRMACreateInstance -> ICLRManagedAnalysis: AssociateClient/GetThread/GetException) and prints the managed thread stack and current/nested exceptions. This lets the Watson/!analyze code path be exercised locally in any SOS host (notably dotnet-dump) without windbg/lldb, as a proxy for Watson bucketing. Registered as a top-level command for DbgEng !clrma parity and framed as a CLRMA-provider diagnostic.
Fixes two latent blockers that only surface in the dotnet-dump host (where Extensions debugger services are null):
AssociateClient gated the whole CLRMA path on GetDebuggerServices() != null; now uses the target obtained from the host.
InternalOutputVaList/TraceHostingError dereferenced a null GetDebuggerServices() when CLRMA logging was enabled; now null-guarded.
Adds a dotnet-dump-scoped clrma verification to NestedExceptionTest.script and documents the local workflow in documentation/clrma.md.