It's easy to capture call stacks with ETW for managed code that throws exception, the CLR provides all. But what can I do for native code that behaves badly?
Let's assume I have a large application mostly native C++, I have all debug symbols and CPU sampling works fine with stacks. Somewhere in the application a bad pointer is dereferenced, and thus the application gets stopped by the kernel.
How can I see in my ETW data where in the code the bad pointer was accessed? I'd like to see the full call stack.
What I did so far was capturing Stacks for process exit or thread exit, but that doesn't help. It shows me the non-zero exit code, but not the place where the problem happened.
I spent days on research already, did not find anything so far. Hard to believe that Microsoft does not provide anything for such cases, while the support for .NET exceptions is just great.
Funny side note: ChatGPT cheated on me and made up a provider "Microsoft-Windows-Kernel-Exception" with GUID DABE372C-16F7-4B91-81A7-9CBEB2D0A8FF (Purpose: Captures low-level exception information from the Windows kernel). As far as I can see such a provider does not exist at all(?) Maybe that AI creativity is a sign that there really is no solution?
Update 1: I tried Alois' suggestion with the CreationStack of werfault.exe. Unfortunately, that call stack does not show where the error happened. I created a small exe that has an access violation. I see two werfault.exe getting created after the crash:
#include <iostream>
int Level2()
{
std::cout << "Hello World from Level2!\n";
int* ptr = (int*)100;
int x = *ptr;
std::cout << "Hello World from Level2 again!\n";
return x;
}
int Level1()
{
std::cout << "Hello World from Level1!\n";
return Level2();
}
int main()
{
std::cout << "Hello World from main()!\n";
try
{
return Level1();
}
catch (const std::exception& )
{
std::cout << "Exception caught in main()!\n";
}
}
The access violation can't be caught with the try/catch, so we don't see the "Exception caught in main()!" message.
Update 2: I tried to get stacks for ALPC calls to WER, as suggested by AloisKraus. Unfortunately, that also does not help: We don't get the method where the exception happened, only "main". Furthermore, adding ALPC to the profile causes far too much noise, just to capture exceptions. Not really usable for production :-(