Skip to content

Conversation

@oganigl
Copy link
Contributor

@oganigl oganigl commented Dec 11, 2025

Add a new module to check if there was a hardfault in the last run

@StefanCostea
Copy link
Member

I would investigate if it is possible to emit a full backtrace of what happened, with the corresponding function calls, you probably need to receive a stack pointer and unwind from there somehow. Or check if the MCU has a branch history table that we can access. Also storing those PCs can later be transalted into the corresponding mapping in source using gcc addr2line

@oganigl
Copy link
Contributor Author

oganigl commented Dec 12, 2025

I would investigate if it is possible to emit a full backtrace of what happened, with the corresponding function calls, you probably need to receive a stack pointer and unwind from there somehow. Or check if the MCU has a branch history table that we can access. Also storing those PCs can later be transalted into the corresponding mapping in source using gcc addr2line

Yes I'm already doing that, check my pr in template-project, I read the stack pointer and I unwind it, read also cfsr register, an keep it in the flash.

Cantonplas
Cantonplas previously approved these changes Dec 12, 2025
Copy link
Contributor

@Cantonplas Cantonplas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@oganigl oganigl marked this pull request as draft December 12, 2025 18:05
@jorgesg82
Copy link
Contributor

It would also be nice to toggle the three user LEDs the Nucleo has if we are compiling targeting it

Comment on lines 32 to 34
HardFaultLog log;
memcpy(&log,(void*)HF_FLASH_ADDR,sizeof(HardFaultLog));
LED_init();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are not doing anything with the log info you retrieve from the flash

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what should I do, my idea was to add a breakpoint so you can read it, and for example check the line where the hard_fault appeared, and have with the rest info about the hard fault type your stack in that point etc. What do you think we could do with that information???

Copy link
Member

@g0nz4I0 g0nz4I0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll combine both reviews in one (here).

In general I like the idea, blinking leds, storage of log report, debugger detection, BUT,

There are several things I have doubts with:

1-From what I remember you can only write to flash by sectors, and also erasing whole sectors, I dont see a logic for that,

e.g. let's say you have some code in the flash sector of you LOG report, and you HardFault, I dont think that would work as expected.

2- I really like the idea of the custom handler and the original one in assembly, but I think there might be a better idea to implement the HardFault Handler directly in assembly, otherwise maybe the stack management corrupts the PC and registers

Also consider moving the asm implementation to the library, and only exposing the C handler so that the user can define its own, in this case it doesnt really matter because we maintain both but it's good practice.

Basically my concerns are:

Flash Writing, double check that that works, maybe consider using the SD?
You are not doing anything with the log you recover from the flash (if that worked)
Maybe you can do something with the debugger thingy and send it via UART or RTT to print the crash report

Instead of having to call HardFaultCheck on the template project maybe you can try to take advantage of static initializtion and try to read the RST reason from RCC?
Otherwise maybe consider placing in STLIB::board, I'm curious to hear why you chose this approach

Suggestion, move the asm handler into stlib and call into a CustomHandler from the user, but this is just a comment

IMO the most important things are in the sContextFrame, which are the PC and the LR, with this you can determine the two previous calls

@g0nz4I0
Copy link
Member

g0nz4I0 commented Dec 12, 2025

I'll combine both reviews in one (here).

In general I like the idea, blinking leds, storage of log report, BUT,

There are several things I have doubts with:

1-From what I remember you can only write to flash by sectors, and also erasing whole sectors, I dont see a logic for that,

e.g. let's say you have some code in the flash sector of you LOG report, and you HardFault, I dont think that would work as expected.

2- I really like the idea of the custom handler and the original one in assembly, but I think there might be a better idea to implement the HardFault Handler directly in assembly, otherwise maybe the stack management corrupts the PC and registers

Also consider moving the asm implementation to the library, and only exposing the C handler so that the user can define its own, in this case it doesnt really matter because we maintain both but it's good practice.

Basically my concerns are:

Flash Writing, double check that that works, maybe consider using the SD? You are not doing anything with the log you recover from the flash (if that worked) Maybe you can do something with the debugger thingy and send it via UART or RTT to print the crash report

Instead of having to call HardFaultCheck on the template project maybe you can try to take advantage of static initializtion and try to read the RST reason from RCC? Otherwise maybe consider placing in STLIB::board, I'm curious to hear why you chose this approach

Suggestion, move the asm handler into stlib and call into a CustomHandler from the user, but this is just a comment

IMO the most important things are in the sContextFrame, which are the PC and the LR, with this you can determine the two previous calls

Parece q se lo he enchufado a una IA, pero juro q no, simplemente ahora escribo y a veces parece una IA

@g0nz4I0
Copy link
Member

g0nz4I0 commented Dec 13, 2025

I would investigate if it is possible to emit a full backtrace of what happened, with the corresponding function calls, you probably need to receive a stack pointer and unwind from there somehow. Or check if the MCU has a branch history table that we can access. Also storing those PCs can later be transalted into the corresponding mapping in source using gcc addr2line

I think with the "default" configuration you get 2 call depth, PC + LR, but I think there can be something done with the frame pointer Frame Pointer Unwinding Linked List this increases the assembly, but might be useful.

@StefanCostea
Copy link
Member

StefanCostea commented Dec 13, 2025

https://github.com/armink/CmBacktrace/blob/master/cm_backtrace/cm_backtrace.c
Here is an example of some code I saw to extract the full call trace, not only the last two calls, IMO it is very important to gather the full call trace that lead to the crash.

To summerize so you do not have to go through what they do is:

sp = read_sp_register();
while(sp-- < stack_start_adrr){ //can get from linker symbol
     mem_val = *sp;
     if mem_val > text_section_start_addr && mem_val < text_section_end_addr){ // this means we now this is a PC value
          instruction_val = *(uint32_t*)mem_val;
          bool is_branch = is_bl_blx(instruction_val); //check if the instruction has the BL BLX format with a mask
          if(is_branch){
                call_trace_buffer.push(mem_val);
          }
     }

}

Basically loop through the whole stack looking for return addresses. You will still have to read the registers for the first return address though as that one has not been pushed to stack

@StefanCostea
Copy link
Member

I would absolutely not enable frame pointer, it would hurt performance

@oganigl oganigl marked this pull request as ready for review December 13, 2025 22:27
@oganigl oganigl marked this pull request as draft December 13, 2025 22:53
@FoniksFox
Copy link
Contributor

I'll combine both reviews in one (here).

In general I like the idea, blinking leds, storage of log report, debugger detection, BUT,

There are several things I have doubts with:

1-From what I remember you can only write to flash by sectors, and also erasing whole sectors, I dont see a logic for that,

e.g. let's say you have some code in the flash sector of you LOG report, and you HardFault, I dont think that would work as expected.

2- I really like the idea of the custom handler and the original one in assembly, but I think there might be a better idea to implement the HardFault Handler directly in assembly, otherwise maybe the stack management corrupts the PC and registers

Also consider moving the asm implementation to the library, and only exposing the C handler so that the user can define its own, in this case it doesnt really matter because we maintain both but it's good practice.

Basically my concerns are:

Flash Writing, double check that that works, maybe consider using the SD? You are not doing anything with the log you recover from the flash (if that worked) Maybe you can do something with the debugger thingy and send it via UART or RTT to print the crash report

Instead of having to call HardFaultCheck on the template project maybe you can try to take advantage of static initializtion and try to read the RST reason from RCC? Otherwise maybe consider placing in STLIB::board, I'm curious to hear why you chose this approach

Suggestion, move the asm handler into stlib and call into a CustomHandler from the user, but this is just a comment

IMO the most important things are in the sContextFrame, which are the PC and the LR, with this you can determine the two previous calls

About using the SD to log the info, we considered it, but deemed it to be too complex and unreliable. The SD card needs a full-blown initialization process that could fail in the process (say, the sd card got loose) and is reliant on many other things working.
So, it could be implemented, with direct control of the SDMMC peripheral and LL calls, but may not be the best of ideas, considering it is an extrenal component and that it needs many calls just to initialize the card...
That's just my opinion though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants