Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

can utils (cansend, candump, cangen...) simply don't work right with mcp251xfd. Possible kernel issue. #563

Open
Markkolas opened this issue Nov 11, 2024 · 2 comments

Comments

@Markkolas
Copy link

Markkolas commented Nov 11, 2024

Hello, I am the one who opened the previous issue. I have been playing with a pair of this 2-CH CAN FD HATs: https://www.waveshare.com/wiki/2-CH_CAN_FD_HAT.

As indicated here #561 I changed the kernel to post 6.51 version, in concrete to the 6.6.58. At first, it seemed to resolve the issue because ip link set command worked and I could list and configure the can interfaces vía terminal. Nevertheless, some weird stuff kept happening. Thinking this was a config issue I started to do some test.

Long story short, I think there is something weird going on with the kernel drivers, because with the same config, can utils doesn't work right with mcp251xfd and post 6.51 kernel but it does work right with pre 6.51 kernel (6.6.28 in concrete raspberrypi/rpi-firmware@1a47eac). I do not have the will nor the time (for now) to look at this issue deeply in code, but I will leave here the description of the errors and the stuff that I discovered during the testing.

Setup:

Raspberry pi 4B with kernel 6.6.58 and one or two 2-CH CAN FD HATs. Follow https://www.waveshare.com/wiki/2-CH_CAN_FD_HAT to set them up. Loopback the can interfaces (with cable) and ensure that the 120ohm resistors are on. Beware of the interface naming. Set the can interfaces up. To test that it is working correctly, use a pre 6.51 kernel.

Error description:

With 6.6.58 kernel, when using cangen or cansend to generate can frames (I was generating CAN-FD frames) one can observe that sometimes the can frames are not received in the other interface. A look with oscilloscope will show that the CAN bus is idle even tho cangen -v shows that can frames are being generated. Eventually, when the tx queue fills up, queue full message will appear on terminal. A fast way to archive this is to set txqueuelen to an arbitrary small value, if not zero. Only way found to exit this state is to reset the can interfaces (ip link down, ip link up).

How to get the error:

With the setup described, use candump to listen in one interface and cangen -v to generate in the other one. If the error doesn't occur, switch Rx and Tx a couple of times. Setting the generation interval to a low time (1 or 10 ms) almost assures that the error will happen. With cansend this error happens too. Error will always appear when using two HATs, at least one interface will block at every try, if not all.

What I have observed:

The error was difficult to detect because no error is displayed in terminal or registered in log messages (at least those I have looked for). Verbose option only shows that messages are being generated. journalclt -xe and dmesg doesn't show anything strange. Only symptoms is that eventually tx queue fills up and only way to exit this state is to reset the can interfaces and, of course, no frames are received at rx interface (checked with oscillocope, CAN bus is idle). In my opinion, something weird is happening with the SPI comms. I have observed with the oscilloscope that sometimes, when this "block" state is reached, the MOSI/MISO lines have data even tho no command is being executed. When can interfaces are reset SPI lines become idle again. Maybe it has something to do with how is it using activation and interruption (CE & INT) SPI pins?

In any case, software works good with 6.6.28 kernel (tested cangen -f -g 1 in 4 can interfaces at the same time without any errors), so I will stick with that for now. Maybe it is a issue related explicitly with 6.6.58. If that is the case, I guess luck is not on my side. Hope you can resolve this. Best regards.

@Markkolas
Copy link
Author

Should I create a standard issue at https://github.com/raspberrypi/linux/issues instead? Forgot to put this in the original message. Thank you in advance.

@marckleinebudde
Copy link
Member

Should I create a standard issue at https://github.com/raspberrypi/linux/issues instead? Forgot to put this in the original message. Thank you in advance.

not needed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants