You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As indicated here #561 I changed the kernel to post 6.51 version, in concrete to the 6.6.58. At first, it seemed to resolve the issue because ip link set command worked and I could list and configure the can interfaces vía terminal. Nevertheless, some weird stuff kept happening. Thinking this was a config issue I started to do some test.
Long story short, I think there is something weird going on with the kernel drivers, because with the same config, can utils doesn't work right with mcp251xfd and post 6.51 kernel but it does work right with pre 6.51 kernel (6.6.28 in concrete raspberrypi/rpi-firmware@1a47eac). I do not have the will nor the time (for now) to look at this issue deeply in code, but I will leave here the description of the errors and the stuff that I discovered during the testing.
Setup:
Raspberry pi 4B with kernel 6.6.58 and one or two 2-CH CAN FD HATs. Follow https://www.waveshare.com/wiki/2-CH_CAN_FD_HAT to set them up. Loopback the can interfaces (with cable) and ensure that the 120ohm resistors are on. Beware of the interface naming. Set the can interfaces up. To test that it is working correctly, use a pre 6.51 kernel.
Error description:
With 6.6.58 kernel, when using cangen or cansend to generate can frames (I was generating CAN-FD frames) one can observe that sometimes the can frames are not received in the other interface. A look with oscilloscope will show that the CAN bus is idle even tho cangen -v shows that can frames are being generated. Eventually, when the tx queue fills up, queue full message will appear on terminal. A fast way to archive this is to set txqueuelen to an arbitrary small value, if not zero. Only way found to exit this state is to reset the can interfaces (ip link down, ip link up).
How to get the error:
With the setup described, use candump to listen in one interface and cangen -v to generate in the other one. If the error doesn't occur, switch Rx and Tx a couple of times. Setting the generation interval to a low time (1 or 10 ms) almost assures that the error will happen. With cansend this error happens too. Error will always appear when using two HATs, at least one interface will block at every try, if not all.
What I have observed:
The error was difficult to detect because no error is displayed in terminal or registered in log messages (at least those I have looked for). Verbose option only shows that messages are being generated. journalclt -xe and dmesg doesn't show anything strange. Only symptoms is that eventually tx queue fills up and only way to exit this state is to reset the can interfaces and, of course, no frames are received at rx interface (checked with oscillocope, CAN bus is idle). In my opinion, something weird is happening with the SPI comms. I have observed with the oscilloscope that sometimes, when this "block" state is reached, the MOSI/MISO lines have data even tho no command is being executed. When can interfaces are reset SPI lines become idle again. Maybe it has something to do with how is it using activation and interruption (CE & INT) SPI pins?
In any case, software works good with 6.6.28 kernel (tested cangen -f -g 1 in 4 can interfaces at the same time without any errors), so I will stick with that for now. Maybe it is a issue related explicitly with 6.6.58. If that is the case, I guess luck is not on my side. Hope you can resolve this. Best regards.
The text was updated successfully, but these errors were encountered:
Hello, I am the one who opened the previous issue. I have been playing with a pair of this 2-CH CAN FD HATs: https://www.waveshare.com/wiki/2-CH_CAN_FD_HAT.
As indicated here #561 I changed the kernel to post 6.51 version, in concrete to the 6.6.58. At first, it seemed to resolve the issue because
ip link set
command worked and I could list and configure the can interfaces vía terminal. Nevertheless, some weird stuff kept happening. Thinking this was a config issue I started to do some test.Long story short, I think there is something weird going on with the kernel drivers, because with the same config, can utils doesn't work right with mcp251xfd and post 6.51 kernel but it does work right with pre 6.51 kernel (6.6.28 in concrete raspberrypi/rpi-firmware@1a47eac). I do not have the will nor the time (for now) to look at this issue deeply in code, but I will leave here the description of the errors and the stuff that I discovered during the testing.
Setup:
Raspberry pi 4B with kernel 6.6.58 and one or two 2-CH CAN FD HATs. Follow https://www.waveshare.com/wiki/2-CH_CAN_FD_HAT to set them up. Loopback the can interfaces (with cable) and ensure that the 120ohm resistors are on. Beware of the interface naming. Set the can interfaces up. To test that it is working correctly, use a pre 6.51 kernel.
Error description:
With 6.6.58 kernel, when using
cangen
orcansend
to generate can frames (I was generating CAN-FD frames) one can observe that sometimes the can frames are not received in the other interface. A look with oscilloscope will show that the CAN bus is idle even thocangen -v
shows that can frames are being generated. Eventually, when the tx queue fills up, queue full message will appear on terminal. A fast way to archive this is to set txqueuelen to an arbitrary small value, if not zero. Only way found to exit this state is to reset the can interfaces (ip link down, ip link up).How to get the error:
With the setup described, use
candump
to listen in one interface andcangen -v
to generate in the other one. If the error doesn't occur, switch Rx and Tx a couple of times. Setting the generation interval to a low time (1 or 10 ms) almost assures that the error will happen. Withcansend
this error happens too. Error will always appear when using two HATs, at least one interface will block at every try, if not all.What I have observed:
The error was difficult to detect because no error is displayed in terminal or registered in log messages (at least those I have looked for). Verbose option only shows that messages are being generated.
journalclt -xe
anddmesg
doesn't show anything strange. Only symptoms is that eventually tx queue fills up and only way to exit this state is to reset the can interfaces and, of course, no frames are received at rx interface (checked with oscillocope, CAN bus is idle). In my opinion, something weird is happening with the SPI comms. I have observed with the oscilloscope that sometimes, when this "block" state is reached, the MOSI/MISO lines have data even tho no command is being executed. When can interfaces are reset SPI lines become idle again. Maybe it has something to do with how is it using activation and interruption (CE & INT) SPI pins?In any case, software works good with 6.6.28 kernel (tested
cangen -f -g 1
in 4 can interfaces at the same time without any errors), so I will stick with that for now. Maybe it is a issue related explicitly with 6.6.58. If that is the case, I guess luck is not on my side. Hope you can resolve this. Best regards.The text was updated successfully, but these errors were encountered: