-
Notifications
You must be signed in to change notification settings - Fork 117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Messages get dropped when larger than 0.5MB - using shared memory - QoS is Best_effort #739
Comments
CC: @Barry-Xu-2018 |
I have opened a discussion here as well as I could reproduce it using their helloworld example with a bit of modifications, fyi - eProsima/Fast-DDS#4276 |
I can reproduce this issue. But on host (Not container), message_lost_callback is never called. In FastDDS shared memory example, it also uses Reliable QoS. I simply modify it (topic_qos) to BEST_EFFORT. |
How come it transfers 1024*1024 bytes and not 2 * 1024 * 1024 which the segment size is set to? Also, whats the difference of the QoS topic vs DataWriter - does they both require the same? |
I figured it was the buffer size of data, rather than the size of the segment nor string. Got it working with a string of 10MB. |
Do you want to test it on ros2 environment ? BTW, there is an easy way. Modify segment size at rmw_fastrtps/rmw_fastrtps_shared_cpp/src/participant.cpp Lines 195 to 197 in 4d0be32
auto shm_transport =
std::make_shared<eprosima::fastdds::rtps::SharedMemTransportDescriptor>();
shm_transport->segment_size(xxxxxx); // <== change the size of segment
domainParticipantQos.transport().user_transports.push_back(shm_transport); And only rebuild rmw_fastrtps package. |
Currently I'm using the binary packages installation, so I would avoid having to deploy a custom build rmw_fastrtps package. Currently tried with:
It doesn't complain about, whereas if I tried with segmentSize it did. But it doesn't seem to have an effect (commented out the shared memory setup in the HelloWorldsharedMem example. |
this is expected. either shared memory or not, setting this is Not bounded data type, which cannot use LoanedMessage nor Data Sharing Delivery. see also eProsima/Fast-DDS#4276 @dk-teknologisk-lag after all, i suggest that you can try with LoanedMessage, message data type must be bounded. (and underneath, here is the demo code, https://github.com/ros2/demos/blob/rolling/demo_nodes_cpp/src/topics/talker_loaned_message.cpp |
I'm afraid your participant profile is missing the |
Yeah, I understand that. But since sending from an actual sensor to PC can run with full 20Hz, with a bit more compressed point cloud format, resulting in 16MB/s for ie. an ouster OS1 lidar, it seems horrible if we can't get 20Hz in IPC out of the box. Yes, its unbound type and hence limited to the shared memory feature and not loaned messages. That could for sure be interesting to look into, but that would require a change in the Ouster driver itself, which is a bit out of scope for our current project. If we get into cpu overload or timing issues for lidar odometry or something similiar we might try to use the loaned message api. |
Think I did try that as well, currently debugging to figure out when and how the xml files are parsed. But if the is_default_profile, the... yeah, default profiles get set to those values? So when this is executed:
I should get the xml default values here? Tried to get it working with the modified examples (only the HelloWorldSharedMem) from here: But looking more closely, it doesn't seem to use any default QoS, but create its own - or should it work here as well? But thanks for the suggestion, will try again tomorrow. |
<?xml version="1.0" encoding="UTF-8"?>
<dds xmlns="http://www.eprosima.com/XMLSchemas/fastRTPS_Profiles">
<profiles>
<transport_descriptors>
<!-- Create a descriptor for the new transport -->
<transport_descriptor>
<transport_id>shm_transport</transport_id>
<type>SHM</type>
<segment_size>10485760</segment_size>
</transport_descriptor>
</transport_descriptors>
<participant profile_name="SHMParticipant" is_default_profile="true">
<rtps>
<!-- Link the Transport Layer to the Participant -->
<userTransports>
<transport_id>shm_transport</transport_id>
</userTransports>
</rtps>
</participant>
</profiles>
</dds> Using this configuration works. It can significantly reduce the packet loss rate. Even with an increased segment size (test 30M), there is still the phenomenon of packet loss. RMW_FASTRTPS_USE_QOS_FROM_XML=1 FASTRTPS_DEFAULT_PROFILES_FILE=my_config.xml ros2 run cpp_pubsub talker --ros-args -p freq:=10 -p bytesize:=10000000 RMW_FASTRTPS_USE_QOS_FROM_XML=1 FASTRTPS_DEFAULT_PROFILES_FILE=pub_sub_config.xml ros2 run cpp_pubsub listener |
The second buffer is there because you did not disable to builtin SHM transport, so you're adding a second one. Please try with the following: <?xml version="1.0" encoding="UTF-8"?>
<dds xmlns="http://www.eprosima.com/XMLSchemas/fastRTPS_Profiles">
<profiles>
<transport_descriptors>
<!-- Create a descriptor for the new transport -->
<transport_descriptor>
<transport_id>shm_transport</transport_id>
<type>SHM</type>
<segment_size>10485760</segment_size>
</transport_descriptor>
</transport_descriptors>
<participant profile_name="SHMParticipant" is_default_profile="true">
<rtps>
<!-- Link the Transport Layer to the Participant -->
<userTransports>
<transport_id>shm_transport</transport_id>
</userTransports>
<useBuiltinTransports>false</useBuiltinTransports>
</rtps>
</participant>
</profiles>
</dds> |
I don't even seem to be able to disable SHM transport, ie. like this (borrowed from eProsima/Fast-DDS#2287):
|
Ahh, thanks. Seems to work like this - wonder why it didn't work with the previous so it just used UDP? |
Ahh. missed the Thanks a lot for the help. Think we can close this, unless default should be something else than a 0.5MB, which seems quite low in a ROS application? |
So this can't be configure per topic - since it requires the |
Nevermind, guess I can just omit the xml config file for those nodes, that don't require large amount of shared mem. |
A question more. Why does both the publisher and subscriber create a shared memory buffer - according to this diagram the shared memory on the subscriber side is not used? |
On subscriber side, I think you don't need to set the segment size. |
If I disable all shared memory and run over UDP it works fine - even though the ros2 topic hz still only show about 15 Hz... Seems to be what I will go for, for now. |
Ah, yeah okay. Works fine now I restarted the docker, but |
@EduPonz do you think this is something we should adjust on if we are not changing any default, i think we can close this issue. |
From my viewpoint, things should work out of the box. Generally, you should be able to send small messages even if you have allocated a "large" shared memory pool, but the other way around leads to packet drop, hence this issue. The question of how large default should be, is of course a bit difficult to guess, but if you cover most cases, one could look towards large point clouds or 8K resolution images from cameras and set that as a target point - that should probably cover most cases. The only downside is though, that you can run out of shared memory. In default docker its only 64MB, but you get a nice error message that it could not allocate space, if you run short of it. In comparison, we have a NUC PC which has about 7GB shared mem and my laptop has 32GB. So a default of 10MB or 20, would only be small subset of those. Double the size, if its not easy to set a lower value for subscribers (see below). If the package you try to send are larger than the shared memory available, you get no warning / error - just lower Hz / dropped messages. A question more: Alternative to increasing the default value, could be to parameterize it, so when you create a publisher you can specify the amount of shared memory, which the driver maintainer then can estimate based on the optimal for each of their drivers/sensors? |
I would like to add that I have run into this exact issue trying to view (I think) small images in rqt, where anything over 420x420 resolution plays extremely poorly. This happens when rqt can no longer get the entire message through shm, and I guess struggles with the udp method. I absolutely believe that the ros defaults should be changed to have a shm pool large enough for rqt to work on an average webcam. |
Just to add some more insight in the configuration options. For large data transmissions we have |
I think that it would be probably better to have |
Hi, I'm trying this solution but even trying to run a simple example publisher
If I increase the segment size to 100MB instead I get this:
Is there something that I am doing wrong? Everything works fine if I don't specify the |
@fdila can you try #739 (comment) configuration? |
@fujitatomoya I still get the error:
|
@fdila okay, i would recommend you to create the another issue with detailed description and reproducible procedure. and on this issue, we want to focus on #739 (comment). |
Bug report
Required Info:
Steps to reproduce issue
It all stems from transferring pointcloud data from Ousters ROS2 driver to any subscriber - ros2bag/ros2 topic echo/hz etc - these also indicates dropped messages.
The minor example here, can reproduce it though.
As far as I know all sensors have their publishing QoS set to BEST_EFFORT, so this is also the case in this example.
Expected behavior
Messages get sent and received with the required frequency
Actual behavior
Messages get dropped occasionally, getting worse the higher frequency or package size. See image:
Additional information
I have searched everywhere to find a solution, but the majority is suggestion to change buffer sizes, but it doesn't seem to be applicable here, since it uses shared memory.
As seen on the image, it seemingly only uses around 1.5MB and has up to 64MB available.
The text was updated successfully, but these errors were encountered: