-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace conda with pyenv to fix incorrect libstdc++ use in jammy CI. #6966
Conversation
Thanks for submitting this pull request! The maintainers of this repository would appreciate if you could update the CHANGELOG.md based on your changes. |
a95f3be
to
3e964a5
Compare
Seg fault in python ...
def test_tensormap(device):
...
# __delitem__ operator. SEGFAULT
with pytest.raises(RuntimeError) as excinfo:
del tm.positions
assert 'cannot be deleted' in str(excinfo.value)
...
# Set primary key. SEGFAULT
with pytest.raises(KeyError) as e:
tm.primary_key = o3c.Tensor.ones((2, 3), dtype, device)
...
# Get unknown attributes. SEGFAULT
with pytest.raises(KeyError) as e:
normals = tm.normals Fix: switch to gcc 13. Update nanoflannimpl to fix this warning as error: inlined from ‘open3d::core::nns::impl::{anonymous}::_KnnSearchCPU<float, int, open3d::core::nns::NeighborSearchAllocator<float, int>, 1>(open3d::core::nns::NanoFlannIndexHolderBase*, int64_t*, size_t, const float*, size_t, const float*, size_t, int, bool, bool, open3d::core::nns::NeighborSearchAllocator<float, int>&)::<lambda(const float*, const float*, size_t)>’ at /home/ssheorey/Documents/Open3D/Code/Open3D/cpp/open3d/core/nns/NanoFlannImpl.h:126:5,
inlined from ‘open3d::core::nns::impl::{anonymous}::_RadiusSearchCPU<float, int, open3d::core::nns::NeighborSearchAllocator<float, int>, 1>(open3d::core::nns::NanoFlannIndexHolderBase*, int64_t*, size_t, const float*, size_t, const float*, size_t, const float*, bool, bool, bool, bool, open3d::core::nns::NeighborSearchAllocator<float, int>&)::<lambda(const tbb::detail::d1::blocked_range<long unsigned int>&)>’ at /home/ssheorey/Documents/Open3D/Code/Open3D/cpp/open3d/core/nns/NanoFlannImpl.h:258:41:
/usr/include/c++/13/bits/new_allocator.h:168:33: error: ‘void operator delete(void*, std::size_t)’ called on pointer ‘<unknown>’ with nonzero offset [4, 9223372036854775804] [-Werror=free-nonheap-object] |
gcc-11 leads to a seg fault in tensormap in Release mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
@@ -147,8 +140,9 @@ void _KnnSearchCPU(NanoFlannIndexHolderBase *holder, | |||
for (size_t valid_i = 0; valid_i < num_valid; ++valid_i) { | |||
TIndex idx = result_indices[valid_i]; | |||
if (ignore_query_point && | |||
points_equal(&queries[i * dimension], | |||
&points[idx * dimension], dimension)) { | |||
std::equal(&queries[i * dimension], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch! This should significantly reduce the number of heap allocations
bash Miniconda3-latest-Linux-x86_64.sh -b; \ | ||
rm Miniconda3-latest-Linux-x86_64.sh; \ | ||
curl https://pyenv.run | bash \ | ||
&& pyenv update \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was working on switching to Miniforge in #6717 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
miniforge likely won't fix this issue, since it's a copy of conda and likely uses it's on libstdc++ as well.
Seg fault with Python, CUDA 11.8, 12.1 and Ubuntu 22.04. No issue with C++, the open3d-cpu package, or if CUDA_VISIBLE_DEVICES is empty. No issue on Ubuntu 20.04 or Windows. |
… No need to find and load explicitly. Use libc++11 to build in Ubuntu 22.04. Warn if newer version is used. TODO: Solution for Ubuntu 24.04
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
approved. See question comment
docker/Dockerfile.ci
Outdated
@@ -67,34 +67,55 @@ RUN if [ "${BUILD_SYCL_MODULE}" = "ON" ]; then \ | |||
rm -rf /etc/apt/sources.list.d/oneAPI.list; \ | |||
fi | |||
|
|||
# Dependencies: basic | |||
# Dependencies: basic and python-build | |||
# gcc-11 causes a seg fault in tensormap when built in Release mode. Upgrade to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have we found the root cause for that or is it really a bug only due to gcc11?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching that. I think it was actually the libunwind issue. I've been working with gcc-11 on jammy (22.04) without issues. Reverted.
Type
Motivation and Context
CUDA jammy CI (python tests) fail since pytest causes the conda libstdc++ library to be loaded instead of the system libstdc++ library. This is older and prevents Open3D cuda pybind lib from loading.
Checklist:
python util/check_style.py --apply
to apply Open3D code styleto my code.
updated accordingly.
results (e.g. screenshots or numbers) here.
Description
TODO: Solution for Ubuntu 24.04 (newer libc++ in general).