Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Python Optional Dependencies #5853

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

samypr100
Copy link
Contributor

@samypr100 samypr100 commented Jan 20, 2023

Closes gh-2991, gh-5740, and potentially gh-5124, gh-3700

This PR allows open3d to support optional dependencies in a manner that allows the end-user to configure what they truly need. The CI/build system should remain unaffected as it defaults to installing [gui,ml] like previous behavior.

Currently, the new code introduces the following optional dependencies:

Optional Dependency Contents
gui Includes new (this PR) requirements_gui.txt file from https://github.com/isl-org/Open3D
ml Includes requirements.txt from https://github.com/isl-org/Open3D-ML
tf Includes ml and requirements-tensorflow.txt from https://github.com/isl-org/Open3D-ML
torch Includes ml and requirements-torch-cuda.txt or requirements-torch.txt from https://github.com/isl-org/Open3D-ML depending if BUILD_CUDA_MODULE is set to ON
all Includes all the above

More can be added (e.g. jupyter extension), but this is the initial list I could think off that would keep it simple. The names can be iterated upon as needed.

✅ Documentation was updated to show new method of installing open3d such that it matches previous behavior
✅ Changelog was updated
✅ Ran python util/check_style.py --apply to apply Open3D code style to the changed code.
✅ Any files that depended on requirements.txt will also depend on requirements_gui.txt to match previous behavior

To provide some motivating examples, I'm listing other packages that have taken this approach to help end users manage what they want to install:


This change is Reviewable

@ssheorey ssheorey added the status / tbd To be decided label Jan 23, 2023
@samypr100
Copy link
Contributor Author

@ssheorey Can the workflows be re-ran? I pushed a minor update to fix an issue that showed up after the first workflows ran. Thanks!

@samypr100
Copy link
Contributor Author

It seems some workflows failed due to situations out of my control 😢

  1. Ubuntu SYCL / ubuntu-sycl seems to be failing due to the following errors:

760: [Open3D WARNING] Failed to download from https://github.com/isl-org/open3d_downloads/releases/download/20220201-data/DemoFeatureMatchingPointClouds.zip. Exception [Open3D Error]

770: [Open3D WARNING] Failed to download from https://github.com/isl-org/open3d_downloads/releases/download/20220201-data/JuneauImage.jpg. Exception [Open3D Error]

Is that something that's known to be failing on the CI? Seems unrelated to the PR

  1. Ubuntu Wheel / Build wheel seems to be failing because there's a git clone to master as part of the job, so it won't find the new file from this PR.

999: Step 31/45 : RUN git clone https://github.com/isl-org/Open3D-ML.git ${OPEN3D_ML_ROOT}

2807: ERROR: Could not open requirements file: [Errno 2] No such file or directory: '/root/Open3D/python/requirements_gui.txt'

This might never pass as the new file would not be found from a clone from master since the change is on the PR, any suggestions?

Copy link
Member

@ssheorey ssheorey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @samypr100, thanks for creating this PR. To ensure backwards compatibility with current users, can you update the PR so that the pip install open3d behavior is unchanged? Dependencies may be removed by using nogui / noml as appropriate. Here is the detailed behavior that would work:

pip install open3d          # full Open3D install as current, with ml and gui,
                            # but requirements-{torch,torch-cuda,tensorflow}.txt
                            # are not installed.
pip install open3d[nogui]   # GUI dependencies are not installed.
pip install open3d[noml]    # ML dependencies are not installed.
pip install open3d[tf]      # requirements-tensorflow.txt is installed.
pip install open3d[torch]   # requirements-torch.txt or requirements-torch-cuda.txt
                            # is installed based on BUILD_CUDA_MODULE
pip install open3d[torch-cpu]  # requirements-torch.txt, ignoring BUILD_CUDA_MODULE
pip install open3d[all]        # identical to pip install open3d[tf,torch]
pip install open3d[all-cpu]    # identical to pip install open3d[tf,torch-cpu]

@ssheorey ssheorey added this to the v0.17 milestone Jan 28, 2023
@ssheorey ssheorey removed the status / tbd To be decided label Jan 28, 2023
@johnthagen
Copy link
Contributor

First off, thank you @samypr100 for putting this PR together!

To ensure backwards compatibility with current users, can you update the PR so that the pip install open3d behavior is unchanged?

@ssheorey So the issue here is that Python packaging does not currently support a way to remove dependencies from the base set of dependencies with extras. Extras are strictly additive. The idea of having "default" extras has been brought up before, but the community could not reach a consensus on it:

This does come with a one-time backwards compatibility issue when a package first makes the transition to use extras for users using the now-optional functionality, but hopefully the popular packages referenced above show that the transition can be successful and offers a lot of flexibility/efficiency to a package's users.

Some ideas to help communicate this change to users could be:

  • Mention the change prominently at the top of the README
  • Mention the change prominetly on the PyPI page
  • Have a pinned GitHub issue that explains the change
  • Mention the change prominently in the user guide

@ssheorey
Copy link
Member

Hi @johnthagen @samypr100 , note that the GUI and ML components in Open3D are not considered extras by the developers -- they are as much a core part of Open3D as the geometry processing code. We are happy to reduce the dependency footprint as much as possible for specific sets of users, but would not want to sacrifice the default experience for users that need the GUI / ML aspects.

Before we consider a solution for the technical issue that @johnthagen highlighted, what is the potential space saving provided by this PR, say for a user who does not want either GUI or ML?

@johnthagen
Copy link
Contributor

@ssheorey I certainly appreciate the fact that this change does have a bit of maintainence overhead.

Before we consider a solution for the technical issue that @johnthagen highlighted, what is the potential space saving provided by this PR, say for a user who does not want either GUI or ML?

I wanted to also explain a bit the full motivation I see with breaking up the dependencies of a large project like Open3D. Size was one of the motivators (as discussed in the other threads, this is primarily the CUDA DLLs included in the Linux wheels), but another similar motivator to me is the dependency tree itself.

When Open3D is included in a larger Python application, Open3D's dependencies impact other things such as locking dependencies using a tools such as Poetry or pip-tools. If Open3D's dependencies conflict with something in the application (such as the application depends on requests <1 but Open3D's transitive dependency depends on requests >2), this causes an issue. Also, licenses of a sub dependency could cause further challenges for some projects (for example if a transitive dependency is GPL, LGPL etc.). Teams often want to do some kind of vetting of their dependency tree to verify trust in those dependecy authors.

Allowing users to opt-in to these rather large dependecy graphs allows them to slim down the chances of these kinds of conflicts or issues to only the portions of a package they need. There is a balance to this (breaking up into a dozen extras is likely too much), and I feel like @samypr100 has chosen fairly reasonable grouping.

I also think other Python packages have been here. Uvicorn is a good example where they have pip install univcorn[standard]. That is in fact the typical way to install it (and includes C-accelerators not too disimliar in purpose to the CUDA accelerators in Open3D). Ultimately, that package decided that the flexibility provided to the users (in their case a pure-Python base option) was worth the added complexity documenting the extra. I'm not trying to say this is for sure the right solution for Open3D, just that Open3D wouldn't be the first to go down this path.

@johnthagen
Copy link
Contributor

Somewhat related to this:

@samypr100
Copy link
Contributor Author

samypr100 commented Jan 29, 2023

Thanks @ssheorey for the suggestions, we can definitely add torch-cpu and all-cpu, but nogui, noml won't work as the extra's pep is only additive as @johnthagen indicated :(

In my mind I like to think of extras as components rather than just additional dependencies. Extras gives you the power to specify the "bare minimum" of packages to get basic functionality in install_requires and if you want even more functionality extras is there to help. That way it can help satisfy a lot of use cases (space, locking, conflicts, etc.) such as the one @johnthagen has described. It's thanks to extras that we can avoid hundreds of packages being installed when installing some libraries unlike other ecosystems (*cough* npm *cough* 😅).

We could also create a open3d[standard] which basically does gui,ml to keep it in one word if that's desirable or more appealing?

All the python tooling has very good support for extras. e.g. in a requirements.txt it would look like

open3d[standard] == 0.17.0
numpy >= 1.19.2
...

I think python users would feel very familiar with the syntax since it's been around for many years, and given there's enough issues open that seem to indicate the desire for this in Open3D, I think it would be a welcoming addition.

Alternatives/Future/Thoughts

Similar to what gh-5124 asks for, we could publish a open3d-lite and/or an open3d-ml pypi package in addition to open3d. I think we'll still need a open3d-ml no matter what to close gh-5124 (see below).

Pip Package Code
open3d Same code base, but no ml?
open3d-lite Same code, no ml, no gui, maybe similar to gh-5869
open3d-ml Same as open3d, but includes ml & cuda bindings/acceleration

The reason I state that open3d-ml will still be needed is because we would definitely need a separate package to isolate the cuda/pybind.so's in the wheels mentioned in gh-5124. I can likely tackle that next if we go forward with this PR.

Some caveats on going this route is that some users will still want to avoid the gui components but desire the ml parts (e.g. on headless servers). So it may very well be that someone will ask to have a no-gui option for open3d-ml as well. Hence, extra's is definitely the answer here I think since that means open3d-lite may become obsolete if it only exists to avoid the dash package. Lastly, the maintenance overhead of keeping three (or more) pypi packages in sync might not be worth it when extras is there to help. If any I'd vote for a maximum of two pypi packages.

@samypr100
Copy link
Contributor Author

@ssheorey Any thoughts on how you'd want me to proceed with this PR?

@ssheorey
Copy link
Member

ssheorey commented Feb 8, 2023

Hi @samypr100 can you respond to my question about the potential space-saving with optional dependencies? I'm not sure it's useful to remove dependencies if it's not going to result in significant space savings. I think there are some dependencies that can be removed / replaced by brief native code in a case by case manner without impacting functionality and we would welcome PRs to reduce dependencies that do not impact functionality.

Regarding concerns of license and version conflicts - these again need to be handled in a case by case manner. Please submit issues / PR per dependency.

CUDA is orthogonal to ML - see the macOS wheel for instance. It contains (almost) all the Linux functionality (including ML) but there is no CUDA support. It is much smaller as a result. Most of the Linux wheel size is actually accelerating classical 3D data processing algorithms with CUDA.

@samypr100
Copy link
Contributor Author

samypr100 commented Feb 9, 2023

@ssheorey I apologize for not answering the sizes question, I must have missed it.

Strictly speaking from a python dependencies size, below are the sizes with the new optional dependencies calculated on OSX (transitive sizes may vary between platforms).

Package Transitive Dependencies Size (MB)
open3d 107.69
open3d[gui] 216.69
open3d[ml] 350.13
open3d[gui,ml] 459.13
open3d[gui,ml,tf] 1437.67

In a clean environment, installing upstream open3d yields 459.13MB, same as open3d[gui,ml] which is to be expected. The PR allows the user to save about 351.44MB on OSX, or even more on other platforms. At least, I consider this a significant amount of savings.

Note: The table does not include the size of open3d alone by itself as it's out of scope of this PR (see clarification below).

This PR is about trying to give the user power over the transitive dependencies pulled in via using pip to install open3d. Maybe there's been some confusion in the discussion since to reduce the size of open3d package itself #5869 would be needed, with similar/same optional dependencies support as shown here.

Edit [02/13/2023]: Once #5902 get's merged, along with this PR could add up to 351.44MB + 303~MB (less from open3d itself) = 650MB in total savings.

@samypr100 samypr100 force-pushed the optional-dependencies-support branch 3 times, most recently from dc542a3 to ae3bc8d Compare February 20, 2023 21:09
@samypr100 samypr100 force-pushed the optional-dependencies-support branch from ae3bc8d to 46552fa Compare March 8, 2023 20:35
@johnthagen
Copy link
Contributor

johnthagen commented Apr 5, 2023

@ssheorey Is there any way we can move this or the more conservative version (#5940) forward?

We're really interested in continuing to reduce the download size/install time for Open3D for developers and within CI pipelines. Additionally, this is valuable for reducing container image sizes that use Open3D.

This would also help Open3D usage on environment managers like Poetry that use the JSON API avoid issues such as:

Thanks for all the great work on Open3D.

Split out visualization and ml modules as optional dependencies
@samypr100 samypr100 force-pushed the optional-dependencies-support branch from 46552fa to 41f17dc Compare April 5, 2023 14:40
@samypr100
Copy link
Contributor Author

Another notable project that just split out into many tiny optional dependencies was pandas (https://pandas.pydata.org/docs/dev/whatsnew/v2.0.0.html)

Pandas 2.0 now has all, performance, computation, fss, aws, gcp, excel, parquet, feather, hdf5, spss, postgresql, mysql, sql-other, html, xml, plot, output_formatting, clipboard, compression, test optional dependencies.

@artemisart
Copy link

Hello, is this stalled? @ssheorey Open3D dependencies are a pretty big issue for us (starting with dash...) and I believe @samypr100 approach is correct. Is there any way we can help?

@samypr100
Copy link
Contributor Author

@artemisart There's also a smaller PR #5940 you might be interested in. I can rebase the PR's with latest if there's interest by the maintainers in moving forward with either.

@theNded
Copy link
Contributor

theNded commented Aug 11, 2023

@artemisart There's also a smaller PR #5940 you might be interested in. I can rebase the PR's with latest if there's interest by the maintainers in moving forward with either.

Hi @samypr100 , thanks for the contribution. I will follow up with your PRs and try to push it. Do you want to proceed with #5940 first? It seems a relatively smoother transition than this PR. However, I do think this PR is more self-inclusive and clear.

@theNded theNded self-requested a review August 11, 2023 04:51
# Conflicts:
#	cpp/pybind/make_install_pip_package.cmake
#	python/requirements.txt
@samypr100
Copy link
Contributor Author

samypr100 commented Aug 11, 2023

Thanks @theNded

Just updated with master (hopefully I didn't miss anything) feel free to take a look. I can go either way the open3d team prefers. This one is the most complete vs #5940 but #5940 was a simpler change in case it was prefer to minimize the changes. Edit: #5940 has also been updated.

@johnthagen
Copy link
Contributor

@theNded Thanks for your interest in this PR. Many of us Open3D users are excited for it.

As a production user of Open3D, our team would prefer this PR to be considered and merged, as it provides the most holistic support for users to control which Open3D dependencies they use. If this PR is deemed too complex, then #5940 would be something we would also support as it still makes progress in this vein.

# Conflicts:
#	docs/getting_started.in.rst
@johnthagen
Copy link
Contributor

@ssheorey With 0.18 coming soon, is there any consensus among the Open3D team about how/if to proceed with allowing optional dependencies for Open3D?

@RafaelWO
Copy link

Any update on this? It would be great if we could opt-out of e.g. nbformat (which pulls in jupyter etc.) because we don't need it 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pip: Why does Open3D pull in Open3D-ML/requirements.txt?
6 participants