Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

yarn: SBOM components #739

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

slimreaper35
Copy link
Member

Maintainers will complete the following section

  • Commit messages are descriptive enough
  • Code coverage from testing does not decrease and new code is covered
  • Docs updated (if applicable)
  • Docs links in the code are still valid (if docs were updated)

Note: if the contribution is external (not from an organization member), the CI
pipeline will not run automatically. After verifying that the CI is safe to run:

Copy link
Collaborator

@a-ovchinnikov a-ovchinnikov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a couple small questions/nitpicks. Good to go once those are resolved.

"""Return package URL."""
qualifiers = {}

if self.url != DEFAULT_NPM_REGISTRY:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that self.url can be anything? If not then could it be constrained to a union of literals?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do have a check for that:
https://github.com/containerbuildsystem/cachi2/blob/main/cachi2/core/package_managers/yarn_classic/resolver.py#L184

IOW, if a package is RegistryPackage it is one of those npm registries.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for pointing me to this. Could you please add this explicitly? Essentially like this:

class RegistryPackage(_BasePackage, _UrlMixin):
    """A Yarn 1.x package from the registry."""
    url: Literal[*NPM_REGISTRY_CNAMES]

It will significantly reduce the number of such questions in the future.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the URL be like "https://registry.yarnpkg.com/foo/-/foo-1.0.0.tgz#fffffff" or just https://registry.yarnpkg.com ? 🤔 cc @taylormadore

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, we store https://registry.yarnpkg.com/foo/-/foo-1.0.0.tgz#fffffff as the url attribute so I would need to change the implementation slightly.

cachi2/core/package_managers/yarn_classic/resolver.py Outdated Show resolved Hide resolved
cachi2/core/package_managers/yarn_classic/main.py Outdated Show resolved Hide resolved
qualifiers = {"download_url": self.url}
return PackageURL(
type="npm",
name=self.name,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some of the non-registry packages, we may want to read their respective package.json files from the cached tarballs to get their true names. As a user, I can call these non-registry dependencies whatever I want in my project's package.json and it could be completely different from what it really is:

"dependencies": {
  "not-fecha": "https://github.com/taylorhakes/fecha.git"
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the filename in the cache come from package.json or the URL ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it is the filename from the resolved URL

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a tricky situation. Theoritacily, we could have three potential names:

  • the name declared in the project's package.json
  • the actual package name from the dependency's package.json
  • the repository name from the Git URL

The class is very often interpreted as `algorithm:hash`. Let's make use
of builtin __str__ magic method and return the same format.

Signed-off-by: Michal Šoltis <[email protected]>
Each "yarn classic" package type has a property that returns the package
URL based on its attributes and community PURL specification [1].

All package types share the same base -> name, version, and type which
is set to "npm" ("yarn" does not exist).

- `FilePackage`, `WorkspacePackage`, `LinkPackage` have in addition
  subpath component (extra subpath within a package, relative to the
  package root)

- `UrlPackage` has one extra qualifier -> its URL as it is definied

- `GitPackage` has one extra qualifier -> package version control system
  URL with a specific syntax [2]

- `RegistryPackage` has two extra qualifiers -> repository_url (default
  repository/registry for "npm" is https://registry.npmjs.org so
  alternative registries such as https://registry.yarnpkg.com should be
  qualified via the qualifier) [3], [4] + the checksum of the package
  converted from Subresource Integrity representation

---
[1]: https://github.com/package-url/purl-spec/blob/master/PURL-SPECIFICATION.rst
[2]: https://github.com/spdx/spdx-spec/blob/cfa1b9d08903/chapters/3-package-information.md#37-package-download-location-
[3]: https://github.com/package-url/purl-spec/blob/master/PURL-TYPES.rst#npm
[4]: https://github.com/package-url/purl-spec/blob/master/PURL-SPECIFICATION.rst#known-qualifiers-keyvalue-pairs

Signed-off-by: Michal Šoltis <[email protected]>
After a successful pre-fetching of all packages, report all downloaded
packages as components in the final SBOM.

Create the `Component` object from each package based on package
attributes.

Dev packages should have `cdx:npm:package:development` property, that is
added to the component if package is marked for development -> `dev`
attribute is set to True.

Move the rest of the unit test logic to `test_fetch_yarn_source` from
its predecessor in yarn-berry implementation.

closes containerbuildsystem#636

Signed-off-by: Michal Šoltis <[email protected]>
The commit follows the previous one, that implements generating SBOM
components. Now e2e tests that generate SBOMs should be updated to
reflect this change by running integration tests and setting our custom
env variable: `CACHI2_GENERATE_TEST_DATA=true`

Signed-off-by: Michal Šoltis <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants