-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add CVEChecker which guesses the pkg name and version of an archive #8
base: master
Are you sure you want to change the base?
Conversation
abb9802
to
c0031e2
Compare
c0031e2
to
3651151
Compare
Hey @ahayzen ! I realize now I completely dropped the ball on this one! I will dedicate some time to review to it later today, sorry for such a delay. I will make sure I move faster now. Thanks for your patience and stay tuned. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for this PR @ahayzen and I apologize again for the time it took me to get to it.
Please see my comments, the most important one is regarding the new field pkg_name
which I think should really be part of the custom checker data.
src/checker.py
Outdated
@@ -89,8 +94,8 @@ def _get_module_data_from_json(self, json_data): | |||
size = source.get('size', -1) | |||
checker_data = source.get('x-checker-data') | |||
|
|||
ext_data = ExternalData(data_type, name, url, sha256sum, size, | |||
arches, checker_data) | |||
ext_data = ExternalData(data_type, pkg_name, name, url, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my previous comment, the pkg_name
would be part of the checker_data
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
src/checker.py
Outdated
@@ -68,6 +69,10 @@ def _get_finish_args_extra_data_from_json(self, json_data): | |||
def _get_module_data_from_json(self, json_data): | |||
external_data = [] | |||
for module in json_data.get('modules', []): | |||
# This is a guess at the package name from the name the author | |||
# has given to the module block | |||
pkg_name = module.get('name', None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes more sense to have the package name as part of the checker data, this way we can enforce it for the CVE Checker.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
src/checkers/cvechecker.py
Outdated
try: | ||
version = CVEChecker.extract_version_from_url( | ||
external_data.url, external_data.type, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really don't like the arguments split like this + the single parenthesis.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we are using 100 chars for each line, moved this onto one line.
src/checkers/cvechecker.py
Outdated
version = CVEChecker.extract_version_from_url( | ||
external_data.url, external_data.type, | ||
) | ||
logging.debug('CVEChecker: Found %s of the version %s' % |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is the old string format, it should be "... Found {} of the versions {}...".format(...)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
src/checkers/cvechecker.py
Outdated
logging.debug('CVEChecker: Version not found in {}'.format(url)) | ||
raise ValueError | ||
else: | ||
logging.debug('CVEChecker: Unknown type %s' % data_type) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
String format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
src/lib/externaldata.py
Outdated
checker_data=None): | ||
def __init__(self, data_type, pkg_name, filename, url, checksum, size=-1, | ||
arches=[], checker_data=None): | ||
self.pkg_name = pkg_name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See other comments regarding the pkg_name
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
src/lib/externaldata.py
Outdated
@@ -54,13 +56,15 @@ def __init__(self, data_type, filename, url, checksum, size=-1, arches=[], | |||
|
|||
def __str__(self): | |||
info = '{filename}:\n' \ | |||
' PkgName: {pkg_name}\n' \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto for the pkg_name
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
One extra comment: it'd be great if you could add a test for this, similar to what we have for the other checkers. |
@joaquimrocha I have done the changes, now CVEChecker uses package-name from checker_data. I would like the package name to be auto determined in the future from the URL for archive type (optionally without x-checker-data) so that we can run across flathub easily. Sure I'll write some tests next :-) |
@joaquimrocha Hey, I've added a basic test case to this now (as we add more functionality to the CVEChecker itself the tests can be expanded). Please re-review when you have a moment :-) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see my comments in the commits.
Also, I thought this was somehow about checking https://cve.mitre.org/ but you're just printing the versions, right? Are you thinking of checking CVEs later or did I completely misunderstood the purpose of this checker?
# | ||
# Authors: | ||
# Andrew Hayzen <[email protected]> | ||
# Joaquim Rocha <[email protected]> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I don't mind but) you don't have to add my name if I didn't touch this code or if most of it wasn't copied from my code.
|
||
def check(self, external_data): | ||
# TODO: if checker_data or package-name are None | ||
# attempt to guess package name from url if archive |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: from url IN archive
?
@@ -49,6 +49,7 @@ def __init__(self, data_type, filename, url, checksum, size=-1, arches=[], | |||
self.arches = arches | |||
self.type = data_type | |||
self.checker_data = checker_data | |||
self.current_version = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be in checker_data
.
@@ -0,0 +1,74 @@ | |||
# Copyright (C) 2018 Endless Mobile, Inc. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't have to assign the copyright to Endless if you don't want and you're not using most of its code.
Also, please describe what this checker is about (see other checkers for examples).
@@ -30,6 +30,7 @@ | |||
import checker | |||
|
|||
TEST_MANIFEST = os.path.join(tests_dir, 'org.externaldatachecker.Manifest.json') | |||
CVE_TEST_MANIFEST = os.path.join(tests_dir, 'org.cvechecker.Manifest.json') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, this way other checkers don't run your manifest. Maybe it'd be better to add your tests to the big manifest?
@joaquimrocha yeah this pull was just a proof of concept of extracting the version numbers etc using TingPing's original script and integrating into your tool :-) Not submitting them to an external service (yet). It looks like TingPing has something already working with bst and yocto recipes (and maybe flatpak manifests?) for the runtime's, we should check with him before going any further as it sounds much more developed and would be duplicated work. |
Possibly worth it to also use https://gitlab.com/BuildStream/bst-external/blob/master/bst_external/elements/collect_manifest.py for inspiration. It's being used to generate CVE reports for freedesktop-sdk and Gnome runtimes. |
This is an initial implementation of a CVEChecker, for now it only guesses the pkg_name and version - which it then adds to the debug output.
Let me know if you want this behind a flag for now ? eg
--experimental-cve-checker
or if it is fine as is.