-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
--is-pipeline relative import error #52
Comments
Thanks for reporting. I suspect --is-pipeline doesn't get much use, and it doesn't look to be covered by tests. load_pipeline_from_module() has a bit that's supposed to handle these relative imports: dirname_ = dirname(module)
assert(module.endswith('.py'))
try:
sys.path.insert(0, dirname_)
modname = basename(module)[:-3]
# to allow for relative imports within "stock" pipelines
if dirname_ == opj(dirname(__file__), 'pipelines'):
mod = __import__('datalad_crawler.pipelines.%s' % modname,
fromlist=['datalad_crawler.pipelines'])
else:
mod = __import__(modname, level=0) The problem is that we don't go down the if-arm because the condition assumes diff --git a/datalad_crawler/pipeline.py b/datalad_crawler/pipeline.py
index a23c70e..4f117c3 100644
--- a/datalad_crawler/pipeline.py
+++ b/datalad_crawler/pipeline.py
@@ -50,6 +50,7 @@
import sys
from glob import glob
+from os.path import abspath
from os.path import dirname, join as opj, isabs, exists, curdir, basename
from os import makedirs
@@ -391,7 +392,7 @@ def load_pipeline_from_module(module, func=None, args=None, kwargs=None, return_
sys.path.insert(0, dirname_)
modname = basename(module)[:-3]
# to allow for relative imports within "stock" pipelines
- if dirname_ == opj(dirname(__file__), 'pipelines'):
+ if abspath(dirname_) == opj(abspath(dirname(__file__)), 'pipelines'):
mod = __import__('datalad_crawler.pipelines.%s' % modname,
fromlist=['datalad_crawler.pipelines'])
else: But that just gets us to another failure:
So --is-pipeline needs some attention. |
@kyleam thanks for the quick reply! How would you test new or existing pipelines as in what are the commands to execute them? |
Have you tried following the demo here? |
As I am trying to write a new crawler for Zenodo, I was trying to find a way to test and execute existing pipeline to observe expected behavior. The problem is when executing
datalad crawl --is-pipeline datalad_crawler/pipelines/<pipeline>.py
There seems to be an relative import error. So my question is how do we successfully test crawling existing pipelines with
--is-pipeline
flag? I tested multiple different paths and all gave me the same error:[ERROR ] Failed to import pipeline from datalad_crawler/pipelines/nda.py: attempted relative import with no known parent package [nda.py:<module>:13] [pipeline.py:load_pipeline_from_module:403] (RuntimeError)
I chose randomly nda.py as a pipeline for testing.
Edit: It would be great if the documentation could be more in-depth
The text was updated successfully, but these errors were encountered: