What's the shortest way to demonstrate a full fledged Kedro pipeline? #3775
Replies: 3 comments 1 reply
-
I know this is a super quick example but we could make the run argument even simpler accepting i.e. outputs = session.run(pipeline, catalog={"source": pl.read_csv("companies.csv")} |
Beta Was this translation helpful? Give feedback.
-
I was made aware today that so maybe we don't even need to expose the |
Beta Was this translation helpful? Give feedback.
-
This would be useful in environments like Databricks, where everything is centered around a single file. See also https://github.com/ibis-project/kedro-ibis-tutorial/blob/main/03%20-%20First%20Steps%20with%20Kedro.ipynb |
Beta Was this translation helpful? Give feedback.
-
AS A
person evaluating Kedro
I WOULD LIKE TO
have a quick way of trying it out
SO THAT
I can form an opinion in minutes rather than hours
Inspired by https://www.tryhamilton.dev/
Consider this:
Some points:
Pipeline.from_dicts
proposes a@classmethod
rather than thepipeline
helper so it's more clear what object is being returned.from_dicts
saves the user from importing theNode
class or thenode
helper altogetherReplaced withKedroSession.run_pipeline
allows the user to run aPipeline
object directly without having to go through the registration process,create_pipeline
etcrunner.run
, see belowDataCatalog.from_raw_inputs
allows the user to inject data from memory directly, instead of having to build a dummy dataset on the spotcatalog=
should be part of something likeKedroSession.create_???
instead, at the momentThe philosophical arguments against this is clear: if we offer an easy way, people will misuse it. https://en.wikipedia.org/wiki/Worse_is_better we'd need to build the right guardrails.
The practical argument in favour of this is clear: it lowers the barrier of entry, hence (potentially, ideally, theoretically) could help increasing adoption.
Related issues and discussions:
KedroSession
run #2169Beta Was this translation helpful? Give feedback.
All reactions