Major features include:
vdk-structlog
By setting structlog_config_preset
users can choose a configuration preset to either LOCAL or CLOUD grouping best logging configuration for those use cases. Any config options set together with the preset will override the preset options..
Example RAG Pipeline
An example of how to build end to end chatbot using VDK:
vdk dag local execution
To be able to test now you can execute the entire dag locally on your machine without needing to deploy
Make sure all data job directories are on the same level
export DAGS_JOB_EXECUTOR_TYPE=local
Then run dag job as normal:
vdk run dag-job
Or from IDE as explained here and set DAGS_JOB_EXECUTOR_TYPE=local
as an environment variable in the run configuration
See more in VDK DAG documentation
Support for Python 3.12
Added official support and testing for Python 3.12 in VDK plugins and main components.
What's Changed
- control-service: fix data job deployment by @mivanov1988 in #3110
- control-service: fix data job deployment cpu conversion by @mivanov1988 in #3109
- control-service: optimize job builder pip install by @antoniivanov in #3151
- documentation: fix monthly download badges closing HTML tag by @yonitoo in #3090
- examples: Make RAG examples a bit more generic and demoable by @antoniivanov in #3085
- examples: RAG Chat UI by @gageorgiev in #3108
- examples: RAG Question-answering Web Service by @gageorgiev in #3098
- examples: add Embed and Ingest Confluence JSON data example data job by @yonitoo in #3073
- examples: add Fetch And Embed Data Job Example by @yonitoo in #3065
- examples: add chunker job to support configurable chunking by @yonitoo in #3093
- examples: adopt RAG examples for remote execution by @antoniivanov in #3117
- examples: refactor a bit the examples by @antoniivanov in #3127
- quickstart-vdk: make sure it is released by @antoniivanov in #3113
- specs: update vector database vep with more explanation by @antoniivanov in #3054
- vdk-confluence: add data source plugin for confluence by @duyguHsnHsn in #3094
- vdk-core: Add 'method' to pre_ingest_process API by @doks5 in #3072
- vdk-core: Adopt 'method' argument in pre-process plugins by @doks5 in #3074
- vdk-core: add is_default() function to config by @DeltaMichael in #3076
- vdk-core: do not count memory properties toward the count by @antoniivanov in #3099
- vdk-core: enable/disable structlog based on config by @DeltaMichael in #3102
- vdk-core: relevant info in step result by @DeltaMichael in #3062
- vdk-core: remove structlog logging override by @DeltaMichael in #3066
- vdk-dag: add local executor by @antoniivanov in #3097
- vdk-dag: fix unnecessary authorization failure by @antoniivanov in #3096
- vdk-dag: improve error handling and error messages by @antoniivanov in #3152
- vdk-examples: example job with confluence reader by @duyguHsnHsn in #3070
- vdk-gdp-execution-id: adopt ingester changes by @dakodakov in #3120
- vdk-kerberos-auth: adopt unreleased oscrypto library by @dakodakov in #3130
- vdk-kerberos-auth: adopt unreleased oscrypto library by @dakodakov in #3131
- vdk-kerberos-auth: revert recent changes by @dakodakov in #3134
- vdk-postgres: batch inserts during ingestion by @antoniivanov in #3121
- vdk-quickstart: add vdk-structlog by @duyguHsnHsn in #2956
- vdk-server: fix ingress settings by @antoniivanov in #3101
- vdk-structlog: add default logging format values by @DeltaMichael in #3055
- vdk-structlog: put vdk init logs config behind flag by @DeltaMichael in #3107
- vdk-test-utils: Measure payload size with len, not getsizeof by @gageorgiev in #3157
- vdk-test-utils: adopt ingester changes by @dakodakov in #3119
- versatile-data-kit: Change copyright notice by @gageorgiev in #3116
- versatile-data-kit: Support for Py3.12 by @gageorgiev in #3143
- versatile-data-kit: add link to architecture to contributing.md by @antoniivanov in #3071
Full Changelog: v1.6...v1.7