Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: clarify 'destination' argument #262

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

smoya
Copy link
Contributor

@smoya smoya commented Nov 28, 2024

I believe we must clarify a bit what the destination field is used for and how VS target_table and view_name, the more granular options.

Open to suggestions, this can taken as the starting point of a discussion.

@smoya smoya requested a review from a team as a code owner November 28, 2024 11:19
| embedding | [Embedding configuration](#embedding-configuration) | - | ✔ | Set how to embed the data. |
| chunking | [Chunking configuration](#chunking-configuration) | - | ✔ | Set the way to split text data, using functions like `ai.chunking_character_text_splitter()`. |
| indexing | [Indexing configuration](#indexing-configuration) | `ai.indexing_default()` | ✖ | Specify how to index the embeddings. For example, `ai.indexing_diskann()` or `ai.indexing_hnsw()`. |
| formatting | [Formatting configuration](#formatting-configuration) | `ai.formatting_python_template()` | ✖ | Define the data format before embedding, using `ai.formatting_python_template()`. |
| scheduling | [Scheduling configuration](#scheduling-configuration) | `ai.scheduling_default()` | ✖ | Set how often to run the vectorizer. For example, `ai.scheduling_timescaledb()`. |
| processing | [Processing configuration](#processing-configuration ) | `ai.processing_default()` | ✖ | Configure the way to process the embeddings. |
| target_schema | name | - | ✖ | Specify the schema where the embeddings will be stored. |
| target_schema | name | - | ✖ | Specify the schema where the embeddings will be stored. This argument takes preference over *destination*. |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| target_schema | name | - || Specify the schema where the embeddings will be stored. This argument takes preference over *destination*. |
| target_schema | name | - || Specify the schema where the embeddings will be stored. This argument takes precedence over *destination*. |

| target_table | name | - | ✖ | Specify name of the table where the embeddings will be stored. |
| view_schema | name | - | ✖ | Specify the schema where the view is created. |
| view_name | name | - | ✖ | Specify the name of the view to be created. |
| queue_schema | name | - | ✖ | Specify the schema where the work queue table is created.. |
| view_name | name | - | ✖ | Specify the name of the view to be created. This argument takes preference over *destination*. |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| view_name | name | - || Specify the name of the view to be created. This argument takes preference over *destination*. |
| view_name | name | - || Specify the name of the view to be created. This argument takes precedence over *destination*. |

@@ -127,18 +127,18 @@ in other management functions.
| Name | Type | Default | Required | Description |
|------------------|--------------------------------------------------------|-----------------------------------|----------|----------------------------------------------------------------------------------------------------|
| source | regclass | - | ✔ | The source table that embeddings are generated for. |
| destination | name | - | ✖ | A name for the table the embeddings are stored in. |
| destination | name | - | ✖ | A name for the table the embeddings will be or are stored in. If defined, *view_name* argument takes this value, and the value of the *target_table* argument is set as `<destination>_store`. |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it confusing that we reference other arguments to explain how this works.

Suggested change
| destination | name | - || A name for the table the embeddings will be or are stored in. If defined, *view_name* argument takes this value, and the value of the *target_table* argument is set as `<destination>_store`. |
| destination | name | - || Sets the name of the embedding table and view. Use this to avoid naming conflicts when configuring additional vectorizers for a source table. The view is named `<destination>`, and the embedding table is named `<destination>_store`. |

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I felt we should clarify that, normally, you either define destination or target_table and view_name. How could we add this without naming those args?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand your point. I think that we could settle both your concern and mine by adding the following sentence to my suggestion above: "Note: the view_name and table_name arguments take precedence, if set."

This allows us to both 1. Define how destination works on its own terms, and 2. Inform the user that view_name and table_name are related to destination.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants