[Bug]: Error occured when Grounding Tools use DataStore with enable chunking mode. #694

yamazakikakuyo · 2024-05-16T12:10:15Z

File Name

gemini/grounding/intro-grounding-gemini.ipynb

What happened?

Hi, I have an issue when using more than Grounding Tool class in GenerativeModel class. I have read the documentation regarding parse and chunking in Datastore in Vertex AI Agent Builder this link. I interested in capability of Layout Parser that could understand table in PDF file format and I create a Datastore and Search type App in Agent Builder with that parse mode.

I noticed that when choose Layout Parser, the chunking mode become mandatory and I don't really mind about this. However, when I use the Datastore ID in Gemini's grounding Tools. There's error occured and it is said as the relevant log output part. I have also tried a several others parse mode and it comes with the same error in every parse mode in Datastore that enabled chunking mode.

Here's the detail of code that I used and also the Datastore and Search type App configuration.

Snippet code of Tools for Grounding Tool

from vertexai.generative_models import FunctionDeclaration, GenerativeModel, Part, Tool, Content, grounding
from vertexai.preview import generative_models as preview_generative_models
vertexai.init(project=PROJECT_ID, location='us-central1')

vertex_search_tool = Tool.from_retrieval(
    retrieval=preview_generative_models.grounding.Retrieval(
        source=preview_generative_models.grounding.VertexAISearch(
            datastore=f"projects/{PROJECT_ID}/locations/global/collections/default_collection/dataStores/{DATASTORE_ID}"
        ),
    )
)

model = GenerativeModel(
        "gemini-1.0-pro",
        generation_config={"temperature": 0}
    )
chat = model.start_chat()
response = chat.send_message(PROMPT, tools=[vertex_search_tool])

Vertex AI Datastore settings or configuration
a. Data store type is unstructure with data source from Google Cloud Storage
b. File type that I put in Datastore is PDF
c. Default Document Parser : Layout Parser
d. Chunking Mode enabled automatically when I choosed Layout Parser.
e. Chunk size 500
f. Disabled "Include ancestor headings in chunks"
g. No exception file type
Vertex AI Search type App settings or configuration
a. App Type is search (as mentioned above)
b. Search app configuration Content is Generic type
c. Enterprise edition features is enabled
d. Advanced LLM features is enabled
e. The Datastore used only one, that is the Datastore mentioned in point 2.

Detail of environment and library:

Python version : 3.10.13
SDK version of google-cloud-aiplatform library : 1.51.0
SDK version of vertexai library : 1.49.0

I have tried it too in google-cloud-aiplatform version 1.50.0 and 1.49.0 and got the same error.

Is there any solution regarding this case or is there any important step that I missed? I hope you could give me solutions or insights for this problem. Thank a lot beforehand! =D

Best Regards

Relevant log output

---------------------------------------------------------------------------
_InactiveRpcError                         Traceback (most recent call last)
File /opt/conda/lib/python3.10/site-packages/google/api_core/grpc_helpers.py:65, in _wrap_unary_errors.<locals>.error_remapped_callable(*args, **kwargs)
     64 try:
---> 65     return callable_(*args, **kwargs)
     66 except grpc.RpcError as exc:

File /opt/conda/lib/python3.10/site-packages/grpc/_channel.py:1176, in _UnaryUnaryMultiCallable.__call__(self, request, timeout, metadata, credentials, wait_for_ready, compression)
   1170 (
   1171     state,
   1172     call,
   1173 ) = self._blocking(
   1174     request, timeout, metadata, credentials, wait_for_ready, compression
   1175 )
-> 1176 return _end_unary_response_blocking(state, call, False, None)

File /opt/conda/lib/python3.10/site-packages/grpc/_channel.py:1005, in _end_unary_response_blocking(state, call, with_call, deadline)
   1004 else:
-> 1005     raise _InactiveRpcError(state)

_InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
	status = StatusCode.INVALID_ARGUMENT
	details = "`extractive_content_spec` must be not defined when the datastore is using 'chunking config'"
	debug_error_string = "UNKNOWN:Error received from peer ipv4:142.251.171.95:443 {grpc_message:"`extractive_content_spec` must be not defined when the datastore is using \'chunking config\'", grpc_status:3, created_time:"2024-05-16T11:21:52.51250561+00:00"}"
>

The above exception was the direct cause of the following exception:

InvalidArgument                           Traceback (most recent call last)
Cell In[8], line 1
----> 1 response = chat.send_message("Get a list of datasets", tools=[vertex_search_tool])

File /opt/conda/lib/python3.10/site-packages/vertexai/generative_models/_generative_models.py:809, in ChatSession.send_message(self, content, generation_config, safety_settings, tools, stream)
    802     return self._send_message_streaming(
    803         content=content,
    804         generation_config=generation_config,
    805         safety_settings=safety_settings,
    806         tools=tools,
    807     )
    808 else:
--> 809     return self._send_message(
    810         content=content,
    811         generation_config=generation_config,
    812         safety_settings=safety_settings,
    813         tools=tools,
    814     )

File /opt/conda/lib/python3.10/site-packages/vertexai/generative_models/_generative_models.py:905, in ChatSession._send_message(self, content, generation_config, safety_settings, tools)
    903 while True:
    904     request_history = self._history + history_delta
--> 905     response = self._model._generate_content(
    906         contents=request_history,
    907         generation_config=generation_config,
    908         safety_settings=safety_settings,
    909         tools=tools,
    910     )
    911     # By default we're not adding incomplete interactions to history.
    912     if self._response_validator is not None:

File /opt/conda/lib/python3.10/site-packages/vertexai/generative_models/_generative_models.py:496, in _GenerativeModel._generate_content(self, contents, generation_config, safety_settings, tools, tool_config)
    471 """Generates content.
    472 
    473 Args:
   (...)
    487     A single GenerationResponse object
    488 """
    489 request = self._prepare_request(
    490     contents=contents,
    491     generation_config=generation_config,
   (...)
    494     tool_config=tool_config,
    495 )
--> 496 gapic_response = self._prediction_client.generate_content(request=request)
    497 return self._parse_response(gapic_response)

File /opt/conda/lib/python3.10/site-packages/google/cloud/aiplatform_v1beta1/services/prediction_service/client.py:2103, in PredictionServiceClient.generate_content(self, request, model, contents, retry, timeout, metadata)
   2100 self._validate_universe_domain()
   2102 # Send the request.
-> 2103 response = rpc(
   2104     request,
   2105     retry=retry,
   2106     timeout=timeout,
   2107     metadata=metadata,
   2108 )
   2110 # Done; return the response.
   2111 return response

File /opt/conda/lib/python3.10/site-packages/google/api_core/gapic_v1/method.py:113, in _GapicCallable.__call__(self, timeout, retry, *args, **kwargs)
    110     metadata.extend(self._metadata)
    111     kwargs["metadata"] = metadata
--> 113 return wrapped_func(*args, **kwargs)

File /opt/conda/lib/python3.10/site-packages/google/api_core/grpc_helpers.py:67, in _wrap_unary_errors.<locals>.error_remapped_callable(*args, **kwargs)
     65     return callable_(*args, **kwargs)
     66 except grpc.RpcError as exc:
---> 67     raise exceptions.from_grpc_error(exc) from exc

InvalidArgument: 400 `extractive_content_spec` must be not defined when the datastore is using 'chunking config'

Code of Conduct

I agree to follow this project's Code of Conduct

The text was updated successfully, but these errors were encountered:

federicoarenasl · 2024-05-24T09:22:03Z

I'm having the same issue. I can confirm that this happens when Chunking mode is enabled, regardless of the type of parsing.

m3nux · 2024-05-27T15:59:51Z

Same issue for me

krishchyt · 2024-05-30T04:58:16Z

same issue for me

giovanniagazzi · 2024-05-30T09:19:02Z

I encountered the same issue.
The only document I found about the extractive_content_spec parameter is this: Get snippets and extracted content

holtskinner · 2024-06-04T19:53:26Z

I believe that grounding isn't currently supported for Vertex AI Search data stores that use Chunking Modes

yoosungung · 2024-09-09T04:42:45Z

Same issue for me

sarakodeiri · 2024-10-23T16:15:09Z

I'm dealing with the same issue.
Interesting find: https://medium.com/@falkenbt/just-found-out-when-changing-the-region-for-the-model-from-europe-west3-to-us-central1-im-getting-58a0dd0b3364

holtskinner self-assigned this Aug 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Error occured when Grounding Tools use DataStore with enable chunking mode. #694

[Bug]: Error occured when Grounding Tools use DataStore with enable chunking mode. #694

yamazakikakuyo commented May 16, 2024 •

edited

Loading

federicoarenasl commented May 24, 2024

m3nux commented May 27, 2024

krishchyt commented May 30, 2024

giovanniagazzi commented May 30, 2024

holtskinner commented Jun 4, 2024

yoosungung commented Sep 9, 2024

sarakodeiri commented Oct 23, 2024 •

edited

Loading

[Bug]: Error occured when Grounding Tools use DataStore with enable chunking mode. #694

[Bug]: Error occured when Grounding Tools use DataStore with enable chunking mode. #694

Comments

yamazakikakuyo commented May 16, 2024 • edited Loading

File Name

What happened?

Relevant log output

Code of Conduct

federicoarenasl commented May 24, 2024

m3nux commented May 27, 2024

krishchyt commented May 30, 2024

giovanniagazzi commented May 30, 2024

holtskinner commented Jun 4, 2024

yoosungung commented Sep 9, 2024

sarakodeiri commented Oct 23, 2024 • edited Loading

yamazakikakuyo commented May 16, 2024 •

edited

Loading

sarakodeiri commented Oct 23, 2024 •

edited

Loading