Enhance Logging for predict_real_time API in Cloud Predictors #106

tonyhoo · 2024-04-03T23:59:46Z

Description

The predict_real_time API currently provides minimal logging information, especially in failure scenarios when using SageMaker endpoints. This lack of detailed logging makes it challenging to diagnose issues or understand the reasons behind failed inference requests. Additionally, the logs available from the cloud endpoint, such as SageMaker, offer limited information beyond HTTP error codes. This situation necessitates improved logging mechanisms to provide users with better visibility and debugging capabilities for their cloud predictors.

Expected Behavior

When an inference request fails, detailed error messages or logs should be provided to the user, including but not limited to:

The specific reason for the failure (e.g., model loading issues, data serialization/deserialization problems, etc.).
Relevant HTTP error codes along with their descriptions.
Suggestions or references for troubleshooting common issues.

Actual Behavior

Currently, the predict_real_time API outputs minimal information in both stdout/stderr and the cloud endpoint logs, mainly limited to HTTP status codes without detailed explanations or context. This minimal feedback loop hinders effective troubleshooting and root cause analysis.

Steps to Reproduce

Set up a cloud predictor using the AutoGluon-Cloud with a SageMaker endpoint.
Attempt to make an inference request using the predict_real_time API with a setup that is known to fail (e.g., incorrect input format).
Observe the lack of detailed logging information in the event of a failure.

Possible Solution

Implement enhanced logging within the predict_real_time API to capture and relay detailed error messages and diagnostics information from the underlying cloud service (e.g., SageMaker). This could include:

Catching exceptions at the API level and enriching them with additional context before re-throwing or logging.
Enabling configurable log levels for the API, allowing users to opt-in for more verbose logging based on their debugging needs.
Working closely with cloud service providers to ensure that more detailed error information is made available and propagated through the AutoGluon-Cloud interface.

Additional Context

Enhancing the logging detail for cloud predictors not only improves the user experience by providing clear insights into the operational aspects but also significantly reduces the time spent on troubleshooting and support.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhance Logging for predict_real_time API in Cloud Predictors #106

Enhance Logging for predict_real_time API in Cloud Predictors #106

tonyhoo commented Apr 3, 2024

Enhance Logging for predict_real_time API in Cloud Predictors #106

Enhance Logging for predict_real_time API in Cloud Predictors #106

Comments

tonyhoo commented Apr 3, 2024

Description

Expected Behavior

Actual Behavior

Steps to Reproduce

Possible Solution

Additional Context