You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The predict_real_time API currently provides minimal logging information, especially in failure scenarios when using SageMaker endpoints. This lack of detailed logging makes it challenging to diagnose issues or understand the reasons behind failed inference requests. Additionally, the logs available from the cloud endpoint, such as SageMaker, offer limited information beyond HTTP error codes. This situation necessitates improved logging mechanisms to provide users with better visibility and debugging capabilities for their cloud predictors.
Expected Behavior
When an inference request fails, detailed error messages or logs should be provided to the user, including but not limited to:
The specific reason for the failure (e.g., model loading issues, data serialization/deserialization problems, etc.).
Relevant HTTP error codes along with their descriptions.
Suggestions or references for troubleshooting common issues.
Actual Behavior
Currently, the predict_real_time API outputs minimal information in both stdout/stderr and the cloud endpoint logs, mainly limited to HTTP status codes without detailed explanations or context. This minimal feedback loop hinders effective troubleshooting and root cause analysis.
Steps to Reproduce
Set up a cloud predictor using the AutoGluon-Cloud with a SageMaker endpoint.
Attempt to make an inference request using the predict_real_time API with a setup that is known to fail (e.g., incorrect input format).
Observe the lack of detailed logging information in the event of a failure.
Possible Solution
Implement enhanced logging within the predict_real_time API to capture and relay detailed error messages and diagnostics information from the underlying cloud service (e.g., SageMaker). This could include:
Catching exceptions at the API level and enriching them with additional context before re-throwing or logging.
Enabling configurable log levels for the API, allowing users to opt-in for more verbose logging based on their debugging needs.
Working closely with cloud service providers to ensure that more detailed error information is made available and propagated through the AutoGluon-Cloud interface.
Additional Context
Enhancing the logging detail for cloud predictors not only improves the user experience by providing clear insights into the operational aspects but also significantly reduces the time spent on troubleshooting and support.
The text was updated successfully, but these errors were encountered:
Description
The
predict_real_time
API currently provides minimal logging information, especially in failure scenarios when using SageMaker endpoints. This lack of detailed logging makes it challenging to diagnose issues or understand the reasons behind failed inference requests. Additionally, the logs available from the cloud endpoint, such as SageMaker, offer limited information beyond HTTP error codes. This situation necessitates improved logging mechanisms to provide users with better visibility and debugging capabilities for their cloud predictors.Expected Behavior
When an inference request fails, detailed error messages or logs should be provided to the user, including but not limited to:
Actual Behavior
Currently, the
predict_real_time
API outputs minimal information in both stdout/stderr and the cloud endpoint logs, mainly limited to HTTP status codes without detailed explanations or context. This minimal feedback loop hinders effective troubleshooting and root cause analysis.Steps to Reproduce
predict_real_time
API with a setup that is known to fail (e.g., incorrect input format).Possible Solution
Implement enhanced logging within the
predict_real_time
API to capture and relay detailed error messages and diagnostics information from the underlying cloud service (e.g., SageMaker). This could include:Additional Context
Enhancing the logging detail for cloud predictors not only improves the user experience by providing clear insights into the operational aspects but also significantly reduces the time spent on troubleshooting and support.
The text was updated successfully, but these errors were encountered: