Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding the pipeline for the task explanation and Llm #2190

Open
wants to merge 50 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
adbca17
Add Task EXPLANATION and the visualization of images with description.
Bepitic Jul 15, 2024
5611ec1
upd dataset task with explanation
Bepitic Jul 15, 2024
8ed23a3
fix tasktype on metrics, depth, cataset, inferencer.
Bepitic Jul 15, 2024
a463b5b
Merge branch 'main' into llm-pipeline
Bepitic Jul 15, 2024
d5baf6b
fix lint on visualization/image
Bepitic Jul 16, 2024
b7c8eaa
Merge branch 'openvinotoolkit:main' into llm-pipeline
Bepitic Jul 18, 2024
5b563d9
Merge branch 'llm-pipeline' of github.com:Bepitic/anomalib into llm-p…
Bepitic Jul 18, 2024
bfd936e
Fix formatting dataset
Bepitic Jul 18, 2024
f541316
fix format data/base/depth
Bepitic Jul 18, 2024
4e392a9
Fix formatting openvino_inferencer
Bepitic Jul 18, 2024
5fc70ba
fix formatting
Bepitic Jul 18, 2024
75099af
Add Explanation to error-msg.
Bepitic Aug 2, 2024
e5040d3
OpenAI - VLM init
Bepitic Aug 3, 2024
86ad803
Add wrapper to run OpenAI
Bepitic Aug 4, 2024
3678f72
add in ppyproject
Bepitic Aug 4, 2024
7413842
Add Test and fix description/title
Bepitic Aug 12, 2024
dc42cbd
Add Readme and fix bug.
Bepitic Aug 13, 2024
5788d22
Update src/anomalib/models/image/openai_vlm/lightning_model.py
Bepitic Aug 13, 2024
e4f6bec
Update src/anomalib/models/image/openai_vlm/__init__.py
Bepitic Aug 13, 2024
5437467
Add fix pipeline bug.
Bepitic Aug 13, 2024
982c9ca
Add test.
Bepitic Aug 13, 2024
642fd26
Merge branch 'OpenAI-VLM' of github.com:Bepitic/anomalib into OpenAI-VLM
Bepitic Aug 13, 2024
b8cacf0
add changes
Bepitic Aug 16, 2024
0929dc9
Add integration test and unit test + skip export.
Bepitic Aug 16, 2024
39cf996
change to LANGUAGE
Bepitic Aug 16, 2024
671693d
Update images in Readme.
Bepitic Aug 17, 2024
224118b
Update src/anomalib/models/image/chatgpt_vision/__init__.py
Bepitic Aug 20, 2024
b703a41
Update src/anomalib/models/image/chatgpt_vision/chatgpt.py
Bepitic Aug 20, 2024
24c5486
Update src/anomalib/models/image/chatgpt_vision/lightning_model.py
Bepitic Aug 20, 2024
68e757e
Update tests/integration/model/test_models.py
Bepitic Aug 20, 2024
86714a1
Update src/anomalib/models/image/chatgpt_vision/lightning_model.py
Bepitic Aug 20, 2024
196d2a3
Update src/anomalib/models/image/chatgpt_vision/lightning_model.py
Bepitic Aug 20, 2024
b7f345a
fix comments
Bepitic Aug 20, 2024
b285d10
remove last file of chatgpt_vision.
Bepitic Aug 20, 2024
a688530
fix tests
Bepitic Aug 20, 2024
0fb5f79
Merge pull request #1 from Bepitic/OpenAI-VLM (GPTVad)
Bepitic Aug 20, 2024
6503543
Merge branch 'main' into llm-pipeline
Bepitic Aug 20, 2024
8e92e5e
Update src/anomalib/models/image/gptvad/chatgpt.py
Bepitic Aug 21, 2024
5ab044d
upd: language -> VISUAL_PROMPTING
Bepitic Aug 21, 2024
3f9ca93
fix visual prompting and model_name
Bepitic Aug 21, 2024
391b4c4
fix GPT for Gpt and the folder of the tests.
Bepitic Aug 21, 2024
ca1a0bb
fix: change import error outside.
Bepitic Aug 21, 2024
022dcb7
fix readme pointing to the right model.
Bepitic Aug 21, 2024
af7b9e9
fix import cycle, and separate usecase by explicit if.
Bepitic Aug 21, 2024
faf334f
upd: add comments to the few shot / zero shot.
Bepitic Aug 21, 2024
3ed8d3f
fix: dataset expected colums
Bepitic Aug 21, 2024
7f454c4
upd: add the same logic of the label on visualize_full.
Bepitic Aug 22, 2024
45bd520
Merge branch 'main' into llm-pipeline
Bepitic Aug 22, 2024
44586d6
Fix in the logic of the code.
Bepitic Aug 22, 2024
7adb835
Merge branch 'llm-pipeline' of github.com:Bepitic/anomalib into llm-p…
Bepitic Aug 22, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added docs/source/images/gptvad/broken.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/source/images/gptvad/good.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@ core = [
"lightning>=2.2",
"torch>=2",
"torchmetrics>=1.3.2",
"openai>=1.38.0",
# NOTE: open-clip-torch throws the following error on v2.26.1
# torch.onnx.errors.UnsupportedOperatorError: Exporting the operator
# 'aten::_native_multi_head_attention' to ONNX opset version 14 is not supported
Expand Down
1 change: 1 addition & 0 deletions src/anomalib/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,3 +22,4 @@ class TaskType(str, Enum):
CLASSIFICATION = "classification"
DETECTION = "detection"
SEGMENTATION = "segmentation"
VISUAL_PROMPTING = "visual prompting"
4 changes: 2 additions & 2 deletions src/anomalib/callbacks/metrics.py
Original file line number Diff line number Diff line change
Expand Up @@ -75,10 +75,10 @@ def setup(
pixel_metric_names: list[str] | dict[str, dict[str, Any]]
if self.pixel_metric_names is None:
pixel_metric_names = []
elif self.task == TaskType.CLASSIFICATION:
elif self.task in (TaskType.CLASSIFICATION, TaskType.VISUAL_PROMPTING):
pixel_metric_names = []
logger.warning(
"Cannot perform pixel-level evaluation when task type is classification. "
"Cannot perform pixel-level evaluation when task type is classification or language. "
"Ignoring the following pixel-level metrics: %s",
self.pixel_metric_names,
)
Expand Down
4 changes: 3 additions & 1 deletion src/anomalib/data/base/dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,9 +20,11 @@
from anomalib.data.utils import LabelName, masks_to_boxes, read_image, read_mask

_EXPECTED_COLUMNS_CLASSIFICATION = ["image_path", "split"]
_EXPECTED_COLUMNS_VISUAL_PROMPTING = ["image_path", "split"]
_EXPECTED_COLUMNS_SEGMENTATION = [*_EXPECTED_COLUMNS_CLASSIFICATION, "mask_path"]
_EXPECTED_COLUMNS_PERTASK = {
"classification": _EXPECTED_COLUMNS_CLASSIFICATION,
"visual prompting": _EXPECTED_COLUMNS_VISUAL_PROMPTING,
"segmentation": _EXPECTED_COLUMNS_SEGMENTATION,
"detection": _EXPECTED_COLUMNS_SEGMENTATION,
}
Expand Down Expand Up @@ -169,7 +171,7 @@ def __getitem__(self, index: int) -> dict[str, str | torch.Tensor]:
image = read_image(image_path, as_tensor=True)
item = {"image_path": image_path, "label": label_index}

if self.task == TaskType.CLASSIFICATION:
if self.task in (TaskType.CLASSIFICATION, TaskType.VISUAL_PROMPTING):
item["image"] = self.transform(image) if self.transform else image
elif self.task in (TaskType.DETECTION, TaskType.SEGMENTATION):
# Only Anomalous (1) images have masks in anomaly datasets
Expand Down
2 changes: 1 addition & 1 deletion src/anomalib/data/base/depth.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ def __getitem__(self, index: int) -> dict[str, str | torch.Tensor]:
depth_image = to_tensor(read_depth_image(depth_path))
item = {"image_path": image_path, "depth_path": depth_path, "label": label_index}

if self.task == TaskType.CLASSIFICATION:
if self.task in (TaskType.CLASSIFICATION, TaskType.VISUAL_PROMPTING):
item["image"], item["depth_image"] = (
self.transform(image, depth_image) if self.transform else (image, depth_image)
)
Expand Down
2 changes: 1 addition & 1 deletion src/anomalib/deploy/inferencers/openvino_inferencer.py
Original file line number Diff line number Diff line change
Expand Up @@ -277,7 +277,7 @@ def post_process(self, predictions: np.ndarray, metadata: dict | DictConfig | No
pred_idx = pred_score >= metadata["image_threshold"]
pred_label = LabelName.ABNORMAL if pred_idx else LabelName.NORMAL

if task == TaskType.CLASSIFICATION:
if task in (TaskType.CLASSIFICATION, TaskType.VISUAL_PROMPTING):
_, pred_score = self._normalize(pred_scores=pred_score, metadata=metadata)
elif task in (TaskType.SEGMENTATION, TaskType.DETECTION):
if "pixel_threshold" in metadata:
Expand Down
2 changes: 2 additions & 0 deletions src/anomalib/models/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@
Fastflow,
Fre,
Ganomaly,
GptVad,
Padim,
Patchcore,
ReverseDistillation,
Expand Down Expand Up @@ -51,6 +52,7 @@ class UnknownModelError(ModuleNotFoundError):
"Fastflow",
"Fre",
"Ganomaly",
"GptVad",
"Padim",
"Patchcore",
"ReverseDistillation",
Expand Down
2 changes: 2 additions & 0 deletions src/anomalib/models/image/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
from .fastflow import Fastflow
from .fre import Fre
from .ganomaly import Ganomaly
from .gptvad import GptVad
from .padim import Padim
from .patchcore import Patchcore
from .reverse_distillation import ReverseDistillation
Expand All @@ -34,6 +35,7 @@
"Fastflow",
"Fre",
"Ganomaly",
"GptVad",
"Padim",
"Patchcore",
"ReverseDistillation",
Expand Down
7 changes: 7 additions & 0 deletions src/anomalib/models/image/gptvad/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
"""Generative Pre-Trained Transformer (GPT) based Large Language Model (LLM)."""
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

from .lightning_model import GptVad

__all__ = ["GptVad"]
169 changes: 169 additions & 0 deletions src/anomalib/models/image/gptvad/chatgpt.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,169 @@
"""Wrapper for the OpenAI calls to the VLM model."""
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

import logging
import os
from typing import Any

import openai


class APIKeyError(Exception):
"""APIKeyError error."""


class GPTWrapper:
"""A wrapper class for making API calls to OpenAI's GPT-4 model to detect anomalies in images.

Environment variable OPENAI_API_KEY (str): API key for OpenAI.
https://platform.openai.com/docs/quickstart/step-2-set-up-your-api-key
Other possible models: https://platform.openai.com/docs/models/gpt-4-turbo-and-gpt-4
All models with vision capabilities: 'gpt-4-turbo-2024-04-09', 'gpt-4-turbo',
all versions of 'gpt-4o-mini', and 'gpt-4o'

Args:
images (list[str]): List of base64 images. If only one image is provided,
it is treated as the anomalous image. If multiple images are provided,
the last one is considered anomalous, and the rest are treated as normal examples.
model_name (str): Model name for OpenAI API VLM. Default "gpt-4o"
detail (bool): If the images will be sended with high detail or low detail.

"""

def __init__(self, model_name: str = "gpt-4o", detail: bool = True) -> None:
openai_key = os.getenv("OPENAI_API_KEY")
self.model_name = model_name
self.detail = detail
if not openai_key:
msg = "OpenAI environment key not found.(OPENAI_API_KEY)"
raise APIKeyError(msg)

def api_call(
self,
images: list[str],
extension: str = "png",
) -> str:
"""Makes an API call to OpenAI's GPT-4 model to detect anomalies in an image.

Args:
images (list[str]): List of base64 images. If only one image is provided,
it is treated as the anomalous image. If multiple images are provided,
the last one is considered anomalous, and the rest are treated as normal examples.
extension (str): Extension of the group of images that needs to be checked for anomalies. Default = 'png'

Returns:
str: The response from the GPT-4 model indicating whether the image has anomalies or not.
It returns 'NO' if there are no anomalies and 'YES: description' if there are anomalies,
where 'description' provides details of the anomaly and its position.

Raises:
openai.error.OpenAIError: If there is an error during the API call.
"""
prompt: str = ""

detail_img = "high" if self.detail else "low"
messages: list[dict[str, Any]] = []

if len(images) > 0:
# If multiple images are provided, the last one is considered anomalous,
# and the rest are treated as normal examples.
prompt = """
You will receive a group of images that are going to be an example
of the typical image without any anomaly,
and the last image that you need to decide if it has an anomaly or not.
Answer with a 'NO' if it does not have any anomalies and 'YES: description'
where description is a description of the anomaly provided, position.
"""

messages.append(
{
"role": "system",
"content": prompt,
},
)
for image in images:
image_message = [
{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": {
"url": f"data:image/{extension};base64,{image}",
"detail": detail_img,
},
},
],
},
]
messages.extend(image_message)

elif len(images) == 1:
# If only one image is provided,
# it is treated as the anomalous image.
prompt = """
Examine the provided image carefully to determine if there is an obvious anomaly present.
Anomalies may include mechanical malfunctions, unexpected objects, safety hazards, structural damages,
or unusual patterns or defects in the objects.

Instructions:

1. Thoroughly inspect the image for any irregularities or deviations from normal operating conditions.

2. Clearly state if an obvious anomaly is detected.
- If an anomaly is detected, begin with 'YES,' followed by a detailed description of the anomaly.
- If no anomaly is detected, simply state 'NO' and end the analysis.

Example Output Structure:

'YES:
- Description: Conveyor belt misalignment causing potential blockages.
This may result in production delays and equipment damage.
Immediate realignment and inspection are recommended.'

'NO'

Considerations:

- Ensure accuracy in identifying anomalies to prevent overlooking critical issues.
- Provide clear and concise descriptions for any detected anomalies.
- Focus on obvious anomalies that could impact final use of the object operation or safety.
"""
messages.append(
{
"role": "system",
"content": prompt,
},
)
# Add the single image
messages.append(
{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": {
"url": f"data:image/{extension};base64,{images[0]}",
"detail": detail_img,
},
},
],
},
)
else:
msg = "No images provided for anomaly detection."
raise ValueError(msg)

try:
# Make the API call using the openai library
response = openai.chat.completions.create(
model=self.model_name,
messages=messages,
max_tokens=300,
)
return response.choices[-1].message.content or ""
except Exception:
msg = "Error generating a response with OpenAI API."
logging.exception(msg)
raise
Loading