Skip to content

Commit

Permalink
[Docs] Inferencer docs (#1744)
Browse files Browse the repository at this point in the history
* [Enhancement] Support batch visualization & dumping in Inferencer

* fix empty det output

* Update mmocr/apis/inferencers/base_mmocr_inferencer.py

Co-authored-by: liukuikun <[email protected]>

* [Docs] Inferencer docs

* fix

* Support weight_list

* add req

* improve md

* inferencers.md

* update

* add tab

* refine

* polish

* add cn docs

* js

* js

* js

* fix ch docs

* translate

* translate

* finish

* fix

* fix

* fix

* update

* standard inferencer

* update docs

* update docs

* update docs

* update docs

* update docs

* update docs

* en

* update

* update

* update

* update

* fix

* apply sugg

---------

Co-authored-by: liukuikun <[email protected]>
  • Loading branch information
gaotongxiao and Harold-lkk authored Mar 7, 2023
1 parent cc78866 commit 33cbc9b
Show file tree
Hide file tree
Showing 28 changed files with 1,554 additions and 394 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,7 @@ instance/
# Sphinx documentation
docs/en/_build/
docs/zh_cn/_build/
docs/*/api/generated/

# PyBuilder
target/
Expand Down
2 changes: 1 addition & 1 deletion configs/kie/sdmgr/metafile.yml
Original file line number Diff line number Diff line change
Expand Up @@ -48,5 +48,5 @@ Models:
Metrics:
macro_f1: 0.931
micro_f1: 0.940
edgee_micro_f1: 0.792
edge_micro_f1: 0.792
Weights: https://download.openmmlab.com/mmocr/kie/sdmgr/sdmgr_novisual_60e_wildreceipt-openset/sdmgr_novisual_60e_wildreceipt-openset_20220831_200807-dedf15ec.pth
12 changes: 0 additions & 12 deletions configs/textdet/drrg/metafile.yml
Original file line number Diff line number Diff line change
Expand Up @@ -26,15 +26,3 @@ Models:
Metrics:
hmean-iou: 0.8467
Weights: https://download.openmmlab.com/mmocr/textdet/drrg/drrg_resnet50_fpn-unet_1200e_ctw1500/drrg_resnet50_fpn-unet_1200e_ctw1500_20220827_105233-d5c702dd.pth

- Name: drrg_resnet50-oclip_fpn-unet_1200e_ctw1500
In Collection: DRRG
Config: configs/textdet/drrg/drrg_resnet50-oclip_fpn-unet_1200e_ctw1500.py
Metadata:
Training Data: CTW1500
Results:
- Task: Text Detection
Dataset: CTW1500
Metrics:
hmean-iou:
Weights:
8 changes: 4 additions & 4 deletions configs/textdet/maskrcnn/metafile.yml
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ Models:
- Task: Text Detection
Dataset: CTW1500
Metrics:
hmean: 0.7458
hmean-iou: 0.7458
Weights: https://download.openmmlab.com/mmocr/textdet/maskrcnn/mask-rcnn_resnet50_fpn_160e_ctw1500/mask-rcnn_resnet50_fpn_160e_ctw1500_20220826_154755-ce68ee8e.pth

- Name: mask-rcnn_resnet50-oclip_fpn_160e_ctw1500
Expand All @@ -38,7 +38,7 @@ Models:
- Task: Text Detection
Dataset: CTW1500
Metrics:
hmean: 0.7562
hmean-iou: 0.7562
Weights: https://download.openmmlab.com/mmocr/textdet/maskrcnn/mask-rcnn_resnet50-oclip_fpn_160e_ctw1500/mask-rcnn_resnet50-oclip_fpn_160e_ctw1500_20221101_154448-6e9e991c.pth

- Name: mask-rcnn_resnet50_fpn_160e_icdar2015
Expand All @@ -51,7 +51,7 @@ Models:
- Task: Text Detection
Dataset: ICDAR2015
Metrics:
hmean: 0.8182
hmean-iou: 0.8182
Weights: https://download.openmmlab.com/mmocr/textdet/maskrcnn/mask-rcnn_resnet50_fpn_160e_icdar2015/mask-rcnn_resnet50_fpn_160e_icdar2015_20220826_154808-ff5c30bf.pth

- Name: mask-rcnn_resnet50-oclip_fpn_160e_icdar2015
Expand All @@ -64,5 +64,5 @@ Models:
- Task: Text Detection
Dataset: ICDAR2015
Metrics:
hmean: 0.8513
hmean-iou: 0.8513
Weights: https://download.openmmlab.com/mmocr/textdet/maskrcnn/mask-rcnn_resnet50-oclip_fpn_160e_icdar2015/mask-rcnn_resnet50-oclip_fpn_160e_icdar2015_20221101_131357-a19f7802.pth
2 changes: 1 addition & 1 deletion configs/textrecog/master/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ Attention-based scene text recognizers have gained huge success, which leverages

```bibtex
@article{Lu2021MASTER,
title={{MASTER}: Multi-Aspect Non-local Network for Scene Text Recognition},
title={MASTER: Multi-Aspect Non-local Network for Scene Text Recognition},
author={Ning Lu and Wenwen Yu and Xianbiao Qi and Yihao Chen and Ping Gong and Rong Xiao and Xiang Bai},
journal={Pattern Recognition},
year={2021}
Expand Down
31 changes: 31 additions & 0 deletions docs/en/_static/js/table.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
$(document).ready(function () {
table = $('.model-summary').DataTable({
"stateSave": false,
"lengthChange": false,
"pageLength": 10,
"order": [],
"scrollX": true,
"columnDefs": [
{ "type": "summary", targets: '_all' },
]
});
// Override the default sorting for the summary columns, which
// never takes the "-" character into account.
jQuery.extend(jQuery.fn.dataTableExt.oSort, {
"summary-asc": function (str1, str2) {
if (str1 == "<p>-</p>")
return 1;
if (str2 == "<p>-</p>")
return -1;
return ((str1 < str2) ? -1 : ((str1 > str2) ? 1 : 0));
},

"summary-desc": function (str1, str2) {
if (str1 == "<p>-</p>")
return 1;
if (str2 == "<p>-</p>")
return -1;
return ((str1 < str2) ? 1 : ((str1 > str2) ? -1 : 0));
}
});
})
6 changes: 3 additions & 3 deletions docs/en/basic_concepts/structures.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ The conventions for the fields in `InstanceData` in MMOCR are shown in the table
| | | |
| ----------- | ---------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Field | Type | Description |
| bboxes | `torch.FloatTensor` | Bounding boxes `[x1, x2, y1, y2]` with the shape `(N, 4)`. |
| bboxes | `torch.FloatTensor` | Bounding boxes `[x1, y1, x2, y2]` with the shape `(N, 4)`. |
| labels | `torch.LongTensor` | Instance label with the shape `(N, )`. By default, MMOCR uses `0` to represent the "text" class. |
| polygons | `list[np.array(dtype=np.float32)]` | Polygonal bounding boxes with the shape `(N, )`. |
| scores | `torch.Tensor` | Confidence scores of the predictions of bounding boxes. `(N, )`. |
Expand Down Expand Up @@ -99,7 +99,7 @@ The fields of [`InstanceData`](#instancedata) that will be used are:
| | | |
| -------- | ---------------------------------- | ------------------------------------------------------------------------------------------------ |
| Field | Type | Description |
| bboxes | `torch.FloatTensor` | Bounding boxes `[x1, x2, y1, y2]` with the shape `(N, 4)`. |
| bboxes | `torch.FloatTensor` | Bounding boxes `[x1, y1, x2, y2]` with the shape `(N, 4)`. |
| labels | `torch.LongTensor` | Instance label with the shape `(N, )`. By default, MMOCR uses `0` to represent the "text" class. |
| polygons | `list[np.array(dtype=np.float32)]` | Polygonal bounding boxes with the shape `(N, )`. |
| scores | `torch.Tensor` | Confidence scores of the predictions of bounding boxes. `(N, )`. |
Expand Down Expand Up @@ -182,7 +182,7 @@ The [`InstanceData`](#text-detection-instancedata) fields that will be used by t
| | | |
| ----------- | ------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Field | Type | Description |
| bboxes | `torch.FloatTensor` | Bounding boxes `[x1, x2, y1, y2]` with the shape `(N, 4)`. |
| bboxes | `torch.FloatTensor` | Bounding boxes `[x1, y1, x2, y2]` with the shape `(N, 4)`. |
| labels | `torch.LongTensor` | Instance label with the shape `(N, )`. |
| texts | `list[str]` | The text content of each instance with the shape `(N, )`,used for e2e text spotting or KIE task. |
| edge_labels | `torch.IntTensor` | The node adjacency matrix with the shape `(N, N)`. In the KIE task, the optional values for the state between nodes are `-1` (ignored, not involved in loss calculation),`0` (disconnected) and `1`(connected). |
Expand Down
16 changes: 14 additions & 2 deletions docs/en/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@
'sphinx.ext.autodoc.typehints',
'sphinx.ext.autosummary',
'sphinx.ext.autosectionlabel',
'sphinx_tabs.tabs',
]
autodoc_typehints = 'description'
autodoc_mock_imports = ['mmcv._ext']
Expand All @@ -57,6 +58,8 @@
copybutton_prompt_text = r'>>> |\.\.\. '
copybutton_prompt_is_regexp = True

myst_enable_extensions = ['colon_fence']

# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']

Expand Down Expand Up @@ -149,8 +152,17 @@
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ['_static']
html_css_files = ['css/readthedocs.css']
html_js_files = ['js/collapsed.js']

html_css_files = [
'https://cdn.datatables.net/1.13.2/css/dataTables.bootstrap5.min.css',
'css/readthedocs.css'
]
html_js_files = [
'https://cdn.datatables.net/1.13.2/js/jquery.dataTables.min.js',
'https://cdn.datatables.net/1.13.2/js/dataTables.bootstrap5.min.js',
'js/collapsed.js',
'js/table.js',
]

myst_heading_anchors = 4

Expand Down
107 changes: 75 additions & 32 deletions docs/en/get_started/install.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,43 +27,45 @@ conda activate openmmlab

**Step 2.** Install PyTorch following [official instructions](https://pytorch.org/get-started/locally/), e.g.

On GPU platforms:
````{tabs}
```shell
```{code-tab} shell GPU Platform
conda install pytorch torchvision -c pytorch
```
On CPU platforms:

```shell
```{code-tab} shell CPU Platform
conda install pytorch torchvision cpuonly -c pytorch
```
````

## Installation Steps

We recommend that users follow our best practices to install MMOCR. However, the whole process is highly customizable. See [Customize Installation](#customize-installation) section for more information.

### Best Practices

**Step 0.** Install [MMEngine](https://github.com/open-mmlab/mmengine) and [MMCV](https://github.com/open-mmlab/mmcv) using [MIM](https://github.com/open-mmlab/mim).
**Step 0.** Install [MMEngine](https://github.com/open-mmlab/mmengine), [MMCV](https://github.com/open-mmlab/mmcv) and [MMDetection](https://github.com/open-mmlab/mmdetection) using [MIM](https://github.com/open-mmlab/mim).

```shell
pip install -U openmim
mim install mmengine
mim install 'mmcv>=2.0.0rc1'
mim install 'mmdet>=3.0.0rc0'
```

**Step 1.** Install [MMDetection](https://github.com/open-mmlab/mmdetection) as a dependency.
**Step 1.** Install MMOCR.

```shell
pip install 'mmdet>=3.0.0rc0'
```
If you wish to run and develop MMOCR directly, install it from **source** (recommended).

**Step 2.** Install MMOCR.
If you use MMOCR as a dependency or third-party package, install it with **MIM**.

Case A: If you wish to run and develop MMOCR directly, install it from source:
`````{tabs}
````{group-tab} Install from Source
```shell
git clone https://github.com/open-mmlab/mmocr.git
cd mmocr
git checkout 1.x
Expand All @@ -72,58 +74,99 @@ pip install -v -e .
# "-v" increases pip's verbosity.
# "-e" means installing the project in editable mode,
# That is, any local modifications on the code will take effect immediately.
```
Case B: If you use MMOCR as a dependency or third-party package, install it with pip:
````
````{group-tab} Install via MIM
```shell
pip install 'mmocr>=1.0.0rc0'
mim install 'mmocr>=1.0.0rc0'
```
**Step 3. (Optional)** If you wish to use any transform involving `albumentations` (For example, `Albu` in ABINet's pipeline), install the dependency using the following command:
````
`````

**Step 2. (Optional)** If you wish to use any transform involving `albumentations` (For example, `Albu` in ABINet's pipeline), install the dependency using the following command:

`````{tabs}
````{group-tab} Install from Source
```shell
# If MMOCR is installed from source
pip install -r requirements/albu.txt
# If MMOCR is installed via pip
```
````
````{group-tab} Install via MIM
```shell
pip install albumentations>=1.1.0 --no-binary qudida,albumentations
```
````
`````

```{note}
We recommend checking the environment after installing `albumentations` to
ensure that `opencv-python` and `opencv-python-headless` are not installed together, otherwise it might cause unexpected issues. If that's unfortunately the case, please uninstall `opencv-python-headless` to make sure MMOCR's visualization utilities can work.
Refer
to ['albumentations`'s official documentation](https://albumentations.ai/docs/getting_started/installation/#note-on-opencv-dependencies) for more details.
to [albumentations's official documentation](https://albumentations.ai/docs/getting_started/installation/#note-on-opencv-dependencies) for more details.
```

### Verify the installation

We provide a method to verify the installation via inference demo, depending on your installation method. You should be able to see a pop-up image and the inference result upon successful verification.
You may verify the installation via this inference demo.

`````{tabs}
````{tab} Python
Run the following code in a Python interpreter:
```python
>>> from mmocr.apis import MMOCRInferencer
>>> ocr = MMOCRInferencer(det='DBNet', rec='CRNN')
>>> ocr('demo/demo_text_ocr.jpg', show=True, print_result=True)
```
````
````{tab} Shell
If you installed MMOCR from source, you can run the following in MMOCR's root directory:
```shell
python tools/infer.py demo/demo_text_ocr.jpg --det DBNet --rec CRNN --show --print-result
```
````
`````

You should be able to see a pop-up image and the inference result printed out in the console upon successful verification.

<div align="center">
<img src="https://user-images.githubusercontent.com/24622904/187825445-d30cbfa6-5549-4358-97fe-245f08f4ed94.jpg" height="250"/>
</div>
<br />

```bash
# Inference result
{'rec_texts': ['cbanke', 'docece', 'sroumats', 'chounsonse', 'doceca', 'c', '', 'sond', 'abrandso', 'sretane', '1', 'tosl', 'roundi', 'slen', 'yet', 'ally', 's', 'sue', 'salle', 'v'], 'rec_scores': [...], 'det_polygons': [...], 'det_scores': tensor([...])}
```

Run the following in MMOCR's directory:

```bash
python mmocr/ocr.py --det DB_r18 --recog CRNN demo/demo_text_ocr.jpg --show
{'predictions': [{'rec_texts': ['cbanks', 'docecea', 'grouf', 'pwate', 'chobnsonsg', 'soxee', 'oeioh', 'c', 'sones', 'lbrandec', 'sretalg', '11', 'to8', 'round', 'sale', 'year',
'ally', 'sie', 'sall'], 'rec_scores': [...], 'det_polygons': [...], 'det_scores':
[...]}]}
```

Also can run the following codes in your Python interpreter:

```python
from mmocr.ocr import MMOCR
ocr = MMOCR(recog='CRNN', det='DB_r18')
ocr.readtext('demo_text_ocr.jpg', show=True)
```{note}
If you are running MMOCR on a server without GUI or via SSH tunnel with X11 forwarding disabled, you may not see the pop-up window.
```

## Customize Installation
Expand Down
Loading

0 comments on commit 33cbc9b

Please sign in to comment.