Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

the issue about seg #115

Open
jiugexuan opened this issue Sep 11, 2024 · 6 comments
Open

the issue about seg #115

jiugexuan opened this issue Sep 11, 2024 · 6 comments

Comments

@jiugexuan
Copy link

When i use the command:

python seg/train.py seg/configs/vim/upernet/upernet_vim_tiny_24_512_slide_60k_debug.py

it show result:

 python seg/train.py seg/configs/vim/upernet/upernet_vim_tiny_24_512_slide_60k_debug.py
/home/jiuth/anaconda3/envs/vim/lib/python3.10/site-packages/mmcv/__init__.py:20: UserWarning: On January 1, 2023, MMCV will release v2.0.0, in which it will remove components related to the training process and add a data transformation module. In addition, it will rename the package names mmcv to mmcv-lite and mmcv-full to mmcv. See https://github.com/open-mmlab/mmcv/blob/master/docs/en/compatibility.md for more details.
  warnings.warn(
/home/jiuth/vim
2024-09-11 16:15:41,142 - mmseg - INFO - Environment info:
------------------------------------------------------------
sys.platform: linux
Python: 3.10.14 (main, May  6 2024, 19:42:50) [GCC 11.2.0]
CUDA available: True
GPU 0: NVIDIA GeForce RTX 3090
CUDA_HOME: /usr/local/cuda
NVCC: Cuda compilation tools, release 11.8, V11.8.89
GCC: gcc (Ubuntu 10.5.0-1ubuntu1~22.04) 10.5.0
PyTorch: 2.0.0
PyTorch compiling details: PyTorch built with:
  - GCC 9.3
  - C++ Version: 201703
  - Intel(R) oneAPI Math Kernel Library Version 2023.1-Product Build 20230303 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.7.3 (Git Hash 6dbeffbae1f23cbbeae17adb7b5b13f1f37c080e)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 11.8
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_90,code=sm_90;-gencode;arch=compute_37,code=compute_37
  - CuDNN 8.7
  - Magma 2.6.1
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.8, CUDNN_VERSION=8.7.0, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 -fabi-version=11 -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wunused-local-typedefs -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_DISABLE_GPU_ASSERTS=ON, TORCH_VERSION=2.0.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF,

TorchVision: 0.15.0
OpenCV: 4.10.0
MMCV: 1.7.2
MMCV Compiler: GCC 9.3
MMCV CUDA Compiler: 11.8
MMSegmentation: 0.30.0+b9cf48f
------------------------------------------------------------

2024-09-11 16:15:41,142 - mmseg - INFO - Distributed training: False
2024-09-11 16:15:41,241 - mmseg - INFO - Config:
norm_cfg = dict(type='SyncBN', requires_grad=True)
model = dict(
    type='EncoderDecoder',
    pretrained=None,
    backbone=dict(
        type='VisionMambaSeg',
        patch_size=16,
        embed_dim=192,
        depth=24,
        img_size=512,
        in_chans=3,
        out_indices=[5, 11, 17, 23],
        pretrained=None,
        rms_norm=True,
        residual_in_fp32=False,
        fused_add_norm=True,
        if_abs_pos_embed=True,
        if_rope=False,
        if_rope_residual=False,
        bimamba_type='v2',
        final_pool_type='all',
        if_divide_out=True,
        if_cls_token=False),
    decode_head=dict(
        type='UPerHead',
        in_channels=[192, 192, 192, 192],
        in_index=[0, 1, 2, 3],
        pool_scales=(1, 2, 3, 6),
        channels=192,
        dropout_ratio=0.1,
        num_classes=150,
        norm_cfg=dict(type='SyncBN', requires_grad=True),
        align_corners=False,
        loss_decode=dict(
            type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
    auxiliary_head=dict(
        type='FCNHead',
        in_channels=192,
        in_index=2,
        channels=256,
        num_convs=1,
        concat_input=False,
        dropout_ratio=0.1,
        num_classes=150,
        norm_cfg=dict(type='SyncBN', requires_grad=True),
        align_corners=False,
        loss_decode=dict(
            type='CrossEntropyLoss', use_sigmoid=False, loss_weight=0.4)),
    train_cfg=dict(),
    test_cfg=dict(mode='slide', crop_size=(512, 512), stride=(341, 341)))
find_unused_parameters = True
dataset_type = 'ADE20KDataset'
data_root = '/home/jiuth/Vim/seg/data/ADEChallengeData2016'
img_norm_cfg = dict(
    mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
crop_size = (512, 512)
train_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(type='LoadAnnotations', reduce_zero_label=True),
    dict(type='Resize', img_scale=(2560, 640), ratio_range=(0.5, 2.0)),
    dict(type='RandomCrop', crop_size=(512, 512), cat_max_ratio=0.75),
    dict(type='RandomFlip', prob=0.5),
    dict(type='PhotoMetricDistortion'),
    dict(
        type='Normalize',
        mean=[123.675, 116.28, 103.53],
        std=[58.395, 57.12, 57.375],
        to_rgb=True),
    dict(type='Pad', size=(512, 512), pad_val=0, seg_pad_val=255),
    dict(type='DefaultFormatBundle'),
    dict(type='Collect', keys=['img', 'gt_semantic_seg'])
]
test_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(
        type='MultiScaleFlipAug',
        img_scale=(2560, 640),
        flip=False,
        transforms=[
            dict(type='Resize', keep_ratio=True),
            dict(type='RandomFlip'),
            dict(
                type='Normalize',
                mean=[123.675, 116.28, 103.53],
                std=[58.395, 57.12, 57.375],
                to_rgb=True),
            dict(type='ImageToTensor', keys=['img']),
            dict(type='Collect', keys=['img'])
        ])
]
data = dict(
    samples_per_gpu=8,
    workers_per_gpu=16,
    train=dict(
        type='ADE20KDataset',
        data_root='/home/jiuth/Vim/seg/data/ADEChallengeData2016',
        img_dir='images/training',
        ann_dir='annotations/training',
        pipeline=[
            dict(type='LoadImageFromFile'),
            dict(type='LoadAnnotations', reduce_zero_label=True),
            dict(type='Resize', img_scale=(2560, 640), ratio_range=(0.5, 2.0)),
            dict(type='RandomCrop', crop_size=(512, 512), cat_max_ratio=0.75),
            dict(type='RandomFlip', prob=0.5),
            dict(type='PhotoMetricDistortion'),
            dict(
                type='Normalize',
                mean=[123.675, 116.28, 103.53],
                std=[58.395, 57.12, 57.375],
                to_rgb=True),
            dict(type='Pad', size=(512, 512), pad_val=0, seg_pad_val=255),
            dict(type='DefaultFormatBundle'),
            dict(type='Collect', keys=['img', 'gt_semantic_seg'])
        ]),
    val=dict(
        type='ADE20KDataset',
        data_root='/home/jiuth/Vim/seg/data/ADEChallengeData2016',
        img_dir='images/validation',
        ann_dir='annotations/validation',
        pipeline=[
            dict(type='LoadImageFromFile'),
            dict(
                type='MultiScaleFlipAug',
                img_scale=(2560, 640),
                flip=False,
                transforms=[
                    dict(type='Resize', keep_ratio=True),
                    dict(type='RandomFlip'),
                    dict(
                        type='Normalize',
                        mean=[123.675, 116.28, 103.53],
                        std=[58.395, 57.12, 57.375],
                        to_rgb=True),
                    dict(type='ImageToTensor', keys=['img']),
                    dict(type='Collect', keys=['img'])
                ])
        ]),
    test=dict(
        type='ADE20KDataset',
        data_root='/home/jiuth/Vim/seg/data/ADEChallengeData2016',
        img_dir='images/validation',
        ann_dir='annotations/validation',
        pipeline=[
            dict(type='LoadImageFromFile'),
            dict(
                type='MultiScaleFlipAug',
                img_scale=(2560, 640),
                flip=False,
                transforms=[
                    dict(type='Resize', keep_ratio=True),
                    dict(type='RandomFlip'),
                    dict(
                        type='Normalize',
                        mean=[123.675, 116.28, 103.53],
                        std=[58.395, 57.12, 57.375],
                        to_rgb=True),
                    dict(type='ImageToTensor', keys=['img']),
                    dict(type='Collect', keys=['img'])
                ])
        ]))
log_config = dict(
    interval=50, hooks=[dict(type='TextLoggerHook', by_epoch=False)])
dist_params = dict(backend='nccl')
log_level = 'INFO'
load_from = None
resume_from = None
workflow = [('train', 1)]
cudnn_benchmark = True
optimizer = dict(
    type='AdamW',
    lr=0.0001,
    betas=(0.9, 0.999),
    weight_decay=0.05,
    constructor='LayerDecayOptimizerConstructor',
    paramwise_cfg=dict(num_layers=24, layer_decay_rate=0.92))
optimizer_config = dict(
    type='DistOptimizerHook',
    update_interval=1,
    grad_clip=None,
    coalesce=True,
    bucket_size_mb=-1,
    use_fp16=False)
lr_config = dict(
    policy='poly',
    warmup='linear',
    warmup_iters=1500,
    warmup_ratio=1e-06,
    power=1.0,
    min_lr=0.0,
    by_epoch=False)
runner = dict(type='IterBasedRunnerAmp', max_iters=60000)
checkpoint_config = dict(by_epoch=False, interval=1000, max_keep_ckpts=4)
evaluation = dict(interval=1000, metric='mIoU', save_best='mIoU')
fp16 = None
work_dir = './work_dirs/upernet_vim_tiny_24_512_slide_60k_debug'
gpu_ids = range(0, 1)

/home/jiuth/anaconda3/envs/vim/lib/python3.10/site-packages/mmseg/models/losses/cross_entropy_loss.py:235: UserWarning: Default ``avg_non_ignore`` is False, if you would like to ignore the certain label and average loss over non-ignore labels, which is the same with PyTorch official cross_entropy, set ``avg_non_ignore=True``.
  warnings.warn(
2024-09-11 16:15:41,477 - mmseg - INFO - EncoderDecoder(
  (backbone): VisionMambaSeg(
    (patch_embed): PatchEmbed(
      (proj): Conv2d(80, 192, kernel_size=(16, 16), stride=(16, 16))
      (norm): Identity()
    )
    (pos_drop): Dropout(p=0.0, inplace=False)
    (drop_path): DropPath(drop_prob=0.100)
    (layers): ModuleList(
      (0-1): 2 x Block(
        (mixer): Mamba(
          (in_proj): Linear(in_features=192, out_features=768, bias=False)
          (conv1d): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (act): SiLU()
          (x_proj): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj): Linear(in_features=12, out_features=384, bias=True)
          (conv1d_b): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (x_proj_b): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj_b): Linear(in_features=12, out_features=384, bias=True)
          (out_proj): Linear(in_features=384, out_features=192, bias=False)
        )
        (norm): RMSNorm()
        (drop_path): Identity()
      )
      (2): Block(
        (mixer): Mamba(
          (in_proj): Linear(in_features=192, out_features=768, bias=False)
          (conv1d): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (act): SiLU()
          (x_proj): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj): Linear(in_features=12, out_features=384, bias=True)
          (conv1d_b): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (x_proj_b): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj_b): Linear(in_features=12, out_features=384, bias=True)
          (out_proj): Linear(in_features=384, out_features=192, bias=False)
        )
        (norm): RMSNorm()
        (drop_path): DropPath(drop_prob=0.004)
      )
      (3): Block(
        (mixer): Mamba(
          (in_proj): Linear(in_features=192, out_features=768, bias=False)
          (conv1d): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (act): SiLU()
          (x_proj): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj): Linear(in_features=12, out_features=384, bias=True)
          (conv1d_b): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (x_proj_b): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj_b): Linear(in_features=12, out_features=384, bias=True)
          (out_proj): Linear(in_features=384, out_features=192, bias=False)
        )
        (norm): RMSNorm()
        (drop_path): DropPath(drop_prob=0.009)
      )
      (4): Block(
        (mixer): Mamba(
          (in_proj): Linear(in_features=192, out_features=768, bias=False)
          (conv1d): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (act): SiLU()
          (x_proj): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj): Linear(in_features=12, out_features=384, bias=True)
          (conv1d_b): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (x_proj_b): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj_b): Linear(in_features=12, out_features=384, bias=True)
          (out_proj): Linear(in_features=384, out_features=192, bias=False)
        )
        (norm): RMSNorm()
        (drop_path): DropPath(drop_prob=0.013)
      )
      (5): Block(
        (mixer): Mamba(
          (in_proj): Linear(in_features=192, out_features=768, bias=False)
          (conv1d): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (act): SiLU()
          (x_proj): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj): Linear(in_features=12, out_features=384, bias=True)
          (conv1d_b): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (x_proj_b): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj_b): Linear(in_features=12, out_features=384, bias=True)
          (out_proj): Linear(in_features=384, out_features=192, bias=False)
        )
        (norm): RMSNorm()
        (drop_path): DropPath(drop_prob=0.017)
      )
      (6): Block(
        (mixer): Mamba(
          (in_proj): Linear(in_features=192, out_features=768, bias=False)
          (conv1d): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (act): SiLU()
          (x_proj): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj): Linear(in_features=12, out_features=384, bias=True)
          (conv1d_b): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (x_proj_b): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj_b): Linear(in_features=12, out_features=384, bias=True)
          (out_proj): Linear(in_features=384, out_features=192, bias=False)
        )
        (norm): RMSNorm()
        (drop_path): DropPath(drop_prob=0.022)
      )
      (7): Block(
        (mixer): Mamba(
          (in_proj): Linear(in_features=192, out_features=768, bias=False)
          (conv1d): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (act): SiLU()
          (x_proj): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj): Linear(in_features=12, out_features=384, bias=True)
          (conv1d_b): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (x_proj_b): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj_b): Linear(in_features=12, out_features=384, bias=True)
          (out_proj): Linear(in_features=384, out_features=192, bias=False)
        )
        (norm): RMSNorm()
        (drop_path): DropPath(drop_prob=0.026)
      )
      (8): Block(
        (mixer): Mamba(
          (in_proj): Linear(in_features=192, out_features=768, bias=False)
          (conv1d): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (act): SiLU()
          (x_proj): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj): Linear(in_features=12, out_features=384, bias=True)
          (conv1d_b): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (x_proj_b): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj_b): Linear(in_features=12, out_features=384, bias=True)
          (out_proj): Linear(in_features=384, out_features=192, bias=False)
        )
        (norm): RMSNorm()
        (drop_path): DropPath(drop_prob=0.030)
      )
      (9): Block(
        (mixer): Mamba(
          (in_proj): Linear(in_features=192, out_features=768, bias=False)
          (conv1d): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (act): SiLU()
          (x_proj): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj): Linear(in_features=12, out_features=384, bias=True)
          (conv1d_b): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (x_proj_b): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj_b): Linear(in_features=12, out_features=384, bias=True)
          (out_proj): Linear(in_features=384, out_features=192, bias=False)
        )
        (norm): RMSNorm()
        (drop_path): DropPath(drop_prob=0.035)
      )
      (10): Block(
        (mixer): Mamba(
          (in_proj): Linear(in_features=192, out_features=768, bias=False)
          (conv1d): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (act): SiLU()
          (x_proj): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj): Linear(in_features=12, out_features=384, bias=True)
          (conv1d_b): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (x_proj_b): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj_b): Linear(in_features=12, out_features=384, bias=True)
          (out_proj): Linear(in_features=384, out_features=192, bias=False)
        )
        (norm): RMSNorm()
        (drop_path): DropPath(drop_prob=0.039)
      )
      (11): Block(
        (mixer): Mamba(
          (in_proj): Linear(in_features=192, out_features=768, bias=False)
          (conv1d): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (act): SiLU()
          (x_proj): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj): Linear(in_features=12, out_features=384, bias=True)
          (conv1d_b): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (x_proj_b): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj_b): Linear(in_features=12, out_features=384, bias=True)
          (out_proj): Linear(in_features=384, out_features=192, bias=False)
        )
        (norm): RMSNorm()
        (drop_path): DropPath(drop_prob=0.043)
      )
      (12): Block(
        (mixer): Mamba(
          (in_proj): Linear(in_features=192, out_features=768, bias=False)
          (conv1d): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (act): SiLU()
          (x_proj): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj): Linear(in_features=12, out_features=384, bias=True)
          (conv1d_b): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (x_proj_b): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj_b): Linear(in_features=12, out_features=384, bias=True)
          (out_proj): Linear(in_features=384, out_features=192, bias=False)
        )
        (norm): RMSNorm()
        (drop_path): DropPath(drop_prob=0.048)
      )
      (13): Block(
        (mixer): Mamba(
          (in_proj): Linear(in_features=192, out_features=768, bias=False)
          (conv1d): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (act): SiLU()
          (x_proj): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj): Linear(in_features=12, out_features=384, bias=True)
          (conv1d_b): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (x_proj_b): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj_b): Linear(in_features=12, out_features=384, bias=True)
          (out_proj): Linear(in_features=384, out_features=192, bias=False)
        )
        (norm): RMSNorm()
        (drop_path): DropPath(drop_prob=0.052)
      )
      (14): Block(
        (mixer): Mamba(
          (in_proj): Linear(in_features=192, out_features=768, bias=False)
          (conv1d): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (act): SiLU()
          (x_proj): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj): Linear(in_features=12, out_features=384, bias=True)
          (conv1d_b): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (x_proj_b): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj_b): Linear(in_features=12, out_features=384, bias=True)
          (out_proj): Linear(in_features=384, out_features=192, bias=False)
        )
        (norm): RMSNorm()
        (drop_path): DropPath(drop_prob=0.057)
      )
      (15): Block(
        (mixer): Mamba(
          (in_proj): Linear(in_features=192, out_features=768, bias=False)
          (conv1d): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (act): SiLU()
          (x_proj): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj): Linear(in_features=12, out_features=384, bias=True)
          (conv1d_b): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (x_proj_b): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj_b): Linear(in_features=12, out_features=384, bias=True)
          (out_proj): Linear(in_features=384, out_features=192, bias=False)
        )
        (norm): RMSNorm()
        (drop_path): DropPath(drop_prob=0.061)
      )
      (16): Block(
        (mixer): Mamba(
          (in_proj): Linear(in_features=192, out_features=768, bias=False)
          (conv1d): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (act): SiLU()
          (x_proj): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj): Linear(in_features=12, out_features=384, bias=True)
          (conv1d_b): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (x_proj_b): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj_b): Linear(in_features=12, out_features=384, bias=True)
          (out_proj): Linear(in_features=384, out_features=192, bias=False)
        )
        (norm): RMSNorm()
        (drop_path): DropPath(drop_prob=0.065)
      )
      (17): Block(
        (mixer): Mamba(
          (in_proj): Linear(in_features=192, out_features=768, bias=False)
          (conv1d): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (act): SiLU()
          (x_proj): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj): Linear(in_features=12, out_features=384, bias=True)
          (conv1d_b): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (x_proj_b): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj_b): Linear(in_features=12, out_features=384, bias=True)
          (out_proj): Linear(in_features=384, out_features=192, bias=False)
        )
        (norm): RMSNorm()
        (drop_path): DropPath(drop_prob=0.070)
      )
      (18): Block(
        (mixer): Mamba(
          (in_proj): Linear(in_features=192, out_features=768, bias=False)
          (conv1d): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (act): SiLU()
          (x_proj): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj): Linear(in_features=12, out_features=384, bias=True)
          (conv1d_b): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (x_proj_b): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj_b): Linear(in_features=12, out_features=384, bias=True)
          (out_proj): Linear(in_features=384, out_features=192, bias=False)
        )
        (norm): RMSNorm()
        (drop_path): DropPath(drop_prob=0.074)
      )
      (19): Block(
        (mixer): Mamba(
          (in_proj): Linear(in_features=192, out_features=768, bias=False)
          (conv1d): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (act): SiLU()
          (x_proj): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj): Linear(in_features=12, out_features=384, bias=True)
          (conv1d_b): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (x_proj_b): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj_b): Linear(in_features=12, out_features=384, bias=True)
          (out_proj): Linear(in_features=384, out_features=192, bias=False)
        )
        (norm): RMSNorm()
        (drop_path): DropPath(drop_prob=0.078)
      )
      (20): Block(
        (mixer): Mamba(
          (in_proj): Linear(in_features=192, out_features=768, bias=False)
          (conv1d): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (act): SiLU()
          (x_proj): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj): Linear(in_features=12, out_features=384, bias=True)
          (conv1d_b): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (x_proj_b): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj_b): Linear(in_features=12, out_features=384, bias=True)
          (out_proj): Linear(in_features=384, out_features=192, bias=False)
        )
        (norm): RMSNorm()
        (drop_path): DropPath(drop_prob=0.083)
      )
      (21): Block(
        (mixer): Mamba(
          (in_proj): Linear(in_features=192, out_features=768, bias=False)
          (conv1d): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (act): SiLU()
          (x_proj): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj): Linear(in_features=12, out_features=384, bias=True)
          (conv1d_b): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (x_proj_b): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj_b): Linear(in_features=12, out_features=384, bias=True)
          (out_proj): Linear(in_features=384, out_features=192, bias=False)
        )
        (norm): RMSNorm()
        (drop_path): DropPath(drop_prob=0.087)
      )
      (22): Block(
        (mixer): Mamba(
          (in_proj): Linear(in_features=192, out_features=768, bias=False)
          (conv1d): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (act): SiLU()
          (x_proj): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj): Linear(in_features=12, out_features=384, bias=True)
          (conv1d_b): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (x_proj_b): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj_b): Linear(in_features=12, out_features=384, bias=True)
          (out_proj): Linear(in_features=384, out_features=192, bias=False)
        )
        (norm): RMSNorm()
        (drop_path): DropPath(drop_prob=0.091)
      )
      (23): Block(
        (mixer): Mamba(
          (in_proj): Linear(in_features=192, out_features=768, bias=False)
          (conv1d): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (act): SiLU()
          (x_proj): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj): Linear(in_features=12, out_features=384, bias=True)
          (conv1d_b): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (x_proj_b): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj_b): Linear(in_features=12, out_features=384, bias=True)
          (out_proj): Linear(in_features=384, out_features=192, bias=False)
        )
        (norm): RMSNorm()
        (drop_path): DropPath(drop_prob=0.096)
      )
    )
    (norm_f): RMSNorm()
    (fpn1): Sequential(
      (0): ConvTranspose2d(192, 192, kernel_size=(2, 2), stride=(2, 2))
      (1): SyncBatchNorm(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (2): GELU(approximate='none')
      (3): ConvTranspose2d(192, 192, kernel_size=(2, 2), stride=(2, 2))
    )
    (fpn2): Sequential(
      (0): ConvTranspose2d(192, 192, kernel_size=(2, 2), stride=(2, 2))
    )
    (fpn3): Identity()
    (fpn4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (decode_head): UPerHead(
    input_transform=multiple_select, ignore_index=255, align_corners=False
    (loss_decode): CrossEntropyLoss(avg_non_ignore=False)
    (conv_seg): Conv2d(192, 150, kernel_size=(1, 1), stride=(1, 1))
    (dropout): Dropout2d(p=0.1, inplace=False)
    (psp_modules): PPM(
      (0): Sequential(
        (0): AdaptiveAvgPool2d(output_size=1)
        (1): ConvModule(
          (conv): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
          (bn): SyncBatchNorm(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (activate): ReLU(inplace=True)
        )
      )
      (1): Sequential(
        (0): AdaptiveAvgPool2d(output_size=2)
        (1): ConvModule(
          (conv): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
          (bn): SyncBatchNorm(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (activate): ReLU(inplace=True)
        )
      )
      (2): Sequential(
        (0): AdaptiveAvgPool2d(output_size=3)
        (1): ConvModule(
          (conv): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
          (bn): SyncBatchNorm(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (activate): ReLU(inplace=True)
        )
      )
      (3): Sequential(
        (0): AdaptiveAvgPool2d(output_size=6)
        (1): ConvModule(
          (conv): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
          (bn): SyncBatchNorm(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (activate): ReLU(inplace=True)
        )
      )
    )
    (bottleneck): ConvModule(
      (conv): Conv2d(960, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn): SyncBatchNorm(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (activate): ReLU(inplace=True)
    )
    (lateral_convs): ModuleList(
      (0-2): 3 x ConvModule(
        (conv): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn): SyncBatchNorm(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (activate): ReLU()
      )
    )
    (fpn_convs): ModuleList(
      (0-2): 3 x ConvModule(
        (conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn): SyncBatchNorm(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (activate): ReLU()
      )
    )
    (fpn_bottleneck): ConvModule(
      (conv): Conv2d(768, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn): SyncBatchNorm(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (activate): ReLU(inplace=True)
    )
  )
  init_cfg={'type': 'Normal', 'std': 0.01, 'override': {'name': 'conv_seg'}}
  (auxiliary_head): FCNHead(
    input_transform=None, ignore_index=255, align_corners=False
    (loss_decode): CrossEntropyLoss(avg_non_ignore=False)
    (conv_seg): Conv2d(256, 150, kernel_size=(1, 1), stride=(1, 1))
    (dropout): Dropout2d(p=0.1, inplace=False)
    (convs): Sequential(
      (0): ConvModule(
        (conv): Conv2d(192, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn): SyncBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (activate): ReLU(inplace=True)
      )
    )
  )
  init_cfg={'type': 'Normal', 'std': 0.01, 'override': {'name': 'conv_seg'}}
)
2024-09-11 16:15:41,721 - mmseg - INFO - Loaded 20210 images
{'num_layers': 24, 'layer_decay_rate': 0.92}
Build LayerDecayOptimizerConstructor 0.920000 - 26
Param groups = {
  "layer_0_decay": {
    "param_names": [
      "backbone.pos_embed",
      "backbone.patch_embed.proj.weight"
    ],
    "lr_scale": 0.12436428680229507,
    "lr": 1.2436428680229507e-05,
    "weight_decay": 0.05
  },
  "layer_0_no_decay": {
    "param_names": [
      "backbone.patch_embed.proj.bias"
    ],
    "lr_scale": 0.12436428680229507,
    "lr": 1.2436428680229507e-05,
    "weight_decay": 0.0
  },
  "layer_25_decay": {
    "param_names": [
      "backbone.layers.0.mixer.A_log",
      "backbone.layers.0.mixer.A_b_log",
      "backbone.layers.0.mixer.in_proj.weight",
      "backbone.layers.0.mixer.conv1d.weight",
      "backbone.layers.0.mixer.x_proj.weight",
      "backbone.layers.0.mixer.dt_proj.weight",
      "backbone.layers.0.mixer.conv1d_b.weight",
      "backbone.layers.0.mixer.x_proj_b.weight",
      "backbone.layers.0.mixer.dt_proj_b.weight",
      "backbone.layers.0.mixer.out_proj.weight",
      "backbone.layers.1.mixer.A_log",
      "backbone.layers.1.mixer.A_b_log",
      "backbone.layers.1.mixer.in_proj.weight",
      "backbone.layers.1.mixer.conv1d.weight",
      "backbone.layers.1.mixer.x_proj.weight",
      "backbone.layers.1.mixer.dt_proj.weight",
      "backbone.layers.1.mixer.conv1d_b.weight",
      "backbone.layers.1.mixer.x_proj_b.weight",
      "backbone.layers.1.mixer.dt_proj_b.weight",
      "backbone.layers.1.mixer.out_proj.weight",
      "backbone.layers.2.mixer.A_log",
      "backbone.layers.2.mixer.A_b_log",
      "backbone.layers.2.mixer.in_proj.weight",
      "backbone.layers.2.mixer.conv1d.weight",
      "backbone.layers.2.mixer.x_proj.weight",
      "backbone.layers.2.mixer.dt_proj.weight",
      "backbone.layers.2.mixer.conv1d_b.weight",
      "backbone.layers.2.mixer.x_proj_b.weight",
      "backbone.layers.2.mixer.dt_proj_b.weight",
      "backbone.layers.2.mixer.out_proj.weight",
      "backbone.layers.3.mixer.A_log",
      "backbone.layers.3.mixer.A_b_log",
      "backbone.layers.3.mixer.in_proj.weight",
      "backbone.layers.3.mixer.conv1d.weight",
      "backbone.layers.3.mixer.x_proj.weight",
      "backbone.layers.3.mixer.dt_proj.weight",
      "backbone.layers.3.mixer.conv1d_b.weight",
      "backbone.layers.3.mixer.x_proj_b.weight",
      "backbone.layers.3.mixer.dt_proj_b.weight",
      "backbone.layers.3.mixer.out_proj.weight",
      "backbone.layers.4.mixer.A_log",
      "backbone.layers.4.mixer.A_b_log",
      "backbone.layers.4.mixer.in_proj.weight",
      "backbone.layers.4.mixer.conv1d.weight",
      "backbone.layers.4.mixer.x_proj.weight",
      "backbone.layers.4.mixer.dt_proj.weight",
      "backbone.layers.4.mixer.conv1d_b.weight",
      "backbone.layers.4.mixer.x_proj_b.weight",
      "backbone.layers.4.mixer.dt_proj_b.weight",
      "backbone.layers.4.mixer.out_proj.weight",
      "backbone.layers.5.mixer.A_log",
      "backbone.layers.5.mixer.A_b_log",
      "backbone.layers.5.mixer.in_proj.weight",
      "backbone.layers.5.mixer.conv1d.weight",
      "backbone.layers.5.mixer.x_proj.weight",
      "backbone.layers.5.mixer.dt_proj.weight",
      "backbone.layers.5.mixer.conv1d_b.weight",
      "backbone.layers.5.mixer.x_proj_b.weight",
      "backbone.layers.5.mixer.dt_proj_b.weight",
      "backbone.layers.5.mixer.out_proj.weight",
      "backbone.layers.6.mixer.A_log",
      "backbone.layers.6.mixer.A_b_log",
      "backbone.layers.6.mixer.in_proj.weight",
      "backbone.layers.6.mixer.conv1d.weight",
      "backbone.layers.6.mixer.x_proj.weight",
      "backbone.layers.6.mixer.dt_proj.weight",
      "backbone.layers.6.mixer.conv1d_b.weight",
      "backbone.layers.6.mixer.x_proj_b.weight",
      "backbone.layers.6.mixer.dt_proj_b.weight",
      "backbone.layers.6.mixer.out_proj.weight",
      "backbone.layers.7.mixer.A_log",
      "backbone.layers.7.mixer.A_b_log",
      "backbone.layers.7.mixer.in_proj.weight",
      "backbone.layers.7.mixer.conv1d.weight",
      "backbone.layers.7.mixer.x_proj.weight",
      "backbone.layers.7.mixer.dt_proj.weight",
      "backbone.layers.7.mixer.conv1d_b.weight",
      "backbone.layers.7.mixer.x_proj_b.weight",
      "backbone.layers.7.mixer.dt_proj_b.weight",
      "backbone.layers.7.mixer.out_proj.weight",
      "backbone.layers.8.mixer.A_log",
      "backbone.layers.8.mixer.A_b_log",
      "backbone.layers.8.mixer.in_proj.weight",
      "backbone.layers.8.mixer.conv1d.weight",
      "backbone.layers.8.mixer.x_proj.weight",
      "backbone.layers.8.mixer.dt_proj.weight",
      "backbone.layers.8.mixer.conv1d_b.weight",
      "backbone.layers.8.mixer.x_proj_b.weight",
      "backbone.layers.8.mixer.dt_proj_b.weight",
      "backbone.layers.8.mixer.out_proj.weight",
      "backbone.layers.9.mixer.A_log",
      "backbone.layers.9.mixer.A_b_log",
      "backbone.layers.9.mixer.in_proj.weight",
      "backbone.layers.9.mixer.conv1d.weight",
      "backbone.layers.9.mixer.x_proj.weight",
      "backbone.layers.9.mixer.dt_proj.weight",
      "backbone.layers.9.mixer.conv1d_b.weight",
      "backbone.layers.9.mixer.x_proj_b.weight",
      "backbone.layers.9.mixer.dt_proj_b.weight",
      "backbone.layers.9.mixer.out_proj.weight",
      "backbone.layers.10.mixer.A_log",
      "backbone.layers.10.mixer.A_b_log",
      "backbone.layers.10.mixer.in_proj.weight",
      "backbone.layers.10.mixer.conv1d.weight",
      "backbone.layers.10.mixer.x_proj.weight",
      "backbone.layers.10.mixer.dt_proj.weight",
      "backbone.layers.10.mixer.conv1d_b.weight",
      "backbone.layers.10.mixer.x_proj_b.weight",
      "backbone.layers.10.mixer.dt_proj_b.weight",
      "backbone.layers.10.mixer.out_proj.weight",
      "backbone.layers.11.mixer.A_log",
      "backbone.layers.11.mixer.A_b_log",
      "backbone.layers.11.mixer.in_proj.weight",
      "backbone.layers.11.mixer.conv1d.weight",
      "backbone.layers.11.mixer.x_proj.weight",
      "backbone.layers.11.mixer.dt_proj.weight",
      "backbone.layers.11.mixer.conv1d_b.weight",
      "backbone.layers.11.mixer.x_proj_b.weight",
      "backbone.layers.11.mixer.dt_proj_b.weight",
      "backbone.layers.11.mixer.out_proj.weight",
      "backbone.layers.12.mixer.A_log",
      "backbone.layers.12.mixer.A_b_log",
      "backbone.layers.12.mixer.in_proj.weight",
      "backbone.layers.12.mixer.conv1d.weight",
      "backbone.layers.12.mixer.x_proj.weight",
      "backbone.layers.12.mixer.dt_proj.weight",
      "backbone.layers.12.mixer.conv1d_b.weight",
      "backbone.layers.12.mixer.x_proj_b.weight",
      "backbone.layers.12.mixer.dt_proj_b.weight",
      "backbone.layers.12.mixer.out_proj.weight",
      "backbone.layers.13.mixer.A_log",
      "backbone.layers.13.mixer.A_b_log",
      "backbone.layers.13.mixer.in_proj.weight",
      "backbone.layers.13.mixer.conv1d.weight",
      "backbone.layers.13.mixer.x_proj.weight",
      "backbone.layers.13.mixer.dt_proj.weight",
      "backbone.layers.13.mixer.conv1d_b.weight",
      "backbone.layers.13.mixer.x_proj_b.weight",
      "backbone.layers.13.mixer.dt_proj_b.weight",
      "backbone.layers.13.mixer.out_proj.weight",
      "backbone.layers.14.mixer.A_log",
      "backbone.layers.14.mixer.A_b_log",
      "backbone.layers.14.mixer.in_proj.weight",
      "backbone.layers.14.mixer.conv1d.weight",
      "backbone.layers.14.mixer.x_proj.weight",
      "backbone.layers.14.mixer.dt_proj.weight",
      "backbone.layers.14.mixer.conv1d_b.weight",
      "backbone.layers.14.mixer.x_proj_b.weight",
      "backbone.layers.14.mixer.dt_proj_b.weight",
      "backbone.layers.14.mixer.out_proj.weight",
      "backbone.layers.15.mixer.A_log",
      "backbone.layers.15.mixer.A_b_log",
      "backbone.layers.15.mixer.in_proj.weight",
      "backbone.layers.15.mixer.conv1d.weight",
      "backbone.layers.15.mixer.x_proj.weight",
      "backbone.layers.15.mixer.dt_proj.weight",
      "backbone.layers.15.mixer.conv1d_b.weight",
      "backbone.layers.15.mixer.x_proj_b.weight",
      "backbone.layers.15.mixer.dt_proj_b.weight",
      "backbone.layers.15.mixer.out_proj.weight",
      "backbone.layers.16.mixer.A_log",
      "backbone.layers.16.mixer.A_b_log",
      "backbone.layers.16.mixer.in_proj.weight",
      "backbone.layers.16.mixer.conv1d.weight",
      "backbone.layers.16.mixer.x_proj.weight",
      "backbone.layers.16.mixer.dt_proj.weight",
      "backbone.layers.16.mixer.conv1d_b.weight",
      "backbone.layers.16.mixer.x_proj_b.weight",
      "backbone.layers.16.mixer.dt_proj_b.weight",
      "backbone.layers.16.mixer.out_proj.weight",
      "backbone.layers.17.mixer.A_log",
      "backbone.layers.17.mixer.A_b_log",
      "backbone.layers.17.mixer.in_proj.weight",
      "backbone.layers.17.mixer.conv1d.weight",
      "backbone.layers.17.mixer.x_proj.weight",
      "backbone.layers.17.mixer.dt_proj.weight",
      "backbone.layers.17.mixer.conv1d_b.weight",
      "backbone.layers.17.mixer.x_proj_b.weight",
      "backbone.layers.17.mixer.dt_proj_b.weight",
      "backbone.layers.17.mixer.out_proj.weight",
      "backbone.layers.18.mixer.A_log",
      "backbone.layers.18.mixer.A_b_log",
      "backbone.layers.18.mixer.in_proj.weight",
      "backbone.layers.18.mixer.conv1d.weight",
      "backbone.layers.18.mixer.x_proj.weight",
      "backbone.layers.18.mixer.dt_proj.weight",
      "backbone.layers.18.mixer.conv1d_b.weight",
      "backbone.layers.18.mixer.x_proj_b.weight",
      "backbone.layers.18.mixer.dt_proj_b.weight",
      "backbone.layers.18.mixer.out_proj.weight",
      "backbone.layers.19.mixer.A_log",
      "backbone.layers.19.mixer.A_b_log",
      "backbone.layers.19.mixer.in_proj.weight",
      "backbone.layers.19.mixer.conv1d.weight",
      "backbone.layers.19.mixer.x_proj.weight",
      "backbone.layers.19.mixer.dt_proj.weight",
      "backbone.layers.19.mixer.conv1d_b.weight",
      "backbone.layers.19.mixer.x_proj_b.weight",
      "backbone.layers.19.mixer.dt_proj_b.weight",
      "backbone.layers.19.mixer.out_proj.weight",
      "backbone.layers.20.mixer.A_log",
      "backbone.layers.20.mixer.A_b_log",
      "backbone.layers.20.mixer.in_proj.weight",
      "backbone.layers.20.mixer.conv1d.weight",
      "backbone.layers.20.mixer.x_proj.weight",
      "backbone.layers.20.mixer.dt_proj.weight",
      "backbone.layers.20.mixer.conv1d_b.weight",
      "backbone.layers.20.mixer.x_proj_b.weight",
      "backbone.layers.20.mixer.dt_proj_b.weight",
      "backbone.layers.20.mixer.out_proj.weight",
      "backbone.layers.21.mixer.A_log",
      "backbone.layers.21.mixer.A_b_log",
      "backbone.layers.21.mixer.in_proj.weight",
      "backbone.layers.21.mixer.conv1d.weight",
      "backbone.layers.21.mixer.x_proj.weight",
      "backbone.layers.21.mixer.dt_proj.weight",
      "backbone.layers.21.mixer.conv1d_b.weight",
      "backbone.layers.21.mixer.x_proj_b.weight",
      "backbone.layers.21.mixer.dt_proj_b.weight",
      "backbone.layers.21.mixer.out_proj.weight",
      "backbone.layers.22.mixer.A_log",
      "backbone.layers.22.mixer.A_b_log",
      "backbone.layers.22.mixer.in_proj.weight",
      "backbone.layers.22.mixer.conv1d.weight",
      "backbone.layers.22.mixer.x_proj.weight",
      "backbone.layers.22.mixer.dt_proj.weight",
      "backbone.layers.22.mixer.conv1d_b.weight",
      "backbone.layers.22.mixer.x_proj_b.weight",
      "backbone.layers.22.mixer.dt_proj_b.weight",
      "backbone.layers.22.mixer.out_proj.weight",
      "backbone.layers.23.mixer.A_log",
      "backbone.layers.23.mixer.A_b_log",
      "backbone.layers.23.mixer.in_proj.weight",
      "backbone.layers.23.mixer.conv1d.weight",
      "backbone.layers.23.mixer.x_proj.weight",
      "backbone.layers.23.mixer.dt_proj.weight",
      "backbone.layers.23.mixer.conv1d_b.weight",
      "backbone.layers.23.mixer.x_proj_b.weight",
      "backbone.layers.23.mixer.dt_proj_b.weight",
      "backbone.layers.23.mixer.out_proj.weight",
      "backbone.fpn1.0.weight",
      "backbone.fpn1.3.weight",
      "backbone.fpn2.0.weight",
      "decode_head.conv_seg.weight",
      "decode_head.psp_modules.0.1.conv.weight",
      "decode_head.psp_modules.1.1.conv.weight",
      "decode_head.psp_modules.2.1.conv.weight",
      "decode_head.psp_modules.3.1.conv.weight",
      "decode_head.bottleneck.conv.weight",
      "decode_head.lateral_convs.0.conv.weight",
      "decode_head.lateral_convs.1.conv.weight",
      "decode_head.lateral_convs.2.conv.weight",
      "decode_head.fpn_convs.0.conv.weight",
      "decode_head.fpn_convs.1.conv.weight",
      "decode_head.fpn_convs.2.conv.weight",
      "decode_head.fpn_bottleneck.conv.weight",
      "auxiliary_head.conv_seg.weight",
      "auxiliary_head.convs.0.conv.weight"
    ],
    "lr_scale": 1.0,
    "lr": 0.0001,
    "weight_decay": 0.05
  },
  "layer_25_no_decay": {
    "param_names": [
      "backbone.layers.0.mixer.D",
      "backbone.layers.0.mixer.D_b",
      "backbone.layers.0.mixer.conv1d.bias",
      "backbone.layers.0.mixer.dt_proj.bias",
      "backbone.layers.0.mixer.conv1d_b.bias",
      "backbone.layers.0.mixer.dt_proj_b.bias",
      "backbone.layers.0.norm.weight",
      "backbone.layers.1.mixer.D",
      "backbone.layers.1.mixer.D_b",
      "backbone.layers.1.mixer.conv1d.bias",
      "backbone.layers.1.mixer.dt_proj.bias",
      "backbone.layers.1.mixer.conv1d_b.bias",
      "backbone.layers.1.mixer.dt_proj_b.bias",
      "backbone.layers.1.norm.weight",
      "backbone.layers.2.mixer.D",
      "backbone.layers.2.mixer.D_b",
      "backbone.layers.2.mixer.conv1d.bias",
      "backbone.layers.2.mixer.dt_proj.bias",
      "backbone.layers.2.mixer.conv1d_b.bias",
      "backbone.layers.2.mixer.dt_proj_b.bias",
      "backbone.layers.2.norm.weight",
      "backbone.layers.3.mixer.D",
      "backbone.layers.3.mixer.D_b",
      "backbone.layers.3.mixer.conv1d.bias",
      "backbone.layers.3.mixer.dt_proj.bias",
      "backbone.layers.3.mixer.conv1d_b.bias",
      "backbone.layers.3.mixer.dt_proj_b.bias",
      "backbone.layers.3.norm.weight",
      "backbone.layers.4.mixer.D",
      "backbone.layers.4.mixer.D_b",
      "backbone.layers.4.mixer.conv1d.bias",
      "backbone.layers.4.mixer.dt_proj.bias",
      "backbone.layers.4.mixer.conv1d_b.bias",
      "backbone.layers.4.mixer.dt_proj_b.bias",
      "backbone.layers.4.norm.weight",
      "backbone.layers.5.mixer.D",
      "backbone.layers.5.mixer.D_b",
      "backbone.layers.5.mixer.conv1d.bias",
      "backbone.layers.5.mixer.dt_proj.bias",
      "backbone.layers.5.mixer.conv1d_b.bias",
      "backbone.layers.5.mixer.dt_proj_b.bias",
      "backbone.layers.5.norm.weight",
      "backbone.layers.6.mixer.D",
      "backbone.layers.6.mixer.D_b",
      "backbone.layers.6.mixer.conv1d.bias",
      "backbone.layers.6.mixer.dt_proj.bias",
      "backbone.layers.6.mixer.conv1d_b.bias",
      "backbone.layers.6.mixer.dt_proj_b.bias",
      "backbone.layers.6.norm.weight",
      "backbone.layers.7.mixer.D",
      "backbone.layers.7.mixer.D_b",
      "backbone.layers.7.mixer.conv1d.bias",
      "backbone.layers.7.mixer.dt_proj.bias",
      "backbone.layers.7.mixer.conv1d_b.bias",
      "backbone.layers.7.mixer.dt_proj_b.bias",
      "backbone.layers.7.norm.weight",
      "backbone.layers.8.mixer.D",
      "backbone.layers.8.mixer.D_b",
      "backbone.layers.8.mixer.conv1d.bias",
      "backbone.layers.8.mixer.dt_proj.bias",
      "backbone.layers.8.mixer.conv1d_b.bias",
      "backbone.layers.8.mixer.dt_proj_b.bias",
      "backbone.layers.8.norm.weight",
      "backbone.layers.9.mixer.D",
      "backbone.layers.9.mixer.D_b",
      "backbone.layers.9.mixer.conv1d.bias",
      "backbone.layers.9.mixer.dt_proj.bias",
      "backbone.layers.9.mixer.conv1d_b.bias",
      "backbone.layers.9.mixer.dt_proj_b.bias",
      "backbone.layers.9.norm.weight",
      "backbone.layers.10.mixer.D",
      "backbone.layers.10.mixer.D_b",
      "backbone.layers.10.mixer.conv1d.bias",
      "backbone.layers.10.mixer.dt_proj.bias",
      "backbone.layers.10.mixer.conv1d_b.bias",
      "backbone.layers.10.mixer.dt_proj_b.bias",
      "backbone.layers.10.norm.weight",
      "backbone.layers.11.mixer.D",
      "backbone.layers.11.mixer.D_b",
      "backbone.layers.11.mixer.conv1d.bias",
      "backbone.layers.11.mixer.dt_proj.bias",
      "backbone.layers.11.mixer.conv1d_b.bias",
      "backbone.layers.11.mixer.dt_proj_b.bias",
      "backbone.layers.11.norm.weight",
      "backbone.layers.12.mixer.D",
      "backbone.layers.12.mixer.D_b",
      "backbone.layers.12.mixer.conv1d.bias",
      "backbone.layers.12.mixer.dt_proj.bias",
      "backbone.layers.12.mixer.conv1d_b.bias",
      "backbone.layers.12.mixer.dt_proj_b.bias",
      "backbone.layers.12.norm.weight",
      "backbone.layers.13.mixer.D",
      "backbone.layers.13.mixer.D_b",
      "backbone.layers.13.mixer.conv1d.bias",
      "backbone.layers.13.mixer.dt_proj.bias",
      "backbone.layers.13.mixer.conv1d_b.bias",
      "backbone.layers.13.mixer.dt_proj_b.bias",
      "backbone.layers.13.norm.weight",
      "backbone.layers.14.mixer.D",
      "backbone.layers.14.mixer.D_b",
      "backbone.layers.14.mixer.conv1d.bias",
      "backbone.layers.14.mixer.dt_proj.bias",
      "backbone.layers.14.mixer.conv1d_b.bias",
      "backbone.layers.14.mixer.dt_proj_b.bias",
      "backbone.layers.14.norm.weight",
      "backbone.layers.15.mixer.D",
      "backbone.layers.15.mixer.D_b",
      "backbone.layers.15.mixer.conv1d.bias",
      "backbone.layers.15.mixer.dt_proj.bias",
      "backbone.layers.15.mixer.conv1d_b.bias",
      "backbone.layers.15.mixer.dt_proj_b.bias",
      "backbone.layers.15.norm.weight",
      "backbone.layers.16.mixer.D",
      "backbone.layers.16.mixer.D_b",
      "backbone.layers.16.mixer.conv1d.bias",
      "backbone.layers.16.mixer.dt_proj.bias",
      "backbone.layers.16.mixer.conv1d_b.bias",
      "backbone.layers.16.mixer.dt_proj_b.bias",
      "backbone.layers.16.norm.weight",
      "backbone.layers.17.mixer.D",
      "backbone.layers.17.mixer.D_b",
      "backbone.layers.17.mixer.conv1d.bias",
      "backbone.layers.17.mixer.dt_proj.bias",
      "backbone.layers.17.mixer.conv1d_b.bias",
      "backbone.layers.17.mixer.dt_proj_b.bias",
      "backbone.layers.17.norm.weight",
      "backbone.layers.18.mixer.D",
      "backbone.layers.18.mixer.D_b",
      "backbone.layers.18.mixer.conv1d.bias",
      "backbone.layers.18.mixer.dt_proj.bias",
      "backbone.layers.18.mixer.conv1d_b.bias",
      "backbone.layers.18.mixer.dt_proj_b.bias",
      "backbone.layers.18.norm.weight",
      "backbone.layers.19.mixer.D",
      "backbone.layers.19.mixer.D_b",
      "backbone.layers.19.mixer.conv1d.bias",
      "backbone.layers.19.mixer.dt_proj.bias",
      "backbone.layers.19.mixer.conv1d_b.bias",
      "backbone.layers.19.mixer.dt_proj_b.bias",
      "backbone.layers.19.norm.weight",
      "backbone.layers.20.mixer.D",
      "backbone.layers.20.mixer.D_b",
      "backbone.layers.20.mixer.conv1d.bias",
      "backbone.layers.20.mixer.dt_proj.bias",
      "backbone.layers.20.mixer.conv1d_b.bias",
      "backbone.layers.20.mixer.dt_proj_b.bias",
      "backbone.layers.20.norm.weight",
      "backbone.layers.21.mixer.D",
      "backbone.layers.21.mixer.D_b",
      "backbone.layers.21.mixer.conv1d.bias",
      "backbone.layers.21.mixer.dt_proj.bias",
      "backbone.layers.21.mixer.conv1d_b.bias",
      "backbone.layers.21.mixer.dt_proj_b.bias",
      "backbone.layers.21.norm.weight",
      "backbone.layers.22.mixer.D",
      "backbone.layers.22.mixer.D_b",
      "backbone.layers.22.mixer.conv1d.bias",
      "backbone.layers.22.mixer.dt_proj.bias",
      "backbone.layers.22.mixer.conv1d_b.bias",
      "backbone.layers.22.mixer.dt_proj_b.bias",
      "backbone.layers.22.norm.weight",
      "backbone.layers.23.mixer.D",
      "backbone.layers.23.mixer.D_b",
      "backbone.layers.23.mixer.conv1d.bias",
      "backbone.layers.23.mixer.dt_proj.bias",
      "backbone.layers.23.mixer.conv1d_b.bias",
      "backbone.layers.23.mixer.dt_proj_b.bias",
      "backbone.layers.23.norm.weight",
      "backbone.norm_f.weight",
      "backbone.fpn1.0.bias",
      "backbone.fpn1.1.weight",
      "backbone.fpn1.1.bias",
      "backbone.fpn1.3.bias",
      "backbone.fpn2.0.bias",
      "decode_head.conv_seg.bias",
      "decode_head.psp_modules.0.1.bn.weight",
      "decode_head.psp_modules.0.1.bn.bias",
      "decode_head.psp_modules.1.1.bn.weight",
      "decode_head.psp_modules.1.1.bn.bias",
      "decode_head.psp_modules.2.1.bn.weight",
      "decode_head.psp_modules.2.1.bn.bias",
      "decode_head.psp_modules.3.1.bn.weight",
      "decode_head.psp_modules.3.1.bn.bias",
      "decode_head.bottleneck.bn.weight",
      "decode_head.bottleneck.bn.bias",
      "decode_head.lateral_convs.0.bn.weight",
      "decode_head.lateral_convs.0.bn.bias",
      "decode_head.lateral_convs.1.bn.weight",
      "decode_head.lateral_convs.1.bn.bias",
      "decode_head.lateral_convs.2.bn.weight",
      "decode_head.lateral_convs.2.bn.bias",
      "decode_head.fpn_convs.0.bn.weight",
      "decode_head.fpn_convs.0.bn.bias",
      "decode_head.fpn_convs.1.bn.weight",
      "decode_head.fpn_convs.1.bn.bias",
      "decode_head.fpn_convs.2.bn.weight",
      "decode_head.fpn_convs.2.bn.bias",
      "decode_head.fpn_bottleneck.bn.weight",
      "decode_head.fpn_bottleneck.bn.bias",
      "auxiliary_head.conv_seg.bias",
      "auxiliary_head.convs.0.bn.weight",
      "auxiliary_head.convs.0.bn.bias"
    ],
    "lr_scale": 1.0,
    "lr": 0.0001,
    "weight_decay": 0.0
  }
}
2024-09-11 16:15:41,991 - mmseg - INFO - Loaded 2000 images
2024-09-11 16:15:41,992 - mmseg - INFO - Start running, host: jiuth@DESKTOP-8F5VA63, work_dir: /home/jiuth/Vim/work_dirs/upernet_vim_tiny_24_512_slide_60k_debug
2024-09-11 16:15:41,992 - mmseg - INFO - Hooks will be executed in the following order:
before_run:
(VERY_HIGH   ) PolyLrUpdaterHook
(ABOVE_NORMAL) DistOptimizerHook
(NORMAL      ) CheckpointHook
(LOW         ) EvalHook
(VERY_LOW    ) TextLoggerHook
 --------------------
before_train_epoch:
(VERY_HIGH   ) PolyLrUpdaterHook
(LOW         ) IterTimerHook
(LOW         ) EvalHook
(VERY_LOW    ) TextLoggerHook
 --------------------
before_train_iter:
(VERY_HIGH   ) PolyLrUpdaterHook
(LOW         ) IterTimerHook
(LOW         ) EvalHook
 --------------------
after_train_iter:
(ABOVE_NORMAL) DistOptimizerHook
(NORMAL      ) CheckpointHook
(LOW         ) IterTimerHook
(LOW         ) EvalHook
(VERY_LOW    ) TextLoggerHook
 --------------------
after_train_epoch:
(NORMAL      ) CheckpointHook
(LOW         ) EvalHook
(VERY_LOW    ) TextLoggerHook
 --------------------
before_val_epoch:
(LOW         ) IterTimerHook
(VERY_LOW    ) TextLoggerHook
 --------------------
before_val_iter:
(LOW         ) IterTimerHook
 --------------------
after_val_iter:
(LOW         ) IterTimerHook
 --------------------
after_val_epoch:
(VERY_LOW    ) TextLoggerHook
 --------------------
after_run:
(VERY_LOW    ) TextLoggerHook
 --------------------
2024-09-11 16:15:41,992 - mmseg - INFO - workflow: [('train', 1)], max: 60000 iters
2024-09-11 16:15:41,993 - mmseg - INFO - Checkpoints will be saved to /home/jiuth/Vim/work_dirs/upernet_vim_tiny_24_512_slide_60k_debug by HardDiskBackend.
Traceback (most recent call last):
  File "/home/jiuth/Vim/seg/train.py", line 165, in <module>
    main()
  File "/home/jiuth/Vim/seg/train.py", line 154, in main
    train_segmentor(
  File "/home/jiuth/Vim/seg/mmcv_custom/train_api.py", line 129, in train_segmentor
    runner.run(data_loaders, cfg.workflow)
  File "/home/jiuth/anaconda3/envs/vim/lib/python3.10/site-packages/mmcv/runner/iter_based_runner.py", line 144, in run
    iter_runner(iter_loaders[i], **kwargs)
  File "/home/jiuth/anaconda3/envs/vim/lib/python3.10/site-packages/mmcv/runner/iter_based_runner.py", line 64, in train
    outputs = self.model.train_step(data_batch, self.optimizer, **kwargs)
  File "/home/jiuth/anaconda3/envs/vim/lib/python3.10/site-packages/mmcv/parallel/data_parallel.py", line 77, in train_step
    return self.module.train_step(*inputs[0], **kwargs[0])
  File "/home/jiuth/anaconda3/envs/vim/lib/python3.10/site-packages/mmseg/models/segmentors/base.py", line 138, in train_step
    losses = self(**data_batch)
  File "/home/jiuth/anaconda3/envs/vim/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/jiuth/anaconda3/envs/vim/lib/python3.10/site-packages/mmcv/runner/fp16_utils.py", line 119, in new_func
    return old_func(*args, **kwargs)
  File "/home/jiuth/anaconda3/envs/vim/lib/python3.10/site-packages/mmseg/models/segmentors/base.py", line 108, in forward
    return self.forward_train(img, img_metas, **kwargs)
  File "/home/jiuth/anaconda3/envs/vim/lib/python3.10/site-packages/mmseg/models/segmentors/encoder_decoder.py", line 140, in forward_train
    x = self.extract_feat(img)
  File "/home/jiuth/anaconda3/envs/vim/lib/python3.10/site-packages/mmseg/models/segmentors/encoder_decoder.py", line 66, in extract_feat
    x = self.backbone(img)
  File "/home/jiuth/anaconda3/envs/vim/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/jiuth/Vim/seg/backbone/vim.py", line 276, in forward
    x = self.forward_features(x)
  File "/home/jiuth/Vim/seg/backbone/vim.py", line 186, in forward_features
    x = self.patch_embed(x)
  File "/home/jiuth/anaconda3/envs/vim/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/jiuth/Vim/seg/backbone/models_mamba.py", line 59, in forward
    x = self.proj(x)
  File "/home/jiuth/anaconda3/envs/vim/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/jiuth/anaconda3/envs/vim/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 463, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "/home/jiuth/anaconda3/envs/vim/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 459, in _conv_forward
    return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Given groups=1, weight of size [192, 80, 16, 16], expected input[8, 3, 512, 512] to have 80 channels, but got 3 channels instead

how can i fix?

Besides, the pre-trained model files do not match the model names provided in the scripts. How can this be resolved?
scripts like:

#!/bin/bash
# bash /client-tools/repair_A100.sh
source /mnt/bn/lianghuidata/miniconda/bin/activate /mnt/bn/lianghuidata/miniconda/envs/vim-seg
cd /mnt/bn/lianghuidata/Vim/seg

SEG_CONFIG=configs/vim/upernet/upernet_vim_tiny_24_512_slide_60k.py
PRETRAIN_CKPT=/mnt/bn/lianghuidata/Vim/pretrained_ckpts/pretrained-vim-t.pth

python3 -m torch.distributed.launch --nproc_per_node=4 --nnodes=${WORLD_SIZE} --node_rank=${RANK} --master_addr=${MASTER_ADDR} --master_port=10295 \
--use_env train.py --launcher pytorch \
    ${SEG_CONFIG} \
    --seed 0 --work-dir work_dirs/vimseg-t --deterministic \
    --options model.backbone.pretrained=${PRETRAIN_CKPT} model.backbone.if_bimamba=False model.backbone.bimamba_type=v2 optimizer.lr=2e-4 optimizer.weight_decay=0.1 
```
@ap2749919
Copy link

ap2749919 commented Oct 29, 2024

vim.py line 47 super().__init__(img_size, patch_size, stride, depth, embed_dim, in_chans, num_classes, **kwargs) . The relative position of parameter in_chans is wrong, d_state is missing (from model_mamba.py Class VisionMamba).

@huhuhuhuhuuuu
Copy link

I also encountered the same problem, may I ask if you have solved the problem of ”RuntimeError: Given groups=1, weight of size [192, 80, 16, 16], expected input[8, 3, 512, 512] to have 80 channels, but got 3 channels instead“?

@ap2749919
Copy link

I also encountered the same problem, may I ask if you have solved the problem of ”RuntimeError: Given groups=1, weight of size [192, 80, 16, 16], expected input[8, 3, 512, 512] to have 80 channels, but got 3 channels instead“?

yes just as the description above. i simply changed seg/backbone/vim.py Line 47 super().__init__(img_size, patch_size, stride, depth, embed_dim, in_chans, num_classes, **kwargs) into super().__init__(img_size, patch_size, stride, depth, embed_dim, d_state , in_chans, num_classes, **kwargs)(define d_state = 16 as default value).

@mfq2003
Copy link

mfq2003 commented Nov 16, 2024

No, I changed it according to what you said, and the display reported an error: said an extra parameter

@mfq2003
Copy link

mfq2003 commented Nov 16, 2024

我也遇到了同样的问题,请问您是否解决了“RuntimeError:给定组=1,大小为 [192, 80, 16, 16] 的权重,预期输入 [8, 3, 512, 512] 有 80 个通道,但反而有 3 个通道”的问题?

是的,就像上面的描述一样。我只是将 seg/backbone/vim.py Line 47 更改为(定义 d_state = 16 作为默认值)。super().__init__(img_size, patch_size, stride, depth, embed_dim, in_chans, num_classes, **kwargs)``super().__init__(img_size, patch_size, stride, depth, embed_dim, d_state , in_chans, num_classes, **kwargs)

No, I changed it according to what you said, and the display reported an error: said an extra parameter

@TiSgrc2002
Copy link

When i use the command:

python seg/train.py seg/configs/vim/upernet/upernet_vim_tiny_24_512_slide_60k_debug.py

it show result:

 python seg/train.py seg/configs/vim/upernet/upernet_vim_tiny_24_512_slide_60k_debug.py
/home/jiuth/anaconda3/envs/vim/lib/python3.10/site-packages/mmcv/__init__.py:20: UserWarning: On January 1, 2023, MMCV will release v2.0.0, in which it will remove components related to the training process and add a data transformation module. In addition, it will rename the package names mmcv to mmcv-lite and mmcv-full to mmcv. See https://github.com/open-mmlab/mmcv/blob/master/docs/en/compatibility.md for more details.
  warnings.warn(
/home/jiuth/vim
2024-09-11 16:15:41,142 - mmseg - INFO - Environment info:
------------------------------------------------------------
sys.platform: linux
Python: 3.10.14 (main, May  6 2024, 19:42:50) [GCC 11.2.0]
CUDA available: True
GPU 0: NVIDIA GeForce RTX 3090
CUDA_HOME: /usr/local/cuda
NVCC: Cuda compilation tools, release 11.8, V11.8.89
GCC: gcc (Ubuntu 10.5.0-1ubuntu1~22.04) 10.5.0
PyTorch: 2.0.0
PyTorch compiling details: PyTorch built with:
  - GCC 9.3
  - C++ Version: 201703
  - Intel(R) oneAPI Math Kernel Library Version 2023.1-Product Build 20230303 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.7.3 (Git Hash 6dbeffbae1f23cbbeae17adb7b5b13f1f37c080e)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 11.8
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_90,code=sm_90;-gencode;arch=compute_37,code=compute_37
  - CuDNN 8.7
  - Magma 2.6.1
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.8, CUDNN_VERSION=8.7.0, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 -fabi-version=11 -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wunused-local-typedefs -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_DISABLE_GPU_ASSERTS=ON, TORCH_VERSION=2.0.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF,

TorchVision: 0.15.0
OpenCV: 4.10.0
MMCV: 1.7.2
MMCV Compiler: GCC 9.3
MMCV CUDA Compiler: 11.8
MMSegmentation: 0.30.0+b9cf48f
------------------------------------------------------------

2024-09-11 16:15:41,142 - mmseg - INFO - Distributed training: False
2024-09-11 16:15:41,241 - mmseg - INFO - Config:
norm_cfg = dict(type='SyncBN', requires_grad=True)
model = dict(
    type='EncoderDecoder',
    pretrained=None,
    backbone=dict(
        type='VisionMambaSeg',
        patch_size=16,
        embed_dim=192,
        depth=24,
        img_size=512,
        in_chans=3,
        out_indices=[5, 11, 17, 23],
        pretrained=None,
        rms_norm=True,
        residual_in_fp32=False,
        fused_add_norm=True,
        if_abs_pos_embed=True,
        if_rope=False,
        if_rope_residual=False,
        bimamba_type='v2',
        final_pool_type='all',
        if_divide_out=True,
        if_cls_token=False),
    decode_head=dict(
        type='UPerHead',
        in_channels=[192, 192, 192, 192],
        in_index=[0, 1, 2, 3],
        pool_scales=(1, 2, 3, 6),
        channels=192,
        dropout_ratio=0.1,
        num_classes=150,
        norm_cfg=dict(type='SyncBN', requires_grad=True),
        align_corners=False,
        loss_decode=dict(
            type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
    auxiliary_head=dict(
        type='FCNHead',
        in_channels=192,
        in_index=2,
        channels=256,
        num_convs=1,
        concat_input=False,
        dropout_ratio=0.1,
        num_classes=150,
        norm_cfg=dict(type='SyncBN', requires_grad=True),
        align_corners=False,
        loss_decode=dict(
            type='CrossEntropyLoss', use_sigmoid=False, loss_weight=0.4)),
    train_cfg=dict(),
    test_cfg=dict(mode='slide', crop_size=(512, 512), stride=(341, 341)))
find_unused_parameters = True
dataset_type = 'ADE20KDataset'
data_root = '/home/jiuth/Vim/seg/data/ADEChallengeData2016'
img_norm_cfg = dict(
    mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
crop_size = (512, 512)
train_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(type='LoadAnnotations', reduce_zero_label=True),
    dict(type='Resize', img_scale=(2560, 640), ratio_range=(0.5, 2.0)),
    dict(type='RandomCrop', crop_size=(512, 512), cat_max_ratio=0.75),
    dict(type='RandomFlip', prob=0.5),
    dict(type='PhotoMetricDistortion'),
    dict(
        type='Normalize',
        mean=[123.675, 116.28, 103.53],
        std=[58.395, 57.12, 57.375],
        to_rgb=True),
    dict(type='Pad', size=(512, 512), pad_val=0, seg_pad_val=255),
    dict(type='DefaultFormatBundle'),
    dict(type='Collect', keys=['img', 'gt_semantic_seg'])
]
test_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(
        type='MultiScaleFlipAug',
        img_scale=(2560, 640),
        flip=False,
        transforms=[
            dict(type='Resize', keep_ratio=True),
            dict(type='RandomFlip'),
            dict(
                type='Normalize',
                mean=[123.675, 116.28, 103.53],
                std=[58.395, 57.12, 57.375],
                to_rgb=True),
            dict(type='ImageToTensor', keys=['img']),
            dict(type='Collect', keys=['img'])
        ])
]
data = dict(
    samples_per_gpu=8,
    workers_per_gpu=16,
    train=dict(
        type='ADE20KDataset',
        data_root='/home/jiuth/Vim/seg/data/ADEChallengeData2016',
        img_dir='images/training',
        ann_dir='annotations/training',
        pipeline=[
            dict(type='LoadImageFromFile'),
            dict(type='LoadAnnotations', reduce_zero_label=True),
            dict(type='Resize', img_scale=(2560, 640), ratio_range=(0.5, 2.0)),
            dict(type='RandomCrop', crop_size=(512, 512), cat_max_ratio=0.75),
            dict(type='RandomFlip', prob=0.5),
            dict(type='PhotoMetricDistortion'),
            dict(
                type='Normalize',
                mean=[123.675, 116.28, 103.53],
                std=[58.395, 57.12, 57.375],
                to_rgb=True),
            dict(type='Pad', size=(512, 512), pad_val=0, seg_pad_val=255),
            dict(type='DefaultFormatBundle'),
            dict(type='Collect', keys=['img', 'gt_semantic_seg'])
        ]),
    val=dict(
        type='ADE20KDataset',
        data_root='/home/jiuth/Vim/seg/data/ADEChallengeData2016',
        img_dir='images/validation',
        ann_dir='annotations/validation',
        pipeline=[
            dict(type='LoadImageFromFile'),
            dict(
                type='MultiScaleFlipAug',
                img_scale=(2560, 640),
                flip=False,
                transforms=[
                    dict(type='Resize', keep_ratio=True),
                    dict(type='RandomFlip'),
                    dict(
                        type='Normalize',
                        mean=[123.675, 116.28, 103.53],
                        std=[58.395, 57.12, 57.375],
                        to_rgb=True),
                    dict(type='ImageToTensor', keys=['img']),
                    dict(type='Collect', keys=['img'])
                ])
        ]),
    test=dict(
        type='ADE20KDataset',
        data_root='/home/jiuth/Vim/seg/data/ADEChallengeData2016',
        img_dir='images/validation',
        ann_dir='annotations/validation',
        pipeline=[
            dict(type='LoadImageFromFile'),
            dict(
                type='MultiScaleFlipAug',
                img_scale=(2560, 640),
                flip=False,
                transforms=[
                    dict(type='Resize', keep_ratio=True),
                    dict(type='RandomFlip'),
                    dict(
                        type='Normalize',
                        mean=[123.675, 116.28, 103.53],
                        std=[58.395, 57.12, 57.375],
                        to_rgb=True),
                    dict(type='ImageToTensor', keys=['img']),
                    dict(type='Collect', keys=['img'])
                ])
        ]))
log_config = dict(
    interval=50, hooks=[dict(type='TextLoggerHook', by_epoch=False)])
dist_params = dict(backend='nccl')
log_level = 'INFO'
load_from = None
resume_from = None
workflow = [('train', 1)]
cudnn_benchmark = True
optimizer = dict(
    type='AdamW',
    lr=0.0001,
    betas=(0.9, 0.999),
    weight_decay=0.05,
    constructor='LayerDecayOptimizerConstructor',
    paramwise_cfg=dict(num_layers=24, layer_decay_rate=0.92))
optimizer_config = dict(
    type='DistOptimizerHook',
    update_interval=1,
    grad_clip=None,
    coalesce=True,
    bucket_size_mb=-1,
    use_fp16=False)
lr_config = dict(
    policy='poly',
    warmup='linear',
    warmup_iters=1500,
    warmup_ratio=1e-06,
    power=1.0,
    min_lr=0.0,
    by_epoch=False)
runner = dict(type='IterBasedRunnerAmp', max_iters=60000)
checkpoint_config = dict(by_epoch=False, interval=1000, max_keep_ckpts=4)
evaluation = dict(interval=1000, metric='mIoU', save_best='mIoU')
fp16 = None
work_dir = './work_dirs/upernet_vim_tiny_24_512_slide_60k_debug'
gpu_ids = range(0, 1)

/home/jiuth/anaconda3/envs/vim/lib/python3.10/site-packages/mmseg/models/losses/cross_entropy_loss.py:235: UserWarning: Default ``avg_non_ignore`` is False, if you would like to ignore the certain label and average loss over non-ignore labels, which is the same with PyTorch official cross_entropy, set ``avg_non_ignore=True``.
  warnings.warn(
2024-09-11 16:15:41,477 - mmseg - INFO - EncoderDecoder(
  (backbone): VisionMambaSeg(
    (patch_embed): PatchEmbed(
      (proj): Conv2d(80, 192, kernel_size=(16, 16), stride=(16, 16))
      (norm): Identity()
    )
    (pos_drop): Dropout(p=0.0, inplace=False)
    (drop_path): DropPath(drop_prob=0.100)
    (layers): ModuleList(
      (0-1): 2 x Block(
        (mixer): Mamba(
          (in_proj): Linear(in_features=192, out_features=768, bias=False)
          (conv1d): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (act): SiLU()
          (x_proj): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj): Linear(in_features=12, out_features=384, bias=True)
          (conv1d_b): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (x_proj_b): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj_b): Linear(in_features=12, out_features=384, bias=True)
          (out_proj): Linear(in_features=384, out_features=192, bias=False)
        )
        (norm): RMSNorm()
        (drop_path): Identity()
      )
      (2): Block(
        (mixer): Mamba(
          (in_proj): Linear(in_features=192, out_features=768, bias=False)
          (conv1d): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (act): SiLU()
          (x_proj): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj): Linear(in_features=12, out_features=384, bias=True)
          (conv1d_b): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (x_proj_b): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj_b): Linear(in_features=12, out_features=384, bias=True)
          (out_proj): Linear(in_features=384, out_features=192, bias=False)
        )
        (norm): RMSNorm()
        (drop_path): DropPath(drop_prob=0.004)
      )
      (3): Block(
        (mixer): Mamba(
          (in_proj): Linear(in_features=192, out_features=768, bias=False)
          (conv1d): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (act): SiLU()
          (x_proj): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj): Linear(in_features=12, out_features=384, bias=True)
          (conv1d_b): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (x_proj_b): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj_b): Linear(in_features=12, out_features=384, bias=True)
          (out_proj): Linear(in_features=384, out_features=192, bias=False)
        )
        (norm): RMSNorm()
        (drop_path): DropPath(drop_prob=0.009)
      )
      (4): Block(
        (mixer): Mamba(
          (in_proj): Linear(in_features=192, out_features=768, bias=False)
          (conv1d): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (act): SiLU()
          (x_proj): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj): Linear(in_features=12, out_features=384, bias=True)
          (conv1d_b): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (x_proj_b): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj_b): Linear(in_features=12, out_features=384, bias=True)
          (out_proj): Linear(in_features=384, out_features=192, bias=False)
        )
        (norm): RMSNorm()
        (drop_path): DropPath(drop_prob=0.013)
      )
      (5): Block(
        (mixer): Mamba(
          (in_proj): Linear(in_features=192, out_features=768, bias=False)
          (conv1d): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (act): SiLU()
          (x_proj): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj): Linear(in_features=12, out_features=384, bias=True)
          (conv1d_b): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (x_proj_b): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj_b): Linear(in_features=12, out_features=384, bias=True)
          (out_proj): Linear(in_features=384, out_features=192, bias=False)
        )
        (norm): RMSNorm()
        (drop_path): DropPath(drop_prob=0.017)
      )
      (6): Block(
        (mixer): Mamba(
          (in_proj): Linear(in_features=192, out_features=768, bias=False)
          (conv1d): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (act): SiLU()
          (x_proj): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj): Linear(in_features=12, out_features=384, bias=True)
          (conv1d_b): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (x_proj_b): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj_b): Linear(in_features=12, out_features=384, bias=True)
          (out_proj): Linear(in_features=384, out_features=192, bias=False)
        )
        (norm): RMSNorm()
        (drop_path): DropPath(drop_prob=0.022)
      )
      (7): Block(
        (mixer): Mamba(
          (in_proj): Linear(in_features=192, out_features=768, bias=False)
          (conv1d): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (act): SiLU()
          (x_proj): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj): Linear(in_features=12, out_features=384, bias=True)
          (conv1d_b): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (x_proj_b): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj_b): Linear(in_features=12, out_features=384, bias=True)
          (out_proj): Linear(in_features=384, out_features=192, bias=False)
        )
        (norm): RMSNorm()
        (drop_path): DropPath(drop_prob=0.026)
      )
      (8): Block(
        (mixer): Mamba(
          (in_proj): Linear(in_features=192, out_features=768, bias=False)
          (conv1d): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (act): SiLU()
          (x_proj): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj): Linear(in_features=12, out_features=384, bias=True)
          (conv1d_b): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (x_proj_b): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj_b): Linear(in_features=12, out_features=384, bias=True)
          (out_proj): Linear(in_features=384, out_features=192, bias=False)
        )
        (norm): RMSNorm()
        (drop_path): DropPath(drop_prob=0.030)
      )
      (9): Block(
        (mixer): Mamba(
          (in_proj): Linear(in_features=192, out_features=768, bias=False)
          (conv1d): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (act): SiLU()
          (x_proj): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj): Linear(in_features=12, out_features=384, bias=True)
          (conv1d_b): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (x_proj_b): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj_b): Linear(in_features=12, out_features=384, bias=True)
          (out_proj): Linear(in_features=384, out_features=192, bias=False)
        )
        (norm): RMSNorm()
        (drop_path): DropPath(drop_prob=0.035)
      )
      (10): Block(
        (mixer): Mamba(
          (in_proj): Linear(in_features=192, out_features=768, bias=False)
          (conv1d): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (act): SiLU()
          (x_proj): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj): Linear(in_features=12, out_features=384, bias=True)
          (conv1d_b): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (x_proj_b): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj_b): Linear(in_features=12, out_features=384, bias=True)
          (out_proj): Linear(in_features=384, out_features=192, bias=False)
        )
        (norm): RMSNorm()
        (drop_path): DropPath(drop_prob=0.039)
      )
      (11): Block(
        (mixer): Mamba(
          (in_proj): Linear(in_features=192, out_features=768, bias=False)
          (conv1d): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (act): SiLU()
          (x_proj): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj): Linear(in_features=12, out_features=384, bias=True)
          (conv1d_b): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (x_proj_b): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj_b): Linear(in_features=12, out_features=384, bias=True)
          (out_proj): Linear(in_features=384, out_features=192, bias=False)
        )
        (norm): RMSNorm()
        (drop_path): DropPath(drop_prob=0.043)
      )
      (12): Block(
        (mixer): Mamba(
          (in_proj): Linear(in_features=192, out_features=768, bias=False)
          (conv1d): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (act): SiLU()
          (x_proj): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj): Linear(in_features=12, out_features=384, bias=True)
          (conv1d_b): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (x_proj_b): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj_b): Linear(in_features=12, out_features=384, bias=True)
          (out_proj): Linear(in_features=384, out_features=192, bias=False)
        )
        (norm): RMSNorm()
        (drop_path): DropPath(drop_prob=0.048)
      )
      (13): Block(
        (mixer): Mamba(
          (in_proj): Linear(in_features=192, out_features=768, bias=False)
          (conv1d): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (act): SiLU()
          (x_proj): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj): Linear(in_features=12, out_features=384, bias=True)
          (conv1d_b): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (x_proj_b): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj_b): Linear(in_features=12, out_features=384, bias=True)
          (out_proj): Linear(in_features=384, out_features=192, bias=False)
        )
        (norm): RMSNorm()
        (drop_path): DropPath(drop_prob=0.052)
      )
      (14): Block(
        (mixer): Mamba(
          (in_proj): Linear(in_features=192, out_features=768, bias=False)
          (conv1d): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (act): SiLU()
          (x_proj): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj): Linear(in_features=12, out_features=384, bias=True)
          (conv1d_b): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (x_proj_b): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj_b): Linear(in_features=12, out_features=384, bias=True)
          (out_proj): Linear(in_features=384, out_features=192, bias=False)
        )
        (norm): RMSNorm()
        (drop_path): DropPath(drop_prob=0.057)
      )
      (15): Block(
        (mixer): Mamba(
          (in_proj): Linear(in_features=192, out_features=768, bias=False)
          (conv1d): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (act): SiLU()
          (x_proj): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj): Linear(in_features=12, out_features=384, bias=True)
          (conv1d_b): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (x_proj_b): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj_b): Linear(in_features=12, out_features=384, bias=True)
          (out_proj): Linear(in_features=384, out_features=192, bias=False)
        )
        (norm): RMSNorm()
        (drop_path): DropPath(drop_prob=0.061)
      )
      (16): Block(
        (mixer): Mamba(
          (in_proj): Linear(in_features=192, out_features=768, bias=False)
          (conv1d): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (act): SiLU()
          (x_proj): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj): Linear(in_features=12, out_features=384, bias=True)
          (conv1d_b): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (x_proj_b): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj_b): Linear(in_features=12, out_features=384, bias=True)
          (out_proj): Linear(in_features=384, out_features=192, bias=False)
        )
        (norm): RMSNorm()
        (drop_path): DropPath(drop_prob=0.065)
      )
      (17): Block(
        (mixer): Mamba(
          (in_proj): Linear(in_features=192, out_features=768, bias=False)
          (conv1d): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (act): SiLU()
          (x_proj): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj): Linear(in_features=12, out_features=384, bias=True)
          (conv1d_b): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (x_proj_b): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj_b): Linear(in_features=12, out_features=384, bias=True)
          (out_proj): Linear(in_features=384, out_features=192, bias=False)
        )
        (norm): RMSNorm()
        (drop_path): DropPath(drop_prob=0.070)
      )
      (18): Block(
        (mixer): Mamba(
          (in_proj): Linear(in_features=192, out_features=768, bias=False)
          (conv1d): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (act): SiLU()
          (x_proj): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj): Linear(in_features=12, out_features=384, bias=True)
          (conv1d_b): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (x_proj_b): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj_b): Linear(in_features=12, out_features=384, bias=True)
          (out_proj): Linear(in_features=384, out_features=192, bias=False)
        )
        (norm): RMSNorm()
        (drop_path): DropPath(drop_prob=0.074)
      )
      (19): Block(
        (mixer): Mamba(
          (in_proj): Linear(in_features=192, out_features=768, bias=False)
          (conv1d): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (act): SiLU()
          (x_proj): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj): Linear(in_features=12, out_features=384, bias=True)
          (conv1d_b): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (x_proj_b): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj_b): Linear(in_features=12, out_features=384, bias=True)
          (out_proj): Linear(in_features=384, out_features=192, bias=False)
        )
        (norm): RMSNorm()
        (drop_path): DropPath(drop_prob=0.078)
      )
      (20): Block(
        (mixer): Mamba(
          (in_proj): Linear(in_features=192, out_features=768, bias=False)
          (conv1d): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (act): SiLU()
          (x_proj): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj): Linear(in_features=12, out_features=384, bias=True)
          (conv1d_b): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (x_proj_b): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj_b): Linear(in_features=12, out_features=384, bias=True)
          (out_proj): Linear(in_features=384, out_features=192, bias=False)
        )
        (norm): RMSNorm()
        (drop_path): DropPath(drop_prob=0.083)
      )
      (21): Block(
        (mixer): Mamba(
          (in_proj): Linear(in_features=192, out_features=768, bias=False)
          (conv1d): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (act): SiLU()
          (x_proj): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj): Linear(in_features=12, out_features=384, bias=True)
          (conv1d_b): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (x_proj_b): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj_b): Linear(in_features=12, out_features=384, bias=True)
          (out_proj): Linear(in_features=384, out_features=192, bias=False)
        )
        (norm): RMSNorm()
        (drop_path): DropPath(drop_prob=0.087)
      )
      (22): Block(
        (mixer): Mamba(
          (in_proj): Linear(in_features=192, out_features=768, bias=False)
          (conv1d): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (act): SiLU()
          (x_proj): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj): Linear(in_features=12, out_features=384, bias=True)
          (conv1d_b): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (x_proj_b): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj_b): Linear(in_features=12, out_features=384, bias=True)
          (out_proj): Linear(in_features=384, out_features=192, bias=False)
        )
        (norm): RMSNorm()
        (drop_path): DropPath(drop_prob=0.091)
      )
      (23): Block(
        (mixer): Mamba(
          (in_proj): Linear(in_features=192, out_features=768, bias=False)
          (conv1d): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (act): SiLU()
          (x_proj): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj): Linear(in_features=12, out_features=384, bias=True)
          (conv1d_b): Conv1d(384, 384, kernel_size=(4,), stride=(1,), padding=(3,), groups=384)
          (x_proj_b): Linear(in_features=384, out_features=18, bias=False)
          (dt_proj_b): Linear(in_features=12, out_features=384, bias=True)
          (out_proj): Linear(in_features=384, out_features=192, bias=False)
        )
        (norm): RMSNorm()
        (drop_path): DropPath(drop_prob=0.096)
      )
    )
    (norm_f): RMSNorm()
    (fpn1): Sequential(
      (0): ConvTranspose2d(192, 192, kernel_size=(2, 2), stride=(2, 2))
      (1): SyncBatchNorm(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (2): GELU(approximate='none')
      (3): ConvTranspose2d(192, 192, kernel_size=(2, 2), stride=(2, 2))
    )
    (fpn2): Sequential(
      (0): ConvTranspose2d(192, 192, kernel_size=(2, 2), stride=(2, 2))
    )
    (fpn3): Identity()
    (fpn4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (decode_head): UPerHead(
    input_transform=multiple_select, ignore_index=255, align_corners=False
    (loss_decode): CrossEntropyLoss(avg_non_ignore=False)
    (conv_seg): Conv2d(192, 150, kernel_size=(1, 1), stride=(1, 1))
    (dropout): Dropout2d(p=0.1, inplace=False)
    (psp_modules): PPM(
      (0): Sequential(
        (0): AdaptiveAvgPool2d(output_size=1)
        (1): ConvModule(
          (conv): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
          (bn): SyncBatchNorm(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (activate): ReLU(inplace=True)
        )
      )
      (1): Sequential(
        (0): AdaptiveAvgPool2d(output_size=2)
        (1): ConvModule(
          (conv): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
          (bn): SyncBatchNorm(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (activate): ReLU(inplace=True)
        )
      )
      (2): Sequential(
        (0): AdaptiveAvgPool2d(output_size=3)
        (1): ConvModule(
          (conv): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
          (bn): SyncBatchNorm(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (activate): ReLU(inplace=True)
        )
      )
      (3): Sequential(
        (0): AdaptiveAvgPool2d(output_size=6)
        (1): ConvModule(
          (conv): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
          (bn): SyncBatchNorm(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (activate): ReLU(inplace=True)
        )
      )
    )
    (bottleneck): ConvModule(
      (conv): Conv2d(960, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn): SyncBatchNorm(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (activate): ReLU(inplace=True)
    )
    (lateral_convs): ModuleList(
      (0-2): 3 x ConvModule(
        (conv): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn): SyncBatchNorm(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (activate): ReLU()
      )
    )
    (fpn_convs): ModuleList(
      (0-2): 3 x ConvModule(
        (conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn): SyncBatchNorm(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (activate): ReLU()
      )
    )
    (fpn_bottleneck): ConvModule(
      (conv): Conv2d(768, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn): SyncBatchNorm(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (activate): ReLU(inplace=True)
    )
  )
  init_cfg={'type': 'Normal', 'std': 0.01, 'override': {'name': 'conv_seg'}}
  (auxiliary_head): FCNHead(
    input_transform=None, ignore_index=255, align_corners=False
    (loss_decode): CrossEntropyLoss(avg_non_ignore=False)
    (conv_seg): Conv2d(256, 150, kernel_size=(1, 1), stride=(1, 1))
    (dropout): Dropout2d(p=0.1, inplace=False)
    (convs): Sequential(
      (0): ConvModule(
        (conv): Conv2d(192, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn): SyncBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (activate): ReLU(inplace=True)
      )
    )
  )
  init_cfg={'type': 'Normal', 'std': 0.01, 'override': {'name': 'conv_seg'}}
)
2024-09-11 16:15:41,721 - mmseg - INFO - Loaded 20210 images
{'num_layers': 24, 'layer_decay_rate': 0.92}
Build LayerDecayOptimizerConstructor 0.920000 - 26
Param groups = {
  "layer_0_decay": {
    "param_names": [
      "backbone.pos_embed",
      "backbone.patch_embed.proj.weight"
    ],
    "lr_scale": 0.12436428680229507,
    "lr": 1.2436428680229507e-05,
    "weight_decay": 0.05
  },
  "layer_0_no_decay": {
    "param_names": [
      "backbone.patch_embed.proj.bias"
    ],
    "lr_scale": 0.12436428680229507,
    "lr": 1.2436428680229507e-05,
    "weight_decay": 0.0
  },
  "layer_25_decay": {
    "param_names": [
      "backbone.layers.0.mixer.A_log",
      "backbone.layers.0.mixer.A_b_log",
      "backbone.layers.0.mixer.in_proj.weight",
      "backbone.layers.0.mixer.conv1d.weight",
      "backbone.layers.0.mixer.x_proj.weight",
      "backbone.layers.0.mixer.dt_proj.weight",
      "backbone.layers.0.mixer.conv1d_b.weight",
      "backbone.layers.0.mixer.x_proj_b.weight",
      "backbone.layers.0.mixer.dt_proj_b.weight",
      "backbone.layers.0.mixer.out_proj.weight",
      "backbone.layers.1.mixer.A_log",
      "backbone.layers.1.mixer.A_b_log",
      "backbone.layers.1.mixer.in_proj.weight",
      "backbone.layers.1.mixer.conv1d.weight",
      "backbone.layers.1.mixer.x_proj.weight",
      "backbone.layers.1.mixer.dt_proj.weight",
      "backbone.layers.1.mixer.conv1d_b.weight",
      "backbone.layers.1.mixer.x_proj_b.weight",
      "backbone.layers.1.mixer.dt_proj_b.weight",
      "backbone.layers.1.mixer.out_proj.weight",
      "backbone.layers.2.mixer.A_log",
      "backbone.layers.2.mixer.A_b_log",
      "backbone.layers.2.mixer.in_proj.weight",
      "backbone.layers.2.mixer.conv1d.weight",
      "backbone.layers.2.mixer.x_proj.weight",
      "backbone.layers.2.mixer.dt_proj.weight",
      "backbone.layers.2.mixer.conv1d_b.weight",
      "backbone.layers.2.mixer.x_proj_b.weight",
      "backbone.layers.2.mixer.dt_proj_b.weight",
      "backbone.layers.2.mixer.out_proj.weight",
      "backbone.layers.3.mixer.A_log",
      "backbone.layers.3.mixer.A_b_log",
      "backbone.layers.3.mixer.in_proj.weight",
      "backbone.layers.3.mixer.conv1d.weight",
      "backbone.layers.3.mixer.x_proj.weight",
      "backbone.layers.3.mixer.dt_proj.weight",
      "backbone.layers.3.mixer.conv1d_b.weight",
      "backbone.layers.3.mixer.x_proj_b.weight",
      "backbone.layers.3.mixer.dt_proj_b.weight",
      "backbone.layers.3.mixer.out_proj.weight",
      "backbone.layers.4.mixer.A_log",
      "backbone.layers.4.mixer.A_b_log",
      "backbone.layers.4.mixer.in_proj.weight",
      "backbone.layers.4.mixer.conv1d.weight",
      "backbone.layers.4.mixer.x_proj.weight",
      "backbone.layers.4.mixer.dt_proj.weight",
      "backbone.layers.4.mixer.conv1d_b.weight",
      "backbone.layers.4.mixer.x_proj_b.weight",
      "backbone.layers.4.mixer.dt_proj_b.weight",
      "backbone.layers.4.mixer.out_proj.weight",
      "backbone.layers.5.mixer.A_log",
      "backbone.layers.5.mixer.A_b_log",
      "backbone.layers.5.mixer.in_proj.weight",
      "backbone.layers.5.mixer.conv1d.weight",
      "backbone.layers.5.mixer.x_proj.weight",
      "backbone.layers.5.mixer.dt_proj.weight",
      "backbone.layers.5.mixer.conv1d_b.weight",
      "backbone.layers.5.mixer.x_proj_b.weight",
      "backbone.layers.5.mixer.dt_proj_b.weight",
      "backbone.layers.5.mixer.out_proj.weight",
      "backbone.layers.6.mixer.A_log",
      "backbone.layers.6.mixer.A_b_log",
      "backbone.layers.6.mixer.in_proj.weight",
      "backbone.layers.6.mixer.conv1d.weight",
      "backbone.layers.6.mixer.x_proj.weight",
      "backbone.layers.6.mixer.dt_proj.weight",
      "backbone.layers.6.mixer.conv1d_b.weight",
      "backbone.layers.6.mixer.x_proj_b.weight",
      "backbone.layers.6.mixer.dt_proj_b.weight",
      "backbone.layers.6.mixer.out_proj.weight",
      "backbone.layers.7.mixer.A_log",
      "backbone.layers.7.mixer.A_b_log",
      "backbone.layers.7.mixer.in_proj.weight",
      "backbone.layers.7.mixer.conv1d.weight",
      "backbone.layers.7.mixer.x_proj.weight",
      "backbone.layers.7.mixer.dt_proj.weight",
      "backbone.layers.7.mixer.conv1d_b.weight",
      "backbone.layers.7.mixer.x_proj_b.weight",
      "backbone.layers.7.mixer.dt_proj_b.weight",
      "backbone.layers.7.mixer.out_proj.weight",
      "backbone.layers.8.mixer.A_log",
      "backbone.layers.8.mixer.A_b_log",
      "backbone.layers.8.mixer.in_proj.weight",
      "backbone.layers.8.mixer.conv1d.weight",
      "backbone.layers.8.mixer.x_proj.weight",
      "backbone.layers.8.mixer.dt_proj.weight",
      "backbone.layers.8.mixer.conv1d_b.weight",
      "backbone.layers.8.mixer.x_proj_b.weight",
      "backbone.layers.8.mixer.dt_proj_b.weight",
      "backbone.layers.8.mixer.out_proj.weight",
      "backbone.layers.9.mixer.A_log",
      "backbone.layers.9.mixer.A_b_log",
      "backbone.layers.9.mixer.in_proj.weight",
      "backbone.layers.9.mixer.conv1d.weight",
      "backbone.layers.9.mixer.x_proj.weight",
      "backbone.layers.9.mixer.dt_proj.weight",
      "backbone.layers.9.mixer.conv1d_b.weight",
      "backbone.layers.9.mixer.x_proj_b.weight",
      "backbone.layers.9.mixer.dt_proj_b.weight",
      "backbone.layers.9.mixer.out_proj.weight",
      "backbone.layers.10.mixer.A_log",
      "backbone.layers.10.mixer.A_b_log",
      "backbone.layers.10.mixer.in_proj.weight",
      "backbone.layers.10.mixer.conv1d.weight",
      "backbone.layers.10.mixer.x_proj.weight",
      "backbone.layers.10.mixer.dt_proj.weight",
      "backbone.layers.10.mixer.conv1d_b.weight",
      "backbone.layers.10.mixer.x_proj_b.weight",
      "backbone.layers.10.mixer.dt_proj_b.weight",
      "backbone.layers.10.mixer.out_proj.weight",
      "backbone.layers.11.mixer.A_log",
      "backbone.layers.11.mixer.A_b_log",
      "backbone.layers.11.mixer.in_proj.weight",
      "backbone.layers.11.mixer.conv1d.weight",
      "backbone.layers.11.mixer.x_proj.weight",
      "backbone.layers.11.mixer.dt_proj.weight",
      "backbone.layers.11.mixer.conv1d_b.weight",
      "backbone.layers.11.mixer.x_proj_b.weight",
      "backbone.layers.11.mixer.dt_proj_b.weight",
      "backbone.layers.11.mixer.out_proj.weight",
      "backbone.layers.12.mixer.A_log",
      "backbone.layers.12.mixer.A_b_log",
      "backbone.layers.12.mixer.in_proj.weight",
      "backbone.layers.12.mixer.conv1d.weight",
      "backbone.layers.12.mixer.x_proj.weight",
      "backbone.layers.12.mixer.dt_proj.weight",
      "backbone.layers.12.mixer.conv1d_b.weight",
      "backbone.layers.12.mixer.x_proj_b.weight",
      "backbone.layers.12.mixer.dt_proj_b.weight",
      "backbone.layers.12.mixer.out_proj.weight",
      "backbone.layers.13.mixer.A_log",
      "backbone.layers.13.mixer.A_b_log",
      "backbone.layers.13.mixer.in_proj.weight",
      "backbone.layers.13.mixer.conv1d.weight",
      "backbone.layers.13.mixer.x_proj.weight",
      "backbone.layers.13.mixer.dt_proj.weight",
      "backbone.layers.13.mixer.conv1d_b.weight",
      "backbone.layers.13.mixer.x_proj_b.weight",
      "backbone.layers.13.mixer.dt_proj_b.weight",
      "backbone.layers.13.mixer.out_proj.weight",
      "backbone.layers.14.mixer.A_log",
      "backbone.layers.14.mixer.A_b_log",
      "backbone.layers.14.mixer.in_proj.weight",
      "backbone.layers.14.mixer.conv1d.weight",
      "backbone.layers.14.mixer.x_proj.weight",
      "backbone.layers.14.mixer.dt_proj.weight",
      "backbone.layers.14.mixer.conv1d_b.weight",
      "backbone.layers.14.mixer.x_proj_b.weight",
      "backbone.layers.14.mixer.dt_proj_b.weight",
      "backbone.layers.14.mixer.out_proj.weight",
      "backbone.layers.15.mixer.A_log",
      "backbone.layers.15.mixer.A_b_log",
      "backbone.layers.15.mixer.in_proj.weight",
      "backbone.layers.15.mixer.conv1d.weight",
      "backbone.layers.15.mixer.x_proj.weight",
      "backbone.layers.15.mixer.dt_proj.weight",
      "backbone.layers.15.mixer.conv1d_b.weight",
      "backbone.layers.15.mixer.x_proj_b.weight",
      "backbone.layers.15.mixer.dt_proj_b.weight",
      "backbone.layers.15.mixer.out_proj.weight",
      "backbone.layers.16.mixer.A_log",
      "backbone.layers.16.mixer.A_b_log",
      "backbone.layers.16.mixer.in_proj.weight",
      "backbone.layers.16.mixer.conv1d.weight",
      "backbone.layers.16.mixer.x_proj.weight",
      "backbone.layers.16.mixer.dt_proj.weight",
      "backbone.layers.16.mixer.conv1d_b.weight",
      "backbone.layers.16.mixer.x_proj_b.weight",
      "backbone.layers.16.mixer.dt_proj_b.weight",
      "backbone.layers.16.mixer.out_proj.weight",
      "backbone.layers.17.mixer.A_log",
      "backbone.layers.17.mixer.A_b_log",
      "backbone.layers.17.mixer.in_proj.weight",
      "backbone.layers.17.mixer.conv1d.weight",
      "backbone.layers.17.mixer.x_proj.weight",
      "backbone.layers.17.mixer.dt_proj.weight",
      "backbone.layers.17.mixer.conv1d_b.weight",
      "backbone.layers.17.mixer.x_proj_b.weight",
      "backbone.layers.17.mixer.dt_proj_b.weight",
      "backbone.layers.17.mixer.out_proj.weight",
      "backbone.layers.18.mixer.A_log",
      "backbone.layers.18.mixer.A_b_log",
      "backbone.layers.18.mixer.in_proj.weight",
      "backbone.layers.18.mixer.conv1d.weight",
      "backbone.layers.18.mixer.x_proj.weight",
      "backbone.layers.18.mixer.dt_proj.weight",
      "backbone.layers.18.mixer.conv1d_b.weight",
      "backbone.layers.18.mixer.x_proj_b.weight",
      "backbone.layers.18.mixer.dt_proj_b.weight",
      "backbone.layers.18.mixer.out_proj.weight",
      "backbone.layers.19.mixer.A_log",
      "backbone.layers.19.mixer.A_b_log",
      "backbone.layers.19.mixer.in_proj.weight",
      "backbone.layers.19.mixer.conv1d.weight",
      "backbone.layers.19.mixer.x_proj.weight",
      "backbone.layers.19.mixer.dt_proj.weight",
      "backbone.layers.19.mixer.conv1d_b.weight",
      "backbone.layers.19.mixer.x_proj_b.weight",
      "backbone.layers.19.mixer.dt_proj_b.weight",
      "backbone.layers.19.mixer.out_proj.weight",
      "backbone.layers.20.mixer.A_log",
      "backbone.layers.20.mixer.A_b_log",
      "backbone.layers.20.mixer.in_proj.weight",
      "backbone.layers.20.mixer.conv1d.weight",
      "backbone.layers.20.mixer.x_proj.weight",
      "backbone.layers.20.mixer.dt_proj.weight",
      "backbone.layers.20.mixer.conv1d_b.weight",
      "backbone.layers.20.mixer.x_proj_b.weight",
      "backbone.layers.20.mixer.dt_proj_b.weight",
      "backbone.layers.20.mixer.out_proj.weight",
      "backbone.layers.21.mixer.A_log",
      "backbone.layers.21.mixer.A_b_log",
      "backbone.layers.21.mixer.in_proj.weight",
      "backbone.layers.21.mixer.conv1d.weight",
      "backbone.layers.21.mixer.x_proj.weight",
      "backbone.layers.21.mixer.dt_proj.weight",
      "backbone.layers.21.mixer.conv1d_b.weight",
      "backbone.layers.21.mixer.x_proj_b.weight",
      "backbone.layers.21.mixer.dt_proj_b.weight",
      "backbone.layers.21.mixer.out_proj.weight",
      "backbone.layers.22.mixer.A_log",
      "backbone.layers.22.mixer.A_b_log",
      "backbone.layers.22.mixer.in_proj.weight",
      "backbone.layers.22.mixer.conv1d.weight",
      "backbone.layers.22.mixer.x_proj.weight",
      "backbone.layers.22.mixer.dt_proj.weight",
      "backbone.layers.22.mixer.conv1d_b.weight",
      "backbone.layers.22.mixer.x_proj_b.weight",
      "backbone.layers.22.mixer.dt_proj_b.weight",
      "backbone.layers.22.mixer.out_proj.weight",
      "backbone.layers.23.mixer.A_log",
      "backbone.layers.23.mixer.A_b_log",
      "backbone.layers.23.mixer.in_proj.weight",
      "backbone.layers.23.mixer.conv1d.weight",
      "backbone.layers.23.mixer.x_proj.weight",
      "backbone.layers.23.mixer.dt_proj.weight",
      "backbone.layers.23.mixer.conv1d_b.weight",
      "backbone.layers.23.mixer.x_proj_b.weight",
      "backbone.layers.23.mixer.dt_proj_b.weight",
      "backbone.layers.23.mixer.out_proj.weight",
      "backbone.fpn1.0.weight",
      "backbone.fpn1.3.weight",
      "backbone.fpn2.0.weight",
      "decode_head.conv_seg.weight",
      "decode_head.psp_modules.0.1.conv.weight",
      "decode_head.psp_modules.1.1.conv.weight",
      "decode_head.psp_modules.2.1.conv.weight",
      "decode_head.psp_modules.3.1.conv.weight",
      "decode_head.bottleneck.conv.weight",
      "decode_head.lateral_convs.0.conv.weight",
      "decode_head.lateral_convs.1.conv.weight",
      "decode_head.lateral_convs.2.conv.weight",
      "decode_head.fpn_convs.0.conv.weight",
      "decode_head.fpn_convs.1.conv.weight",
      "decode_head.fpn_convs.2.conv.weight",
      "decode_head.fpn_bottleneck.conv.weight",
      "auxiliary_head.conv_seg.weight",
      "auxiliary_head.convs.0.conv.weight"
    ],
    "lr_scale": 1.0,
    "lr": 0.0001,
    "weight_decay": 0.05
  },
  "layer_25_no_decay": {
    "param_names": [
      "backbone.layers.0.mixer.D",
      "backbone.layers.0.mixer.D_b",
      "backbone.layers.0.mixer.conv1d.bias",
      "backbone.layers.0.mixer.dt_proj.bias",
      "backbone.layers.0.mixer.conv1d_b.bias",
      "backbone.layers.0.mixer.dt_proj_b.bias",
      "backbone.layers.0.norm.weight",
      "backbone.layers.1.mixer.D",
      "backbone.layers.1.mixer.D_b",
      "backbone.layers.1.mixer.conv1d.bias",
      "backbone.layers.1.mixer.dt_proj.bias",
      "backbone.layers.1.mixer.conv1d_b.bias",
      "backbone.layers.1.mixer.dt_proj_b.bias",
      "backbone.layers.1.norm.weight",
      "backbone.layers.2.mixer.D",
      "backbone.layers.2.mixer.D_b",
      "backbone.layers.2.mixer.conv1d.bias",
      "backbone.layers.2.mixer.dt_proj.bias",
      "backbone.layers.2.mixer.conv1d_b.bias",
      "backbone.layers.2.mixer.dt_proj_b.bias",
      "backbone.layers.2.norm.weight",
      "backbone.layers.3.mixer.D",
      "backbone.layers.3.mixer.D_b",
      "backbone.layers.3.mixer.conv1d.bias",
      "backbone.layers.3.mixer.dt_proj.bias",
      "backbone.layers.3.mixer.conv1d_b.bias",
      "backbone.layers.3.mixer.dt_proj_b.bias",
      "backbone.layers.3.norm.weight",
      "backbone.layers.4.mixer.D",
      "backbone.layers.4.mixer.D_b",
      "backbone.layers.4.mixer.conv1d.bias",
      "backbone.layers.4.mixer.dt_proj.bias",
      "backbone.layers.4.mixer.conv1d_b.bias",
      "backbone.layers.4.mixer.dt_proj_b.bias",
      "backbone.layers.4.norm.weight",
      "backbone.layers.5.mixer.D",
      "backbone.layers.5.mixer.D_b",
      "backbone.layers.5.mixer.conv1d.bias",
      "backbone.layers.5.mixer.dt_proj.bias",
      "backbone.layers.5.mixer.conv1d_b.bias",
      "backbone.layers.5.mixer.dt_proj_b.bias",
      "backbone.layers.5.norm.weight",
      "backbone.layers.6.mixer.D",
      "backbone.layers.6.mixer.D_b",
      "backbone.layers.6.mixer.conv1d.bias",
      "backbone.layers.6.mixer.dt_proj.bias",
      "backbone.layers.6.mixer.conv1d_b.bias",
      "backbone.layers.6.mixer.dt_proj_b.bias",
      "backbone.layers.6.norm.weight",
      "backbone.layers.7.mixer.D",
      "backbone.layers.7.mixer.D_b",
      "backbone.layers.7.mixer.conv1d.bias",
      "backbone.layers.7.mixer.dt_proj.bias",
      "backbone.layers.7.mixer.conv1d_b.bias",
      "backbone.layers.7.mixer.dt_proj_b.bias",
      "backbone.layers.7.norm.weight",
      "backbone.layers.8.mixer.D",
      "backbone.layers.8.mixer.D_b",
      "backbone.layers.8.mixer.conv1d.bias",
      "backbone.layers.8.mixer.dt_proj.bias",
      "backbone.layers.8.mixer.conv1d_b.bias",
      "backbone.layers.8.mixer.dt_proj_b.bias",
      "backbone.layers.8.norm.weight",
      "backbone.layers.9.mixer.D",
      "backbone.layers.9.mixer.D_b",
      "backbone.layers.9.mixer.conv1d.bias",
      "backbone.layers.9.mixer.dt_proj.bias",
      "backbone.layers.9.mixer.conv1d_b.bias",
      "backbone.layers.9.mixer.dt_proj_b.bias",
      "backbone.layers.9.norm.weight",
      "backbone.layers.10.mixer.D",
      "backbone.layers.10.mixer.D_b",
      "backbone.layers.10.mixer.conv1d.bias",
      "backbone.layers.10.mixer.dt_proj.bias",
      "backbone.layers.10.mixer.conv1d_b.bias",
      "backbone.layers.10.mixer.dt_proj_b.bias",
      "backbone.layers.10.norm.weight",
      "backbone.layers.11.mixer.D",
      "backbone.layers.11.mixer.D_b",
      "backbone.layers.11.mixer.conv1d.bias",
      "backbone.layers.11.mixer.dt_proj.bias",
      "backbone.layers.11.mixer.conv1d_b.bias",
      "backbone.layers.11.mixer.dt_proj_b.bias",
      "backbone.layers.11.norm.weight",
      "backbone.layers.12.mixer.D",
      "backbone.layers.12.mixer.D_b",
      "backbone.layers.12.mixer.conv1d.bias",
      "backbone.layers.12.mixer.dt_proj.bias",
      "backbone.layers.12.mixer.conv1d_b.bias",
      "backbone.layers.12.mixer.dt_proj_b.bias",
      "backbone.layers.12.norm.weight",
      "backbone.layers.13.mixer.D",
      "backbone.layers.13.mixer.D_b",
      "backbone.layers.13.mixer.conv1d.bias",
      "backbone.layers.13.mixer.dt_proj.bias",
      "backbone.layers.13.mixer.conv1d_b.bias",
      "backbone.layers.13.mixer.dt_proj_b.bias",
      "backbone.layers.13.norm.weight",
      "backbone.layers.14.mixer.D",
      "backbone.layers.14.mixer.D_b",
      "backbone.layers.14.mixer.conv1d.bias",
      "backbone.layers.14.mixer.dt_proj.bias",
      "backbone.layers.14.mixer.conv1d_b.bias",
      "backbone.layers.14.mixer.dt_proj_b.bias",
      "backbone.layers.14.norm.weight",
      "backbone.layers.15.mixer.D",
      "backbone.layers.15.mixer.D_b",
      "backbone.layers.15.mixer.conv1d.bias",
      "backbone.layers.15.mixer.dt_proj.bias",
      "backbone.layers.15.mixer.conv1d_b.bias",
      "backbone.layers.15.mixer.dt_proj_b.bias",
      "backbone.layers.15.norm.weight",
      "backbone.layers.16.mixer.D",
      "backbone.layers.16.mixer.D_b",
      "backbone.layers.16.mixer.conv1d.bias",
      "backbone.layers.16.mixer.dt_proj.bias",
      "backbone.layers.16.mixer.conv1d_b.bias",
      "backbone.layers.16.mixer.dt_proj_b.bias",
      "backbone.layers.16.norm.weight",
      "backbone.layers.17.mixer.D",
      "backbone.layers.17.mixer.D_b",
      "backbone.layers.17.mixer.conv1d.bias",
      "backbone.layers.17.mixer.dt_proj.bias",
      "backbone.layers.17.mixer.conv1d_b.bias",
      "backbone.layers.17.mixer.dt_proj_b.bias",
      "backbone.layers.17.norm.weight",
      "backbone.layers.18.mixer.D",
      "backbone.layers.18.mixer.D_b",
      "backbone.layers.18.mixer.conv1d.bias",
      "backbone.layers.18.mixer.dt_proj.bias",
      "backbone.layers.18.mixer.conv1d_b.bias",
      "backbone.layers.18.mixer.dt_proj_b.bias",
      "backbone.layers.18.norm.weight",
      "backbone.layers.19.mixer.D",
      "backbone.layers.19.mixer.D_b",
      "backbone.layers.19.mixer.conv1d.bias",
      "backbone.layers.19.mixer.dt_proj.bias",
      "backbone.layers.19.mixer.conv1d_b.bias",
      "backbone.layers.19.mixer.dt_proj_b.bias",
      "backbone.layers.19.norm.weight",
      "backbone.layers.20.mixer.D",
      "backbone.layers.20.mixer.D_b",
      "backbone.layers.20.mixer.conv1d.bias",
      "backbone.layers.20.mixer.dt_proj.bias",
      "backbone.layers.20.mixer.conv1d_b.bias",
      "backbone.layers.20.mixer.dt_proj_b.bias",
      "backbone.layers.20.norm.weight",
      "backbone.layers.21.mixer.D",
      "backbone.layers.21.mixer.D_b",
      "backbone.layers.21.mixer.conv1d.bias",
      "backbone.layers.21.mixer.dt_proj.bias",
      "backbone.layers.21.mixer.conv1d_b.bias",
      "backbone.layers.21.mixer.dt_proj_b.bias",
      "backbone.layers.21.norm.weight",
      "backbone.layers.22.mixer.D",
      "backbone.layers.22.mixer.D_b",
      "backbone.layers.22.mixer.conv1d.bias",
      "backbone.layers.22.mixer.dt_proj.bias",
      "backbone.layers.22.mixer.conv1d_b.bias",
      "backbone.layers.22.mixer.dt_proj_b.bias",
      "backbone.layers.22.norm.weight",
      "backbone.layers.23.mixer.D",
      "backbone.layers.23.mixer.D_b",
      "backbone.layers.23.mixer.conv1d.bias",
      "backbone.layers.23.mixer.dt_proj.bias",
      "backbone.layers.23.mixer.conv1d_b.bias",
      "backbone.layers.23.mixer.dt_proj_b.bias",
      "backbone.layers.23.norm.weight",
      "backbone.norm_f.weight",
      "backbone.fpn1.0.bias",
      "backbone.fpn1.1.weight",
      "backbone.fpn1.1.bias",
      "backbone.fpn1.3.bias",
      "backbone.fpn2.0.bias",
      "decode_head.conv_seg.bias",
      "decode_head.psp_modules.0.1.bn.weight",
      "decode_head.psp_modules.0.1.bn.bias",
      "decode_head.psp_modules.1.1.bn.weight",
      "decode_head.psp_modules.1.1.bn.bias",
      "decode_head.psp_modules.2.1.bn.weight",
      "decode_head.psp_modules.2.1.bn.bias",
      "decode_head.psp_modules.3.1.bn.weight",
      "decode_head.psp_modules.3.1.bn.bias",
      "decode_head.bottleneck.bn.weight",
      "decode_head.bottleneck.bn.bias",
      "decode_head.lateral_convs.0.bn.weight",
      "decode_head.lateral_convs.0.bn.bias",
      "decode_head.lateral_convs.1.bn.weight",
      "decode_head.lateral_convs.1.bn.bias",
      "decode_head.lateral_convs.2.bn.weight",
      "decode_head.lateral_convs.2.bn.bias",
      "decode_head.fpn_convs.0.bn.weight",
      "decode_head.fpn_convs.0.bn.bias",
      "decode_head.fpn_convs.1.bn.weight",
      "decode_head.fpn_convs.1.bn.bias",
      "decode_head.fpn_convs.2.bn.weight",
      "decode_head.fpn_convs.2.bn.bias",
      "decode_head.fpn_bottleneck.bn.weight",
      "decode_head.fpn_bottleneck.bn.bias",
      "auxiliary_head.conv_seg.bias",
      "auxiliary_head.convs.0.bn.weight",
      "auxiliary_head.convs.0.bn.bias"
    ],
    "lr_scale": 1.0,
    "lr": 0.0001,
    "weight_decay": 0.0
  }
}
2024-09-11 16:15:41,991 - mmseg - INFO - Loaded 2000 images
2024-09-11 16:15:41,992 - mmseg - INFO - Start running, host: jiuth@DESKTOP-8F5VA63, work_dir: /home/jiuth/Vim/work_dirs/upernet_vim_tiny_24_512_slide_60k_debug
2024-09-11 16:15:41,992 - mmseg - INFO - Hooks will be executed in the following order:
before_run:
(VERY_HIGH   ) PolyLrUpdaterHook
(ABOVE_NORMAL) DistOptimizerHook
(NORMAL      ) CheckpointHook
(LOW         ) EvalHook
(VERY_LOW    ) TextLoggerHook
 --------------------
before_train_epoch:
(VERY_HIGH   ) PolyLrUpdaterHook
(LOW         ) IterTimerHook
(LOW         ) EvalHook
(VERY_LOW    ) TextLoggerHook
 --------------------
before_train_iter:
(VERY_HIGH   ) PolyLrUpdaterHook
(LOW         ) IterTimerHook
(LOW         ) EvalHook
 --------------------
after_train_iter:
(ABOVE_NORMAL) DistOptimizerHook
(NORMAL      ) CheckpointHook
(LOW         ) IterTimerHook
(LOW         ) EvalHook
(VERY_LOW    ) TextLoggerHook
 --------------------
after_train_epoch:
(NORMAL      ) CheckpointHook
(LOW         ) EvalHook
(VERY_LOW    ) TextLoggerHook
 --------------------
before_val_epoch:
(LOW         ) IterTimerHook
(VERY_LOW    ) TextLoggerHook
 --------------------
before_val_iter:
(LOW         ) IterTimerHook
 --------------------
after_val_iter:
(LOW         ) IterTimerHook
 --------------------
after_val_epoch:
(VERY_LOW    ) TextLoggerHook
 --------------------
after_run:
(VERY_LOW    ) TextLoggerHook
 --------------------
2024-09-11 16:15:41,992 - mmseg - INFO - workflow: [('train', 1)], max: 60000 iters
2024-09-11 16:15:41,993 - mmseg - INFO - Checkpoints will be saved to /home/jiuth/Vim/work_dirs/upernet_vim_tiny_24_512_slide_60k_debug by HardDiskBackend.
Traceback (most recent call last):
  File "/home/jiuth/Vim/seg/train.py", line 165, in <module>
    main()
  File "/home/jiuth/Vim/seg/train.py", line 154, in main
    train_segmentor(
  File "/home/jiuth/Vim/seg/mmcv_custom/train_api.py", line 129, in train_segmentor
    runner.run(data_loaders, cfg.workflow)
  File "/home/jiuth/anaconda3/envs/vim/lib/python3.10/site-packages/mmcv/runner/iter_based_runner.py", line 144, in run
    iter_runner(iter_loaders[i], **kwargs)
  File "/home/jiuth/anaconda3/envs/vim/lib/python3.10/site-packages/mmcv/runner/iter_based_runner.py", line 64, in train
    outputs = self.model.train_step(data_batch, self.optimizer, **kwargs)
  File "/home/jiuth/anaconda3/envs/vim/lib/python3.10/site-packages/mmcv/parallel/data_parallel.py", line 77, in train_step
    return self.module.train_step(*inputs[0], **kwargs[0])
  File "/home/jiuth/anaconda3/envs/vim/lib/python3.10/site-packages/mmseg/models/segmentors/base.py", line 138, in train_step
    losses = self(**data_batch)
  File "/home/jiuth/anaconda3/envs/vim/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/jiuth/anaconda3/envs/vim/lib/python3.10/site-packages/mmcv/runner/fp16_utils.py", line 119, in new_func
    return old_func(*args, **kwargs)
  File "/home/jiuth/anaconda3/envs/vim/lib/python3.10/site-packages/mmseg/models/segmentors/base.py", line 108, in forward
    return self.forward_train(img, img_metas, **kwargs)
  File "/home/jiuth/anaconda3/envs/vim/lib/python3.10/site-packages/mmseg/models/segmentors/encoder_decoder.py", line 140, in forward_train
    x = self.extract_feat(img)
  File "/home/jiuth/anaconda3/envs/vim/lib/python3.10/site-packages/mmseg/models/segmentors/encoder_decoder.py", line 66, in extract_feat
    x = self.backbone(img)
  File "/home/jiuth/anaconda3/envs/vim/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/jiuth/Vim/seg/backbone/vim.py", line 276, in forward
    x = self.forward_features(x)
  File "/home/jiuth/Vim/seg/backbone/vim.py", line 186, in forward_features
    x = self.patch_embed(x)
  File "/home/jiuth/anaconda3/envs/vim/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/jiuth/Vim/seg/backbone/models_mamba.py", line 59, in forward
    x = self.proj(x)
  File "/home/jiuth/anaconda3/envs/vim/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/jiuth/anaconda3/envs/vim/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 463, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "/home/jiuth/anaconda3/envs/vim/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 459, in _conv_forward
    return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Given groups=1, weight of size [192, 80, 16, 16], expected input[8, 3, 512, 512] to have 80 channels, but got 3 channels instead

how can i fix?

Besides, the pre-trained model files do not match the model names provided in the scripts. How can this be resolved? scripts like:

#!/bin/bash
# bash /client-tools/repair_A100.sh
source /mnt/bn/lianghuidata/miniconda/bin/activate /mnt/bn/lianghuidata/miniconda/envs/vim-seg
cd /mnt/bn/lianghuidata/Vim/seg

SEG_CONFIG=configs/vim/upernet/upernet_vim_tiny_24_512_slide_60k.py
PRETRAIN_CKPT=/mnt/bn/lianghuidata/Vim/pretrained_ckpts/pretrained-vim-t.pth

python3 -m torch.distributed.launch --nproc_per_node=4 --nnodes=${WORLD_SIZE} --node_rank=${RANK} --master_addr=${MASTER_ADDR} --master_port=10295 \
--use_env train.py --launcher pytorch \
    ${SEG_CONFIG} \
    --seed 0 --work-dir work_dirs/vimseg-t --deterministic \
    --options model.backbone.pretrained=${PRETRAIN_CKPT} model.backbone.if_bimamba=False model.backbone.bimamba_type=v2 optimizer.lr=2e-4 optimizer.weight_decay=0.1 

Excuse me, may I ask what the environment configuration for your experiment is? I noticed that the CUDA version for SEG is 11.6, but the CUDA version for causal_comv1d is as low as 11.8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants