Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to run the program on MacbookPro with M2 Max CPU? #8

Open
zhuchuji opened this issue Apr 12, 2024 · 2 comments
Open

How to run the program on MacbookPro with M2 Max CPU? #8

zhuchuji opened this issue Apr 12, 2024 · 2 comments

Comments

@zhuchuji
Copy link

zhuchuji commented Apr 12, 2024

I try to run the program on my macbook pro with M2 Max CPU, it throws AssertionError: Torch not compiled with CUDA enabled.

Detail log is shown as following:

/Users/chockiezhu/Library/Python/3.9/lib/python/site-packages/urllib3/__init__.py:35: NotOpenSSLWarning: urllib3 v2 only supports OpenSSL 1.1.1+, currently the 'ssl' module is compiled with 'LibreSSL 2.8.3'. See: https://github.com/urllib3/urllib3/issues/3020
  warnings.warn(
Loading checkpoint shards:   0%|                         | 0/10 [00:01<?, ?it/s]
Traceback (most recent call last):
  File "/Users/chockiezhu/practice/CLoT/gradio_demo.py", line 369, in <module>
    main()
  File "/Users/chockiezhu/practice/CLoT/gradio_demo.py", line 365, in main
    _launch_demo(args)
  File "/Users/chockiezhu/practice/CLoT/gradio_demo.py", line 114, in _launch_demo
    model, tokenizer = _load_model_tokenizer(args.checkpoint_path)
  File "/Users/chockiezhu/practice/CLoT/gradio_demo.py", line 71, in _load_model_tokenizer
    model = AutoPeftModelForCausalLM.from_pretrained(
  File "/Users/chockiezhu/Library/Python/3.9/lib/python/site-packages/peft/auto.py", line 104, in from_pretrained
    base_model = target_class.from_pretrained(base_model_path, **kwargs)
  File "/Users/chockiezhu/Library/Python/3.9/lib/python/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
    return model_class.from_pretrained(
  File "/Users/chockiezhu/Library/Python/3.9/lib/python/site-packages/transformers/modeling_utils.py", line 3531, in from_pretrained
    ) = cls._load_pretrained_model(
  File "/Users/chockiezhu/Library/Python/3.9/lib/python/site-packages/transformers/modeling_utils.py", line 3958, in _load_pretrained_model
    new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
  File "/Users/chockiezhu/Library/Python/3.9/lib/python/site-packages/transformers/modeling_utils.py", line 812, in _load_state_dict_into_meta_model
    set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
  File "/Users/chockiezhu/Library/Python/3.9/lib/python/site-packages/accelerate/utils/modeling.py", line 399, in set_module_tensor_to_device
    new_value = value.to(device)
  File "/Users/chockiezhu/Library/Python/3.9/lib/python/site-packages/torch/cuda/__init__.py", line 293, in _lazy_init
    raise AssertionError("Torch not compiled with CUDA enabled")
@zhuchuji
Copy link
Author

By using MPS(Metal Performance Shaders) on my mac and modify AutoPeftModelForCausalLM.from_pretrained( checkpoint_path, device_map="cuda", trust_remote_code=True, fp16=True ).eval() to AutoPeftModelForCausalLM.from_pretrained( checkpoint_path, device_map=torch.device('mps'), trust_remote_code=True, fp16=True ).eval(), the error becoming the following:

Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████| 10/10 [01:32<00:00, 9.23s/it]
Traceback (most recent call last):
File "/Users/chockiezhu/practice/CLoT/inference.py", line 9, in
model = AutoPeftModelForCausalLM.from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/lib/python3.12/site-packages/peft/auto.py", line 126, in from_pretrained
base_model.resize_token_embeddings(len(tokenizer))
File "/opt/homebrew/lib/python3.12/site-packages/transformers/modeling_utils.py", line 1803, in resize_token_embeddings
model_embeds = self._resize_token_embeddings(new_num_tokens, pad_to_multiple_of)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/lib/python3.12/site-packages/transformers/modeling_utils.py", line 1818, in _resize_token_embeddings
new_embeddings = self._get_resized_embeddings(old_embeddings, new_num_tokens, pad_to_multiple_of)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/lib/python3.12/site-packages/transformers/modeling_utils.py", line 1928, in _get_resized_embeddings
new_embeddings = nn.Embedding(
^^^^^^^^^^^^^
File "/opt/homebrew/lib/python3.12/site-packages/torch/nn/modules/sparse.py", line 143, in init
self.weight = Parameter(torch.empty((num_embeddings, embedding_dim), **factory_kwargs),
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: MPS backend out of memory (MPS allocated: 36.18 GB, other allocations: 384.00 KB, max allowed: 36.27 GB). Tried to allocate 2.32 GB on private pool. Use PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0 to disable upper limit for memory allocations (may cause system failure).

@zhongshsh
Copy link
Collaborator

Our testing was conducted on Linux. Due to a lack of Apple Mac devices, we're unable to address your query. Here is a suggestion.

"Out of memory" errors often occur due to insufficient GPU memory. You can try using some methods for transformers (like those outlined in https://huggingface.co/docs/accelerate/en/usage_guides/big_modeling) to resolve the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants