Unable to export to .pte format #7099

sixersri · 2024-11-27T01:12:49Z

🐛 Describe the bug

I have a fine-tuned TinyLlama/TinyLlama-1.1B-Chat-v1.0 model.

I created the checkpoint file using the following:
torch.save(model.state_dict(), "/opt/ml/model/model.pth")

A 4.1GB model.pth file gets created.

I then try to create a .pte file as follows:

python -m examples.models.llama.export_llama
--checkpoint /home/elxr/projecta/model.pth
--params /home/elxr/projecta/params.json
-X --xnnpack-extended-ops -qmode 8da4w
-d fp16
--metadata '{"get_bos_id":128000, "get_eos_ids":[128009, 128001, 128006, 128007]}'
--embedding-quantize 4,32
--output_name="tinyllama_chat_.pte"

Here is the content of the params.json file:
{
"dim": 2048,
"multiple_of": 64,
"n_heads": 32,
"n_kv_heads": 4,
"n_layers": 22,
"norm_eps": 1e-05,
"rope_theta": 10000.0,
"use_scaled_rope": false,
"vocab_size": 32000
}

I get the following error:
NotImplementedError: Cannot copy out of meta tensor; no data!

Here is the full stack trace:

INFO:root:Applying quantizers: []
INFO:root:Loading model with checkpoint=/home/elxr/projecta/model.pth, params=/home/elxr/projecta/params.json, use_kv_cache=False, weight_type=WeightType.LLAMA
INFO:root:model.to torch.float16
INFO:root:linear: layers.0.attention.wq, in=2048, out=2048
Traceback (most recent call last):
File "/home/elxr/miniconda3/envs/executorch/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/elxr/miniconda3/envs/executorch/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/elxr/executorch/examples/models/llama/export_llama.py", line 32, in
main() # pragma: no cover
File "/home/elxr/executorch/examples/models/llama/export_llama.py", line 28, in main
export_llama(args)
File "/home/elxr/executorch/examples/models/llama/export_llama_lib.py", line 508, in export_llama
builder = _export_llama(args)
File "/home/elxr/executorch/examples/models/llama/export_llama_lib.py", line 643, in _export_llama
builder_exported = _prepare_for_llama_export(args).export()
File "/home/elxr/executorch/examples/models/llama/export_llama_lib.py", line 564, in _prepare_for_llama_export
.source_transform(_get_source_transforms(args.model, dtype_override, args))
File "/home/elxr/miniconda3/envs/executorch/lib/python3.10/site-packages/executorch/extension/llm/export/builder.py", line 148, in source_transform
self.model = transform(self.model)
File "/home/elxr/executorch/examples/models/llama/source_transformation/quantize.py", line 103, in quantize
).quantize(model)
File "/home/elxr/miniconda3/envs/executorch/lib/python3.10/site-packages/torchao/quantization/GPTQ.py", line 1100, in quantize
state_dict = self._create_quantized_state_dict(model)
File "/home/elxr/miniconda3/envs/executorch/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/home/elxr/miniconda3/envs/executorch/lib/python3.10/site-packages/torchao/quantization/GPTQ.py", line 1079, in _create_quantized_state_dict
cur_state_dict[f"{fqn}.weight"] = weight_int8.to(self.device)
NotImplementedError: Cannot copy out of meta tensor; no data!

Versions

PyTorch version: 2.6.0.dev20241112+cpu
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A

OS: eLxr Linux 12 (aria) (x86_64)
GCC version: (Debian 12.2.0-14) 12.2.0
Clang version: Could not collect
CMake version: version 3.31.1
Libc version: glibc-2.36

Python version: 3.10.0 (default, Mar 3 2022, 09:58:08) [GCC 7.5.0] (64-bit runtime)
Python platform: Linux-6.1.0-22-amd64-x86_64-with-glibc2.36
Is CUDA available: False
CUDA runtime version: No CUDA
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 46 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Vendor ID: GenuineIntel
Model name: Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
CPU family: 6
Model: 85
Thread(s) per core: 2
Core(s) per socket: 2
Socket(s): 1
Stepping: 7
BogoMIPS: 5000.00
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves ida arat pku ospke avx512_vnni
Hypervisor vendor: KVM
Virtualization type: full
L1d cache: 64 KiB (2 instances)
L1i cache: 64 KiB (2 instances)
L2 cache: 2 MiB (2 instances)
L3 cache: 35.8 MiB (1 instance)
NUMA node(s): 1
NUMA node0 CPU(s): 0-3
Vulnerability Gather data sampling: Unknown: Dependent on hypervisor status
Vulnerability Itlb multihit: KVM: Mitigation: VMX unsupported
Vulnerability L1tf: Mitigation; PTE Inversion
Vulnerability Mds: Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state unknown
Vulnerability Meltdown: Mitigation; PTI
Vulnerability Mmio stale data: Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state unknown
Vulnerability Reg file data sampling: Not affected
Vulnerability Retbleed: Vulnerable
Vulnerability Spec rstack overflow: Not affected
Vulnerability Spec store bypass: Vulnerable
Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2: Mitigation; Retpolines; STIBP disabled; RSB filling; PBRSB-eIBRS Not affected; BHI Retpoline
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Not affected

Versions of relevant libraries:
[pip3] executorch==0.5.0a0+1c7d94e
[pip3] numpy==1.26.4
[pip3] torch==2.6.0.dev20241112+cpu
[pip3] torchao==0.7.0+git75d06933
[pip3] torchaudio==2.5.0.dev20241112+cpu
[pip3] torchsr==1.0.4
[pip3] torchvision==0.20.0.dev20241112+cpu
[conda] executorch 0.5.0a0+1c7d94e pypi_0 pypi
[conda] numpy 1.26.4 pypi_0 pypi
[conda] torch 2.6.0.dev20241112+cpu pypi_0 pypi
[conda] torchao 0.7.0+git75d06933 pypi_0 pypi
[conda] torchaudio 2.5.0.dev20241112+cpu pypi_0 pypi
[conda] torchsr 1.0.4 pypi_0 pypi
[conda] torchvision 0.20.0.dev20241112+cpu pypi_0 pypi

sixersri changed the title ~~Unable to expoert~~ Unable to export to .pte format Nov 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to export to .pte format #7099

Unable to export to .pte format #7099

sixersri commented Nov 27, 2024 •

edited

Loading

Unable to export to .pte format #7099

Unable to export to .pte format #7099

Comments

sixersri commented Nov 27, 2024 • edited Loading

🐛 Describe the bug

Versions

sixersri commented Nov 27, 2024 •

edited

Loading