CoreML support? #18

ArtemBernatskyy · 2023-05-24T09:09:16Z

How can we add CoreML support? Thx!

stlukey · 2023-06-03T19:48:05Z

Whisper.cpp now has CoreML support:

Using just with whisper.cpp, should be as simple as compiling with the appropriate flags:

cd build
cmake -DWHISPER_COREML=1 ..

Check by running:

./main -m models/ggml-base.en.bin -f samples/gb0.wav

...

whisper_init_state: loading Core ML model from 'models/ggml-base.en-encoder.mlmodelc'
whisper_init_state: first run on a device may take a while ...
whisper_init_state: Core ML model loaded

system_info: n_threads = 4 / 10 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | VSX = 0 | COREML = 1 | 

...

note: COREML = 1.

For whispercpp.py, we can add the flag for CoreML here inside setup.py:

if sys.platform == 'darwin':
    os.environ['CFLAGS']   = '-DWHISPER_COREML=1 -DGGML_USE_ACCELERATE -O3 -std=gnu11'
    os.environ['CXXFLAGS'] = '-DWHISPER_COREML=1 -DGGML_USE_ACCELERATE -O3 -std=c++11'
    os.environ['LDFLAGS']  = '-framework Accelerate'

First update the submodule inside whispercpp.py for whisper.cpp. Check that it still runs, it might need some changes if the API has changed. Given it still works, add the flag inside setup.py.

I can't test this at the moment, but feel free to make the pull request, and we can get this feature added.

ArtemBernatskyy · 2023-06-03T23:30:09Z

Thx!
I decided to use OpenAI's Whisper API for the current moment, in my tests it beats the local Whisper with CoreML by 3-4 times (comparing to Macbook M1 32GB)

ryzn0518 · 2023-07-21T03:20:29Z

Whisper.cpp now has CoreML support:

ggerganov/whisper.cpp#566

Using just with whisper.cpp, should be as simple as compiling with the appropriate flags:
cd build
cmake -DWHISPER_COREML=1 ..
Check by running:
./main -m models/ggml-base.en.bin -f samples/gb0.wav

...

whisper_init_state: loading Core ML model from 'models/ggml-base.en-encoder.mlmodelc'
whisper_init_state: first run on a device may take a while ...
whisper_init_state: Core ML model loaded

system_info: n_threads = 4 / 10 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | VSX = 0 | COREML = 1 | 

...
note: COREML = 1.

For whispercpp.py, we can add the flag for CoreML here inside setup.py:
if sys.platform == 'darwin':
    os.environ['CFLAGS']   = '-DWHISPER_COREML=1 -DGGML_USE_ACCELERATE -O3 -std=gnu11'
    os.environ['CXXFLAGS'] = '-DWHISPER_COREML=1 -DGGML_USE_ACCELERATE -O3 -std=c++11'
    os.environ['LDFLAGS']  = '-framework Accelerate'
First update the submodule inside whispercpp.py for whisper.cpp. Check that it still runs, it might need some changes if the API has changed. Given it still works, add the flag inside setup.py.

I can't test this at the moment, but feel free to make the pull request, and we can get this feature added.
@stlukey

@stlukey

I have verified that my computer is an M2. I found that CoreML does not seem to be enabled through this command.

I also added this compile flag, which also does not seem to work:

if sys.platform == 'darwin':
    print("run here.....")
    os.environ['CFLAGS'] = '-DWHISPER_COREML=1 -DGGML_USE_ACCELERATE -O3 -std=gnu11'
    os.environ['CXXFLAGS'] = '-DWHISPER_COREML=1 -DGGML_USE_ACCELERATE -O3 -std=c++11'
    os.environ['LDFLAGS'] = '-framework Accelerate'

That is, I added -DWHISPER_COREML=1 and used the latest whisper.cpp code. When I run the generated whisper.xxxx.so file to transcribe the same audio, it takes 12 minutes. But if I compile whisper.cpp with the same commit using cmake -DWHISPER_COREML=1 and run ./main on the same audio, it only takes 7 minutes. Also, I can see the loading process has:

whisper_init_state: loading Core ML model from 'models/ggml-large-encoder.mlmodelc'
whisper_init_state: first run on a device may take a while ...
whisper_init_state: Core ML model loaded
system_info: n_threads = 4 / 12 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | VSX = 0 | COREML = 1 | OPENVINO = 0 |

It loading the Core ML model. andIt only takes 8 minutes. I expected the .so call can also do this. How can I modify it?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CoreML support? #18

CoreML support? #18

ArtemBernatskyy commented May 24, 2023

stlukey commented Jun 3, 2023 •

edited

Loading

ArtemBernatskyy commented Jun 3, 2023

ryzn0518 commented Jul 21, 2023 •

edited

Loading

CoreML support? #18

CoreML support? #18

Comments

ArtemBernatskyy commented May 24, 2023

stlukey commented Jun 3, 2023 • edited Loading

ArtemBernatskyy commented Jun 3, 2023

ryzn0518 commented Jul 21, 2023 • edited Loading

stlukey commented Jun 3, 2023 •

edited

Loading

ryzn0518 commented Jul 21, 2023 •

edited

Loading