You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am currently developing a personal library for use on a Mac, based on mlx and the stable diffusion code in mlx example, as well as the diffusers library, which is optimized for Mac and can easily load and use checkpoints uploaded to Hugging Face. Let me briefly explain why I decided to do this:
I enjoy coding and wanted to try something.
I often use the diffusers library, but occasionally encounter bugs on the torch side. Even with data sizes that should work, I've faced Out Of Memory (OOM) type errors, making it challenging to use smoothly.
I invested in an MBP M3 Max 128GB, and struggled to find ML and DL libraries optimized for the MBP.
Although I enjoy coding, I am quite a beginner and am learning as I go.
However, when trying to replicate the models implemented in the diffusers library, I found that mlx lacks certain operators. For example, operations like interpolation, image resizing, and einsum. I managed to implement like interpolation (nearest from mlx-example/stable_diffusion, bilinear from wikipedia or other github projects...) or image resizing using only mlx, but relatively complex functions like einsum are challenging to implement for me.
Currently, I am trying to use them, einsum for example, like this: mlx.array(np.einsum('text', np.array(arr1), np.array(arr2)))
Since I haven't yet developed fully functioning code, I can't assess how much this approach might degrade performance or if it has no significant impact. I would appreciate hearing the opinions of professionals on the impact of such an implementation.
It may be a vague question to answer cleary, but I would be grateful if you could understand it as a novice's inquiry.
I am currently developing a personal library for use on a Mac, based on mlx and the stable diffusion code in mlx example, as well as the diffusers library, which is optimized for Mac and can easily load and use checkpoints uploaded to Hugging Face. Let me briefly explain why I decided to do this:
I enjoy coding and wanted to try something.
I often use the diffusers library, but occasionally encounter bugs on the torch side. Even with data sizes that should work, I've faced Out Of Memory (OOM) type errors, making it challenging to use smoothly.
I invested in an MBP M3 Max 128GB, and struggled to find ML and DL libraries optimized for the MBP.
Although I enjoy coding, I am quite a beginner and am learning as I go.
However, when trying to replicate the models implemented in the diffusers library, I found that mlx lacks certain operators. For example, operations like interpolation, image resizing, and einsum. I managed to implement like interpolation (nearest from mlx-example/stable_diffusion, bilinear from wikipedia or other github projects...) or image resizing using only mlx, but relatively complex functions like einsum are challenging to implement for me.
Currently, I am trying to use them, einsum for example, like this:
Since I haven't yet developed fully functioning code, I can't assess how much this approach might degrade performance or if it has no significant impact. I would appreciate hearing the opinions of professionals on the impact of such an implementation.
It may be a vague question to answer cleary, but I would be grateful if you could understand it as a novice's inquiry.
What is in your text that you pass to einsum? You can decompose various einops into simple operations in the meantime. I've recently had to do exactly that, so I'd love to help you out.
Indeed usually (but not always there is a workaround using existing MLX ops).
We are also in the process of adding more. For example there is currently a PR out for exactly einsum#392 but it will take time until we have some of the missing operations you listed. In the meantime feel free to post any operation you find missing with a bit of detail and we are happy to help you find a workaround.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Hello,
I am currently developing a personal library for use on a Mac, based on mlx and the stable diffusion code in mlx example, as well as the diffusers library, which is optimized for Mac and can easily load and use checkpoints uploaded to Hugging Face. Let me briefly explain why I decided to do this:
Although I enjoy coding, I am quite a beginner and am learning as I go.
However, when trying to replicate the models implemented in the diffusers library, I found that mlx lacks certain operators. For example, operations like interpolation, image resizing, and einsum. I managed to implement like interpolation (nearest from mlx-example/stable_diffusion, bilinear from wikipedia or other github projects...) or image resizing using only mlx, but relatively complex functions like einsum are challenging to implement for me.
Currently, I am trying to use them, einsum for example, like this:
mlx.array(np.einsum('text', np.array(arr1), np.array(arr2)))
Since I haven't yet developed fully functioning code, I can't assess how much this approach might degrade performance or if it has no significant impact. I would appreciate hearing the opinions of professionals on the impact of such an implementation.
It may be a vague question to answer cleary, but I would be grateful if you could understand it as a novice's inquiry.
Beta Was this translation helpful? Give feedback.
All reactions