Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Codegen][llvmgpu] Compute gemmC size when C promotion is done in padding matmul #19307

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

jerryyin
Copy link
Member

@jerryyin jerryyin commented Nov 26, 2024

This PR depends on #19271. Please review the last commit only.

The existing implementation of #19271 doesn't take gemmC into consideration when computing shared memory size. Though in condition of #19271, gemmC always get promoted and we ended always allocating the C tensor in shared memory. Ignoring C tensor will severely underestimate the amount of shared memory used and eventually cause deduceMMASchedule() to pick a tile size too large and go beyond the space limit for shared memory.

6c9ffd2 addressed this by adding calculateResultSharedMemoryUsedInBytes() and apply it to matmul tiling size derive process.

nirvedhmeshram and others added 3 commits November 22, 2024 14:07
- This is a followup of iree-org#19271
- This take gemmC promotion into consideration in computing shared
  memory consumption

Signed-off-by: jerryyin <[email protected]>
@jerryyin jerryyin changed the title Compute gemmC size when C promotion is done in padding matmul [Codegen][llvmgpu] Compute gemmC size when C promotion is done in padding matmul Nov 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants