[Core] Support all head sizes up to 256 with FlashAttention backend #8910

njhill · 2024-09-27T17:37:40Z

We were previously restricting to specific sizes, but the native FA kernels pad and support arbitrary sizes up to 256.

tlrmchlsmth

Could you add some unit tests? Looks like we may be able to just extend this list here🤞

Lines 32 to 34 in c2ec430

    
           # FlashAttention forward only supports head dimension at most 128 
        
           # https://github.com/ROCmSoftwarePlatform/flash-attention/blob/3d2b6f5d037782cc2c906909a46fb7e2e1b48b25/csrc/flash_attn_rocm/flash_api.cpp#L62 
        
           HEAD_SIZES = [64, 80, 96, 112, 120, 128, 192, 256]

njhill · 2024-09-27T23:54:56Z

Looks like we need to build flash without the FLASHATTENTION_DISABLE_UNEVEN_K flag, have opened vllm-project/flash-attention#21 ... @WoosukKwon wdyt?

[Core] Support all head sizes up to 256 with FlashAttention backend

4451bbc

We were previously restricting to specific sizes, but the native FA kernels pad and support arbitrary sizes up to 256.

njhill requested a review from WoosukKwon September 27, 2024 17:37

tlrmchlsmth reviewed Sep 27, 2024

View reviewed changes

Test more head sizes

dbbe6dc

vllm-project deleted a comment from github-actions bot Sep 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Core] Support all head sizes up to 256 with FlashAttention backend #8910

[Core] Support all head sizes up to 256 with FlashAttention backend #8910

njhill commented Sep 27, 2024

tlrmchlsmth left a comment

njhill commented Sep 27, 2024

	# FlashAttention forward only supports head dimension at most 128
	# https://github.com/ROCmSoftwarePlatform/flash-attention/blob/3d2b6f5d037782cc2c906909a46fb7e2e1b48b25/csrc/flash_attn_rocm/flash_api.cpp#L62
	HEAD_SIZES = [64, 80, 96, 112, 120, 128, 192, 256]

[Core] Support all head sizes up to 256 with FlashAttention backend #8910

Are you sure you want to change the base?

[Core] Support all head sizes up to 256 with FlashAttention backend #8910

Conversation

njhill commented Sep 27, 2024

tlrmchlsmth left a comment

Choose a reason for hiding this comment

njhill commented Sep 27, 2024