Grouped Linear layers #857

opfromthestart · 2023-08-25T13:13:44Z

I have a model that needs to be able to run in real time, and one of the layers has a width of 6000, which makes it take about 0.3 seconds to run (I need 30-60fps). One solution I can think of is a grouped linear layer, where GroupLinear<I, O, C> takes an input of (B, C*I) and returns an output of (B, C*O) where the C blocks of I inputs are all processed independently of each other. I don't think new kernels would need to be written as it just does C smaller linear layers.

The text was updated successfully, but these errors were encountered:

opfromthestart · 2023-08-26T15:43:21Z

The storage of the weights should probably be approximately a Tensor<Rank3<C, I, O>,...>.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Grouped Linear layers #857

Grouped Linear layers #857

opfromthestart commented Aug 25, 2023

opfromthestart commented Aug 26, 2023

Grouped Linear layers #857

Grouped Linear layers #857

Comments

opfromthestart commented Aug 25, 2023

opfromthestart commented Aug 26, 2023