You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a model that needs to be able to run in real time, and one of the layers has a width of 6000, which makes it take about 0.3 seconds to run (I need 30-60fps). One solution I can think of is a grouped linear layer, where GroupLinear<I, O, C> takes an input of (B, C*I) and returns an output of (B, C*O) where the C blocks of I inputs are all processed independently of each other. I don't think new kernels would need to be written as it just does C smaller linear layers.
The text was updated successfully, but these errors were encountered:
I have a model that needs to be able to run in real time, and one of the layers has a width of 6000, which makes it take about 0.3 seconds to run (I need 30-60fps). One solution I can think of is a grouped linear layer, where
GroupLinear<I, O, C>
takes an input of(B, C*I)
and returns an output of(B, C*O)
where the C blocks of I inputs are all processed independently of each other. I don't think new kernels would need to be written as it just does C smaller linear layers.The text was updated successfully, but these errors were encountered: