Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implementation of other operators #125

Open
brightening-eyes opened this issue Dec 30, 2023 · 9 comments
Open

implementation of other operators #125

brightening-eyes opened this issue Dec 30, 2023 · 9 comments

Comments

@brightening-eyes
Copy link

brightening-eyes commented Dec 30, 2023

hello.
thanks for such a cool library.
if you have time, or if possible, please let me know how can I add these operators?
convTranspose, gelu, layerNorm, groupNorm, instanceNorm, swish, mish, localResponseNorm, paddings, permute, repeat, channelShuffle, reshape, celu, selu, elu, shrink, softmax, softplus, expandDims, squeeze, tile, dft/stft?
the reason is that when these ops when implemented, can help rtNeural run transformers and attention based models, autoencoders and spectrogram to spectrogram conversion models, allowing to write various plugins like noise removals etc.
other things like abs, log, sin, cos, sinh, cosh, argmin, argmax, clip, etc seems not implemented, although it seems easier to implement than others.
thanks.

@janaboy74
Copy link

janaboy74 commented Dec 30, 2023

I think I have the softplus:
https://www.desmos.com/calculator/bcdanrpfqv
the abs/log/sin/etc. are math functions - what do you want with them - #include <cmath> is not enough?

@janaboy74
Copy link

Ok, these are very similar:
https://www.desmos.com/calculator/gudlnw7jxw

@brightening-eyes
Copy link
Author

@janaboy74
the issue is not with the math functions. for std stuff, a #include completely suffices.
for things like softplus, gelu, and inplace operations like log, sin, etc, just a single function call in a loop completely suffices.
but for things like eigen and xsimd, I suppose things might work differently.
also, many of these operators are not inplace ops. meaning that they will not modify the input exactly to the output like relu.
for example, convolutions and transposed convolutions and poolings have different shapes. in this way, the outputs should be calculated based on the inputs and the given weights.

@janaboy74
Copy link

janaboy74 commented Dec 30, 2023

@brightening-eyes
OK, eigen and xsimd -> these are GPU things when I see it correctly. What is the target device? PC / Mobilephone / RPi & clones?
This:
https://en.wikipedia.org/wiki/General-purpose_computing_on_graphics_processing_units
or this:
https://www.openmp.org/
or just:
C++11 threads.

@jatinchowdhury18
Copy link
Owner

Hello!

A couple of the operators that you've mentioned (e.g. Elu and Softmax) are currently implemented in RTNeural.

You've mentioned a lot of operators, so I'll try to split them into three parts: basic math operations, Fourier-style transforms, and neural operators.

First I should mention that RTNeural is different from many "machine learning libraries", in that it only supports inferencing, not training. This makes it much simpler for users to provide their own operators. For example, a tanh operator in a traditional ML framework needs to support back-propagation in whatever form the framework requires. However in RTNeural, you can "bring your own" implementation of tanh... by default RTNeural will use the implementations provided by the chosen backend (e.g. std::tanh or xsimd::tanh), but we've recently worked on making this more flexible (see the recent commits regarding the MathsProvider concept).

  1. Math operators

I think I would generally prefer to avoid supplying basic math operations (e.g. abs, sin/cos, etc) within RTNeural. While I could definitely see the use-case for having them in the library, I think it would make the library way bigger than it needs to be. The only reason I think it would make sense to add a math operator to the library is if it is also typically used as an "activation" function in traditional neural nets (like with tanh). Of the operators you listed I think the clip function might meet that requirement... maybe argmin/argmax as well? It's a tricky distinction to make, and I'm definitely open to a conversation about it, but I hope you understand my hesitance about not wanting to add every possible math operator to the library.

As for how you would implement a math operator for your own use with RTNeural, it should be pretty straightforward.

/** Static implementation of a sin operator. */
template <typename T, int size>
class SinFunctionT
{
public:
    static constexpr auto in_size = size;
    static constexpr auto out_size = size;

    SinFunctionT() = default;

    /** Performs forward propagation for sin function. */
    inline void forward(const T (&ins)[size]) noexcept
    {
        for(int i = 0; i < size; ++i)
            outs[i] = std::sin(ins[i]);
    }

    T outs alignas(RTNEURAL_DEFAULT_ALIGNMENT)[size];
};

If you're using the XSIMD backend, the forward method could be written:

using v_type = xsimd::simd_type<T>;
static constexpr auto v_size = (int)v_type::size;
static constexpr auto v_io_size = ceil_div(size, v_size);

inline void forward(const v_type (&ins)[v_io_size]) noexcept
{
    for(int i = 0; i < v_io_size; ++i)
        outs[i] = xsimd::sin(ins[i]);
}

And with Eigen:

using v_type = Eigen::Matrix<T, size, 1>;

inline void forward(const v_type& ins) noexcept
{
    outs = ins.array().sin();
}
  1. Fourier-style Operators

For operators like the DFT, STFT, DCT, etc... I think I would prefer to avoid implementing those within RTNeural as well, largely for the same reasoning mentioned above. At least with the math operators, we have a pretty "standard" implementation that's always available to us (<cmath>), whereas with Fourier-style operators, there's so many libraries out there to choose from, each with their own complexities around usage and licensing.

For implementing these types of operators for your own use, I think it's hard to give advice without knowing more about the use-case, but I think in most cases it would probably make sense to apply your Fourier operators outside of RTNeural. For example, do a Fourier transform, then run inference on your transformed data in RTNeural, and then do an inverse transform using the output from RTNeural. The reason I would suggest this is that the way RTNeural uses the XSIMD and Eigen libraries to improve performance may not interact well with the tricks used by many Fourier transform libraries to get the best possible performance for those operations. Again, this may depend on your Fourier transform library of choice, and on your particular application.

  1. Neural Operators

Some of the operators you mentioned (e.g. softplus, selu, etc) seem pretty like pretty straightforward activation functions, and it would be great to add them to the library! I imagine they could be done pretty simply using the existing activation function implementations as examples.

It seems like most of the other operators that you've mentioned are intended to work on two-dimensional data (e.g. reshape, convTranspose, channelShuffle). This is something we've been slowly working on, starting with Conv2D support that was added a few months ago. The current approach is to pass a single array through the model which contains the two-dimensional data in a "flattened" format. However, this approach has some issues (particularly with the XSIMD backend), and I worry that it may not scale well for many layers that are operating on 2D data (at least not in its current form). I have some ideas about how to solve this issue, but I haven't yet written any code in that direction. Again, I'd be happy to discuss this further if you've got some ideas, or have some time available to help with implementation/testing.

Anyway, sorry for the long reply. I hope this information is helpful for you! Feel free to reply here, or to split this out into a couple of separate issues if you prefer.

@janaboy74
Copy link

@jatinchowdhury18
Hi!
I am not a AI expert. I wanted to be familiar with that area later. That was my plan.😀
Now I have an other project - a crypto stuff. This is for me now more important. 😉
If I understand you correctly the AI needs the data to be transformed. A picture for example needs to be linearised somehow. Fourier is helpful for audio. Ok jpeg makes it also for pictures after the block zig-zag transform.
I know the vectorgeometry stuff, because I was almost 10 years long a 3D game developer. OK it was a long time ago.

What I can do I will try to fix my json stuff. I don't think that it will be 100% compatible with all the json files, but hopefully I will make from it as much as I can. The main design idea was to create a easy to use parser. To get the data easy after it is in memory. Maybe the structure what I have created is not enough. I have added a vector for single items and as I said the parser is working now, but to format the output is much harder. I don't know when I will done with it.
I have tested with some json files, but obviously it was not enough.
My test criteria was that what "python3 -m json.tool" can understand, it should work with my stuff as well. So I need to improve it.
Later we can talk about other things...like AI.

@algoravioli
Copy link

I can work on some of the activation functions proposed, as I had already been working on them: GeLU, swish, mish, CDELU, SELU. I can check to see if softmax and softplus is possible/already done.

@brightening-eyes
Copy link
Author

@jatinchowdhury18
thanks for your complete reply
the reason that I proposed fft related stuff was to make rtneural more complete, like onnx and its coverage.
for things like channelShuffle, it can work in 1d related stuff as well. channelShuffle is used alongside grouped convolutions.
things like repeat/transpose etc are like this. it can change the order of channel/time or time/channels etc or to repeat a matrix in a given axis.
or for slice, it can slice a matrix into a vector of matrices. this can be used for GLU activation stuff.

@janaboy74
Copy link

I see.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants