-
Notifications
You must be signed in to change notification settings - Fork 245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Try to optimize a couple of things #1740
base: main
Are you sure you want to change the base?
Conversation
d27b1cc
to
f00dd78
Compare
Hmm, I think I have to split this into smaller PRs. |
#include "mpn_extras.h" | ||
#include "fmpz.h" | ||
|
||
__GMP_DECLSPEC extern void * (*__gmp_allocate_func)(size_t); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we assume that GMP allocation functions are the same as FLINT's, so there shouldn't need to be some special magic in this file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect there are already cases where we put a flint_malloc
:ed pointer in an mpz
. I could be wrong about this though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then why don't we simply define flint_malloc
as __gmp_allocate_func
?
Yes. The inplace functions look nice though. Is |
|
That is my thinking.
It sounds extremely specialized to have whole modules just for inplace versions of functions.
I think you are more likely to search for versions of a specific operation than for a group of inplace operations. |
This PR seems to be faster when ADX is available, but I had troubles getting consistent results. I don't know if these changes for
fmpz_mul
andfmpz_sqr
are actually faster.Perhaps there are things one could optimize, and if so, I would be happy to do so.
Perhaps
X_mat_neg
could have aX_mat_inplace_neg
instead.NOTE: I think the assertion worker will fail due to
fmpz_mul
andfmpz_sqr
allowing aliasing under certain circumstances when ADX assembly is enabled.