Attempt to consolidate SSE and ARM NEON SIMD code for GCC/clang and Visual Studio#252
Open
fanc999 wants to merge 9 commits intoebassi:masterfrom
Open
Attempt to consolidate SSE and ARM NEON SIMD code for GCC/clang and Visual Studio#252fanc999 wants to merge 9 commits intoebassi:masterfrom
fanc999 wants to merge 9 commits intoebassi:masterfrom
Conversation
This way, we can try to abstract uses of such calls between different compilers that we support instead of repeating them in the headers due to differences in compiler syntax/feature support.
Use the newly-added macros to abstract the one-liner intrinsic calls for GCC/CLang and Visual Studio for building the SSE code, to reduce duplication. It's not totally exhausive, but should cover quite a number of items.
Use the newly-added macros to abstract one-liner intrinsic calls for GCC/CLang and Visual Studio for building the ARM NEON code, to reduce duplication. It's not totally exhausive, but should cover quite a number of items.
These macros can be used to abstract initializing SIMD data arrays for the different compilers that we support, especially as we already require C99 support for building and using Graphene.
Use the macros that we just added to initialize the graphene_simd4f_t arrays with the floats that we pass into graphene_simd4f_init*() as applicable. This especially simplifies the code for Visual Studio since we already require C99 support for building and using Graphene, and we can reduce some code duplication.
We don't need identical typedefs separate for GCC/CLang and Visual Studio. Put them in one place for all cases.
We can define them instead to call the respective graphene_simd4f_get() accordingly instead.
...with graphene_msvc_ instead of just _. This attempts to make things clearer to people.
...with graphene_msvc_ instead of just _. This attempts to make things clearer to people.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Hi,
This attempts to clean up the code a bit in
graphene-simd4f.handgraphene-simd4x4f.hby trying to reduce the code duplication for SSE and ARM NEON SIMD implementation due to syntactical differences in Visual Studio and GCC/clang in regards to inlining, via:__extension__and direct intrinsic call) supported by GCC/clang and Visual Studio, for calls that could be done as one-liners.graphene_simd4f_tarrays from the 4 floats that we pass in, especially as we required C99 support for a while and the supported Visual Studio compilers have the needed support for this.graphene_msvc_instead of just_to make things clearer to people[1].[1]: Sadly, I was not able to do the cleanup for the SIMD code that are done in a function-like manner. I couldn't get the preprocessor happy in one shot for Visual Studio and clang, ugh :|, so I had to leave that alone, since preprocessors don't allow a working #define inside a macro and doesn't like splitting lines when set apart by #if/#ifdef's. So this is the best I could do for now. For instance:
(unrelated parts omitted for brevity, trying to remember things on top of my head, so there might be some mistakes below)
I understand that this PR might well conflict with the changes in #251, so if one of this or #251 goes through, I will fix things up as needed as soon as possible.
With blessings, thank you!