Skip to content

Conversation

@lemire
Copy link
Member

@lemire lemire commented Jan 7, 2026

Alternative to #49

@lemire lemire merged commit dc287a3 into master Jan 7, 2026
10 checks passed
@Giulio2002
Copy link

Giulio2002 commented Jan 8, 2026

wait but this actually won't work with mmap anyways - mmap is also part of our usecase. is there any interest in that or should I just make a fork?

@lemire
Copy link
Member Author

lemire commented Jan 8, 2026

@Giulio2002 Supporting memory file mapping on big endian platforms based on files created on a little endian system seems quite niche.

Can you elaborate on your application?

Is this a real issue you are encountering? If so, please elaborate on your business needs so that I understand.

@Giulio2002
Copy link

Giulio2002 commented Jan 8, 2026

We handle a distributed database for the Ethereum blockchain. we have some data which we generate and users can use our OSS tool to sync this data. we sync the data through bittorent (files we distribute and seed) and currently we use bloom filters for our existence filters (and looking to switch). However, if a user is using a BE CPU and we generate with a LE CPU and use Fuse filters, we have an issue. We use MMAP because some of these filters are huge (as in > 2.2B entries). and bloom filters already take >30GB in total so cannot stay in RAM (We are mandated for 64GB Ram requirement in total).

If it's too niche which talking out loud, it probably is, I will make my own fork no issue.

@lemire
Copy link
Member Author

lemire commented Jan 8, 2026

You can do this without changing the xorfilter library.

@Giulio2002
Copy link

well not really, I suppose you could by changing exposed FingerPrints but still it is far from optimal. changing the lib is more self-contained. if its not in the lib interest, I completely understand.

@lemire
Copy link
Member Author

lemire commented Jan 8, 2026

I think that the proper way is to flip the bytes when transmitting the binary data to a big endian system. If you have enough sophistication to do memory file mapping on mainframe computers using data generated from regular computers, then that’s easy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants