We could replace the []uint64 bit vector implementation we currently use with a []byte implementation. This can save up to 7 bytes per bit vector in unlucky cases, where the bit vector must use one additional uint64 when the vector's size isn't divisible by 64.
Note, however, that initial experiments with a []byte implementation seem to indicate that it is slightly slower to call bv.Set(). I have not tested it in BBHash.
We should benchmark the two implementations and compare them.