Safe Haskell | Safe-Inferred |
---|---|
Language | GHC2021 |
Data.BloomFilter.Mutable
Description
A fast, space efficient Bloom filter implementation. A Bloom filter is a set-like data structure that provides a probabilistic membership test.
- Queries do not give false negatives. When an element is added to
a filter, a subsequent membership test will definitely return
True
. - False positives are possible. If an element has not been added to a filter, a membership test may nevertheless indicate that the element is present.
This module provides low-level control. For an easier to use interface, see the Data.BloomFilter.Easy module.
Synopsis
- type Hash = Word64
- type MBloom s = MBloom' s CheapHashes
- data MBloom' s h a
- data CheapHashes a
- data RealHashes a
- new :: Int -> Word64 -> ST s (MBloom' s h a)
- length :: MBloom' s h a -> Word64
- elem :: (Hashes h, Hashable a) => a -> MBloom' s h a -> ST s Bool
- insert :: (Hashes h, Hashable a) => MBloom' s h a -> a -> ST s ()
Overview
Each of the functions for creating Bloom filters accepts two parameters:
- The number of bits that should be used for the filter. Note that a filter is fixed in size; it cannot be resized after creation.
- A number of hash functions, k, to be used for the filter.
By choosing these parameters with care, it is possible to tune for
a particular false positive rate.
The suggestSizing
function in
the Data.BloomFilter.Easy module calculates useful estimates for
these parameters.
Ease of use
This module provides both mutable interfaces for creating and querying a Bloom filter. It is most useful as a low-level way to manage a Bloom filter with a custom set of characteristics.
Performance
The implementation has been carefully tuned for high performance and low space consumption.
Types
type MBloom s = MBloom' s CheapHashes Source #
Mutable Bloom filter using CheapHashes hashing scheme.
A mutable Bloom filter, for use within the ST
monad.
data CheapHashes a Source #
A pair of hashes used for a double hashing scheme.
See evalCheapHashes
.
Instances
data RealHashes a Source #
A closure of real hashing function.
Instances
Hashes (RealHashes :: Type -> Type) Source # | |
Defined in Data.BloomFilter.Hash Methods makeHashes :: Hashable a => a -> RealHashes a Source # evalHashes :: RealHashes a -> Int -> Hash Source # |
Mutable Bloom filters
Creation
Arguments
:: Int | number of hash functions to use |
-> Word64 | number of bits in filter |
-> ST s (MBloom' s h a) |
Create a new mutable Bloom filter.
The size is ceiled at $2^48$. Tell us if you need bigger bloom filters.
Accessors
elem :: (Hashes h, Hashable a) => a -> MBloom' s h a -> ST s Bool Source #
Query a mutable Bloom filter for membership. If the value is
present, return True
. If the value is not present, there is
still some possibility that True
will be returned.