| Safe Haskell | Safe-Inferred |
|---|---|
| Language | GHC2021 |
Database.LSMTree.Internal.RunAcc
Description
Incremental (in-memory portion of) run construction
Synopsis
- data RunAcc s = RunAcc {
- mbloom :: !(MBloom s SerialisedKey)
- mindex :: !(IndexAcc s)
- mpageacc :: !(PageAcc s)
- entryCount :: !(PrimVar s Int)
- new :: NumEntries -> RunBloomFilterAlloc -> Salt -> IndexType -> ST s (RunAcc s)
- unsafeFinalise :: RunAcc s -> ST s (Maybe RawPage, Maybe Chunk, Bloom SerialisedKey, Index, NumEntries)
- addKeyOp :: RunAcc s -> SerialisedKey -> Entry SerialisedValue BlobSpan -> ST s ([RawPage], [RawOverflowPage], [Chunk])
- addSmallKeyOp :: RunAcc s -> SerialisedKey -> Entry SerialisedValue BlobSpan -> ST s (Maybe (RawPage, Maybe Chunk))
- addLargeKeyOp :: RunAcc s -> SerialisedKey -> Entry SerialisedValue BlobSpan -> ST s ([RawPage], [RawOverflowPage], [Chunk])
- addLargeSerialisedKeyOp :: RunAcc s -> SerialisedKey -> RawPage -> [RawOverflowPage] -> ST s ([RawPage], [RawOverflowPage], [Chunk])
- entryWouldFitInPage :: SerialisedKey -> Entry SerialisedValue b -> Bool
- data RunBloomFilterAlloc
- newMBloom :: NumEntries -> RunBloomFilterAlloc -> Salt -> ST s (MBloom s a)
Documentation
The run accumulator is a mutable structure that accumulates key/op pairs. It yields pages and chunks of the index incrementally, and returns the Bloom filter and complete index at the end.
Use new to start run construction, add new key/operation pairs to the run
by using addKeyOp and co, and complete run construction using
unsafeFinalise.
new :: NumEntries -> RunBloomFilterAlloc -> Salt -> IndexType -> ST s (RunAcc s) Source #
starts an incremental run construction.new nentries
nentries should be an upper bound on the expected number of entries in the
output run.
unsafeFinalise :: RunAcc s -> ST s (Maybe RawPage, Maybe Chunk, Bloom SerialisedKey, Index, NumEntries) Source #
Finalise an incremental run construction. Do not use a RunAcc after
finalising it.
The frozen bloom filter and compact index will be returned, along with the final page of the run (if necessary), and the remaining chunks of the incrementally constructed compact index.
Adding key/op pairs
There are a few variants of actions to add key/op pairs to the run accumulator. Which one to use depends on a couple questions:
- Is it fully in memory or is it pre-serialised and only partly in memory?
- Is the key/op pair known to be "small" or "large"?
If it's in memory but it's not known whether it's small or large then
use addKeyOp. One can use entryWouldFitInPage to find out if it's
small or large. If it's in memory and known to be small or large then
use addSmallKeyOp or addLargeKeyOp as appropriate. If it's large
and pre-serialised, use addLargeSerialisedKeyOp but note its
constraints carefully.
Arguments
| :: RunAcc s | |
| -> SerialisedKey | |
| -> Entry SerialisedValue BlobSpan | the full value, not just a prefix |
| -> ST s ([RawPage], [RawOverflowPage], [Chunk]) |
Add a key/op pair with an optional blob span to the run accumulator.
Note that this version expects the full value to be in the givenEntry, not
just a prefix of the value that fits into a single page.
If the key/op pair is known to be "small" or "large" then you can use
the special versions addSmallKeyOp or addLargeKeyOp. If it is
pre-serialised, use addLargeSerialisedKeyOp.
addSmallKeyOp :: RunAcc s -> SerialisedKey -> Entry SerialisedValue BlobSpan -> ST s (Maybe (RawPage, Maybe Chunk)) Source #
Add a "small" key/op pair with an optional blob span to the run accumulator.
This version is only for small entries that can fit within a single page.
Use addLargeKeyOp if the entry is bigger than a page. If this distinction
is not known at the use site, use entryWouldFitInPage to determine
which case applies, or use addKeyOp.
This is guaranteed to add the key/op, and it may yield (at most one) page.
Arguments
| :: RunAcc s | |
| -> SerialisedKey | |
| -> Entry SerialisedValue BlobSpan | the full value, not just a prefix |
| -> ST s ([RawPage], [RawOverflowPage], [Chunk]) |
Add a "large" key/op pair with an optional blob span to the run accumulator.
This version is only for large entries that span multiple pages. Use
addSmallKeyOp if the entry is smaller than a page. If this distinction
is not known at the use site, use entryWouldFitInPage to determine
which case applies.
Note that this version expects the full large value to be in the given
Entry, not just the prefix of the value that fits into a single page.
For large multi-page values that are represented by a pre-serialised
RawPage (as occurs when merging runs), use addLargeSerialisedKeyOp.
This is guaranteed to add the key/op. It will yield one or two RawPages,
and one or more RawOverflowPages. These pages should be written out to
the run's page file in that order, the RawPages followed by the
RawOverflowPages.
addLargeSerialisedKeyOp Source #
Arguments
| :: RunAcc s | |
| -> SerialisedKey | The key |
| -> RawPage | The page that this key/op is in, which must be the
first page of a multi-page representation of a single
key/op without a |
| -> [RawOverflowPage] | The overflow pages for this key/op |
| -> ST s ([RawPage], [RawOverflowPage], [Chunk]) |
Add a "large" pre-serialised key/op entry to the run accumulator.
This version is for large entries that span multiple pages and are
represented by already serialised RawPage and one or more
RawOverflowPages.
For this case, the caller provides the key, the raw page it is from and the
overflow pages. The raw page and overflow pages are returned along with any
other pages that need to be yielded (in order). The caller should write out
the pages to the run's page file in order: the returned RawPages followed
by the RawOverflowPages (the same as for addLargeKeyOp).
Note that this action is not appropriate for key/op entries that would fit
within a page (entryWouldFitInPage) but just happen to have
ended up in a page on their own in an input to a merge. A page can end up
with a single entry because a page boundary was needed rather than because
the entry itself was too big. Furthermore, pre-serialised pages can only be
used unaltered if the entry does not use a BlobSpan, since the BlobSpan
typically needs to be modified. Thus the caller should use the following
tests to decide if addLargeSerialisedKeyOp should be used:
- The entry does not use a
BlobSpan. - The entry definitely overflows onto one or more overflow pages.
Otherwise, use addLargeKeyOp or addSmallKeyOp as appropriate.
entryWouldFitInPage :: SerialisedKey -> Entry SerialisedValue b -> Bool Source #
Determine if the key, value and optional blobspan would fit within a
single page. If it does, then it's appropriate to use pageAccAddElem, but
otherwise use singletonPage.
If entryWouldFitInPage is True and the PageAcc is empty (i.e. using
resetPageAcc) then pageAccAddElem is guaranteed to succeed.
Bloom filter allocation
data RunBloomFilterAlloc Source #
See BloomFilterAlloc
Constructors
| RunAllocFixed !Double | Bits per element in a filter |
| RunAllocRequestFPR !Double |
Instances
| Show RunBloomFilterAlloc Source # | |
Defined in Database.LSMTree.Internal.RunAcc Methods showsPrec :: Int -> RunBloomFilterAlloc -> ShowS # show :: RunBloomFilterAlloc -> String # showList :: [RunBloomFilterAlloc] -> ShowS # | |
| NFData RunBloomFilterAlloc Source # | |
Defined in Database.LSMTree.Internal.RunAcc Methods rnf :: RunBloomFilterAlloc -> () # | |
| Eq RunBloomFilterAlloc Source # | |
Defined in Database.LSMTree.Internal.RunAcc Methods (==) :: RunBloomFilterAlloc -> RunBloomFilterAlloc -> Bool # (/=) :: RunBloomFilterAlloc -> RunBloomFilterAlloc -> Bool # | |
| DecodeVersioned RunBloomFilterAlloc Source # | |
Defined in Database.LSMTree.Internal.Snapshot.Codec Methods decodeVersioned :: SnapshotVersion -> Decoder s RunBloomFilterAlloc Source # | |
| Encode RunBloomFilterAlloc Source # | |
Defined in Database.LSMTree.Internal.Snapshot.Codec Methods encode :: RunBloomFilterAlloc -> Encoding Source # | |
Exposed for testing
newMBloom :: NumEntries -> RunBloomFilterAlloc -> Salt -> ST s (MBloom s a) Source #