Safe Haskell | Safe-Inferred |
---|---|
Language | GHC2021 |
Database.LSMTree.Internal.MergingTree
Contents
Description
An incremental merge of multiple runs, preserving a bracketing structure.
Synopsis
- data MergingTree m h = MergingTree {
- mergeState :: !(StrictMVar m (MergingTreeState m h))
- mergeRefCounter :: !(RefCounter m)
- data PreExistingRun m h
- = PreExistingRun !(Ref (Run m h))
- | PreExistingMergingRun !(Ref (MergingRun LevelMergeType m h))
- newCompletedMerge :: (MonadMVar m, PrimMonad m, MonadMask m) => Ref (Run m h) -> m (Ref (MergingTree m h))
- newOngoingMerge :: (MonadMVar m, PrimMonad m, MonadMask m) => Ref (MergingRun TreeMergeType m h) -> m (Ref (MergingTree m h))
- newPendingLevelMerge :: forall m h. (MonadMVar m, MonadMask m, PrimMonad m) => [PreExistingRun m h] -> Maybe (Ref (MergingTree m h)) -> m (Ref (MergingTree m h))
- newPendingUnionMerge :: (MonadMVar m, MonadMask m, PrimMonad m) => [Ref (MergingTree m h)] -> m (Ref (MergingTree m h))
- isStructurallyEmpty :: MonadMVar m => Ref (MergingTree m h) -> m Bool
- remainingMergeDebt :: (MonadMVar m, PrimMonad m) => Ref (MergingTree m h) -> m (MergeDebt, NumEntries)
- supplyCredits :: forall m h. (MonadMVar m, MonadST m, MonadSTM m, MonadMask m) => HasFS m h -> HasBlockIO m h -> ResolveSerialisedValue -> RunParams -> CreditThreshold -> SessionRoot -> UniqCounter m -> Ref (MergingTree m h) -> MergeCredits -> m MergeCredits
- data MergingTreeState m h
- = CompletedTreeMerge !(Ref (Run m h))
- | OngoingTreeMerge !(Ref (MergingRun TreeMergeType m h))
- | PendingTreeMerge !(PendingMerge m h)
- data PendingMerge m h
- = PendingLevelMerge !(Vector (PreExistingRun m h)) !(Maybe (Ref (MergingTree m h)))
- | PendingUnionMerge !(Vector (Ref (MergingTree m h)))
Documentation
Semantically, tables are key-value stores like Haskell's
Map
. Table unions then behave like Map.unionWith (<>)
. If one of the
input tables contains a value at a particular key, the result will also
contain it. If multiple tables share that key, the values will be combined
monoidally.
Looking at the implementation, tables are not just key-value pairs, but
consist of runs. If each table was just a single run, unioning would involve
a run merge similar to the one used for compaction (when a level is full),
but with a different merge type MergeUnion
that differs semantically:
Here, runs don't represent updates (overwriting each other), but they each
represent the full state of a table. There is no distinction between no
entry and a Delete
, between an Insert
and a Mupsert
.
To union two tables, we can therefore first merge down each table into a single run (using regular level merges) and then union merge these.
However, we want to spread out the work required and perform these merges
incrementally. At first, we only create a new table that is empty except for
a data structure MergingTree
, representing the merges that need to be
done. The usual operations can then be performed on the table while the
merge is in progress: Inserts go into the table as usual, not affecting its
last level (UnionLevel
), lookups need to consider the tree (requiring some
complexity and runtime overhead), further unions incorporate the in-progress
tree into the resulting one, which also shares future merging work.
It seems necessary to represent the suspended merges using a tree. Other approaches don't allow for full sharing of the incremental work (e.g. because they effectively "re-bracket" nested unions). It also seems necessary to first merge each input table into a single run, as there is no practical distributive property between level and union merges.
data MergingTree m h Source #
A "merging tree" is a mutable representation of an incremental tree-shaped nested merge. This allows to represent union merges of entire tables, each of which itself first need to be merged to become a single run.
Trees have to support arbitrarily deep nesting, since each input to union
might already contain an in-progress merging tree (which then becomes shared
between multiple tables).
Constructors
MergingTree | |
Fields
|
Instances
RefCounted m (MergingTree m h) Source # | |
Defined in Database.LSMTree.Internal.MergingTree Methods getRefCounter :: MergingTree m h -> RefCounter m Source # |
data PreExistingRun m h Source #
Constructors
PreExistingRun !(Ref (Run m h)) | |
PreExistingMergingRun !(Ref (MergingRun LevelMergeType m h)) |
newCompletedMerge :: (MonadMVar m, PrimMonad m, MonadMask m) => Ref (Run m h) -> m (Ref (MergingTree m h)) Source #
newOngoingMerge :: (MonadMVar m, PrimMonad m, MonadMask m) => Ref (MergingRun TreeMergeType m h) -> m (Ref (MergingTree m h)) Source #
Create a new MergingTree
representing the merge of an ongoing run.
The usage of this function is primarily to facilitate the reloading of an
ongoing merge from a persistent snapshot.
newPendingLevelMerge :: forall m h. (MonadMVar m, MonadMask m, PrimMonad m) => [PreExistingRun m h] -> Maybe (Ref (MergingTree m h)) -> m (Ref (MergingTree m h)) Source #
Create a new MergingTree
representing the merge of a sequence of
pre-existing runs (completed or ongoing, plus a optional final tree).
This is for merging the entire contents of a table down to a single run
(while sharing existing ongoing merges).
Shape: if the list of runs is empty and the optional input tree is
structurally empty, the result will also be structurally empty. See
isStructurallyEmpty
.
Resource tracking:
* This allocates a new Ref
which the caller is responsible for releasing
eventually.
* The ownership of all input Ref
s remains with the caller. This action
will create duplicate references, not adopt the given ones.
ASYNC: this should be called with asynchronous exceptions masked because it allocates/creates resources.
newPendingUnionMerge :: (MonadMVar m, MonadMask m, PrimMonad m) => [Ref (MergingTree m h)] -> m (Ref (MergingTree m h)) Source #
Create a new MergingTree
representing the union of one or more merging
trees. This is for unioning the content of multiple tables (represented
themselves as merging trees).
Shape: if all of the input trees are structurally empty, the result will
also be structurally empty. See isStructurallyEmpty
.
Resource tracking:
* This allocates a new Ref
which the caller is responsible for releasing
eventually.
* The ownership of all input Ref
s remains with the caller. This action
will create duplicate references, not adopt the given ones.
ASYNC: this should be called with asynchronous exceptions masked because it allocates/creates resources.
isStructurallyEmpty :: MonadMVar m => Ref (MergingTree m h) -> m Bool Source #
Test if a MergingTree
is "obviously" empty by virtue of its structure.
This is not the same as being empty due to a pending or ongoing merge
happening to produce an empty run.
remainingMergeDebt :: (MonadMVar m, PrimMonad m) => Ref (MergingTree m h) -> m (MergeDebt, NumEntries) Source #
Calculate an upper bound on the merge credits required to complete the
merge, i.e. turn the tree into a CompletedTreeMerge
. For the recursive
calculation, we also return an upper bound on the size of the resulting run.
supplyCredits :: forall m h. (MonadMVar m, MonadST m, MonadSTM m, MonadMask m) => HasFS m h -> HasBlockIO m h -> ResolveSerialisedValue -> RunParams -> CreditThreshold -> SessionRoot -> UniqCounter m -> Ref (MergingTree m h) -> MergeCredits -> m MergeCredits Source #
Internal state
data MergingTreeState m h Source #
Constructors
CompletedTreeMerge !(Ref (Run m h)) | Output run |
OngoingTreeMerge !(Ref (MergingRun TreeMergeType m h)) | Reuses MergingRun to allow sharing existing merges. |
PendingTreeMerge !(PendingMerge m h) |
data PendingMerge m h Source #
A merge that is waiting for its inputs to complete.
Constructors
PendingLevelMerge !(Vector (PreExistingRun m h)) !(Maybe (Ref (MergingTree m h))) | The collection of inputs is the entire contents of a table, i.e. its (merging) runs and finally a union merge (if that table already contained a union). |
PendingUnionMerge !(Vector (Ref (MergingTree m h))) | Each input is the entire content of a table (as a merging tree). |