lsm-tree-0.1.0.0: Log-structured merge-trees
Safe HaskellSafe-Inferred
LanguageGHC2021

Database.LSMTree.Internal.Readers

Contents

Description

Multiple inputs (write buffers, runs) that are being read incrementally.

Synopsis

Documentation

data Readers m h Source #

A collection of runs and write buffers being read from, yielding elements in order. More precisely, that means first ordered by their key, then by the input run they came from. This is important for resolving multiple entries with the same key into one.

Construct with new, then keep calling pop. If aborting early, remember to call close!

Creating a Readers does not retain a reference to the input Runs or the WriteBufferBlobs, but does retain an independent reference on their blob files. It is not necessary to separately retain the Runs or the WriteBufferBlobs for correct use of the Readers. There is one important caveat however: to preserve the validity of BlobRefs then it is necessary to separately retain a reference to the Run or its BlobFile to preserve the validity of BlobRefs.

TODO: do this more nicely by changing Reader to preserve the BlobFile ref until it is explicitly closed, and also retain the BlobFile from the WBB and release all of these BlobFiles once the Readers is itself closed.

Constructors

Readers 

Fields

data ReaderSource m h Source #

Constructors

FromWriteBuffer !WriteBuffer !(Ref (WriteBufferBlobs m h)) 
FromRun !(Ref (Run m h)) 
FromReaders !ReadersMergeType ![ReaderSource m h]

Recursive case, allowing to build a tree of readers for a merging tree.

new :: forall m h. (MonadMask m, MonadST m, MonadSTM m) => ResolveSerialisedValue -> OffsetKey -> [ReaderSource m h] -> m (Maybe (Readers m h)) Source #

close :: (MonadMask m, MonadSTM m, PrimMonad m) => Readers m h -> m () Source #

Clean up the resources held by the readers.

Only call this function when aborting before all readers have been drained!

peekKey :: PrimMonad m => Readers m h -> m SerialisedKey Source #

Return the smallest key present in the readers, without consuming any entries.

data HasMore Source #

Once a function returned Drained, do not use the Readers any more!

Constructors

HasMore 
Drained 

Instances

Instances details
Show HasMore Source # 
Instance details

Defined in Database.LSMTree.Internal.Readers

Eq HasMore Source # 
Instance details

Defined in Database.LSMTree.Internal.Readers

Methods

(==) :: HasMore -> HasMore -> Bool #

(/=) :: HasMore -> HasMore -> Bool #

pop :: (MonadMask m, MonadSTM m, MonadST m) => ResolveSerialisedValue -> Readers m h -> m (SerialisedKey, Entry m h, HasMore) Source #

Remove the entry with the smallest key and return it. If there are multiple entries with that key, it removes the one from the source that came first in list supplied to new. No resolution of multiple entries takes place.

dropWhileKey Source #

Arguments

:: (MonadMask m, MonadSTM m, MonadST m) 
=> ResolveSerialisedValue 
-> Readers m h 
-> SerialisedKey 
-> m (Int, HasMore)

How many were dropped?

Drop all entries with a key that is smaller or equal to the supplied one.

Internals

data Reader m h Source #

An individual reader must be able to produce a sequence of pairs of SerialisedKey and Entry, with ordered und unique keys.

TODO: This is slightly inelegant. This module could work generally for anything that can produce elements, but currently is very specific to having write buffer and run readers. Also, for run merging, no write buffer is involved, but we still need to branch on this sum type. A more general version is possible, but despite SPECIALISE-ing everything showed ~100 bytes of extra allocations per entry that is read (which might be avoidable with some tinkering).

Constructors

ReadBuffer !(MutVar (PrimState m) [KOp m h])

The list allows to incrementally read from the write buffer without having to find the next entry in the Map again (requiring key comparisons) or having to copy out all entries.

TODO: more efficient representation? benchmark!

ReadRun !(RunReader m h) 
ReadReaders !ReadersMergeType !(SMaybe (Readers m h))

Recursively read from another reader. This requires keeping track of its HasMore status, since we should not try to read another entry from it once it is drained.

We represent the recursive reader and HasMore status together as a Maybe Readers. The reason is subtle: once a Readers becomes drained it is immediately closed, after which the structure should not be used anymore or you'd be using resources after they have been closed already.

TODO: maybe it's a slightly more ergonomic alternative to no close the Readers automatically.

data ReadCtx m h Source #

Each heap element needs some more context than just the reader. E.g. the Eq instance we need to be able to access the first key to be read in a pure way.

TODO(optimisation): We allocate this record for each k/op. This might be avoidable, see ideas below.

Instances

Instances details
Eq (ReadCtx m h) Source # 
Instance details

Defined in Database.LSMTree.Internal.Readers

Methods

(==) :: ReadCtx m h -> ReadCtx m h -> Bool #

(/=) :: ReadCtx m h -> ReadCtx m h -> Bool #

Ord (ReadCtx m h) Source #

Makes sure we resolve entries in the right order.

Instance details

Defined in Database.LSMTree.Internal.Readers

Methods

compare :: ReadCtx m h -> ReadCtx m h -> Ordering #

(<) :: ReadCtx m h -> ReadCtx m h -> Bool #

(<=) :: ReadCtx m h -> ReadCtx m h -> Bool #

(>) :: ReadCtx m h -> ReadCtx m h -> Bool #

(>=) :: ReadCtx m h -> ReadCtx m h -> Bool #

max :: ReadCtx m h -> ReadCtx m h -> ReadCtx m h #

min :: ReadCtx m h -> ReadCtx m h -> ReadCtx m h #