lsm-tree-0.1.0.0: Log-structured merge-trees
Safe HaskellSafe-Inferred
LanguageGHC2021

Database.LSMTree.Internal.PageAcc

Description

Page accumulator.

Synopsis

Incrementally accumulating a single page.

data PageAcc s Source #

The delete operations take the least amount of space, as there's only the key.

A smallest page is with empty key:

>>> import FormatPage
>>> let Just page0 = pageSizeAddElem (Key "", Delete) (pageSizeEmpty DiskPage4k)
>>> page0
PageSize {pageSizeElems = 1, pageSizeBlobs = 0, pageSizeBytes = 32, pageSizeDisk = DiskPage4k}

Then we can add pages with a single byte key, e.g.

>>> pageSizeAddElem (Key "a", Delete) page0
Just (PageSize {pageSizeElems = 2, pageSizeBlobs = 0, pageSizeBytes = 35, pageSizeDisk = DiskPage4k})

i.e. roughly 3-4 bytes (when we get to 32/64 elements we add more bytes for bitmaps). (key and value offset is together 4 bytes: so it's at least 4, the encoding of single element page takes more space).

If we write as small program, adding single byte keys to a page size:

>>> let calc s ps = case pageSizeAddElem (Key "x", Delete) ps of { Nothing -> s; Just ps' -> calc (s + 1) ps' }
>>> calc 1 page0
759

I.e. we can have a 4096 byte page with at most 759 keys, assuming keys are length 1 or longer, but without assuming that there are no duplicate keys.

And 759 rounded to the next multiple of 64 (for the bitmaps) is 768.

PageAcc can hold up to 759 elements, but most likely pageAccAddElem will make it overflow sooner. Having an upper bound allows us to allocate all memory for the accumulator in advance.

We don't store or calculate individual key nor value offsets in PageAcc, as these will be naturally calculated during serialisation (serialisePageAcc).

Constructors

PageAcc 

Fields

newPageAcc :: ST s (PageAcc s) Source #

Create new PageAcc.

The create PageAcc will be empty.

resetPageAcc :: PageAcc s -> ST s () Source #

Reuse PageAcc for constructing new page, the old data will be reset.

pageAccAddElem Source #

Arguments

:: PageAcc s 
-> SerialisedKey 
-> Entry SerialisedValue BlobSpan 
-> ST s Bool

True if value was successfully added.

Add an entry to PageAcc.

serialisePageAcc :: PageAcc s -> ST s RawPage Source #

Convert mutable PageAcc accumulator to concrete RawPage.

After this operation PageAcc argument can be reset with resetPageAcc, and reused.

Inspection

Entry sizes

entryWouldFitInPage :: SerialisedKey -> Entry SerialisedValue b -> Bool Source #

Determine if the key, value and optional blobspan would fit within a single page. If it does, then it's appropriate to use pageAccAddElem, but otherwise use singletonPage.

If entryWouldFitInPage is True and the PageAcc is empty (i.e. using resetPageAcc) then pageAccAddElem is guaranteed to succeed.

sizeofEntry :: SerialisedKey -> Entry SerialisedValue b -> Int Source #

Calculate the total byte size of key, value and optional blobspan.

To fit into single page this has to be at most 4064. If the entry is larger, the pageAccAddElem is guaranteed to return False.

In other words, if you have a large entry (i.e. Insert with big value), don't use the PageAcc, but construct the single value page directly, using singletonPage.

If it's not known from context, use entryWouldFitInPage to determine if you're in the small or large case.

Checking entry size allows us to use Word16 arithmetic, we don't need to worry about overflows.