Safe Haskell | Safe-Inferred |
---|---|
Language | GHC2021 |
Database.LSMTree.Internal.PageAcc
Description
Page accumulator.
Synopsis
- data PageAcc s = PageAcc {
- paDir :: !(MutablePrimArray s Int)
- paOpMap :: !(MutablePrimArray s Word64)
- paBlobRefsMap :: !(MutablePrimArray s Word64)
- paBlobRefs1 :: !(MutablePrimArray s Word64)
- paBlobRefs2 :: !(MutablePrimArray s Word32)
- paKeys :: !(MVector s SerialisedKey)
- paValues :: !(MVector s SerialisedValue)
- newPageAcc :: ST s (PageAcc s)
- resetPageAcc :: PageAcc s -> ST s ()
- pageAccAddElem :: PageAcc s -> SerialisedKey -> Entry SerialisedValue BlobSpan -> ST s Bool
- serialisePageAcc :: PageAcc s -> ST s RawPage
- keysCountPageAcc :: PageAcc s -> ST s Int
- indexKeyPageAcc :: PageAcc s -> Int -> ST s SerialisedKey
- entryWouldFitInPage :: SerialisedKey -> Entry SerialisedValue b -> Bool
- sizeofEntry :: SerialisedKey -> Entry SerialisedValue b -> Int
Incrementally accumulating a single page.
The delete operations take the least amount of space, as there's only the key.
A smallest page is with empty key:
>>>
import FormatPage
>>>
let Just page0 = pageSizeAddElem (Key "", Delete) (pageSizeEmpty DiskPage4k)
>>>
page0
PageSize {pageSizeElems = 1, pageSizeBlobs = 0, pageSizeBytes = 32, pageSizeDisk = DiskPage4k}
Then we can add pages with a single byte key, e.g.
>>>
pageSizeAddElem (Key "a", Delete) page0
Just (PageSize {pageSizeElems = 2, pageSizeBlobs = 0, pageSizeBytes = 35, pageSizeDisk = DiskPage4k})
i.e. roughly 3-4 bytes (when we get to 32/64 elements we add more bytes for bitmaps). (key and value offset is together 4 bytes: so it's at least 4, the encoding of single element page takes more space).
If we write as small program, adding single byte keys to a page size:
>>>
let calc s ps = case pageSizeAddElem (Key "x", Delete) ps of { Nothing -> s; Just ps' -> calc (s + 1) ps' }
>>>
calc 1 page0
759
I.e. we can have a 4096 byte page with at most 759 keys, assuming keys are length 1 or longer, but without assuming that there are no duplicate keys.
And 759 rounded to the next multiple of 64 (for the bitmaps) is 768.
PageAcc
can hold up to 759 elements, but most likely pageAccAddElem
will make it overflow sooner.
Having an upper bound allows us to allocate all memory for the accumulator in advance.
We don't store or calculate individual key nor value offsets in PageAcc
, as these will be naturally calculated during serialisation (serialisePageAcc
).
Constructors
PageAcc | |
Fields
|
resetPageAcc :: PageAcc s -> ST s () Source #
Reuse PageAcc
for constructing new page, the old data will be reset.
Arguments
:: PageAcc s | |
-> SerialisedKey | |
-> Entry SerialisedValue BlobSpan | |
-> ST s Bool |
|
Add an entry to PageAcc
.
serialisePageAcc :: PageAcc s -> ST s RawPage Source #
Convert mutable PageAcc
accumulator to concrete RawPage
.
After this operation PageAcc
argument can be reset with resetPageAcc
,
and reused.
Inspection
indexKeyPageAcc :: PageAcc s -> Int -> ST s SerialisedKey Source #
Entry sizes
entryWouldFitInPage :: SerialisedKey -> Entry SerialisedValue b -> Bool Source #
Determine if the key, value and optional blobspan would fit within a
single page. If it does, then it's appropriate to use pageAccAddElem
, but
otherwise use singletonPage
.
If entryWouldFitInPage
is True
and the PageAcc
is empty (i.e. using
resetPageAcc
) then pageAccAddElem
is guaranteed to succeed.
sizeofEntry :: SerialisedKey -> Entry SerialisedValue b -> Int Source #
Calculate the total byte size of key, value and optional blobspan.
To fit into single page this has to be at most 4064. If the entry is larger,
the pageAccAddElem
is guaranteed to return False
.
In other words, if you have a large entry (i.e. Insert with big value),
don't use the PageAcc
, but construct the single value page directly,
using singletonPage
.
If it's not known from context, use entryWouldFitInPage
to determine if
you're in the small or large case.
Checking entry size allows us to use Word16
arithmetic, we don't need to
worry about overflows.