Safe Haskell	Safe-Inferred
Language	GHC2021

Database.LSMTree.Internal.Config

Contents

Table configuration
Merge policy
Size ratio
Write buffer allocation
Bloom filter allocation
Fence pointer index
Disk cache policy
Merge schedule
Merge batch size

Synopsis

newtype LevelNo = LevelNo Int
data TableConfig = TableConfig {
- confMergePolicy :: !MergePolicy
- confMergeSchedule :: !MergeSchedule
- confSizeRatio :: !SizeRatio
- confWriteBufferAlloc :: !WriteBufferAlloc
- confBloomFilterAlloc :: !BloomFilterAlloc
- confFencePointerIndex :: !FencePointerIndexType
- confDiskCachePolicy :: !DiskCachePolicy
- confMergeBatchSize :: !MergeBatchSize
}
defaultTableConfig :: TableConfig
data RunLevelNo
- = RegularLevel LevelNo
- | UnionLevel
runParamsForLevel :: TableConfig -> RunLevelNo -> RunParams
data MergePolicy = LazyLevelling
data SizeRatio = Four
sizeRatioInt :: SizeRatio -> Int
data WriteBufferAlloc = AllocNumEntries !Int
data BloomFilterAlloc
- = AllocFixed !Double
- | AllocRequestFPR !Double
bloomFilterAllocForLevel :: TableConfig -> RunLevelNo -> RunBloomFilterAlloc
data FencePointerIndexType
- = OrdinaryIndex
- | CompactIndex
indexTypeForRun :: FencePointerIndexType -> IndexType
data DiskCachePolicy
- = DiskCacheAll
- | DiskCacheLevelOneTo !Int
- | DiskCacheNone
diskCachePolicyForLevel :: DiskCachePolicy -> RunLevelNo -> RunDataCaching
data MergeSchedule
- = OneShot
- | Incremental
newtype MergeBatchSize = MergeBatchSize Int
creditThresholdForLevel :: TableConfig -> LevelNo -> CreditThreshold

Documentation

newtype LevelNo Source #

Constructors

LevelNo Int

Instances

Instances details

Enum LevelNo Source #
Instance details Defined in Database.LSMTree.Internal.Config Methods succ :: LevelNo -> LevelNo # pred :: LevelNo -> LevelNo # toEnum :: Int -> LevelNo # fromEnum :: LevelNo -> Int # enumFrom :: LevelNo -> [LevelNo] # enumFromThen :: LevelNo -> LevelNo -> [LevelNo] # enumFromTo :: LevelNo -> LevelNo -> [LevelNo] # enumFromThenTo :: LevelNo -> LevelNo -> LevelNo -> [LevelNo] #
Show LevelNo Source #
Instance details Defined in Database.LSMTree.Internal.Config Methods showsPrec :: Int -> LevelNo -> ShowS # show :: LevelNo -> String # showList :: [LevelNo] -> ShowS #
NFData LevelNo Source #
Instance details Defined in Database.LSMTree.Internal.Config Methods rnf :: LevelNo -> () #
Eq LevelNo Source #
Instance details Defined in Database.LSMTree.Internal.Config Methods (==) :: LevelNo -> LevelNo -> Bool # (/=) :: LevelNo -> LevelNo -> Bool #
Ord LevelNo Source #
Instance details Defined in Database.LSMTree.Internal.Config Methods compare :: LevelNo -> LevelNo -> Ordering # (<) :: LevelNo -> LevelNo -> Bool # (<=) :: LevelNo -> LevelNo -> Bool # (>) :: LevelNo -> LevelNo -> Bool # (>=) :: LevelNo -> LevelNo -> Bool # max :: LevelNo -> LevelNo -> LevelNo # min :: LevelNo -> LevelNo -> LevelNo #

Table configuration

data TableConfig Source #

A collection of configuration parameters for tables, which can be used to tune the performance of the table. To construct a TableConfig, modify the defaultTableConfig, which defines reasonable defaults for all parameters.

For a detailed discussion of fine-tuning the table configuration, see Fine-tuning Table Configuration.

confMergePolicy :: MergePolicy: The merge policy balances the performance of lookups against the performance of updates. Levelling favours lookups. Tiering favours updates. Lazy levelling strikes a middle ground between levelling and tiering, and moderately favours updates. This parameter is explicitly referenced in the documentation of those operations it affects.
confSizeRatio :: SizeRatio: The size ratio pushes the effects of the merge policy to the extreme. If the size ratio is higher, levelling favours lookups more, and tiering and lazy levelling favour updates more. This parameter is referred to as \(T\) in the disk I/O cost of operations.
confWriteBufferAlloc :: WriteBufferAlloc: The write buffer capacity balances the performance of lookups and updates against the in-memory size of the database. If the write buffer is larger, it takes up more memory, but lookups and updates are more efficient. This parameter is referred to as \(B\) in the disk I/O cost of operations. Irrespective of this parameter, the write buffer size cannot exceed 4GiB.
confMergeSchedule :: MergeSchedule: The merge schedule balances the performance of lookups and updates against the consistency of updates. With the one-shot merge schedule, lookups and updates are more efficient overall, but some updates may take much longer than others. With the incremental merge schedule, lookups and updates are less efficient overall, but each update does a similar amount of work. This parameter is explicitly referenced in the documentation of those operations it affects. The merge schedule does not affect the way that table unions are computed. However, any table union must complete all outstanding incremental updates.
confBloomFilterAlloc :: BloomFilterAlloc: The Bloom filter size balances the performance of lookups against the in-memory size of the database. If the Bloom filters are larger, they take up more memory, but lookup operations are more efficient.
confFencePointerIndex :: FencePointerIndexType: The fence-pointer index type supports two types of indexes. The ordinary indexes are designed to work with any key. The compact indexes are optimised for the case where the keys in the database are uniformly distributed, e.g., when the keys are hashes.
confDiskCachePolicy :: DiskCachePolicy: The disk cache policy supports caching lookup operations using the OS page cache. Caching may improve the performance of lookups and updates if database access follows certain patterns.
confMergeBatchSize :: MergeBatchSize: The merge batch size balances the maximum latency of individual update operations, versus the latency of a sequence of update operations. Bigger batches improves overall performance but some updates will take a lot longer than others. The default is to use a large batch size.

Constructors

TableConfig
Fields confMergePolicy :: !MergePolicy confMergeSchedule :: !MergeSchedule confSizeRatio :: !SizeRatio confWriteBufferAlloc :: !WriteBufferAlloc confBloomFilterAlloc :: !BloomFilterAlloc confFencePointerIndex :: !FencePointerIndexType confDiskCachePolicy :: !DiskCachePolicy confMergeBatchSize :: !MergeBatchSize

Instances

Instances details

Show TableConfig Source #
Instance details Defined in Database.LSMTree.Internal.Config Methods showsPrec :: Int -> TableConfig -> ShowS # show :: TableConfig -> String # showList :: [TableConfig] -> ShowS #
NFData TableConfig Source #
Instance details Defined in Database.LSMTree.Internal.Config Methods rnf :: TableConfig -> () #
Eq TableConfig Source #
Instance details Defined in Database.LSMTree.Internal.Config Methods (==) :: TableConfig -> TableConfig -> Bool # (/=) :: TableConfig -> TableConfig -> Bool #
DecodeVersioned TableConfig Source #
Instance details Defined in Database.LSMTree.Internal.Snapshot.Codec Methods decodeVersioned :: SnapshotVersion -> Decoder s TableConfig Source #
Encode TableConfig Source #
Instance details Defined in Database.LSMTree.Internal.Snapshot.Codec Methods encode :: TableConfig -> Encoding Source #

defaultTableConfig :: TableConfig Source #

The defaultTableConfig defines reasonable defaults for all TableConfig parameters.

>>> confMergePolicy defaultTableConfig
LazyLevelling
>>> confMergeSchedule defaultTableConfig
Incremental
>>> confSizeRatio defaultTableConfig
Four
>>> confWriteBufferAlloc defaultTableConfig
AllocNumEntries 20000
>>> confBloomFilterAlloc defaultTableConfig
AllocRequestFPR 1.0e-3
>>> confFencePointerIndex defaultTableConfig
OrdinaryIndex
>>> confDiskCachePolicy defaultTableConfig
DiskCacheAll
>>> confMergeBatchSize defaultTableConfig
MergeBatchSize 20000

data RunLevelNo Source #

Constructors

RegularLevel LevelNo
UnionLevel

runParamsForLevel :: TableConfig -> RunLevelNo -> RunParams Source #

Merge policy

data MergePolicy Source #

The merge policy balances the performance of lookups against the performance of updates. Levelling favours lookups. Tiering favours updates. Lazy levelling strikes a middle ground between levelling and tiering, and moderately favours updates. This parameter is explicitly referenced in the documentation of those operations it affects.

NOTE: This package only supports lazy levelling.

For a detailed discussion of the merge policy, see Fine-tuning: Merge Policy, Size Ratio, and Write Buffer Size.

Constructors

LazyLevelling

Instances

Instances details

Show MergePolicy Source #
Instance details Defined in Database.LSMTree.Internal.Config Methods showsPrec :: Int -> MergePolicy -> ShowS # show :: MergePolicy -> String # showList :: [MergePolicy] -> ShowS #
NFData MergePolicy Source #
Instance details Defined in Database.LSMTree.Internal.Config Methods rnf :: MergePolicy -> () #
Eq MergePolicy Source #
Instance details Defined in Database.LSMTree.Internal.Config Methods (==) :: MergePolicy -> MergePolicy -> Bool # (/=) :: MergePolicy -> MergePolicy -> Bool #
DecodeVersioned MergePolicy Source #
Instance details Defined in Database.LSMTree.Internal.Snapshot.Codec Methods decodeVersioned :: SnapshotVersion -> Decoder s MergePolicy Source #
Encode MergePolicy Source #
Instance details Defined in Database.LSMTree.Internal.Snapshot.Codec Methods encode :: MergePolicy -> Encoding Source #

Size ratio

data SizeRatio Source #

The size ratio pushes the effects of the merge policy to the extreme. If the size ratio is higher, levelling favours lookups more, and tiering and lazy levelling favour updates more. This parameter is referred to as \(T\) in the disk I/O cost of operations.

NOTE: This package only supports a size ratio of four.

For a detailed discussion of the size ratio, see Fine-tuning: Merge Policy, Size Ratio, and Write Buffer Size.

Constructors

Four

Instances

Instances details

Show SizeRatio Source #
Instance details Defined in Database.LSMTree.Internal.Config Methods showsPrec :: Int -> SizeRatio -> ShowS # show :: SizeRatio -> String # showList :: [SizeRatio] -> ShowS #
NFData SizeRatio Source #
Instance details Defined in Database.LSMTree.Internal.Config Methods rnf :: SizeRatio -> () #
Eq SizeRatio Source #
Instance details Defined in Database.LSMTree.Internal.Config Methods (==) :: SizeRatio -> SizeRatio -> Bool # (/=) :: SizeRatio -> SizeRatio -> Bool #
DecodeVersioned SizeRatio Source #
Instance details Defined in Database.LSMTree.Internal.Snapshot.Codec Methods decodeVersioned :: SnapshotVersion -> Decoder s SizeRatio Source #
Encode SizeRatio Source #
Instance details Defined in Database.LSMTree.Internal.Snapshot.Codec Methods encode :: SizeRatio -> Encoding Source #

sizeRatioInt :: SizeRatio -> Int Source #

Write buffer allocation

data WriteBufferAlloc Source #

The write buffer capacity balances the performance of lookups and updates against the in-memory size of the table. If the write buffer is larger, it takes up more memory, but lookups and updates are more efficient. Irrespective of this parameter, the write buffer size cannot exceed 4GiB.

For a detailed discussion of the size ratio, see Fine-tuning: Merge Policy, Size Ratio, and Write Buffer Size.

Constructors

AllocNumEntries !Int

Allocate space for the in-memory write buffer to fit the requested number of entries. This parameter is referred to as \(B\) in the disk I/O cost of operations.

Instances

Instances details

Show WriteBufferAlloc Source #
Instance details Defined in Database.LSMTree.Internal.Config Methods showsPrec :: Int -> WriteBufferAlloc -> ShowS # show :: WriteBufferAlloc -> String # showList :: [WriteBufferAlloc] -> ShowS #
NFData WriteBufferAlloc Source #
Instance details Defined in Database.LSMTree.Internal.Config Methods rnf :: WriteBufferAlloc -> () #
Eq WriteBufferAlloc Source #
Instance details Defined in Database.LSMTree.Internal.Config Methods (==) :: WriteBufferAlloc -> WriteBufferAlloc -> Bool # (/=) :: WriteBufferAlloc -> WriteBufferAlloc -> Bool #
DecodeVersioned WriteBufferAlloc Source #
Instance details Defined in Database.LSMTree.Internal.Snapshot.Codec Methods decodeVersioned :: SnapshotVersion -> Decoder s WriteBufferAlloc Source #
Encode WriteBufferAlloc Source #
Instance details Defined in Database.LSMTree.Internal.Snapshot.Codec Methods encode :: WriteBufferAlloc -> Encoding Source #

Bloom filter allocation

data BloomFilterAlloc Source #

The Bloom filter size balances the performance of lookups against the in-memory size of the table. If the Bloom filters are larger, they take up more memory, but lookup operations are more efficient.

For a detailed discussion of the Bloom filter size, see Fine-tuning: Bloom Filter Size.

Constructors

AllocFixed !Double

Allocate the requested number of bits per entry in the table.

The value must strictly positive, but fractional values are permitted. The recommended range is \([2, 24]\).

AllocRequestFPR !Double

Allocate the required number of bits per entry to get the requested false-positive rate.

The value must be in the range \((0, 1)\). The recommended range is \([1\mathrm{e}{ -5 },1\mathrm{e}{ -2 }]\).

Instances

Instances details

Show BloomFilterAlloc Source #
Instance details Defined in Database.LSMTree.Internal.Config Methods showsPrec :: Int -> BloomFilterAlloc -> ShowS # show :: BloomFilterAlloc -> String # showList :: [BloomFilterAlloc] -> ShowS #
NFData BloomFilterAlloc Source #
Instance details Defined in Database.LSMTree.Internal.Config Methods rnf :: BloomFilterAlloc -> () #
Eq BloomFilterAlloc Source #
Instance details Defined in Database.LSMTree.Internal.Config Methods (==) :: BloomFilterAlloc -> BloomFilterAlloc -> Bool # (/=) :: BloomFilterAlloc -> BloomFilterAlloc -> Bool #
DecodeVersioned BloomFilterAlloc Source #
Instance details Defined in Database.LSMTree.Internal.Snapshot.Codec Methods decodeVersioned :: SnapshotVersion -> Decoder s BloomFilterAlloc Source #
Encode BloomFilterAlloc Source #
Instance details Defined in Database.LSMTree.Internal.Snapshot.Codec Methods encode :: BloomFilterAlloc -> Encoding Source #

bloomFilterAllocForLevel :: TableConfig -> RunLevelNo -> RunBloomFilterAlloc Source #

Fence pointer index

data FencePointerIndexType Source #

The fence-pointer index type supports two types of indexes. The ordinary indexes are designed to work with any key. The compact indexes are optimised for the case where the keys in the database are uniformly distributed, e.g., when the keys are hashes.

For a detailed discussion the fence-pointer index types, see Fine-tuning: Fence-Pointer Index Type.

Constructors

OrdinaryIndex

Ordinary indexes are designed to work with any key.

When using an ordinary index, the serialiseKey function cannot produce output larger than 64KiB.

CompactIndex

Compact indexes are designed for the case where the keys in the database are uniformly distributed, e.g., when the keys are hashes.

When using a compact index, some requirements apply to serialised keys:

keys must be uniformly distributed;
keys can be of variable length;
keys less than 8 bytes (64bits) are padded with zeros (in LSB position).

Instances

Instances details

Show FencePointerIndexType Source #
Instance details Defined in Database.LSMTree.Internal.Config Methods showsPrec :: Int -> FencePointerIndexType -> ShowS # show :: FencePointerIndexType -> String # showList :: [FencePointerIndexType] -> ShowS #
NFData FencePointerIndexType Source #
Instance details Defined in Database.LSMTree.Internal.Config Methods rnf :: FencePointerIndexType -> () #
Eq FencePointerIndexType Source #
Instance details Defined in Database.LSMTree.Internal.Config Methods (==) :: FencePointerIndexType -> FencePointerIndexType -> Bool # (/=) :: FencePointerIndexType -> FencePointerIndexType -> Bool #
DecodeVersioned FencePointerIndexType Source #
Instance details Defined in Database.LSMTree.Internal.Snapshot.Codec Methods decodeVersioned :: SnapshotVersion -> Decoder s FencePointerIndexType Source #
Encode FencePointerIndexType Source #
Instance details Defined in Database.LSMTree.Internal.Snapshot.Codec Methods encode :: FencePointerIndexType -> Encoding Source #

indexTypeForRun :: FencePointerIndexType -> IndexType Source #

Disk cache policy

data DiskCachePolicy Source #

The disk cache policy determines if lookup operations use the OS page cache. Caching may improve the performance of lookups if database access follows certain patterns.

For a detailed discussion the disk cache policy, see Fine-tuning: Disk Cache Policy.

Constructors

DiskCacheAll

Cache all data in the table.

Use this policy if the database's access pattern has either good spatial locality or both good spatial and temporal locality.

DiskCacheLevelOneTo !Int

Cache the data in the freshest l levels.

Use this policy if the database's access pattern only has good temporal locality.

The variable l determines the number of levels that are cached. For a description of levels, see Merge Policy, Size Ratio, and Write Buffer Size. With this setting, the database can be expected to cache up to \(\frac{k}{P}\) pages of memory, where \(k\) refers to the number of entries that fit in levels \([1,l]\) and is defined as \(\sum_{i=1}^{l}BT^{i}\).

DiskCacheNone

Do not cache any table data.

Use this policy if the database's access pattern has does not have good spatial or temporal locality. For instance, if the access pattern is uniformly random.

Instances

Instances details

Show DiskCachePolicy Source #
Instance details Defined in Database.LSMTree.Internal.Config Methods showsPrec :: Int -> DiskCachePolicy -> ShowS # show :: DiskCachePolicy -> String # showList :: [DiskCachePolicy] -> ShowS #
NFData DiskCachePolicy Source #
Instance details Defined in Database.LSMTree.Internal.Config Methods rnf :: DiskCachePolicy -> () #
Eq DiskCachePolicy Source #
Instance details Defined in Database.LSMTree.Internal.Config Methods (==) :: DiskCachePolicy -> DiskCachePolicy -> Bool # (/=) :: DiskCachePolicy -> DiskCachePolicy -> Bool #
DecodeVersioned DiskCachePolicy Source #
Instance details Defined in Database.LSMTree.Internal.Snapshot.Codec Methods decodeVersioned :: SnapshotVersion -> Decoder s DiskCachePolicy Source #
Encode DiskCachePolicy Source #
Instance details Defined in Database.LSMTree.Internal.Snapshot.Codec Methods encode :: DiskCachePolicy -> Encoding Source #

diskCachePolicyForLevel :: DiskCachePolicy -> RunLevelNo -> RunDataCaching Source #

Interpret the DiskCachePolicy for a level: should we cache data in runs at this level.

Merge schedule

data MergeSchedule Source #

The merge schedule balances the performance of lookups and updates against the consistency of updates. The merge schedule does not affect the performance of table unions. With the one-shot merge schedule, lookups and updates are more efficient overall, but some updates may take much longer than others. With the incremental merge schedule, lookups and updates are less efficient overall, but each update does a similar amount of work. This parameter is explicitly referenced in the documentation of those operations it affects.

For a detailed discussion of the effect of the merge schedule, see Fine-tuning: Merge Schedule.

Constructors

OneShot

The OneShot merge schedule causes the merging algorithm to complete merges immediately. This is more efficient than the Incremental merge schedule, but has an inconsistent workload. Using the OneShot merge schedule, the worst-case disk I/O complexity of the update operations is linear in the size of the table. For real-time systems and other use cases where unresponsiveness is unacceptable, use the Incremental merge schedule.

Incremental

The Incremental merge schedule spreads out the merging work over time. This is less efficient than the OneShot merge schedule, but has a consistent workload. Using the Incremental merge schedule, the worst-case disk I/O complexity of the update operations is logarithmic in the size of the table. This Incremental merge schedule still uses batching to improve performance. The batch size can be controlled using the MergeBatchSize.

Instances

Instances details

Show MergeSchedule Source #
Instance details Defined in Database.LSMTree.Internal.Config Methods showsPrec :: Int -> MergeSchedule -> ShowS # show :: MergeSchedule -> String # showList :: [MergeSchedule] -> ShowS #
NFData MergeSchedule Source #
Instance details Defined in Database.LSMTree.Internal.Config Methods rnf :: MergeSchedule -> () #
Eq MergeSchedule Source #
Instance details Defined in Database.LSMTree.Internal.Config Methods (==) :: MergeSchedule -> MergeSchedule -> Bool # (/=) :: MergeSchedule -> MergeSchedule -> Bool #
DecodeVersioned MergeSchedule Source #
Instance details Defined in Database.LSMTree.Internal.Snapshot.Codec Methods decodeVersioned :: SnapshotVersion -> Decoder s MergeSchedule Source #
Encode MergeSchedule Source #
Instance details Defined in Database.LSMTree.Internal.Snapshot.Codec Methods encode :: MergeSchedule -> Encoding Source #

Merge batch size

newtype MergeBatchSize Source #

The merge batch size is a micro-tuning parameter, and in most cases you do need to think about it and can leave it at its default.

When using the Incremental merge schedule, merging is done in batches. This is a trade-off: larger batches tends to mean better overall performance but the downside is that while most updates (inserts, deletes, upserts) are fast, some are slower (when a batch of merging work has to be done).

If you care most about the maximum latency of updates, then use a small batch size. If you don't care about latency of individual operations, just the latency of the overall sequence of operations then use a large batch size. The default is to use a large batch size, the same size as the write buffer itself. The minimum batch size is 1. The maximum batch size is the size of the write buffer confWriteBufferAlloc.

Note that the actual batch size is the minimum of this configuration parameter and the size of the batch of operations performed (e.g. inserts). So if you consistently use large batches, you can use a batch size of 1 and the merge batch size will always be determined by the operation batch size.

A further reason why it may be preferable to use minimal batch sizes is to get good parallel work balance, when using parallelism.

Constructors

MergeBatchSize Int

Instances

Instances details

Show MergeBatchSize Source #
Instance details Defined in Database.LSMTree.Internal.Config Methods showsPrec :: Int -> MergeBatchSize -> ShowS # show :: MergeBatchSize -> String # showList :: [MergeBatchSize] -> ShowS #
NFData MergeBatchSize Source #
Instance details Defined in Database.LSMTree.Internal.Config Methods rnf :: MergeBatchSize -> () #
Eq MergeBatchSize Source #
Instance details Defined in Database.LSMTree.Internal.Config Methods (==) :: MergeBatchSize -> MergeBatchSize -> Bool # (/=) :: MergeBatchSize -> MergeBatchSize -> Bool #
Ord MergeBatchSize Source #
Instance details Defined in Database.LSMTree.Internal.Config Methods compare :: MergeBatchSize -> MergeBatchSize -> Ordering # (<) :: MergeBatchSize -> MergeBatchSize -> Bool # (<=) :: MergeBatchSize -> MergeBatchSize -> Bool # (>) :: MergeBatchSize -> MergeBatchSize -> Bool # (>=) :: MergeBatchSize -> MergeBatchSize -> Bool # max :: MergeBatchSize -> MergeBatchSize -> MergeBatchSize # min :: MergeBatchSize -> MergeBatchSize -> MergeBatchSize #
DecodeVersioned MergeBatchSize Source #
Instance details Defined in Database.LSMTree.Internal.Snapshot.Codec Methods decodeVersioned :: SnapshotVersion -> Decoder s MergeBatchSize Source #
Encode MergeBatchSize Source #
Instance details Defined in Database.LSMTree.Internal.Snapshot.Codec Methods encode :: MergeBatchSize -> Encoding Source #

creditThresholdForLevel :: TableConfig -> LevelNo -> CreditThreshold Source #