lsm-tree-0.1.0.0: Log-structured merge-trees
Safe HaskellSafe-Inferred
LanguageGHC2021

Database.LSMTree.Internal.Config

Synopsis

Documentation

Table configuration

data TableConfig Source #

A collection of configuration parameters for tables, which can be used to tune the performance of the table. To construct a TableConfig, modify the defaultTableConfig, which defines reasonable defaults for all parameters.

For a detailed discussion of fine-tuning the table configuration, see Fine-tuning Table Configuration.

confMergePolicy :: MergePolicy
The merge policy balances the performance of lookups against the performance of updates. Levelling favours lookups. Tiering favours updates. Lazy levelling strikes a middle ground between levelling and tiering, and moderately favours updates. This parameter is explicitly referenced in the documentation of those operations it affects.
confSizeRatio :: SizeRatio
The size ratio pushes the effects of the merge policy to the extreme. If the size ratio is higher, levelling favours lookups more, and tiering and lazy levelling favour updates more. This parameter is referred to as \(T\) in the disk I/O cost of operations.
confWriteBufferAlloc :: WriteBufferAlloc
The write buffer capacity balances the performance of lookups and updates against the in-memory size of the database. If the write buffer is larger, it takes up more memory, but lookups and updates are more efficient. This parameter is referred to as \(B\) in the disk I/O cost of operations. Irrespective of this parameter, the write buffer size cannot exceed 4GiB.
confMergeSchedule :: MergeSchedule
The merge schedule balances the performance of lookups and updates against the consistency of updates. The merge schedule does not affect the performance of table unions. With the one-shot merge schedule, lookups and updates are more efficient overall, but some updates may take much longer than others. With the incremental merge schedule, lookups and updates are less efficient overall, but each update does a similar amount of work. This parameter is explicitly referenced in the documentation of those operations it affects.
confBloomFilterAlloc :: BloomFilterAlloc
The Bloom filter size balances the performance of lookups against the in-memory size of the database. If the Bloom filters are larger, they take up more memory, but lookup operations are more efficient.
confFencePointerIndex :: FencePointerIndexType
The fence-pointer index type supports two types of indexes. The ordinary indexes are designed to work with any key. The compact indexes are optimised for the case where the keys in the database are uniformly distributed, e.g., when the keys are hashes.
confDiskCachePolicy :: DiskCachePolicy
The disk cache policy supports caching lookup operations using the OS page cache. Caching may improve the performance of lookups if database access follows certain patterns.

defaultTableConfig :: TableConfig Source #

The defaultTableConfig defines reasonable defaults for all TableConfig parameters.

>>> confMergePolicy defaultTableConfig
LazyLevelling
>>> confMergeSchedule defaultTableConfig
Incremental
>>> confSizeRatio defaultTableConfig
Four
>>> confWriteBufferAlloc defaultTableConfig
AllocNumEntries 20000
>>> confBloomFilterAlloc defaultTableConfig
AllocRequestFPR 1.0e-3
>>> confFencePointerIndex defaultTableConfig
OrdinaryIndex
>>> confDiskCachePolicy defaultTableConfig
DiskCacheAll

Merge policy

data MergePolicy Source #

The merge policy balances the performance of lookups against the performance of updates. Levelling favours lookups. Tiering favours updates. Lazy levelling strikes a middle ground between levelling and tiering, and moderately favours updates. This parameter is explicitly referenced in the documentation of those operations it affects.

NOTE: This package only supports lazy levelling.

For a detailed discussion of the merge policy, see Fine-tuning: Merge Policy, Size Ratio, and Write Buffer Size.

Constructors

LazyLevelling 

Size ratio

data SizeRatio Source #

The size ratio pushes the effects of the merge policy to the extreme. If the size ratio is higher, levelling favours lookups more, and tiering and lazy levelling favour updates more. This parameter is referred to as \(T\) in the disk I/O cost of operations.

NOTE: This package only supports a size ratio of four.

For a detailed discussion of the size ratio, see Fine-tuning: Merge Policy, Size Ratio, and Write Buffer Size.

Constructors

Four 

Write buffer allocation

data WriteBufferAlloc Source #

The write buffer capacity balances the performance of lookups and updates against the in-memory size of the table. If the write buffer is larger, it takes up more memory, but lookups and updates are more efficient. Irrespective of this parameter, the write buffer size cannot exceed 4GiB.

For a detailed discussion of the size ratio, see Fine-tuning: Merge Policy, Size Ratio, and Write Buffer Size.

Constructors

AllocNumEntries !Int

Allocate space for the in-memory write buffer to fit the requested number of entries. This parameter is referred to as \(B\) in the disk I/O cost of operations.

Bloom filter allocation

data BloomFilterAlloc Source #

The Bloom filter size balances the performance of lookups against the in-memory size of the table. If the Bloom filters are larger, they take up more memory, but lookup operations are more efficient.

For a detailed discussion of the Bloom filter size, see Fine-tuning: Bloom Filter Size.

Constructors

AllocFixed !Double

Allocate the requested number of bits per entry in the table.

The value must strictly positive, but fractional values are permitted. The recommended range is \([2, 24]\).

AllocRequestFPR !Double

Allocate the required number of bits per entry to get the requested false-positive rate.

The value must be in the range \((0, 1)\). The recommended range is \([1\mathrm{e}{ -5 },1\mathrm{e}{ -2 }]\).

Fence pointer index

data FencePointerIndexType Source #

The fence-pointer index type supports two types of indexes. The ordinary indexes are designed to work with any key. The compact indexes are optimised for the case where the keys in the database are uniformly distributed, e.g., when the keys are hashes.

For a detailed discussion the fence-pointer index types, see Fine-tuning: Fence-Pointer Index Type.

Constructors

OrdinaryIndex

Ordinary indexes are designed to work with any key.

When using an ordinary index, the serialiseKey function cannot produce output larger than 64KiB.

CompactIndex

Compact indexes are designed for the case where the keys in the database are uniformly distributed, e.g., when the keys are hashes.

When using a compact index, the serialiseKey function must satisfy the following additional law:

Minimal size
size (serialiseKey x) >= 8

Use serialiseKeyMinimalSize to test this law.

serialiseKeyMinimalSize :: SerialiseKey k => k -> Bool Source #

Test the Minimal size law for the CompactIndex option.

Disk cache policy

data DiskCachePolicy Source #

The disk cache policy determines if lookup operations use the OS page cache. Caching may improve the performance of lookups if database access follows certain patterns.

For a detailed discussion the disk cache policy, see Fine-tuning: Disk Cache Policy.

Constructors

DiskCacheAll

Cache all data in the table.

Use this policy if the database's access pattern has either good spatial locality or both good spatial and temporal locality.

DiskCacheLevelOneTo !Int

Cache the data in the freshest l levels.

Use this policy if the database's access pattern only has good temporal locality.

The variable l determines the number of levels that are cached. For a description of levels, see Merge Policy, Size Ratio, and Write Buffer Size. With this setting, the database can be expected to cache up to \(\frac{k}{P}\) pages of memory, where \(k\) refers to the number of entries that fit in levels \([1,l]\) and is defined as \(\sum_{i=1}^{l}BT^{i}\).

DiskCacheNone

Do not cache any table data.

Use this policy if the database's access pattern has does not have good spatial or temporal locality. For instance, if the access pattern is uniformly random.

diskCachePolicyForLevel :: DiskCachePolicy -> RunLevelNo -> RunDataCaching Source #

Interpret the DiskCachePolicy for a level: should we cache data in runs at this level.

Merge schedule

data MergeSchedule Source #

The merge schedule balances the performance of lookups and updates against the consistency of updates. The merge schedule does not affect the performance of table unions. With the one-shot merge schedule, lookups and updates are more efficient overall, but some updates may take much longer than others. With the incremental merge schedule, lookups and updates are less efficient overall, but each update does a similar amount of work. This parameter is explicitly referenced in the documentation of those operations it affects.

For a detailed discussion of the effect of the merge schedule, see Fine-tuning: Merge Schedule.

Constructors

OneShot

The OneShot merge schedule causes the merging algorithm to complete merges immediately. This is more efficient than the Incremental merge schedule, but has an inconsistent workload. Using the OneShot merge schedule, the worst-case disk I/O complexity of the update operations is linear in the size of the table. For real-time systems and other use cases where unresponsiveness is unacceptable, use the Incremental merge schedule.

Incremental

The Incremental merge schedule spreads out the merging work over time. This is less efficient than the OneShot merge schedule, but has a consistent workload. Using the Incremental merge schedule, the worst-case disk I/O complexity of the update operations is logarithmic in the size of the table.