lsm-tree-0.1.0.0: Log-structured merge-trees
Copyright(c) 2023 Input Output Global Inc. (IOG)
(c) 2023-2025 INTERSECT
LicenseApache-2.0
Stabilityexperimental
Portabilityportable
Safe HaskellSafe-Inferred
LanguageGHC2021

Database.LSMTree.Simple

Description

 
Synopsis

Example

>>> :{
runExample $ \session table -> do
  insert table 0 "Hello"
  insert table 1 "World"
  lookup table 0
:}
Just (Value "Hello")

Usage Notes

Resource Management

This package uses explicit resource management. The Session, Table, and Cursor handles hold open resources, such as file handles, which must be explicitly released. Every operation that allocates a resource is paired with another operation to releases that resource. For each pair of allocate and release operations there is a bracketed function that combines the two.

ResourceBracketed Allocate Release
SessionwithSessionopenSessioncloseSession
Table withTablenewTablecloseTable
withDuplicateduplicate
withUnionunion
withIncrementalUnionincrementalUnion
withTableFromSnapshotopenTableFromSnapshot
CursorwithCursornewCursorcloseCursor

To prevent resource and memory leaks due to asynchronous exceptions, it is recommended to use the bracketed functions whenever possible, and otherwise:

  • Run functions that allocate and release a resource with asynchronous exceptions masked.
  • Ensure that every use allocate operation is followed by the corresponding release operation even in the presence of asynchronous exceptions, e.g., using bracket.

Concurrency

Table handles may be used concurrently from multiple Haskell threads, and doing read operations concurrently may result in improved throughput, as it can take advantage of CPU and I/O parallelism. However, concurrent use of write operations may introduces races. Specifically:

  • It is a race to read and write the same table concurrently.
  • It is a race to write and write the same table concurrently.
  • It is not a race to read and read the same table concurrently.
  • It is not a race to read or write separate tables concurrently.

For the purposes of the above rules:

It is possible to read from a stable view of a table while concurrently writing to the table by using duplicate and performing the read operations on the duplicate. However, this requires that the duplicate operation happens before the subsequent writes, as it is a race to duplicate concurrently with any writes. As this package does not provide any construct for synchronisation or atomic operations, this ordering of operations must be accomplished by the user through other means.

A Cursor creates a stable view of a table and can safely be read while modifying the original table. However, reading the next key/value pair from a cursor locks the view, so concurrent reads on the same cursor block. This is because next updates the cursor's current position.

Session handles may be used concurrently from multiple Haskell threads, but concurrent use of read and write operations may introduce races. Specifically, it is a race to use listSnapshots and deleteSnapshots with the same session handle concurrently.

ACID properties

This text copies liberally from https://en.wikipedia.org/wiki/ACID and related wiki pages.

Atomicity, consistency, isolation, and durability (ACID) are important properties of database transactions. They guarantee data validity despite errors, power failures, and other mishaps. A transaction is a sequence of database operations that satisfy the ACID properties.

lsm-tree does not support transactions in the typical sense that many relational databases do, where transactions can be built from smaller components/actions, e.g., reads and writes of individual cells. Instead, the public API only exposes functions that individually form a transaction; there are no smaller building blocks. An example of such a transaction is updates.

An lsm-tree transaction still perform multiple database actions internally, but transactions themselves are not composable into larger transactions, so it should be expected that table contents can change between transactions in a concurrent setting. A consistent view of a table can be created, so that independent transactions have access to their own version of the database state (see concurrency).

All lsm-tree transactions are designed for atomicity, consistency, and isolation (ACI), assuming that users of the library perform proper resource management. Durability is only guaranteed when saving a snapshot, which is the only method of stopping and restarting tables.

We currently cannot guarantee consistency in the presence of synchronous and asynchronous exceptions, eventhough major strides were made to make it so. The safest course of action when an internal exception is encountered is to stop and restart: close the session along with all its tables and cursors, reopen the session, and load a previous saved table snapshot.

Sharing

Tables created via duplicate or union will initially share as much of their in-memory and on-disk data as possible with the tables they were created from. Over time as these related tables are modified, the contents of the tables will diverge, which means that the tables will share less and less.

Sharing of in-memory data is not preserved by snapshots, but sharing of on-disk data is partially preserved. Existing files for runs are shared, but files for ongoing merges are not. Opening a table from a snapshot (using openTableFromSnapshot or withTableFromSnapshot) is expensive, but creating a snapshot (using saveSnapshot) is relatively cheap.

Sessions

data Session Source #

A session stores context that is shared by multiple tables.

Each session is associated with one session directory where the files containing table data are stored. Each session locks its session directory. There can only be one active session for each session directory at a time. If a database is must be accessed from multiple parts of a program, one session should be opened and shared between those parts of the program. Session directories cannot be shared between OS processes.

withSession Source #

Arguments

:: forall a. FilePath

The session directory.

-> (Session -> IO a) 
-> IO a 

Run an action with access to a session opened from a session directory.

If there are no open tables or cursors when the session terminates, then the disk I/O complexity of this operation is \(O(1)\). Otherwise, closeTable is called for each open table and closeCursor is called for each open cursor. Consequently, the worst-case disk I/O complexity of this operation depends on the merge policy of the open tables in the session. The following assumes all tables in the session have the same merge policy:

LazyLevelling
\(O(o \: T \log_T \frac{n}{B})\).

The variable \(o\) refers to the number of open tables and cursors in the session.

This function is exception-safe for both synchronous and asynchronous exceptions.

It is recommended to use this function instead of openSession and closeSession.

Throws the following exceptions:

SessionDirDoesNotExistError
If the session directory does not exist.
SessionDirLockedError
If the session directory is locked by another process.
SessionDirCorruptedError
If the session directory is malformed.

openSession Source #

Arguments

:: FilePath

The session directory.

-> IO Session 

Open a session from a session directory.

The worst-case disk I/O complexity of this operation is \(O(1)\).

Warning: Sessions hold open resources and must be closed using closeSession.

Throws the following exceptions:

SessionDirDoesNotExistError
If the session directory does not exist.
SessionDirLockedError
If the session directory is locked by another process.
SessionDirCorruptedError
If the session directory is malformed.

closeSession :: Session -> IO () Source #

Close a session.

If there are no open tables or cursors in the session, then the disk I/O complexity of this operation is \(O(1)\). Otherwise, closeTable is called for each open table and closeCursor is called for each open cursor. Consequently, the worst-case disk I/O complexity of this operation depends on the merge policy of the tables in the session. The following assumes all tables in the session have the same merge policy:

LazyLevelling
\(O(o \: T \log_T \frac{n}{B})\).

The variable \(o\) refers to the number of open tables and cursors in the session.

Closing is idempotent, i.e., closing a closed session does nothing. All other operations on a closed session will throw an exception.

Tables

data Table k v Source #

A table is a handle to an individual LSM-tree key/value store with both in-memory and on-disk parts.

Warning: Tables are ephemeral. Once you close a table, its data is lost forever. To persist tables, use snapshots.

withTable :: forall k v a. Session -> (Table k v -> IO a) -> IO a Source #

Run an action with access to an empty table.

The worst-case disk I/O complexity of this operation depends on the merge policy of the table:

LazyLevelling
\(O(T \log_T \frac{n}{B})\).

This function is exception-safe for both synchronous and asynchronous exceptions.

It is recommended to use this function instead of newTable and closeTable.

Throws the following exceptions:

SessionClosedError
If the session is closed.

withTableWith :: forall k v a. TableConfig -> Session -> (Table k v -> IO a) -> IO a Source #

Variant of withTable that accepts table configuration.

newTable :: forall k v. Session -> IO (Table k v) Source #

Create an empty table.

The worst-case disk I/O complexity of this operation is \(O(1)\).

Warning: Tables hold open resources and must be closed using closeTable.

Throws the following exceptions:

SessionClosedError
If the session is closed.

newTableWith :: forall k v. TableConfig -> Session -> IO (Table k v) Source #

Variant of newTable that accepts table configuration.

closeTable :: forall k v. Table k v -> IO () Source #

Close a table.

The worst-case disk I/O complexity of this operation depends on the merge policy of the table:

LazyLevelling
\(O(T \log_T \frac{n}{B})\).

Closing is idempotent, i.e., closing a closed table does nothing. All other operations on a closed table will throw an exception.

Warning: Tables are ephemeral. Once you close a table, its data is lost forever. To persist tables, use snapshots.

Table Lookups

member :: forall k v. (SerialiseKey k, SerialiseValue v) => Table k v -> k -> IO Bool Source #

Check if the key is a member of the table.

The worst-case disk I/O complexity of this operation depends on the merge policy of the table:

LazyLevelling
\(O(T \log_T \frac{n}{B})\).

Membership tests can be performed concurrently from multiple Haskell threads.

Throws the following exceptions:

SessionClosedError
If the session is closed.
TableClosedError
If the table is closed.
TableCorruptedError
If the table data is corrupted.

members :: forall k v. (SerialiseKey k, SerialiseValue v) => Table k v -> Vector k -> IO (Vector Bool) Source #

Variant of member for batch membership tests. The batch of keys corresponds in-order to the batch of results.

The worst-case disk I/O complexity of this operation depends on the merge policy of the table:

LazyLevelling
\(O(b \: T \log_T \frac{n}{B})\).

The variable \(b\) refers to the length of the input vector.

The following property holds in the absence of races:

members table keys = traverse (member table) keys

lookup :: forall k v. (SerialiseKey k, SerialiseValue v) => Table k v -> k -> IO (Maybe v) Source #

Look up the value associated with a key.

The worst-case disk I/O complexity of this operation depends on the merge policy of the table:

LazyLevelling
\(O(T \log_T \frac{n}{B})\).

Lookups can be performed concurrently from multiple Haskell threads.

Throws the following exceptions:

SessionClosedError
If the session is closed.
TableClosedError
If the table is closed.
TableCorruptedError
If the table data is corrupted.

lookups :: forall k v. (SerialiseKey k, SerialiseValue v) => Table k v -> Vector k -> IO (Vector (Maybe v)) Source #

Variant of lookup for batch lookups. The batch of keys corresponds in-order to the batch of results.

The worst-case disk I/O complexity of this operation depends on the merge policy of the table:

LazyLevelling
\(O(b \: T \log_T \frac{n}{B})\).

The variable \(b\) refers to the length of the input vector.

The following property holds in the absence of races:

lookups table keys = traverse (lookup table) keys

rangeLookup :: forall k v. (SerialiseKey k, SerialiseValue v) => Table k v -> Range k -> IO (Vector (k, v)) Source #

Look up a batch of values associated with keys in the given range.

The worst-case disk I/O complexity of this operation is \(O(T \log_T \frac{n}{B} + \frac{b}{P})\), where the variable \(b\) refers to the length of the output vector.

Range lookups can be performed concurrently from multiple Haskell threads.

Throws the following exceptions:

SessionClosedError
If the session is closed.
TableClosedError
If the table is closed.
TableCorruptedError
If the table data is corrupted.

Table Updates

insert :: forall k v. (SerialiseKey k, SerialiseValue v) => Table k v -> k -> v -> IO () Source #

Insert a new key and value in the table. If the key is already present in the table, the associated value is replaced with the given value.

The worst-case disk I/O complexity of this operation depends on the merge policy and the merge schedule of the table:

LazyLevelling/Incremental
\(O(\frac{1}{P} \: \log_T \frac{n}{B})\).
LazyLevelling/OneShot
\(O(\frac{n}{P})\).

Throws the following exceptions:

SessionClosedError
If the session is closed.
TableClosedError
If the table is closed.

inserts :: forall k v. (SerialiseKey k, SerialiseValue v) => Table k v -> Vector (k, v) -> IO () Source #

Variant of insert for batch insertions.

The worst-case disk I/O complexity of this operation depends on the merge policy and the merge schedule of the table:

LazyLevelling/Incremental
\(O(b \: \frac{1}{P} \: \log_T \frac{n}{B})\).
LazyLevelling/OneShot
\(O(\frac{b}{P} \log_T \frac{b}{B} + \frac{n}{P})\).

The variable \(b\) refers to the length of the input vector.

The following property holds in the absence of races:

inserts table entries = traverse_ (uncurry $ insert table) entries

delete :: forall k v. (SerialiseKey k, SerialiseValue v) => Table k v -> k -> IO () Source #

Delete a key and its value from the table. If the key is not present in the table, the table is left unchanged.

The worst-case disk I/O complexity of this operation depends on the merge policy and the merge schedule of the table:

LazyLevelling/Incremental
\(O(\frac{1}{P} \: \log_T \frac{n}{B})\).
LazyLevelling/OneShot
\(O(\frac{n}{P})\).

Throws the following exceptions:

SessionClosedError
If the session is closed.
TableClosedError
If the table is closed.

deletes :: forall k v. (SerialiseKey k, SerialiseValue v) => Table k v -> Vector k -> IO () Source #

Variant of delete for batch deletions.

The worst-case disk I/O complexity of this operation depends on the merge policy and the merge schedule of the table:

LazyLevelling/Incremental
\(O(b \: \frac{1}{P} \: \log_T \frac{n}{B})\).
LazyLevelling/OneShot
\(O(\frac{b}{P} \log_T \frac{b}{B} + \frac{n}{P})\).

The variable \(b\) refers to the length of the input vector.

The following property holds in the absence of races:

deletes table keys = traverse_ (delete table) keys

update :: forall k v. (SerialiseKey k, SerialiseValue v) => Table k v -> k -> Maybe v -> IO () Source #

Update the value at a specific key:

  • If the given value is Just, this operation acts as insert.
  • If the given value is Nothing, this operation acts as delete.

The worst-case disk I/O complexity of this operation depends on the merge policy and the merge schedule of the table:

LazyLevelling/Incremental
\(O(\frac{1}{P} \: \log_T \frac{n}{B})\).
LazyLevelling/OneShot
\(O(\frac{n}{P})\).

Throws the following exceptions:

SessionClosedError
If the session is closed.
TableClosedError
If the table is closed.

updates :: forall k v. (SerialiseKey k, SerialiseValue v) => Table k v -> Vector (k, Maybe v) -> IO () Source #

Variant of update for batch updates.

The worst-case disk I/O complexity of this operation depends on the merge policy and the merge schedule of the table:

LazyLevelling/Incremental
\(O(b \: \frac{1}{P} \: \log_T \frac{n}{B})\).
LazyLevelling/OneShot
\(O(\frac{b}{P} \log_T \frac{b}{B} + \frac{n}{P})\).

The variable \(b\) refers to the length of the input vector.

The following property holds in the absence of races:

updates table entries = traverse_ (uncurry $ update table) entries

Table Duplication

withDuplicate :: forall k v a. Table k v -> (Table k v -> IO a) -> IO a Source #

Run an action with access to the duplicate of a table.

The duplicate is an independent copy of the given table. The duplicate is unaffected by subsequent updates to the given table and vice versa.

The worst-case disk I/O complexity of this operation depends on the merge policy of the table:

LazyLevelling
\(O(T \log_T \frac{n}{B})\).

This function is exception-safe for both synchronous and asynchronous exceptions.

It is recommended to use this function instead of duplicate and closeTable.

Throws the following exceptions:

SessionClosedError
If the session is closed.
TableClosedError
If the table is closed.

duplicate :: forall k v. Table k v -> IO (Table k v) Source #

Duplicate a table.

The duplicate is an independent copy of the given table. The duplicate is unaffected by subsequent updates to the given table and vice versa.

The worst-case disk I/O complexity of this operation is \(O(0)\).

Warning: The duplicate must be independently closed using closeTable.

Throws the following exceptions:

SessionClosedError
If the session is closed.
TableClosedError
If the table is closed.

Table Unions

withUnion :: forall k v a. Table k v -> Table k v -> (Table k v -> IO a) -> IO a Source #

Run an action with access to a table that contains the union of the entries of the given tables.

The worst-case disk I/O complexity of this operation is \(O(\frac{n}{P})\).

This function is exception-safe for both synchronous and asynchronous exceptions.

It is recommended to use this function instead of union and closeTable.

Warning: Both input tables must be from the same Session.

Warning: This is a relatively expensive operation that may take some time to complete. See withIncrementalUnion for an incremental alternative.

Throws the following exceptions:

SessionClosedError
If the session is closed.
TableClosedError
If the table is closed.
TableUnionNotCompatibleError
If both tables are not from the same Session.

withUnions :: forall k v a. NonEmpty (Table k v) -> (Table k v -> IO a) -> IO a Source #

Variant of withUnions that takes any number of tables.

union :: forall k v. Table k v -> Table k v -> IO (Table k v) Source #

Create a table that contains the left-biased union of the entries of the given tables.

The worst-case disk I/O complexity of this operation is \(O(\frac{n}{P})\).

Warning: The new table must be independently closed using closeTable.

Warning: Both input tables must be from the same Session.

Warning: This is a relatively expensive operation that may take some time to complete. See incrementalUnion for an incremental alternative.

Throws the following exceptions:

SessionClosedError
If the session is closed.
TableClosedError
If the table is closed.
TableUnionNotCompatibleError
If both tables are not from the same Session.

unions :: forall k v. NonEmpty (Table k v) -> IO (Table k v) Source #

Variant of union that takes any number of tables.

withIncrementalUnion :: forall k v a. Table k v -> Table k v -> (Table k v -> IO a) -> IO a Source #

Run an action with access to a table that incrementally computes the union of the given tables.

The worst-case disk I/O complexity of this operation depends on the merge policy of the table:

LazyLevelling
\(O(T \log_T \frac{n}{B})\).

This function is exception-safe for both synchronous and asynchronous exceptions.

It is recommended to use this function instead of incrementalUnion and closeTable.

The created table has a union debt which represents the amount of computation that remains. See remainingUnionDebt. The union debt can be paid off by supplying union credit which performs an amount of computation proportional to the amount of union credit. See supplyUnionCredits. While a table has unresolved union debt, operations may become more expensive by a factor that scales with the number of unresolved unions.

Warning: Both input tables must be from the same Session.

Throws the following exceptions:

SessionClosedError
If the session is closed.
TableClosedError
If the table is closed.
TableUnionNotCompatibleError
If both tables are not from the same Session.

withIncrementalUnions :: forall k v a. NonEmpty (Table k v) -> (Table k v -> IO a) -> IO a Source #

Variant of withIncrementalUnion that takes any number of tables.

The worst-case disk I/O complexity of this operation depends on the merge policy of the table:

LazyLevelling
\(O(T \log_T \frac{n}{B} + b)\).

The variable \(b\) refers to the number of input tables.

incrementalUnion :: forall k v. Table k v -> Table k v -> IO (Table k v) Source #

Create a table that incrementally computes the union of the given tables.

The worst-case disk I/O complexity of this operation is \(O(1)\).

The created table has a union debt which represents the amount of computation that remains. See remainingUnionDebt. The union debt can be paid off by supplying union credit which performs an amount of computation proportional to the amount of union credit. See supplyUnionCredits. While a table has unresolved union debt, operations may become more expensive by a factor that scales with the number of unresolved unions.

Warning: The new table must be independently closed using closeTable.

Warning: Both input tables must be from the same Session.

Throws the following exceptions:

SessionClosedError
If the session is closed.
TableClosedError
If the table is closed.
TableUnionNotCompatibleError
If both tables are not from the same Session.

incrementalUnions :: forall k v. NonEmpty (Table k v) -> IO (Table k v) Source #

Variant of incrementalUnion for any number of tables.

The worst-case disk I/O complexity of this operation is \(O(b)\), where the variable \(b\) refers to the number of input tables.

remainingUnionDebt :: forall k v. Table k v -> IO UnionDebt Source #

Get the amount of remaining union debt. This includes the union debt of any table that was part of the union's input.

The worst-case disk I/O complexity of this operation is \(O(0)\).

supplyUnionCredits :: forall k v. Table k v -> UnionCredits -> IO UnionCredits Source #

Supply the given amount of union credits.

This reduces the union debt by at least the number of supplied union credits. It is therefore advisable to query remainingUnionDebt every once in a while to see what the current debt is.

This function returns any surplus of union credits as leftover credits when a union has finished. In particular, if the returned number of credits is positive, then the union is finished.

The worst-case disk I/O complexity of this operation is \(O(\frac{b}{P})\), where the variable \(b\) refers to the amount of credits supplied.

Throws the following exceptions:

SessionClosedError
If the session is closed.
TableClosedError
If the table is closed.

Cursors

data Cursor k v Source #

A cursor is a stable read-only iterator for a table.

A cursor iterates over the entries in a table following the order of the serialised keys. After the cursor is created, updates to the referenced table do not affect the cursor.

The name of this type references database cursors, not, e.g., text editor cursors.

withCursor :: forall k v a. Table k v -> (Cursor k v -> IO a) -> IO a Source #

Run an action with access to a cursor.

The worst-case disk I/O complexity of this operation depends on the merge policy of the table:

LazyLevelling
\(O(T \log_T \frac{n}{B})\).

This function is exception-safe for both synchronous and asynchronous exceptions.

It is recommended to use this function instead of newCursor and closeCursor.

Throws the following exceptions:

SessionClosedError
If the session is closed.
TableClosedError
If the table is closed.

withCursorAtOffset :: forall k v a. SerialiseKey k => Table k v -> k -> (Cursor k v -> IO a) -> IO a Source #

Variant of withCursor that starts at a given key.

newCursor :: forall k v. Table k v -> IO (Cursor k v) Source #

Create a cursor for the given table.

The worst-case disk I/O complexity of this operation depends on the merge policy of the table:

LazyLevelling
\(O(T \log_T \frac{n}{B})\).

Warning: Cursors hold open resources and must be closed using closeCursor.

Throws the following exceptions:

SessionClosedError
If the session is closed.
TableClosedError
If the table is closed.

newCursorAtOffset :: forall k v. SerialiseKey k => Table k v -> k -> IO (Cursor k v) Source #

Variant of newCursor that starts at a given key.

closeCursor :: forall k v. Cursor k v -> IO () Source #

Close a cursor.

The worst-case disk I/O complexity of this operation depends on the merge policy of the table:

LazyLevelling
\(O(T \log_T \frac{n}{B})\).

Closing is idempotent, i.e., closing a closed cursor does nothing. All other operations on a closed cursor will throw an exception.

next :: forall k v. (SerialiseKey k, SerialiseValue v) => Cursor k v -> IO (Maybe (k, v)) Source #

Read the next table entry from the cursor.

The worst-case disk I/O complexity of this operation is \(O(\frac{1}{P})\).

Throws the following exceptions:

SessionClosedError
If the session is closed.
CursorClosedError
If the cursor is closed.

take :: forall k v. (SerialiseKey k, SerialiseValue v) => Int -> Cursor k v -> IO (Vector (k, v)) Source #

Read the next batch of table entries from the cursor.

The worst-case disk I/O complexity of this operation is \(O(\frac{b}{P})\), where the variable \(b\) refers to the length of the output vector, which is at most equal to the given number. In practice, the length of the output vector is only less than the given number once the cursor reaches the end of the table.

The following property holds:

take n cursor = catMaybes <$> replicateM n (next cursor)

Throws the following exceptions:

SessionClosedError
If the session is closed.
CursorClosedError
If the cursor is closed.

takeWhile :: forall k v. (SerialiseKey k, SerialiseValue v) => Int -> (k -> Bool) -> Cursor k v -> IO (Vector (k, v)) Source #

Variant of take that accepts an additional predicate to determine whether or not to continue reading.

The worst-case disk I/O complexity of this operation is \(O(\frac{b}{P})\), where the variable \(b\) refers to the length of the output vector, which is at most equal to the given number. In practice, the length of the output vector is only less than the given number when the predicate returns false or the cursor reaches the end of the table.

The following properties hold:

takeWhile n (const True) cursor = take n cursor
takeWhile n (const False) cursor = pure empty

Throws the following exceptions:

SessionClosedError
If the session is closed.
CursorClosedError
If the cursor is closed.

Snapshots

saveSnapshot :: forall k v. SnapshotName -> SnapshotLabel -> Table k v -> IO () Source #

Save the current state of the table to disk as a snapshot under the given snapshot name. This is the only mechanism that persists a table. Each snapshot must have a unique name, which may be used to restore the table from that snapshot using openTableFromSnapshot. Saving a snapshot does not close the table.

Saving a snapshot is relatively cheap when compared to opening a snapshot. However, it is not so cheap that one should use it after every operation.

The worst-case disk I/O complexity of this operation depends on the merge policy of the table:

LazyLevelling
\(O(T \log_T \frac{n}{B})\).

Throws the following exceptions:

SessionClosedError
If the session is closed.
TableClosedError
If the table is closed.
SnapshotExistsError
If a snapshot with the same name already exists.

withTableFromSnapshot :: forall k v a. Session -> SnapshotName -> SnapshotLabel -> (Table k v -> IO a) -> IO a Source #

Run an action with access to a table from a snapshot.

The worst-case disk I/O complexity of this operation is \(O(\frac{n}{P})\).

This function is exception-safe for both synchronous and asynchronous exceptions.

It is recommended to use this function instead of openTableFromSnapshot and closeTable.

Throws the following exceptions:

SessionClosedError
If the session is closed.
TableClosedError
If the table is closed.
SnapshotDoesNotExistError
If no snapshot with the given name exists.
SnapshotCorruptedError
If the snapshot data is corrupted.
SnapshotNotCompatibleError
If the snapshot has a different label or is a different table type.

openTableFromSnapshot :: forall k v. Session -> SnapshotName -> SnapshotLabel -> IO (Table k v) Source #

Open a table from a named snapshot.

The worst-case disk I/O complexity of this operation is \(O(\frac{n}{P})\).

Warning: The new table must be independently closed using closeTable.

Throws the following exceptions:

SessionClosedError
If the session is closed.
TableClosedError
If the table is closed.
SnapshotDoesNotExistError
If no snapshot with the given name exists.
SnapshotCorruptedError
If the snapshot data is corrupted.
SnapshotNotCompatibleError
If the snapshot has a different label or is a different table type.

doesSnapshotExist :: Session -> SnapshotName -> IO Bool Source #

Check if the named snapshot exists.

The worst-case disk I/O complexity of this operation is \(O(1)\).

Throws the following exceptions:

SessionClosedError
If the session is closed.

deleteSnapshot :: Session -> SnapshotName -> IO () Source #

Delete the named snapshot.

The worst-case disk I/O complexity of this operation depends on the merge policy of the table:

LazyLevelling
\(O(T \log_T \frac{n}{B})\).

Throws the following exceptions:

SessionClosedError
If the session is closed.
SnapshotDoesNotExistError
If no snapshot with the given name exists.

listSnapshots :: Session -> IO [SnapshotName] Source #

List the names of all snapshots.

The worst-case disk I/O complexity of this operation is \(O(s)\), where the variable \(s\) refers to the number of snapshots in the session.

Throws the following exceptions:

SessionClosedError
If the session is closed.

isValidSnapshotName :: String -> Bool Source #

Check if a String would be a valid snapshot name.

Snapshot names consist of lowercase characters, digits, dashes -, and underscores _, and must be between 1 and 64 characters long. >>> isValidSnapshotName "main" True

>>> isValidSnapshotName "temporary-123-test_"
True
>>> isValidSnapshotName "UPPER"
False
>>> isValidSnapshotName "dir/dot.exe"
False
>>> isValidSnapshotName ".."
False
>>> isValidSnapshotName "\\"
False
>>> isValidSnapshotName ""
False
>>> isValidSnapshotName (replicate 100 'a')
False

Snapshot names must be valid directory on both POSIX and Windows. This rules out the following reserved file and directory names on Windows:

>>> isValidSnapshotName "con"
False
>>> isValidSnapshotName "prn"
False
>>> isValidSnapshotName "aux"
False
>>> isValidSnapshotName "nul"
False
>>> isValidSnapshotName "com1" -- "com2", "com3", etc.
False
>>> isValidSnapshotName "lpt1" -- "lpt2", "lpt3", etc.
False

See, e.g., the VBA docs for the "Bad file name or number" error.

toSnapshotName :: String -> SnapshotName Source #

Create snapshot name.

The given string must satisfy isValidSnapshotName.

Throws the following exceptions:

InvalidSnapshotNameError
If the given string is not a valid snapshot name.

newtype SnapshotLabel Source #

Custom, user-supplied text that is included in the metadata.

The main use case for a SnapshotLabel is for the user to supply textual information about the key/value/blob type for the table that corresponds to the snapshot. This information is used to dynamically check that a snapshot is opened at the correct key/value/blob type.

Constructors

SnapshotLabel Text 

Table Configuration

data TableConfig Source #

A collection of configuration parameters for tables, which can be used to tune the performance of the table. To construct a TableConfig, modify the defaultTableConfig, which defines reasonable defaults for all parameters.

For a detailed discussion of fine-tuning the table configuration, see Fine-tuning Table Configuration.

confMergePolicy :: MergePolicy
The merge policy balances the performance of lookups against the performance of updates. Levelling favours lookups. Tiering favours updates. Lazy levelling strikes a middle ground between levelling and tiering, and moderately favours updates. This parameter is explicitly referenced in the documentation of those operations it affects.
confSizeRatio :: SizeRatio
The size ratio pushes the effects of the merge policy to the extreme. If the size ratio is higher, levelling favours lookups more, and tiering and lazy levelling favour updates more. This parameter is referred to as \(T\) in the disk I/O cost of operations.
confWriteBufferAlloc :: WriteBufferAlloc
The write buffer capacity balances the performance of lookups and updates against the in-memory size of the database. If the write buffer is larger, it takes up more memory, but lookups and updates are more efficient. This parameter is referred to as \(B\) in the disk I/O cost of operations. Irrespective of this parameter, the write buffer size cannot exceed 4GiB.
confMergeSchedule :: MergeSchedule
The merge schedule balances the performance of lookups and updates against the consistency of updates. The merge schedule does not affect the performance of table unions. With the one-shot merge schedule, lookups and updates are more efficient overall, but some updates may take much longer than others. With the incremental merge schedule, lookups and updates are less efficient overall, but each update does a similar amount of work. This parameter is explicitly referenced in the documentation of those operations it affects.
confBloomFilterAlloc :: BloomFilterAlloc
The Bloom filter size balances the performance of lookups against the in-memory size of the database. If the Bloom filters are larger, they take up more memory, but lookup operations are more efficient.
confFencePointerIndex :: FencePointerIndexType
The fence-pointer index type supports two types of indexes. The ordinary indexes are designed to work with any key. The compact indexes are optimised for the case where the keys in the database are uniformly distributed, e.g., when the keys are hashes.
confDiskCachePolicy :: DiskCachePolicy
The disk cache policy supports caching lookup operations using the OS page cache. Caching may improve the performance of lookups if database access follows certain patterns.

data MergePolicy Source #

The merge policy balances the performance of lookups against the performance of updates. Levelling favours lookups. Tiering favours updates. Lazy levelling strikes a middle ground between levelling and tiering, and moderately favours updates. This parameter is explicitly referenced in the documentation of those operations it affects.

NOTE: This package only supports lazy levelling.

For a detailed discussion of the merge policy, see Fine-tuning: Merge Policy, Size Ratio, and Write Buffer Size.

Constructors

LazyLevelling 

data SizeRatio Source #

The size ratio pushes the effects of the merge policy to the extreme. If the size ratio is higher, levelling favours lookups more, and tiering and lazy levelling favour updates more. This parameter is referred to as \(T\) in the disk I/O cost of operations.

NOTE: This package only supports a size ratio of four.

For a detailed discussion of the size ratio, see Fine-tuning: Merge Policy, Size Ratio, and Write Buffer Size.

Constructors

Four 

data WriteBufferAlloc Source #

The write buffer capacity balances the performance of lookups and updates against the in-memory size of the table. If the write buffer is larger, it takes up more memory, but lookups and updates are more efficient. Irrespective of this parameter, the write buffer size cannot exceed 4GiB.

For a detailed discussion of the size ratio, see Fine-tuning: Merge Policy, Size Ratio, and Write Buffer Size.

Constructors

AllocNumEntries !Int

Allocate space for the in-memory write buffer to fit the requested number of entries. This parameter is referred to as \(B\) in the disk I/O cost of operations.

data BloomFilterAlloc Source #

The Bloom filter size balances the performance of lookups against the in-memory size of the table. If the Bloom filters are larger, they take up more memory, but lookup operations are more efficient.

For a detailed discussion of the Bloom filter size, see Fine-tuning: Bloom Filter Size.

Constructors

AllocFixed !Double

Allocate the requested number of bits per entry in the table.

The value must strictly positive, but fractional values are permitted. The recommended range is \([2, 24]\).

AllocRequestFPR !Double

Allocate the required number of bits per entry to get the requested false-positive rate.

The value must be in the range \((0, 1)\). The recommended range is \([1\mathrm{e}{ -5 },1\mathrm{e}{ -2 }]\).

data FencePointerIndexType Source #

The fence-pointer index type supports two types of indexes. The ordinary indexes are designed to work with any key. The compact indexes are optimised for the case where the keys in the database are uniformly distributed, e.g., when the keys are hashes.

For a detailed discussion the fence-pointer index types, see Fine-tuning: Fence-Pointer Index Type.

Constructors

OrdinaryIndex

Ordinary indexes are designed to work with any key.

When using an ordinary index, the serialiseKey function cannot produce output larger than 64KiB.

CompactIndex

Compact indexes are designed for the case where the keys in the database are uniformly distributed, e.g., when the keys are hashes.

When using a compact index, the serialiseKey function must satisfy the following additional law:

Minimal size
size (serialiseKey x) >= 8

Use serialiseKeyMinimalSize to test this law.

data DiskCachePolicy Source #

The disk cache policy determines if lookup operations use the OS page cache. Caching may improve the performance of lookups if database access follows certain patterns.

For a detailed discussion the disk cache policy, see Fine-tuning: Disk Cache Policy.

Constructors

DiskCacheAll

Cache all data in the table.

Use this policy if the database's access pattern has either good spatial locality or both good spatial and temporal locality.

DiskCacheLevelOneTo !Int

Cache the data in the freshest l levels.

Use this policy if the database's access pattern only has good temporal locality.

The variable l determines the number of levels that are cached. For a description of levels, see Merge Policy, Size Ratio, and Write Buffer Size. With this setting, the database can be expected to cache up to \(\frac{k}{P}\) pages of memory, where \(k\) refers to the number of entries that fit in levels \([1,l]\) and is defined as \(\sum_{i=1}^{l}BT^{i}\).

DiskCacheNone

Do not cache any table data.

Use this policy if the database's access pattern has does not have good spatial or temporal locality. For instance, if the access pattern is uniformly random.

data MergeSchedule Source #

The merge schedule balances the performance of lookups and updates against the consistency of updates. The merge schedule does not affect the performance of table unions. With the one-shot merge schedule, lookups and updates are more efficient overall, but some updates may take much longer than others. With the incremental merge schedule, lookups and updates are less efficient overall, but each update does a similar amount of work. This parameter is explicitly referenced in the documentation of those operations it affects.

For a detailed discussion of the effect of the merge schedule, see Fine-tuning: Merge Schedule.

Constructors

OneShot

The OneShot merge schedule causes the merging algorithm to complete merges immediately. This is more efficient than the Incremental merge schedule, but has an inconsistent workload. Using the OneShot merge schedule, the worst-case disk I/O complexity of the update operations is linear in the size of the table. For real-time systems and other use cases where unresponsiveness is unacceptable, use the Incremental merge schedule.

Incremental

The Incremental merge schedule spreads out the merging work over time. This is less efficient than the OneShot merge schedule, but has a consistent workload. Using the Incremental merge schedule, the worst-case disk I/O complexity of the update operations is logarithmic in the size of the table.

Table Configuration Overrides

Ranges

data Range k Source #

A range of keys.

Constructors

FromToExcluding k k

FromToExcluding i j is the ranges from i (inclusive) to j (exclusive).

FromToIncluding k k

FromToIncluding i j is the ranges from i (inclusive) to j (inclusive).

Instances

Instances details
Functor Range Source # 
Instance details

Defined in Database.LSMTree.Internal.Range

Methods

fmap :: (a -> b) -> Range a -> Range b #

(<$) :: a -> Range b -> Range a #

Show k => Show (Range k) Source # 
Instance details

Defined in Database.LSMTree.Internal.Range

Methods

showsPrec :: Int -> Range k -> ShowS #

show :: Range k -> String #

showList :: [Range k] -> ShowS #

NFData k => NFData (Range k) Source # 
Instance details

Defined in Database.LSMTree.Internal.Range

Methods

rnf :: Range k -> () #

Eq k => Eq (Range k) Source # 
Instance details

Defined in Database.LSMTree.Internal.Range

Methods

(==) :: Range k -> Range k -> Bool #

(/=) :: Range k -> Range k -> Bool #

Union Credit and Debt

newtype UnionDebt Source #

Union debt represents the amount of computation that must be performed before an incremental union is completed. This includes the cost of completing incremental unions that were part of a union's input.

Warning: The UnionDebt returned by remainingUnionDebt is an upper bound on the remaining union debt, not the exact union debt.

Constructors

UnionDebt Int 

Key/Value Serialisation

newtype RawBytes Source #

Raw bytes.

This type imposes no alignment constraint and provides no guarantee of whether the memory is pinned or unpinned.

Constructors

RawBytes (Vector Word8) 

Instances

Instances details
IsString RawBytes Source #

fromString: \(O(n)\).

Warning: fromString truncates multi-byte characters to octets. e.g. "枯朶に烏のとまりけり秋の暮" becomes "�6k�nh~�Q��n�".

Instance details

Defined in Database.LSMTree.Internal.RawBytes

Monoid RawBytes Source #

mempty: \(O(1)\).

mconcat: \(O(n)\).

Instance details

Defined in Database.LSMTree.Internal.RawBytes

Semigroup RawBytes Source #

(<>): \(O(n)\).

sconcat: \(O(n)\).

Instance details

Defined in Database.LSMTree.Internal.RawBytes

IsList RawBytes Source #

fromList: \(O(n)\).

toList: \(O(n)\).

Instance details

Defined in Database.LSMTree.Internal.RawBytes

Associated Types

type Item RawBytes #

Show RawBytes Source # 
Instance details

Defined in Database.LSMTree.Internal.RawBytes

NFData RawBytes Source # 
Instance details

Defined in Database.LSMTree.Internal.RawBytes

Methods

rnf :: RawBytes -> () #

Eq RawBytes Source # 
Instance details

Defined in Database.LSMTree.Internal.RawBytes

Ord RawBytes Source #

This instance uses lexicographic ordering.

Instance details

Defined in Database.LSMTree.Internal.RawBytes

Hashable RawBytes Source # 
Instance details

Defined in Database.LSMTree.Internal.RawBytes

type Item RawBytes Source # 
Instance details

Defined in Database.LSMTree.Internal.RawBytes

class SerialiseKey k where Source #

Serialisation of keys.

Instances should satisfy the following laws:

Identity
deserialiseKey (serialiseKey x) == x
Identity up to slicing
deserialiseKey (packSlice prefix (serialiseKey x) suffix) == x

Instances

Instances details
SerialiseKey ByteArray Source #

serialiseKey: \(O(1)\).

deserialiseKey: \(O(n)\).

Instance details

Defined in Database.LSMTree.Internal.Serialise.Class

SerialiseKey Int16 Source #

serialiseKey: \(O(1)\).

deserialiseKey: \(O(1)\).

Instance details

Defined in Database.LSMTree.Internal.Serialise.Class

SerialiseKey Int32 Source #

serialiseKey: \(O(1)\).

deserialiseKey: \(O(1)\).

Instance details

Defined in Database.LSMTree.Internal.Serialise.Class

SerialiseKey Int64 Source #

serialiseKey: \(O(1)\).

deserialiseKey: \(O(1)\).

Instance details

Defined in Database.LSMTree.Internal.Serialise.Class

SerialiseKey Int8 Source #

serialiseKey: \(O(1)\).

deserialiseKey: \(O(1)\).

Instance details

Defined in Database.LSMTree.Internal.Serialise.Class

SerialiseKey Word16 Source #

serialiseKey: \(O(1)\).

deserialiseKey: \(O(1)\).

Instance details

Defined in Database.LSMTree.Internal.Serialise.Class

SerialiseKey Word32 Source #

serialiseKey: \(O(1)\).

deserialiseKey: \(O(1)\).

Instance details

Defined in Database.LSMTree.Internal.Serialise.Class

SerialiseKey Word64 Source #

serialiseKey: \(O(1)\).

deserialiseKey: \(O(1)\).

Instance details

Defined in Database.LSMTree.Internal.Serialise.Class

SerialiseKey Word8 Source #

serialiseKey: \(O(1)\).

deserialiseKey: \(O(1)\).

Instance details

Defined in Database.LSMTree.Internal.Serialise.Class

SerialiseKey ByteString Source #

serialiseKey: \(O(n)\).

deserialiseKey: \(O(n)\).

Instance details

Defined in Database.LSMTree.Internal.Serialise.Class

SerialiseKey ByteString Source #

serialiseKey: \(O(n)\).

deserialiseKey: \(O(n)\).

Instance details

Defined in Database.LSMTree.Internal.Serialise.Class

SerialiseKey ShortByteString Source #

serialiseKey: \(O(1)\).

deserialiseKey: \(O(n)\).

Instance details

Defined in Database.LSMTree.Internal.Serialise.Class

SerialiseKey String Source #

serialiseKey: \(O(n)\).

deserialiseKey: \(O(n)\).

The String is (de)serialised as UTF-8.

Instance details

Defined in Database.LSMTree.Internal.Serialise.Class

SerialiseKey Int Source #

serialiseKey: \(O(1)\).

deserialiseKey: \(O(1)\).

Instance details

Defined in Database.LSMTree.Internal.Serialise.Class

SerialiseKey Word Source #

serialiseKey: \(O(1)\).

deserialiseKey: \(O(1)\).

Instance details

Defined in Database.LSMTree.Internal.Serialise.Class

class SerialiseKey k => SerialiseKeyOrderPreserving k Source #

Order-preserving serialisation of keys.

Table data is sorted by serialised keys. Range lookups and cursors return entries in this order. If serialisation does not preserve the ordering of unserialised keys, then range lookups and cursors return entries out of order.

If the SerialiseKey instance for a type preserves the ordering, then it can safely be given an instance of SerialiseKeyOrderPreserving. These should satisfy the following law:

Order-preserving
x `compare` y == serialiseKey x `compare` serialiseKey y

Serialised keys are lexicographically ordered. To satisfy the Order-preserving law, keys should be serialised into a big-endian format.

class SerialiseValue v where Source #

Serialisation of values and blobs.

Instances should satisfy the following laws:

Identity
deserialiseValue (serialiseValue x) == x
Identity up to slicing
deserialiseValue (packSlice prefix (serialiseValue x) suffix) == x

Instances

Instances details
SerialiseValue ByteArray Source #

serialiseValue: \(O(1)\).

deserialiseValue: \(O(n)\).

Instance details

Defined in Database.LSMTree.Internal.Serialise.Class

SerialiseValue Void Source #

This instance is intended for tables without blobs.

The implementation of deseriValue throws an excepValuen.

Instance details

Defined in Database.LSMTree.Internal.Serialise.Class

SerialiseValue Int16 Source #

serialiseValue: \(O(1)\).

deserialiseValue: \(O(1)\).

Instance details

Defined in Database.LSMTree.Internal.Serialise.Class

SerialiseValue Int32 Source #

serialiseValue: \(O(1)\).

deserialiseValue: \(O(1)\).

Instance details

Defined in Database.LSMTree.Internal.Serialise.Class

SerialiseValue Int64 Source #

serialiseValue: \(O(1)\).

deserialiseValue: \(O(1)\).

Instance details

Defined in Database.LSMTree.Internal.Serialise.Class

SerialiseValue Int8 Source #

serialiseValue: \(O(1)\).

deserialiseValue: \(O(1)\).

Instance details

Defined in Database.LSMTree.Internal.Serialise.Class

SerialiseValue Word16 Source #

serialiseValue: \(O(1)\).

deserialiseValue: \(O(1)\).

Instance details

Defined in Database.LSMTree.Internal.Serialise.Class

SerialiseValue Word32 Source #

serialiseValue: \(O(1)\).

deserialiseValue: \(O(1)\).

Instance details

Defined in Database.LSMTree.Internal.Serialise.Class

SerialiseValue Word64 Source #

serialiseValue: \(O(1)\).

deserialiseValue: \(O(1)\).

Instance details

Defined in Database.LSMTree.Internal.Serialise.Class

SerialiseValue Word8 Source #

serialiseValue: \(O(1)\).

deserialiseValue: \(O(1)\).

Instance details

Defined in Database.LSMTree.Internal.Serialise.Class

SerialiseValue ByteString Source #

serialiseValue: \(O(n)\).

deserialiseValue: \(O(n)\).

Instance details

Defined in Database.LSMTree.Internal.Serialise.Class

SerialiseValue ByteString Source #

serialiseValue: \(O(n)\).

deserialiseValue: \(O(n)\).

Instance details

Defined in Database.LSMTree.Internal.Serialise.Class

SerialiseValue ShortByteString Source #

serialiseValue: \(O(1)\).

deserialiseValue: \(O(n)\).

Instance details

Defined in Database.LSMTree.Internal.Serialise.Class

SerialiseValue String Source #

serialiseKey: \(O(n)\).

deserialiseKey: \(O(n)\).

The String is (de)serialiseValue as UTF-8.

Instance details

Defined in Database.LSMTree.Internal.Serialise.Class

SerialiseValue Int Source #

serialiseValue: \(O(1)\).

deserialiseValue: \(O(1)\).

Instance details

Defined in Database.LSMTree.Internal.Serialise.Class

SerialiseValue Word Source #

serialiseValue: \(O(1)\).

deserialiseValue: \(O(1)\).

Instance details

Defined in Database.LSMTree.Internal.Serialise.Class

SerialiseValue a => SerialiseValue (Sum a) Source #

An instance for Sum which is transparent to the serialisation of the value type.

NOTE: If you want to seriValue Sum a differValuely from a, you must use another newtype wrapper.

Instance details

Defined in Database.LSMTree.Internal.Serialise.Class

SerialiseValue v => SerialiseValue (ResolveAsFirst v) Source # 
Instance details

Defined in Database.LSMTree.Internal.Types

SerialiseValue v => SerialiseValue (ResolveViaSemigroup v) Source # 
Instance details

Defined in Database.LSMTree.Internal.Types

Key/Value Serialisation Property Tests

serialiseKeyIdentity :: (Eq k, SerialiseKey k) => k -> Bool Source #

Test the Identity law for the SerialiseKey class

serialiseKeyIdentityUpToSlicing :: (Eq k, SerialiseKey k) => RawBytes -> k -> RawBytes -> Bool Source #

Test the Identity up to slicing law for the SerialiseKey class

serialiseKeyPreservesOrdering :: (Ord k, SerialiseKey k) => k -> k -> Bool Source #

Test the Order-preserving law for the SerialiseKeyOrderPreserving class

serialiseKeyMinimalSize :: SerialiseKey k => k -> Bool Source #

Test the Minimal size law for the CompactIndex option.

serialiseValueIdentity :: (Eq v, SerialiseValue v) => v -> Bool Source #

Test the Identity law for the SerialiseValue class

serialiseValueIdentityUpToSlicing :: (Eq v, SerialiseValue v) => RawBytes -> v -> RawBytes -> Bool Source #

Test the Identity up to slicing law for the SerialiseValue class

packSlice :: RawBytes -> RawBytes -> RawBytes -> RawBytes Source #

packSlice prefix x suffix makes x into a slice with prefix bytes on the left and suffix bytes on the right.

Errors

data TableUnionNotCompatibleError Source #

A table union was constructed with two tables that are not compatible.

Constructors

ErrTableUnionHandleTypeMismatch 

Fields

  • !Int

    The index of the first table.

  • !TypeRep

    The type of the filesystem handle of the first table.

  • !Int

    The index of the second table.

  • !TypeRep

    The type of the filesystem handle of the second table.

ErrTableUnionSessionMismatch 

Fields

  • !Int

    The index of the first table.

  • !FilePath

    The session directory of the first table.

  • !Int

    The index of the second table.

  • !FilePath

    The session directory of the second table.