lsm-tree-0.1.0.0: Log-structured merge-trees
Safe HaskellSafe-Inferred
LanguageGHC2021

Database.LSMTree.Internal.CRC32C

Description

Functionalty related to CRC-32C (Castagnoli) checksums:

  • Support for calculating checksums while incrementally writing files.
  • Support for verifying checksums of files.
  • Support for a text file format listing file checksums.
Synopsis

Documentation

newtype CRC32C Source #

Constructors

CRC32C 

Fields

Instances

Instances details
Show CRC32C Source # 
Instance details

Defined in Database.LSMTree.Internal.CRC32C

Eq CRC32C Source # 
Instance details

Defined in Database.LSMTree.Internal.CRC32C

Methods

(==) :: CRC32C -> CRC32C -> Bool #

(/=) :: CRC32C -> CRC32C -> Bool #

Ord CRC32C Source # 
Instance details

Defined in Database.LSMTree.Internal.CRC32C

Prim CRC32C Source # 
Instance details

Defined in Database.LSMTree.Internal.CRC32C

Pure incremental checksum calculation

I/O with checksum calculation

hGetExactlyCRC32C :: MonadThrow m => HasFS m h -> Handle h -> Word64 -> CRC32C -> m (ByteString, CRC32C) Source #

This function ensures that exactly the requested number of bytes is read. If the file is too short, an FsError of type FsReachedEOF is thrown.

It attempts to read everything into a single strict chunk, which should almost always succeed. If it doesn't, multiple chunks are produced.

TODO: To reliably return a strict bytestring without additional copying, fs-api needs to support directly reading into a buffer, which is currently work in progress: https://github.com/input-output-hk/fs-sim/pull/46

hPutAllCRC32C :: forall m h. Monad m => HasFS m h -> Handle h -> ByteString -> CRC32C -> m (Word64, CRC32C) Source #

This function makes sure that the whole ByteString is written.

hPutAllChunksCRC32C :: forall m h. Monad m => HasFS m h -> Handle h -> ByteString -> CRC32C -> m (Word64, CRC32C) Source #

This function makes sure that the whole lazy ByteString is written.

readFileCRC32C :: forall m h. MonadThrow m => HasFS m h -> FsPath -> m CRC32C Source #

newtype ChunkSize Source #

Constructors

ChunkSize ByteCount 

hGetExactlyCRC32C_SBS Source #

Arguments

:: forall m h. (MonadThrow m, PrimMonad m) 
=> HasFS m h 
-> Handle h 
-> ByteCount

Number of bytes to read

-> CRC32C 
-> m (ShortByteString, CRC32C) 

Reads exactly as many bytes as requested, returning a ShortByteString and updating a given CRC32C value.

If EOF is found before the requested number of bytes is read, an FsError exception is thrown.

The returned ShortByteString is backed by pinned memory.

hGetAllCRC32C' Source #

Arguments

:: forall m h. PrimMonad m 
=> HasFS m h 
-> Handle h 
-> ChunkSize

Chunk size, must be larger than 0

-> CRC32C 
-> m CRC32C 

Reads all bytes, updating a given CRC32C value without returning the bytes.

Checksum files

We use .checksum files to help verify the integrity of on disk snapshots. Each .checksum file lists the CRC-32C (Castagnoli) of other files. For further details see doc/format-directory.md.

The file uses the BSD-style checksum format (e.g. as produced by tools like md5sum --tag), with the algorithm name "CRC32C". This format is text, one line per file, using hexedecimal for the 32bit output.

Checksum files are used for each LSM run, and for the snapshot metadata.

Typical examples are:

CRC32C (keyops) = fd040004
CRC32C (blobs) = 5a3b820c
CRC32C (filter) = 6653e178
CRC32C (index) = f4ec6724

Or

CRC32C (snapshot) = 87972d7f

Checksum checking

checkCRC :: forall m h. (MonadMask m, PrimMonad m) => HasFS m h -> HasBlockIO m h -> Bool -> CRC32C -> FsPath -> m () Source #

Check the CRC32C checksum for a file.

If the boolean argument is True, all file data for this path is evicted from the page cache.

File format errors

data FileCorruptedError Source #

The file is corrupted.

Constructors

ErrFileFormatInvalid

The file fails to parse.

Fields

ErrFileChecksumMismatch

The file CRC32 checksum is invalid.

Fields