This article is one in the series that describes the design of a file persistence database called kahadb. The components involved are page file, pages, transaction, free pages, page writer, B-Tree index, page allocation, storage and recovery file. We will deal with each component separately and try to understand the use cases involved. In this article, we deal with loading of a page file. It is more like a note of the use cases that are involved.
Load Page file
- Create Page Cache.
- Allocate OS resources for read/write: Create main page file, free list file and recovery file, if they doesn’t exist.
- Read metadata: If we are reading the page file first time, load the metadata else write the tmetadata to the page file. Properties of metadata are page size, free page count, page count, last transaction ID, clean shutdown, last free page ID.
- Recovery file: If we have failed to unload the page file properly during our last access then we have to recover the pages from the recovery file and write to the page file. After recovery, scan the page file for any free pages.
- Page writer: Start writer thread
Create page cache – LFU or LRU. Every page that is loaded is cached. Each page has a page ID which is nothing but an offset in the page file.
Allocates OS resource
Create main page file, free list file and recovery file, if they don’t exist, so that they can be accessed for read/write purposes.
Load page metadata. If this is the first time we are loading the page file, write metadata to the file.
One of the main properties of the metadata is the page size setting as it can’t be changed once the file is created. Other properties are free page count, last transaction ID, clean shutdown and last free page ID. We re-write the metadata once loading is done. The clean shutdown would be false, as it would be true only if page file is properly unloaded.
We will know from metadata property ‘Clean Shut Down’ whether we have properly unloaded the page file after its last access. If yes, we will also load the free pages list that were freed during the last transaction.The number of free pages available is also a property of the page metadata. The free page list is loaded from the free file. It contains the page IDs or range of Page IDs.
If last access was not properly shut down, we try to recover the page file using recovery file.
Recover file header consists of next transaction ID, checksum and page count. The offset and page content for each page is stored as the data.
Data is not recovered if any of the following occur
- Invalid recovery record, Could not fully read the data. Probably due to a partial write to the recovery buffer.
- redo buffer was not full written out correctly
- If the checksum is not valid then the recovery buffer was partially written to disk.
If none of the above occurs, the data read from redo logs is written to page file.
Next transaction ID is last transaction ID + 1. Last transaction ID is stored in the metadata. If it was not a clean shutdown, next transaction is fetched from the redo log.
If shutdown is not clean, it will have to scan the page file to find the free pages.
A page deallocated after initial use is called a free page. If the page file header has free pages, the free list will be loaded from the free file. The free file header has the free pages count and the data. The data here is the page ID or range of page IDs.
If shutdown is not clean, we won’t rely on the free file and the free list will be created by scanning the main page.
The page file is scanned where each page’s header is loaded. If the page type is of free type, it is added to the free pages list.
Start writer thread which will write the page content in batches in a separate thread.