B Tree Library History

The B Tree routines in this library were originally written in Fortran 77 during the late 1980’s for use in a gazetteer.

The routines were ported to C over a number years (as time permitted), but still very much organised as the original Fortran. This version was utilised in the anag and dict programs, for anagram solving.

The source was re-organised in early 2001 to adopt more closely a C organisation style (note the source code itself still has a Fortran flavour). This was known as "bt_new".

The original B Tree implementation was designed solely as an index handler. Data was expected to be managed by the client application. In addition, the routines were only able to manage exclusive access to the index file - shared access would result in corrupt index files. Lastly, only one index file could be open at a time - effectively preventing any application-mediated copying capability.

"bt_shared" was derived to resolve these three issues: to provide a combined index and data file (yes, something like CP-V/CP-6 keyed files), to allow shared access to a B Tree file by concurrent processes, and to permit a single process to open multiple index files concurrently.

These capabilities were provided in a 32 bit implementation, meaning the largest file that could be supported was 2GiB. Version 3.0 was developed to add support for larger files, i.e those requiring 64 bit addressing. This is dependent on kernel and gcc Large File Support.

Introduction

The B Tree (BT) library offers a set of C language functions which implement a generalised index file capability, based on the B tree indexing scheme. The B tree was originally described by Bayer and McCraight. A B tree is a multiway balanced tree: i.e. there is more than one key per node, and all leaf nodes are the same distance from the root node.

The BT functions implement a 'classical' B tree, not one of the later variants (B* or B+ tree).

The B Tree is stored in a UNIX disk file. The file can contain many B Trees, each of which is referred to by a name assigned by the user (or application). The system allows many such files to exist.

In order for BT to function efficiently on different hardware platforms, the important constants relating to disk block size, maximum number of keys per block, etc. are defined as constants, and may be modified prior to compilation.

System Description

Overview

The B Tree is stored in a standard UNIX file. To support efficient processing, the size of a B Tree node should be the same as the hardware’s disk block size. In the following description, the terms node and block are interchangeable.

A B Tree has a root, which acts as the starting point for all insertions, deletions and searches. A B Tree has a master root stored at file address 0 (zero), which is termed the "superroot". The superroot holds the names and root block addresses of all the B Trees in the file. The superroot also holds the free block list for the file and other administrative information.

Each block contains a number of keys, an associated integer value and pointers to other blocks. The maximum number of keys that can be stored in a block depends on the size of a block and the maximum number of bytes permitted for a key. The integer value associated with a key can be used as desired by the application program. If the record storage facilities of the B Tree are used, it will contain the block address of an associated data record.

Version 3.1 and above allows the definition of duplicate keys, which is enabled on a per-root basis. If not enabled, this version with behave as previous versions, that is, duplicate keys will be rejected.

When a B Tree file is created, the superroot is initialised with two named roots: itself and the default root. These are defined as "super" and "default" respectively. The application may create more roots as required. When a B Tree file is first opened, the default root ($$default) is always selected.

The maximum size of a B Tree file is governed by the implementation. For an implementation with a 32 bit word length, the maximum file size supported is 2GiB. If the B Tree library is built with Large File Support, that limit is removed.

Opening multiple B Tree files

An application program may have more than one B Tree file opened at any one time. When a B Tree file is created or opened, a B Tree context pointer is returned to the application. All B Tree functions must be passed this context pointer to indicate which open B Tree file is to be operated on. This parameter is identified as btact in each function description.

Shared Access

B Tree files may be created/opened in exclusive or shared mode. Application programs that use shared access should be prepared to handle a busy return from a read or update access to the B Tree file.

An application can gain exclusive access to a B File after it has been opened in shared mode. This is achieved via the btlock function. The btunlock function relinquishes exclusive access.

Large File Support

In order to support large files (i.e. those > 2GiB), a new type has been introduced into the BT library, BTint, which can be 32 bits (i.e int) when compiled without Large File Support, or 64 bits (i.e. long long), when compiled with Large File Support. BTint is a typedef, which will be declared as appropriate. BT library function arguments which must be declared as BTint are described in the API, but version 2.x users of bfndky, bdbug, binsky, bnxtky or bupdky should be aware of the need to change argument declarations from int to BTint.

Duplicate Keys

Version 3.1 added support for duplicate keys. By default, duplicate keys are not permitted, so this (and later) versions behaves as previous versions. Duplicate key support is enabled on a per-root basis, using the btdups API function.

Finding a duplicate key (via bfndky) will leave the index at the first instance of the key. bnxtky may be used to walk through the set of duplicate keys. The bprvky function has been added to allow reverse key navigation.

To faciltate the management of duplicate keys, a number of BTree functions have been modified to operate against the current key, as selected by the bfndky, bnxtky or bprvky functions. These are: bupdky, bdelky, btupd, btdel and btrecs. Passing these functions a key pointer of NULL will invoke the desired operation against the currently selected key. See individual function descriptions for further details.

Sample Program

A very simple use of the BTree API is shown below. This program creates a BTree file and inserts one key ("akey") with the value of 99. Error checking has been omitted for clarity.

        #include "btree.h"
        int main(int argc, char *argv[])
        {
            BTA *btfile;
            btinit();
            btfile = btcrt("test_db",0,FALSE);
            binsky(btfile,"akey",99);
            btcls(btfile);
            return 0;
        }

If the program source resides in the bt directory, the command to compile and link the program will be (assuming no Large File Support):

        gcc -o simple simple.c -Iinc -Llib -lbt

A couple of additional sample programs, using the BT Library API, can be found the the samples sub-directory. A Makefile is provided to build the sample programs. The makefile assumes it will be run in the samples directory.

Function Descriptions

Introduction

This chapter describes each of the functions offered by the BT API. Rather than present the functions in alphabetic order (as any sensible document would), the functions are described in order of probable usage by an application program. To make it even more difficult to use as a reference manual, the functions are titled by their functionality, not their names.

Initialising the B Tree library

#include "btree.h"

int btinit(void);

The btinit function initialises the B Tree library. It must be invoked before any other B Tree routine. Failure to do so will result in strange errors.

Calling btinit more than once in the execution lifetime of the B Tree library will cause it to return an error (QINERR). btinit also checks that the block size, in bytes, of the B Tree library is a power of two. An error return will result for non-conformant block sizes. Successful initialisation is indicated by a return value of zero.

Creating a B Tree File

#include "btree.h"

BTA* btcrt(char* fid, int nkeys, int shared);

The btcrt function will create and initialise a new B Tree file. The parameter fid must be set to the name of the file to create. The nkeys defines the maximum number of keys that can be stored in the B Tree. This parameter should always be set to 0 for those operating systems, such as UNIX, that support dynamic file growth. The parameter shared should be set to 0 to disallow shared access to the newly created B Tree, or non-zero to allow shared access.

btcrt will return a pointer to the BT activation context for the newly opened file (BTA*), or NULL in the case of an error. To determine the cause of an error, invoke the btcerr function.

If the B Tree index file has been successfully created, the default root is selected, the file becomes the in-use B Tree file and is ready for further operations.

WARNING: The btcrt function will unconditionally create a new file, even if a file of the same name already exists.

Opening a B Tree File

#include "btree.h"

BTA* btopn(char* fid, int mode, int shared);

The btopn function will open an existing B Tree file. The parameter fid must be set to the name of the file to open. The mode parameter determines if the B Tree file can be updated. A value of zero indicates that updates are allowed, a non-zero value will prohibit updates. The parameter shared should be set to zero to disallow shared access to the B Tree file, or non-zero to allow shared access.

btopn will return a pointer to the BT activation context for the newly opened file (BTA*), or NULL in the case of an error. To determine the cause of an error, invoke the btcerr function.

If the B Tree index file has been successfully opened, the default root is selected, and the file is ready for further operations.

Closing a B Tree File

#include "btree.h"

int btcls(BTA* btact);

The btcls function will close the file associated with the btact context pointer.

A non-zero return code indicates an error occurred in closing the file.

Set/unset support for duplicate keys

#include "btree.h"

int btdups(BTA* btact, int dups);

The btdups controls support for duplicate keys in the current root of the index file associated with the btact context pointer. Setting the value of the dups to non-zero (TRUE) will enable support for duplicate keys in the current root. A value of zero (FALSE) will disable duplicate key support for the current root. Enabling duplicate key support on the superroot is not permitted.

Disabling duplicate key support on a root that previously permitted them merely prevents further duplicate keys from being inserted into the root BTree index. Existing duplicates will remain and must be managed by the application.

A non-zero return code indicates an error occurred.

Set write through threshold for index file blocks

#include "btree.h"

int btthresh(BTA* btact, int threshold);

The btthresh function sets the write threshold for the btree index file associated with the btact context pointer. The threshold defines the number of updates on a block that will cause it to be written to disk. A value of zero (the default for a btree index) means that a block is not written to disk until the memory it occupies is required for a new block.

btthresh offers finer-grained control over disk writes than in previous versions of Btree, which was either only when necessary (in exclusive mode), or after every API call (in shared mode). The intention is to allow the application program to reduce the chance of lost data in a btree index should a hardware or software falure interrupt the running program before the indexes are closed and dirty blocks flushed to disk.

NB: If threshold is set to a small value, it may reduce performance of the BTree application.

A non-zero return code indicates an error occurred.

Inserting a key

#include "btree.h"

int binsky(BTA* btact, char* key, BTint value);

The binsky function inserts a new key and associated integer value into the current root of the file associated with the btact context pointer. The key, a character string, is passed in key, while value holds the associated integer value. value is declared as a BTint, which is normally a typedef for int, but with Large File Support will be a typedef for long long.

If the key has been inserted successfully, binsky returns zero, otherwise an error code is returned.

Keys longer than the maximum key length (BT constant ZKYLEN) will be silently truncated to the maximum key length.

A non-zero return from binsky indicates an error occurred during the key insertion process.

Finding a key

#include "btree.h"

int bfndky(BTA* btact, char* key, BTint* value);

The bfndky function searches for a key in the current root of the file associated with the btact context pointer. The key, a character string, is passed as a pointer in key. If the key is found, the associated value will be returned in the integer location identified by value. value is declared as a BTint, which is normally a typedef for int, but with Large File Support will be a typedef for long long.

If the key is found, bfndky returns zero. If the key is not found, bfndky will return an error code of QNOKEY.

Whether or not the key is located, the B Tree context is left at the next highest key within the B Tree file. A call to bnxtky will return this key. The function bprvky may be called to return the previous key.

If the current root supports duplicate keys (enabled by a call to btdups, and the target of the bfndky function has duplicates, the context of the B Tree index is positioned at the start of the duplicate key set.

A non-zero return from bfndky indicates an error occurred during the key location process.

Finding a sequence of keys

#include "btree.h"

int bnxtky(BTA* btact, char* key, BTint* value);

The bnxtky function returns the next key from the current root in the file associated with the btact context pointer. The key, a character string, is returned via the pointer in key. The value associated with the key will be returned in the integer location identified by value. value is declared as a BTint, which is normally a typedef for int, but with Large File Support will be a typedef for long long.

bnxtky returns zero to indicate the next key has been located. If no next key exists, bnxtky returns the error code QNOKEY.

To initialise the B Tree position, a call to bfndky or btpos must be made before the first call to bnxtky. Thereafter, repeated calls to bnxtky may be made. Calls to bprvky may be freely intermingled with calls to bnxtky.

A non-zero return from bnxtky indicates an error occurred during the key location process.

Finding a reverse sequence of keys

#include "btree.h"

int bprvky(BTA* btact, char* key, BTint* value);

The bprvky function returns the previous key from the current root in the file associated with the btact context pointer. The key, a character string, is returned via the pointer in key. The value associated with the key will be returned in the integer location identified by value. value is declared as a BTint, which is normally a typedef for int, but with Large File Support will be a typedef for long long.

bprvky returns zero to indicate the previous key has been located. If no previous key exists, bnxtky returns the error code QNOKEY.

To initialise the B Tree position, a call to bfndky or btpos must be made before the first call to bprvky. Thereafter, repeated calls to bprvky may be made. Calls to bnxtky may be freely intermingled with calls to bprvky.

A non-zero return from bprvky indicates an error occurred during the key location process.

Setting the position within a B Tree index

#include "btree.h"

int btpos(BTA* btact, int pos);

The btpos function sets the position in the current root in the file associated with the btact context pointer. The desired position is indicated by the pos; a value of 1 positions before the first key in the index, a value of 2 will position after the last key in the index. These values correspond to the B Tree constants ZSTART and ZEND, respectively.

Following a call to btpos, calls to bnxtky and bprvky may be made to return successive or previous keys.

btpos returns zero to indicate success, otherwise the error code if an error was encountered.

Deleting a key

#include "btree.h"

int bdelky(BTA* btact, char* key);

The bdelky function deletes a key from the current root in the file associated with the btact context pointer. The key, a character string, is passed via the pointer in key. If the key does not exist, bdelky returns the error code QNOKEY. bdelky returns zero on successful deletion of a key.

If bdelky is called with a key value of NULL, the delete operation will act against the current key, as selected by bfndky, bnxtky or bprvky operations. This capability is designed to allow deletion of a duplicate key, presumably based on other, application managed, attributes.

A non-zero return from bdelky indicates an error occurred during the key deletion process.

Updating the value of a key

#include "btree.h"

int bupdky(BTA* btact, char* key, BTint value);

The bupdky function updates the value of an existing key in the current root of the file associated with the btact context pointer. The key, a character string, is passed via the pointer in key. The new value is passed via value. value is declared as a BTint, which is normally a typedef for int, but with Large File Support will be a typedef for long long.

If the key does not exist, bupdky returns the error code QNOKEY.

If bupdky is called with a key value of NULL, the update operation will act against the current key, as selected by bfndky, bnxtky or bprvky operations. This capability is designed to allow update of a duplicate key, presumably based on other, application managed, attributes.

bupdky returns zero to indicate a successful update, error code otherwise.

Creating a root

#include "btree.h"

int btcrtr(BTA* btact, char* root);

The btcrtr function creates a new root within the file associated with the btact context pointer. The root name, a character string, is passed via the pointer in root. If the new root is created successfully, btcrtr returns zero.

On successful creation of a new root, on return from btcrtr, the new root will have been made current. If the root could not be created, the current root is unchanged.

A non-zero return from btcrtr indicates an error occurred during the root creation process.

Changing the current root

#include "btree.h"

int btchgr(BTA* btact, char* root);

The btchgr function changes the current root within the file associated with the btact context pointer. The target root name, a character string, is passed via the pointer in root. If the switch to the target root is successful, btchgr returns zero.

On successful change to the target root, on return from btchgr, the target root will have been made current. If the root could not be switched, the current root is unchanged.

A non-zero return from btchgr indicates an error occurred during the root change process.

Deleting a root

#include "btree.h"

int btdelr(BTA* btact, char* root);

The btdelr function deletes the named root within the file associated with the btact context pointer. The target root name for deletion, a character string, is passed via the pointer in root. If the deletion of the target root is successful, btdelr returns zero.

All blocks owned by the target root are deleted, and returned to the free list. Whether or not the target root is deleted, the current root is left unchanged.

A non-zero return from btdelr indicates an error occurred during the root delete process. It is considered an error to attempt to delete the current root.

Gaining exclusive access to a B Tree file

#include "btree.h"

int btlock(BTA* btact);

The btlock function enables a process to gain exclusive access to a B Tree file, originally opened in shared mode. btlock is passed btact, which holds the context pointer of the file for which exclusive access is required.

btlock will return zero on success, error code otherwise. Applications should be ready to handle a QBUSY error return, indicating that exclusive access could not be gained. btlock waits for ZSLEEP seconds before giving up the attempt to gain exclusive access. ZSLEEP is an implementation defined constant. The default is five seconds.

Releasing exclusive access on a B Tree file

#include "btree.h"

int btunlock(BTA* btact);

The btunlock function enables a process to relinquish exclusive access to a file, originally gained from a call to btlock. btunlock is passed btact, which holds the context pointer of the B Tree file for which exclusive access is no longer required.

If the B Tree file is not locked, or has been opened for exclusive access, btunlock has no effect.

A non-zero return from btunlock indicates an error occurred.

Inserting a key and data

#include "btree.h"

int btins(BTA* btact, char* key, char* data, int dsize);

The btins function inserts a key and data record into a file associated with the btact context pointer. Both key and data are character pointers. Since the data may legitimately contain null (x'00') characters, the length of the data, in bytes, is passed in dsize. dsize must be zero or greater. If the key and data is successfully stored in the B Tree file, btins returns zero.

A non-zero return from btins indicates an error occurred.

Updating data for an existing key

#include "btree.h"

int btupd(BTA* btact, char* key, char* data, int dsize);

The btupd function updates the data record of an existing key in the file associated with the btact context pointer. Both key and data are character pointers. Since the data may legitimately contain null (x'00) characters, the length of the data, in bytes, must be passed in dsize. If the replacement data is successfully stored in the B Tree file, btupd returns zero.

If btupd is called with a key value of NULL, the update operation will act against the current key, as selected by btsel, btseln or btselp operations. This capability is designed to allow update of a duplicate key, presumably based on other, application managed, attributes.

A non-zero return from btupd indicates an error occurred.

Locating data for an existing key

#include "btree.h"

int btsel(BTA* btact, char* key, char* data, int dsize, int* rsize);

The btsel function locates and returns the data record of an existing key in the file associated with the btact context pointer. Both key and data are character pointers. The dsize parameter must contain the maximum number of bytes to be returned. The caller should ensure that the data pointer refers to an area of memory of at least dsize bytes. The actual number of bytes returned is returned in rsize. Even if the data record contains more than dsize bytes, only dsize bytes will be returned. If the data record is successfully retrieved (even partially), btsel returns zero.

An application program can determine the number of bytes occupied by a data record through the btrecs function.

A non-zero return from btsel indicates an error occurred.

Deleting a key and associated data

#include "btree.h"

int btdel(BTA* btact, char* key);

The btdel function deletes a key and data record in the file associated with the btact context pointer. key is a character pointer, identifying the key to delete. If deletion of the key and data is successful, btdel returns zero.

If btdel is called with a key value of NULL, the delete operation will act against the current key, as selected by btsel, btseln or btselp operations. This capability is designed to allow deletion of a duplicate key, presumably based on other, application managed, attributes.

A non-zero return from btdel indicates an error occurred.

Locating data for the next key in sequence

#include "btree.h"

int btseln(BTA* btact, char* key, char* data, int dsize, int* rsize);

The btseln function locates and returns the next key and data record in the file associated with the btact context pointer. Before using btseln, a call to btsel or btpos must be made to initialise the position within the B Tree. Calls to btselp may be freely intermingled with calls to btseln.

Both key and data are character pointers. The dsize parameter must contain the maximum number of bytes to be returned. The caller should ensure that the data pointer refers to an area of memory of at least dsize bytes. The actual number of bytes returned is returned in rsize. Even if the data record contains more than dsize bytes, only dsize bytes will be returned. If the data record is successfully retrieved (even partially), btseln returns zero.

If no next key exists, btseln will return the error code QNOKEY.

An application program can determine the number of bytes occupied by a data record through the btrecs function.

A non-zero return from btseln indicates an error occurred.

Locating data for the previous key in sequence

#include "btree.h"

int btselp(BTA* btact, char* key, char* data, int dsize, int* rsize);

The btselp function locates and returns the previous key and data record in the file associated with the btact context pointer. Before using btselp, a call to btsel or btpos must be made to initialise the position within the B Tree. Calls to btseln may be freely intermingled with calls to btselp.

Both key and data are character pointers. The dsize parameter must contain the maximum number of bytes to be returned. The caller should ensure that the data pointer refers to an area of memory of at least dsize bytes. The actual number of bytes returned is returned in rsize. Even if the data record contains more than dsize bytes, only dsize bytes will be returned. If the data record is successfully retrieved (even partially), btselp returns zero.

If no previous key exists, btselp will return the error code QNOKEY.

An application program can determine the number of bytes occupied by a data record through the btrecs function.

A non-zero return from btseln indicates an error occurred.

Determine size of data record for specific key

#include "btree.h"

int btrecs(BTA* btact, char* key, int* rsize);

The btrecs function returns the number of bytes occupied by the data record of a key in the file associated with the btact context pointer. The key parameter is a character pointer, identifying the key to query. The number of bytes occupied by the data record is returned in rsize. If the key is located and the data size of the record returned successfully, btrecs returns zero.

If btrecs is called with a key value of NULL, the size operation will act against the current key, as selected by btsel, btseln or btselp operations. This capability is designed to allow the determination of the size of the data record of a duplicate key, presumably based on other, application managed, attributes.

If btrecs is invoked for a key without an associated data record, the results are undefined.

A non-zero return from btrecs indicates an error occurred.

#include "btree.h"

int bdbug(BTA* btact, char* opt, BTint blk);

The bdbug function provides a debug capability for the B Tree package. The following options can be passed via the opt parameter:

Table 1. Debug Options

control

-

displays the in-memory block information, together with the last key found details

super

-

displays superroot information i.e. block usage, free list etc.

stack

-

displays the tree stack (i.e. key context)

space

-

displays occupancy statistics

stats

-

displays B Tree operating statistics

block

-

displays the contents of the block identified by blk. blk is declared as a BTint, which is normally a typedef for int, but with Large File Support will be a typedef for long long.

structure

-

Performs a structure check of the currently active BTree file. If blk is set to ZNULL, information on the index structure, and problems (if any), are displayed. Otherwise, a simple statement of structure condition is displayed.

A non-zero return from bdbug indicates an error occurred during the display of debugging information.

Retrieving error message text

#include "btree.h"

void btcerr(int* ierr, int* ioerr, char* srname, char* msg);

The btcerr function returns the error code (in ierr) and (if relevant) the I/O error code (in ioerr) of the last error encountered by the B Tree system. In addition, it will return the name of the function which detected the error (in srname) and an error message in msg.

The maximum number of chars returned in srname is BT constant ZRNAMESZ. The maximum number of chars returned in msg is BT constant ZMSGSZ. Both char arrays will be zero-padded to ZRNAMESZ and ZMSGSZ respectively. Declaring these arrays to be smaller than the BT constants will ensure btcerr acts as a very effective stack smasher.

Error Messages

This section lists the errors that may be encountered when using the B Tree system. The occurrence of most of these errors indicates a serious failure in the B Tree system, with the following exceptions:

QNOKEY

The key given as a parameter to bfndky (or its brethren) does not exist.

QDUP

The key given as a parameter to binsky (or its brethren) already exists in the index. Duplicate keys are not permitted.

QBUSY

File busy, a normal hazard when using shared access mode in a multiuser environment.

QNOWRT

The B Tree file was originally opened with read-only permission, and a write has subsequently been attempted. Probably an application program error.

QNOBTF

Attempt to perform operation on B Tree file, but there is no file attached to the context pointer provided, likely an application error.

QINERR

Attempt made to open the same file again, likely an application error.

QDELCR

An attempt has been made to delete the current root, or worse, the super root. This is forbidden by the BT library.

QBADOP

Unknown debug option passed to bdbug, likely an application error.

QNOACT

Maximum number of concurrently open B Tree files reached - may be an application error.

QBADAP

Illegal context pointer passed to a B Tree function - may be an application error.

QDNEG

A negative length data record has been passed to a B Tree function.

QBADVR

The B Tree index file was created using an older version of the B Tree library, and cannot be accessed safely with this version. Extract data using a program based on the previous version of the B Tree library, and import into a index file created with the new. Alternatively, it may be possible to use the btr recovery tool to migrate an older BTree index file to the latest version.

QDAOVR

A new data record cannot be entered as the maximum value of a data block pointer has been exceeded.

QF2BIG

The index file has reached its maximum size for this implementation.

QBADAL

Unable to set alarm for for file lock handling. This may be a problem with the underlying OS.

QBADCTXT

Index context invalid for current key operation. An attempt was made to delete or update the current key, but the context is not valid. A valid context is set by bfndky, bnxtky, bprvky, btsel, btseln, or btselp.

QNODUPS

Duplicates are not permitted in the superroot. Attempting to permit duplicate keys in the superroot, via btdups(..,TRUE), is prohibited.

QNOT64BIT

Index file was created with a non-64 bit version (LFS=0) of the library. However, access is being attempted with a 64 bit version.

Q64BIT

Index file was created with a 64 bit version (LFS=1) of the library. However, access is being attempted with a non-64 bit version.

Table 2. B Tree Error Messages

1

QBLKNR

Block %s is not a root block

2

QCLSIO

Unable to close index file: "file"

3

QCRTIO

Unable to create index file: "file"

4

QCPBLK

Unable to read source or destination block

5

QWRBLK

I/O error writing block

6

QRDSUP

I/O error reading super root

7

QWRSUP

I/O error writing super root

8

QOPNIO

I/O error opening index file: "file"

9

QRDBLK

I/O error reading block

10

QIXOPN

An index file is already open

11

QSPLIT

Can’t split full block

12

QINFER

Bad info block index used

13

QNOMEM

Unable to acquire a free memory block

14

QSTKUF

Stack underflow

15

QSTKOF

Stack overflow

16

QBLKFL

Can’t insert key at block: %s

17

QLOCTB

Replace location out of range

18

QSPKEY

Split: search for middle key failed

19

QWRMEM

Requested write block not in memory

20

QBALSE

Balance: search for key failed

21

QDELEX

Exact flag not set for delete

22

QDELER

Internal inconsistency in delete operation

23

QDELRP

Search for deleted key replacement failed (in block %s)

24

QDEMSE

Demote search failed

25

QDEMSP

Demote split failed

26

QJNSE

Join search failed

27

QNODEF

Cannot locate default root ($$default)

28

QDELCR

Deletion of the current or super root is forbidden

29

QBADIX

Negative in-memory index encountered

30

QNOBTF

No index file open for this operation

31

QINERR

Index file already in use

32

QBADOP

Debug option not recognised

33

QNOACT

No more index files may be opened (limit reached)

34

QBADAP

Invalid index file context pointer

35

QBUSY

File is busy

37

QNOBLK

No block available for data storage

38

QNEGSZ

Data block usage gone bad: %s

39

QNOTDA

Data segment header references a non-data block: %s

40

QBADCTXT

Index context invalid for current key operation

41

QDLOOP

Circular data segment pointer encountered

42

QUNLCK

Unlock operation failed

43

QLRUER

LRU queue corrupt - index not in list

44

QDAERR

Unable to insert data record

45

QDNEG

Data record cannot be negative

46

QDUP

Key "key" already exists in index

47

QNOKEY

Key "key" does not exist in index

48

QNOWRT

Write access to index prohibited

49

QNOTFR

Block on free list is not marked as free

50

QBADVR

Index file is incompatible with current version: "version"

51

QDAOVR

Data capacity exceeded at block: "block"

52

QF2BIG

Index file is at maximum size

53

QBADAL

Unable to set alarm for locking

54

QDRANEG

Data record address is negative: "address"

55

QBLKSZERR

Defined block size is not a power of two: "size"

56

QNODUPS

Duplicates keys are not allowed for the superroot

57

QPOSERR

Location search exceeds key count at block: %s

58

QNOT64BIT

Index file likely not LFS (64bit) enabled; doesn’t match library.

59

Q64BIT

Index file likely LFS (64bit) enabled; doesn’t match library.

60

QNOTDUP

Duplicate key address does not reference a duplicate block: %s.

61

QDUPSZ

Duplicate key entry has wrong size.

62

QBADIR

Bad direction parameter.

B Tree Test Harness

The B Tree library is distributed with a test harness, bt, which exercises all of the functions supplied by the B Tree library.

Most bt commands correspond directly to a matching B Tree library function call. Additional commands are available to automate testing scripts and manage concurrently open files. bt reads from stdin and writes normal output to stdout. Terminal error messages go to stderr. A prompt of bt: is issued prior to reading from stdin. Long running commands may be interrupted using cntrl-c, which will return to the command prompt.

bt is built with GNU readline support, if readline libraries and include files are detected when building the BT library and supporting tools. Readline enables bt to offer command editing, command history and file completion. More information on the capabilities provided by readline can be found in the full GNU documentation.

A typical bt session might look like:

      $ bt
      bt: c test
      bt: d newkey 55
      bt: f newkey
      Key: 'newkey' = 55
      bt: dd datakey some_text_string
      bt: fd datakey
      Data returned: 'some_text_string'
      bt: fd datakey d
      some_text_string
      bt: b abuf 512
      bt: dd bufkey *abuf
      bt: fd bufkey
      Data record:
      aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
      aaaaaaaaaaaaaaaaaaaaaaaaa'
      bt: sd bufkey
      Key 'bufkey' record size: 512 bytes
      bt: q
      $

bt commands have a both a full and abbreviated versions. The descriptions below show the full command first, followed by the abbreviated version (comma separated). A command may optionally be followed by an argument and a qualifier. The following table lists the commands supported by bt:

buffer,b <bufname> <size> <filename>

Buffer: Creates a data buffer called bufname. If the numeric size argument is given, the buffer is created with that number of bytes. The buffer is filled with the first character of the bufname. If the size argument is non-numeric, it is assumed to be a file name, and the contents of the file are read into the buffer. The data buffer can subsequently be specified as data for a define-data command.

buffer-delete,bd <bufname>

Buffer Delete: Deletes an existing data buffer identified by bufname.

buffer-list,bl

Buffer List: Lists the names of the currently defined data buffers on stdout.

check-order,co s c

Check Order: Checks the lexicographic order of keys in the current root, starting from the current position within the BTree. If the s argument is given, the check is performed from the first key of the BTree index.

If a disordered index is discovered, the keys at fault are displayed. Otherwise, check-order is silent, unless the c argument is specified, which causes the number of keys checked to be displayed.

create,c <filename> s

Create file: Creates a new B Tree file. If a file of the same name already exists, it will be silently overwritten. If the s qualifier is given, the file will be created in shared mode. The newly created B Tree file becomes the current file; use the file-list command to view the list of open files.

change-root,cr <rootname>

Change Root: Switches the current root to the root named rootname in the in-use B Tree file. If switch is successful, rootname becomes the current root. All subsequent key and/or data operations will take place against rootname.

close,x

Close: Closes the in-use B Tree file. The next available open file, if one exists, is automatically made the in-use B Tree file. If there are no candidate B Tree files, a warning message is issued.

define,d <key> <value>

Define key: Defines a new key in the current root of the in-use B Tree index file. The new key name is defined by key, and is assigned value. If value is omitted, zero is assumed.

data-address,da <key> i

Data Address: Prints, in a decoded form, the data segment address associated with key. If the i qualifier is given, the key is interpreted as data segment address in integer form and decoded immediately.

define-data,dd <key> <string> <*bufname>

Define key with Data: Defines a new key with an associated data record in the current root of the in-use B Tree index file. key defines the key name. Data can be provided in one of two ways: either a plain string or the name of a previously defined buffer can be specified. If the latter, it should be indicated by a leading *.

define-root,dr <rootname>

Define Root: Creates a new B Tree index root, named rootname in the currently selected B Tree file. If creation is successful, the current root becomes the new root. All subsequent key and/or data operations will take place against the new root.

duplicates,dups on off

Duplicates: Sets or unsets support for duplicate keys in the current root. When on is specified, duplicate keys are permitted. When off, duplicate keys are not permitted .

echo,ec on off

a Echo: When echo is on, commands read from an execute file are echoed to stdout. If off, no echo is performed.

If no argument is given to echo, the current status of the echo setting is displayed.

error,er on off

a Error: When error is on, an execution error while reading commands from an execute file will cause termination of the execute file. If off, command execution will continue when errors are encountered.

If no argument is given to error, the current status of the error setting is displayed.

execute,e <filename>

a Execute: Causes commands to be read and executed from the file denoted by filename. execute commands can be nested, currently up to five deep. No command prompts will be issued while reading commands from a file.

See also the echo and error command descriptions for more information on execution control when reading commands from a file.

find,f <key>

a Find: Attempts to locate key in the current root of the in-use B Tree index file. If found, the value associated with the key is printed.

If key is omitted, the index is positioned prior to the first key (like position s).

find-data,fd <key> d

Find Data: Attempts to locate key in the current root of the in-use B Tree index file. If found, the first 80 bytes of the data record associated with the key is displayed. If the d qualifier is given, the whole of the data record is displayed. Note that the data record is displayed as character data; control characters or escape sequences in the data record could cause strangenesses on display.

file-list,fl

File List: Lists the set of open B Tree index files. To change the current file, issue a use command.

list,l c

a List: Displays all key and associated value, starting from the current key in the current root of the in-use B Tree index file. The current key is set by the last find, find-data, next, next-data, previous or previous-data command.

If the c argument is given, the count of keys listed will be displayed in addition.

list-data,ld

List Data: Displays all keys, and associated data records, starting from the current key in the current root of the in-use B Tree index file. The current key is set by the last find, find-data, next, next-data previous or previous-data command.

list-data-prev,ldp

List Data Previous: Displays all keys, and associated data records, prior to the current key in the current root of the in-use B Tree index file. The current key is set by the last find, find-data, next, next-data previous or previous-data command.

list-prev,lp c

a List Previous: Displays all keys, and associated value, prior to the current key in the current root of the in-use B Tree index file. The current key is set by the last find, find-data, next, next-data, previous or previous-data command.

If the c argument is given, the count of keys listed will be displayed in addition.

list-keys-only,lko

List Keys Only: Displays all keys, but not associated value, starting from the current key in the current root of the in-use B Tree index file. The current key is set by the last find, find-data, next, next-data, previous or previous-data command.

lock,lk

Lock: Acquires exclusive access to the in-use B Tree file which was originally opened in shared mode. If the file was opened in exclusive mode (the default), lock will have no effect.

next,n

Next: Displays the key following the current key in the current root of the in-use B Tree index file. The current key is set by the last find, find-data, next, next-data, previous or previous-data command.

next-data,nd

Next Data: Displays the key and associated data record following the current key in the current root of the in-use B Tree index file. The current key is set by the last find, find-data, next, next-data, previous or previous-data command.

open,o <filename> s

Open: Opens the existing B Tree file identified by filename. If the optional s qualifier is given, the file will be opened in shared mode. More than one B Tree index file may be open currently; the newly opened file is made the in-use B Tree file. The in-use file may be changed by the use command, while the list of open files is displayed through the file-list command.

open-readonly,or <filename> s

Open Readonly: Opens the existing B Tree file identified by filename in read-only mode. If the optional s qualifier is given, the file will be opened in shared mode. More than one B Tree index file may be open currently; the newly opened file is made the in-use B Tree file. The in-use file may be changed by the use command, while the list of open files is displayed through the file-list command.

position,pos s e

Position: Sets the position in the current root. s will cause the position to be set prior to the first key in the index, e will cause the position to be set after the last key.

prompt,p

Prompt: Toggles the display of a command prompt when bt is ready for the next user command.

previous,prv

Previous: Displays the key prior to the current key in the current root of the in-use B Tree index file. The current key is set by the last find, find-data, next, next-data, previous or previous-data command.

previous-data,pd

Previous Data: Displays the key and associated data record prior to the current key in the current root of the in-use B Tree index file. The current key is set by the last find, find-data, next, next-data, previous or previous-data command.

quit,q

Quit: Terminates bt. Any open B Tree files will be closed automatically.

remove,r <key>

Remove key: Removes a previously defined key in the current root of the in-use B Tree index file. The key name is specified by key.

remove-cur,rc

Remove Current key: Removes the current key in the current root of the in-use B Tree index file. The current key is set by the last find, find-data, next, next-data, previous or previous-data command.

remove-data,rd <key>

Removes key with Data: Removes a previously defined key and its associated data record in the current root of the in-use B Tree index file. key defines the key name.

remove-data-cur,rdc

Removes Current Data: Removes the current key and its associated data record in the current root of the in-use B Tree index file. The current key is set by the last find, find-data, next, next-data, previous or previous-data command.

remove-root,rr <rootname>

Remove Root: Removes an existing B Tree index root, named rootname in the in-use B Tree file. If removal is successful, all blocks used by the root (both index and data) will be returned to the free list. It is not permitted to remove the current root.

show,s control super stats space stack block <n> structure v

a Show: Displays B Tree debug information. The option specified is passed directly to the bdbug function. See the bdebug description for the information provided by each option.

For the structure option, specifying v will cause a detailed report on the structure to be displayed. Otherwise, only a summary is displayed.

size-data,sd <key>

a Size Data: Displays the number of bytes occupied by the data record associated with key. If key is omitted, the size of the data record associated with the current key is displayed. The current key is set by the last find, find-data, next, next-data, previous or previous-data command.

If key has no associated data record, results are undefined.

system,! <command>

System: Passes the text following the system command to the shell for execution.

use,u <filename>

Use: Changes the in-use B Tree file to filename. The file must have already been opened or created, using the open or create command.

update-data,ud <key> <string> <*bufname>

Update Data: Updates an existing key with a new associated data record in the current root of the in-use B Tree index file. key defines the key name. Data can be provided in one of two ways: either a plain string or the name of a previously defined buffer can be specified. If the latter, it should be indicated by a leading *.

update-data-cur,udc <string> <*bufname>

Update Data Current: Updates the current key with a new associated data record in the current root of the in-use B Tree index file. The current key is set by the last find, find-data, next, next-data, previous or previous-data command. Data can be provided in one of two ways: either a plain string or the name of a previously defined buffer can be specified. If the latter, it should be indicated by a leading *.

unlock,ulk

Unlock: Unlocks the in-use B Tree file, if it was locked with the lk command. If not locked, or the file was originally opened in exclusive mode, unlock has no effect.

update-value,uv <key> <value>

Update Value: Modifies the value associated with key in the current root of the in-use B Tree index file.

update-value,uv <value>

Update Value Current: Modifies the value associated with current key key in the current root of the in-use B Tree index file. The current key is set by the last find, find-data, next, next-data, previous or previous-data command.

write-threshold,wt <threshold>

Write Threshold: Sets the number of index block updates beyond which the block contents must be writtent to disk. A threshold of zero means writes will not take place unless a block must be flushed to disk.

help,? cmd

Help: Displays a list of bt commands and a terse description of syntax and usage. If cmd is specified, only help on that command will be displayed.

comment,#

Comment: bt will ignore any line starting with a #. Note that bt will also ignore blank lines.

B Tree Recovery

btr provides a B Tree recovery facility. Recovery of a B Tree index file may be required if it is left in an inconsistent state due to a hardware or software failure during update processing.

btr may also be used to migrate a Btree index file created with an earlier version of the Btree library to current version. Due to limited recovery information in earlier versions of the Btree index file, such migration is really only applicable to single-rooted B Tree index files.

Name

btr - B Tree index recovery

Synopsis

btr [-a | -d | -f | -k | -n cnt | -r | -v | — ] <old_file> <new_file>

Description

btr will attempt to recover key (and optionally data) information from the B Tree index file identified by <old_file>. The recovered contents are written to the B Tree index file identified by <new_file>. btr recovery is controlled by a number of arguments:

Command Arguments

-k | -d

Specifies recovery mode; -k for keys, -d for keys and data. Default if omitted is -k.

-n <cnt>

Sets maximum number of io errors to ignore before terminating the recovery. Default is 0.

-v

Causes btr to display information on the recovery process. This flag may be repeated up to three times to increase the level of information. For a large B Tree index this could lead to a significant amount of output.

-a

If specified, the B Tree index in <new_file> will allow duplicate keys. The default is not to allow duplicates.

-f

If specified, the B Tree index represented by <new_file> will be overwritten. Default is to preserve <new_file>, should it exist.

-r

Requests btr to attempt full recovery mode even if the version of the <old_file> does not support it.

 — 

Causes btr to stop processing command arguments. Should be used if <old_file> begins with a '-'.

Recovery Processing

btr will attempt to open the input btree file using the btree library version of btopn. If this fails, the btr version of btopn is used instead, which bypasses the consistency checks.

An attempt to read the superroot is made. If successful, the root names and root blocks are recorded. Only the roots present in the superroot are retained. Root names in any child blocks are ignored; the index structure may be damaged and therefore no attempt to traverse it is made.

Each block, starting from 1, is read. If marked as ZROOT, ZINUSE or ZDUP the keys and values are extracted directly from the in-memory array. If -k specified, the key and value are written to the <new_btree> index file. If -d specified, and the value is a valid disk record address, an attempt is made to read the data record. Data record addresses are stored in a supporting bt index file (.bt_da.db), to enable detection of circular references. If the data record is read OK, the key and data record is written to the <new_file> btree file. If the data record cannot be read, only the key is written.

In version 4 (and later) of the BTree index, each ZINUSE block contains the root block it belongs to. This data allows btr to partition keys by their roots. Only those roots recovered from the superroot will be named as in the <old_file>. Keys from other roots will be copied to new roots, named after their root block number (e.g. root_19834).

btr will display summary statistics about the recovery on stdout when complete.

Exit Status

0 if OK.

Notes

btr can be used to recover (or indeed migrate) data from earlier (i.e. pre-version 4) versions of a Btree index file. Since the root block numbers are not held in the ZINUSE blocks, keys can not be partitioned by root. Therefore, this facility is only really applicable to single-rooted B Tree index files.

Customisation

All compile time constants are defined in the header file bc.h. The following constants may be altered for different hardware environments. The values used in the example given are from the original UNIX implementation.

ZBPW

The number of bytes in a word.

ZBYTEW

The number of bits in a byte.

ZMXBLK

Maximum number of in-memory disk blocks that can be stored. The minimum value for this parameter is 3; there is no maximum. The more in-memory blocks defined, the lower the disk I/O requirements will be.

ZBLKSZ

The number of bytes allocated to a disk block. This value should be set to a multiple of the physical disk block size. This must be defined as a power of two.

ZKEYSZ

The maximum size (in bytes) of a key.

ZTHRES

Threshold for block joining. This value determines the number of free key slots that must exist before two blocks are considered candidates for joining.

ZMXACT

The maximum number of B Tree files that may be open concurrently.

ZSLEEP

The number of seconds to wait for a B Tree file to become unlocked, when in shared mode.

ZRNAMESZ

The maximum number of bytes returned for the name of the function reporting a BT error (via btcerr)/

ZMSGSZ

The maximum number of bytes returned for the error message of the corresponding to a BT library error (via btcerr).

These compile time constants are assigned the following values in the distributed version:

      ZBPW =        4
      ZBYTEW =      8
      ZMXBLK =      3     (16 when LFS=1)
      ZBLKSZ =   1024     (8192 when LFS=1)
      ZKEYSZ =     32
      ZTHRES =      3
      ZMXACT =      5
      ZSLEEP =      5
      ZRNAMESZ =   16
      ZMSGSZ =    123

These values result in the minimum memory usage. If memory is not a constraint, increasing the values for ZMXBLK and ZBLKSZ will make the B Tree implementation much faster, e.g try:

      ZMXBLK =  100
      ZBLKSZ = 8192

The number of keys that can be stored in a block is determined at compile time, using the following definition:

      #define ZMXKEY ((ZBLKSZ-ZBPW-ZINFSZ*ZBPW)/(ZKYLEN+2*ZBPW))

N.B. ZINFSZ is the number of information words that a block must carry as overhead. This value is six in this implementation.

Building and installing the BT Library

The BT library is distributed as a tar file, which contains a set of C source and header files, a Makefile and a set of testcases.

First, unpack the tar file into a convenient directory. cd to the directory containing the source files and issue the command make clean;make. This will compile each BT library file, and create the UNIX static library libbt.a in the lib sub-directory. make will also built the BT test harness bt, a utility, kcp, which performs intelligent copies of BT index files, a BTree index recovery tool btr and two additional testing utilities, bigt and bigtdel, for large file handling.

The default BT library build will create a 32-bit version, in which the maximum size of the index file is 2GiB. A version of BT with Large File Support can be built with the command make clean;make LFS=1.

When compiling programs against a LFS version of BT, if you need to manipulate the BTint value associated with a key, ensure you set the compile-time flag _FILE_OFFSET_BITS=64, e.g:

      $ gcc -o yak yak.c -Iinc -Llib -lbt -D_FILE_OFFSET_BITS=64

This is not necessary if you are just using the in-built data record functions (btsel etc).

In order to test the newly created BT library, you can use the bt test harness for ad-hoc testing. Alternatively, the Makefile provides a means of automated testing. make test_run will run a set of testcases, held in the Testcases sub-directory. These testcases use bt scripts to test key components of the BT library, comparing the results against known, good, output templates. The output templates distributed with BT should suffice for most standard Linux and FreeBSD builds.

Revision History

Revision

Date

Comment

5.0.2

3rd May, 2024

Documentation converted to AsciiDoc and placed in source code tree

5.0.1

1st July, 2020

Minor bug fixes. Modifications to accomodate changes in the toolchain since 2012 (GNU Make and C compilers).

5.0.0

26th November, 2012

Revised duplicate key handling to remove index navigation restrictions when in shared mode.

4.0.0

24th June, 2011

Add support for btree index recovery.

3.1.2

3rd January, 2011

Document revised handling of duplicate keys in shared mode.

3.1.1

21st December, 2010

Bump rev to match library version. No other changes.

3.1.0

10th December, 2010

Added support for previous key search, duplicate keys. Programs that link against the 3.0.x library will operate unchanged with 3.1.0.

3.0.1

2nd July, 2010

Enhanced BT test harness; bug fixes.

3.0.0

4th June, 2010

Added support for large files (> 2GiB)

2.0.4

10th May, 2008

Shared access sleep time added as implementation constant

2.0.3

27th December, 2005

Changes to reflect that zero-length data records are valid

2.0.2

3rd October, 2004

New text on implementation constants and file sizes

2.0

1st June, 2004

Reflect changes to API in version 2.0

1.0

13th April, 2003

First release

Colophon

This manual was written in AsciiDoc using Emacs. Html is generated by Asciidoctor.