Prev | Advanced Operations Guide | Next |
Rebuild Utility Concepts
The Rebuild utility allows you to perform the following operations on MicroKernel files (data files and dictionary files):
- convert older file formats to a newer Pervasive.SQL format
- convert newer file formats to a format not older than a 6.x format
- rebuild a file using the same file format (provided the format is 6.x, 7.x, or 8.x)
- add file indexes
- change file page size
- rebuild files to include system data and key or system data without key
- specify a location and name of the log file used by Rebuild
If your database uses dictionary files (DDFs), you must rebuild them as well as the data files.
Read further in this section to understand the conceptual aspects of rebuilding data files.
- Platforms Supported
- File Formats
- Command Line Parameters
- Temporary Files
- Optimizing the Rebuild Process
- Log File
For information on using the Rebuild utility, see one of these sections:
Platforms Supported
Rebuild comes in two forms: a GUI version for 32-bit versions of Windows, and command-line versions for Linux, NetWare, and Windows. See Rebuild Utility GUI Reference and CLI Tasks .
Linux CLI Rebuild
Rebuild runs as a program, rbldcli, on Linux. By default, the program is located at /usr/local/psql/bin.
NetWare CLI Rebuild
Rebuild runs as an NLM, BREBUILD.NLM, on NetWare. By default, BREBUILD.NLM is located at SYS:\SYSTEM.
Note
When rebuilding a data file larger than 10MB, BREBUILD.NLM can hog CPU on the NetWare platform. Perform rebuild tasks directly at the file server console when server usage is low.
Windows CLI Rebuild
Rebuild runs as a program, rbldcli.exe, on Windows. By default, the program is located at \PVSW\BIN.
File Formats
The current database engines remain compatible with some older file formats, but you may want to convert files to the current format to take advantage of current features. The following table lists the primary reasons for converting from an older to a newer format.
The file format that results from using the command-line Rebuild depends on the
-f
parameter. If you omit the-f
parameter, Rebuild uses the value set for the MicroKernel's Create File Version configuration option. For example, if the Create File Version value is 8.x, then running the Rebuild utility on version 7.x files converts them to 8.x format. See Create File Version and "-f
" parameter .It is suggested that you back up all the data files you plan to convert before running Rebuild. This is particularly true if you are rebuilding files to the same location as the source files (in which case the rebuilt files replace the source files). Having backup copies allows you to restore the original files if you so desire. To ensure that the backup is successful, you may perform one or more of the following operations:
- Close all data files before running the backup utility.
- Use continuous operations (only during the backup).
Note
You cannot run Rebuild on a file that is in continuous operation mode.
Command Line Parameters
The parameter option specifies the parameter(s) used with the utility. You may use the parameters in any order. Precede each parameter with a hyphen (-). Do not place a space after the hyphen or after the single-letter parameter and the parameter value.
Note
On Linux platforms only, the parameters are case sensitive.
Parameter is defined as follows:
-c Instructs Rebuild to continue with the next file if an error occurs. The utility notifies you of non-MicroKernel data files or errors with MicroKernel files, but continues rebuilding data files. The errors are written to the log file. See Log File .
Tip: This parameter is particularly useful if you specify wildcard characters (*.*) for a mixed set of files. Mixed set means a combination of MicroKernel files and non-MicroKernel files. Rebuild reports an error for each non-MicroKernel file (or any errors on MicroKernel files), but continues processing. -d If you specify -d, Rebuild converts pre-6.0 supplemental indexes (which allow duplicates) to 6.x, 7.x, or 8.x indexes with linked-duplicatable keys.
If you omit this parameter, Rebuild preserves the indexes as repeating-duplicatable keys.
If you access your data files only through Btrieve and your files have a relatively large number of duplicate keys, you can use the -d parameter to enhance the performance of the Get Next and Get Previous operations. -m<0 | 2> The "m" parameter stands for "method." Rebuild selects a processing method whether you specify this parameter or not. If you omit this parameter, Rebuild does the following:
- uses -m2 as the default method if sufficient available memory exists
- uses an alternative method,-m0, if the amount of available memory is not sufficient.
See Amount of Memory for how the amount of memory affects the method chosen. 0 Clones and copies the file without dropping and replacing indexes. This method is slower than the -m2 method. It is available in case you do not want to rebuild your indexes.
A file built with the -m0 creates a file where each key page is about 55% to 65% full. The file is more optimized for writing and less for reading. If you can afford the extra rebuild time, which can be considerable depending on the situation, you might want to rebuild a file optimized for writing.
See also Optimizing the Rebuild Process . 2 Clones the file, drops the indexes, copies the records into the new file, and rebuilds the indexes. This method is faster and creates smaller files than the -m0 method.
The -m2 method may create a new file in which the records are in a different physical order than in the original file.
A file built with the -m2 method has key pages that are 100% full. This allows the file to be optimized for reading. -p<D | P | bytes> Optimizes page size for disk storage or processing, or specifies a specific page size to use for the rebuilt file.
If you omit this parameter, Rebuild uses the page size from the source file. If the source page size does not work for the current database engine, Rebuild changes the page size and displays an informative message explaining the change. (For example, older file formats, such as 5.x, supported a page size of 1024 with 24 keys. File format 8.x supports only 23 keys for a page size of 1024, so Rebuild would select a different page size if building an 8.x file.)
See also Index Page Size . D Optimizes page size for disk storage. For a discussion of optimal page size for disk storage, Choosing a Page Size in Pervasive.SQL Programmer's Guide, which is part of the Pervasive.SQL Software Developer's Kit (SDK). P Optimizes for processing (that is, for your application accessing its data). For -pP, Rebuild uses a page size of 4096 bytes. bytes Specifies the page size (in bytes) for the new file. The only valid values are 512, 1024, 1536, 2048, 2560, 3072, 3584, and 4096 (multiples of 512 up to 4096). -bdirectoryname Specifies an alternate location for the rebuilt file (which may also be a location on a different server). The default location is the directory where the data file is located. You must specify a location that already exists. Rebuild does not create a directory for you.
You may use either a fully qualified path or a relative path. Do not use wildcard characters in directoryname.
On your local server, the MicroKernel Database Engine and the Message Router must be loaded. On a remote server, the MicroKernel Database Engine and communications components must be loaded.
If you omit this parameter, the rebuilt file replaces the original data file. A copy of the original file is not retained.
If you specify this parameter, the rebuilt file is placed in the specified location and the original file is retained. An exception to this is if the specified location already contains data files with the same names. Rebuild fails if the alternate location you specify contains files with the same names as the source files. For example, suppose you want to rebuild mydata.mkd, which is in a directory namedfolder1
. You want to place the rebuilt file into a directory namedfolder2
. If mydata.mkd also exists infolder2
(perhaps unknown to you), Rebuild fails and informs you to check the log file.
Note: Ensure that you have create file permission for the location you specify (or for the location of the source file if you omit the parameter). -knumber Specifies the key number that Rebuild reads from the source file and uses to sort the rebuilt file. If you omit this parameter, Rebuild reads the source file in physical order and creates the rebuilt file in physical order.
See also Optimizing the Rebuild Process . -s[D | K] Retains in the rebuilt file the existing system data and key from the source file. If you omit this parameter, Rebuilt does not include the system data and key in the rebuilt file.
See also System Data . D Rebuilds the file to include system data. The system data is not indexed. See also System Data . K Rebuilds the file to include system data and key. The system data is indexed. See also System Data . -lfile Specifies a file name, and optionally a path location, for the Rebuild log file. The default file name is rbldcli.log on Windows and Linux. On NetWare, the default file name is brebuild.log. The default location is the current working directory on Windows and Linux. On NetWare, the default location is SYS:\SYSTEM.The following conditions apply:
- The path location must already exist. Rebuild does not create the path location.
- If you specify a path location without a file name, Rebuild ignores this parameter and uses the default file name and location.
- If you specify a file name without a path location, Rebuild uses the default location.
- You must have read and write file permission for the location you specify. Rebuild uses the default location if it cannot create the log file because of file permission.
See also Log File . -f<6 | 7 | 8> Specifies a file format for the rebuilt file. File formats supported are versions 6.x, 7.x, and 8.x. The following example rebuilds a file to the 7.x format:rbldcli -f7 c:\pvsw\demodata\class.mkd
If you omit this parameter, Rebuild uses the value set for the MicroKernel's "Create File Version" configuration option. See Create File Version .
Note1: If you specify a file format newer than the version supported by the current database engine, Rebuild uses the highest supported file format of that engine. Rebuild reports no error or message for this. For example, if the database engine is Pervasive.SQL 2000i and you specify a file format of 8.x (-f8), the file is rebuilt to 7.x. File format 7.x is the highest one supported by the Pervasive.SQL 2000i engine.
Note2: Rebuild does not convert data types in indexes. If you rebuild a file to an older file format for use with an older database engine, ensure that the engine supports the data types used. You must manually adjust data types as required by your application and by the database engine.
Example1. Your data file contains index fields that use the WZSTRING data type. If you rebuild the data file to a 6.x file format, the WZSTRING data type is not converted. You would be unable to use the data file with a Btrieve 6.15 engine. That engine does not support the WZSTRING data type.
Example 2. Your data file contains true NULLs. You rebuild the data file to a 7.x file format. The true NULLs are not converted. You would be unable to use the data file with the Pervasive.SQL 7 engine. That engine does not support true NULLs.File and @command_file are defined as follows:
Temporary Files
On Windows, Rebuild creates temporary files in the directory specified by the TMP system environment variable. By default on NetWare and Linux, Rebuild creates temporary files in the output directory (or in the source directory if the -b parameter is not used). Therefore, you need enough disk space in the temporary file directory (while the Rebuild utility is running) to potentially accommodate both the original file and the new file. You can specify a different directory for storing these files by using the Output Directory option in the Rebuild GUI version or by using the -b parameter with the CLI versions.
Normally, Rebuild deletes temporary files when the conversion is complete. However, if a power failure or other serious interruption occurs, Rebuild may not delete the temporary files. If this occurs, delete the following types of temporary files:
Optimizing the Rebuild Process
Rebuild makes Btrieve calls to the MicroKernel. Therefore, the MicroKernel configuration settings and the amount of random access memory (RAM) in your computer affect the performance of the rebuild process. This is particularly evident in the amount of time required to rebuild large data files.
In general, building indexes requires much more time than building data pages. If you have a data file with many indexes, it requires more time to rebuild than would the same file with fewer indexes.
The following items can affect the rebuild processing time:
- CPU Speed and Disk Speed
- Amount of Memory
- Sort Buffer Size
- Max MicroKernel Memory Usage
- Cache Allocation Size
- Index Page Size
- Number of Indexes
CPU Speed and Disk Speed
The speed of the central processing unit (CPU) and access speed of the physical storage disk can affect processing time during a rebuild. In general, the faster the speed for both of these, the faster the rebuild process. Disk speed is more critical for rebuilding files that are too large to fit entirely in memory.
Tip
Large files, such as 3 or 4 GB or more, may take several hours to convert. If you have more than one database engine available, you may wish to share the rebuild processing among a number of machine CPUs. For example, you could copy some of your files to each machine that has a database engine installed, then copy the files back after the rebuild process.
Amount of Memory
Rebuild is capable of rebuilding a file using two different methods, a default method and an alternative method. See -m<0 | 2> parameter . The method chosen depends on the amount of memory available. For the default method (-m2), Rebuild takes the following steps provided available memory exists.
- Creates a new, empty data file with the same record structure and indexes as defined in the source file.
- Drops all the indexes from the new file.
- Copies all the data into the new file, without indexes.
- Adds the indexes, using the following process.
The temporary file now contains several key value sets, each of which has been individually sorted.
- Merges the sets into index pages, filling each page to capacity. Each index page is added to the data file at the end, extending the file length.
- Repeats steps 4 and 5 for each remaining key.
If any failure occurs during this process, such as a failure to open or write the temporary file, Rebuild starts over and uses the alternative method to build the file.
Rebuild uses an alternative method (-m0) when insufficient memory exists to use the default method, or if the default method encounters processing errors.
- Creates a new, empty data file with the same record structure and indexes as defined in the source file.
- Drops all the indexes from the new file.
- Copies all the data into the new file, without indexes.
- Adds the indexes, using the following process.
- For a particular key in the source file, reads one record at a time using the Step Next operation.
- Extracts the key value from the record and inserts it into the appropriate place in the index. This necessitates splitting key pages when they get full.
- Repeats steps a and b, processing the key value from every record.
- Repeats step 4 for each remaining key.
The alternative method is typically much slower than the default method. If you have large data files with many indexes, the difference between the two methods can amount to many hours or even days. The only way to ensure that Rebuild uses the default method is to have enough available memory. In a NetWare environment, the available memory must also not be highly fragmented. Several Configuration settings affect the amount of available memory.
Formulas For Estimating Memory Requirements
The following formulas estimate the optimal and minimum amount of contiguous free memory required to rebuild file indexes using the fast method. The optimal memory amount is enough memory to store all merge blocks in RAM. The minimum amount of memory is enough to store one merge block in RAM.
Key Length = total size of all segments of largest key in the file. Key Overhead = 8 if key type is not linked duplicate. 12 if key type is linked duplicate. Record Count = number of records in the file. Optimal Memory Bytes = (((Key Length + Key Overhead) * Record Count) + 65536) / 0.6 Minimum Memory Bytes = Optimal Memory Bytes / 30For example, if your file has 8 million records, and the longest key is 20 bytes (not linked duplicate), the preferred amount of memory is 373.5 MB, or ((( 20 + 8 ) * 8,000,000 ) + 65536 ) / 0.6 = 373,442,560 bytes.
The optimal amount of contiguous free memory is 373.5 MB. If you have at least this much free memory available, the Rebuild process takes place entirely in RAM. Because of the 60% allocation limit, the optimal amount of memory is actually the amount required to be free when the rebuild process starts, not the amount that the rebuild process actually uses. Multiply this optimal amount by 0.6 to determine the maximum amount Rebuild actually uses.
The minimum amount of memory is 1/30th of the optimal amount, 12,448,086 bytes, or 12.45 MB.
The divisor 30 is used because the database engine keeps track of no more than 30 merge blocks at once, but only one merge block is required to be in memory at any time. The divisor 0.6 is used because the engine allocates no more than 60% of available physical memory for rebuild processing.
If you do not have the minimum amount of memory available, Rebuild uses the alternative method to rebuild your data file.
Finally, the memory block allocated must meet two additional criteria: blocks required and allocated block size.
Blocks required must be less than or equal to 30, where:
Blocks Required = Round Up (Optimal Memory Bytes / Allocated Block)Allocated block size must be greater than or equal to:
((2 * Max Keys + 1) * (Key Length + Key Overhead)) * Blocks RequiredAssuming a 512-byte page size, and a block of 12.45 MB successfully allocated, the value for blocks required is:
Blocks Required = 373,500,000 / 12,450,000 = 30The value for allocated block size is:
Max Keys = (512-12) / 28 = 18 (((2 * 18) + 1) * (20 + 8)) * 9 = 9324Is Allocated Block (12.5 million bytes) larger than 9324 bytes? Yes, so the second criteria is met. The index keys will be written to a temporary file in 12.45 MB pieces, sorted in memory, and then written to the index.
Sort Buffer Size
This setting specifies the maximum amount of memory that the MicroKernel dynamically allocates and de-allocates for sorting purposes during run-time creation of indexes. See Sort Buffer Size .
If the setting is zero (the default), Rebuild calculates a value for optimal memory bytes and allocates memory based on that value. If the memory allocation succeeds, the size of the block allocated must be at least as large as the value defined for minimum memory bytes. See Formulas For Estimating Memory Requirements .
If the setting is a non-zero value, and the value is smaller than the calculated minimum memory bytes, Rebuild uses the value to allocate memory.
Finally, Rebuild compares the amount of memory that it should allocate with 60% of the amount that is actually available. It then attempts to allocate the smaller of the two. If the memory allocation fails, Rebuild keeps attempting to allocate 80% of the last attempted amount. If the memory allocation fails completely (which means the amount of memory is less than the minimum memory bytes), Rebuild uses the alternative method to rebuild the file.
Max MicroKernel Memory Usage
This setting specifies the maximum proportion of total physical memory that the MicroKernel is allowed to consume. L1, L2, and all miscellaneous memory usage by the MicroKernel are included (SRDE is not included). See Max MicroKernel Memory Usage .
If you have large files to rebuild, temporarily set Max MicroKernel Memory Usage to a lower percentage than its default setting. Reset it to your preferred percentage after you complete your rebuilding.
Cache Allocation Size
This setting specifies the size of the Level 1 cache that the MicroKernel allocates; the MicroKernel uses this cache when accessing any data files. See Cache Allocation Size .
This setting determines how much memory is available to the MicroKernel for accessing data files, not for use when indexes are built.
Increasing Cache Allocation to a high value does not help indexes build faster. In fact, it may slow the process by taking up crucial memory that is now unavailable to Rebuild. When rebuilding large files, decrease the cache value to a low value, such as 20% of your current value but not less than 5 MB. This leaves as much memory as possible available for index rebuilding.
Index Page Size
The page size in your file also affects the speed of index building. If Rebuild uses the alternative method, smaller key pages dramatically increase the time required to build indexes. Key page size has a lesser effect on building indexes if Rebuild uses the default method.
Rebuild can optimize page size for application performance or for disk storage.
To optimize for performance (your application accessing its data), Rebuild uses a page size of 4096 bytes. This results in larger page sizes on physical storage and slower rebuilding times.
For a discussion of optimizing page size for disk storage, see Choosing a Page SizeChoosing a Page Size in Pervasive.SQL Programmer's Guide, which is part of the Pervasive.SQL Software Developer's Kit (SDK).
Assume that your application has 8 million records, a 20-byte key, and uses a page size of 512 bytes. The MicroKernel places between 8 and 18 key values in each index page. This lessens the amount of physical storage required for each page. However, indexing 8 million records creates a B-tree about seven levels deep, with most of the key pages at the seventh level. Performance will be slower.
If you use a page size of 4096 bytes, the MicroKernel places between 72 and 145 key values in each index page. This B-tree is only about four levels deep and requires many fewer pages to be examined when Rebuild inserts each new key value. Performance is increased but so is the requirement for the amount of physical storage.
Number of Indexes
The number of indexes also affects the speed of index building. Generally, the larger the number of indexes, the longer the rebuild process takes. The time required to build the indexes increases exponentially with increasing depth of the B-tree.
Log File
Information from a rebuild process is appended to an ASCII log file. By default, the log file is placed in the current working directory except on NetWare. On NetWare, the default location for the log file is SYS:\SYSTEM. The default file name is rbldcli.log on Windows and Linux. On NetWare, the default file name is brebuild.log.
You may specify a location and name for the log file. See -lfile parameter .
You may examine the log file using a text editor. The information written to the log file includes the following:
- Start time of the rebuild process
- Parameters specified on the command line
- Status code and error description (if an error occurs)
- File being processed
- Information about the processing (such as page size changes)
- Total records processed
- Total indexes rebuilt (if the -m2 processing method is used)
- End time of the rebuild process
- Status of the process (for example, if the file rebuilt successfully)
Prev Converting Data Files |
Contents Up Check for Revisions | Next Rebuild Utility GUI Reference |