Replication Support
The Sunbelt Data Manager provides communication between multiple data managers allowing data files opened and written on a primary data manager to be replicated on secondary data manager through a TCP/IP network connection.
The following glossary of terms are used with replication:
Enhanced Performance in 9.5A
In version 9.5A the Data Manager Replication support was modified to allow multiple channels between the Primary and Secondary/Backup servers. In addition, the Primary server was modified to write all IO and command transactions into a WAL (write-ahead log) file. A specialized channel transfers the WAL data from the Primary to the Secondary/Backup servers. The operation of replication support utilizing multiple channels and WAL file data is identified as version 2 style of replication. Operation of a replication server with a single channel as originally implemented is identified as version 1 style of replication
Please note the following regarding the operations of the 2 replication styles:
The version 2 can co-exist with a version 1 style of replication. This means that a Primary configured to execute using the version 2 style of replication can support a Secondary/Backup server that is configured for version 1 or 2 styles of replication. However, the performance of the Primary in this case can be diminished of the interaction with the version 1 style of Secondary/Backup servers.
The implementation of the version 2 style of replication requires the use of multiple threads and message queues to provide faster transaction processing. This implementation removes any bottlenecks the existed in the version 1 style of replication.
The version 2 style of replication provides the following benefits:
Although, the startup synchronization is single threaded. The use of the WAL file eliminates application halts during the synchronization processing.
The file open ID requests for PLB OPEN operations no longer wait on the message queue with other transactions. This eliminates the possibility of having an application halt during a PLB OPEN for a heavily IO oriented system.
IO transactions no longer halt when the replication transaction queue and the message queue are full. This prevents a PLB application from hanging on an IO instruction when the queues are full.
The managed and unmanaged replication operations are processed separately. This change eliminates all interactions where replication unmanaged transactions could affect the performance of the managed PLB IO operations. The version 2 unmanaged operations no longer affect the performance of the PLB program IO operations.
The unmanaged directory scanning no longer suspends the transaction processing.
A slow Backup or Secondary server no longer dictates the speed of replication. The replication services for any one Backup/Secondary server does not affect the replication services for another Backup/Secondary server.
The Primary server processing threads are described as follows:
Main Thread
Handles actions for the Sundm command message queue.
Creates data channel threads based on demand.
Handles file open ID requests when a PLB OPEN/PREP instruction is executed.
Reads the commands from the command message queue.
Handles and processes the general commands like the terminate ( -t, -f, ... etc ).
Handles and processes all transactions for all version1 style servers.
Transaction Processing Thread
Reads the transactions from the Sundm child transaction message queue.
Writes the transactions to the WAL files.
Unmanaged Scanner Thread
Scans the unmanaged file directories.
Writes unmanaged transactions to the WAL files.
Data Channel Thread
Handles the message requests from a Backup/Secondary server.
Each data channel thread allocates a file cache memory block to avoid interactions with other data channels.
The Backup/Secondary server processing threads are described as follows:
Main Thread
Handles message commands that are received from the Primary replication server and Sundm.
WAL Transfer Thread
Transfers the current transactions from the Primary server queue to a WAL file on disk.
This thread uses a data channel that is connected to the Primary server.
WAL Managed Processing Thread
This thread processes all managed transactions that have been received from the Primary server and stored into the WAL files. Managed transactions are created as PLB program IO operations are executed.
This thread uses a data channel that is connected to the Primary server.
WAL Unmanaged Processing Thread
This thread processes all unmanaged transactions that have been received from the Primary server and stored into the WAL files. The unmanaged transactions are created from the unmanaged thread that scans files on the Primary server.
This thread processes any Secondary recovery scanning that can be configured to determine that no files are unexpectedly deleted at the Secondary.
This thread uses a data channel that is connected to the Primary server.
Idle File Scanner thread (Error List)
This thread processes all error and file recovery actions to resolve errors that have been detected during transactions processing.
This thread closes any files that are open and have had no I/O longer then the secondary open idle time. See the IDLE_CLOSE keyword.
This thread uses a data channel that is connected to the Primary server.
The WAL file support has been implemented for the version 2 style of replication in the Data Manager. The WAL file support improves the overall performance for both managed and unmanaged processes for the replication servers. It eliminates all bottlenecks, thread conflicts, and scenarios where the PLB applications could be slowed by excessive IO transaction processing required for the replication servers. The following points of interest provide basic information about the WAL files:
A WAL file has a 1KB header.
Data in a WAL file following the header is composed of 60KB data segments. The data segments include all transaction command messages required to replicate data files from the Primary.
By default, the number of segments in a single WAL file is limited to a 1000 segments. However, the WAL_SEGMENT_MAX keyword can be used to set the segment count from 100 to 100000 segments in the WAL file.
The old physical WAL files are deleted after all of the WAL file messages have been processed.
New application file and new directory information is embedded WAL messages to improve the managed transaction performance.
New keywords have been implemented for the Sundm.cfg that can be placed in the [replication] section to support the version 2 replication operations. The new keywords are defined as follows:
V2_REPLICATION={ON|OFF}
This keyword is used to turn the version 2 replication ON or OFF. By default, the version 2 replication is turned OFF. This keyword must be added to the [replication] section of the replication servers to use the version 2 replication.
WAL_DIR={path}
This keyword is used to define where the WAL files are placed. If this keyword is not specified, the WAL files are placed in the current working directory for the Data Manager.
WAL_SEGMENT_MAX={max}
This keyword is used to specify the maximum number of segments that are written into a single WAL file. If this keyword is not specified, the default maximum segment count is 1000 segments. When this keyword is specified, the {max} value must be from 100 to 100000 segments. Otherwise, it is set to the default of 1000 segments. If the maximum segment value is set to a lower value, the WAL files are removed at a higher frequency than when the maximum segment value is set to a higher value.
WAL_SEGMENT_TIMEOUT={timeout}
This keyword is used to define the maximum number of seconds that a dirty WAL log segment can remain in memory before it is written to a WAL file. If this keyword is not specified, the {timeout} defaults to be 5 seconds. When this keyword is specified, the {timeout} value can be from 1 to 3600 seconds. Otherwise, the {timeout} value is set to 5 seconds.
Examples of new keywords:
[Replication]
V2_REPLICATION=ON
WAL_DIR=c:\temp\wal
WAL_SEGMENT_MAX=100
WAL_SEGMENT_TIMEOUT=10
New ADMIN data items have been added for the Data Manager. These keywords can be used in the AdmGetInfo instruction. The new keywords are described as follows:
AdmitemSrvWalMain (136)
Returns the current write position of the WAL file.
AdmItemSrvWalMan (137)
Returns the current processed position of the WAL file by the managed transaction handler on a Secondary/Backup server.
AdmitemSrvWalUnMan (138)
Returns the current processed position of the WAL file by the unmanaged transaction handler on a Secondary/Backup server.
See Also: Introduction, Managed File Support
![]() |