Replication Support


 

The Sunbelt Data Manager provides communication between multiple data managers allowing data files opened and written on a primary data manager to be replicated on secondary data manager through a TCP/IP network connection.

 

The following glossary of terms are used with replication:

Replication
Replication provides duplication of files and directories. File data is transferred from a primary data manager to a secondary data manager such that the files are exact mirror images. The operating system time stamps for the data files are not expected to be the same due to differences in server clock settings.
Primary
The primary data manager is identified by the HOST00 setting in the [replication] section of the configuration file. The primary data manager is the active data manager that allows PL/B applications to log on and open data files. Normal PL/B language operations are executed via the primary data manager. The primary data manager establishes a communications session to all secondary replication servers before the PL/B applications start executing. If the primary server stops communicating prematurely with the secondary server, the secondary server takes control after a configured time period.
Secondary
A secondary data manager is identified by the HOST01 to HOST03 keyword settings in the primary server's configuration file. After a secondary server establishes a communication session with the primary server, it continues to monitor and transfer any file and directory data that has changed at the primary server. PL/B applications cannot log on to a secondary server while the primary server communicates with the secondary server to transfer modified data or keep alive sequences.
If the primary server terminates prematurely and stops communicating, the HOST01 secondary server detects the error and takes over primary control after a default time period of three (3) minutes. The keyword named FAIL_TIME controls the time period.
Backup
A backup data manager is identified by the HOST01 to HOST03 keyword settings in the primary configuration file. After a backup server establishes a communications session with the primary server, it continues to monitor and transfer any file and directory data that is changed at the primary server. PL/B applications can never logon to a backup server.
If the primary server terminates prematurely and stops communicating, a HOSTnn backup server waits an indefinite amount of time trying to establish a connection with any primary data manager server that is activated.
Rollover
Rollover occurs when a secondary server detects the primary server has stopped communicating. The secondary server will then take control from the primary and allow users to log on and open files at the secondary server.
Replication Control File
A replication control file is created at the primary server when the server is loaded the first time. The current configuration keywords determine the replication settings for all servers that are networked. The replication control file named ‘sundmrep.DM’ is created and maintained in the same directory in which the sundm.exe file exists. When the replication network is activated, the primary ‘sundmrep.DM’ file is transferred to any secondary servers that log on to the primary server.
When the primary server is loaded and the replication control file already exists, the current replication keyword settings are automatically updated.
The ‘SUNDM –u’ command line forces the primary server to update its settings from the configuration file while executing.
When secondary servers log on to the primary server, the primary server sends appropriate updated keyword settings, file, and directory information to the secondary servers.

Enhanced Performance in 9.5A

 

In version 9.5A the Data Manager Replication support was modified to allow multiple channels between the Primary and Secondary/Backup servers. In addition, the Primary server was modified to write all IO and command transactions into a WAL (write-ahead log) file. A specialized channel transfers the WAL data from the Primary to the Secondary/Backup servers. The operation of replication support utilizing multiple channels and WAL file data is identified as version 2 style of replication. Operation of a replication server with a single channel as originally implemented is identified as version 1 style of replication

 

Please note the following regarding the operations of the 2 replication styles:

  1. The version 2 can co-exist with a version 1 style of replication. This means that a Primary configured to execute using the version 2 style of replication can support a Secondary/Backup server that is configured for version 1 or 2 styles of replication. However, the performance of the Primary in this case can be diminished of the interaction with the version 1 style of Secondary/Backup servers.

  2. The implementation of the version 2 style of replication requires the use of multiple threads and message queues to provide faster transaction processing. This implementation removes any bottlenecks the existed in the version 1 style of replication.

  3. The version 2 style of replication provides the following benefits:

  4. The Primary server processing threads are described as follows:

    1. Main Thread

    2. Transaction Processing Thread

    3. Unmanaged Scanner Thread

    4. Data Channel Thread

  1. The Backup/Secondary server processing threads are described as follows:

    1. Main Thread

    2. WAL Transfer Thread

    3. WAL Managed Processing Thread

    4. WAL Unmanaged Processing Thread

    5. Idle File Scanner thread (Error List)

  1. The WAL file support has been implemented for the version 2 style of replication in the Data Manager. The WAL file support improves the overall performance for both managed and unmanaged processes for the replication servers. It eliminates all bottlenecks, thread conflicts, and scenarios where the PLB applications could be slowed by excessive IO transaction processing required for the replication servers. The following points of interest provide basic information about the WAL files:

  1. New keywords have been implemented for the Sundm.cfg that can be placed in the [replication] section to support the version 2 replication operations. The new keywords are defined as follows:

  2.  

    V2_REPLICATION={ON|OFF}

    This keyword is used to turn the version 2 replication ON or OFF. By default, the version 2 replication is turned OFF. This keyword must be added to the [replication] section of the replication servers to use the version 2 replication.

     

    WAL_DIR={path}

    This keyword is used to define where the WAL files are placed. If this keyword is not specified, the WAL files are placed in the current working directory for the Data Manager.

     

    WAL_SEGMENT_MAX={max}

    This keyword is used to specify the maximum number of segments that are written into a single WAL file. If this keyword is not specified, the default maximum segment count is 1000 segments. When this keyword is specified, the {max} value must be from 100 to 100000 segments. Otherwise, it is set to the default of 1000 segments. If the maximum segment value is set to a lower value, the WAL files are removed at a higher frequency than when the maximum segment value is set to a higher value.

     

    WAL_SEGMENT_TIMEOUT={timeout}

    This keyword is used to define the maximum number of seconds that a dirty WAL log segment can remain in memory before it is written to a WAL file. If this keyword is not specified, the {timeout} defaults to be 5 seconds. When this keyword is specified, the {timeout} value can be from 1 to 3600 seconds. Otherwise, the {timeout} value is set to 5 seconds.

     

    Examples of new keywords:

     

    [Replication]

    V2_REPLICATION=ON

    WAL_DIR=c:\temp\wal

    WAL_SEGMENT_MAX=100

    WAL_SEGMENT_TIMEOUT=10

  1. New ADMIN data items have been added for the Data Manager. These keywords can be used in the AdmGetInfo instruction. The new keywords are described as follows:

AdmitemSrvWalMain (136)

Returns the current write position of the WAL file.

 

AdmItemSrvWalMan (137)

Returns the current processed position of the WAL file by the managed transaction handler on a Secondary/Backup server.

 

AdmitemSrvWalUnMan (138)

Returns the current processed position of the WAL file by the unmanaged transaction handler on a Secondary/Backup server.

 

 

See Also: Introduction, Managed File Support

 



about_sundm Data Manager Installation Getting Started with Replication