Wednesday, March 4, 2015

Multiple-masters domain enSwFaultTol set-up



The multiple-masters domain (e.g. containing back-up / DR masters and/or "FULLSTATUS ON" FTA's) biggest challenge is to make sure there is no loss on jobs details (messages) in case the primary master is out of order and we need to promote another master or "FULLSTATUS ON" FTA's to primary master.

There is a global variable called: enSwFaultTol enable or disable the fault-tolerant switch manager feature which basically does changes the flow of communication inside TWS environment:

 enSwFaultTol / sw = NO The FTA's are connected and are sending the jobs data to primary master and other masters and / or "FULLSTATUS ON" FTA's are getting in sync with primary master.

enSwFaultTol / sw = YES The FTA's are connected and are sending the jobs data to each master and / or "FULLSTATUS ON" FTA's independently.

Below is  a diagram representing only the communication channels that are impacted by the above variable.



Default is NO but my recommendation is to change it to YES, even if IBM recommends NO (I've had several meetings with IBM support on this topic but they did not convince me to use it as NO).

Why?

  • enSwFaultTol / sw = NO biggest downsize is that in case of primary master being in a freeze state you will lose all the jobs messages sent to it by FTA's until the new primary master promotion. It happened to me several times and we lost up to several hours worth of data. 
  • enSwFaultTol / sw = YES the above scenario won't happen as all the other masters or "FULLSTATUS ON" FTA's are receiving all the jobs messages but its small disadvantage is an increase network traffic as the data is sent to multiple servers. 
 My strong recommendation, even is it against IBM one, is to use enSwFaultTol / sw = YES on a multiple-masters domain.

3 comments:

  1. Thanks for all of your great posts! It should be noted that if you are running job a dynamic broker the enSwFaultTol setting should be disabled. There is a known compatibility with the dynamic broker and Fault Tolerant that can cause jobs on FTAs to get stuck in the ready state. See: http://www-01.ibm.com/support/docview.wss?uid=swg21639575

    ReplyDelete
  2. ni3avhad@gmail.com I will happy if someone hold my hand in current situation.

    ReplyDelete
  3. will connect on teams immediately

    ReplyDelete