Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-71-1003766.1
Update Date:2010-10-29
Keywords:

Solution Type  Technical Instruction Sure

Solution  1003766.1 :   Tuning Remote Volume Mirroring for Sun Storage[TM] RAID Arrays  


Related Items
  • Sun Storage 6540 Array
  •  
  • Sun Storage 6130 Array
  •  
  • Sun Storage 6140 Array
  •  
Related Categories
  • GCS>Sun Microsystems>Storage - Disk>Modular Disk - 6xxx Arrays
  •  

PreviouslyPublishedAs
205301


Description
This document defines the way to evaluate and modify a Remote Volume Mirroring(RVM) configuration in order to mitigate the performance effects of high latency, between the two arrays of the RVM configuration, as well as local and remote system load. Customer's should identify their performance goals on an application and volume basis, prior to using the tools in this document.

This document assumes that the audience has reviewed the Online Help documentation in Sun StorageTek[TM] Common Array Manager as well as the help in Sun StorageTek[TM] SANtricity Storage Manager.

The details of Synchronous versus Asynchronous Replication can be found by reviewing the following Online Help:

Under Configuration Tasks -> Configuring Data Replication:

  1. About Replication Modes
  2. Reference: Synchronous versus Asynchronous Replication Modes
  3. About Replication Set Properties
  4. About Replication Sets

The details of latency and other replication issues are covered:

Under Configuration Tasks -> Planning for Data Replication

  1. General Planning Considerations
  2. Reference: Checklist of Data Replication Tasks

The following are additional pages that will round out your understanding of
this topic.

Under Configuration Tasks -> Managing Data Replication

  1. Suspending and Resuming Data Replication
  2. Testing of Set Links


Steps to Follow
Follow the steps below to identify what performance changes will impact your environment.

Part I: Defining and Determining Latency

Defining Latency

Latency refers to the amount of time to complete a transaction. In terms of the RVM configuration, we are considering the so called round-trip latency. That is the time for an operation to travel from the start point to the end point, and send an acknowledgement back to the start point. The term Highly Latent refers to round-trip latencies that are large or unacceptable.

Sun recommends utilizing Asynchronous replication at latency values of 10ms or more. Whether a lower value is accepteable or not will depend
on the user's requirements.

Defining LBA or Chunk Size

LBA(Logical Block Allocation) is the number of bytes transferred in a single IO. In this case, we are interested in the typical size of an RVM IO, since we will want to know how long it will take for each LBA to transfer over the network. The LBA size is dependent upon the size of the volume being replicated.

Volume Size In Blocks / 10^6 = LBA

NOTE: The LBA cannot be less than 64KB, or 128 512-byte blocks.

For example a 500GB volume:

500GB/10^6 = 524288000/10^6 = 524.288 blocks or ~262 KB

Determining Latency

These arrays provide a user interface that helps determine link health, and
the overall latency between units. Your options are to either use the Sun StorageTek[TM] CAM(Common Array Manager) web user interface, or the sscs(1M) command line interface. There must be a Replication Set configured to use this function(this implies that the Replication Feature is enabled and activated).

For CAM:

In the left hand menu:

1) Expand Storage Systems
2) Expand Array Name
3) Click Replication
4) Click Replication Set Link in main window
5) Click Test Communication Button

For Legacy Element Manager 1.3 through 2.1:

1) Click Configuration Services
2) Click Array Name
3) Click Logical Devices Tab
4) Click Replication Tab
5) Click Replication Set link in main window
6) Click Test Communication Button

For sscs:

Use the modify sub-command with the -E option against the specific set you
wish to evaluate:

Example:

sscs modify -a array -E repset volname/#

# /opt/SUNWsesscs/cli/bin/sscs modify -a myarray -E repset primaryvol/#
Communication between the owning controllers of Local Volume primaryvol and Remote Volume remotevol is normal.

Average round trip time: 1052117 microseconds
Shortest round trip time: 2978 microseconds

NOTE: The number after the volumename is the number of the set. This is listed in the output of sscs list -a array repset

Estimating Time to Synchronize

With high latency values, initial synchronizations, or even recovery synchronizations can take some time to complete. Take for example, the replication of 500 GB over SAN with a latency of 60ms.

A full synchronization from the primary to the secondary sites would take at maximum:

Rate: (Reference Chunk Size as calculated above)/ (Latency in seconds)

262 KB / 0.06s = 4369KB/s or ~4.3 MB/s

Time to completion:

(500GB * 1024 GB/MB) / 4.3 MB/s = 119070s = 33 hours = 1.4 days

This is if there is no host IO to the Sun StorEdge[TM] 6130, and the replication priority is set to its highest value. Cutting the latency by half, halves the time to completion. This is, by far, the best way to better local application performance, and decrease synchronization times. All replication IO depends on the time to transmit and acknowledge the data replicated. The problem is of course that this can, at times, be the most expensive to reduce and tune.

''NOTE: The benefits of Asynchronous mode do not take affect until
the update or full synchronization completes. Therefore, new write IO's will
be under the same restrictions as those of Synchronous mode.''

Part II: Asynchronous Write Order Consistency Group

The Write Order Consistency Group forces every new write to any primary volume, in the group, to be written to the secondary volume, in order of receipt. This
serializes write IO operations to those volumes participating in the group. The serialization of IO reduces the flushing of data from the primary to the secondary to a single threaded act.

Serialization of the data flush on an asynchronous group of sets can lead to system performance equivalent to that of a synchronous transfer over the same
network with the same latency.

Great care should be taken when considering this option. There can be only one Consistency Group defined per array.

Part III: Defining and Setting Priority

Replication Set priority is defined individually. This setting sets a process
priority in the array firmware to lengthen or shorten the delay on servicing replication requests. In effect, this increases or decreases the bandwidth utilization of on the wire replication IO's, at the expense of local, incoming read and write IO's. This function affects sets in a Synchronizing or Optimal state.

In general, the states are defined as: lowest, low, medium, high, and highest.
The lowest setting means that there is the longest delay between processing
remote replication requests for this set. The highest setting is, of course,
the opposite.

There is no hard, fast rule about what a user should set this to, for an individual set, or for a group of sets. There are some guidelines that should
be noted in changing these parameters:

  • The priority setting primarily benefits volume synchronization and sets with an Asynchronous mode of transfer.
  • Consider initially setting the priority according to the read/write ratio of the application using the primary volume in the set. For heavy writes, use a higher priority, lower for heavy reads.
  • The priority affects other volume IO on the controller that owns the Replication Set. Setting priority too high for one or more sets can starve
    controller resources that would service other IO. Setting it too low, will
    cause incoming write IO's for the sets to suffer, as well.
  • For Synchronous sets, the priority setting is not as useful when in an Optimal state, as the incoming writes must be completed remotely before being committed locally. A setting of medium, for Optimal sets, is recommended for this mode.
  • It is recommended that you change the priority from an initial priority to a higher setting, only if you wish to expedite volume synchronization. Then, change it back to a lower setting once synchronization completes, and the set is optimal.
  • The impact of this setting is limited by the round-trip lantency discussed in part 1. The poorer the latency, the less of an effect this setting will have.

Since the settings are subjective to every solution, you will have to use your
best judgment, and a bit of trial and error to get the desired result. The
guidelines above are to help you get started, and give you an idea on how to
judge the change and the impact of that change.

To set the priority in Element Manager:

To set the priority with sscs:

sscs modify -a arrayname -R priority repset repsetname

Part IV: Considerations in Volume Balancing

Not only can you change the replication process priority, but you can assign
the process to a particular RAID controller. This is accomplished through
the Volume Ownership of the controller. By moving volumes around, you can
manage the controller resources more effectively. Like setting the priority
for a set, there are guidelines for managing this aspect of the replication set:

  • These arrays are Asymmetric, so only one controller can have active path to the controller at a time. See <Document: 1003652.1>  for more detail.
  • Both local and remote volumes may be managed in this fashion, in that a replicated volume managed by the A controller locally, must be on the A controller remotely.
  • Start by recognizing which volumes would have the heaviest, application load at the primary site, whether replicated or not, and balance them based on this
    fact first, then based on replication.
  • For full synchronization situations, consider dedicating a single controller
    to high priority replication, and placing volumes with relatively low load with
    it.
  • Consider that if Snapshot or Volume Copy features are used, the source and destination of these operations must take place on the same controller.

To change volume ownership, refer to the contents of <Document: 1006464.1>  or more
specifically for Solaris[TM], <Document: 1012540.1> .

Again, these are simply guidlines, and are meant to give you a starting point
to consider whether this tuneable can or should be changed to provide your solution with better performance, whatever that performance goal should be.

Part V: Conclusion

As defined above, you have 3 variables that can be manipulated. We have listed them in order of highest impact to the overall configuration.

  1. Round-Trip Latency
  2. Volume Ownership
  3. Replication Priority

Users must consider what their IO profile is for their application, and the entire array. The next step is to create a goal that will satisfy the needs
for the applications using the primary or secondary arrays. That's right, the array usage at the secondary site does not have to be limited to disaster recovery utilization. Most operating systems now provide a utility that can
monitor disk device utilization for statistics like: writes, reads, write size, read size, service time, wait time, and busy time. The web user interface and sscs(1M) command line interface offer the ability to show similar
statistics. For more information on viewing performance statistics for these arrays, review <Document: 1003966.1> 



Product
Sun StorageTek 6130 Array (SATA)
Sun StorageTek 6130 Array
Sun StorageTek 6540 Array
Sun StorageTek 6140 Array
Sun Storage 6780 Array
Sun Storage 6580 Array
Sun Storage 6180 Array

Internal Comments

This was the result of Escalation 1-13723224.


6140, 6540, 6130, RVM, Remote Volume Mirroring, Latency
Previously Published As
83509
Product_uuid
61718837-0e90-11d9-8d5c-080020a9ed93|Sun StorageTek 6130 Array (SATA)
8252cb91-d771-11d8-ab52-080020a9ed93|Sun StorageTek 6130 Array
e35cfcfc-a31a-11da-85b4-080020a9ed93|Sun StorageTek 6540 Array
8ac7dca5-a8bd-11da-85b4-080020a9ed93|Sun StorageTek 6140 Array

Change History
Date: 2009-11-25
User Name: 88109
Action: Approved
Comment: Updated title and products. May need to be updated for SANtricity too at a later date.
Version: 0
Date: 2006-10-30
User Name: 71396
Action: Approved
Comment: Performed final review of article.
No changes required.
Publishing.
Version: 6
Date: 2006-10-27
User Name: 71396
Action: Accept
Comment:
Version: 0

Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback