| Asset ID: |
1-75-1007046.1 |
| Update Date: | 2011-05-23 |
| Keywords: | |
Solution Type
Troubleshooting Sure
Solution
1007046.1
:
Troubleshooting Sun StorageTek[TM], Sun StorEdge[TM], and Sun Storage[TM] Management Communication Faults with Arrays
| Related Items |
- Sun Storage Flexline 280 Array
- Sun Storage 2540 Array
- Sun Storage 2510 Array
- Sun Storage 6140 Array
- Sun Storage Common Array Manager (CAM)
- Sun Storage Flexline 210 Array
- Sun Storage 2530 Array
- Sun Storage Flexline 380 Array
- Sun Storage 6540 Array
- SANtricity Storage Manager
- Sun Storage 6130 Array
- Sun Storage Flexline 240 Array
|
| Related Categories |
- GCS>Sun Microsystems>Storage Software>Modular Disk Device Software
|
PreviouslyPublishedAs
209726
Applies to:
Sun Storage Common Array Manager (CAM) - Version: 5.0 and later [Release: and later ] Sun Storage 6130 Array - Version: Not Applicable and later [Release: N/A and later] Sun Storage 2540 Array - Version: Not Applicable and later [Release: N/A and later] Sun Storage 2530 Array - Version: Not Applicable and later [Release: N/A and later] Sun Storage 2510 Array - Version: Not Applicable and later [Release: N/A and later] All Platforms
Purpose
The purpose of this document is to assist in the identification and resolution
of issues related communication issues between Sun StorageTek Common
Array Manager(CAM), Sun StorageTek SANtricity Storage Manager, and any
supported Sun StorEdge[TM], Sun StorageTek[TM], StorageTek[TM], or Sun
Storage[TM]array.
Symptoms include:
- ASR Summary with SCRK:oob Component Name:OutOfBand Id:oob
- ASR Summary with ASR:oob
- ASR Summary with ASR:ib
- CAM Alert or ASR event for 2510 - 73.12.31 2510.CommunicationLostEvent.oob
- CAM Alert or ASR event for 2530 - 69.12.31 2530.CommunicationLostEvent.oob
- CAM Alert or ASR event for 2540 - 70.12.31 2540.CommunicationLostEvent.oob
- CAM Alert or ASR event for 6130 - 48.12.31 6130.CommunicationLostEvent.oob
- CAM Alert or ASR event for 6140 - 57.12.31 6140.CommunicationLostEvent.oob
- CAM Alert or ASR event for 6540 - 63.12.31 6540.CommunicationLostEvent.oob
- CAM Alert or ASR event for flx380 - 59.12.31 flx380.CommunicationLostEvent.oob
- CAM Alert or ASR event for flx280 - 72.12.31 flx280.CommunicationLostEvent.oob
- CAM Alert or ASR event for flx240 - 74.12.31 flx240.CommunicationLostEvent.oob
- CAM Alert or ASR event for 6580 79.12.31 6580.CommunicationLostEvent.oob
- CAM Alert or ASR event for 6780 80.12.31 6780.CommunicationLostEvent.oob
- CAM Alert or ASR event for 6180 90.12.31 6180.CommunicationLostEvent.oob
- CAM Alert or ASR event for 2510 - 73.12.21 2510.CommunicationLostEvent.ib
- CAM Alert or ASR event for 2530 - 69.12.21 2530.CommunicationLostEvent.ib
- CAM Alert or ASR event for 2540 - 70.12.21 2540.CommunicationLostEvent.ib
- CAM Alert or ASR event for 6130 - 48.12.21 6130.CommunicationLostEvent.ib
- CAM Alert or ASR event for 6140 - 57.12.21 6140.CommunicationLostEvent.ib
- CAM Alert or ASR event for 6540 - 63.12.21 6540.CommunicationLostEvent.ib
- CAM Alert or ASR event for flx380 - 59.12.21 flx380.CommunicationLostEvent.ib
- CAM Alert or ASR event for flx280 - 72.12.21 flx280.CommunicationLostEvent.ib
- CAM Alert or ASR event for flx240 - 74.12.21 flx240.CommunicationLostEvent.ib
- CAM Alert or ASR event for 6580 79.12.21 6580.CommunicationLostEvent.ib
- CAM Alert or ASR event for 6780 80.12.21 6780.CommunicationLostEvent.ib
- CAM Alert or ASR event for 6180 90.12.21 6180.CommunicationLostEvent.ib
- Failure to register an array in SANtricity or CAM
- Array is listed as Unresponsive in the Enterprise Management window in SANtricity
- Array is listed as Unresponsive in the Array Summary Page in CAM
Last Review Date
March 22, 2011
Instructions for the Reader
A Troubleshooting Guide is provided to assist
in debugging a specific issue. When possible, diagnostic tools are included in the document
to assist in troubleshooting.
Troubleshooting Details
Please validate that each troubleshooting step below is true for your environment. Each step will provide instructions via a link to a document, for validating the step and taking corrective action as necessary. The steps are ordered in the most appropriate sequence to isolate the issue and identify the proper resolution. Please do not skip a step.
A. Verify whether the issue is based on an
Alert/Alarm, or a problem with
Registration/Adding. - If
the issue is based on an Alert/Alarm received from your management
host, or observed in the Array Summary(CAM) or Enterprise
Management(SANtricity), go to Step
B.
- If the issue is based on a problem
with registering/adding the array to the management software, go to
Step
C.
B.
Verify whether the array is being managed in band or out of
band. For
CAM: In the Array Summary window, the
management type will be listed in parentheses as
In-Band or Out-of-Band.
Alternatively the alarm code listed as XX.12.YY in the alarm dictates
whether the array is in band or out of band. XX is the array
type as listed in the Description section of this document.
YY can be 21 or 31, indicating In-Band or Out-of-Band
respectively.
For SANtricity via
GUI: This is displayed in the enterprise
management window under management connections. This is
either Out of Band or In
Band NOTE 1: There is no
easy way to identify whether the array is managed in-band or
out-of-band via the CLI for either
application. - If the array is
being managed Out-of-Band, go to Step
C.
- If the array is being managed In-Band,
go to Step
D.
C.
Validate that you can communicate with each array controller out of
band
Reference:
<Document: 1008327.1> Validating Sun StorageTek[TM] 2500, 6000, and Flexline Array Controller Out of Band Communication
- If the controllers can be communicated with
properly, continue to Step F.
- If
the controllers communicated with properly, but the array still shows
up as unresponsive, go to Step E.
D. Validate that you
can communicate with each array controller in band
Reference: <Document: 1021058.1> Validating Sun StorageTek[TM] 2500, 6000, and Flexline Array Controller In Band Proxy Agent Communication
- If the array and the in-band agent can be
communicated with properly, but the array still cannot be registered,
go to Step F.
- If the array and
the in-band agent can be communicated with properly, but the array
still shows up as unresponsive, go to Step
E.
E. Validate array status after initializing CAM or
SANtricity Services
CAM
Solaris 10 : svcadm restart
svc:/system/fmservice:default
Solaris 8,9: /opt/SUNWsefms/sbin/fmservice.sh restart
Linux : /opt/sun/cam/private/fms/sbin/fmservice.sh restart
Windows : Use control panel to restart Sun_STK_FMS
Then check status:
Solaris 10 : svcs svc:/system/fmservice:default
Solaris 8,9: /opt/SUNWsefms/sbin/fmservice.sh status
Linux : /opt/sun/cam/private/fms/sbin/fmservice.sh status
Windows : Use control panel to check status of
Sun_STK_FMS
Status should be online.
SANtricity
Simply Closing the Enterprise Management
window to exit the application,
and launching it again, takes care of this task.
- If the array is still unresponsive, or shows an
alert/alarm, go to Step F.
- If
the array alert/alarm is gone, you have corrected the problem,
no further action is required.
F. Re-register the array
If possible, remove and register the array from CAM or
SANtricity. Attempt doing so by alternating between controller IP's
during registration.
- If you can register the array, continue to
Step G.
- If you cannot register
the array, using either IP address, continue to Step
H.
G. Validate
whether issue is intermittent or not
- If the array slips between having a communication
issue and communicating ok, check your CAM version for 6.4.1 or
below. There are issues with long running jobs for arrays, or with the scripting client that are addressed in 6.5 and later.
- If the array status slips between having a
communication issue and communicating ok, check your network for the
following traits:
- Whether your
management LAN is a private LAN. This makes the network software on the
array controllers subject to attack, and network congestion can cause
the poll to fail.
- Whether any type of port
scanning is taking place on the LAN. Port scanning can cause TCP port
connections to max out, which will result in a failed poll of the
array.
- If you suspect either of the
above issues, and your connection is intermittent, try to tune the
polling interval larger.
CAM
- Click General Configuration
- Click Health Monitoring
- Change
Monitoring Frequency
- Click Save
By default, CAM has a five(5) minute polling interval, will
retry twice, and after fifteen(15) minutes, will throw an Alarm for
loss of communication.
SANtricity
You cannot tune the polling interval in SANtricity Storage
Manager.
If your issue is not intermittent, or if tuning the polling
interval has not helped, continue to Step H.
H. Collect Data At this point, if you have validated that each troubleshooting step above is true for your environment, and the issue still exists, further troubleshooting is required.
- If possible, collect CAM array support data(will not
be available if the array cannot be communicated with). Reference <Document: 1002514.1> : Collecting Support Data for Arrays Using Sun StorageTek[TM] Common Array Manager
- If possible, collect SANtricity support data(will not be available if the array cannot be communicated with). Reference: <Document: 1014074.1> Collecting Support Data for Arrays Using Sun StorageTek[TM] SANtricity Storage Manager
- If using CAM, collect CAM host support data. Reference <Document: 1021091.1> Collecting Sun StorageTek[TM] Common Array Manager Host Support Data
- Provide a network map of the management
LAN
- Provide a network map of the in-band management
if applicable
- Provide Polling
Interval
- Indicate whether array is on a public or
private LAN
- Indicate whether array has a Static or
DHCP assigned IP address
- Indicate which of the
above steps were attempted
- Provide host type of
management software
- Provide Array
Model
- Provide Management Software name and
version
And contact Support
Internal Comments
This document contains normalized content and is managed by the the Domain Lead
(s) of the respective domains. To notify content owners of a knowledge gap
contained in this document, and/or prior to updating this document, please
add a comment to the document and it will be processed.
Most Customers will be resolved by following the path of Steps C or D. The remaining few have either @ CAM services issues, or an intermittent problem on their network.
Due to CR6830106
running several long running jobs, or using the sscs scripting client
for multiple operations in succession, can cause the array to stop
communicating for a period of time. Upgrade to release 6.5 or
later and re-evaluate your scripts/jobs.
Ensure that they are at the latest version of Common Array Manager
SANtricity, CAM, Common Array Manager, oob, out of band, 6140, 6540, flx240, flx210, flx380, 6540, 2540, 25x0, 2500, 6000, 2530, communication, 6180, 6580, 6780, ib, in band, normalized
Previously Published As
91322
Change History
Date: 2008-01-04
User Name: 7058
Action: Approved
Comment: No further edits required.
OK to publish.
Version: 9
Date: 2008-01-04
User Name: 7058
Action: Accept
Comment:
Version: 0
Date: 2008-01-04
User Name: 88109
Action: Approved
Comment: no technical change. link is correct
Version: 0
Date: 2008-01-03
User Name: 71066
Action: Approved
Comment: Link corrected as requested by the Final Review team.
Nicolas
Version: 0
Date: 2008-01-03
User Name: 31620
Action: Rejected
Attachments
This solution has no attachment
|