Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1009951.1
Update Date:2009-09-14
Keywords:

Solution Type  Problem Resolution Sure

Solution  1009951.1 :   Sun Fire[TM] B1600 Blade: Switch drops packets on NETMGT port or loses connectivity  


Related Items
  • Sun Fire B1600 Blade System Chassis
  •  
Related Categories
  • GCS>Sun Microsystems>Servers>Blade Servers
  •  

PreviouslyPublishedAs
213635


Symptoms

It is observed by some customers that the packets are dropped on the NETMGT port during high traffic. Sometimes, it is also observed that the connectivity from the NETMGT port to the external network is lost, and the users are unable to ping to/from the NETMGT port. In some cases, it is observed that the switch gets reset often. In one of the reported problem, the following was observed in the log of Sun Fire[TM] B1600 Blade while pinging NETMGT port at regular intervals of 1 second.

sc>showlogs ssc0
Sep 15 19:48:49: MINOR: SSC0: Active LED state changed to ON.
Sep 15 20:32:07: MINOR: Peer SC SSC0 is now online
Sep 15 20:32:07: MINOR: SSC0: Environmental monitoring enabled.
Sep 15 20:32:08: MINOR: SSC0: Environmental monitoring enabled.
Sep 15 20:32:14: MINOR: SSC0: Active LED state changed to OFF.
Sep 15 20:32:14: MINOR: SSC0: Service Required LED state changed to OFF.
Sep 15 20:32:14: MINOR: SSC0: Ready to Remove LED state changed to OFF.
Sep 15 20:32:16: MINOR: SSC0: Powered on.
Sep 15 20:32:22: MINOR: SSC0: Active LED state changed to ON.
Sep 15 21:58:39: MINOR: Peer SC SSC0 is now online
Sep 15 21:58:39: MINOR: SSC0: Environmental monitoring enabled.
Sep 15 21:58:40: MINOR: SSC0: Environmental monitoring enabled.
Sep 15 21:58:46: MINOR: SSC0: Active LED state changed to OFF.
Sep 15 21:58:46: MINOR: SSC0: Service Required LED state changed to OFF.
Sep 15 21:58:46: MINOR: SSC0: Ready to Remove LED state changed to OFF.
Sep 15 21:58:48: MINOR: SSC0: Powered on.
Sep 15 21:58:53: MINOR: SSC0: Active LED state changed to ON.
Sep 16 09:21:03: MAJOR: SSC0: Reset occurred
Nov 04 14:48:56: CRITICAL: SSC0: Network access failure           <---
Nov 04 15:05:53: CRITICAL: SSC0: Network access failure recovered <---

The problem has been observed in cases where the NETMGT port is connected to the customer's network with a number of other switches and hosts. The problem is not seen when the NETMGT port is connected to another host back-to-back.



Resolution

The problem is mainly because the NETMGT port is connected to the same network as the other ports (NETPTx) or when NETMGT port is connected to a network with other switches and hosts. When NETMGT poprt is connected to a network, care must be taken to configure the network such that NETMGT port does not participate in spanning tree. In other words, packets related to spanning tree must not come to the NETMGT port. The customer may have to re-configure the network (particularly, the other switches in the network) accordingly. Alternatively, the customer may connect NETMGT port to a separate network with spanning tree disabled. Connecting NETMGT port back-to-back to another host will also resolve this problem.



Relief/Workaround

Disconnect the cable from NETMGT port to the external network.



Product
Sun Fire B1600 Blade System Chassis

Internal Comments

Ref: CR# 6272982 : B1600 SSC1/SWT NETMGT port could not reply to a ping request

esc#1-13264431

Radiance case#10993486

radiance task#22213763


B1600, NETMGT, packet, loss.
Previously Published As
89898

Change History
Date: 2007-07-05
User Name: 97961
Action: Approved
Comment: - Changed title to comply to the standard format
- Converted to STM formatting for better readability
- Applied trademarking where it is missing
Version: 5
Date: 2007-07-05
User Name: 97961
Action: Accept
Comment:
Version: 0
Date: 2007-07-04
User Name: 107050
Action: Approved
Comment: Useful background information.
Version: 0
Date: 2007-07-04
User Name: 117923
Action: Approved
Comment: As far as I have observed, ping is not the trigger, and the loss of connectivity did not happen immediately after the ping stated. What is observed is that the connectivity is lost sometimes during the period of activity on the NETMGT port. It is the current assessment that some kind of spanning tree related packets trigger this problem.
Version: 0
Date: 2007-07-04
User Name: 107050
Action: Rejected
Comment: Was the pinging started just prior to "Nov 04 14:48:56" ??
Version: 0
Date: 2007-07-04
User Name: 107050
Action: Add Comment
Comment: To clarify one point of this very broad problem:

You said "pinging NETMGT port at regular intervals of 1 second" triggered the problem and that can be seen in the output of "sc>showlogs ssc0"

Was the pinging inititated just prior to the "Network access failure" ?
In other words was the pinging started about "Nov 04 14:48:56"

This would help to clarify the cause/effect of that particular test.
Again, I understand the problem is broader and includes STP interactions as another cause of these faults.
Version: 0
Date: 2007-07-04
User Name: 107050
Action: Accept
Comment:
Version: 0
Date: 2007-06-21
User Name: 117923
Action: Approved
Comment: Pl do tech review
Version: 0
Date: 2007-06-21
User Name: 107050
Action: Rejected
Comment: Please add escalation/bugs/cases in additional info section
Version: 0
Date: 2007-06-21
User Name: 107050
Action: Accept
Comment:
Version: 0
Date: 2007-06-20
User Name: 117923
Action: Approved
Comment: Sending this for technical review
Version: 0
Date: 2007-06-20
User Name: 117923
Action: Created
Comment:
Version: 0
Product_uuid
10bec5e4-5865-11d6-9ffc-c65b6cd3fd7d|Sun Fire B1600 Blade System Chassis

Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback