Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-75-1002213.1
Update Date:2009-09-14
Keywords:

Solution Type  Troubleshooting Sure

Solution  1002213.1 :   Simba, PCI Controller, SERR Fault Isolation on the Netra [TM] ct 410/810  


Related Items
  • Sun Netra CT410 Server
  •  
  • Sun Netra CT810 Server
  •  
Related Categories
  • GCS>Sun Microsystems>Servers>NEBS-Certified Servers
  •  

PreviouslyPublishedAs
203129


Description

This document indicates how to isolate Simba, PCI Controller, SERR panics on the Netra[TM] ct 410/810 system.



Steps to Follow

The SERR results from:
- Address parity errors
- A parity error detected during an address cycle is bad, as no device knows who should respond, so every device who sees the parity error asserts SERR.
- Data parity errors during special cycles.
- Critical errors other than parity errors. eg data trapped in a bridge with no way to inform the sender that it is being dropped.

The debug driver works by walking the device tree down from the nexus driver that gets the SERR interrupt. At each node it extracts the common PCI status/command registers. Then, for nodes that it recognizes, it will extract more information, ie: for a PCI-to-PCI bridge chip it will get registers from both sides as well as chip specific error registers.

A review of the following message file snippet contains my comments preceded by ###:

 simba: WARNING: simba-0: Simba fault log start:
simba: WARNING: simba-0: primary err (42a0):
simba: signalled system error.
simba: WARNING: simba-0: sec err (280):
### No system error - simba 0 looks fine
simba: simba-0: PCI fault log end.
simba: WARNING: simba-1: Simba fault log start:
simba: WARNING: simba-1: primary err (42a0):
simba: signalled system error.
simba: WARNING: simba-1: sec err (4280):
simba: received system error.
simba: simba-1: PCI fault log end.
###  Not a bridge bridge error, so on lower level bus
pcipsy: [ID 831440 kern.warning] WARNING: pcipsy-0: PCI fault log start:
pcipsy: [ID 303176 kern.notice] PCI SERR
pcipsy: [ID 267043 kern.notice] bytemask=c
pcipsy: [ID 607383 kern.warning] WARNING: pcipsy-0: PCI primary error (0):
pcipsy: [ID 259679 kern.warning] WARNING: pcipsy-0: PCI secondary error (0):
pcipsy: [ID 467665 kern.warning] WARNING: pcipsy-0: PBM AFAR 1ff.a000120a:    

Use mdb $< msgbuf & prtconf -pv to determine which device corresponds to address 1ff.a000120a.

Architecture Description:

The Netra[TM] ct 410/810 CPU board contains an UltraSPARC[TM] IIe processor which communicates to Sun[TM] Advanced PCI bridge (Simba) via a 66MHz PCI bus. The Simba connects to the I/O devices via two 33 MHz PCI busses, A and B. The A bus (simba-1) connects to the PCI-to-PCI Drawbridge which connects to the CPCI backplane. The B bus (simba-0) connects to the CPU board's internal I/O devices which include the two RIO ICs (Ethernet/USB controllers), the PCI Mezzanine Card, and the 53C876 SCSI controller. RIO-A connects to EBUS-A which connects to the OBP, NVRAM, TOD clock, and other devices. RIO-B connects to EBUS-B which connects to the I2C controller and other devices.



Product
Netra CT 810 Server
Netra CT 410 Server

Makaha, Simba
Previously Published As
79038

Change History
Date: 2005-07-11
User Name: 31620
Action: Approved
Comment: Verified Metadata - ok
Verified Keywords - ok
Verified still correct for audience - currently set to contract
Audience left at contract as per FvF http://kmo.central/howto/FvF.html
Checked review date - currently set to 2006-06-22
Checked for TM - added [TM] where appropriate in title
Publishing under the current publication rules of 18 Apr 2005:


(1) The document is in the ARCHIVED state.
The AUDIENCE tag for archived documents should be set to "FREE".
If a document moves out of the ARCHIVED state and is
re-published to the current collections, the AUDIENCE tag
should be changed to "CONTRACT".


(2) The document deals with any of the products listed below
(one or more of the products is listed in the article properties
PRODUCT field). The AUDIENCE tag for these documents should be "FREE"

- StarOffice
- Sun Java Desktop System
- Sun Fortran / Sun Studio
- Sun Java Studio Creator
- Sun Java Studio Enterprise
- Sun Java Studio Mobility
- Sun Java Studio Standard
- Sun Cobalt
- Sun LX50 Server


--

Subject:PLS READ: authoring content in Voyager
From:Robb Moody
Date:Mon, 18 Apr 2005 16:40:38 -0600
To:undisclosed-recipients: ;


The AUDIENCE tag in Voyager must be set according to the following
rules:


(1) Internal
Documents identified as "Internal" are only disseminated to Sun
employees and contractors. Internal documents are not shared with third
parties. Documents dealing with the following should be marked as
"Internal"...
- unreleased products
- beta products
- any confidential, non-public strategies and ideas, development
information, computer codes, marketing information and
strategies, business operations, customer and workforce
information, trade secrets, records, documents, information,
studies, and materials of Sun


(2) Contract
All EXTERNAL Voyager documents should be identified as "Contract"
(unless otherwise indicated below).


(3) Free
Any EXTERNAL Voyager document dealing with any of the products listed
below should be identified as "Free". If one or more of the products
below is listed in the "article properties PRODUCT" field, the document
should be marked "Free".
- StarOffice
- Sun Java Desktop System
- Sun Fortran / Sun Studio
- Sun Java Studio Creator
- Sun Java Studio Enterprise
- Sun Java Studio Mobility
- Sun Java Studio Standard
- Sun Cobalt
- Sun LX50 Server



Thank you for your immediate attention to these changes. If you have
questions about the Voyager process please send an e-mail to
[email protected]. Detailed guidelines for authoring Voyager documents can be
found at pronto.central or by clicking on the link below:
http://kmo.central/howto/content/voyager-contributor-standards.html

If you have questions about the Fee vs. Free project, contact
[email protected].
Version: 11
Date: 2005-07-06
User Name: 31620
Action: Accept
Comment:
Version: 0
Date: 2005-07-06
User Name: 27596
Action: Approved
Comment: DonD,

Kool. Archived older doc.

This one rules.

One more time - off to
Final Review.

Regards, DonP
Version: 0
Date: 2005-07-06
User Name: 38000
Action: Approved
Comment: Archived old doc 79039 which this replaces
Version: 0
Date: 2005-06-23
User Name: 7058
Action: Rejected
Comment: Why is this document separate from doc ID 79039?
Both are talking about Simba fault isolation on Netra ct systems.
The faults may be slightly different, but wouldn't it make sense to have
one doc with the different faults listed and how to isolate them?
Please help me understand this before publishing.

Thanks,
nita
Version: 0
Date: 2005-06-22
User Name: 7058
Action: Accept
Comment:
Version: 0
Date: 2005-06-22
User Name: 27596
Action: Approved
Comment: The article has been improved by the explanation added
by DonD, from the Internal PCI SERR page.

Fwd'ing toFinal Review.

R, DonP
Version: 0
Date: 2005-06-22
User Name: 27596
Action: Accept
Comment:
Version: 0
Date: 2005-06-22
User Name: 38000
Action: Unlock
Comment:
Version: 0
Date: 2005-06-21
User Name: 38000
Action: Approved
Comment: Added general comments regarding the definition of SERR & fault propogation.
Version: 0
Date: 2005-06-20
User Name: 7058
Action: Rejected
Comment: Hi Donald,

Please see William Watson's comments. I think he's working to obtain further
detail from a tech writer. Maybe we should wait for that info before publishing
this document? I hope I'm understanding his comments correctly ..they are:

"File 'Diagnosing PCI SERR errors.html' has been forwarded to tech-writer for additional detail."

Please contact William Watson for more info on this and once the info is added,
please resubmit to the workflow.

Thanks,
nita
Version: 0
Date: 2005-06-20
User Name: 7058
Action: Accept
Comment:
Version: 0
Date: 2005-06-20
User Name: 116519
Action: Add Comment
Comment: File 'Diagnosing PCI SERR errors.html' has been forwarded to tech-writer for additional detail.
Version: 0
Date: 2005-06-20
User Name: 27596
Action: Approved
Comment: The author has responded to "What is SIMBA", by
including PCI Controller in the asset's title.

Back again to Final Review.

Regards, Don
Version: 0
Date: 2005-06-20
User Name: 38000
Action: Approved
Comment: Clarify Simba
Version: 0
Date: 2005-06-20
User Name: 38000
Action: Rejected
Comment: revert
Version: 0
Date: 2005-06-20
User Name: 38000
Action: Accept
Comment:
Version: 0
Date: 2005-06-20
User Name: 38000
Action: Approved
Comment: Clarified Simba
Version: 0
Date: 2005-06-19
User Name: 7058
Action: Rejected
Comment: Hi Donald,

Is "Simba" a product codename? If so, before publishing, either change the
audience tag on this document to Internal Only instead of Contract, or remove
all reference to internal product codenames so that the document can be seen by
contract customers. Then, please resubmit to the workflow.

Thanks,
Nita
Version: 0
Date: 2005-06-19
User Name: 7058
Action: Accept
Comment:
Version: 0
Date: 2005-06-17
User Name: 27596
Action: Approved
Comment: I have discussed the prior recommended changes
with the author Don Drygalski.

I reviewing this version I can see that those
recommendations have been satisified AND I
personally think the article is ready for
Final Review.

Sending to Final Review.

R, Don

Don Palko [email protected]
---------
Product Technical Support Engineer
Volume Server Products (VSP)
Product Technical Support (PTS)
Sun Support Services
Sun Microsystems, Inc., Burlington, Ma.
Primary # 781-442-1371 (x21371)
Messages left @ x21371 are not forwarded
best # to leave messages at is my Cell #
Cell Phone: 508-523-8701
Version: 0
Date: 2005-06-17
User Name: 27596
Action: Accept
Comment:
Version: 0
Date: 2005-06-09
User Name: 38000
Action: Approved
Comment: Updated due to previous comments
Version: 0
Date: 2005-03-24
User Name: 116519
Action: Rejected
Comment: Don,

Review voyager supplied email for requested modifications.
Version: 0
Date: 2005-03-24
User Name: 116519
Action: Add Comment
Comment: Don,

Your problem statement is as follows:
'This document indicates how to isolate some Simba panics on the Netra CT-410 system', however; the problem statement should state what the types and/or descriptions of panics for the Netra ct410 and Netra ct810 systems.

Suggestion:
How can one isolate the following Simba panics on a Netra ct 410 or Netra ct810 system:
1. 'panic 1' , or
2. 'panic 2', or
.
.
.
n 'panic n' .






Ancillary:
1. Place your architecture description at the base of the document. It does not address specifically how to reduce panics in question.

2. Also state where the warnings and errors are located.
3. Site keywords.
4. What are all platforms specific to the characteristics of these types of errors? You only have ct410 listed in the inital problem statement.
Lastly,
5. I would encourage you to provide location where snippets are located (aka from console, from messages files, from where?) This will ensure that an individual faced with this type of problem will better understand how to diagnose and map this infodoc to his/her problem..

.....> Good Start <....
Version: 0
Date: 2005-03-24
User Name: 116519
Action: Accept
Comment:
Version: 0
Date: 2005-03-16
User Name: 116819
Action: Unlock
Comment:
Version: 0
Date: 2005-03-16
User Name: 116819
Action: Accept
Comment:
Version: 0
Date: 2005-03-11
User Name: 38000
Action: Approved
Comment: Original submission
Version: 0
Date: 2004-11-04
User Name: 38000
Action: Created
Comment:
Version: 0
Product_uuid
a70f8b46-058a-4c49-b283-9709eb6ef37d|Netra CT 810 Server
bf89fb96-99cf-4e7b-8681-152c788171f7|Netra CT 410 Server

Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback