![]() | Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||
Solution Type Technical Instruction Sure Solution 1010407.1 : DTAG parity error Troubleshooting and Analysis
PreviouslyPublishedAs 214288
Applies to:Sun Enterprise 3000 ServerSun Enterprise 3500 Server Sun Enterprise 4000 Server Sun Enterprise 4500 Server Sun Enterprise 5000 Server All Platforms GoalDescription:This document describes how to perform analysis of DTAG parity error events on Sun Enterprise 3x00/4x00/5x00/6x00 (aka Classic) Servers and determine if a replacement action is necessary.Examples:A DTAG Parity Error event is often only visible on the system console (sometimes called the console log since this is often logged on a console server) and is usually seen within Fatal Reset output.An example from console log data is below: 17-OCT-2001 17:07:55.17 LBC5 Fatal Reset
In the example above the
DTAG parity error occurred on System Board 2.
The FIX section of this article will explain further details of this event.
A DTAG error looks like the following in prtdiag: AC: UPA Port B Dtag Parity ErrorNOTE that the port can also be Port A for example: AC: UPA Port A Dtag Parity ErrorOnce you have determined that your event matches what has been described above, proceed to the FIX section of this article to resolve the event. SolutionWhat is a DTAG parity Error?The event DT_PERR indicates a Duplicate Tag SRAM (DTAG) parity error. These DTAG SRAM's reside on CPU/Memory boards in Sun Enterprise 3x00/4x00/5x00 (Classic) Servers. DTAG's are duplicates of the CPU's ETAG's on the system board.
Notes about troubleshooting DTAg Errors:DTAG errors are usually caused by bit flips in DTAG SRAM. DTAG SRAM is located on the system board. The same issue which cause bit flips in memory (Alpha Particles, handling and environmental conditions) cause bit flips in DTAG SRAM.
Repair Vendor testing of system boards which received DTAG parity errors prove that more then 90% of the time, these errors are transient and never occur again. For this reason, Oracle's Best Practices (originally was "Sun's Best Practices") dictates that if a DTAG parity error occurs the recommendation is:
From the example in the GOALS section: The
DTAG parity error occurred on System Board 2. The DTAG memory
which suffered a bit flip was associated with CPU location 1,
(DT_PERRB). If this was the
second occurrence of the same error the system Board in the past 6 months, system board
2 should be replaced. But, if this was a first error, Best Practice dictates that the board should not be replaced. Additional Information: One leading cause of DTAG errors are Environmental factors. A good environmental resource to utilize is Document 1011650.1 Sun Enterprise[TM] 3X00-6X00 Servers: Board Temperature Information. @ Previously Published As 40760Attachments This solution has no attachment |
||||||||||||
|