Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1002133.1
Update Date:2010-07-06
Keywords:

Solution Type  Problem Resolution Sure

Solution  1002133.1 :   Sun Fire[TM] Midrange Servers: Fast Data Access MMU Miss when issuing probe-scsi-all or boot  


Related Items
  • Sun Fire E6900 Server
  •  
  • Sun Fire 3800 Server
  •  
  • Sun Fire 6800 Server
  •  
  • Sun Fire E4900 Server
  •  
  • Sun Storage D240 (StorEdge) Media Tray
  •  
  • Sun Fire 4800 Server
  •  
  • Sun Fire E2900 Server
  •  
  • Sun Fire V1280 Server
  •  
  • Sun Fire 4810 Server
  •  
Related Categories
  • GCS>Sun Microsystems>Storage - Disk>Modular Disk - Other
  •  

PreviouslyPublishedAs
203034


Symptoms
In a specific configuration, a Sun Fire[TM] domain may encounter a "Fast Data Access MMU Miss" when the command probe-scsi-all or boot is issued at OBP (OpenBoot PROM).
This document was written for a specific customer configuration that showed this problem. The configuration included a Cauldron-S card (501-6635) located in slot 0 of an I/O Board. A D240 storage array was attached to Cauldron-S scsi port 0. This array contains the domain boot device.

It is unknown at this time whether different I/O cards or storage arrays are affected by this issue or not. But, for this specific case, both probe-scsi-all and boot failed for the same reason.

In certain configurations the device path in error in probe-scsi-all may be different then the device path used for booting. In these configurations, boot should not fail with these same errors as probe-scsi-all. In this case, the document might still apply to the situations where boot is fine, and only probe-scsi-all errors. See the "Additional Information" section for more details regarding the failure and identifying if this document applies to your situation.

The problem appeared like this in the domain console when trying to boot:

 {10} ok boot /ssm@0,0/pci@18,700000/pci@1/scsi@2/disk@0,0:a
TL = 1, TT = 68.
TSTATE= 0x1402  asi = 0x0, pstate = 0x14, cwp = 0x2]
TPC= 00000000f0035664
TNPC= 00000000f0035668
 SFSR= 0000000000808008, TAGACCESS = 00000000ffffe000
D-SFAR = 00000000ffffffff
TICK= 80000021216428ec, TICKCMP = ffffffffffffffff 

You may see the full or partial error message "Fast Data Access MMU Miss" within the above output.

Partial output of this error appeared like this when issuing a probe-scsi-all:

 {10} ok probe-scsi-all
 /ssm@0,0/pci@18,700000/pci@1/scsi@2,1
 /ssm@0,0/pci@18,700000/pci@1/scsi@2
TL = 1, TT = 68. ata Access MMU Miss
TSTATE= 0x1404 r = 0x0, asi = 0x0, pstate = 0x14, cwp = 0x4]
TPC= 00000000f0035664
TNPC= 00000000f0035668
SFSR= 0000000000808008, TAGACCESS = 00000000ffffe000
D-SFAR = 00000000ffffffff
TICK= 800000214f848d51, TICKCMP = ffffffffffffffff


Resolution
At this time, the issue has occurred at only one customer site that we know. The configuration is specific: This document applies to failures of probe-scsi-all or the boot command when an I/O card is located in an I/O Board slot 0. As stated earlier, boot will fail for this described reason only if the boot device is located on the same I/O card located in slot 0.
At this time, the only reported I/O adapter which has this issue is the Cauldron-S (501-6635), and the issue has only occurred when a storage array (D240) was attached to the card scsi port 0 (Top port; Device path is "scsi@2" in probe-scsi-all).

If you encounter a situation similar to this issue using a different I/O card, please contact Sun Support Services with the failure details and the CR and this SRDB will need to be updated with regards to the new failure symptoms/information.

Below is the process which will confirm that this document and CR 6238924, applies to your failure situation. It shows how to reproduce the issue and also how to test the workarounds. If the issue can not be reproduced or the workarounds do not resolve your specific situation, then this document and CR may not apply to your situation.

Please make sure that this process is executed in the domain console and that console logging is enabled so this information can be sent in to Sun Support Services for documentation purposes.

Please validate that each troubleshooting step below is true for your environment. Each step will provide instructions via a link to the document, for validating the step and taking corrective action as necessary. The steps are ordered in the most appropriate sequence to isolate the issue and identify the proper resolution. Please do not skip a step.

Process:

1) I/O card is located in I/O Board slot 0 (I/O Board number is not relevant). Card scsi port 0 is attached to scsi device (assumes Cauldron-S - and the scsi device attached is a D240 array).

   - Again, not sure if Cauldron-S is the only card susceptible to this;  
But, at this time it is the only reported adapter to have this issue.

2) Setkeyswitch on.

3) At OBP set diag-switch to true.

   - This will show "probing for device" information as the domain exits POST
and enters OBP. This is useful trouble shooting information if needed.

4) Issue probe-scsi-all (or boot if the device is the boot device).

   - If you get a "Fast Data Access MMU Miss" as shown in the "Problem 
Description" section, you have reproduced the issue.

5) Setkeyswitch standby or off.

6) Setkeyswitch on.

7) Issue reset-all.

8) Issue probe-scsi-all (or boot if the boot device, but note that the device path has changed).

   - If errors persist, your issue is different than the one described in this
document and CR. Perhaps you have bad hardware (card, cable, etc.).
- If this works, Workaround #3 is confirmed.

9) Setkeyswitch standby/off.

10) Disconnect the cable from scsi interface 0 and attach it to scsi interface 1 on the I/O card (assumes Cauldron-S; SCSI 0 is top port, SCSI 1 is bottom port).

11) Setkeyswitch on

12) Issue probe-scsi-all (or boot if the boot device, but note that the device path has changed).

   - If errors persist, your issue is different than the one described in this
document and CR. Perhaps you have bad hardware (card, cable, etc.).
- If this works, Workaround #1 is confirmed.

13) Setkeyswitch standby/off.

14) Move the I/O card to an empty slot on the same I/O Board. Attach the array to either scsi port (or repeat this later to test each port).

15) Setkeyswitch on.

16) Issue probe-scsi-all (or boot if the boot device, but note that the device path has changed).

   - If errors persist, your issue is different than the one described in this
document and CR. Perhaps you have bad hardware (card, cable, etc.).
- If this works, Workaround #2 is confirmed.

At this point, if you have validated that each troubleshooting step above is true for your environment, and the issue still exists, further troubleshooting is required.  Contact Sun Support.



Relief/Workaround

After implementing these workarounds the probe-scsi-all and boot commands will function normally. Reminder that boot fails only when the boot device is the same scsi device path as in error in probe-scsi-all.

The workarounds are:

1) Leave the I/O card in slot 0 on the I/O Board but attach the array to the other scsi interface on the card (assumes Cauldron-S card which has two interfaces). This means the array would now use scsi interface 1 on the card (Device path "scsi@2,1"; Located on the bottom port of the card).

2) Move the I/O card to a different slot on the I/O Board. Problem is only reported when card is in slot 0 and card port 0 is used. The problem has not been reported when the card is located in other I/O Board slots (regardless of the card port used).

 NOTE:  Options 1 and 2 change the I/O device path for the array, thus you will
have to adjust any device paths within the boot-device alias as well as
in the Solaris configuration.

3) Leave the I/O card in slot 0 and the array attached to the scsi port 0 ("bad" port). Issue a reset-all prior to issuing the probe-scsi-all or boot command. The reset-all needs to be executed prior to each of these commands. See below for details:

    1) reset-all
2) probe-scsi-all
3) reset-all <--- Left out, the boot will fail with same symptoms.
4) boot

See the "Additional Information" section for details on the workarounds.



Additional Information
If you have reproduced the problem and confirmed the workaround, then please follow the directions listed in the "Resolution" section of this document to log a Sun Support Services Request and have the CR re-opened for further investigation.
 Use one of the workarounds provided in the "Temporary Workaround" section of this document while a permanent resolution to the problem is being developed.


Product
Sun Fire V1280 Server
Sun Fire E6900 Server
Sun Fire E4900 Server
Sun Fire E2900 Server
Sun Fire 6800 Server
Sun Fire 4810 Server
Sun Fire 4800 Server
Sun Fire 3800 Server
Sun StorageTek D240 Media Tray

Internal Comments
Feedback email alias: [email protected]

Link to Cr 6238924

Cr#<SUNBUG: 6238924>

Cauldron, probe-scsi-all, reset-all, boot, Fast Data Access MMU Miss, IO, slot, scsi, normalized

Change History
Date: 2009-11-25
User Name: Josh Freeman
Action: Refreshed
Comment: No changes - doc good as is. ESG Content Team update.
Date: 2008-11-19
User Name: T230884
Action: Quality Review
Date: 2007-03-12
User Name: 7058
Action: Approved
Comment: Keywords and disclaimer only.
OK to republish.
Version: 9
Date: 2007-03-12
User Name: 67850
Action: Approved
Serengeti is a codename, not a product name in the tmark database. Please remove Serengeti. I think it's perfectly acceptable for Sun Fire to stand on its own.


Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback