Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1008905.1
Update Date:2011-04-07
Keywords:

Solution Type  Problem Resolution Sure

Solution  1008905.1 :   Sun Fire[TM] 12K/15K/E20K/E25K: Safari Agent ID Cheat Sheet and Decoding CPU, Memory, and IOC Locations  


Related Items
  • Sun Fire E25K Server
  •  
  • Sun Fire E20K Server
  •  
  • Sun Fire 12K Server
  •  
  • Sun Fire 15K Server
  •  
Related Categories
  • GCS>Sun Microsystems>Servers>High-End Servers
  •  

PreviouslyPublishedAs
212253


Applies to:

Sun Fire 15K Server
Sun Fire E20K Server
Sun Fire E25K Server
Sun Fire 12K Server
All Platforms

Symptoms

Solaris[TM] device paths and messaging reference the Safari Agent ID of a given component (in /var/adm/messages, console logs, core files, OBP probing, etc.).

Cause

In the case of a hardware error or problem, mapping the Agent ID to a location allows us to determine where the suspect component is physically located. Ultimately, by correctly mapping this Agent ID to a physical location, we know that we are servicing the right component to resolve a hardware problem. An incorrect mapping could result in replacing and/or servicing the wrong component and could cause further outages or problems on the platform.

NOTE: Only a certified Sun Service Engineer is authorized to service the Sun Fire[TM] 12K/15K/E20K/E25K platform. Therefore if you need to obtain help in resolving a hardware problem on this platform, please open a Service Request with Sun Support Services and an engineer can assist in troubleshooting and ultimately resolving your problem.

This document provides the Safari Agent ID cheat sheets for the Sun Fire[TM] 12K/15K/E20K/E25K platform. It explains how to correctly decode an Agent ID to the physical location within this platform.

Solution

Determining the physical location from a Safari Agent ID is a two step process:
  1. Identify what the Safari Agent ID is from the message or device path in question.
  2. Use the Cheat Sheets at the bottom of this document to translate the Agent ID to a physical location.

Some example messages/paths would be as follows (Safari Agent ID is highlighted):

     /SUNW,UltraSPARC-III@62,0 
/memory-controller@0,400000
/pci@5c,700000
/address-extender-queue@7f,0
/SUNW,wci@15d,0:SUNW,wci-rsm

PCI reported ECC Errors, such as the UE example below, are types of errors where it can be useful to know the location of the detecting IO Controller. The Agent ID can be determined in the event message.

Example event:

Jan  7 07:12:13  host pcisch: [ID 284024 kern.warning] WARNING: uncorrectable error detected by pci0 (safari id 00000000.0000001c) during
Jan 7 07:12:13 host DVMA read transaction
Jan 7 07:12:13 host pcisch: [ID 475334 kern.info] Transaction was a block operation.
Jan 7 07:12:13 host pcisch: [ID 956438 kern.info] dvma access, Memory safari command, address 000001a0.9a550400, owned_in not asserted.
Jan 7 07:12:13 host pcisch: [ID 144388 kern.info] AFSR=48000000.1c000071 AFAR=000001a0.9a550400,
Jan 7 07:12:13 host quad word offset 00000000.00000000, Memory Module SB13/P0/B0 J13300 J13400 J13500 J13600 id 444.
Jan 7 07:12:13 host pcisch: [ID 545677 kern.info] mtag 0, mtag ecc syndrome 0
Jan 7 07:12:13 host pcisch: [ID 308334 kern.info] secondary error from DVMA read transaction
Jan 7 07:12:13 host unix: [ID 836849 kern.notice]
Jan 7 07:12:13 host ^Mpanic[cpu417]/thread=3002663b9c0:
Jan 7 07:12:13 host unix: [ID 261965 kern.notice] Fatal PCI UE Error

1) The Agent ID of the implicated IOC is detailed in the message:

Jan  7 07:12:13  host         quad word offset 00000000.00000000, Memory Module SB13/P0/B0 J13300 J13400 J13500 J13600 id 444. 

When 444 is converted to hexidecimal, it is 1BC, which is the Agent ID.


In this example, 1BC equates to a physical location of IO Board 13, PCI0.

NOTES:

  • Make sure to know which type of System Board (SB) you are decoding. Output from the System Management Services (SMS) command showboards, as well as the domain post logs, and boot messages will show you which type of SB is installed on platform:

    • USIII shows up as board type CPU

    • USIV shows up as board type V3CPU

  • MaxCPU boards are only supported in Sun Fire[TM] 12K/15K platforms and are USIII boards.

  • The memory-controller is the CPU module on a Sun Fire[TM] 12K/15K/E20K/E25K System Board.

  • A memory-controller device path will have a different Memory Offset depending on the Dimm Bank that is implicated:

    • 400000 for Dimm Bank 0.

    • 600000 for Dimm Bank 1.

  • In the tables below, the number outside of the parenthesis is the Safari Agent ID, also called the Hexidecimal ID. In the parenthesis is the Decimal ID.

EXAMPLES:

1) Decoding an UltraSPARC III CPU:

                   /SUNW, Ultrasparc III@1e9,0 
/\
Agent ID.

Using the USIII Table below, Agent ID 1e9 is MaxCPU 1, located on Expander 15.

2) Decoding a memory bank:

                /SUNW, memory-controller@2,600000 
/\ /\
Agent ID Memory Offset

Using the USIII Table below (Confirmed the board type was USIII first), Agent ID 2 is CPU2 on SB0.

The Memory Offset of 600000 means the implicated dimm bank is bank 1.

3) Decoding an UltraSPARC IV CPU:

                    /SUNW, UltraSPARC-IV@1c4,0
/\
Agent ID

Using the USIV Table below, Agent ID 1c4 is CORE 0.1 on Expander 14.


Ultra SPARC III

System Boards

MaxCPU Boards
EXB# CPU 0 CPU 1 CPU 2 CPU 3
MAX0 MAX1
0 0(0) 1(1) 2(2) 3(3)
8(8) 9(9)
1 20(32) 21(33) 22(34) 23(35)
28(40) 29(41)
2 40(64) 41(65) 42(66) 43(67)
48(72) 49(73)
3 60(96) 61(97) 62(98) 63(99)
68(104) 69(105)
4 80(128) 81(129) 82(130) 83(131)
88(136) 89(137)
5 a0(160) a1(161) a2(162) a3(163)
a8(168) a9(169)
6 c0(192) c1(193) c2(194) c3(195)
c8(200) c9(201)
7 e0(224) e1(225) e2(226) e3(227)
e8(232) e9(233)
8 100(256) 101(257) 102(258) 103(259)
108(264) 109(265)
9 120(288) 121(289) 122(290) 123(291)
128(296) 129(297)
10 140(320) 141(321) 142(322) 143(323)
148(328) 149(329)
11 160(352) 161(353) 162(354) 163(355)
168(360) 169(361)
12 180(384) 181(385) 182(386) 183(387)
188(392) 189(393)
13 1a0(416) 1a1(417) 1a2(418) 1a3(419)
1a8(424) 1a9(425)
14 1c0(448) 1c1(449) 1c2(450) 1c3(451)
1c8(456) 1c9(457)
15 1e0(480) 1e1(481) 1e2(482) 1e3(483)
1e8(488) 1e9(489)
16 200(512) 201(513) 202(514) 203(515)
208(520) 209(521)
17 220(544) 221(545) 222(546) 223(547)
228(552) 229(553)

Ultra SPARC IV (Dual Thread CPU)
EXB#CORE 0.0CORE 0.1CORE 1.0CORE 1.1 CORE 2.0CORE 2.1CORE 3.0CORE 3.1
0 0(0) 4(4) 1(1) 5(5) 2(2) 6(6) 3(3)7(7)
1 20(32) 24(36) 21(33) 25(37) 22(34) 26(38) 23(35) 27(39)
2 40(64) 44(68) 41(65) 45(69) 42(66) 46(70) 43(67) 47(71)
3 60(96) 64(100) 61(97) 65(101) 62(98) 66(102) 63(99) 67(103)
4 80(128) 84(132) 81(129) 85(133) 82(130) 86(134) 83(131) 87(135)
5 a0(160) a4(164) a1(161) a5(165) a2(162) a6(166) a3(163) a7(167)
6 c0(192) c4(196) c1(193) c5(197) c2(194) c6(198) c3(195) c7(199)
7 e0(224) e4(228) e1(225) e5(229) e2(226) e6(230) e3(227) e7(231)
8 100(256) 104(260) 101(257) 105(261) 102(258) 106(262) 103(259) 107(263)
9 120(288) 124(292) 121(289) 125(293) 122(290) 126(294) 123(291) 127(295)
10 140(320) 144(324) 141(321) 145(325) 142(322) 146(326) 143(323) 147(327)
11 160(352) 164(356) 161(353) 165(357) 162(354) 166(358) 163(355) 167(359)
12 180(384) 184(388) 181(385) 185(389) 182(386) 186(390) 183(387) 187(391)
13 1a0(416) 1a4(420) 1a1(417) 1a5(421) 1a2(418) 1a6(422) 1a3(419) 1a7(423)
14 1c0(448) 1c4(452) 1c1(449) 1c5(453) 1c2(450) 1c6(454) 1c3(451) 1c7(455)
15 1e0(480) 1e4(484) 1e1(481) 1e5(485) 1e2(482) 1e6(486) 1e3(483) 1e7(487)
16 200(512) 204(516) 201(513) 205(517) 202(514) 206(518) 203(515) 207(519)
17 220(544) 224(548) 221(545) 225(549) 222(546) 226(550) 223(547) 227(551)

NON-CPU Board ASICs
Slot 1
Expander
EXB# PCI0 PCI1 WCI1
EXB# AXQ0 AXQ1
0 1c(28) 1d(29) 1d(29)
0 1e(30) 1f(31)
1 3c(60) 3d(61) 3d(61)
1 3e(62) 3f(63)
2 5C(92) 5d(93) 5d(93)
2 5e(94) 5f(95)
3 7c(124) 7d(125) 7d(125)
3 7e(126) 7f(127)
4 9c(156) 9d(157) 9d(157)
4 9e(158) 9f(159)
5 bc(188) bd(189) bd(189)
5 be(190) bf(191)
6 dc(220) dd(221) dd(221)
6 de(222) df(223)
7 fc(252) fd(253) fd(253)
7 fe(254) ff(255)
8 11c(284) 11d(285) 11d(285)
8 11e(286) 11f(287)
9 13c(316) 13d(317) 13d(317)
9 13e(318) 13f(319)
10 15c(348) 15d(349) 15d(349)
10 15e(350) 15f(351)
11 17c(380) 17d(381) 17d(381)
11 17e(382) 17f(383)
12 19c(412) 19d(413) 19d(413)
12 19e(414) 19f(415)
13 1bc(444) 1bd(445) 1bd(445)
13 1be(446) 1bf(447)
14 1dc(476) 1dd(477) 1dd(477)
14 1de(478) 1df(479)
15 1fc(508) 1fd(509) 1fd(509)
15 1fe(510) 1ff(511)
16 21c(540) 21d(541) 21d(541)
16 21e(542) 21f(543)
17 23c(572) 23d(573) 23d(573)
17 23e(574) 23f(575)



The following is strictly for the use of Sun employees:

SRDB(now Symptom Resolution) 48225 was merged with this document. That SRDB has been archived.
Safari Agent ID, Starcat, CPU, SPARC, cheat sheet, decoder, memory, dimm, processor, IOC, controller, memerr, location
Previously Published As 48142



Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback