Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1007933.1
Update Date:2009-02-17
Keywords:

Solution Type  Problem Resolution Sure

Solution  1007933.1 :   Dynamic Reconfiguration (DR) failure on Sun Fire[tm] 4800 running Oracle  


Related Items
  • Sun Fire 3800 Server
  •  
  • Sun Fire 6800 Server
  •  
  • Sun Fire 4800 Server
  •  
  • Sun Fire 4810 Server
  •  
Related Categories
  • GCS>Sun Microsystems>Servers>Midrange Servers
  •  

PreviouslyPublishedAs
210942


Symptoms
Dynamic Reconfiguration (DR) failed attempting to remove a known good uniboard
for replacement for Field Change Order purposes.
Sun Fire[TM] 4800 running Solaris[TM] 9 KU 117171-05 and Oracle 9i with firmware 5.18.1. Two board domain (SB0 SB2).
Additional single board domain unaffected, for completeness.
#
# cfgadm -c disconnect N0.SB2
cfgadm: Hardware specific failure: unconfigure N0.SB2: I/O error:
/ssm@0,0/memory-controller@b,400000
<or>
# cfgadm -v -c disconnect N0.SB2
request delete capacity (4 cpus)
request delete capacity (1048576 pages)
request delete capacity N0.SB2 done
request offline SUNW_cpu/cpu8
request offline SUNW_cpu/cpu9
request offline SUNW_cpu/cpu10
request offline SUNW_cpu/cpu11
request offline SUNW_cpu/cpu8 done
request offline SUNW_cpu/cpu9 done
request offline SUNW_cpu/cpu10 done
request offline SUNW_cpu/cpu11 done
unconfigure N0.SB2
notify remove SUNW_cpu/cpu8
notify remove SUNW_cpu/cpu9
notify remove SUNW_cpu/cpu10
notify remove SUNW_cpu/cpu11
notify remove SUNW_cpu/cpu8 done
notify remove SUNW_cpu/cpu9 done
notify remove SUNW_cpu/cpu10 done
notify remove SUNW_cpu/cpu11 done
cfgadm: Hardware specific failure: unconfigure N0.SB2: I/O error:
/ssm@0,0/memory-controller@b,400000
#
No processes bound to a specific processor (pbind).
NO Real Time processes running.
No Sun[TM] Cluster involved.
System functioning normally, running Oracle in a Quality Assurance environment.
# cfgadm -al N0.SB2
Ap_Id                          Type         Receptacle   Occupant     Condition
N0.SB2                         CPU_V2       connected    configured   ok
N0.SB2::cpu0                   cpu          connected    configured   ok
N0.SB2::cpu1                   cpu          connected    configured   ok
N0.SB2::cpu2                   cpu          connected    configured   ok
N0.SB2::cpu3                   cpu          connected    configured   ok
N0.SB2::memory
#
No permanent memory on this (SB2) uniboard:
# cfgadm -alv |grep memory
N0.SB0::memory                 connected    configured   ok         base address
0x0, 8388608 KBytes total, 3054144 KBytes permanent
Mar  2 13:13 memory       n        /devices/ssm@0,0:N0.SB0::memory
N0.SB2::memory                 connected    configured   ok         base address
0x2000000000, 8388608 KBytes total
Mar  2 13:13 memory       n        /devices/ssm@0,0:N0.SB2::memory
#


Resolution
There is no current resolution (or work around) and the best that can be done is to minimize database down time.

DBA has utilized the majority of physical memory for Oracle.
Shutdown Oracle and DR works fine: (note minimal Oracle Listener processes running)
#
# ps -ef |grep ora
root 29930 29922  0 11:03:28 pts/6    0:00 grep ora
oracle  9407     1  0   Mar 09 ?        0:05
/dbvol01/oracle/product/9.2.0/bin/tnslsnr 1522 -inherit
oracle  9413  9409  0   Mar 09 ?        0:01
/dbvol01/oracle/product/9.2.0/bin/dbsnmp
oracle  9409     1  0   Mar 09 ?        0:00 /bin/sh
/dbvol01/oracle/product/9.2.0/bin/dbsnmpwd
oracle  9398     1  0   Mar 09 ?        5:31
/dbvol01/oracle/product/9.2.0/bin/tnslsnr 1521 -inherit
#
# cfgadm -v -c disconnect N0.SB2
request delete capacity (4 cpus)
request delete capacity (1048576 pages)
request delete capacity N0.SB2 done
request offline SUNW_cpu/cpu8
request offline SUNW_cpu/cpu9
request offline SUNW_cpu/cpu10
request offline SUNW_cpu/cpu11
request offline SUNW_cpu/cpu8 done
request offline SUNW_cpu/cpu9 done
request offline SUNW_cpu/cpu10 done
request offline SUNW_cpu/cpu11 done
unconfigure N0.SB2
unconfigure N0.SB2 done
notify remove SUNW_cpu/cpu8
notify remove SUNW_cpu/cpu9
notify remove SUNW_cpu/cpu10
notify remove SUNW_cpu/cpu11
notify remove SUNW_cpu/cpu8 done
notify remove SUNW_cpu/cpu9 done
notify remove SUNW_cpu/cpu10 done
notify remove SUNW_cpu/cpu11 done
disconnect N0.SB2
disconnect N0.SB2 done
poweroff N0.SB2
poweroff N0.SB2 done
unassign N0.SB2 skipped
#
#
Replaced uniboard in question and brought back into environment:
# cfgadm -v -c connect N0.SB2
assign N0.SB2
assign N0.SB2 done
poweron N0.SB2
poweron N0.SB2 done
test N0.SB2
test N0.SB2 done
connect N0.SB2
connect N0.SB2 done
# mpstat
CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl
0  165   1  238   296  181  358   16   34   49    0  1483    6   3   4  86
1   70   0  573   209  192  339   15   31   55    0   993    5   4   3  88
2   71   1 1433    37   19  312   14   29   50    0   387    7   6   3  84
3   70   1 1451    18    1  304   13   29   49    0   317    7   5   4  84
#
#cfgadm -v -c configure N0.SB2
configure N0.SB2
configure N0.SB2 done
notify online SUNW_cpu/cpu8
notify online SUNW_cpu/cpu9
notify online SUNW_cpu/cpu10
notify online SUNW_cpu/cpu11
notify add capacity (4 cpus)
notify add capacity (1048576 pages)
notify add capacity N0.SB2 done
#
#mpstat
CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl
0  165   1  238   296  181  358   16   34   49    0  1483    6   3   4  86
1   70   0  573   209  192  339   15   31   55    0   993    5   4   3  88
2   71   1 1433    37   19  312   14   29   50    0   387    7   6   3  84
3   70   1 1451    18    1  304   13   29   49    0   317    7   5   4  84
8  278   0 1325     4    1   56    1   12   16    0   355    3   2   5  90
9   32   0   59    45   42   23    0   10    9    0    62    0   1   0  99
10   65   0  304     7    5   32    0   11   12    0   136    1   1   1  97
11   60   0  284     4    1   19    0    7   11    0    98    0   2   1  97
#
#cfgadm -alv N0.SB2
Ap_Id                          Receptacle   Occupant     Condition  Information
When         Type         Busy     Phys_Id
N0.SB2                         connected    configured   ok         powered-on,
assigned
Apr  4 11:36 CPU_V2       n        /devices/ssm@0,0:N0.SB2
N0.SB2::cpu0                   connected    configured   ok         cpuid 8,
speed 900 MHz, ecache 8 MBytes
Apr  4 11:36 cpu          n        /devices/ssm@0,0:N0.SB2::cpu0
N0.SB2::cpu1                   connected    configured   ok         cpuid 9,
speed 900 MHz, ecache 8 MBytes
Apr  4 11:36 cpu          n        /devices/ssm@0,0:N0.SB2::cpu1
N0.SB2::cpu2                   connected    configured   ok         cpuid 10,
speed 900 MHz, ecache 8 MBytes
Apr  4 11:36 cpu          n        /devices/ssm@0,0:N0.SB2::cpu2
N0.SB2::cpu3                   connected    configured   ok         cpuid 11,
speed 900 MHz, ecache 8 MBytes
Apr  4 11:36 cpu          n        /devices/ssm@0,0:N0.SB2::cpu3
N0.SB2::memory                 connected    configured   ok         base address
0x2000000000, 8388608 KBytes total
Apr  4 11:36 memory       n        /devices/ssm@0,0:N0.SB2::memory
#


Additional Information
Notes on System Controller activity during physical replacement:
root@my-server:/tmp# telnet 192.168.0.25
Trying 192.168.0.25...
Connected to 192.168.0.25.
Escape character is '^]'.
System Controller 'my-server':
Type  0  for Platform Shell
Type  1  for domain A console
Type  2  for domain B console
Type  3  for domain C console
Type  4  for domain D console
Input: 0
Enter Password:
Platform Shell
my-server:SC> poweron SB2
/N0/SB2: powered on
my-server:SC>
my-server:SC>
my-server:SC>
my-server:SC> showboard -p version
Component   Compatible Version
---------   ---------- -------
SSC0        Reference  5.18.1 Build_01
/N0/IB6     Yes        5.18.1 Build_01
/N0/SB0     Yes        5.18.1 Build_01
/N0/SB2     Yes        5.18.1 Build_01
/N0/IB8     Yes        5.18.1 Build_01
/N0/SB4     Yes        5.18.1 Build_01
my-server:SC>
my-server:SC>
my-server:SC>
my-server:SC>
my-server:SC> showboards
Slot     Pwr Component Type                 State      Status     Domain
----     --- --------------                 -----      ------     ------
SSC0     On  System Controller              Main       Passed     -
SSC1     On  Present                        Spare      -          -
ID0      On  Sun Fire 4800 Centerplane      -          OK         -
PS0      On  A153 Power Supply              -          OK         -
PS1      On  A153 Power Supply              -          OK         -
PS2      On  A153 Power Supply              -          OK         -
FT0      On  Fan Tray                       Low Speed  OK         -
FT1      On  Fan Tray                       Low Speed  OK         -
FT2      On  Fan Tray                       Low Speed  OK         -
RP0      On  Repeater Board                 -          OK         -
RP2      On  Repeater Board                 -          OK         -
/N0/SB0  On  CPU Board V2                   Active     Passed     A
/N0/SB2  On  CPU Board V2                   Assigned   Under Test A
/N0/SB4  On  CPU Board V2                   Active     Passed     C
/N0/IB6  On  PCI I/O Board                  Active     Passed     A
/N0/IB8  On  PCI I/O Board                  Active     Passed     C
my-server:SC>


Product
Sun Fire 3800 Server
Sun Fire 6800 Server
Sun Fire 4810 Server
Sun Fire 4800 Server
Oracle9i Database Release 1 (9.0.1)

ISM, DR, dynamic reconfiguration, ipcs, shared memory
Previously Published As
81331

Change History
Date: 2005-04-28
User Name: 25440
Action: Approved
Comment: Publishing.
Version: 6
Date: 2005-04-28
User Name: 25440
Action: Accept
Comment:
Version: 0

Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback