Document Audience:INTERNAL
Document ID:I1143-1
Title:'logadm' on the SE6320 image is not restarting syslogd resulting in missed log messages and false positive StorADE alarms.
Copyright Notice:Copyright © 2005 Sun Microsystems, Inc. All Rights Reserved
Update Date:2004-11-17

------------------------------------------------------------
            - Sun Proprietary/Confidential: Internal Use Only -
------------------------------------------------------------------------

  ***  Sun Confidential:  Internal Use and Authorized VARs Only  ***
____________________________________________________________
  This message including any attachments is confidential information
  of Sun Microsystems, Inc.  Disclosure, copying or distribution is
  prohibited without permission of Sun.  If you are not the intended
  recipient, please reply to the sender and then delete this message.
________________________________________________________________________
  
                        FIELD INFORMATION NOTICE
               (For Authorized Distribution by Sun Service)
FIN #: I1143-1
Synopsis: 'logadm' on the SE6320 image is not restarting syslogd resulting in missed log messages and false positive StorADE alarms.
Create Date: Nov/16/04
SunAlert: No
Top FIN/FCO Report: No
Products Reference: Sun StorEdge 6320 Arrays
Product Category: Storage / Diag-Doc-Service
Product Affected: 
Systems Affected:
-----------------  
Mkt_ID      Platform    Model    Description                   Serial Number
------      --------    -----    -----------                   -------------
  -          Anysys       -      System Platform Independent         -


X-Options Affected:
-------------------
Mkt_ID      Platform     Model     Description                 Serial Number
------      --------     -----     -----------                 ------------- 
  -          SE6320        -       Sun StorEdge 6320 Array           -
Parts Affected: 
----------------------
Part Number             Description                     Model
-----------             -----------                     -----
     -                       -                            -
References: 
Bug: 6182101 - Rolling array log files due to size is causing new messages 
               to go to the file that was rolled over.
     6176915 - StorADE phoning home excessive number of rasagent.event 
               files.
Issue Description: 
A cron executes /usr/bin/logadm periodically to check log files and act
accordingly.  The actions 'logadm' takes are detailed in the
/etc/logadm.conf file.  See also the manpage for 'logadm' to understand
the arguments.  All versions of the SE6320 units have the issue of
recognizing an SE6320 cabinet with a Service Processor and SPA Tray.
All customers with an SE6320 will be effected when the array log file
grows beyond 10MB and is consequently rolled over using 'logadm'.

This is a serviceability issue in that false positive alarms about
previous array failures are repeatedly being generated along with
potentially fatal messages being ignored.

It is important to restart to syslogd after the files have moved and a
new file is created.  Refer to the manpage for syslogd for additional
details.  The entry in /etc/logadm.conf does not include the argument
to restart syslogd resulting in a mismatch of file descriptors.

This mismatch of file descriptors is causing StorADE to generate
False-Positives for previous failure information.  It is also causing
new messages from the array to go to the older file.  This could result
in missed indicators of a potential issue with the array.

One indication of the symptom would be the presence of a messages.se6320.0
much larger than 10M and a messages.se6320 file that is 0 Bytes in size.
False Positive StorADE alarms about previous failures in an array along with
array messages being lost/ignore because they are going to old log files.

Any command at the array CLI editor followed by viewing the mirrored log 
file(s).

  # tail /var/adm/messages.se6320
  # ls -li /var/adm/messages.se6320
    
A burst of messages from a potentially failing array.
 
The root cause of an issue is that '/etc/logadm.conf' does not have the 
added command to restart syslogd.

Any current releases of SE6320 and even the future releases may not be 
fixed.   

The Service Procedure would recommend to change the /etc/logadm.conf
file by adding the following to the last line.

  "-a 'kill -HUP `cat /var/run/syslog.pid`' -s 100k"
Implementation: 
---
        |   |   MANDATORY (Fully Proactive)
         ---    
         
  
         ---
        |   |   CONTROLLED PROACTIVE (per Sun Geo Plan) 
         --- 
         
                                
         ---
        | X |   REACTIVE (As Required)
         ---
Corrective Action: 
The following recommendation is provided as a guideline for authorized
Sun Services Field Representatives who may encounter the above
mentioned issue.

This corrective action is done to correct a issue with logadm.conf
where syslogd is not being restarted after the existing array log file
(messages.se6320) is moved to .0 and a new array log file is
created.  Basically, an additional command to restart syslogd is
being appended to the line that determines the necessity to roll the
array log file.

The 2 possible symptoms being corrected are:

  1. StorADE repeatedly generating events for "old" log messages.
  2 New array messages are being logged in the .0 log file.

The following steps below will describe the CLI editor implementation
of adding an argument to an existing SE6320 configuration file on the
Service Processor for 'logadm'.  This fix will allow for restarting
'syslogd' when the messages.se6320 mirrored log file is rolled to
messages.se6320.0 once it exceeds 10MB in size.  This process uses an 
editor to make the necessary modification to the configuration file.
  
  # cp /etc/logadm.conf /etc/logadm.conf_BAK
  # vi /etc/logadm.conf
  
  NOTE: Change the last line to the following and then write the file:
        /var/adm/messages.se6320 -P 'Wed Oct 20 15:54:04 2004' -a 
        'kill -HUP `cat /var/run/syslog.pid`' -s 10M

The field can use the below procedure to test to 
Testing this can be tricky depending on the users ability to use the
editor.  The only way to test this is to force/trick 'logadm' into
thinking the file size is ready to be rolled.  It will then be
necessary to ensure the messages are being mirrored in the correct
file.

The outcome of the testing should yield a properly working logadm
roll-over mechanism for the array log file.  Be sure to remember to
change the "size" back to 10M after testing is complete.

Testing process
---------------
Ensure the size of the mirrored log file is at least 1K in size.
 
   # ls -l /var/adm/messages.se6320

Change the size for the '-s' argument to 1K
Execute logadm

   # /usr/bin/logadm

Make sure the mirrored log file was rolled as expected.

   # ls -l /var/adm/messages.se6320*

Make sure syslogd is writing to the new mirrored log file by doing a 
command on the array.

   # telnet array00
     array00:/:<1> date

Check the size of the new mirrored log file to be sure it has grown 
and check the contents to ensure the date command has been mirrored.

   # ls -l /var/adm/messages.se6320*
   # cat /var/adm/messages.se6320

CHANGE THE SIZE FOR THE '-s' ARGUMENT BACK TO 10M !!
Comments: 
None.

============================================================================

NOTE: FIN Tracking Instructions for Radiance/SPWeb:
--------------------------------------------------

If a Radiance case involves the application of a FIN to solve a customer
issue, please complete the following steps in Radiance/SPWeb prior to
closing the case:
 
    o Select "Field Information Notice" in the REFERENCE TYPE field.

    o Enter FIN ID number in the REFERENCE ID field.
      For example; I1111-1.

If possible, include additional details in the REFERENCE SUMMARY field
(ie. implementation complete, customer declined, etc.)
--------------------------------------------------------------------------


Implementation Notes:
--------------------

In case of "Mandatory" FINs, Sun Services will attempt to contact
all known customers to recommend proactive implementation.

For "Controlled Proactive" FINs, Sun Services mission critical
support teams will initiate proactive implementation efforts for
their respective accounts as required.

For "Reactive" FINs, Sun Services and partners will implement
the necessary corrective actions as the need arises.


Billing Information:
-------------------

Warranty: On-Site Labor Rates are based on specified Warranty deliverables
          for the affected product.

Contract: On-Site Labor Rates are based on the type of service contract.

Non Contract: On-Site implementation by Sun is available based on On-Site
              Labor Rates defined in the Price List.

--------------------------------------------------------------------------

All FIN documents are accessible via Internal SunSolve.  Type "sunsolve"
in a browser and follow the prompts to Search Collections.

For questions on this document, please email:

        [email protected]

The FIN and FCO homepage is available at:

        http://sdpsweb.central/FIN_FCO/index.html

For more information on how to submit a FIN, go to:

        http://pronto.central/fin.html

To access the Service Partner Exchange, use:

        https://spe.sun.com
--------------------------------------------------------------------------
Statusactive