Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-71-1006645.1
Update Date:2010-05-19
Keywords:

Solution Type  Technical Instruction Sure

Solution  1006645.1 :   What is a Solaris[TM] PCI IOMMU error telling me?  


Related Items
  • Sun Enterprise 3000 Server
  •  
  • Sun Enterprise 4500 Server
  •  
  • Sun Fire E6900 Server
  •  
  • Sun Fire V440 Server
  •  
  • Sun Fire V480 Server
  •  
  • Sun Enterprise 5500 Server
  •  
  • Sun Fire E25K Server
  •  
  • Sun Enterprise 450 Server
  •  
  • Sun Fire 280R Server
  •  
  • Sun Fire T2000 Server
  •  
  • Sun Enterprise 4000 Server
  •  
  • Sun Enterprise 5000 Server
  •  
  • Sun Enterprise 6000 Server
  •  
  • Sun Fire E20K Server
  •  
  • Sun Fire B100x Blade Server
  •  
  • Sun Enterprise 3500 Server
  •  
  • Sun Fire 3800 Server
  •  
  • Sun Fire 6800 Server
  •  
  • Sun Fire V890 Server
  •  
  • Sun Enterprise 6500 Server
  •  
  • Sun Fire E4900 Server
  •  
  • Sun Fire 12K Server
  •  
  • Sun Enterprise 220R Server
  •  
  • Sun Fire T1000 Server
  •  
  • Sun Fire V880 Server
  •  
  • Sun Fire 4800 Server
  •  
  • Sun Enterprise 250 Server
  •  
  • Sun Fire V1280 Server
  •  
  • Sun Fire E2900 Server
  •  
  • Sun Fire 15K Server
  •  
  • Sun Fire B100s Blade Server
  •  
  • Sun Fire V490 Server
  •  
  • Sun Fire 4810 Server
  •  
  • Sun Enterprise 10000 Server
  •  
  • Sun Enterprise 420R Server
  •  
Related Categories
  • GCS>Sun Microsystems>Servers>Blade Servers
  •  
  • GCS>Sun Microsystems>Servers>Entry-Level Servers
  •  
  • GCS>Sun Microsystems>Servers>Midrange Servers
  •  
  • GCS>Sun Microsystems>Servers>CMT Servers
  •  
  • GCS>Sun Microsystems>Servers>High-End Servers
  •  

PreviouslyPublishedAs
209268


Description
When there is a Peripheral Component Interconnect (PCI) bus I/O Memory Management Unit (IOMMU) fault the Solaris[TM] kernel logs a long and complex error message and may panic.

This document should explain what that message contains. The example is based on the Schizo host-to-PCI bridge computer chip that is used in most of the UltraSPARC[TM] based systems.



Steps to Follow
A system could have several PCI buses connected to it. Each PCI bus has its own address space that uses 32 bit physical addressing. The system's CPU and memory and the host-to-PCI bridges operate in their own 64 bit address space (for example the Safari bus address space on a Sun Fire[TM] 6800), with physical and virtual addresses.

    In order to support Dynamic Memory Access(DMA) the host-to-PCI bridge chips need to be able to translate between the two address spaces in a controlled manner. This is achieved using a small IOMMU in each host-to-PCI bridge chip. When a driver that manages a DMA-capable PCI device wants to command the PCI device to perform a DMA, it must first program the IOMMU to translate an unused segment of the PCI bus's physical address space into a similar-sized segment of system physical memory. When the PCI device completes the DMA, it will inform the Solaris device driver using an interrupt. The device driver can then destroy the mapping allowing some other driver to use the address space. The IOMMU in each host-to-PCI bridge is only responsible for translations that are programmed for PCI devices that are in the tree of PCI buses and devices below that specific host-to-PCI bridge.

If a DMA occurs that accesses a PCI address that is currently unmapped or has the wrong permissions then the host-to-PCI bridge will generate an error interrupt and log a message like...

 line
====
1  - pcisch: [ID 462479 kern.warning] WARNING: pcisch3 (pci@9,600000): PCI fault log start:
2  - pcisch: [ID 309153 kern.notice] PCI iommu error
3  - pcisch: [ID 866426 kern.notice] pcisch3: Error 1 on IOMMU TLB entry 2:
4  -       Context=0 not Writable not Streamable
5  -       PCI Page Size=8k Address in page c446a000
6  - pcisch: [ID 219581 kern.notice] Memory: Valid not Cacheable Page Frame=0
7  - pcisch: [ID 684763 kern.notice] pcisch3 (pci@9,600000): PBM
8  -  AFSR=0x0.00000000
9  - pcisch: [ID 120591 kern.notice] dwordmask=0 bytemask=0
10 - pcisch: [ID 829486 kern.notice] pcisch3 (pci@9,600000): PCI primary error (0):
11 - pcisch: [ID 227296 kern.notice] pcisch3 (pci@9,600000): PCI secondary error (0):
12 - pcisch: [ID 748186 kern.notice] pcisch3 (pci@9,600000): PBM AFAR 0.00000000:
13 - pcisch: [ID 127741 kern.warning] WARNING: pcisch3: PCI config space
14 -       CSR=0xaa0<signaled-target-abort>
15 - pcisch: [ID 656289 kern.notice] pcisch3 (pci@9,600000): PCI fault log end.
16 - pcisch: [ID 686566 kern.notice] Scrubbing PCI iommu TLB entries
17 - pcisch: [ID 193938 kern.notice] No fatal PCI bus error(s)

An IOMMU entry consists of two linked data structures: the tag and data. These are held in a small array, called the Translation Lookaside Buffer(TLB), inside the bridge chip. This array is a 64-entry subset of the much larger array of Translation Table Entry(TTE) structures that are held in memory and searched automatically when there is no match for the PCI address in the array of TLBs.

 TLB tag
Bits     Description
=====    ===========
32-25    context used to link entries.
23-24    error type
00=protection error, 01=invalid
10 timeout error,    11 UE ECC error.
22       error, 0 = no error, 1 = error.
21       writeable
20       streaming
19       page size, 0 = 8KB, 1=64KB
18-0     19 bit virtual page number (PCI address >> 13)
 TLB data
Bits      Description
====      ===========
32        valid, TLB data is valid.
31        reserved.
30        cacheable.
29-0      30 bit physical page frame number (system address >> 13)
 TTE
Bits      Description
====      ===========
63        valid bit
61        page size
60        streamable
59        localbus (ignored)
58-51     context number
42-13     bits 42-13 of the system physical address
12-7      software bits.
4         cacheable.
1         writeable.

So when a DMA is performed from a PCI device to the host-to-PCI bridge the bottom 13 bits (the offset within the 8KB page are saved). The upper 19 bits are then compared to the entries in the TLB array, if the 19 bits match a valid entry then the 30 bits of physical page number (system bus address modulo 8k) are added to the 13 bits of saved offset to produce a 43 bit system physical address that is the real target of the DMA which addresses the real memory in the machine and the DMA continues.

It the valid bit is not set in the TLB data then the search for a match continues. If no match is found in the little TLB array, the PCI bus address is used to calculate the offset into the much larger array of TTEs. The TTE at that address is loaded into the TLB array and used. If that entry has the wrong permissions or the valid bit is not set, then we have an IOMMU error. The error reporting code walks the small TLB array looking for entries with the error bit set and then printing out most of these fields.

Let's go through this error line by line.

Line 1 - Tells us the host-to-PCI bridge that is reporting the error. The PCI DMA transaction that caused the message must have occurred from a PCI device below this node in the device tree.

line 2 - There are several types of error that the pcisch host-to-PCI bridge chip can generate this tells us that it is an IOMMU error.

line 3 - Error 1 tells us the type of IOMMU error that we have found,this is just the "error type" field from the tag. It also tells us the entry in the TLB array where the error walker found a tag with the error bit set in the data.

line 4 - prints the context/writeable/cacheable bits.

line 5 - prints out the page size and the PCI bus address (with the 13 bits of offset masked out) from the TLB tag.

line 6 - prints out the valid, cacheable and the 30 bit Page Frame Number (PFN) from the data.

Now in our example we see that the error was an invalid error, yet the valid bit is set. This means that we failed to look up the address in the little TLB array, we used the PCI virtual page number to locate the TTE in the memory array but the valid bit was not set in the selected TTE, we always load the TTE into the TLB data and tag entries, the valid bit in the data is set as it has been installed okay. You will see that the the PCI virtual page number is not stored in the memory array of TTEs, but is calculated from the incoming DMA transaction, there is a direct link between the size of the memory array of TTEs
and the available DMA space that a device can access.

lines 7-12 are printing out internal registers

lines 13-15 are printing out the host-to-PCI bridges PCI node. Its default action on receiving a invalid DMA transaction is to send a target abort back.

lines 15 and on - just housekeeping.

On the whole, this message is not very useful as it is reporting an error condition caused by an invalid action. The faulty component could be the following:

-The device driver which has setup or removed(torn down) an IOMMU mapping incorrectly.
-The card firmware which has sent a DMA to the wrong address.
-A hardware error defeating the parity checks on the PCI bus.



Product
Sun Fire E25K Server
Sun Fire V890 Server
Sun Fire V880 Server
Sun Fire V490 Server

kernel, pci, panic, iommu, fault
Previously Published As
82572


Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback