download.boulder.ibm.com Open in urlscan Pro
170.225.126.19  Public Scan

URL: https://download.boulder.ibm.com/ibmdl/pub/software/server/firmware/SV-Firmware-Hist.html
Submission: On May 02 via api from US — Scanned from DE

Form analysis 0 forms found in the DOM

Text Content

POWER8 SYSTEM FIRMWARE FIX HISTORY - RELEASE LEVELS SV8XX


Firmware Description and History



SV860
For Impact, Severity and other Firmware definitions, Please refer to the below
'Glossary of firmware terms' url:
http://www14.software.ibm.com/webapp/set2/sas/f/power5cm/home.html#termdefs
SV860_243_165 / FW860.B1

06/02/22 Impact:  Data    Severity:  HIPER

System firmware changes that affect certain systems

 * HIPER/Pervasive:  For systems with PowerVM firmware and an IBM i partition
   with native SR-IOV at firmware levels FW810.00 through FW860.B0, a problem
   was fixed for data incorrectly written to PowerVM/LPAR memory during a DLPAR
   remove of a native SR-IOV Virtual Function (VF) or Concurrent Maintenance
   (CM) of the SR-IOV adapter. This may cause undetected data corruption in a
   partition or a PowerVM crash.

SV860_240_165 / FW860.B0

01/21/22 Impact:  Availability     Severity:  SPE

System firmware changes that affect all systems

 * On systems with PowerVM firmware, a problem was fixed for an incorrect SRC
   logged for a #EXM0 PCIe expansion drawer power fault found on the low CXP
   cable.  An SRC B7006A85 (AOCABLE, PCICARD) is logged instead of the correct
   SRC of B7006A86 (PCICARD, AOCABLE).  This happens every time there is a power
   fault on the low CXP cable.
 * On systems with PowerVM firmware, a problem was fixed for a Live Partition
   Mobility (LPM) hang during LPM validation on the target system.  This is a
   rare system problem triggered during an LPM migration that causes LPM
   attempts to fail as well as other functionality such as configuration changes
   and partition shutdowns. To recover from this problem to be able to do LPM
   and other operations such as configuration changes and shutting down
   partitions, the system must be re-IPLed.
 * On systems with PowerVM firmware, a problem was fixed for the HMC Repair and
   Verify (R&V) procedure failing with "Unable to isolate the resource" during
   concurrent maintenance of the #EMX0 Cable Card.  This could lead one to take
   disruptive action in order to do the repair. This should occur infrequently
   and only with cases where a physical hardware failure has occurred which
   prevents access to the PCIe reset line (PERST) but allows access to the slot
   power controls.  As a workaround, pulling both cables from the Cable Card to
   the #EMX0 expansion drawer will result in a completely failed state that can
   be handled by bringing up the "PCIe Hardware Topology" screen from either
   ASMI or the HMC. Then retry the R&V operation to recover the Cable Card.
 * On systems with PowerVM firmware, a problem was fixed for a partition with an
   SR-IOV logical port (VF) having a delay in the start of the partition. If the
   partition boot device is an SR-IOV logical port network device, this issue
   may result in the partition failing to boot with SRCs BA180010 and BA155102
   logged and then stuck on progress code SRC 2E49 for an AIX partition.  This
   problem is infrequent because it requires multiple error conditions at the
   same time on the SR-IOV adapter.  To trigger this problem, multiple SR-IOV
   logical ports for the same adapter must encounter EEH conditions at roughly
   the same time such that a new logical port EEH condition is occurring while a
   previous EEH condition's handling is almost complete but not notified to the
   hypervisor yet.  To recover from this problem, reboot the partition.
 * On systems with PowerVM firmware, a problem was fixed for a system hypervisor
   hang and an Incomplete state on the HMC after a logical partition (LPAR) is
   deleted that has an active virtual session from another LPAR.  This problem
   happens every time an LPAR is deleted with an active virtual session.  This
   is a rare problem because virtual sessions from an HMC (a more typical case)
   prevent an LPAR deletion until the virtual session is closed, but virtual
   sessions originating from another LPAR do not have the same check.
   
 * On systems with PowerVM firmware, the following problems were fixed for
   certain SR-IOV adapters:
   1) An error was fixed that occurs during a VNIC failover where the VNIC
   backing device has a physical port down due to an adapter internal error with
   an SRC B400FF02 logged.  This is an improved version of the fix delivered in
   earlier service pack FW860.A0 for adapter firmware 11.4.415.37 and it
   significantly reduces the frequency of the error being fixed.
   2) An adapter in SR-IOV shared mode may cause a network interruption and SRCs
   B400FF02 and B400FF04 logged.  The problem occurs infrequently during normal
   network traffic.
   These fixes update the adapter firmware to 11.4.415.41 for the following
   Feature Codes and CCINs: #EN15/#EN16 with CCIN 2CE3, #EN17/#EN18 with CCIN
   2CE4, #EN0H/#EN0J with CCIN 2B93, #EN0M/#EN0N with CCIN 2CC0, #EN0K/#EN0L
   with CCIN 2CC1, #EL56/#EL38 with CCIN 2B93, and #EL57/#EL3C with CCIN 2CC1.
   Update instructions: 
   https://www.ibm.com/support/knowledgecenter/HW4M4/p8efd/p8efd_updating_sriov_firmware.htm.
 * On systems with PowerVM firmware and an AIX or Linux partition. a problem was
   fixed for Platform Error Logs (PELs) that are truncated to only eight bytes
   for error logs created by the firmware and reported to the AIX or Linux OS. 
   These PELs may appear to be blank or missing on the OS.  This rare problem is
   triggered by multiple error log events in the firmware occurring close
   together in time and each needing to be reported to the OS, causing a
   truncation in the reporting of the PEL.  As a problem workaround, the full
   error logs for the truncated logs are available on the HMC or using ASMI on
   the service processor to view them.
 * On systems with PowerVM firmware, a problem was fixed for Platform Error Logs
   (PELs) not being logged and shown by the OS if they have an Error Severity
   code of "critical error".  The trigger is the reporting by a system firmware
   subsystem of an error log that has set an Event/Error Severity in the 'UH'
   section of the log to a value in the range, 0x50 to 0x5F.  The following
   error logs are affected:
   B200308C ==> PHYP ==>  A problem occurred during the IPL of a partition. The
   adapter type cannot be determined. Ensure that a valid I/O Load Source is
   tagged.
   B700F104 ==> PHYP ==> Operating System error.  Platform Licensed Internal
   Code terminated a partition.
   B7006990 ==> PHYP ==> Service processor failure
   B2005149 ==> PHYP ==>  A problem occurred during the IPL of a partition.
   B700F10B ==> PHYP ==>  A resource has been disabled due to hardware problems
   A7001150 ==> PHYP ==> System log entry only, no service action required. No
   action needed unless a serviceable event was logged.
   B7005442 ==> PHYP ==> A parity error was detected in the hardware Segment
   Lookaside Buffer (SLB).
   B200541A ==> PHYP ==> A problem occurred during a partition Firmware Assisted
   Dump
   B7001160 ==> PHYP ==> Service processor failure.
   B7005121 ==> PHYP ==> Platform LIC failure
   BC8A0604 ==> Hostboot  ==> A problem occurred during the IPL of the system.
   BC8A1E07 ==> Hostboot  ==>  Secure Boot firmware validation failed.
   Note that these error logs are still reported to the service processor and
   HMC properly. This issue does not affect the Call Home action for the error
   logs.
   
 * On systems with PowerVM firmware, a problem was fixed for the Device
   Description in a System Plan related to Crypto Coprocessors and NVMe cards
   that were only showing the PCI vendor and device ID of the cards.  This is
   not enough information to verify which card is installed without looking up
   the PCI IDs first.  With the fix, more specific/useful information is
   displayed and this additional information does not have any adverse impact on
   sysplan operations.  The problem is seen every time a System Plan is created
   for an installed Crypto Coprocessor or NVMe card.
   
 * A problem was fixed for correct ASMI passwords being rejected when accessing
   ASMI using an ASCII terminal with a serial connection to the server.  This
   problem always occurs for systems at firmware level FW860.A0 and later.
   

System firmware changes that affect certain systems

 * On systems with PowerVM firmware and an IBM i partition, a problem was fixed
   for a Live Partition Mobility (LPM) hang while performing the migration of an
   IBM i partition.   In some situations, there is a timing issue when the
   hypervisor is managing IBM i software licenses.  When a subsequent LPM
   operation is performed, the LPM operation hangs. To recover from this problem
   to be able to do LPM, the system must be re-IPLed.
 * On systems with PowerVM firmware and an IBM i partition. a problem was fixed
   for an IBM i partition running in P7 or P8 processor compatibility mode
   failing to boot with SRCs BA330002 and B200A101 logged.  This problem can be
   triggered as larger configurations for processors and memory are added to the
   partition.  A circumvention for this problem could be to reduce the number of
   processors and memory in the partition, or booting in P9 or later
   compatibility mode will also allow the partition to boot.

SV860_236_165 / FW860.A2

12/07/21 Impact:  Security   Severity:  HIPER


System firmware changes that affect all systems

 * HIPER/Non-Pervasive:  On systems with PowerVM firmware, a security problem
   was fixed to prevent an attacker that gains service access to the FSP service
   processor from reading and writing PowerVM system memory using a series of
   carefully crafted service procedures.  This problem is Common Vulnerability
   and Exposure number CVE-2021-38917.
   
 * HIPER/Non-Pervasive:   On systems with PowerVM firmware, a problem was fixed
   for the IBM PowerVM Hypervisor where through a specific sequence of VM
   management operations could lead to a violation of the isolation between peer
   VMs.  This Common Vulnerability and Exposure number is CVE-2021-38918.

SV860_234_165 / FW860.A1

09/16/21 Impact:  Data    Severity:  HIPER


System firmware changes that affect all systems

 * HIPER:  On systems with PowerVM firmware, a problem was fixed which may occur
   on a target system following a Live Partition Mobility (LPM) migration of an
   AIX partition utilizing Active Memory Expansion (AME) with 64 KB page size
   enabled using the vmo tunable: "vmo -ro ame_mpsize_support=1".  The problem
   may result in AIX termination, file system corruption, application
   segmentation faults, or undetected data corruption.
   Note:  If you are doing an LPM migration of an AIX partition utilizing AME
   and 64 KB page size enabled involving a POWER8 or POWER9 system, ensure you
   have a Service Pack including this change for the appropriate firmware level
   on both the source and target systems.

SV860_231_165 / FW860.A0

07/08/21 Impact:  Availability     Severity:  SPE


New features and functions

 * Support added to Redfish to provide a command to set the ASMI user passwords
   using a new AccountService schema.   Using this service, the ASMI admin, HMC,
   and general user passwords can be changed.
   

System firmware changes that affect all systems

 * A problem was fixed for Time of Day (TOD) being lost for the real-time clock
   (RTC) with an SRC B15A3303 logged when the service processor boots or
   resets.  This is a very rare problem that involves a timing problem in the
   service processor kernel.  If the server is running when the error occurs,
   there will be an SRC B15A3303 logged, and the time of day on the service
   processor will be incorrect for up to six hours until the hypervisor
   synchronizes its (valid) time with the service processor.  If the server is
   not running when the error occurs, there will be an SRC B15A3303 logged, and
   If the server is subsequently IPLed without setting the date and time in ASMI
   to fix it, the IPL will abort with an SRC B7881201 which indicates to the
   system operator that the date and time are invalid.
 * A problem was fixed in ASMI to allow setting static routes with two default
   gateway IP addresses.  Without the fix, ASMI  always fails with "Invalid
   entry. Gateway address" for this configuration.  As a workaround, the static
   routes could be created using the ASMI command line and the "route add"
   command. 
   
 * On systems with PowerVM firmware, a problem was fixed for intermittent
   failures for a reset of a Virtual Function (VF) for SR-IOV adapters during
   Enhanced Error Handling (EEH) error recovery.  This is triggered by EEH
   events at a VF level only, not at the adapter level.  The error recovery
   fails if a data packet is received by the VF while the EEH recovery is in
   progress.  A VF that has failed can be recovered by a partition reboot or a
   DLPAR remove and add of the VF.
 * On systems with PowerVM firmware, a problem was fixed where the Floating
   Point Unit Computational Test, which should be set to "staggered" by default,
   has been changed in some circumstances to be disabled. If you wish to
   re-enable this option, this fix is required.  After applying this service
   pack,  do the following steps:
   1) Sign into the Advanced System Management Interface (ASMI).
   2) Select Floating Point Computational Unit under the System Configuration
   heading and change it from disabled to what is needed: staggered (run once
   per core each day) or periodic (a specified time).
   3) Click "Save Settings".
 * On systems with PowerVM firmware, the following problems were fixed for
   certain SR-IOV adapters:
   1) An error was fixed that occurs during a VNIC failover where the VNIC
   backing device has a physical port down or read port errors with an SRC
   B400FF02 logged.
   2) A problem was fixed for adding a new logical port that has a PVID assigned
   that is causing traffic on that VLAN to be dropped by other interfaces on the
   same physical port which uses OS VLAN tagging for that same VLAN ID.  This
   problem occurs each time a logical port with a non-zero PVID that is the same
   as an existing VLAN is dynamically added to a partition or is activated as
   part of a partition activation, the traffic flow stops for other partitions
   with OS configured VLAN devices with the same VLAN ID.  This problem can be
   recovered by configuring an IP address on the logical port with the non-zero
   PVID and initiating traffic flow on this logical port.  This problem can be
   avoided by not configuring logical ports with a PVID if other logical ports
   on the same physical port are configured with OS VLAN devices.
   This fix updates the adapter firmware to 11.4.415.37 for the following
   Feature Codes and CCINs: #EN15/#EN16 with CCIN 2CE3, #EN17/#EN18 with CCIN
   2CE4, #EN0H/#EN0J with CCIN 2B93, #EN0M/#EN0N with CCIN 2CC0, #EN0K/#EN0L
   with CCIN 2CC1, #EL56/#EL38 with CCIN 2B93, and #EL57/#EL3C with CCIN 2CC1.
   Update instructions: 
   https://www.ibm.com/support/knowledgecenter/HW4M4/p8efd/p8efd_updating_sriov_firmware.htm.
 * On systems with PowerVM firmware, a problem was fixed for some serviceable
   events specific to the reporting of EEH errors not being displayed on the
   HMC.  The sending of an associated call home event, however, was not
   affected.  This problem is intermittent and infrequent.
 * A problem was fixed for newer hardware record names (hardware delivered after
   the original POWER8 GA) not being displayed correctly in the ASMI
   deconfiguration records.  For example, Capp is displayed as "Unknown".
 * A problem was fixed for Over Temperature (OT) errors being reported for the
   processor with SRC B1112A10.  In certain workload environments, additional
   cooling is needed for the processors and this can be provided by a user
   option to increase the floor speed for the fans.  This fix is activated using
   the ASMI command line to install an alternate power management definition
   file to increase the fan speeds.  This change will persist until a factory
   reset of the system.  Please contact IBM Support for information on the
   command to use to increase the fan speeds.
   This problem only pertains to the S822 (8284-22A), S822L(8247-22L), and
   S822L(5148-22L) models.
 * On systems with PowerVM firmware, a problem was fixed for a system
   termination with SRC B700F107 following a time facility processor failure
   with SRC B700F10B.  With the fix, the transparent replacement of the failed
   processor will occur for the B700F10B if there is a free core, with no impact
   to the system.
   
 * On systems with PowerVM firmware, a problem was fixed for possible partition
   errors following a concurrent firmware update from FW810 or later. A
   precondition for this problem is that DLPAR operations of either physical or
   virtual I/O devices must have occurred prior to the firmware update  The
   error can take the form of a partition crash at some point following the
   update. The frequency of this problem is low.  If the problem occurs, the OS
   will likely report a DSI (Data Storage Interrupt) error.  For example, AIX
   produces a DSI_PROC log entry.  If the partition does not crash, it is also
   possible that some subsequent I/O DLPAR operations will fail.
 * A problem was fixed for spurious out-of-range (greater than 127 C)
   temperatures being reported for the processor with SRC B1112A10.  With the
   fix, only valid temperature sensor readings are used when reporting
   processors that have exceeded the Over Temperature (OT) value.
 * A problem was fixed in ASMI for setting a static route with a network address
   for the IP such as "xxx.xxx.xxx.0".  Without the fix, ASMI always fails with
   "Invalid entry. IP address" for this network address format.  As a
   workaround, the static route could be created with the individual IP endpoint
   entered instead of the network address. or created using the ASMI command
   line and the "route add" command.

System firmware changes that affect certain systems

 * On systems with an IBM i partition, a problem was fixed for physical I/O
   property data not being able to be collected for an inactive partition booted
   in "IOR" mode with SRC B200A101 logged.   This can happen when making a
   system plan (sysplan) for an IBM i partition using the HMC and the IBM i
   partition is inactive.  The sysplan data collection for the active IBM i
   partitions is successful.
   
 * On systems with only Integrated Facility for Linux ( IFL) processors and AIX
   or IBM i partitions,  a problem was fixed for performance issues for IFL VMs
   (Linux and VIOS).  This problem occurs if AIX or IBM i partitions are active
   on a system with IPL only cores.  As a workaround, AIX or IBM i partitions
   should not be activated on an IFL only system.  With the fix, the activation
   of AIX and IBM i partitions are blocked on an IFL only system.  If this fix
   is installed concurrently with AIX or IBM i partitions running, these
   partitions will be allowed to continue to run until they are powered off. 
   Once powered off, the AIX and IBM i partitions will not be allowed to be
   activated again on the IFL-only system.
   This problem pertains to only the E850 (8408-E8E) and E850C(8408-44E) models.

SV860_226_165 / FW860.90

12/09/20 Impact:  Data     Severity:  HIPER


New features and functions


 * On systems with PowerVM firmware, enable periodic logging of internal
   component operational data for the PCIe3 expansion drawer paths.  The logging
   of this data does not impact the normal use of the system.

System firmware changes that affect all systems

 * HIPER/Pervasive:  On systems with PowerVM firmware, a problem was fixed for
   certain SR-IOV adapters for a condition that may result from frequent resets
   of adapter Virtual Functions (VFs), or transmission stalls and could lead to
   potential undetected data corruption.
   The following additional fixes are also included:
   1) The VNIC backing device goes to a powered off state during a VNIC failover
   or Live Partition Mobility (LPM) migration.  This failure is intermittent and
   very infrequent.
   2) Adapter time-outs with SRC B400FF01 or B400FF02 logged.
   3) Adapter time-outs related to adapter commands becoming blocked with SRC
   B400FF01 or B400FF02 logged.
   4) VF function resets occasionally not completing quickly enough resulting in
   SRC B400FF02 logged.
   This fix updates the adapter firmware to 11.4.415.33 for the following
   Feature Codes and CCINs: #EN15/#EN16 with CCIN 2CE3, #EN17/#EN18 with CCIN
   2CE4, #EN0H/#EN0J with CCIN 2B93, #EN0M/#EN0N with CCIN 2CC0, #EN0K/#EN0L
   with CCIN 2CC1, #EL56/#EL38 with CCIN 2B93, and #EL57/#EL3C with CCIN 2CC1.
   The SR-IOV adapter firmware level update for the shared-mode adapters happens
   under user control to prevent unexpected temporary outages on the adapters. 
   A system reboot will update all SR-IOV shared-mode adapters with the new
   firmware level.  In addition, when an adapter is first set to SR-IOV shared
   mode, the adapter firmware is updated to the latest level available with the
   system firmware (and it is also updated automatically during maintenance
   operations, such as when the adapter is stopped or replaced).  And lastly,
   selective manual updates of the SR-IOV adapters can be performed using the
   Hardware Management Console (HMC).  To selectively update the adapter
   firmware, follow the steps given at the IBM Knowledge Center for using HMC to
   make the updates: 
   https://www.ibm.com/support/knowledgecenter/HW4M4/p8efd/p8efd_updating_sriov_firmware.htm.
   Note: Adapters that are capable of running in SR-IOV mode, but are currently
   running in dedicated mode and assigned to a partition, can be updated
   concurrently either by the OS that owns the adapter or the managing HMC (if
   OS is AIX or VIOS and RMC is running).
   
 * A problem was fixed for the service processor ASMI "Factory Reset" option to
   disable the IPMI service as part of the factory reset.  Without the fix, the
   IPMI operation state will be unchanged by the factory reset.
 * A rare problem was fixed for a checkstop during an IPL that fails to isolate
   and guard the problem core.  An SRC is logged with B1xxE5xx and an extended
   hex word 8 xxxxDD90.  With the fix, the suspected failing hardware is
   guarded.
 * A problem was fixed for the REST/Redfish interface to change the success
   return code for object creation from "200" to "201".  The "200" status code
   means that the request was received and understood and is being processed.  A
   "201" status code indicates that a request was successful and, as a result, a
   resource has been created.  The Redfish Ruby Client, "redfish_client" may
   fail a transaction if a "200" status code is returned when "201" is expected.
 * On systems with PowerVM firmware, a problem was fixed to allow quicker
   recovery of PCIe links for the #EMXO PCIe expansion drawer for a run-time
   fault with B7006A22 logged.  The time for recovery attempts can exceed six
   minutes on rare occasions which may cause I/O adapter failures and failed
   nodes.  With the fix, the PCIe links will recover or fail faster (in the
   order of seconds) so that redundancy in a cluster configuration can be used
   with failure detection and failover processing by other hosts, if available,
   in the case where the PCIe links fail to recover.
 * On systems with PowerVM firmware, a problem was fixed for a concurrent
   maintenance "Repair and Verify" (R&V) operation for a #EMX0 fanout module
   that fails with an "Unable to isolate the resource" error message.  This
   should occur only infrequently for cases where a physical hardware failure
   has occurred which prevents access to slot power controls.  This problem can
   be worked around by bringing up the "PCIe Hardware Topology" screen from
   either ASMI or the HMC after the hardware failure but before the concurrent
   repair is attempted.  This will avoid the problem with the PCIe slot
   isolation   These steps can also be used to recover from the error to allow
   the R&V repair to be attempted again.
 * On systems with PowerVM firmware, a problem was fixed for a B7006A96 fanout
   module FPGA corruption error that can occur in unsupported PCIe3 expansion
   drawer(#EMX0) configurations that mix an enhanced PCIe3 fanout module (#EMXH)
   in the same drawer with legacy PCIe3 fanout modules (#EMXF, #EMXG, #ELMF, or
   #ELMG).  This causes the FPGA on the enhanced #EMXH to be updated with the
   legacy firmware and it becomes a non-working and unusable fanout module. 
   With the fix, the unsupported #EMX0 configurations are detected and handled
   gracefully without harm to the FPGA on the enhanced fanout modules.
 * On systems with PowerVM firmware, a problem was fixed for possible
   dispatching delays for partitions running in POWER8 processor compatibility
   mode.
 * On systems with PowerVM firmware, a problem was fixed for system memory not
   returned after create and delete of partitions, resulting in slightly less
   memory available after configuration changes in the systems.  With the fix,
   an IPL of the system will recover any of the memory that was orphaned by the
   issue.
 * On systems with PowerVM firmware, a problem was fixed for utilization
   statistics for commands such as HMC lslparutil and third-party lpar2rrd that
   do not accurately represent CPU utilization.  The values are incorrect every
   time for a partition that is migrated with Live Partition Mobility (LPM).
   Power Enterprise Pools 2.0 is not affected by this problem.  If this problem
   has occurred, here are three possible recovery options:
   1) Re-IPL the target system of the migration.
   2) Or delete and recreate the partition on the target system.
   3) Or perform an inactive migration of the partition.  The cycle values get
   zeroed in this case.
 * On systems with PowerVM firmware, a problem was fixed for a PCIe3 expansion
   drawer cable that has hidden error logs for a single lane failure.  This
   happens whenever a single lane error occurs.  Subsequent lane failures are
   not hidden and have visible error logs.  Without the fix, the hidden or
   informational logs would need to be examined to gather more information for
   the failing hardware.
 * On systems with PowerVM firmware, a problem was fixed for a DLPAR remove of
   memory from a partition that fails if the partition contains 65535 or more
   LMBs.  With 16MB LMBs, this error threshold is 1 TB of memory.  With 256 MB
   LMBs, it is 16 TB of memory.  A reboot of the partition after the DLPAR will
   remove the memory from the partition.
 * On systems with PowerVM firmware, a problem was fixed for extraneous B400FF01
   and B400FF02 SRCs logged when moving cables on SR-IOV adapters.  This is an
   infrequent error that can occur if the HMC performance monitor is running at
   the same time the cables are moved.  These SRCs can be ignored when
   accompanied by cable movement.
 * On systems with PowerVM firmware, a problem was fixed for B400FF02 errors for
   certain SR-IOV adapters during adapter initialization or error recovery. 
   This is a rare error that can occur because of a race condition in the
   firmware.
   This fix pertains to adapters with the following Feature Codes and CCINs:
   #EN15/#EN16 with CCIN 2CE3, #EN17/#EN18 with CCIN 2CE4, #EN0H/#EN0J with CCIN
   2B93, #EN0M/#EN0N with CCIN 2CC0, #EN0K/#EN0L with CCIN 2CC1, #EL56/#EL38
   with CCIN 2B93, and #EL57/#EL3C with CCIN 2CC1.
 * On systems with OPAL firmware, a problem was fixed for a reset/reload of the
   service processor initiated by ipmitool inband usage on the host (such as "mc
   reset cold") causing all subsequent inband IPMI messages to be blocked.
 * On systems with OPAL firmware, a problem was fixed for host hangs that can
   occur when doing error recovery.
 * On systems with OPAL firmware, a problem was fixed for I2C transactions to
   the On-Chip Controller (OCC) causing a host hang.
 * On systems with PowerVM firmware, a problem was fixed for not logging SRCs
   for certain cable pulls from the #EMXO PCIe expansion drawer.  With the fix,
   the previously undetected cable pulls are now detected and logged with SRC
   B7006A8B and B7006A88 errors.
 * On systems with PowerVM firmware, a problem was fixed for a rare system hang
   that can occur when a page of memory is being migrated.  Page migration
   (memory relocation) can occur for a variety of reasons, including predictive
   memory failure, DLPAR of memory, and normal operations related to managing
   the page pool resources.
 * On systems with PowerVM firmware, a problem was fixed for running PCM on a
   system with SR-IOV adapters in shared mode that results in an "Incomplete"
   system state with certain hypervisor tasks deadlocked.  This problem is rare
   and is triggered when using SR-IOV adapters in shared mode and gathering
   performance statistics with PCM (Performance Collection and Monitoring) and
   also having a low level error on an adapter.  The only way to recover from
   this condition is to re-IPL the system.
 * On systems with PowerVM firmware, a problem was fixed for an SRC B7006A99
   informational log now posted as a Predictive with a call out of the CXP cable
   FRU,  This fix improves FRU isolation for cases where a CXP cable alert
   causes a B7006A99 that occurs prior to a B7006A22 or B7006A8B.  Without the
   fix, the SRC B7006A99 is informational and the latter SRCs cause a larger
   hardware replacement even though the earlier event identified a probable
   cause for the cable FRU.

SV860_215_165 / FW860.81

03/04/20 Impact:  Security      Severity:  HIPER

Power System S812L (8247-21L), Power System S822L (8247-22L), Power System S824L
(8247-42L), Power System S812 (8284-21A), Power System S822 (8284-22A), Power
System S814 (8286-41A), Power System S824 (8286-42A); Power System E850
(8408-E8E); Power System E850C (8408-44E); Power System S812L (5148-21L) and
Power System S822L (5148-22L) servers only.


System firmware changes that affect all systems


 * HIPER/Pervasive:  A problem was fixed for an HMC "Incomplete" state for a
   system after the HMC user password is changed with ASMI on the service
   processor.  This problem can occur if the HMC password is changed on the
   service processor but not also on the HMC, and a reset of the service
   processor happens.  With the fix, the HMC will get the needed "failed
   authentication" error so that the user knows to update the old password on
   the HMC.

SV860_212_165 / FW860.80

12/17/19 Impact:  Security      Severity:  SPE

Power System S812L (8247-21L), Power System S822L (8247-22L), Power System S824L
(8247-42L), Power System S812 (8284-21A), Power System S822 (8284-22A), Power
System S814 (8286-41A), Power System S824 (8286-42A); Power System E850
(8408-E8E); Power System E850C (8408-44E); Power System S812L (5148-21L) and
Power System S822L (5148-22L) servers only.


New features and functions


 * Support was added for improved security for the service processor password
   policy.  For the service processor, the "admin", "hmc" and "general" password
   must be set on first use for newly manufactured systems and after a factory
   reset of the system.  The IPMI interface has been changed to be disabled by
   default in these scenarios.  The REST/Redfish interface will return an error
   saying the user account is expired.  This policy change helps to enforce the
   service processor is not left in a state with a well-known password.  The
   user can change from an expired default password to a new password using the
   Advanced System Management Interface (ASMI).
 * Support was added for real-time data capture for PCIe3 expansion drawer
   (#EMX0) cable card connection data via resource dump selector on the HMC or
   in ASMI on the service processor.  Using the resource selector string of
   "xmfr -dumpccdata" will non-disruptively generate an RSCDUMP type of dump
   file that has the current cable card data, including data from cables and the
   retimers.
   

System firmware changes that affect all systems


 * A problem was fixed for an intermittent IPMI core dump on the service
   processor.  This occurs only rarely when multiple IPMI sessions are starting
   and cleaning up at the same time.  A new IPMI session can fail initialization
   when one of its session objects is cleaned up.  The circumvention is to retry
   the IPMI command that failed.
 * On systems using PowerVM firmware, a problem was fixed for SR-IOV adapters to
   provide a consistent Informational message level for cable plugging issues. 
   For transceivers not plugged on certain SR-IOV adapters, an unrecoverable
   error (UE) SRC B400FF03 was changed to an Informational message logged.  This
   affects the SR-IOV adapters with the following feature codes and CCINs:
   #EC2R/EC2S with CCIN 58FA; #EC2T/EC2U with CCIN 58FB; and #EC3L/EC3M with
   CCIN 2CEC.
   For copper cables unplugged on certain SR-IOV adapters, a missing message was
   replaced with an Informational message logged.  This affects the SR-IOV
   adapters with the following feature codes and CCINs:  #EN17/EN18 with CCIN
   2CE4; #EN0K/EN0L with CCIN 2CC1; and #EL57/EL3C with CCIN 2CC1.
 * On systems with PowerVM firmware, the following problem related to SR-IOV was
   fixed:  If the SR-IOV logical port's VLAN ID (PVID) is modified while the
   logical port is configured, the adapter will use an incorrect PVID for the
   Virtual Function (VF).  This problem is rare because most users do not change
   the PVID once the logical port is configured, so they will not have the
   problem.
   This fix updates adapter firmware to 10.2.252.1940 for the following Feature
   Codes and CCINs: #EN15/EN16 with CCIN 2CE3; #EN17/EN18 with CCIN 2CE4;
   #EN0H/EN0J with CCIN 2B93; #EN0M/EN0N with CCIN 2CC0; #EN0K/EN0L with CCIN
   2CC1; #EL56/EL38 with CCIN 2B93; and #EL57/EL3C with CCIN 2CC1.
   The SR-IOV adapter firmware level update for the shared-mode adapters happens
   under user control to prevent unexpected temporary outages on the adapters. 
   A system reboot will update all SR-IOV shared-mode adapters with the new
   firmware level.  In addition, when an adapter is first set to SR-IOV shared
   mode, the adapter firmware is updated to the latest level available with the
   system firmware (and it is also updated automatically during maintenance
   operations, such as when the adapter is stopped or replaced).  And lastly,
   selective manual updates of the SR-IOV adapters can be performed using the
   Hardware Management Console (HMC).  To selectively update the adapter
   firmware, follow the steps given at the IBM Knowledge Center for using HMC to
   make the updates:  
   https://www.ibm.com/support/knowledgecenter/HW4M4/p8efd/p8efd_updating_sriov_firmware.htm.
   Note: Adapters that are capable of running in SR-IOV mode, but are currently
   running in dedicated mode and assigned to a partition, can be updated
   concurrently either by the OS that owns the adapter or the managing HMC (if
   OS is AIX or VIOS and RMC is running).
   
 * A problem was fixed for unknowingly running at lower (the default)
   frequencies when changing into Fixed Max Frequency (FMF) mode.  This problem
   should be unlikely to happen because it requires that the system already is
   in FMF mode, and then the user requesting a change into FMF mode.  This
   request is not handled correctly as the tunable parameters get reset to
   default which allows the processor frequency to be reduced to the minimum
   value.  The recovery for this problem is to change the power mode to
   "Nominal" and then change it to FMF.
 * A problem was fixed for Novalink failing to activate partitions that have
   names with character lengths near the maximum allowed character length.  This
   problem can be circumvented by changing the partition name to have 32
   characters or less.
 * A problem was fixed where a Linux or AIX partition type was incorrectly
   reported as unknown.  Symptoms include: IBM Cloud Management Console (CMC)
   not being able to determine the RPA partition type (Linux/AIX) for partitions
   that are not active; and HMC attempts to dynamically add CPU to Linux
   partitions may fail with a HSCL1528 error message stating that there are not
   enough Integrated Facility for Linux ( IFL) cores for the operation.
   
 * A problem was fixed for a possible system crash with SRC B7000103 if the HMC
   session is closed while the performance monitor is active.  As a
   circumvention for this problem, make sure the performance monitor is turned
   off before closing the HMC sessions.
 * A problem was fixed for a Live Partition Mobility (LPM) migration of a large
   memory partition to a target system that causes the target system to crash
   and for the HMC to go to the "Incomplete" state.  For servers with the
   default LMB size (256MB), if a partition is >=16TB and if desired memory is
   different than the maximum memory, LPM may fail on the target system. 
   Servers with LMB sizes less than the default could hit this problem with
   smaller memory partition sizes.  A circumvention to the problem is to set the
   desired and maximum memory to the same value for the large memory partition
   that is to be migrated.
 * A problem was fixed for system hangs or incomplete states displayed by HMC(s)
   caused by a loop in the handling of Segment Lookaside Buffer (SLB) cache
   memory parity errors where SRC B7005442 may be logged.  This problem has a
   low frequency of occurrence as it requires severe errors in the SLB cache
   that are not cleared by an error flush of the entries.  A re-IPL of the
   system can be used to recover from this error.
   

System firmware changes that affect certain systems


 * On systems with an IBM i partition, a problem was fixed for a D-mode IPL
   failure when using a USB DVD drive in an IBM 7226 multimedia storage
   enclosure.  Error logs with SRC BA16010E, B2003110, and/or B200308C can
   occur.  As a circumvention, an external DVD drive can be used for the D-mode
   IPL.
 * On systems with IBM i partitions, a rare problem was fixed for an
   intermittent failure of a DLPAR remove of an adapter.  In most cases, a retry
   of the operation will be successful.

SV860_205_165 / FW860.70

06/18/19 Impact:  Availability      Severity:  HIPER

Power System S812L (8247-21L), Power System S822L (8247-22L), Power System S824L
(8247-42L), Power System S812 (8284-21A), Power System S822 (8284-22A), Power
System S814 (8286-41A), Power System S824 (8286-42A); Power System E850
(8408-E8E); Power System E850C (8408-44E); Power System S812L (5148-21L) and
Power System S822L (5148-22L) servers only.


System firmware changes that affect all systems


 * HIPER/Pervasive:  On systems with PowerVM firmware , the following problems
   related to SR-IOV were fixed:
   1) A problem was fixed for new or replacement SR-IOV adapters with feature
   codes EN15 and EN17 being rendered non-functional when moved to SR-IOV mode.
   This includes cards moved from dedicated device mode, newly installed
   adapters, and FRU replacements. This problem occurs when the adapter firmware
   is updated to the 10.2.252.x levels from 11.x adapter firmware levels.
   2) A problem was fixed for certain SR-IOV adapters where SRC B400FF01 errors
   are seen during vNIC failovers and Live Partition Mobility (LPM) migration of
   vNIC clients.This may also result in errors seen in partitions (for example,
   some partitions may show LNC2ENT_TX_ERR).
   3) A problem was fixed where network multicast traffic is not received by a
   SR-IOV logical port (VF) network interface for a Linux partition. The failure
   can occur when the partition transitions the network interface out of
   promiscuous or multicast promiscuous mode.
   These fixes update adapter firmware to 10.2.252.1939  for the following
   Feature Codes:   EN15, EN17, EN0H, EN0J, EN0M, EN0N, EN0K, EN0L, EL38, EL3C,
   EL56, and EL57.
   The SR-IOV adapter firmware level update for the shared-mode adapters happens
   under user control to prevent unexpected temporary outages on the adapters. 
   A system reboot will update all SR-IOV shared-mode adapters with the new
   firmware level.  In addition, when an adapter is first set to SR-IOV shared
   mode, the adapter firmware is updated to the latest level available with the
   system firmware (and it is also updated automatically during maintenance
   operations, such as when the adapter is stopped or replaced).  And lastly,
   selective manual updates of the SR-IOV adapters can be performed using the
   Hardware Management Console (HMC).  To selectively update the adapter
   firmware, follow the steps given at the IBM Knowledge Center for using HMC to
   make the updates:  
   https://www.ibm.com/support/knowledgecenter/HW4M4/p8efd/p8efd_updating_sriov_firmware.htm.
   Note: Adapters that are capable of running in SR-IOV mode, but are currently
   running in dedicated mode and assigned to a partition, can be updated
   concurrently either by the OS that owns the adapter or the managing HMC (if
   OS is AIX or VIOS and RMC is running).
   
 * DEFERRED: PARTITION_DEFERRED:  On systems with PowerVM firmware, a problem
   was fixed for repeated CPU DLPAR remove operations by Linux (Ubuntu, SUSE, or
   RHEL) OSes possibly resulting in a partition crash.  No specific SRCs or
   error logs are reported.   The problem can occur on any DLPAR CPU remove
   operation if running on Linux.  The occurrence is intermittent and rare.  The
   partition crash may result in one or more of the following console messages
   (in no particular order):
    1) Bad kernel stack pointer addr1 at addr2
    2) Oops: Bad kernel stack pointer
    3) ******* RTAS CALL BUFFER CORRUPTION *******
    4)  ERROR: Token not supported
   This fix does not activate until there is a reboot of the partition.
 * A problem was fixed for a PCIe Hub checkstop with SRC B138E504 logged that
   fails to guard the errant processor chip.  With the fix, the problem hardware
   FRU is guarded so there is not a recurrence of the error on the next IPL.
 * A problem was fixed for an incorrect SRC of B1810000 being logged when a
   firmware update fails because of Entitlement Key expiration.  The error
   displayed on the HMC and in the OS is correct and meaningful.  With the fix,
   for this firmware update failure the correct SRC of B181309D is now logged.
 * A problem was fixed for informational logs flooding the error log if a "Get
   Sensor Reading" is not working.
   
 * A problem was fixed for a Redfish (REST) Patch request for PowerSaveMode with
   an unsupported mode value returning an error code "500" instead of the
   correct error code of "400".
   
 * On systems with PowerVM firmware,  a problem was fixed for a rare Live
   Partition Mobility migration hang with the partition left in VPM (Virtual
   Page Mode) which causes performance concerns.  This error is triggered by a
   migration failover operation occurring during the migration state of
   "Suspended" and there has to be insufficient VASI buffers available to clear
   all partition state data waiting to be sent to the migration target. 
   Migration failovers are rare and the migration state of "Suspended" is a
   migration state lasting only a few seconds for most partitions, so this
   problem should not be frequent.  On the HMC, there will be an inability to
   complete either a migration stop or a recovery operation.  The HMC will show
   the partition as migrating and any attempt to change that will fail.  The
   system must be re-IPLed to recover from the problem.
 * A problem was fixed for an IPMI core dump and SRC B1818601 logged
   intermittently when an IPMI session is closed.  A flood of B1818A03 SRCs may
   be logged after the error occurs.  The IPMI server is not impacted and a call
   home is reported for the problem.  There is no service outage for the IPMI
   users because of this.
 * A problem was fixed for IPMI sessions in the service processor causing a
   flood of B181A803 informational error logs on registry read fails for IPv6
   and IPv4 keywords.  These error logs do not represent a real problem and may
   be ignored.
 * On systems with the PowerVM firmware,  a problem was fixed for shared
   processor partitions going unresponsive after changing the processor sharing
   mode of a dedicated processor partition from "allow when partition is active"
   to either "allow when partition is inactive" or "never".  This problem can be
   circumvented by avoiding disabling processor sharing when active on a
   dedicated processor partition.  To recover if the issue has been encountered,
   enable "processor sharing when active" on the dedicated partition.
 * On systems with PowerVM firmware, a problem was fixed for an error in
   deleting a partition with the virtualized Trusted Platform Module (vTPM)
   enabled and SRC B7000602 logged.  When this error occurs, the encryption
   process in the hypervisor may become unusable.  The problem can be recovered
   from with a re-IPL of the system.
 * On systems with PowerVM firmware, a problem was fixed in Live Partition
   Mobility (LPM) of a partition to a shared processor pool, which results in
   the partition being unable to consume uncapped cycles on the target system. 
   To prevent the issue from occurring, partitions can be migrated to the
   default shared processor pool and then dynamically moved to the desired
   shared processor pool.  To recover from the issue,  do one of the following
   four steps: 
   1) Either use DLPAR to add or remove a virtual processor to/from the affected
   partition;
   2) or dynamically move the partition between shared processor pools;
   3) or reboot the partition;
   4) or re-IPL the system.
   
 * On systems with PowerVM firmware,  a problem was fixed for a boot failure
   using a N_PORT ID Virtualization (NPIV) LUN for an operating system that is
   installed on a disk of 2 TB or greater, and having a device driver for the
   disk that adheres to a non-zero allocation length requirement for the "READ
   CAPACITY 16".  The IBM partition firmware had always used an invalid zero
   allocation length for the return of data and that had been accepted by
   previous device drivers.  Now some of the newer device drivers are adhering
   to the specification and needing an allocation length of non-zero to allow
   the boot to proceed.
 * On systems with PowerVM firmware, a problem was fixed for failing to boot
   from an AIX mksysb backup on a USB RDX drive with SRCs logged of BA210012,
   AA06000D, and BA090010.  The problem trigger is a boot attempt from the RDX
   device. The boot error does not occur if a serial console is used to navigate
   the SMS menus.
   
 * On systems with PowerVM firmware,  a problem was fixed for a system IPLing
   with an invalid time set on the service processor that causes partitions to
   be reset to the Epoch date of 01/01/1970.  With the fix, on the IPL, the
   hypervisor logs a B700120x when the service processor real time clock is
   found to be invalid and halts the IPL to allow the time and date to be
   corrected by the user.  The Advanced System Management Interface (ASMI) can
   be used to correct the time and date on the service processor.  On the next
   IPL, if the time and date have not been corrected, the hypervisor will log a
   SRC B7001224 (indicating the user was warned on the last IPL) but allow the
   partitions to start, but the time and date will be set to the Epoch value.
 * A security problem was fixed in the service processor Network Security
   Services (NSS) services which, with a man-in-the-middle attack, could provide
   false completion or errant network transactions or exposure of sensitive data
   from intercepted SSL connections to ASMI, Redfish, or the service processor
   message server.  The Common Vulnerabilities and Exposures issue number is
   CVE-2018-12384.
 * On systems with PowerVM firmware, a problem was fixed for hypervisor task
   getting deadlocked if partitions are powered on at the same time that SR-IOV
   is being configured for an adapter.  With this problem, workloads will
   continue to run but it will not be possible to change the virtualization
   configuration or power partitions on and off.  This error can be recovered by
   doing a re-IPL of the system.
 * On systems with PowerVM firmware,  a problem was fixed for hypervisor tasks
   getting deadlocked that cause the hypervisor to be unresponsive to the HMC (
   this shows as an incomplete state on the HMC) with SRC B200F011 logged.  This
   is a rare timing error.  With this problem,  OS workloads will continue to
   run but it will not be possible for the HMC to interact with the partitions. 
   This error can be recovered by doing a re-IPL of the system with a scheduled
   outage.
 * A problem was fixed for false indication of a real time clock (RTC) battery
   failure with SRC B15A3305 logged.  This error happens infrequently.  If the
   error occurs, and another battery failure SRC is not logged within 24 hours,
   ignore the error as it was caused by a timing issue in the battery test.
 * A problem was fixed for an IPMI core dump and SRC B181720D logged, causing
   the service processor to reset due to a low memory condition.  The memory
   loss is triggered by frequently using the ipmitool to read the network
   configuration.  The service processor recovers from this error but if three
   of these errors occur within a 15 minute time span, the service processor
   will go to a failed hung state with SRC B1817212 logged.  Should a service
   processor hang occur, OS workloads will continue to run but it will not be
   possible for the HMC to interact with the partitions.  This service processor
   hung state can be recovered by doing a re-IPL of the system with a scheduled
   outage.
   

System firmware changes that affect certain systems


 * DEFERRED:  On systems with a PCIe3 I/O expansion drawer (#EMX0) , a problem
   was fixed for the PCIe3 I/O expansion drawer links to improve stability.  
   Intermittent training failures on the links occurred during the IPL with SRC
   B7006A8B logged.  With the fix, the link settings were changed to lower the
   peak link signal amplification to bring the signal level into the middle of
   the operating range, thus improving the high margin to reduce link training
   failures.  The system must be re-IPLed for the fix to activate.
   
 * On a system witn an IBM i partition, a problem was fixed for a DLPAR
   force-remove of a physical IO adapter from an IBM i partition and a
   simultaneous power off of the partition causing the partition to hang during
   the power off.  To recover the partition from the error, the system must be
   re-IPLed.  This problem is rare because there is only a 2-second timing
   window for the DLPAR and power off to interfere with each other.
 * On a system with an active IBM i partition, a problem was fixed for a SPCN
   firmware download to the PCIe3 I/O expansion drawer (feature #EMX0) Chassis
   Management Card (CMC) that could possibly get stuck in a pending state.  This
   failure is very unlikely as it would require a concurrent replacement of the
   CMC card that is loaded with a SPCN level that is older than 2015
   (01MEX151012a).  The failure with the SPCN download can be corrected by a
   re-IPL of the system.
 * On a system with an AMS (Active Memory Sharing) partition, a problem was
   fixed for a Live Partition Mobility (LPM) migration failure when migrating
   from P9 to a pre-FW860 P8 or P7 system.  This failure can occur if the P9
   partition is in dedicated memory mode, and the Physical Page Table (PPT)
   ratio is explicitly set on the HMC (rather than keeping the default value)
   and the partition is then transitioned to AMS mode prior to the migration to
   the older system.  This problem can be avoided by using dedicated memory in
   the partition being migrated back to the older system.
 * On systems with PowerVM firmware and a vNIC configuration with multiple
   backing Virtual Functions (VFs), a problem was fixed for a backing VF failure
   after a sequence of repeated failovers where one of the VF backing devices
   goes to a powered off state.  This problem is infrequent and only occurs
   after many vNIC failovers.  A reboot of the partition with the affected VF
   will recover it.
 * On systems with PCIe3 expansion drawers (feature code #EMX0),  a problem was
   fixed for a UE B700BA01 logged after a FRU was replaced in the PCIe Expansion
   drawer.  The log should have been informational instead of unrecoverable
   because it is normal to have this log for a replaced part in the expansion
   drawer that has a different serial number from the old part.  If a part in
   the expansion drawer has been replaced, the UE error log can be ignored.
 * On systems with IBMi partitions,  a problem was fixed for Live Partition
   Mobility (LPM) migrations that could have incorrect hardware resource
   information (related to VPD) in the target partition if a failover had
   occurred for the source partition during the migration.  This failover would
   have to occur during the Suspended state of the migration, which only lasts
   about a second, so this should be rare.  With the fix, at a minimum the
   migration error will be detected to abort the migration so it can be
   restarted.  And at a later IBMi OS level, the fix will allow the migration to
   complete even though the failover has occurred during the Suspended state of
   the migration.
   
 * On systems with PCIe3 expansion drawers (feature #EMX0), a problem was fixed
   for PCI link recovery failure during a PCI Host Bridge (PHB) reset with SRCs
   of B7006A80, B7006A22, B7006A8B, and B7006970 logged.  This causes the cable
   card to fail, losing all slots in the expansion drawer.  This is a rare
   problem.  If this error occurs, a concurrent maintenance operation could
   reboot the expansion drawer or a re-IPL of the system could be done to
   recover the drawer.
 * On systems with an IBM i partition with greater than 9999 GB installed, a
   problem was fixed for on/Off COD memory-related amounts not being displayed
   correctly.  This only happens when retrieving the On/Off COD numbers via a
   particular IBMi MATMATR MI command option value.
 * On systems with PCIe3 expansion drawers(feature code #EMX0),  a problem was
   fixed for a concurrent exchange of a PCIe expansion drawer cable card,
   although successful, leaves the fault LED turned on.
 * On systems using PowerVM firmware, a problem was fixed for shared processor
   pools where uncapped shared processor partitions placed in a pool may not be
   able to consume all available processor cycles.  The problem may occur when
   the sum of the allocated processing units for the pool member partitions
   equals the maximum processing units of the pool.

SV860_180_165 / FW860.60

10/31/18 Impact:  Availability      Severity:  SPE

Power System S812L (8247-21L), Power System S822L (8247-22L), Power System S824L
(8247-42L), Power System S812 (8284-21A), Power System S822 (8284-22A), Power
System S814 (8286-41A), Power System S824 (8286-42A); Power System E850
(8408-E8E); Power System E850C (8408-44E); Power System S812L (5148-21L) and
Power System S822L (5148-22L) servers only.


System firmware changes that affect all systems


 * A security problem was fixed in the Dynamic Host Control Protocol (DHCP)
   client on the service processor for an out-of-bound memory access flaw that
   could be used by a malicious DHCP server to crash the DHCP client process. 
   The Common Vulnerabilities and Exposures issue number is CVE-2018-5732.
 * A problem was fixed for ipmitool not being able to set the system power limit
   when the power limit is not activated with the standard option.  With the
   fix, the ipmitool user can activate the power limit "dcmi power activate" and
   then set the power limit "dcmi power set _limit xxxx"  where "xxxx" in the
   new power limit in Watts.
 * A problem was fixed for the periodic guard reminder function to not re-post
   error logs of failed FRUs on each IPL.  Instead, a reminder SRC is created to
   call home the list of FRUs that have failed and require service.  This puts
   the system to back to original behavior of only posting one error log for
   each FRU that has failed.
 * For a HMC managed system, a problem was fixed for a rare, intermittent
   NetsCMS core dump that could occur whenever the system is doing a deferred
   shutdown power off.  There is no impact to normal operations as the power off
   completes, but there are extra error logs with SRC B181EF88  and a service
   processor dump.
 * A problem was fixed for the Redfsih "Manager" request returning duplicate
   object URIs for the same HMC.  This can occur if the HMC was removed from the
   managed system and then later added back in.  The Redfish objects for the
   earlier instances of the same HMC were never deleted on the remove.
 * Hardware data collection performance was improved for platform-level dumps.
 * A problem was fixed for platform dumps failing for HWPROC checkstops, causing
   the system to terminate instead of re-IPLing after the processor failure.  To
   recover, the system can be powered off and then IPLed.  Any problem hardware
   will be guarded during the IPL to allow normal system operations.
 * A security problem was fixed to detect and prevent Self Boot Engine (SBE)
   SEEPROM corruption.   The Common Vulnerabilities and Exposures issue number
   is CVE-2018-8931.
   

System firmware changes that affect certain systems


 * On systems with PowerVM firmware,  a problem was fixed for certain hypervisor
   error logs being slow to report to the OS.  The error logs affected are those
   created by the hypervisor immediately after the hypervisor is started and if
   there is more than 128 error logs from the hypervisor to be reported.  The
   error logs at the end of the queue take a long time to be processed, and may
   make it appear as if error logs are not being reported to the OS.
 * On systems with PowerVM firmware,  a problem was fixed for an enclosure fault
   LED being stuck on after a repair of a fan.  This problem only occurs after
   the second concurrent repair of a fan.
 * On systems with PowerVM firmware,  a problem was fixed for a concurrent EMX0
   PCIe3 expansion CXP (120 Gb/s 12x Small Form-factor Pluggable) cable adapter
   add or repair that fails with a hypervisor 0x030A error after a previous add
   or repair failure.  The affected CXP cable adapters have feature codes #EJ05
   and #EJ08.  A system IPL will recover from the problem.
 * On systems with PowerVM firmware,  a problem was fixed for a dedicated
   processor partition hanging during a shutdown.  This is a very rare problem
   with only a small timing window in the shutdown that can cause the hang.
   
 * On systems with PowerVM firmware, a problem was fixed for a Novalink enabled
   partition not being able to release master from the HMC that results in error
   HSCLB95B.  To resolve the issue, run a rebuild managed server operation on
   the HMC and then retry the release.  This occurs when attempting to release
   master from HMC after the first boot up of a Novalink enabled partition if
   Master Mode was enforced prior to the boot.
 * On systems with PowerVM firmware, a problem was fixed for resource dumps that
   use the selector "iomfnm" and options "rioinfo" or "dumpbainfo".  This
   combination of options for resource dumps always fails without the fix.
 * On a system with an AIX partition,  a problem was fixed for a partition time
   jump that could occur after doing an AIX Live Update.  This problem could
   occur if the AIX Live Update happens after a Live Partition Mobility (LPM)
   migration to the partition.  AIX applications using the timebase facility
   could observe a large jump forwards or backwards in the time reported by the
   timebase facility.   A circumvention to this problem is to reboot the
   partition after the LPM operation prior to doing the AIX Live Update.  An AIX
   fix is also required to resolve this problem.  The issue will no longer occur
   when this firmware update is applied on the system that is the target of the
   LPM operation and the AIX partition performing the AIX Live Update has the
   appropriate AIX updates installed prior to doing the AIX Live Update.
 * On systems with PowerVM firmware, a problem was fixed for a Virtual Network
   Interface Controller (vNIC) client adapter to prevent a failover when
   disabling the adapter from the HMC.  A failover to a new backing device could
   cause the client adapter to erroneously appear to be active again when it is
   actually disabled.  This causes confusion and failures on the OS for the
   device driver.  This problem can only occur when there is more than a single
   backing device for the vNIC adapter and if a commands are issued from the HMC
   to disable the adapter and enable the adapter.
 * On systems with PowerVM firmware, a problem was fixed for all variants (this
   was partially fixed in an earlier release) for the SR-IOV firmware adapter
   updates using the HMC GUI or CLI to only reboot one SR-IOV adapter at a
   time.  If multiple adapters are updated at the same time, the HMC error
   message HSCF0241E may occur:  "HSCF0241E Could not read firmware information
   from SR-IOV device ...".  This fix prevents the system network from being
   disrupted by the SR-IOV adapter updates when redundant configurations are
   being used for the network.  The problem can be circumvented by using the HMC
   GUI to update the SR-IOV firmware one adapter at a time using the following
   steps:
   https://www.ibm.com/support/knowledgecenter/en/POWER8/p8efd/p8efd_updating_sriov_firmware.htm
   
 * On systems with PowerVM firmware, a problem was fixed for the callout of SRC
   BA188002 so it does not display three trailing extra garbage characters in
   the location code for the FRU.  The string is correct up to the line ending
   white space, so the three extra characters after that should be ignored. 
   This problem is intermittent and does not occur for all BA188002 error logs.
   
 * On systems with PowerVM firmware, a problem was fixed for when booting a
   large number of LPARs with Virtual Trusted Platform Module (vTPM) capability,
   some partitions may post a SRC BA54504D time-out for taking too long to
   start.  With the fix, the time allowed to boot a vTPM LPAR is increased.  If
   a time-out occurs, the partition can be booted again to recover.  The problem
   can be avoided by auto-starting fewer vTPM LPARs, or booting them a couple at
   a time to prevent flooding the vTPM device server with requests that will
   slow the boot time while the LPARs wait on the vTPM device server responses.
 * On systems with PowerVM firmware, a problem was fixed for SMS menus to limit
   reporting on the NPIV and vSCSI configuration to the first 511 LUNs.  Without
   the fix, LUN 512 through the last configured LUN report with invalid data. 
   Configurations in excess of 511 LUNs are very rare, and it is recommended for
   performance reasons (to be able search for the boot LUN more quickly) that
   the number of LUNs on a single targeted be limited to less than 512.
 * On systems with PowerVM firmware, the following two errors in the SR-IOV
   adapter firmware were fixed:  1)  The adapter resets and there is a B400FF01
   reference code logged. This error happens in rare cases when there are
   multiple partitions actively running traffic through the adapter.  System
   firmware resets the adapter and recovers the system with no user-intervention
   required; 2) SR-IOV VFs with defined VLANs and an assigned PVID are not able
   to ping each other.
   This fix updates adapter firmware to 10.2.252.1933, for the following Feature
   Codes: EN15, EN16, EN17, EN18, EN0H, EN0J, EN0M, EN0N, EN0K, EN0L, EL38,
   EL3C, EL56, and EL57.
   The SR-IOV adapter firmware level update for the shared-mode adapters happens
   under user control to prevent unexpected temporary outages on the adapters. 
   A system reboot will update all SR-IOV shared-mode adapters with the new
   firmware level.  In addition, when an adapter is first set to SR-IOV shared
   mode, the adapter firmware is updated to the latest level available with the
   system firmware (and it is also updated automatically during maintenance
   operations, such as when the adapter is stopped or replaced).  And lastly,
   selective manual updates of the SR-IOV adapters can be performed using the
   Hardware Management Console (HMC).  To selectively update the adapter
   firmware, follow the steps given at the IBM Knowledge Center for using HMC to
   make the updates:  
   https://www.ibm.com/support/knowledgecenter/HW4M4/p8efd/p8efd_updating_sriov_firmware.htm.
   Note: Adapters that are capable of running in SR-IOV mode, but are currently
   running in dedicated mode and assigned to a partition, can be updated
   concurrently either by the OS that owns the adapter or the managing HMC (if
   OS is AIX or VIOS and RMC is running).
   
 * On systems with PowerVM firmware, a problem was fixed for an IPL that ends
   with the HMC in the "Incomplete" state with SRCs B182951C and A7001151
   logged.  Partitions may start and can continue to run without the HMC
   services available.  In order to recover the HMC session,  a re-IPL of the
   system is needed (however, partition workloads could continue running
   uninterrupted until the system is intentionally re-IPLed at a scheduled
   time.).  The frequency of this problem is very low as it rarely occurs.
 * On systems with PowerVM firmware, a problem was fixed for Live Partition
   Mobility (LPM) failing along with other hypervisor tasks, but the partitions
   continue to run.  This is an extremely rare failure where a re-IPL is needed
   to restore HMC or Novalink connections to the partitions, or to do any system
   configuration changes.
 * On systems with PowerVM firmware, a problem was fixed for partition SMS menus
   to display certain network adapters that were unviewable and not usable as
   boot and install devices after a microcode update.  The problem network
   adapter is still present and usable at the OS.  The adapters with this
   problem have the following featiure codes:  EN0A, EN0B, EN0H, EN0J, EN0K,
   EN0L, EN15, EN17, EL5B, EL38, EL3C, EL56, and EL57.
 * For a shared memory partition,  a problem was fixed for Live Partition
   Mobility (LPM) migration hang after a Mover Service Partition (MSP) failover
   in the early part of the migration.  To recover from the hang, a migration
   stop command must be given on the HMC.  Then the migration can be retryed.
 * For a shared memory partition,  a problem was fixed for Live Partition
   Mobility (LPM) migration failure to an indeterminate state.  This can occur
   if the Mover Service Partition (MSP)  has a failover that occurs when the
   migrating partition is in the state of "Suspended."  To recover from this
   problem, the partition must be shutdown and restarted.
 * On a system attached to a Cloud Management Console (CMC) via a Cloud
   Connector on the HMC,  a problem was fixed for Redfish queries to the service
   processor resulting in memory leaks and out of memory (OOM) resets of the
   service processor.

SV860_165_165 / FW860.51

05/22/18 Impact:  Security      Severity:  SPE

Power System S812L (8247-21L), Power System S822L (8247-22L), Power System S824L
(8247-42L), Power System S812 (8284-21A), Power System S822 (8284-22A), Power
System S814 (8286-41A), Power System S824 (8286-42A); Power System E850
(8408-E8E); Power System E850C (8408-44E); Power System S812L (5148-21L) and
Power System S822L (5148-22L) servers only.


Response for Recent Security Vulnerabilities


 * DISRUPTIVE:  On systems with PowerVM firmware,  In response to recently
   reported security vulnerabilities, this firmware update is being released to
   address Common Vulnerabilities and Exposures issue number CVE-2018-3639.  In
   addition, Operating System updates are required in conjunction with this FW
   level for CVE-2018-3639.

SV860_160_056 / FW860.50

05/03/18 Impact:  Availability      Severity:  SPE

Power System S812L (8247-21L), Power System S822L (8247-22L), Power System S824L
(8247-42L), Power System S812 (8284-21A), Power System S822 (8284-22A), Power
System S814 (8286-41A), Power System S824 (8286-42A); Power System E850
(8408-E8E); Power System E850C (8408-44E); Power System S812L (5148-21L) and
Power System S822L (5148-22L) servers only.


New features and functions


 * On systems with PowerVM firmware, support was added to allow V9R910 and later
   HMC levels to query Live Partition Mobility (LPM) performance data after an
   LPM operation.
 * Support was added to the Advanced System Management Interface (ASMI) to
   provide customer control over speculative execution in response to
   CVE-2017-5753 and CVE-2017-5715 (collectively known as Spectre) and
   CVE-2017-5754 (known as Meltdown).   The ASMI "System
   Configuration/Speculative Execution Control" provides two options that can
   only be set when the system is powered off:
   1) Speculative execution controls to mitigate user-to-kernel and user-to-user
   side-channel attacks.  This mode is designed for systems that need to
   mitigate exposures of the hypervisor, operating systems, and user application
   data to untrusted code.   This mode is set as the default.
   2) Speculative execution fully enabled:  This optional mode is designed for
   systems where the hypervisor, operating system, and applications can be fully
   trusted.
   Note:  Enabling this option could expose the system to CVE-2017-5753,
   CVE-2017- 5715, and CVE-2017-5754.  This includes any partitions that are
   migrated (using Live Partition Mobility) to this system.
 * On systems with PowerVM firmware, support was added to allow a periodic data
   capture from the PCIe3 I/O expansion drawer (with feature code #EMX0) cable
   card links.
 * On systems with PowerVM firmware and an IBM i partition,  support was added
   for multipliers for IBM i MATMATR fields that are limited to four
   characters.  When retrieving Server metrics via IBM MATMATR calls, and the
   system contains greater than 9999 GB, for example, MATMATR has an architected
   "multiplier" field such that 10,000 GB can be represented by 5,000 GB *
   Multiplier of 2, so '5000' and '2' are returned in the quantity and
   multiplier fields, respectively, to handle these extended values.  The IBM i
   OS also requires a PTF to support the MATMATR field multipliers.

System firmware changes that affect all systems


 * A problem was fixed in which deconfigured-resource records can become
   malformed and cause the loss of service processor for both redundant and
   non-redundant service processor systems.  These failures can occur during or
   after firmware updates to the FW860.40, FW860.41, or FW860.42 levels.  The
   complete loss of service processor results in the loss of HMC (or FSP
   stand-alone) management of the server and loss of any further error logging. 
   The server itself will continue to run.  Without the fix, the loss of the
   service processor could happen within one month of the deconfiguration
   records being encountered.  It is highly recommended to install the fix. 
   Recovery from the problem, once encountered, requires a full server AC power
   cycle and clearing of deconfiguration records to avoid reoccurrence. 
   Clearing deconfiguration records exposes the server to repeat hardware
   failures and possible unplanned outages.
 * A problem was fixed for the guard reminder processing of garded FRUs and
   error logs that can cause a system power off to hang and time out with a
   service processor reset.
   
 * A problem was fixed for the wrong Redfish method (PATCH or POST) passed for a
   valid Uniform Resource Indicator (URI) causing an incorrect error message of
   " 501 - Not Implemented".  With the fix, the message returned is "Invalid
   Method on URI" which is more helpful to the user.
 * A problem was fixed for SRC call home reminders for bad FRUs causing service
   processor dumps with SRC B181E911 and reset/reloads.  This occurred if the
   FRU callout was missing a CCIN number in the error log.  This can happen
   because some error logs only have have "Symbolic FRUs" and these were not
   being handled correctly.

System firmware changes that affect certain systems


 * DEFERRED: On systems with PowerVM firmware, a problem was fixed for a PCIe3
   I/O expansion drawer (with feature code #EMX0) where control path stability
   issues may cause certain SRCs to be logged.  Systems using copper cables may
   log SRC B7006A87 or similar SRCs, and the fanout module may fail to become
   active.  Systems using optical cables may log SRC of B7006A22 or similar
   SRCs.  For this problem, the errant I/O drawer may be recovered by a re-IPL
   of the system.
   
 * On systems with PowerVM firmware, a problem was fixed for a Coherent
   Accelerator Processor Proxy (CAPP) unit hardware failure that caused a
   hypervisor hang with SRC B7000602.  This failure is very rare and can only
   occur during the early IPL of the hypervisor, before any partitions are
   started.   A re-IPL will recover from the problem.
 * On systems with PowerVM firmware, a problem was fixed for a Live Partition
   Mobility migration hang that could occur if one of its VIOS Mover Service
   Partitions (MSPs) goes into a failover at the start of the LPM operation. 
   This problem is rare because it requires a MSP error to force a MSP failover
   at the very start of the LPM migration to get the LPM timing error.  The LPM
   hang can be recovered by using the "migrlpar -o s" and "migrlpar -o r"
   commands on the HMC.
 * On systems with PowerVM firmware, a problem was fixed for incorrect low
   affinity scores for a partition reported from the HMC "lsmemopt" command when
   a partition has filled an entire drawer.  A low score indicates the placement
   is poor but in this case the placement is actually good.  More information on
   affinity scores for partitions and the Dynamic Platform Optimizer can be
   found at the IBM Knowledge Center: 
   https://www.ibm.com/support/knowledgecenter/en/POWER8/p8hat/p8hat_dpoovw.htm.
 * On systems with PowerVM firmware, a problem was fixed to allow the management
   console to display the Active Memory Mirroring (AMM) licensed capability. 
   Without the fix, the AMM licensed capability of a server will always show as
   "off" on the management console, even when it is present.
 * On systems with PowerVM firmware, a problem was fixed for a rare hypervisor
   hang for systems with shared processors with a sharing mode of uncapped.  If
   this hang occurs, all partitions of the system will become unresponsive and
   the HMC will go to an "Incomplete" state.
 * On systems with PowerVM firmware, a problem was fixed for a Live Partition
   Mobility migration abort that could occur if one of its VIOS Mover Service
   Partitions (MSPs) goes into a failover during the LPM operation.  This
   problem is rare because it requires a MSP error to force a MSP failover
   during the LPM migration to get the LPM timing error.  The LPM abort can be
   recovered by retrying the LPM migration.
 * On systems with PowerVM firmware and a shared processor pool, a very rare
   problem was fixed for the hypervisor not responding to partition requests
   such as power off and LIve Partiton Mobility (LPM).  This error is caused by
   a request for a guard of a failed processor (when there are not any available
   spare processors) that has hung.
 * On systems using PowerVM firmware with mirrored memory running IBM i
   partitions, a problem was fixed for un-mirrored nodal memory errors in the
   partition that also caused the system to crash.   With the fix, the memory
   failure is isolated to the impacted partition, leaving the rest of the system
   unaffected.  This fix improves on an earlier fix delivered for IBM i memory
   errors in FW840.60 by handling the errors in nodal memory.
 * On systems with PowerVM firmware and Huge Page (16 GB) memory enabled for a
   AIX partition,  a problem was fixed for the OS failing to boot with an 0607
   SRC displayed.  This error occurs on systems with FW860.40, FW860.41 or
   FW860.42 installed.  To circumvent the problem, disable Huge Pages for the
   AIX partition.  For information on viewing and setting values for AIX
   huge-page memory allocation, see the following link in the IBM Knowledge
   Center:
   https://www.ibm.com/support/knowledgecenter/en/POWER8/p8hat/p8hat_aixviewhgpgmem.htm
 * On systems with PowerVM firmware and an IBM i partition, a problem was fixed
   for 64 bytes overwritten in a portion of the IBM i Main Storage Dump (MSD). 
   Approximately 64 bytes are overwritten just beyond the 17 MB (0x11000000)
   address on P8 systems.  This problem is cosmetic as the dump is still
   readable for problem diagnostics and no customer operations are affected by
   it.
 * On systems with PowerVM firmware and a partition with a Fibre Channel Adapter
   (FCA) or a Fibre Channel over Ethernet (FCoE) adapter,  a problem was fixed
   for bootable disks attached to the FCA/FCoE adapter not being seen in the
   System Management Services (SMS) menus for selection as boot devices.  This
   problem is likely to occur if the only I/O device in the partition is a FCA
   or FCoE adapter.  If other I/O devices are present, the problem may still
   occur if the FCA or FCoE is the first adapter discovered by SMS.  A
   work-around to this problem is to define a virtual Ethernet adapter in the
   partition profile.  The virtual adapter does not need to have any physical
   backing device,  as just having the VLAN defined is sufficient to avoid the
   problem.  The FCA has feature codes #EN0A, #EN0B, #EN0F, #EN0G, #EN0Y, #EN12,
   #5729, #5774, #5735, and #5723 and the FCoE adapter has feature codes #5708,
   #EN0H, #EN0J, #EN0K, and #EN0L for all but the Linux on Power 8247 models. 
   For the Linux on Power 8247 models, the FCA has feature codes #5729, #5774,
   #EL43, #EL58, #EL5B, #EL54, and #EL52 and the FCoE adapter has feature codes
   #5708, #EL56, #EL38, #EL57, and #EL3C.
 * On systems with PowerVM firmware and a partition with a 3.0 USB controller, a
   problem was fixed for a partition boot failure.  The USB 3.0 controller may
   be integrated or a adapter card with feature code #EC45 or #EC46.  The boot
   failure is triggered by a fault in the USB controller but instead of the just
   the USB controller failing, the entire partition fails.  With the fix, the
   failure is limited to the USB controller.
 * On systems with PowerVM firmware, a problem was fixed for the FRU callouts
   for the BA188001 and BA188002 EEH errors to include the PCI Host Bridge (PHB)
   FRU which had been excluded.  For the P8 systems, these rare errors will more
   typically isolate to the processor instead of the adapter or slot planar.  
   In the pre-P8 systems, the I/O planar also included the PHB, but for P8
   systems, the PHB was moved to the processor complex.
 * On systems using PowerVM firmware, a problem was fixed for an internal error
   in the SR-IOV adapter firmware that resets the adapter and logs a B400FF01
   reference code.  This error happens in rare cases when there are multiple
   partitions actively running traffic through the adapter and a subset of the
   partitions are shutdown hard.  The error causes a temporary disruption of
   traffic but recovery from the error is automatic with no user intervention
   needed.
   This fix updates adapter firmware to 10.2.252.1931, for the following Feature
   Codes: EN15, EN16, EN17, EN18, EN0H, EN0J, EN0M, EN0N, EN0K, EN0L, EL38,
   EL3C, EL56, and EL57.
   The SR-IOV adapter firmware level update for the shared-mode adapters happens
   under user control to prevent unexpected temporary outages on the adapters. 
   A system reboot will update all SR-IOV shared-mode adapters with the new
   firmware level.  In addition, when an adapter is first set to SR-IOV shared
   mode, the adapter firmware is updated to the latest level available with the
   system firmware (and it is also updated automatically during maintenance
   operations, such as when the adapter is stopped or replaced).  And lastly,
   selective manual updates of the SR-IOV adapters can be performed using the
   Hardware Management Console (HMC).  To selectively update the adapter
   firmware, follow the steps given at the IBM Knowledge Center for using HMC to
   make the updates:  
   https://www.ibm.com/support/knowledgecenter/HW4M4/p8efd/p8efd_updating_sriov_firmware.htm.
   Note: Adapters that are capable of running in SR-IOV mode, but are currently
   running in dedicated mode and assigned to a partition, can be updated
   concurrently either by the OS that owns the adapter or the managing HMC (if
   OS is AIX or VIOS and RMC is running).
   
 * On systems using OPAL firmware, Skiboot was updated to V5.4.9 from V5.4.8, 
   providing the following fix:
   -  A problem was fixed for a possible incorrect value for processor frequency
   from /proc/cpuinfo. The value being returned was the last frequency requested
   by the kernel, but may not reflect the current frequency of the processor.
 * On systems with PowerVM firmware, a problem was fixed for a PCIe3 I/O
   expansion drawer (with feature code #EMX0)  failing to initialize during the
   IPL with a SRC B7006A88 logged.  The error is infrequent.  The errant I/O
   drawer can be recovered by a re-IPL of the system.
 * On systems with PowerVM firmware, a problem was fixed for the SR-IOV firmware
   adapter updates using the HMC GUI or CLI to only reboot one SR-IOV adapter at
   a time.  If multiple adapters are updated at the same time, the HMC error
   message HSCF0241E may occur:  "HSCF0241E Could not read firmware information
   from SR-IOV device ...".  This fix prevents the system network from being
   disrupted by the SR-IOV adapter updates when redundant configurations are
   being used for the network.  The problem can be circumvented by using the HMC
   GUI to update the SR-IOV firmware one adapter at a time using the following
   steps: 
   https://www.ibm.com/support/knowledgecenter/en/8247-22L/p8efd/p8efd_updating_sriov_firmware.htm

SV860_138_056 / FW860.42

01/09/18 Impact:  Security      Severity:  SPE

Power System S812L (8247-21L), Power System S822L (8247-22L), Power System S824L
(8247-42L), Power System S812 (8284-21A), Power System S822 (8284-22A), Power
System S814 (8286-41A), Power System S824 (8286-42A); Power System E850
(8408-E8E) and Power System E850C (8408-44E) servers only.


New features and functions


 * In response to recently reported security vulnerabilities, this firmware
   update is being released to address Common Vulnerabilities and Exposures
   issue numbers CVE-2017-5715, CVE-2017-5753 and CVE-2017-5754.  Operating
   System updates are required in conjunction with this FW level for
   CVE-2017-5753 and CVE-2017-5754.

SV860_127_056 / FW860.41

12/08/17 Impact:  Availability      Severity:  SPE

Power System S812L (8247-21L), Power System S822L (8247-22L), Power System S824L
(8247-42L), Power System S812 (8284-21A), Power System S822 (8284-22A), Power
System S814 (8286-41A), Power System S824 (8286-42A); Power System E850
(8408-E8E) and Power System E850C (8408-44E) servers only.


System firmware changes that affect certain systems

 * On systems using PowerVM firmware that are co-managed with HMC and PowerVM
   NovaLink, a problem was fixed for the HMC going into the Incomplete state
   after deleting a NovaLink partition or after using the HMC "chsyscfg
   powervm_mgmt_capable=0" command to remove the NovaLink attribute from a
   partition.  Partitions will continue running but cannot be changed by the
   management console and the Live Partitiion Mobility (LPM) will not function
   in this state.  A power off of the system will remove it from the Incomplete
   state, but the NovaLink partition will not have been deleted.  To force the
   delete of the NovaLink partition or partitions without the fix,  erase the
   service processor NVRAM and then restore the HMC partition data.
 * On systems using PowerVM firmware with PowerVM NovaLink, a problem was fixed
   for the HMC going into the incomplete state when restoring HMC profile data
   after deleting a NovaLink partition.  This fix will prevent but not repair
   the problem once it has occurred.  Recovery from the problem is to erase the
   service processor NVRAM and then restore the HMC partition data.

SV860_118_056 / FW860.40

11/08/17 Impact:  Availability      Severity:  SPE

Power System S812L (8247-21L), Power System S822L (8247-22L), Power System S824L
(8247-42L), Power System S812 (8284-21A), Power System S822 (8284-22A), Power
System S814 (8286-41A), Power System S824 (8286-42A); Power System E850
(8408-E8E) and Power System E850C (8408-44E) servers only.


System firmware changes that affect all systems

 * A problem was fixed for the "Minimum code level supported" not being shown by
   the Advanced System Management Interface (ASMI) when selecting the "System
   Configuration/Firmware Update Policy" menu.  The message shown is "Minimum
   code level supported value has not been set".  The workaround to find this
   value is to use the ASMI command line interface with the "registry -l
   cupd/MinMifLevel" command.
 * A problem was fixed for system termination and outage caused by a corrupted
   system reset type.  For cases where the system reset type cannot be
   identified, the service processor will now do a reset/reload to keep the
   system running.  This is a rare problem that is occurring during an
   error/recovery situation that involves a reset of the service processor. 
   This is a replacement for a previous fix attempt (same fix description) for
   this problem but it failed to prevent the system from terminating.
 * A problem was fixed for a power supply error log with SRC B155A4E0 not
   identifying the FRU location of the failed power supply.  This will happen
   anytime a power supply fails or is removed at system runtime.  A
   circumvention for this problem is to look for other power Predictive Errors
   in the error log and these will help identify the location of the failing
   power supply.
 * A problem was fixed for "sh: errl: not found " error messages to the service
   processor console whenever the Advanced System Management Interface (ASMI)
   was used to display error logs.  These messages did not cause any problems
   except to clutter the console output as seen in the service processor traces.
 * A problem was fixed for the LineInputVoltage and LastPowerOutputWatts being
   displayed in millivolts and milliwatts, respectively,  instead of volts and
   watts for the output from the Redfish API for power properties for the
   chassis.  The URL affected is the following:  "https://<fsp
   ip>/redfish/v1/Chassis/<id>/Power"
 * A problem was fixed for a Power Supply Unit (PSU) failure of SRC 110015xF
   logged with a power supply fan call out when doing a hot re-plug of a PSU.  
   The power supply may be made operational again by doing a dummy replace of
   the PSU that was called out (keeping the same PSU for the replace
   operation).  A re-IPL of the system will also recover the PSU.
 * A problem was fixed for the service processor low-level boot code always
   running off the same side of the flash image, regardless of what side has
   been selected for boot ( P-side or T-side).  Because this low-level boot code
   rarely changes, this should not cause a problem unless corruption occurs in
   the flash image of the boot code.  This problem does not affect firmware
   side-switches as the service processor initialization code (higher-level code
   than the boot code) is running correctly from the selected side.  Without the
   fix, there is no recovery for boot corruption for systems with a single
   service processor as the service processor must be replaced.
 * A problem was fixed for a missing serviceable event from a periodic call home
   reminder.  This occurred if there was an FRU deconfigured for the serviceable
   event.
 * A problem was fixed for help text in the Advanced System Management Interface
   (ASMI) not informing the user that system fan speeds would increase if the
   system Power Mode was changed to "Fixed Maximum Frequency" mode.  If ASMI
   panel function "System Configuration->Power Management->Power Mode Setup"
   "Enable Fixed Maximum Frequency mode" help is selected, the updated text
   states "...This setting will result in the fans running at the maximum speed
   for proper cooling."
 * A problem was fixed for a degraded PCI link causing a Predictive SRC for a
   non-cacheable unit (NCU) store time-out that occurred with SRC B113E540 or
   B181E450 and PRD signature "(NCUFIR[9]) STORE_TIMEOUT: Store timed out on
   PB".  With the fix, the error is changed to be an Informational as the
   problem is not with the processor core and the processor should not be
   replaced.  The solution for degraded PCI links is different from the fix for
   this problem, but a re-IPL of the CEC or a reset of the PCI adapters could
   help to recover the PCI links from their degraded mode.
 * A problem was fixed for the IPMI serial over LAN (SOL) console buffer
   becoming full without an active ipmitool client causing a service processor
   hang to host, resulting in a host initiated reset/reload of the service
   processor.  The problem causes a serviceable event and a service processor
   dump, but otherwise it should not impact the jobs on the running host.
 * A problem was fixed for the IPMI serial over LAN (SOL) console intermittently
   dropping a character of data.  This occurred anytime the console data to
   write size matched the free space size in the SOL console 4K buffer.
 * A problem was fixed for a Redfish Patch on the "Chassis" 
   "HugeDynamicDMAWindowSlotCount" for the validation of incorrect values. 
   Without the fix, the user will not get proper error messages when providing
   bad values to the patch.
   

System firmware changes that affect certain systems


 * DEFERRED:  On systems using PowerVM firmware, a problem was fixed for DPO
   (Dynamic Platform Optimizer) operations taking a very long and impacting the
   server system with a performance degradation.  The problem is triggered by a
   DPO operation being done on a system with unlicensed processor cores and a
   very high I/O load.  The fix involves using a different lock type for the
   memory relocation activities (to prevent lock contention between memory
   relocation threads and partition threads) that is created at IPL time, so an
   IPL is needed to activate the fix.  More information on the DPO function can
   be found at the IBM Knowledge Center: 
   https://www.ibm.com/support/knowledgecenter/en/8247-42L/p8hat/p8hat_dpoovw.htm
 * On systems using PowerVM firmware,  a problem was fixed for an intermittent
   service processor core dump and a callout for netsCommonMSGServer with SRC
   B181EF88.   The HMC connection to the service processor automatically
   recovers with a new session.
 * On systems using PowerVM firmware, a problem was fixed for a concurrent
   firmware update failure with HMC error message
   "E302F865-PHYPTooBusyToQuiesce".  This error can occur when the error log is
   full on the hypervisor and it cannot accept more error logs from the service
   processor.  But the service processor keeps retrying the send of an error
   log, resulting in a "denial of service" scenario where the hypervisor is kept
   busy rejecting the error logging attempts.  Without the fix, the problem may
   be circumvented by starting a logical partition (if none are running) or by
   purging the error logs on the service processor.
 * On systems using PowerVM firmware with mirrored memory running IBM i
   partitions, a problem was fixed for memory fails in the partition that also
   caused the system to crash.  The system failure will occur any time that IBM
   i partition memory towards the beginning of the partition's assigned memory
   fails.  With the fix, the memory failure is isolated to the impacted
   partition, leaving the rest of the system unaffected.
 * On systems using PowerVM firmware, a problem was fixed for failures
   deconfiguring SR-IOV Virtual Functions (VFs).  This can occur during Live
   Partition Mobility (LPM) migrations with HMC error messages of HSCLAF16,
   HSCLAF15 and HSCLB602 shown. This results in an LPM migration failure and a
   system reboot is required to recover the VFs for the I/O adapters.  This
   error may occur more frequently in cases where the I/O adapter has pending
   I/O at the time of the deconfigure request for the VF.
 * On systems using PowerVM firmware, a problem was fixed for a vNIC client that
   has backing devices being assigned an active server that was not the one
   intended by an HMC user failover for the client adapter.  This only can
   happen if the vNIC client adapter had never been activated.  A circumvention
   is to activate the client OS and initialize the vNIC device (ifconfig "xxx"
   up) and an active backing device will then be selected.
 * On systems using PowerVM firmware, a problem was fixed for partitions with
   more than 32TB memory failing to IPL with memory space errors.  This can
   occur if the logical memory block (LMB) size is small as there is a memory
   loss associated with each LMB.  The problem can be circumvented by reducing
   the amount of partition memory or increasing the LMB size to reduce the total
   number of LMBs needed for the memory allocation.
 * On systems using PowerVM firmware,  a problem was fixed for the error
   handling of EEH events for the SR-IOV Virtual Functions (VFs) that can result
   in IPL failure with B7006971, B400FF05, and BA210000 SRCs logged.  In these
   cases, the partition console stops at an OFDBG prompt.  Also, a DLPAR add of
   a VF may result in a partition crash due to a 300 DSI exception because of a
   low-level EEH event.  A circumvention for the problem would be to debug the
   EEH events which should be recovered errors and eliminate the cause of the
   EEH events.  With the fix, the EEH events still log Predictive Errors but do
   not cause a partition failure.
 * On systems using PowerVM firmware and running IBM i on stand-alone systems
   (no HMC attached). a problem was fixed for an inadvertent Operations Panel
   function 71 activation that put the system into "Network Boot" mode and
   prevented the IBM i from IPLing.  A circumvention is to use Operations Panel
   function 72 to turn off "Network Boot" mode.  With the fix, the Operations
   Panel function 71 request will be ignored on IBM i stand-alone systems.
 * A problem was fixed for intermittent high-temperature induced link failures
   on the 100GB EDR IB, NIC, and RoCE adapters caused by system fans running at
   too low of a speed.  These adapters include the PCIe3 1-port and 2-port 100Gb
   EDR IB x16 adapters and the PCIe3 2-port 100GbE (NIC and RoCE) QSFP28 x16
   adapter with feature codes EC3E, EC3F, EC3L, EC3M, EC3T, and EC3U.  EDR IB
   (Enhanced Data Rate Infiniband), NIC (Network Interface Controller), and IBTA
   RoCE (Remote Direct Memory Access (RDMA) over Converged Ethernet) are the
   specific network standards supported in the adapters.
   This problem was fixed earlier in FW860.31 for the (8284-xxx) and (8247-xxx)
   models.  The fix has been extended to include the E850 (8408-E8E) and the
   E850 (8408-44E) models.
   
 * On systems using PowerVM firmware, a problem was fixed for an invalid date
   from the service processor causing the customer date and time to go to the
   Epoch value (01/01/1970) without a warning or chance for a correction.  With
   the fix,  the first IPL attempted on an invalid date will be rejected with a
   message alerting the user to set the time correctly in the service
   processor.  If the warning is ignored and the date/time is not corrected, the
   next IPL attempt will complete to the OS with the time reverted to the Epoch
   time and date.  This problem is very rare but it has been known to occur on
   service processor replacements when the repair step to set the date and time
   on the new service processor was inadvertently skipped by the service
   representative.
 * On systems using PowerVM firmware with PowerVM NovaLink, a problem was fixed
   for a lost of a communications channel between the hypervisor and the PowerVM
   NovaLink during a reset of the service processor.  Various NovaLink tasks,
   including deploy, could fail with a "No valid host was found" error.  With
   the fix, PowerVM NovaLink prevents normal operations from being impacted by a
   reset of the service processor.
 * On systems using PowerVM firmware, a problem was fixed for a rare system hang
   caused by a process dispatcher deadlock timing window.  If this problem
   occurs, the HMC will also go to an "Incomplete" state for the managed system.
 * On systems using PowerVM firmware,  a problem was fixed for communication
   failures on adapters in SR-IOV shared mode.  This communication failure only
   occurs when a logical port's VLAN ID ( PVID) is dynamically changed from
   non-zero to zero.  An SR-IOV logical port is an I/O device created for a
   partition or a partition profile using the management console (HMC) when a
   user intends for the partition to access an SR-IOV adapter Virtual Function. 
   The error can be recovered from by a reboot of the partition.
   This fix updates adapter firmware to 10.2.252.1929, for the following Feature
   Codes: EN15, EN16, EN17, EN18, EN0H, EN0J, EN0M, EN0N, EN0K, EN0L, EL38,
   EL3C, EL56, and EL57.
   The SR-IOV adapter firmware level update for the shared-mode adapters happens
   under user control to prevent unexpected temporary outages on the adapters. 
   A system reboot will update all SR-IOV shared-mode adapters with the new
   firmware level.  In addition, when an adapter is first set to SR-IOV shared
   mode, the adapter firmware is updated to the latest level available with the
   system firmware (and it is also updated automatically during maintenance
   operations, such as when the adapter is stopped or replaced).  And lastly,
   selective manual updates of the SR-IOV adapters can be performed using the
   Hardware Management Console (HMC).  To selectively update the adapter
   firmware, follow the steps given at the IBM Knowledge Center for using HMC to
   make the updates:  
   https://www.ibm.com/support/knowledgecenter/HW4M4/p8efd/p8efd_updating_sriov_firmware.htm.
   Note: Adapters that are capable of running in SR-IOV mode, but are currently
   running in dedicated mode and assigned to a partition, can be updated
   concurrently either by the OS that owns the adapter or the managing HMC (if
   OS is AIX or VIOS and RMC is running).
   
 * On systems using PowerVM firmware, a problem was fixed for error logs not
   getting sent to the OS running in a partition.   This problem could occur if
   the error log buffer was full in the hypervisor and then a re-IPL of the
   system occurred.  The error log full condition was persisting across the
   re-IPL, preventing further logs from being sent to the OS.
 * On systems using OPAL firmware, Skiboot was updated to V5.4.8 from V5.4.6, 
   providing the following fixes:
   -  A problem was fixed for an intermittent host freeze during a reset/reload
   of the service processor.  The host will resume normal operations after the
   reset/reload has completed.  To have this error occur,  a timing window has
   to be hit where a synchronous message from the host is in progress to the
   service processor at the same time a reset/reload is initiated.
   - A problem was fixed for IPMI Serial Over Lan (SOL) console disconnects to
   prevent Host process hangs related to the console management for output
   buffers and error logging.  If there is a reset of the service processor and
   the console was active, the console session is now closed to free all the
   console resources. 
   - A problem was fixed for "FSP: Unhandled message eb0500" error message. 
   This is a command sent by the FSP to OPAL to get vNVRAM statistics.  Since
   OPAL maintains no NVRAM statistics, it now returns FSP_STATUS_INVALID_SUBCMD
   with its new handler.  Sample of OPAL log that will no longer occur with the
   fix:
   [16944.384670488,3] FSP: Unhandled message eb0500
   [16944.474110465,3] FSP: Unhandled message eb0500
   - A problem was fixed for sending false messages for "Reassociating HVSI
   console" when the console is not available.  These message are no longer
   issued for unavailable consoles:
    5013.227994012,7] FSP: Reassociating HVSI console 1
   [ 5013.227997540,7] FSP: Reassociating HVSI console 2
   - A problem was fixed for a Delayed Power Off (DPO) failure that occurred if
   the service processor reset right after the request.  With the fix, the DPO
   and normal shutdowns will complete on the host without regard to service
   processor state changes that occur after the request.
 * On systems using OPAL firmware, Petitboot was updated to V1.4.4 from V1.4.2, 
   providing the following fixes:
   - A problem was fixed for line truncation on the Petitboot screen occurring
   for any line that had a multibyte character in it.
   - A problem was fixed for the safe mode message not clearing even after
   "Rescan Devices" button in safe mode was pressed and re-initialization
   completed successfully.
   - A problem was fixed for Petitboot configuration for boot order and network
   settings being cleared when the user just wanted to clear the IPMI override. 
   With the fix, the IPMI override is cleared and safe mode is exited, if
   active, without modifying the rest of the configuration.
 * On systems using PowerVM firmware, a problem was fixed in the text for the
   Firmware License agreement to correct a link that pointed to a URL that was
   not specific to microcode licensing.  The message is displayed for a machine
   during its initial power on.  Once accepted, the message is not displayed
   again.  The fixed link in the licensing agreement is the following:
   http://www.ibm.com/support/docview.wss?uid=isg3T1025362.

SV860_109_056 / FW860.31

08/30/17 Impact:  Availability      Severity:  ATT

System firmware changes that affect certain systems

 * A problem was fixed for intermittent high-temperature induced link failures
   on the 100GB EDR IB, NIC, and RoCE adapters caused by system fans running at
   too low of a speed.  These adapters include the PCIe3 1-port and 2-port 100Gb
   EDR IB x16 adapters and the PCIe3 2-port 100GbE (NIC and RoCE) QSFP28 x16
   adapter with feature codes EC3E, EC3F, EC3L, EC3M, EC3T, and EC3U.  EDR IB
   (Enhanced Data Rate Infiniband), NIC (Network Interface Controller), and IBTA
   RoCE (Remote Direct Memory Access (RDMA) over Converged Ethernet) are the
   specific network standards supported in the adapters.
   This problem does not apply to the E850 (8408-E8E) or the E850 (8408-44E)
   models.
   

SV860_103_056 / FW860.30

06/30/17 Impact:  Availability      Severity:  SPE

Power System S812L (8247-21L), Power System S822L (8247-22L), Power System S824L
(8247-42L), Power System S812 (8284-21A), Power System S822 (8284-22A), Power
System S814 (8286-41A), Power System S824 (8286-42A); Power System E850
(8408-E8E) and Power System E850C (8408-44E) servers only.

New features and functions


 * Support was added for Redfish API to allow the ISO 8610 extended format for
   the time and date so that the date/time can be represented as an offset from
   UTC (Universal Coordinated Time).
 * Support for the Redfish API for power and thermal properties for the
   chassis.  The new URIs are as follows::
   https://<fsp ip>/redfish/v1/Chassis/<id>/Power : Provides fan data
   https://<fsp ip>/redfish/v1/Chassis/<id>/Thermal : Provides power supply data
   Only the Redfish GET operation is supported for these resources.
   

System firmware changes that affect all systems

 * A problem was fixed for service actions with SRC B150F138 missing an Advanced
   System Management Interface (ASMI) Deconfiguration Record.  The
   deconfiguration records make it easier to organize the repairs that are
   needed for the system and they need to be consistent with the periodic
   maintenance reminders that are logged for the failed FRUs.
 * A problem was fixed for a false 1100026B1 (12V power good failure) caused by
   an I2C bus write error for a LED state.  This error can be triggered by the
   fan LEDs changing state.
 * A problem was fixed for a fan LED turning amber on solid when there is no fan
   fault, or when the fan fault is for a different fan.  This error can be
   triggered anytime a fan LED needs to change its state.  The fan LEDs can be
   recovered to a normal state concurrently using the following link steps for a
   soft reset of the service processor: 
   https://www.ibm.com/support/knowledgecenter/POWER8/p8hby/p8hby_softreset.htm
 * A problem was fixed for a system termination and outage caused by a corrupted
   system reset type.  For cases where the system reset type cannot be
   identified, the service processor will now do a reset/reload to keep the
   system running.  This is a rare problem that is occurring during an
   error/recovery situation that involves a reset of the service processor.
 * A problem was fixed for sporadic blinking amber LEDs for the system fans with
   no SRCs logged.  There was no problem with the fans.  The LED corruption
   occurred when two service processor tasks attempted to update the LED state
   at the same time.  The fan LEDs can be recovered to a normal state
   concurrently using the following link steps for a soft reset of the service
   processor: 
   https://www.ibm.com/support/knowledgecenter/POWER8/p8hby/p8hby_softreset.htm
 * A problem was fixed for a Redfish Patch on the "Chassis" or
   "IBMEnterpriseComputerSystem" with empty data that caused a "500 Internal
   Server Error".  Validation for the empty data case has been added to prevent
   the server error.
 * A problem was fixed for the loss of Operations Panel function 30 (displaying
   ethernet port HMC1 and HMC2 IP addresses) after a concurrent repair of the
   Operations Panel.  Operations Panel function 30 can be restored concurrently
   using the following link steps for a soft reset of the service processor: 
   https://www.ibm.com/support/knowledgecenter/POWER8/p8hby/p8hby_softreset.htm
 * A problem was fixed for a core dump of the rtiminit (service processor time
   of day) process that logs an SRC B15A3303  and could invalidate the time on
   the service processor.  If the error occurs while the system is powered on,
   the hypervisor has the master time and will refresh the service processor
   time, so no action is needed for recovery.  If the error occurs while the
   system is powered off, the service processor time must be corrected on the
   systems having only a single service processor.  Use the following steps from
   the IBM Knowledge Center to change the UTC time with the Advanced System
   Management Interface: 
   https://www.ibm.com/support/knowledgecenter/en/POWER8/p8hby/viewtime.htm.
 * A problem was fixed for the service processor boot watch-dog timer expiring
   too soon during DRAM initialization in the reset/reload, causing the service
   processor to go unresponsive.  On systems with a single service processor,
   the SRC B1817212 was displayed on the control panel.  For systems with
   redundant service processors, the failing service processor was
   deconfigured.  To recover the failed service processor, the system will need
   to be powered off with AC powered removed during a regularly scheduled system
   service action.  This problem is intermittent and very infrequent as most of
   the reset/reloads of the service processor will work correctly to restore the
   service processor to a normal operating state.
 * A problem was fixed for host-initiated resets of the service processor
   causing the system to terminate.  A prior fix for this problem did not work
   correctly because some of the host-initiated resets were being translated to
   unknown reset types that caused the system to terminate.  With this new
   correction for failed host-initiated resets, the service processor will still
   be unresponsive but the system and partitions will continue to run.  On
   systems with a single service processor, the SRC B1817212 will be displayed
   on the control panel.  For systems with redundant service processors, the
   failing service processor will be deconfigured.  To recover the failed
   service processor, the system will need to be powered off with AC powered
   removed during a regularly scheduled system service action.  This problem is
   intermittent and very infrequent as most of the host-initiated resets of the
   service processor will work correctly to restore the service processor to a
   normal operating state.
 * A problem was fixed for a service processor reset triggered by a spurious
   false IIC interrupt request in the kernel.  On systems with a single service
   processor, the SRC B1817201 is displayed on the Operator Panel.  For systems
   with redundant service processors, an error failover to the backup service
   processor occurs.  The problem is extremely infrequent and does not impact
   processes on the running system.
 * A problem was fixed for an incorrect Redfish error message when trying to use
   the $metadata URI:   "The resource at the URI
   https://<systemip>/redfish/v1/%24metadata was not found.". This %24 is
   meaningless.  The "%24" has been replaced with a "$" in the error message. 
   The Redfish $metadata URI is not supported.
 * A problem was fixed so that IPMI boot parameters are not cleared after a
   service processor reset or loss of AC power to the system.
 * A problem was fixed for serializing concurrent requests for the IPMI serial
   over LAN (SOL) console that were causing a service processor hang with a
   subsequent Host-Initiated Reset/Reload for service processor.

System firmware changes that affect certain systems


 * DEFERRED: On systems using PowerVM firmware, a problem was fixed for PCIe3
   I/O expansion drawer (#EMX0) link improved stability.  The settings for the
   continuous time linear equalizers (CTLE) was updated for all the PCIe
   adapters for the PCIe links to the expansion drawer.  The system must be
   re-IPLed for the fix to activate.
 * On systems using the OPAL firmware, a problem was fixed for an IPMI console
   hang to OPAL that caused the Linux host to be hung for SSH sessions and for
   ipmitool commands to fail with "Error in open session response message :
   insufficient resources for session" error messages on the service
   processor.   An error log with SRC B1818601 is reported for the service
   processor IPMI failure and multiple SRC BB822210  error logs are reported for
   OPAL message time outs to the service processor.  In most cases, this error
   can be recovered from by doing a soft reset of the service processor using
   the following steps from the IBM Knowledge Center: 
   https://www.ibm.com/support/knowledgecenter/POWER8/p8hby/p8hby_softreset.htm.
   
 * On systems using OPAL firmware, a problem was fixed for intermittent long
   delays in the NX co-processor for asynchronous requests such as NX 842
   compressions.  This problem was observed for PowerVM AIX DB2 when it was
   doing hardware-accelerated compressions of data but could occur on any
   asynchronous request to the NX co-processor.  The PowerVM version of the fix
   was delivered in FW860.00.
 * On systems using PowerVM firmware with a Linux Little Endian (LE) partition,
   a problem was fixed for system reset interrupts returning the wrong values in
   the debug output for the NIP and MSR registers.  This problem reduces the
   ability to debug hung Linux partitions using system reset interrupts.  The
   error occurs every time a system reset interrupt is used on a Linux LE
   partition.
 * On systems using PowerVM firmware, a problem was fixed for "Time Power On"
   enabled partitions not being capable of suspend and resume operations.  This
   means Live Partition Mobility (LPM) would not be able to migrate this type of
   partition.  As a workaround, the partition could be transitioned to a
   "Non-time Power On" state and then made capable of suspend and resume
   operations.
 * On systems using PowerVM firmware, a problem was fixed for manual vNIC
   failovers (from the HMC, manually "Make the Backing Device Active") so that
   the selected server was chosen for the failover, regardless of its priority. 
   With the problem, the server chosen for the VNIC failover will be the one
   with the most favorable priority. 
   There are two possible workarounds to the problem:
   (1) Disable auto-priority-failover; Change priority to the server that is
   needed as the target of the failover; Force the vNIC failover; Change
   priority back to original setting.
   (2) Or use auto-priority-failover and change the priority so the server that
   is needed as the target of the failover is favored.
 * On systems using PowerVM firmware, a problem was fixed for extra error logs
   in the VIOS due to failovers taking place while the client vNIC is inactive. 
   The inactive client vNIC failovers are skipped unless the force flag is on. 
   With the problem occurring, Enhanced Error Handling (EEH) Freeze/Temporary
   Error/Recovery logs posted in the VIOS error log of the client partition boot
   can be ignored unless an actual problem is experienced.
 * On systems using PowerVM firmware, a problem was fixed for a Live Partition
   Mobility (LPM) migration abort and reboot on the FW860  target CEC caused by
   a mismatched address space for the source and target partition.  The
   occurrence of this problem is very rare and related to performance
   improvements made in the memory management on the FW860 system that exposed a
   timing window in the partition memory validation for the migration.  The
   reboot of the migrated partition recovers from the problem as the migration
   was otherwise successful.
 * On systems using PowerVM firmware, a problem was fixed for reboot retries for
   IBM i partitions such that the first load source I/O adapter (IOA) is retried
   instead of bypassed after the first failed attempt.  The reboot retries are
   done for an hour before the reboot process gives up.  This error can occur if
   there is more than one known load source, and the IOA of the first load
   source is different from the IOA of the last load source.  The error can be
   circumvented by retrying the boot of the partition after the load source
   device has become available.
 * On systems using PowerVM firmware, a problem was fixed for adapters failing
   to transition to shared SR-IOV mode on the IPL after changing the adapter
   from dedicated mode.  This intermittent problem could occur on systems using
   SR-IOV with very large memory configurations.
 * On systems using PowerVM firmware,  a problem was fixed for SR-IOV adapters
   in shared mode for a transmission stall or time out with SRC B400FF01
   logged.  The time out happens during Virtual Function (VF) shutdowns and
   during Function Level Resets (FLRs) with network traffic running.
   This fix updates adapter firmware to 10.2.252.1927, for the following Feature
   Codes: EN15, EN16, EN17, EN18, EN0H, EN0J, EN0M, EN0N, EN0K, EN0L, EL38,
   EL3C, EL56, and EL57.
   The SR-IOV adapter firmware level update for the shared-mode adapters happens
   under user control to prevent unexpected temporary outages on the adapters. 
   A system reboot will update all SR-IOV shared-mode adapters with the new
   firmware level.  In addition, when an adapter is first set to SR-IOV shared
   mode, the adapter firmware is updated to the latest level available with the
   system firmware (and it is also updated automatically during maintenance
   operations, such as when the adapter is stopped or replaced).  And lastly,
   selective manual updates of the SR-IOV adapters can be performed using the
   Hardware Management Console (HMC).  To selectively update the adapter
   firmware, follow the steps given at the IBM Knowledge Center for using HMC to
   make the updates:  
   https://www.ibm.com/support/knowledgecenter/HW4M4/p8efd/p8efd_updating_sriov_firmware.htm.
   Note: Adapters that are capable of running in SR-IOV mode, but are currently
   running in dedicated mode and assigned to a partition, can be updated
   concurrently either by the OS that owns the adapter or the managing HMC (if
   OS is AIX or VIOS and RMC is running).
 * On systems with maximum memory configurations (where every DIMM slot is
   populated - size of DIMM does not matter), a problem has been fixed for
   systems losing performance and going into Safe mode (a power mode with
   reduced processor frequencies intended to protect the system from overheating
   and excessive power consumption) with B1xx2AC3/B1xx2AC4 SRCs logged.  This
   happened because of On-Chip Controller (OCC) timeout errors when collecting
   Analog Power Subsystem Sweep (APSS) data, used by the OCC to tune the
   processor frequency.  This problem occurs more frequently on systems that are
   running heavy workloads.  Recovery from Safe mode back to normal performance
   can be done with a re-IPL of the system, or concurrently using the following
   link steps for a soft reset of the service processor: 
   https://www.ibm.com/support/knowledgecenter/POWER8/p8hby/p8hby_softreset.htm.
   To check or validate that Safe mode is not active on the system will require
   a dynamic celogin password from IBM Support to use the service processor
   command line:
   1) Log into ASMI as celogin with dynamic celogin password generated by IBM
   Support
   2) Select System Service Aids
   3) Select Service Processor Command Line
   4) Enter "tmgtclient --query_mode_and_function" from the command line
   The first line of the output, "currSysPwrMode" should say "NOMINAL" and this
   means the system is in normal mode and that Safe mode is not active.
 * A problem has been fixed for systems losing performance and going into Safe
   mode (a power mode with reduced processor frequencies intended to protect the
   system from overheating and excessive power consumption) with
   B1xx2AC3/B1xx2AC4 SRCs logged.  This happened because of an On-Chip
   Controller (OCC) internal queue overflow. The problem has only been observed
   for systems running heavy workloads with maximum memory configurations (where
   every DIMM slot is populated - size of DIMM does not matter), but this may
   not be required to encounter the problem.  Recovery from Safe mode back to
   normal performance can be done with a re-IPL of the system, or concurrently
   using the following link steps for a soft reset of the service processor: 
   https://www.ibm.com/support/knowledgecenter/POWER8/p8hby/p8hby_softreset.htm.
   To check or validate that Safe mode is not active on the system will require
   a dynamic celogin password from IBM Support to use the service processor
   command line:
   1) Log into ASMI as celogin with dynamic celogin password generated by IBM
   Support
   2) Select System Service Aids
   3) Select Service Processor Command Line
   4) Enter "tmgtclient --query_mode_and_function" from the command line
   The first line of the output, "currSysPwrMode" should say "NOMINAL" and this
   means the system is in normal mode and that Safe mode is not active.
   
 * On systems using PowerVM firmware,  a problem was fixed for a partition boot
   from a USB 3.0 device that has an error log SRC BA210003.  The error is
   triggered by an Open Firmware entry to the trace buffer during the partition
   boot.  The error log can be ignored as the boot is successful to the OS.
 * On systems using PowerVM firmware,  a problem was fixed for a partition boot
   fail or hang from a Fibre Channel device having fabric faults.  Some of the
   fabric errors returned by the VIOS are not interpreted correctly by the Open
   Firmware VFC drive, causing the hang instead of generating helpful error
   logs.
 * On systems using PowerVM firmware,  a problem was fixed for a power off
   hanging at D200C1FF caused by a vNIC VF failover error with SRC B200F011. 
   The power off hang error is infrequent because it requires that a VF failover
   error having occurred first.  The system can be recovered by using the power
   off immediate option from the Hardware Management Console (HMC).
 * On systems using PowerVM firmware, a problem was fixed for the incorrect
   reporting of the Universally Unique Identifier (UUID) to the OS, which
   prevented the tracking of a partition as it moved within a data center.  The
   UUID value as seen on HMC or the NovaLink did not match the value as
   displayed in the OS.
 * On systems using OPAL firmware, a problem was fixed for an IPMI console hang
   to OPAL that caused the Linux host to be hung for SSH sessions and for
   ipmitool commands to fail with "Error in open session response message:
   insufficient resources for session" error messages on the service
   processor.   An error log with SRC B1818601 is reported for the service
   processor IPMI failure and multiple SRC BB822210  error logs are reported for
   OPAL message timeouts to the service processor.  In most cases, this error
   can be recovered from by doing a soft reset of the service processor using
   the following steps from the IBM Knowledge Center: 
   https://www.ibm.com/support/knowledgecenter/POWER8/p8hby/p8hby_softreset.htm.
 * On systems using the OPAL firmware, Petitboot was updated to v1.4.2 from
   V1.2.7, including the following update: 
   A problem was fixed for the User Interface server connect message to make it
   more clear.  The current message mentions a "server" which can give the
   misleading impression that the user interface is waiting for a remote network
   server. The delay is actually in waiting for the pb-discover process to be
   ready.
   More information for the Petitboot changes can be found at the following
   link:  http://git.ozlabs.org/?p=petitboot;a=tags
 * On systems using OPAL firmware, Skiboot was updated to v5.4.6 from V5.3.7,
   including the following updates:
   - Fix setting of firmware progress sensor properly.  OPAL was incorrectly
   setting firmware status on a sensor id "00" which doesn't exist.
   - Fix error log timeout to only timeout on the send of the error log to the
   service processor.  This will significantly reduce false time out errors.
   - A problem was fixed for excessive "Poller recursion detected" error
   messages during the skiboot that could require a power off to recover from
   the error.
   - A problem was fixed for an unnecessary error message when a reset occurs on
   an empty PCIe Host Bridge (PHB) - no PCIe adapters attached.  The extra error
   message occurs anytime the PHBs in the system go through error recovery.
   - A problem was fixed to fence off an errant PCIe Host Bridge (PHB) during a
   complete reset to allow the kernel to retry the operation.  This helps the
   system recovery process by guarding out the bad hardware to prevent a fatal
   error loop.
   -  A problem was fixed for unknown command messages in the OPAL log after a
   Host-Initiated Reset/Reload of the service processor.
   - A problem was fixed the I2C bus locking that sometimes caused an OPAL crash
   with double unlock() detected.
   - A problem was fixed for OPAL kernel lockups when the IPMI SOL console
   became unresponsive.  The console can become full now and drop messages but
   this prevents the lock-up of the Host kernel.
   - A problem was fixed service processor time-out messages being interpreted
   as "success" by OPAL, preventing correct error reporting and recovery
   actions.
   - A problem was fixed for a kernel hang caused by queued messages needing to
   be sent to the service processor during a reset/reload of the service
   processor.  The messages are now cached and sent when the service processor
   is ready to receive after a reset/reload.
   - A problem was fixed for a soft lockup of the kernel that occurred because
   of RTC/TOD clock errors during a Host-initiated Reset/Reload of the service
   processor.  A frozen process would be seen on the host system along with this
   message:   "NMI watchdog: BUG: soft lockup - CPU#57 stuck for 23s!" where the
   CPU number would vary.
   More information on the Skiboot changes can be found at the following link: 
   https://github.com/open-power/skiboot/tree/master/doc/release-notes.
 * For the IBM Power System E850 (8408-44E), a problem was fixed for the power
   supply with feature #EB3M and part number 001KU578  for fans spinning too
   slowly with SRC 110015xf logged, where x is 1,2,3, or 4 depending on which
   power supply has the failing fan.
 * On systems using PowerVM firmware, a problem was fixed for an error finding
   the partition load source that has a GPT format.  GUID Partition Table (GPT)
   is a standard for the layout of the partition table on a physical storage
   device used in the server, such as a hard disk drive or solid-state drive,
   using globally unique identifiers (GUID).  Other drives that are working may
   be using the older master boot record (MBR) partition table format.  This
   problem occurs whenever load sources utilizing the GPT format occur in other
   than the first entry of the boot table.  Without the fix, a GPT disk drive
   must be the first entry in the boot table to be able to use it to boot a
   partition.
 * On systems using PowerVM firmware, a problem was fixed for an SRC BA090006
   serviceable event log occurring whenever an attempt was made to boot from an
   ALUA (Asymmetric Logical Unit Access) drive.  These drives are always busy by
   design and cannot be used for a partition boot, but no service action is
   required if a user inadvertently tries to do that.  Therefore, the SRC was
   changed to be an informational log.

SV860_096_056 / FW860.21

06/07/17 Impact:  Availability      Severity:  ATT

Power System S812L (8247-21L), Power System S822L (8247-22L) and Power System
S824L (8247-42L) servers only.

System firmware changes that affect certain systems


 * On systems using the OPAL firmware, a problem was fixed for an IPMI console
   hang to OPAL that caused the Linux host to be hung for SSH sessions and for
   ipmitool commands to fail with "Error in open session response message :
   insufficient resources for session" error messages on the service
   processor.   An error log with SRC B1818601 is reported for the service
   processor IPMI failure and multiple SRC BB822210  error logs are reported for
   OPAL message time outs to the service processor.  In most cases, this error
   can be recovered from by doing a soft reset of the service processor using
   the following steps from the IBM Knowledge Center: 
   https://www.ibm.com/support/knowledgecenter/POWER8/p8hby/p8hby_softreset.htm.

SV860_082_056 / FW860.20

03/17/17 Impact:  Availability      Severity:  SPE

Power System S812L (8247-21L), Power System S822L (8247-22L), Power System S824L
(8247-42L), Power System S812 (8284-21A), Power System S822 (8284-22A), Power
System S814 (8286-41A), Power System S824 (8286-42A); Power System E850
(8408-E8E) and Power System E850C (8408-44E) servers only.

New features and functions


 * Support for the Redfish API for provisioning of Power Management tunable
   (EnergyScale) parameters.  The Redfish Scalable Platforms Management API
   ("Redfish") is a DMTF specification that uses RESTful interface semantics to
   perform out-of-band systems management.
   (http://www.dmtf.org/standards/redfish). 
   Redfish service enables platform management tasks to be controlled by client
   scripts developed using secure and modern programming paradigms.
   For systems with redundant service processors, the Redfish service is
   accessible only on the primary service processor.   Usage information for the
   Redfish service is available at the following IBM Knowledge Center link: 
   https://www.ibm.com/support/knowledgecenter/en/POWER8/p8hdx/p8_workingwithconsoles.htm.
   The IBM Power server supports DMTF Redfish API (DSP0266, version 1.0.3
   published 2016-06-17) for systems management.
   A copy of the the Redfish schema files in JSON format published by the DMTF
   (http://redfish.dmtf.org/schemas/v1/) are packaged in the firmware image.
   The schema files are distributed on chip to enable proper functioning in
   deployments with no WAN connectivity.
   IBM extensions to the Redfish schema are published at
   http://public.dhe.ibm.com/systems/power/redfish/schemas/v1. Copyright notices
   for the DMTF Redfish API and schemas are at: (a)
   http://www.dmtf.org/about/policies/copyright, and (b)
   http://redfish.dmtf.org/schemas/README8010.html.
   
 * Support for the IBM Power System S812 (8284-21A) with a single partition
   system running either AIX (FC #EPXQ 4-core 3.026GHz 130W module, CCIN 54E9)
   or IBM i (FC #EPXP, 1-core 3.026GHz 130W module,  CCIN 54E9) for the
   operating system.
 * Support added to reduce memory usage for shared SR-IOV adapters.
 * Support for the Advanced System Management Interface (ASMI) was changed to
   allow the special characters of "I", "O", and "Q" to be entered for the
   serial number of the I/O Enclosure under the Configure I/O Enclosure option. 
   These characters have only been found in an IBM serial number rarely, so
   typing in these characters will normally be an incorrect action.  However,
   the special character entry is not blocked by ASMI anymore so it is able to
   support the exception case.  Without the enhancement, the typing of one of
   the special characters causes message "Invalid serial number" to be
   displayed.
 * On systems using PowerVM firmware, support was added to allow the IBM i OS on
   the Power System S822 (8284-22A) without the need for a VET code.
   

System firmware changes that affect all systems

 * A problem was fixed for the setting the disable of a periodic notification
   for a call home error log SRC B150F138 for Memory Buffer resources (membuf)
   from the Advanced System Management Interface (ASMI).
 * A problem was fixed for the call home data for the B1xx2A01 SRC to include
   the min/max/average readings for more values.  The values for processor
   utilization, memory utilization, and node power usage were added.
 * A problem was fixed for incorrect callouts of the Power Management Controller
   (PMC) hardware with SRC B1112AC4 and SRC B1112AB2 logged.  These extra
   callouts occur when the On-Chip Controller (OCC) has placed the system in the
   safe state for a prior failure that is the real problem that needs to be
   resolved.
 * A problem was fixed for System Vital Product Data (SVPD) FRUs being guarded
   but not having a corresponding error log entry.  This is a failure to commit
   the error log entry that has occurred only rarely.
 * A problem was fixed for the failover to the backup PNOR on a Hostboot Self
   Boot Engine (SBE) failure.  Without the fix, the failed SBE causes loss of
   processors and memory with B15050AD logged.  With the fix, the SBE is able to
   access the backup PNOR and IPL successfully by deconfiguring the failing PNOR
   and calling it out as a failed FRU.
 * A problem was fixed for the OS not being able to detect the USB connected
   Uninterruptible Power Supply (UPS) that has feature code #ECCF.  An
   informational SRC B1814616 is logged from the service processor and the IBM i
   OS logs a CPI0961 (Uninterruptible power supply no longer attached).  The
   error occurs infrequently because it depends on system timing and system
   configuration.  If a system is having the error, it might have it on every
   IPL.  The circumvention is to reseat the USB cable connector for the USB
   connected UPS.
 * A problem was fixed for the Advanced System Management Interface (ASMI)
   "System Service Aids => Error/Event Logs" panel not showing the "Clear" and
   "Show" log options and also having a truncated error log when there are a
   large number of error logs on the system.
 * A problem was fixed for IPMI process core dumps for DCMI commands used to
   gather power and thermal data.  These dumps occur intermittently if the DCMI
   commands are used in a repetitive loop.
 * A problem was fixed to allow changing the IPMI channel authentication
   capabilities from the OS.  The following command was causing an IPMI core
   dump "ipmitool channel authcap 1 4" every time it was run.
 * A problem was fixed a system going into safe mode with SRC B1502616 logged as
   informational without a call home notification.  Notification is needed
   because the system is running with reduced performance.  If there are
   unrecoverable error logs and any are marked with reduced performance and the
   system has not been rebooted, then the system is probably running in safe
   mode with reduced performance.  With the fix, the SRC B1502616 is a
   Unrecoverable Error (UE).
 * A problem was fixed for valid IPv4 static IP addresses not being allowed to
   communicate on the network and not being allowed to be configured.
    The Advanced System Management Interface (ASMI) static IPv4 address
   configuration was not allowing "255" in the IP address subfields.  The
   corrected range checking is as follows:
   Allowed values:  x.255.x.x, x.x.255.x, x.255.255.x
   Disallowed values:  x.x.x.255
   The failure for the communication on the network is seen if the problematic
   IP addresses are in use prior to a firmware update to 860.00, 860.10, 860.11,
   or 860.12.  After the firmware update, the service processor is unable to
   communicate on the network.  The problem can be circumvented by changing the
   service processor to use DHCP addressing, or by moving the IP address to a
   different static IP range, prior to doing the firmware update.
 * A problem was fixed for DCMI commands intermittent failures when used from
   the HMC to continuously gather power and thermal data.  The maximum number of
   IPMI sessions was being exceeded by the HMC.  The number of IPMI sessions has
   been increased to allow two HMCs to collect data simultaneously.
 * A problem was fixed for an unneeded service action request for a
   informational VRM redundant phase fail error logged with SRC 11002701.  If
   reminders for service action with SRC B150F138 are occurring for this
   problem, then firmware containing the fix needs to be installed and ASMI
   error logs need to be cleared in order to stop the periodic reminder.
   

System firmware changes that affect certain systems


 * On systems using PowerVM firmware with PowerVM NovaLink, a problem was fixed
   for returning to HMC-only management from co-management when a Novalink
   partition is deleted holding the master mode.  A circumvention is to release
   master mode before deleting the NovaLink partition and then reconnect the
   disconnected management console.  Please refer to IBM Knowledge Center link
   "http://ibm.biz/novalink-kc" for more information on the PowerVM NovaLink
   feature and changing the master authority when doing co-management.
 * On systems using PowerVM firmware,  a problem was fixed for a blank SRC in
   the LPA dump for user-initiated non-disruptive adjunct dumps.  The A2D03004
   SRC is needed for problem determination and dump analysis.
 * A problem was fixed for the system VPD showing 4 extra PCIe slots that are
   not actually available to the system.  When running an IBM i partition, the
   IBM i Hardware Service Manager shows twelve PCIe adapter slots instead of the
   actual eight that can be used (P1-C2, P1-C3, P1-C4, and P1-C5 are the extra
   slots displayed).  This problem only pertains to the IBM Power System S814
   (8286-41A).
 * On a system using PowerVM firmware with an IBM i partition and VIOS,  a
   problem was fixed for a Live Partition Mobility migration for a IBM i
   partition that fails if there is a VIOS failover during the migration
   suspended window.
 * On a system using PowerVM firmware and VIOS,  a problem was fixed for a HMC
   "Incomplete State" after a Live Partition Mobility migration followed by a
   VIOS failover.  The error is triggered by a delete operation on a migration
   adapter on the VIOS that did the failover.  The HMC "Incomplete State" can be
   recovered from by doing a re-IPL of the system.  This error can also prevent
   a VIOS from activating.
 * On systems using PowerVM firmware, a problem was fixed with SR-IOV adapter
   error recovery where the adapter is left in a failed state in nested error
   cases for some adapter errors.  The probability of this occurring is very low
   since the problem trigger is multiple low-level adapter failures.  With the
   fix, the adapter is recovered and returned to an operational state.
 * On systems using PowerVM firmware with PCIe adapters in Single Root I/O
   Virtualization (SR-IOV) shared mode, a problem was fixed for the hypervisor
   SR-IOV adjunct partition failing during the IPL with SRCs B200F011 and
   B2009014 logged. The SR-IOV adjunct partition successfully recovers after it
   reboots and the system is operational.
 * On systems using PowerVM firmware with PCIe adapters in Single Root I/O
   Virtualization (SR-IOV) shared-mode in a PCIe slot with Enlarged IO Capacity
   and 2TB or more of system memory, a problem was fixed for the hypervisor
   SR-IOV adjunct partition failing during the IPL with SRCs B200F011 and
   B2009014 logged.   In this configuration, it is possible the SR-IOV adapter
   will not become functional following a system reboot or when an adapter is
   first configured into shared-mode.  Larger system memory configurations of
   2TB or more than 1TB are more likely to encounter the problem.  The problem
   can be avoided by reducing the number of PCIe slots with Enlarged IO Capacity
   enabled so it does not include adapters in SR-IOV shared-mode.  Another
   circumvention option is to move the adapter to an SR-IOV capable PCIe slot
   where Enlarged IO Capacity is not enabled.
   
 * On a system using PowerVM firmware and VIOS,  a problem was fixed for a Live
   Partition Mobility (LPM) migration for an Active Memory Sharing (AMS)
   partition that hangs if there is a VIOS failover during the migration.
 * On systems using PowerVM firmware, a problem was fixed for the PCIe3 Optical
   Cable Adapter for the PCIe3 Expansion Drawer failing with SRC B7006A84 error
   logged during the IPL.  The failed cable adapter can be recovered by using a
   concurrent repair operation to power it off and on.  Or the system can be
   re-IPLed to recover the cable adapter.  The affected optical cable adapters
   have feature codes #EJ05, #EJ06, and #EJ08 with CCINs 2B1C, 6B52, and 2CE2,
   respectively.
 * On systems using PowerVM firmware, the hypervisor "vsp" macro was enhanced to
   show the type of the adjunct partition.  The "vsp -longname" macro option was
   also updated to list the location codes for the SR-IOV adjunct partitions. 
   The hypervisor macros are used by IBM support to help debug Power system
   problems.
 * On systems using PowerVM firmware, a problem was fixed for PCIe Host Bridge
   (PHB) outages and PCIe adapter failures in the PCIe I/O expansion drawer
   caused by error thresholds being exceeded for the LEM bit [21] errors in the
   FIR accumulator.  These are typically minor and expected errors in the PHB
   that occur during adapter updates and do not warrant a reset of the PHB and
   the PCIe adapter failures.  Therefore, the threshold LEM[21] error limit has
   been increased and the LEM fatal error has been changed to a Predictive Error
   to avoid the outages for this condition.
 * On systems using PowerVM firmware, a problem was fixed for PCIe3 I/O
   expansion drawer (#EMX0) link improved stability.  The settings for the
   continuous time linear equalizers (CTLE) was updated for all the PCIe
   adapters for the PCIe links to the expansion drawer.  The CEC must be
   re-IPLed for the fix to activate.
   
 * On systems using PowerVM firmware with IBM i partitions, a problem was fixed
   for frequent logging of informational B7005120 errors due to communications
   path closed conditions during messaging from HMCs to IBMi partitions.  In the
   majority of cases these errors are due to normal operating conditions and not
   due to errors that require service or attention.  The logging of
   informational errors due to this specific communications path closed
   condition that are the result of normal operating conditions has been
   removed.
 * On a system using PowerVM firmware with an IBM i partition,  a problem was
   fixed for a D-mode boot failure for IBM i from an USB RDX cartridge.  There
   is a hang at the LPAR progress code C2004130 for a period of time and then a
   failure with SRC B2004158 logged.  There is a USB External Dock (FC #EU04)
   and Removable Disk Cartridge (RDX) 63B8-005 attached.  The error is
   intermittent so the RDX can be powered off and back on to retry the D-mode
   boot to recover.
 * On systems using the OPAL firmware, Petitboot was updated to v1.2.7.  It is
   is now less verbose during boot - only error-level messages are printed
   during Petitboot bootloader initialization.  This means that there will be
   fewer messages printed as the system boots. Additionally, the Petitboot user
   interface is started earlier in the boot process. This means that the user
   will be presented with the user interface sooner, but it may still take time,
   potentially up to 30 seconds, for the user interface to be populated with
   boot options as storage and network hardware is being initialized.  During
   this time, Petitboot will show the status message "Info: Waiting for device
   discovery".  When Petitboot device discovery is completed, the following
   status message will be shown "Info: Connected to pb-discover!".
 * On systems using PowerVM firmware,  the following problems were fixed for
   SR-IOV adapters:
   1) Insufficient resources reported for SR-IOV logical port configured with
   promiscuous mode enable and a Port VLAN ID (PVID) when creating new interface
   on the SR-IOV adapters.
   2) Spontaneous dumps and reboot of the adjunct partition for SR-IOV adapters.
   3) Adapter enters firmware loop when single bit ECC error is detected. 
   System firmware detects this condition as a adapter command time out.  System
   firmware will reset and restart the adapter to recover the adapter
   functionality.  This condition will be reported as a temporary adapter
   hardware failure.
   4) vNIC interfaces not being deleted correctly causing SRC B400FF01 to be
   logged and Data Storage Interrupt (DSI) errors with failiure on boot of the
   LPAR.
   This set of fixes updates adapter firmware to 10.2.252.1926, for the
   following Feature Codes: EN15, EN16, EN17, EN18, EN0H, EN0J, EN0M, EN0N,
   EN0K, EN0L, EL38 , EL3C, EL56, and EL57.
   The SR-IOV adapter firmware level update for the shared-mode adapters happens
   under user control to prevent unexpected temporary outages on the adapters. 
   A system reboot will update all SR-IOV shared-mode adapters with the new
   firmware level.  In addition, when an adapter is first set to SR-IOV shared
   mode, the adapter firmware is updated to the latest level available with the
   system firmware (and it is also updated automatically during maintenance
   operations, such as when the adapter is stopped or replaced).  And lastly,
   selective manual updates of the SR-IOV adapters can be performed using the
   Hardware Management Console (HMC).  To selectively update the adapter
   firmware, follow the steps given at the IBM Knowledge Center for using HMC to
   make the updates:  
   https://www.ibm.com/support/knowledgecenter/HW4M4/p8efd/p8efd_updating_sriov_firmware.htm.
   Note: Adapters that are capable of running in SR-IOV mode, but are currently
   running in dedicated mode and assigned to a partition, can be updated
   concurrently either by the OS that owns the adapter or the managing HMC (if
   OS is AIX or VIOS and RMC is running).
 * On systems using PowerVM firmware with an IBM i partition, a problem was
   fixed for incorrect maximum performance reports based on the wrong number of
   "maximum" processors for the system.   Certain performance reports that can
   be generated on IBMi systems contain not only the existing machine
   information, but also "what-if" information, such as "how would this system
   perform if it had all the processors possible installed in this system". 
   This "what-if" report was in error because the maximum number of processors
   possible was too high for the system.
 * On systems using PowerVM firmware, a problem was fixed for degraded PCIe3
   links for the PCIe3 expansion drawer with SRC B7006A8F not being visible on
   the HMC.  This occurred because the SRC was informational.  The problem
   occurs when the link attaching a drawer to the system trains to x8 instead of
   x16.  With the fix, the SRC has been changed to a B70006A8B permanent error
   for the degraded link.
 * On systems using PowerVM firmware, a problem was fixed for a concurrent
   exchange of a CAPI adapter that left the new adapter in a deactivated
   state.   The system can be powered off and IPLed again to recover the new
   adapter.  The CAPI adapters have the following feature codes:  #EC3E, #EC3F,
   #EC3L, #EC3M, #EC3T, #EC3U, #EJ16, #EJ17, #EJ18, #EJ1A, and #EJ1B.
 * On a system using PowerVM firmware with SR-IOV adapters,  a problem was fixed
   for a DLPAR remove on a Virtual Function (VF) of a ConnectX-4 (CX4) adapter
   that failed with AIX error "0931-013 Unable to isolate the resource".  The
   HMC reported error is "HSCL12B5 The operation to remove SR-IOV logical port
   xx failed because of the following error: HSCL131D The SR-IOV logical port is
   still in use by the partition".  The failing PCIe3 adapters are sourced from
   Mellanox Corporation based on ConnectX-4 technology and have the following
   feature codes and CCINs:  #EC3E, #EC3F with CCIN 2CEA; #EC3L and #EC3M with
   CCIN 2CEC; and #EC3T and #ECTU with CCIN 2CEB.  The issue occurs each time a
   DLPAR remove operation is attempted on the VF.  Restarting the partition
   after a failed DLPAR remove recovers from the error.
 * A problem was fixed for the serial port being disabled on the service
   processor for the IBM Power System E850 (8408-44E).  There is no response
   when plugging the serial port.
 * On systems using PowerVM firmware, a problem was fixed for NVRAM corruption
   that can occur when deleting a partition that owns a CAPI adapter, if that
   CAPI adapter is not assigned to another partition before the system is
   powered off.  On a subsequent IPL, the system will come up in recovery mode
   if there is NVRAM corruption.  To recover, the partitions must be restored
   from the HMC.  The frequency of this error is expected to be rare.  The CAPI
   adapters have the following feature codes:  #EC3E, #EC3F, #EC3L, #EC3M,
   #EC3T, #EC3U, #EJ16, #EJ17, #EJ18, #EJ1A, and #EJ1B.
 * On systems using PowerVM firmware, a problem was fixed for NVRAM corruption
   and a HMC recovery state when using Simplified Remote Restart partitions. 
   The failing systems will have at least one Remote Restart partition and on
   the failed IPL there will be a B70005301 SRC with word 7 being 0X00000002.
 * On systems using PowerVM firmware, a problem was fixed for a group of shared
   processor partitions being able to exceed the designated capacity placed on a
   shared processor pool.  This error can be triggered by using the DLPAR move
   function for the shared processor partitions, if the pool has already reached
   its maximum specified capacity.  To prevent this problem from occurring when
   making DLPAR changes when the pool is at the maximum capacity, do not use the
   DLPAR move operation but instead break it into two steps:  DLPAR remove
   followed by DLPAR add.  This gives enough time for the DLPAR remove to be
   fully completed prior to starting the DLPAR add request.
 * On systems using PowerVM firmware, a problem was fixed for partition boot
   failures and run time DLPAR failures when adding I/O that log BA210000,
   BA210003, and/or BA210005 errors.  The fix also applies to run time failures
   configuring an I/O adapter following an EEH recovery that log BA188001
   events.  The problem can impact IBMi partitions running in any processor mode
   or AIX/Linux partitions running in P7 (or older) processor compatibility
   modes.  The problem is most likely to occur when the system is configured in
   the Manufacturing Default Configuration (MDC) mode.  The trigger for the
   problem is a race-condition between the hypervisor and the physical
   operations panel with a very rare frequency of occurrence.

SV860_070_056 / FW860.12

01/13/17 Impact:  Availability      Severity:  SPE

The following pertains to Power System S812L (8247-21L), Power System S822L
(8247-22L), Power System S824L (8247-42L), Power System S822 (8284-22A), Power
System S814 (8286-41A), Power System S824 (8286-42A) and Power System E850C
(8408-44E) servers only.


System firmware changes that affect certain systems


 * On a system using PowerVM firmware, a problem was fixed for the System
   Management Services (SMS) SAS utility showing very large (incorrect) disk
   capacity values depending on the size of the disk or Volume Set/Array.  The
   problem occurs when the number of blocks on a disk is 2 G or more.
 * On a system using PowerVM firmware running a Linux OS,  a problem was fixed
   for support for Coherent Accelerator Processor Interface (CAPI) adapters. 
   The CAPI related RTAS h-calls for the CAPI devices could not be made by the
   Linux OS, impacting the CAPI adapter functionality and usability.  This
   problem involves the following adapters:  the PCIe3 LP CAPI Accelerator
   Adapter with F/C #EJ16 that is used on the S812L(8247-21L) and S822L
   (8247-22L) models;  the PCIe3 CAPI FlashSystem Acclerator Adapter with F/C
   #EJ17  that is used on the S814(8286-41A) and S824(8286-42A) models;  and the
   PCIe3 CAPI FlashSystem Accelerator Adapter with F/C #EJ18 that is used on the
   S822(8284-22A), E870(9119-MME), and E880(9119-MHE) models.  This problem does
   not pertain to PowerVM AIX partitions using CAPI adapters.
 * On a system using PowerVM firmware, a problem was fixed for Live Partition
   Mobility (LPM) migrations to FW860.10 or FW860.11 from any other level of
   firmware (i.e. not FW 860.10 or FW860.11) that caused errors in the output of
   the AIX "lsattr -El mem0" command and Dynamic LPAR (DLPAR) operations.  The
   "lsattr" command will report the partition only has one logical memory block
   (LMB) of memory assigned to it, even though there is more memory assigned to
   the partition.  Also, as a result of this problem, DLPAR operations will fail
   with an error indicating the request could not be completed.  This issue
   affects AIX 5.3, AIX 6.1, AIX 7.1, AIX 7.2 TL 0, and may result in AIX DLPAR
   error message "0931-032 Firmware failure.   Data may be out of sync and the
   system may require a reboot."  This issue also affect all levels of Linux. 
   Not affected by this issue are AIX 7.2 TL 1, VIOS and IBM i partitions.
   In addition, after performing LPM from FW860 to earlier versions of
   firmware,  the DLPAR of Virtual Adapters will fail with HMC error message
   HSCL294C, which contains text similar to the following:  "0931-007 You have
   specified an invalid drc_name."
   Without the fix, a reboot of the migrated partition will correct the problem.
   
 * On a system using PowerVM firmware, a problem was fixed for I/O DLPARs that
   result in partition hangs.  To trigger the problem, the DLPAR operation must
   be performed on a partition which has been migrated via a Live Partition
   Mobility (LPM) operation from a P6 or P7 system to a P8 system. 
   Additionally, DLPAR of I/O will fail when performed on a partition which has
   been migrated via an LPM operation from a P8 system to a P6 or P7 system. 
   The failure will produce HMC error message HSCL2928, which contains text
   similar to the following: "0931-011  Unable to allocate the resource to the
   partition." DLPAR operations for memory or CPU are not affected.  This issue
   affects all Linux and AIX partitions.  IBMi partitions are not affected.

SV860_063_056 / FW860.11

12/05/16 Impact:  Availability      Severity:  SPE

The following pertains to Power System S812L (8247-21L), Power System S822L
(8247-22L), Power System S824L (8247-42L), Power System S822 (8284-22A), Power
System S814 (8286-41A), Power System S824 (8286-42A) and Power System E850C
(8408-44E) servers only.



System firmware changes that affect certain systems


 * DEFERRED: A problem was fixed for a Field Core Override (FCO) error that
   causes a processor chip without functional cores to be guarded with a SRC
   B111BA24 error logged and by guard association causes all the memory and I/O
   resources behind the processor chip to be lost for the current IPL.  This
   problem is triggered by a system being manufactured with one or more feature
   codes of #2319 (Factory Deconfiguration of 1-core) to assist with
   optimization of software licensing.  For more information on Field Core
   Override, refer to IBM Knowledge Center:
   http://www.ibm.com/support/knowledgecenter/POWER8/p8hby/fieldcore.htm.  The
   error only occurs in systems where the total number of active cores is less
   than the number of processor chips.  When the fix is applied on a system that
   has lost memory or I/O resources due to the errant processor guard, the
   system must be re-IPLed with the guard removed from the processor to recover
   the resources.
   Without the fix, the problem may be circumvented by the following four steps:
   1) Power off the system.
   2) Use the Field Core Override function to increase the number of active
   processor cores in the system.  The Advanced System Management Interface
   (ASMI) "System Configuration -> Hardware Deconfiguration -> Field Core
   Override" panel shows the number of cores that are active in the system and
   it can be used to increase the number of active processor cores in the
   system.
   3) Unguard the failed processor.  Use the ASMI "System Configuration ->
   Hardware Deconfiguration -> Clear All Deconfiguration Errors" panel to
   restore the guarded processor. 
   4) IPL with the increased number of active processor cores and the unguarded
   processor.
   This problem does not pertain to the IBM Power System E850 (8408-44E) model.

SV860_056_056 / FW860.10

11/18/16 Impact:  New      Severity:  New

The following pertains to Power System S812L (8247-21L), Power System S822L
(8247-22L), Power System S824L (8247-42L), Power System S822 (8284-22A), Power
System S814 (8286-41A), Power System S824 (8286-42A) and Power System E850C
(8408-44E) servers only.

New features and functions


 * Support enabled for Live Partition Mobility (LPM) operations.
 * Support enabled for partition Suspend and Resume from the HMC.
 * Support enabled for partition Remote Restart.
 * Support enabled for PowerVM vNIC. PowerVM vNIC combined many of the best
   features of SR-IOV and PowerVM SEA to provide a network solution with options
   for advanced functions such as Live Partition Mobility along with better
   performance and I/O efficiency when compared to PowerVM SEA.  In addition
   PowerVM vNIC provided users with bandwidth control (QoS) capability by
   leveraging SR-IOV logical ports as the physical interface to the network.
 * Support for dynamic setting of the Simplified Remote Restart VM property,
   which enables this property to be turned on or off dynamically with the
   partition running.
 * Support for PowerVM and HMC to get and set the boot list of a partition.
 * Support for PowerVM partition restart in a Disaster Recovery (DR)
   environment.
 * On systems using PowerVM firmware, support for PCIe3 3D graphics (F/C #EC51)
   adapter for Linux boot.  Supported Linux OS distributions are Red Hat
   Enterprise Linux 7.3 and SLES 12 SP2.  This feature only applies to S822
   (8284-22A), S812L (8247-21L), and S822L (8247-22L) systems.
 * Support for concurrent add of a PCIe3 Optical cable card (#EJ08 and CCIN
   2CE2) used to attach the PCIe expansion drawer.  This feature pertains to
   E850(8408-E8E) and E850 (8408-44E) systems only.
 * Support for concurrent add of a PCIe expansion drawer (#EMX0) to an existing
   cable card.  This feature pertains to E850(8408-E8E) and E850 (8408-44E)
   systems only.
 * Support on PowerVM for a partition with 32 TB memory.  AIX, IBM i and Linux
   are supported but IBM i must be IBM i 7.3. TR1  IBM i 7.2 has a limit of 16
   TB per partition and IBM i 7.1 has a limit of 8 TB per partition.  AIX level
   must be 7.1S or later.  Linux distributions supported are RHEL 7.2 P8,  SLES
   12 SP1,  Ubuntu 16.04 LTS, RHEL 7.3 P8,  SLES 12 SP2, Ubuntu 16.04.1,  and
   SLES 11 SP4 for SAP HANA.
 * Support for four processors for each IBM i partition with VIOS (up from limit
   of two processors) on the IBM Power System S822 (8284-22A).
 * Support for PowerVM and PowerNV (non-virtualized or OPAL bare-metal) booting
   from a PCIe Non-Volatile Memory express (NVMe) flash adapter.  The adapters
   include feature codes #EC54 and #EC55 - 1.6 TB,  and #EC56 and #EC57 - 3.2 TB
   NVMe flash adapters with CCIN 58CB and 58CC respectively.
 * Support for PowerVM NovaLink V1.0.0.4 which includes the following features:
   - IBM i network boot
   - Live Partition Mobility (LPM) support for inactive source VIOS
   - Support for SR-IOV configurations, vNIC, and vNIC failover
   - Partition support for Red Hat Enterprise Linux
 * Support for a decrease in the amount of PowerVM memory needed to support Huge
   Dynamic DMA Window (HDDW) for a PCI slot by using 64K pages instead of 4K
   pages.  The hypervisor only allocates enough storage for the Enlarged IO
   Capacity (Huge Dynamic DMA Window) capable slots to map every page in main
   storage with 64K pages rather than 4K pages as was done previously.  This
   affects only the Linux OS as AIX and IBM i do not use HDDW.
 * Support was enhanced for the Power Linux models to increase the default
   number of slots for I/O Adapter Enlarged Capacity PCI slots from 4 to 13.  In
   860.10, the new default of 13 Enlarged I/O slots will use approximately 1.5
   GB of storage (which is a factor of 10 less than what would have been
   previously required for this many slots, benefiting by the PowerVM change to
   64K pages from 4K pages for HDDW). Huge DMA is a PCIe slot capability on IBM
   Power Systems servers that enables a DMA window to be wider, possibly
   allowing all the partition memory to be mapped for DMA. This feature avoids
   increased system usage when DMA mappings are requested by the adapter driver,
   because all the system memory assigned to the partition is already mapped.
   Consequently, this feature enables the data transfer between the I/O card
   that is placed in this slot and the system memory to be more efficient and
   with lower latency. The performance benefit will vary based on the operating
   system and adapter being used. Linux performance information can be found in
   the 64-bit DMA performance benefit topic in the performance section of the
   IBM Knowledge
   Center:http://www.ibm.com/support/knowledgecenter/linuxonibm/liabm/liabmconcepts.htm. 
   This feature enhancement only pertains to the IBM Power System S812L
   (8247-21L), S822L (8247-22L) and S824L (8247-42L) models.
 * Support added to reduce the number of error logs and call homes for the
   non-critical FRUs for the power and thermal faults of the system.
 * Support for redundancy in the the transfer of partition state for Live
   Partition Mobility (LPM) migration operations.  Redundant VIOS Mover Service
   Partitons (MSPs) can be defined along with redundant network paths at the
   VIOS/MSP level.  When redundant MSP pairs are used, the migrating memory
   pages of the logical partition are transferred from the source system to the
   target system by using two MSP pairs simultaneously. If one of the MSP pair
   fails, the migration operation continues by using the other MSP pair. In some
   scenarios, where a common shared Ethernet adapter is not used, use redundant
   MSP pairs to improve performance and reliability.
   Note:  For a LPM migration for a partition using Advanced Memory Sharing
   (AMS) in a dual (redundant) MSP configuration the LPM operation may hang if
   the MSP connection fails during the LPM migration. To avoid this issue that
   applies only to AMS partitions,  the AMS migrations should only be done from
   the HMC command line using the migrlpar command and specifying --redundentmsp
   0 to disable the redundant MSPs.
   Note: To use redundant MSP pairs, all VIOS MSPs must be at version 2.2.5.00
   or later, the HMC at version 8.6.0 or later, and the firmware level FW860 or
   later.
   For more information on LPM and VIOS supported levels and restrictions, refer
   to the following links on the IBM Knowledge Center:
   http://www.ibm.com/support/knowledgecenter/PurePower/p8hc3/p8hc3_firmwaresupportmatrix.htm
   https://www.ibm.com/support/knowledgecenter/HW4L4/p8eeo/p8eeo_ipeeo_main.htm
 * Support for failover capability for vNIC client adapters in the PowerVM
   hypervisor, rather than requiring the failover configuration to be done in
   the client OS.  To create a redundant connection, the HMC adds another vNIC
   server with the same remote lpar ID and remote DRC as the first, giving each
   server its own priority.
   
 * Support for SAP HANA with Solution edition with feature code #EPVR on 3.65
   GHZ processors and 12-core activations and 512 GB memory activations on SUSE
   Linux..  SAP HANA is an in-memory platform for processing high volumes of
   data in real-time. HANA allows data analysts to query large volumes of data
   in real-time. HANA's in-memory database infrastructure frees analysts from
   having to load or write-back data.
 * Support for the Hardware Management Console (HMC)  to access the service
   processor IPMI credentials and to retrieve Performance and Capacity Monitor
   (PCM) data for viewing in a tabular format or for exporting as CSV values.
   The enhanced HMC interface can now start and stop VIOS Shared Storage Pool
   (SSP) monitoring from the HMC and start and stop SSP historical data
   aggregation.
 * Support for the Advanced System Management Interface (ASMI) was changed to
   not create VPD deconfiguration records and call home alerts for hardware FRUs
   that have one VPD chip of a redundant pair broken or inaccessible.  The
   backup VPD chip for the FRU allows continued use of the hardware resource. 
   The notification of the need for service for the FRU VPD is not provided
   until both of the redundant VPD chips have failed for a FRU.
   

System firmware changes that affect all systems

 * A problem was fixed for a failed IPL with SRC UE BC8A090F that does not have
   a hardware callout or a guard of the failing hardware.  The system may be
   recovered by guarding out the processor associated with the error and
   re-IPLing the system.  With the fix, the bad processor core is guarded and
   the system is able to IPL.
 * A problem was fixed for an Operations Panel Function 04 (Lamp test) during an
   IPL causing the IPL to fail.  With the fix, the lamp test request is rejected
   during the IPL until the hypervisor is available.  The lamp test can be
   requested without problems anytime after the system is powered on to
   hypervisor ready or an OS is running in a partition.
 * A problem was fixed for On-Chip Controller (OCC) errors that had excessive
   callouts for processor FRUs.  Many of the OCC errors are recoverable and do
   not required that the processor be called out and guarded.  With the fix, the
   processors will only be called out for OCC errors if there are three or more
   OCC failures during a time period of a week.
 * A problem was fixed for the On-Chip Controller (OCC) incorrectly calling out
   processors with SRC B1112A16 for L4 Cache DIMM failures with SRC B124E504. 
   This false error logging can occur if the DIMM slot that is failing is
   adjacent to two unoccupied DIMM slots.
 * A problem was fixed for device time outs during a IPL logged with a SRC
   B18138B4.  This error is intermittent and no action is needed for the error
   log.  The service processor hardware server has allotted more time of the
   device transactions to allow the transactions to complete without a time-out
   error.
 * Support for 6 core processor with FC #8A2225 and CCIN 54E1  extended for use
   in the Power System S822L (8247-22L).  Support was already in place for this
   processor since FW810.20 for the S822 (8284-22A).
 * For the IBM Power System E850 (8408-44E) system, a problem was fixed for the
   incorrect values for the Idle Power Saver (IPS) mode call home data.  The
   call home "max" is reported much lower numbers than what the On-chip
   Controllers (OCC) read for the IPS.  This problem only affects 4-socket
   systems as it is caused by an integer overflow of the summation of the IPS
   value from all OCCs in the system.
   

System firmware changes that affect certain systems


 * DISRUPTIVE:  On systems using the PowerVM firmware, a problem was fixed for
   an "Incomplete" state caused by initiating a resource dump with selector
   macros from NovaLink (vio -dump -lp 1 -fr).   The failure causes a
   communication process stack frame, HVHMCCMDRTRTASK, size to be exceeded with
   a hypervisor page fault that disrupts the NovalLink and/or HMC
   communications. The recovery action is to re-IPL the CEC but that will need
   to be done without the assistance of the management console.  For each
   partition that has a OS running on the system, shut down each partition from
   the OS.  Then from the Advanced System Management Interface (ASMI),  power
   off the managed system.  Alternatively, the system power button may also be
   used to do the power off.  If the management console Incomplete state
   persists after the power off, the managed system should be rebuilt from the
   management console.  For more information on management console recovery
   steps, refer to this IBM Knowledge Center link:
   https://www.ibm.com/support/knowledgecenter/en/POWER7/p7eav/aremanagedsystemstate_incomplete.htm. 
   The fix is disruptive because the size of the PowerVM hypervisor must be
   increased to accommodate the over-sized stack frame of the failing task.
 * DEFERRED:  On systems using the PowerVM firmware, a problem was fixed for a
   CAPI function unavailable condition on a system with the maximum number of
   CAPI adapters and partitions.  Not enough bytes were allocated for CAPI for
   the maximum configuration case.  The problem may be circumvented by reducing
   the number of active partitions or CAPI adapters.   The fix is deferred
   because the size of the hypervisor must be increased to provide the
   additional CAPI space.
 * DEFERRED:   On systems using PowerVM firmware, a problem was fixed for cable
   card capable PCI slots that fail during the IPL.  Hypervisor I/O Bus
   Interface UE B7006A84 is reported for each cable card capable PCI slot that
   doesn't contain a PCIe3 Optical Cable Adapter for the PCIe Expansion Drawer
   (feature code #EJ05).  PCI slots containing a cable card will not report an
   error but will not be functional.  The problem can be resolved by performing
   an AC cycle of the system.  The trigger for the failure is the I2C devices
   used to detect the cable cards are not coming out of the power on reset
   process in the correct state due to a race condition.
 * On systems using PowerVM firmware, a problem was fixed for network issues,
   causing critical situations for customers, when an SR-IOV logical port or
   vNIC is configured with a non-zero Port VLAN ID (PVID).  This fix updates
   adapter firmware to 10.2.252.1922, for the following Feature Codes: EN15,
   EN16, EN17, EN18, EN0H, EN0J, EL38, EN0M, EN0N, EN0K, EN0L, and EL3C.
   The SR-IOV adapter firmware level update for the shared-mode adapters happens
   under user control to prevent unexpected temporary outages on the adapters. 
   A system reboot will update all SR-IOV shared-mode adapters with the new
   firmware level.  In addition, when an adapter is first set to SR-IOV shared
   mode, the adapter firmware is updated to the latest level available with the
   system firmware (and it is also updated automatically during maintenance
   operations, such as when the adapter is stopped or replaced).  And lastly,
   selective manual updates of the SR-IOV adapters can be performed using the
   Hardware Management Console (HMC).  To selectively update the adapter
   firmware, follow the steps given at the IBM Knowledge Center for using HMC to
   make the updates:  
   https://www.ibm.com/support/knowledgecenter/HW4M4/p8efd/p8efd_updating_sriov_firmware.htm.
   Note: Adapters that are capable of running in SR-IOV mode, but are currently
   running in dedicated mode and assigned to a partition, can be updated
   concurrently either by the OS that owns the adapter or the managing HMC (if
   OS is AIX or VIOS and RMC is running).
 * On systems using the PowerVM firmware, a problem was fixed for a Live
   Partition Mobility migration that resulted in the source managed system going
   to the management console Incomplete state after the migration to the target
   system was completed.  This problem is very rare and has only been detected
   once.. The problem trigger is that the source partition does not halt
   execution after the migration to the target system.   The management console
   went to the Incomplete state for the source managed system when it failed to
   delete the source partition because the partition would not stop running. 
   When this problem occurred, the customer network was running very slowly and
   this may have contributed to the failure.  The recovery action is to re-IPL
   the source system but that will need to be done without the assistance of the
   management console.  For each partition that has a OS running on the source
   system, shut down each partition from the OS.  Then from the Advanced System
   Management Interface (ASMI),  power off the managed system.  Alternatively,
   the system power button may also be used to do the power off.  If the
   management console Incomplete state persists after the power off, the managed
   system should be rebuilt from the management console.  For more information
   on management console recovery steps, refer to this IBM Knowledge Center
   link:
   https://www.ibm.com/support/knowledgecenter/en/POWER7/p7eav/aremanagedsystemstate_incomplete.htm
 * On systems using the PowerVM firmware, a fix was made to provide an option to
   change the ordering of PCIe Host Bridge (PHB) devices on Power 8 systems to
   match the discovery order on Power 7 systems.
 * On systems using PowerVM firmware,  a problem was fixed for a shared
   processor pool partition showing an incorrect zero "Available Pool Processor"
   (APP) value after a concurrent firmware update.  The zero APP value means
   that no idle cycles are present in the shared processor pool but in this case
   it stays zero even when idle cycles are available.  This value can be
   displayed using the AIX "lparstat" command.  If this problem is encountered,
   the partitions in the affected shared processor pool can be dynamically moved
   to a different shared processor pool.  Before the dynamic move, the
   "uncapped" partitions should be changed to "capped" to avoid a system hang.
   The old affected pool would continue to have the APP error until the system
   is re-IPLed.
 * On systems using PowerVM firmware, a problem was fixed for a latency time of
   about 2 seconds being added to a target Live Partition Mobility (LPM)
   migration system when there is a latency time check failure.  With the fix,
   in the case of a latency time check failure, a much smaller default latency
   is used instead of two seconds.  This error would not be noticed if the
   customer system is using a NTP time server to maintain the time.
 * On systems with OPAL firmware, a problem was fixed for misaligned mapped
   interrupts to virtual PCI devices that could cause a PB_CENT_CRESP_ADDR_ERROR
   checkstop.
 * On systems with OPAL firmware, a problem was fixed for a PXE (Preboot
   eXecution Environment) boot (also known as network boot) hang that occurred
   when a network server was down.  With the fix, the boot is able to recover so
   that alternative methods of booting can be selected using petitboot menu
   items.
 * A problem was fixed for PCI Host Bridge (PHB)  "link down"  Endpoint
   Recoverable errors that became fatal exceptions when not handled by the CAPI
   adapters.  With the fix, the recoverable errors are now detected by the CAPI
   adapters to allow for run-time link recovery.
 * On systems using PowerVM firmware,  a rare problem was fixed for a system
   hang that can occur when dynamically moving "uncapped" partitions to a
   different shared processor pool.  To prevent a system hang, the "uncapped"
   partitions should be changed to "capped" before doing the move.
 * On systems using the PowerVM firmware, support was added fora new utility
   option for the System Management Services (SMS) menus.  This is the SMS SAS
   I/O Information Utility.  It has been introduced to allow an user to get
   additional information about the attached SAS devices.  The utility is
   accessed by selecting option 3 (I/O Device Information) from the main SMS
   menu, and then selecting the option for "SAS Device Information".
 * On systems using the PowerVM hypervisor firmware and Novalink, a problem was
   fixed for a NovaLink installation error where the hypervisor was unable to
   get the maximum logical memory buffer (LMB) size from the service processor. 
   The maximum supported LMB size should be 0xFFFFFFFF but in some cases it was
   initialized to a value that was less than the amount of configured memory,
   causing the service processor read failure with error code 0X00000134.
 * On systems using the PowerVM hypervisor firmware and CAPI adapters, a problem
   was fixed for CAPI adapter error recovery.  When the CAPI adapter goes into
   the error recovery state, the Memory Mapped I/O (MMIO) traffic to the adapter
   from the OS continues, disrupting the recovery.  With the fix, the MMIO and
   DMA traffic to the adapter are now frozen until the CAPI adapter is fully
   recovered.   If the adapter becomes unusable because of this error, it can be
   recovered using concurrent maintenance steps from the HMC, keeping the
   adapter in place during the repair.  The error has a low frequency since it
   only occurs when the adapter has failed for another reason and needs
   recovery.
   
 * On systems using the PowerVM hypervisor firmware, when using affinity groups,
   if the group includes a VIOS, ensure the group is placed in the same drawer
   where the VIOS physical I/O is located.  Prior to this change,  if the VIOS
   was in an affinity group with other partitions, the partitions placement
   could over-ride the VIOS adapter placement rules and the VIOS could end up in
   a different drawer from the IO adapters.
 * On systems using PowerVM firmware,  a problem was fixed to improve error
   recovery when attempting to boot an iSCSI target backed by a drive formatted
   with a block size other than 512 bytes.  Instead of stopping on this error,
   the boot attempt fails and then continues with the next potential boot
   device.  Information regarding the reason for the boot failure is available
   in an error log entry.  The 512 byte block size for backing devices for iSCSI
   targets is a partition firmware requirement.
 * On systems using PowerVM firmware, a problem was fixed for a false thermal
   alarm in the active optical cables (AOC) for the PCIe3 expansion drawer with
   SRCs B7006AA6 and B7006AA7 being logged every 24 hours.  The AOC cables have
   feature codes of #ECC6 through #ECC9, depending on the length of the cable. 
   The SRCs should be ignored as they call for the replacement of the cable,
   cable card, or the expansion drawer module.  With the fix, the false AOC
   thermal alarms are no longer reported.
 * On systems using PowerVM firmware that have an attached HMC,  a problem was
   fixed for a Live Partition Mobility migration that resulted in a system hang
   when an EEH error occurred simultaneously with a request for a page migration
   operation.  On the HMC, it shows an incomplete state for the managed system
   with reference code A181D000.  The recovery action is to re-IPL the source
   system but that will need to be done without the assistance of the HMC.  From
   the Advanced System Management Interface (ASMI),  power off the managed
   system.  Alternatively, the system power button may also be used to do the
   power off.  If the HMC Incomplete state persists after the power off, the
   managed system should be rebuilt from the HMC.  For more information on HMC
   recovery steps, refer to this IBM Knowledge Center link:
   https://www.ibm.com/support/knowledgecenter/en/POWER7/p7eav/aremanagedsystemstate_incomplete.htm
 * On systems using the OPAL firmware, a problem was fixed for fundamental PCI
   resets at boot time causing the PCI adapters to not be usable in the Linux
   OS.  No errors occur in the skiboot but the adapters are not configurable
   once the OS is reached.
 * On systems using the OPAL firmware, a problem was fixed for time-out errors
   during the power off of PCI slots with " Timeout powering off slot ...
   FIRENZE-PCI: Wrong state 00000000 on slot" error message during a power off
   of the system.

SV860_039_039 / FW860.00

11/02/16 Impact:  New      Severity: 
New                                                                   The
following pertains to Power System E850C (8408-44E) servers only.

New Features and Functions

NOTE:
 * GA Level
   Four FW840 features that have been disabled for the 860.00 GA are listed
   below.  These will be re-enabled for the 860.10 service pack:
   1. Support disabled for Live Partition Mobility (LPM) operations.
   2. Support disabled for partition Suspend and Resume from the HMC.
   3. Support disabled for partition Remote Restart.
   4. Support disabled for PowerVM vNIC. PowerVM vNIC combined many of the best
   features of SR-IOV and PowerVM SEA to provide a network solution with options
   for advanced functions such as Live Partition Mobility along with better
   performance and I/O efficiency when compared to PowerVM SEA.  In addition
   PowerVM vNIC provided users with bandwidth control (QoS) capability by
   leveraging SR-IOV logical ports as the physical interface to the network.
 * New features that have been disabled: vNIC failover; new redundant path LPM
   function;  and PCIe cable recovery on a link to the PCIe3 expansion drawer.
 * Do not use the following functions.  They are not disabled but should not be
   used as the implementations and testing has not been completed for 860.00: 
   1. SMS SAS I/O Information utility.  If a non-SCDD (Self Configuring Device
   Data) drive is attached to a controller and the utility is used to look at
   devices attached to the controller, a Default Catch condition will occur due
   to a partition firmware data stack underflow.  This utility is accessed by
   selecting option 3 (I/O Device Information) from the main SMS menu, and then
   selecting option 2 (SAS Device Information).
   2. 32TB Max Memory Enablement for partitions. 
   3. PowerVM NovaLink enhancements.  For more information, refer to IBM
   Knowledge Center: 
   http://www.ibm.com/support/knowledgecenter/POWER8/p8eig/p8eig_kickoff.htm
   4. PowerVM change to support HDDW using 64K pages
   5. IBM Power System E850(8408-44E) concurrent add of the PCIe expansion
   drawer (#EMX0). 
   6. IBM Power System E850(8408-84E) concurrent add of PCIe3 Optical Cable
   Adapter for PCIe3 Expansion Drawer (F/C #EJ08)
   7. Enforcement of limits to IBM i support on IBM Power System S822 (8284-22A)
   8. Dynamic TCE memory allocation for SR-IOV adapters
   9. Dynamic Toggle of SRR
   10. Power Boot List Management Platform Support
   11. SAP HANA (#EPVR) enhancements - Solution edition for SAP HANA 3.65 GHz +
   12 Activations
   12. HMC new gui enhancements
   13. LPAR DR Restart
   14. HMC override for Port vs LUN level validation
   15. SNMP traps for system state
   16. HMC Option to boot without IPv6 Support
   17. PCIe3 3D Graphics Adapter x16 (#EC51) boot support (for Linux only)
   18. Non-volatile Memory Express (NVMe) boot
   19. Service processor security updates
   20. vHMC support for DHCP server configuration
   
 * Support for the IBM Power System E850 (8408-44E).  Similar in many respects
   to the 8408-E8E but upgraded with faster processors (4.223GHz, 10C 3.957GHz,
   12C 3.658GHz ) with a maximum of 48 cores and an upgrade in memory to DDR4
   with expanded capacity to 4 TB with 128 GB Dimms available.  As with
   8408-E8E, there is no IBM i or OPAL support.  Operating System offerings for
   PowerVM partitions are AIX and Linux (RHEL, SLES, and Ubuntu).



SV840
For Impact, Severity and other Firmware definitions, Please refer to the below
'Glossary of firmware terms' url:
http://www14.software.ibm.com/webapp/set2/sas/f/power5cm/home.html#termdefs
SV840_177_056 / FW840.60

09/29/2017 Impact:  Availability      Severity:  SPE

System firmware changes that affect all systems


 * A problem was fixed for a false 110026B1 (12V power good failure) caused by
   an I2C bus write error for a LED state.  This error can be triggered by the
   fan LEDs changing state.
 * A problem was fixed for a fan LED turning amber on solid when there is no fan
   fault, or when the fan fault is for a different fan.  This error can be
   triggered anytime a fan LED needs to change its state.  The fan LEDs can be
   recovered to a normal state concurrently using the following link steps for a
   soft reset of the service processor: 
   https://www.ibm.com/support/knowledgecenter/POWER8/p8hby/p8hby_softreset.htm
 * A problem was fixed for sporadic blinking amber LEDs for the system fans with
   no SRCs logged.  There was no problem with the fans.  The LED corruption
   occurred when two service processor tasks attempted to update the LED state
   at the same time.  The fan LEDs can be recovered to a normal state
   concurrently using the following link steps for a soft reset of the service
   processor: 
   https://www.ibm.com/support/knowledgecenter/POWER8/p8hby/p8hby_softreset.htm
 * A problem was fixed for the loss of Operations Panel function 30 (displaying
   ethernet port HMC1 and HMC2 IP addresses) after a concurrent repair of the
   Operations Panel.  Operations Panel function 30 can be restored concurrently
   using the following link steps for a soft reset of the service processor: 
   https://www.ibm.com/support/knowledgecenter/POWER8/p8hby/p8hby_softreset.htm
 * A problem was fixed for a core dump of the rtiminit (service processor time
   of day) process that logs an SRC B15A3303  and could invalidate the time on
   the service processor.  If the error occurs while the system is powered on,
   the hypervisor has the master time and will refresh the service processor
   time, so no action is needed for recovery.  If the error occurs while the
   system is powered off, the service processor time must be corrected on the
   systems having only a single service processor.  Use the following steps from
   the IBM Knowledge Center to change the UTC time with the Advanced System
   Management Interface: 
   https://www.ibm.com/support/knowledgecenter/en/POWER8/p8hby/viewtime.htm.
 * A problem was fixed for serializing concurrent requests for the IPMI serial
   over LAN (SOL) console that were causing a service processor hang with a
   subsequent Host-Initiated Reset/Reload for the service processor.
 * A problem was fixed for the "Minimum code level supported" not being shown by
   the Advanced System Menu Interface when selecting the "System
   Configuration/Firmware Update Policy" menu.  The message shown is "Minimum
   code level supported value has not been set".  The workaround to find this
   value is to use the ASMI command line interface with the "registry -l
   cupd/MinMifLevel" command.
 * A problem was fixed for a degraded PCI link causing a Predictive SRC for a
   non-cacheable unit (NCU) store time-out that occurred with SRC B113E540 or
   B181E450 and PRD signature "(NCUFIR[9]) STORE_TIMEOUT: Store timed out on
   PB".  With the fix, the error is changed to be an Informational as the
   problem is not with the processor core and the processor should not be
   replaced.  The solution for degraded PCI links is different from the fix for
   this problem, but a re-IPL of the CEC or a reset of the PCI adapters could
   help to recover the PCI links from their degraded mode.
 * A problem was fixed for a service processor reset triggered by a spurious
   false IIC interrupt request in the kernel.  On systems with a single service
   processor, the SRC B1817201 is displayed on the Operator Panel.  For systems
   with redundant service processors, an error failover to the backup service
   processor occurs.  The problem is extremely infrequent and does not impact
   processes on the running system.
 * A problem was fixed for the service processor low-level boot code always
   running off the same side of the flash image, regardless of what side has
   been selected for boot ( P-side or T-side).  Because this low-level boot code
   rarely changes, this should not cause a problem unless corruption occurs in
   the flash image of the boot code.  This problem does not affect firmware
   side-switches as the service processor initialization code (higher-level code
   than the boot code) is running correctly from the selected side.  Without the
   fix, there is no recovery for boot corruption for systems with a single
   service processor as the service processor must be replaced.
 * A problem was fixed for system termination and outage caused by a corrupted
   system reset type.  For cases where the system reset type cannot be
   identified, the service processor will now do a reset/reload to keep the
   system running.  This is a rare problem that is occurring during an
   error/recovery situation that involves a reset of the service processor. 
   This is a replacement for a previous fix attempt (same fix description) for
   this problem but it failed to prevent the system from terminating.
 * A problem was fixed for help text in the Advanced System Management Interface
   (ASMI) not informing the user that system fan speeds would increase if the
   system Power Mode was changed to "Fixed Maximum Frequency" mode.  If ASMI
   panel function "System Configuration->Power Management->Power Mode Setup"
   "Enable Fixed Maximum Frequency mode" help is selected, the updated text
   states "...This setting will result in the fans running at the maximum speed
   for proper cooling."
 * A problem was fixed for a Power Supply Unit (PSU) failiure of SRC 110015xF
   logged with a power supply fan call out when doing a hot re-plug of a PSU.  
   The power supply may be made operational again by doing a dummy replace of
   the PSU that was called out (keeping the same PSU for the replace
   operation).  A re-IPL of the system will also recover the PSU.
   

System firmware changes that affect certain systems

 * DEFERRED:  On systems using PowerVM firmware, a problem was fixed for PCIe3
   I/O expansion drawer (#EMX0) link improved stability.  The settings for the
   continuous time linear equalizers (CTLE) was updated for all the PCIe
   adapters for the PCIe links to the expansion drawer.  The CEC must be
   re-IPLed for the fix to activate.
 * On systems using PowerVM firmware,  a problem was fixed for an intermittent
   service processor core dump and callout for netsCommonMSGServer with SRC
   B181EF88.   The HMC connection to the service processor automatically
   recovers with a new session.
 * On systems using OPAL firmware, a problem was fixed for an IPMI console hang
   to OPAL that caused the Linux host to be hung for SSH sessions and for
   ipmitool commands to fail with "Error in open session response message:
   insufficient resources for session" error messages on the service
   processor.   An error log with SRC B1818601 is reported for the service
   processor IPMI failure and multiple SRC BB822210  error logs are reported for
   OPAL message timeouts to the service processor.  In most cases, this error
   can be recovered from by doing a soft reset of the service processor using
   the following steps from the IBM Knowledge Center: 
   https://www.ibm.com/support/knowledgecenter/POWER8/p8hby/p8hby_softreset.htm.
 * On systems using PowerVM firmware with a Linux Little Endian (LE) partition,
   a problem was fixed for system reset interrupts returning the wrong values in
   the debug output for the NIP and MSR registers.  This problem reduces the
   ability to debug hung Linux partitions using system reset interrupts.  The
   error occurs every time a system reset interrupt is used on a Linux LE
   partition.
 * On systems using PowerVM firmware, a problem was fixed for "Time Power On"
   enabled partitions not being capable of suspend and resume operations.  This
   means Live Partition Mobility (LPM) would not be able to migrate this type of
   partition.  As a workaround, the partition could be transitioned to a
   "Non-time Power On" state and then made capable of suspend and resume
   operations.
 * On systems using PowerVM firmware, a problem was fixed for reboot retries for
   IBM i partitions such that the first load source I/O adapter (IOA) is retried
   instead of bypassed after the first failed attempt.  The reboot retries are
   done for an hour before the reboot process gives up.  This error can occur if
   there is more than one known load source, and the IOA of the first load
   source is different from the IOA of the last load source.  The error can be
   circumvented by retrying the boot of the partition after the load source
   device has become available.
 * On systems using PowerVM firmware with mirrored memory running IBM i
   partitions, a problem was fixed for memory fails in the partition that also
   caused the system to crash.  The system failure will occur any time that IBM
   i partition memory towards the beginning of the partition's assigned memory
   fails.  With the fix, the memory failure is isolated to the impacted
   partition, leaving the rest of the system unaffected.
 * On systems using PowerVM firmware, a problem was fixed for failures
   deconfiguring SR-IOV Virtual Functions (VFs).  This can occur during Live
   Partition Mobility (LPM) migrations with HMC error messages of
   HSCLAF16,HSCLAF15 and HSCLB602 shown.  This results in a LPM migration
   failure and a system reboot is required to recover the VFs for the I/O
   adapters.  This error may occur more frequently in cases where the I/O
   adapter has pending I/O at the time of the deconfigure request for the VF.
 * On systems using PowerVM firmware, a problem was fixed for the incorrect
   reporting of the Universally Unique Identifier (UUID) to the OS, which
   prevented the tracking of a partition as it moved within a data center.  The
   UUID value as seen on HMC or the NovaLink did not match the value as
   displayed in the OS.
 * On systems using PowerVM firmware,  a problem was fixed for a partition boot
   from a USB 3.0 device that has an error log SRC BA210003.  The error is
   triggered by an Open Firmware entry to the trace buffer during the partition
   boot.  The error log can be ignored as the boot is successful to the OS.
 * On systems using PowerVM firmware,  a problem was fixed for a partition boot
   fail or hang from a Fibre Channel device having fabric faults.  Some of the
   fabric errors returned by the VIOS are not interpreted correctly by the Open
   Firmware VFC drive, causing the hang instead of generating helpful error
   logs.
 * On systems using PowerVM firmware,  problems were fixed for communication
   failures on adapters in SR-IOV shared mode:
   1) A problem was fixed for SR-IOV adapters in shared mode for a transmission
   stall or time out with SRC B400FF01 logged.  The time out happens during
   Virtual Function (VF) shutdowns and during Function Level Resets (FLRs) with
   network traffic running.
   2) A problem was fixed for an SR-IOV logical port whose Port VLAN ID (PVID)
   changing from non-zero to zero causes a communication failure under certain
   conditions.  The communication failure only occurs when a logical port's PVID
   is dynamically changed from non-zero to zero.  An SR-IOV logical port is an
   I/O device created for a partition or a partition profile using the
   management console (HMC) when a user intends for the partition to access an
   SR-IOV adapter Virtual Function.  The error can be recovered from by a reboot
   of the partition.
   These fixes updates adapter firmware to 10.2.252.1929, for the following
   Feature Codes: EN15, EN16, EN17, EN18, EN0H, EN0J, EN0M, EN0N, EN0K, EN0L,
   EL38, EL3C, EL56, and EL57.
   The SR-IOV adapter firmware level update for the shared-mode adapters happens
   under user control to prevent unexpected temporary outages on the adapters. 
   A system reboot will update all SR-IOV shared-mode adapters with the new
   firmware level.  In addition, when an adapter is first set to SR-IOV shared
   mode, the adapter firmware is updated to the latest level available with the
   system firmware (and it is also updated automatically during maintenance
   operations, such as when the adapter is stopped or replaced).  And lastly,
   selective manual updates of the SR-IOV adapters can be performed using the
   Hardware Management Console (HMC).  To selectively update the adapter
   firmware, follow the steps given at the IBM Knowledge Center for using HMC to
   make the updates:  
   https://www.ibm.com/support/knowledgecenter/HW4M4/p8efd/p8efd_updating_sriov_firmware.htm.
   Note: Adapters that are capable of running in SR-IOV mode, but are currently
   running in dedicated mode and assigned to a partition, can be updated
   concurrently either by the OS that owns the adapter or the managing HMC (if
   OS is AIX or VIOS and RMC is running).
   
 * On systems using PowerVM firmware with PowerVM NovaLink, a problem was fixed
   for a lost of a communications channel between the hypervisor and the PowerVM
   NovaLink during a reset of the service processor.  Various NovaLink tasks,
   including deploy, could fail with a "No valid host was found" error.  With
   the fix, PowerVM NovaLink prevents normal operations from being impacted by a
   reset of the service processor.
 * On systems using PowerVM firmware with PowerVM NovaLink, a problem was fixed
   for returning to HMC-only management from co-management when a Novalink
   partition is deleted holding the master mode.  A circumvention is to release
   master mode before deleting the NovaLink partition and then reconnect the
   disconnected management console.  Please refer to IBM Knowledge Center link
   "http://ibm.biz/novalink-kc" for more information on the PowerVM NovaLink
   feature and changing the master authority when doing co-management.
 * On systems using PowerVM firmware with PowerVM NovaLink, a problem was fixed
   for a master management console becoming disconnected and blocking other
   management consoles from performing virtualization changes. A circumvention
   is to use the HMC CLI on another management console to request the master
   mode with the force option.   Please refer to IBM Knowledge Center link
   "http://ibm.biz/novalink-kc" for more information on the PowerVM NovaLink
   feature and changing the master authority when doing co-management.
 * On systems using PowerVM firmware, a problem was fixed for an invalid date
   from the service processor causing the customer date and time to go to the
   Epoch value (01/01/1970) without a warning or chance for a correction.  With
   the fix,  the first IPL attempted on an invalid date will be rejected with a
   message alerting the user to set the time correctly in the service
   processor.  If the warning is ignored and the date/time is not corrected, the
   next IPL attempt will complete to the OS with the time reverted to the Epoch
   time and date.  This problem is very rare but it has been known to occur on
   service processor replacements when the repair step to set the date and time
   on the new service processor was inadvertently skipped by the service
   representative.
 * On systems using PowerVM firmware,  a problem was fixed for the error
   handling of EEH events for the SR-IOV Virtual Functions (VFs) that can result
   in IPL failure with B7006971, B400FF05, and BA210000 SRCs logged.  In these
   cases, the partition console stops at an OFDBG prompt.  Also a DLPAR add of a
   VF may result in a parttion crash due to a 300 DSI exception because of a
   low-level EEH event.  A circumvention for the problem would be to debug the
   EEH events which should be recovered errors and eliminate the cause of the
   EEH events.  With the fix, the EEH events still log Predictive Errors but do
   not cause a partition failure.
 * On systems using PowerVM firmware, a problem was fixed for an error finding
   the partition load source that has a GPT format.  GUID Partition Table (GPT)
   is a standard for the layout of the partition table on a physical storage
   device used in the server, such as a hard disk drive or solid-state drive,
   using globally unique identifiers (GUID).  Other drives that are working may
   be using the older master boot record (MBR) partition table format.  This
   problem occurs whenever load sources utilizing the GPT format occur in other
   than the first entry of the boot table.  Without the fix, a GPT disk drive
   must be the first entry in the boot table to be able to use it to boot a
   partition.
 * On systems using PowerVM firmware, a problem was fixed for an SRC BA090006
   serviceable event log occurring whenever an attempt was made to boot from an
   ALUA (Asymmetric Logical Unit Access) drive.  These drives are always busy by
   design and cannot be used for a partition boot, but no service action is
   required if a user inadvertently tries to do that.  Therefore, the SRC was
   changed to be an informational log.
 * On systems using PowerVM firmware, a problem was fixed for Live Partition
   Mobility (LPM) migrations from FW860.12 or later to the FW840.50 level of
   firmware. Subsequent DLPAR add operations of Virtual Adapters will fail with
   HMC error message HSCLAB2B, which contains text similar to the following: 
   "The operation to add a virtual NIC in slot 8 on partition 9 failed. The
   requested amounts of slot(s) to be added is 1 and the completed amount is
   0."  The AIX OS standard error message with return code 3 is the following:
   "0931-007 You have specified an invalid drc_name."   This issue affects
   partitions installed with AIX 7.2 TL 1 and later.   Not affected by this
   issue are partitions installed with VIOS, IBM i, or earlier levels of AIX. 
   The error can be recovered by a reboot of the affected partition.
 * On systems using OPAL firmware, Skiboot was updated to V5.1.21 from V5.1.19, 
   including the following updates:
   -  A problem was fixed for an intermittent host freeze during a reset/reload
   of the service processor.  The host will resume normal operations after the
   reset/reload has completed.  To have this error occur,  a timing window has
   to be hit where a synchronous message from the host is in progress to the
   service processor at the same time a reset/reload is initiated.
   -  A problem was fixed for an error log timeout to only timeout on the send
   of the error log to the service processor.  This will significantly reduce
   false time out errors.
   -  A problem was fixed for unknown command messages in the OPAL log after a
   Host-Initiated Reset/Reload of the service processor.
   - A problem was fixed for OPAL kernel lockups when the IPMI SOL console
   became unresponsive.  The console can become full now and drop messages but
   this prevents the lock-up of the Host kernel.
   - A problem was fixed service processor time-out messages being interpreted
   as "success" by OPAL, preventing correct error reporting and recovery
   actions.
   - A problem was fixed for a kernel hang caused by queued messages needing to
   be sent to the service processor during a reset/reload of the service
   processor.  The messages are now cached and sent when the service processor
   is ready to receive after a reset/reload.
   - A problem was fixed the I2C bus locking that sometimes caused an OPAL crash
   with double unlock() detected.
   - A problem was fixed for a soft lockup of the kernel that occurred because
   of RTC/TOD clock errors during a Host-initiated Reset/Reload of the service
   processor.  A frozen process would be seen on the host system along with this
   message:   "NMI watchdog: BUG: soft lockup - CPU#57 stuck for 23s!" where the
   CPU number would vary.
   - A problem was fixed for a lockup of the host waiting for responses for
   sensor data during a service processor reset. "OPAL_BUSY" is now returned to
   the host so the host driver knows not to wait but to retry later to gather
   the sensor data.

SV840_168_056 / FW840.50

04/21/17 Impact:  Availability      Severity:  SPE

New features and functions


 * Support for the Advanced System Management Interface (ASMI) was changed to
   allow the special characters of "I", "O", and "Q" to be entered for the
   serial number of the I/O Enclosure under the Configure I/O Enclosure option. 
   These characters have only been found in an IBM serial number rarely, so
   typing in these characters will normally be an incorrect action.  However,
   the special character entry is not blocked by ASMI anymore so it is able to
   support the exception case.  Without the enhancement, the typing of one of
   the special characters causes message "Invalid serial number" to be
   displayed.
 * On systems using PowerVM firmware, support was added to allow the IBM i OS on
   the Power System S822 (8284-22A) without the need for a VET code.
 * On systems using PowerVM firmware, support was added for the Universally
   Unique IDentifier (UUID) property for each partition.  The UUID provides each
   partition with an identifier that is persisted by the platform across
   partition reboots, reconfigurations, OS reinstalls, partition migration,  and
   hibernation.
   

System firmware changes that affect all systems


 * A problem was fixed for the setting the disable of a periodic notification
   for a call home error log SRC B150F138 for Memory Buffer resources (membuf)
   from the Advanced System Management Interface (ASMI).
 * A problem was fixed for incorrect callouts of the Power Management Controller
   (PMC) hardware with SRC B1112AC4 and SRC B1112AB2 logged.  These extra
   callouts occur when the On-Chip Controller (OCC) has placed the system in the
   safe state for a prior failure that is the real problem that needs to be
   resolved.
 * A problem was fixed for device time outs during a IPL logged with a SRC
   B18138B4.  This error is intermittent and no action is needed for the error
   log.  The service processor hardware server has allotted more time of the
   device transactions to allow the transactions to complete without a time-out
   error.
 * A problem was fixed for the OS not being able to detect the USB connected
   Uninterruptible Power Supply (UPS) that has feature code #ECCF.  An
   informational SRC B1814616 is logged from the service processor and the IBM i
   OS logs a CPI0961 (Uninterruptible power supply no longer attached).  The
   error occurs infrequently because it depends on system timing and system
   configuration.  If a system is having the error, it might have it on every
   IPL.  The circumvention is to reseat the USB cable connector for the USB
   connected UPS.
 * A problem was fixed for the Advanced System Management Interface (ASMI)
   "System Service Aids => Error/Event Logs" panel not showing the "Clear" and
   "Show" log options and also having a truncated error log when there are a
   large number of error logs on the system.
   
 * A problem was fixed for the failover to the backup PNOR on a Hostboot Self
   Boot Engine (SBE) failure.  Without the fix, the failed SBE causes loss of
   processors and memory with B15050AD logged.  With the fix, the SBE is able to
   access the backup PNOR and IPL successfully by deconfiguring the failing PNOR
   and calling it out as a failed FRU.
 * A problem was fixed for System Vital Product Data (SVPD) FRUs being guarded
   but not having a corresponding error log entry.  This is a failure to commit
   the error log entry that has occurred only rarely.
 * A problem was fixed for the system VPD showing 4 extra PCIe slots that are
   not actually available to the system.  When running an IBM i partition, the
   IBM i Hardware Service Manager shows twelve PCIe adapter slots instead of the
   actual eight that can be used (P1-C2, P1-C3, P1-C4, and P1-C5 are the extra
   slots displayed).  This problem only pertains to the IBM Power System S814
   (8286-41A).
 * A problem was fixed to allow changing the IPMI channel authentication
   capabilities from the OS.  The following command was causing an IPMI core
   dump "ipmitool channel authcap 1 4" every time it was run.
 * A problem was fixed for a system going into safe mode with SRC B1502616
   logged as informational without a call home notification.  Notification is
   needed because the system is running with reduced performance.  If there are
   unrecoverable error logs and any are marked with reduced performance and the
   system has not been rebooted, then the system is probably running in safe
   mode with reduced performance.  With the fix, the SRC B1502616 is a
   Unrecoverable Error (UE).
 * A problem was fixed for the service processor boot watch-dog timer expiring
   too soon during DRAM initialization in the reset/reload, causing the service
   processor to go unresponsive.  On systems with a single service processor,
   the SRC B1817212 was displayed on the control panel.  For systems with
   redundant service processors, the failing service processor was
   deconfigured.  To recover the failed service processor, the system will need
   to be powered off with AC powered removed during a regularly scheduled system
   service action.  This problem is intermittent and very infrequent as most of
   the reset/reloads of the service processor will work correctly to restore the
   service processor to a normal operating state.
 * A problem was fixed for host-initiated resets of the service processor
   causing the system to terminate.  A prior fix for this problem did not work
   correctly because some of the host-initiated resets were being translated to
   unknown reset types that caused the system to terminate.  With this new
   correction for failed host-initiated resets, the service processor will still
   be unresponsive but the system and partitions will continue to run.  On
   systems with a single service processor, the SRC B1817212 will be displayed
   on the control panel.  For systems with redundant service processors, the
   failing service processor will be deconfigured.  To recover the failed
   service processor, the system will need to be powered off with AC powered
   removed during a regularly scheduled system service action.  This problem is
   intermittent and very infrequent as most of the host-initiated resets of the
   service processor will work correctly to restore the service processor to a
   normal operating state.
 * A problem was fixed for incorrect error messages from the Advanced System
   Management Interface (ASMI) functions when the system is powered on but in
   the "Incomplete State".  For this condition, ASMI was assuming the system was
   powered off because it could not communicate to the PowerVM hypervisor.  With
   the fix, the ASMI error messages will indicate that ASMI functions have
   failed because of the bad hypervisor connection instead of falsely stating
   that the system is powered off.
 * A problem was fixed for system termination and outage caused by a corrupted
   system reset type.  For cases where the system reset type cannot be
   identified, the service processor will now do a reset/reload to keep the
   system running.  This is a rare problem that is occurring during an
   error/recovery situation that involves a reset of the service processor.
 * A problem has been fixed for systems losing performance and going into Safe
   mode (a power mode with reduced processor frequencies intended to protect the
   system from over-heating and excessive power consumption) with
   B1xx2AC3/B1xx2AC4 SRCs logged.  This happened because of an On-Chip
   Controller (OCC) internal queue overflow. The problem has only been observed
   for systems running heavy workloads with maximum memory configurations (where
   every DIMM slot is populated - size of DIMM does not matter), but this may
   not be required to encounter the problem.  Recovery from Safe mode back to
   normal performance can be done with a re-IPL of the system, or concurrently
   using the following link steps for a soft reset of the service processor: 
   https://www.ibm.com/support/knowledgecenter/POWER8/p8hby/p8hby_softreset.htm.
   To check or validate that Safe mode is not active on the system will require
   a dynamic celogin password from IBM Support to use the service processor
   command line:
   1) Log into ASMI as celogin with dynamic celogin password generated by IBM
   Support
   2) Select System Service Aids
   3) Select Service Processor Command Line
   4) Enter "tmgtclient --query_mode_and_function" from the command line
   The first line of the output, "currSysPwrMode" should say "NOMINAL" and this
   means the system is in normal mode and that Safe mode is not active.
   

System firmware changes that affect certain systems

 * On systems using PowerVM firmware, a problem was fixed for cable card (PCIe3
   Optical Cable Adapter for the PCIe3 Expansion Drawer) capable PCI slots that
   fail during the IPL.  Hypervisor I/O Bus Interface UE B7006A84 is reported
   for each cable card capable PCI slot that doesn't contain a cable card.  PCI
   slots containing a cable card will not report an error but will not be
   functional.  The problem can be resolved by doing a "power off/power on"
   re-IPL of the system. The trigger for the failure is the I2C devices used to
   detect the cable cards are not coming out of the power on reset process in
   the correct state due to a race condition.  The affected optical cable
   adapters have feature codes #EJ05, #EJ07, and #EJ08 with CCINs 2B1C, 6B52,
   and 2CE2, respectively.
 * On systems using PowerVM firmware,  a problem was fixed for a blank SRC in
   the LPA dump for user-initiated non-disruptive adjunct dumps.  The SRC is
   needed for problem determination and dump analysis.
 * On systems using PowerVM firmware, a problem was fixed with SR-IOV adapter
   error recovery where the adapter is left in a failed state in nested error
   cases for some adapter errors.  The probability of this occurring is very low
   since the problem trigger is multiple low-level adapter failures.  With the
   fix, the adapter is recovered and returned to an operational state.
 * On systems using PowerVM firmware with PCIe adapters in Single Root I/O
   Virtualization (SR-IOV) shared mode, a problem was fixed for the hypervisor
   SR-IOV adjunct partition failing during the IPL with SRCs B200F011 and
   B2009014 logged. The SR-IOV adjunct partition successfully recovers after it
   reboots and the system is operational.
 * For the IBM Power System E850 (8408-E8E) system, a problem was fixed for the
   incorrect values for the Idle Power Saver (IPS) mode call home data.  The
   call home "max" is reported much lower numbers than what the On-chip
   Controllers (OCC) read for the IPS.  This problem only affects 4-socket
   systems as it is caused by an integer overflow of the summation of the IPS
   value from all OCCs in the system.
 * On systems using PowerVM firmware, a problem was fixed for PCIe Host Bridge
   (PHB) outages and PCIe adapter failures in the PCIe I/O expansion drawer
   caused by error thresholds being exceeded for the LEM bit [21] errors in the
   FIR accumulator.  These are typically minor and expected errors in the PHB
   that occur during adapter updates and do not warrant a reset of the PHB and
   the PCIe adapter failures.  Therefore, the threshold LEM[21] error limit has
   been increased and the LEM fatal error has been changed to a Predictive Error
   to avoid the outages for this condition.
 * On systems using PowerVM firmware with a large memory configuration (greater
   than 8 TB), a problem was fixed for a SR-IOV adjunct failure during the IPL,
   causing loss of SR-IOV function.  The large system memory space causes an
   overflow in the space calculations for SR-IOV adapters in PCIe slots with
   Enlarged IO Capacity enabled.  The problem can be avoided by reducing the
   number of PCIe slots with Enlarged IO Capacity enabled so it does not include
   adapters in SR-IOV shared-mode.  Another circumvention option is to move the
   SR-IOV adapters to SR-IOV capable PCIe slots where Enlarged IO Capacity is
   not enabled.   Reducing system physical memory to below 8 TB will also work
   as a circumvention.
 * On systems using PowerVM firmware, a problem was fixed for Live Partition
   Mobility (LPM) migrations from FW860.10 or FW860.11 to older levels of
   firmware. Subsequent DLPAR of Virtual Adapters will fail with HMC error
   message HSCL294C, which contains text similar to the following:  "0931-007
   You have specified an invalid drc_name." This issue affects partitions
   installed with AIX 7.2 TL 1 and later. Not affected by this issue are
   partitions installed with VIOS, IBM i, or earlier levels of AIX.
 * On a system using PowerVM firmware running a Linux OS,  a problem was fixed
   for support for Coherent Accelerator Processor Interface (CAPI) adapters. 
   The CAPI related RTAS h-calls for the CAPI devices could not be made by the
   Linux OS, impacting the CAPI adapter functionality and usability.  This
   problem involves the following adapters:  the PCIe3 LP CAPI Accelerator
   Adapter with F/C #EJ16 that is used on the S812L(8247-21L) and S822L
   (8247-22L) models;  the PCIe3 CAPI FlashSystem Acclerator Adapter with F/C
   #EJ17  that is used on the S814(8286-41A) and S824(8286-42A) models;  and the
   PCIe3 CAPI FlashSystem Accelerator Adapter with F/C #EJ18 that is used on the
   S822(8284-22A), E870(9119-MME), and E880(9119-MHE) models.  This problem does
   not pertain to PowerVM AIX partitions using CAPI adapters.
 * On a system using OPAL firmware, a problem was fixed for excessive "Poller
   recursion detected" error messages during the skiboot that could require a
   power off to recover from the error.
 * On a system using OPAL firmware, a problem was fixed for an unnecessary error
   message when a reset occurs on an empty PCIe Host Bridge (PHB) - no PCIe
   adapters attached..  The extra error message occurs anytime the PHBs in the
   system go through error recovery.
 * On a system using OPAL firmware, a problem was fixed to fence off an errant
   PCIe Host Bridge (PHB) during a complete reset to allow the kernel to retry
   the operation.  This helps the system recovery process by guarding out the
   bad hardware to prevent a fatal error loop.
 * On a system using PowerVM firmware, a problem was fixed for corruption of the
   partition data in the service processor NVRAM during a power off that causes
   the managed system to go into the HMC "Recovery" error state.  A
   circumvention for the error is to restore partition data from the HMC.  If
   using Novalink to manage the partition, a recovery can be done from the
   Novalink backup.  The error is very infrequent but more likely to occur on an
   immediate power off of the system.  Instead, if a delayed powered off is
   used, that would allow the hypervisor to complete all pending operations
   before shutting down cleanly.
 * On systems using PowerVM firmware, a problem was fixed for a group of shared
   processor partitions being able to exceed the designated capacity placed on a
   shared processor pool.  This error can be triggered by using the DLPAR move
   function for the shared processor partitions, if the pool has already reached
   its maximum specified capacity.  To prevent this problem from occurring when
   making DLPAR changes when the pool is at the maximum capacity, do not use the
   DLPAR move operation but instead break it into two steps:  DLPAR remove
   followed by DLPAR add.  This gives enough time for the DLPAR remove to be
   fully completed prior to starting the DLPAR add request.
 * On systems using PowerVM firmware, a problem was fixed for NVRAM corruption
   and a HMC recovery state when using Simplified Remote Restart partitions. 
   The failing systems will have at least one Remote Restart partition and on
   the failed IPL there will be a B70005301 SRC with word 7 being 0X00000002.
 * On systems using PowerVM firmware with an IBM i partition, a problem was
   fixed for incorrect maximum performance reports based on the wrong number of
   "maximum" processors for the system.   Certain performance reports that can
   be generated on IBMi systems contain not only the existing machine
   information, but also "what-if" information, such as "how would this system
   perform if it had all the processors possible installed in this system". 
   This "what-if" report was in error because the maximum number of processors
   possible was too high for the system.
 * On systems using PowerVM firmware, a problem was fixed for NVRAM corruption
   that can occur when deleting a partition that owns a CAPI adapter, if that
   CAPI adapter is not assigned to another partition before the system is
   powered off.  On a subsequent IPL, the system will come up in recovery mode
   if there is NVRAM corruption.  To recover, the partitions must be restored
   from the HMC.  The frequency of this error is expected to be rare.  The CAPI
   adapters have the following feature codes:  #EC3E, #EC3F, #EC3L, #EC3M,
   #EC3T, #EC3U, #EJ16, #EJ17, #EJ18, #EJ1A, and #EJ1B.
 * On systems using PowerVM firmware, a problem was fixed for PCIe3 I/O
   expansion drawer (#EMX0) link improved stability.  The settings for the
   continuous time linear equalizers (CTLE) was updated for all the PCIe
   adapters for the PCIe links to the expansion drawer.  The CEC must be
   re-IPLed for the fix to activate.
   
 * On systems using PowerVM firmware,  the following problems were fixed for
   SR-IOV adapters:
   1) Insufficient resources reported for SR-IOV logical port configured with
   promiscuous mode enable and a Port VLAN ID (PVID) when creating new interface
   on the SR-IOV adapters.
   2) Spontaneous dumps and reboot of the adjunct partition for SR-IOV adapters.
   3) Adapter enters firmware loop when single bit ECC error is detected. 
   System firmware detects this condition as a adapter command time out.  System
   firmware will reset and restart the adapter to recover the adapter
   functionality.  This condition will be reported as a temporary adapter
   hardware failure.
   4) vNIC interfaces not being deleted correctly causing SRC B400FF01 to be
   logged and Data Storage Interrupt (DSI) errors with failiure on boot of the
   LPAR.
   This set of fixes updates adapter firmware to 10.2.252.1926, for the
   following Feature Codes: EN15, EN16, EN17, EN18, EN0H, EN0J, EN0M, EN0N,
   EN0K, EN0L, EL38 , EL3C, EL56, and EL57.
   The SR-IOV adapter firmware level update for the shared-mode adapters happens
   under user control to prevent unexpected temporary outages on the adapters. 
   A system reboot will update all SR-IOV shared-mode adapters with the new
   firmware level.  In addition, when an adapter is first set to SR-IOV shared
   mode, the adapter firmware is updated to the latest level available with the
   system firmware (and it is also updated automatically during maintenance
   operations, such as when the adapter is stopped or replaced).  And lastly,
   selective manual updates of the SR-IOV adapters can be performed using the
   Hardware Management Console (HMC).  To selectively update the adapter
   firmware, follow the steps given at the IBM Knowledge Center for using HMC to
   make the updates:  
   https://www.ibm.com/support/knowledgecenter/HW4M4/p8efd/p8efd_updating_sriov_firmware.htm.
   Note: Adapters that are capable of running in SR-IOV mode, but are currently
   running in dedicated mode and assigned to a partition, can be updated
   concurrently either by the OS that owns the adapter or the managing HMC (if
   OS is AIX or VIOS and RMC is running).
 * On systems using PowerVM firmware, a problem was fixed for partition boot
   failures and run time DLPAR failures when adding I/O that log BA210000,
   BA210003, and/or BA210005 errors.  The fix also applies to run time failures
   configuring an I/O adapter following an EEH recovery that log BA188001
   events.  The problem can impact IBMi partitions running in any processor mode
   or AIX/Linux partitions running in P7 (or older) processor compatibility
   modes.  The problem is most likely to occur when the system is configured in
   the Manufacturing Default Configuration (MDC) mode.  The trigger for the
   problem is a race-condition between the hypervisor and the physical
   operations panel with a very rare frequency of occurrence.
 * On systems with maximum memory configurations (where every DIMM slot is
   populated - size of DIMM does not matter), a problem has been fixed for
   systems losing performance and going into Safe mode (a power mode with
   reduced processor frequencies intended to protect the system from
   over-heating and excessive power consumption) with B1xx2AC3/B1xx2AC4 SRCs
   logged.  This happened because of On-Chip Controller (OCC) time out errors
   when collecting Analog Power Subsystem Sweep (APSS) data, used by the OCC to
   tune the processor frequency.  This problem occurs more frequently on systems
   that are running heavy workloads.  Recovery from Safe mode back to normal
   performance can be done with a re-IPL of the system, or concurrently using
   the following link steps for a soft reset of the service processor: 
   https://www.ibm.com/support/knowledgecenter/POWER8/p8hby/p8hby_softreset.htm.
   To check or validate that Safe mode is not active on the system will require
   a dynamic celogin password from IBM Support to use the service processor
   command line:
   1) Log into ASMI as celogin with dynamic celogin password generated by IBM
   Support
   2) Select System Service Aids
   3) Select Service Processor Command Line
   4) Enter "tmgtclient --query_mode_and_function" from the command line
   The first line of the output, "currSysPwrMode" should say "NOMINAL" and this
   means the system is in normal mode and that Safe mode is not active.

SV840_147_056 / FW840.40

10/26/16 Impact:  Availability      Severity:  SPE

New features and functions


 * Support was added to protect the service processor from booting on a level of
   firmware that is below the minimum MIF level.  If this is detected, a SRC
   B18130A0 is logged.  A disruptive firmware update would then need to be done
   to the minimum firmware level or higher.  This new support has no effect on
   the system being updated with the service pack but has been put in place to
   provide an enhanced firmware level for the IBM field stock service
   processors.
 * Support for the Advanced System Management Interface (ASMI) was changed to
   not create VPD deconfiguration records and call home alerts for hardware FRUs
   that have one VPD chip of a redundant pair broken or inaccessible.  The
   backup VPD chip for the FRU allows continued use of the hardware resource. 
   The notification of the need for service for the FRU VPD is not provided
   until both of the redundant VPD chips have failed for a FRU.
   

System firmware changes that affect all systems


 * A problem was fixed for excessive, repeating error logs with SRC B150B901 for
   a failed FSI link to a DIMM that had insufficent hardware callouts for easy
   diagnosis of the failure.  With the fix, the B150B901 is limited to one
   occurrence but a new error log is provided with the hardware callouts. 
   Without the fix, if you see repeating B150B901 predictive logs,  there will
   also be repeated informational error logs with SRC B1504800.  These B1504800
   logs would have the hardware involved and could be used to point to the
   failing DIMM.
 * A problem was fixed for unneeded throttling of processors if a power supply
   fails.  The error log SRCs of B1812A05 and B1812A33 are reported when the
   processors are throttled.  The affected systems have four power supplies and
   the loss of one power supply would not normally cause power use to go over
   the power capacity limit, but it happened because the number of power
   supplies was internally set as two instead of the four actually in the
   system.  This problem only affects the IBM Power System S824 (8286-42A) and
   the S824L(8247-42L) models.  Without the fix, the problem with processor
   throttling can be circumvented by replacing the power supply that has failed.
 * A problem was fixed for PCIe slot errors caused by improper PCIe device
   training.  PCIe links do not train properly and PCIe cards may show up as
   unknown in I/O list system properties.  Error log SRC BA180020 may be seen,
   or informational events B7006976 (for PHB slot) or B7006977 (for a switch
   slot).  The applied fix does not recover failed PCIe devices but does prevent
   those failures on the next power on IPL.  If any PCIe devices are in the
   failed state, they can be recovered using the HMC to power cycle the affected
   PCIe slot.  This problem only affects the IBM Power System E850 (8408-E8E)
   model.
 * A problem was fixed for a backplane short causing smoke in the case.  The
   power on sequence was changed to apply power from one power supply at a time
   and then check for excessive current use that could be caused by a backplane
   short.  If excessive current is defected, the system is powered off with a
   SRC logged to call out the failing hardware.  If a short has occurred, the
   backplane must still be replaced but damage to other components will be
   prevented.  The problem is triggered by a physical move of the system.  This
   problem only affects the IBM Power System E850 (8408-E8E) model.
 * A problem was fixed for the Advanced System Management Interface "Network
   Services/Network Configuration" "Reset Network Configuration" button that was
   not resetting the static routes to the default factory setting.  The
   manufacturing default is to have no static routes defined so the fix clears
   any static routes that had been added.  A circumvention to the problem is to
   use the ASMI "Network Services/Network Configuration/Static Route
   Configuration" "Delete" button before resetting the network configuration.
 * A problem was fixed for the HMC Exchange FRU procedure for DVD drive with MTM
   7226-1U3 and feature codes 5757/5762/5763 where it did not verify the DVD
   drive was plugged in at the end of the exchange procedure.  Without the fix, 
   the user must manually verify that the DVD drive is plugged in.
 * A problem was fixed for the Advanced System Management Interface (ASMI)
   incorrectly showing the Anchor card as guarded whenever any redundant VPD
   chip is guarded.
 * A problem was fixed for the health monitoring of the NVRAM and DRAM in the
   service processor that had been disabled.  The monitoring has been
   re-established and early warnings of service processor memory failure is
   logged with one of the following Predictive Error SRCs:  B151F107, B151F109,
   B151F10A, or B151F10D.
 * A problem was fixed for infrequent VPD cache read failures during an IPL
   causing an unnecessary guarding of DIMMs with SRC B123A80F logged.  With the
   fix, the VPD cache read fails cause a temporary deconfiguration of the
   associated DIMM but the DIMM is recovered on the next IPL.
 * A problem was fixed for a processor hang where the error recovery was not
   guarding the failing processor.  The failure causes a SRC B111E540 to be
   logged with Signature Description of " ex(n0p3c1) (COREFIR[55])
   NEST_HANG_DETECT: External Hang detected".  With the fix, the failure
   processor FRU is called out and guarded so that the error does not re-occur
   when the system is re-IPLed.
 * A problem was fixed for a DDR4 memory training step during hostboot that
   incorrectly failed DIMMs on the timing margins for the HOLD limit.  The DIMMs
   may be recovered by manually unguarding the failed DIMM hardware.   This
   affects the 128GB DDR4 memory DIMM with feature code #EM8S for the E850
   (8404-E8E) system.
 * A problem was fixed for a failed IPL with SRC UE BC8A090F that does not have
   a hardware callout or a guard of the failing hardware.  The system may be
   recovered by guarding out the processor associated with the error and
   re-IPLing the system.  With the fix, the bad processor core is guarded and
   the system is able to IPL.
 * A problem was fixed for the Operations Panel showing swapped physical port
   assignments for logical eth0 and eth1 for the service processor when panel
   function 30 is used.  For eth0, port "T5" is displayed instead of port "T4". 
   For eth1, port "T4" is displayed instead of "T5".  This problem does not
   affect the IP addresses assigned in the Advanced System Management Interface
   (ASMI) for the eth0 and eth1 ports which are correctly assigned.
   This problem only pertains to the IBM Power System E850 (8408-E8E) model.
 * A problem was fixed for On-Chip Controller (OCC) errors that had excessive
   callouts for processor FRUs.  Many of the OCC errors are recoverable and do
   not required that the processor be called out and guarded.  With the fix, the
   processors will only be called out for OCC errors if there are three or more
   OCC failures during a time period of a week.
 * A problem was fixed for an Operations Panel Function 04 (Lamp test) during an
   IPL causing the IPL to fail.  With the fix, the lamp test request is rejected
   during the IPL until the hypervisor is available.  The lamp test can be
   requested without problems anytime after the system is powered on to
   hypervisor ready or an OS is running in a partition.
 * A problem was fixed for a false thermal alarm in the active optical cables
   (AOC) for the PCIe3 expansion drawer with SRCs B7006AA6 and B7006AA7 being
   logged every 24 hours.  The AOC cables have feature codes of #ECC6 through
   #ECC9, depending on the length of the cable.  The SRCs should be ignored as
   they call for the replacement of the cable, cable card, or the expansion
   drawer module.  With the fix, the false AOC thermal alarms are no longer
   reported.
 * A problem was fixed for the On-Chip Controller (OCC) incorrectly calling out
   processors with SRC B1112A16 for L4 Cache DIMM failures with SRC B124E504. 
   This false error logging can occur if the DIMM slot that is failing is
   adjacent to two unoccupied DIMM slots.
   

System firmware changes that affect certain systems

 * On systems using PowerVM firmware, a problem was fixed for network issues,
   causing critical situations for customers, when an SR-IOV logical port or
   vNIC is configured with a non-zero Port VLAN ID (PVID).  This fix updates
   adapter firmware to 10.2.252.1922, for the following Feature Codes: EN15,
   EN16, EN17, EN18, EN0H, EN0J, EL38, EN0M, EN0N, EN0K, EN0L, and EL3C.
   The SR-IOV adapter firmware level update for the shared-mode adapters happens
   under user control to prevent unexpected temporary outages on the adapters. 
   A system reboot will update all SR-IOV shared-mode adapters with the new
   firmware level.  In addition, when an adapter is first set to SR-IOV shared
   mode, the adapter firmware is updated to the latest level available with the
   system firmware (and it is also updated automatically during maintenance
   operations, such as when the adapter is stopped or replaced).  And lastly,
   selective manual updates of the SR-IOV adapters can be performed using the
   Hardware Management Console (HMC).  To selectively update the adapter
   firmware, follow the steps given at the IBM Knowledge Center for using HMC to
   make the updates:  
   https://www.ibm.com/support/knowledgecenter/HW4M4/p8efd/p8efd_updating_sriov_firmware.htm.
   Note: Adapters that are capable of running in SR-IOV mode, but are currently
   running in dedicated mode and assigned to a partition, can be updated
   concurrently either by the OS that owns the adapter or the managing HMC (if
   OS is AIX or VIOS and RMC is running).
 * A problem was fixed for systems using the OPAL firmware for repeated B181460B
   error logs in the Linux OS message log.  These are informational error logs
   related to a restart of a process in the service processor and can be
   ignored.  The restart of the process has been cleaned up to prevent the error
   message from being logged.
 * On systems using the PowerVM hypervisor firmware and Novalink, a problem was
   fixed for a NovaLink installation error where the hypervisor was unable to
   get the maximum logical memory buffer (LMB) size from the service processor. 
   The maximum supported LMB size should be 0xFFFFFFFF but in some cases it was
   initialized to a value that was less than the amount of configured memory,
   causing the service processor read failure with error code 0X00000134.
 * On systems using PowerVM firmware, a problem was fixed for an AIX or Linux
   partition failing with a SRC B2008105 LP 00005 on a re-IPL after a dump
   (firmware assisted or error generated dump) following a Live Partition
   Mobility (LPM) migration operation.  The problem does not occur if the
   migrated partition completes a normal IPL after the migration.
 * On systems using PowerVM firmware, a problem was fixed to prevent NovaLink
   managed or co-managed systems from blocking SR-IOV configurations.  When
   configuring or deconfiguring SR-IOV, it is highly likely that the Novalink
   VMC virtual device will interfere with SR-IOV virtual devices.  Without the
   fix, SR-IOV is ignoring the NovaLink VMC device and trying to use the same
   virtual slot.
 * On systems using PowerVM firmware, a problem was fixed for intermittent long
   delays in the NX co-processor for asynchronous requests such as NX 842
   compressions.  This problem was observed for AIX DB2 when it was doing
   hardware-accelerated compressions of data but could occur on any asynchronous
   request to the NX co-processor.
 * On systems using the PowerVM firmware, a fix was made to provide an option to
   change the ordering of PCIe Host Bridge (PHB) devices on Power 8 systems to
   match the discovery order on Power 7 systems.
 * On systems using PowerVM firmware that have an attached HMC,  a problem was
   fixed for a Live Partition Mobility migration that resulted in the source
   managed system going to the Hardware Management Console (HMC) Incomplete
   state after the migration to the target system was completed.  This problem
   is very rare and has only been detected once.. The problem trigger is that
   the source partition does not halt execution after the migration to the
   target system.   The HMC went to the Incomplete state for the source managed
   system when it failed to delete the source partition because the partition
   would not stop running.  When this problem occurred, the customer network was
   running very slowly and this may have contributed to the failure.  The
   recovery action is to re-IPL the source system but that will need to be done
   without the assistance of the HMC.  For each partition that has a OS running
   on the source system, shut down each partition from the OS.  Then from the
   Advanced System Management Interface (ASMI),  power off the managed system. 
   Alternatively, the system power button may also be used to do the power off. 
   If the HMC Incomplete state persists after the power off, the managed system
   should be rebuilt from the HMC.  For more information on HMC recovery steps,
   refer to this IBM Knowledge Center link:
   https://www.ibm.com/support/knowledgecenter/en/POWER7/p7eav/aremanagedsystemstate_incomplete.htm
 * On systems using PowerVM firmware, a problem was fixed for a latency time of
   about 2 seconds being added to a target Live Partition Mobility (LPM)
   migration system when there is a latency time check failure.  With the fix,
   in the case of a latency time check failure, a much smaller default latency
   is used instead of two seconds.  This error would not be noticed if the
   customer system is using a NTP time server to maintain the time.
 * On systems using PowerVM firmware that have an attached HMC,  a problem was
   fixed for a Live Partition Mobility migration that resulted in a system hang
   when an EEH error occurred simultaneously with a request for a page migration
   operation.  On the HMC, it shows an incomplete state for the managed system
   with reference code A181D000.  The recovery action is to re-IPL the source
   system but that will need to be done without the assistance of the HMC.  From
   the Advanced System Management Interface (ASMI),  power off the managed
   system.  Alternatively, the system power button may also be used to do the
   power off.  If the HMC Incomplete state persists after the power off, the
   managed system should be rebuilt from the HMC.  For more information on HMC
   recovery steps, refer to this IBM Knowledge Center link:
   https://www.ibm.com/support/knowledgecenter/en/POWER7/p7eav/aremanagedsystemstate_incomplete.htm
 * On systems using PowerVM firmware, a problem was fixed for a system dump
   post-dump IPL that resulted in adjunct partition errors of SRC BA54504D,
   B7005191, and BA220020 when they could not be created due to false space
   constraints.  These adjunct partition failures will prevent normal operations
   of the hypervisor such as creating new partitions, so a power off and power
   on of the system is needed to recover it.  If the customer system is
   experiencing this error (only some systems will be impacted), it is expected
   to occur for each system dump post-dump IPL until the fix is applied.
 * On systems using PowerVM firmware,  a problem was fixed for a shared
   processor pool partition showing an incorrect zero "Available Pool Processor"
   (APP) value after a concurrent firmware update.  The zero APP value means
   that no idle cycles are present in the shared processor pool but in this case
   it stays zero even when idle cycles are available.  This value can be
   displayed using the AIX "lparstat" command.  If this problem is encountered,
   the partitions in the affected shared processor pool can be dynamically moved
   to a different shared processor pool.  Before the dynamic move, the
   "uncapped" partitions should be changed to "capped" to avoid a system hang.
   The old affected pool would continue to have the APP error until the system
   is re-IPLed.
 * On systems using PowerVM firmware,  a rare problem was fixed for a system
   hang that can occur when dynamically moving "uncapped" partitions to a
   different shared processor pool.  To prevent a system hang, the "uncapped"
   partitions should be changed to "capped" before doing the move.
 * On systems using PowerVM firmware,  a problem was fixed for a DLPAR add of
   the USB 3.0 adapter (#EC45 and #EC46) to an AIX partition where the adapter
   could not be configured with the AIX "cfgmgr" command that fails with EEH
   errors and an outstanding illegal DMA transaction.  The trigger for the
   problem is the DLPAR add operation of the USB 3.0 adapter that has a USB
   External Dock (#EU04) and RDX Removable Disk Drives attached, or a USB 3.0
   adapter that has a flash driver attached.  The PCI slot can be powered off
   and on to recover the USB 3.0 adapter.
 * On systems using PowerVM firmware,  a problem was fixed for a missing OF
   trace buffer in the resource dump.  This happens any time a resource dump is
   requested.  The missing FFDC data may require that problems be recreated
   before they can be debugged.
 * On systems using PowerVM firmware, a problem was fixed for a Live Partition
   Mobility (LPM) error where the target partition migration is failed with
   HSCLB98C error.  Frequency of this error can be moderate with source
   partitions that have a vNIC resource but extremely low if the source
   partition does not have a vNIC resource.  The failure originates at the VIOS
   VF level, so recovery from this error may need a re-IPL of the system to
   regain full use of the vNIC resources.

SV840_139_056 / FW840.30

09/28/16 Impact:  Availability      Severity:  SPE

New features and functions


 * Support for the CAPI NVMe (Non-Volatible Memory express) Flash Accelerator
   Adapter with feature code #EJ1K.  This feature provides a PCIe Gen3 adapter
   with an FPGA and 1.92 TB of low write latency, nonvolatile flash memory. The
   adapter physically is a half length x8 adapter, but requires a x16 PCIe
   CAPI-capable Gen3 slot in the system unit. The system connects to the FPGA
   using the CAPI interface. The FPGA connects to the flash memory using NVMe, 
   which is a high performance software interface to read/write this flash
   memory.    Use of the #EJ1K adapter requires one #EC2A CAPI activation
   feature per system.   This CAPI Flash Accelerator Adapter does not run under
   PowerKVM but is a bare-metal install only for the following minimum Little
   Endian (LE) Linux distribution level:  Ubuntu 16.04.1.
   This feature only pertains to the IBM Power System S812L (8247-21L), S822L
   (8247-22L) and S824L (8247-42L) models.
 * Support for 6 core processor with FC #8A2225 and CCIN 54E1  extended for use
   in the Power System S822L (8247-22L).  Support was already in place for this
   processor since FW810.20 for the S822 (8284-22A).
 * The certificate store on the service processor has been upgraded to include
   the changes contained in version 2.6 of the CA certificate list published by
   the Mozilla Foundation at the mozilla.org website as part of the Network
   Security Services (NSS) version 3.21.
   

System firmware changes that affect all systems


 * A problem was fixed for PCI Host Bridge (PHB)  "link down"  Endpoint
   Recoverable errors that became fatal exceptions when not handled by the CAPI
   adapters.  With the fix, the recoverable errors are now detected by the CAPI
   adapters to allow for run-time link recovery.
 * A problem was fixed for CAPI adapter errors that caused the system processors
   to be called out and guarded instead of the CAPI adapter unit.  The errors
   that cause this problem are the rare fatal adapter errors, so the problem
   should be infrequently seen.  With the fix, the failing CAPI adapter is
   guarded after the checkstop instead of the system processor.
 * A problem was fixed for host-initiated resets of the service processor that
   can cause the service processor to terminate.  In this state, the service
   processor will be unresponsive but the system and partitions will continue to
   run.  On systems with a single service processor, the SRC B1817212 will be
   displayed on the control panel.  For systems with redundant service
   processors, the failing service processor will be deconfigured.  To recover
   the failed service processor, the system will need to be powered off with AC
   powered removed during a regularly scheduled system service action.  The
   problem is intermittent and very infrequent as most of the host-initiated
   resets of the service processor will work correctly to restore the service
   processor to a normal operating state.

SV840_132_056 / FW840.24

08/31/16 Impact:  Availability      Severity:  HIPER

System firmware changes that affect certain systems


 * HIPER/Non-Pervasive: For a system using PowerVM firmware at a FW840 level and
   having an AIX partition or VIOS partition at specific back levels,  a problem
   was fixed for PCI adapters not getting configured in the OS.  DVD boots hang
   with status code 518 when attempts are made to boot off the AIX or VIOS DVD
   image.  NIM installs hang with status code 608.  If the firmware is updated
   to 840_104 through 840_118 for a SAS booted system, the subsequent reboot
   will hang with status code 554.
   The failing AIX and VIOS levels for the IBM Power System S822 (8284-22A),
   S814 (8286-41A), and S824 (8286-42A) are as follows:
   AIX:
   AIX 7100-01-10
   AIX 7100-02-05 - AIX 7100-02-07
   AIX 6100-07-10
   AIX 6100-08-05 - AIX 6100-08-07
   VIOS :
   VIOS 2.2.1.9
   VIOS 2.2.2.5 - VIOS 2.2.2.70
   The failing AIX and VIOS levels for the IBM Power System E850 (8408-E8E) are
   as follows:
   AIX :
   AIX 7100-02-07
   AIX 6100-08-07
   VIOS :
   VIOS 2.2.2.70
   Without the fix, the problem may be circumvented by upgrading the AIX to
   7100-03-03 or 6100-09-03 and the VIOS to 2.2.3.4.
   Depending on the adapter not getting configured, the error may result in
   Defined devices, EEH errors, and/or failure to boot the partition (if the
   failing adapter is the boot device). These errors may also be seen for a
   rebooted partition after a LPM migration to FW840.
   With the fix applied, the error state for some of the adapters in the running
   OS may persist and it will be necessary to reboot the OS to recover from
   those errors.
   The problem corrected with this Service Pack does not pertain to the IBM
   Power System S812L (8247-21L), S822L (8247-22L), or S824L (8247-42L) models.

SV840_118_056 / FW840.23

07/28/16 Impact: Data            Severity:  HIPER

System firmware changes that affect certain systems


 * HIPER/Non-Pervasive: DEFERRED:  On systems with DDR4 memory installed, a
   problem was fixed for the handling of data errors in the L4 cache.   If a
   data error occurs in the L4 cache of the memory buffer on an affected system
   and it is pushed out to mainline memory, the data error will not be correctly
   handled.   A data error originating in the L4 cache may result in incorrect
   data being stored into memory.  The DDR4 DRAM has feature code (FC) EM8S for
   a 128GB 1600 MHz CDIMM.
   IBM strongly recommends that the customer should plan an outage to install
   the firmware fix immediately.  Fix activation requires a subsequent platform
   IPL following the installation of the firmware fix to eliminate any exposure
   to this issue.
   This problem only exists on the 8408-E8E systems with the DDR4 DRAM memory
   feature.

SV840_113_056 / FW840.22

07/06/16 Impact:  Availability      Severity:  ATT

New features and functions


 * Support was added to Live Partition Mobility to allow migrations between
   partitions at firmware level FW760 and FW840.22 or later.  Previously,
   migration operations were not allowed between FW760 and FW840 partitions.
 * Support for the CoD on the 226W 4.323 GHz eight core processor (CCIN 54E5,
   F/C EPXF) for the EasyScale offering of the S822 (8284-22A).  This includes
   Processor Capacity on Demand (CoD) with Elastic (On/Off) Processor CoD and
   Trial Processor CoD.  Previously, the CoD support for the EasyScale S822 was
   only available when using the ten core 3.42 GHz processor (CCIN 54E8, F/C
   EPXD).

System firmware changes that affect certain systems


 * On systems using PowerVM firmware, a problem was fixed for a sequence of two
   or more Live Partition Mobility migrations that caused a partition to crash
   with a SRC BA330000 logged (Memory allocation error in partition firmware). 
   The sequence of LPM migrations that can trigger the partition crash are as
   follows:
   The original source partition level can be any FW760.xx, FW763.xx, FW770.xx,
   FW773.xx, FW780.xx, or FW783.xx P7 level or any FW810.xx, FW820.xx, FW830.xx,
   or FW840.xx P8 level.  It is migrated first to a system running one of the
   following levels:
   1) FW730.70 or later 730 firmware or
   2) FW740.60 or later 740 firmware
   And then a second migration is needed to a system running one of the
   following levels:
   1) FW760.00 - FW760.20 or
   2) FW770.00 - FW770.10
   The twice-migrated system partition is now susceptible to the BA330000
   partition crash during normal operations until the partition is rebooted.  If
   an additional LPM migration is done to any firmware level, the
   thrice-migrated partition is also susceptible to the partition crash until it
   is rebooted.
   With the fix applied, the susceptible partitions may still log multiple
   BA330000 errors but there will be no partition crash.  A reboot of the
   partition will stop the logging of the BA330000 SRC.

SV840_104_056 / FW840.20

05/31/16 Impact:  Availability      Severity:  SPE

New features and functions


 * Support for a 128GB DDR4 memory DIMM for the E850 (8408-E8E) model .  Memory
   feature code #EM8S provides the 128GB CDIMM (1600 MHz, 8GBIT DDR4).   Note
   that DDR4 and DDR3 DIMMs cannot be mixed in the system. Also, the minimum
   firmware level needed for DDR4 usage is FW840.23 due to a fix needed for a
   data integrity problem.
 * Support was added for the Stevens6+ option of the internal tray loading
   DVD-ROM drive with F/C #EU13.  This is an 8X/24X(max) Slimline SATA DVD-ROM
   Drive.  The Stevens6+ option is a FRU hardware replacement for the
   Stevens3+.  MTM 7226-1U3 (Oliver)  FC 5757/5762/5763 attaches to IBM Power
   Systems and lists Stevens6+ as optional for Stevens3+.  If the Stevens6+  DVD
   drive is installed on the system without the required firmware support, the
   boot of an AIX partition will fail when the DVD is used as the load source. 
   Also, an IBM i partition cannot consistently boot from the DVD drive using
   D-mode IPL.  A SRC C2004130 may be logged for the load source not found
   error.
 * Support for the IBM PCIe3 12GB cache RAID plus SAS dual 4-port 6Gb x8 adapter
   with feature code #EJ14 and CCIN 57B1.  This adapter is very similar to the
   #EJ0L SAS adapter, but it uses a second chip in the card to provide more IOPS
   capacity (significant performance improvement) and can attach more SSD.  This
   adapter uses integrated flash memory to provide protection of the write
   cache, without need for batteries, in case of power failure.
 * Support for PowerVM vNIC extended to Linux OS Ubuntu 16.04 LE with up to ten
   vNIC client adapters for each partition.  PowerVM vNIC combines many of the
   best features of SR-IOV and PowerVM SEA to provide a network solution with
   options for advanced functions such as Live Partition Mobility along with
   better performance and I/O efficiency when compared to PowerVM SEA.  In
   addition PowerVM vNIC provides users with bandwidth control (QoS) capability
   by leveraging SR-IOV logical ports as the physical interface to the network.
 * PowerVM CoD was enhanced to eliminate the yearly Utility CoD renewal on
   systems using Utility COD.  The Utility CoD usage is already monitoring to
   make sure systems are running within the prescribed threshold limit of
   unreported usage, so a yearly customer renewal is not needed to manage the
   Utility CoD processor usage.
 * Support was added to the DHCP client on the service processor for non-random
   backoff mode needed for Data Center Manageability Interface (DCMI) V1.5 
   compliance.  By default, the DHCP client does random backoff delays for
   retries during DHCP discovery.  For DCMI V1.5, non-random backoff delays were
   introduced as an option.  Disabling the random back-off mode is not required
   for normal operations, but if wanted, the system administrator can override
   the default and disable the random back-off mode by sending the “SET DCMI
   Configuration Parameters” for the random back-off property of the Discovery
   Configuration parameter.  A value of "0" for the bit means "Disabled".  Or,
   the DHCP configuration file can be modified to add "random-backoff off",
   causing the non-random mode for the retry delays to be used during DHCP
   discovery.
 * Support was added for enhanced diagnostics for PowerVM Simplified Remote
   Restart (SRR) partitions.   This service pack level is recommended when using
   SRR partitions.  You can learn more about SSR partitions at the IBM Knowledge
   Center: "
   http://www.ibm.com/support/knowledgecenter/HW4P4/p8hat/p8hat_createremotereslpar.htm".
 * Support was added for auto-correction in the Advanced System Manager
   Interface (ASMI) for the "Feature Code/Sequence Number" field of the "System
   Configuration/Program Vital Product Data/System Enclosures" menu selection. 
   Lower case letters are invalid in the "Feature Code/Sequence Number" field so
   these are now changed to upper case letters to help form a valid entry.  For
   example, if "78c9-001" was entered, it would be changed to "78C9-001".
 * Support was added for HTTP Strict Transport Security (HSTS) compliance for
   The Advanced System Management Interface (ASMI) web connection.  Even without
   this feature, any attempt to access ASMI with the HTTP protocol was rejected
   because the service processor firewall blocks port 80 (HTTP).  But enabling
   HSTS for ASMI prevents HSTS security warnings for the service processor
   during network scans by security scanner programs such as IBM AppScan.
   

System firmware changes that affect all systems

 * DEFERRED:  A problem was fixed in the dynamic ram (DRAM) initialization to
   update the VREF on the dimms to the optimal settings and to add an additional
   margin check test to improve the reliability of the DRAM by screening out
   more marginal dimms before they can result in a run-time memory fault.
   
 * A problem was fixed for a degraded PCI link causing a processor core to be
   guarded if a non-cacheable unit (NCU) store time-out occurred with SRC
   B113E540 and PRD signature "(NCUFIR[9]) STORE_TIMEOUT: Store timed out on
   PB".  With the fix, the processor core is not guarded because of the NCU
   error.  If this problem occurs and a core is deconfigured. clear the guard
   record and re-IPL to regain the processor core.  The solution for degraded
   PCI links is different from the fix for this problem, but a re-IPL of the CEC
   or a reset of the PCI adapters could help to recover the PCI links from their
   degraded mode.
 * A problem was fixed for an incorrect reduction in FRU callouts for Processor
   Run-time Diagnostic (PRD) errors after a reference oscillator clock (OSCC)
   error has been logged.  Hardware resources are not called out and guarded as
   expected.  Some of the missing PRD data can be found in the secondary SRC of
   B181BAF5 logged by hardware services.  The callouts that PRD would have made
   are in the user data of that error log.
 * A problem was fixed for a Qualys network scan for security vulnerabilities
   causing a core dump in the Intelligent Platform Management Interface (IPMI) 
   process on the service processor with SRC B181EF88.  The error occurs anytime
   the Qualys scan is run because it sends an invalid IPMI session id that
   should have been handled and discarded without a core dump.
 * A security problem was fixed in OpenSSL for a possible service processor
   reset on a null pointer de-reference during RSA PPS signature verification.
   The Common Vulnerabilities and Exposures issue number is CVE-2015-3194.
 * A security problem was fixed in the lighttpd server on the service processor,
   where a remote attacker, while attempting authentication, could insert
   strings into the lighttpd server log file.  Under normal operations on the
   service processor, this does not impact anything because the log is disabled
   by default.  The Common Vulnerabilities and Exposures issue number is
   CVE-2015-3200.
   
 * A problem was fixed for the service processor going to the reset state
   instead of the termination state when the anchor card is missing or broken. 
   At the termination state, the Advanced System Manager Interface (ASMI) can be
   used to collect failure data and debug the problem with the anchor card.
 * A problem was fixed for error log entries created by Hostboot not getting
   written to the error log in some situations.  This can cause hardware
   detected as failed by Hostboot to not get reported or have a call-home
   generated.  This problem will occur whenever Hostboot commits a recovered or
   informational error as its last error log in the current IPL.  In the next
   IPL,  one or more error logs from Hostboot will be lost.
 * A problem was fixed for a service processor failure during a system power off
   that causes a reset of the service processor.  The service processor is in
   the correct state for a normal system power on after the error.  The
   frequency for this error should be low as it is caused by a very rare race
   condition in the power off process.
 * A problem was fixed so that service processor NVRAM bit flips are now
   detected and reported as predictive errors after a certain threshold of
   failures have occurred.  The SRCs reported are B151F109 (threshold of NVRAM
   errors was reached) or B151F10A (a NVRAM address has failed multiple times). 
   Previously, these normal wear errors in the NVRAM were ignored.  The bit flip
   is self-corrected and does not cause a problem but a high occurrence of these
   could mean that a service processor card FRU or system backplane FRU, as
   called out in the SRC, is in need of service. 
   
 * A security problem was fixed in OpenSSL for a possible service processor
   reset on a null pointer de-reference during SSL certificate management. The
   Common Vulnerabilities and Exposures issue number is CVE-2016-0797.
   

System firmware changes that affect certain systems


 * DEFERRED:  On systems using PowerVM firmware, a performance improvement was
   made by disabling the Hot/Cold Affinity (HCA) hardware feature, which gathers
   memory usage statistics for consumption by partition operating system memory
   management algorithms.  The statistics gathering can, in rare cases, cause
   performance to degrade.  The workloads that may experience issues are
   memory-intensive workloads that have little locality of reference and thus
   cannot take advantage of hardware memory cache.  As a consequence, the
   problem occurs very infrequently or not at all except for very specific
   workloads in a HPC environment.  This performance fix requires an IPL of the
   system to activate it after it is applied.
 * On systems using PowerVM firmware and NovaLink co-management of the
   partitions, a problem was fixed with the Hardware Management Console (HMC)
   not showing the co-management master name with the HMC lscomgmt command.  The
   command displayed blank text for the master owner when NovaLink established
   the master mode.  This problem occurred whenever Novalink powered on and took
   the master mode that had been released by the HMC.
 * On systems using OPAL firmware, a problem was fixed for Enhanced Error
   Handling (EEH) recoverable errors on network adapters behind a PLX switch
   having the backplane called out by OPAL instead of the adapter slot.
 * On systems with a PowerVM Active Memory Sharing (AMS) partition with AIX
   Level 7.2.0.0 or later with Firmware Assisted Dump enabled, a problem was
   fixed for a Restart Dump operation failing into KDB mode.  If "q" is entered
   to exit from KDB mode, the partition fails to start.  The AIX partition must
   be powered off and back on to recover.  The problem can be circumvented by
   disabling Firmware Assisted Dump (default is enabled in AIX 7.2).
 * On a PowerVM system, a problem was fixed for an incorrect date in partitions
   created with a Simplified Remote Restart-Capable (SRR) attribute where the
   date is created as Epoch 01/01/1970 (MM/DD/YYYY).  Without the fix, the user
   must change the partition time of day when starting the partition for the
   first time to make it correct.  This problem only occurs with SRR partitions.
 * On a PowerVM system with licensed Power Integrated Facility for Linux (IFL)
   processors, a problem was fixed for a system hang that could occur if the
   system contains both 1) dedicated processor partitions configured to share
   processors while active and 2) shared processor partitions.  This problem is
   more likely to occur on a system with a low number of non-IFL processors.
   
 * On systems using PowerVM firmware with dedicated processor partitions,  a
   problem was fixed for the dedicated processor partition becoming
   intermittently unresponsive. The problem can be circumvented by changing the
   partition to use shared processors.  This is a follow-on to the fix provided
   in 840.11 for a different issue for delays in dedicated processor partitions
   that were caused by low I/O utilization.
 * A problem was fixed for transmit time-outs on a Virtual Function (VF) during
   stressful network traffic, on systems using PCIe adapters in Single Root I/O
   Virtualization (SR-IOV) shared-mode.  This fix updates adapter firmware to
   10.2.252.1918, for the following Feature Codes: EN15, EN16, EN17, EN18, EN0H,
   EN0J, EL38, EN0M, EN0N, EN0K, EN0L, and EL3C.
   The SR-IOV adapter firmware level update for the shared-mode adapters happens
   under user control to prevent unexpected temporary outages on the adapters. 
   A system reboot will update all SR-IOV shared-mode adapters with the new
   firmware level.  In addition, when an adapter is first set to SR-IOV shared
   mode, the adapter firmware is updated to the latest level available with the
   system firmware (and it is also updated automatically during maintenance
   operations, such as when the adapter is stopped or replaced).  And lastly,
   selective manual updates of the SR-IOV adapters can be performed using the
   Hardware Management Console (HMC).  To selectively update the adapter
   firmware, follow the steps given at the IBM Knowledge Center for using HMC to
   make the updates:  
   https://www.ibm.com/support/knowledgecenter/HW4M4/p8efd/p8efd_updating_sriov_firmware.htm.
   Note: Adapters that are capable of running in SR-IOV mode, but are currently
   running in dedicated mode and assigned to a partition, can be updated
   concurrently either by the OS that owns the adapter or the managing HMC (if
   OS is AIX or VIOS and RMC is running).
   
 * On systems using OPAL firmware, a problem was fixed in the PCI Host Bridge
   (PHB) to prevent adapter interrupts from being lost when two interrupts come
   in at the same time.  The lost interrupts could result in a slow down for the
   workload using the affected adapter.  This fixes a problem seen with some
   CAPI workloads that have lots of interrupt masking at the same time as a high
   interrupt load.  However, the fix is not specific to the CAPI adapters.
 * On systems using OPAL firmware, a problem was fixed for an extraneous SRC
   BB822411 being logged during service processor termination occurrences.  This
   SRC is unrelated to the root cause of the termination and should be ignored.
 * On systems using OPAL firmware, a problem was fixed for a incomplete
   reporting of a Hypervisor Maintenance Interrupt (HMI) to the host Linux OS. 
   The fix ensures the CPU Processor Identification Register (PIR) is reported
   correctly instead of having an all zero value.  HMIs are caused by hardware
   failures occurring in the SLW (sleep winkle image) for processors or in CAPI
   (Coherent Accelerator Processor Interface) adapters.  These cause the
   hypervisor to investigate the cause of the error by reading SCOM registers to
   isolate the fault and send a HMI.
 * On systems using OPAL firmware, support was added to allow the Linux OS to
   send alphanumeric strings to the operations panel.  The OS program must use a
   device driver for /dev/oppanel.  The driver implements a 32 character buffer
   which a user can read/write by accessing the device (/dev/oppanel). This
   buffer is then displayed on the operator panel display.
 * A problem was fixed for the Advanced System Management Interface (ASMI)
   "System Service Aids/Call-home Setup" menu not being able to clear the old
   Service center phone numbers.  The blank or null characters are now accepted
   and can be used to overlay the existing values.  Without the fix, the
   characters input to clear the phone number field are rejected and replaced
   with the old values.  The ASMI Call-home option is not available for systems
   that are managed by the Hardware Management Console (HMC).
 * On PowerVM systems using Elastic Capacity on Demand (CoD) (also known as
   On/Off CoD), a problem was fixed for losing entitlement amounts when
   upgrading from FW820 or FW830.  If you upgrade to a service pack level that
   does not have this fix and lose the entitlement, you can get another On/Off
   (Elastic) CoD Enablement code from IBM Support.  This problem only pertains
   to the E850 (8408-E8E), E870 (9119-MME), and E880 (9119-MHE) models.
 * On IBM Power System S822 (8284-22A) using PowerVM for IBM i partitions, a
   problem was fixed for the User-based pricing indicator being off.  This was
   changed to be on.  The IBM i Licensing fees involves a distinction between
   User-based and non-User-based pricing.  The model S822, for PurePower (IBM i)
   now shows User-based pricing as required.

SV840_087_056 / FW840.11

03/18/16 Impact:  Availability      Severity:  ATT

New features and functions


 * Support for PowerVM co-management mode on the Hardware Management Console
   (HMC). This feature allows the HMC and PowerVM NovaLink to both have a live
   management connection to the system.  This is different than the traditional
   dual-HMC model however, and results in some behavior changes in the HMC.  For
   hardware and service management functions, the HMC works as it does when not
   in co-management mode.  However, when in co-management mode, only the PowerVM
   Co-Management Master can make changes to the PowerVM configuration and change
   the state of the system.  Power System firmware updates must be done using
   the HMC,  with the HMC as the Co-Management Master.  All management entities
   (HMC(s) and NovaLink) have read-access to the partition configuration
   regardless of whether they are the designated master.  Typically NovaLink
   will be the co-management master, however if a virtualization task or a
   firmware update is needed,  one can explicitly request master authority for
   the HMC, perform the action, and then relinquish the authority back to
   NovaLink.  The minimum firmware and HMC levels for this feature are FW840.11
   and HMC V8R8.4.0.1.  If using PowerVC with NovaLink co-management, the
   minimum level is PowerVC 1.3.0.2.  Please refer to IBM KnowledgeCenter link
   "http://ibm.biz/novalink-kc" for more information on the PowerVM NovaLink
   feature and changing the master authority when doing co-management.
   Note:  If a firmware update is attempted from a co-managing HMC that is not
   in the master role, the update operation will fail with the following
   message: "Could not start the update because this management console is not
   the master console.  Check to see if there is another management console
   program is attached to the target server {0} (HSCF0261E)" along with HMC SRC
   E302FB11.
 * The default setting for the "Enlarged I/O Memory Capacity" feature was
   disabled on newly manufactured E850, E870 & E880 models to reduce hypervisor
   memory usage.  Customers of the new systems using PCI adapters that leverage
   "Enlarged I/O Memory Capacity" will need to explicitly enable this feature
   for the supported PCI slots, using ASMI Menus while the system is powered
   off.  Existing systems will not see a change in their current setting.  For
   existing systems with only AIX and IBM i partitions that do not benefit from
   this feature, it can be disabled by using the Advanced System Management
   Interface (ASMI) for the "System Configuration-> I/O Adapter Enlarged
   Capacity" panel to uncheck the option for the "I/O Adapter Enlarged Adapter
   Capacity" feature.
   

System firmware changes that affect certain systems

 * On systems using PowerVM partitions, a problem was fixed for error recovery
   from failed Live Partition Mobility (LPM) migrations.  The recovery error is
   caused by a partition reset that leaves the partition in an unclean state
   with the following consequences:  1) A retry on the migration for the failed
   source partition may not not be allowed; and 2) With enough failed migration
   recovery errors, it is possible that any new migration attempts for any
   partition will be denied.  This error condition can be cleared by a re-IPL of
   the system. The partition recovery error after a failed migration is much
   more likely to occur for partitions managed by NovaLink but it is still
   possible to occur for Hardware Management Console (HMC) managed partitions.

SV840_079_056 / FW840.10

03/04/16 Impact:  Availability      Severity:  SPE

New features and functions


 * Support was added to block a full Hardware Management Console (HMC)
   connection to the service processor when the HMC is at a lower firmware major
   and minor release level than the service processor.  In the past, this check
   was done only for the major version of the firmware release but it now has
   been extended to the minor release version level as well.  The HMC at the
   lower firmware level can still make a limited connection to the higher
   firmware level service processor.  This will put the CEC in a "Version
   Mismatch" state.  Firmware updates are allowed with the CEC in the "Version
   Mismatch" state so that the condition can be corrected with either a HMC
   update or a firmware update of the CEC.
 * Support for Processor Capacity on Demand (CoD) for the IBM Power System S822
   (8284-22A) that includes Elastic (On/Off) Processor CoD and Trial Processor
   CoD.
 * Support was removed in the Advanced Systems Management Interface (ASMI) and
   IPMI for allowing the IBM Power System S822 (8284-22A) to change between OPAL
   and PowerVM hypervisor modes.  The default for new 8284-22A systems is
   PowerVM mode and it cannot be changed to OPAL. For existing customers with
   8284-22A systems, both hypervisor modes (PowerVM & OPAL) are still available
   after the firmware is upgraded to 840.10, so they are not affected by the
   change.
   
 * Support was added for a 4-Core 3.02 GHz POWER8 Processor Card with CCIN 54E9
   and feature code #EPXK for the S822 (8284-22A), S812L(8247-21L),  and S822L
   (8247-22L) models.
 * Support for PowerVM vNIC with more vNIC client adapters for each partition,
   up to 10 from a limit of 6 at the FW840.00 level.  PowerVM vNIC combines many
   of the best features of SR-IOV and PowerVM SEA to provide a network solution
   with options for advanced functions such as Live Partition Mobility along
   with better performance and I/O efficiency when compared to PowerVM SEA.  In
   addition PowerVM vNIC provides users with bandwidth control (QoS) capability
   by leveraging SR-IOV logical ports as the physical interface to the network.
 * Support for the IBM Power System E850 (8408-E8E) with AIX and Linux
   partitions.
 * The default setting for the "Enlarged I/O Memory Capacity" feature was
   disabled on newly manufactured E850, E870 & E880 models to reduce hypervisor
   memory usage.  Customers using PCI adapters that leverage "Enlarged I/O
   Memory Capacity" will need to explicitly enable this feature for the
   supported PCI slots, using ASMI Menus while the system is powered off.
   

System firmware changes that affect all systems

 * A problem was fixed for false errors logs for SRC B181A40F where upper domain
   fans are incorrectly reported as missing on a reboot of the service
   processor.  This problem only pertains to the IBM Power System E850
   (8408-E8E).
 * A problem was fixed for not being able to control all I/O slots for Huge
   Dynamic DMA Window (HDDW) capability on the IBM Power System E850
   (8408-E8E).  There are 13 I/O slots enabled for HDDW on this system but only
   8 could be controlled by the Advanced System Management Interface (ASMI) 
   panel for "I/O Enlarged Capacity".  This prevented enabling all slots to be
   HDDW enabled, limiting DMA bandwidth on some of the I/O slots.
 * A problem was fixed for a system IPL hang at C100C1B0 with SRC 1100D001 when
   the power supplies have failed to supply the necessary 12-volt output for the
   system.   The 1100D001 SRC was calling out the planar when it should have
   called out the power supplies.  With the fix, the system will terminate as
   needed and call out the power supply for replacement.  One mode of power
   supply failure that could trigger the hang is sync-FET failures that disrupt
   the 12-volt output.
 * A problem was fixed for a PCIe3 I/O expansion drawer (#EMX0) not getting all
   error logs reported when its error log queue is full.  In the case where the
   error log queue is full with 16 entries, only one entry is returned to the
   hypervisor for reporting.  This error log truncation only occurs during
   periods of high error activity in the expansion drawer.
 * A problem was fixed for the callout of a VPD collection fault and system
   termination with SRC 11008402 to include the 1.2vcs VRM FRU.  The power good
   fault fault for the 1.2 volts would be a primary cause of this error. 
   Without the fix, the VRM is missing in the callout list and only has the
   VPDPART isolation procedure.
 * A problem was fixed for excessive logging of the SRC 11002610 on a power good
   (pgood) fault when detected by the Digital Power Subsystem Sweep (DPSS). 
   Multiple pgood interrupts are signaled by the DPSS in the interval between
   the first pgood failure and the node power down.  A threshold was added to
   limit the number of error logs for the condition.
 * A problem was fixed to speed recovery for VPD collection time-out errors for
   PCIe resources in an I/O drawer logged with SRC 10009133 during concurrent
   firmware updates.  With the fix, the hypervisor is notified as soon as the
   VPD collection has finished so the PCIe resources can report as available . 
   Without the fix, there is a delay as long as two hours for the recovery to
   complete.
 * A problem was fixed to allow IPMI entity IDs to be used in ipmitool raw
   commands on the service processor to get the temperature reading.  Without
   the fix, the DCMI entity IDs have to be used in the raw command for the "Get
   temperature" function.
 * A problem was fixed for a false unrecoverable error (UE) logged for B1822713
   when an invalid cooling zone is found during the adjustment of the system fan
   speeds.  This error can be ignored as it does not represent a problem with
   the fans.
 * A problem was fixed for loss of back-level protection during firmware updates
   if an anchor card has been replaced.  The Power system manufacturing process
   sets the minimum code level a system is allowed to have for proper
   operation.  If a anchor card is replaced, it is possible that the replacement
   anchor card is one that has the Minimum MIF Level (MinMifLevel) given as
   "blank",  and this removes the system back-level protection. With the fix,
   blanks or nulls on the anchor card for this field are handled correctly to
   preserve the back-level protection.  Systems that have already lost the
   back-level protection due to anchor card replacement remain vulnerable to a
   accidental downgrade of code level by operator error, so code updates to a
   lower level for these systems should only be performed under guidance from
   IBM Support.  The following command can be run the Advanced Management
   Management Interface (ASMI) to determine if the system has lost the
   back-level protection with the presence of "blanks" or ASCII 20 values for
   MinMifLevel:
   "registry -l cupd/MinMifLevel" with output:
   "cupd/MinMifLevel:
   2020202020202020 2020202020202020 [ ]
   2020202020202020 2020202020202020 [ ]"
 * A problem was fixed for a code update error from FW830 to a FW840 level
   causes temperature sensors to be lost so that the ipmitool command to list
   the temperature sensors fails with a IPMI program core dump.  If the
   temperature sensors are already corrupted due to a preceding code update,
   this fix adds back in the temperature sensors to allow the ipmitool to work
   for listing the temperature sensors.
 * A problem was fixed for a system checkstop caused by a L2 cache
   least-recently used (LRU) error that should have been a recoverable error for
   the processor and the cache.  The cache error should not have caused a L2 HW
   CTL error checkstop.
 * A problem was fixed for a re-IPL with power on failure with B181A40F SRC
   logged for VPD not found for a DIMM FRU.  The DIMM had been moved to another
   slot or just removed.  In this situation, a IPL of the system from power off
   will work without errors, but a re-IPL with power on,  such as that done
   after processing a hardware dump, will fail with the B181A40F.  Power off the
   system and IPL to recover.  Until the fix is applied, the problem can be
   circumvented after a DIMM memory move by putting the PNOR flash memory in
   genesis mode by running the following commands in ASMI with the CEC powered
   off:
           1) hwsvPnorCmd -c
           2) hwsvPnorCmd -g
   
 * A problem was fixed for the service processor becoming inaccessible when
   having a dynamic IP address and being in DCMI "non-random" mode for DHCP
   discovery by customer configuration.  The problem can occur intermittently
   during a AC power on of the system.  If the service processor does not
   respond on the network, AC power cycle to recover.  Without the fix, the
   problem can be circumvented by using the DHCP client in the DCMI "random"
   mode for DHCP discovery, which is the default on the service processor.
 * A problem was fixed for a memory initialization error reported with SRC
   BC8A0506 that terminates the IPL.  This problem is unlikely to occur because
   it depends on a specific memory location being used by the code load. The
   system can be recovered from the error by doing another IPL.
   

System firmware changes that affect certain systems


 * On PowerVM systems a problem was fixed to address a performance degradation.
   The problem surfaces under the following conditions:
   1)    There is at least one VIOS or Linux partition that is running with
   dedicated processors AND
   2)    There is at least one VIOS or Linux partition running with shared
   processors AND
   3)    There is at least one AIX or IBMi partitions configured with shared
   processors. 
   If ALL the above conditions are met AND one of the following actions occur,
   1)    VIOS/Linux dedicated processor partition is configured to share
   processors while active OR
   2)    A dynamic platform optimization operation (HMC 'optmem' command) is
   performed OR
   3)    Processors are unlicensed via a capacity on demand operation
   there is an exposure for a loss in performance.
   
 * On systems using PowerVM firmware, a problem was fixed for PCIe switch
   recovery to prevent a partition switch failure during the IPL with error logs
   for SRC B7006A22 and B7006971  reported.  This problem can occur when doing
   recovery for an informational error on the switch.  If this problem occurs,
   the partition must be restarted to recover the affected I/O adapters.
 * On systems using PowerVM firmware, a problem was fixed for a concurrent FRU
   exchange of a CAPI (Coherent Accelerator Processor Interface) adapter for a
   standard I/O adapter that results in a vary off failure.  If this failure
   occurs, the system needs to be re-IPLed to fix the adapter.  The trigger for
   this failure is a dual exchange where the CAPI adapter is exchanged first for
   a standard (non-like-typed) adapter.  Then an attempt is made to exchange the
   standard adapter for a CAPI adapter which fails.
 * On systems using PowerVM firmware, a problem was fixed for a CAPI (Coherent
   Accelerator Processor Interface) device going to a "Defined" state instead of
   "Available" after a partition boot.  If the CAPI device is doing recovery and
   logging error data at the time of the partition boot, the error may occur. 
   To recover from the error, reboot the partition.  With the fix, the
   hypervisor will wait for the logging of error data from the CAPI device to
   finish before proceeding with the partition boot.
 * On systems using PowerVM firmware, a problem was fixed for a hypervisor
   adjunct partition failed with "SRC B2009008 LP=32770" for an unexpected
   SR-IOV adapter configuration.  Without the fix, the system must be re-IPLed
   to correct the adjunct error.  This error is infrequent and can only occur if
   an adapter port configuration is being changed at the same time that error
   recovery is occurring for the adapter.
 * On systems using PowerVM firmware and PCIe adapters in SR-IOV mode,  the
   following problem was addressed with a Broadcom Limited (formerly known as
   Avago Technologies and Emulex) adapter firmware update to 10.2.252.1913: 
   Transmit time-outs on a Virtual Function (VF) during stressful network
   traffic.
 * On systems using PowerVM firmware with an invalid P-side or T-side in the
   firmware, a problem was fixed in the partition firmware Real-Time Abstraction
   System (RTAS) so that system Vital Product Data (VPD) is returned at least
   from the valid side instead of returning no VPD data.   This allows AIX host
   commands such as lsmcode, lsvpd, and lsattr that rely on the VPD data to work
   to some extent even if there is one bad code side.  Without the fix,  all the
   VPD data is blocked from the OS until the invalid code side is recovered by
   either rejecting the firmware update or attempting to update the system
   firmware again.
 * On systems using PowerVM firmware without a HMC (and in Manufacturing Default
   Configuration (MDC) mode with a single host partition), a problem was fixed
   for missing dumps of type SYSDUMP. FSPDUMP. LOGDUMP, and RSCDUMP that were
   not off-loaded to the host OS.  This is an infrequent error caused by a
   timing error that causes the dump notification signal to the host OS to be
   lost.  The missing/pending dumps can be retrieved by rebooting the host OS
   partition.  The rebooted host OS will receive new notifications of the dumps
   that have to be off-loaded.
 * On systems using PowerVM firmware, a problem was fixed for truncation on the
   memory fields displayed in the Advanced System Management Interface on the
   COD panels.  ASMI shows three fields of memory called "Installed memory",
   Permanent memory", and "Inactive memory".  The largest value that can be
   displayed in the fields was "9999" GB.  This has been expanded to a maximum
   of "999999" GB for each of the ASMI fields.  The truncation was only in the
   displayed memory value, not in the actual memory size being used by the
   system which was correct.
 * On systems using PowerVM firmware and a partition using Active memory Sharing
   (AMS), a problem was fixed for a Live Partition Mobility (LPM) migration of
   the AMS partition that can hang the hypervisor on the target CEC.  When an
   AMS partition migrates to the target CEC, a hang condition can occur after
   processors are resumed on the target CEC, but before the migration operation
   completes.  The hang will prevent the migration from completing, and will
   likely require a CEC reboot to recover the hung processors.  For this problem
   to occur, there needs to be memory page-based activity (e.g. AMS dedup or
   Pool paging) that occurs exactly at the same time that the Dirty Page
   Manager's PSR data for that page is being sent to the target CEC.
 * On systems using PowerVM firmware, a problem was fixed for PCIe adapter hangs
   and network traffic error recovery during Live Partition Mobility (LPM) and
   SR-IOV vNIC (virtual ethernet adapter)  operations.  An error in the PCI Host
   Bridge (PHB) hardware can persist in the L3 cache and fail all subsequent
   network traffic through the PHB.  The PHB error recovery was enhanced to
   flush the PHB L3 cache to allow network traffic to resume.
 * On systems using PowerVM firmware with AIX or Linux partitions with greater
   than 8TB of memory, a problem was fixed for Dynamic DMA Window (DDW) enabled
   adapters IPLing into a "Defined" state,  instead of "Available", and unusable
   with a "0" size DMA window.  If a DDW enabled adapter is plugged into an HDDW
   (Huge Dynamic DMA Window) slot in a partition with the large memory size, the
   OS changes the default DMA window to "0" in size.  To prevent this problem,
   the Advanced System Management Interface (ASMI) in the service processor can
   be used to set "I/O Enlarged Capacity" to "0" (which is off), and all the DDW
   enabled adapters will work on the next IPL.
 * On systems using OPAL firmware, a problem was fixed for a held PSI link in
   delayed power off during a reset/reload of the service processor.  This error
   makes the service processor do a forced recovery of the PSI link on the next
   IPL.  For this problem, the PSI SRCs and error logs can be ignored as there
   is no problem in the PSI link.
 * On systems using OPAL firmware, a problem was fixed for intermittent errors
   in the module autoload function in the ibmpowernv driver.  A compatible
   property "ibm.opal-sensor" was added to implement the fix for a smooth
   autoload in Linux.
 * On systems using OPAL firmware, a problem was fixed for lost console output
   for serial consoles during power downs and reboots.  If a power down or
   reboot is detected, the console output buffer is now flushed before
   proceeding with the operation.
 * On systems using OPAL firmware , an informational message was added that OPAL
   does not support opal-prd since the processor runtime diagnostics (PRD) are
   handled by the service processor.
 * On systems using OPAL firmware, a performance problem was fixed in the OPAL
   hypervisor PCI Host Bridge (PHB) to prevent the PHB L3 cache from retrying
   defunct entries in the L3 after an MSI end of information (EOI) has been
   received.  The cache line is now flushed after updating the P/Q bits in the
   priority queue.  The situation is improved (and thus performance) by sending
   a DCBF (Data Cache Block Flush) to force a flush of PHB cache.  This improves
   interrupt performance, reducing latency per interrupt.  The improvement will
   vary by workload.
 * On systems using OPAL firmware, a problem was fixed for the OPAL hypervisor
   not releasing the PSI link after a power off of the CEC.  With the PSI link
   unavailable, the service processor has to forcibly reclaim it on the next
   IPL, causing erroneous SRCs and error logs for the PSI link when no problem
   exists.
 * On systems using OPAL firmware, a problem was fixed for a infinite loop in
   the boot of a host OS linux kernel.  Under rare error conditions in the real
   time clock, a bad error code returned to the host could cause it to get stuck
   in an infinite loop.
 * On systems using PowerVM firmware and NovaLink management of the partitions,
   a problem was fixed for error recovery for the NovaLink partition in cases
   where it has gone unresponsive with a heartbeat failure.  Without the fix,
   the system would have to be re-IPLed.  With the fix, the hypervisor reboots
   the NovaLink partition to resume normal operations.
 * On PowerVM systems with partitions running Linux, a problem was fixed for
   intermittent hangs following a Live Partition Mobility (LPM) migration of a
   Linux partition.  A partition migrating from a source system running FW840.00
   to a system running any other supported firmware level may become
   unresponsive and unusable once it arrives on the target system.  The problem
   only affects Linux partitions and is intermittent.  Only partitions that have
   previously been migrated to a FW840.00 system are susceptible to a hang on
   subsequent migration to another system.  If a partition is hung following a
   LPM migration, it must be rebooted on the target system to resume operations.
 * On systems using OPAL firmware, a problem was fixed that prevented multiple
   NVIDIA Tesla K80 GPUs from being attached to one PCIe adapter.  This
   prevented using a PCIe attached GPU drawer.  This fix increases the PCIe MMIO
   (memory-mapped I/O) space to 1 TB from a previous maximum of 64 GB per
   PHB/PCIe slot.
 * On PowerVM systems with dedicated processor partitions with low I/O
   utilization, the dedicated processor partition may become intermittently
   unresponsive. The problem can be circumvented by changing the partition to
   use shared processors.
 * On systems using OPAL firmware, a problem was fixed in OPAL to identify the
   PCI Host Bridge (PHB) on CAPI adapter errors and not always assume PHB0.
 * On systems using OPAL firmware, a problem was fixed in the OPAL gard utility
   to remove gard records after guarded components have been replaced,  Without
   the fix, Hostboot and the gard utility could be in disagreement on the
   replaced components, causing some components to still display as guarded
   after a repair.
 * On systems using PowerVM firmware with partitions with very large number of
   PCIe adapters, a problem was fixed for partitions that would hang because the
   partition firmware ran out of memory for the OpenFirmware FCode device
   drivers for PCIe adapters.  With the fix, the hypervisor is able to
   dynamically increase the memory to accommodate the larger partition
   configurations of I/O slots and adapters.
 * On PowerVM systems with vNIC adapters, a problem was fixed for doing a
   network boot or install from the adapter using a VLAN tag.  Without the fix,
   the support is missing for doing a network boot from the VLAN tag from the
   SMS RIPL menu.
   
 * On systems using PowerVM firmware, a problem was fixed for a Live Partition
   Mobility (LPM) migration of a partition with large memory that had a
   migration abort when the partition took longer than five minutes to suspend. 
   This is a rare problem and is triggered by an abnormally slow response time
   from the migrating partition.  With the fix, the five minute time limit on
   the suspend operation has been removed.
 * On systems using PowerVM firmware at FW840.00 with an AIX VIO client
   partition at level 7.1 TL04 SP03 or 7.2 TL01 SP00 or later, a problem was
   fixed for virtual ethernet adapters adapters with a IPv6 largesend packet
   (-i.e.,  data packets of size greater than the maximum transmission unit
   (MTU)) that hung and/or ran slow because largesend packets were discarded by
   the hypervisor.   For example, telnet and ping commands for the system will
   be working but as soon as a send of a large packet of data is attempted, the
   network connection hangs.  This firmware fix requires AIX levels 7.1 TL04
   SP03 or 7.2 TL01 SP00 or later for the largesend feature to work.
   The problem can be circumvented by disabling "mtu_bypass" (largesend) on the
   AIX VIO client.  The "mtu_bypass" is disabled by default but many network
   administrators enable it for a performance gain.  To disable " mtu_bypass" on
   the AIX VIO client,  use the following steps:
   (0) This change may impact existing connections so shut down the affected NIC
   cards (where X is the interface number)  prior to the change
   (1) Login to AIX VIO client from console as root
   (2) ifconfig enX down;ifconfig enX detach
   (3) chdev -l enX -a mtu_bypass=off
   (4) chdev -l enX -a state=up
   (5) mkdev -l inet0
   

SV840_056_056 / FW840.00

12/04/15 Impact:  New      Severity:  New

New features and functions

NOTE:

 * POWER8 (and later) servers include an “update access key” that is checked
   when system firmware updates are applied to the system.  The initial update
   access keys include an expiration date which is tied to the product warranty.
   System firmware updates will not be processed if the GA date of the desired
   firmware level occurred after the update access key’s expiration date.  As
   these update access keys expire, they need to be replaced using either the
   Hardware Management Console (HMC) or the Advanced Management Interface (ASMI)
   on the service processor.  Update access keys can be obtained via the key
   management website: http://www.ibm.com/servers/eserver/ess/index.wss.
 * Support for allowing the PowerVM hypervisor to continue to run when
   communication between the service processor and platform firmware has been
   lost and cannot be re-established.  A SRC B1817212 may be logged and any
   active partitions will continue to run but they will not be able to be
   managed by the management console.  The partitions can be allowed to run
   until the next scheduled service window at which time the service processor
   can be recovered with an AC power cycle or a pin-hole reset from the operator
   panel.  This error condition would only be seen on a system that had been
   running with a single service processor (no redundancy for the service
   processor).

 * Support for a HVDC (180-400 VDC) 1400W power supply in a one plus one or two
   plus two configuration to support redundancy.  Supported in rack models only
   with F/C EB2N for the S822 (8284-22A), S814(8286-41A), S824(8286-42A), and
   E850(8404-E8E) models.  And F/C EL1D for the S812L(8247-21L),
   S822L(8247-22L), and S824L(8247-42L) models.
 * Support in the Advanced Systems Management Interface (ASMI) for managing
   certificates on the service processor with option "System
   Configuration/Security/Certificate Management".  Certificate management
   includes 1) Generation of Certificate Signing Request (CSR) 2) Download of
   CSR and 3) Upload of signed certificates.  For more information on managing
   certificates, go to the IBM KnowledgeCenter link for "Certificate Management"
   (https://www-01.ibm.com/support/knowledgecenter/P8ESS/p8hby/p8hby_securitycertificate.htm).
 * Support for water cooling of the processor module in place of air cooling
   fins with feature code #ER2C.  The PCIe C5 slot carries the water lines so a
   PCIe adapter cannot be used there when the water cooling is installed.  This
   feature is available for the S822 (8284-22A) and S822L (8247-22L) models
   only.
 * Support for a High Frequency Trading policy to speed the processors.  When
   this policy is enabled, the processor cores are allowed to run at a higher
   frequency and voltage for better performance.  A new panel was created in the
   Advanced Systems Management Interface (ASMI) "System Configuration/High
   Frequency Trading"  to enable and disable this policy.  In PowerVM mode, 
   this feature applies only to the S822 (8284-22A), S812L (8247-21L), and S822L
   (8247-22L) models.  In OPAL mode, this feature applies to S812L (8247-21L)
   and S822L (8247-22L) with Ubuntu 14.04.3 bare-metal, Ubuntu 15.10 bare-metal,
   or RHEL 7.2 LE bare-metal.
 * Support for enhanced power management on PowerKVM systems with memory
   throttling and in-band power measurement capability.  This feature applies to
   S812L (8247-21L) and S822L (8247-22L) models only.
   
 * Support for service processor call home of error logs over ethernet (no
   dial-up modem required).  The call home setup is done through an option on
   the Advanced System Management Interface called "System Service
   Aids/Call-Home Setup".  This feature is only available for systems that are
   not attached to a management console.  For guidance on how to set up the
   call-home on the service processor, go to the IBM KnowledgeCenter link for
   "Configuring the call-home policy"
   (https://www-01.ibm.com/support/knowledgecenter/P8DEA/p8hby/callhomesetup.htm).
 * PowerVM support for Support for Coherent Accelerator Processor Interface
   (CAPI) adapters.  The PCIe3 LP CAPI Accelerator Adapter with F/C #EJ16 is
   used on the S812L(8247-21L) and S822L (8247-22L)  models.  The PCIe3 CAPI
   FlashSystem Acclerator Adapter with F/C #EJ17  is used on the S814(8286-41A)
   and S824(8286-42A) models.  The PCIe3 CAPI FlashSystem Accelerator Adapter
   with F/C #EJ18 is used on the S822(8284-22A), E870(9119-MME), and
   E880(9119-MHE) models.  This feature does not apply to the S824L (8247-42L)
   model.
 * Support for PCIe3 Expansion Drawer (#EMX0) lower cable failover, using lane
   reversal mode to bring up the expansion drawer from the top cable.  This
   eliminates a single point of failure by supporting lane reversal in case of
   problems with the lower cable.
 * Expanded support of Virtual Ethernet Large send from IPv4 to the IPv6
   protocol in PowerVM.
 * Support for IBM i network install on a IEEE 802.1Q VLAN.  The OS supported
   levels are IBM i.7.2.TR3 or later.  This feature applies only to S814
   (8286-41A), S824(8286-42A), E870 (9119-MME), and E880 (9119-MHE) models.
 * Support for PowerVM vNIC with up to six vNIC client adapters for each
   partition.  PowerVM vNIC combines many of the best features of SR-IOV and
   PowerVM SEA to provide a network solution with options for advanced functions
   such as Live Partition Mobility along with better performance and I/O
   efficiency when compared to PowerVM SEA.  In addition PowerVM vNIC provides
   users with bandwidth control (QoS) capability by leveraging SR-IOV logical
   ports as the physical interface to the network.
   Note:  If more than six vNIC client adapters are used in a partition, the
   partition will run, as there is no check to prevent the extra adapters, but
   certain operations such as Live Partition Mobility may fail.
   
 * Enhanced handling of errors to allow partial data in a Shared Storage Pool
   (SSP) cluster.  Under partial data error conditions, the management console
   "Manage PowerVM" gui will correctly show the working VIOS clusters along with
   information about the broken VIOS clusters, instead of showing no data.
 * PowerVM enhanced to support Little Endian (LE)  Linux guest OSes with Nvidia
   Compute Intensive Accelerator (PCIe attached GPU) with F/C EC47 and EC4B. 
   These adapters are only supported on the IBM Power System S824L (8247-42L)
   model.  Little Endian must be used because the Nvidia software stack is only
   enabled for LE mode.
 * Live Partition Mobility (LPM) was enhanced to allow the user to specify VIOS
   concurrency level overrides.
 * Support was added for PowerVM hard compliance enforcement of the Power
   Integrated Facility for Linux (IFL).  IFL is an optional lower cost per
   processor core activation for Linux-only workloads on IBM Power Systems. 
   Power IFL processor cores can be activated that are restricted to running
   Linux workloads.  In contrast, processor cores that are activated for
   general-purpose workloads can run any supported operating system.  PowerVM
   will block partition activation, LPM and DLPAR requests on a system with IFL
   processors configured if the total entitlement of AIX and IBMi partitions
   exceeds the amount of licensed general-purpose processors.  For AIX and IBMi
   partitions configured with uncapped processors, the PowerVM hypervisor will
   limit the entitlement and uncapped resources consumed to the amount of
   expensive processors that are currently licensed.
 * Support was added to allow Power Enterprise Pools to convert
   permanently-licensed (static) processors to Pool Processors using a CPOD COD
   activation code provided by the management console.  Previously, only
   unlicensed processors were able to become Pool Processors.
 * The management console was enhanced to allow a Live Partition Mobility (LPM)
   if there is a failed VIOS in a redundant pair.  During LPM, if the VIOS is
   inactive, the management console will use stored configuration information to
   perform the LPM.
 * The firmware update process from the management console and from in-band OS
   (except for IBM i PTFs) has been enhanced to download new "Update access
   keys" as needed to prevent the access key from expiring.  This provides an
   automatic renewal process for the entitled customer.
 * Live Partition Mobility support was added to allow the user to specify a
   different virtual Ethernet switch on the target server.
 * PowerVM was enhanced to support an AIX Live Update where the AIX kernel is
   updated without rebooting the kernel.  The AIX OS level must be 7.2 or
   later.  Starting with AIX Version 7.2, the AIX operating system provides the
   AIX Live Update function which eliminates downtime associated with patching
   the AIX operating system. Previous releases of AIX required systems to be
   rebooted after an interim fix was applied to a running system. This new
   feature allows workloads to remain active during a Live Update operation and
   the operating system can use the interim fix immediately without needing to
   restart the entire system. In the first release of this feature, AIX Live
   Update will allow customers to install interim fixes (ifixes) only. For more
   information on AIX Live Update,  go to the IBM KnowledgeCenter link for "Live
   Update" 
   (https://www-01.ibm.com/support/knowledgecenter//ssw_aix_72/com.ibm.aix.install/live_update_install.htm).
   
 * The management console has been enhanced to use standard FTP in its firmware
   update process instead of a custom implementation.  This will provide a more
   consistent interface for the users.
 * Support for setting Power Management Tuning Parameters from the management
   console (Fixed Maximum Frequency (FMF), Idle Power Save, and DPS Tunables)
   without needing to use the Advanced System Management Interface (ASMI) on the
   service processor.  This allows FMF mode to be set by default without having
   to modify any tunable parameters using ASMI.
 * Support for a Corsa PCIe adapter with accelerator FPGA for low latency
   connection using CAPI (Coherent Accelerator Processor Interface) attached to
   a FlashSystem 900 using two 8Gb optical SR Fibre Channel (FC) connections.
   Supported IBM Power Systems for this feature are the following:
   1) E880 (9119-MHE) with CAPI Activation feature #EC19 and Corsa adapter #EJ18
   Low profile on AIX.
   2) E870 (9119-MME) with CAPI Activation feature #EC18 and Corsa adapter
   #EJ18.Low profile on AIX.
   3) S822 (8284-22A) with CAPI Activation feature #EC2A and Corsa adapter
   #EJ18.Low profile on AIX.
   4) S814 (8286-41A) with CAPI Activation feature #EC2A and Corsa adapter #EJ17
   Full height on AIX.
   5) S824 (8286-42A) with CAPI Activation feature #EC2A and Corsa adapter #EJ17
   Full height on AIX.
   6) S812L (8247-21L) with CAPI Activation feature #EC2A and Corsa adapter
   #EJ16 Low profile on Linux.
   7) S822L (8247-22L)  with CAPI Activation feature #EC2A and Corsa adapter
   #EJ16 Low profile on Linux.
   OS levels that support this feature are PowerVM AIX 7.2 or later and OPAL
   bare-metal Linux Ubuntu 15.10.
   The IBM FlashSystem 900 storage system is model 9840-AE2 (one year warranty)
   or 9843-AE2 (three year warranty) at the 1.4.0.0 or later firmware level with
   features codes #AF23, #AF24, and #AF25 supported for 1.2 TB, 2.9 TB, 5.7 TB
   modules, respectively.
 * The Digital Power Subsystem Sweep (DPSS) FPGA, used to control P8 fan speeds
   and memory voltages, was enhanced to support the 840 GA level. This DPSS
   update is delayed to the next IPL of the CEC and adds 18 to 20 minutes to the
   IPL.  See the "Concurrent Firmware Updates" section above for details.
 * Support for Data Center Manageability Interface (DCMI) V1.5 and Energy Star
   compliance.  DCMI features were added to the Intelligent Platform Management
   Interface (IPMI) 2.0 implementation on the service processor.  DCMI adds
   platform management capability for monitoring elements such as system
   temperatures, power supplies, and bus errors.  It also includes automatic and
   manually driven recovery capabilities such as local or remote system resets,
   power on/off operations, logging of abnormal or "out-of-range‟ conditions for
   later examination.  And It allows querying for inventory information that can
   help identify a failed hardware unit along with power management options for
   getting and setting power limits.
   Note:  A deviation from the DCMI V1.5 specification exists for 840.00 for the
   DCMI Configuration Parameters for DHCP Discovery.  Random back-off mode is
   enabled by default instead of being disabled.  The random back-off puts a
   random variation delay in the DHCP retry interval so that the DHCP clients
   are not responding at the same time. Disabling the back-off time is not
   required for normal operations, but if wanted, the system administrator can
   override the default and disable the random back-off mode by sending the “SET
   DCMI Configuration Parameters” for the random back-off property of the
   Discovery Configuration parameter.  A value of "0" for the bit means
   "Disabled".
 * Support for PowerVM NovaLink partition management.  The NovaLink architecture
   enables OpenStack to work seamlessly with PowerVM by providing a direct
   connection to the PowerVM server rather than proxying through an HMC.  This
   allows for vastly improved scalability (from 30 to 200+ servers), better
   performance, and better alignment with the OpenStack architecture.  NovaLink
   is enabled via a small software package that runs within a Linux partition
   (Ubuntu) on a POWER8 host.  The following are the NovaLink hardware and
   software requirements:
       o POWER8 hardware coupled with System Firmware 840 (or later)
       o Virtual IO Server 2.2.4 (or later)
       o Ubuntu Linux 15.10 (ppc64le) (or later)
       o PowerVC 1.3 (or later)
 * Support for IBM i operating system over Virtual I/O Server (VIOS) on the IBM
   Power System S822 (8284-22A) server.  The IBM i support requires VIOS (no
   native I/O) and FW840.00.  At this level, the S822 supports IBM i 7.2 or IBM
   i7.1 with special terms and conditions. Technology Refresh 3 or later for IBM
   i 7.2 or Technology Refresh 11 or later for IBM i 7.1 is required.  Multiple
   IBM i partitions, each up to a maximum of two cores, are supported. The Power
   S822 software tier is P10.
   IBM i partitions that access directly attached disk or SSD through VIOS must
   use 4 k byte sector drives, not 5 xx byte sector drives. The 4 k drives are
   required for performance reasons.
   Note:  Async or bisync adapters or crypto-cards are not supported under
   VIOS.  Thus IBM i applications that require use of these adapters are not a
   good fit for the Power S822.   IBM i 7.2 clients can connect to a
   LAN-attached OEM device that has downstream async connections.



SV830
For Impact, Severity and other Firmware definitions, Please refer to the below
'Glossary of firmware terms' url:
http://www14.software.ibm.com/webapp/set2/sas/f/power5cm/home.html#termdefs
SV830_106_048 / FW830.50

04/27/17 Impact: Availability    Severity: SPE

New features and functions


 * Support for the Advanced System Management Interface (ASMI) was changed to
   allow the special characters of "I", "O", and "Q" to be entered for the
   serial number of the I/O Enclosure under the Configure I/O Enclosure option. 
   These characters have only been found in an IBM serial number rarely, so
   typing in these characters will normally be an incorrect action.  However,
   the special character entry is not blocked by ASMI anymore so it is able to
   support the exception case.  Without the enhancement, the typing of one of
   the special characters causes message "Invalid serial number" to be
   displayed.
 * Support was added for the Universally Unique IDentifier (UUID) property for
   each partition.  The UUID provides each partition with an identifier that is
   persisted by the platform across partition reboots, reconfigurations, OS
   reinstalls, partition migration,  and hibernation.
   

System firmware changes that affect all systems

 * A problem was fixed for the OS not being able to detect the USB connected
   Uninterruptible Power Supply (UPS) that has feature code #ECCF.  An
   informational SRC B1814616 is logged from the service processor and the IBM i
   OS logs a CPI0961 (Uninterruptible power supply no longer attached).  The
   error occurs infrequently because it depends on system timing and system
   configuration.  If a system is having the error, it might have it on every
   IPL.  The circumvention is to reseat the USB cable connector for the USB
   connected UPS.
 * A problem was fixed for System Vital Product Data (SVPD) FRUs being guarded
   but not having a corresponding error log entry.  This is a failure to commit
   the error log entry that has occurred only rarely.
 * A problem was fixed for the system VPD showing 4 extra PCIe slots that are
   not actually available to the system.  When running an IBM i partition, the
   IBM i Hardware Service Manager shows twelve PCIe adapter slots instead of the
   actual eight that can be used (P1-C2, P1-C3, P1-C4, and P1-C5 are the extra
   slots displayed).  This problem only pertains to the IBM Power System S814
   (8286-41A).
 * A problem was fixed for a system going into safe mode with SRC B1502616
   logged as informational without a call home notification.  Notification is
   needed because the system is running with reduced performance.  If there are
   unrecoverable error logs and any are marked with reduced performance and the
   system has not been rebooted, then the system is probably running in safe
   mode with reduced performance.  With the fix, the SRC B1502616 is a
   Unrecoverable Error (UE).
 * A problem was fixed for the PCIe3 Optical Cable Adapter for the PCIe3
   Expansion Drawer failing with SRC B7006A84 error logged during the IPL.  The
   failed cable adapter can be recovered by using a concurrent repair operation
   to power it off and on.  Or the system can be re-IPLed to recover the cable
   adapter.  The affected optical cable adapters have feature codes #EJ05,
   #EJ06, and #EJ08 with CCINs 2B1C, 6B52, and 2CE2, respectively.
 * A problem was fixed for PCIe Host Bridge (PHB) outages and PCIe adapter
   failures in the PCIe I/O expansion drawer caused by error thresholds being
   exceeded for the LEM bit [21] errors in the FIR accumulator.  These are
   typically minor and expected errors in the PHB that occur during adapter
   updates and do not warrant a reset of the PHB and the PCIe adapter failures. 
   Therefore, the threshold LEM[21] error limit has been increased and the LEM
   fatal error has been changed to a Predictive Error to avoid the outages for
   this condition.
 * A problem was fixed for PCIe3 I/O expansion drawer (#EMX0) link improved
   stability.  The settings for the continuous time linear equalizers (CTLE) was
   updated for all the PCIe adapters for the PCIe links to the expansion drawer.
   The CEC must be re-IPLed for the fix to activate.
 * The following problems were fixed for SR-IOV adapters:
   1) Insufficient resources reported for SR-IOV logical port configured with
   promiscuous mode enable and a Port VLAN ID (PVID) when creating new interface
   on the SR-IOV adapters.
   2) Spontaneous dumps and reboot of the adjunct partition for SR-IOV adapters.
   3) Adapter enters firmware loop when single bit ECC error is detected. 
   System firmware detects this condition as a adapter command time out.  System
   firmware will reset and restart the adapter to recover the adapter
   functionality.  This condition will be reported as a temporary adapter
   hardware failure.
   4) vNIC interfaces not being deleted correctly causing SRC B400FF01 to be
   logged and Data Storage Interrupt (DSI) errors with failiure on boot of the
   LPAR.
   This set of fixes updates adapter firmware to 10.2.252.1926, for the
   following Feature Codes: EN15, EN16, EN17, EN18, EN0H, EN0J, EN0M, EN0N,
   EN0K, EN0L, EL38 , EL3C, EL56, and EL57.
   The SR-IOV adapter firmware level update for the shared-mode adapters happens
   under user control to prevent unexpected temporary outages on the adapters. 
   A system reboot will update all SR-IOV shared-mode adapters with the new
   firmware level.  In addition, when an adapter is first set to SR-IOV shared
   mode, the adapter firmware is updated to the latest level available with the
   system firmware (and it is also updated automatically during maintenance
   operations, such as when the adapter is stopped or replaced).  And lastly,
   selective manual updates of the SR-IOV adapters can be performed using the
   Hardware Management Console (HMC).  To selectively update the adapter
   firmware, follow the steps given at the IBM Knowledge Center for using HMC to
   make the updates:  
   https://www.ibm.com/support/knowledgecenter/HW4M4/p8efd/p8efd_updating_sriov_firmware.htm.
   Note: Adapters that are capable of running in SR-IOV mode, but are currently
   running in dedicated mode and assigned to a partition, can be updated
   concurrently either by the OS that owns the adapter or the managing HMC (if
   OS is AIX or VIOS and RMC is running). 
   
 * A problem was fixed for Live Partition Mobility (LPM) migrations from
   FW860.10 or FW860.11 to older levels of firmware.  Subsequent DLPAR of
   Virtual Adapters will fail with HMC error message HSCL294C, which contains
   text similar to the following:  "0931-007 You have specified an invalid
   drc_name." This issue affects partitions installed with AIX 7.2 TL 1 and
   later. Not affected by this issue are partitions installed with VIOS, IBM i,
   or earlier levels of AIX.
 * A problem was fixed for incorrect callouts of the Power Management Controller
   (PMC) hardware with SRC B1112AC4 and SRC B1112AB2 logged.  These extra
   callouts occur when the On-Chip Controller (OCC) has placed the system in the
   Safe mode state for a prior failure that is the real problem that needs to be
   resolved.
 * A problem was fixed for a failure in launching the Advanced System Management
   Interface (ASMI) from the HMC local console for the HMC levels of V8R8.3.0
   SP2 and V8R8.4.0 SP1.  There was a frozen window displayed instead of the
   ASMI login panel.  A circumvention to the problem is to connect to ASMI from
   a remote browser session.
 * A problem was fixed for the Advanced System Management Interface (ASMI)
   "System Service Aids => Error/Event Logs" panel not showing the "Clear" and
   "Show" log options and also having a truncated error log when there are a
   large number of error logs on the system.
 * A problem was fixed to allow changing the IPMI channel authentication
   capabilities from the OS.  The following command was causing an IPMI core
   dump "ipmitool channel authcap 1 4" every time it was run.
 * A problem was fixed for sporadic blinking amber LEDs for the system fans with
   no SRCs logged.  There was no problem with the fans.  The LED corruption
   occurred when two service processor tasks attempted to update the LED state
   at the same time.  The fan LEDs can be recovered to a normal state
   concurrently using the following link steps for a soft reset of the service
   processor: 
   https://www.ibm.com/support/knowledgecenter/POWER8/p8hby/p8hby_softreset.htm
 * A problem was fixed for the loss of Operations Panel function 30 (displaying
   ethernet port HMC1 and HMC2 IP addresses) after a concurrent repair of the
   Operations Panel.  Operations Panel function 30 can be restored concurrently
   using the following link steps for a soft reset of the service processor: 
   https://www.ibm.com/support/knowledgecenter/POWER8/p8hby/p8hby_softreset.htm
 * A problem was fixed for the service processor boot watch-dog timer expiring
   too soon during DRAM initialization in the reset/reload, causing the service
   processor to go unresponsive.  On systems with a single service processor,
   the SRC B1817212 was displayed on the control panel.  For systems with
   redundant service processors, the failing service processor was
   deconfigured.  To recover the failed service processor, the system will need
   to be powered off with AC powered removed during a regularly scheduled system
   service action.  This problem is intermittent and very infrequent as most of
   the reset/reloads of the service processor will work correctly to restore the
   service processor to a normal operating state.
 * A problem was fixed for host-initiated resets of the service processor
   causing the system to terminate.  A prior fix for this problem did not work
   correctly because some of the host-initiated resets were being translated to
   unknown reset types that caused the system to terminate.  With this new
   correction for failed host-initiated resets, the service processor will still
   be unresponsive but the system and partitions will continue to run.  On
   systems with a single service processor, the SRC B1817212 will be displayed
   on the control panel.  For systems with redundant service processors, the
   failing service processor will be deconfigured.  To recover the failed
   service processor, the system will need to be powered off with AC powered
   removed during a regularly scheduled system service action.  This problem is
   intermittent and very infrequent as most of the host-initiated resets of the
   service processor will work correctly to restore the service processor to a
   normal operating state.
 * A problem was fixed for incorrect error messages from the Advanced System
   Management Interface (ASMI) functions when the system is powered on but in
   the "Incomplete State".  For this condition, ASMI was assuming the system was
   powered off because it could not communicate to the PowerVM hypervisor.  With
   the fix, the ASMI error messages will indicate that ASMI functions have
   failed because of the bad hypervisor connection instead of falsely stating
   that the system is powered off.
 * A problem was fixed for a system termination and outage caused by a corrupted
   system reset type.  For cases where the system reset type cannot be
   identified, the service processor will now do a reset/reload to keep the
   system running.  This is a rare problem that is occurring during an
   error/recovery situation that involves a reset of the service processor.
   

System firmware changes that affect certain systems


 * On systems with PCIe adapters in Single Root I/O Virtualization (SR-IOV)
   shared mode, a problem was fixed for the hypervisor SR-IOV adjunct partition
   failing during the IPL with SRCs B200F011 and B2009014 logged. The SR-IOV
   adjunct partition successfully recovers after it reboots and the system is
   operational.
 * On systems with maximum memory configurations (where every DIMM slot is
   populated - size of DIMM does not matter), a problem has been fixed for
   systems losing performance and going into Safe mode (a power mode with
   reduced processor frequencies intended to protect the system from
   over-heating and excessive power consumption) with B1xx2AC3/B1xx2AC4 SRCs
   logged.  This happened because of On-Chip Controller (OCC) time out errors
   when collecting Analog Power Subsystem Sweep (APSS) data, used by the OCC to
   tune the processor frequency.  This problem occurs more frequently on systems
   that are running heavy workloads.  Recovery from Safe mode back to normal
   performance can be done with a re-IPL of the system, or concurrently using
   the following link steps for a soft reset of the service processor: 
   https://www.ibm.com/support/knowledgecenter/POWER8/p8hby/p8hby_softreset.htm.
   To check or validate that Safe mode is not active on the system will require
   a dynamic celogin password from IBM Support to use the service processor
   command line:
   1) Log into ASMI as celogin with dynamic celogin password generated by IBM
   Support
   2) Select System Service Aids
   3) Select Service Processor Command Line
   4) Enter "tmgtclient --query_mode_and_function" from the command line
   The first line of the output, "currSysPwrMode" should say "NOMINAL" and this
   means the system is in normal mode and that Safe mode is not active.

SV830_101_048 / FW830.40

12/08/16 Impact: Availability    Severity: ATT

New features and functions


 * Support for the Advanced System Management Interface (ASMI) was changed to
   not create VPD deconfiguration records and call home alerts for hardware FRUs
   that have one VPD chip of a redundant pair broken or inaccessible.  The
   backup VPD chip for the FRU allows continued use of the hardware resource. 
   The notification of the need for service for the FRU VPD is not provided
   until both of the redundant VPD chips have failed for a FRU.
   

System firmware changes that affect all systems

 * A problem was fixed for excessive, repeating error logs with SRC B150B901 for
   a failed FSI link to a DIMM that had insufficient hardware callouts for easy
   diagnosis of the failure.  With the fix, the B150B901 is limited to one
   occurrence but a new error log is provided with the hardware callouts. 
   Without the fix, if you see repeating B150B901 predictive logs,  there will
   also be repeated informational error logs with SRC B1504800.  These B1504800
   logs would have the hardware involved and could be used to point to the
   failing DIMM.
 * A problem was fixed for PCIe slot errors caused by improper PCIe device
   training.  PCIe links do not train properly and PCIe cards may show up as
   unknown in I/O list system properties.  Error log SRC BA180020 may be seen,
   or informational events B7006976 (for PHB slot) or B7006977 (for a switch
   slot).  The applied fix does not recover failed PCIe devices but does prevent
   those failures on the next power on IPL.  If any PCIe devices are in the
   failed state, they can be recovered using the HMC to power cycle the affected
   PCIe slot.  This problem only affects the IBM Power System E850 (8408-E8E)
   model.
 * A problem was fixed for a backplane short causing smoke in the case.  The
   power on sequence was changed to apply power from one power supply at a time
   and then check for excessive current use that could be caused by a backplane
   short.  If excessive current is defected, the system is powered off with a
   SRC logged to call out the failing hardware.  If a short has occurred, the
   backplane must still be replaced but damage to other components will be
   prevented.  The problem is triggered by a physical move of the system.  This
   problem only affects the IBM Power System E850 (8408-E8E) model.
 * A problem was fixed for On-Chip Controller (OCC) errors that had excessive
   callouts for processor FRUs.  Many of the OCC errors are recoverable and do
   not required that the processor be called out and guarded.  With the fix, the
   processors will only be called out for OCC errors if there are three or more
   OCC failures during a time period of a week.
 * A problem was fixed for the Operations Panel showing swapped physical port
   assignments for logical eth0 and eth1 for the service processor when panel
   function 30 is used.  For eth0, port "T5" is displayed instead of port "T4". 
   For eth1, port "T4" is displayed instead of "T5".  This problem does not
   affect the IP addresses assigned in the Advanced System Management Interface
   (ASMI) for the eth0 and eth1 ports which are correctly assigned.
   This problem only pertains to the IBM Power System E850 (8408-E8E) model.
 * A problem was fixed for an Operations Panel Function 04 (Lamp test) during an
   IPL causing the IPL to fail.  With the fix, the lamp test request is rejected
   during the IPL until the hypervisor is available.  The lamp test can be
   requested without problems anytime after the system is powered on to
   hypervisor ready or an OS is running in a partition.
 * A problem was fixed for infrequent VPD cache read failures during an IPL
   causing an unnecessary guarding of DIMMs with SRC B123A80F logged.  With the
   fix, the VPD cache read fails cause a temporary deconfiguration of the
   associated DIMM but the DIMM is recovered on the next IPL.
 * A problem was fixed for a Live Partition Mobility (LPM) error where the
   target partition migration is failed with HSCLB98C error.  Frequency of this
   error can be moderate with source partitions that have a vNIC resource but
   extremely low if the source partition does not have a vNIC resource.  The
   failure originates at the VIOS VF level, so recovery from this error may need
   a re-IPL of the system to regain full use of the vNIC resources.
 * A problem was fixed for a latency time of about 2 seconds being added to a
   target Live Partition Mobility (LPM) migration system when there is a latency
   time check failure.  With the fix, in the case of a latency time check
   failure, a much smaller default latency is used instead of two seconds.  This
   error would not be noticed if the customer system is using a NTP time server
   to maintain the time.
 * A problem was fixed for a system dump post-dump IPL that resulted in adjunct
   partition errors of SRC BA54504D, B7005191, and BA220020 when they could not
   be created due to false space constraints.  These adjunct partition failures
   will prevent normal operations of the hypervisor such as creating new
   partitions, so a power off and power on of the system is needed to recover
   it.  If the customer system is experiencing this error (only some systems
   will be impacted), it is expected to occur for each system dump post-dump IPL
   until the fix is applied.
 * A problem was fixed for a shared processor pool partition showing an
   incorrect zero "Available Pool Processor" (APP) value after a concurrent
   firmware update.  The zero APP value means that no idle cycles are present in
   the shared processor pool but in this case it stays zero even when idle
   cycles are available.  This value can be displayed using the AIX "lparstat"
   command.  If this problem is encountered, the partitions in the affected
   shared processor pool can be dynamically moved to a different shared
   processor pool.  Before the dynamic move, the "uncapped" partitions should be
   changed to "capped" to avoid a system hang. The old affected pool would
   continue to have the APP error until the system is re-IPLed.
 * A rare problem was fixed for a system hang that can occur when dynamically
   moving "uncapped" partitions to a different shared processor pool.  To
   prevent a system hang, the "uncapped" partitions should be changed to
   "capped" before doing the move.
 * A problem was fixed for a DLPAR add of the USB 3.0 adapter (#EC45 and #EC46)
   to an AIX partition where the adapter could not be configured with the AIX
   "cfgmgr" command that fails with EEH errors and an outstanding illegal DMA
   transaction.  The trigger for the problem is the DLPAR add operation of the
   USB 3.0 adapter that has a USB External Dock (#EU04) and RDX Removable Disk
   Drives attached, or a USB 3.0 adapter that has a flash driver attached.  The
   PCI slot can be powered off and on to recover the USB 3.0 adapter.
 * A problem was fixed for network issues, causing critical situations for
   customers, when an SR-IOV logical port or vNIC is configured with a non-zero
   Port VLAN ID (PVID).  This fix updates adapter firmware to 10.2.252.1922, for
   the following Feature Codes: EN15, EN16, EN17, EN18, EN0H, EN0J, EL38, EN0M,
   EN0N, EN0K, EN0L, and EL3C.
   The SR-IOV adapter firmware level update for the shared-mode adapters happens
   under user control to prevent unexpected temporary outages on the adapters. 
   A system reboot will update all SR-IOV shared-mode adapters with the new
   firmware level.  In addition, when an adapter is first set to SR-IOV shared
   mode, the adapter firmware is updated to the latest level available with the
   system firmware (and it is also updated automatically during maintenance
   operations, such as when the adapter is stopped or replaced).  And lastly,
   selective manual updates of the SR-IOV adapters can be performed using the
   Hardware Management Console (HMC).  To selectively update the adapter
   firmware, follow the steps given at the IBM Knowledge Center for using HMC to
   make the updates:  
   https://www.ibm.com/support/knowledgecenter/HW4M4/p8efd/p8efd_updating_sriov_firmware.htm.
   Note: Adapters that are capable of running in SR-IOV mode, but are currently
   running in dedicated mode and assigned to a partition, can be updated
   concurrently either by the OS that owns the adapter or the managing HMC (if
   OS is AIX or VIOS and RMC is running).
 * A problem was fixed for a failed IPL with SRC UE BC8A090F that does not have
   a hardware callout or a guard of the failing hardware.  The system may be
   recovered by guarding out the processor associated with the error and
   re-IPLing the system.  With the fix, the bad processor core is guarded and
   the system is able to IPL.
 * A problem was fixed for the On-Chip Controller (OCC) incorrectly calling out
   processors with SRC B1112A16 for L4 Cache DIMM failures with SRC B124E504. 
   This false error logging can occur if the DIMM slot that is failing is
   adjacent to two unoccupied DIMM slots.
 * A problem was fixed for host-initiated resets of the service processor that
   can cause the service processor to terminate.  In this state, the service
   processor will be unresponsive but the system and partitions will continue to
   run.  On systems with a single service processor, the SRC B1817212 will be
   displayed on the control panel.  For systems with redundant service
   processors, the failing service processor will be deconfigured.  To recover
   the failed service processor, the system will need to be powered off with AC
   powered removed during a regularly scheduled system service action.  The
   problem is intermittent and very infrequent as most of the host-initiated
   resets of the service processor will work correctly to restore the service
   processor to a normal operating state.
 * A problem was fixed for device time outs during a IPL logged with a SRC
   B18138B4.  This error is intermittent and no action is needed for the error
   log.  The service processor hardware server has allotted more time of the
   device transactions to allow the transactions to complete without a time-out
   error.
 * A problem was fixed for cable card capable PCI slots that fail during the
   IPL.  Hypervisor I/O Bus Interface UE B7006A84 is reported for each cable
   card capable PCI slot that doesn't contain a PCIe3 Optical Cable Adapter for
   the PCIe Expansion Drawer (feature code #EJ05).  PCI slots containing a cable
   card will not report an error but will not be functional.  The problem can be
   resolved by performing an AC cycle of the system.  The trigger for the
   failure is the I2C devices used to detect the cable cards are not coming out
   of the power on reset process in the correct state due to a race condition.
 * A problem was fixed with SR-IOV adapter error recovery where the adapter is
   left in a failed state in nested error cases for some adapter errors.  The
   probability of this occurring is very low since the problem trigger is
   multiple low-level adapter failures.  With the fix, the adapter is recovered
   and returned to an operational state.
 * A problem was fixed for the setting the disable of a periodic notification
   for a call home error log SRC B150F138 for Memory Buffer resources (membuf)
   from the Advanced System Management Interface (ASMI).
 * A problem was fixed for a blank SRC in the LPA dump for user-initiated
   non-disruptive adjunct dumps.  The SRC is needed for problem determination
   and dump analysis.
 * A problem was fixed for a missing processor FRU callout for SRC BC8A0307 for
   a node deconfiguration during the IPL.  The failing SCM is now provided on
   the callout when this error occurs during the IPL.  This callout allows the
   guard of the failing processor to occur so that the IPL is successful.
   

System firmware changes that affect certain systems


 * On systems using the PowerVM hypervisor firmware and Novalink, a problem was
   fixed for a NovaLink installation error where the hypervisor was unable to
   get the maximum logical memory buffer (LMB) size from the service processor. 
   The maximum supported LMB size should be 0xFFFFFFFF but in some cases it was
   initialized to a value that was less than the amount of configured memory,
   causing the service processor read failure with error code 0X00000134.
 * On systems that have an attached HMC,  a problem was fixed for a Live
   Partition Mobility migration that resulted in the source managed system going
   to the Hardware Management Console (HMC) Incomplete state after the migration
   to the target system was completed.  This problem is very rare and has only
   been detected once.. The problem trigger is that the source partition does
   not halt execution after the migration to the target system.   The HMC went
   to the Incomplete state for the source managed system when it failed to
   delete the source partition because the partition would not stop running. 
   When this problem occurred, the customer network was running very slowly and
   this may have contributed to the failure.  The recovery action is to re-IPL
   the source system but that will need to be done without the assistance of the
   HMC.  For each partition that has a OS running on the source system, shut
   down each partition from the OS.  Then from the Advanced System Management
   Interface (ASMI),  power off the managed system.  Alternatively, the system
   power button may also be used to do the power off.  If the HMC Incomplete
   state persists after the power off, the managed system should be rebuilt from
   the HMC.  For more information on HMC recovery steps, refer to this IBM
   Knowledge Center link:
   https://www.ibm.com/support/knowledgecenter/en/POWER7/p7eav/aremanagedsystemstate_incomplete.htm
 * On systems that have an attached HMC,  a problem was fixed for a Live
   Partition Mobility migration that resulted in a system hang when an EEH error
   occurred simultaneously with a request for a page migration operation.  On
   the HMC, it shows an incomplete state for the managed system with reference
   code A181D000.  The recovery action is to re-IPL the source system but that
   will need to be done without the assistance of the HMC.  From the Advanced
   System Management Interface (ASMI),  power off the managed system. 
   Alternatively, the system power button may also be used to do the power off. 
   If the HMC Incomplete state persists after the power off, the managed system
   should be rebuilt from the HMC.  For more information on HMC recovery steps,
   refer to this IBM Knowledge Center link:
   https://www.ibm.com/support/knowledgecenter/en/POWER7/p7eav/aremanagedsystemstate_incomplete.htm
 * For the IBM Power System E850 (8408-xxx) systems, a problem was fixed for the
   incorrect values for the Idle Power Saver (IPS) mode call home data.  The
   call home "max" is reported much lower numbers than what the On-chip
   Controllers (OCC) read for the IPS.  This problem only affects 4-socket
   systems as it is caused by an integer overflow of the summation of the IPS
   value from all OCCs in the system.

SV830_097_048 / FW830.30

08/24/16 Impact: Availability    Severity: SPE

New features and functions


 * The certificate store on the service processor has been upgraded to include
   the changes contained in version 2.6 of the CA certificate list published by
   the Mozilla Foundation at the mozilla.org website as part of the Network
   Security Services (NSS) version 3.21.
 * Support was added to the Advanced System Management Interface (ASMI) for the
   Intelligent Platform Machine Interface (IPMI) to be able to change the IPMI
   password.  On the "Login Profile/Change Password" menu, a user ID of "IPMI"
   can be selected.  Changing the password for IPMI changes the password for the
   default IPMI user ID.  IPMI is not a user ID for logging into ASMI.  The IPMI
   function on the service processor can be accessed using tool "ipmitool" from
   a client system that has a network connection to the service processor.
 * Support was added to protect the service processor from booting on a level of
   firmware that is below the minimum MIF level.  If this is detected, a SRC
   B18130A0 is logged.  A disruptive firmware update would then need to be done
   to the minimum firmware level or higher.  This new support has no effect on
   the system being updated with the service pack but has been put in place to
   provide an enhanced firmware level for the IBM field stock service
   processors.
 * Support was added for the Stevens6+ option of the internal tray loading
   DVD-ROM drive with F/C #EU13.  This is an 8X/24X(max) Slimline SATA DVD-ROM
   Drive.  The Stevens6+ option is a FRU hardware replacement for the
   Stevens3+.  MTM 7226-1U3 (Oliver)  FC 5757/5762/5763 attaches to IBM Power
   Systems and lists Stevens6+ as optional for Stevens3+.  If the Stevens6+  DVD
   drive is installed on the system without the required firmware support, the
   boot of an AIX partition will fail when the DVD is used as the load source. 
   Also, an IBM i partition cannot consistently boot from the DVD drive using
   D-mode IPL.  A SRC C2004130 may be logged for the load source not found
   error.
   

System firmware changes that affect all systems

 * DEFERRED:  A performance improvement was made by disabling the Hot/Cold
   Affinity (HCA) hardware feature which gathers memory usage statistics for
   consumption by partition operating system memory management algorithms.  The
   statistics gathering can, in rare cases, cause performance to degrade.  The
   workloads that may experience issues are memory-intensive workloads that have
   little locality of reference and thus cannot take advantage of hardware
   memory cache.  As a consequence, the problem occurs very infrequently or not
   at all except for very specific workloads in a HPC environment.  This
   performance fix requires an IPL of the system to activate it after it is
   applied.
   
 * A problem was fixed for the service processor going to the reset state
   instead of the termination state when the anchor card is missing or broken. 
   At the termination state, the Advanced System Management Interface (ASMI) can
   be used to collect failure data and debug the problem with the anchor card.
 * A problem was fixed for error log entries created by Hostboot not getting
   written to the error log in some situations.  This can cause hardware
   detected as failed by Hostboot to not get reported or have a call-home
   generated.  This problem will occur whenever Hostboot commits a recovered or
   informational error as its last error log in the current IPL.  In the next
   IPL,  one or more error logs from Hostboot will be lost.
 * A problem was fixed for the health monitoring of the NVRAM and DRAM in the
   service processor that had been disabled.  The monitoring has been
   re-established and early warnings of service processor memory failure is
   logged with one of the following Predictive Error SRCs:  B151F107, B151F109,
   B151F10A, or B151F10D.
 * A problem was fixed for an incorrect date in partitions created with a
   Simplified Remote Restart-Capable (SRR) attribute where the date is created
   as Epoch 01/01/1970 (MM/DD/YYYY).  Without the fix, the user must change the
   partition time of day when starting the partition for the first time to make
   it correct.  This problem only occurs with SRR partitions.
 * A problem was fixed for hypervisor task failures in adjunct partitions with a
   SRC B7000602 reported in the error log.  These failures occur during adjunct
   partition reboots for concurrent firmware updates but are extremely rare and
   require a re-IPL of the system to recover from the task failure.  The adjunct
   partitions may be associated with the VIOS or I/O virtualization for the
   physical adapters such as done for SR-IOV.
 * A problem was fixed for a shortened "Grace Period" for "Out of Compliance"
   users of a Power Enterprise Pool (PEP).   The "Grace Period" is short by one
   hour, so the user has one less hour to resolve compliance issues before the
   HMC disallows any more borrowing of PEP resources.  For example, if the
   "Grace Period" should have been 48 hours as shown in the "Out of Compliance"
   message, it really is 47 hours in the hypervisor firmware.  The borrowing of
   PEP resources is not a common usage scenario.  It is most often found in Live
   Partition Mobility (LPM) migrations where PEP resources are borrowed from the
   source server and loaned to the target server.
 * A problem was fixed for an AIX or Linux partition failing with a SRC B2008105
   LP 00005 on a re-IPL after a dump (firmware assisted or error generated dump)
   following a Live Partition Mobility (LPM) migration operation.  The problem
   does not occur if the migrated partition completes a normal IPL after the
   migration.
 * A problem was fixed for intermittent long delays in the NX co-processor for
   asynchronous requests such as NX 842 compressions.  This problem was observed
   for AIX DB2 when it was doing hardware-accelerated compressions of data but
   could occur on any asynchronous request to the NX co-processor.
 * A problem was fixed for transmit time-outs on a Virtual Function (VF) during
   stressful network traffic, on systems using PCIe adapters in Single Root I/O
   Virtualization (SR-IOV) shared-mode.  This fix updates adapter firmware to
   10.2.252.1918, for the following Feature Codes: EN15, EN16, EN17, EN18, EN0H,
   EN0J, EL38, EN0M, EN0N, EN0K, EN0L, and EL3C.
   The SR-IOV adapter firmware level update for the shared-mode adapters happens
   under user control to prevent unexpected temporary outages on the adapters. 
   A system reboot will update all SR-IOV shared-mode adapters with the new
   firmware level.  In addition, when an adapter is first set to SR-IOV shared
   mode, the adapter firmware is updated to the latest level available with the
   system firmware (and it is also updated automatically during maintenance
   operations, such as when the adapter is stopped or replaced).  And lastly,
   selective manual updates of the SR-IOV adapters can be performed using the
   Hardware Management Console (HMC).  To selectively update the adapter
   firmware, follow the steps given at the IBM Knowledge Center for using HMC to
   make the updates:  
   https://www.ibm.com/support/knowledgecenter/HW4M4/p8efd/p8efd_updating_sriov_firmware.htm.
   Note:  Adapters that are capable of running in SR-IOV mode, but are currently
   running in dedicated mode and assigned to a partition, can only be updated
   concurrently by the OS that owns the adapter.
 * A security problem was fixed in OpenSSL for a possible service processor
   reset on a null pointer de-reference during SSL certificate management. The
   Common Vulnerabilities and Exposures issue number is CVE-2016-0797.
 * A problem was fixed for missing dumps for service processor failures during
   firmware updates.
 * A problem was fixed for a service processor failure during a system power off
   that causes a reset of the service processor.  The service processor is in
   the correct state for a normal system power on after the error.  The
   frequency for this error should be low as it is caused by a very rare race
   condition in the power off process.
 * A problem was fixed for a processor hang where the error recovery was not
   guarding the failing processor.  The failure causes a SRC B111E540 to be
   logged with Signature Description of " ex(n0p3c1) (COREFIR[55])
   NEST_HANG_DETECT: External Hang detected".  With the fix, the failure
   processor FRU is called out and guarded so that the error does not re-occur
   when the system is re-IPLed.
 * A problem was fixed for a sequence of two or more Live Partition Mobility
   migrations that caused a partition to crash with a SRC BA330000 logged
   (Memory allocation error in partition firmware).  The sequence of LPM
   migrations that can trigger the partition crash are as follows:
   The original source partition level can be any FW760.xx, FW763.xx, FW770.xx,
   FW773.xx, FW780.xx, or FW783.xx P7 level or any FW810.xx, FW820.xx, FW830.xx,
   or FW840.xx P8 level.  It is migrated first to a system running one of the
   following levels:
   1) FW730.70 or later 730 firmware or
   2) FW740.60 or later 740 firmware
   And then a second migration is needed to a system running one of the
   following levels:
   1) FW760.00 - FW760.20 or
   2) FW770.00 - FW770.10
   The twice-migrated system partition is now susceptible to the BA330000
   partition crash during normal operations until the partition is rebooted.  If
   an additional LPM migration is done to any firmware level, the
   thrice-migrated partition is also susceptible to the partition crash until it
   is rebooted.
   With the fix applied, the susceptible partitions may still log multiple
   BA330000 errors but there will be no partition crash.  A reboot of the
   partition will stop the logging of the BA330000 SRC.
 * A problem was fixed for the Advanced System Management Interface "Network
   Services/Network Configuration" "Reset Network Configuration" button that was
   not resetting the static routes to the default factory setting.  The
   manufacturing default is to have no static routes defined so the fix clears
   any static routes that had been added.  A circumvention to the problem is to
   use the ASMI "Network Services/Network Configuration/Static Route
   Configuration" "Delete" button before resetting the network configuration.
 * A problem was fixed for the HMC Exchange FRU procedure for DVD drive with MTM
   7226-1U3 and feature codes 5757/5762/5763 where it did not verify the DVD
   drive was plugged in at the end of the exchange procedure.  Without the fix, 
   the user must manually verify that the DVD drive is plugged in.
 * A problem was fixed for the Advanced System Mangement Interface (ASMI)
   incorrectly showing the Anchor card as guarded whenever any redundant VPD
   chip is guarded.
   

System firmware changes that affect certain systems


 * On systems with a PowerVM Active Memory Sharing (AMS) partition with AIX
   Level 7.2.0.0 or later with Firmware Assisted Dump enabled, a problem was
   fixed for a Restart Dump operation failing into KDB mode.  If "q" is entered
   to exit from KDB mode, the partition fails to start.  The AIX partition must
   be powered off and back on to recover.  The problem can be circumvented by
   disabling Firmware Assisted Dump (default is enabled in AIX 7.2).
   
 * A problem was fixed for unneeded throttling of processors if a power supply
   fails.  The error log SRCs of B1812A05 and B1812A33 are reported when the
   processors are throttled.  The affected systems have four power supplies and
   the loss of one power supply would not normally cause power use to go over
   the power capacity limit, but it happened because the number of power
   supplies was internally set as two instead of the four actually in the
   system.  This problem only affects the IBM Power System S824 (8286-42A) and
   the S824L(8247-42L) models.  Without the fix, the problem with processor
   throttling can be circumvented by replacing the power supply that has failed.

SV830_092_048 / FW830.21

06/01/16 Impact: Availability    Severity: SPE

System firmware changes that affect all systems

 * On systems using PowerVM firmware with dedicated processor partitions,  a
   problem was fixed for the dedicated processor partition becoming
   intermittently unresponsive. The problem can be circumvented by changing the
   partition to use shared processors.  This is a follow-on to the fix provided
   in 830.20 for a different issue for delays in dedicated processor partitions
   that were caused by low I/O utilization.

SV830_086_048 / FW830.20

04/01/16 Impact: Availability    Severity: SPE

New features and functions


 * Support was added for a 4-Core 3.02 GHz POWER8 Processor Card with CCIN 54E9
   and feature code #EPXK for the S822 (8284-22A), S812L(8247-21L),  and S822L
   (8247-22L) models.
 * Support was added for IPMI console connections for systems running in MDC
   (Manufacturing Default Configuration) mode without a management console.  The
   Advanced System Management Interface (ASMI) "Configuration/Console Type"
   panel is used to select the IPMI console over the serial console. 
   Auto-selection to an IPMI console will also occur if IPMI commands are used
   prior to the power on of the CEC.
 * Support was added to the Advanced System Management Interface (ASMI) to be
   able to add a IPv4 static route definition for each ethernet interface on the
   service processor.  Using a static route definition,  a Hardware Management
   Console (HMC) configured on a private subnet that is different from the
   service processor subnet is now able to connect to the service processor and
   manage the CEC.  A static route persists until it is deleted or until the
   service processor settings are restored to manufacturing defaults.  The
   static route is managed with the ASMI panel "Network Services/Network
   Configuration/Static Route Configuration" IPv4 radio button.  The "Add"
   button is used to add a static route (only one is allowed for each ethernet
   interface) and the "Delete" button is used to delete the static route.
 * Support was added to the Advanced System Management Interface (ASMI) to
   display the environmental info section of error logs in the "System Service
   Aids-> Error->Event logs" panel.  The following is an example of the
   information displayed:
   |------------------------------------------------------
   |                              Environmental Info      
   |------------------------------------------------------
   | Section Version          : 1                         
   | Sub-section type         : 0                        
   | Created by               : powr                                   
   | Genesis Record Time-Stamp: 03/12/2015 15:31:21
   | Genesis Corr-Resistance  : 4.687847
   | Genesis Ambient-Temp(C)  : 28.000000
   | Genesis Corrosion-Rate   : 0           
   |                                                       
   | Corrosion Rate Status    : 1             
   | Presence of UsrDataSec   : 1
   | Num Corrosion Readings   : 1        
   |                                                      
   | Daily Corr-Resistance    : 4.804206          
   | Daily Ambient-Tempr(C)   : 35.312500      
   | Daily Corrosion-Rate     : 12C                  
   |------------------------------------------------------
   
   

System firmware changes that affect all systems

 * A problem was fixed for false errors logs for SRC B181A40F where upper domain
   fans are incorrectly reported as missing on a reboot of the service
   processor.  This problem only pertains to the IBM Power System E850
   (8408-E8E).
 * A problem was fixed for Advanced System Management Interface (ASMI) TTY to
   allow "admin" passwords to be greater than eight characters in length to be
   consistent with prior generations of the product.  The ASMI web interface
   works correctly for user "admin" passwords with no truncation in the length
   of the passwords.
 * A problem was fixed for a system IPL hang at C100C1B0 with SRC 1100D001 when
   the power supplies have failed to supply the necessary 12-volt output for the
   system.   The 1100D001 SRC was calling out the planar when it should have
   called out the power supplies.  With the fix, the system will terminate as
   needed and call out the power supply for replacement.  One mode of power
   supply failure that could trigger the hang is sync-FET failures that disrupt
   the 12-volt output.
 * A problem was fixed for recovery from PNOR flash memory corruption that
   causes the IPL to fail with SRC D143900C.  This is very rare and only has
   happened in IBM internal labs.  Without the fix, the service processor cannot
   correct the corruption in the PNOR.  If a system has the problem SRC and
   cannot IPL,  then that system must be disruptively firmware updated to apply
   the fix to be able to IPL again.
 * A problem was fixed for a PCIe3 I/O expansion drawer (#EMX0) not getting all
   error logs reported when its error log queue is full.  In the case where the
   error log queue is full with 16 entries, only one entry is returned to the
   hypervisor for reporting.  This error log truncation only occurs during
   periods of high error activity in the expansion drawer.
 * A problem was fixed for hardware system dump collection after a hardware
   checkstop that was missing scan ring data.  This is a very infrequent problem
   caused by an error with timing in the multi-threaded dump collection
   process.  Until this fix is applied, the debug of some hardware dump problems
   may require doing multiple dump collections to get all the data.
 * A problem was fixed for an Advanced System Management Interface (ASMI) error
   that occurred when trying to display detail on a deconfigured Anchor Card
   VPD.  If the error log for the selected deconfiguration record had been
   deleted, it caused ASMI to core dump.  With the fix,  if the error log for
   deconfiguration record is missing, the error log details such as failing SRC
   for the deconfiguration record are returned as blank.
 * A problem was fixed for an On-Chip Controller error with SRC B1702AC4 that
   was logged as a unrecoverable without hardware callouts.  This occurred when
   the slave OCC failed to receive any Analog Power Subsystem Sweep (APSS) data
   over a long time interval.  With the fix, if the OCC fails in the same
   manner, the error is predictive with hardware callouts in the error log.
 * A problem was fixed in the Advanced System Management Interface (ASMI) for a
   FRU exchange of a DVD where the DVD was not being powered off as needed for
   the exchange.  The missing power off of the FRU could cause a data read or
   write error if the DVD is in use when the DVD is removed.  With the fix, the
   ASMI deactivate DVD button turns off the DVD green power LED during the
   exchange procedure, so it is known when it is safe to continue with the
   exchange procedure steps and remove the DVD.
 * A problem was for fixed so that error logs are now generated for thermal
   errors detected by the service processor.  Without the fix, thermal errors
   such as a temperature over the threshold will not get reported in the error
   log but higher fan speeds will be present as an indicator of the thermal
   problem.  Until the fix is applied, the error log and call home mechanism
   cannot be relied on to monitor for system thermal problems.
 * A problem was fixed for not being able to control all I/O slots for Huge
   Dynamic DMA Window (HDDW) capability on the IBM Power System E850
   (8408-E8E).  There are 13 I/O slots enabled for HDDW on this system but only
   8 could be controlled by the Advanced System Management Interface (ASMI) 
   panel for "I/O Enlarged Capacity".  This prevented enabling all slots to be
   HDDW enabled, limiting DMA bandwidth on some of the I/O slots.
 * A problem was fixed for processor core checkstops that cause an LPAR outage
   but do not create hardware errors and service events.  The processor core is
   deconfigured correctly for the error.  This can happen if the hypervisor
   forces processor checkstops in response to excessive processor recovery.
 * A problem was fixed for the callout of a VPD collection fault and system
   termination with SRC 11008402 to include the 1.2vcs VRM FRU.  The power good
   fault fault for the 1.2 volts would be a primary cause of this error. 
   Without the fix, the VRM is missing in the callout list and only has the
   VPDPART isolation procedure.
 * A problem was fixed for excessive logging of the SRC 11002610 on a power good
   (pgood) fault when detected by the Digital Power Subsystem Sweep (DPSS). 
   Multiple pgood interrupts are signaled by the DPSS in the interval between
   the first pgood failure and the node power down.  A threshold was added to
   limit the number of error logs for the condition.
 * A problem was fixed to speed up recovery for VPD collection time-out errors
   for PCIe resources in an I/O drawer logged with SRC 10009133 during
   concurrent firmware updates.  With the fix, the hypervisor is notified as
   soon as the VPD collection has finished so the PCIe resources can report as
   available .  Without the fix, there is a delay as long as two hours for the
   recovery to complete.
 * A problem was fixed for a false unrecoverable error (UE) logged for B1822713
   when an invalid cooling zone is found during the adjustment of the system fan
   speeds.  This error can be ignored as it does not represent a problem with
   the fans.
 * A problem was fixed for loss of back-level protection during firmware updates
   if an anchor card has been replaced.  The Power system manufacturing process
   sets the minimum code level a system is allowed to have for proper
   operation.  If a anchor card is replaced, it is possible that the replacement
   anchor card is one that has the Minimum MIF Level (MinMifLevel) given as
   "blank",  and this removes the system back-level protection. With the fix,
   blanks or nulls on the anchor card for this field are handled correctly to
   preserve the back-level protection.  Systems that have already lost the
   back-level protection due to anchor card replacement remain vulnerable to a
   accidental downgrade of code level by operator error, so code updates to a
   lower level for these systems should only be performed under guidance from
   IBM Support.  The following command can be run the Advanced Management
   Management Interface (ASMI) to determine if the system has lost the
   back-level protection with the presence of "blanks" or ASCII 20 values for
   MinMifLevel:
   "registry -l cupd/MinMifLevel" with output:
   "cupd/MinMifLevel:
   2020202020202020 2020202020202020 [ ]
   2020202020202020 2020202020202020 [ ]"
 * A problem was fixed for a system checkstop caused by a L2 cache
   least-recently used (LRU) error that should have been a recoverable error for
   the processor and the cache.  The cache error should not have caused a L2 HW
   CTL error checkstop.
 * A problem was fixed that was corrupting the Update Access Key (UAK) date with
   a corrupted date of "1900".   The user should correct the UAK date, if
   needed, to allow the firmware update to proceed, by using the original UAK
   key for the system.  On the Management Console,  enter the original update
   access key via the "Enter COD Code" panel. Or on the Advanced System Manager
   Interface (ASMI),  enter the original update access key via the "On Demand
   Utilities/COD Activation" panel.
 * A problem was fixed for PCIe switch recovery to prevent a partition switch
   failure during the IPL with error logs for SRC B7006A22 and B7006971
   reported.  This problem can occur when doing recovery for an informational
   error on the switch.  If this problem occurs, the partition must be restarted
   to recover the affected I/O adapters.
 * A problem was fixed to correct the error messages for early failures in the
   Live Partition Mobility (LPM) migration of a partition.  The management
   console might report an unrelated error such as "HSCLA27E The operation to
   lock the physical device location for target adapter" when the actual error
   might be not enough available memory on the target CEC to run the migration. 
   With the fix, the correct error code is returned so there is enough
   information to correct the error and retry the migration.
 * A problem was fixed for a hypervisor task hang during a FRU exchange on the
   PCIe3 I/O expansion drawer (#EMX0) that requires the entire drawer to power
   off and power on again.  The activation phase for the power on may never
   complete if a very rare sequence of events occurs during the power on step. 
   The FRUs to exchange that would cause the expansion drawer to power off and
   power on are the following:  midplane, I/O module, I/O module VRM, chassis
   management card (CMC), cable card, and active optical cable.
 * A problem was fixed for PCIe adapter hangs and network traffic error recovery
   during Live Partition Mobility (LPM) and SR-IOV vNIC (virtual ethernet
   adapter)  operations.  An error in the PCI Host Bridge (PHB) hardware can
   persist in the L3 cache and fail all subsequent network traffic through the
   PHB.  The PHB error recovery was enhanced to flush the PHB L3 cache to allow
   network traffic to resume.
 * A problem was fixed for a network boot/install failure using bootp in a
   network with switches using the Spanning Tree Protocol (STP).  A network
   boot/install using lpar_netboot on the management console was enhanced to
   allow the number of retries to be increased.  If the user is not using
   lpar_netboot, the number of bootp retries can be increased using the SMS
   menus.  If the SMS menus are not an option, the STP in the switch can be set
   up to allow packets to pass through while the switch is learning the network
   configuration.
 * A problem was fixed for a hypervisor adjunct partition failed with "SRC
   B2009008 LP=32770" for an unexpected SR-IOV adapter configuration.  Without
   the fix, the system must be re-IPLed to correct the adjunct error.  This
   error is infrequent and can only occur if an adapter port configuration is
   being changed at the same time that error recovery is occurring for the
   adapter.
 * A problem was fixed for recovering from FSI interrupt overruns (too many FSI
   interrupts at one time that cause the service processor to go interrupt-bound
   and get stuck in a loop) that caused the service processor to go to a failed
   state with SRC B1817212 on systems with a single service processor.  On
   systems with redundant service processors, the failed service processor would
   get guarded with a B151E6D0 or B152E6D0 SRC depending on which service
   processor fails.  With the fix, the FSI interrupt generation is reset if a
   threshold is exceeded, allowing the service processor to continue normal
   processing.  The failure trigger is a rare hardware fault condition that does
   not persist in the service processor.
 * A problem was fixed for a degraded PCI link causing a processor core to be
   guarded if a non-cacheable unit (NCU) store time-out occurred with SRC
   B113E540 and PRD signature "(NCUFIR[9]) STORE_TIMEOUT: Store timed out on
   PB".  With the fix, the processor core is not guarded for the NCU error.  If
   this problem occurs and a core is deconfigured. clear the guard record and
   re-IPL to regain the processor core.  The solution for degraded PCI links is
   different from the fix for this problem, but a re-IPL of the CEC or a reset
   of the PCI adapters could help to recover the PCI links from their degraded
   mode.
 * A problem was fixed for a L2 cache error on the service processor that caused
   the service processor to reset or go to a failed state with SRC B1817212 on
   systems with a single service processor.  On systems with redundant service
   processors, the failed service processor would get guarded with a B151E6D0 or
   B152E6D0 SRC depending on which service processor fails.  With the fix, the
   L2 cache error is handled with single-bit corrected with no error to the
   service processor, so it can continue normal processing.  The L2 cache data
   error that causes this fail is infrequent and the service processor requires
   its limit of three resets in fifteen minutes to be exceeded for the service
   processor to fail, so service processor failure rate for this problem is low.
 * A problem was fixed for an incorrect reduction in FRU callouts for Processor
   Run-time Diagnostic (PRD) errors after a reference oscillator clock (OSCC)
   error has been logged.  Hardware resources are not called out and guarded as
   expected.  Some of the missing PRD data can be found in the secondary SRC of
   B181BAF5 logged by hardware services.  The callouts that PRD would have made
   are in the user data of that error log.
   
 * A problem was fixed for error recovery from failed Live Partition Mobility
   (LPM) migrations.  The recovery error is caused by a partition reset that
   leaves the partition in an unclean state with the following consequences:  1)
   A retry on the migration for the failed source partition may not not be
   allowed; and 2) With enough failed migration recovery errors, it is possible
   that any new migration attempts for any partition will be denied.  This error
   condition can be cleared by a re-IPL of the system. The partition recovery
   error after a failed migration is much more likely to occur for partitions
   managed by NovaLink but it is still possible to occur for Hardware Management
   Console (HMC) managed partitions.
 * A problem was fixed for a Qualys network scan for security vulnerabilities
   causing a core dump in the Intelligent Platform Management Interface (IPMI) 
   process on the service processor with SRC B181EF88.  The error occurs anytime
   the Qualys scan is run because it sends an invalid IPMI session id that
   should have been handled and discarded without a core dump.
   
 * A security problem was fixed in the lighttpd server on the service processor,
   where a remote attacker, while attempting authentication, could insert
   strings into the lighttpd server log file.  Under normal operations on the
   service processor, this does not impact anything because the log is disabled
   by default.  The Common Vulnerabilities and Exposures issue number is
   CVE-2015-3200.
 * A security problem was fixed in OpenSSL for a possible service processor
   reset on a null pointer de-reference during RSA PPS signature verification.
   The Common Vulnerabilities and Exposures issue number is CVE-2015-3194.
 * A problem was fixed to guard a failed processor core to allow the system to
   IPL.  The processor core chiplet FRU was failing to be called out and guarded
   on a RC_PMPROC_CHKSLW_ADDRESS_MISMATCH error and this prevented the system
   from being able to IPL.
   

System firmware changes that affect certain systems


 * On PowerVM systems with dedicated processor partitions with low I/O
   utilization, the dedicated processor partition may become intermittently
   unresponsive. The problem can be circumvented by changing the partition to
   use shared processors.
 * On systems where memory relocation (as done by using Live Partition Mobility
   (LPM)) and a partition reboot are occurring simultaneously, a problem for a
   system termination was fixed.  The potential for the problem existed between
   the active migration and the partition reboot.
 * On a system running a IBM i partition,  a problem was fixed for a machine
   check incorrectly issued to an IBM i partition running 7.2 or later with 4K
   sector disks.  This problem only pertains to the IBM Power System S814
   (8286-41A) , S824 (8286-42A), E870 (9119-MME), and E880 (9119-MHE) models.
 * A problem was fixed that limited Virtual Functions (VFs) to a maximum of 50
   on a single PCIe3 10GbE adapter (feature codes #EN15, #EN16, #EN17, and
   #EN18; and CCINs 2CE3 and 2CE4) when 64 should have been allowed.  This
   problem only occurs for two of the SR-IOV capable slot locations in the Power
   Systems:  slot C4 in the PCIe3 I/O expansion drawer (#EMX0) and slot C7 in
   the Power System E850 (8408-E8E).
 * A problem was fixed for an extraneous PCIe switch SRC B7006A22 being called
   out when there is a valid PCIe expansion drawer cable problem with SRC
   B7006A88 reported.  The callout for SRC B7006A22 should be ignored as the
   PCIe switch hardware is working for this case.
 * On a system with a AIX partition and a Linux partition, a problem was fixed
   for dynamically moving an adapter that uses DMA from the Linux partition to
   the AIX partition that caused the AIX to fail by going into KDB mode (0c20
   crash).  The management console showed the following message for the
   partition operation:  "Dynamic move of I/O resources failed.  The I/O slot
   dynamic partitioning operation failed.".  The error was caused by Linux using
   64K mappings for the DMA window and AIX using 4K mappings for the DMA window,
   causing incorrect calculations on the AIX when it received the adapter. 
   Until the fix is applied, the adapters that use DMA should only be moved from
   Linux to AIX when the partitions are powered off.  This problem does not
   pertain to Power System S812L(8247-21L), S822L(8247-22L), and S824L(8247-42L)
   models.
 * A problem was fixed for a Live Partition Mobility migration failure of a time
   reference partition (TRP) to a FW830 system when setting partition hibernate
   capable "false".  This happens any time the TRP partition is attempted to be
   migrated.  To circumvent the problem, set the partition's Time Reference
   Property to disabled and retry the migration.
 * For Integrated Virtualization Manager (IVM) managed systems with more than 64
   active partitions, a problem was fixed for recovery from Live Partition
   Mobility (LPM) errors.  Without the fix, the IVM managed system partition can
   appear to still be running LPM after LPM has aborted, preventing retries of
   the LPM operation.  In this case, the partition must be stopped and restarted
   to clear the LPM error state.  The problem is not frequent because it
   requires a failed LPM on a partition with a partition ID that is greater than
   64.
 * On systems with a partition using Active memory Sharing (AMS), a problem was
   fixed for a Live Partition Mobility (LPM) migration of the AMS partition that
   can hang the hypervisor on the target CEC.  When an AMS partition migrates to
   the target CEC, a hang condition can occur after processors are resumed on
   the target CEC, but before the migration operation completes.  The hang will
   prevent the migration from completing, and will likely require a CEC reboot
   to recover the hung processors.  For this problem to occur, there needs to be
   memory page-based activity (e.g. AMS dedup or Pool paging) that occurs
   exactly at the same time that the Dirty Page Manager's PSR data for that page
   is being sent to the target CEC.
 * On systems with an invalid P-side or T-side in the firmware, a problem was
   fixed in the partition firmware Real-Time Abstraction System (RTAS) so that
   system Vital Product Data (VPD) is returned at least from the valid side
   instead of returning no VPD data.   This allows AIX host commands such as
   lsmcode, lsvpd, and lsattr that rely on the VPD data to work to some extent
   even if there is one bad code side.  Without the fix,  all the VPD data is
   blocked from the OS until the invalid code side is recovered by either
   rejecting the firmware update or attempting to update the system firmware
   again.
 * On systems using PowerVM firmware without a HMC (and in Manufacturing Default
   Configuration (MDC) mode with a single host partition), a problem was fixed
   for missing dumps of type SYSDUMP. FSPDUMP. LOGDUMP, and RSCDUMP that were
   not off-loaded to the host OS.  This is an infrequent error caused by a
   timing error that causes the dump notification signal to the host OS to be
   lost.  The missing/pending dumps can be retrieved by rebooting the host OS
   partition.  The rebooted host OS will receive new notifications of the dumps
   that have to be off-loaded.
 * On systems using PCIe adapters in SR-IOV mode, a problem was fixed for
   occasional B200F011 and B2009008 SRCs that can occur during an IPL, moving a
   adapter into SR-IOV mode, or with SR-IOV link up/down activity.
 * On systems using PCIe adapters in SR-IOV mode,  the following problems were
   addressed with a Broadcom Limited (formerly known as Avago Technologies and
   Emulex) adapter firmware update to 10.2.252.1905:  1) Eliminating virtual
   function (VF) transmit errors during VF resets and 2) Preventing loss of
   legacy flow control when an adapter port is connected to a priority flow
   control (PFC) capable switch.
   
 * On systems with a AIX or Linux encapsulated state partitions, a problem was
   fixed for a Live Partition Mobility migration failure for the encapsulated
   state partitions.  The migration fails on the target CEC when the associated
   paging space needed to support the encapsulated state is not available. 
   Removing the "Encapsulated State" attribute from the partition would allow
   the migration to succeed.  However, removing this attribute can only be
   accomplished if the partition in the powered off state.  Encapsulated State
   partitions are needed for the remote restart feature.  An encapsulated state
   partition is a partition in which the configuration information and the
   persistent data are stored external to the server on persistent storage.  A
   partition that supports remote restart can be restarted remotely.  For more
   information on the remote start feature, refer to this IBM Knowledge Center
   link:
   http://www.ibm.com/support/knowledgecenter/P8DEA/p8efd/p8efd_lpar_general_props.htm
   
 * Support was added to eliminate the yearly Utility COD renewal on systems
   using Utility COD.  The Utility COD usage is already monitoring to make sure
   systems are running within the prescribed threshold limit of unreported
   usage, so a yearly customer renewal is not needed to manage the Utility COD
   processor usage.

SV830_075_048 / FW830.11

11/11/15 Impact: Availability    Severity: HIPER

New features and functions


 * Support for a new 4.124 GHz processor with CCIN 551E .  This pertains to the
   IBM Power System E850 (8408-E8E).
   

System firmware changes that affect all systems

 * HIPER/Pervasive:  A problem was fixed for recovering from embedded
   MultiMediaCard (eMMC) flash NAND errors that caused the service processor to
   go to a failed state with SRC B1817212 on systems with a single service
   processor.  On systems with redundant service processors, the failed service
   processor would get guarded with a B151E6D0 or B152E6D0 SRC depending on
   which service processor fails.
 * HIPER/Pervasive: A problem associated with workloads using transactional
   memory on PowerVM was discovered and is fixed in this service pack. The
   effect of the problem is non-deterministic but may include undetected
   corruption of data.
   
 * DEFERRED:  A problem was fixed for memory on-die termination (ODT) settings
   to improve the signal integrity of the memory channel.
 * A problem was fixed to increase the temperature for ambient temperature
   warnings for performance degradation as this was set too low.  This problem
   only pertains to the IBM Power System E850 (8408-E8E).
   
 * A problem was fixed for power supply redundancy for the HVDC power supplies. 
   If one power supply had failed, the other power supply would fail to IPL with
   SRC 11002614 when the IPL should have been successful.  This power supply is
   supported in rack models only with F/C EB2N for the S822 (8284-22A),
   S814(8286-41A), S824(8286-42A), and E850(8404-E8E) models.  And F/C EL1D for
   the S812L(8247-21L), S822L(8247-22L), and S824L(8247-42L) models.
 * A problem was fixed for recovery from unaligned addresses for MSI interrupts
   from PCIe adapters.  The recovery prevents an adapter timeout caused by
   resource exhaustion.  With the fix, the resources for each bad interrupt are
   returned, allowing the PCIe adapter to continue to run for the normal
   traffic.
 * A problem was fixed for an Operations Panel SRC of B1504804 with no FRU
   callout.  A callout of the failed hardware has been added.
 * A problem was fixed to prevent recoverable power faults of short duration
   from causing the system to lose power supply redundancy.  Without the fix,
   the faulted state persisted for the recovered power fault, causing a problem
   with a system power off if other power supplies were lost at a later time.
 * A problem was fixed for a PCIe3 I/O expansion drawer (#EMX0) link failure
   with SRC B7006A8B .  The settings for the continuous time linear equalizers
   (CTLE) were adjusted to improve the incoming signal strength to improve the
   stability of the links.  The expansion drawer must be power cycled or the CEC
   can be re-IPLed for the fix to activate.
   
 * A problem was fixed for recovery from a processor local bus (PLB) hang on the
   service processor.  The errant PLB hang recovery would be seen in concurrent
   firmware updates that, on rare occasions, fail to do a side switch to
   activate to the new level of firmware.  On the management console, the error
   message would be HSCF010180E Operation failed ... E302F873 is the error
   code."  Other than the failed code level activation, the firmware update is
   successful.  If this problem occurs, the system can be set to the new
   firmware level by doing a power off from the management console and then
   doing a power on with side switch selected in the advanced properties.

SV830_068_048 / FW830.10

09/10/15 Impact: Availability    Severity: HIPER

New features and functions


 * Support for a HVDC (180-400 VDC) 1400W power suppy in a one plus one or two
   plus two configuration to support redundancy.  Supported in rack models only
   with F/C EB2N for the S822 (8284-22A), S814(8286-41A), S824(8286-42A), and
   E850(8404-E8E) models.  And F/C EL1D for the S812L(8247-21L),
   S822L(8247-22L), and S824L(8247-42L) models.
 * The firmware code update process was enhanced with a feature to block a
   firmware "downgrade" to a level that is below the system's manufactured code
   level.
   

System firmware changes that affect all systems

 * HIPER/Pervasive:DEFERRED:  A problem was fixed for a TCP/IP performance
   degradation on PCIe ethernet adapters with Remote Direct Memory Access (RDMA)
   over Converged Ethernet (RoCE).  By adjusting the system memory caching, a
   significant improvement was made to the data throughput speed to restore
   performance to expected levels.  This fix requires a system re-IPL to take
   effect.  This problem affects the E850 (8408-E8E), E870 (9119-MME), and E880
   (9119-MHE) systems.
 * HIPER/Pervasive:  A problem was fixed for an ethernet adapter hanging on the
   service processor.  This hang prevents TCP/IP network traffic from the
   managment console and the Advanced System Management Interface (ASMI)
   browsers.  It makes it appear as if the service processor is unresponsive and
   can be confused with a service processor in the stopped state..  An A/C power
   cycle would recover a hung ethernet adapter.
 * HIPER/Pervasive:  A problem was fixed for missing the interrupts for
   processor local bus (PLB) time-outs..  This problem could hang the service
   processor or cause it to panic with a reset/reload of the service processor. 
   There is a possibility the reset of the service processor could take it to a
   stopped state where the service processor would be unresponsive.  In the
   service processor stopped state, any active partitions will continue to run
   but they will not be able to be managed by the management console.  The
   partitions can be allowed to run until the next scheduled service window at
   which time the service processor can be recovered with an AC power cycle or a
   pin-hole reset from the operator panel.
 * HIPER/Pervasive:  A problem was fixed for a system reset to clear the boot
   registers to prevent the reset from being mishandled as chip reset.   If a
   "system reset" is misinterpreted as a "chip reset", the boot of the service
   processor can go inadvertently to a stopped state and be unresponsive. 
   Pin-hole resets from the operations panel could also fail to the service
   processor stopped state.  In the service processor stopped state, any active
   partitions will continue to run but they will not be able to be managed by
   the management console.  The partitions can be allowed to run until the next
   scheduled service window at which time the service processor can be recovered
   with an AC power cycle or a pin-hole reset from the operator panel.
 * HIPER/Pervasive:  A problem was fixed so a corrupted file system partition
   table can be recovered and not have the service processor lose the ability to
   do P and T-side switches.  In error recovery situations, the loss of the
   side-switch option could present itself as an unresponsive service processor
   if it was needed to prevent a failure to the service processor stopped state.
 * HIPER/Pervasive:  A problem was fixed for a runaway interrupt request (IRQ)
   condition that caused the service processor to go to a stopped state.  In the
   service processor stopped state, any active partitions will continue to run
   but they will not be able to be managed by the management console.  The
   partitions can be allowed to run until the next scheduled service window at
   which time the service processor can be recovered with an AC power cycle or a
   pin-hole reset from the operator panel.
 * HIPER/Pervasive:  A problem was fixed for a dump partition full condition
   that caused the service processor to go to a stopped state.  In the service
   processor stopped state, any active partitions will continue to run but they
   will not be able to be managed by the management console.  The partitions can
   be allowed to run until the next scheduled service window at which time the
   service processor can be recovered with an AC power cycle or a pin-hole reset
   from the operator panel.
 * DEFERRED:  A problem was fixed for a PCIe3 I/O expansion drawer (#EMX0) link
   failure with SRC B7006A8B .  Data packet send retries were increased and link
   recovery was enabled to improve the stability of the links.  The CEC must be
   re-IPLed for the fix to activate.
 * A problem was fixed for a SRC 11002613 logged during a concurrent repair of a
   power supply.  This SRC was erroneously logged and did not represent a real
   problem.
 * A problem was fixed for an intermittent SRC B1504804 logged on a re-ipl of
   the CEC but that did not result in an IPL failure.
 * A problem was fixed for the capture of the registers for the Hostboot
   Self-Boot Engine (SBE) for SBE failures.  These registers had been missing
   from failure data for SBE failures, making these problems more difficult to
   debug.
 * A problem was fixed to remove an unnecessary delay in the system IPL to
   reduce the time needed to IPL by 30 seconds.
 * A problem was fixed for an unneeded error log with SRC B181DB04 that occurred
   in a failed IPL for a normal condition of lost PNOR flash access after a
   reIPL process had started and taken over the access.
 * A problem was fixed for an Advanced System Manager Interface (ASMI) error
   message of "Error in function 'connect", error code 111" when a browser
   attempted to connect before the service processor was ready.  The browser
   connection through the web server is now held off until the ASMI process is
   ready after a reset of the service processor or a AC power cycle of the
   system.
 * A problem was fixed for an incorrect call home for SRC B1818A0F.  There was
   no real problem so this call home should have been ignored.
 * A problem was fixed for a dump reIPL that failed with SRC B1818601 and
   B181460B after processor checkstops had terminated the system.
 * A problem was fixed for an infrequent service processor database corruption
   during concurrent firmware update that caused the system to terminate.
   
 * A problem was fixed for a failed PCI oscillator that was not guarded, causing
   repeated errors with SRC B15050A6 and B158E504 logged on each IPL of the
   system.
 * A problem was fixed for a two rotor fan failure to provide adequate cooling
   to the system by adjusting the remaining fans to maximum speed.
 * A problem was fixed to correct the SRC calllouts for the fans.   Symbolics
   PWRSPLY and AIRMOVR were added for missing fans for the power supply and for
   the system, respectively.
 * A problem was fixed for a service processor dump with error logs B181E911 and
   B181D172 during an IPL.  The error logs were for the detection of defunct
   processes but otherwise the IPL was successful.
 * A problem was fixed for Digital Power Subsystem Sweep (DPSS) firmware updates
   that caused an error log with SRC B1819906 but otherwise was successful.
   
 * A problem was fixed for missing Keyword (KW) and Resource ID (RID) for SRC
   B181A40F.
 * A problem was fixed for a I2C bus lock error during a CEC power off that
   caused a ten minute delay for the power off and errorlog SRCs B1561314 and
   B1814803 with error number (errno) 3E.
 * A problem was fixed for concurrent firmware updates to a system that needed
   to be re-IPLed after getting a B113E504 SRC during activation of the new
   firmware level on the hypervisor.  The code update activate failed if the
   Sleep Winkle (SLW) images were significantly different between the firmware
   levels.  The SLW contains the state of the processor and cache so it can be
   restored after sleep or power saving operations.
 * A problem was fixed for System Power Control Network (SPCN) failover for a
   I/O module A/C power fault on the PCIe3  I/O expansion drawer (#EMX0).  A
   sideband failure on one I/O module was blocking SPCN commands for the entire
   drawer instead of SPCN failing over to a working I/O module.  The broken SPCN
   communications path prevented concurrent maintenance operations on the
   expansion drawer.
 * A problem was fixed for a possible lack of recovery for an A/C power loss
   condition on the PCIe3  I/O expansion drawer (#EMX0).   If there was an
   outstanding problem on the expansion drawer and an A/C loss occurred while
   the earlier error was still unprocessed, the auto-recovery for the A/C power
   loss would not have happened.
 * A problem was fixed for a missing FRU call out for error SRC B7006A87  when
   unable to read the drawer module logical flash VPD for the PCIe3 I/O
   expansion drawer (#EMX0).
   
 * For a partition that has been migrated with Live Partition Mobility (LPM)
   from FW730 to FW740 or later, a problem was fixed for a Main Storage Dump
   (MSD) IPL failing with SRC B2006008.  The MSD IPL can happen after a system
   failure and is used to collect failure data.  If the partition is rebooted
   anytime after the migration, the problem cannot happen.  The potential for
   the problem existed between the active migration and a partition reboot.
 * A problem was fixed for partial loss of Entitlement for On/Off Memory
   Capacity On Demand (also called Elastic COD).  Users with large amounts of
   Entitlement on the system of greater than "65535 GB * Days" could have had a
   truncation of the Entitlement value on a re-IPL of the system.  To recover
   lost Entitlement, the customer can request another On/Off Enablement Code
   from IBM support to "re-fill" their entitlement.
 * A problem was fixed for a management console command line failure with a
   return code 0x40000147 (invalid lock state) when trying to delete SR-IOV
   shared mode configurations.  This could have occurred if the adapter slot had
   been re-purposed without involvement of the management console and was owned
   and operational at the time of the requested delete.  With the fix, the
   current ownership of the slot is honored and only the SR-IOV shared mode
   configuration data is deleted on the force delete.
 *  A problem was fixed for an incorrect restriction on the amount of
   "Unreturned"  resources allowed for a Power Enterprise Pool (PEP).  PEP
   allows for logical moving of resources (processors and memory) from one
   server to another.  Part of this is 'borrowing' resources from one server to
   move to another. This may result in "Unreturned" resources on the source
   server. The management console controls how many total "Unreturned" PEP
   resources can exist.  For this problem,  the user had some "Unreturned" PEP
   memory and asked to borrow more but this request was incorrectly refused by
   the hypervisor.
 * A problem was fixed for a PCIe3 I/O expansion drawer (#EMX0) error with SRCs
   B7006A82 and B7004137 for a missing FRU location code.  The FRU location code
   for the Active Optical Cable (AOC)  was added to identify the failing drawer
   side.
 * A problem was fixed for a PCIe3 I/O expansion drawer (#EMX0)  failing to IPL
   when the IPL includes a FPGA update for the drawer.  The FPGA update is
   actually good but perceived as a failure when the FPGA resets as part of the
   update.  For the problem, a re-IPL of the system would have fixed the drawer.
   
 * A problem was fixed for Live Partition Mobility (LPM) to prevent a memory
   access error during LPM operations with unpredictable affects.  When data is
   moved by LPM, the underlying firmware code requires that the buffers be 4K
   aligned.  The fixes made now force the buffers to be 4K aligned and if there
   is still an alignment issue, the LPM operation will fail without impacting
   the system.
 * A problem was fixed for an On-Chip Controller (OCC) failure after a system
   dump with SRCs B18B2616 and BC822024 reported.  This resulted in the system
   running with reduced performance in safe mode, where processor clock
   frequencies are lowered to minimum levels to avoid hardware errors since the
   OCC is not available to monitor the system.   A re-IPL of the system would
   have resolved the problem.
 * A performance problem was fixed for systems entering processor hang recovery
   prematurely with SRC B111E504 and PBCENTFIR(9) "PB_CENT_HANG_RECOV".  The
   ability of the L3 cache to prefetch memory was extended to speed the memory
   accesses and prevent a processor hang condition for applications running with
   lower memory affinity.
 * A problem was fixed for a processor error causing a Hostboot terminate
   instead of a deconfiguration of the bad hardware and continuation of the
   IPL.  The state of the processors was synchronized between the service
   processor and the Hostboot process to correct the error.
 * A problem was fixed for a USB Save and Restore of machine configuration to
   not lose the system name.
 * A problem was fixed for Advanced System Management Interface (ASMI) help text
   for menu "I/O Adapter Enlarged Capacity" being missing with the system IPLed
   and partitions running.  The help text is now available for the system in the
   powered on state as well as in the powered off state.
 * A problem was fixed for an intermittent predictive error log B1504805 during
   an IPL of the S814 (8286-41A) and S812L (8247-21L) systems.  An adjustment
   was being incorrectly attempted to a Voltage Regulator Module (VRM) that did
   not exist on the single-socket systems.
 * A problem was fixed for an intermittent power supply error SRC 1100D008 with
   a flood of VPD SRC B1504804 with errno 3Es logged on a re-ipl of the CEC but
   that did not result in an IPL failure.
 * A problem was fixed for a LED intermittently not lighting for an enclosure
   with a fault.
 * A problem was fixed for an intermittent PSI link error with SRC B15CDA27
   after a firmware update or reset/reload of the service processor.
 * A problem was fixed for PCIe3 adapters failing when requesting more than 32
   Message Signaled Interrupts (MSI-X).  The adapter may fail to ping or cause
   OS tasks to hang that are using the adapter.  This problem was found
   specifically on the 10 Gb Ethernet-SR (Short Range) PCIe3 adapter with
   feature codes #5275 and #5769 and on the 56 Gb Infiniband (IB) Fourteen Data
   Rate (FDR) adapter with feature codes #EC32, #EC33, #EL3D, and #EL50 and CCIN
   2CE7.  However, other PCIe adapters may also be affected.
 * A problem was fixed for IBM copyright statements being displayed on the
   System Management Services (SMS) menu after a repair or replacement of system
   hardware.
   

System firmware changes that affect certain systems


 * HIPER/Pervasive:  For partitions with a graphics console and USB keyboard, a
   problem was fixed for a OS boot hang at the CA00E100 progress SRC.  For the
   problem, the hang can be avoided by issuing the boot command from the Open
   Firmware (OF) prompt.
 * HIPER/Pervasive:  On systems using PowerVM with shared processor partitions
   that are configured as capped or in a shared processor pool, there was a
   problem found that delayed the dispatching of the virtual processors which
   caused performance to be degraded in some situations.  Partitions with
   dedicated processors are not affected.   The problem is rare and can be
   mitigated, until the service pack is applied, by creating a new shared
   processor AIX or Linux partition and booting it to the SMS prompt; there is
   no need to install an operating system on this partition.  Refer to help
   document http://www.ibm.com/support/docview.wss?uid=nas8N1020863 for
   additional details.
 * DEFERRED:  A problem was fixed for Non-Volatile Memory express (NVMe)
   adapters, plugged into PCIe3 switches, mis-training to generation 1 instead
   of generation 3.   NVMe adapters attached directly to the PCIe3 slots trained
   correctly to the generation 3 specification. This fix requires a re-IPL of
   the system to correct the training of any mis-trained adapters.
 * On a system with an IBM i partition using Active Memory Sharing (AMS),  a
   problem was fixed for internal memory management errors caused by deleting a
   IBM i partition that had been powered off in the middle of a Main Storage
   Dump (MSD).  Until the fix is installed, if a MSD is interrupted for a IBM i
   partition that has AMS, the partition should be powered on and powered off
   normally before a delete of the partition is done to prevent errors with
   unpredictable affects.  This problem does not affect the S822 (8284-22A),
   S812L(8247-21L), S822L (8247-22L), S824L(8247-42L), and E850 (8408-E8E)
   models.

SV830_048_048 / FW830.00

06/08/15 Impact:  New      Severity:  New

New features and functions for MTM 8408-E8E:


GA Level

NOTE:

 * POWER8 (and later) servers include an “update access key” that is checked
   when system firmware updates are applied to the system.  The initial update
   access keys include an expiration date which is tied to the product warranty.
   System firmware updates will not be processed if the calendar date has passed
   the update access key’s expiration date, until the key is replaced.  As these
   update access keys expire, they need to be replaced using either the Hardware
   Management Console (HMC) or the Advanced Management Interface (ASMI) on the
   service processor.  Update access keys can be obtained via the key management
   website: http://www.ibm.com/servers/eserver/ess/index.wss.
   

 * This system supports only the PowerVM hypervisor.
   

New Features and Functions for MTMs 8247-21L, 8247-22L, 8247-42L, 8284-22A,
8286-41A, 8286-42A:


NOTE:


 * POWER8 (and later) servers include an “update access key” that is checked
   when system firmware updates are applied to the system.  The initial update
   access keys include an expiration date which is tied to the product warranty.
   System firmware updates will not be processed if the calendar date has passed
   the update access key’s expiration date, until the key is replaced.  As these
   update access keys expire, they need to be replaced using either the Hardware
   Management Console (HMC) or the Advanced Management Interface (ASMI) on the
   service processor.  Update access keys can be obtained via the key management
   website: http://www.ibm.com/servers/eserver/ess/index.wss.
 * The 830 release stream only supports the PowerVM hypervisor.  OPAL firmware
   and PowerKVM support will be provided in a later release for MTMs 8247-21L,
   8247-22L, and 8247-42L.
   
 * Support for a PCIe 3 I/O expansion drawer (#EMX0).  This 19-inch 4U (4 EIA)
   enclosure provides PCIe Gen3 slots outside of the system unit. It has two
   module bays. One 6-Slot Fanout Module (#EMXF) is placed in each module bay.
   Two 6-slot modules provide a total of 12 PCIe Gen3 slots. Each fanout module
   is connected to a PCIe3 Optical Cable Adapter located in the system unit over
   an active optical cable (AOC) pair.
 * Support for a PCIe3 x16 optical cable adapter with F/C #EJ08 and CCIN 2CE2
   for a PCIe3 expansion drawer (#EMX0).  This adapter provides two optical CXP
   ports for the attachment of two active optical cables (AOC).   One adapter
   supports the attachment of one PCIe3 module in a PCIe Gen3 I/O Expansion
   Drawer.   This cable adapter is supported in the following IBM Power
   Systems:  S814 (8286-41A), S824 (8286-42A) and S824L (8247-42L).
 * Support for a PCIe3 x16 optical cable adapter with F/C #EJ05 and CCIN 2B1C
   for a PCIe3 expansion drawer(#EMX0).  This adapter provides two optical CXP
   ports for the attachment of two active optical cables (AOC).   One adapter
   supports the attachment of one PCIe3 module in a PCIe Gen3 I/O Expansion
   Drawer.    This cable adapter is supported in the following IBM Power
   Systems: S822 (8284-22A), S812L (8247-21L),  and S822L (8247-22L).
 * Support for Single Root I/O Virtualization (SR-IOV) that enables the
   hypervisor to share a SR-IOV-capable PCI-Express adapter across multiple
   partitions. Twelve ethernet adapters are supported with the SR-IOV NIC
   capability, when placed in the P8 system (SR-IOV supported in both native
   mode and through VIOS):
   - PCIe3  4-port 10GbE SR Adapter                           (F/C EN15 and CCIN
   2CE3)
   - PCIe3  4-port 10GbE SR Adapter                         (F/C EN16 and CCIN
   2CE3).  Fits E870/E880 system node PCIe slot.
   - PCIe3  4-port 10GbE SFP+ Copper Adapter                    (F/C EN17 and
   CCIN 2CE4)
   - PCIe3  4-port 10GbE SFP+ Copper Adapter                    (F/C EN18 and
   CCIN 2CE4).  Fits E870/E880 system node PCIe slot.
   - PCIe2  4-port (10Gb FCoE & 1GbE) SR and RJ45 SFP+ Adapter        (F/C EN0H
   and CCIN 2B93)
   - PCIe2 LP 4-port (10Gb FCoE & 1GbE) SR and RJ45  SFP+ Adapter        (F/C
   EN0J and CCIN 2B93)
   - PCIe2 LP Linux 4-port (10Gb FCoE & 1GbE) SR and RJ45 SFP+ Adapter      
   (F/C EL38 and CCIN 2B93)
   - PCIe2  4-port (10Gb FCoE & 1GbE) LR and RJ45 Adapter             (F/C EN0M
   and CCIN 2CC0)
   - PCIe2 LP 4-port (10Gb FCoE & 1GbE) LR and RJ45 Adapter              (F/C
   EN0N and CCIN 2CC0)
    -PCIe2  4-port (10Gb FCoE & 1GbE) SFP+Copper and RJ45 Adapter        (F/C
   EN0K and CCIN 2CC1)
   - PCIe2 LP 4-port (10Gb FCoE & 1GbE) SFP+Copper and RJ45    Adapter       
   (F/C EN0L and CC IN 2CC1)
   - PCIe2 LP Linux 4-port (10Gb FCoE & 1Gb Ethernet) SFP+Copper and RJ45   
   (F/C EL3C and CCIN 2CC1)
   These adapters each have four ports, and all four ports are enabled with
   SR-IOV function. The entire adapter (all four ports) is configured for SR-IOV
   or none of the ports is.
   System firmware updates the adapter firmware level on these adapters to
   10.2.252.16 when a supported adapter is placed into SR-IOV mode.
   Support for SR-IOV adapter sharing is now available for adapters in the PCIe3
   I/O Expansion Drawer with F/C #EMX0.
   SR-IOV NIC on the Power P8 systems is supported by:
       - AIX 6.1 TL9 SP4 and APAR IV63331, or later
       - AIX 7.1 TL3 SP4 and APAR IV63332, or later
       - IBM i 7.1 TR8, or later (Supported on S824/S814)
       - IBM i 7.2  or later  (Supported on S824/S814)
       - IBM i 7.1 TR9, or later (Supported on E870/E880)
       - IBM i 7.2 TR1, or later  (Supported on E870/E880)
               - Red Hat Enterprise Linux 6.5 or later ( Supported on
   E870/E880/S812L/S822/S822L/S814/S824/S824L except for adapters with F/Cs
   EN15/EN16/EN17/EN18)
       - Red Hat Enterprise Linux 6.6, or later (Supported on E850 and minimum
   level needed for adapters with F/Cs EN15/EN16/EN17/EN18)
       - Red Hat Enterprise Linux 7.1, or later
       - SUSE Linux Enterprise Server 11 SP1 or later  (Supported on
   S812L/S822/S822L/S814/S824/S824L)
       - SUSE Linux Enterprise Server 11 SP3 or later  (Supported on E870/E880)
       - SUSE Linux Enterprise Server 12, or later  (Supported on E850)
       - Ubuntu 15.04 or later (Supported on
   E850/S812L/S822/S822L/S814/S824/S824L) 
       - VIOS 2.2.3.4 with interim fix IV63331, or later
   
 * Support for adjusting voltage regulators input voltage dynamically based on
   regulator slave failures to achieve the optimal voltage for system operation
   for normal and degraded conditions.
 * Support for a 226W 4.323 GHz eight core processor (CCIN 54E5, F/C EPXF) for
   the S822 (8284-22A) and S822L(8247-22L).
 * Support for Little Endian (LE) Linux in PowerVM for the S812L (8247-21L),
   S822L (8247-22L), and S824L (8247-42L) systems.
   

System firmware changes that affect all systems except MTM 8408-E8E:

 * A problem with concurrent PCIe adapter maintenance was fixed that caused
   On-Chip Controller (OCC) resets with SRCs logged of B18B2616 and BC822029,
   forcing the system into safe mode (processor voltage/frequency reduced to a
   "safe" level where thermal monitoring is not required).  Recovery from safe
   mode requires a system re-IPL.
   

System firmware changes that affect certain systems


 * On systems with memory mirroring enabled, a problem was fixed for PowerVM
   over-estimating its memory needs, allowing more memory to be used by the
   partitions.




SV810
For Impact, Severity and other Firmware definitions, Please refer to the below
'Glossary of firmware terms' url:
http://www14.software.ibm.com/webapp/set2/sas/f/power5cm/home.html#termdefs
SV810_159_081 / FW810.50

05/03/16 Impact: Availability    Severity: SPE

New features and functions


 * Support was added for the Stevens6+ option of the internal tray loading
   DVD-ROM drive with F/C #EU13.  This is an 8X/24X(max) Slimline SATA DVD-ROM
   Drive.  The Stevens6+ option is a FRU hardware replacement for the
   Stevens3+.  MTM 7226-1U3 (Oliver)  FC 5757/5762/5763 attaches to IBM Power
   Systems and lists Stevens6+ as optional for Stevens3+.  If the Stevens6+  DVD
   drive is installed on the system without the required firmware support, the
   boot of an AIX partition will fail when the DVD is used as the load source. 
   Also, an IBM i partition cannot consistently boot from the DVD drive using
   D-mode IPL.  A SRC C2004130 may be logged for the load source not found
   error.
   

System firmware changes that affect all systems


 * A problem was fixed for some service processor error logs not getting
   reported to the OS partitions as needed.  The service processor was not
   checking for a successful completion code on the error log message send, so
   it was not doing retries of the send to the OS when that was needed to ensure
   that the OS received the message.
 * A problem was fixed for new service processor error logs not getting created
   if too many old error logs exist.  This problem can occur if a large number
   of small error logs get created and use up all the available inodes
   (directory entries) for the file system.  The error log garbage collector was
   not checking the available number of inodes correctly, so it was not always
   deleting old error logs before attempting to create a new error log.  
   Without the fix,  this problem will continue until some error logs are
   purged.
 * A problem was fixed for a system IPL hang at C100C1B0 with SRC 1100D001 when
   the power supplies have failed to supply the necessary 12-volt output for the
   system.   The 1100D001 SRC was calling out the planar when it should have
   called out the power supplies.  With the fix, the system will terminate as
   needed and call out the power supply for replacement.  One mode of power
   supply failure that could trigger the hang is sync-FET failures that disrupt
   the 12-volt output.
 * A problem was fixed for recovery from PNOR flash memory corruption that
   causes the IPL to fail with SRC D143900C.  This is very rare and only has
   happened in IBM internal labs.  Without the fix, the service processor cannot
   correct the corruption in the PNOR.  If a system has the problem SRC and
   cannot IPL,  then that system must be disruptively firmware updated to apply
   the fix to be able to IPL again.
 * A problem was fixed for processor core checkstops that cause an LPAR outage
   but do not create hardware errors and service events.  The processor core is
   deconfigured correctly for the error.  This can happen if the hypervisor
   forces processor checkstops in response to excessive processor recovery.
 * A problem was fixed for a false unrecoverable error (UE) logged for B1822713
   when an invalid cooling zone is found during the adjustment of the system fan
   speeds.  This error can be ignored as it does not represent a problem with
   the fans.
 * A problem was fixed for a system checkstop caused by a L2 cache
   least-recently used (LRU) error that should have been a recoverable error for
   the processor and the cache.  The cache error should not have caused a L2 HW
   CTL error checkstop.
 * A security problem was fixed in OpenSSL for a possible service processor
   reset on a null pointer de-reference during RSA PPS signature verification.
   The Common Vulnerabilities and Exposures issue number is CVE-2015-3194.
 * A security problem was fixed in the lighttpd server on the service processor,
   where a remote attacker, while attempting authentication, could insert
   strings into the lighttpd server log file.  Under normal operations on the
   service processor, this does not impact anything because the log is disabled
   by default.  The Common Vulnerabilities and Exposures issue number is
   CVE-2015-3200.
 * A problem was fixed for not being able to collect a processor core unit dump
   if there is any error in the sleep winkle image (SLW) that contains the state
   of the processor and the cache.  If there is any SLW error, the affected core
   will not have a dump taken.
 * A problem was fixed for a Qualys network scan for security vulnerabilities
   causing a core dump in the Intelligent Platform Management Interface (IPMI) 
   process on the service processor with SRC B181EF88.  The error occurs anytime
   the Qualys scan is run because it sends an invalid IPMI session id that
   should have been handled and discarded without a core dump.
 * A problem was fixed for a L2 cache error on the service processor that caused
   the service processor to reset or go to a failed state with SRC B1817212 on
   systems with a single service processor.  On systems with redundant service
   processors, the failed service processor would get guarded with a B151E6D0 or
   B152E6D0 SRC depending on which service processor fails.  With the fix, the
   L2 cache error is handled with single-bit corrected with no error to the
   service processor, so it can continue normal processing.  The L2 cache data
   error that causes this fail is infrequent and the service processor requires
   its limit of three resets in fifteen minutes to be exceeded for the service
   processor to fail, so service processor failure rate for this problem is low.
 * A problem was fixed for the service processor going to the reset state
   instead of the termination state when the anchor card is missing or broken. 
   At the termination state, the Advanced System Manager Interface (ASMI) can be
   used to collect failure data and debug the problem with the anchor card.
   

System firmware changes that affect certain systems


 * On systems using PowerVM firmware with an AIX partition and a Linux
   partition, a problem was fixed for dynamically moving an adapter that uses
   DMA from the Linux partition to the AIX partition that caused the AIX to fail
   by going into KDB mode (0c20 crash).  The management console showed the
   following message for the partition operation:  "Dynamic move of I/O
   resources failed.  The I/O slot dynamic partitioning operation failed."  The
   error was caused by Linux using 64K mappings for the DMA window and AIX using
   4K mappings for the DMA window, causing incorrect calculations on the AIX
   when it received the adapter.  Until the fix is applied, the adapters that
   use DMA should only be moved from Linux to AIX when the partitions are
   powered off.  This problem does not pertain to Power System S812L(8247-21L),
   S822L(8247-22L), and S824L(8247-42L) models.
 * On systems using PowerVM firmware with AIX or Linux encapsulated state
   partitions, a problem was fixed for a Live Partition Mobility migration
   failure for the encapsulated state partitions.  The migration fails on the
   target CEC when the associated paging space needed to support the
   encapsulated state is not available.  Removing the "Encapsulated State"
   attribute from the partition would allow the migration to succeed.  However,
   removing this attribute can only be accomplished if the partition in the
   powered off state.  Encapsulated State partitions are needed for the remote
   restart feature.  An encapsulated state partition is a partition in which the
   configuration information and the persistent data are stored external to the
   server on persistent storage.  A partition that supports remote restart can
   be restarted remotely.  For more information on the remote start feature,
   refer to this IBM Knowledge Center link:
   http://www.ibm.com/support/knowledgecenter/P8DEA/p8efd/p8efd_lpar_general_props.htm
   .
   
 * On systems using PowerVM firmware, a problem was fixed to correct the error
   messages for early failures in the Live Partition Mobility (LPM) migration of
   a partition.  The management console might report an unrelated error such as
   "HSCLA27E The operation to lock the physical device location for target
   adapter" when the actual error might be not enough available memory on the
   target CEC to run the migration.  With the fix, the correct error code is
   returned so there is enough information to correct the error and retry the
   migration.
 * On systems using PowerVM firmware with Integrated Virtualization Manager
   (IVM) managed partitions with more than 64 active partitions, a problem was
   fixed for recovery from Live Partition Mobility (LPM) errors.  Without the
   fix, the IVM managed system partition can appear to still be running LPM
   after LPM has aborted, preventing retries of the LPM operation.  In this
   case, the partition must be stopped and restarted to clear the LPM error
   state.  The problem is not frequent because it requires a failed LPM on a
   partition with a partition ID that is greater than 64.
 * On systems using PowerVM firmware, a problem was fixed for PCIe adapter hangs
   and network traffic error recovery during Live Partition Mobility (LPM) and
   SR-IOV vNIC (virtual ethernet adapter)  operations.  An error in the PCI Host
   Bridge (PHB) hardware can persist in the L3 cache and fail all subsequent
   network traffic through the PHB.  The PHB error recovery was enhanced to
   flush the PHB L3 cache to allow network traffic to resume.
 * On systems using PowerVM firmware, a problem was fixed for PCIe switch
   recovery to prevent a partition switch failure during the IPL with error logs
   for SRC B7006A22 and B7006971 reported.  This problem can occur when doing
   recovery for an informational error on the switch.  If this problem occurs,
   the partition must be restarted to recover the affected I/O adapters.
   
 * On systems using PowerVM firmware with an invalid P-side or T-side in the
   firmware, a problem was fixed in the partition firmware Real-Time Abstraction
   System (RTAS) so that system Vital Product Data (VPD) is returned at least
   from the valid side instead of returning no VPD data.   This allows AIX host
   commands such as lsmcode, lsvpd, and lsattr that rely on the VPD data to work
   to some extent even if there is one bad code side.  Without the fix,  all the
   VPD data is blocked from the OS until the invalid code side is recovered by
   either rejecting the firmware update or attempting to update the system
   firmware again.
 * On systems using PowerVM firmware without a HMC (and in Manufacturing Default
   Configuration (MDC) mode with a single host partition), a problem was fixed
   for missing dumps of type SYSDUMP. FSPDUMP. LOGDUMP, and RSCDUMP that were
   not off-loaded to the host OS.  This is an infrequent error caused by a
   timing error that causes the dump notification signal to the host OS to be
   lost.  The missing/pending dumps can be retrieved by rebooting the host OS
   partition.  The rebooted host OS will receive new notifications of the dumps
   that have to be off-loaded.
 * On systems using PowerVM firmware, a problem was fixed for error recovery
   from failed Live Partition Mobility (LPM) migrations.  The recovery error is
   caused by a partition reset that leaves the partition in an unclean state
   with the following consequences:  1) A retry on the migration for the failed
   source partition may not not be allowed; and 2) With enough failed migration
   recovery errors, it is possible that any new migration attempts for any
   partition will be denied.  This error condition can be cleared by a re-IPL of
   the system. The partition recovery error after a failed migration is much
   more likely to occur for partitions managed by the Integrated Virtualization
   Manager (IVM) but it is still possible to occur for Hardware Management
   Console (HMC) managed partitions.
 * On systems using OPAL firmware, a problem was fixed for the OPAL hypervisor
   not releasing the PSI link after a power off of the CEC.  With the PSI link
   unavailable, the service processor has to forcibly reclaim it on the next
   IPL, causing erroneous SRCs and error logs for the PSI link when no problem
   exists.
 * On systems using OPAL firmware, a performance problem was fixed in the OPAL
   hypervisor PCI Host Bridge (PHB) to prevent the PHB L3 cache from retrying
   defunct entries in the L3 after an MSI end of information (EOI) has been
   received.  The cache line is now flushed after updating the P/Q bits in the
   priority queue.  The situation is improved (and thus performance) by sending
   a DCBF (Data Cache Block Flush) to force a flush of PHB cache.  This improves
   interrupt performance, reducing latency per interrupt.  The improvement will
   vary by workload.
 * On systems using PowerVM firmware with dedicated processor partitions,  a
   problem was fixed for the dedicated processor partition becoming
   intermittently unresponsive. The problem can be circumvented by changing the
   partition to use shared processors.
 * On systems using OPAL firmware, a problem was fixed that prevented multiple
   NVIDIA Tesla K80 GPUs from being attached to one PCIe adapter.  This
   prevented using a PCIe attached GPU drawer.  This fix increases the PCIe MMIO
   (memory-mapped I/O) space to 1 TB from a previous maximum of 64 GB per
   PHB/PCIe slot.

SV810_146_081 / FW810.40

11/10/15 Impact: Availability    Severity: HIPER

New features and functions


 * The firmware code update process was enhanced with a feature to block a
   firmware "downgrade" to a level that is below the system's manufactured code
   level.
 * Support was added to the Advanced System Management Interface (ASMI) to be
   able to add a IPv4 static route definition for each ethernet interface on the
   service processor.  Using a static route definition,  a Hardware Management
   Console (HMC) configured on a private subnet that is different from the
   service processor subnet is now able to connect to the service processor and
   manage the CEC.  A static route persists until it is deleted or until the
   service processor settings are restored to manufacturing defaults.  The
   static route is managed with the ASMI panel "Network Services/Network
   Configuration/Static Route Configuration" IPv4 radio button.  The "Add"
   button is used to add a static route (only one is allowed for each ethernet
   interface) and the "Delete" button is used to delete the static route.
   

System firmware changes that affect all systems


 * HIPER/Pervasive:  A problem was fixed for missing the interrupts for
   processor local bus (PLB) time-outs.  This problem could hang the service
   processor or cause it to panic with a reset/reload of the service processor. 
   There is a possibility the reset of the service processor could take it to a
   failed state with SRC B1817212.
   
 * HIPER/Pervasive:  A problem was fixed for an ethernet adapter hanging on the
   service processor.  This hang prevents TCP/IP network traffic from the
   management console and the Advanced System Management Interface (ASMI)
   browsers.  It makes it appear as if the service processor is unresponsive and
   can be confused with a service processor in the stopped state.  An A/C power
   cycle would recover a hung ethernet adapter.
 * HIPER/Pervasive:  A problem was fixed for the system reset to clear the boot
   registers to prevent the reset from being mis-handled as chip reset.   If a
   "system" reset is processed as a "chip" reset, the boot of the service
   processor can go inadvertently to a stopped state and be unresponsive. 
   Pin-hole resets from the operations panel could also fail to the service
   processor stopped state.
 * HIPER/Pervasive:  A problem was fixed so a corrupted file system partition
   table can be recovered and not have the service processor lose the ability to
   do P and T-side switches.  In error recovery situations, the loss of the
   side-switch option could present itself as an unresponsive service processor
   if it was needed to prevent a failure to the service processor stopped state.
 * HIPER/Pervasive: A problem was fixed for a dump partition full condition that
   caused the service processor to go to a failed state with SRC B1817212.
   
 * HIPER/Pervasive:  A problem was fixed for recovering from embedded
   MultiMediaCard (eMMC) flash NAND errors that caused the service processor to
   go a failed state with SRC B1817212.
   
 * DEFERRED:  A problem was fixed for memory on-die termination (ODT) settings
   to improve the signal integrity of the memory channel.
 * DEFERRED:  A problem was fixed for a hang in the processor and cache memory
   that causes a system checkstop with SRC B181E540 logged with a processor FRU
   callout.  The error log details include "Description:  Runtime diagnostics
   has detected a problem on a memory bus" and "Signature Description: 
   mcs(n0p0c6) (MCIFIR[40]) CHANNEL TIMEOUT ERROR" and "Multi-Signature List: 
   ex(n0p0c14) (L3FIR[24]) L3 Hw Control Error".  The trigger for the hang error
   is speculative DMA partial writes into cache and the frequency of the error
   varies with the workload, but may happen several times a month.  A re-IPL of
   the system is needed for this fix to take effect after a concurrent firmware
   update of the service pack.
   
 * A performance problem was fixed for systems entering processor hang recovery
   prematurely with SRC B111E504 and PBCENTFIR(9) "PB_CENT_HANG_RECOV".  The
   ability of the L3 cache to prefetch memory was extended to speed the memory
   accesses and prevent a processor hang condition for applications running with
   lower memory affinity.
 * A problem was fixed for a SRC 11002613 logged during a concurrent repair of a
   power supply.  This SRC was erroneously logged and did not represent a real
   problem.
 * A problem was fixed for an Advanced System Manager Interface (ASMI) error
   message of "Error in function 'connect", error code 111" when a browser
   attempted to connect before the service processor was ready.  The browser
   connection through the web server is now held off until the ASMI process is
   ready after a reset of the service processor or a AC power cycle of the
   system.
 * A problem was fixed for an incorrect call home for SRC B1818A0F.  There was
   no real problem so this call home should have been ignored.
 * A problem was fixed for rare database corruption that caused the service
   processor to go to a stopped state.  The fix recovers the database from a
   backup copy to restore it.  For a system without the fix and failed In the
   service processor stopped state, any active partitions will continue to run
   but they will not be able to be managed by the management console.  The
   partitions can be allowed to run until the next scheduled service window at
   which time the service processor can be recovered with an AC power cycle or a
   pin-hole reset from the operator panel.  The database corruption can be
   caused by firmware updates or other causes, and it is detected during a
   reset/reload of the service processor.
 * A problem was fixed for an infrequent service processor database corruption
   during concurrent firmware update that caused the system to terminate.
 * A problem was fixed for an intermittent PSI link error with SRC B15CDA27
   after a firmware update or reset/reload of the service processor.
 * A problem was fixed to correct the SRC calllouts for the fans.   Symbolics
   PWRSPLY and AIRMOVR were added for missing fans for the power supply and for
   the system, respectively.
 * A problem was fixed for a service processor dump with error logs B181E911 and
   B181D172 during an IPL.  The error logs were for the detection of defunct
   processes but otherwise the IPL was successful.
 * A problem was fixed for Digital Power Subsystem Sweep (DPSS) firmware updates
   that caused an error log with SRC B1819906 but otherwise was successful.
   
 * A problem was fixed for missing Keyword (KW) and Resource ID (RID) for SRC
   B181A40F.
 * A security problem was fixed for an OpenSSL specially crafted X.509
   certificate that could cause the service processor to reset in a
   denial-of-service (DOS) attack.  The Common Vulnerabilities and Exposures
   issue number is CVE-2015-1789.
   
 * A problem was fixed for memory not being guarded that had failed
   initialization.  The guarding prevents the bad memory from being used by the
   partitions and also generates a call home so the memory FRUs can be scheduled
   for repair.
 * A problem was fixed for false errors reported with SRC B1812663 for the
   On-Chip Controller (OCC).  These error logs can be ignored as these are
   caused by a prior error log using a buffer that is not properly sized for the
   log data.
 * A problem was fix for the location code of the TOD battery for bad battery
   SRC B15A3305.  It was calling out the backplane with location code P1.  This
   has been corrected to location code P1-E1.
 * A problem was fixed for a processor error causing a Hostboot terminate
   instead of a deconfiguration of the bad hardware and continuation of the
   IPL.  The state of the processors was synchronized between the service
   processor and the Hostboot process to correct the error.
 * A problem was fixed for specific hardware access failures that did not call
   out the hardware FRU during the IPL, forcing an extra IPL to guard the
   hardware and recover the system.  With the fix, the system is able to
   reconfigure and guard the failed hardware during the course of a single IPL.
 * A problem was fixed for Advanced System Management Interface (ASMI) help text
   for menu "I/O Adapter Enlarged Capacity" being missing with the system IPLed
   and partitions running.  The help text is now available for the system in the
   powered on state as well as in the powered off state.
 * A problem was fixed for an Advanced System Manager Interface (ASMI) error
   that occurred when trying to display detail on a deconfigured Anchor Card
   VPD.  If the error log for the selected deconfiguration record had been
   deleted, it caused ASMI to core dump.  With the fix,  if the error log for
   deconfiguration record is missing, the error log details such as failing SRC
   for the deconfiguration record are returned as blank.
 * A problem was fixed for Advanced System Management Interface (ASMI) TTY to
   allow "admin" passwords to be greater than eight characters in length to be
   consistent with prior generations of the product.  The ASMI web interface
   works correctly for user "admin" passwords with no truncation in the length
   of the passwords.
 * A problem was fixed that was corrupting the Update Access Key (UAK) date with
   a corrupted date of "1900".   The user should correct the UAK date, if
   needed, to allow the firmware update to proceed, by using the original UAK
   key for the system.  On the Management Console,  enter the original update
   access key via the "Enter COD Code" panel. Or on the Advanced System Manager
   Interface (ASMI),  enter the original update access key via the "On Demand
   Utilities/COD Activation" panel.
 * A problem was fixed for an Operations Panel SRC of B1504804 with no FRU
   callout.  A callout of the failed hardware has been added.
 * A problem was fixed for PCIe3 adapters failing when requesting more than 32
   Message Signaled Interrupts (MSI-X).  The adapter may fail to ping or cause
   OS tasks to hang that are using the adapter.  This problem was found
   specifically on the 10 Gb Ethernet-SR (Short Range) PCIe3 adapter with
   feature codes #5275 and #5769 and on the 56 Gb Infiniband (IB) Fourteen Data
   Rate (FDR) adapter with feature codes #EC32, #EC33, #EL3D, and #EL50 and CCIN
   2CE7.  However, other PCIe adapters may also be affected.
   Urgent notice:  For AIX 7.1 and AIX 6.1 partitions with the PCIe 2-port Async
   EIA-232 Adapter (feature code EN27/EN28, CCIN 57D4) installed, the ports on
   the adapter may become un-usable with the installation of this fix due to an
   issue with how interrupts are handled.  Many JAS_RTS error log entries are
   written to the AIX error log due to this issue.
   Instructions:  For IBM Power System S822(8284-22A), S814(8286-41A), and
   S824(8286-42A) servers with the PCIe 2-port Async EIA-232 Adapter installed
   on AIX partitions, the AIX ifix for APAR IV77596 resolving the issue must be
   installed before updating to the SV810_146 (FW810.40) or later level of
   firmware.
 * A problem was fixed to prevent recoverable power faults of short duration
   from causing the system to lose power supply redundancy.  Without the fix,
   the faulted state persisted for the recovered power fault, causing a problem
   with a system power off if other power supplies were lost at a later time.
 * A problem was fixed to guard a failed processor core to allow the system to
   IPL.  The processor core chiplet FRU was failing to be called out and guarded
   on a RC_PMPROC_CHKSLW_ADDRESS_MISMATCH error and this prevented the system
   from being able to IPL.
 * A problem was fixed to guard a failed processor during an IPL instead of
   hanging with SRC B1813450 reported to the error log.
 * A problem was fixed for recovery from a processor local bus (PLB) hang on the
   service processor.  The errant PLB hang recovery would be seen in concurrent
   firmware updates that, on rare occasions, fail to do a side switch to
   activate to the new level of firmware.  On the management console, the error
   message would be HSCF010180E Operation failed ... E302F873 is the error
   code."  Other than the failed code level activation, the firmware update is
   successful.  If this problem occurs, the system can be set to the new
   firmware level by doing a power off from the management console and then
   doing a power on with side switch selected in the advanced properties.
   

System firmware changes that affect certain systems


 * HIPER/Non-Pervasive:  On systems using OPAL firmware, a problem was fixed for
   corruption to a small amount of memory that may occur when the "Service
   Indicators" LEDs feature is used in the Advanced System Manager Interface
   (ASMI) on the service processor.  This could result in a system crash or
   possible undetected data corruption.  This issue was discovered by IBM during
   testing and IBM has no knowledge of this being observed in the field.  Until
   this fix can be applied, the problem can be prevented by the following steps
   for a non-persistent Linux patch (must be done again on each reboot of the
   Linux):
   Since the corruption is always in a fixed location in memory, use a Linux
   kernel feature to mark the affected location as one that should not be used. 
   Run the following command as root in the Linux host (e.g. PowerKVM itself,
   *not* inside a guest):
       echo 0x03197000 > /sys/devices/system/memory/soft_offline_page
    If the "echo" command succeeds, you will not be affected by this corruption
   bug for the remainder of this boot.
    If the "echo" command is unsuccessful, an error message will be logged to
   the kernel log.  In this case, it is strongly advised to not use the Service
   Indicators feature NOR access any of the Service Indicators functions from
   ASMI on the service processor, including these ASMI panel options:  System
   Information Indicator, Enclosure Indicators, Indicators by Location code, and
   Lamp Test.
 * HIPER/Pervasive: A problem associated with workloads using transactional
   memory on PowerVM was discovered and is fixed in this service pack. The
   effect of the problem is non-deterministic but may include undetected
   corruption of data.
   
 * On systems using OPAL firmware,  a problem was fixed for system checkstops
   caused by conflicting co-processor requests on the Power Bus from NX and
   coherent accelerator processor proxy (CAPP) units using the same request
   identifier.
 * On systems using PowerVM firmware,  a problem was fixed for concurrent
   firmware updates to a system that needed to be re-IPLed after getting a
   B113E504 SRC during activation of the new firmware level on the hypervisor. 
   The code update activate failed if the Sleep Winkle (SLW) images were
   significantly different between the firmware levels.  The SLW contains the
   state of the processor and cache so it can be restored after sleep or power
   saving operations.
 * On systems using PowerVM firmware, a problem was fixed for an unexpected
   interrupt from a PCIe adapter that causes the AIX OS to abend.  The extra
   interrupt comes in from the adapter before it has been enabled for
   interrupts, after it has reached End of Information (EOI) for its previous
   session.  The double interrupt from the adapter has been corrected.
 * On systems using PowerVM firmware and for a partition that has been migrated
   with Live Partition Mobility (LPM) from FW730 to FW740 or later, a problem
   was fixed for a Main Storage Dump (MSD) IPL failing with SRC B2006008.  The
   MSD IPL can happen after a system failure and is used to collect failure
   data.  If the partition is rebooted anytime after the migration, the problem
   cannot happen.  The potential for the problem existed between the active
   migration and a partition reboot.
 * On a system with an IBM i partition using Active Memory Sharing (AMS),  a
   problem was fixed for internal memory management errors caused by deleting a
   IBM i partition that had been powered off in the middle of a Main Storage
   Dump (MSD).  Until the fix is installed, if a MSD is interrupted for a IBM i
   partition that has AMS, the partition should be powered on and powered off
   normally before a delete of the partition is done to prevent errors with
   unpredictable affects.  This problem only affects the S814 (8286-41A) and
   S824(8286-42A) models.
 * On systems with PowerVM firmware, a problem was fixed for Live Partition
   Mobility (LPM) to prevent a memory access error during LPM operations with
   unpredictable affects.  When data is moved by LPM, the underlying firmware
   code requires that the buffers be 4K aligned.  The fixes made now force the
   buffers to be 4K aligned and if there is still an alignment issue, the LPM
   operation will fail without impacting the system.
 * On systems using OPAL firmware, a problem was fixed to reduce power
   consumption for the Nvidia Compute Intensive Accelerator (PCIe attached 300W
   GPU) with F/C #EC4B to optimize system performance.  This problem only
   pertains to the IBM Power System S824L (8247-42L).
 * On systems using OPAL firmware, a problem was fixed for kexec errors when
   having PCIe adapters with DMA capability configured in the system.
   
 * On systems using OPAL firmware, a problem was fixed for a PCIe adapter hang
   on a messaged interrupt error in the system boot.
 * On systems using PowerVM firmware where memory relocation (as done by using
   Live Partition Mobility (LPM)) and a partition reboot are occurring
   simultaneously, a problem for a system termination was fixed.  The potential
   for the problem existed between the active migration and the partition
   reboot.
 * On systems using PowerVM firmware, a problem was fixed for recovery from
   unaligned addresses for MSI interrupts from PCIe adapters.  The recovery
   prevents an adapter timeout caused by resource exhaustion.  With the fix, the
   resources for each bad interrupt are returned, allowing the PCIe adapter to
   continue to run for the normal traffic.
 * On systems using PowerVM firmware, a problem was fixed for a machine check
   incorrectly issued to an IBM i partition running 7.2 or later with 4K sector
   disks.  This problem only pertains to the IBM Power System S814 (8286-41A)
   and S824 (8286-42A) models.
 * On systems using PowerVM firmware, a problem was fixed for a Network
   boot/install failure using bootp in a network with switches using the
   Spanning Tree Protocol (STP).  A Network boot/install using lpar_netboot on
   the management console was enhanced to allow the number of retries to be
   increased.  If the user is not using lpar_netboot, the number of bootp
   retries can be increased using the SMS menus.  If the SMS menus are not an
   option, the STP in the switch can be set up to allow packets to pass through
   while the switch is learning the network configuration.
 * On systems using OPAL firmware, a problem was fixed for PCIe3 FPGA adapters
   going missing after a hot reset (power on to the adapter).

SV810_133_081 / FW810.33

08/14/15 Impact: Performance    Severity: HIPER



System firmware changes that affect certain systems


HIPER/Pervasive:  On systems using PowerVM with shared processor partitions that
are configured as capped or in a shared processor pool, there was a problem
found that delayed the dispatching of the virtual processors which caused
performance to be degraded in some situations.  Partitions with dedicated
processors are not affected.   The problem is rare and can be mitigated, until
the service pack is applied, by creating a new shared processor AIX or Linux
partition and booting it to the SMS prompt; there is no need to install an
operating system on this partition.  Refer to help document
http://www.ibm.com/support/docview.wss?uid=nas8N1020863 for additional details.
SV810_126_081 / FW810.31

07/08/15 Impact: Usability       Severity: ATT

System firmware changes that affect all systems


 * A problem was fixed for an In-band firmware update exhibiting a 45-minute
   delay (using the i5 PTF process or the update_flash utility for AIX or Linux)
   from FW810 firmware to FW830.  During the delay of the new code level
   activation, SRC D133C002 is displayed on the operations panel.  This delay
   occurred because an updated level for the Digital Power Systems Sweep (DPSS)
   chip was needed but the power off to do the DPSS update hung until a time-out
   allowed the power off operation to complete.  The firmware update to FW830.00
   was successful after the 45-minute delay.  The firmware updates done with the
   Hardware Management Console (HMC) will not experience this 45-minute delay.

SV810_124_081 / FW810.30

05/29/15 Impact: Availability    Severity: SPE

New features and functions


 * Support for setting Power Management Tuning Parameters from the Management
   Console (Fixed Maximum Frequency (FMF), Idle Power Save, and DPS Tunables)
   without needing to use the Advanced System Management Interface (ASMI) on the
   service processor.  This allows FMF mode to be set by default without having
   to modify any tunable parameters using ASMI.
 * Support was added for a new menu for the Advanced System Management Interface
   (ASMI) that is used to reset/reload the service processor.  A reset/reload or
   "soft reset" maintains the state of the hypervisor and the operating systems
   running in the partitions while rebooting the service processor so it can
   recover from service processor errors.  The menu that does this function is
   called "System Service Aids/Soft Reset Service Processor."
 * Support was added to the Advanced System Management Interface (ASMI) to
   display Anchor card VPD failures in the "Deconfigurations records" menu.
 * Support for the Nvidia Compute Intensive Accelerator (PCIe attached 300W GPU)
   with F/C #EC4B.  This feature is only supported on the IBM Power System S824L
   (8247-42L).  It is a PCIe 3 X16/Long/Full High/Double wide adapter with the
   PCIe connection in the left slot and overlaps another PCIe slot.  This
   feature ships with an auxiliary power cord used inside the system to support
   the 300W card.
   

System firmware changes that affect all systems


 * A problem was fixed for systems with a corrupted date of "1900" showing for
   the Update Access Key (UAK).  The firmware update is allowed to proceed on
   systems with a bad UAK date because the fix is in an emergency service pack. 
   After the fix is installed, the user should correct the UAK date, if needed,
   by using the original UAK key for the system.  On the Management Console, 
   enter the original update access key via the "Enter COD Code" panel. Or on
   the Advanced System Manager Interface (ASMI),  enter the original update
   access key via the "On Demand Utilities/COD Activation" panel.
 * A problem was fixed for the iptables process consuming all available memory,
   causing an out of memory dump and reset/reload of the service processor.
 * A problem was fixed for a CEC IPL hang failure with CEC Hardware Subsystem
   SRC UE B150BE14 when having persistent L2/L3 cache memory errors.  The IPL
   was stuck in a loop with progress codes C1C3C200 through C1C3C213 and having
   repeating error log informational SRCs of HostBoot BC8A1402 and Processor
   Unit (CPU) BC13E504.  With the fix, the failing core chiplet is guarded out
   and the IPL is able to complete.
 * A problem was fixed for the NEBS DC power supply showing up in the part
   inventories for the CEC as "IBM AC PS".  The description string has been
   changed to "IBM PS" as power supplies can be of DC or AC type.
 * A problem was fixed for missing hardware callouts in Vital Product Data (VPD)
   error logs.
 * A problem was fixed for the callouts for a checkstop with SRC B111E504 with
   PBCENTFIR[5] of PB_CENT_CRESP_ADDR_ERROR so that FSPSP16 is added as the high
   priority callout.  This checkstop is most likely caused by software error,
   not hardware.
 * A problem was fixed for SRC B1104800 having duplicate FRU call outs for the
   PNOR flash FRU.
 * A problem was fixed in the hardware server to prevent a UE B181BA07 abort
   when a host boot dump collection is in progress.
 * A problem was fixed for the unnecessary guarding of DIMMs for a memory bus
   error for SRC Memory Card/FRU B124E504.  The error recovery has been improved
   so that DIMMs are not guarded and the failing memory bus lane is replaced by
   the spare memory bus data lane.
 * A problem was fixed for a processor core unit being deconfigured but not
   guarded for a SRC B113E504 processor error in host boot with fault isolation
   register (FIR) code "RC_PMPROC_CHKSLW_NOT_IN_ETR" that caused the CEC to go
   to termination.  By guarding the failed processor core, the fix insures the
   core is not used on the reIPL of the CEC.
 * A problem was fixed for the On Chip Controller (OCC) taking the system into
   safe mode under certain work loads by increasing the time allowed for getting
   an update of the Analog Power Subsystem Sweep (APSS) data for current
   temperatures and power consumption.  If the OCC does not get data from the
   APSS within its time-out period,  the OCC will go to safe mode and run the
   processor at a minimum frequency.
 * A problem was fixed for intermittent firmware database errors that logged an
   UE SRC of B1818611 and had a fwdbServer core dump.
 * A problem was fixed for an intermittent reset/reload of the service processor
   during the early part of an IPL with SRC B1814616 logged.
 * A problem was fixed that prevented a second management console from being
   added to the CEC.  In some cases, network outages caused defunct management
   console connection entries to remain in the service processor connection
   table,  making connection slots unavailable for new management consoles.  A
   reset of the service processor could be used to remove the defunct entries.
 * A problem was fixed for a false guarding and call out of a PSI link with SRC
   B15CDA27.  This failure is very infrequent but sometimes seen after the
   reset/reload of the service processor during a concurrent firmware update.  
   Since there is no actual hardware failure, a manual unguarding of the PSI
   link allows it to be reused.
 * A problem was fixed for performance dumps to speed its processing so it is
   able to handle partitions with a large number of processors configured. 
   Previously, for large systems, the performance dump took too long in
   collecting performance data to be useful in the debugging of some performance
   problems.
 * A problem was fixed for a CEC power off error with SRC B1818903 logged.  The
   error causes a dump and reset of the service processor that allows the power
   off operation to complete.
 * A problem was fixed for firmware update to be able do a code update downgrade
   from a SV830 release to SV810.  This error causes the service processor to go
   to a stopped state with a user power cycle needed to recover to the P-side
   which will be correctly at the SV810 level.
 * A problem was fixed for missing "fastarray" data in hardware dump type
   HWPROC.  The "fastarray" contains debug information for the processor cores.
 * A problem was fixed for the Dynamic Power Saving (DPS) mode where, when
   favoring performance,  the system instead favored lower power use.   A
   work-around for the problem is to use the Advanced System Management
   Interface (ASMI) menu of System Configuration/Power Management/Tuning
   parameters to change the parameter labelled "Utilization threshold to
   determine active cores with slack" to 10.0%.
 * A problem was fixed for the Automatic Power On Policy (APOR) where the system
   failed to re-IPL after a AC power loss.  The APOR process needed to wait
   longer for the AC fault to clear before doing the IPL retry.
 * A problem was fixed for the Advanced System Manager Interface (ASMI)  IPv4
   Network Configuration where the IP address was being overwritten by value in
   the subnet mask field for the initial values of the panel.  If the network
   configuration was saved without fixing the IP address, the wrong IP address
   was also saved.
 * A problem was fixed for missing call outs when having multiple "Memory
   Card/FRU" failures with SRC B124E504.  There is a call out for the first
   memory FRU of the failures but any other memory FRUs failing at the same time
   are not reported.
 * A problem was fixed for errors during a CEC power off with SRCs B1812616 and
   B1812601.  These occurred if the CEC was powered off immediately after a
   power on such that the On-Chip Controllers (OCCs) had to shutdown during
   their initialization.
 * A problem was fixed for a highly intermittent IPL failure with SRC B18187D9
   caused by a defunct attention handler process.  For this problem, the IPL
   will continue to fail until the service processor is reset.
 * A security vulnerability, commonly referred to as GHOST, was fixed in the
   service processor glibc functions getbyhostname() and getbyhostname2() that
   allowed remote users of the functions to cause a buffer overflow and execute
   arbitrary code with the permissions of the server application.  There is no
   way to exploit this vulnerability on the service processor but it has been
   fixed to remove the vulnerability from the firmware.  The Common
   Vulnerabilities and Exposures issue number is CVE-2015-0235.
 * A security problem in GNU Bash was fixed to prevent arbitrary commands hidden
   in environment variables from being run during the start of a Bash shell. 
   Although GNU Bash is not actively used on the service processor, it does
   exist in a library so it has been fixed.  This is IBM Product Security
   Incident Response Team (PSIRT) issue #2211.  The Common Vulnerabilities and
   Exposures issue numbers for this problem are CVE-2014-6271, CVE-2014-7169,
   CVE-2014-7186, and CVE-2014-7187.
 * A security problem was fixed in OpenSSL where the service processor would,
   under certain conditions, accept Diffie-Hellman client certificates without
   the use of a private key, allowing a user to falsely authenticate .  The
   Common Vulnerabilities and Exposures issue number is CVE-2015-0205.
 * A security problem was fixed in OpenSSL to prevent a denial of service when
   handling certain Datagram Transport Layer Security (DTLS) messages.  A
   specially crafted DTLS message could exhaust all available memory and cause
   the service processor to reset.  The Common Vulnerabilities and Exposures
   issue number is CVE-2015-0206.
 * A security problem was fixed in OpenSSL to prevent a denial of service when
   handling certain Datagram Transport Layer Security (DTLS) messages.  A
   specially crafted DTLS message could do an null pointer de-reference and
   cause the service processor to reset.  The Common Vulnerabilities and
   Exposures issue number is CVE-2014-3571.
 * A security problem was fixed in OpenSSL to fix multiple flaws in the parsing
   of X.509 certificates.  These flaws could be used to modify an X.509
   certificate to produce a certificate with a different fingerprint without
   invalidating its signature, and possibly bypass fingerprint-based
   blacklisting.  The Common Vulnerabilities and Exposures issue number is
   CVE-2014-8275.
 * A security problem was fixed in the OpenSSL (Secure Socket Layer) protocol
   that allowed a man-in -the middle attacker, via a specially crafted
   fragmented handshake packet, to force a TLS/SSL server to use TLS 1.0, even
   if both the client and server supported newer protocol versions. The Common
   Vulnerabilities and Exposures issue number for this problem is CVE-2014-3511.
 * A security problem was fixed in OpenSSL for formatting fields of security
   certificates without null-terminating the output strings.  This could be used
   to disclose portions of the program memory on the service processor.  The
   Common Vulnerabilities and Exposures issue number for this problem is
   CVE-2014-3508.
 * Multiple security problems were fixed in the way that OpenSSL handled
   Datagram Transport Layer Security (DLTS) packets.  A specially crafted DTLS
   handshake packet could cause the service processor to reset.  The Common
   Vulnerabilities and Exposures issue numbers for these problems are
   CVE-2014-3505, CVE-2014-3506 and CVE-2014-3507.
 * A security problem was fixed in OpenSSL to prevent a denial of service when
   handling certain Datagram Transport Layer Security (DTLS) ServerHello
   requests.  A specially crafted DTLS handshake packet with an included
   Supported EC Point Format extension could cause the service processor to
   reset.  The Common Vulnerabilities and Exposures issue number for this
   problem is CVE-2014-3509.
 * A security problem was fixed in OpenSSL to prevent a denial of service by
   using an exploit of a null pointer de-reference during anonymous Diffie
   Hellman (DH) key exchange.  A specially crafted handshake packet could cause
   the service processor to reset.  The Common Vulnerabilities and Exposures
   issue number for this problem is CVE-2014-3510.
 * A problem was fixed for an intermittent problem in a CEC IPL where an On-Chip
   Controller is stuck in a reset loop, logging repeated SRCs for B1702A17, and
   eventually places the CEC in safe mode, running at minimum processor clock
   frequencies.
 * A problem was fixed for NVRAM initialization to support a service processor
   side switch after an in-band firmware downgrade from SV830 to SV810.  A
   service processor failure with SRC B1817212 occurs on a side switch after the
   downgrade (side switching from SV810 to SV830).  This happens because of the
   difference in size of the NVRAM used between SV810 and SV830 with a need for
   more NVRAM initialization on level SV830.  This problem does not affect out
   of band firmware downgrades to a new release using the Management Console
   because in that case a code update accept automatically occurs and the T and
   P sides are updated to the same SV810 release level.
 * A problem was fixed for an error on a re-IPL of a powered-on CEC that fails
   with a time-of-day topology error with SRC B111BA24.
 * A problem was fixed to provide a service alert for failed VPD on the anchor
   card.  Previously, only an informational (INF) SRC B155A435 was generated for
   this failure.  Now the SRC has been made a predictive error (PE) and the
   failed anchor card VPD is guarded and ready for service.
 * A problem was fixed for a clearing of all guard records associated with one
   error log entry.  If a FRU is replaced for any of the related guard record,
   all the related guard records are cleared.  Previously, only the guard record
   for the replaced FRU was cleared and the association was lost.
 * A problem was fixed to reduce switching noise on the memory address bus for
   DIMMs.  Noise on the bus could cause a failure for a marginal DIMM, so this
   fix has the effect of potentially improving the reliability of the memory.
 * A fix was made to prevent processor speculative memory loads from the service
   processor mailbox Direct Memory Access (DMA) area in the CEC memory.  The
   speculative loads caused memory cache faults and system checkstops with SRC
   B181E540.
   

System firmware changes that affect certain systems


 * For a system with a degraded power supply,  a problem was fixed so that
   inaccurate output voltage levels would be handled by the Voltage Regulator
   Modules (VRMs) and not cause a system failure.
 * For a system with a missing or broken operations panel, a problem was fixed
   for excessive logging of SRC B181A734 for the error condition.
 * On systems using Virtual IO Server (VIOS) with the partitions,  a problem was
   fixed for a mainstore dump (MSD) failure with SRC B2005123 when it attempted
   to write to a loadsource DASD connected via VIOS.  VIOS was unable to handle
   the I/O write request exceeding 256K.
 * On systems using OPAL, the time-outs for errors on the PCIe Host Bridge (PHB)
   were increased to allow time for PCIe link error recoveries to complete where
   possible to reduce partition and system errors caused by link errors.
 * A problem was fixed for a PowerVM hypervisor hang after a processor core and
   system checkstop.  The failed processor core was not put into a guarded state
   and the hypervisor hung when it tried to use the failed core.
 * On systems using Field Core Override (FCO) feature code #2319 to reduce the
   number of available cores, a problem was fixed where failed cores were not
   being replaced by unconfigured cores, causing the system to fail to IPL with
   a no cores available condition.  The fix now allows unconfigured cores to be
   substituted for licensed cores that have failed.
 * On systems using OPAL, a problem was fixed for an unnecessary guarding of a
   processor core on a L2 or L3 cache error.  This error was caused by an errant
   attempt to repair the cache using an operation that is not supported on
   OPAL.  Guarding of the processor core on OPAL now only occurs after a daily
   threshold of cache errors is exceeded instead of guarding on the second cache
   error for the core.
 * On systems using PowerVM, a problem was fixed for the handling of the error
   of multiple cache hits in the instruction effective-to-real address
   translation cache (IERAT).  A multi-hit IERAT error was causing system
   termination with SRC B700F105.  The multi-hit IERAT is now recognized by the
   hypervisor and reported to the OS where it is handled.
   
 * On systems using PowerVM, a problem was fixed to prevent a hypervisor task
   failure with a B7000602 SRC logged, if multiple resource dumps running
   concurrently run out of dump buffer space. The failed hypervisor task could
   prevent basic logical partition operations from working, potentially leading
   to an Incomplete state on the Management Console.
 * On systems using PowerVM, a problem was fixed for partitions going back to
   Epoch Time (1970) after a real-time clock (RTC) battery replacement.  If the
   RTC battery is replaced and the correct time is set using the Advanced System
   Management Interface, the partitions end up with the wrong time based in
   1970.
 * On systems using PowerVM, a problem was fixed to allow partitions to recover
   PCIe links from multiple link errors occurring at the same time.  The only
   recovery without the fix would be to reipl the CEC.
 * On systems using PowerVM, a problem was fixed for partitions with Virtual
   Trusted Platform Module (VTPM) resources so they could restart partitions
   after a CEC power off and power on sequence without hanging at progress code
   C2006009.
 * On systems using PowerVM, a problem was fixed to fully deconfigure cores that
   have cache repair failures so they cannot be referenced by an On-Chip
   Controller (OCC) reset..  This will prevent an OCC reset failure because of
   the failed cores, logged with SRC B1112AB4 and BC82203B, that forces the OCC
   into safe mode (minimum processor clock frequency) for all of its remaining
   cores.  A CEC re-IPL is needed to get an OCC out of safe mode.
 * On systems using Virtual IO Server (VIOS) to share physical I/O resources
   among client logical partitions using virtual Small Computer Serial Interface
   (vSCSI) adapters, a problem was fixed that prevented the VIOS from accessing
   storage hosted by a physical adapter that had storage mapped to a vSCSI
   adapter.  The VIOS showed errors on disks under that physical adapter and was
   unresponsive.  To recover from this problem, the VIOS must be rebooted.
 * On systems using the Virtual I/O Server (VIOS) to share physical I/O
   resources among client logical partitions, a problem was fixed for memory
   relocation errors during page migrations for the virtual control blocks. 
   These errors caused a CEC termination with SRC B700F103.  The memory
   relocation could be part of the processing for the Dynamic Platform Optimizer
   (DPO), Active Memory Sharing (AMS) between partitions, mirrored memory
   defragmentation, or a concurrent FRU repair.
 * On systems using PowerVM, a problem was fixed for the PCIe Host Bridge (PHB)
   error recovery process which failed, causing the PCIe slots to fail.  The
   recovery process has been enhanced to allow for delays caused by active power
   bus operations during the recovery and to handle recovery from simultaneous
   PCIe switch and PHB errors .  A CEC re-IPL is needed to get the failed PCIe
   slots working again.
 * On systems using PowerVM, a problem was fixed that could result in
   unpredictable behavior if a memory UE is encountered while relocating the
   contents of a logical memory block during one of these operations:
   - Reducing the size of an Active Memory Sharing (AMS) pool.
   - On systems using mirrored memory, using the memory mirroring optimization
   tool.
   - Performing a Dynamic Platform Optimizer (DPO) operation.
 * On systems using Virtual Shared Processor Pools (VSPP), a problem was fixed
   for an inaccurate pool idle count over a small sampling period.
   
 * On systems using PowerVM and Virtual Trusted Platform Module (VTPM)
   partitions,  a problem was fixed for a Management Console error that occurred
   while restoring a backup profile that caused the system to go to the
   Management Console "Incomplete state".  The failed system had a suspended
   VTPM partition and a B7000602 SRC logged.
 * On systems using PowerVM, a problem was fixed for a partition deletion error
   on the Management Console with error code 0x4000E002 and message
   "...insufficient memory for PHYP".  The partition delete operation has been
   adjusted to accommodate the temporary increase in memory usage caused by
   memory fragmentation, allowing the delete operation to be successful.
 * On systems using PowerVM, a problem was fixed for Live Partition Mobility
   (LPM) migrations of Linux partitions running in P8 compatibility mode.  After
   an active migration, the resumed partition may experience performance
   degradation.
 * On systems using PowerVM, a problem was fixed for a false error message with
   error code 0x8006 when creating a virtual ethernet adapter with the
   Integrated Virtualization Manager (IVM).  The error message can be ignored as
   the virtual ethernet slot is fully functional.
 * On systems using PowerVM with a PCIe 3D graphics adapter (F/C #EC41 or #EC42)
   in a partition, a problem was fixed for a partition hang or BA21xxxx error
   conditions during partition initialization.
 * On systems using PowerVM, a problem was fixed for the Live Partition Mobility
   (LPM) migration of virtual devices to a Power8 systems to update each virtual
   device location code correctly to reflect the location code in the target
   systems instead of the location code in the source system.  This problem
   prevented the management console from being able to look up AIX Object Data
   Manager (ODM) names for the virtual devices so that operations such as remove
   on the device could not be performed.
 * On systems using PowerVM with a Linux partition, a problem was fixed for the
   Linux "lsslot" command so that it is able to find the F/C EC41 and EC42 PCIe
   3D graphics adapter installed in the CEC, instead of showing the slot as
   "empty".  The Linux graphics adapter worked correctly even though it showed
   as "empty".
 * On systems using PowerVM, support was added for USB 2.0 HUBs so that a
   keyboard plugged into the USB 2.0 HUB will work correctly at the SMS menus. 
   Previously, a keyboard plugged into a USB 2.0 HUB was not a recognized
   device.
 * On systems using PowerVM,  a problem was fixed for a hypervisor deadlock that
   results in the system being in a "Incomplete state" as seen on the management
   console.  This deadlock is the result of two hypervisor tasks using the same
   locking mechanism for handling requests between the partitions and the
   management console.  Except for the loss of the management console control of
   the system, the system is operating normally when the "Incomplete state"
   occurs.
 * On systems using OPAL firmware,  a problem was fixed for Coherent Accelerator
   Processor Interface (CAPI)  devices not being available to the partitions
   after a re-IPL of a CEC with power on.
 * On systems using OPAL firmware, a problem was fixed to support a kdump of a
   baremetal Little Endian (LE) kernel using XPS mounts to prevent a hang in Big
   Endian (BE) Petitboot.  For this problem, there was an endian swtich on the
   re-mount of the XPS and Petitboot was unable to read the XPS logs to do
   recovery.  Petitboot now mounts the XPS file system read-only with no
   recovery: "-o ro,norecovery" to prevent the problem.
 * On systems using OPAL firmware, a problem was fixed in Petitboot for the
   default selection of the OS to use the first grub entry if no matching OS
   labels are found in the grub configuration file.  Previously, if a grub label
   did not match, the user had to manually select the OS and boot it.
   
 * On systems using OPAL firmware,  a security problem was fixed to prevent an
   out-of-bounds read in the glibc's iconv() function when converting certain
   encoded data to UTF-8.  This could cause a crash of OPAL.  The Common
   Vulnerabilities and Exposures issue number is CVE-2014-6040.
 * On systems using OPAL firmware,  a security problem was fixed for Name
   Service Switch (NSS) to prevent a denial of service attack from a application
   performing key based look-ups on a database in an infinite loop.   The Common
   Vulnerabilities and Exposures issue number is CVE-2014-8121.
 * On systems using OPAL firmware,  a security problem was fixed for the snap
   utility of powerpc-utils to prevent plain text passwords from being extracted
   from archives containing configuration snapshots of services.  The Common
   Vulnerabilities and Exposures issue number is CVE-2014-4040.
 * On systems using OPAL firmware,  a problem was fixed for the OPAL lsdevinfo
   command as it did not correctly process the path to the device, which made
   the path unreadable in the output.  With the fix, the path is displayed
   correctly.
 * On systems using OPAL firmware, a problem was fixed for Resource Monitoring
   and Control (RMC) failing and going inactive after several OPAL Linux
   partition migrations.  The validation operations failed when the Machine,
   Type, Model, and Serial number (MTMS) were set incorrectly.
   
 * On systems using OPAL firmware, a problem was fixed for the OPAL drmgr
   utility so it correctly gathers Logical Memory Block (LMB) information while
   performing Memory Dynamic Logical Partitioning (DLPAR) on the little-endian
   variation of the Power processor.

SV810_108_081 / FW810.21

01/09/15 Impact: Security         Severity:  SPE

System firmware changes that affect all systems


 * A problem was fixed to prevent the Advanced System Management Interface
   (ASMI) "System Service Aids/Factory Configuration" panel option from
   restoring to factory configuration for FSP or ALL if one boot side of the
   service processor is marked invalid.  The following informational message is
   issued:  "The request cannot be performed because a firmware boot side is
   marked invalid.  This state may have been caused by a previous firmware
   update failure."
 * A problem was fixed for firmware updates from USB to allow the code update
   progress to be seen with the addition of progress code C100B100.  This
   progress code means that the firmware update is busy unpacking the firmware
   image file and that the USB key should not be removed until the operation is
   completed.
 * A security problem was fixed in OpenSSL for padding-oracle attacks known as
   Padding Oracle On Downgraded Legacy Encryption (POODLE).  This attack allows
   a man-in-the-middle attacker to obtain a plain text version of the encrypted
   session data. The Common Vulnerabilities and Exposures issue number is
   CVE-2014-3566.  The service processor POODLE fix is implemented by disabling
   SSL protocol SSLv3 and requiring TLSv1.2 protocol on all secured
   connections.  The Hardware Management Console (HMC) also requires a POODLE
   fix for APAR MB03867(FIX FOR CVE-2014-3566 FOR HMC V8 R8.1.0 SP1 with PTF
   MH01481).  This HMC minimum requirement is enforced by the firmware update
   process for this defect.
   
 * A security problem was fixed in OpenSSL for memory leaks that allowed remote
   attackers to cause a denial of service (out of memory on the service
   processor). The Common Vulnerabilities and Exposures issue numbers are
   CVE-2014-3513 and CVE-2014-3567.
 * A problem was fixed for two light-emitting diodes (LEDs) turning on
   incorrectly on the operator panel after a system power off.  These LEDs are
   the blue LED (Identify) and the amber LED (enclosure fault indicator LED with
   the exclamation point symbol ("!").
   

System firmware changes that affect certain systems


 * On systems with partitions using shared processors, a problem was fixed that
   could result in latency or timeout issues with IO devices.

SV810_101_081 / FW810.20

10/24/14 Impact: Availability    Severity: HIPER

New features and functions


 * Support for the IBM Power System S824L (8247-42L).
 * Support for NEBS-3 48VDC 750 W power supply with CCIN 51D8 and F/C #EB3H on
   the S822 (8284-22A) and the S822L (8247-22L).
 * Support for 128Gb CDIMM DDR3 DRAM with F/C #EM8E on the IBM Power System S824
   (8286-42A).  These need to be ordered in pairs and each DIMM within a DIMM
   pair must be of the same capacity.
   
 * Support for the Nvidia Compute Intensive Accelerator (PCIe attached GPU) with
   F/C #EC47.  This feature is only supported on the IBM Power System
   S824L(8247-42L).  It is a PCIe 3 X16/Long/Full High/Double wide adapter with
   the PCIe connection in the left slot.
 * Support was added to enable fast sleep on OPAL systems, allowing for
   significant power savings.
 * Support for an Intelligent Platform Management Interface (IPMI) enhancement
   to provide a host Linux boot device path on OPAL systems.
 * Enhancement to the service processor dump for easier problem debugging by
   collecting full kcore dumps as a gzipped file instead of truncating the large
   kcore files.
 * Enhancement made to the Advanced System Management Interface (ASMI) "System
   Service Aids/Factory Configuration" menu to clear all firmware NVRAM for
   PowerVM and OPAL, regardless of the current firmware selection.  Previously,
   only the NVRAM for the current firmware type was cleared.
 * Support for additional PCIe adapters, which had previously been supported on
   Power7+ and earlier servers, to help with server migration:
       Ethernet 1 Gb LAN: 2-port UTP/TX (#5767, #5281), 2-port SX (#5768,
   #5274), and 4-port UTP/TX (#5717, #5271)
       Ethernet and FCoE: 2-port 10 Gb (#5708, #5270)
       SAS:  3-port 6 Gb/1.8 GB cache (#5913, #ESA3)
   

System firmware changes that affect all systems


 * A problem was fixed in the error handling of memory channel failures with SRC
   B181E540 to prevent false processor errors with SRC B113E504 during the next
   IPL after the memory fault.
 * A problem was fixed for L4 cache errors being assigned an incorrect subsystem
   of "Memory Controller" in the SRC B121E504 error log instead of "Memory
   Fru".    L4 cache resides on the DIMM and is not a memory controller.
 * A problem was fixed in the Advanced System Management Interface (ASMI) 
   "Performance Setup/Logical Memory Block Size" menu that prevented the user
   from selecting valid Logical Memory Block (LMB) sizes because they were
   greyed out.
 * A problem was fixed to capture missing trace data for the hardware
   compression accelerator (NX) checkstop failures to allow for easier debug of
   the failures.
 * A problem was fixed to add call outs for the operations panel FRU for SRCs
   B1504804 and B1504805 for operation panel failures.  The FRU call out had
   been missing in the error log.
 * A problem was fixed that caused the system to hang in the IPL state during a
   system dump with SRC B182901E shown in the error log.  The hang occurred when
   system dump detected a prior system dump already in place.  The second system
   dump would normally be bypassed to allow the IPL to complete.
 * A problem was fixed for the service processor error log handling that caused
   SRC B150BAC5 errors when converting a error log entry from an object into a
   flattened array of bytes.
 * A problem was fixed for truncated fan part numbers in the FRU call outs of
   SRC 110076111 so that 4U systems (8286-41A, 8286-42A, 8247-42L) have FRU
   00FV629 for the 80 mm fan and the 2U systems (8284-22A, 8247-21L, 8247-22L) 
   have FRU 00FV726 for the 60 mm fan.  FRU 00FV62 and FRU 00FV72 were being
   incorrectly reported, showing the right-most character of the part number
   truncated.
 * A problem was fixed in the fault isolation of FRUs for errors in the Time Of
   Day (TOD) oscillator topologies and the processors to reduce the number of
   incorrect call outs.  When a problem is detected in a connection between the
   processor and TOD oscillator,  the oscillator is now called out with high
   priority and processor with low priority but neither is guarded to prevent
   unnecessary loss of system resources.
 * A problem was fixed with the DIMM pairing rules to ensure that only the one
   DIMM that is the paired mate of a failing or missing DIMM is guarded.  An
   error in the pairing rules was causing additional DIMMs to be called out and
   guarded in the case of a single DIMM failure.
 * A problem was fixed so that when a L2/L3 cache repair cannot be performed
   because there is no repair available, the error log written is a Predictive
   Error instead of a hidden Recoverable Error.  This improves the customer
   awareness that the processor cache is becoming degraded.
   

System firmware changes that affect certain systems


 * HIPER/Pervasive:  On systems using PowerVM firmware, a performance problem
   was fixed that may affect shared processor partitions where there is a
   mixture of dedicated and shared processor partitions with virtual IO
   connections, such as virtual ethernet or Virtual IO Server (VIOS) hosting,
   between them.  In high availability cluster environments this problem may
   result in a split brain scenario.
   
 * On systems using OPAL firmware, a performance problem was fixed where the
   On-Chip Controller (OCC) failed to establish a session to OPAL, resulting in
   all the system processors being set to minimum (safe mode) frequencies.
 * On systems using PowerVM firmware, a problem was fixed for systems in
   networks using the Juniper 1GBe and 10GBe switches (F/Cs #1108, #1145, and
   #1151) to prevent network ping errors and boot from network (bootp)
   failures.  The Address Resolution Protocol (ARP) table information on the
   Juniper aggregated switches is not being shared between the switches and that
   causes problems for address resolution in certain network configurations. 
   Therefore, the CEC network stack code has been enhanced to add three
   gratuitous ARPs (ARP replies sent without a request received) before each
   ping and bootp request to ensure that all the network switches have the
   latest network information for the system.
   
 * On systems using OPAL firmware,  a problem was fixed for the 10/1Gb Ethernet
   adapter (F/C #EL3Z) where it failed by rebooting into the wrong endian mode.
 * On systems using PowerVM firmware, a problem was fixed for a false error
   message displayed on the management console during firmware code updates that
   include Concurrent Core Initialization (CCI) for the processors.  All
   processors core are correctly initialized but the management console displays
   this message:   "An open serviceable event related to system firmware was
   found.  The firmware update process will not be interrupted.  Please address
   any open serviceable events on the system(s) ...  HSCF0223".
 * On systems using PowerVM firmware,  a problem was fixed so that a system dump
   with Advanced System Management Interface (ASMI)  server firmware content of
   "maximum " or "HCA IO" will not cause the system to fail with a SRC
   B700F103.  There is no Infiniband (IB) Host Channel Adapter (HCA) on a IBM
   Power8 system so this caused an unexpected problem in the hypervisor dump
   data collection for IB adapters.
 * On systems using PowerVM firmware,  a problem was fixed for network
   boot/install using a null pointer when network adapter buffers are depleted
   and failing the boot with a SRC BA210003 - "Partition firmware detected a
   data storage error".
 * On the IBM Power System S824 (8286-42A)  with IBM i partitions, a problem was
   fixed to block a non-applicable IBM i console warning message "CPF9E17 -
   Usage limit exceeded - operator action required".  IBM i software license key
   5722-SS1 feature 5052, the user entitlement key for the number of users who
   are authorized to use the operating system, is not required for the 8286-42A
   system.  This system has the Software Tier P20 licensing, which does not have
   user based licensing and includes the 5250 features.
 * On systems using OPAL firmware,  a problem was fixed when switching into the
   PowerVM mode to prevent the management console from going into recovery mode.
 * On systems using PowerVM firmware, a problem was fixed for a hypervisor
   time-keeping services topology failover that caused errors to be wrongly
   attributed to the new time-of-day topology, resulting in processor FRUs being
   guarded falsely.
 * On systems with a PCIe dual-x4 SAS adapter (F/C #5901, #5278, or #EL10), a
   problem was fixed for the system fans running too fast and loud.  This PCIe
   adapter was incorrectly assigned a hot PCIe rating and this caused the system
   fans to go to high speed for the required extra cooling.
   This fix is not applicable to the IBM Power System S824L (8247-42L).
 * On systems using OPAL firmware,  a problem was fixed for CAPP (Coherent
   Attached Processor Proxy) system checkstops that should have been recoverable
   errors.
 * On systems using OPAL firmware,  a problem was fixed for the CEC memory
   controllers to increase the operation time-out value to be able to handle
   long-running Coherent Accelerator Processor Interface (CAPI) and Peripheral
   Component Interconnect Express (PCIe) operations.
 * On systems using OPAL firmware, a problem was fixed in the Advanced System
   Management Interface (ASMI) "Real Time progress indicator" to not delete the
   first character of the second line of the display.
 * On systems using PowerVM firmware, a problem was fixed to allow booting off
   an iSCSI device.  For the failure, the partition firmware error logs had SRC
   BA012010 "Opening the TCP node failed." and SRC BA010013 "The information in
   the error log entry for this SRC provides network trace data."  The open
   firmware standard output trace showed SRC BA012014  "The TCP re-transmission
   count of 8 was exceeded. This indicates a large number of lost packets
   between this client and the boot or installation server" followed by SRC
   BA012010.
 * On systems using PowerVM firmware, a problem was fixed for partition firmware
   stack corruption that would cause spurious output to the console for failed
   ping or network boot operations.  When a stack imbalance is encountered, text
   is displayed on the console indicating a stack depth error along with a
   number of values and the text string "CUTILS" similar, in format, to the
   following:
               6 1 2 2 0 da15b007 22901dc
               CUTILS: bad exit depth? SCHEDULER call-c-wrapper exit: depth=7 ,
   _indepth=4 , _#inparms=0
 * On systems using PowerVM firmware, a problem was fixed so that the thermal
   and power management tunable parameters for the On-Chip Controller (OCC) in
   the Advanced System Management Interface (ASMI) "System Configuration/Power
   Management/Tuning Parameters" are not set back to the defaults when the CEC
   is powered off.
 * On systems using PowerVM firmware, a problem was fixed in checkstop error
   recovery to force a re-IPL instead of a system termination for checkstops
   that occur during memory-preserving IPLs.  This allows the system to recover
   from the IPL error without any operator intervention needed.

SV810_087_081 / FW810.11

09/26/14 Impact: Data            Severity:  HIPER

System firmware changes that affect certain systems


HIPER/Pervasive:  A problem was fixed in PowerVM where the effect of the problem
is non-deterministic but may include undetected corruption of data.  This
problem can occur if VIOS (Virtual I/O Server) version 2.2.3.x or later is
installed and either one of following statements is true:

(A) A storage adapter (including Fibre Channel) is assigned to a VIOS and shared
between multiple partitions (one of which must be an IBM i partition, others can
be AIX, Linux or IBM i partitions), and at least one of the other partitions is
performing LPM (Live Partition Mobility) or an immediate or abnormal shutdown
operation.

-or-

(B) A Shared Ethernet Adapter (SEA) with fail over enabled is configured on the
VIOS. SV810_081_081 / FW810.10

09/08/14 Impact: Availability    Severity: SPE

New features and functions


 * Extended the availability of the IBM Power System S812L (8247-21L) that was
   enabled in the 810.00 release.
 * Expansion of maximum number of SAS drives on Power System S814 (8286-41A)
   from 8 (SSD, disk, or combination thereof) to 10 drives.
 * Support for SAS EXP24S expansion drawer (#5887, #EL1S) attached using a PCIe
   slot.
 * Support for large M64 based BARs for systems in the OPAL environment.
 * Fan speed settings were enhanced for the case of systems with fan failure to
   set the speed based on system thermal conditions instead of forcing all
   remaining fans to a overdrive speed setting.
   
 * Support for a PCIe Gen3 FPGA x 16 slot adapter that acts as a co-processor
   for the POWER8 processor chip for gzip compressions and decompressions. 
   Feature codes #EJ12 and #EJ13 are electronically identical with the same CCIN
   of 59AB.  #EJ12 has full high tail stock and is supported by 8286-41A and
   8286-42A.  #EJ13 has a low profile tail stock and is supported by 8284-22A. 
   OS levels supported are AIX 6.1 and AIX 7.1 or later.  IBM i and Linux are
   not supported.
   
 * Support for use of system and partition templates on the management console.
 * Support for Coherent Accelerator Processor Interface (CAPI) for the PCIe Gen
   3 FPGA on OPAL.  Operating system supported is Linux.
 * Support was added to allow concurrent initialization of the processor cores. 
   This expands the range of concurrent firmware updates to accommodate core
   initialization changes and also allows for dynamic repairs of processor and
   cache memory.
 * Support was added for cache memory L2/L3 column repair to allow concurrent
   repair of memory and propagation of memory errors for better fault isolation
   of memory components.
 * The system operator panel was enhanced to show the firmware mode of the
   system during the IPL of either PowerVM or OPAL for panel function 1.
 * The service processor Processor Runtime Diagnostics (PRD) was enhanced to
   collect debug data for failures in host boot initialization for the Self-Boot
   Engine (SBE).
 * Support was added to the Advanced System Management Interface (ASMI) USB menu
   to allow a system dump to be collected to USB with the power on to the
   system.  This allows the dump to be collected with the system memory state
   intact.
 * Support for enhanced 10 Gb ethernet adapters that were previously announced
   for Power8 for AIX NIM (Network Install Management) or Linux Network Install
   capability.  The enhanced adapters are the following:
       PCIe2 4-port(10Gb+1GbE) SR+RJ45 Adapter (#EN0S, #EN0T)
       PCIe2 4-port(10Gb+1GbE) SFP+Copper+RJ45 Adapter (#EN0U, #EN0V)
       The level of adapter microcode required is level 20100130 or later.
   
       PCIe2 LP 2-port 10/1GbE BaseT RJ45 Adapter (#EN0W, #EN0X, #EL3Z)
       The level of adapter microcode required is level 30080130 or later.

 * Support for a new 4-port Ethernet Adapter with two 10 Gb and two 1Gb ports
   (#EN0M, #EN0N with CCIN 2CC0). The adapter offers NIC and FCoE over its 10 Gb
   ports and NIC over the 1 Gb ports and is SR-IOV capable.  The 10 Gb ports are
   LR (long range) fiber optic, supporting distances up to 10 km.  Except for
   the transceivers and cabling of the 10 Gb ports,  this adapter is
   functionally identical to the 4-port adapter (#EN0H, #EN0J, #EL38) SR optical
   and (#EN0K, #EN0L, #EL3C) activer copper twinax.
 * Support for a new PCIe 2-port Async adapter (#EN27, #EN28) that serves the
   same function as the predecessor PCIe 2-port Async adapter (#5289, #5290) on
   the Power7+ and earlier servers.    This adapter provides connection for 2
   asynchronous EIA-232 devices. Ports are programmable to support EIA-232
   protocols, at a line speed of 128K bps. Two RJ45 connections are located on
   the rear of the adapter. To attach to devices using a 9-pin (DB9) connection,
   use an RJ45-to-DB9 converter. For convenience, one converter is included with
   this feature. One converter for each connector needing a DB9 connector is
   needed.
 * Support for additional PCIe adapters, which had previously been supported on
   Power7+ and earlier servers, to help with server migration:
       Ethernet 10 Gb LAN: 1-port optical SR (#5769, #5275)
       Ethernet and FCoE: 4-port 10 Gb/1 Gb Copper (#EN0K, #EN0L, #EL3C)
       Ethernet RoCE: 2-port 10 Gb copper (#EC27, #EC28, #EL27)
       Fibre Channel: 2-port 4 Gb (#5774, #5276, #EL09)
       SAS: 2-port 3 Gb 380 MB cache (#5805)
   
 * Support was added for a new Advanced System Management Interface (ASMI) menu
   to allow the user to choose between an IPMI or a serial console when in OPAL
   mode.
   

System firmware changes that affect all systems


 * A problem was fixed in the service processor that caused the SRC B1504804 to
   be logged as many as 30 times over five minutes for a operations panel
   voltage regulator error.  The error logging has been reduced to one SRC for
   this error.
 * A problem was fixed to allow the system to prevent an intermittent system
   hang until IPL time-out after a processor core checkstop.  This secondary
   failure after a core checkstop had a low probability of occurring.
 * A problem was fixed to maintain time-of-day (TOD) clock redundancy for the
   hypervisor time-keeping services in the case of a TOD error and fail-over to
   the backup clock topology.  There was a failure in the TOD fail-over process
   to correctly assign the new backup TOD topology, causing loss of redundancy
   for the next TOD error.
 * A problem was fixed for the service processor reset/reload process to
   eliminate an extra dump and SRC B1818601 caused by an internal core dump
   during the reset/reload.
 * A problem was fixed for a processor error with an incorrect call out of a
   memory card with SRC B124E504 to eliminate the memory card FRU call out.  The
   processor error call out of SRC B170E540 was correct.
 * A problem was fixed in the Advanced System Menu Interface (ASMI) menus to
   restore factory settings so that the default for the Hypervisor mode (PowerVM
   or OPAL) was restored to the factory setting using "System Service
   Aids/Factory Configuration/Service Processor Reset/All Reset".
 * A problem was fixed in how the processor clock speed was reported to the
   hypervisor, causing the partitions to show a clock speed that was about 200
   MHZ faster than the actual processor clock speed.
 * A problem was fixed for DRAM repair for the case where two DRAM modules are
   having failures at the same rank such that spares are used to repair each
   DRAM error.  Without the fix, the second DRAM is not repaired and could
   eventually be called out and guarded with a UE SRC.
 * A problem was fixed for system hardware dump collection to collect all the
   hardware registers by stopping all functional clocks before starting the
   collection.
 * A problem was fixed for repairing spare memory DRAM so that repair solutions
   for failed spares persists across IPLs of the system by getting the repair
   solutions written to the Vital Product Data (VPD) of the DRAM.
 * A problem was fixed in the Advanced System Menu Interface (ASMI) menus to
   change the name of the "Hypervisor Configuration" menu to "Firmware
   Configuration" to more accurately describe the menu function of being able to
   change firmware between the PowerVM and OPAL modes.
 * A problem was fixed in the Advanced System Menu Interface (ASMI) menus to
   move the IPMI password reset operation from the "Firmware Configuration" menu
   to the "Login Profile/Change password" menu.  This change was made to put all
   the password change operation together under one menu.
 * A problem was fixed in the Advanced System Menu Interface (ASMI) menu for
   "Resource Dump" to give the message "This feature is not supported for OPAL
   environments" when the system is in OPAL mode.  Previously,  ASMI incorrectly
   stated that the "Resource Dump" function was not supported on the machine
   type.
 * A problem was fixed in the service processor to add missing call outs for the
   memory buffer and memory controller FRUs when there is a time-out error on
   the power bus with PE SRC logged of B170E540.
 * A problem was fixed in memory diagnostics and fault isolation that
   deconfigured more memory than necessary for memory errors.
 * A problem was fixed that caused the Utility COD display of historical usage
   data to be truncated on the management console.
 * A problem was fixed to eliminate service processor dumps after AC power
   cycles of the CEC.
 * A problem was fixed to add a missing hardware call out for service processor
   FSI bus errors logged with SRC BC8A0A11.  This causes the failing hardware to
   be deconfigured and guarded for the next IPL of the system.
 * A problem was fixed so that if an IPL failure occurs that causes the system
   to power off,  error SRCs will be logged instead of the system hanging for
   ten minutes and not logging any SRCs.
 * A problem was fixed in the system dump data collection for missing memory
   data to collect memory data after hardware de-configuration checkstop errors.
 * A problem was fixed for in-band code update to prevent loss of a processor
   support interface (PSI) link that is in a backup role.
   
 * A problem was fixed in system dump collection for a system hang after a
   checkstop.  The system failed to go to terminate state and reboot.
 * A problem was fixed in system dump collection to return full dump data when a
   secondary error occurs during dump data collection for the checkstop primary
   error.
 * A problem was fixed in the Advanced System Menu Interface (ASMI) menu "System
   Configuration/Hardware Deconfiguration/Memory Deconfiguration" to be able to
   manually configure and deconfigure DIMMs.
 * A problem was fixed for system terminations that could occur as a result of
   PCIe adapters using a Level Signaled Interrupt (LSI) before the hypervisor
   interrupt handler was ready.  This could occur when in PCIe adapter recovery
   for an error with src logs of B7006970 and B700B971.   The PCIe adapters are
   now held in reset until initialization sequences are completed to ensure all
   interrupt handlers are ready for PCIe adapter interrupts.
 * A problem was fixed for a management console firmware update "Remove and
   Activate" operation that fails to activate the OCC (On-Chip Controller for
   thermal and power management) new code level with SRCs logged of B18B2616 and
   B1812601.  An IPL is needed to activate the OCC code level to complete the
   firmware update.
 * A problem was fixed for IPL failures caused by Host Boot PNOR memory
   corruption.  If a IPL Terminate Immediate (TI) from Host Boot has a SRC
   without a specific reason code, a corruption check on the Host Boot memory
   partitions is run and the Host Boot partitions corrected to recover them.
 * A problem was fixed for the power usage regulation of memory to keep memory
   power usage below its specified limits.  Lack of enough memory throttling was
   allowing the memory to consume power pass its set limits, leaving the system
   exposed to power faults or unexpected power throttling in other areas of the
   system.
 * A problem was fixed to guard cores on hang errors.  A processor core was not
   being guarded on hang errors where a core timed-out waiting for an
   instruction to complete.
 * A problem was fixed to allow memory diagnostics during a re-IPL of the CEC,
   insuring that problem memory will be guarded or recovered and preventing
   possible error log flooding with memory errors.
 * A problem was fixed for system dump process memory corruption that could
   cause the wrong dump type to be created for a system failure, resulting in a
   system dump with the wrong content.
 * A problem was fixed for a service processor reset/reload causing a FSP dump
   with a Firmware Database (fwdb) core dump captured within it.
 * A problem was fixed for a processor core forward progress parity error so
   that the core could be guarded without causing a system checkstop.
 * A problem was fixed in the run time diagnostics of DIMMs to read the raw card
   type correctly, preventing failures in the memory repair.
   
 * A problem was fixed to prevent an intermittent hostboot IPL deadlock/hang in
   the deferred work queue with progress code CC009543 and termination with SRC
   B1813450.
 * A problem was fixed in memory diagnostics to be able to handle multiple DIMM
   failures without a time-out failure, reducing the the amount of memory needed
   to guarded for the errors.
 * A problem was fixed in DIMM initialization to prevent intermittent B181BA08
   DIMM failures in host boot during IPL.
 * A problem was fixed to call home guarded FRUs on each IPL.  Only the initial
   failure of the hardware was being reported to the error log.
 * A problem was fixed for the incorrect fan FRU call outs of SRC 110076111 so
   that 4U systems (8286-41A, 8286-42A) have FRU 00FV629 for the 80 mm fan and
   the 2U systems (8284-22A, 8247-21L, 8247-22L)  have FRU 00FV726 for the 60 mm
   fan.
 * A problem was fixed for a memory write error becoming a system checkstop
   instead of being handled by the memory error handling and recovery processes.
 * A problem was fixed for the error processing of processor core checkstops at
   runtime to not ignore the guard on the failed core on the next IPL of the
   system, thus preventing additional failures with the next IPL during host
   boot.
 * A problem was fixed for error recovery for a failed processor that has all
   cores guarded such that host boot is able to re-IPL using the working
   processor.   In certain situations, the re-IPL on the good processor was
   failing with SRC B113E504 with PRD signature PB_CENT_CRESP_ADDR_ERROR.
 * A problem was fixed for run-time guarding of a processor core that had
   resulted in a system checkstop when the core guard attempt failed.  The
   processor with the non-guarded broken core caused the On-Chip Controller
   (OCC) to have a power measurement time-out to the processor with SRC B1102A00
   that resulted in the system termination.
 * A problem was fixed to prevent incorrect logging of SRC 11007221 whenever the
   operator panel is missing (or broken).  This SRC indicates ambient
   temperature of the system is too high and a performance throttle may occur to
   lower the temperature, causing performance loss.  A missing operator panel
   should not cause lower performance of the system.
 * A problem was fixed for undefined hardware states in the system that caused a
   early IPL failure with SRCB1101314 when configuring the Self Boot Engine
   (SBE) for hostboot.
 * A problem was fixed for the Operator panel where the Enclosure Fault LED was
   swapped with the Attention/Check Log LED.
 * A problem was fixed for memory diagnostics to guard all unusable memory due
   to a channel failure.  This prevents the hypervisor from trying to start
   partitions with memory associated with the bad channel and having the
   partition crash.
 * A problem was fixed to insure all memory is scrubbed for correctable errors
   to prevent run-time memory failures and possible checkstops.   If memory
   scrubbing actions found the preceding memory rank had persistent ECC errors,
   the next rank of memory was sometimes skipped.
 * A problem was fixed in the Hostboot Self Boot Engine (SBE) to re-IPL without
   guarding the processor on a SBE step that has infrequent failures that are
   recoverable with a retry.
   

System firmware changes that affect certain systems


 * A problem was fixed for processor local bus errors during an IPL to call out
   the master and slave bus components with a BC14090F SRC to identify all the
   possible failing components.  For the problem, only the bus slave components
   were being called out on bus error leaving open the possibility that the
   faulty component might not be guarded or repaired.
 * On systems that have a boot disk located on a SAN,  a problem was fixed where
   the SAN boot disk would not be found on the default boot list and then the
   boot disk would have to be selected from SMS menus.  This problem would
   normally be seen for new partitions that had tape drives configured before
   the SAN boot disk.
 * On systems in IPv6 networks,  A problem was fixed for DHCP where a duplicate
   address detection (DAD) message to the DHCP-client on the service processor
   could fail, resulting in duplicate IP addresses being configured on the
   network.
 * On systems that have Active Memory Sharing (AMS) partitions, a problem was
   fixed for Dynamic Logical Partitioning (DLPAR) for a memory remove, leaving a
   logical memory block (LMB) in an unusable state until partition reboot.
 * On systems in IPv6 networks, a problem was fixed for a network boot/install
   failing with SRC B2004158 and IP address resolution failing using neighbor
   solicitation to the partition firmware client.
 * On systems in Dynamic Power Saver (DPS) mode, a problem was fixed so SRC
   B1812A61 is not logged when power throttling is needed for a workload over
   the power capacity.  In DPS mode,  a system power usage adjustment is not an
   error condition.
 * On systems in OPAL mode,  a problem was fixed for OPAL network boots to add
   retries to DHCP to prevent network boot time-out errors caused by network
   lags and slow downs.
 * On systems in OPAL mode, a problem was fixed in the fault isolation
   procedures to not call out hardware FRUS for software failures to reduce loss
   of hardware on errors.
 * On systems in PowerVM mode,  a problem was fixed in Live Partition Mobility
   (LPM) for systems at or near the new 32K maximum for virtual devices that
   insufficient space existed to store device attributes of the migrated
   system,  causing RMC failures and incorrect MTMS values for the migrated
   partition.
 * On systems in PowerVM mode,  a problem was fixed for I/O adapters so that
   BA400002 errors were changed to informational for memory boundary adjustments
   made to the size of DMA map-in requests.  These size adjustments were marked
   as UE previously for a condition that is normal.
 * On Power8 2U systems, a problem was fixed for the C5 PCIe slot failing.  This
   PCIe configuration was not supported on the 8284-22A, 8247-21L, and 8247-22L
   systems.
 * On Power8 2U systems, a problem was fixed in the fan speed management to
   lower the maximum RPMs of the fans and reduce the noise level of the system. 
   This problem affects the 8284-22A, 8247-21L, and 8247-22L systems.
 * On systems in PowerVM mode using dedicated processors, a problem with
   concurrent firmware update was fixed to prevent a quiesce of the hypervisor
   process that can result in a system hang.
 * On systems in PowerVM mode, a problem was fixed for unresponsive PCIe
   adapters after a partition power off or a partition reboot.
 * On systems with 64Gb DIMM memory (F/C #EM8D), a problem was fixed to allow
   64Gb DIMM memory error-correcting code (ECC) repairs instead of logging a
   predictive error with no repair to the memory.

SV810_061_054 / FW810.02

07/29/14 Impact: Data            Severity:  HIPER

System firmware changes that affect all systems


 * HIPER/Pervasive: A problem was fixed in PowerVM where the usage of P8
   transactional memory and vector facilities could result in undetected
   corruption of data if the system is running in Power8 native mode. OS levels
   that support Power8 native mode are RHEL 7 and AIX 7.1 TL3 SP3 and later.
   

System firmware changes that affect certain systems


 * HIPER/Pervasive: A problem was fixed with Live Partition Mobility (LPM) on
   PowerVM when migrating a partition between two Power8 systems that are
   running in Power8 native mode. This problem could result in unpredictable
   behavior when the partition resumes execution on the target system, including
   potential undetected corruption of data, a system crash, or a partition
   crash. OS levels that support Power8 native mode are RHEL 7 and AIX 7.1 TL3
   SP3 and later.
   
 * A problem was fixed for an IBM i D-mode IPL failure with SRC B2003110 when
   the alternative load source could not be found.  If a system encounters this
   issue prior to installing the fix, the Service Pack can be applied via the
   Management console or using a USB flash drive with the system powered off.

SV810_058_054 / FW810.01

06/23/14 Impact: Security         Severity:  HIPER

System firmware changes that affect all systems


 * HIPER/Pervasive:  A security problem was fixed in the OpenSSL (Secure Socket
   Layer) protocol that allowed clients and servers, via a specially crafted
   handshake packet, to use weak keying material for communication.  A
   man-in-the-middle attacker could use this flaw to decrypt and modify traffic
   between the management console and the service processor.  The Common
   Vulnerabilities and Exposures issue number for this problem is CVE-2014-0224.
 * HIPER/Pervasive:  A security problem was fixed in OpenSSL for a buffer
   overflow in the Datagram Transport Layer Security (DTLS) when handling
   invalid DTLS packet fragments.  This could be used to execute arbitrary code
   on the service processor.  The Common Vulnerabilities and Exposures issue
   number for this problem is CVE-2014-0195.
 * HIPER/Pervasive:  Multiple security problems were fixed in the way that
   OpenSSL handled read and write buffers when the SSL_MODE_RELEASE_BUFFERS mode
   was enabled to prevent denial of service.  These could cause the service
   processor to reset or unexpectedly drop connections to the management console
   when processing certain SSL commands.  The Common Vulnerabilities and
   Exposures issue numbers for these problems are CVE-2010-5298 and
   CVE-2014-0198.
 * HIPER/Pervasive:  A security problem was fixed in OpenSSL to prevent a denial
   of service when handling certain Datagram Transport Layer Security (DTLS)
   ServerHello requests. A specially crafted DTLS handshake packet could cause
   the service processor to reset.  The Common Vulnerabilities and Exposures
   issue number for this problem is CVE-2014-0221.
 * HIPER/Pervasive:  A security problem was fixed in OpenSSL to prevent a denial
   of service by using an exploit of a null pointer de-reference during
   anonymous Elliptic Curve Diffie Hellman (ECDH) key exchange.  A specially
   crafted handshake packet could cause the service processor to reset.  The
   Common Vulnerabilities and Exposures issue number for this problem is
   CVE-2014-3470.
 * A problem was fixed for hardware dumps on the service processor so that valid
   dump data could be collected from multiple processor checkstops.  Previously,
   the hardware data from multiple processor checkstops would only be correct
   for the first processor.
 * A problem was fixed for platform dumps so that certain operations would work
   after the platform dump completed.  Operations such as firmware updates or
   reset/reloads of the service processor after a platform dump would cause the
   service processor to become inaccessible.

SV810_054_054 / FW810.00

06/10/14 Impact:  New      Severity:  New

New Features and Functions


 * GA Level
   
   NOTE:
   
 * POWER8 firmware addresses the security problem in the OpenSSL Transport Layer
   Security (TLS) and Datagram Transport Layer Security (DTLS) to not allow
   Heartbeat Extension packets to trigger a buffer over-read to steal private
   keys for the encrypted sessions on the service processor.  The Common
   Vulnerabilities and Exposures issue number is CVE-2014-0160 and it is also
   known as the heartbleed vulnerability. 
   
 * POWER8 (and later) servers include an “update access key” that is checked
   when system firmware updates are applied to the system.  The initial update
   access keys include an expiration date which is tied to the product warranty.
   System firmware updates will not be processed if the calendar date has passed
   the update access key’s expiration date, until the key is replaced.  As these
   update access keys expire, they need to be replaced using either the Hardware
   Management Console (HMC) or the Advanced Management Interface (ASMI) on the
   service processor.  Update access keys can be obtained via the key management
   website: http://www.ibm.com/servers/eserver/ess/index.wss .





--------------------------------------------------------------------------------