sg247039

sg247039

5

Chapter 5.

Dynamic logical partitioning

A dynamic logical partition (DLPAR) allows you to add and remove the resources that are associated with a partition dynamically without rebooting the partition.

This functionality was first supported by AIX 5L Version 5.2.

򐂰

򐂰

򐂰

򐂰

򐂰

򐂰

򐂰

򐂰

򐂰

򐂰

Advanced Virtualization and Micro-Partitioning technology provided a change from the DLPAR perspective with finer granularity of system resources available for DLPAR operations. This chapter provides an update of the DLPAR functions with the introduction of the POWER5 systems and includes the following sections:

5.1, “Dynamic logical partitioning overview” on page 140

5.2, “The process flow of a DLPAR operation” on page 148

Table 5.3 on page 152

5.4, “DLPAR-safe and DLPAR-aware applications” on page 155

5.5, “Integrating a DLPAR operation into the application” on page 157

5.6, “Script-based DLPAR event handling” on page 160

5.7, “DLPAR script subcommands” on page 167

5.8, “How to manage DLPAR scripts” on page 186

5.9, “API-based DLPAR event handling” on page 194

5.10, “Error handling of DLPAR operations” on page 205

139

© Copyright IBM Corp. 2003, 2004, 2005. All rights reserved.

5.1 Dynamic logical partitioning overview

򐂰

򐂰

򐂰

DLPAR was introduced with AIX 5L Version 5.2. A dynamic partition based on

AIX 5L Version 5.2 can consist of the following resource elements:

A dedicated processor

256 MB memory region

I/O adapter slot

Multiple resources can be placed under the exclusive control of a given logical partition. DLPAR extends these capabilities by allowing this fine-grained resource allocation to occur not only when activating a logical partition, but also while the partitions are running. Individual processors, memory regions, and I/O adapter slots can be released into a free pool, acquired from that free pool, or moved directly from one partition to another.

On POWER5 with AIX 5L Version 5.3, however, a partition can consist of dedicated processors, or virtual processors with a specific capacity entitlement running in capped or uncapped mode, dedicated memory region, and virtual or physical I/O adapter slots.

򐂰

򐂰

򐂰

For dedicated and shared processor partitions, it is possible to:

Add, move, or remove memory in a granularity of 16 MB regions dynamically

Add, move, or remove physical I/O adapter slots dynamically

Add, or remove virtual I/O adapter slots dynamically

For a dedicated processor partition, it is only possible to add, move, or remove whole processors dynamically. When you remove a processor dynamically from a dedicated partition on a system that uses shared processor partitions, it is then assigned to the shared processor pool.

򐂰

򐂰

򐂰

򐂰

For shared processor partitions, it is also possible to:

Remove, move, or add entitled shared processor capacity dynamically

Change between capped and uncapped processing dynamically

Change the weight of an uncapped partition dynamically

Add, and remove virtual processors dynamically

The DLPAR operation for a shared processor refers to the additional processor capacity which is expressed as a percentage, so 100 represents one physical processor and 180 represents the 1.8 processors.

140

Partitioning Implementations for IBM

E server

p5 Servers

Note: A single DLPAR operation can perform only one type of resource

change. You cannot add and remove memory to and from the same partition in a single DLPAR operation. Also, you cannot move processor and memory from a partition to another partition in a single DLPAR operation.

5.1.1 Processor resources

Figure 5-1 shows the panel for dynamic reconfiguration of processor resources

on the HMC. From this panel, you can choose to add, remove, or move your resources. Select the partition that you want to change dynamically, and press the right mouse button. Then choose Dynamic Logical Partitioning

Processor Resources and choose the action that you want to perform.

Figure 5-1 Dynamic Logical Partitioning Processor Resources

From the Add Processor Resources panel shown in Figure 5-2 on page 142, you

can specify the processing units and the number of virtual processors that you want to add to the selected partition. The limits for adding processing units and virtual processors are the maximum values that are defined in the partition

Chapter 5. Dynamic logical partitioning

141

profile. This panel also allows you to add variable weight when the partition runs in uncapped mode.

Figure 5-2 Add Processor Resources

Additionally, the Add Processor Resource panel allows you to change the partition mode dynamically from uncapped to capped or vice versa. To show the actual status of the partition, use the

lparstat -i

command from the AIX command line interface of the partition, as shown in the following example:

# lparstat -i

Node Name : applsrv

Partition Name : Apps_Server

Partition Number : 4

Type : Shared-SMT

Mode : Uncapped

Entitled Capacity : 0.30

Partition Group-ID : 32772

Shared Pool ID : 0

Online Virtual CPUs : 2

Maximum Virtual CPUs : 10

Minimum Virtual CPUs : 1

Online Memory : 512 MB

Maximum Memory : 1024 MB

Minimum Memory : 128 MB

Variable Capacity Weight : 128

142

Partitioning Implementations for IBM

E server

p5 Servers

Minimum Capacity : 0.20

Maximum Capacity : 1.00

Capacity Increment : 0.01

Maximum Dispatch Latency : 16999999

Maximum Physical CPUs in system : 2

Active Physical CPUs in system : 2

Active CPUs in Pool : -

Unallocated Capacity : 0.00

Physical CPU Percentage : 15.00%

Unallocated Weight : 0

Figure 5-3 shows how to change the mode of the partition from uncapped to

capped mode. Deselect Uncapped and click OK.

Figure 5-3 Advanced Processor Settings - Uncapped Mode

Chapter 5. Dynamic logical partitioning

143

To verify this dynamic action, use the

lparstat -i

command on the selected partition again. The partition mode changed from uncapped to capped.

# lparstat -i

Node Name : applsrv

Partition Name : Apps_Server

Partition Number : 4

Type : Shared-SMT

Mode : Capped

Entitled Capacity : 0.30

Partition Group-ID : 32772

Shared Pool ID : 0

Online Virtual CPUs : 2

Maximum Virtual CPUs : 10

Minimum Virtual CPUs : 1

Online Memory : 512 MB

Maximum Memory : 1024 MB

Minimum Memory : 128 MB

Variable Capacity Weight : 128

Minimum Capacity : 0.20

Maximum Capacity : 1.00

Capacity Increment : 0.01

Maximum Dispatch Latency : 16999999

Maximum Physical CPUs in system : 2

Active Physical CPUs in system : 2

Active CPUs in Pool : -

Unallocated Capacity : 0.00

Physical CPU Percentage : 15.00%

Unallocated Weight : 0

Figure 5-4 on page 145 shows the Remove Processing Units panel that allows

you to remove processing units and virtual processors dynamically. The limit for the removal of processing units and virtual processors is the minimum value defined in the partition profile.

This panel also allows you to remove variable weight when the partition runs in uncapped mode.

144

Partitioning Implementations for IBM

E server

p5 Servers

Figure 5-4 Remove Processing Units

When moving processing units, you have to select the partition from which you want the processing units removed and choose the Move Processing Units panel

as shown in Figure 5-5 on page 146.

Chapter 5. Dynamic logical partitioning

145

Figure 5-5 Move Processing Units

In the Processing units field, select the amount of processor capacity you want to remove from the selected partition and move to the partition that you select from the menu under Logical Partition. In this example, 0.7 processing units are required to be moved to the Apps_Server partition.

You can also choose to move virtual processors to adjust the number of virtual processors of your partition. This action does not actually move the virtual processor but removes and adds the defined number of Virtual processors to the chosen partitions.

5.1.2 Dynamic partitioning for Virtual Ethernet devices

You can assign and remove Virtual Ethernet resources dynamically. On the HMC, you can assign and remove Virtual Ethernet target and server adapters from a partition using DLPAR. You can also map between physical and virtual resources on the Virtual I/O Server dynamically.

146

Partitioning Implementations for IBM

E server

p5 Servers

5.1.3 Dynamic partitioning for Virtual SCSI devices

You can assign and remove Virtual SCSI resources dynamically. On the HMC, you can assign and remove Virtual SCSI target and server adapters from a partition using dynamic logical partitioning. You can also map between physical and virtual resources on the Virtual I/O Server dynamically.

5.1.4 Capacity on Demand

Capacity on Demand (CoD) adds operational and configuration flexibility for

IBM

^

p5 and pSeries systems. CoD is available in a variety of offerings that allow you to pay when purchased, pay after activation, pay before activation, or pay with a one-time cost.

When activating a processor featured for CoD on a system with defined shared processor partitions, the activated processor is assigned automatically to the shared processor pool. You can then decide to add the processor dynamically to a dedicated processor partition or to add capacity entitlement dynamically to the shared processor partitions.

When the system operates as a full system partition, the processor is added automatically to the systems processor capacity.

To remove a CoD processor (for example, when using On/Off CoD, which enables users to temporarily activate processors), you have to make sure that there are enough processing units to deactivate the processor. You can remove the needed capacity entitlement from the partitions dynamically.

A type of CoD is named Reserve CoD. It represents an autonomic way to activate temporary capacity. Reserve CoD enables the user to place a quantity of inactive processors into the server's shared processor pool, which then become available to the pool's resource manager. When the server recognizes the number of base (purchased and active) processors that are assigned across uncapped partitions have been 100% utilized, and at least 10% of an additional processor is needed, then a Processor Day (good for a 24 hour period) is charged against the Reserve CoD account balance. Another Processor Day is charged for each additional processor that is put into use based on the 10% utilization rule. After the 24-hour period elapses and there is no longer a need for the additional performance, no Processor Days are charged until the next performance spike.

򐂰

򐂰

DLPAR supports the following dynamic resource changes in a partition without requiring a partition reboot:

Resource addition

Resource removal

Chapter 5. Dynamic logical partitioning

147

By achieving the resource changes in the following sequence on two partitions in a system, the specified resource can be moved from a partition to another partition:

1. Resource removal from a partition

2. Resource addition to another partition

This resource movement is implemented as single task on the HMC, although it is actually composed of two separate tasks on two partitions internally.

Note: A DLPAR operation can perform only one type of resource change. You

cannot add and remove memory to and from the same partition in a single

DLPAR operation. Also, you cannot move processor and memory from a partition to another partition in a single DLPAR operation.

Resources that are removed from a partition are marked free (free resources) and are owned by the global firmware of system. These resources are kept in a free resource pool. You can add free resources to any partition in a system as long as the system has enough free resources.

5.2 The process flow of a DLPAR operation

A DLPAR operation initiated on the HMC is transferred to the target partition through Resource Monitoring and Controlling (RMC). The request produces a

DLPAR event on the partition. After the event has completed, regardless of the result from the event, a notification is returned to the HMC to mark the completion of the DLPAR operation. Thus, a DLPAR operation is considered a single transactional unit, and only one DLPAR operation is performed at a time.

A DLPAR operation is executed in the process flow is illustrated in Figure 5-6 on page 149.

148

Partitioning Implementations for IBM

E server

p5 Servers

B: DLPAR operation request via RMC from the HMC

C: DLPAR operation result from the partition

RMC

IBM.DRM

GUI or command

HMC drmgr

Ethernet

AIX 5L Version

5.2 Partition

Platform-dependent commands

A: Resource query and allocate requests to the CSP before the DLPAR operation over the serial line

D: Resource reclaim request to the CSP after the DLPAR operation over the serial line

Platform-dependent device driver

Kernel

RTAS

Serial Line

Global firmware / Hypervisor

CSP

Managed system

Figure 5-6 Process flow of a DLPAR operation

The following steps explain the process flow of a DPLAR operation:

1. The system administrator initiates a DLPAR operation request on the HMC using either the graphical user interface or command line interface.

2. The requested DLPAR operation is verified on the HMC with the current resource assignment to the partition and free resources on the managed system before being transferred to the target partition. In other words, the

HMC provides the policy that determines whether a DLPAR operation request is actually performed on the managed system. The policy is determined by the partition profile. If the request is a resource addition, the HMC communicates with the global firmware to allocate free resources to the target partition through the service processor.

3. If enough free resources exist on the system, the HMC assigns the requested resource to the specified partition, updates the partition’s object to reflect this addition, and then creates associations between the partition and the resource to be added.

Chapter 5. Dynamic logical partitioning

149

4. After the requested DLPAR operation has been verified on the HMC, it will be transferred to the target partition using RMC, which is an infrastructure implemented on both the HMC and AIX partitions, as indicated as arrow B in

Figure 5-6 on page 149. The RMC is used to provide a secure and reliable

connection channel between the HMC and the partitions.

Note: The connection channel established by RMC only exists between

the HMC and the partition to where the DLPAR operation is targeted. There are no connection paths required between partitions for DLPAR operation purposes.

5. The request is delivered to the IBM.DRM resource manager running on the partition, which is in charge of the DLPAR function in the RMC infrastructure in AIX. As shown in the following example, the IBM.DRM resource manager is running as the IBM.DRMd daemon process and included in the devices.chrp.base.rte fileset on AIX 5L Version 5.2 or later:

# lssrc -ls IBM.DRM

Subsystem : IBM.DRM

PID : 18758

Cluster Name : IW

Node Number : 1

Daemon start time : Wed Aug 21 16:44:12 CDT 2002

Information from malloc about memory use:

Total Space : 0x003502c0 (3474112)

Allocated Space: 0x0030b168 (3191144)

Unused Space : 0x00043e40 (278080)

Freeable Space : 0x00000000 (0)

Class Name(Id) : IBM.DRM(0x2b) Bound

# ps -ef | head -1 ; ps -ef | grep DRMd | grep -v grep

UID PID PPID C STIME TTY TIME CMD

root 18758 10444 0 Aug 21 - 0:22 /usr/sbin/rsct/bin/IBM.DRMd

# lslpp -w /usr/sbin/rsct/bin/IBM.DRMd

File Fileset Type

---------------------------------------------------------------------------

/usr/sbin/rsct/bin/IBM.DRMd

devices.chrp.base.rte File

Note: The absence of the IBM.DRM resource manager in the

lssrc -a

output does not always mean that the partition has not been configured appropriately for the DLPAR. The resource manager is configured automatically and started by RMC after the first partition reboot, if the network configuration is correctly set up on the partition and the HMC.

150

Partitioning Implementations for IBM

E server

p5 Servers

6. The IBM.DRM resource manager invokes the

drmgr

command, which is an platform-independent command designed as the focal point of the dynamic logical partitioning support on AIX.

As shown in the following example, the

drmgr

command is installed in the

/usr/sbin directory provided by the bos.rte.methods fileset:

# whence drmgr

/usr/sbin/drmgr

# lslpp -w /usr/sbin/drmgr

File Fileset Type

---------------------------------------------------------------------------

/usr/sbin/drmgr bos.rte.methods File

Note: The

drmgr

command should not be invoked by the system administrator in order to directly perform resource changes in a partition. It

must be invoked in the context explained here to do so. In “How to manage

DLPAR scripts” on page 186, another usage of the

drmgr

command is provided.

7. The

drmgr

command invokes several platform-dependent commands depending on the resource type (processor, memory, or I/O resource) and request (resource addition or removal) in order to instruct the kernel to process the actual resource change with necessary information.

8. The kernel does many tasks, as described in Table 5.3 on page 152.

9. After the DLPAR event has completed, regardless of the result, a notification is returned to the HMC to mark the completion of the DLPAR operation,

indicated as arrow C in Figure 5-6 on page 149. The notification also includes

the exit code, standard out, and standard error from the

drmgr

command. The system administrator who has initiated the DLPAR operation sees the exit code and outputs on the HMC.

If the request is a resource removal, the HMC communicates with the global firmware in order to reclaim resource(s) to the shared or dedicated free resource pool from the source partition through the service processor indicated as arrow D

in Figure 5-6 on page 149. The HMC unassigns the resource from the partition

and updates the partition’s object to reflect this removal, and then removes associations between the partition and the resource that was just removed.

A DLPAR operation can take noticeable time depending on the availability and the capability to configure or deconfigure a specific resource.

Chapter 5. Dynamic logical partitioning

151

5.3 Internal activity in a DLPAR event

The AIX kernel communicates with the partition firmware through Run-Time

Abstraction Services (RTAS). The partition firmware manages resources in the partition). The resources are represented in the Open Firmware device tree that serves as a common reference point for the operating system and firmware. The

RTAS operate on objects represented in this database.

Each AIX partition has a private copy of the Open Firmware device tree that reflects the resources that are actually assigned to the partition and those that might be in the future. Structurally, it is organized like a file system with directories and files, where the files represent configured instances of resources, and the directories provide the list of potential assignments. Each installed resource is represented in this list and are individually called dynamic reconfiguration connectors.

5.3.1 Internal activity for processors and memory in a DLPAR event

As described previously, the

drmgr

command handles all DLPAR operations by calling the appropriate commands and controls the process of the reconfiguration of resources.

The following briefly describes the kernel internal activity for processors and memory in a DLPAR event.

1. The Object Data Manager (ODM) lock is taken to guarantee that the ODM,

Open Firmware device tree, and the kernel are automatically updated. This step can fail if the ODM lock is held for a long time and the user indicates that the DLPAR operation should have a time limit.

2. The platform-dependent command reads the Open Firmware device tree.

3. The platform-dependent command invokes the kernel to start the DLPAR event. The following steps are taken: a. Requesting validation.

b. Locking DLPAR event. Only one event can proceed at a time.

c. Saving request in global kernel DR structure that is used to pass information to signal handlers, which runs asynchronously to the platform-dependent command.

d. Starting check phase.

4. The check phase scripts are invoked.

5. The check phase signals are sent, conditional wait if signals were posted.

152

Partitioning Implementations for IBM

E server

p5 Servers

6. The check phase kernel extension callout. Callback routines of registered kernel extensions are called.

The event might fail in steps 4, 5, or 6 if any check phase handler signals an error. After the check phase has passed without an error, and the DLPAR event is in the pre phase, all pre phase application handlers will be called, even if they fail, and the actual resource change is attempted.

7. The kernel marks the start of the pre phase.

8. Pre-phase scripts are invoked.

9. Pre-phase signals are sent–conditional wait, if signals were posted.

10.The kernel marks the doit phase start. This is an internal phase where the resource is either added to or removed from the kernel.

Steps 11-13 can be repeated depending on the request. Processor-based requests never loop; only one shared or dedicated processor can be added or removed at a time in one DLPAR operation. If more than one shared or dedicated processor needs to be added or removed, the HMC invokes AIX once for each processor.

Memory-based requests loop at the LMB level, which represent contiguous from 16 MB segments of logical memory, until the entire user request has been satisfied. The HMC remotely invokes AIX once for the complete memory request.

11.This step is only taken if adding a resource. The Open Firmware device tree is updated. The resource allocated, un-isolated, and the connector configured.

When un-isolating the resource, it is assigned to the partition, and ownership is transferred from Open Firmware to AIX:

– For processors, the identity of the global and local interrupt service is discovered.

– For memory, the logical address and size is discovered.

12.Invoke kernel to add or remove resource: a. The callback functions of registered kernel extensions are called. Kernel extensions are told the specific resource that is being removed or added.

b. The resources in the kernel are removed or added.

c. The kernel extension in post or posterror phase are invoked.

If steps a or b fail, the operation fails.

13.This step is only taken if removing a resource.

The Open Firmware device tree is updated. Resources are isolated and unallocated for removal. The Open Firmware device tree must be kept updated so that the configuration methods can determine the set of resources that are actually configured and owned by the operating system.

Chapter 5. Dynamic logical partitioning

153

14.Kernel marks post (or posterror) phase start depending on the success of the previous steps.

15.Invoke configuration methods so that DLPAR-aware applications and registered DLPAR scripts will see state change in the ODM.

16.The post scripts are invoked.

17.The post signals are sent to registered processes, conditional wait if signals were posted.

18.The kernel clears the DLPAR event.

19.ODM locks are released.

5.3.2 Internal activity for I/O slots in a DLPAR event

Dynamic removal and addition of I/O adapters has been provided by AIX prior to

DLPAR support, utilizing the PCI adapter Hot Plug capability on the IBM RS/6000 and IBM pSeries server models. To allow for the dynamic addition and removal of

PCI I/O slots, AIX 5L Version 5.2 provided enhancements to the

lsslot

command have been made.

PCI slots and integrated I/O devices can be listed using the new connector type slot in the

lsslot

command, as shown in the following example:

# lsslot -c slot

The output of this command looks similar to the following:

#Slot Description Device(s)

U1.5-P1-I1 DLPAR slot pci13 ent0

U1.5-P1-I2 DLPAR slot pci14 ent1

U1.5-P1-I3 DLPAR slot pci15

U1.5-P1-I4 DLPAR slot pci16

U1.5-P1-I5 DLPAR slot pci17 ent2

U1.5-P1/Z1 DLPAR slot pci18 scsi0

Before the I/O slot removal, you must delete the PCI adapter device and all its child devices from AIX. Given that ent2 in the slot U1.5-P1-I5 in the previous example is not used, the devices could be removed using the following command as the root user on the partition.

# rmdev -l pci17 -d -R

After the devices have been removed from AIX, the I/O slot can be removed from the partition using the graphical user interface or command line interface on the

HMC.

Note: Any PCI slots defined as

required

are not eligible for the DLPAR operation.

154

Partitioning Implementations for IBM

E server

p5 Servers

To let AIX recognize the dynamically added I/O slot and its children devices to a partition, you must invoke the

cfgmgr

command as the root user on the partition.

To add the previously removed I/O slot from a partition, it first needs to be reassigned to the partition using the HMC.

5.4 DLPAR-safe and DLPAR-aware applications

The dynamic logical partitioning function was first introduced on AIX 5L Version

5.2 and was designed and implemented to not impact the existing applications. In fact, most applications are not affected by any DLPAR operations results.

Therefore, those applications are called

DLPAR-safe

applications.

There are two types of application classifications regarding DLPAR operations:

DLPAR-safe

Applications that do not fail as a result of DLPAR operations. The application’s performance can suffer when resources are removed, or it cannot scale as resources are added.

DLPAR-aware

Applications that incorporate DLPAR operations that allow the application to adjust its use of the system resources equal to the actual capacity of the system. DLPAR-aware applications are always DLPAR-safe.

5.4.1 DLPAR-safe

Although, most applications are DLPAR-safe without requiring any modification, there are certain instances where programs might not be inherently DLPAR-safe.

There are two cases where DLPAR operations can introduce undesirable effects in the application:

򐂰

򐂰

Programs that are optimized for uni-processors can have problems when a processor is added to the system resources.

On programs that are indexed by processor numbers, the increased processor number can cause the code to go down an unexpected code path during its run-time checks.

In addition, applications that use uni-processor serialization techniques can experience unexpected problems. In order to resolve these concerns, system administrators and application developers need to be aware of how their applications get the number of processors.

Chapter 5. Dynamic logical partitioning

155

5.4.2 DLPAR-aware

DLPAR-aware applications adapt to system resource changes that are caused by

DLPAR operations. When these operations occur, the application recognizes the resource change and accommodate accordingly.

You can use the following techniques to make applications DLPAR-aware:

򐂰

򐂰

Consistently poll for system resource changes. Polling is not the recommended way to accommodate for DLPAR operations, but it is valid for systems that do not need to be tightly integrated with DLPAR. Because the resource changes might not be discovered immediately, an application that uses polling can have limited performance. Polling is not suitable for applications that deploy processor bindings, because they represent hard dependencies.

Applications have other methods to react to the resource change caused by

DLPAR operations. See “Integrating a DLPAR operation into the application” on page 157.

򐂰 Several applications should be made DLPAR-aware, because, they need to scale with the system resources. These types of applications can increase

their performance by becoming DLPAR-aware. Table 5-1 lists some examples

of applications that should be made DLPAR-aware.

Note: These are only a few types of common applications affected by DLPAR

operations. The system administrator and application developer should be sensitive to other types of programs that might need to scale with resource changes.

Table 5-1 Applications that should be DLPAR-aware

Application type Reason

Database applications

The application needs to scale with the system. For example, the number of threads might need to scale with the number of available processors, or the number of large pinned buffers might need to scale with the available system memory.

Licence Managers

Workload Managers

Tools

Licenses are distributed based on the number of available processors or the memory capacity.

Jobs are scheduled based on system resources, such as available processors and memory.

Certain tools might report processor and memory statistics or rely on available resources.

156

Partitioning Implementations for IBM

E server

p5 Servers

5.5 Integrating a DLPAR operation into the application

The DLPAR operation can be integrated into the application using the following methods:

򐂰

Script-based DLPAR event handling

If the application is controlled externally to use a specific number of threads or to size its buffers, use this method. In order to facilitate this method, a new command,

drmgr

, is provided. The

drmgr

command is the central focal point of the DLPAR function of AIX. The following several sections discuss the

drmgr

command and typical usage examples are provided in “How to manage

DLPAR scripts” on page 186.

See “Script-based DLPAR event handling” on page 160 for more information.

򐂰 API-based DLPAR event handling

If the application is directly aware of the system configuration, and the application source code is available, use this method.

See “API-based DLPAR event handling” on page 194 for more information.

Applications can monitor and respond to various DLPAR events, such as a memory addition or processor removal, by using these two methods. Although, at the high-level, both methods share the same DLPAR events flow, several key differences exist between these two methods.

One difference is that the script-based method externally reconfigures the application once a DLPAR event takes place, while the API-based method can be directly integrated into the application by registering a signal handler so that the process can be notified with the SIGRECONFIG signal when the DLPAR event occurs.

Note: The DLPAR events of I/O resources do not notify applications.

5.5.1 Three phases in a DLPAR event

A DLPAR event executes in three phases: check, pre, and post. Each phase is an automatic execution unit and is executed in its entirety before the next phase is started, preventing partial updates to the system. In the pre and post phases, the state of the application is permitted to change. The operating system only acts upon DLPAR requests between the pre and post phases to perform the actual resource change.

Chapter 5. Dynamic logical partitioning

157

Note: If a dynamic processor deallocation occurs in a partition that is running

AIX 5L Version 5.2 or later, it is also treated as a processor removal DLPAR event, and thus invokes these three phases.

Figure 5-7 illustrates the three phases and the order in which they occur for a

DLPAR event.

Force?

No

Check phase

Yes

Only when DLPAR operation is a removal request

Success?

Yes

No

Pre phase

Resource change

Post phase

Figure 5-7 Three DLPAR phases of a DLPAR event

Check phase

The check phase usually occurs first. It is used to examine the resource’s state and to determine if the application can tolerate a DLPAR event. It gives the script or API a chance to fail the current DLPAR operation request without changing any system state.

158

Partitioning Implementations for IBM

E server

p5 Servers

Note: In a resource removal DLPAR event, the check phase is skipped if the

force option is specified. In a resource addition DLPAR event, the check phase is not skipped regardless of the force option value.

The check phase can be used in several situations, including the following:

򐂰

To determine if a processor cannot be removed because it still has threads bound to it.

򐂰

򐂰

By a licence manager to fail the integration of a new processor to the partition because it does not have a valid licence to support the addition of a processor.

To maintain an application’s DLPAR safeness by restricting the effects of

DLPAR operations. For instance, if the application is optimized for a uniprocessor environment, the check phase could prevent the application from recognizing the addition of a processor, which could prevent the application from executing an unexpected code path with the presence of additional processors.

Note: If a DLPAR script exits with failure from the check phase, the DLPAR

event will not continue. Therefore, the resource change is not performed, and the DLPAR script is not invoked in the pre and post phases.

Pre phase

Before the actual resource change is made, the application is notified that a resource change (addition or removal) is about to occur. The application is given a chance to prepare for the DLPAR request in the pre phase.

When the application is expecting a resource removal, the DLPAR script or API needs to carefully utilize this phase. This phase handles such things as unbinding processors, detaching pinned shared memory segments, removing plocks, and terminating the application if it does not support DLPAR or will be broken by DLPAR requests.

Note: The actual resource change takes place between the pre and post

phases.

Post phase

After a resource change has occurred, the application will have a chance to respond to the DLPAR operation. The application can reconfigure itself in the post phase in order to take advantage of the resource addition or to compensate for the resource removal.

Chapter 5. Dynamic logical partitioning

159

If resources are added, the DLPAR script or API could create new threads or attach to pinned shared memory segments. On the other hand, if resources are removed, the DLPAR scripts or API calls might delete threads for scalability.

5.5.2 Event phase summary

When a DLPAR request is made to change resource configurations in a partition, the

drmgr

command notifies applications of the pending resource change.

Table 5-2 summarizes the phases of a DLPAR event and some important

considerations of what needs to be accomplished in each phase.

Table 5-2 Considerations during each event phase

Phase Considerations

Check

򐂰

򐂰

򐂰

Can the application support the request?

Are there licence restrictions?

Can the system withstand this application failing?

Pre

Post

򐂰

򐂰

򐂰

Is it best to stop the application and then restart it after the DLPAR operation?

How can the application help facilitate a DLPAR removal or addition?

What can the application eliminate or reduce when a resource is removed? (that is, kill threads)

򐂰

򐂰

򐂰

Does the application need to be restarted after the DLPAR operation?

How can the application take advantage of added resource? (that is, start new threads)

Did the operation complete? Was there a partial success?

5.6 Script-based DLPAR event handling

The script-based DLPAR event handling method is performed by several

components, as explained in the following (see Figure 5-8 on page 162):

1. A DLPAR operation request is initiated using either the graphical user interface or command line interface on the HMC.

2. The request is transferred to the target partition through RMC. The IBM.DRM resource manager on the partition receives this request.

3. The IBM.DRM resource manager invokes the

drmgr

command with the necessary information that represents a DLPAR event.

160

Partitioning Implementations for IBM

E server

p5 Servers

4. The

drmgr

command invokes registered DLPAR scripts depending on the resource type, processor, or memory that is specified by the DLPAR event.

The information about the registered DLPAR scripts is kept in the DLPAR script database and fetched by the

drmgr

command.

5. The invoked DLPAR scripts perform necessary tasks that integrate the

DLPAR operation into the application.

The DLPAR scripts should satisfy the application’s demands when a DLPAR event takes place so that the application can take the appropriate actions.

Therefore, DLPAR scripts are carefully developed and tested in order for the applications’ DLPAR-awareness.

The DLPAR script can use the following commands in order to resolve the application processes’ resource dependency:

ps bindprocessor kill

To display bindprocessor attachments and plock system call status at the process level.

To display online processors and make new attachments.

ipcs

To send signals to processes.

To display pinned shared memory segments at the process level.

lsrset lsclass chclass

To display processor sets.

To display Workload Manager (WLM) classes, which might include processor sets.

To change WLM class definitions.

Chapter 5. Dynamic logical partitioning

161

DLPAR operation request through the

RMC from the HMC

IBM.DRMd

Debug information

Spawn drmgr

Fetch registered

DLPAR script information dr_script database

Spawn

DLPAR script

Input:

- additional cmd args

- name-value pairs

(environment values)

Interact with syslog facility

Figure 5-8 A DLPAR script invoked by the drmgr command

Application

Output:

- exit value

- name-value pairs (stdout)

5.6.1 Script execution environment

򐂰

򐂰

When DLPAR scripts are invoked by the

drmgr

command, it sets up the following script execution environment. The required information to set up this environment is taken from the DLPAR script database and the DLPAR event.

򐂰

򐂰

The UID and GID of the execution process are set to the ones of the DLPAR script.

The current working directory is changed to /tmp.

The PATH environment variable is set to /usr/bin:/etc:/usr/sbin.

Two pipes are established between

drmgr

and the executing process so that the process reads using the standard in from the

drmgr

command and writes using the standard out to the

drmgr

command.

As illustrated in Figure 5-8, the execution environment defines the input and

output for the DLPAR script process.

162

Partitioning Implementations for IBM

E server

p5 Servers

When the DLPAR script is invoked, the DLPAR script process receives its input using the following two ways:

򐂰

Additional command line arguments

When a DLPAR script is called, the

drmgr

command will invoke it as follows: dr_application_script <sub-command> <additional_cmd_arg>

In addition to the subcommands, which are explained in “DLPAR script subcommands” on page 167, additional command arguments can be passed

to the script.

򐂰 Environment variables with specified format

When a DLPAR script is called, the

drmgr

command passes several environmental variables using a name-value pair format.

Environmental variables that start with

DR_

are primarily used to send input data to DLPAR scripts; therefore, they should be exclusively set aside for the

drmgr

command.

There are three types of environment values:

– General environment values (see Table 5-3 on page 164)

– processor-specific environment values (Table 5-4 on page 165)

– Memory-specific environment values (Table 5-5 on page 166)

Note: These environment variables only exist during DLPAR events. If you

want to view these variable values, the script needs to be coded to write these variables to the standard out using DR_LOG_* variables so that the

drmgr

command can forward these output to the syslog facility (see

Table 5-4 on page 165).

The DLPAR script process produces its output using the following two ways:

򐂰

Exit values

򐂰

The script must return an exit value, either 0 (success) or 1 (failure). If the exit value is 1, then the DR_ERROR name-value pair must be set to describe the reason for failure, and the script must print it to the standard out.

Standard out with specified format

The DLPAR scripts can write strings using a name-value pair format to the standard out. The

drmgr

command will read them from the scripts. Strings that start with DR_ are primarily used to send output data to the

drmgr

command from the DLPAR scripts.

Chapter 5. Dynamic logical partitioning

163

Note: The script should not print the following to the standard out:

򐂰

򐂰

A string whose length is larger than 1024 characters.

A string that contains new line characters.

򐂰

Any strings that are undefined in Table 5-6 on page 166, Table 5-9 on

page 171, Table 5-10 on page 173, and Table 5-11 on page 174.

Input (environment variables)

Table 5-3 shows general environment variables.

Table 5-3 General DLPAR environment variables

Environment variable Description

DR_DETAIL_LEVEL=N

DR_FORCE=emergency

򐂰

򐂰

򐂰

򐂰

This name-value pair instructs the script to produce the specified level of detailed debug information sent to the standard out. The value of N must be one of the following:

0 - None

1 - Min

2 - Medium/more

򐂰

3 - Max

4 - Debug

򐂰

򐂰

This name-value pair gives the emergency processing request to the script. The value of emergency must be one of the following:

FALSE - Emergency processing is required.

TRUE - Emergency processing is not required

(default).

Note: The DR_DETAIL_LEVEL=N environment value is set on the HMC. If

you use the graphical user interface, select Detail level in the DLPAR operation panel. If you use the command line interface, use the

-d

option of the

chhwres

command to set the value.

164

Partitioning Implementations for IBM

E server

p5 Servers

Table 5-4 shows processor-specific environment variables.

Table 5-4 Processor-specific DLPAR environment variables

Processor environment variables Description

DR_LCPUID=N

DR_BCPUID=N

DR_CPU_CAPACITY=N

The logical CPU ID of the processor that is being added or removed. N is a decimal number.

The bind CPU ID of the processor that is being added or removed. N is a decimal number.

Capacity is not expressed as a fraction in the above parameters. Capacity is expressed as a percentage, where 100 represents one physical processor, and 180 represents the power of 1.8 processors. The environment variables DR_CPU_CAPACITY and

DR_VAR_WEIGHT represent the value of the partition attribute before the request was made, so the script will have to internally add or subtract the delta to determine the result of the request.

DR_CPU_CAPACITY_DELTA=N

DR_VAR_WEIGHT=N

DR_VAR_WEIGHT_DELTA=N

Capacity is not expressed as a fraction in the above parameters. Capacity is expressed as a percentage, where 100 represents one physical processor, and 180 represents the power of 1.8 processors. The environment variables DR_CPU_CAPACITY and

DR_VAR_WEIGHT represent the value of the partition attribute before the request was made, so the script will have to internally add or subtract the delta to determine the result of the request.

The environment variables

DR_CPU_CAPACITY and DR_VAR_WEIGHT represent the value of the partition attribute before the request was made, so the script will have to internally add or subtract the delta to determine the result of the request.

The environment variables

DR_CPU_CAPACITY and DR_VAR_WEIGHT represent the value of the partition attribute before the request was made, so the script will have to internally add or subtract the delta to determine the result of the request.

Chapter 5. Dynamic logical partitioning

165

Table 5-5 shows the memory-specific environment variables.

Table 5-5 Memory-specific DLPAR environment variables

Memory environment variables Description

DR_MEM_SIZE_REQUEST=N Size of memory requested in megabytes.

N is a decimal value.

DR_MEM_SIZE_COMPLETED=N

DR_FREE_FRAMES=N

DR_PINNABLE_FRAMES=N

DR_TOTAL_FRAMES=N

Number of megabytes that were successfully added or removed. N is a decimal value.

Number of free frames currently in the system. Each frame is a 4 KB page. N is a

32-bit hexadecimal value.

Total number of pinnable frames currently in the system. Each frame is a 4 KB page.

N is a 32-bit hexadecimal value.

Total number of frames in the system.

Each frame is a 4 KB page. N is a 32-bit hexadecimal value.

Output (standard out)

Table 5-6 shows general output variables. The DR_ERROR=failure_cause

name=variable pair is a mandatory output when the script exits with 1 (failure).

Table 5-6 General DLPAR output variables

Variable Description

DR_ERROR=failure_cause

(only if the script exits with 1)

DR_LOG_ERR=message

This name-value pair describes the reason for failure.

This name-value pair describes the information message to be sent to the syslog facility with the err (LOG_ERR) priority.

DR_LOG_WARNING=message

DR_LOG_INFO=message

DR_LOG_EMERG=message

This name-value pair describes the information message to be sent to the syslog facility with the warning (LOG_WARNING) priority.

This name-value pair describes the information message to be sent to the syslog facility with the info (LOG_INFO) priority.

This name-value pair describes the information message to be sent to the syslog facility with the emerg (LOG_EMERG) priority.

166

Partitioning Implementations for IBM

E server

p5 Servers

Variable

DR_LOG_DEBUG=message

Description

This name-value pair describes the information message to be sent to the syslog facility with the debug (LOG_DEBUG) priority.

Note: Except for the DR_ERROR variable, the other variables are used to

send messages to the syslog facility.

5.6.2 DLPAR script naming convention

When developing a DLPAR script, you should follow a few simple naming conventions. It is preferable to name the script using prefixes that describe the vendor name and the subsystem that it controls.

For example, dr_ibm_wlm.pl would be a good name for a DLPAR Perl script that was written by IBM to control the WLM assignments. WLM is a standard function of AIX to prioritize multiple processes depending on the predefined attributes.

Another example is dr_sysadmin_wlm.pl. This name could be a DLPAR Perl script provided by system administrator to control the WLM assignments.

5.7 DLPAR script subcommands

Every DLPAR script is required to accept all the subcommands found in Table 5-7 on page 168. This section provides detailed information for each subcommand.

Note: The prefix names for these subcommands (check, pre, and post)

coincide with the DLPAR phases that are explained in “Script-based DLPAR event handling” on page 160.

Chapter 5. Dynamic logical partitioning

167

Table 5-7 DLPAR script subcommands

Subcommand name Description

scriptinfo register usage resource_name checkrelease resource_name

Identifies script-specific information. It must provide the version, date, and vendor information. This command is called when the script is installed.

Identifies the resources managed by the script, such as

cpu

or

mem

.

Returns a description of how the script plans to use the named resources. It contains pertinent information so that the user can determine whether or not to install the script. Further, the command describes the software capabilities of the applications that are impacted.

This subcommand is invoked when the

drmgr

command initiates the release of the specified resource. The script checks the resource dependencies of the application and evaluate the effects of resource removal on the application the script is monitoring. The script can indicate that the resource should not be removed if the application is not DLPAR-aware or if the resource is critical for the subsystem.

prerelease resource_name postrelease resource_name undoprerelease resource_name

Before the removal of the specified resource, this subcommand is invoked. The script uses this time to remove any dependencies the application can have on the resource. This command can reconfigure, suspend, or terminate the application such that the named resource can be released.

After the resource is removed successfully, this subcommand is invoked. The script can perform any necessary cleaning up, or it can restart the application if it stopped the application in the prerelease phase.

This subcommand is invoked if an error occurs while the resource is being released. The script takes the necessary steps to undo its prerelease operations on the resource and the application. In the case of a partial resource release, this command reads the environment variables to determine the level of success before the fail.

168

Partitioning Implementations for IBM

E server

p5 Servers

Subcommand name

checkaquire resource_name preacquire resource_name postacquire resource_name undopreacquire resource_name

Description

This subcommand is invoked to determine if the

drmgr

command can proceed with a resource addition to the application.

This subcommand tells the application that a resource will be available for use.

This subcommand informs the

drmgr

command that the resource addition completed, and the script allows the application to use the new resources. If the application was stopped in the preacquire phase, the application is restarted in this command.

This subcommand notifies the

drmgr

command that the resource addition aborted or partially completed. The script then makes the necessary changes to undo anything it did in the preacquire phase, or the script determines the level of success of the DLPAR addition request by reading the environment variables.

5.7.1 The scriptinfo subcommand

When a script is first installed, the script is invoked with the scriptinfo subcommand by the

drmgr

command. The scriptinfo subcommand displays useful information to identify the script, such as the developed date and the vendor name for it, in order to let the

drmgr

command record appropriate information in the DLPAR script database about the script. The scriptinfo subcommand is also called by the

drmgr

command in the very early stage of a

DLPAR operation request.

When the script is invoked with the scriptinfo subcommand, it takes the following syntax: dr_application_script scriptinfo

Input to the scriptinfo subcommand

The

scriptinfo

subcommand takes the following input data:

򐂰

Additional command line arguments

򐂰

None.

Name-value pairs from environment variables

See Table 5-8 on page 170.

Chapter 5. Dynamic logical partitioning

169

Output from the scriptinfo subcommand

The

scriptinfo

subcommand produces the following output data.

򐂰

򐂰

Exit value

The script must return an exit value, either 0 (success) or 1 (failure). If the exit value is 1, the DR_ERROR name-value pair must be set to describe the reason for failure, and the script must print it to the standard out.

Name-value pairs to the standard output stream

Table 5-8 lists the name-value pairs that must be returned by the script when

it is invoked with the

scriptinfo

subcommand.

Table 5-8 Required output name-value pairs for the scriptinfo subcommand

Required output pair Description

DR_VERSION=1 This name-value pair indicates the version level of the script that specifies the compatibility level of the DLPAR script with respect to the DLPAR implementation version of AIX. On AIX 5L

Version 5.2, the version must be set to 1, which indicates that the script is compatible with

DLPAR implementation Version 1.

DR_DATE=DDMMYYYY

DR_SCRIPTINFO=description

DR_VENDOR=vendor_information

This name-value pair is the publication date of the script. The format should be DDMMYYYY, where DD=days, MM=months, and YYYY=year.

For example, a valid date would be 08102002, which is October 8, 2002.

This name-value pair contains a description of the script’s functionality. This string should be a brief human-readable message.

This name-value pair indicates the vendor name and related information. This string can also be used to highlight the application represented by the script.

In addition to Table 5-8, Table 5-9 on page 171 lists the optional name-value

pair that can be returned by the script when it is invoked with the

scriptinfo

subcommand. If the script needs to have more processing time for its execution, it prints the timeout value to the standard out explained in

Table 5-10 on page 173, so that the

drmgr

command can read the appropriate timeout value for this script.

170

Partitioning Implementations for IBM

E server

p5 Servers

Table 5-9 Optional output name-value pairs for the scriptinfo subcommand

Optional output pair Description

DR_TIMEOUT=timeout_in_seconds This name-value pair indicates the timeout value in seconds of all DLPAR operations done in this script. The default timeout is 10 seconds.

This timeout can be overridden by the

-w

flag of the

drmgr

command.

The

drmgr

command waits for the timeout before it sends a SIGABRT to the script. After waiting 1 more second for the script to gracefully end, it will send a SIGKILL.

A value of zero (0) disables the timer.

Example

In Example 5-1, two sample DLPAR scripts are registered, dr_test.sh and

dr_test.pl. The emphasized lines in this example show the information recorded in the DLPAR script database. The information was derived from the script output

with the scriptinfo subcommand upon the script registration (see Table 5-8 on page 170).

Also, the two fields, Script Timeout and Admin Override Timeout, correspond to the values specified by the DR_TIMEOUT value and the

-w

option, respectively

(see Table 5-9).

Example 5-1 Registered sample DLPAR scripts

# drmgr -l

DR Install Root Directory: /usr/lib/dr/scripts/all

Syslog ID: DRMGR

------------------------------------------------------------

/usr/lib/dr/scripts/all/dr_test.sh

DLPAR ksh example script

Vendor:IBM, Version:1, Date:10182002

Script Timeout:10, Admin Override Timeout:0

Resources Supported:

Resource Name: cpu Resource Usage: cpu binding for performance

Resource Name: mem Resource Usage: Shared(Pinned) memory for app XYZ

------------------------------------------------------------

/usr/lib/dr/scripts/all/dr_test.pl

DLPAR Perl example script

Vendor:IBM Corp., Version:1, Date:04192002

Script Timeout:5, Admin Override Timeout:0

Resources Supported:

Resource Name: cpu Resource Usage: Testing DLPAR on CPU removal

Resource Name: mem Resource Usage: Testing DLPAR on MEM removal

------------------------------------------------------------

Chapter 5. Dynamic logical partitioning

171

5.7.2 The register subcommand

When the script is invoked with the

register

subcommand by the

drmgr

command, the script is registered into the DLPAR script database. The

register

subcommand also informs the

drmgr

command about the resource type

(processor or memory) that the script is designed to handle.

When the script is invoked with the

register

subcommand, it takes the following syntax: dr_application_script register

Input to the register subcommand

The

register

subcommand takes the following input data:

򐂰

Additional command line arguments

򐂰

None.

Name-value pairs from environment variables

See Table 5-10 on page 173.

Output from the register subcommand

The

register

subcommand produces the following output data:

򐂰

򐂰

Exit value

The script must return an exit value, either 0 (success) or 1 (failure). If the exit value is 1, the DR_ERROR name-value pair must be set to describe the reason for failure, and the script must print it to the standard out.

Name-value pairs to the standard output stream

Table 5-10 on page 173 lists the name-value pair that must be returned by the

script when it is invoked with the register subcommand.

172

Partitioning Implementations for IBM

E server

p5 Servers

Table 5-10 Required output name-value pair for the register subcommand

Required output pair Description

DR_RESOURCE=resource_name

򐂰

򐂰

This string identifies the resource type that the

DLPAR script is designed to handle. The valid resource type names are: cpu capacity = capacity changes to entitled processor capacity

򐂰 var_weight = changes to the variable capacity weight

򐂰 mem

If a script needs to handle both processor and memory resource types, the script prints the following two lines:

DR_RESOURCE=cpu

DR_RESOURCE=mem

Optionally, the script can return the name-value pairs listed in Table 5-9 on page 171.

Note: The resource types, capacity and var_weight, have been added to

support shared partitions and virtual processors in

Sserver

p5 servers.

Example

The emphasized fields in the following example are extracted from Example 5-1 on page 171. The fields show the information that is recorded in the DLPAR

script database. The information was derived from the script output with the

register

subcommand upon the script registration (see Table 5-10).

Resources Supported:

Resource Name: cpu Resource Usage: Testing DLPAR on CPU removal

Resource Name: mem Resource Usage: Testing DLPAR on MEM removal

5.7.3 The usage subcommand

The main purpose of the

usage

subcommand is to tell you which resource type

(processor or memory) the script is designed to handle. The

usage

subcommand is also called by the

drmgr

command in the very early stage of a DLPAR operation request for information purposes only.

When the script is invoked with the

usage

subcommand, it takes the following syntax: dr_application_script usage <resource_type>

Chapter 5. Dynamic logical partitioning

173

Input to the usage subcommand

The

usage

subcommand takes the following input data:

򐂰

򐂰

Additional command line arguments

The

usage

subcommand requires one additional command line argument that tells the

drmgr

command which resource type (processor or memory) the script is designed to handle. The valid values are cpu, capacity, var_weight, or mem.

Name-value pairs from environment variables

See Table 5-3 on page 164.

Output from the usage subcommand

The

usage

subcommand produces the following output data:

򐂰

Exit value

򐂰

The script must return an exit value, either 0 (success) or 1 (failure). If the exit value is 1, the DR_ERROR name-value pair must be set to describe the reason for failure, and the script must print it to the standard out.

Name-value pairs to the standard output stream

Table 5-11 lists the name-value pair that must be returned by the script when

it is invoked with the

usage

subcommand.

Table 5-11 Required output name-value pair for the usage subcommand

Required output pair Description

DR_USAGE=usage_description This name-value pair contains a human-readable string describing how the resource is used by the associated application. This description should indicate the impact on the application if that resource is removed or added.

Optionally, the script can return the name-value pairs listed in Table 5-9 on page 171.

174

Partitioning Implementations for IBM

E server

p5 Servers

Example

The emphasized fields in the following example are extracted from Example 5-1 on page 171. The fields show the information recorded in the DLPAR script

database. The information was derived from the script output with the

usage

subcommand upon the script registration (see Table 5-10 on page 173).

Resources Supported:

Resource Name: cpu Resource Usage: Testing DLPAR on CPU removal

Resource Name: mem Resource Usage: Testing DLPAR on MEM removal

5.7.4 The checkrelease subcommand

Before the specified resource type is removed, the script is invoked with the

checkrelease

subcommand by the

drmgr

command. The resource is not actually changed with this subcommand.

When the

drmgr

command invokes the script with the

checkrelease

subcommand, the script determines the resource dependencies of the application, evaluate the effects of resource removal on the application, and indicate whether the resource can be successfully removed. If the resource removal request affects the application, the script returns with an exit status of 1 to let the

drmgr

command know to not release the resource.

When the script is invoked with the

checkrelease

subcommand, it takes the following syntax: dr_application_script checkrelease <resource_type>

Input for the checkrelease subcommand

The

checkrelease

subcommand takes the following input data:

򐂰

򐂰

Additional command line arguments

The

checkrelease

subcommand requires one additional command line argument that tells the

drmgr

command which resource type (processor or memory) the script is designed to handle. The valid values are cpu, capacity, var_weight, or mem.

Name-value pairs from environment variables

The

checkrelease

subcommand takes several required input name-value pairs from environment values depending on the resource type that the script is designed to handle:

– If the script is registered to handle processor, see Table 5-4 on page 165.

– If the script is registered to handle memory, see Table 5-5 on page 166.

The

checkrelease

subcommand can also take an optional input name-value

pair from the environment value shown in Table 5-3 on page 164.

Chapter 5. Dynamic logical partitioning

175

Note: If the DR_FORCE=TRUE environment value is passed to a script

with prerelease, the script interprets the force option as an order, so it returns as soon as possible.

Output for the checkrelease subcommand

The

checkrelease

subcommand produces the following output data:

򐂰 Exit value

The script must return an exit value, either 0 (success) or 1 (failure). If the exit value is 1, the DR_ERROR name-value pair must be set to describe the reason for failure, and the script must print it to the standard out.

Note: When invoking scripts in the

prerelease

phase, the failure of a script does not prevent the

drmgr

command from attempting to remove the resource. The theory is that resource removal is safe. It can fail, but the kernel is coded to cleanly remove resources, so there is no harm in trying.

The return code from each script is stored so that the

drmgr

command can determine whether it needs to call it back. If a script fails in the prerelease phase, it will not be called in the

postrelease

or

undorelease

phases.

򐂰 Name-value pairs to the standard output stream

Optionally, the script can return the name-value pairs listed in Table 5-9 on page 171.

5.7.5 The prerelease subcommand

Before the specified resource type is actually released, the script is invoked with the

prerelease

subcommand by the

drmgr

command. This is called after the

checkrelease

subcommand.

When the

drmgr

command invokes the script with the

prerelease

subcommand, the script interacts with the application, as briefly summarized in the following:

1. Informs the application about the resource removal event and lets the application release the specified resource, for example, reconfigure, suspend, or terminate the application process that uses the specified resource.

2. If the application has successfully released the specified resource, the script exits with an exit status of 0 (success).

176

Partitioning Implementations for IBM

E server

p5 Servers

3. Otherwise, there are two options:

– The script exits with an exit status of 0 (success) regardless of the response from the application.

– The script exits with an exit status of 1 (failure).

When the script is invoked with the

prerelease

subcommand, it takes the following syntax: dr_application_script prerelease <resource_type>

Input for the prerelease subcommand

The

prerelease

subcommand takes the following input data:

򐂰

򐂰

Additional command line arguments

The

prerelease

subcommand requires one additional command line argument that tells the

drmgr

command which resource type (processor or memory) the script is designed to handle. The valid values are cpu, capacity, var_weight, or mem.

Name-value pairs from environment variables

The

prerelease

subcommand takes several required input name-value pairs from environment values depending on the resource type that the script is designed to handle:

– If the script is registered to handle processor, see Table 5-4 on page 165.

– If the script is registered to handle memory, see Table 5-5 on page 166.

The

prerelease

subcommand can also take an optional input name-value

pair from the environment value shown in Table 5-3 on page 164.

Note: If the DR_FORCE=TRUE environment value is passed to a script

with the

prerelease

subcommand, the script returns as soon as possible.

Output for the prerelease subcommand

The

prerelease

subcommand produces the following output data:

򐂰 Exit value

The script must return an exit value, either 0 (success) or 1 (failure). If the exit value is 1, the DR_ERROR name-value pair must be set to describe the reason for failure, and the script must print it to the standard out.

Note: If the script exits with 1 (failure), the

drmgr

command will not perform actual resource removal; however, it will invoke subsequent events

(

postrelease

and

undoprerelease

) against the specified resource.

Chapter 5. Dynamic logical partitioning

177

򐂰 Name-value pairs to the standard output stream

Optionally, the script can return the name-value pairs listed in Table 5-6 on page 166.

5.7.6 The postrelease subcommand

After the specified resource type has been released from the partition, the script is invoked with the

postrelease

subcommand by the

drmgr

command. This is called after the

prerelease

subcommand.

When the

drmgr

command invokes the script with the

postrelease

subcommand, the script interacts with the application, including any necessary cleanup, for example, restarting or resuming the application if it was quiesced in the

prerelease

subcommand.

The script also takes appropriate actions if a partial success occurs. A partial success occurs when a subset of the requested number of resources was successfully removed. For example, the memory-related environment variables are checked to determine if all the requested memory frames were removed.

When the script is invoked with the

postrelease

subcommand, it takes the following syntax: dr_application_script postrelease <resource_type>

Input for the postrelease subcommand

The

postrelease

subcommand takes the following input data:

򐂰

򐂰

Additional command line arguments

The

postrelease

subcommand requires one additional command line argument that tells the

drmgr

command which resource type (processor or memory) the script is designed to handle. The valid values are cpu or mem.

Name-value pairs from environment variables

The

postrelease

subcommand takes several required input name-value pairs from environment values depending on the resource type that the script is designed to handle:

– If the script is registered to handle processor, see Table 5-4 on page 165.

– If the script is registered to handle memory, see Table 5-5 on page 166.

The

postrelease

subcommand also can take an optional input name-value

pair from the environment value shown in Table 5-3 on page 164.

Note: The force option should be ignored.

178

Partitioning Implementations for IBM

E server

p5 Servers

Output for the postrelease subcommand

The

postrelease

subcommand produces the following output data:

򐂰

򐂰

Exit value

The script must return an exit value, either 0 (success) or 1 (failure). If the exit value is 1, the DR_ERROR name-value pair must be set to describe the reason for failure, and the script must print it to the standard out.

Name-value pairs to the standard output stream

Optionally, the script can return the name-value pairs listed in Table 5-9 on page 171.

5.7.7 The undoprerelease subcommand

If the

drmgr

command fails to release the specified resource, it invokes the script with the

undoprerelease

subcommand to recover any necessary clean-up tasks that were done by the

prerelease

subcommand. The script undoes any actions that were taken by the script in the

prerelease

subcommand.

Note: If the specified resource has been removed successfully, the

drmgr

command will not invoke the script with the

undoprerelease

subcommand.

When the script is invoked with the

undoprerelease

subcommand, it takes the following syntax: dr_application_script undoprerelease <resource_type>

Input for the undoprerelease subcommand

The

undoprerelease

subcommand takes the following input data:

򐂰

Additional command line arguments

򐂰

The

undoprerelease

subcommand requires one additional command line argument that tells the

drmgr

command which resource type (processor or memory) the script is designed to handle. The valid values are cpu, capacity, var_weight, or mem.

Name-value pairs from environment variables

The

undoprerelease

subcommand takes several required input name-value pairs from environment values depending on the resource type that the script is designed to handle:

– If the script is registered to handle processor, see Table 5-4 on page 165.

– If the script is registered to handle memory, see Table 5-5 on page 166.

Chapter 5. Dynamic logical partitioning

179

The

undoprerelease

subcommand also can take an optional input

name-value pair from the environment value shown in Table 5-3 on page 164.

Note: The force option should be ignored.

Output for the undoprerelease subcommand

The

undoprerelease

subcommand produces the following output data:

򐂰

Exit value

򐂰

The script must return an exit value, either 0 (success) or 1 (failure). If the exit value is 1, the DR_ERROR name-value pair must be set to describe the reason for failure, and the script must print it to the standard out.

Name-value pairs to the standard output stream

Optionally, the script can return the name-value pairs listed in Table 5-6 on page 166.

5.7.8 The checkacquire subcommand

Before the specified resource type is added, the script is invoked with the

checkacquire

subcommand by the

drmgr

command. The resource is not actually changed with this subcommand.

When the

drmgr

command invokes the script with the

checkacquire

subcommand, the script determines the resource dependencies of the application, evaluates the effects of resource addition on the application, and indicates whether the resource can be successfully added. For example, there are some MP-unsafe applications. MP-unsafe applications are not tolerant with multiple processors.

Note: Whether or not an application is MP-unsafe is an application design

issue, and independent of DLPAR functionality.

If the resource addition request affects the application, the script returns with an exit status of 1 to let the

drmgr

command know to not add the resource.

When the script is invoked with the

checkacquire

subcommand, it takes the following syntax: dr_application_script checkacquire <resource_type>

180

Partitioning Implementations for IBM

E server

p5 Servers

Input for the checkacquire subcommand

The

checkaquire

subcommand takes the following input data:

򐂰

򐂰

Additional command line arguments

The

checkacquire

subcommand requires one additional command line argument that tells the

drmgr

command which resource type (processor or memory) the script is designed to handle. The valid values are cpu, capacity, var_weight, or mem.

Name-value pairs from environment variables

The

checkacquire

subcommand takes several required input name-value pairs from environment values depending on the resource type that the script is designed to handle:

– If the script is registered to handle processor, see Table 5-4 on page 165.

– If the script is registered to handle memory, see Table 5-5 on page 166.

The

checkacquire

subcommand also can take an optional input name-value

pair from the environment value shown in Table 5-3 on page 164.

Note: The force option should be ignored.

Output for the checkacquire subcommand

The

checkacquire

subcommand produces the following output data:

򐂰 Exit value

The script must return an exit value, either 0 (success) or 1 (failure). If the exit value is 1, the DR_ERROR name-value pair must be set to describe the reason for failure, and the script must print it to the standard out.

Note: If the script exits with 1 (failure), the

drmgr

command will not add the specified resource and will not invoke subsequent events (

preacquire

,

postacquire

, and

undopreacquire

) against the specified resource (see

Table 5-6 on page 166).

򐂰 Name-value pairs to the standard output stream

Optionally, the script can return the name-value pairs listed in Table 5-6 on page 166.

Chapter 5. Dynamic logical partitioning

181

5.7.9 The preacquire subcommand

Before the specified resource type is actually acquired, the script is invoked with the

preacquire

subcommand by the

drmgr

command. This is called after the

checkacquire

subcommand.

When the

drmgr

command invokes the script with the

preacquire

subcommand, the script interacts with the application, for example, it informs the application about the resource addition and lets the application acquire the specified resource if it is DLPAR-aware.

Note: Most applications are DLPAR-safe. If your application is DLPAR-safe,

but not DLPAR-aware, the script with the

preacquire

subcommand does not have to do any processing.

When the script is invoked with the

preacquire

subcommand, it takes the following syntax: dr_application_script preacquire <resource_type>

Input for the preacquire subcommand

The

preacquire

subcommand takes the following input data:

򐂰

򐂰

Additional command line arguments

The

preacquire

subcommand requires one additional command line argument that tells the

drmgr

command which resource type (processor or memory) the script is designed to handle. The valid values are cpu, capacity, var_weight, or mem.

Name-value pairs from environment variables

The

preacquire

subcommand takes several required input name-value pairs from environment values depending on the resource type that the script is designed to handle:

– If the script is registered to handle processor, see Table 5-4 on page 165.

– If the script is registered to handle memory, see Table 5-5 on page 166.

– The

preacquire

subcommand also can take an optional input name-value

pair from the environment value shown in Table 5-3 on page 164.

182

Partitioning Implementations for IBM

E server

p5 Servers

Note: When invoking scripts in the

preacquire

phase, the failure of a script does not prevent the

drmgr

command from attempting to add the resource.

The theory is that resource addition is safe. It can fail, but the kernel is coded to cleanly add resources, so there is no harm in trying. The return code from each script is remembered so that the

drmgr

command can determine whether it needs to call it back. If a script fails in the

preacquire

phase, it will not be called in the

postacquire

or

undoacquire

phases.

Output for the preacquire subcommand

The

preacquire

subcommand produces the following output data:

򐂰 Exit value

The script must return an exit value, either 0 (success) or 1 (failure). If the exit value is 1, the DR_ERROR name-value pair must be set to describe the reason for failure, and the script must print it to the standard out.

Note: If the script exits with 1 (failure), the

drmgr

command will not perform actual resource removal; however, it will invoke subsequent events

(

postacquire

and

undopreacquire

) against the specified resource.

򐂰 Name-value pairs to the standard output stream

Optionally, the script can return the name-value pairs listed in Table 5-6 on page 166.

5.7.10 The postacquire subcommand

After the specified resource type has been added to the partition, the script is invoked with the

postacquire

subcommand by the

drmgr

command. The script is called after the

preacquire

subcommand.

When the

drmgr

command invokes the script with the

postacquire

subcommand, the script interacts with the application, including any necessary cleanup, for example, restarting or resuming the application if it was quiesced in the

preacquire

subcommand.

The script also takes the appropriate actions if a partial success occurs. A partial success occurs when a subset of the requested number of resources was successfully added. For example, the memory-related environment variables should be checked to determine if all of the requested memory frames were added.

Chapter 5. Dynamic logical partitioning

183

When the script is invoked with the

postacquire

subcommand, it takes the following syntax: dr_application_script postacquire <resource_type>

Input for the postacquire subcommand

The

postacquire

subcommand takes the following input data:

򐂰

򐂰

Additional command line arguments

The

postacquire

subcommand requires one additional command line argument that tells the

drmgr

command which resource type (processor or memory) the script is designed to handle. The valid values are cpu, capacity, var_weight, or mem.

Name-value pairs from environment variables

The

postacquire

subcommand takes several required input name-value pairs from environment values depending on the resource type that the script is designed to handle:

– If the script is registered to handle processor, see Table 5-4 on page 165.

– If the script is registered to handle memory, see Table 5-5 on page 166.

The

postacquire

subcommand also can take an optional input name-value

pair from the environment value shown in Table 5-6 on page 166.

Note: The force option should be ignored.

Output for the postacquire subcommand

The

postacquire

subcommand produces the following output data:

򐂰 Exit value

The script must return an exit value, either 0 (success) or 1 (failure). If the exit value is 1, the DR_ERROR name-value pair must be set to describe the reason for failure, and the script must print it to the standard out.

– Name-value pairs to the standard output stream

Optionally, the script can return the name-value pairs listed in Table 5-6 on page 166.

5.7.11 The undopreacquire subcommand

If the

drmgr

command fails to add the specified resource to the partition, it invokes the script with the

undopreacquire

subcommand to recover any necessary cleanup tasks that were done by the

preacquire

subcommand. The script undoes any actions that were taken by the script in the

preacquire

subcommand.

184

Partitioning Implementations for IBM

E server

p5 Servers

Note: If the specified resource has been added successfully, the

drmgr

command will not invoke the script with the

undopreacquire

subcommand.

When the script is invoked with the

undopreacquire

subcommand, it takes the following syntax: dr_application_script undopreacquire <resource_type>

Input for the undopreacquire subcommand

The

undopreacquire

subcommand takes the following input data:

򐂰

򐂰

Additional command line arguments

The

undopreacquire

subcommand requires one additional command line argument that tells the

drmgr

command which resource type (processor or memory) the script is designed to handle. The valid values are cpu or mem.

Name-value pairs from environment variables

The

undopreacquire

subcommand takes several required input name-value pairs from environment values depending on the resource type that the script is designed to handle:

– If the script is registered to handle processor, see Table 5-4 on page 165.

– If the script is registered to handle memory, see Table 5-5 on page 166.

The

undopreacquire

subcommand also can take an optional input

name-value pair from the environment value shown in Table 5-3 on page 164.

Note: The force option should be ignored.

Output for the undopreacquire subcommand

The

undopreacquire

subcommand produces the following output data:

򐂰

򐂰

Exit value

The script must return an exit value, either 0 (success) or 1 (failure). If the exit value is 1, the DR_ERROR name-value pair must be set to describe the reason for failure, and the script must print it to the standard out.

Name-value pairs to the standard output stream

Optionally, the script can return the name-value pairs listed in Table 5-6 on page 166.

Chapter 5. Dynamic logical partitioning

185

5.8 How to manage DLPAR scripts

The

drmgr

command must be used to manage DLPAR scripts. The function provided by the

drmgr

command does the following:

򐂰

Lists the registered DLPAR scripts and shows their information.

򐂰

򐂰

Registers or uninstalls DLPAR scripts in the DLPAR script database.

Changes the script install directory path. The default directory is

/usr/lib/dr/scripts/all.

Note: The

drmgr

command is the only interface to manipulate the DLPAR script database. To use the

drmgr

command, you need the root authority.

The following sections provide typical

drmgr

command usage examples.

5.8.1 List registered DLPAR scripts

To list registered DLPAR scripts and their information, type drmgr -l

. If no scripts are registered, it returns the following output:

# drmgr -l

DR Install Root Directory: /usr/lib/dr/scripts/

Syslog ID: DRMGR

Example 5-1 on page 171 shows the example output of

drmgr -l

when DLPAR scripts are already registered.

5.8.2 Register a DLPAR script

To register a DLPAR script, type drmgr -i script_file_name . The script is copied into the script install path (the default value is /usr/lib/dr/scripts/all) and

registered in the DLPAR script database, as shown in Example 5-2.

Note: The fileset bos.adt.samples must be installed to enable these functions

Example 5-2 Register a DLPAR script

# drmgr -l

DR Install Root Directory: /usr/lib/dr/scripts

Syslog ID: DRMGR

# ls /usr/samples/dr/scripts/IBM_template.sh

/usr/samples/dr/scripts/IBM_template.sh

186

Partitioning Implementations for IBM

E server

p5 Servers

# drmgr -i /usr/samples/dr/scripts/IBM_template.sh

DR script file /usr/samples/dr/scripts/IBM_template.sh installed successfully

# drmgr -l

DR Install Root Directory: /usr/lib/dr/scripts

Syslog ID: DRMGR

------------------------------------------------------------

/usr/lib/dr/scripts/all/IBM_template.sh AIX DR ksh example script

Vendor:IBM, Version:1, Date:10182002

Script Timeout:10, Admin Override Timeout:0

Resources Supported:

Resource Name: cpu Resource Usage: cpu binding for performance

Resource Name: mem Resource Usage: Shared(Pinned) memory for app XYZ

------------------------------------------------------------

# ls /usr/lib/dr/scripts/all

IBM_template.sh

If the permission mode of the registered script is not appropriate, for example, no executable bits are set, then

drmgr -l

will not list the registered script name, even if the registration has been successfully completed. In this case, set the appropriate permission mode on the script and register it with the overwrite option

-f

, as shown in the following example:

# drmgr -f /usr/samples/dr/scripts/IBM_template.sh

5.8.3 Uninstall a registered DLPAR script

To uninstall a registered DLPAR script, type drmgr -u script_file_name . The script is unregistered from the DLPAR script database, as shown in the following example:

# drmgr -u IBM_template.sh

DR script file IBM_template.sh uninstalled successfully

# drmgr -l

DR Install Root Directory: /usr/lib/dr/scripts

Syslog ID: DRMGR

Chapter 5. Dynamic logical partitioning

187

The uninstalled script file name is renamed in the script install path, as shown in the following example:

# ls -l /usr/lib/dr/scripts/all total 32

-rw-r--r-- 1 bin bin 13598 Jul 12 14:08 .IBM_template.sh

Note: A dot character is added in front of the original file name.

5.8.4 Change the script install path

To change the script install path, type drmgr -R new_dir . In the following example, the script install path is changed to the newly created directory

/local/lpar2

1

from the default path /usr/lib/dr/scripts:

# drmgr -l

DR Install Root Directory: /usr/lib/dr/scripts

Syslog ID: DRMGR

# mkdir -p /local/`hostname`

# drmgr -R /local/`hostname`

0930-022 DR script ROOT directory set to:/local/lpar2 successfully

# drmgr -l

DR Install Root Directory: /local/lpar2

Syslog ID: DRMGR

Note: If you have changed the script install path, scripts that are already

registered will not be referenced by the

drmgr

command.

5.8.5 The drmgr command line options

Table 5-12 on page 189 lists the

drmgr

command line options and their purpose.

For further information about the

drmgr

command, type

man drmgr

on the command line prompt or refer to AIX 5L product documentation, which is available at: http://publib.boulder.ibm.com/infocenter/pseries/index.jsp?topic=/com.ibm.aix.d

oc/infocenter/base/aix53.htm

1

In this example, the

hostname

command returns the host name, lpar2 on one of our test partitions.

188

Partitioning Implementations for IBM

E server

p5 Servers

Table 5-12 The drmgr command line options

Command option Brief description

-i script_name

Installs a DLPAR script to the default or specified directory.

Other associated options:

[ -D install_directory ]

[ -w timeout ]

[ -f ]

-w timeout

-f

-u script_name

Other associated options:

[ -D host_name ]

-R base_directory_path

-d debug_level

Timeout value in minutes.

Forces an override.

Uninstalls a DLPAR script.

Sets the root directory where the DLPAR scripts are installed.

Sets the debug level.

Detailed description

The system administrator should use the

-i

flag to install a DLPAR script. The script’s file name is used as input.

Unless the

-D

flag is used, the scripts are installed into the /usr/lib/dr/scripts/all/ directory under the root install directory (see

-R

flag).

Permissions for the DLPAR script are the same as the script_name file.

If a script with the same name is already registered, the install will fail with a warning unless the force option is used.

This option is used in conjunction with the

-i

option. The

drmgr

command will override the timeout value specified by the LPAR script with the new-user defined timeout value.

During the installation of a script, the

-f

option can be set to force an override of a duplicate

DLPAR script name.

The system administrator invokes this command to uninstall a DLPAR script. The script file name is provided as an input.

The user can specify the directory from where the script should be removed by using the

-D

option. If no directory is specified, the command will try to remove the script from the all directory under the root directory (see the

-R

option).

If the script is registered using the

-D

option, it will only be invoked on a system with that host name.

If no file is found, the command will return with an error.

The default value is /usr/lib/dr/scripts/.

The installer looks at the all or hosts directory under this root directory. (/usr/lib/dr/scripts/all/).

This option sets the DR_DEBUG environment variable, which controls the level of debug messages from the DLPAR scripts.

Chapter 5. Dynamic logical partitioning

189

Command option

-l

-b

-S syslog_chan_id_str

Brief description

Lists DLPAR scripts.

Rebuilds DLPAR script database.

Specifies a syslog channel.

Detailed description

This option lists the details of all DLPARR scripts currently active on the system.

This option rebuilds the DLPAR script database by parsing through the entire list of DLPAR script install directories.

This option enables the user to specify a particular channel to which the syslog messages have to be logged from the DLPAR script by the

drmgr

command.

5.8.6 Sample output examples from a DLPAR script

Although, the syslog facility can be used to record debug information, the debug information for the example DLPAR script was sent to /tmp/IBM_template.sh.dbg for readability reasons.

򐂰

򐂰

򐂰

򐂰

After registering the DLPAR script written in Korn shell (see Example A-3 on page 245), the following DLPAR operations on the HMC was initiated:

2 GB memory addition

1 GB memory removal

1 CPU addition

2 CPU removal

To perform a DLPAR operation using the graphical user interface on the HMC,

refer to 3.1, “Hardware Management Console” on page 58.

Sample output: 2 GB memory addition

Example 5-3 shows the sample output of a 2 GB memory addition DLPAR event.

Notice that there are three line blocks for the check, pre, and post phases.

Example 5-3 Sample output: 2 GB memory addition

-- start checkacquire phase --

DR_DRMGR_INFO=DRAF architecture Version 1

DR_FORCE=FALSE

DR_FREE_FRAMES=0x30a

DR_MEM_SIZE_COMPLETED=0x0

DR_MEM_SIZE_REQUEST=0x80000000

DR_PINNABLE_FRAMES=0x54a55

DR_TOTAL_FRAMES=0x80000 mem resources: 0x80000000

-- end checkacquire phase --

-- start preacquire phase --

190

Partitioning Implementations for IBM

E server

p5 Servers

DR_DRMGR_INFO=DRAF architecture Version 1

DR_FORCE=FALSE

DR_FREE_FRAMES=0x30a

DR_MEM_SIZE_COMPLETED=0x0

DR_MEM_SIZE_REQUEST=0x80000000

DR_PINNABLE_FRAMES=0x54a55

DR_TOTAL_FRAMES=0x80000

-- end preacquire phase --

-- start undopreacquire phase --

DR_DRMGR_INFO=DRAF architecture Version 1

DR_FORCE=FALSE

DR_FREE_FRAMES=0x30a

DR_MEM_SIZE_COMPLETED=0x800

DR_MEM_SIZE_REQUEST=0x80000000

DR_PINNABLE_FRAMES=0x54a55

DR_TOTAL_FRAMES=0x80000

-- end undopreacquire phase --

Before the DLPAR operation, the partition had 2 GB memory assigned, as shown in the following example:

# lsattr -El mem0 size 2048 Total amount of physical memory in Mbytes False goodsize 2048 Amount of usable physical memory in Mbytes False

After the completion of the DLPAR operation, the memory size has been increased to 4 GB, as shown in the following example:

# lsattr -El mem0 size 4096 Total amount of physical memory in Mbytes False goodsize 4096 Amount of usable physical memory in Mbytes False

Sample output: 1 GB memory removal

Example 5-4 shows the sample output of a 1 GB memory removal DLPAR event.

There are three line blocks for the check, pre, and post phases.

Example 5-4 Sample output: 1 GB memory removal

-- start checkrelease phase --

DR_DRMGR_INFO=DRAF architecture Version 1

DR_FORCE=FALSE

DR_FREE_FRAMES=0x7e2f5

DR_MEM_SIZE_COMPLETED=0x0

DR_MEM_SIZE_REQUEST=0x40000000

DR_PINNABLE_FRAMES=0xac3f3

DR_TOTAL_FRAMES=0x100000

-- end checkrelease phase --

-- start prerelease phase --

DR_DRMGR_INFO=DRAF architecture Version 1

Chapter 5. Dynamic logical partitioning

191

DR_FORCE=FALSE

DR_FREE_FRAMES=0x7e2f5

DR_MEM_SIZE_COMPLETED=0x0

DR_MEM_SIZE_REQUEST=0x40000000

DR_PINNABLE_FRAMES=0xac3f3

DR_TOTAL_FRAMES=0x100000

-- end prerelease phase --

-- start postrelease phase --

DR_DRMGR_INFO=DRAF architecture Version 1

DR_FORCE=FALSE

DR_FREE_FRAMES=0x7e2f5

DR_MEM_SIZE_COMPLETED=0x400

DR_MEM_SIZE_REQUEST=0x40000000

DR_PINNABLE_FRAMES=0xac3f3

DR_TOTAL_FRAMES=0x100000

-- end postrelease phase --

Before the DLPAR operation, the partition had 4 GB memory assigned, as shown in the following example:

# lsattr -El mem0 size 4096 Total amount of physical memory in Mbytes False goodsize 4096 Amount of usable physical memory in Mbytes False

After the completion of the DLPAR operation, the memory size has been decreased to 3 GB, as shown in the following example:

# lsattr -El mem0 size 3072 Total amount of physical memory in Mbytes False goodsize 3072 Amount of usable physical memory in Mbytes False

Sample output: 1 CPU addition

Example 5-5 shows the sample output of a 1 CPU addition DLPAR event. You will

see there are three line blocks for the check, pre, and post phases for the CPU ID

2.

Example 5-5 Sample output: 1 CPU addition

-- start checkacquire phase --

DR_BCPUID=2

DR_DRMGR_INFO=DRAF architecture Version 1

DR_FORCE=FALSE

DR_LCPUID=2 cpu resources: logical 2, bind 2

-- end checkacquire phase --

-- start preacquire phase --

DR_BCPUID=2

DR_DRMGR_INFO=DRAF architecture Version 1

DR_FORCE=FALSE

192

Partitioning Implementations for IBM

E server

p5 Servers

DR_LCPUID=2

-- end preacquire phase --

-- start undopreacquire phase --

DR_BCPUID=2

DR_DRMGR_INFO=DRAF architecture Version 1

DR_FORCE=FALSE

DR_LCPUID=2

-- end undopreacquire phase --

Before the DLPAR operation, the partition had two processors assigned, as shown in the following example:

# lsdev -Cc processor -S Available proc6 Available 00-06 Processor proc7 Available 00-07 Processor

After the completion of the DLPAR operation, the number of active processors has been increased to three, as shown in the following example:

# lsdev -Cc processor -S Available proc6 Available 00-06 Processor proc20 Available 00-20 Processor proc7 Available 00-07 Processor

Sample output: 2 CPU removal

Example 5-6 shows the sample output of a 2 CPU removal DLPAR event. There

are three line blocks for the check, pre, and post phases for each CPU (ID 2 and

ID3).

Example 5-6 Sample output: 2 CPU removal

-- start checkrelease phase --

DR_BCPUID=3

DR_DRMGR_INFO=DRAF architecture Version 1

DR_FORCE=FALSE

DR_LCPUID=3

-- end checkrelease phase --

-- start prerelease phase --

DR_BCPUID=3

DR_DRMGR_INFO=DRAF architecture Version 1

DR_FORCE=FALSE

DR_LCPUID=3

-- end prerelease phase --

-- start postrelease phase --

DR_BCPUID=3

DR_DRMGR_INFO=DRAF architecture Version 1

DR_FORCE=FALSE

DR_LCPUID=3

-- end postrelease phase --

Chapter 5. Dynamic logical partitioning

193

-- start checkrelease phase --

DR_BCPUID=2

DR_DRMGR_INFO=DRAF architecture Version 1

DR_FORCE=FALSE

DR_LCPUID=2

-- end checkrelease phase --

-- start prerelease phase --

DR_BCPUID=2

DR_DRMGR_INFO=DRAF architecture Version 1

DR_FORCE=FALSE

DR_LCPUID=2

-- end prerelease phase --

-- start postrelease phase --

DR_BCPUID=2

DR_DRMGR_INFO=DRAF architecture Version 1

DR_FORCE=FALSE

DR_LCPUID=2

-- end postrelease phase --

Before the DLPAR operation, the partition had four processors assigned, as shown in the following example:

# lsdev -Cc processor -S Available proc6 Available 00-06 Processor proc20 Available 00-20 Processor proc7 Available 00-07 Processor proc21 Available 00-21 Processor

After the completion of the DLPAR operation, the number of active processes has been decreased to two, as shown in the following example:

# lsdev -Cc processor -S Available proc6 Available 00-06 Processor proc7 Available 00-07 Processor

5.9 API-based DLPAR event handling

The AIX 5L Version 5.3 operating system includes further enhancements to the dr_reconfig() system call. The improvements are intended to exploit the

POWER5 processor and virtualization features when building or customizing applications that respond to DLPAR events. Applications wanting to utilize all of the new function will need to be modified to remain DLPAR-aware. Because

DLPAR-safe applications should not fail when a DLPAR event occurs, they require no changes.

To properly write an application to recognize a DLPAR event, the application will register a signal handler that calls dr_reconfig(). When a DLPAR event occurs,

194

Partitioning Implementations for IBM

E server

p5 Servers

the application receives SIGRECONFIG signals from the kernel in order to notify the DLPAR event. The signal is sent twice (check and pre phases) before the actual resource change (addition or removal), and sent once (post phase) after the resource change.

Note: The SIGRECONFIG signal is also sent (along with the SIGCPUFAIL

signal for backward compatibility) in the case of a CPU Guard event.

Therefore, this API-based method can also be utilized by CPU Guard-aware applications.

In the latest release of DLPAR support, DLPAR events for I/O slots do not notify applications using the dr_config() system call.

5.9.1 The dr_reconfig system call

The dr_reconfig() system call is provided to query the information of the current

DLPAR event. The system call must be called from a registered signal handler in order for the application be notified from the kernel when a DLPAR event occurs.

The sigaction() system call is used to register a signal handler.

To use dr_reconfig() in your C language application, you need to add the following compiler directive line that instructs the preprocessor to include the

/usr/include/sys/dr.h file:

#include <sys/dr.h>

Example 5-7 The dr_reconfig system call usage

int dr_reconfig(int flags, dr_info_t *info);

0 is returned for success; otherwise,

-1 is returned, and the errno is set to the appropriate value.

The dr_reconfig() system call takes two parameters. The flags determine what the system call does. The info parameter is a structure that contains

DLPAR-specific data that the signal handler uses to process DLPAR events

accordingly. Table 5-13 on page 196 provides the supported flags.

Chapter 5. Dynamic logical partitioning

195

Table 5-13 The dr_reconfig flag parameters

2

Flags Description

DR_QUERY

DR_EVENT_FAIL

DR_RECONFIG_DONE

This flag identifies the current DLPAR event. It also identifies any actions, if any, that the application should take to comply with the current DLPAR event.

Any pertinent information is returned in the second parameter.

This flag fails the current DLPAR event. It requires root authority.

This flag is used in conjunction with the DR_QUERY flag.

The application notifies the kernel that the actions it took to comply with the current DLPAR request are now complete. The dr_info structure identifying the DLPAR request that was returned earlier is passed as an input parameter.

The other parameter is a pointer to a structure that hold DLPAR-specific information. The signal handler must allocate space for the dr_info_t data structure. The AIX kernel populates this data structure and return it to the signal handler. In AIX 5L Version 5.3, additional handlers have been included. They are

outlined in Example 5-8, which shows the definition of the dr_info structure.

Example 5-8 The dr_reconfig info parameter

typedef struct dr_info {

/* The following fields are filled out for cpu based requests */ unsigned int ent_cap : 1; // entitled capacity change request unsigned int unsigned int capable unsigned int var_wgt : 1; // variable weight change request splpar_capable : 1; // partition is Shared Processor Partition unsigned int unsigned int uint64_t splpar_shared : 1; // shared partition (1), dedicated (0) splpar_capped : 1; // shared partition is capped cap_constrained : 1; // capacity is constrained by PHYP capacity; // the current entitled capacity or

// variable capacity weight value

// depending on the bit fields

// ent_cap and var_wgt.

Int delta_cap; // delta entitled capacity or variable

// weight capacity that is to be added

// or removed.

} dr_info_t;

2

The DR_RECONFIG_DONE flag is available on AIX 5L Version 5.2 plus 5200-01 Recommended

Maintenance Level and later.

196

Partitioning Implementations for IBM

E server

p5 Servers

The bindproc and bindpset bits are only set if the request is to remove a processor. If the bindproc is set, then the process has a bindprocessor() attachment that must be resolved before the operation is allowed. If the bindpset bit is set, the application has processor set attachment, which can be lifted by calling the appropriate processor set interface.

The plock and pshm bits are only set if the DLPAR request is to remove memory and the process has plock() memory or is attached to a pinned shared memory segment. If the plock bit is set, the application calls plock() to unpin itself. If the pshm bit is set, the application detaches its pinned memory segments. The memory remove request might succeed, even if the pshm bit is set, as long as there is enough pinnable memory in the partition. Therefore, an action might not be required for the pshm bit to be set, but it is strongly recommended. The sys_pinnable_frames field provides the necessary information if the system has enough excess pinnable memory.

Programming implications of CPU DLPAR events

At boot time, processors are configured in the kernel. In AIX 5L, a processor is identified by three different identifications, namely:

򐂰

򐂰

򐂰

The physical CPU ID, which is derived from the Open Firmware device tree and used to communicate with RTAS.

The logical CPU ID, which is a ppda-based

3

index of online and offline processors.

The bind CPU ID, which is the index of online processors.

The logical and bind CPU IDs are consecutive and have no holes in the numbering. No guarantee is given across boots that the processors will be configured in the same order or even that the same processors will be used in a partitioned environment at all.

At system startup, the logical and bind CPU IDs are both consecutive and have no holes in the numbering; however, DLPAR operations can remove a processor from the middle of the logical CPU list. The bind CPU IDs remain consecutive because they refer only to online processors, so the kernel has to explicitly map these IDs to logical CPU IDs (containing online and offline CPU IDs).

The range of logical CPU IDs is defined to be 0 to M-1, where M is the maximum number of processors that can be activated within the partition. M is derived from the Open Firmware device tree. The logical CPU IDs name both online and offline processors. The rset

4

APIs are predicated on the use of logical CPU IDs.

3

Per processor description area.

4

Resource set.

Chapter 5. Dynamic logical partitioning

197

The range of bind CPU IDs is defined to be 0 to N-1; however, N is the current number of online processors. The value of N changes as processors are added and removed from the system by either DLPAR or CPU Guard. In general, new processors are always added to the Nth position. Bind CPU IDs are used by the system call bindprocessor and by the kernel service switch_cpu.

򐂰

򐂰

򐂰

򐂰

The number of potential processors can be determined by:

_system_configuration.max_ncpus

_system_configuration.original_ncpus

var.v_ncpus_cfg

sysconf(_SC_NPROCESSORS_CONF)

򐂰

򐂰

򐂰

The number of online processors can be determined by:

_system_configuration.ncpus

var.v_ncpus

sysconf(_SC_NPROCESSORS_ONLN)

The _system_configuration structure is defined in the

/usr/include/sys/systemcfg.h header file, and those members can be accessed from your application, as shown in the following code fraction example:

#include <sys/systemcfg.h> printf("_system_configuration.original_ncpus=%d\n"

, _system_configuration.original_ncpus);

The var structure is defined in the /usr/include/sys/var.h header file and populated by the sysconfig system call. The following code fraction example demonstrates how to retrieve var.v_ncpus:

#include <sys/types.h>

#include <sys/sysconfig.h>

#include <sys/var.h> struct var myvar; rc = sysconfig(SYS_GETPARMS, &myvar, sizeof(struct var)); if (rc == 0) printf(“var.v_ncpus = %d\n”, myvar.v_ncpus);

򐂰

򐂰

The number of online processors can also be determined from the command line. AIX provides the following commands:

bindprocessor -q lsrset -a

As previously mentioned, AIX supports two programming models for processors: the bindprocessor model that is based on bind CPU IDs and the rset API model that is based on logical CPU IDs. Whenever a program implements any of these programming models, it should be DLPAR-aware.

198

Partitioning Implementations for IBM

E server

p5 Servers

򐂰

򐂰

򐂰

򐂰

The following new interfaces (system calls and kernel services) are provided to query bind and logical CPU IDs and the mapping between them: mycpu(): Returns bind CPU ID of the process my_lcpu(): Returns bind CPU ID of the process b2lcpu(): Returns the bind to logical CPU ID mapping l2bcpu(): Returns the logical to bind CPU ID mapping

5.9.2 A sample code using the dr_reconfig system call

A sample application was written in the C language using the dr_reconfig()

system call (see Example A-4 on page 257). Because the source code is long,

an excerpt of the most important part is provided with annotations from the example.

Basically, this application does nothing voluntary, except for the signal handler registration. It just does the busy loop in the while loop in main and waits until the

SIGRECONFIG signal is delivered. You must implement your application logic in the while loop, specified by the comment Your application logic goes here .

The behavior of the application is briefly explained in the following:

1. Register a signal handler, dr_func(), in main (indicated as #A in the comment). The signal handler is registered in order to react to the

SIGRECONFIG signal when it is delivered to the application process.

if ((rc = sigaction(SIGRECONFIG, &sigact, &sigact_save)) != 0) { /* #A */

2. Wait in the busy loop in main (#B) until the SIGRECONFIG signal is sent: while (1) { /* #B */

;

/* your application logic goes here. */

}

3. After the SIGRECONFIG signal is delivered by the kernel, the signal handler, dr_func(), is invoked.

4. The handler calls dr_reconfig() in order to query the dr_info structure data.

The dr_info structure is used to determine what DLPAR operation triggers this signal (#C).

l_rc = dr_reconfig(DR_QUERY, &dr_info); /* #C */

Note: You must include the following preprocessor directive line to use the

dr_reconfig() system call:

#include <sys/dr.h>

Chapter 5. Dynamic logical partitioning

199

5. The handler parses the dr_info structure to determine the DLPAR operation type:

– If the dr_info.add member is set, this signal is triggered by a DLPAR resource addition request (#D): if (dr_info.add) { /* #D */

– If the dr_info.rem member is set, this signal is triggered by a DLPAR resource removal request (#E): if (dr_info.rem) { /* #E */

6. The handler again parses the dr_info structure to determine the DLPAR resource type:

– If the dr_info.cpu member is set, this signal is triggered by a DLPAR CPU resource addition or removal request (#F): if (dr_info.cpu) { /* #F */

– If the dr_info.mem member is set, this signal is triggered by a DLPAR memory resource addition or removal request (#G):

} else if (dr_info.mem) { /* #G */

7. Invoke the corresponding function based on the information determined:

– If the requested DLPAR resource type is CPU, call the function pointer stored in the l_currentPhase->cpu_ptr array (#H): l_rc = l_currentPhase->cpu_ptr(); /* #H */

– If the requested DLPAR resource type is memory, call the function pointer stored in the l_currentPhase->mem_ptr array (#I): l_rc = l_currentPhase->mem_ptr(); /* #I */

Note: You must modify the functions included in the definedPhase array

(#J) by adding your own logic in order to react against DLPAR operation phases. The comment

Perform actions here

specifies the location where you modify the functions.

5.9.3 Sample output examples from a DLPAR-aware application

Although, the syslog facility can be used to record debug information, the example application debug information was sent to /tmp/dr_api_template.C.dbg for readability reasons.

After compiling the C source code (see Example A-4 on page 257), the

application was run and initiated several DLPAR operations on the HMC.

200

Partitioning Implementations for IBM

E server

p5 Servers

򐂰

򐂰

򐂰

򐂰

The following several examples exhibit the internal behaviors of the following

DLPAR operations:

1 GB memory addition

1 GB memory removal

2 CPU addition

1 CPU removal

To perform a DLPAR operation using the graphical user interface on the HMC,

refer to 3.1, “Hardware Management Console” on page 58.

Sample output: 1 GB memory addition

Example 5-9 shows the sample output of a 1 GB memory addition DLPAR event.

There are three line blocks for the check, pre, and post phases.

Example 5-9 Sample output: 1 GB memory addition

---Start of Signal Handler---

An add request for

** check phase **

Resource is Memory.

requested memory size (in bytes) = 1073741824 system memory size = 2147483648 number of free frames in system = 29434 number of pinnable frams in system = 339916 total number of frames in system = 524288

*****Entered CheckedAcquire_mem*****

---end of signal handler---

---Start of Signal Handler---

An add request for

** pre phase **

Resource is Memory.

requested memory size (in bytes) = 1073741824 system memory size = 2147483648 number of free frames in system = 29434 number of pinnable frams in system = 339916 total number of frames in system = 524288

*****Entered PreeAcquire_mem*****

---end of signal handler---

---Start of Signal Handler---

An add request for

** post phase **

Resource is Memory.

requested memory size (in bytes) = 1073741824 system memory size = 2147483648 number of free frames in system = 284761

Chapter 5. Dynamic logical partitioning

201

number of pinnable frams in system = 516763 total number of frames in system = 786432

*****Entered PostAcquire_mem*****

---end of signal handler---

Sample output: 1 GB memory removal

Example 5-10 shows the sample output of a 1 GB memory removal DLPAR

event. There are three line blocks for the check, pre, and post phases.

Example 5-10 Sample output: 1 GB memory removal

---Start of Signal Handler---

A remove request for

** check phase **

Resource is Memory.

requested memory size (in bytes) = 1073741824 system memory size = 3221225472 number of free frames in system = 284771 number of pinnable frams in system = 516763 total number of frames in system = 786432

*****Entered CheckeRelease_mem*****

---end of signal handler---

---Start of Signal Handler---

A remove request for

** pre phase **

Resource is Memory.

requested memory size (in bytes) = 1073741824 system memory size = 3221225472 number of free frames in system = 284770 number of pinnable frams in system = 516763 total number of frames in system = 786432

*****Entered PreRelease_mem*****

---end of signal handler---

---Start of Signal Handler---

A remove request for

** post phase **

Resource is Memory.

requested memory size (in bytes) = 1073741824 system memory size = 3221225472 number of free frames in system = 29043 number of pinnable frams in system = 339916 total number of frames in system = 524288

*****Entered PostReleasee_mem*****

---end of signal handler---

202

Partitioning Implementations for IBM

E server

p5 Servers

Sample output: 2 CPU addition

Example 5-11 shows the sample output of a 2 CPU addition DLPAR event. There

are three line blocks for the check, pre, and post phases for each processor

(CPU ID 2 or 3).

Example 5-11 Sample output: 2 CPU addition

---Start of Signal Handler---

An add request for

** check phase **

Resource is CPU .

logical CPU ID = 2

Bind CPU ID = 2

*****Entered CheckedAcquire_cpu*****

---end of signal handler---

---Start of Signal Handler---

An add request for

** pre phase **

Resource is CPU .

logical CPU ID = 2

Bind CPU ID = 2

*****Entered PreAcquire_cpu*****

---end of signal handler---

---Start of Signal Handler---

An add request for

** post phase **

Resource is CPU .

logical CPU ID = 2

Bind CPU ID = 2

*****Entered PostAcquire_cpu*****

---end of signal handler---

---Start of Signal Handler---

An add request for

** check phase **

Resource is CPU .

logical CPU ID = 3

Bind CPU ID = 3

*****Entered CheckedAcquire_cpu*****

---end of signal handler---

---Start of Signal Handler---

An add request for

** pre phase **

Resource is CPU .

logical CPU ID = 3

Bind CPU ID = 3

Chapter 5. Dynamic logical partitioning

203

*****Entered PreAcquire_cpu*****

---end of signal handler---

---Start of Signal Handler---

An add request for

** post phase **

Resource is CPU .

logical CPU ID = 3

Bind CPU ID = 3

*****Entered PostAcquire_cpu*****

---end of signal handler---

Sample output: 1 CPU removal

Example 5-12 shows the sample output of a 1 CPU removal DLPAR event. There

are line three blocks for the check, pre, and post phases for the CPU ID 3.

Example 5-12 Sample output: 1 CPU removal

---Start of Signal Handler---

A remove request for

** check phase **

Resource is CPU .

logical CPU ID = 3

Bind CPU ID = 3

*****Entered CheckRelease_cpu*****

---end of signal handler---

---Start of Signal Handler---

A remove request for

** pre phase **

Resource is CPU .

logical CPU ID = 3

Bind CPU ID = 3

*****Entered PreRelease_cpu*****

---end of signal handler---

---Start of Signal Handler---

A remove request for

** post phase **

Resource is CPU .

logical CPU ID = 3

Bind CPU ID = 3

*****Entered PostRelease_cpu*****

---end of signal handler---

204

Partitioning Implementations for IBM

E server

p5 Servers

5.9.4 DLPAR-aware kernel extensions

Like applications, most kernel extensions are DLPAR-safe by default. However, some are sensitive to the system configuration and might need to be registered with the kernel in order to be notified of DLPAR events.

򐂰

򐂰

To register and unregister from the kernel in order to be notified in the case of

DLPAR events, the following kernel services are available:

򐂰 reconfig_register reconfig_unregister reconfig_complete

The reconfig_register and reconfig_unregister services have had events added to support the following shared processors functions:

򐂰

Capacity addition and removal

򐂰 Virtual processor add and remove is supported by pre-existing CPU add and remove

5.10 Error handling of DLPAR operations

Knowing what errors the

drmgr

command can return is fundamental to creating a comprehensive DLPAR script or DLPAR-aware application. This section covers the methods AIX provides to help perform error analysis on failed DLPAR operations. It also discusses some actions that should be taken when an error occurs.

5.10.1 Possible causes of DLPAR operation failures

A DLPAR operation request can fail for various reasons. The most common of these is that the resource is busy, or that there are not enough system resources currently available to complete the request. In these cases, the resource is left in a normal state as though the DLPAR event never happened.

The following are possible causes of DLPAR operation failures:

򐂰 The primary cause of processor removal failure is processor bindings. The operating system cannot ignore processor bindings and carry on DLPAR operations or applications might not continue to operate properly. To ensure that this does not occur, release the binding, establish a new one, or terminate the application. The processors that are impacted is a function of the type of binding that is used.

Chapter 5. Dynamic logical partitioning

205

򐂰

򐂰

The primary cause of memory removal failure is that there is not enough pinned memory available in the system to complete the request. This is a system-level issue and is not necessarily the result of a specific application. If a page in the memory region to be removed has a pinned page, its contents must be migrated to another pinned page, while automatically maintaining its virtual to physical mappings. The failure occurs when there is not enough pinnable memory in the system to accommodate the migration of the pinned data in the region that is being removed. To ensure that this does not occur, lower the level of pinned memory in the system. This can be accomplished by destroying pinned shared memory segments, terminating programs that implement the plock system call, or removing the plock on the program.

The primary cause of PCI slot removal failure is that the adapters in the slot are busy. Note that device dependencies are not tracked. For example, the device dependency might extend from a slot to one of the following: an adapter, a device, a volume group, a logical volume, a file system, or a file. In this case, resolve the dependencies manually by stopping the relevant applications, unmounting file systems, varying off volume groups, and unconfiguring the device drivers associated with the adapters in the target slot.

If an error occurs in a DLPAR operation, the error message dialog box shown in

Figure 5-9 appears on the HMC.

Figure 5-9 DLPAR operation failed message

The HMC also displays the information message dialog box, as shown in

Figure 5-10 on page 207.

206

Partitioning Implementations for IBM

E server

p5 Servers

Figure 5-10 DLPAR operation failure detailed information

Note: If the registered DLPAR scripts have bugs, they are also the cause of

failure. You must carefully code the scripts and test them on a test partition before the deployment on the production partition.

5.10.2 Error analysis facilities

򐂰

򐂰

򐂰

򐂰

AIX provides the following facilities to help isolate DLPAR operation failures:

The syslog facility

AIX system trace facility

AIX error log facility

Kernel debugger

You can also use these facilities if a script or DLPAR-aware application fails.

Moreover, by learning how to use these facilities, you can modify your programs to handle some of the possible errors automatically.

The syslog facility

The syslog facility is another useful tool to help isolate DLPAR-related errors. You can use it to keep a record of the progress of DLPAR events. The syslog entries come with a time stamp to indicate when all the DLPAR events occurred.

On AIX, the syslog facility is not enabled by default. To enable recording DLPAR events using syslog, do the following:

1. Edit the /etc/syslog.conf file as the root user.

2. Add the required syslog entries to the end of the file.

For example, add the following:

*.debug /var/adm/syslog.log rotate size 10k

Chapter 5. Dynamic logical partitioning

207

This directive line instructs the syslog facility to log all messages of priority debug (LOG_DEBUG) and above to the /var/adm/syslog.log file. The

/var/adm/syslog.log file is automatically rotated to limit the maximum file size to 10 KB.

3. Create the file explicitly:

# touch /var/adm/syslog.log

4. Restart the syslogd subsystem:

# stopsrc -s syslogd

# startsrc -s syslogd

򐂰

򐂰

򐂰

In Appendix B, “Dynamic logical partitioning output samples” on page 273, the

following syslog output examples are included:

Sample syslog output for a processor addition request

Sample syslog output for a memory addition request

Sample syslog output for a memory removal request

When you register your DLPAR scripts, if you explicitly specify a channel ID string other than the default value DRMGR by using

drmgr -S

, you can quickly search the corresponding information that is produced by your DLPAR scripts. The default channel ID, DRMGR, is shown in several syslog output examples.

AIX system trace facility

The AIX system trace facility is a tool that can trace many kernel internal activities by specifying trace hook IDs. In case of processor- and memory-related

DLPAR events, the trace hook ID is 38F. After capturing the trace of DLPAR events, you can generate a trace report in order to examine the results.

To use AIX system trace facility in order to capture DLPAR events, do the following as the root user:

1. Start the trace:

# trace -a -j 38f

2. Perform the desired DLPAR operations.

3. Stop the trace:

# trcstop

4. Analyze the trace:

# trcrpt

208

Partitioning Implementations for IBM

E server

p5 Servers

You can also use SMIT to do the same activities (you need the root authority):

1. Invoke

smit

and select the following panels, and then press Enter:

Problem Determination

Trace

START Trace

2. Type 38F in the ADDITIONAL event IDs to trace field, as shown in

Example 5-13, and then press Enter.

Example 5-13 START Trace panel

START Trace

Type or select values in entry fields.

Press Enter AFTER making all desired changes.

[Entry Fields]

EVENT GROUPS to trace [] +

ADDITIONAL event IDs to trace [38F] +

Event Groups to EXCLUDE from trace [] +

Event IDs to EXCLUDE from trace [] +

Trace MODE [alternate] +

STOP when log file full? [no] +

LOG FILE [/var/adm/ras/trcfile]

SAVE PREVIOUS log file? [no] +

Omit PS/NM/LOCK HEADER to log file? [yes] +

Omit DATE-SYSTEM HEADER to log file? [no] +

Run in INTERACTIVE mode? [no] +

Trace BUFFER SIZE in bytes [131072] #

LOG FILE SIZE in bytes [1310720] #

Buffer Allocation [automatic] +

3. Perform the desired DLPAR operations.

4. Invoke

smit

and select the following panels, and then press Enter:

Problem Determination

Trace

STOP Trace

5. Invoke

smit

, select the following panels, select 1 filename (defaults stdout), and then press Enter:

Problem Determination

Trace

Generate a Trace Report

Chapter 5. Dynamic logical partitioning

209

6. Select the following values in the smit panel shown in Example 5-14, and then

press Enter:

Show PROCESS IDs for each event?

Show THREAD IDs for each event?

yes yes

Example 5-14 Generate a Trace Report panel

Generate a Trace Report

Type or select values in entry fields.

Press Enter AFTER making all desired changes.

[Entry Fields]

Show exec PATHNAMES for each event? [yes] +

Show PROCESS IDs for each event? [yes] +

Show THREAD IDs for each event? [yes] +

Show CURRENT SYSTEM CALL for each event? [yes] +

Time CALCULATIONS for report [elapsed only] +

Event Groups to INCLUDE in report [] +

IDs of events to INCLUDE in report [] +X

Event Groups to EXCLUDE from report [] +

ID's of events to EXCLUDE from report [] +X

STARTING time []

ENDING time []

LOG FILE to create report from [/var/adm/ras/trcfile]

FILE NAME for trace report (default is stdout) []

For more information, see Appendix B, “Dynamic logical partitioning output samples” on page 273.

AIX error log facility

The

drmgr

command can generate error log messages in the few cases involving kernel, kernel extensions, or platform failures that have been caused by a DLPAR

event. Table 5-14 on page 211 shows a list of the possible errors that could be

found in the system error log.

210

Partitioning Implementations for IBM

E server

p5 Servers

Table 5-14 AIX error logs generated by DLPAR operations

Error log entry Description

DR_SCRIPT_MSG

DR_RECONFIG_HANDLER_MSG

Application script error or related messages, or both.

Entry incudes failing script name and DLPAR phase where the error occurred.

Kernel extension reconfiguration handler error.

Entry includes failing handler’s registration name.

DR_MEM_UNSAFE_USE

DR_DMA_MEM_MIGRATE_FAIL

DR_DMA_MEM_MAPPER_FAIL

򐂰

򐂰

Non-DLPAR aware kernel extension’s use of physical memory is not valid. The result is that the affected memory is not available for

DLPAR removal. The entry includes:

򐂰

The affected logical memory address

An address corresponding to the kernel extension’s load module

The kernel extension load module’s path name

򐂰

򐂰

Memory removal failure due to DMA activity.

The affected LMB had active DMA mappings that could not be migrated by the platform.

The entry includes:

򐂰

The logical memory address within the

LMB

Hypervisor migration return code

Logical bus number of the slot owning the

DMA mapping

򐂰 The DMA address

򐂰

򐂰

Memory removal failure due to a kernel extension responsible for controlling DMA mappings error. The entry includes:

򐂰

DMA mapper handler return code

An address corresponding to the DMA mapper’s kernel extension load module

The DMA mapper’s kernel extension load module’s path name

Chapter 5. Dynamic logical partitioning

211

Kernel debugger

The AIX kernel debugger (KDB) helps isolate DLPAR operation errors. KDB can be especially useful to diagnose errors found in the kernel extensions. For further information about the use of KDB, refer to AIX 5L Version 5.3 Technical

Reference: Kernel and Subsystems, available at: http://techsupport.services.ibm.com/server/library

5.10.3 AIX error log messages when DLPAR operations fail

Several different AIX error messages can be generated when a DLPAR event failure occurs. These error messages and the action that should be taken are listed. The following tables show all the AIX error messages in relation to DLPAR operations.

Table 5-15 indicates general error messages that can be displayed when a

DLPAR event failure occurs and the recommended actions to take.

Table 5-15 General AIX error messages

Error message

You must have root authority to run this command.

Failed to set the ODM data.

Consult AIX error log for more information.

Resource identifier out of range.

Recommended action

Log in as the root user.

Contact an IBM service representative.

Open the AIX error log and look for error log entries with the DR_ prefix.

Consult AIX syslog and HMC logs.

Table 5-16 describes the possible errors that can be generated by the

drmgr

command.

Table 5-16 drmgr-specific AIX error messages

Error message Recommended action

Error building the DLPAR script information.

Aborting DLPAR operation due to Check

Phase failure.

Check system resources, such as free space in the /var file system. If the problem persists, contact an IBM service representative.

Examine the AIX syslog. Contact the script/application owner.

212

Partitioning Implementations for IBM

E server

p5 Servers

Error message

Error: Specified DLPAR script file already exists in the destination directory.

Error: Specified DLPAR script file does not exist in directory.

The DLPAR operation is not supported.

Invalid parameter.

Recommended action

Examine the AIX syslog. Contact the script/application owner to change the script name. Use the force flag (

drmgr -f

) to overwrite the pre-existing script.

File could not be found for uninstallation.

Check the file name specified.

The machine or configuration does not support that operation. Upgrade the system firmware or operating system software, or both.

Contact an IBM service representative.

While a DLPAR operation is taking place, an error can occur. Table 5-17 indicates

some of these error messages caused by AIX during a DLPAR operation.

Table 5-17 DLPAR operation-specific AIX error messages

Error message Recommended action

DLPAR operation failed because of timeout.

DLPAR operation failed. Kernel busy with another DLPAR operation.

DLPAR operation failed due to kernel error.

The DLPAR operation could not be supported by one or more kernel extensions.

DLPAR operation failed since resource is already online.

DLPAR operation timed out.

Increase the timeout value, or try again later. Also, try the DLPAR operation without a timeout specified.

Only perform one DLPAR operation at a time. Try again later.

Examine the AIX syslog or contact an IBM service representative, or both.

Find the corresponding AIX error log entry

DR_RECONFIG_HANDLER_MSG and contact the kernel extension owner.

Examine the AIX syslog and HMC log.

Increase the timeout, or try again later.

Also, try initiating the DLPAR operation without a timeout specified.

Chapter 5. Dynamic logical partitioning

213

Finally, there are several DLPAR errors resulting from resource events.

Table 5-18 displays these types of errors.

Table 5-18 DLPAR resource-specific AIX error messages

Error message Recommended action

Examine the AIX syslog and HMC log.

The specified connector type is invalid, or there is no dynamic reconfiguration support for connectors of this type on this system.

Insufficient resource to complete operation.

Try again later. Free up resources and try again.

Examine the AIX syslog and HMC log.

CPU could not be started.

Memory could not be released. DLPAR operation failed since a kernel extension controlling DMA mappings could not support the operation.

Examine the AIX error log and look for the

DR_DMA_MAPPER_FAIL entry. The logical memory block or address of the message should correspond to the logical memory address in the error log entry. The

LR value in the error log is an address within the failing kernel extension. The

Module Name in the error log is the path name of the kernel extension load module.

Unconfigure the kernel extension and retry. Contact the kernel extension owner.

Examine the AIX syslog and HMC log.

Resource could not be found for the

DLPAR operation.

Resource is busy and cannot be released.

Memory in use by a non DLPAR-safe kernel extension and hence cannot be released.

Examine the AIX syslog. Quiesce activity using the resource and try again.

Examine the AIX error log and look for the

DR_MEM_UNSAFE_USE entry. The logical memory block or address in the message should correspond to the logical memory address in the error log. The LR value in the error log is an address within the owning kernel extension. The Module

Name in the error log is the path name of the kernel extension load module.

Unconfigure the kernel extension and retry. Contact the kernel extension owner.

214

Partitioning Implementations for IBM

E server

p5 Servers

Error message

Memory could not be released because system does not contain enough resources for optimal pinned memory and large page memory ratios.

Recommended action

Reduce pinned or large page memory requirements, or both, and try again.

Chapter 5. Dynamic logical partitioning

215

216

Partitioning Implementations for IBM

E server

p5 Servers

Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement

Table of contents