Friday, February 15, 2008

Running Oracle on AIX

A systems administrator always needs to be cognizant of system performance. Performance tuning on IBM® AIX® has changed considerably in recent years due to changes that have been made in AIX and its hardware platform, System p™. If you were to read an AIX-specific performance tuning document from two years ago and applied the same strategies today, not only might you not be improving performance, but in some cases you would be making things worse. As an administrator, while at times you may find that changing some parameters on the fly might increase performance dramatically and fairly quickly, performance tuning in a database environment like Oracle is a marathon, not a sprint. This article drills down into the many aspects of tuning AIX to run Oracle. We'll look at the Virtual Memory Manager (VMM), CPU, Memory, and I/O (disk and network). We'll examine some of the tools that you can use to analyze bottlenecks, while also making some changes to the system. Finally, we'll also review some Oracle tools you can use to help with your performance tuning.

Introduction

As a systems administrator, you should already know some of the basics of memory, CPU, and Disk I/O (see the Resources section for articles on these subjects). What you may not fully understand is how the VMM works in AIX and what that means to Oracle. You will also find that because many of the AIX tuning commands and parameters have changed in recent years, Oracle has changed also, and there are changes to utilities such as the Oracle Enterprise Manager, which is an important utility you should definitely take the time to learn and add to your repertoire.

This article discusses in detail the AIX VMM and the tuning commands that you will be using to tune memory. It also introduces some of the monitoring tools that you will be using, which will help put you in a position to tune.

Before we get started, it is important to note that you must have an overall approach to what you are doing. Make sure you use proper change control processes; only make one change at a time and monitor that change very carefully before introducing that change into other environments, particularly production. Performance tuning really is an iterative, ongoing process and you'll oftentimes find that by fixing one bottleneck you will create another, which is okay as long as you continuously look to improve the health of your systems. Make sure that you start monitoring your system at the beginning, well before your users are screaming about slow performance. How can you know what a poorly performing system is like unless you know what a healthy system looks like? A proper baseline is key. The system we'll be looking at is running Oracle 10g -- 10.1.0.2.0 and AIX 5.3 TL7 on a POWER5™ LPAR with one CPU and 4GB of RAM.


 


 

Memory

In this section, we'll review memory as it relates to AIX and Oracle. We'll discuss how AIX uses virtual memory and how this relates to Oracle. We'll also analyze the data and tune our subsystems.

Let's start with VMM. It's important to understand that the VMM services all memory requests from the system, not just virtual memory. When RAM is accessed, the VMM must allocate space even where there is plenty of physical memory left on the box. This is what confuses both DBAs and systems administrators at times. It does this by using a process called early allocation of paging space, by partitioning segments into pages. These pages can be either RAM or paging space (virtual memory stored on disk). At the same time, VMM maintains a free list of unallocated page frames, which are used to satisfy page faults. The VMM has a page-replacement algorithm, which assigns the page frames and determines exactly which virtual-memory pages currently in RAM will have their page frames brought back to the free list.

Furthermore, the AIX operating system will use all available memory, except that which is configured to be unallocated and known as the free list. Obviously, administrators prefer to use physical memory rather than paging space, where the physical memory is available. VMM classifies memory segments into two categories: persistent segments and working segments. Persistent segments use file memory and working segments use computational memory. What does this mean to us? It's the computational memory that is used while your SQL queries are accessing the database. These are working segments and will terminate when the process is completed. These segments have no real permanent location. On the other hand, file memory uses persistent segments and do have permanent locations on the disks. They will remain in memory usually until the pages are stolen or the database is recycled. Again, you want the file memory paged to disk and not the computational memory.

How do we tune our systems? One critical parameter worth discussing is the Translation Lookaside Buffer (TLB). Applications like Oracle exploit a tremendous amount of virtual memory, so using large pages increases performance substantially. Increasing the size of this buffer allows the system to map more virtual memory, which results in a lower miss rate for applications that use a lot of virtual memory like Oracle. This includes both OLTP and Data Warehouse applications. Oracle uses large pages for its SGA, because it is the SGA that really dominates virtual memory. With AIX 5.3 and older, we will use vmo; prior to that, we used vmtune.

Let's look at the parameters, using vmo, as shown in Listing 1.


Listing 1. Parameters using vmo
 
                 
root@lpar21ml16ed_pub[/] > vmo -L lgpg_size
NAME                      CUR    DEF    BOOT   MIN    MAX    UNIT           TYPE
     DEPENDENCIES
--------------------------------------------------------------------------------
lgpg_size                 0      0      0      0      16M    bytes             D
     lgpg_regions


root@lpar21ml16ed_pub[/] > vmo -L lgpg_regions
NAME                      CUR    DEF    BOOT   MIN    MAX    UNIT           TYPE
     DEPENDENCIES
--------------------------------------------------------------------------------
lgpg_regions              0      0      0      0                               D
     lgpg_size

 

Using the following command, we'll allocate 16777216 bytes to provide large pages, with 256 actual large pages:

# vmo -r -o lgpg_size=16777216 lgpg_regions=256

At the same time, with Oracle Database 10g, make sure that the LOCK_SGA Oracle initialization parameter is set to TRUE, so that Oracle request large pages when allocating shared memory. By far, the two most important vmo settings are minperm and maxperm. We use these parameters to determine whether our system favors computational memory or file memory. The first thing we do here is make certain that our lru_file_repage parameter = 0. This parameter was introduced in ML1 of AIX 5.3 and determines if the VMM repage-counts are considered and the type of memory it should steal (see Listing 2).


Listing 2. The lru_file_repage parameter
 
                
root@lpar21ml16ed_pub[/] > vmo -L lru_file_repage
NAME                      CUR    DEF    BOOT   MIN    MAX    UNIT           TYPE
     DEPENDENCIES
--------------------------------------------------------------------------------
lru_file_repage           1      1      1      0      1      boolean           D
--------------------------------------------------------------------------------
root@lpar21ml16ed_pub[/] >

 

As shown in Listing 2, the default is 1, so we'll need to change this using vmo (see Listing 3).


Listing 3. Changing the default setting for the lru_file_repage parameter using vmo
 
                
root@lpar21ml16ed_pub[/] > vmo -o lru_file_repage=0
Setting lru_file_repage to 0
root@lpar21ml16ed_pub[/] >

 

Setting this to 0 tells the VMM that you want to steal only file pages and not computational pages. As this will change if the numperm < minperm or > maxperm, we will make maxperm high and minperm very low. Years ago, before the lru_file_repage parameter was introduced, we used to make maxperm low. If we did this now, we would stop the application caching programs that are currently running.

Listing 4 shows how we'll set these parameters:


Listing 4. Setting the minperm, maxperm and maxclient parameters
 
                
vmo -p -o minperm%=5
vmo -p -o maxperm%=90
vmo -p -o maxclient%=90

 

We also want to take a look at minfree and maxfree. When the pages on our free list fall below minfree, the VMM will start to steal pages, which we don't want to happen until the free list has beefed up the number in maxfree. The values should be similar to the ones shown in Listing 5.


Listing 5. Setting the minfree and maxfree parameters
 
                
vmo -p -o minfree=960
vmo -p -o maxfree=1088


 


 

CPU

In this section, we'll discuss CPU as it relates to AIX and Oracle. We'll discuss how we can tune our CPU subsystems and take advantage of recent System p innovations to increase Oracle performance.

Let's start with SMT. This important POWER5 innovation allows for the ability of one single physical processor to concurrently dispatch instructions from several hardware threads. In AIX 5L Version 5.3, a dedicated partition created with one physical processor is configured as a logical two-way by turning on SMT, which allows two hardware threads to run on one physical processor at the same time. You should always leave SMT on with Oracle (see Listing 6).


Listing 6. Leaving SMT on with Oracle
 
                
oot@lpar21ml16ed_pub[/home/u0004773] > smtctl

This system is SMT capable.

SMT is currently enabled.

SMT boot mode is not set.
SMT threads are bound to the same virtual processor.

proc0 has 2 SMT threads.
Bind processor 0 is bound with proc0
Bind processor 1 is bound with proc0

root@lpar21ml16ed_pub[/home/u0004773] >

 

Let's run a performance-monitoring utility, mpstat (see Listing 7).


Listing 7. Running the mpstat utility
 
                
root@lpar21ml16ed_pub[/] > mpstat 1 5

System configuration: lcpu=2 ent=0.2 mode=Uncapped

cpu  min  maj  mpc  int   cs  ics   rq  mig lpa sysc us sy wa id   pc  %ec  lcs
  0    0    0    0  557  274  128    1    1 100  682 26 51  0 22 0.02  9.9  769
  1    0    0    0  289    2    2    1    1 100    0  0 27  0 73 0.01  4.1  772
  U    -    -    -    -    -    -    -    -   -    -  -  -  0 86 0.22 86.1    -
ALL    0    0    0  846  276  130    2    2 100  682  3  6  0 91 0.03 13.9 1541

 

Though our system has only one physical CPU, we can see that both logical CPU's come up when analyzing our systems.

Another important utility worth mentioning is nmon, which has been my favorite monitoring utility for years now (see Figure 1).


Figure 1. nmon output
nmon analyzer
 

Although nmon shows activity by CPU, you can use different flags to show the amount of activity that the Oracle processes are using. Furthermore, using the nmon analyzer, you can download information into spreadsheets and compile nice-looking charts that senior management likes to see.

There are some other important things you can do with CPU:

  • Processor affinity -- This allows processes to run on specific processors. You can actually correlate specific processes with running processes.
  • Nice and Renice -- These change the priority of running processes. It is not recommended to renice Oracle processes.

Another utility that is important with monitoring CPU is vmstat, which will also quickly let you know where a bottleneck resides.


 


 

Disk I/O

In this section, we'll discuss the disk I/O subsystem as it relates to AIX and Oracle. We'll review how we can monitor and tune our I/O subsystems and also discuss some important subsystems that relate to I/O.

When our system is slow, most inexperienced administrators will usually look at CPU. It is, however, the disk I/O subsystem that can cause the most problems. We'll examine the ever-important asynchronous I/O and concurrent I/O in this section, as well.

Asynchronous I/O (AIO) servers

AIO determines if Oracle is waiting for your I/O to complete prior to starting new processing. If asynchronous I/O is not tuned properly, it can significantly affect the overall performance of writes on the I/O subsystem. What it does is allow the system to continue processing while I/O completes in the background. This improves performance significantly because processes can run at the same time as I/O is going on. We can monitor the AIO subsystem by using either iostat or nmon (see Listing 8).


Listing 8. Monitoring the AIO subsystem using iostat
 
                
oot@lpar21ml16ed_pub[/home/u0004773] > iostat -A 1 5

System configuration: lcpu=2 drives=2 ent=0.25 paths=2 vdisks=2

aio: avgc avfc maxg maif maxr avg-cpu: % user % sys % idle % iowait physc % entc
        0    0  312    0 4096             3.1   7.1   89.8      0.0   0.0   16.7

Disks:        % tm_act     Kbps      tps    Kb_read   Kb_wrtn
hdisk1           0.0       0.0       0.0          0         0
hdisk0           0.0       0.0       0.0          0         0

 

The following list is a description of parameters used to monitor the AIO subsystem.

  • avfc: This reports back the average fastpath request count per second for your interval.
  • avgc: This reports back the average global asynchronous I/O request per second of the interval you specified.
  • maxgc: This reports back the max global asynchronous I/O request since the last time this value was fetched.
  • maxfc: This reports back the maximum fastpath request count since the last time this value was fetched.
  • maxreqs: This is the maximum asynchronous I/O requests allowed.

In our case, AIO servers are not a system bottleneck.

Concurrent I/O (CIO)

CIO, introduced in AIX Version 5.2, is an extremely important system that you should use in your Oracle environment. Similar to its predecessor, direct I/O, when turned on, it allows filesystem I/O to bypass the VMM and transfer data directly to disk from the user's buffer. CIO allows multiple threads to read and write data concurrently to the same file, which is due to the way in which JFS2 is implemented, allowing users to read and write simultaneously. In order to turn this on, you mount your filesystems with the cio flag: # mount -o cio /orafilesystem.

These elements are important to consider with CIO:

  • Raw devices -- While some Oracle DBAs like to create raw logical volumes for their data, and there is little argument about the performance benefit, in most cases it is too difficult to administer and usually I've found that the UNIX® administrators can talk the Oracle DBAs out of this one. With the advent of CIO, I would not use raw logical volumes unless performance is the driving factor of everything you are doing and you have the staff that can maintain the complexities inherent in this type of environment.
  • Spreading the wealth -- The more spindles you have, the more you should spread your wealth around. The more adapters you will have, the more performance will also increase. You should also try to keep indexes and redo logs off the same volumes as your data.
  • SAN -- Make sure you spend time looking at your SAN; optimizing the hardware will help you more than anything you can do at the OS level.

 


 

Oracle tools

In this section, we'll look at Oracle-specific tools that can help you with your AIX administration.

Statspack

This is an Oracle performance diagnosis tool, and I highly recommend that Unix administrators learn to use this tool. It's really not that hard, once you have it set up and configured. This is done from sql once you have the Oracle installed. There are really two types of collection options: level and threshold. You need to configure the level parameter, which controls the type of data collected from Oracle. The threshold parameter acts as a filter for the collection of SQL statements the status summary tables.

Here's how to install it. After logging on to the systems as Oracle, start up sqlplus and then just follow the steps as instructed (see Listing 9).


Listing 9. Starting up sqlplus to install Statspack
 
                

SQL*Plus: Release 10.1.0.2.0 - Production on Sun May 18 19:21:21 2008

Copyright (c) 1982, 2004, Oracle.  All rights reserved.

Enter user-name: system as sysdba
Enter password:

Connected to:
Oracle Database 10g Enterprise Edition Release 10.1.0.2.0 - 64bit Production
With the Partitioning, OLAP and Data Mining options
SQL> execute
SQL> @?/rdbms/admin/spcreate

Choose the PERFSTAT user's password
-----------------------------------
Not specifying a password will result in the installation FAILING

Oracle Enterprise Manager
choose the Temporary tablespace for the PERFSTAT user
-----------------------------------------------------
Below is the list of online tablespaces in this database which can
store temporary data (e.g. for sort workareas).  Specifying the SYSTEM
tablespace for the user's temporary tablespace will result in the
installation FAILING, as using SYSTEM for workareas is not supported.

Choose the PERFSTAT user's Temporary tablespace.

 

Oracle Enterprise Manager

The Oracle Enterprise Manager is a tool that I've used for years. In order to turn it on, you'll need to make sure you first allow it to run when installing Oracle or creating a database using the Oracle dbca utility. After the database is created, you'll need to turn OEM on using: $ emctl start dbconsole.

This is what you'll put in your browser: http://lpar21ml16ed_pub:5505/em.

After logging in, you'll see something like Figure 2.


Figure 2. The Oracle Enterprise Manager
The Oracle Enterprise Manager
 

There is so much you can monitor and tune within OEM that there are actually books on this utility. If you are working in an Oracle environment, this is a must-use system.


 


 

Summary

In this article, we introduced the concepts of performance tuning as it relates to Oracle. We looked at the memory, CPU, and I/O subsystems as we analyzed and tuned our systems. We captured data and analyzed the results of our changes. We discussed important systems such as concurrent I/O and why implementing these systems will help our systems perform better. We also discussed some important kernel parameters, what they do, and how to tune them. At the same time, we made note of some important changes through the years and our approach to certain parameters. We also looked at some Oracle-specific utilities and how they could help us as AIX systems administrators.

No comments: