A systems administrator always needs to be cognizant of
system performance. Performance tuning on IBM® AIX® has
changed considerably in recent years due to changes that
have been made in AIX and its hardware platform, System
p™. If you were to read an AIX-specific performance
tuning document from two years ago and applied the same
strategies today, not only might you not be improving
performance, but in some cases you would be making
things worse. As an administrator, while at times you
may find that changing some parameters on the fly might
increase performance dramatically and fairly quickly,
performance tuning in a database environment like Oracle
is a marathon, not a sprint. This article drills down
into the many aspects of tuning AIX to run Oracle. We'll
look at the Virtual Memory Manager (VMM), CPU, Memory,
and I/O (disk and network). We'll examine some of the
tools that you can use to analyze bottlenecks, while
also making some changes to the system. Finally, we'll
also review some Oracle tools you can use to help with
your performance tuning.
Introduction
As a systems administrator, you should already know some
of the basics of memory, CPU, and Disk I/O (see the
Resources
section for articles on these subjects). What you may not
fully understand is how the VMM works in AIX and what that
means to Oracle. You will also find that because many of the
AIX tuning commands and parameters have changed in recent
years, Oracle has changed also, and there are changes to
utilities such as the Oracle Enterprise Manager, which is an
important utility you should definitely take the time to
learn and add to your repertoire.
This article discusses in detail the AIX VMM and the
tuning commands that you will be using to tune memory. It
also introduces some of the monitoring tools that you will
be using, which will help put you in a position to tune.
Before we get started, it is important to note that you
must have an overall approach to what you are doing. Make
sure you use proper change control processes; only make one
change at a time and monitor that change very carefully
before introducing that change into other environments,
particularly production. Performance tuning really is an
iterative, ongoing process and you'll oftentimes find that
by fixing one bottleneck you will create another, which is
okay as long as you continuously look to improve the health
of your systems. Make sure that you start monitoring your
system at the beginning, well before your users are
screaming about slow performance. How can you know what a
poorly performing system is like unless you know what a
healthy system looks like? A proper baseline is key. The
system we'll be looking at is running Oracle 10g --
10.1.0.2.0 and AIX 5.3 TL7 on a POWER5™ LPAR with one CPU
and 4GB of RAM.
Memory
In this section, we'll review memory as it relates to AIX
and Oracle. We'll discuss how AIX uses virtual memory and
how this relates to Oracle. We'll also analyze the data and
tune our subsystems.
Let's start with VMM. It's important to understand that
the VMM services all memory requests from the system, not
just virtual memory. When RAM is accessed, the VMM must
allocate space even where there is plenty of physical memory
left on the box. This is what confuses both DBAs and systems
administrators at times. It does this by using a process
called early allocation of paging space, by partitioning
segments into pages. These pages can be either RAM or paging
space (virtual memory stored on disk). At the same time, VMM
maintains a free list of unallocated page frames, which are
used to satisfy page faults. The VMM has a page-replacement
algorithm, which assigns the page frames and determines
exactly which virtual-memory pages currently in RAM will
have their page frames brought back to the free list.
Furthermore, the AIX operating system will use all
available memory, except that which is configured to be
unallocated and known as the free list. Obviously,
administrators prefer to use physical memory rather than
paging space, where the physical memory is available. VMM
classifies memory segments into two categories: persistent
segments and working segments. Persistent segments use file
memory and working segments use computational memory. What
does this mean to us? It's the computational memory that is
used while your SQL queries are accessing the database.
These are working segments and will terminate when the
process is completed. These segments have no real permanent
location. On the other hand, file memory uses persistent
segments and do have permanent locations on the disks. They
will remain in memory usually until the pages are stolen or
the database is recycled. Again, you want the file memory
paged to disk and not the computational memory.
How do we tune our systems? One critical parameter worth
discussing is the Translation Lookaside Buffer (TLB).
Applications like Oracle exploit a tremendous amount of
virtual memory, so using large pages increases performance
substantially. Increasing the size of this buffer allows the
system to map more virtual memory, which results in a lower
miss rate for applications that use a lot of virtual memory
like Oracle. This includes both OLTP and Data Warehouse
applications. Oracle uses large pages for its SGA, because
it is the SGA that really dominates virtual memory. With AIX
5.3 and older, we will use vmo; prior to that, we used
vmtune.
Let's look at the parameters, using vmo, as shown in
Listing 1.
Listing 1. Parameters using vmo
root@lpar21ml16ed_pub[/] > vmo -L lgpg_size
NAME CUR DEF BOOT MIN MAX UNIT TYPE
DEPENDENCIES
--------------------------------------------------------------------------------
lgpg_size 0 0 0 0 16M bytes D
lgpg_regions
root@lpar21ml16ed_pub[/] > vmo -L lgpg_regions
NAME CUR DEF BOOT MIN MAX UNIT TYPE
DEPENDENCIES
--------------------------------------------------------------------------------
lgpg_regions 0 0 0 0 D
lgpg_size
|
Using the following command, we'll allocate 16777216 bytes to provide
large pages, with 256 actual large pages:
# vmo -r -o lgpg_size=16777216 lgpg_regions=256
|
.
At the same time, with Oracle Database 10g, make sure
that the LOCK_SGA Oracle initialization
parameter is set to TRUE, so that Oracle request large pages
when allocating shared memory. By far, the two most
important vmo settings are minperm and
maxperm . We use these parameters to determine whether
our system favors computational memory or file memory. The
first thing we do here is make certain that our
lru_file_repage parameter = 0. This parameter was
introduced in ML1 of AIX 5.3 and determines if the VMM
repage-counts are considered and the type of memory it
should steal (see Listing 2).
Listing 2. The lru_file_repage
parameter
root@lpar21ml16ed_pub[/] > vmo -L lru_file_repage
NAME CUR DEF BOOT MIN MAX UNIT TYPE
DEPENDENCIES
--------------------------------------------------------------------------------
lru_file_repage 1 1 1 0 1 boolean D
--------------------------------------------------------------------------------
root@lpar21ml16ed_pub[/] >
|
As shown in Listing 2, the default is 1, so we'll need to change this
using vmo (see Listing 3).
Listing 3. Changing the default setting
for the lru_file_repage parameter using vmo
root@lpar21ml16ed_pub[/] > vmo -o lru_file_repage=0
Setting lru_file_repage to 0
root@lpar21ml16ed_pub[/] >
|
Setting this to 0 tells the VMM that you want to steal only file
pages and not computational pages. As this will change if
the numperm < minperm or >
maxperm , we will make maxperm high and
minperm very low. Years ago, before the
lru_file_repage parameter was introduced, we used to
make maxperm low. If we did this now, we would
stop the application caching programs that are currently
running.
Listing 4 shows how we'll set these parameters:
Listing 4. Setting the minperm ,
maxperm and maxclient parameters
vmo -p -o minperm%=5
vmo -p -o maxperm%=90
vmo -p -o maxclient%=90
|
We also want to take a look at minfree and maxfree .
When the pages on our free list fall below minfree ,
the VMM will start to steal pages, which we don't want to
happen until the free list has beefed up the number in
maxfree . The values should be similar to the ones
shown in Listing 5.
Listing 5. Setting the minfree
and maxfree parameters
vmo -p -o minfree=960
vmo -p -o maxfree=1088
|
CPU
In this section, we'll discuss CPU as it relates to AIX
and Oracle. We'll discuss how we can tune our CPU subsystems
and take advantage of recent System p innovations to
increase Oracle performance.
Let's start with SMT. This important POWER5 innovation
allows for the ability of one single physical processor to
concurrently dispatch instructions from several hardware
threads. In AIX 5L Version 5.3, a dedicated partition
created with one physical processor is configured as a
logical two-way by turning on SMT, which allows two hardware
threads to run on one physical processor at the same time.
You should always leave SMT on with Oracle (see Listing 6).
Listing 6. Leaving SMT on with Oracle
oot@lpar21ml16ed_pub[/home/u0004773] > smtctl
This system is SMT capable.
SMT is currently enabled.
SMT boot mode is not set.
SMT threads are bound to the same virtual processor.
proc0 has 2 SMT threads.
Bind processor 0 is bound with proc0
Bind processor 1 is bound with proc0
root@lpar21ml16ed_pub[/home/u0004773] >
|
Let's run a performance-monitoring utility, mpstat (see Listing 7).
Listing 7. Running the mpstat utility
root@lpar21ml16ed_pub[/] > mpstat 1 5
System configuration: lcpu=2 ent=0.2 mode=Uncapped
cpu min maj mpc int cs ics rq mig lpa sysc us sy wa id pc %ec lcs
0 0 0 0 557 274 128 1 1 100 682 26 51 0 22 0.02 9.9 769
1 0 0 0 289 2 2 1 1 100 0 0 27 0 73 0.01 4.1 772
U - - - - - - - - - - - - 0 86 0.22 86.1 -
ALL 0 0 0 846 276 130 2 2 100 682 3 6 0 91 0.03 13.9 1541
|
Though our system has only one physical CPU, we can see that both
logical CPU's come up when analyzing our systems.
Another important utility worth mentioning is nmon, which
has been my favorite monitoring utility for years now (see
Figure 1).
Figure 1. nmon output
Although nmon shows activity by CPU, you can use different flags to
show the amount of activity that the Oracle processes are
using. Furthermore, using the nmon analyzer, you can
download information into spreadsheets and compile
nice-looking charts that senior management likes to see.
There are some other important things you can do with
CPU:
- Processor affinity -- This allows processes to run
on specific processors. You can actually correlate
specific processes with running processes.
- Nice and Renice -- These change the priority of
running processes. It is not recommended to renice
Oracle processes.
Another utility that is important with monitoring CPU is
vmstat, which will also quickly let you know where a
bottleneck resides.
Disk I/O
In this section, we'll discuss the disk I/O subsystem as
it relates to AIX and Oracle. We'll review how we can
monitor and tune our I/O subsystems and also discuss some
important subsystems that relate to I/O.
When our system is slow, most inexperienced
administrators will usually look at CPU. It is, however, the
disk I/O subsystem that can cause the most problems. We'll
examine the ever-important asynchronous I/O and concurrent
I/O in this section, as well.
Asynchronous
I/O (AIO) servers
AIO determines if Oracle is waiting for your I/O to
complete prior to starting new processing. If asynchronous
I/O is not tuned properly, it can significantly affect the
overall performance of writes on the I/O subsystem. What it
does is allow the system to continue processing while I/O
completes in the background. This improves performance
significantly because processes can run at the same time as
I/O is going on. We can monitor the AIO subsystem by using
either iostat or nmon (see Listing 8).
Listing 8. Monitoring the AIO subsystem
using iostat
oot@lpar21ml16ed_pub[/home/u0004773] > iostat -A 1 5
System configuration: lcpu=2 drives=2 ent=0.25 paths=2 vdisks=2
aio: avgc avfc maxg maif maxr avg-cpu: % user % sys % idle % iowait physc % entc
0 0 312 0 4096 3.1 7.1 89.8 0.0 0.0 16.7
Disks: % tm_act Kbps tps Kb_read Kb_wrtn
hdisk1 0.0 0.0 0.0 0 0
hdisk0 0.0 0.0 0.0 0 0
|
The following list is a description of parameters used to monitor the
AIO subsystem.
avfc : This reports back the average
fastpath request count per second for your interval.
avgc : This reports back the average
global asynchronous I/O request per second of the
interval you specified.
maxgc : This reports back the max global
asynchronous I/O request since the last time this value
was fetched.
maxfc : This reports back the maximum
fastpath request count since the last time this value
was fetched.
maxreqs : This is the maximum
asynchronous I/O requests allowed.
In our case, AIO servers are not a system bottleneck.
Concurrent I/O
(CIO)
CIO, introduced in AIX Version 5.2, is an extremely
important system that you should use in your Oracle
environment. Similar to its predecessor, direct I/O, when
turned on, it allows filesystem I/O to bypass the VMM and
transfer data directly to disk from the user's buffer. CIO
allows multiple threads to read and write data concurrently
to the same file, which is due to the way in which JFS2 is
implemented, allowing users to read and write
simultaneously. In order to turn this on, you mount your
filesystems with the cio flag: # mount -o
cio /orafilesystem .
These elements are important to consider with CIO:
- Raw devices -- While some Oracle DBAs like to create
raw logical volumes for their data, and there is little
argument about the performance benefit, in most cases it
is too difficult to administer and usually I've found
that the UNIX® administrators can talk the Oracle DBAs
out of this one. With the advent of CIO, I would not use
raw logical volumes unless performance is the driving
factor of everything you are doing and you have the
staff that can maintain the complexities inherent in
this type of environment.
- Spreading the wealth -- The more spindles you have,
the more you should spread your wealth around. The more
adapters you will have, the more performance will also
increase. You should also try to keep indexes and redo
logs off the same volumes as your data.
- SAN -- Make sure you spend time looking at your SAN;
optimizing the hardware will help you more than anything
you can do at the OS level.
Oracle tools
In this section, we'll look at Oracle-specific tools that
can help you with your AIX administration.
Statspack
This is an Oracle performance diagnosis tool, and I
highly recommend that Unix administrators learn to use this
tool. It's really not that hard, once you have it set up and
configured. This is done from sql once you have the Oracle
installed. There are really two types of collection options:
level and threshold . You need to
configure the level parameter, which controls
the type of data collected from Oracle. The threshold
parameter acts as a filter for the collection of SQL
statements the status summary tables.
Here's how to install it. After logging on to the systems
as Oracle, start up sqlplus and then just follow the steps
as instructed (see Listing 9).
Listing 9. Starting up sqlplus to
install Statspack
SQL*Plus: Release 10.1.0.2.0 - Production on Sun May 18 19:21:21 2008
Copyright (c) 1982, 2004, Oracle. All rights reserved.
Enter user-name: system as sysdba
Enter password:
Connected to:
Oracle Database 10g Enterprise Edition Release 10.1.0.2.0 - 64bit Production
With the Partitioning, OLAP and Data Mining options
SQL> execute
SQL> @?/rdbms/admin/spcreate
Choose the PERFSTAT user's password
-----------------------------------
Not specifying a password will result in the installation FAILING
Oracle Enterprise Manager
choose the Temporary tablespace for the PERFSTAT user
-----------------------------------------------------
Below is the list of online tablespaces in this database which can
store temporary data (e.g. for sort workareas). Specifying the SYSTEM
tablespace for the user's temporary tablespace will result in the
installation FAILING, as using SYSTEM for workareas is not supported.
Choose the PERFSTAT user's Temporary tablespace.
|
Oracle Enterprise Manager
The Oracle Enterprise Manager is a tool that I've used
for years. In order to turn it on, you'll need to make sure
you first allow it to run when installing Oracle or creating
a database using the Oracle dbca utility. After the database
is created, you'll need to turn OEM on using: $ emctl
start dbconsole .
This is what you'll put in your browser:
http://lpar21ml16ed_pub:5505/em .
After logging in, you'll see something like Figure 2.
Figure 2. The Oracle Enterprise Manager
There is so much you can monitor and tune within OEM that there are
actually books on this utility. If you are working in an
Oracle environment, this is a must-use system.
Summary
In this article, we introduced the concepts of
performance tuning as it relates to Oracle. We looked at the
memory, CPU, and I/O subsystems as we analyzed and tuned our
systems. We captured data and analyzed the results of our
changes. We discussed important systems such as concurrent
I/O and why implementing these systems will help our systems
perform better. We also discussed some important kernel
parameters, what they do, and how to tune them. At the same
time, we made note of some important changes through the
years and our approach to certain parameters. We also looked
at some Oracle-specific utilities and how they could help us
as AIX systems administrators. |
No comments:
Post a Comment