Tuesday, April 15, 2008

Shared library memory footprints on AIX 5L

Learn about shared library mechanisms and memory footprints on IBM® AIX®. This article is essential for developers writing server code or administrators managing production AIX systems. It offers developers and administrators commands and techniques, and gives the understanding necessary to analyze memory requirements of server processes on AIX. It also helps developers and administrators avoid resource shortages that can't be identified with other standard runtime analysis tools such as ps or topas. The article is intended for systems administrators or developers of native applications on AIX.

Introduction

This article examines how shared libraries occupy memory on 32-bit AIX 5L™ (5.3), demonstrating the following commands:

  • ps
  • svmon
  • slibclean
  • procldd
  • procmap
  • genkld
  • genld

The article discusses the virtual address space of processes, as well as the kernel shared-library segment, how to examine them, and how to interpret the output of the various diagnostic utilities mentioned above. The article also discusses how to diagnose situations where the kernel shared segment is full and possible approaches to resolving that situation.

In the examples throughout, we happen to use the processes from the software product Business Objects Enterprise Xir2®. This is arbitrary, as the concepts will apply to all processes running on AIX 5L.



 

Review

Just so we are all in the same mindframe, let's review a little on 32-bit architecture. In doing so, I'll resort to the employ of the most useful 'bc' command-line calculator.

In a 32-bit processor, the registers are capable of holding 2^32 possible values,

 $ bc
 2^32
 4294967296
 obase=16
 2^32
 100000000


 

That is a 4 gigabyte range. This means a program running on the system is able to access any function or data address in the range of 0 and 2^32 - 1.

 $ bc
  2^32 - 1 
 FFFFFFFF
 obase=10
 2^32 - 1 
 4294967295


 

Now, as you know, any operating system has potentially hundreds of programs running at the same time. Even though each one of them is capable of accessing a 4GB range of memory, it doesn't mean that they each get their own 4GB allotment of physical RAM. That would be impractical. Rather, the OS implements a sophisticated scheme of swapping code and data between a moderate amount of physical RAM and areas of the file system designated as swap (or paging) space. Moreover, even though each process is capable of accessing 4GB of memory space, many don't even use most of it. So the OS only loads or swaps the required amount of code and data for each particular process.


Figure 1. Conceptual diagram of virtual memory
vmm
 

This mechanism is often referred to as virtual memory and virtual address spaces.

When an executable file is run, the Virtual Memory Manager of the OS looks at the code and data that comprise the file, and decides what parts it will load into RAM, or load into swap, or reference from the file system. At the same time, it establishes some structure to map the physical locations to virtual locations in the 4GB range. This 4GB range represents the process' maximum theoretical extent and (together sometimes with the VMM's structures that represent it), is known as the virtual address space of the process.

On AIX, the 4GB virtual address space is divided into sixteen 256-megabyte segments. The segments have predetermined functions, some of which are described below:

  • Segment 0 is for kernel-related data.
  • Segment 1 is for code.
  • Segment 2 is for stack and dynamic memory allocation.
  • Segment 3 is for memory for mapped files, mmap'd memory.
  • Segment d is for shared library code.
  • Segment f is for shared library data.

On HP-UX® by comparison, the address space is divided into four quadrants. Quadrants three and four are available for shared library mappings if they are designated using the chatr command with the +q3p enable and +q4p enable options.



 

Where shared libraries are loaded

Shared libraries are, naturally, intended to be shared. More specifically, the read-only sections of the binary image, namely the code (also known as "text") and read-only data (const data, and data that can be copy-on-write) may be loaded once into physical memory, and mapped multiple times into any process that requires it.

To demonstrate this, take a running AIX machine and see which shared libraries are presently loaded:

> su 
# genkld
Text address     Size File

    d1539fe0    1a011 /usr/lib/libcurses.a[shr.o]
    d122f100    36732 /usr/lib/libptools.a[shr.o]
    d1266080    297de /usr/lib/libtrace.a[shr.o]
    d020c000     5f43 /usr/lib/nls/loc/iconv/ISO8859-1_UCS-2
    d7545000    161ff /usr/java14/jre/bin/libnet.a
    d7531000    135e2 /usr/java14/jre/bin/libzip.a
.... [ lots more libs ] ....
d1297108 3a99 /opt/rational/clearcase/shlib/libatriastats_svr.a
[atriastats_svr-shr.o]
    d1bfa100    2bcdf /opt/rational/clearcase/shlib/libatriacm.a[atriacm-shr.o]
    d1bbf100    2cf3c /opt/rational/clearcase/shlib/libatriaadm.a[atriaadm-shr.o]
.... [ lots more libs ] ....
    d01ca0f8     17b6 /usr/lib/libpthreads_compat.a[shr.o]
    d10ff000    30b78 /usr/lib/libpthreads.a[shr.o]
    d00f0100    1fd2f /usr/lib/libC.a[shr.o]
    d01293e0    25570 /usr/lib/libC.a[shrcore.o]
    d01108a0    18448 /usr/lib/libC.a[ansicore_32.o]
.... [ lots more libs ] ....
    d04a2100    fdb4b /usr/lib/libX11.a[shr4.o]
    d0049000    365c4 /usr/lib/libpthreads.a[shr_xpg5.o]
    d0045000     3c52 /usr/lib/libpthreads.a[shr_comm.o]
    d05bb100     5058 /usr/lib/libIM.a[shr.o]
    d05a7100    139c1 /usr/lib/libiconv.a[shr4.o]
    d0094100    114a2 /usr/lib/libcfg.a[shr.o]
    d0081100    125ea /usr/lib/libodm.a[shr.o]
    d00800f8      846 /usr/lib/libcrypt.a[shr.o]
    d022d660   25152d /usr/lib/libc.a[shr.o]

 

As an interesting observation, we can see on this machine right away Clearcase and Java™ are running. Let's take any one of these common libraries, say, libpthreads.a. Browse the library and see which functions it implements:

# dump -Tv /usr/lib/libpthreads.a | grep EXP
[278]   0x00002808    .data      EXP     RW SECdef        [noIMid] pthread_attr_default
[279] 0x00002a68 .data EXP RW SECdef [noIMid]
 pthread_mutexattr_default
[280]   0x00002fcc    .data      EXP     DS SECdef        [noIMid] pthread_create
[281]   0x0000308c    .data      EXP     DS SECdef        [noIMid] pthread_cond_init
[282]   0x000030a4    .data      EXP     DS SECdef        [noIMid] pthread_cond_destroy
[283]   0x000030b0    .data      EXP     DS SECdef        [noIMid] pthread_cond_wait
[284]   0x000030bc    .data      EXP     DS SECdef        [noIMid] pthread_cond_broadcast
[285]   0x000030c8    .data      EXP     DS SECdef        [noIMid] pthread_cond_signal
[286]   0x000030d4    .data      EXP     DS SECdef        [noIMid] pthread_setcancelstate
[287]   0x000030e0    .data      EXP     DS SECdef        [noIMid] pthread_join
.... [ lots more stuff ] ....

 

Hmm, that was cool. Now let's see which currently running processes have it loaded currently on the system:

# for i in $(ps -o pid -e | grep ^[0-9] ) ; do j=$(procldd $i | grep libpthreads.a); \
 if [ -n "$j" ] ; then ps -p $i -o comm | grep -v COMMAND; fi  ; done
portmap
rpc.statd
automountd
rpc.mountd
rpc.ttdbserver
dtexec
dtlogin
radiusd
radiusd
radiusd
dtexec
dtterm
procldd : no such process : 24622
dtterm
xmwlm
dtwm
dtterm
dtgreet
dtexec
ttsession
dtterm
dtexec
rdesktop
procldd : no such process : 34176
java
dtsession
dtterm
dtexec
dtexec

 

Cool! Now let's get the same thing, but eliminate the redundancies:

# cat prev.command.out.txt | sort | uniq 
       
automountd
dtexec
dtgreet
dtlogin
dtsession
dtterm
dtwm
java
portmap
radiusd
rdesktop
rpc.mountd
rpc.statd
rpc.ttdbserver
ttsession
xmwlm

 

There, now we have a nice, discrete list of binaries that are currently executing and all load libpthreads.a. Note that there are many more processes on this system than this at this time:

# ps -e | wc -l  
      85

 

Now, let's see where each process happens to load libpthreads.a :

# ps -e | grep java
 34648      -  4:13 java
#
# procmap 34648 | grep libpthreads.a
d0049000         217K  read/exec      /usr/lib/libpthreads.a[shr_xpg5.o]
f03e6000          16K  read/write     /usr/lib/libpthreads.a[shr_xpg5.o]
d0045000          15K  read/exec      /usr/lib/libpthreads.a[shr_comm.o]
f03a3000         265K  read/write     /usr/lib/libpthreads.a[shr_comm.o]
#
# ps -e | grep automountd
 15222      -  1:00 automountd
 25844      -  0:00 automountd
#
# procmap 15222 | grep libpthreads.a
d0049000         217K  read/exec      /usr/lib/libpthreads.a[shr_xpg5.o]
f03e6000          16K  read/write     /usr/lib/libpthreads.a[shr_xpg5.o]
d0045000          15K  read/exec      /usr/lib/libpthreads.a[shr_comm.o]
f03a3000         265K  read/write     /usr/lib/libpthreads.a[shr_comm.o]
d10ff000         194K  read/exec         /usr/lib/libpthreads.a[shr.o]
f0154000          20K  read/write        /usr/lib/libpthreads.a[shr.o]
#
# ps -e | grep portmap              
 12696      -  0:06 portmap
 34446      -  0:00 portmap
#
# procmap 12696 | grep libpthreads.a
d0045000          15K  read/exec      /usr/lib/libpthreads.a[shr_comm.o]
f03a3000         265K  read/write     /usr/lib/libpthreads.a[shr_comm.o]
d10ff000         194K  read/exec         /usr/lib/libpthreads.a[shr.o]
f0154000          20K  read/write        /usr/lib/libpthreads.a[shr.o]
#
# ps -e | grep dtlogin
  6208      -  0:00 dtlogin
  6478      -  2:07 dtlogin
 20428      -  0:00 dtlogin
#
# procmap 20428 | grep libpthreads.a
d0045000          15K  read/exec      /usr/lib/libpthreads.a[shr_comm.o]
f03a3000         265K  read/write     /usr/lib/libpthreads.a[shr_comm.o]
d0049000         217K  read/exec      /usr/lib/libpthreads.a[shr_xpg5.o]
f03e6000          16K  read/write     /usr/lib/libpthreads.a[shr_xpg5.o]

 

Notice that each process loads it at the same address each time. Don't be confused by the constituent listings for the .o's in the library. On AIX, you can share archive libraries (.a files, customarily) as well as dynamic shared libraries (.so files, customarily). The purpose of this is to be able to bind symbols at link time, just like traditional archive linking, yet not require the constituent object (.o file in the archive) be copied into the final binary image. No dynamic (or runtime) symbol resolution is performed, however, as is the case with dynamic shared libraries (.so/.sl files).

Also note libpthreads.a code sections, those marked read/exec, are loaded into segment 0xd. That segment, as mentioned above, is designated on AIX as the segment for shared library code. That is to say, the kernel loads the shareable segments of this shared library into an area that is shared by all processes running on the same kernel.

You might notice that the data sections are also loaded to the same segment: the shared library segment 0xf. That doesn't mean, however, that each process is also sharing the data section of libpthreads.a. Loosely defined, such an arrangement wouldn't work, as different processes would need to maintain different data values at different times. Segment 0xf is distinct for each process using libpthreads.a, even though the virtual memory address is the same.

The svmon command can show us the segment IDs in the Virtual Memory Manager (Vsid) for processes. We'll see the shared-library code segments all have the same Vsid, while the shared-library data segments all have distinct Vsids. The Esid, meaning Effective Segment ID, is the segment ID within the scope of the process's address space (just terminology; don't let it confuse you).

# svmon -P 17314

-------------------------------------------------------------------------------
     Pid Command          Inuse      Pin     Pgsp  Virtual 64-bit Mthrd  16MB
   17314 dtexec           20245     9479       12    20292      N     N     N

    Vsid      Esid Type Description              PSize  Inuse   Pin Pgsp Virtual
       0         0 work kernel segment               s  14361  9477    0 14361 
   6c01b         d work shared library text          s   5739     0    9  5786 
   19be6         f work shared library data          s     83     0    1    87 
   21068         2 work process private              s     56     2    2    58 
   18726         1 pers code,/dev/hd2:65814          s      5     0    -     - 
    40c1         - pers /dev/hd4:2                   s      1     0    -     - 
#
# svmon -P 20428

-------------------------------------------------------------------------------
     Pid Command          Inuse      Pin     Pgsp  Virtual 64-bit Mthrd  16MB
   20428 dtlogin          20248     9479       23    20278      N     N     N

    Vsid      Esid Type Description              PSize  Inuse   Pin Pgsp Virtual
       0         0 work kernel segment               s  14361  9477    0 14361 
   6c01b         d work shared library text          s   5735     0    9  5782 
   7869e         2 work process private              s     84     2   10    94 
                   parent=786be
   590b6         f work shared library data          s     37     0    4    41 
                   parent=7531d
   6c19b         1 pers code,/dev/hd2:65670          s     29     0    -     - 
   381ae         - pers /dev/hd9var:4157             s      1     0    -     - 
    40c1         - pers /dev/hd4:2                   s      1     0    -     - 
   4c1b3         - pers /dev/hd9var:4158             s      0     0    -     - 


 

Doing the math

Let's see how much is currently in this shared segment 0xd. We'll revert to our bc calculator tool again. So we know we are sane, we'll verify the size of segment 0xd:

# bc    
ibase=16
E0000000-D0000000
268435456
ibase=A
268435456/(1024^2)
256

 

That looks good. Like stated above, each segment is 256MB. Ok, now let's see how much is currently being used.

$ echo "ibase=16; $(genkld | egrep ^\ \{8\} | awk '{print $2}' | tr '[a-f]' '[A-F]' \
 |  tr '\n' '+' ) 0" | bc
39798104
$
$ bc <<EOF
> 39798104/(1024^2)
> EOF
37

 

That is saying that there is 37MB currently being used. Let's start up XIr2, and compare:

$ echo "ibase=16; $(genkld | egrep ^\ \{8\} | awk '{print $2}' | tr '[a-f]' '[A-F]' \
 |  tr '\n' '+' ) 0" | bc
266069692
$
$ bc <<EOF
> 266069692/(1024^2)
> EOF
253

 

Now there is 253MB being used. That is very close to the limit of 256MB. Let's pick a random process, like WIReportServer, and see how many shared libraries made it into shared space, and how many had to be mapped privately. Since we know the shared segment begins at address 0xd000000, we can filter that out of the output from procmap. Remember, only code sections are mapped to segment 0xd, so we'll just look for the lines that are read/exec:

$ procmap 35620 | grep read/exec | grep -v ^d
10000000       10907K  read/exec         boe_fcprocd
31ad3000       14511K  read/exec
/crystal/sj1xir2a/xir2_r/bobje/enterprise115/aix_rs6000/libEnterpriseFramework.so
3167b000        3133K  read/exec
/crystal/sj1xir2a/xir2_r/bobje/enterprise115/aix_rs6000/libcpi18nloc.so
3146c000        1848K  read/exec
/crystal/sj1xir2a/xir2_r/bobje/enterprise115/aix_rs6000/libBOCP_1252.so
31345000         226K  read/exec
/crystal/sj1xir2a/xir2_r/bobje/enterprise115/aix_rs6000/btlat300.so

 

It looks like the above four libraries couldn't be mapped into the shared segment. Consequently, they were mapped to the private segment 0x3, which is used for any general memory allocated by a call to the mmap() routine.

There are a few conditions that force a shared library to be mapped privately on 32-bit AIX:

  • It is out of space in the shared segment 0xd (as above).
  • The shared library does not have execute permissions for group or other. You can use a permission designation of rwxr-xr-x mto correct this; however, developers would want to use private permissions (eg. rwx------) so they don't have to run slibclean each time they recompile a shared library and deploy it for testing.
  • Some documentation says shared libraries are loaded over nfs.

The AIX kernel will even load the same library twice into shared memory, if it comes from a different location:

sj2e652a-chloe:~/e652_r>genkld | grep libcplib.so
        d5180000    678c6 /space2/home/sj2e652a/e652_r/lib/libcplib.so
        d1cf5000    678c6 /home/sj1e652a/xir2_r/lib/libcplib.so


 

When it goes wrong

If we run another instance of XIr2 deployed in a different directory, we see a significant difference in the process footprint:

$ ps -e -o pid,vsz,user,comm | grep WIReportServer
28166 58980   jbrown WIReportServer
46968 152408 sj1xir2a WIReportServer
48276 152716 sj1xir2a WIReportServer
49800 152788 sj1xir2a WIReportServer
50832 152708 sj1xir2a WIReportServer

 

The instance for account 'jbrown' was started first, and the instance for account 'sj1xir2a' was started second. If we were to do something obscure and risky like setting at the appropriate place in our bobje/setup/env.sh file,

    LIBPATH=~jbrown/vanpgaix40/bobje/enterprise115/aix_rs6000:$LIBPATH

 

before starting the second instance, we would see the footprints normalized, (I switch to the process boe_fcprocd, as I couldn't get WIReportServer to start for this LIBPATH test).

$ ps -e -o pid,vsz,user,comm | grep boe_fcprocd   
29432 65036   jbrown boe_fcprocd
35910 67596   jbrown boe_fcprocd
39326 82488 sj1xir2a boe_fcprocd
53470 64964 sj1xir2a boe_fcprocd

 

And we see procmap shows us the files are loaded from ~jbrown as expected:

53470 : /crystal/sj1xir2a/xir2_r/bobje/enterprise115/aix_rs6000/boe_fcprocd
-name vanpg 
10000000       10907K  read/exec         boe_fcprocd
3000079c        1399K  read/write        boe_fcprocd
d42c9000        1098K  read/exec
/home7/jbrown/vanpgaix40/bobje/enterprise115/aix_rs6000/libcrypto.so
33e34160         167K  read/write
/home7/jbrown/vanpgaix40/bobje/enterprise115/aix_rs6000/libcrypto.so
33acc000        3133K  read/exec
/home7/jbrown/vanpgaix40/bobje/enterprise115/aix_rs6000/libcpi18nloc.so
33ddc697         349K  read/write
/home7/jbrown/vanpgaix40/bobje/enterprise115/aix_rs6000/libcpi18nloc.so



 

Clean up

Once applications are shut down, shared libraries may still reside in the shared segment 0xd. In such case, you can use the utility 'slibclean' to unload any shared libraries that are no longer referenced. The utility requires no arguments:

slibclean

 

There is also the utility genld, which when passed the -l option, can show you output like procmap, but for all existing process on the system,

genld -l

 

Sometimes, after running slibclean, you may still be prohibited from copying a shared library. For example:

$ cp /build/dev/bin/release/libc3_calc.so   /runtime/app/lib/
cp: /runtime/app/lib/libc3_calc.so: Text file busy

 

You may have run slibclean already, and running 'genld -l' doesn't show any process having this library loaded. Yet the system still has this file protected. You can overcome this limitation by first deleting the shared library in the target location, and then copying the new shared library:

$ rm /runtime/app/lib/libc3_calc.so
$ cp /build/dev/bin/release/libc3_calc.so   /runtime/app/lib/

 

During shared-library development, if you are making repeated compile, link, execute, and test exercises, you can avoid having to run slibclean in each cycle by making your shared-library executable only by the owner (eg. r_xr__r__). This will cause the process that you use for testing to load and map your shared-library privately. Be sure to make it executable by all, however (e.g. r_xr_xr_x at product release time).



 

Summary

I hope you've been able to see in more detail how shared libraries occupy memory and the utilities used to examine them. With this you'll be better able to assess the sizing requirements for your applications and analyze the constituents of memory footprints for processes running on AIX systems.

No comments: