Tuesday, January 15, 2008

dnstop: Monitor BIND DNS Server (DNS Network Traffic) From a Shell Prompt

Q. How do I monitor my Bind 9 named (or any other dns server) server traffic / network traffic under Linux? How do I find out and view current DNS queries such as A, MX, PTR and so on in real time? How do I find out who is querying my DNS server or specific domain or specific dns client IP address?

A. Log file can give out required information but dnstop is just like top command for monitoring dns traffic. It is a small tool to listen on device or to parse the file savefile and collect and print statistics on the local network's DNS traffic. You must have read access to /dev/bpf*. bpf (Berkeley Packet Filter) which provides a raw interface to data link layers in a protocol independent fashion. All packets on the network, even those destined for other hosts, are accessible through this mechanism.

dnstop can either read packets from the live capture device, or from a tcpdump savefile.

Install dnstop

Type the following command to install dnstop under Debian / Ubuntu Linux:
$ sudo apt-get update
$ sudo apt-get install dnstop

A note about Red Hat / CentOS / RHEL / Fedora Linux

Install latest version using make command (see below for for binary RPM file). First, grab latest source code by visiting official dnstop website.
# cd /tmp
# wget http://dns.measurement-factory.com/tools/dnstop/src/dnstop-20080502.tar.gz
# tar -zxvf dnstop-20080502.tar.gz
# cd dnstop-20080502

Compile and install dnstop, enter:
# ./configure
# make
# make install
 

dnstop rpm file

Alternatively, you can download dnstop rpm from dag's repo for RHEL / CentOS / Fedora Linux.

dnstop under FreeBSD

If you are using FreeBSD, follow these installation instructions.

How do I view dns traffic with dnstop?

Simply, type the following command at a shell prompt to monitor traffic for eth0 interface:
# dnstop {interface-name}
# dnstop eth0
# dnstop em0

Sample output:

2 new queries, 220 total queries                  Mon Aug  4 05:56:50 2008

Sources              count      %
---------------- --------- ------
180.248.xxx.26          72   32.7
77.89.xx.108             7    3.2
186.xxx.13.108           5    2.3
90.xxx.94.39             4    1.8
178.xx.77.83             4    1.8
187.xxx.149.23           4    1.8
xxx.13.249.70            4    1.8
1.xxx.169.102            4    1.8
189.xx.191.126           4    1.8
xxx.239.194.97           3    1.4

You can force dnstop to keep counts on names up to level domain name levels by using the -l {level} option. For example, with -l 2 (the default), dnstop will keep two tables: one with top-level domain names (such as .com, .org, .biz etc), and another with second level domain names (such as co.in, col.uk).
# dnstop -l 3 eth0
Under Debian / Ubuntu Linux, enter:
# dnstop -t -s eth0
Where,

  • -s Track second level domains
  • -t Track third level domains

Please note that increasing the level provides more details, but also requires more memory and CPU to keep track of DNS traffic.

How do I exit or reset counters?

To exit the dnstop, hit ^X (hold [CTRL] key and press X). Press ^R to reset the counters.

How do find out TLD generating maximum traffic?

While running dnstop, hit 1 key to view first level query names (TLDs):

5 new queries, 1525 total queries                 Mon Aug  4 06:11:09 2008

TLD                                count      %
------------------------------ --------- ------
net                                  520   34.1
biz                                  502   32.9
in-addr.arpa                         454   29.8
in                                    23    1.5
org                                   15    1.0
com                                   11    0.7

Look like this DNS server is serving more .net TLDs. You can also find out more about actual domain name by hinting 2 key while running dnstop:

3 new queries, 1640 total queries                 Mon Aug  4 06:13:20 2008

SLD                                count      %
------------------------------ --------- ------
cyberciti.biz                        557   34.0
nixcraft.net                         556   33.9
74.in-addr.arpa                       34    2.1
208.in-addr.arpa                      29    1.8
195.in-addr.arpa                      28    1.7
192.in-addr.arpa                      27    1.6
64.in-addr.arpa                       27    1.6
theos.in                              23    1.4
203.in-addr.arpa                      20    1.2
202.in-addr.arpa                      18    1.1
212.in-addr.arpa                      15    0.9
nixcraft.com                          13    0.8
217.in-addr.arpa                      13    0.8
213.in-addr.arpa                      12    0.7
128.in-addr.arpa                      12    0.7
193.in-addr.arpa                      12    0.7
simplyguide.org                       12    0.7
cricketnow.in                          3    0.2

To find out 3 level domain, hit 3 key:

www.cyberciti.biz         60   39.0
figs.cyberciti.biz        33   21.4
ns1.nixcraft.net          18   11.7
ns3.nixcraft.net          13    8.4
ns2.nixcraft.net          13    8.4
theos.in                   5    3.2
nixcraft.com               5    3.2
cyberciti.biz              2    1.3
jobs.cyberciti.biz         1    0.6
bash.cyberciti.biz         1    0.6

How do I display the breakdown of query types seen?

You can easily find out most requested, query type (A, AAAA, PTR etc) by hinting t key

Query Type     Count      %
---------- --------- ------
A?               224   56.7
AAAA?            142   35.9
A6?               29    7.3

How do I find out who is connecting to my DNS server?

Hit d to view dns client IP address:

Source         Query Name        Count       %
-------------- ------------- ---------  ------
xx.75.164.90   nixcraft.net          20    9.1
xx.75.164.90   cyberciti.biz         18    9.1
x.68.25.4      nixcraft.net           9    9.1
xxx.131.0.10   cyberciti.biz          5    4.5
xx.104.200.202 cyberciti.biz          4    4.5
202.xxx.0.2    cyberciti.biz          1    4.5

Option help

There many more option to provide detailed view of current, traffic, just type ? to view help for all run time options:

 s - Sources list
 d - Destinations list
 t - Query types
 o - Opcodes
 r - Rcodes
 1 - 1st level Query Names      ! - with Sources
 2 - 2nd level Query Names      @ - with Sources
 3 - 3rd level Query Names      # - with Sources
 4 - 4th level Query Names      $ - with Sources
 5 - 5th level Query Names      % - with Sources
 6 - 6th level Query Names      ^ - with Sources
 7 - 7th level Query Names      & - with Sources
 8 - 8th level Query Names      * - with Sources
 9 - 9th level Query Names      ( - with Sources
^R - Reset counters
^X - Exit

 ? - this

Learning doxygen for source code documentation

Maintaining and adding new features to legacy systems developed using C/C++ is a daunting task. Fortunately, doxygen—a documentation system for the C/C++, Java™, Python, and other programming languages—can help. Discover the features of doxygen in the context of projects using C/C++ as well as how to document code using doxygen-defined tags.

Maintaining and adding new features to legacy systems developed using C/C++ is a daunting task. There are several facets to the problem—understanding the existing class hierarchy and global variables, the different user-defined types, and function call graph analysis, to name a few. This article discusses several features of doxygen, with examples in the context of projects using C/C++. However, doxygen is flexible enough to be used for software projects developed using the Python, Java, PHP, and other languages, as well. The primary motivation of this article is to help extract information from C/C++ sources, but it also briefly describes how to document code using doxygen-defined tags.

Installing doxygen

You have two choices for acquiring doxygen. You can download it as a pre-compiled executable file, or you can check out sources from the SVN repository and build it. Listing 1 shows the latter process.


Listing 1. Install and build doxygen sources
 
                
bash-2.05$ svn co https://doxygen.svn.sourceforge.net/svnroot/doxygen/trunk doxygen-svn

bash-2.05$ cd doxygen-svn
bash-2.05$ ./configure –prefix=/home/user1/bin
bash-2.05$ make

bash-2.05$ make install

 

Note that the configure script is tailored to dump the compiled sources in /home/user1/bin (add this directory to the PATH variable after the build), as not every UNIX® user has permission to write to the /usr folder. Also, you need the svn utility to check out sources.


 


 

Generating documentation using doxygen

To use doxygen to generate documentation of the sources, you perform three steps.

Generate the configuration file

At a shell prompt, type the command doxygen -g . This command generates a text-editable configuration file called Doxyfile in the current directory. You can choose to override this file name, in which case the invocation should be doxygen -g <user-specified file name>, as shown in Listing 2.


Listing 2. Generate the default configuration file
 
                
bash-2.05b$ doxygen -g
Configuration file 'Doxyfile' created.
Now edit the configuration file and enter
  doxygen Doxyfile
to generate the documentation for your project
bash-2.05b$ ls Doxyfile
Doxyfile

 

Edit the configuration file

The configuration file is structured as <TAGNAME> = <VALUE>, similar to the Make file format. Here are the most important tags:

  • <OUTPUT_DIRECTORY>: You must provide a directory name here—for example, /home/user1/documentation—for the directory in which the generated documentation files will reside. If you provide a nonexistent directory name, doxygen creates the directory subject to proper user permissions.
  • <INPUT>: This tag creates a space-separated list of all the directories in which the C/C++ source and header files reside whose documentation is to be generated. For example, consider the following snippet:
    INPUT = /home/user1/project/kernel /home/user1/project/memory
    

     

    In this case, doxygen would read in the C/C++ sources from these two directories. If your project has a single source root directory with multiple sub-directories, specify that folder and make the <RECURSIVE> tag Yes.

  • <FILE_PATTERNS>: By default, doxygen searches for files with typical C/C++ extensions such as .c, .cc, .cpp, .h, and .hpp. This happens when the <FILE_PATTERNS> tag has no value associated with it. If the sources use different naming conventions, update this tag accordingly. For example, if a project convention is to use .c86 as a C file extension, add this to the <FILE_PATTERNS> tag.
  • <RECURSIVE>: Set this tag to Yes if the source hierarchy is nested and you need to generate documentation for C/C++ files at all hierarchy levels. For example, consider the root-level source hierarchy /home/user1/project/kernel, which has multiple sub-directories such as /home/user1/project/kernel/vmm and /home/user1/project/kernel/asm. If this tag is set to Yes, doxygen recursively traverses the hierarchy, extracting information.
  • <EXTRACT_ALL>: This tag is an indicator to doxygen to extract documentation even when the individual classes or functions are undocumented. You must set this tag to Yes.
  • <EXTRACT_PRIVATE>: Set this tag to Yes. Otherwise, private data members of a class would not be included in the documentation.
  • <EXTRACT_STATIC>: Set this tag to Yes. Otherwise, static members of a file (both functions and variables) would not be included in the documentation.

Listing 3 shows an example of a Doxyfile.


Listing 3. Sample doxyfile with user-provided tag values
 
                
OUTPUT_DIRECTORY = /home/user1/docs
EXTRACT_ALL = yes
EXTRACT_PRIVATE = yes
EXTRACT_STATIC = yes
INPUT = /home/user1/project/kernel
#Do not add anything here unless you need to. Doxygen already covers all 
#common formats like .c/.cc/.cxx/.c++/.cpp/.inl/.h/.hpp
FILE_PATTERNS = 
RECURSIVE = yes

 

Run doxygen

Run doxygen in the shell prompt as doxygen Doxyfile (or with whatever file name you've chosen for the configuration file). Doxygen issues several messages before it finally produces the documentation in Hypertext Markup Language (HTML) and Latex formats (the default). In the folder that the <OUTPUT_DIRECTORY> tag specifies, two sub-folders named html and latex are created as part of the documentation-generation process. Listing 4 shows a sample doxygen run log.


Listing 4. Sample log output from doxygen
 
                
Searching for include files...
Searching for example files...
Searching for images...
Searching for dot files...
Searching for files to exclude
Reading input files...
Reading and parsing tag files
Preprocessing /home/user1/project/kernel/kernel.h
…
Read 12489207 bytes
Parsing input...
Parsing file /project/user1/project/kernel/epico.cxx
…
Freeing input...
Building group list...
..
Generating docs for compound MemoryManager::ProcessSpec
…
Generating docs for namespace std
Generating group index...
Generating example index...
Generating file member index...
Generating namespace member index...
Generating page index...
Generating graph info page...
Generating search index...
Generating style sheet...


 


 

Documentation output formats

Doxygen can generate documentation in several output formats other than HTML. You can configure doxygen to produce documentation in the following formats:

  • UNIX man pages: Set the <GENERATE_MAN> tag to Yes. By default, a sub-folder named man is created within the directory provided using <OUTPUT_DIRECTORY>, and the documentation is generated inside the folder. You must add this folder to the MANPATH environment variable.
  • Rich Text Format (RTF): Set the <GENERATE_RTF> tag to Yes. Set the <RTF_OUTPUT> to wherever you want the .rtf files to be generated—by default, the documentation is within a sub-folder named rtf within the OUTPUT_DIRECTORY. For browsing across documents, set the <RTF_HYPERLINKS> tag to Yes. If set, the generated .rtf files contain links for cross-browsing.
  • Latex: By default, doxygen generates documentation in Latex and HTML formats. The <GENERATE_LATEX> tag is set to Yes in the default Doxyfile. Also, the <LATEX_OUTPUT> tag is set to Latex, which implies that a folder named latex would be generated inside OUTPUT_DIRECTORY, where the Latex files would reside.
  • Microsoft® Compiled HTML Help (CHM) format: Set the <GENERATE_HTMLHELP> tag to Yes. Because this format is not supported on UNIX platforms, doxygen would only generate a file named index.hhp in the same folder in which it keeps the HTML files. You must feed this file to the HTML help compiler for actual generation of the .chm file.
  • Extensible Markup Language (XML) format: Set the <GENERATE_XML> tag to Yes. (Note that the XML output is still a work in progress for the doxygen team.)

Listing 5 provides an example of a Doxyfile that generates documentation in all the formats discussed.


Listing 5. Doxyfile with tags for generating documentation in several formats
 
                
#for HTML 
GENERATE_HTML = YES
HTML_FILE_EXTENSION = .htm

#for CHM files
GENERATE_HTMLHELP = YES

#for Latex output
GENERATE_LATEX = YES
LATEX_OUTPUT = latex

#for RTF
GENERATE_RTF = YES
RTF_OUTPUT = rtf 
RTF_HYPERLINKS = YES

#for MAN pages
GENERATE_MAN = YES
MAN_OUTPUT = man
#for XML
GENERATE_XML = YES


 


 

Special tags in doxygen

Doxygen contains a couple of special tags.

Preprocessing C/C++ code

First, doxygen must preprocess C/C++ code to extract information. By default, however, it does only partial preprocessing -- conditional compilation statements (#if…#endif) are evaluated, but macro expansions are not performed. Consider the code in Listing 6.


Listing 6. Sample C++ code that makes use of macros
 
                
#include <cstring>
#include <rope>

#define USE_ROPE

#ifdef USE_ROPE
  #define STRING std::rope
#else
  #define STRING std::string
#endif

static STRING name;

 

With <USE_ROPE> defined in sources, generated documentation from doxygen looks like this:

                Defines
    #define USE_ROPE
    #define STRING std::rope

Variables
    static STRING name

 

Here, you see that doxygen has performed a conditional compilation but has not done a macro expansion of STRING. The <ENABLE_PREPROCESSING> tag in the Doxyfile is set by default to Yes. To allow for macro expansions, also set the <MACRO_EXPANSION> tag to Yes. Doing so produces this output from doxygen:

                Defines
   #define USE_ROPE
    #define STRING std::string

Variables
    static std::rope name

 

If you set the <ENABLE_PREPROCESSING> tag to No, the output from doxygen for the earlier sources looks like this:

                Variables
    static STRING name

 

Note that the documentation now has no definitions, and it is not possible to deduce the type of STRING. It thus makes sense always to set the <ENABLE_PREPROCESSING> tag to Yes.

As part of the documentation, it might be desirable to expand only specific macros. For such purposes, along setting <ENABLE_PREPROCESSING> and <MACRO_EXPANSION> to Yes, you must set the <EXPAND_ONLY_PREDEF> tag to Yes (this tag is set to No by default) and provide the macro details as part of the <PREDEFINED> or <EXPAND_AS_DEFINED> tag. Consider the code in Listing 7, where only the macro CONTAINER would be expanded.


Listing 7. C++ source with multiple macros
 
                
#ifdef USE_ROPE
  #define STRING std::rope
#else
  #define STRING std::string
#endif

#if ALLOW_RANDOM_ACCESS == 1
  #define CONTAINER std::vector
#else
  #define CONTAINER std::list
#endif

static STRING name;
static CONTAINER gList;

 

Listing 8 shows the configuration file.


Listing 8. Doxyfile set to allow select macro expansions
 
                
ENABLE_PREPROCESSING = YES
MACRO_EXPANSION = YES
EXPAND_ONLY_PREDEF = YES
EXPAND_AS_DEFINED = CONTAINER
…

 

Here's the doxygen output with only CONTAINER expanded:

                Defines
#define STRING   std::string 
#define CONTAINER   std::list

Variables
static STRING name
static std::list gList

 

Notice that only the CONTAINER macro has been expanded. Subject to <MACRO_EXPANSION> and <EXPAND_AS_DEFINED> both being Yes, the <EXPAND_AS_DEFINED> tag selectively expands only those macros listed on the right-hand side of the equality operator.

As part of preprocessing, the final tag to note is <PREDEFINED>. Much like the same way you use the -D switch to pass the G++ compiler preprocessor definitions, you use this tag to define macros. Consider the Doxyfile in Listing 9.


Listing 9. Doxyfile with macro expansion tags defined
 
                
ENABLE_PREPROCESSING = YES
MACRO_EXPANSION = YES
EXPAND_ONLY_PREDEF = YES
EXPAND_AS_DEFINED = 
PREDEFINED = USE_ROPE= \
                             ALLOW_RANDOM_ACCESS=1

 

Here's the doxygen-generated output:

                Defines
#define USE_CROPE 
#define STRING   std::rope 
#define CONTAINER   std::vector

Variables
static std::rope name 
static std::vector gList

 

When used with the <PREDEFINED> tag, macros should be defined as <macro name>=<value>. If no value is provided—as in the case of simple #define —just using <macro name>=<spaces> suffices. Separate multiple macro definitions by spaces or a backslash (\).

Excluding specific files or directories from the documentation process

In the <EXCLUDE> tag in the Doxyfile, add the names of the files and directories for which documentation should not be generated separated by spaces. This comes in handy when the root of the source hierarchy is provided and some sub-directories must be skipped. For example, if the root of the hierarchy is src_root and you want to skip the examples/ and test/memoryleaks folders from the documentation process, the Doxyfile should look like Listing 10.


Listing 10. Using the EXCLUDE tag as part of the Doxyfile
 
                
INPUT = /home/user1/src_root
EXCLUDE = /home/user1/src_root/examples /home/user1/src_root/test/memoryleaks
…


 


 

Generating graphs and diagrams

By default, the Doxyfile has the <CLASS_DIAGRAMS> tag set to Yes. This tag is used for generation of class hierarchy diagrams. For a better view, download the dot tool from Graphviz download site. The following tags in the Doxyfile deal with generating diagrams:

  • <CLASS_DIAGRAMS>: The default tag is set to Yes in the Doxyfile. If the tag is set to No, diagrams for inheritance hierarchy would not be generated.
  • <HAVE_DOT>: If this tag is set to Yes, doxygen uses the dot tool to generate more powerful graphs, such as collaboration diagrams that help you understand individual class members and their data structures. Note that if this tag is set to Yes, the effect of the <CLASS_DIAGRAMS> tag is nullified.
  • <CLASS_GRAPH>: If the <HAVE_DOT> tag is set to Yes along with this tag, the inheritance hierarchy diagrams are generated using the dot tool and have a richer look and feel than what you'd get by using only <CLASS_DIAGRAMS>.
  • <COLLABORATION_GRAPH>: If the <HAVE_DOT> tag is set to Yes along with this tag, doxygen generates a collaboration diagram (apart from an inheritance diagram) that shows the individual class members (that is, containment) and their inheritance hierarchy.

Listing 11 provides an example using a few data structures. Note that the <HAVE_DOT>, <CLASS_GRAPH>, and <COLLABORATION_GRAPH> tags are all set to Yes in the configuration file.


Listing 11. Interacting C++ classes and structures
 
                
struct D {
  int d;
};

class A {
  int a;
};

class B : public A {
  int b;
};

class C : public B {
  int c;
  D d;
};

 

Figure 1 shows the output from doxygen.


Figure 1. The Class inheritance graph and collaboration graph generated using the dot tool
Class inheritance graph

 


 

Code documentation style

So far, you've used doxygen to extract information from code that is otherwise undocumented. However, doxygen also advocates documentation style and syntax, which helps it generate more detailed documentation. This section discusses some of the more common tags doxygen advocates using as part of C/C++ code. For further details, see Resources.

Every code item has two kinds of descriptions: one brief and one detailed. Brief descriptions are typically single lines. Functions and class methods have a third kind of description known as the in-body description, which is a concatenation of all comment blocks found within the function body. Some of the more common doxygen tags and styles of commenting are:

  • Brief description: Use a single-line C++ comment, or use the <\brief> tag.
  • Detailed description: Use JavaDoc-style commenting /** … test … */ (note the two asterisks [*] in the beginning) or the Qt-style /*! … text … */.
  • In-body description: Individual C++ elements like classes, structures, unions, and namespaces have their own tags, such as <\class>, <\struct>, <\union>, and <\namespace>.

To document global functions, variables, and enum types, the corresponding file must first be documented using the <\file> tag. Listing 12 provides an example that discusses item 4 with a function tag (<\fn>), a function argument tag (<\param>), a variable name tag (<\var>), a tag for #define (<\def>), and a tag to indicate some specific issues related to a code snippet (<\warning>).


Listing 12. Typical doxygen tags and their use
 
                
/*! \file globaldecls.h
      \brief Place to look for global variables, enums, functions
           and macro definitions
  */

/** \var const int fileSize
      \brief Default size of the file on disk
  */
const int fileSize = 1048576;

/** \def SHIFT(value, length)
      \brief Left shift value by length in bits
  */
#define SHIFT(value, length) ((value) << (length))

/** \fn bool check_for_io_errors(FILE* fp)
      \brief Checks if a file is corrupted or not
      \param fp Pointer to an already opened file
      \warning Not thread safe!
  */
bool check_for_io_errors(FILE* fp);

 

Here's how the generated documentation looks:

                Defines
#define SHIFT(value, length)   ((value) << (length))  
             Left shift value by length in bits.

Functions
bool check_for_io_errors (FILE *fp)  
        Checks if a file is corrupted or not.

Variables
const int fileSize = 1048576;
Function Documentation
bool check_for_io_errors (FILE* fp)
Checks if a file is corrupted or not.

Parameters
              fp: Pointer to an already opened file

Warning
Not thread safe! 


 


 

Conclusion

This article discusses how doxygen can extract a lot of relevant information from legacy C/C++ code. If the code is documented using doxygen tags, doxygen generates output in an easy-to-read format. Put to good use, doxygen is a ripe candidate in any developer's arsenal for maintaining and managing legacy systems.

The importance of UNIX in SOA environments

These are exciting times in solution architecture—days of Web 2.0, SOA, Web services, mash-ups, and the full integration of technical solutions derived from business models integrating with old and new systems alike. Discover how and why existing systems and applications with which you are already familiar deployed on operating systems that you know well are so critical to the present and future of Web-based computing, particularly in the area of SOA.

These are exciting times in solution architecture . . . that is, if you embrace the challenges of learning and implementing technologies such as Service-Oriented Architecture (SOA), Web services, mash-ups, portals, and the like. For business executives, project managers, sales execs, and various resource managers, SOA and the myriad of new tools and technologies about which you must make immediate business decisions may seem impossible to keep up with. The goal of this article is to explain how and why existing systems and applications with which you are already familiar deployed on operating systems that you know are so critical to the present and future of Web-based computing, particularly in the area of SOA.

This article provides:

  • A brief history of the UNIX® operating system in enterprise environments.
  • A high-level overview of the IBM SOA.
  • IBM® SOA solution architecture scenarios.
  • Information about implementation components for SOA in UNIX.

The evolution of business and IT

As businesses have evolved from mainframe computing to applications built on open standards platforms and from independent, proprietary, and complex enterprise systems to open, integrated, reusable, services-based, platform-independent systems, it's important not only to understand the vague concepts behind such a model but actually see how systems you're familiar with integrate with this modern-day approach to information technology (IT). In addition, it's extremely helpful to see what software is used on what platforms to deliver on SOA's promises. This is where UNIX comes into play.

Regardless of what the future may bring—virtualization; complete services-, mash-up-, or portlet-based application front ends with shared computing hosting and database environments—somewhere out there you will still have UNIX servers doing what they do best: providing a reliable operating system to host a variety of Web computing needs. This article's high-level explanation of the history of UNIX servers in the enterprise and how important they are to the SOA puzzle drills into and exposes which technologies are used for each SOA implementation and the platforms on which they perform best.


 


 

UNIX in the enterprise environment

UNIX was created in the 1970s by AT&T's Bell Laboratories and has gone through design evolutions by both universities and companies. After more than 30 years of use, the UNIX operating system is still regarded as one of the most powerful, versatile, and flexible operating systems in the computer world. Its popularity hinges on its simplicity, open standards design, its ability to run on a wide variety of machines, and its portability.

The bottom line is that UNIX was and is a reliable, secure, multi-user operating system that continues to dominate the enterprise Web- and application-hosting landscape. Many large organizations funded UNIX development platforms, and they remain loyal to the platform to this day. This loyalty is to some extent the result of cost and support.

Many experts agree that UNIX is the operating system of choice for Web hosting, with the only real alternative being Linux®, which IBM and others now are strongly backing.


 


 

Service-Oriented Architecture

If you're new to SOA, review the following developerWorks document, New to SOA and Web services. This document accurately explains in great detail the foundation of the five entry points to an SOA system that I summarize below.

Entry points

Note that these are "technical" executive summaries. They should help the tech-minded readers who get confused with all the SOA jargon when all they really want to know are the technical details of each entry point along with which software you can use to implement these solutions:

  • People: Dynamic portal/portlet front end, proxy, access managers, and process flows.
  • Information: Managing multiple data sources and harvesting business value from the data.
  • Connectivity: Connecting Web services and existing systems through standard protocols, adapters, and buses.
  • Process: Business process modeling and monitoring along with content management and collaboration.
  • Reuse: Integrating existing applications or new Web services.

Figure 1 illustrates these SOA entry points.


Figure 1. SOA entry points
SOA entry points
 

The IBM SOA solution architecture

IBM has made great strides in standardizing a common SOA model to better help IT professionals logically understand the multiple components and layers of a true SOA. In Figure 2, you can see five main top-layer components, with the consumers being the system that needs a service provided by the referenced operational systems. These available services are sometimes defined by the consumer through business process models or straight service naming references and sometimes by the provider. As I show below, most of the time, this process is routed through the Enterprise Service Bus (ESB) at the service components layer. This whole process is governed by security and management layers, which I don't discuss here.


Figure 2. IBM SOA solution architecture
IBM SOA solution architecture
 

As you study the architecture diagrams here, you'll see UNIX-based and UNIX-like (Linux) systems spread through the enterprise. These are labeled with icons representing their corresponding operating systems—a sun for Sun Solaris, a penguin for Linux, and so on. They are at the proxy layer and make up the ESB, and Java™ application servers are also deployed on UNIX systems. Now, you can start to see how UNIX fits in and is a vital part of SOA in most enterprise environments. Although for most server hosting needs you can choose among UNIX, Linux, and Windows, most enterprises use UNIX servers.


 


 

IBM SOA solution architecture scenarios

Although there are dozens of patterns and scenarios to use as examples, to keep things simple, I visually demonstrate patters that you may have seen previously discussed in other places. The patterns I use are the Direct, Indirect, External Provider, and External Consumer patterns. The diagrams below assume two organizations within a large enterprise, with one as a primary provider and the other a primary consumer. You also see how SOA scenarios would look with external users, systems, or applications needing to reference the provider services.

Direct exposure

The direct exposure pattern exposes an existing asset as a service with the use of open standards protocols such as SOAP and Web Services Description Language (WSDL). This solution will not likely be aligned with the business processes.

As you see in Figure 3, assets are open and directly accessible to the consumer through a Web service. This situation isn't necessarily good, as it takes away from a large part of what SOA is all about—namely, decoupling. In this case, you'll be tied directly to that legacy system or asset. Nonetheless, it is an option that may be available to your organization.


Figure 3. The direct SOA pattern
Direct pattern
 

Indirect exposure

The indirect exposure pattern connects several existing software functions to realize a service while using a well-planned interoperable service interface solution. This pattern essentially aligns your services with business activities.

Based on which application you need to access as defined by your service definition, you can route this request through the local ESB, providing for a true decoupled SOA solution. In this case, systems and applications are sharing data and services within their own organization, as shown in Figure 4.


Figure 4. The indirect SOA pattern
Indirect pattern
 

External Provider

The External Provider pattern demonstrates how to outsource a noncritical business function to a consumer accessing one or more third-party services. The enterprise can share services among different organizations within the enterprise.

In this scenario, shown in Figure 5, the consumer has the power, because the solution is based on their service definition and what they need at the time. They may or may not choose to connect to this particular organization's service. This allows great flexibility.


Figure 5. The External Provider pattern
External Provider pattern
 

External Consumer

The External Consumer patterns allows third parties outside the provider governance domain access to services that the provider serves. In this scenario, shown in Figure 6, the provider has the power, because the provider sets the definition of which services are allowed or opened to the outside world. A benefit to this scenario may be that your organization can host services and assets that they can make available to a business partner.


Figure 6. The External Consumer SOA pattern
External Consumer pattern
 

For some reason, sales people and most high-level business decision makers sell the "idea" of SOA very well as a sort of magic concept for integrating business goals with technical solutions by using existing platforms, allowing for future growth by using reusable services. The implementation of such magic is where things get a bit trickier.

In all but one of the scenarios you see above, you will be ESB dependent. So, let's take a look at this vital part of the SOA infrastructure and see how UNIX operating systems may be the operating systems of choice for your ESB solution.


 


 

Implementation components for SOA in UNIX

From my experience and observation, the least-understood facts regarding implementing SOA in the enterprise are:

  • The software or tools needed at each SOA entry point to make the magic happen.
  • The complexity involved in actual integration of all of the services, systems, reusable assets, and applications residing on different platforms using existing or outdated means.

Integration in SOA is achieved at the service components layer, with the primary concept or tool being an ESB, as shown in Figure 7. The ESB is a vital piece of your SOA infrastructure, as it provides security and virtualization.


Figure 7. The ESB
The ESB
 

This particular ESB has a demilitarized zone (DMZ) File Transfer Protocol (FTP) server with a security and virus-checking process to correctly validate data, while a file server helps connect multiple systems by sharing data across a shared storage solution and hosts your file-move software for transferring data off and on IBM® WebSphere® MQ queues. IBM WebSphere Business Integration Message Broker allows for secure routing of messages through Publish and Subscribe definitions. WebSphere MQ is the core product at the heart of the ESB in this scenario.

Obstacles

One major obstacle you've faced is allowing Windows and UNIX servers to share and manipulate the same data. Data passing from UNIX-based ESB servers to Windows servers is formatted to the UNIX operating system and vice versa, requiring data conversion. Also, I have yet to see a great Network Attached Storage (NAS), Storage Area Network (SAN), or Samba shared-storage solution that easily and reliably shares data between UNIX and Windows servers. In addition, Windows Messaging platforms do not integrate with UNIX-based messaging software solutions such as WebSphere MQ. These are all challenges important to understand.

Although SOA by definition means integration of various systems, services, and assets regardless of architecture or platform, it still requires actual software solutions like ESB likely deployed on UNIX-based servers. In addition, UNIX and Linux-based servers have an edge in shared storage solutions and sharing data with existing UNIX-based Web server environments, which still dominate the Web landscape. Although UNIX and Linux are not the only alternatives, it is important to understand their advantages in integrating such typically non-conforming systems. In this, you will see why UNIX or UNIX-like operating systems are still highly needed for the future of the Web-based computing.


 


 

Summary

My goal in this article was to give you a high-level explanation of where we have come from regarding UNIX-based server environments in our enterprises as well as where we're going in regard to SOA while building a foundation of knowledge about SOA solution patterns and architecture. I even drilled down to specific implementations of such solutions, explaining why UNIX-based systems are beneficial given some of the challenges that we face when given the task of connecting all these services and systems.

I hope you now have a decent understanding of where your organization is and where you may be going in relation to SOA and why UNIX-based servers are still critical pieces of the overall puzzle.

DB2 and the Live Partition Mobility feature of PowerVM on IBM System p using storage area network (SAN) storage

Learn about Live Partition Mobility, a feature of the System p™ virtualization PowerVM™ Enterprise edition. See how Live Partition Mobility can be applied to DB2® deployments, and how it helps you migrate AIX® and Linux® partitions and hosted applications from one physical server to another compatible physical server. Live Partition Mobility allows hardware maintenance, firmware upgrades, system maintenance, and on-the-fly server consolidation without application outage. Setup, configuration, best-practices, and performance characterization for Storage Area Network (SAN) and DB2 are covered.

Introduction

System p virtualization technologies offer a rich set of platform deployment features. Features range from simple resource isolation to an array of the most advanced and powerful functions, including: server-resource partitioning, autonomic dynamic resource reallocation, and workload management.

Live Partition Mobility, a new feature on POWER6™ processor-based systems, is designed to enable migration of a logical partition (LPAR, or partition) from one system to another compatible system. Live Partition Mobility uses a simple and automated procedure that transfers the migrating LPAR’s running state (primarily memory pages belonging to the migrating LPAR) and configuration information (such as virtual network interface, virtual Small Computer System Interface (SCSI) interface, or LPAR profiles) from the source server to the destination server without disrupting the hosted operating system, network connectivity, and hosted applications.

Live Partition Mobility gives the administrator greater control over the system resources in a data center. It increases flexibility in workload deployment and workload management, which previously wasn't possible due to narrow maintenance windows, stringent application availability requirements, or strict service level agreements that don't allow an application to be stopped.

The migration process can be performed either with a partition in powered off state or a live partition. There are two available migration types:

Inactive migration
The logical partition is powered off and moved to the destination system.
Active migration
The migration of the partition is performed while service is provided, without disrupting user activities.

 

Maintenance activities that previously required down time can now be performed with no disruption to services. Activities include, but are not limited to:

  • Preventive hardware maintenance
  • Hardware upgrades
  • Server consolidation that involves reallocation of partitions among servers

 

DB2 V9.5 is the industry’s leading information management platform. DB2 V9.5 is designed to leverage the System p virtualization technology. Innovative features such as host dynamic-resource awareness, online tuning of configuration parameters, and self-tuning memory management (STMM) make DB2 well suited for the virtualization environment.


 


 

Prerequisites for Live Partition Mobility

It is assumed that you are knowledgeable about System p virtualization concepts, storage management, network administration, and basic system administration skills.

Live Partition Mobility has specific requirements in terms of the operating system level, firmware level, DB2 data storage layout, and network interfaces. Setting up for the migration, regardless of whether it's active or inactive, requires careful deployment planning beforehand. Sometimes you can qualify a partition for mobility by taking additional steps, such as removing physical adapters (non-virtual adapters) using a dynamic logical partitioning (DLPAR) operation.

Only the partitions meeting the following requirements are eligible for migration. The typical requirements for the migration of a logical partition are:

  • Two POWER6 processor-based systems controlled by the same Hardware Management Console (HMC). An optional redundant HMC configuration is supported.

     

  • The destination system must have at least an equal amount of available memory and processors as the migrating partition’s current usage. DLPAR operation might have altered partition resources (including memory, processor, and adapters) at run time after it is booted with a partition profile. Hence, the current amount of partition resources may be different than the amount specified as the “desired” in the partition profile.

    A partition can have multiple profiles. At operating system boot time, the system administrator selects which profile a partition is booted with. In general, a partition profile specifies operating system boot time resource requirements, such as the desired amount of memory, processor entitlement, adapters, and so on. The partition profile also specifies the minimum and maximum amount of memory, and number of virtual processors and processor entitlement.

  • The operating system, applications (DB2) installation, DB2 instance data, and DB2 table data must reside on Virtual I/O storage on an external storage subsystem.

     

  • The mobile partition's all network and disk access must be virtualized using one or more Virtual I/O Servers (VIOS).

     

  • No other physical adapters, such as serial I/O adapter, SCSI adapters, and so on, may be used by the mobile partition at the time it is migrated. The pre-migration validation phase will fail if any physical adapter belongs to a migrating partition.

     

  • The VIOS on both systems must have a shared Ethernet adapter configured to bridge to the same Ethernet network used by the mobile partition.

     

  • The VIOS on both systems must be capable of providing virtual access to all storage resources that the mobile partition is using.

Figure 1 shows a general schematic to enable mobility. Each POWER6-based system is configured with a VIOS partition. The mobile partition has access to network and disk resources only through virtual adapters. The VIOS on the destination system is connected to the same network, and is configured to access the same storage used by the mobile partition.


Figure 1. Partition mobility infrastructure
Partition mobility infrastructure

 


 

Configuration to enable Live Partition Mobility

This section provides a reference checklist of configuration settings for Live Partition Mobility. In general, one method to verify items in the checklist is to navigate to the respective Property screen. For example, for items in the Systems checklist navigate to System -> Property screen. For a VIOS partition, navigate to its Property screen. The Redbook PowerVM Live Partition Mobility on IBM System p (see Resources) has detailed instructions for verifying and changing these settings.

Systems
  • The source and destination servers must be at least POWER6 processor architecture-based systems. Both systems must be controlled by the same HMC. Live Partition Mobility requires HMC Version 7.3.2 or higher.
  • The memory region size (also called logical memory block (LMB) size) must be the same on the source and destination systems. It can be verified on HMC by navigating to the Memory tab on the managed system Properties screen. See Figure 2 below.
  • The destination system cannot be running on battery power for migration to take place.
  • The destination system must have at least an equal amount of processors and memory available (as dictated by the active partition profile) as the host mobile partition.

Figure 2. Verify system memory region (LMB) size on HMC
Verify system memory region (LMB) size on HMC
 
VIOS partitions
  • At least one VIOS at release level 1.5 or higher has to be installed both on the source and destination systems.
  • The partition attribute “Mover service partition” (MSP) must be enabled on both the source system and target system VIOS. You can enable this by selecting a check box when creating the VIOS partition profile, or later by navigating to the partition property and selecting a check box, as shown in Figure 3.
  • On the VIOS where MSP is enabled, a Virtual Asynchronous Service Interface (VASI) device must be defined and configured in order to be considered a valid MSP candidate.
  • Virtual disks for use by the mobile partition have to be backed by the devices; they cannot be part of any logical volume group on VIOS.

Figure 3. Enable or check status of mover service partition
Enable or check status of mover service partition
 
Mobile partition
  • AIX 5.3 technology level 7 or later.
  • Ensure that Resource Monitoring and Control (RMC) connections are established.
  • Redundant error path reporting has to be disabled on the migrating partition.
  • The mobile partition cannot have any required virtual serial adapters, except for the two reserved for the HMC.
  • The mobile partition cannot be part of a partition workload group. A partition workload group is a group of logical partitions whose resources are managed collectively by a workload management application.

    Workload management applications can balance memory and processor resources within groups of logical partitions without intervention from the HMC.

  • The mobile partition cannot use barrier synchronization register (BSR) arrays.
  • The mobile partition cannot use huge (16GB) memory pages. The 64KB and 16MB AIX page sizes are allowed, however.
  • The mobile partition does not have physical or dedicated I/O adapters and devices.
  • The mobile partition cannot be configured with any logical host Ethernet adapter (LHEA) devices.

    A host Ethernet adapter (HEA) is a physical Ethernet adapter that is integrated directly into the system (GX+ bus) on a managed system. HEAs offer high throughput, low latency, and virtualization support for Ethernet connections. HEAs have the same uses as other types of physical Ethernet adapters. For example, you can use an HEA to establish a console connection to a logical partition.

External storage
  • The same Storage Area Network (SAN) disks used as virtual disks by the mobile partition must be assigned to the source and destination Virtual I/O logical partitions.
  • reserve_policy attributes on the shared physical volumes must be set to no_reserve on the VIOS.
  • The physical volumes must have the same unique identifier, physical identifier, or an IEEE volume attribute. This can be verified using the AIX lsattr command.

    The VIOS must be able to accurately identify a physical volume each time it boots, even if an event such as a SAN reconfiguration or adapter change has taken place. Physical volume attributes, such as the name, address, and location, might change after the system reboots due to SAN reconfiguration. However, the VIOS must be able to recognize that this is the same device and update the virtual device mappings.

    To export a physical volume as a virtual device, the physical volume must have either a unique identifier (UDID), a physical identifier (PVID), or an IEEE volume attribute.

  • The mobile partition must have access to the storage through the VIOS.
  • The destination VIOS must have sufficient free virtual slots to create the virtual SCSI adapters required to host the mobile partition.
  • The mobile partition must have access to the same physical storage on the SAN from both the source and destination environments.
Network considerations
  • A shared Ethernet adapter has to be configured on both the source and destination VIOS.
  • A virtual Ethernet adapter has to be configured on the mobile partition.

 


 

Mobility in action

From the end user’s perspective, an active migration is simply a few clicks on the HMC graphical user interface. Behind the scenes, a lot of work is involved in transferring the migrating LPAR state and configuration information from the source to the destination system (including many gigabytes of memory configured in the migrating LPAR), while keeping all services of the LPAR functional, and maintaining coherency between the source and destination systems. An active migration involves:

  1. A validation phase to make sure migration criteria are met. The validation process essentially verifies the prerequisites listed above.
  2. New LPAR creation on the destination system.
  3. MSP setup on both VIO servers.
  4. Virtual SCSI adapter setup on the destination VIO server.
  5. Memory copy from the source to the destination system.
  6. Suspend the LPAR on the source system and resume it on the destination system.
  7. Virtual SCSI removal from the source VIO server.
  8. Notification of completion to the mobile partition and VIO servers.
  9. Removal of the LPAR from the source system.

 


 

Test plan

The test plan outlines the test environment and test cases.

Test environment

The following table summarizes the system, storage, and software used for the proof of concept.


Table 1. Environment
 
Hardware System: Two systems, each System p 570 with 4x 4.7 GHz POWER6 cores with IBM PowerVM Enterprise edition feature

Number of cores used: Four

Physical memory: 9 GB

Firmware version: EM320_031

HMC version: V7R3.2.0.0

Disk characteristics:
Number of disks: 28 RAID5 FCP LUNs. 196 external disks (140 data + 56 transaction logs)
Type, size, speed: FCP, 18 GB, 15 000 RPM

Software Operating system: AIX 5L v5.3 TL07, 64-bit kernel

Database: DB2 9.5 for AIX, 64-bit

Workload Characteristics: Online-transaction processing (OLTP)

Database size: ~100 GB


 

Test cases

The test cases are designed to understand the following characteristics of the DB2 performance under active migration. The DB2 server instance continues to serve clients at the time of migration.

  • Performance impact at various phases of the migration.
  • Observe client or end user perceived throughput during active migration.
  • How soon the original throughput is reestablished on the destination server.
  • Effect of load on the migration time.

 


 


 

Setup for mobility

Figure 4 shows the setup used for the DB2 and Live Partition Mobility performance characterization. Throughout the test, the setup evolved around best practices to ensure ease in management and higher performance throughput. As mentioned, preparation for mobility requires careful storage and network planning.


Figure 4. DB2 live partition mobility hardware setup
DB2 live partition mobility hardware setup
 

Hardware setup

This section describes the system, storage, and network setup required to enable the active migration.

System #1 in Figure 4, above, is considered the source system. The LPAR (running a DB2 server instance) was originally defined on this system. System #2 was set up to receive a migrating DB2 server LPAR. The DB2 server LPAR does not have any physical resource (including physical adapter, or local disk) belonging to it. This is one of the major requirements for mobility to work. The required network and storage interfaces are a shared Ethernet adapter (SEA), and a virtual SCSI, respectively. The DB2 network client is connected to the DB2 server by an SEA, whereas DB2 server data is hosted off the DS4500 using a virtual SCSI adapter.

The storage system is hooked up using a SAN switch to be able to connect to both the systems. The important step in configuring SAN storage is to ensure that worldwide port numbers (WWPN) are appropriately declared so both systems can access data. (At any point in time, though, only one system is accessing the storage system.) Configuration to enable Live Partition Mobility has detailed setup and configuration steps.

PowerVM Enterprise edition, a separately licensed, paid feature, must be enabled on both the systems to enable partition mobility. The minimum required AIX operating system is version 5.3 technology level 07. Specific minimum HMC and firmware versions are also required. Complete details about the environment are in Test environment .

It is mandatory that you install the AIX operating system on virtual storage backed by the SAN storage device backing. It is required so that the AIX install media (specifically, rootvg) will also move along with the migrating partition.

To minimize migration latency time, high speed network infrastructure such as Gigabit or 10 Gigabit interfaces are recommended at both ends. To improve network throughput, if possible connect both the systems and HMC with the same high speed Ethernet switch, as shown in Figure 4.

DB2 setup

Just as you need an AIX installation on SAN storage, your DB2 installation location, DB2 instance directory, DB2 catalog, and all other storage (including DB2 tablespaces containers) also must be on SAN storage. The DB2 database cannot use local storage devices.

Be sure that the DB2 instance is not using any physical network adapters, disk adapters, or devices. The migration is not possible with physical adapters belonging to a migrating LPAR. The physical adapters can be removed using the DLPAR operation before the migration. If DB2 is accessing such devices, then the DLPAR operation to remove them will fail, resulting in a migration validation failure. The best way to avoid this situation is to follow the rule that there should be no physical adapters belonging to the LPAR. All adapters (network, storage) must be virtual adapters.

PowerVM Virtualization on IBM System p Introduction and Configuration (see Resources) has details about System p virtualization setup and configuration.
 

Partition mobility is not supported if the LPAR is using huge AIX pages (16GB page size). It's important to ensure that DB2 is not using AIX with huge pages configuration; for example, using DB2 registry variable DB2_LARGE_PAGE_MEM=DB:16GB. The AIX 4KB, 64KB, and 16MB page sizes, which DB2 supports, are supported for use with the partition mobility.

The DB2 database manager configuration parameter INSTANCE_MEMORY controls the amount of memory used by the DB2 instance. If the purpose of the migration is to assign more memory (by performing a DLPAR operation after the migration to add more memory to the LPAR hosting DB2) for the DB2 instance on the target system, the INSTANCE_MEMORY configuration parameter may require attention after the migration to be able to use additional memory on the target system. There is no need for any other change in the DB2 configuration.


 


 

Migration analysis

This section discusses migration performance, LPAR suspend/resume duration, effect of workload intensity, and resource requirements.

Active migration performance profile

The shaded area in Chart 1 below shows when the mobile LPAR is migrating. Execution of the mobile LPAR on the source system ends with a suspension (in green). It's followed by a resume operation (the red shaded area) that is observed between suspension on the source system and resumption of the target system.


Chart 1. Performance profile of DB2 active migration
Performance profile of DB2 active migration
 

Notice that the memory copy goes beyond the point of resumption of execution of the mobile LPAR on the destination system. When the LPAR is suspended, some memory pages that were recently modified have not been copied over to the destination system. When processes access memory pages that are still on the source system, the memory pages are demand-paged over from the source system. Therefore, processes can continue execution on the destination without waiting for all the remaining memory pages to be copied over.

Chart 1 also shows that database transactions continue once the mobile LPAR is resumed, and they return to full performance within a few seconds. This timing is workload dependent.

Table 2 below summarizes elapsed time at various stages of the active migration. The times are for example purposes only, and not representative. Times depend entirely on the workload, workload type, DB2 memory consumption, network adapter speed, network topology, amount of memory, storage system, and so forth.


Table 2. DB2 live partition mobility summary
 
Event Duration (mm:ss)
Pre-migration validation and state transfer (memory copy) 03:15
Total migration time (elapsed time between the “Begin migration” and “End migration” markers) 03:24

 

The whole migration scenario took only three minutes and 24 seconds. This time is a function of the amount of memory in an LPAR, and how frequently memory pages are updated during the migration. Further breakdown of the total migration time can be divided into two major events: state transfer took 3 minutes and 15 seconds, followed by around 9 seconds of suspend/resume duration. The original throughput (the throughput at or before the “Begin migration” marker) is established almost instantly on the target system.

As seen in Chart 1, transaction throughput is affected during the state transfer phase. The average throughput dropped by about 12% between the markers “Begin migration” and “Suspend on source system.”

It is worth reemphasizing that all the performance analysis metrics and elapsed times mentioned in this discussion are examples only. They will vary widely from environment to environment.

The next section discusses test results with different levels of load on the mobile LPAR in combination with different VIOS configurations.

Load scenarios

The next set of tests examines the effect of LPAR load on the mobility characteristics. There are three load levels synthesized by a varying number of DB2 clients. The resulting average DB2 transaction throughput is approximately 1500, 1800, and 2200 transactions/second (TPS), as shown in Chart 2.


Chart 2. Effect of partition load on mobility profile
Effect of partition load on mobility profile
 

There were not any other changes in the LPAR, AIX, or DB2 configuration. It is clear that the amount of load has apparent effect on the mobility metrics, including total migration time, state transfer time, and suspend/resume duration. The detailed analysis of exact metrics is shown in Table 3.


Table 3. Effect of workload intensity on live partition mobility
 
Event Low Medium High
Pre-migration validation and state transfer (memory copy) 02:16 02:33 03:15
Suspend/resume duration 00:09 00:09 00:09
Total migration time (elapsed time between the “Begin migration” and “End migration” markers) 02:25 02:42 03:24

 

Processor consumption

This section reviews processor consumption behavior during the various stages of active migration operation. Chart 3 below represents the same run scenario as in Chart 1, but this time showing processor utilization for each of the involved partitions. Each VIOS (on the source and target system) are configured as a dedicated LPAR, with one processor, having 512MB of physical memory. The DB2 LPAR is configured as a dedicated LPAR with two processors, having 9GB of physical memory.


Chart 3. Processor utilization of VIOS and DB2 LPAR during active migration operation
Processor utilization of VIOS and DB2 LPAR during active migration           operation
 

Before the “Begin migration” event, DB2 LPAR processor utilization is around 85%. Right after the migration is triggered, DB2 LPAR processor utilization slightly increases to around 87%, and stays there until the “End migration” marker. The slight DB2 partition processor utilization increase is attributed to the DB2 partition participating in the state transfer. Compared it with Chart 1, where at the same time DB2 throughput starts gradually declining up to the time marker 120 in the charts. After the marker 120, DB2 throughput stays at the same level (down by around 12% as compared to pre-migration level).

Before the migration, the source VIOS is servicing DB2 disk and network I/O requests, so it has processor utilization of around 18%. Since there's no other activity on the target system, the VIOS processor utilization is 0%. After the 120 marker in the chart, both VIOS processor utilization increases significantly because now both VIOS are conducting the state transfer operation. The source VIOS still continues servicing the DB2 disk and network I/O, in addition to performing the state transfer operation, causing processor utilization to jump to as high as 90%.

Around the same time, the target VIOS processor increases from 0% to around 60% and stays at that level until the end of the state transfer operation. Right before the suspend/resume operation, the target system VIOS processor utilization increases to as high as 95%, the source VIOS shows a similar trend, and so does the DB2 LPAR.

Observe that after the “End migration” event, the VIOS processor utilizations are reversed. The target system VIOS processor utilization is at the same level as the source system VIOS processor utilization, since now the target system is servicing the DB2 client, and vice versa.

To minimize total active migration duration, it is essential to size the VIOS appropriately. It is recommended that you use a partition type of shared dedicated capacity, or an uncapped shared-processor VIOS partition. To derive the initial amount of processing capacity, set the VIOS partition type as an uncapped shared-processor partition. Do a trial run with a peak workload, perform the active migration, and record the VIOS entitled capacity utilization (using the VIOS viostat command, for example). As a safeguard, add 10% spare capacity to the recorded peak entitled capacity utilization. Round up this value in the case of the shared dedicated partition. With an uncapped shared-processor partition type for VIOS, set the VIOS uncapped weight to the highest value, with entitled capacity calculated in the above step.


 


 

Summary

Live Partition Mobility gives administrators greater control over the usage of resources in your data center. The migration process can be performed either with a partition in powered off state, or with a live partition. There are two available migration types. With inactive migration, the logical partition is powered off and moved to the destination system. With active migration, the migration of the partition is performed while service is provided, without disrupting user activities.

Maintenance activities that required down-time in the past can now be performed with no disruption to services. These activities include, but are not limited to, preventive hardware maintenance, hardware upgrades, and server consolidation that involves reallocation of partitions among servers.

The separately licensed product PowerVM Enterprise edition is required to enable the mobility. The PowerVM Enterprise edition includes Live Partition Mobility, Shared Dedicated Capacity, Micro-partitioning, VIOS, and more.

This article provided detailed analysis for DB2 9.5 running with the PowerVM Enterprise edition Live Partition Mobility feature. The DB2 9.5 instance is hosting an OLTP workload. The DB2 instance is servicing a multitude of clients, as more than 2000 transaction/second (TPS) throughput is migrated to another system. The client network connections remained intact, and the application observed a blip for only a few seconds.

Acknowledgements

The authors would like to thank Horace Tong for allowing them to derive some of the content for this article from his previous work; Sunil Kamath and Peter Kokosielis for their help with DB2 setup and configuration; Rich Bassemir for review comments; and Pete Jordan for assistance with lab support and administrative matters related to the system and storage used.

Complex networking using Linux on Power blades

Blades are an excellent choice for many applications and services, especially in the telecommunications service provider industry. But the unique requirements of these provider networks often require configurations that are complex and need up-front focus and planning so all the stringent functional requirements are met. In this article, learn how to plan and set up the necessary network configurations for a POWER6™ JS22 blade deployment.

Blade-based operational models are of tremendous value in the wired and wireless telecommunications domain for several reasons:

  1. Small footprints mean cost-effective use of data center space.
  2. Deployment meets NEBS requirements for distributed deployments. (NEBS, or Network Equipment Building System, is a set of criteria that networked equipment must meet to qualify for compatibility.)
  3. Cost-effective massive horizontal scalability lowers deployment costs for the telecommunications service provider.
  4. Centralized management support provides better OAM&P support for in-network deployment in service provider networks. (OAM&P stands for "operations, administration, maintenance, and provisioning." The term describes this collection of disciplines and specific software used to track it.)
  5. Built-in support for continuous-availability-based operational models—including upgrades and maintenance activities—avoids service downtime from a subscriber perspective.

These additional considerations are key in a telecommunications service provider environment, especially one with complex configurations:

  • Multiple VLANs. These are used for CDN (Customer Data Network) and Management (OAM&P) traffic. Considering them separately ensures that subscriber QoS (Quality of Service) is effectively maintained across multiple LPARs (logical partitions).
  • Micro-partitioning and virtualization. These strategies help maximize the capacity utilization and TCO (Total Cost Of Ownership).
  • Existing network complexities. Existing networks can have a higher degree of load variability, requiring resource load balancing among multiple client LPARs.

In this article, we describe an implementation of a multiple VLAN configuration of a blade chassis using Cisco switches paired in an active/passive configuration. In our example, we configured networks to connect multiple VLANs on a BladeCenter® JS22 using Linux® on Power. The architecture consists of six Cisco catalyst switch modules with fourteen internal ports and four external 1GB ports each.

To properly leverage all six switches in the chassis, the blade requires six Ethernet interfaces. Ethernet interface ent0 on the blade maps to the first switch on the blade chassis; each consecutive Ethernet interface maps to the next available switch. This mapping creates a limitation, because it does not allow administrators to map the physical adapter on the blade to the switch they choose to map them to.

When creating network architecture for your blades, you must have one physical interface for each Cisco switch you want to use on the chassis. If some blades do not have the same number of adapters as the chassis, the switches that do not have a physical adapter associated with them on the blade cannot be used by the blade.

It is important to understand how to pair the Ethernet interfaces on the blade and the switches within the chassis. The first switch in the chassis is normally located in the upper left of the blade chassis, just below the power plug. It would map to ent0 on the blade since it is the first interface on the blade. Figure 1 shows the numbering of the switches in our configuration.


Figure 1. Physical adapter switches in the configuring we're using
Physical adapter switches in the configuring we're using
 

Determining switch pairing is extremely important for high availability. In a typical configuration, one power distribution unit (PDU) supplies power to the top half of the blade chassis while another PDU provides power to the bottom half of the chassis. If you are creating a redundant solution, it's important to split the primary and secondary switch between the upper half and lower half of the chassis.

In our case, we created a solution with the following switch pairs: (1, 9) (2, 3) (4, 7). Because adapter pairs (ent0, ent1), (ent2, ent3), and (ent4, ent5) are on the same physical I/O card, we also needed to make sure that the network traffic for our target VLANs did not travel across the same I/O card. Our configuration splits traffic across multiple PDUs and across multiple interfaces.

Even though the pairing of the adapters and switches may seem simple, there are multiple steps to configure the IVM (Integrated Virtualization Manager), switches, and LPARs to take advantage of this architecture. Figure 2 represents the setup of one of our blades with the associated switch, trunking, and VLAN tagging. This configuration allows multiple VLANs to communicate over the same physical adapters to multiple switches.


Figure 2. One of the blades with the associated switch, trunking, and VLAN tagging
One of the blades with the associated switch, trunking, and VLAN tagging
 

In this example, each LPAR has two Ethernet adapters that connect to one Virtual Ethernet Adapter (VEA) on the IVM. Notice that the VEA on the IVM has multiple VLANs associated with it. Traffic for each VLAN is separated on the LPARs by their adapters. The VEA trunks the VLANs together and sends them over the Shared Ethernet Adapter (SEA) through a Link Aggregation Device and out to the network via one of the chassis switches. The switches route the VLAN traffic to the appropriate network through the use of VLAN tagging.

Five steps to configuring VLANs

There are five main steps (and one optional step) for configuring VLANs for client LPARs on the IVM:

  1. Configure Cisco switches. (An optional step at this point may be to create a link aggregation device, commonly referred to as the LNAGG device.)
  2. Create 802.1q IEEE VLAN-aware interfaces on the IVM to support the VLANs you want to use on each interface. It is very important to design the VLANs prior to any work because it is not possible to modify the created VLAN-aware interfaces afterwards. You would need to delete them and re-add them, which means a big waste of time.
  3. Assign the Virtual Ethernet Adapter to the physical adapter (LNAGG) in the Virtual Ethernet Menu on the IVM.
  4. Modify the LPARs properties to map the new virtual adapters to LPAR adapters. Make sure the LPARs are inactive before performing changes to the network device properties.
  5. Boot each LPAR and configure the new interfaces.

The example in this article is intended for a fresh install of a blade.


 


 

Step 1: Configure the Cisco switches

You may choose to skip this step if the switches are already properly configured with the VLANs that you want to use. The example does not demonstrate how to configure spanning trees and switch priority for spanning trees that may be required. If you want to follow the example, then you should configure your switches to match the configuration that follows.

Log in to the switch

Type the following commands in this order:

  • enable
  • config
  • interface GigabitEthernet0/1
  • description blade1
  • switchport trunk native vlan 383
  • switchport trunk allowed vlan 373,383
  • switchport mode trunk

The commands configure port 1 on the switch for trunking. If traffic comes in from the blade to the switch or from the external ports of the blades to the switch and they are not tagged with a VLAN, the switch will tag them with VLAN 383. The switchport (external ports) will allow only traffic from 373 and 383, in this case, to route through to and from the IVM's VEAs. To change which VLANs need to access the port from the IVM, simply change the trunk-allowed VLANs numbers.

Splitting the traffic

Once the traffic is trunked and sent to the switch, the switch will determine which external port routes the VLAN traffic. In the example, we are sending external VLAN 373 traffic across port 17, and VLAN 383 over port 20.

To configure VLAN 373 traffic over port 17, type the following commands on the Cisco switch:

  • interface GigabitEthernet0/17
  • description extern1
  • switchport access vlan 373
  • switchport mode access

To configure VLAN 383 traffic over port 20, type the following commands on the Cisco switch:

  • interface GigabitEthernet0/20
  • description extern4
  • switchport access vlan 383
  • switchport mode access

After setting the external port configuration, type exit twice to back out of the configuration mode of the command line. Then run the show run command to see your configuration. The show run command displays the actively running switch configuration. The configuration has not been written to memory in the switch, but you can see the changes that are currently running on the switch. If you look in the configuration, you can see changes we made in the steps above. Look for the Ethernet port for blade 1 as we configured it:


Listing 1. Displaying the configuration of switch port 1
 
interface GigabitEthernet0/1
       description blade1
       switchport trunk native vlan 383
       switchport trunk allowed vlan 373,383
       switchport mode trunk

 

If you issue a show config command, you will see the previous configuration of the interface, not the one we just configured. If the switch is rebooted in this state, the current configuration will be lost. To write it to memory, type write on the command line of the switch. If you run a show config again, the configuration stored on the switch for our interfaces will match the running configuration.


 


 

Step optional: Create a Link Aggregation Adapter

A link aggregation device can be used to connect two physical adapters together to look like one adapter. This is useful for creating an active/passive configuration for failover.

In our example, we wanted to create an active/passive configuration on the blade linking adapter ent0 and ent5 together. Issue the following command to create link aggregation device (LNAGG) with a backup adapter on ent5 from the IVM's command line:

$ mkvdev -lnagg ent0 -attr backup_adapter=ent5

Use the lsdev command to verify the creation of the LNAGG device.


Listing 2. Verifying the creation of the LNAGG device
 
$lsdev |grep ^ent
ent0             Available   Logical Host Ethernet Port (lp-hea)
ent1             Available   Logical Host Ethernet Port (lp-hea)
ent2             Available   Gigabit Ethernet-SX Adapter (e414a816)
ent3             Available   Gigabit Ethernet-SX Adapter (e414a816)
ent4             Available   Gigabit Ethernet-SX PCI-X Adapter (14106703)
ent5             Available   Gigabit Ethernet-SX PCI-X Adapter (14106703)
ent6             Available   Virtual I/O Ethernet Adapter (l-lan)
ent7             Available   Virtual I/O Ethernet Adapter (l-lan)
ent8             Available   Virtual I/O Ethernet Adapter (l-lan)
ent9             Available   Virtual I/O Ethernet Adapter (l-lan)
ent10            Available   EtherChannel/IEEE 802.3ad Link Aggregation
ent11            Available   EtherChannel/IEEE 802.3ad Link Aggregation
ent12            Available   EtherChannel/IEEE 802.3ad Link Aggregation
ent13            Available   Virtual I/O Ethernet Adapter (l-lan)
ent14            Available   Virtual I/O Ethernet Adapter (l-lan)
ent15            Available   Virtual I/O Ethernet Adapter (l-lan)
ent16            Available   Shared Ethernet Adapter
ent17            Available   Shared Ethernet Adapter
ent18            Available   Shared Ethernet Adapter

 

In the output from our lsdev command, you can see that the link aggregation device (ent10 - ent12) looks like a physical adapter. This allows you to map the link aggregation device to a virtual device (ent13 - ent15) via a Shared Ethernet Adapter (ent16 - ent18). The SEA treats the link aggregation device (LNAGG) as though it is a physical adapter. The Virtual Ethernet Adapters (VEA) ent6 - ent9 are created by default and are not VLAN-aware devices, nor can you modify them to become VLAN aware. ent0 through ent5 are the physical adapters on the blade server.

Use the attr flag on the lsdev command to see how the LNAGG device maps to its physical adapters: $lsdev -dev ent10 -attr will produce mapping as shown in Table 1.


Table 1. LNAGG to PA mapping with $lsdev -dev ent10 -attr
 
Attribute Value Description Is it user-settable?
adapter_names ent0 EtherChannel Adapters True < Primary
alt_addr 0x000000000000 Alternate EtherChannel Addr True
auto_recovery no Enable automatic recovery after failover True
backup_adapter ent5 Adapter used when whole channel fails True < Backup
hash_mode default Determines how outgoing adapter is chosen True
mode standard EtherChannel mode of operation True
netaddr   Address to ping True
noloss_failover yes Enable lossless failover after ping failure True
num_retries 8 Times to retry ping before failing True
retry_time 1 Wait time (in seconds) between pings True
use_alt_addr no Enable Alternate EtherChannel Address True
use_jumbo_frame no Enable Gigabit Ethernet Jumbo Frames True

 


 

Step 2: Create virtual adapters on the IVM

In this example, we're going to configure network traffic to flow through the Virtual Ethernet Adapter for VLANs 373 and 383. The first step in configuring the solution is to create the virtual adapter on the IVM that is needed to transport traffic to and from the client LPARs. Let's create a virtual adapter with a primary port of 373 and a secondary port of 383.

To create the VLAN-aware interface on the IVM:

  1. Use your favorite (putty) telnet program, open a window to the IVM, and log in as padmin (the default password is "passw0rd" with a "0" instead of an "O").
  2. Use the shwres -r virtualio --rsubtype eth --level lpar command to list the Ethernet resources:

    Listing 3. Listing the Ethernet resources
     
    $lshwres -r virtualio --rsubtype eth --level lpar
    lpar_name=IVM_01,lpar_id=1,slot_num=3,state=1,ieee_virtual_eth=0,port_vlan_id=1,
     addl_vlan_ids=none,is_trunk=1,trunk_priority=1,is_required=0,mac_addr=463337C4B503
    lpar_name=IVM_01,lpar_id=1,slot_num=4,state=1,ieee_virtual_eth=0,port_vlan_id=2,
     addl_vlan_ids=none,is_trunk=1,trunk_priority=1,is_required=0,mac_addr=463337C4B504
    lpar_name=IVM_01,lpar_id=1,slot_num=5,state=1,ieee_virtual_eth=0,port_vlan_id=3,
     addl_vlan_ids=none,is_trunk=1,trunk_priority=1,is_required=0,mac_addr=463337C4B505
    lpar_name=IVM_01,lpar_id=1,slot_num=6,state=1,ieee_virtual_eth=0,port_vlan_id=4,
     addl_vlan_ids=none,is_trunk=1,trunk_priority=1,is_required=0,mac_addr=463337C4B506
    

     
  3. Use the chhwres command to create a virtual adapter on the IVM that supports IEEE VLAN awareness and the additional VLANs you want on that interface. To do this, issue the following command from the command line of the IVM: $ chhwres -p IVM_01 -o a -r virtualio --rsubtype eth -s 15 -a\ '"ieee_virtual_eth=1","port_vlan_id=373","addl_vlan_ids=383","is_trunk=1","trunk_priority=1"'. The chhwres command tells the IVM how to construct a new VLAN-aware Virtual Ethernet Adapter.

    There are some important features of the command that you need to know to create multiple virtual adapters on the IVM:
    • -p partition: In the command, we are telling the chhwres that there is a change to the IVM partition by issuing the -p command.
    • -s nn: This tells the IVM that we are going to use a particular slot number. If this parameter is not specified, the IVM will use the next available slot. The slot number is required when a device is removed from the IVM.
    • ieee_virtual_eth: A value of 1 informs the IVM that this adapter supports IEEE 802.1Q. This needs to be set to 1 if there are additional VLANs that are required.
    • port_vlan_id: This is the primary VLAN for the virtual adapter.
    • add_vlan_ids: If trunking is enabled, then this parameter accepts the additional VLANs.
    • is_trunk: This attribute must also be set to 1 if you are passing multiple VLANs.
    • trunk_priority: When trunking, the priority of the adapter must be set between 1-15.
  4. Ensure the creation is complete by re-running the lshwres command and look for the new devices.

    Listing 4. Displaying the new devices
     
    $lshwres -r virtualio --rsubtype eth --level lpar
    lpar_name=IVM_01,lpar_id=1,slot_num=3,state=1,ieee_virtual_eth=0,port_vlan_id=1,
     addl_vlan_ids=none,is_trunk=1,trunk_priority=1,is_required=0,mac_addr=463337C4B503
    lpar_name=IVM_01,lpar_id=1,slot_num=4,state=1,ieee_virtual_eth=0,port_vlan_id=2,
     addl_vlan_ids=none,is_trunk=1,trunk_priority=1,is_required=0,mac_addr=463337C4B504
    lpar_name=IVM_01,lpar_id=1,slot_num=5,state=1,ieee_virtual_eth=0,port_vlan_id=3,
     addl_vlan_ids=none,is_trunk=1,trunk_priority=1,is_required=0,mac_addr=463337C4B505
    lpar_name=IVM_01,lpar_id=1,slot_num=6,state=1,ieee_virtual_eth=0,port_vlan_id=4,
     addl_vlan_ids=none,is_trunk=1,trunk_priority=1,is_required=0,mac_addr=463337C4B506
    lpar_name=IVM_01,lpar_id=1,slot_num=15,state=1,ieee_virtual_eth=1,port_vlan_id=383,
     addl_vlan_ids=378,is_trunk=1,trunk_priority=1,is_required=0,mac_addr=463337C4B50F
    lpar_name=IVM_01,lpar_id=1,slot_num=16,state=1,ieee_virtual_eth=1,port_vlan_id=6,
     "addl_vlan_ids=22,23",is_trunk=1,trunk_priority=1,is_required=0,
     mac_addr=463337C4B510
    lpar_name=IVM_01,lpar_id=1,slot_num=17,state=1,ieee_virtual_eth=1,port_vlan_id=7,
     "addl_vlan_ids=565,566,567,568",is_trunk=1,trunk_priority=1,is_required=0,
     mac_addr=463337C4B511
    

     

 


 

Step 3: Assign the virtual Ethernet ID to a physical adapter

Once the link aggregation device has been created, it needs to be mapped to a virtual adapter. This is easily accomplished via the IVM GUI as shown in Figure 3.


Figure 3. Mapping the LNAGG to the virtual adapter via the IVM GUI
Mapping the LNAGG to the virtual adapter via the IVM GUI
 

After logging into the GUI, select the "View/Modify Virtual Ethernet" from the left-side navigation, then choose the "Virtual Ethernet Bridge" tab. From this menu you can see the virtual Ethernet adapter we created previously with the 383 primary VLAN and the 373 additional VLAN. From the drop-down box, you can select the link aggregation device we created in the previous step. Once the new device has been chosen, click Apply. This creates a Shared Ethernet Adapter (SEA) within the IVM.


 


 

Step 4: Modify the LPAR's properties

Once a physical adapter or LNAGG device has been mapped to a Virtual Ethernet ID on the IVM, then the virtual adapter needs to be created for each logical partition. The first step is to log into the GUI on the IVM (see Figure 4).


Figure 4. Creating the virtual adapter for each LPAR
Creating the virtual adapter for each LPAR
 

After logging in, select View/Modify Partition in the upper left corner. Once the page refreshes, choose the LPAR you are going to modify.

Select the client LPAR in the menu and be sure that your browser supports pop-up menus. On the pop-up menu, select the "Ethernet" tab (Figure 5).


Figure 5. Remember to power-off LPAR to modify properties
Remember to power-off LPAR to modify properties
 

In Figure 5, you can see that Virtual Ethernet pull-downs are grayed out; that is because the LPAR was running when the screenshot was taken. Make sure the client LPAR is powered off or inactive before modifying the properties.

Use the pull-downs to map the VLAN-aware VEAs to the client LPAR's adapter on this screen. Notice that the virtual adapter is associated with one VLAN—this allows the IVM to attach VLAN tags to the traffic as it comes in from the operating systems and sends it out the appropriate LNAGG device. If more adapters are needed, click Create Adapter as needed.


 


 

Step 5: Configure Linux to use the VLAN-aware interfaces

Once the IVM and Cisco switches have been configured, you may need to do one additional step if the configuration requires static IP addresses for your Linux partitions. From the IVM GUI, activate the LPAR.

  1. Log in to the box and change your user to root.
  2. Type the following: cd /etc/sysconfig/network-scripts.

If you perform an ifconfig command, you will see that the virtual adapter individual VLAN mapped to the LPAR. With your favorite editor, change the interfaces parameters to meet your configuration requirements. The following is an example of an interface, ifcfg-eth0, with a static IP address:


Listing 5. An interface with a static IP address
 
DEVICE=eth0
BOOTPROTO=static
BROADCAST=192.168.1.31
HWADDR=00:1A:64:8C:B2:32
IPADDR=192.168.1.44
NETMASK=255.255.255.0
ONBOOT=yes
TYPE=Ethernet
GATEWAY=192.168.1.1

 

Restart the interfaces using /etc/init.d/network restart.


 


 

Conclusion

As with any deployment, planning is crucial to success. To avoid costly rework, it's important to lay out your network design before implementing a complex network within the blade chassis. From our experience, we can tell you that reconfiguring a complex IVM-supported network requires considerable effort; in fact, the administrator usually must remove the previous configuration before reconfiguring.

Planning is also critical to the install because you cannot add new VLANs to the virtual adapters on the fly in the IVM. Since you can only have one IVM in the JS22, you cannot use SEA failover as you can in a traditional VIOS installation. Link aggregation provides a mechanism for routing traffic across multiple switches in the case of a network failure of a switch. When considering redundancy in the blade, remember that the top half of the blade is powered by one PDU, while the bottom half is powered by the other PDU.

All these considerations add up to a relatively complex network implementation, so our best advice is to plan before you act.