Performance Blog

Posts Tagged ‘loadtesting

Going by the many posts in various LinkedIn groups and blogs, there seems to be some confusion about how to measure and analyze a web application’s performance. This article tries to clarify the different aspects of web performance and how to go about measuring it, explaining key terms and concepts along the way.

Web Application Architecture

The diagram below shows a high-level view of typical architectures of web applications.

The simplest applications have the web and app tiers combined while more complex ones may have multiple application tiers (called “middleware”) as well as multiple datastores.

The Front end refers to the web tier that generates the html response for the browser.

The Back end refers to the server components that are responsible for the business logic.

Note that in architectures where a single web/app server tier is responsible for both the front and back ends, it is still useful to think of them as logically separate for the purposes of performance analysis.

Front End Performance

When measuring front end performance, we are primarily concerned with understanding the response time that the user (sitting in front of a browser) experiences. This is typically measured as the time taken to load a web page. Performance of the front end depends on the following:

  • Time taken to generate the base page
  • Browser parse time
  • Time to download all of the components on the page (css,js,images,etc.)
  • Browser render time of the page

For most applications, the response time is dominated by the 3rd bullet above i.e. time spent by the browser in retrieving all of the components on a page. As pages have become increasingly complex, their sizes have mushroomed as well – it is not uncommon to see pages of 0.5 MB or more. Depending on where the user is located, it can take a significant amount of time for the browser to fetch components across the internet.

Front end Performance Tools

Front-end performance is typically viewed as waterfall charts produced by tools such as the Firebug Net Panel. During development, firebug is an invaluable tool to understand and fix client-side issues. However, to get a true measure of end user experience on production systems, performance needs to be measured from points on the internet where your customers typically are. Many tools are available to do this and they vary in price and functionality. Do your research to find a tool that fits your needs.

Back End Performance

The primary goal of measuring back end performance is to understand the maximum throughput that it can sustain.Traditionally, enterprises perform “load testing” of their applications to ensure they can scale. I prefer to call this “scalability testing“. Test clients drive load via bare-bones HTTP clients and measure the throughput of the application i.e. the number of requests per second they can handle. To increase the throughput, the number of client drivers need to be increased until the point where throughput stops to increase or worse stops to drop-off.

For complex multi-tier architectures, it is beneficial to break-up the back end analysis by testing the scalability of individual tiers. For example,  database scalability can be measured by running a workload just on the database. This can greatly help identify problems and also provides developers and QA engineers with tests they can repeat during subsequent product releases.

Many applications are thrown into production before any scalability testing is done. Things may seem fine until the day the application gets hit with increased traffic (good for business!). If the application crashes and burns because it cannot handle the load, you may not get a second chance.

Back End Performance Tools

Numerous load testing tools exist with varying functionality and price. There are also a number of open source tools available. Depending on resources you have and your budget, you can also outsource your entire scalability testing.

Summary

Front end performance is primarily concerned with measuring end user response times while back end performance is concerned with measuring throughput and scalability.

 

Advertisements

Recently, an engineer came to me puzzled that the response times of some performance benchmark she was running were increasing. She had already looked at the usual trouble spots – resource utilizations on the app and database systems, database statistics, application server tunables, the network stack etc.  I asked her about the cpu metrics on the load driver systems (the machines which drive the load). Usually, when I ask this question, the answer is “I don’t know. Let me find out and get back to you”. But this engineer had looked at that as well. “It isn’t a problem. There is plenty of CPU left – I have 30% idle”.

Ah ah – I had spotted the problem. When we run benchmarks, we tend to squeeze every bit of performance we can out of the systems. This means running the servers as close to 100% utilized as possible. This mantra is sometimes carried over to the load driver systems as well. Unfortunately, that can result in severe performance degradation. Here’s why.

The load driver systems emulate users and generate requests to the system under test. They receive the responses and measure and record response times. A typical driver emulates hundreds to thousands of users. Each emulated user is then competing for system resources. Now suppose an emulated user has issued a read request to read the response from the server. It is very likely that this thread will be context switched out by the operating system as there are so many additional users it needs to serve. Depending on the number of CPUS on the system and the load, the original emulated user thread may get to execute with a considerable delay and consequently record a much larger response time. My rule of thumb is never to run the load generator systems more than 50% busy if the application is latency sensitive. In this particular case, the system was already 70% utilized.

Sure enough – when a new load driver system was added and the performance tests re-run, all the response time criteria passed and the engineer could continue scaling the benchmark.

Moral of the story – don’t forget to monitor your load driver systems and don’t be complacent if their utilization starts climbing above 50%.

After finishing my post on the Faban 1.0 announcement, I realized that it was geared towards users who were already using Faban. So I decided to write this post for users who have never used Faban.

Faban is two things :

  1. A framework for developing performance and load tests.
  2. A tool to run these tests and view their results.

The former is called the “Faban Driver Framework” and the latter is called the “Faban Harness”. Although the two are related, it is entirely possible to run an arbitrary test developed outside of Faban using the Faban Harness. In fact, many benchmarks do just that. In this respect, Faban is rather unique.

The Driver Framework

The real power of Faban is unleashed only when you use the framework. The framework provides capabilities similar to other load testing tools, namely : emulate users, keep track of response times and other metrics, run monitoring tools etc. Some unique features of Faban include the ability to :

  • accurately measure server response times at the network layer
  • emulate a Markov model to realistically model web workloads
  • emulate a Poisson process for inter-arrival times
  • support hundreds of thousands of users with the lowest possible resource requirement

If that doesn’t convince you to try Faban, maybe some of the features in the Harness will. Please read on.

The Harness

The Faban Harness is a web application that queues and executes runs, displays reports and graphs from previous runs and in general serves to maintain the results repository. Some features of the harness include :

  • Gathering of configuration information across all the systems in the test configuration (including driver systems)
  • Automatic collection of system level performance monitoring data
  • Ability to run arbitrary scripts/commands to collect any type of monitoring data
  • Automated management of a wide variety of common server applications like apache, MySQL, glassfish etc.
  • Graphing of both workload and monitoring data for easier analysis

If a lot of these features sound like LoadRunner and other fancy, high-priced tools, they are (in fact these tools don’t even have all of the functionality I’ve listed above). And you get all this for free in an open-source tool !

Tutorials

So check it out. The easiest way to get started with Faban is using the Quick Start Tutorial. It gives step by step instructions on installing Faban and running a pre-built workload including screen-shots.

If you are ready to start creating your first workload, check out the Creating your first Workload Tutorial. For users new to Java, this step by step tutorial should make it really use to get started.

I mentioned fhb in a prior post, when it was still in it’s infancy. Both faban and fhb have now come a long way and so I wanted to post an update.

Faban is both a framework for creating workloads and a harness for running them while providing performance monitoring of all parts of the system being tested. There are myriads of free and open source load testing tools out there but none that match the power, functionality and accuracy of faban (My own opinion of course, but I think you will agree once you evaluate it yourself).

To make use of the flexibility and power that faban provides, you do need to write some code in Java. As soon as that sentence is uttered, I can see some heads shaking and worse, it sends shivers down some spines ! For some reason, some people seem to think that writing code is bad, or difficult. And writing in a programming language like Java – what could be worse ! I don’t want to get into the merits of programming/scripting languages etc. but the fact is that the kind of programming you have to do for faban is largely declarative with some high-level method calls. Faban uses annotations heavily, provides the build scripts and does all the heavy lifting for you – here is some sample code from the ‘web101’ sample benchmark :

@BenchmarkOperation (
name    = “MyOperation1”,
max90th = 2,
timing  = Timing.AUTO
)
public void doMyOperation1() throws IOException {
logger.finest(“Accessing ” + url1);
http.fetchURL(url1);
if (ctx.isTxSteadyState())
contentStats.sumContentSize[ctx.getOperationId()] +=              http.getContentSize();
}

That’s it. You now have a web request that will do a GET on url1 and keep track of the size of the response data.

But for those who think this is too hard, faban provides the full functionality (almost !) of the driver framework in a simple command-line tool called ‘fhb’ (for Faban Http Bench).  Countless benchmarks have now been run using ‘fhb’ and it is the only tool that gives you the power of modeling transitions in Markov chains using a simple XML configuration file. See the tutorial for how to use fhb.

With a few ‘fhb’ runs under your belt, it should make the transition to the full-fledged Faban Driver framework much easier so I hope you will give it a try. Doing quality performance testing is hard work and a good tool is essential if you want to get accurate results.

Did I mention that faban and fhb are free and open source ? And actively supported via a user forum ?


Shanti's Photo

Pages

Latest Tweets

Categories

Archives