Performance Blog

Archive for the ‘web2.0’ Category

Olio was developed by Sun Microsystems as a way to compare, measure and analyze the performance of various web2.0 technology stacks. We had a great collaboration with the RADLab in UC Berkeley and gave this project to Apache. However, with the take-over by Oracle, Sun was no longer willing to support the project. Many users continued to find and use Olio but no one (including big-name companies like VMWare who used it for their own benchmark) was willing to contribute to it. I’ve always felt that open source works only when there is big corporation support, but I digress.

Anyway, I’ve asked for the Apache Olio project to be wound down. For those who may still be interested in using it, I have now copied over the repository to github – feel free to fork it. I have also moved some of the documentation to the wiki.

For anyone considering moving a svn repository to git, git-svn was mostly painless. It preserves the history of the edits which is really great.

Web pages today are becoming increasingly complex and it is now well recognized that simply measuring page load time does not represent the  response time of a page.

But what exactly do we mean by response time? Terms such as time to interactivity, perceived performance, above the fold performance etc. have come into vogue. Let’s examine these in turn.

Time To Interactivity

In last year’s Velocity Conference, Nicholas Zakas of Yahoo! gave an excellent presentation on how the Yahoo! Front Page responsiveness was improved in which he focused on time to interactivity. In other words, when can a user actually interact with a page (say click on a link) is more important than ensuring that the entire page gets rendered. With pages increasingly containing animated multi-media content which the user may not care about, this definition may make sense. However,  imagine a page whose primary purpose is to serve up links to other pages (e.g. news) and loads lots of images after onload. All the links appear first which means the user can interact with the site, yet the page has lots of white space where the images go – can we truly just measure the time to interactivity and claim this to be the response time of a page?

Perceived Performance

Another popular term, perceived performance is defined loosely as the time that the user perceives it takes for the page. This means that all the major elements of the page that the user can easily see must  be rendered. By definition, this measurement is highly subjective. Perceptions differ – for e.g. one user may not miss the Chat pane in gmail while for another this is a very important feature. Further, again by definition, this metric is application dependent. Nevertheless, for a particular web application, it is possible for developers and performance engineers to agree on a definition of what is perceived performance and work towards measuring/improving it.

Above The Fold Performance

In the Velocity Online conference last week, Jake Brutlag of google proposed  “Above the Fold Time” (AFT) as a method of measuring a more meaningful response time. This was defined as the time taken to render all of the static elements in the visible portion of the browser window. Jake and others have put in some serious thought and defined the algorithm to distinguish between the static and dynamic (e.g. ads) content on a page.

Clearly, one part of the AFT proposal is valid –  one doesn’t really care about the content on the page that is not visible initially and which the user can only get to by scrolling. But measuring everything above the fold has its issues as well. Take for instance the new Yahoo! Mail Beta. Y! Mail now has extensive social features to enable one to connect to Facebook, Twitter, Messenger and endless third-party applications. It is arguable whether the user will expect all of these third-party links to be on the page before he “perceives” that his request is complete.  The page still looks finished without those links.

In my opinion, we need to distinguish between the essential parts of the page vs the optional ones (A caveat here – although no one would argue that an ad is essential, it is essential for the page to look complete).  Looking at just static vs dynamic pixels misses this point. The difficulty of course is that there is no uniform way to define what is essential – it once again becomes application/page specific.

But for now, that’s the way I am going – defining “perceived performance” on a case by case basis.

Velocity 2010 came to an end today. I attended all 3 days – it was a great conference. I did not attend last year, but the crowds this year must have been at least 3 times that of 2008, when I first presented at Velocity. Here are some of my thoughts on the conference.

Performance

Being a performance person, I am naturally biased towards performance topics. So, I’ll cover this first. All of the performance sessions at the conference can be summed up thus :

The biggest performance issue is the time it takes for the browser to load a web page (aka page load times). Here is technique x and trick y and hack z to help you fix this problem.

I learned a lot about how to optimize css, javascript, http headers etc. But I was still disappointed that there was hardly a whisper about how to optimize the server side. The claim is that of the total response time, the server takes tens or at most 100’s of milliseconds where as the client takes several seconds.  So where do you want to focus your energy on ? I can accept that. But that seems to pre-suppose that all web applications have a scalable architecture and have solved their server-side performance and scalability issues. I find that a little hard to believe.

Operations

As expected, the details of how Facebook, Yahoo and twitter run their operations was of great interest to the audience. With so much media now being served, I was surprised to see only one session on optimizing Video serving and even that was not well attended. There was hardly any talk about optimizing infrastructure. I can’t help wondering why web operations wouldn’t be interested in optimizing their infrastructure costs. After all, we’ve been hearing a lot lately about the cost of power, how data centers are going green, more efficient etc. Aren’t these things applicable to the web world as well (not just enterprise IT) ? Even more surprising, a very small portion of the audience said they were deployed on the cloud.

Neil Gunther and I presented a session  on Hidden Scalability Gotchas in Memcached and Friends.

We had a great audience with some folks squatting on the floor in the front and a standing-room only audience in the back. There was tremendous interest in applying the USL Model to accurate data to quantify scalability. If anyone has additional feedback or comments, I would love to hear them.

Tools

I was blown away by the plethora of tools, a good many of which I had never heard of. Firebug with various add-ons (YSlow, PageSpeed) set the trend on browser-side monitoring and now even commercial vendors have versions of their product (available for free !) to do the same. This is great news for developers. If you haven’t heard of HttpWatch, showslow.com, webpagetest, check them out.DynaTrace announced a free end user response time monitoring tool as well.

Products

One real cool product I came across was Strangeloop – this is an appliance that sits in front of your web server and optimizes the response page. It’s amazing that it can do so much javascript optimization resulting in dramatic reduction in latency. I can’t help wondering why browsers don’t do this ? Surely, Mozilla and Google have enough smart engineers to come up with a an optimized javascript interpreter. It will be interesting to watch.

The usual monitoring vendors were all there – Keynote, Gomez (now part of Compuware), Webmetrics, AppDynamics etc.

Misc.

Tuesday was billed as “Workshop” day. However, there really weren’t any workshops – they were all just regular sessions just longer. I guess it’s hard to do workshops with several hundred people in the room. If Velocity really wants to do workshops, they need to have at least a dozen of them scheduled and they need to be longer.

On the whole, the conference was a great success, with sold out crowds, well attended and delivered sessions and lots of new products. Hope I can make it to Velocity 2011.

Olio is a web2.0 application that can be used for performance testing of various web infrastructure technologies. The cool thing about Olio is that the same application (a social calendar similar to yahoo.com/upcoming) is implemented in  PHP, Rails and Java. Thus it can be used for comparison of different stacks (to answer questions like “Should I move my PHP app to Rails? “) or compare different software infrstructure components (apache vs nginx say) or compare different hardware. Olio is currently an apache incubator project. You can get an overview of Olio from this presentation.

Release 0.2 was a lot of work. I was the release manager for both Olio releases done so far. 0.2 went a lot more smoothly (anyone who has done an Apache release knows all the criteria and rules involved !)

The major features of this release are :

  • First release of the Java version. The Java code has been in the svn repository for some time now and has been actively used by VMWare and others. Understandably, it was fairly buggy and a lot of fixes went into creating this release.
  • Re-organization of the file store. Previously, the entire unstructured file store (all the photos and documents) were in a single directory. This caused severe navigation problems at loads of several thousand concurrent users as the number of files grew very large. The file store now has an heirarchical directory structure.
  • The load generator (driver) has been upgraded to use HttpClient 3.0.1. This eliminates some of the complexity in the driver when dealing with multi-part posts, re-directs etc.
  • The cycle times for the AddEvent and AddPerson operations have been adjusted to result in passing runs in most cases. Previously, the minimum cycle time was defined as 0 which caused problems when the 90% response time was several seconds. The cycle time curve would keep shifting to the right making it impossible to meet the requirements. This is a topic for a separate article, so more on it later.
  • Olio simulates browser caches being cleared when new users first login by re-fetching all static files 40% of the time. When the cache was alive, it did not issue any such requests. It turns out that for older apps that do not use the Cache-control directive, the browser does issue a small request for every static file with the If-modified-since header. To better reflect these sites, Olio 0.2 now issues these if-modified-since requests. This results in substantially more traffic to the server.

Because of these major changes, any performance results obtained using Olio 0.1 should not be compared with those from Olio 0.2.

If you are interested in a realistic web workload to test your LAMP stack, Rails stack or Java EE stack, please do consider downloading Olio.

I was at the Enterprise 2.0 conference on Nov 3 and 4 at San Francisco. This is my first time at this conference (actually, this is the first time this conference has been held on the West Coast). Although it was in Moscone North, it turned out to be quite small – they shared space with VoiceCon and only had 3 parallel sessions in rooms that hold about 100-120 people. The keynotes had a larger audience which probably included people with expo and other free badges. Next year, the conference is being held in Santa Clara.

On the plus side, it was great to see so many companies with products aimed at the enterprise. Older companies (like BroadVision) have new social products while a whole bunch of new ones (Xwiki, cubetree) showcased collaboration and analysis tools.

The dominant theme of the conference was still how to convince CXO’s and IT of the benefits of web2.0, how to make the business case, measure ROI, etc. Although some large enterprises have taken the leap and some have taken baby steps, the bulk of enterprises still look at social media with suspicion. Surprisingly, user adoption ranked high on the path to resistance for E2.0 (even higher than management and IT resistance). The good news is that Enterprise 2.0 is certainly taking off according to Andrew McAfee who gave a keynote on the opening day. Even in the recession, E2.0 spending has grown, with enterprises typically spending anywhere between 500K to a few million. McAfee also said that the CIA has gone social. If a highly secretive, bureaucratic organization like the CIA can do this, what is the excuse for the rest of us ?

Microsoft and IBM were dominant – from keynotes to speakers to panelists. And of course on the Expo floor. But even companies that don’t have specific products (HP, Cisco) for E2.0 participated in panels and as speakers. In fact, a good portion of the sessions were focused on technology vendors describing how they have adopted social technologies within their own enterprise.

The more interesting sessions for me were the ones in which other large companies like Kaiser Permanente, Booz Allen Hamilton etc. explained how they made the transition to using social communication within their companies. Booz Allen in particular have implemented their own solution using open source technologies and plugging them to their internal data sources (they viewed SharePoint as too document-centric, not conversation-centric).

Some key points that came through for me from the various sessions :

  • User adoption is key. To ensure adoption, the user interfaces must be real easy to use. Single Sign On (SSO) is key. If a tool does not hook into your current LDAP, don’t use it. You can also aid user adoption with a little bit of push e.g. send them an email of recent blog posts from their communities, post profile images/pages/status messages on screens in conspicuous places around the company.
  • Choose technology/tools that allow you to extract the social data. A lot can be gleamed by analyzing such data although the tools are not there yet. This will be the next wave of innovation in E2.0.
  • Social media for customer support is a no-brainer. It has clear ROI and this may be the place to first implement E2.0 if you have a reluctant management and/or are struggling to show the benefits of this technology. Communities for customer support not only can lower cost of support but also provide you a big marketing opportunity to talk to people who directly use your products. It can greatly help to be proactive and resolve issues before they become a nightmare for customers, thus improving credibility and can build customer loyalty.

Being a performance engineer, I must add that I found absolutely no talk of performance at the conference. It is well known that performance and scalability are always after thoughts, but it seems incredible that if a company with 170000 users (like Kaiser) wouldn’t be worrying whether their social infrastructure scales and can handle the volume of traffic if they are really successful and E2.0 takes off. Poor performance can be a huge barrier to adoption as well, but I didn’t hear anyone mention this. Perhaps next year, when we have more adoption and they really start running into performance problems, we will hear more about it !


Shanti's Photo

Pages

Latest Tweets

Categories

Archives