An old IT Warrior bemoans that he hates users telling him that the Web application is slow. You see, a user complaint carries the expectation that IT can do something to fix the problem while saying nothing about what went wrong. To fix the problem, the Warrior would love to ask: “So the application was slow yesterday at around 2pm. Was the application really slow or just perceived to be slow? Which URL was launched? Was the entire page slow or just a particular object? What server was used? Which method call was involved? Which SQL call? Was the problem inside the data center, or was the service provider having a hiccup? What, where, why, when, how, etc. etc.” Just the facts, please. And without these facts, tackling this kind of elusive problem with the revenue clock ticking away (and the business owner having a seizure) is akin to French Soccer star Eric Carriere heading a ball…OUCH!
© Copyright Reuters, http://photos.reuters.com
Unfortunately, the chance of our poor Warrior getting any actionable triage information from either the real user or his existing monitoring tools is nil. In fact, Jean-Pierre Garbani of Forrester Research reported that 76% of user-visible problems are missed by existing tools. This surely begs the question as to why one should bother with these traditional tools.
When a problem hits, the Warrior, like King Arthur of old, will convene his round table of specialists including the network guy (or gal), the server guy, the apps developer, the DBA--the list could go on and on depending on the perceived severity of the problem. Each will bring along monitored data from his silo and argue that it is not his problem. As a result, each problem becomes an inquisition, and more often than not, ignored until it can be caught the “next time.” But what is the chance that the problem can be caught the next time or time after next…or ever? Time wasted, users’ expectations not met, bonus not earned.
The only way to solve the problem is to be able to monitor, in real-time, what the real user is experiencing performance-wise. More importantly, any performance problem has to be quickly related back to the source of the problem: inside/outside the firewall, which server, which method call, which SQL statement, etc., etc. I will go deeper on this thread in future posts.


