Tue, 15 May 2012
Monitoring and Testability
At UDS last week there was another "Testing in Ubuntu" session. During the event I gave a brief presentation on monitoring and testability. The thesis was that there are a lot of parallels between monitoring and testing, so many that it's worth thinking of monitoring as a type of testing at times. Due to that great monitoring requires a testable system, as well as thinking about monitoring right at the start to build a monitorable system as well as a testable one.
You can watch a video of the talk here. (Thanks to the video team for recording it and getting it online quickly.)
I have two main questions. Firstly, what are the conventional names for the "passive" and "active" monitoring that I describe? Seecondly, do you agree with me about monitoring?
'Active' is what most folk mean when they talk about monitoring of a site.
I do agree that monitoring is as important a part of the delivery of a product as a unit/functional test suite. There are some very effective tools for writing such monitoring as behavioural tests using cucumber.
Posted by Robert Collins at Wed May 16 01:29:21 2012
It's interesting that most people talk about 'active' monitoring, because most of the services I have seen only support it in a very shallow way e.g. checking for 200 response codes on a few pages. It's good that it's considered the ideal though :-)
Thanks for the reference to cucumber, I'll check it out.
Posted by James Westby at Wed May 16 02:59:27 2012
On the automation side:
- gathering the data in the first place
- presenting it sensibly (e.g. 80% of folk that start an order fail to complete, vs 10 people fail to complete)
- Add exit interviews (you're cancelling the process, can you tell us why?)
On the experimentation side, run a number of tests (separately or concurrently) to see what makes people more or less likely to complete an order. Things like page design, prose, price, discounts, bundles, should all be part of that.
That may seem like a digression, but it actually ties right back to actionability: if your situation normal was (say) 20% of folk that start an order are failing to complete, you can alert when that rises to 25%, and then start tweaking in realtime - you can observe after quite a short window (minutes if you have enough folk landing on the page) whether a particular test helped or hindered.
Posted by Robert Collins at Wed May 16 03:08:02 2012
Referencing purchases in the talk I was meaning things like alerting if a purchase fails due to the purchasing service being down after 3 retries. That's certainly something that I would want to be alerted about, and not something that is related to the users intent, so will be low on false positives.
Posted by James Westby at Wed May 16 03:10:50 2012
Posted by John A Meinel at Wed May 16 09:30:04 2012