Wed 7 Feb 2007
Split Testing - Be Sure to Triple Check Validity Factors!
Posted by admin under Analytics, Testing, Analysis, A/B Testing, Blog
Aaron Wall of www.seobook.com posted his findings on an AdWords split test that may baffle some.
I'm not sure what led Aaron to do this, but he decided to run two identicle ads that supposedly had identical titles, copy, and display url. He went on to clarify that the ads ran at the same time, used Google search delivery and without content targeting. He posted a screenshot of the ads with the Google reporting results. Visually, everything is identicle.
The outcome, in my opinion, raises the "stink factor".
Ad 1 was served 50.5% of the time, had 316 impressions, 12 clicks @ a 3.79% CTR.
Ad 2 was served 49.5% of the time, had 310 impressions, 4 clicks @ a 1.29% CTR.
There are four primary threats that affect test validity:
- History Effects
- Instrumentation Effects
- Selection Effects
- Sampling Distortion Effects
My Findings:
- History - both ads ran at the same time.
- Instrumentation - both ads ran at the same time, using the same reporting metrics.
- Selection - on the surface, it looks like the ads match the profile of the test subjects. I suspect that selection may be distorted due to broad, phrase, or exact matching for the keyword used. The use or lack of negative keywords may also have an impact on selection effects.
- Sampling - based on the information provided, sample size distortion does not appear to be a factor. A sample size of 626 is sufficient enough to produce a 95% confidence level.
What are your suspicions? Do you think that over time, extending the sample size will even things out, or compound the distortion?
