As we’ve discussed in previous articles, running experiments is the best way to get the data you need to boost e-commerce sales and revenue. Data collected from experiments can be fed into a machine learning algorithm to predict the outcomes of future tests so you can concentrate your efforts in the right areas.
To be useful, though, experiments must be run properly. Here are 10 common mistakes that can render your experiments ineffective or even counterproductive.
1. Not knowing your platform and setting up experiments properly. This is crucial to ensuring that your experiments get off to the right start. For example, make sure the code is on all the right pages and in all the right places. If code is too far down the page and loads later, this may result in a flicker that can affect the client journey negatively. And if the code isn’t on every page being tested, you could miss critical steps in the client journey.
We also recommend implementing integrations with third party analytics tools like Google Analytics. These can open up a whole new world of data possibilities beyond those offered by most testing platforms, which usually only offer basic e-commerce tracking. This includes metrics like which pages visitors landed on, where visitors arrived from and on which visit they made a purchase. This is data you might not have thought of acquiring.
We recommend running AA tests at this point. This will let you test the data being collected in your experiments so you can be sure you’re not making decisions based on bad data, as well as make sure there aren’t any performance issues caused by experiments that slow down load speeds.
2. Not planning your experiments properly. Maybe you have a hypothesis that you’ve seen X and believe that if you change Y then Z will happen. But what specific changes will you make to test this? What do you want your experiment to look like and what pages will be involved in the experiment? What’s the goal of your experiment and what will success look like? Map out every customer experience and set measurable goals.
Also don’t get stuck in a desktop-first mindset when planning experiments because most e-commerce traffic comes from mobile devices. Think about how users will interact on your site and how your code will affect their ability to complete different tasks.
3. Not setting goals as part of your build. Most testing platforms have some goals built in, like sales and revenue, but you might want to learn more than this, like how users interact with search. In this case, you wouldn’t need to track all users, but just those who use the search function. The conversion rates of these users would be much more relevant and targeted.
Here’s another way to put this: Be sure you can track the things you want to track. Build the events into the user experience of the control, not just the variation, by using a pseudo control to add more tracking to the site while the experiment is running.
4. Not pre-reporting. Ideally you should build reports before running experiments. This forces you to identify the questions you want to answer before you start experimenting and make sure you can answer them. The last thing you want is to run a four-week experiment and then wish you had tracked something or asked a question that you didn’t. Building your reports in advance and hypothesising about the possible outcomes can allow you cover those eventualities.
5. Relying on your developer for QA. Your developer will do the initial QA, but you need a dedicated QA resource or team to review the briefs and make sure the experiments work properly and test what they’re supposed to. Perform QA across all web browsers and devices on which experiments will be run, at least once if not twice.
Remember: While experiments enable you to make site improvements, they can also break things. QA is crucial before launching an experiment to make sure everything is working properly. Use the tools you have available, such as Google Analytics, to know which are the key devices and browsers.
6. Not following activation protocols. Common protocols include things like not making website changes or experiments live on a Friday afternoon. If there are problems, these might not be discovered until Monday, which could result in two or three days of poor site performance and bad data. Similarly, don’t make changes or launch experiments after 3 p.m. for the same reason.
7. Not establishing emergency protocols. Mistakes happen. When they do, you need a system for making sure that the right people can be notified to fix them. We use a traffic light system in which:
Green means there’s a problem but it isn’t stopping users from doing what they need to do, so immediate action isn’t necessary, but we should keep an eye on it.
Amber means there’s a problem that’s affecting users. The experiment doesn’t need to be turned off, but the problem should be identified and fixed as soon as possible.
Red means there’s a serious problem that is stopping users from doing what they need to do. The experiment should be turned off immediately and not restarted until the problem is fixed.
We use WhatsApp groups containing the key stakeholders for each client to contact all relevant parties about problems at any time of the day or night, depending on their severity, so serious problems aren’t kept live for any length of time.
8. Not performing checks after experiments go live. Almost everyone previews experiments before they go live, but you can’t trust a preview. You need to see the experiment live on all devices and browsers to make sure everything is running exactly like it’s supposed to.
We created a step in all of our experiences where the person who QA’d the experience must also sign it off after it has gone live and take screenshots to verify it matches the experience that passed QA.
9. Not checking the data regularly. It’s true that you can go overboard by looking at data too much, especially early on in the experiment when the data can fluctuate wildly. But this settles down over time if you have enough volume. As a general rule, you should check the data at least once every day, looking for anomalies and anything else that might not look right.
Site updates, content changes and new releases all affect the code on a page. If an experience is live on that page, it could impact its performance.
10. Not know when to stop your experiments. There are certain milestones to achieve statistical significance, such as covering multiple business cycles or reaching statistically significant sample sizes. But in some instances, it can be wise to stop an experiment early. If results have been consistent (either positive or negative) after two weeks, for example, they aren’t likely to change much within the next two weeks. And if they do, you would be wise to question the data and why it has fluctuated so much. You can also lose a lot of money in two weeks.
An optimisation agency can help you avoid costly mistakes like these and get the most out of your e-commerce experiments.