Leading AB testing to boost app downloads | Product management

At what3words, we in the product team were eager to build an experiment of testing in the company to allow us to get more buy-in for incremental product development based on AB testing. One context I had experience with previously was the Google Play Store, which has a dedicated console for running tests on the app listing. As no one was currently managing this, and it didn't have as much risk as dedicating development time to potentially unsuccessful experiments in our products, I got buy-in to set up a group to carry out App Store Optimisation (ASO).

I lead a group of 7 individuals from different teams, managing and executing a testing strategy and reporting back to the whole company regularly.

Our goals included:
(a) Increase the % of people visiting our app store listings who then downloaded our app (i.e. app store conversion rate).
(b) Educate the company about testing in the process, getting people excited to try experimenting in other contexts.
(c) Foster better cross-functional collaboration across teams, in particular between product and marketing.

Company

what3words

Date

Throughout 2020

Based in

London, UK

Scroll Down

Assembling a cross-functional team

I chose individuals from several key teams strategically, including a mixture of people sceptical about testing, and people who had been at the company for longer and held more influence across the team, as these were people I saw as key in building a culture of learning-through-experimenting.

We had a workshop where I shared my previous ASO experience with the group, and we critiqued some existing listings to get a feel for the kinds of things each of us would be looking out for.

We then assigned each individual a key area of responsibility and ownership. While each of us could suggest ideas for each other's area, it would be up to each individual as the expert in their respective accountability to decide.

Oh, and on Slack, each of us had an associated power ranger emoji. I was the green ranger.

What can we test?

Next up, we agreed as a group on what we wanted to test. From experience and best practice, we knew which parts of the listing would be the most high impact, but for screenshots in particular, more work would be required.

We put together a backlog of ideas, and I assigned sizing to these, so each week, we had our own mini-spring planning to decide on what to pull from the backlog and introduce each week.

Additionally, we had 5 different testing slots available at any one time. We were keen to test one change at a time incrementally, keeping the best variant in each case, so we could properly learn what worked and what didn't. However, we had listings in over 30 languages, which is why localisation became key to our strategy – we could test in multiple languages at once.

For each test, our Growth Manager calculated how much time would be needed roughly to reach significance.

Developing hyper localised listings

As we began experimenting across languages, we found – unsurprisingly – that different things worked well in different markets. Initially, we had tested the same things across our core markets (UK, US, Germany, Japan, and India), however, our localisation manager reached out to our network of local translators and came back with culturally-specific suggestions for tests for each listing separately.

So, we began producing hyper-localised listings, meaning for our core markets, we were running different tests, with different visuals and messaging, according to what our local experts recommended. Below you can see an early example of how differentiation looked in the early stages…

Unique branding per market

I want to call out our Japanese listing in particular. When we started testing, our conversion rate was below 25%, meaning less than 1 in 4 people in Japan visiting our listing downloaded the app. This was significantly lower than other markets.

Our resident localisation expert had heard about a technique called "sourceless transcreation", where rather than translating our English source messaging or imagery, we would give a translation specialist all the same information we had access to to create the English listing originally, and ask them to create the listing for their market specifically.

For Japan, this lead to a radically different approach, with highly differentiated messaging, visuals, and even a different layout and colour scheme to represent digital norms and cultural associations. This changed listing alone more than doubled our conversion rate in Japan, and we had great success replicating this approach in our other markets.

(And yes, that's one of the Olympic stadiums in the second pic, as it was 2020 and there was a lot of buzz about the Olympics in Tokyo despite COVID delaying everything.)

A huge boost in conversion

After 1 year, we had achieved an outstanding increase in conversion across all core markets, representing hundreds of thousands of additional app downloads per year without additional ad spend. In addition, in each market, we were performing better than 90% of similar apps in our selected comparison cohort in terms of conversion rate, and we also saw an increase in overall traffic to our listings.

In addition, I gave monthly updates in our company all hands on our progress, which became a highlight, encouraging the company to share ideas, guess which variants won, and learn about testing. During the year, what we learned inspired marketing campaigns, and also opened the door to start running tests in our mobile app's onboarding.

The ASO task force continues today at what3words!