I can imagine a world where each and every game producer and product owner never ever read anything related to their expertise except one article — and they still are better professionals on average than those in our reality. It's possible if this one article is that:
The A/B Testing Playbook For Mobile Game Growth, Part 1: Structuring the Experiment
The Playbook for A/B Testing for Mobile Gaming part 1: Structuring the Experiment. Walkthrough of key considerations when designing an A/B test.
But there is one important thing this article lacks of: what to test?!
Of course, the correct answer depends on a specific product — but... Not so much. I believe in the what-to-tests approach that is applicable to most games regardless of their genre, plus most suggested have the potential for high impact. And here it is:
Tests you might run on just-released MVP, soft launch or early scale stages.
- Forced Tutor Duration. Let's skip the basics: we all know that longer tutors are always better, and the forced part is a necessary one. The main question is the optimal duration of a forced part of the tutorial, and that's what you should test in most of the games. Default numbers that fit a lot of genres are: from 15 to 60 minutes.
- Progress Speed. One of the common answers to “How could we increase engagement?” sounds like “let's show them a lot of new content and drag fast by progress so they won't get bored”. I too used this logic a lot (probably professional deformation), but sometimes got counter-intuitive results when players actually prefer slower progress and less new content/mechanics over time. And ofc it's most crucially in the first raw hours of gameplay.
The thing is that the forced part is usually mixed with contextual and in the most genres while the game still forces you to get through it's tutorial funnel — there will also be a moment of raw gameplay when you have some freedom of actions. So we can also measure not only duration but also the force level of the tutorial, where maximum means “tap here and nowhere else”.
But to keep it simple, I consider a tutorial as forced even if there is relative freedom of action within the current activity, BUT the player can't freely shift between available activities (modes/regimes/gameplays) and this step still inevitably leads to the next one, regardless of players actions or results (which are either pre-scripted or irrelevant).
So, the perfect FTUE tests should target Forced Tutor Duration and Progress Speed over first hours of gameplay. Obviously, control version parameters should be based on the most relevant competitor (probably adjusted to the amount of available content), and variants should be like x1.5 and 0.75 to the control parameter. Specific test configurations should vary based on how much traffic you can afford, but in terms of the above-mentioned article, it's best to run them in Blended Stacked Testing.
In my experience, it might increase d1 by at most ~15% percent (not points), but only half of those in terms of expected cohort revenue (probably cause the core audience doesn't give a shit about FTUE's quality). Not much, but not nothing + relatively cheap.
Probably every game has not only a converting offer ("starter pack” or similar) that is supposed to be the first purchase for potential payers — but also a well-scripted roadmap that leads players to the point where this offer would seem like a most desired purchase. Approaches may vary: in some games, it's the point where the player loses several times in a row, slows down the progress, or faces a lack of energy, sometimes its “Use It, Then Lose It” + “Then Buy It”, sometimes just showing interstitial and offer "no ads” right after. Anyway, it's what I call converting point and it's lot of things to test here.
- Time to pressure. Don't confuse it with time to show the offer: usually, that comes before the game actually creates a need or clearly shows its value to the player. In other words, we're talking about time before the first pressure. Optimal time varies but usually lies between 20 and 120 net minutes from the start.
- Pressure level and type. Modern games don't use classic hard paywalls in the early game, but all sorts of “soft paywalls” are at your disposal (some were mentioned above), so you can use different levels of pressure, making your converting point more or less soft for the player.
And despite the almost infinite combination of parameters that you could vary in the test, it can be easily reduced to a few with the highest chances of being the best. I'll give you examples in the next article.
It's truly important what actually you're selling (offer's content) — but that's not where tests would tell you something beyond well-known practices and common sense. Yeah, the offer's value should be clear to the players, it should solve their current needs and it's always better when you offer some persistent/constant value (not only resource/energy) even if it's just a decoration.
The same goes for standard offers mechanics: time limitations, high discounts, visibility from the mainscreen, triggers for pop-ups, "last chance” + cancel confirmation, etc… — yes, you need all of those from the start.
And the last thing: there is no harm in suggesting a lot of concurrent offers even in the early game — don't be afraid to look intrusive; at worst, you would get 0.1% bad reviews from those who would never pay for anything anyway.
The best result I've got with those experiments was about +15% in net revenue / LTV; but being honest, such a high impact was reached primarily cause it was run vs obviously weak variant (old-fashioned paywall that cut audience from further payments).
In many kinds of games, you have something like time-replenished energy, or passive income, or reward timers (eg time to open chests), or daily missions, etc. All of those and some similar mechanics create the type of gameplay that could divided into four phases:
1.1. useful: while you get the fastest progress over time by using those resources (or rewards, or energies) before they ran out
1.2. wasted: gameplay is available, but progress is significantly slowed due to lack of smth spent in the useful phase
2.1. useful: while you're inactive or out-of-the-screen, but energy replenishing, timers ticking, income generating, etc
2.2. wasted: while important timers are already achieved and you won't get much value, inactive
Sure, it's not a universal description, and not every game actually fits this model -- but a lot of them do. And if so, there is something to test.
First, forget about ‘wasted active’: there is nothing to test cause it's always better to have an option of unlimited time-killing activities that won't affect your economy too much yet allow you to retain players as much as possible. Only 'useful' phases are test-worthy. And that gives us four combinations:
Ofc, ‘short’ & ‘long’ are relative measures based on default values from closest competitors. Intensity modifications would affect everything up to late-game, so the best time to run them is before you'll build a mid-game.
'Intensity’ tests delivered me less than 10% (in net revenue), yet there always was at least one variant better than the control — and that's good enough, considering those tests are very cheap. I mean, you already have a remote balance config where you can create those variants by just tweaking a few parameters (you have one, right?)
OMG, long read again 😣
In the next episodes:
- LiveOps Stage
- Events: as tools and targets
- Balance curve, win-costs, etc
- Never-ending optimization
- Test It Before Make It
- Tricks for testing on small samples