What you’ll measure
Test the order and pacing of the first three onboarding steps. A common pattern in mid-core games:| Variant | Step 1 | Step 2 | Step 3 | Hypothesis |
|---|---|---|---|---|
| Control | Tutorial battle | Account binding prompt | Reward claim | Current sequence. |
| Variant A | Tutorial battle | Reward claim | Account binding prompt | Showing reward before friction lifts D1 retention. |
| Variant B | Reward claim | Tutorial battle | Account binding prompt | Frontloading the reward and deferring everything else maximises early delight. |
tutorial_complete backend event within 1 hour of exposure)
Pre-flight check
Confirm your backend serves the onboarding sequence as a config endpoint, not as hardcoded logic in the binary. Open the Fleack backoffice, navigate to Endpoints, and find the endpoint your app calls at first launch. Check:- Classification: must read
config-candidate. If it readsuser-data, the response varies per user — check whether you’re looking at the right endpoint. - Body sample: the step array must appear. Typical paths:
data.onboarding.steps(array of step IDs)data.tutorial.flow(array of{ id, type, params }objects)config.first_session.steps
Main workflow
Declare the lever
Open the Levers page. The AI enrichment pass may have already detected an onboarding steps lever — look for something labelled “Onboarding steps” or “Tutorial flow”. If it exists, click it, verify the path is correct, and proceed to step 2.If you need to create it manually, click + New lever:
- Pick the onboarding config endpoint.
- In the path picker, search for
steps(orflow, depending on your path) and click the array path. - Set the lever details:
- Label:
Onboarding step order - Type:
text— Fleack treats the JSON array as a string value so you can paste whole arrays as variant values. - Test suggestions: paste the step arrays you plan to test, for example:
["battle","account","reward"]— control sequence["battle","reward","account"]— Variant A["reward","battle","account"]— Variant B
- Label:
Set up the test
From the lever detail page, click Test.
-
Variant A value:
["battle","reward","account"] -
Variant B value:
["reward","battle","account"] - Allocation: 33% / 33% / 34%
-
Segment: this is critical — restrict to new installs only.
- Add rule:
days_since_installeq0 - Add rule:
total_sessionslte1
- Add rule:
- Primary metric: Retention day 1 — select the endpoint your app calls on session start (the same session-start endpoint used by all retention metrics).
-
Secondary metric: Conversion on
POST /api/onboarding/complete(or your equivalent tutorial-complete event endpoint), conversion window 1 hour after exposure.
Variant assignment is sticky per user. A new install sees the same step sequence on every request for the duration of the test — Fleack hashes
test_id + user_identifier to guarantee consistency.Watch the results
Onboarding tests have fast exposure accumulation because every new install triggers them. For a game receiving 1,000 new installs per day:
- Hundreds of D1-eligible exposures per day per variant
- A meaningful early read within 5–7 days
- A stable verdict typically available at 14 days
Make the call
| Verdict | Condition | Action |
|---|---|---|
| Promote | Variant wins D1 retention with ≥ 90% confidence AND tutorial completion is flat or improved | Click Promote |
| Reject | Variant lifts D1 but tutorial completion drops more than 5 percentage points | Stop the test, keep control |
| Run longer | No clear difference at 14 days | Extend to 21 days — deferred-friction variants sometimes only show their advantage at D7–D14 |
Common pitfalls
- Don’t test onboarding mid-soft-launch. Soft-launch cohorts are small and skewed toward a specific demographic or region. You’ll get noisy, non-generalisable results. Wait until you have reliable, representative daily installs before running onboarding experiments.
- Don’t expose returning users to the test. The segment rules
days_since_install = 0ANDtotal_sessions ≤ 1are not optional. A returning user encountering a reshuffled onboarding is a bug, and the resulting confusion will pollute your retention numbers. - Watch for store-listing mismatch. If your App Store or Google Play screenshots promise a specific first-screen experience and your winning variant changes it, update the screenshots in your next release. Mismatched expectations between the listing and the actual experience hurt D1 more than any step reorder can lift.
- Test one thing at a time. Reordering the existing steps is a clean, single-variable change. Don’t also vary step content, reward values, or tutorial skip availability in the same experiment — variant interaction effects will make the results unreadable.
A/B test interstitial ad frequency
Balance ad revenue and D7 retention by testing three interstitial cadences on a live game.
A/B test in-app pricing
Test IAP bundle composition for a fixed-price SKU to lift first-purchase conversion and ARPPU.