In the context of usability testing, choosing the number of testers is a tricky decision. You choice will be affected by resources (budget and time available) and is usually predefined before the study starts.
In this article, we discuss 3 frameworks to make decisions on how many testers to work with: RITE, Nielsen’s 5 testers rules and T-shirt sizing.
RITE : Rapid Iterative Testing and Evaluation
The core concept of this method is to fix problems as soon as they have been discovered rather than fixing problems in one block at the end of the process.
It means iterations are made to the design each time a problem is discovered: that may take only one tester, or several.
This cycle should then be repeated until there are no more problems.
The next decision to be made is: after how many testers not discovering problems should you stop testing? The higher the number the safer, it all depends on your resources.
Here is an example of what a RITE cycle can look like:
We propose 2 ways to apply the RITE framework:
- a weekly research ritual for small and medium teams
- a daily research sprint for teams conducting research at scale
To choose between these 2 options, study your resources and needs: what is your participants’ recruitment capacity? how often and how long can you work on design rework? when is your due date? how many team members are dedicated to the project?
In the case of a weekly research ritual, this is what it can look like:
- meeting participants/testers on Monday and Tuesday
- working on design iterations on Wednesday to Friday
This pace is suitable and manageable for most teams. If you are interested in the RITE method, starting slow might be a good idea. Once you have experienced this pace of research, you can decide to go forward with a tighter schedule.
You can also organize your sprint as follows:
- booking 2 testers every morning
- dedicating their afternoons to corrections
This pacing only works if you are able to change your interface/product within a few hours. If corrections are more complex, you might need longer cycles of iteration. The advantage of planning your pace in advance is that you optimize your time and ensure a continuous flow of user insights.
Some argue that the RITE method is more valid when applied to prototype testing than product testing. That is because fixing a problem in a finished product might take a longer time which delays the overall study and the ability to discover other problems.
When should you use the RITE framework?
It works best for continuous research: when you have long term improvement goals that require sustained research efforts over the long run by creating rituals and regularity in your research.
Nielsen 5 testers rule
You are probably already familiar with this one: Jakob Nielsen’s (in)famous 5 testers rule!
This rule is based on data from Nielsen’s work showing that on average, testing with 5 testers will discover 85% of problems of a digital interface. This is why, for a lot of teams, 5 is the default number of testers they want to hire for their usability testings.
However, over the years, this theory has been criticized a lot, as there are a lot of factors at play that affect this number - but the sweeping statement makes it easy to overlook them.
For instance, your 5 testers might a percentage of issues that is nowhere near the promised 85% if you take the following into account:
- frequency of problems
- gravity of issues
- level of proficiency of testers
- level of complexity of the product tested
In reality, the original mathematical theory does take these into account (to some extent), but in a way that is not easy to define nor use (if you want to know more about the 5 testers rule and its mathematical theory, read (FR) our dedicated article.)
So, does it mean you should completely avoid Nielsen’s theory? Not really either!
5 might not be a magical number, but it certainly is a good number to start - especially if your team is still building its UX research practice, as it is a number :
- small enough that most teams will be able to spare the effort and budget for one research study
- big enough to uncover real insights and issues, thus helping you improve on the product.
- make sure you test with 5 people with similar profiles (with the same type of use cases or problems for instance)
- iterate! Test another 5 testers a couple of weeks later after you’ve improved on your products, and then again.
Once you’ve iterated a couple of times with 5 testers (of the same profiles!), look back on your process, and ask yourself :
- from iteration to iteration, did the following testers find new issues on parts of the product that you hadn’t changed? If yes, it probably means that 5 wasn’t enough, as those would be issues that were missed on the first test. You should probably add a couple of testers for your next set of tests.
- on the other hand, did you get less and less problems to fix between each iteration? If yes, it means that 5 was a probably a good number that you can keep.
- and finally, did you feel like, each time, the last couple of testers were not helping you learn anything new? Then maybe 5 is too many, and you could cut down a little bit.
When should you use the 5 testers rule?
If you are new to UX research or are struggling to implement it regularly in your organization, this rule is a way to take it slow. Once you have results to support your work, you can start pushing for more resources and working towards more ambitious testing methods.
T-shirt sizing is a popular method to help teams be aligned on how big of an effort will be required, while keeping discussions nice and simple. This creates more flexibility and consensus within projects.
In the context of UX research, T-shirt sizing involves creating levels of complexity in which to arrange research projects. To each of these levels is then associated a number of testers required for testing.
For example, you might have 5 categories and their corresponding tester number:
- Simple research project: 5 testers (XS)
- Pretty simple research project: 7 testers (S)
- Moderately complex research project: 10 testers (M)
- Complex project: 15 testers (L)
- Very complex project: 20 testers (XL)
Depending on your UX maturity and means, you might need more or less categories. We however advise to have at least 3 sizes : S, M and L.
The advantage of this technique is that once they are created, defining the project’s complexity will tell you how many testers you need.
Base your complexity assessment on the following elements:
- project’s stage: creation, nth iteration, enhancement
- the team’s familiarity with the subject
- project’s criticality
- the type and number of research target
- time and resources available
This framework invites stakeholders to take part in the definition of categories and the complexity of each project collectively, trying to reach consensus as much as possible.
If relevant to your project, you can assign other variables to the complexity levels in addition to testers required, for example: team members assigned to the project and print duration.
Here is an example of different complexity levels assigned to different variables:
When should you use the T-shirt sizing rule?
T-shirt sizing is the most versatile framework of the 3.
Small teams can start with 2 or 3 sizes and create a low level of complexity for their categorization.
Bigger, more organized teams with higher research rates and objectives can create as many sizes as they want which accounts for the variety of their projects specifications and complexity.
Time to test these frameworks
There is no unique answer to how many testers you need for your research because of the many factors at play. We hope these frameworks inspire your decision-making in defining a rythm and format to your research.
Keep in mind to stay agile and iterate on your research methods themselves. Remember that while T-shirt sizing is valid for all types of research, RITE and Nielsen 5 testers rule are designed for user testing.