Test Data Factory: Why and How to Use
- What problem does it solve?
- What is the Test Data Factory?
- The example
- How to implement it?
- Real Examples
What problem does it solve?
One of the biggest pain points during test automation, whatever the layer you are testing, is to manage the test data. I would say it’s quite easy to manage in the unit and integration levels, where we can control the application state, but for functional and e2e tests it’s quite hard.
Would be awesome if we could stop manually change data we have in the code file or even from an external source like a CSV or JSON file, right?
We would like to solve the code maintenance and the test failures due to the data changes and create an easy and extensible way to generate test data.
What is the Test Data Factory?
We don’t have an agreement about test terminologies (yes, I know that sucks) and we don’t have a definition for the Test Data Factory approach, so here’s my definition.
During some research1 (read this as “I was using Google”) I found interesting posts related to the same concept: the ObjectMother pattern
1 the research:
- Combining Object Mother and Fluent Builder for the Ultimate Test Data Factory
- Test Data Builders and Object Mother: another look
- Writing Clean Tests – New Considered Harmful
- Test Data Builders: an alternative to the Object Mother pattern
Think that you have an application you must fill in a form where get a loan, where it has these two requirements:
- a not null name
- a valid e-mail
- an amount should be greater than $ 1.000 and less than $ 40.000
- the installments should be greater than 2 and less than 10
If you already know some testing techniques you can see yourself applying the Boundary Value Analysis (BVA) testing technique for the amount and installments inputs.
We must figure out a way to test all those possible scenarios filling the form with, of course, some data:
- all valid information
- name as null
- not valid e-mail
- an amount less than $ 1.000
- an amount greater than $ 40.000
- installments less than 2
- installments greater than 18
Test example with hard-coded data
Let’s assume you’re using Selenium WebDriver to fill in this information on a web page. A hard-coded data example would be:
The example above is using Java and the Page Object pattern to fill in a form on a web page. Notice that from lines 3 to 6 we are using fixed data to fill the form.
Using a fixed data in the test is considered a bad practice and should be avoided in order to have less maintenance in the automated test code.
How to implement it?
There’re a few steps we must follow:
- Create an object to model the data you need
- Create a builder class
- Create a factory class to consume the data
- Modify the class to generate dynamic data
- Use the factory class in the test
Create an object to model the data you need
In our example, we are filling a loan form based on name, e-mail, amount, and installments. We need to create an object (class) to record this information.
In a plain Java class we can model our example like this:
When we create a model class using Java we need three things: the private attributes, the constructor, and the getters and setters.
We need to create the same attributes we’re going to use to fill in the data. You can see this in lines 4 to 7.
The constructor is necessary to create the object and easily inform the data, but we are going to use it to generate our builder class. You can see this on lines 10 to 14.
The getters and setters can be used to get or set a single data in the object. You can see this on lines 19 to 25.
Lines 28 to 33 show the
toString the method that will be necessary to log the data we will use in the tests.
You might not need the getters and setters if we plan to use the builder object so you can remove it and set the attributes as final, but you still need the constructor and the toString method.Note from the author
The model class would be used as the example below, but we will make it more readable and fluent using the builder pattern.
Loan loan = new Loan("elias", "email@example.com", new BigDecimal(10.000), 8);
Create a builder class
You could use the getters and setters in the
Loan class (if you didn’t remove it) to create the class with data, but we have a better and fluent way to do it: the builder pattern.
I would recommend you to read the link below but, in general, and for this example, a builder consists of the same attributes we have in the
Loan class, methods that receive an attribute as a parameter and return the builder class, and a build method that will create the concrete class.
Let’s learn the internal structure.
Lines 3 to 6 have the same attributes as the
Loan class. We need it to create the class with the parameters filled in using its constructor.
The lines from 8 to 26 are the build methods. The method must return the builder class, have a parameter, associate the parameter with an attribute, and return itself (
this). Usually, the method name is the same as the data you fill inform.
Lines 28 to 30 show the build method. This method is responsible to create the main object, that in our case is
Loan. It uses all the attributes values used through the methods to create it where it will return a new
Within it you can create the class with data in a fluent way, like this:
Note that we can use the
LoanBuilder to chain the methods, as it returns itself, filling in the necessary data and, always, constructing it using the
Create a factory class to consume the data
The factory data class is slightly different from the Factory pattern (actually it’s simpler) where we have the following aspects to implement:
- a private constructor to avoid the class instantiation
- a static method that will generate the data
- as it generates de data it must return it
- the method must return the data model
In the example below, we have different factory methods creating a loan (
createLoan), creating a loan with an invalid e-mail (
createLoanWithNotValidEmail), and others.
Note that we are using the
LoanBuilder to easily add the correct data to the attributes, returning the object Loan with all the necessary data.
As you can see the data still hard-coded. It’s ok but we can improve it a little bit more adding a dynamic generation.
Modify the class to generate dynamic data
We can dynamically generate data using different ways. The three main ones are:
- using a library
- using an API endpoint
- using a database
I will focus on the first one: using a library.
For that I will introduce you JavaFaker, a is a library that can generate fake random data every time it’s called. For example: if you generate a name all the names generated will be different.
As you can see in the code above the hard-coded data was replaced by the usage of Java Faker. You will receive different data all the time you call the factory method, but it’s still valid data.
The constants created from line 4 to 8 was added to add more clarity and reduce the maintenance in changing the values in the future.
You can see from line 16 to 21 the usage of Java Faker to generate the correct value for each field. For example, the email is added to the Loan object using the
faker.internet().emailAddress() where it generates a valid e-mail.
Use the factory class in the test
I explained in the Test Example with Hard-Coded data about how this class looks like. Let’s now using the Test Data Factory approach we learned.
Line 4 shows the usage of the data factory method createLoan() where the object returned is set to the validLoanData attribute.
Lines 7 to 10 show the usage of the Loan object (validLoanData) to fill in each data using the getters. The data will be different every time we run the test.
Would you like to see real examples for two different test targets: web and API?
Notice that I have a restriction: in the frontend, I have fixed data for the countries and daily budget. We can still randomize it and make the factory class return random data for a limited set of data.
In the project restassured-complete-basic-example you will find different data generation approaches in the data package. The SimulationDataFactory class has a complete set of factory methods for all the possible scenarios.
In some cases, you can even call some existing endpoints to provide data you need to use in your tests. It’s the case for the method getAllSimulationsFromApi() where I’m getting the existing simulation, using it in the oneExistingSimulation() factory method to get a random existing simulation and on allExistingSimulations() to return all the existing simulations.