Data Management approaches for your tests

Lean Web Test Automation Architecture

July 19, 2020

Trust Your Pipeline: Automatically Testing an End-to-End Java Application

August 6, 2020

Published by Elias on July 30, 2020

Introduction

I have now in my personal GitHub repo a way you can ask me questions related to the Quality Engineering subject.

Rafael Teixeira sent me a question regarding the Data Management approach. As this topic is very interesting I created this blog post based on my answers and explanations.

Questions Answered

Which type of management is better in which situation?
When is it better to restore a database without data?
It’s usually dependent on your context (your current situation in your project).
I like to use, independent of the project, fake data generation and static files as well.

When is it best to run scripts to define a specific configuration? And the main thing, when the system is very large and complex and requires several setups, what to do?
For me, the best way to set up data in your test is during the definition of your test pre-condition, so you can share the same condition along with the tests that have the data condition in common.
I’m facing the same “problem” you are: the system that I test is very large and has different data conditions. The approach we are adopting is to group the related tests into the same pre-conditions. We create a Base Test class with the pre and post-conditions for that situation.

Example: the system can run the functionalities with or without levels of approvals, wherewith approvals require more configuration, different data setup, etc…
We have two different BaseTest classes based on that need, so we can easily manage these conditions isolation (configuration, data, etc…)

Usually, I do 3 different approaches to data management…

Possible approaches to generating data

Fake data generation

This is the easiest one that I’m using all the time.
You can use a data generation library to pass the responsibility to it. I use DataFaker.
Instead of creating the data inside the test code (sometimes I do) I use this approach:

I create the Object I want to model
I create the Builder pattern for this model
I create a Data Factory Class to create the object given the conditions I can create

The model will give you the possibility to easily create an object with the necessary data
The Build will allow you to easily get and set the data
The Data Factory class will give you the flexibility to create a different set of data and a centralized point to change it when necessary.

Example
Let’s imagine we need to add the age of the person to enable him/her to apply for a driver’s license. I will assume that the age must be >= 16.

Person example

public class Person {

    private String name;
    private int age;

    public Person(String name, int age) {
        this.name = name;
        this.age = age;
    }

    public String getName() { return name; }

    public void setName(String name) { this.name = name; }

    public int getAge() { return age; }

    public void setAge(int age) { this.age = age; }
}

Builder example

public class PersonBuilder {

    private String name;
    private int age;

    public PersonBuilder setName(String name) {
        this.name = name;
        return this;
    }

    public PersonBuilder setAge(int age) {
        this.age = age;
        return this;
    }

    public Person build() {
        return new Person(name, age);
    }
}

Data Factory example

public class PersonFactory {

    private final Faker faker;

    public PersonFactory() {
        faker = new Faker();
    }

    public Person personAbleToGetDriverLicense() {
        return new PersonBuilder().
            setName(faker.name().fullName()).
            setAge(faker.number().numberBetween(16, 65)).
            build();
    }

    public Person personNotAbleToGetDriverLicense() {
        return new PersonBuilder().
            setName(faker.name().fullName()).
            setAge(faker.number().numberBetween(0, 16)).
            build();
    }
}

Note that you can use the Data Factory class in your tests to generate any data.
In the example below, you can see the usage of the personAbleToGetDriverLicense that will return a Person object with the age between 16 and 65, so you don’t need to fill in, create an attribute for that, or generate the entire object as we did in the Data Factory into your test.

public class PersonTest {
    
    @Test
    public void personCanApplyForDriverLicense() {
        PersonFactory personFactory = new PersonFactory();
        Person person = personFactory.personAbleToGetDriverLicense();
        person.getAge();
        // your test goes here
    }
}

Consuming static files

I’m not using this approach that much but I think it’s a good one and must be used if you have a need.
You can use different static files to have the data you want to use in your tests, like txt, csv, json, yaml, etc…
With this approach, you need either a library that can read the file or create a class to do it.

Keep in mind that:

all the static files (at least in Java projects) must be placed into the resource folder, not together with your source
you might constantly change it, so it can follow your PR process (change -> commit -> push -> approve) to have it into your main branch

In the example below I’m using a csv file and a built-in JUnit 5 function to read a CSV file provided by the library junit-jupiter-params called CSVFileSource

CSV file example

product,amount
Micro SD Card 16Gb,6.09
JBL GO 2,22.37
iPad Air Case, 14.99

Test Example

class JUnitCSVSourceTest {

    private static final String MAXIMUM_PRICE = "30.0";

    @CsvFileSource(resources = "/products.csv", numLinesToSkip = 1)
    void productsLassThan(String product, BigDecimal amount) {

        assertThat(product).isNotEmpty();
        assertThat(amount).isLessThanOrEqualTo(new BigDecimal(MAXIMUM_PRICE));
    }
}

The @CSVFileSource read the contents of the file and place them into resources, which should be in the resources folder, and skip the first line because, in the CSV file, the first line is the header.

Dynamic consumption

I had this approach in my last project experience in a financial institution where we needed to have different sets of data based on different requirements. Examples:

an account with credit-card
an account with a limit greater than $ 10.000 and investments

We had a production database replica where developers (backend and QA) when some data was needed, were doing this manual process:

querying the replica database using an SQL script
updating the data into the test or a static file
running the tests

It was also a manual process (data setup) before the regression test… you can see it’s time-consuming.

The solution we had at that time was to create a DSL to automatically query based on the data requirements we needed.

public class MyTestWithDynamicData {

    @Test
    void cannotPayWithExpiredCreditCard() {
         DynamicData dynamicData = new DynamicData();

         CreditCard creditCard = dynamicData.creditCard().withRestriction(Restriction.EXPIRED);

         // extraction the credit card information to use in the test
         String holderName = creditCard.getHolderName();
         String creditCardNumber = creditCard.getNumber();
         String expirationDate = creditCard.getExpirationDate();
         String cvv = creditCard.getCvv();
    }
}

In the example above we needed a credit card already expired. The method withRestriction got the Restriction.EXPIRED and added into the query the SQL statements (where clause) to return an object that matches these criteria.

Conclusion

We can apply different approaches to generate data. I didn’t mention (because I don’t use this approach right now) another one is: generating data consuming and API.

If you have any questions, please feel free to comment on this post or send me a question.

Thanks for your time!

Elias

Java Champion, Senior Principal Software Engineer, Oracle ACE for Java, Java Magazine NL Editor, Browserstack Champion.

Data Management approaches for your tests

Lean Web Test Automation Architecture

Trust Your Pipeline: Automatically Testing an End-to-End Java Application

Lean Web Test Automation Architecture

Trust Your Pipeline: Automatically Testing an End-to-End Java Application

Introduction

Questions Answered

Possible approaches to generating data

Fake data generation

Consuming static files

Dynamic consumption

Conclusion

Elias

Leave a Reply Cancel reply