Applications require testing at various stages of development. Often due to the nature of data stored in the database, it is not desirable to perform tests based on copies of production data, for example due to their confidentiality. Therefore, it is important to conduct tests on non-production data. The process of anonymising production data is a separate issue. It is usually expensive and needs to be updated as the structure of the database changes.
In order to improve the software development process and to ensure the security of customer data during (both user and penetration) tests, EuroDB provides a mechanism to generate random data.
Data is generated based on the database schema based on the field types contained therein, their parameters, foreign and excluding keys. It is possible to tune the generated data using the available parameters, e.g. by using regular expressions. Built-in data types are supported including: what is important, IP and MAC addresses. Data can be generated from alternative macros, including user-defined ones, in given ratios (for example, logging in to internal and external client applications). Generation of “data gaps” simulating failures of dependent systems is supported.
Generator functions:
- Generating data with a fixed offset
- Scaling per table (e.g., 100 products, 10,000 clients, one million session identifiers)
- Determining the percentage NULL value (e.g. no customer phone number)
- Concatenation of the value of several generators
- Creating BLOB
- The generator is compatible with the Luhn algorithm (generating correct credit card numbers).
Many other operations are also supported, allowing for the adjustment of the generated data to the designated purpose.
Example of EuroDB data generation
The company provides an account management application for its customers. The application is released in 3-month development cycles. As a part of the security policy, an external company performs penetration tests before each deployment in the production environment. These tests are performed on the environment where EuroDB has been supplied with randomly generated data. A unique prefix, unknown to an external company, is added to the data. Providing this prefix by testers is a proof of penetrating the application.