The client’s business includes partnerships with 40+ different news outlets while also hosting its own 20+ websites. Its products enable advertisers to reach high-intent users in 20+ markets around the world. Each of the websites is using one of several available design themes and each of the designs has different features. For some of the websites, the features differentiate by:
Many of these websites have numerous sub-pages that also need to be checked. Moreover, the data on every page is changed on a daily basis. The problem with testing numerous pages is that when there is a new feature added, it can work on one partner’s page or one theme, but it can break on another. Because of this, performing manual QA on every new feature and every design update is almost impossible and time-consuming. Despite the possibility of new bugs appearing with every new feature, there is also the risk of mistakes appearing in the data where editors manually make changes on a daily basis. Nevertheless, all of the websites are essential for our client’s business and are expected to provide the desired UI and function properly.
The web application has a complex structure, containing the code for every design theme in the same repository.
The functionality is reusable where needed, with technologies such as Redis, CloudFront, MongoDB and ElasticSearch behind. To test the application we developed an automation framework in Java language which was designed specifically for testing the different design themes of the websites. However, as the company grew, so did the complexity of its structure and the number of websites. The automation tests took longer to finish and the code was harder to maintain. New requests were also coming in for which the framework was not designed to fulfill at that time. Therefore, the automation framework had to be improved to match the company’s growth, but in order to accomplish this, we came across a few challenges.
First challenge: how to categorize the bug report
The first challenge was to fulfill our client’s request to categorize the bug report for every element, functionality, link, and data check failure, depending on the bug and page type, only specific bugs to be reported to a different group of people like the SEO team, project manager, editors or developers. At the time, the automation framework was designed to send all the test failures together in the same report.
Second challenge: how to decrease the execution time
As mentioned before, since the number of websites grew, it also grew the number of webpages, URL links, and image links. The tests are expected to run on a daily basis, before, during, and after every deployment and on-demand, so a large number of webpages considerably increased the execution time which slowed down the deployments. Apart from this, the daily tests were making large numbers of HTTP requests which slowed down the websites, and often new instances needed to be scaled to preserve the user experience. The challenge here was to decrease the execution time of the tests and lower the number of HTTP requests.
Third challenge: how to reduce the maintenance effort
The last challenge was to decrease the code’s maintenance time. The attributes of the HTML elements change very frequently, so a large number of websites and the diversity of the pages also made the number of element selectors in the framework to increase. Maintaining every selector to be up to date made the automation developers waste a lot of time in debugging and updating instead of working on adding new tests.
Part 1: categorized bug report
As mentioned before, the automation framework already had an integrated report system, used to gather all the bugs without any categorization.
We updated its functionality in a way that each of the individual checks introduced category groups, with the purpose of sending the bug reports that are relevant to the category.
Part 2: faster page testing
In order to speed up the execution of the test runs, we divided the implementation into two parts:
This made testing thousands of pages much faster. To decrease the frequency of HTTP requests, we adjusted the AT framework caching to:
Every time a link URL or page needed to be tested or called again, it would be taken from the cache instead of the service’s response. Depending on the specific website, there could be hundreds or more subpages.
Therefore the automation framework was adjusted to always monitor the cached memory size during its execution and when it starts to run out of memory it would remove a part of the cached data. While the caching was provided for the pages opened with Jsoup, if the selenium chrome driver was used to open a page for the first time, then the page’s status code will be taken from the driver’s performance logs. Decreasing the number of HTTP requests also resulted in decreasing the test execution time.
Part 3: “smart” maintenance
The solution for the frequently changing selectors was to make the automation framework to be “smarter”. This was achieved by updating the element finders to use three selectors:
After some element is changed, the main selector cannot find that element anymore. In such cases, the neighbor selectors would be used to find the element, thus not breaking the test. After the test finishes, the framework would use the page’s and element’s html to generate a new selector that would replace the old one. In addition, the framework adds all of the selector fixes to a new git branch and alerts the automation developer to review and merge the changes.
As a support for this system, another “self-fix” operation was implemented. To add some perspective here, most of the framework’s test scenarios are made per page type and each element or widget has its own test. When the full scenario finishes, all of its tests are validated and the failed ones are added to the report. Normally if something breaks one of the tests, for example clicking on an element where the test expects a popup but instead it encounters a bug that navigates to another URL instead, then each following test would break since the wrong page would be tested. We could have provided fixing this type of case if we added a “navigate to URL” at the start of each test but this would only clutter the classes with duplicate code.
There are other circumstances that could also break the test flow like pagination where the page element could not be on the first page, lazy loading, or hidden elements (for example, the page would need to be scrolled down for the element to show up). Adding all that extra code to satisfy every case wouldn’t be practical.
For this purpose, we implemented this “self-fix” functionality in a way where every test executes normally until it fails to find the element it needs for testing before the test is marked as failed, firstly it is reviewed if all the conditions for testing are correct using these checks if:
And many more…
Depending on the nature of the test, specific condition checks are selected by the framework and if any of them fails then the appropriate actions would be taken to correct them the part of the test that failed will be rechecked and the test will continue if the check passed. This functionality was implemented in the core of the framework and can be activated on each test just by selecting “true” in its configuration.
The increased performance of the tests gives the opportunity to run all the tests before, during and after every deployment without making any delays in the process. This greatly helps improve the Manual QA by skipping the repetitive checks and to finish smoke testing the application much faster. It also helps a lot with regression testing where everything needs to be checked, so the automation tests could be run many times during the day after every bug fix.
Decreasing the number of requests makes it possible to run the tests daily and when needed on production without impacting the user experience.
By categorizing the different checks, the groups which are getting the bug reports will never be cluttered with emails containing bugs that they couldn’t do anything about nor were their concern, while the full bug report is still sent to the automation developer. Sending the specific bugs to the right source helps with fixing the bugs much faster than it would have been by sending all bugs to everyone or to one or a few people who then would need to notify the right person for fixing them.
Making the tests “smarter”, cuts down the development time significantly. Every time one or more selectors need updating, it is done automatically and the developer would only need to review and merge the change.
And last but not least, the support system makes sure that all the conditions pass before making the check lower the false bug reports to a minimum.