Enhancing Selenium Automation Testing Framework Using Generative AI

Alexander Meshkov
Gen AI QA Director

In the dynamic landscape of software testing, automation has become a cornerstone of efficient and reliable delivery. Selenium, as one of the most widely used automation frameworks, has empowered testers to simulate user interactions and validate web applications effectively. Yet, the growing need for faster releases and more complex testing scenarios presents challenges to the scalability and adaptability of even seasoned Selenium automation testing frameworks.

Generative AI (GenAI) is emerging as a transformative technology that opens new avenues for enhancing Selenium-based automation. By leveraging GenAI, through natural language processing and code generation capabilities, testers can automate test generation, optimize scripts, and intelligently analyze results, reducing manual efforts while boosting accuracy and efficiency.

This article explores how we integrated GenAI with Selenium and revolutionized automation testing workflows. We’ll delve into our practical use case, where we reduced the time of creating new automation tests and enhanced the capacity of our automation testers with a specific technology called AI agents. Whether you’re a testing professional or a tech enthusiast, this article provides valuable insights into bridging the gap between traditional automation and cutting-edge AI innovations.

To start, let me share a bit about the specifics of our projects, where we work extensively with test automation. The majority of our projects use Cucumber with Gherkin notation. This approach allows us to create clear and user-friendly automated tests using a human-readable language, reducing the technical barrier to entry for creating automated tests. Gherkin also promotes collaboration and improves test readability, leading to better test coverage.

Over the years of working on various projects, we have gradually shifted away from the conventional Page Object approach towards methods that enable dynamic locator identification. This shift has significantly reduced the burden on testers to create Page Objects for new elements. The primary framework used in our projects is Selenium, and its open-source nature has allowed us to implement a mechanism for dynamically determining locators during test execution.

As a result, Gherkin notations have also taken on a more dynamic character. To interact with a specific element in the system, it is now sufficient to specify a parameter that is relevant to the required element during the test.

Below are a few examples of Gherkin notations that can be used for any web application.

When I click an element with text “/*” on the current page

When I set value “/*” to input with label “/*” in “/*” area on the current page

These are the default steps available in our framework. Since the steps are quite independent, writing a step for an automated test, it’s simply a matter of replacing /* with specific values for the elements of your application. Currently, we have over 120 universal steps preconfigured for web applications, more than 40 steps for API testing, over 60 steps for mobile applications, and additional steps for more specific solutions like working with databases, Kafka message brokers, and others.

This means that to start automation on a new project, there is no need to write tests from scratch. You only need to transfer the core step library and its processing methods, specify the necessary parameters, and execute the tests. It sounds great, but we decided to go even further and explore how GenAI could enhance our work even more.

The main challenge we’ve faced in our projects is the constant need for test automation specialists to rewrite manual test cases into Gherkin notations. Manual tests are usually written in a classic format, listing each step’s actions and expected results. This process takes time—sometimes a significant amount—because the specialist must first reproduce the test, then identify the names of the required elements (since, despite having dynamic locators, testers still need to visually locate elements on the page), and only then use the predefined Gherkin steps to add the necessary parameters.

This made us wonder, why not use modern GenAI technologies to fully or partially automate this process as well?

This led to the idea of creating an AI Agent to empower our test automation team. Here’s what we decided to do: given a traditional test scenario, a link to our application, and predefined Gherkin notations, we aimed to automatically populate the necessary parameters for elements using GenAI.

Below is a simple diagram of how our AI Agent works.

Selenium-Automation-Testing-AI-agent-Logic

How Selenium Automation Testing Works

To use the AI Agent, you need to provide it with a standard test scenario written by a manual tester and the Gherkin notations from your test automation project. Then, a little magic happens.

Using a specialized AI pattern called WebVoyager, our agent opens a separate browser tab and begins interacting with your application. For example, it navigates to specific pages, clicks buttons, fills in fields with given values, and so on. During the execution of these actions, the AI Agent recognizes the active elements on the page, saves them, and captures screenshots of your application for further processing.

After several iterations of element recognition, our AI Agent independently determines whether it has gathered enough information about the test it performed. If sufficient elements have been collected, it proceeds to generate the Gherkin notation for the automated test.

Here’s a small example of a test scenario that can be provided as input to our agent:

Scenario: Create a New Page

Steps:

1) Open URL: https://…..

2) Click the Server dropdown then click Dashboard

3) Go to the Admin tab and click the Create New Page button

4) Set values: PageName123, PageDisplayName123 and description123 to inputs

5) Click Create

Expected Result:

6) Validate page with a given name is created

As a result of our AI Agent’s work, we get a test written in Gherkin notation, which already includes all the necessary interface elements.

Feature: Create New Page

Scenario: Create new page

Given I open URL `https://…..`
Then I click an element with text `Log in` on the current page
Then I click an element with text `Server` on the current page
Then I click an element with text `Dashboard` on the current page
Then I click an element with text `Admin` on the current page
Then I click an element with text `Create new page` on the current page
And I set the value `PageName123` to input with near text `Name` on the current page
And I set the value `PageDisplayName123` to input with near text `Display name` on the current page
I set value `description123` to input with near text `Description` on the current page
Then I click an element with text `Create` on the current page
Then I see text `PageName123` on the current page

After that, our test automation specialist only needs to transfer the generated test into the automated testing development environment, create a new feature file, and run the test. The framework will dynamically identify the element locators on the page and execute the test successfully.

Additionally, there are cases where it’s necessary to add specific locators manually. To address this, our AI Agent includes a feature for automatically determining locators for all elements on the page based on priorities. The agent first looks for the element’s ID, then the Class, followed by the CSS selector, and if none of these are found, it generates an XPath for the element.

Our AI Agent supports various programming languages, including Java, Python, JavaScript, and C#.

As a result, our team has achieved a significant reduction in the time required to create new automated tests, ranging from 30% to 80%, depending on the complexity of the test. We’ve also made it possible to start test automation on any project within just 1–2 days.

Incorporating GenAI into automation testing workflows has proven to be a game-changer, enabling faster test creation, dynamic element recognition, and seamless integration across projects. Our AI Agent demonstrates the potential of combining cutting-edge AI technologies with established frameworks like Selenium, significantly reducing the time and effort required for test automation.

So, if you’re interested in strengthening your test automation team with specialists equipped with tools and skills in GenAI, or if you’re looking to quickly kickstart automated testing on your project using GenAI, contact us, and we’ll be sure to get in touch with you.

How Selenium Automation Testing Works

Start a conversation today