Decomposing High-Complexity UI Automation through a Specification–Flow–Component Architecture

Luis Angel Moroco Ramos

Software Engineer & Solver

To learn more about this topic, click here.

1. Introduction

UI automation is defined as the practice of simulating human interactions with a specific system's user interface. While these interactions can be simple tasks, such as checking whether an element exists or clicking a button, they can be combined to represent complex user workflows from start to finish. Its main goal is to ensure that changes, whether minor or major, added during the continuous delivery process do not break existing flows or impact the user experience in the production environment. This helps achieve thorough regression testing, accelerates delivery speed, and enables QA engineers to focus on testing truly unusual flows.

As applications become more complex, their tests also become more complicated. UI automation suites tend to become fragile, hard to maintain, and costly to expand. Frequent updates to the layout or structure of the DOM in web interfaces cause multiple tests to fail, even when the workflows are still correct. Additionally, combining business logic, UI technical details, and validations into a single test file leads to duplication, low reusability, and high maintenance effort. Approaches like Layered Test Architecture (LTA), Screenplay Pattern (SP), and Page Object Pattern (POP) have been proposed to solve these problems; however, they still have limitations and room for improvement. For instance, POP can lead to the buildup of redundant and similar methods without clearly separating technical interactions from business logic. The same issue affects SP, which, although it improves workflow clarity, also experiences code duplication when dealing with common DOM elements. While these methods help define a responsibility boundary between layers, the absence of a unified way to interact with standard HTML elements results in scattered custom functions that are hard to maintain.

This article discusses modifications to LTA by adding reusable abstractions for common UI components and using models that extend these abstractions to inherit properties and behaviors. In a web application, these abstractions represent standard HTML elements—such as button, input, select, or toggle—and are implemented as centralized base classes. Models based on these foundations encapsulate the data and actions needed to interact with these elements, explicitly identifying them within the interface, and enabling a consistent connection between functional logic and the DOM.

This proposal is structured into three main layers—Specification, Flow, and Component—that classify test code based on its level of abstraction. Throughout the article, their responsibilities, benefits, and practical uses will be explained with examples in Cypress and TypeScript.
‍

Figure 1: (A, B, C) stand for the main layers in this proposal. Layer (A) contains the test specifications, where high-level scenarios are defined such as (A1). Layer (B) serves as an intermediary, orchestrating complex flows and mapping them into steps by leveraging Components as gateways. Layer (C) encapsulates all models that belong to a given Component. To ensure clarity and maintainability, three practices are recommended within this layer: use prefixes to avoid duplicated Component names across modules (C1), place each model in an isolated file to facilitate refactoring (C2), and extract widely used Models from the scope of Components to prevent coupling (C3). Finally, (D) denotes the set of browser automation tools available, such as Cypress (D1), which interact with and control the DOM (E). Source: Own elaboration.

‍

2. The Model

They are defined as "extensible data containers" for atomic, indivisible UI elements. For example, labels or tags that help us identify interface elements, such as a Selector or Toggle. While this is already optimal, this article encourages generating Abstractions that allow us to identify these Models throughout the system.

Since the interface has finite elements with which to interact, the models should also have finite elements. To this end, we promote the use of Enums as the basis for each model. Ideally, each model should:

1. Declare properties as statically immutable.
2. Prohibit the creation of new values at runtime.

2.1 Abstraction

In web interfaces, they refer to standard HTML elements such as Button, Toggle, Button Group, Input, Selector, and others. They act as supermodels that encapsulate common properties and behaviors, helping us reduce code by working with abstractions instead of concrete models. This way, models can focus on storing specific data and behaviors that are not shared across the entire system.

Some abstractions are described below using the TypeScript language:

1. Selector: They are used to create drop-down menus, allowing users to select the desired value. (A) would act as the selector's identifier.

2. Button Group: It's a collection of functionally related buttons. The instance property (A) identifies the button's ID in the DOM.

3. Input: Allows you to create interactive controls for forms to receive user input. The instance property (A) refers to the value of the placeholder, if one exists, or to the label associated with the input.

4. Toggle: It's a mechanism that toggles between two states, usually on/off, active/inactive, or true/false. The instance property (A) refers to the label associated with the toggle.

The models,therefore, extend these supermodels to become identifiable and acquirefunctional meaning within the system. For example:

Where we make sure to declare the possiblevalues (A, B) in a static and immutable way using the ‘static readonly’modifier, and we make sure not to allow new values by marking the constructoras ‘private’ in (C).

‍

3. Specification–Flow–Component Architecture

Tests should be able to evolve as quickly as user interfaces. To achieve this, it's not enough to make them work; they should also be structured for constant change.

To explain this organization, a bottom-up approach will be used:

‍

3.1 Component Layer

It is the foundation of the architecture because it encapsulates the logic of interaction with individual DOM elements. Since it can be reused throughout the system, it should not be aware of any functional context. Its role is strictly technical, as it abstracts away implementation details.

Each class in this layer represents a specific, functional element in the interface, such as a form or a modal.

Therefore, a component can be represented as:

Where (A) is declared as 'protected' to allow subclassing. (B) functions as a utility for selectors standard to any component, such as selecting elements by their identifier, tags, etc. Finally, common-agnostic methods are supported for any component, such as changing state with a toggle in (C), a button clicker (D), and an input filler (E).

Finally, we can add a custom component. For example:

Where, since a selector should nothave state (as it is only a means of communication with the UI), theconstructor (A) is marked as ‘private’. In addition, (B) is added as an elementcontainer, for example, ‘getInputByLabel(label: string)’ in the visual contextof the component.

3.1.1 SidePanel

We will evaluate a SidePanel component within this internal interface. For this, it is necessary to first represent the models described in point 2. If the component looks like:

**Figure 2:** Side Panel Component in UI. *Source: Own elaboration.*

A selector can be identified by its associated tag, so the template should be able to include all possible values related to that element. Why isn't it necessary to specify even more at this level? Generally, the same style rules apply to each possible value within this HTML element.

So, the mapping for the input:

Would be:

Where we identify inputs by the placeholder associated with them.

As for the Selector:

Would be:

Where we identify each value by the tag associated with it.

As for the toggle mapping:

Would be:

Where we identify them by the label associated to the toggle.

As for the Button Group:

Would be:

Since we have the Component’s models declared, we can represent this component as:

The SidePanel component is a static facade thatcentralizes the interaction logic with that component. It defines selectorswithin ‘elements’ in (A), accessing the DOM through properties of theabstractions and not fixed values. Therefore, generalities can be standardized.

(B, C, D) methods operate on these reusableabstractions, decoupling the technical code from the functional flow. Thisreduces duplication, improves maintainability, and makes tests more resilientto changes in UI structure.

‍

3.1.2 ReportDashboard

**Figure 3:** Dashboard Component in UI. *Source: Own elaboration.*

‍

Since we only have one input area,the unique one Model would be:

Where we identify inputs by the placeholderassociated with them.

So, the Component class would be represented as:

As any Component, it works as a facade. Where (A) involves the searching logic.

‍

3.2 Flow Layer

It represents the functional level of thearchitecture. Its purpose is to compose complete user flows, such as applyingfilters, creating a report, or exporting results, reusing only logic previouslydefined in the components. It should not interact directly with the DOM,keeping the tests clean and expressive.

In the context of web interfaces, thesefunctional flows are grouped in Pages. Each represents a section or module ofthe application - such as Reporting, Dashboard or Auth - and exposes methodsthat describe user tasks consistent with the business logic.

A Page can be represented as:

where in (A), the module path is received in the application and in (B), for example, a common method is implemented.

In this case, we will evaluate an internal module called Reporting. Specifically, we will add the flow (F1) that configures a report and (F2) which searches for it. Which involves redirection to the module interface, configuration, generation, and searching. Therefore, the Reporting page would be represented as:

ReportingPage represents a functional page in the system. It groups a series of flows consistent with the business. (A) centralizes all the parameters needed to execute the flow (F1) and (S) does the same for flow (F2), allowing the complete decoupling of the functional intent from the implementation details. (B) defines the entry path to the reporting module, enabling automatic navigation. Within method (C), it redirects to the view and orchestrates SidePanel (SP) calls using only model properties, without direct selection logic. Finally, in (D) the generation is triggered by delegating that logic to (SP) as well.

As for the search, method (J) first redirects for the view, then receives and fills in the filter form using ReportDashboard (RD) and initiates it in (K).

By representing business intentions declaratively, duplication is reduced and system evolution is facilitated without compromising test stability.

Although a Page should not have state, it is useful to be able to instantiate multiple instances when testing QA and DEVELOPMENT environments in parallel. For this reason, the constructor accepts a baseUrl, which allows creating the same Page pointing to different environments without changing the flow logic.

‍

3.3 Specification Layer

It represents the highest point ofthis architecture and is responsible for defining complete functional testscenarios. Unlike the previous layers, here the use cases are organized from abusiness perspective, invoking flows previously defined in the Flow Layer.

In this approach, validation is notlimited solely to explicit assertions at the end of the test but also allowserror detection at any stage of the process. For instance, if, while applyingfilters or exporting a report, an inconsistency or unexpected behavior occurs,the process immediately throws an exception, enabling the error to beidentified and traceability to be improved. This dual strategy providesflexibility and robustness: on one side, distributed validations within thefunctional flows that ensure strong contracts; and on the other hand, thepossibility of adding specific assertions when more control over what is beingevaluated is needed.

(A) define the functional scenario from the business perspective, clearly expressing what is tested without exposing how it is implemented internally. Then, (B) instantiates the Reporting Page configured for a specific QA environment. (C) executes the complete configuration flow (F1) complying with the exposed method signature, in this case, a strongly typed DTO. Regarding searching (F2), it is executed in (D) as well as (C) does. Finally, (E) triggers an assertion in order to verify whether it was created successfully.

This approach allows to keep the Specification Layer clean, focused, and highly expressive: it describes concrete business behaviors without being polluted with technical logic or implementation details. The result is a test suite that is more readable, maintainable, and resilient to structural changes in the UI.

‍

4. Conclusion

This proposal presents a practical extension to LTA by incorporating reusable abstractions for common DOM elements and a declarative model system. These additions allow centralizing interaction logic, reducing code duplication and standardizing behaviors, thus solving typical limitations such as excessive coupling, low maintainability, and sparse selectors.

The architecture, which includes the Component, Flow, and Specification layers, was designed to be scalable, expressive, and adaptable. Its implementation in internal projects at Ensolvers—especially in complex modules like Reporting, Dashboard, and Settings—enabled the creation of reusable flows, the decoupling of configurations, and the extension of exception handling to a more distributed system, enhancing error traceability.

As a result, the tests not only verify the correct behavior of the system but also do so in a structured, robust, and alignment with the business logic. This, in turn, enables the QA team to focus on finding bugs in truly complex scenarios instead of wasting time on trivial tasks.

Luis Angel Moroco Ramos

Software Engineer & Solver

Decomposing High-Complexity UI Automation through a Specification–Flow–Component Architecture

1. Introduction

2. The Model

2.1 Abstraction

3. Specification–Flow–Component Architecture

3.1 Component Layer

3.1.1 SidePanel

3.1.2 ReportDashboard

3.2 Flow Layer

3.3 Specification Layer

4. Conclusion

Start Your Digital Journey Now!