The Best Way to Perform Web Application UI Test Automation

Suppose you have a Web application of some kind and you want to write test automation which exercises the app through the application GUI. There are several approaches you can take. First, you can buy a commercial record-playback tool. A record-playback solution involves the use of a software tool which captures a user’s manual manipulation of a Web application running within a browser, and then records a representation of the user’s actions as a script of some kind. This script can be manually edited if necessary, and then re-run against new versions of the Web application under test in order to verify that any new functionality in the application has not introduced application failures. These tools are generally easy to use and can be effective for regression testing of relatively simple scenarios. However, these tools are expensive, browser-dependent, and relatively sensitive to changes in the GUI of the Web app under test.
Second, you can use a JavaScript approach. A JavaScript based solution involves creating a lightweight Web page harness which consists of two HTML frame objects. One frame of the page harness holds the Web application under test. The second frame of the harness holds JavaScript code which loads the Web application under test into the first frame, exercises the application through a series of state changes by simulating user actions such as button-clicks using the simple JavaScript code wrappers around the browser’s Document Object Model, and examines the final state of the Web application for an expected state to determine a pass-fail test scenario result. This approach is simple, effective, and relatively browser-agnostic. However, JavaScript does not have any built-in debugging capabilities, and does not fully support object oriented programming, making JavaScript tests somewhat difficult to create.
Third, you can use a UI automation API approach. A UI automation API based solution involves writing code which calls directly into the low-level functions exposed by a browser’s Document Object Model. For Internet Explorer, these function calls can be made directly, typically by using the C++ language, or indirectly, by using wrapper code such as the C# language P/Invoke mechanism. Or, for Firefox you can use XPCOM calls. The API approach gives you maximum control into a browser DOM but is the most difficult to write and is browser-dependent.
Fourth, you can use some open source framework such as WATIR. These frameworks are script language wrappers (Ruby in the case of WATIR) around browser DOM API sets. These frameworks are relatively easy to use but create a clear external dependency.
There a several variations on the four approaches described too. So, which is the best approach? There isn’t a single best approach. Each technique has pros and cons depending upon your particular testing scenario. The point is you need to be aware of these approaches so you can decide which is best to use in your particular scenario.
This entry was posted in Software Test Automation. Bookmark the permalink.