Selenium
Selenium automates web browsers. It is most famous for enabling rapid, repeatable web-app testing, which allows developers to ship new releases faster and with confidence.
Selenium
Selenium automates web browsers and enables rapid, repeatable web-app testing. Sign up to test even faster via access to a Cloud Selenium Grid of 3000+ desktop browsers & real mobile devices.
Trusted by more than 50,000 customers globally
Countless hours are spent testing a web app to make sure it’s functional in and outside the local development environment. Before Selenium, this testing fell to a host of manual testers, enacting and reenacting hundreds of test case scenarios on all benchmarked browsers, flagging what broke and trying to pinpoint the source of that breakage.
Depending on the size of the manual testing team, an end-to-end system test could take anywhere between days to weeks to run its course.
Today’s development methodologies work in significantly shorter time frames of two to four weeks. Shipping new, bug-free releases in that time requires deterministic, repeatable testing that provides near-instant feedback. That’s why Selenium testing is integral to development today.
Here’s a deeper look at Selenium automation testing, how the toolset that enables it came to be, and where its usage fits within fast-paced development pipelines that are common today.
BrowserStack is now the first cloud test automation platform to announce complete support for Selenium 4, and its BiDi APIs. Learn More.
What is Selenium?
Selenium is an open-source tool that automates web browsers. It provides a single interface that lets you write test scripts in programming languages like Ruby, Java, NodeJS, PHP, Perl, Python, and C#, among others.
Note: To get started with Selenium latest version (WebDriver-based implementation), you'll only need a single selenium jar file (selenium-standalone-server-'{'version'}'.jar) to run tests both locally and on remote devices. This JAR file contains W3C-standard WebDriver API and Selenium Grid, along with Selenium Server (for existing users of deprecated Selenium RC implementation).
Here’s how those components work:
Selenium WebDriver
Also known as Selenium 2.0, WebDriver executes test scripts through browser-specific drivers. It consists of:
API
Application Programming Interface. Ports test scripts you write in Ruby, Java, Python, or C# to Selenese (Selenium’s own scripting language), through bindings.
Library
Houses the API and language-specific bindings. Although plenty of third-party bindings exist to support different programming languages, the core client-side bindings supported by the main project are: Selenium Java (as selenium jar files), Selenium Ruby, Selenium dotnet (or Selenium C#, available as .dll files), Selenium Python, and Selenium JavaScript (Node).
Driver
Executable module that opens up a browser instance and runs the test script. Browser-specific—for instance, Google develops and maintains Chromedriver for Selenium to support automation on Chromium/Chrome.
Framework
Support libraries for integration with natural or programming language test frameworks, like Selenium with Cucumber or Selenium with TestNG.
Here is a detailed tutorial on Selenium Webdriver.
How it Works: The WebDriver protocol has a local end (‘client’) which sends the commands (test script) to a browser-specific driver. The driver executes these commands on its browser-instance. So, if the test script calls for execution on Chrome and Firefox, the ChromeDriver will execute the test on Chrome; the GeckoDriver will do the same on Firefox.
Note: Test scripts execute only when the WebDriver’s client and browser/driver are connected. They don’t have to be on the same device. To enable test execution on multiple remote drivers, you need RemoteWebDriver and the Grid.
Selenium Grid
The Grid can minimize test runtime—by executing multiple test scripts on any number of remote devices at once. This is called parallel testing.
Selenium Grid is a smart server that routes test commands to browser instances on remote devices. The two main components needed for this (other than the test script from client-side/tester) are:
The ‘Hub’ (server):
Accepts access requests from WebDriver client. Routes JSON test commands to remote drivers on registered ‘nodes’.
‘Node’ (remote device):
Contains a native OS, browsers, and remoteWebDriver.
How it works: WebDriver-client executes the test on a faraway device through remoteWebDriver. RemoteWebDriver is like your regular WebDriver, except its two components are the Client (your test script) and Server (a Java servlet that actually executes the test on the remote device).
In your test script, you define ‘desired capabilities’ (device, platform, browser, etc.) of the node where the test will execute. The Hub receives this script, runs through the registered nodes to find one that matches the desired capabilities, and assigns the test to it for execution.
Note: Setting up the Grid is pretty easy, but scaling, configuring, and maintaining its integrity can take up a lot of resources. Make sure to adopt it after careful consideration.
Selenium IDE
Selenium IDE is a Chrome and Firefox plugin that can log ‘natural’ interactions in the browser and generate its code in programming languages like C#, Java, Python, and Ruby, as well as Selenese (Selenium’s own scripting language).
Testers can enable ‘recording’ within the IDE and ‘play out’ the test scenario on the browser. The IDE can then replay those interactions and highlight any errors (during replay) in red.
Keep in mind that while the plugin is quick and helpful, the code generated is generally too messy to be used in automation test scripts. So use it for rapid prototyping, but for more serious cross browser testing, we recommend Selenium WebDriver.
Selenium: A History
A timeline of major events in the evolution of Selenium from an in-house side-project to an open-source industry standard in browser automation:
2004: Making history in two parts (from Selenium A to B)
- Jason Huggins of ThoughtWorks needs to test his web app’s front-end behavior across different browsers.
- He develops a tool that works by injecting JavaScript underneath the webpage, allowing the tester to write code that could ‘automate’ front-end user interactions. This became the JavaScript TestRunner.
- Although the JS-injection approach couldn’t naturally replicate user interactions (via keystrokes/mouse movements), it was a workaround for the ‘same-host origin policy’, which prohibits external JavaScript code from accessing elements from a domain it didn’t originally reside in. Nonetheless, the tool is positively received by in-house developers and ThoughtWorks’ clients alike.
- The tool is open sourced due to popular demand.
- To eliminate the need for JS-injections, Huggins, along with colleague Paul Hammant, discuss the possibility of a ‘server’ component. This server would act as an HTTP proxy and trick the browser-instance into believing that the test script and the web app under test are from the same source.
- They develop the server component in Java and the original client-side driver (TestRunner) gets ported to Ruby.
- This is the original Selenium. Known as Driven Selenium or Selenium B in the evolution timeline.
2005: Selenium RC (Remote Control)
- Elsewhere (at Bea, specifically), Dan Fabulich and Nelson Sproul begin working on the driver coder. They eventually mold it into a standalone server that bundled MortBay’s Jetty as HTTP proxy.
- This becomes ‘Selenium RC (Remote Control)’ or Selenium 1.0. Before we cut to 2.0, there is another significant development in the form of…
2006: The Selenium IDE
- Shinya Kasatani wraps the Selenium driver code in an IDE module in Firefox browser.
- When it works, he finds that he can run a functional ‘live test’ on a website—interacting with the browser (as a user would); recording/replaying the interactions and debugging as needed.
- Kasatani donates this tool to Selenium project where it becomes known as the Selenium IDE.
2007: The Selenium WebDriver (Selenium 2.0)
- Back at ThoughtWorks, Simon Stewart diligently codes up separate ‘driver’ clients for every popular browser, so they’d all support automation with native browser capabilities.
- It pays off. The project becomes famous as the WebDriver.
2008: Multiply by ‘n’: The Selenium Grid
- At ThoughtWorks, Philippe Hanrigou creates a server which would allow testers to access and run tests on browser instances on any number of remote devices.
- This becomes known as the Grid. Cut to…
2016: Selenium 3.0
Selenium RC gets deprecated and WebDriver becomes standard implementation—aka Selenium 3.0.
2019: W3C Protocol
WebDriver becomes a W3C standard protocol.
2021: Selenium 4 released
On October 13, 2021, Selenium 4.0 was officially released.
Also Read: What’s New in Selenium 4?
Why do I need Selenium Automation Testing?
Imagine that a manual tester has this scenario: Checking whether the web app’s signup page (www.example.com/signup) validates input strings and registers a user successfully in latest versions of Chrome and Firefox, on Windows 7.
Assume that the signup page has these input fields—username, email address, and password. The tester will get a Windows 7 desktop and follow these steps, consecutively, on latest versions of Chrome and Firefox:
- Enter the URL in the address bar (www.example.com/signup)
- Enter an invalid string in each input field (email, username, and password)
- Check whether the input strings were validated against corresponding regexes and any pre-existing values in the database
- Enter ‘valid’ strings in each input field; click Sign Up
- Check whether “Welcome, ‘{‘username’}’“ page showed up
- Check whether the system database created a new userID for ‘{‘username’}’
- Mark the test ‘passed’ if it did, ‘failed’ if the signup feature broke anywhere during the test.
That’s a very basic system test. In the real world, testers are more likely to be checking all user workflows on www.example.com for breakage, on as many OS-browser combinations as needed to meet the benchmarked compatibility standards.
Depending on the number of manual testers (and thoroughness of test cases), it may take anywhere between hours to weeks to be sure that the web app is fully functional.
Modern developers and product teams don’t have that kind of time to allot for testing, but they can’t set aside exhaustive testing in a hurry to release either. This is why they super-charge their testing with automation, powered by Selenium.
How Selenium Testing Boosts Agile Development
What is Agile?
Agile is a development methodology. It starts with the simplest working version of the product design—one that can be continuously improved.
Here’s what a typical Agile workflow looks like:
- Stakeholders agree upon the ‘simplest working’ design of the product.
- The design gets divided into smaller modules.
- Each module is assigned to a cross-functional team of developers, designers, and Quality Assurance personnel.
- Teams work in sprints to create their modules within a time-frame (‘iteration’)—a window of one to four weeks.
- At the end of each iteration, finished modules are put together. Tests are run and a functional product (with minimum bugs) is demonstrated to the stakeholders.
- The stakeholders evaluate project priorities, add customer feedback, and adapt as needed.
The whole cycle begins again with the next iteration and a new set of modules. A ‘market-ready’ product or a new feature will always need multiple iterations.
Where does testing automation come in:
- QAs are involved from early stages to run a series of unit and acceptance tests on modules.
- Integration tests on every iteration ensure that separately coded modules don’t break when put together.
- Each new iteration requires regression tests (so it doesn’t break the previous working iteration).
It’s essential to keep track of code as well as test cases, so all iterations are well documented. While we’re on the subject, you should note that this recurrent testing is a theme in any sub-category of rapid, iterative development based on Agile, like CI/CD.
How Selenium Testing is Integral to Continuous Integration/Delivery (CI/CD)
What is CI/CD?
Continuous Integration/Delivery prioritizes delivery of new releases of a build, frequently and quickly. A project that’s launched remains open to continuous iterations (like Agile).
The only difference is this: the project also remains ready to be shipped at all times (instead of waiting for iterations to run their course).
A CI/CD pipeline looks like this:
- A developer has code they wants to integrate into the project
- An external CI server does an ‘integration’ test—it grabs the source files and attempts to do a build with the new code.
- If the build completes successfully, the server packages the changes with source files. If not, the server notifies members of the team.
CI engines (like Jenkins or Bamboo) have dashboards that display current and previous builds, logs of previous check-ins and their status (successful/failed), what broke (and when), etc. Everyone remains informed about any change in code, infrastructure, or configuration. This ensures that deployment failures are caught (and fixed) early.
Note: There’s a difference between a ‘successful build’ and ‘quality build’. Even if a new integration is successful, it’s not considered ready to ship until it has passed a series of tests by QA engineers. That’s where automation testing with Selenium comes in handy.
Selenium automates frequent and recurrent functional, performance, and compatibility testing. This gives developers near-instant feedback for faster debugging, leaving them with more time to code business logic for newer versions/features.
Modern web development needs Selenium testing because:
- It automates repeated testing of smaller components of a large(r) code-base
- It’s integral to agile development and CI/CD
- It frees resources from manual testing
- It’s consistently reliable; catches bugs that human testers might miss
- It can provide extensive test coverage
- It’s precise; the customizable error reporting is an added plus
- It’s reusable; you can refactor and reuse an end-to-end test script every time a new feature gets deployed.
- It’s scalable; over time, you can develop an extensive library of repeatable test cases for a product
What Types of Testing can be Automated with Selenium?
Types of testing that are commonly automated with Selenium are:
Compatibility Testing:
Done by QA professionals/Testers to ensure that the web app meets performance benchmarks on different browser-OS combinations. For example, testing on different devices (mobile and desktop) to ensure that the front-end fits to scale (responsive); testing on different browsers to see if video ads render on the pages as they should.
Performance Testing:
Series of tests done by QA professionals/Testers to ensure that the project meets performance benchmarks set by the stakeholders. Tester writes a script that checks whether all elements on homepage load within 2 seconds on different browsers/browser versions.
Integration Testing:
Done by developers to verify that units/modules coded separately (that work on their own), also work when put together. Parallel Test Calculator, for instance, has separate layers. UI takes input and business logic calculates the output—then sends it back to UI to display. The tester could verify whether they are able to relay data/output when integrated.
System Testing:
aka Black Box testing. Done by Testers/QA professionals with no context of the code or any previously executed tests. Typically centered on a single user workflow. The check-out process on a product website, for instance, comprises of: validating user credentials, fetching products from the cart, checking their availability, and validating payment details—before redirecting to the bank website. The tester could write a script to verify that the entire system is functional.
End-to-end Testing:
Also done by Testers/QA professionals, typically from the user’s point of view. The aim is to verify that all touchpoints on the web app are functional. From the previous example, the tester could write a series of test cases to check that sign-up, product search, checkout, review, bookmark, and all other features function as intended (and fail when invalid values are entered in input fields).
Regression Testing:
A series of tests done to ensure that newly built features work with the existing system. From the same example, say the product website launches a new feature (promotional codes) that automatically apply to eligible items before checkout. The tester could write cases to verify that it doesn’t break the rest of the checkout feature.
Well-written test suites can also automate Smoke and Sanity testing with Selenium.
Note: Selenium testing is not meant to replace manual testing. Testing automation, by its very definition, automates that which does not merit human evaluation. You can’t automate the testing of your newly revamped UI for human-usability. But for everything else, there’s Selenium.
Who Uses Selenium?
Short answer: Everyone who cares about the state of their web app.
Part of the reason why Selenium is so popular is its flexibility. Anyone who codes for the web can use Selenium to test their code/app–from individual freelance developers running a quick series of tests for debugging to UI engineers doing visual regression tests after a new integration.
In an enterprise environment, testing with Selenium falls under the purview of QA engineers. They are tasked with writing focused, non-flaky (i.e., deterministic) scripts to maximize test coverage and accuracy, refactoring old test suites for newer versions of the project, and maintaining test infrastructure (from the Hub to the test-case library).
They would be the ones creating comprehensive test suites to pinpoint ‘show-stopper’ bugs and advising stakeholders about updating performance benchmarks for the project. Their end goal is to ensure maximum test coverage and efficacy, which in turn boosts the overall productivity of the engineers at work.
Which Browsers can I run Selenium Tests on?
Most desktop/ mobile browsers today have built-in support for automation testing with Selenium. Consumer browser vendors like Firefox, Chrome, Safari, IE, and Opera develop and ship their browser’s drivers.
In the years since it was first open-sourced, others have contributed to the Selenium project by adding third-party drivers for specialized browsers like BlackBerry 10 and HtmlUnit, as well as bindings for integration with development frameworks like PhantomJS, Qt, etc.
What do I need to get started?
You don’t need to download Selenium (so to speak) like a piece of software. You will, however, need some of its components in order to run tests on automated browser-instances on your own device.
To get a hang of this, simply head over to Automate Documentation. Pick a language and/or framework you typically work with. Follow the steps to install the components you’ll need, and run a couple of sample tests. You’ll get a functional understanding of Selenium automation testing in no time.
Ready to take your web app through a real testing mill? Take a look at Automate, which gives you instant access to our Selenium Grid and 3000+ browsers on real desktop and mobile devices.
Future of Selenium
Selenium recently released the latest version of the tool – Selenium 4. This upgrade comes with features like an enhanced Selenium grid, upgraded Selenium IDE3, relative locators in Selenium 4, Chrome debugging protocol, better Window/Tab Management and so much more.
Also Read: How to upgrade from Selenium 3 to Selenium 4
Watch the below webinar by Simon Stewart, Project lead and creator of Selenium WebDriver, to understand the features that are available in Selenium 4 and more.