Webscraping and Automation Using Selenium with Python

Table of Contents

What is Selenium?

What can I do with Selenium?

Getting started

Headless drivers

Automation

Acknowledgements

What is Selenium?

Selenium is an open-source browser automation framework that can be used within many different programming languages, including Python. From the official documentation this is a list compiled of languages and their support of Selenium.

Programming Language Selenium Support
Java Main language for Selenium. Officially supported with Selenium WebDriver.
Python Widely used for Selenium scripting. Supported with Selenium WebDriver.
C# Supported with Selenium WebDriver. Often used in combination with the .NET framework.
JavaScript Supported for browser automation with Selenium WebDriver. Popular for frontend testing using tools like Protractor.
Ruby Has Selenium support through libraries like Watir and Capybara.
Kotlin Interoperable with Java, can use Java Selenium bindings.
Groovy Compatible with Java, often used with Selenium.
PHP Selenium bindings available for PHP.
Perl Selenium bindings available for Perl.
Swift Selenium bindings available for Swift.
Objective-C Selenium bindings available for Objective-C.
R Selenium bindings available for R.
C++ Limited support through third-party libraries.
Go Limited support through third-party libraries.

However, for the purposes of this introduction, we will refer to Selenium usage in Python.

What can I do with Selenium?

Selenium has uses within automating browser interactivity, such as extracting information from the web and interacting with elements such as buttons and forms. It is an important tool used in automation to execute a plethora of different test cases you may want to perform on a website you are developing. If you are interested in learning more about using Selenium for testing purposes, you can check out this guide.

Getting started

  1. Install Selenium bindings

    The easiest way to install Selenium is with the pip package manager. With Python and pip installed, simply run the following command to get the latest version.

     pip install selenium
    

    You can then run the following command to see if you successfully installed Selenium.

     pip show selenium
    

    If the installation was successful, the output should be similar to the following.

     Name: selenium
     Version: 4.15.2
     Summary: 
     Home-page: https://www.selenium.dev
     Author: 
     Author-email: 
     License: Apache 2.0
     Location: /Users/john/CSC301/venv/lib/python3.11/site-packages
     Requires: certifi, trio, trio-websocket, urllib3
     Required-by: 
    
  2. Install Chrome WebDriver

    Now, download the appropriate Chrome WebDriver version matching your Chrome browser version and place it in the project directory. You can find this in Google Chrome under settings > about chrome as shown. image

    Alternatively, you can download Chrome here. Or, for Linux users, you can use either wget wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb or dpkg sudo dpkg -i google-chrome-stable_current_amd64.deb.

  3. Start Writing Code

    At this point, you are ready to start using Selenium. Now you can import the specific functionality you want within Selenium with the import statement from selenium import [specific module] to fit your specific use cases.

Headless drivers

If you are running your scripts from a headless terminal (with no GUI), or don’t want the browser actions to be visible, you can use selenium with a headless WebDriver. This is useful if your testing environment is a server that you are connected to with SSH.

With Python, you can accomplish this by using pyvirtualdisplay. Before your Selenium interaction, initiate your virtual display using:

display = Display(visible=0, size=(800, 600))
display.start()

Then, after the interaction runs, close your virtual display.

display.close()

Typically, you would use a try, except, finally statement to execute the code, and you would instantiate the display at the top of the try block, and close it at the end of the finally block.

Automation

The Selenium WebDriver can be used for many purposes, including logging in to websites, filling out forms, and clicking buttons. For example, if there is a particular task that you need to do every day that requires you to log in to a website, and then navigate to a certain page this is a simple task in Selenium. You can then automate this script to run at a given time interval on a Linux server using cron. For example, you could include the following line in your crontab to schedule the script to run at midnight every day:

0 0 * * * python3 /path/to/selenium/script.py

You can learn more about what the numbers and wildcards mean here, or you can check out the linux man page to learn more about corn.

This is a small program I made to log in to a website and click a certain button.

from login_utils import login
from config_utils import load_config
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

driver = login()
    config = load_config()

    # Wait for the button element to be clickable
    button_to_click = WebDriverWait(driver, 10).until(
        EC.element_to_be_clickable((By.XPATH, config['button_to_click_xpath']))
    )

button_to_click.click()

driver.quit()

In this code, login_utils inputs a specified username and password combination, and then returns the driver after logging in, and config_utils serves to load the xpath to the button to click. Xpath is a part of the XML Path Language and it is used to identify elements in an XML document and also works with HTML documents. The WebDriverWait function waits until the element is loaded and clickable, with a timeout of 10 seconds. Once the desired button is selected, you can just call the .click() method on it to simulate a user click. Once you are finished, simply close the WebDriver with driver.quit().

A Comprehensive Selenium UI Example: Automating a Google Form

Now that you have selenium installed and ready to go, we will dive into a more comprehensive example of using Selenium by automating an online Google Form on Google Chrome. We will use this example form I have created to submit a response - https://forms.gle/rbxs4fNCbBSzfPW1A.

Below is a screenshot of the form that will be used in this tutorial. alt text In this tutorial we will first (1) set up our project, then use Selenium to (2) populate text fields, (3) choose multiple choice options, (4) choose dropdown values and (5) submit the form.

  1. Setting up the project

    Since we are working with a small project we can use one file to accomplish what we need. However, in larger projects we typically need to split the various components such as the css selectors/xpaths, location of chromedriver executable, URLs to navigate to etc. into different files.

    Ensure that your chromedriver version installed matches the Chrome browser you are on. This is vital or else your Chrome browser will not open properly. You can find the chromedriver.exe here: https://googlechromelabs.github.io/chrome-for-testing/

    Below I have provided the commented starter code which we will work with today.

    from selenium import webdriver
    from selenium.webdriver.chrome.service import Service
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.support import expected_conditions as EC
    import time
    
    # The Google Form we are working with
    browserURL = "https://forms.gle/rbxs4fNCbBSzfPW1A"
    
    # Make sure the driver version matches your Chrome browser
    # Can be found at https://googlechromelabs.github.io/chrome-for-testing/
    pathToChromeDriver = "C:\\Users\\edwar\\Downloads\\chromedriver.exe"
    
    
    def fill_form(browser):
        pass
    
    
    if __name__ == '__main__':
    # Set up ChromeDriver service
    options = Service(executable_path=pathToChromeDriver)
    
    # Initialize WebDriver with ChromeDriver service
    driver = webdriver.Chrome(service=options)
    driver.get(browserURL)
    
    # Call function to populate the form
    fill_form(driver)
    
    # Stop the webdriver after form has been submitted
    driver.quit()
    
    

    The function fill_form is what we will be focused on working on. If you were to execute the current file, you would see Chrome opening up, it navigating to the Google form followed by it closing.

  2. Selenium: Populating Text Fields

    To accomplish this, we will need to do the following - (a) locate the element which we want to populate, (b) wait for it to be visible, (c) populate the field. This is the general structure to be followed when doing UI Automation and we will follow this for all the fields we will populate.

    To locate the element we will use css selectors which is essentially the same as how we style css elements by locating them on css files. Inspecting the Text Field element we see the following:

    alt text

    We see that the text field element has class = "whsOnd zHQkBf" so we will use this as the css selector for this element.

    After locating the css selector we will need to wait for that element to be visible and then we need to populate it with some text. We will do this in our fill_form function and below is an example of how we can achieve this.

    # Text Field
    textField = WebDriverWait(driver, 5).until(
        EC.visibility_of_element_located((By.CLASS_NAME, "whsOnd.zHQkBf"))
    ) # Function waits until element is present on the screen before executing actions
    textField.click() # Simulate mouse button press
    textField.send_keys("Hello") # Simulate keyboard typing
    

    Waiting is vital in Selenium UI testing to ensure that web elements are fully loaded and ready for interaction. Without proper waiting mechanisms, tests may fail due to elements not being accessible or interactive. By incorporating waiting strategies, such as explicit waits, testers can allow sufficient time for elements to appear, become clickable, or undergo necessary changes. This improves test reliability and reduces the likelihood of false negatives, resulting in more accurate and stable test results.

  3. Selenium: Selecting Multiple Choice

    Let’s say we wanted to select ‘Option 2’. Following similar steps as above we will first locate the css selectors. Inspecting the multiple choice option for ‘Option 2’ we see the following:

    alt text

    We see there is a unique id field id = "i12"which we can use for our css selector. We want to ensure that our css selectors are unique so that we choose only the item we want to select. As such below is the example code as how we could go about doing this.

    # Multiple Choice
    option2 = WebDriverWait(driver, 5).until(
        EC.element_to_be_clickable((By.ID, "i12"))
    )
    option2.click()
    

    To identify and choose fields for CSS selectors effectively, testers can inspect the HTML structure of the webpage using browser developer tools. Pay attention to unique attributes or classes associated with the elements of interest. For dynamically generated content, focus on attributes or classes that remain consistent across different instances of the element.

    For example, if you’re targeting the third option under a <div> element with the class name “something,” your CSS selector could be structured as .something > option:nth-child(3). This selector targets the third <option> element within any <div> element with the class “something.”

  4. Selenium: Selecting Dropdown

    Selecting dropdowns are a bit tricker as we would need to first click the dropdown to view the options, wait for the options to load, and finally click on the option we want. As such we would need 2 css selectors for this step - one for the dropdown and another for the dropdown selection.

    Inspecting first the dropdown we see the following:

    alt text

    We will use the class name similar to the text field we did earlier class = "MocG8c HZ3kWc mhLiyf LMgvRb DEh1R KKjvXb" to click to open the dropdown.

    Once we open up the dropdown options we see the following:

    alt text

    In this case let’s choose ‘Option 2’. We see that both ‘Option 1’ and ‘Option 2’ have the same class name so we need another way to select only ‘Option 2’. Looking closer we see that there is a unique identifier for each of the options and in this case it is data-value="Option 2". However, when we check on the inspect tab (CTRL+F and enter the selector to test it out) we see that there are multiple elements with that selector.

    alt text

    Since we cannot select just individual css selectors for the dropdown option, we can combine css selectors (following a parent-child hierarchical structure - “[parent selector] [child selector]”) together to form a unique selector, similar to how you would do it styling css elements as well.

    After trial and error, the unique selector that selects ‘Option 2’ is '[class="OA0qNb ncFHed QXL7Te"] [data-value="Option 2"]'.

    Combining all of this together we get the following code:

    # Dropdown
    dropdown = WebDriverWait(driver, 5).until(
        EC.element_to_be_clickable((By.CLASS_NAME, "MocG8c.HZ3kWc.mhLiyf.LMgvRb.DEh1R.KKjvXb"))
    )
    dropdown.click()
    
    # Dropdown Options
    dropdownOption2 = WebDriverWait(driver, 5).until(
        EC.element_to_be_clickable((By.CSS_SELECTOR, '[class="OA0qNb ncFHed QXL7Te"] [data-value="Option 2"]'))
    )
    dropdownOption2.click()
    
  5. Selenium: Submit

    Lastly, all we need to do now is to click the submit button and our form will be sent. After doing all these steps you are probably quite knowledgeable in automation and expect this last button click to be easy as well. However, sometimes Selenium is unable to click on elements with certain tags and the tags that are clickable are not unique even if we try to combine selectors like we did previously.

    As such, we need to approach this problem a different way. From trial and error we notice that the css selector [role="button"] is clickable but is not unique.

    alt text

    In Selenium, we can use the find_elements() to get all elements that have a certain selector. After retrieving this list, we can then index the list and click on the Submit button. Notice in the image above the Submit button is ‘3 of 5’. This means we would need to index the 2nd element in the python list to retrieve it.

    As such, the code will be as follows:

    # Submit 
    submit = driver.find_elements(By.CSS_SELECTOR, "[role='button']")
    submit[2].click()
    
  6. Debugging Selenium Selenium UI testing, while powerful, is prone to encountering various errors during test execution. These errors can stem from issues such as unselectable CSS selectors, element visibility problems, or timing issues where elements haven’t loaded properly. Below are some common errors and strategies for handling them effectively:

    • CSS Selector Issues: Sometimes, CSS selectors may not accurately identify the intended elements due to complex page structures or dynamic content. To address this, consider using alternative locator strategies such as XPath or employing more robust CSS selectors.

    • Element Visibility and Interactivity: Selenium may attempt to interact with elements that are not yet visible or interactive on the page. Utilize explicit waits to ensure elements are fully loaded and ready for interaction before performing actions on them. This can help mitigate errors related to element not clickable, element not visible, or stale element references.

    • Page Loading Delays: Timing issues can arise when Selenium attempts to interact with elements before the page has finished loading. Implement implicit or explicit waits to handle dynamic loading elements or asynchronous JavaScript operations, ensuring that the page is fully loaded before proceeding with test execution.

Conclusion

Now that we have gone through all the steps we are now able to automate the Google Form created. Below is the completed code showing now with the fill_form function finished:

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time

# The Google Form we are working with
browserURL = "https://forms.gle/rbxs4fNCbBSzfPW1A"

# Make sure the driver version matches your Chrome browser
# Can be found at https://googlechromelabs.github.io/chrome-for-testing/
pathToChromeDriver = "C:\\Users\\edwar\\Downloads\\chromedriver.exe"


def fill_form(driver):

    time.sleep(5)

    # Text Field
    textField = WebDriverWait(driver, 5).until(
        EC.visibility_of_element_located((By.CLASS_NAME, "whsOnd.zHQkBf"))
    )
    textField.click()
    textField.send_keys("Hello")

    # Multiple Choice
    option2 = WebDriverWait(driver, 5).until(
        EC.element_to_be_clickable((By.ID, "i12"))
    )
    option2.click()
   
    # Dropdown
    dropdown = WebDriverWait(driver, 5).until(
        EC.element_to_be_clickable((By.CLASS_NAME, "MocG8c.HZ3kWc.mhLiyf.LMgvRb.DEh1R.KKjvXb"))
    )
    dropdown.click()

    # Dropdown Options
    dropdownOption2 = WebDriverWait(driver, 5).until(
        EC.element_to_be_clickable((By.CSS_SELECTOR, '[class="OA0qNb ncFHed QXL7Te"] [data-value="Option 2"]'))
    )
    dropdownOption2.click()

    # Submit 
    submit = driver.find_elements(By.CSS_SELECTOR, "[role='button']")
    submit[2].click()

    time.sleep(2)

if __name__ == '__main__':
    # Set up ChromeDriver service
    options = Service(executable_path=pathToChromeDriver)

    # Initialize WebDriver with ChromeDriver service
    driver = webdriver.Chrome(service=options)
    driver.get(browserURL)

    # Call function to populate the form
    fill_form(driver)

    # Stop the webdriver after form has been submitted
    driver.quit()

Last thing to note is that I’ve added a couple of waits in the final code just to first allow the page to fully load first then at the end so that when you execute it you can see the finished result before the browser closes.

Acknowledgements: