Web Scraping with CSS Selectors (with Python Examples)

In this tutorial, you will learn what CSS Locators are and how to leverage CSS Selectors in web scraping.

CSS Selectors allow you to locate exact elements within an HTML document. It is also supported by most web scraping tools, making it a super useful tool for web scraping.

What is a CSS Selector

CSS Selectors use the Cascading Style Sheets rules to locate elements in an HTML document.

For example, in CSS, you could use this rule to apply a color to all links on a page that have a class named “myclass”.

a.myclass {
	color: blue;
}

Similarly, in Web scraping, you can use the same CSS selector to display all the links with the same class:

a.myclass

Why CSS Selectors are Useful in Web Scraping

CSS Selectors are very useful in web scraping. CSS Selectors allow you to:

locate the element you want to extract from a webpage,
identify and extract data from HTML documents quickly.
automate the scraping of webpages.

Simplest Way to Find CSS Selectors in Chrome

Chrome DevTools as an incredible feature that allows you to find the CSS selectors of any DOM element without any prior knowledge.

Open Chrome DevTool with Command + Shift + I, or by using right-click > inspect.

Then right click on any element in the DOM and select Copy > Copy Selector.

Basics of CSS Selectors

CSS selectors are patterns used to describe the location of element(s) within an HTML document.

They consist of one or more elements separated by a white space.

The last element on the right of the selector expression is the element that you want to select

For example, the CSS Selector below locates the h1, found within the HTML body element:

html > body > h1

Basic Structure of the CSS Selectors

The basic structure of an CSS Selectors is similar to the structure used in XPath, or the structure to navigate a URL where the > symbol replaces the slash symbol (/).

xpath: /html/body/p
CSS selector: html > body > p

The CSS locator is represented by a series of steps. Each step is separated by the greater-than (>) symbol, which moves forward one generation.

Types of CSS Selectors (Filters)

There are several types of CSS selectors that can be used to target HTML elements on a web page.

Tag Selector
Class Selector
ID Selector
Attribute Selector
Descendant Selector
Child Selector
Pseudo-Class Selector

Basic CSS Selectors Cheatsheet

Selector	Description	Example	Explanation
Tag Selector	Selects elements based on their tag name.	`p`	Selects all `<p>` elements.
Class Selector	Selects elements based on their class name.	`.example`	Selects all elements with the class name “example”.
ID Selector	Selects an element based on its ID.	`#example`	Selects the element with the ID “example”.
Attribute Selector	Selects elements based on their attribute and value.	`[type="text"]`	Selects all elements with the attribute “type” and value “text”.
Descendant Selector	Selects elements that are descendants of another element.	`div p`	Selects all `<p>` elements that are descendants of a `<div>` element.
Child Selector	Selects elements that are direct children of another element.	`ul > li`	Selects all `<li>` elements that are direct children of a `<ul>` element.
Pseudo-Class Selector	Selects elements based on their state or position in the document.	`a:hover`	Selects all `<a>` elements when the mouse is hovering over them.

Tag Selector

Tag selectors can be used to target specific HTML elements.

For example, you could use the title, h1, p or a tag selectors in order to select each element.

<html>
    <head>
        <title>Title of your web page</title>
    </head>
    <body>
        <h1>Heading of the page</h1>
        <p id="first-paragraph" class="paragraph">Paragraph of text</p>
        <p class="paragraph">Paragraph of text 2</p>
        <div><p class="paragraph">Nested paragraph</p></div>
        <a href="/a-link">hyperlink</a>
    </body>
</html>

To select the h1, here is the selector that you would use.

h1

<h1>Heading of the page</h1>

Class Selector

This example shows how to use the dot notation to select the elements with a certain class

This expression selects all the <p> HTML tags that have a class attribute with “paragraph” as its value.

p.paragraph

<p id="first-paragraph" class="paragraph">Paragraph of text</p>
 <p class="paragraph">Paragraph of text 2</p>
 <p class="paragraph">Nested paragraph</p>

ID Selector

An ID Selector uses the hashtag to target an element with its unique id.

p#first-paragraph

<p id="first-paragraph" class="paragraph">Paragraph of text</p>

Attribute Selector

An attribute Selector selects elements based on their attribute and value.

It uses the syntax:

tag-name[attribute="attribute-value"]

Example:

a[href="/a-link"]

 <a href="/a-link">hyperlink</a>

Descendant Selector

The descendant selector selects all descendants elements of another element.

This is done using the parent and the descendant tags with a space instead of a >.

body p

<p id="first-paragraph" class="paragraph">Paragraph of text</p>
 <p class="paragraph">Paragraph of text 2</p>
 <p class="paragraph">Nested paragraph</p>

Child Selector

The child selector selects elements that are direct children of another element.

body > p

The difference here is the any element that are not direct, e.g. nested elements like the p inside the div will not be selected.

<p id="first-paragraph" class="paragraph">Paragraph of text</p>
<p class="paragraph">Paragraph of text 2</p>

Pseudo-Class Selector

The pseudo-class selector uses the colon : to select elements based on their state or position in the document.

For example, you can define which sibling that you want to locate using the :nth-of-type(N).

The CSS expression below shows how to select the first div of the body element.

html > body > div:nth-of-type(1)

Unlike Python that uses zero-based indexing, the index in CSS Selectors starts at 1.

There are many pseudo-class selectors, some of which are described in this table.

Pseudo-class Selector	Description
:hover	Selects an element when the mouse pointer hovers over it
:active	Selects an element when it is being activated (e.g. clicked on)
:visited	Selects a link that has been visited by the user
:focus	Selects an element when it has focus (e.g. when a form field is selected)
:first-child	Selects the first child element of its parent element
:last-child	Selects the last child element of its parent element
:nth-child(n)	Selects the nth child element of its parent element (where n is a number)
:nth-of-type(n)	Selects the nth element of its type (where n is a number)
:last-of-type	Selects the last occurrence of an element within its container (e.g. h2:last-of-type)

Using CSS Selectors with Python

You can use CSS selectors when Web Scraping with BeautifulSoup or when Web Scraping with Scrapy.

Scraping CSS Selectors with BeautifulSoup

BeautifulSoup is not really allowing CSS selectors. To extract using CSS tag, you would need to do something like:

from bs4 import BeautifulSoup
import requests

# Fetch the HTML
url = 'https://crawler-test.com/'
response = requests.get(url)

# Parse the HTML
soup = BeautifulSoup(response.text, 'html.parser')

# Define CSS
soup.find_all("div", {"class":"panel-header"})

Scraping CSS Selectors with Scrapy

Scrapy allows you to scrape content using CSS selectors with the css() method from the Selector class.

from scrapy import Selector
html = '''<html>
    <head>
        <title>Title of your web page</title>
    </head>
    <body>
        <h1>Heading of the page</h1>
        <p id="first-paragraph" class="paragraph">Paragraph of text</p>
        <p class="paragraph">Paragraph of text 2</p>
        <div><p class="paragraph">Nested paragraph</p></div>
        <a href="/a-link">hyperlink</a>
    </body>
</html>'''

# Instantiate Selector
sel = Selector(text=html)

# define Selector class
css_selector = 'body > p#first-paragraph'

sel.css(css_selector).extract()

['<p id="first-paragraph" class="paragraph">Paragraph of text</p>']

Using CSS Selectors with JavaScript

Whatever way you are executing JavaScript (e.g. Chrome DevTools Console, NodeJS, etc.), you can use JavaScript the querySelector or querySelectorAll commands to locate HTML elements with their CSS selectors.

Here is an example using querySelectorAll to find all the links on a page.

var links = document.querySelectorAll('a');
var anchorText = [];

links.forEach(link => {
  anchorText.push(link.textContent);
});

console.log(anchorText);

XPath to CSS Selector Conversion

Equivalency	XPath Notation	CSS Selector
Select by element type	//div	div
Select by class name	//div[@class=”example”]	div.example
Select by ID	//*[@id=”example”]	#example
Select by attribute	//input[@name=”example”]	input[name=”example”]
Select by attribute value containing	//input[contains(@class, “example”)]	input[class*=”example”]
Select by attribute value starting with	//input[starts-with(@id, “example”)]	input[id^=”example”]
Select by attribute value ending with	//a[ends-with(@href, “example”)]	a[href$=”example”]
Select by sibling	//div/following-sibling::p	div + p
Select by descendant	//div//p	div p
Select by first child	//div/p[1]	div > p:first-child
Select by last child	//div/p[last()]	div > p:last-child

Difference Between Xpath and CSS Selectors

The difference between XPath and CSS selectors is that with the XPath we can move forward and backward while a CSS selector can only move forward while XPath can be a bit more complex.

Articles Related to Web Scraping

Scrape Linkedin Jobs with Python (Example)

Web Scraping with Playwright and Node.js (with Examples)

What is a Headless Browser?

Simplest Example to Install WGet Command (Mac & Windows)

Install Wget on Windows (Video)

Install Wget on Mac (Video)

3 Different Ways I have Installed Wget (MacOS, Windows and Linux)

How to Scrape Google without Getting Blocked (for FREE) Example

THE 8 BEST Websites to Practice Web Scraping

Web Scraping With Selenium in Python (with Example) – Browser Automation

How to Install Chrome Driver Executable File in Selenium (with Python Example)

Web Scraping with Python: Complete Guide

Web Scraping With Python and Requests-HTML (with Example)

How to Use Python Requests Library (Example and Video) – Guided Tutorial

Web Scraping with XPath (with Python Example)

Web Scraping with CSS Selectors (with Python Examples)

Web Scraping with BeautifulSoup (Python Examples)

Web Scraping Tutorial (with Examples)

Web Scraping with Scrapy in Python (Crawler Example)

Random User-Agent With Python and BeautifulSoup (by JR Oakes)

How to Scrape Google Using XPath

Enjoyed This Post?

Jean-Christophe Chouinard

SEO Strategist at Tripadvisor, ex- Seek (Melbourne, Australia). Specialized in technical SEO. Writer in Python, Information Retrieval, SEO and machine learning. Guest author at SearchEngineJournal, SearchEngineLand and OnCrawl.