why learn python for seo
Share this post

This post is part of the complete guide on Python for SEO.

Top SEO Experts share their advice on how and why you should learn Python for SEO. (JR Oakes, Hamlet Batista, Elias Dabbas and more)

I’ve invited experts to answer the challenges that I faced while learning Python and to help people get started learning Python.

This post has a lot of stuff so feel free to use the navigation to walk around this Python for SEO goldmine.


Best Advice When You Start Learning Python

What Should You Learn First?

What Should You Use to Run Your Python Code?

Best Python Installation Set-Up

Best Python for SEO Blog Posts

What is Your Favorite Python Tool?

Best and Most Useful Script for SEO

Best Python for SEO Resources

SEO Tasks you can do with Python


Invited Python Experts


Why Learn Python for SEO?


For me, learning Python gave me access to insights and information in a way that conventional SEO tools couldn’t at a speed that aligned with fast paced agency life.

Automating out projects in Python that would have normally taken much longer gave me opportunities to work on new projects with different teams and to give more back to my client base than they were ever expecting. For SEOs who want more out of their work, faster and more outside of the boundaries of conventional SEO then it’s almost an essentially to learn programming. Python just happens to be the most accessible compared to other languages (R, Java, C++, etc.)


Best Advice When You Start Learning Python

What is your best advice for people who start learning Python?


Start with a simple problem you need to solve and find code that gets you 80% there. Run the code line by line to understand how it works and use a tutorial to guide you in your learning.



Give yourself a small project that will be rewarding if accomplished. 

Don’t make it too hard.

It could be something like extracting search results or images from Google.  Packt has some great books on Python that are paced and include code.  It is great to start a project, use a book to learn as you look to complete the project, and have a feeling of accomplishment when you get something to work. 

Once it works, you are a Python programmer :-).

Also, use StackOverflow and Github to see how other people have accomplished the same problem. Python is a very intuitive language and you can find beautiful ways others have accomplished the problem in a simple and easy-to-follow way.



Never give up! 

Because if you have never been familiar with coding before, after a while you might think that it is so difficult to learn.

My suggestion is to give yourself some time in order to understand the concept and logic of coding before doing any project. It can take one or two months which is absolutely okay.

Believe me, it is not that scary!

Just copy and paste the code and try to configure its small parts according to your needs. I am learning Python better while I am searching for ‘’how to do this’’ or when my code gives an error.

And last advice, follow one or two courses step by step, don’t get distracted with lots of resources. It may cause confusion and make you lose your motivation. I’ve attended a bootcamp which I highly recommend it.



My biggest advice for people starting to learn python is to start by using Corey’s videos. Also to realize sometimes installing things and learning how to do things (like setting a path) can be a massive headache.



Start with data manipulation, master it. Pandas, pandas, pandas.

Learn data visualization, master it! Plotly (in my opinion).



To not be intimidated by Python, it’s one of the easiest languages to learn. Also, to try and code a little every day.



If you learn Python for SEO purposes, I think in most cases it will be to learn scripts to handle or collect data.

With some knowledge from other languages, I would say you can quickly start using Jupyter Notebook and just try some things.

Without any coding experience, it’s useful to do some online courses, some youtube tutorials, maybe some books about Python in general first.



My number one tip for anyone starting to learn Python is to persevere and practice.

Learning any coding language is difficult and can take a long time, there’s going to be times when you get frustrated and want to give up, but persevere!

Practice makes perfect, so make sure you learn a little more and practice each day, building your own scripts or trying to replicate ones you already know with slight changes.



Some advice:

  1. Use the “Google-it-first” algorithm. StackOverflow has saved my day a lot of times.
  2. Struggle. Spin your wheels. Set the code aside. Run around the block. Then, come back to it.
  3. Learn libraries one at a time. Read their docs and then dive into their codebase, trying to understand what is going on. I have learned a lot by simply reading others’ code.
  4. If you are using a library and you have a problem, go check present and past issues in the related Github repo: there is a good chance you’ll find the answer you need.


Code, read, then code some more. The practice is the most effective way to get better at something.



Find ways to incorporate Python into your day-to-day tasks, no matter how big or small. Have a daily task in Excel that only takes you about 10 minutes to do?

See how you can utilize Python to take that 10 minutes down to 5 minutes or 1 minute. In tandem with that, definitely start up a side project that you feel passionate about that isn’t directly involved with your professional life. My personal passion project is working with HR data (job descriptions, job postings, salary data, etc) and finding interesting trends across the board within that field.

It gets me thinking in a new way when coding, addressing problems I might not face when using Python for SEO but ultimately becomes a learning experience for me.


What Should You Learn First?

What should people start learning first?

Hamlet Batista


The language building blocks.

I cover them in this Python introduction to SEO on Search Engine Journal.

Author Note: You can also check out the Python Beginner Guide to learn language building blocks.



I would learn how to use the library Pandas and BeautifulSoup. In almost all scraping or data collection / manipulation tasks, these libraries come in very handy.  In addition, almost all code I do starts with the line import pandas as pd because I will want to be able to save the data in some form I can use it easily. Pandas make reading and writing to CSV dead simple.



Basic programming, then data manipulation, then visualization. Later on, get into machine learning (and stats).



The basics, of course, but I think because you will for sure as an SEO query an API to get some data at some point, that’s something you could try early…

To check out Pandas is for sure interesting too…

Author Note: Check out how to Get Google API Keys, how to query Google Analytics API and Google Search Console API.



Python courses usually start with data types, variables, loops, and conditional statements. If you are okay with these topics, keep going. The next chapters will be more difficult, but also more enjoyable. 

If you just don’t like what you learn so far, ask yourself whether you really want to learn Python or not. And again keep going until you really don’t want to do it 🙂

I asked myself once, but now I am really happy not to have given up.



I would always advise starting with the very basics. Using online learning tools such as Solo Learn has always been my go-to starting point. You’re able to work through various lessons covering all basic and advanced concepts of coding languages.



Data types are very important in Python and can give a little bit of a headache if you don’t understand them correctly. Variables, functions, and loops are very useful and something that I use almost every script.



Data structures, functions, and loops are a good starting point to do something cool. Then you can start thinking about how to automate your boring day-to-day tasks, like reporting or monitoring.



Programming and Computer Science have a chicken-and-egg problem.

To be good at computer programming, having a solid foundation in Computer Science helps tremendously; to be good at Computer Science having a solid grasp of computer programming helps tremendously.

My advice would be to try to solve a basic problem or automate a simple task first; that way you have a tangible goal to strive for. Will it be difficult? Yes. Will you get frustrated? Yes. When you are finished will you have learned something? YES!



Definitely the core essentially and syntax of Python. I think a lot of people when they first get into Python want to immediate begin automating everything and start incorporating a ton of third party APIs / machine-learning tools. But, in that excitement a lot of the core functions are lost. There are a TON of preset functions within Python that many coders have no idea about. Get a grasp of the language in full before you jump into even your first libraries.


What Should You Use to Run Your Python Code?

What tool would you recommend working with? (Command-Line? IDEs? Notebooks? Google Colab? Others?)


Please use Google Colab which is part of Google Drive. Python is already installed.



My IDE is Atom because it includes FTP/SFTP to servers, a terminal (command line), and a code editor that does language-specific highlighting.  It also has Git baked in. But, answering this depends a lot on what level you are at and what you want to accomplish. Installing Python and Cuda libraries to run Machine Learning libraries like TensorFlow and Pytorch on a computer can be very challenging.  Google Colab makes that very easy and allows you to speed up the training of projects in Google’s GPUs and TPUs. Further, the compute capacity and memory available in Google’s cloud GPUs are greater than what is in most laptops and desktops.

I really like Jupyter Labs for a lot of coding because the UX between Google Colab and what I use on my desktop is, mostly, the same and it allows me to annotate my code in meaningful ways to share with others, or my future self.



For now, I only use Jupyter Notebook, but Google Colab platform, which is a free Jupyter notebook environment provided by Google, is also great if your computer can’t take the workload.



Whatever works for the individual… And Project. Not a great answer but it’s the truth. Usually, people start with notebooks these days.

I really don’t like Colab. Since learning how to install things and use tools is important. And on the job, Google Colab isn’t used.

For traditional software engineering approaches, notebooks don’t work too well



For marketing people, Jupyter lab (notebook) should be fine. Once you are comfortable with that, and want to go on building apps you can select one of the good IDEs (I use PyCharm). But for general interactive work, the notebook should be fine.



I use Jupyter Notebook



Initially, I would recommend starting with the command-line whilst you learn Python and action your scripts.

This is due to how simple command-line actually is, there are no fancy displays or additional features that can sometimes overload your brain when you’re trying to learn.

Once you feel more confident in your Python skills however, I would definitely suggest using PyCharm. Out of all the IDEs I’ve tried and tested, PyCharm is simple, easy to use and has a fantastic UI.



I like using the command line and Spyder (Anaconda).



If you are in a hurry to test something quickly: iPython (more powerful than standard interactive Python shell).

If you need computational power for free: Google Colab, while if you are working on more structured projects I am in love with Visual Studio Code.



I do almost all of my work command-line. I find that it makes my code more portable and easier to share.



I use Jupyter Notebook for development purposes, Spyder for scripts in production and Google Colab to collaborate with other people on my team. 

Best Python Installation Set-Up

How do you recommend installing Python?


For newbies, Google Colab. You need a certain level of proficiency to work with IDEs and the command line.



I love Anaconda

I am pretty versed with the command line and installing, compiling code that I need, but I cannot tell you how many times conda has saved me hours by having libraries prebuilt / compiled to run on windows.  MMH3 (a hashing library) is one example. In addition, I really like the GUI for reviewing environment packages and one-click access to launching Jupyterlab.

The only issue I have ever had with Anaconda (Conda) is that some programs look for Python to be installed in a location that is different from where Anaconda installs it. This usually involves updating the Windows PATH to point to the new location.



From the website.



I went with the default installation guide. So nothing special.



Personally, I always recommend using default installation instructions. The reason for this is simply for ease of use and the speed with which you can install Python.



If you are a beginner, I think anaconda is a good choice, since it comes with all the standard packages that are good for manipulating data.



I usually install default Python and have different virtual environments based on my needs.

My favorite extension is virtualenvwrapper, but also the basic venv it’s enough to start understanding what is going on.

If you are a newcomer and you prefer to have a GUI, Anaconda Navigator can be another good choice.



I tend to do things “the hard way”.

I use the system packaged version of Python and use virtual environments to keep everything tidy.

When doing machine learning tasks, I use Docker to pre-build numpy, scipy and pandas packages to keep them from clashing with other projects and make distribution easier.



I have always used Anaconda as my backend for downloading everything Python related. It comes prepackaged with Spyder, Jupyter, a bunch of other Python/R IDEs and all the libraries you need to get started.

Best Python for SEO Blog Posts

What are the best posts that you have read about Python for SEO?

Hamlet Batista

I was very impressed with this one by Kristin Tynksi.

JR Oakes

But, I really don’t read that many SEO posts about Python.  I am more interested in finding cool work others are doing in other areas and seeing if they can be applied to SEO.  Python is relatively new to the SEO community and most of the current content is not developed as other areas. For example, researchers in NLP have needed to crawl the web, to extract content, for years to build datasets of content.  A lot of this code, papers, and posts are out there and are approachable to SEOs.



Mine! 🙂 Haven’t seen many actually, but I’d love to read one if you have.



I used Google APIs a lot … so maybe the stuff at Google Developers.

This one maybe:



I first met Python with this great post about intent classification by Hamlet Batista. I was so impressed and felt impatient for the future of SEO when I read it. So I strongly recommend all of Hamlet’s inspiring works.

Some other great articles:



In regards to specific blog posts, here are a few of my favorites;



For newcomers: Ruth Everett’s “An Introduction to Python for Technical SEO” to have a brief insight into what can be done with this scripting language.

For more geeky guys, I recommend the Deep Learning series by Hamlet Batista on Search Engine Journal: so inspiring!



Python.org has some pretty decent posts on Python for SEO.



Specifically for SEO? I surprisingly don’t have any that come to mind. I learned SO much of my Python knowledge from sentdex’s videos which keeps Python in the context of overall use.

If I had to think of anyone that immediately comes to mind whose written about the subject in the past I would have to shoutout Paul Shapiro who has challenged the idea of the technical SEO as the SEO with programming as a skillset to further enhance all aspects of the medium (not just how sites are built)

What is Your Favorite Python Tool?

What is your favorite Python Tool?


Ludwig is really cool.



Pytorch and Pandas.



  • Pandas for data manipulation
  • Plotly for data visualization dash for building dashboards
  • Advertools for serp analysis on a large scale (Google and YouTube)


Pandas



Ipywidgets is great for interactive controls without changing the inputs. If you will draw lots of plots and analyze data, using Ipywidgets can make you save time and create fancy charts.

Source: Will Koehrsen


If I had to pick one specific Python tool as my favorite, I would have to say it’s Gquestions.py (by Alessio Nittoli.

This tool allows you to enter a search query and Selenium will the scrape all ‘People Also Ask’ questions that show up for those terms. It’s super useful to do some very top-level keyword research, generate content ideas and build out content pillars.



Spyder.



One of the things I find especially pleasing is the possibility to use custom Python shells when you are working with some libraries.

For example, Scrapy Shell or Django Shell. This is useful when you are learning a new library and you’re not sure about something or you want to test something on the fly.



PDB (The Python Debugger). I don’t think my code would ever work without it. Haha!



In terms of libraries, I am a huge advocate for NLP libraries and practices for the future of SEO. NLTK (more academic but I use for streamline preprocessing), Spacy (industry standard for NLP), Gensim (for topic modeling), scikit-learn (for machine learning algorithms) and Scattertext (clustering and visualizations) are what make me most excited for coding lately.

For some everyday use though, https://regex101.com/ is regularly open on my computer because I write regex paths all the time but I am really bad with regex over!

Best and Most Useful Script for SEO

What is your most useful Script for SEO?


That I’ve shared publicly, probably this one:



Pandas is the most useful library for me, because of the ability to easily manipulate data tables of millions of rows and import and export via CSVs.  It even connects with Big Query to dump data to the cloud.

Below are a few functions that I use in many projects.  The GET_UA randomizes the User-Agent string to get around servers that throw errors if you try to crawl with the default user-Agent.  Parse_url returns the content, BeautifulSoup-parsed DOM, and content type for a URL. parse_internal_links(soup, current_page) is how you can use those two to grab the internal links for a web page.

Import random
import requests
from bs4 import BeautifulSoup
from urllib.parse import urlparse


def GET_UA():
    uastrings = ["Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.111 Safari/537.36",\
                "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/28.0.1500.72 Safari/537.36",\
                "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10) AppleWebKit/600.1.25 (KHTML, like Gecko) Version/8.0 Safari/600.1.25",\
                "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:33.0) Gecko/20100101 Firefox/33.0",\
                "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.111 Safari/537.36",\
                "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.111 Safari/537.36",\
                "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/600.1.17 (KHTML, like Gecko) Version/7.1 Safari/537.85.10",\
                "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko",\
                "Mozilla/5.0 (Windows NT 6.3; WOW64; rv:33.0) Gecko/20100101 Firefox/33.0",\
                "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.104 Safari/537.36"\
                ]

    return random.choice(uastrings)


def parse_url(url):

    headers = {'User-Agent': GET_UA()}
    content = None

    try:
        response = requests.get(url, headers=headers)
        ct = response.headers['Content-Type'].lower().strip()

        if 'text/html' in ct:
            content = response.content
            soup = BeautifulSoup(content, "lxml")
        else:
            content = response.content
            soup = None

    except Exception as e:
        print(“Error:, str(e))

    return content, soup, ct


def parse_internal_links(soup, current_page):
    return [a['href'].lower().strip() for a in soup.find_all('a', href=True) if urlparse(a['href']).netloc == urlparse(current_page).netloc]



I have one for custom search (also available as a Dashboard), and another for YouTube. Different ways of using them are mainly in the Semrush article, but there’s a few more (not very different) on Kaggle.



I think my first blog post about basic Python coding for SEO data can be very helpful for those who want to learn it.

If you want to go deeper into data science, check my page speed score prediction model. Learning statistics is also important for data analysis.



I wouldn’t say I have a specific script that I find most useful, however, all of my top scripts are all based around web scraping and data mining. I use a lot of variations of scripts that leverage these, but the core functionalities are paramount for me.



I have a lot of useful Scripts that I use to clean data and do specific stuff. I also have some Scripts that check if a page is down or if something on the page has been changed. But I think the most useful are the ones that clean big amount of data that would take me hours.



Recently I have been dealing with some big site migrations, so to figure out what was going on I had to write a script to compare multiple Screaming Frog crawls to spot if all was going as excepted (JS version vs non-JS/mobile version vs desktop).

You can find a shared notebook here:



I’ve been using my Python SEO Analyzer for over ten years now, and it’s still my go-to script.



I have a lot of scripts that take the SEMRush API along with web scraping libraries to crawl the SERP based on specific keyword clusters to pull out language trends within top position pages. I use this script, along with a semantic similarity keyword research script (also using the SEMRush API) on a regular basis.

In terms of scripts that I can directly share with the community, my Pagespeed Insights automated speed tester is another lifesaver for longer audits I conduct. You can find that script here!

Best Python for SEO Resources

Who would you recommend following?


These are some great Python SEOs that you will learn something new for sure:





I haven’t found many people doing programming for marketing, but there’s pbpython.



I cannot name just one name… maybe twitter.com/tobias_willmann/following 😉





There are now so many people who are great at Python and share amazing insights, however, from an SEO perspective, there are two people who instantly come to my mind;



JR Oakes, Andrea Volpini, Hulya Coban for SEO related machine learning and python stuff.



It really depends on what you are focusing on.

If you’re interested in NLP (Natural Language Processing), Andrea Volpini (Title tag optimization using deep learning) and Hamlet Batista do a lot of interesting stuff using BERT for text summarization, image captioning, etc.

If otherwise, you’re interested in learning Data Analysis or want to have an idea of how SEOs use Python: Paul Shapiro and Aysun Akarsu (87 Million domain PageRank).

Another really interesting thing I would recommend it’s an amazing speech at TechSEO Boost 2019 by JR Oakes where he explains one of his crazy projects: basically he built a “simple” search engine (code here: Tech SEO Crawler). It really impressed me.

Lastly, the most complete YouTube channel to watch cool things is without doubts Sentdex‘s channel, learned a lot watching his videos.



This is more of a “what” than a “who”: I feel like The Python List has more information than pretty much anywhere else. I learn something new every time I go through it.



  • Google’s Webmaster’s Blog / API documentation (for updates)
  • Paul Shapiro
  • Jamie Alberico
  • Britney Muller
  • Harrison Kinsley (sentdex)
  • /r/Python

SEO Tasks you can do with Python

What is the most common SEO task you do with Python?

JR Oakes

Time series prediction / forecasting, internal linking, topic analysis.

Plus, anything that you can share that will help people (myself included) with automation, that would be awesome😀

  • Learn how to use Git / Github to share your code. It is rewarding when other people find your code helpful.
  • Search Github for projects that other people are working in, even if it is not in SEO.
  • AWS Lambda functions are incredible and very approachable.  They make scheduling processes, and developing APIs that do one thing really well, easy and very cost-effective.
  • Learn how to access and parse JSON from APIs.  There are a lot of tool vendors in the SEO space that has APIs.  A lot of data from APIs is better than what you get in their UX product.
  • Follow David Sottimano.  He is big into Google Sheets and JavaScript.  JavaScript is a great library to learn in addition to Python because of its use across Google products (App script) and it is very helpful for data collection and visualization.

Matthew Jones

The most common SEO task I carry out with Python is probably keyword research. This sounds like a very simple task to be carrying out with Python, however, the speed and ease of carrying this out makes it one of the most common tasks I carry out.

Alongside this, I have a few various scripts that allow me to run technical audits on websites which I carry out periodically, making it the second most common tasks I carry out with Python.

Konrad Burchardt

Clean data.

I think that by learning a few tricks you can save so much time. Maybe it will take you some time to put the script together, but once you have it you can recycle it and use it on any different projects.

Alessio Nittoli

Data Analysis of Screaming Frog exports with Pandas on Google Colab.

Pandas is basically Excel on steroids, so you can analyze huge datasets with just a few lines. I work a lot also with scraping libraries as Scrapy or Selenium.

Seth Black

Definitely Technical SEO checks. Small issues with crawling (broken links, invalid redirects, duplicate pages) are the fastest and easiest things to check for, and I do that quite frequently.


Python SEO FAQs

📙How to Start Learning Python for SEO?

To learn Python for SEO, learn Python basics, Jupyter Notebook and Python Libraries.

🐍How to get started with Python?

To get started with Python, install Python with Anaconda and learn Python basic building blocks.

Which Tool Should You Use to Run Python Code?

Experts say that beginner Python SEOs should start with Jupyter Notebook or Google Colab.

❤️Which SEO Experts Should You Follow?

According to experts, you should follow Hamlet Batisa, JR Oakes and Ruth Everett.

This is it! I hope you were convinced by the reasons why you should learn Python for SEO and that experts have helped you learn how to get started.