How to use the basic Python packages like `pandas`

, `numpy`

, `matplotlib`

, `seaborn`

and `requests`

for SEO?

In this post, I will show you how to use each package to help you work with Python.

## Getting Started With Python Packages

If you haven’t installed Python on your computer yet, make sure that you read this easy step-by-step guide to install Python using Anaconda.

Also, make sure that you know how to use Spyder IDE, Jupyter notebook or the IPython console before you read this guide. You can find all this information by reading my previous tutorial to help you learn Python or start learning Python for SEO from scratch.

## What Packages Will We Cover?

In this guide, we will cover the most useful packages that are used in Data Science.

If you followed the advice outlined in the preface and installed Python using Anaconda,

you already have all these packages installed and ready to go.

NumPy – Pandas – Matplotlib – Seaborn – Requests

### Install and use the Packages

Before you can get started using NumPy, Pandas, and Matplotlib, you need to install the packages. If you have installed Python using Anaconda, you can skip the first step.

#### Step #1: Install the packages (optional)

Go in command prompt and type each line one-by-one.

pip install numpy pip install pandas pip install matplotlib pip install seaborn pip install requests

#### Step #2 Load the libraries

To load the NumPy library, just use the import function.

```
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
```

## NumPy

NumPy – Pandas – Matplotlib – Seaborn – Requests

NumPy, short for Numerical Python, provides efficient storage and manipulation of numerical arrays that you can use for advanced calculations instead of regular Python lists.

### Import Numpy

```
import numpy as np
```

### NumPy Arrays

The arrays, like the lists, are used to store multiple values in one single variable. The main difference is that you can perform calculations over entire arrays.

A regular list doesn’t allow you to do this.

Let’s say we want to analyze three days’ traffic data against two groups.

seo_grp1=[8,9,13] seo_grp2=[2,6,5] ttl_seo=seo_grp1+seo_grp2 ttl_seo ##[8, 9, 13, 2, 6, 5]

What we would have wanted instead of a pasted together list is the added array equalling [10,15,18]. Hence, the use of NumPy.

#### NumPy Arrays Calculations

First, let’s import NumPy.

```
import numpy as np
```

Then let’s calculate our two groups and three days’ worth of data, but this time using NumPy.

```
import numpy as np
seo_grp1=np.array([8,9,13])
seo_grp2=np.array([2,6,5])
ttl_seo=seo_grp1+seo_grp2
ttl_seo
## array([10, 15, 18])
```

The np.array() will perform its calculations element-wise.

It will add the first element of the first list to the first element of the second list, and so on.

In our case, it will perform this way: array([8+2 , 9+6 , 13+5]).

Remember that unlike Python lists, NumPy needs arrays that all contain the same type. If types do not match, the NumPy array will contain a single type.

```
np.array(["my domain authority is", 89])
## array(['my domain authority is', '89'], dtype='<U22')
```

Here, both the *string* and the* integer* were converted to *strings*.

You can also select some data using arrays.

```
# Indexing a single element
ttl_seo[2]
## 18
ttl_seo > 10
## array([False, True, True])
```

### Create Arrays Using NumPy

You can also create arrays from scratch using NumPy functions.

You could use np.arange() to build an ordered list of values up to the value you select; np.zeros() to create arrays filled with zeros; np.reshape() to build multi-dimensional arrays; and so on.

Here are a few useful functions.

Build an ordered list

```
np.arange(5)
##array([0,1,2,3,4])
```

Build an ordered list in a range, jumping from a number of values

```
np.arange(6,28,3)
## array([6, 9, 12, 15, 18, 21, 24, 27])
```

Create a list filled with zeros

```
np.zeros(6, dtype=int)
## array([0, 0, 0, 0, 0, 0])
```

Fill a matrix with a single value

```
np.full((5,5),4.15)
##array([[4.15, 4.15, 4.15, 4.15, 4.15],
## [4.15, 4.15, 4.15, 4.15, 4.15],
## [4.15, 4.15, 4.15, 4.15, 4.15],
## [4.15, 4.15, 4.15, 4.15, 4.15],
## [4.15, 4.15, 4.15, 4.15, 4.15]])
```

Create a 2D array

```
np.arange(8).reshape(2,4)
## array([[0, 1, 2, 3],
## [4, 5, 6, 7]])
```

Create a 3D array

```
np.arange(8).reshape(2,2,2)
## array([[[0, 1],[2, 3]],
## [[4, 5],[6, 7]]])
```

Create a 2D array with random values between 0 and 1

```
np.random.random((2,2))
## array([[0.48556036, 0.94031317],
## [0.01495329, 0.79882602]])
```

Copy an array

```
copy = np.arange(5).copy()
```

### Arrays Slicing

As we have seen with lists in our beginner Guide to Python, it is possible to access subarrays with the *slice* notation, using the colon (:).

Create an array

```
x = np.arange(10)
## array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
```

Select the first five elements

```
x[:5]
#array([0, 1, 2, 3, 4])
```

Select elements after index 5

```
x[5:]
#array([5, 6, 7, 8, 9])
```

Select the range of values

```
x[2:5]
#array([2, 3, 4])
```

Select a range of elements jumping two steps

```
x[::2]
#array([0, 2, 4, 6, 8])
```

Select every element starting at index 1, jumping two steps

```
x[1::2]
#array([1, 3, 5, 7, 9])
```

### Multi-Dimensional NumPy Arrays

Using NumPy, you can easily create multi-dimensional arrays.

Let’s create a 2D array.

```
x = np.arange(8).reshape(2,4)
x
## array([[0, 1, 2, 3],
## [4, 5, 6, 7]])
```

To find what shape our “x” array has, use the *shape* function.

```
x.shape
## (2, 4)
```

Let’s find other information about our array.

```
#Find the dimension
x.ndim
##2
#Find the size of the array
x.size
##8
#Find the Type
x.dtype
##dtype('int32')
```

Let’s change one value to a different data type.

```
#Look first position of the array
x[0,1]
##1¸
#Change it to a float
x[0,1]=1.5
x[0,1]
##1
```

Numpy has converted the float number to an integer.

Numpy will always use the same data type and convert it automatically.

#### 2D Arrays Subsetting

There are multiple ways to subset 2-dimensional arrays.

*Create a Matrix*

```
x = np.arange(8).reshape(2,4)
x
## array([[0, 1, 2, 3],
## [4, 5, 6, 7]])
```

*Select the 1st row*

```
x[0]
## array([0, 1, 2, 3])
```

*Same as previous notation*

```
x[0,:]
## array([0, 1, 2, 3])
```

*Select 1st row, 4th column*

```
x[0][3]
## 3
```

*Select 1st row, 4th column*

```
x[0,3]
## 3
```

*Select fourth column*

```
x[:,3]
## array([3, 7])
```

*Select the intersection between the 2nd and 3rd column and both rows*

```
x[:,1:3]
## array([[1, 2],
## [5, 6]])
```

### Randomize With NumPy

A really useful function of NumPy is to generate random numbers.

You can do this using `rand() `

function.

```
random_data = np.random.rand(1000)
plot = plt.scatter(range(1000),random_data)
```

### Use NumPy to Visualize The Gaussian Distribution

The Gaussian Distribution, or the theory of the normal distribution, is a central theorem in statistics.

```
import numpy as np
import matplotlib.pyplot as plt
from math import sqrt, pi, exp
domaine = range(-100,100)
mu = 0
sigma = 20
f = lambda x : 1/(sqrt(2*pi*pow(sigma,2))) * exp(-pow((x-mu),2)/(2*pow(sigma,2)))
y = [f(x) for x in domaine]
plot = plt.plot(domaine, y)
```

What does it tell?

The normal distribution states that a lot of randomizing data will generate a bell-shape distribution that we have seen. If it does, the distribution will be considered normal.

Let’s do an example.

We will generate random data and plot the histogram to see if it follows a normal distribution.

1. Create a Random Matrix with 1000 random variables, and a 100k sample for each of those variables.

```
normal_matrix =np.random.rand(1000,100000)
```

2. Sum the random variable of each column. This will help you visualize a distribution.

```
matrix_sum = np.sum(normal_matrix,0)
```

Note that ff you set up the sum to 0, you are going to sum the lines, if you set it up to 1, you are going to sum the columns. You could also set no value and sum all the elements of the matrix.

3. Plot Histogram

```
plot = plt.hist(matrix_sum, bins=1000)
```

4. Print Details of Your Distribution

As a data scientist, you will need to generate reports to justify your actions.

```
print("The mean of your distribution is {}."
.format(np.mean(matrix_sum)))
print("The mean distribution generated by rand is {}."
.format(np.mean(np.random.rand(1000000))))
print("The variance is {}."
.format(np.var(matrix_sum)))
print("The variance of rand is {}."
.format(np.var(np.random.rand(1000000))))
##The mean of your distribution is 500.04352575358934.
##The mean distribution generated by rand is 0.4999905658429283.
##The variance is 82.77889479273443.
##The variance of rand is 0.08342877035074636.
```

### NumPy Statistical Functions

There are plenty of other operations that you can do using NumPy.

As a Data Scientist, you’ll be faced with a number of statistical problems that can be solved using NumPy Functions. Here is a quick overview of the functions that you might be interested in:

- np.mean() : Calculate the Mean of an array
- np.median() : Calculate the Median of an array
- np.max(): Find the highest value
- np.min(): Find the lowest value
- np.corrcoef(x,y) : Find Correlation
- np.std() : Compute Standard deviation
- np.sqrt() : Square Root
- np.sum(): Calculate the Sum of an array
- np.sort(): Sort Data

To understand the scope of what you can do, just read how to make a T-Test using NumPy¸or this advanced NumPy Tutorial for Data Analysis.

## Pandas

NumPy – Pandas – Matplotlib – Seaborn – Requests

Pandas is one of the basic libraries that you will need in SEO and in Data Science. This library provides data structure and data analysis tools for Python.

Simply put, it is the library that touches the world of **Data Frames**.

A data frame is basically a table that is available in languages such as R and Python.

We have seen with Numpy that we could create multi-dimensional arrays easily.

```
x = np.arange(8).reshape(2,4)
x
## array([[0, 1, 2, 3],
## [4, 5, 6, 7]])
```

This is cool, but not convenient to analyze.

Using Pandas’ data frames, you can create a table.

```
quarterly_sales = pd.DataFrame(x)
quarterly_sales
```

You can, make it better by adding row and column name.

```
quarterly_sales = pd.DataFrame(x,
index=["2018","2019"],
columns = ["Q1","Q2","Q3","Q4"])
quarterly_sales
```

You can also select a column using its index. The index is either the column name or the row name.

```
quarterly_sales.Q1
quarterly_sales["Q1"]
##2018 0
##2019 4
##Name: Q1, dtype: int32
```

### How to Import An Excel File Using Pandas

You can easily import an Excel file into Python using pandas. To do this, you need to use the` read_excel `

function.

```
import pandas as pd
df = pd.read_excel(r'Path to your document\yourFile.xlsx')
```

### How to Inspect Your DataFrame

There are multiple functions that you can use in Pandas to help you inspect your dataset:` head(), shape`

, `type()`

,`columns().`

Let’s create a new data frame.

```
import pandas as pd
import numpy as np
x = np.arange(10000).reshape(2000,5)
yearly_sales = pd.DataFrame(x,
columns = ["2015","2016","2017","2018","2019"])
```

#### Preview of Your Dataset Using Head() and Tail()

When, you have a large dataset, you might want to show just the first few lines to help you get a good idea of your data structure.

```
yearly_sales.head()
## 2015 2016 2017 2018 2019
##0 0 1 2 3 4
##1 5 6 7 8 9
##2 10 11 12 13 14
##3 15 16 17 18 19
##4 20 21 22 23 24
yearly_sales.tail()
## 2015 2016 2017 2018 2019
##1995 9975 9976 9977 9978 9979
##1996 9980 9981 9982 9983 9984
##1997 9985 9986 9987 9988 9989
##1998 9990 9991 9992 9993 9994
##1999 9995 9996 9997 9998 9999
```

#### Show Rows and Columns of Your DataFrame With Shape

The `shape`

command gives information on the data set size. It gives you a tuple with a count of rows and columns (rows, columns).

```
yearly_sales.shape
##(2000, 5)
```

#### Show the Type of Your Dataset

Is your dataset a list? a string? a data frame? Find it by using the `type()`

function.

```
type(yearly_sales)
##pandas.core.frame.DataFrame
```

#### Print Header Names

To help you visualize your header names by iterating over the columns using `for`

. Really useful when you have a really large dataset.

```
for col in yearly_sales.columns:
print(col)
##2015
##2016
##2017
##2018
##2019
```

You can also easily see header name by selecting the first column

```
yearly_sales[:0]
##Empty DataFrame
##Columns: [2015,2016,2017,2018,2019]
##Index: []
```

Or, you can use the `columns `

function.

```
yearly_sales.columns
##Index(["2015","2016","2017","2018","2019"])
```

#### Get the Column Index Using Name With get_loc

You might want to get column index from column name in python pandas. Do it with `get_loc`

.

```
yearly_sales.columns.get_loc("2018")
##3
```

### Most Useful Dataframe Manipulations

There are many great functions in Pandas for data frame manipulation. In this tutorial, I will show you some of the most useful ones for SEO and Datascience.

#### Select Only Unique Values

In a dataset, you might end up with duplicates in your columns.

```
cities = ['montreal',"quebec","montreal","montreal","toronto","vancouver","edmonton"]
cities =pd.DataFrame(cities)
cities.columns = ["city"]
cities
```

To get a list of unique values, use the unique() function.

```
cities.city.unique()
```

#### Drop NA Values In Rows and Columns

To drop NA values in a Dataframe, use the `dropna`

function. To know more about the function, just follow the well-explained guide on pydata.

#### Modify Column Names

The rename function helps you rename row or columns of a DataFrame.

```
cities.rename(columns={"city":"area"})
```

#### Remove Rows or Columns

You can remove rows and columns in your data frame using the `drop`

function.

To remove rows.

```
cities.drop(0)
```

This will delete the row with an index equal to 0.

To drop a column.

```
cities.drop(columns=["city"])
```

This will delete the column named “city”.

#### Select Rows and Columns in DataFrames Using iloc & loc

The easiest way to select and index rows and columns in Python is to use either `.iloc`

or `.loc`

.

You can select data by row or column number using `.iloc`

or to select data by label or conditional statement using `.loc`

.

`df.iloc[<row number>,<column number>]`

`df.loc[<row label>,<column label>]`

To select an entire row using iLoc:

```
quarterly_sales.iloc[1,:]
```

To select an entire column using `iloc`

:

```
quarterly_sales.iloc[:,1]
```

### Split Columns & Extract Data Using Delimiters

Let’s say that your first column has values like in a CSV document that uses semi-column spacer.

```
## Column A
## 322;435;423
## 111;2443;23556
## 222
## 111;354
```

To split columns using spacers in Pandas, use the `str.split`

function.

```
newdf = df.iloc[:,1].str.split(";", expand = True)
```

### Extract Data Using Regex

To break up a string into columns using Regex in pandas, you will need to have both the `pandas`

and the `re`

(Regular expression operations) packages.

```
import re
import pandas as pd
```

#### Create a New Dataset

```
import pandas as pd
import numpy as np
date = pd.date_range(start='1/01/2019', end='1/05/2019', freq='D')
date=pd.DataFrame(date)
randnum = pd.DataFrame(np.random.randint(10,99,size=(5, 1)))
semicolumn = pd.DataFrame(';', index=range(5), columns=list('A'))
cities = ['Montreal',"quebec","toronto","Vancouver","Edmonton"]
cities =pd.DataFrame(cities)
df = pd.concat([date,randnum,semicolumn,cities], axis=1)
df['combined'] = df.apply(lambda row: ' '.join(row.values.astype(str)), axis=1)
df=pd.DataFrame(df['combined'])
df
```

#### Extract a Column With Dates

```
df['date'] = df['combined'].str.extract('(....-..-..)', expand=True)
df['date']
```

#### Extract a Column With Numbers

```
df['integer'] = df['combined'].str.extract('( \d\d\ )', expand=True)
df['integer']
```

#### Extract a Column With Text

```
df['city'] = df['combined'].str.extract('([A-Za-z]\w{0,})', expand=True)
df['city']
```

#### Split Column Using Delimiter

```
df=df['combined'].str.split(";", expand = True)
df
```

### Pandas Pivot Tables

Pivot tables are super useful in SEO as well as in Data Science.

Those tables are useful to aggregate and compare large datasets.

```
import numpy as np
import pandas as pd
import seaborn as sns
tips = sns.load_dataset('tips')
```

```
tips.head()
```

```
tips.pivot_table('tip', index='sex', columns='time')
```

### Other Useful Pandas Functions

As you can now see, not only are there more than enough packages to work data in Python, there are multiple ways of doing the same thing.

Here, I’ll state a few other functions that you might find useful.

#### Sort Data in Pandas

```
df = sorted(df, key=lambda x: x[1], reverse=True)
```

## Matplotlib

NumPy – Pandas – Matplotlib – Seaborn – Requests

Matplotlib is a Python package to visualize data, a very important component of data analysis. After this section, you’ll know how to make awesome visualizations for data analysis.

### Create a Line Chart With Matplotlib

To create a line chart, you need to use the `pyplot `

subpackage.

```
import matplotlib.pyplot as plt
quarter = ["Q1","Q2","Q3","Q4"]
sales = [320.06,327.2, 325.3, 330.4]
plt.plot(quarter,sales)
plt.show()
```

Note that you have to use the `plt.show()`

function to actually display the plot. Like this.

You could also print various mathematical functions by merging Numpy and Matplot.

```
x = np.linspace(start = 0, stop = 10, num = 1000)
plt.plot(x,np.sin(x))
```

```
plt.plot(x,np.exp(x))
```

### Make Great Scatter Plots

Scatter plots are great visualizations to show the relationships between two variables.

Similar to the line charts, they use cartesian coordinates (x and y-axis) to display the values of two variables.

They can help determine the correlation (the impact of one value on the other) between the two variables.

To create the graph, just use the `scatter`

function.

```
import matplotlib.pyplot as plt
quarter = ["Q1","Q2","Q3","Q4"]
sales = [320.06,327.2, 325.3, 330.4]
plt.scatter(quarter,sales)
plt.show()
```

### Build a Histogram

A histogram is a type of visualization that uses bars of different heights to shows the frequency distribution of your variables.

An histogram is not a bar graph.

The difference between a bar chart and a histogram is that the former is a comparison of *discrete variables* and shows *categorical data*.

The latter represents the frequency distribution of *continuous variables* and presents *numerical data*.

When you look at a bar graph, you’ll see gaps between the bars.

```
import matplotlib.pyplot as plt
quarter = ["Q1","Q2","Q3","Q4"]
sales = [100,327,225,300]
plt.bar(quarter,sales)
plt.ylim(0,500)
plt.show()
```

The histogram, however, has no such gaps.

```
import numpy as np
import matplotlib.pyplot as plt
dataset = np.random.uniform(0.0,10.0,100)
plt.hist(dataset, bins=10)
plt.show()
```

### Modify The Axis: Limits, Ticks, and Scale

Let’s start with our basic line chart.

```
import matplotlib.pyplot as plt
quarter = ["Q1","Q2","Q3","Q4"]
sales = [320.06,327.2, 325.3, 330.4]
```

#### Tip #1: Set Limits

**First, **you can add limits to your axis using `xlim `

and `ylim`

.

```
plt.ylim(0,500)
plt.plot(quarter,sales)
plt.show()
```

See in this graph how you can alter the perception of the data, using different visualizations? The same data now looks like there are absolutely no variations from Q1 to Q4.

#### Tip #2: Set Ticks

**Second,** choose the coordinates of your axis using `xticks`

and `yticks`

.

```
import matplotlib.pyplot as plt
quarter = ["Q1","Q2","Q3","Q4"]
sales = [320.06,327.2, 325.3, 330.4]
plt.yticks(np.arange(320.0,331.0, step=1.0))
plt.plot(quarter,sales)
plt.show()
```

#### Tip #3: Apply a Different Scale

Last, you could make your scale logarithmic.

You can modify the scale of the X and the Y-axis using `xscale`

for the former, and `yscale`

for the latter.

```
plt.xscale("log")
```

In this case, it doesn’t make any sense to plot a logarithmic scale, so we’ll skip this step.

You could also add color and other funky stuff to your graphs. To learn more about the potential of matplotlib customization, you could read this awesome guide on Towards Data Science.

### How to Add a Title And Labels to Your Visualization

To add labels to your axis, use the `xlabel`

and `ylabel`

functions. To add a title, use the `title`

function.

Import your function and create your dataset.

```
import matplotlib.pyplot as plt
quarter = ["Q1","Q2","Q3","Q4"]
sales = [320.06,327.2, 325.3, 330.4]
```

Add your labels

```
plt.xlabel("Quarter")
plt.ylabel("Sales")
```

Add a title.

```
plt.title("Sales per Quarter")
```

Create your visualization.

```
plt.plot(quarter,sales)
plt.show()
```

### Modify Visual Components

We now have a well-labeled graph. Let’s make it more beautiful.

#### Change Font-Size

To change font size, use `rcParams.update`

.

```
plt.rcParams.update({"font.size":20})
quarter = ["Q1","Q2","Q3","Q4"]
sales = [320.06,327.2, 325.3, 330.4]
plt.yticks(np.arange(320.0,331.0, step=1.0))
plt.xlabel("Quarter")
plt.ylabel("Sales")
plt.title("Sales per Quarter")
plt.plot(quarter,sales)
plt.show()
```

Don’t you like it?

Here is how to reset Matplotlib default settings.

```
plt.rcParams.update(plt.rcParamsDefault)
```

#### Change The Color of a Graph in MatplotLib

To change the color of your graph in Matplotlib, use the `color`

parameters.

```
plt.plot(quarter,sales, color="red")
plt.show()
```

#### Add a Label and a Legend to Your Graph

Awesome, we’re almost done. Let’s end this up by adding a legend to the graph.

```
plt.plot(quarter,sales, label="Quarterly Sales")
plt.legend(loc="lower left")
```

Now, the entire code.

```
quarter = ["Q1","Q2","Q3","Q4"]
sales = [320.06,327.2, 325.3, 330.4]
plt.yticks(np.arange(320.0,331.0, step=1.0))
plt.xlabel("Quarter")
plt.ylabel("Sales")
plt.title("Sales per Quarter")
plt.plot(quarter,sales, color="red", linestyle="dotted", label="Quarterly Sales")
plt.legend(loc="lower left")
plt.show()
```

### Show Data Uncertainty

Sometimes, as a Data Scientist, you’ll need to make predictions.

Predictions always come with a degree of uncertainty that is represented by the p-value.

If you started with a 95% confidence level and want to show this in your graph, you’ll use `plot.errorbar`

.

```
x = np.linspace(0, 10, 50)
margin = 0.95
y = np.sin(x) + margin * np.random.randn(50)
plt.errorbar(x, y, yerr=margin, fmt=".");
```

This is it. If you want to see more customization options with Matplotlib, I suggest that you bookmark this Matplotlib Cheatsheet.

## Seaborn

NumPy – Pandas – Matplotlib – Seaborn – Requests

Matplotlib is great but has its flaws. Indeed, a good example of those flaws is that Matplotlib’s functions don’t interact very well with Pandas’ Dataframes. Seaborn is here to the rescue.

Seaborn is a layer added to the Matplotlib package. It brings intuitive functions to help solve most problems encountered by the other library.

To learn more read the full tutorial on Python data analysis with Seaborn.

### Import Seaborn Package

```
import seaborn as sns
sns.set()
```

### Displot: A Basic Seaborn Function

Let’s recreate our normal distribution graph, this time using Seaborn `distplot`

.

```
import numpy as np
normal_matrix =np.random.rand(100,1000)
matrix_sum = np.sum(normal_matrix,0)
sns.distplot(matrix_sum, kde=True)
```

### How to Load a Template DataSet

To build a proper example, let’s load the “Iris” dataset.

Iris is only a data set that is frequently used in tutorials.

```
iris = sns.load_dataset("iris")
iris.head()
```

```
sns.pairplot(iris, hue="species", height=2)
```

### Make a Linear Regression in Seaborn

Every plot in Seaborn has a set of fixed parameters. For `sns.jointplot`

, there are three mandatory parameters: the x-axis data, the y-axis data, and the dataset.

To make a linear regression, we need to add to those three parameters, the optional parameter `kind="reg"`

(for Linear Regression).

```
tips=sns.load_dataset("tips")
sns.jointplot("total_bill","tip",data=tips, kind='reg')
```

Note that you could also make a linear regression using `lmplot()`

or `regplot()`

. Just follow this awesome guide on linear regression with Seaborn.

To learn more about Seaborn, just read the official documentation.

## Requests

NumPy – Pandas – Matplotlib – Seaborn – Requests

The requests library is one of the most important libraries for SEO. It lets you make HTTP requests to servers using Python.

This is the library that you need to:

- Check for server response code (2XX, 3XX, 4XX…);
- Post something;
- Read a JSON file;

**Install Requests**

```
!pip install requests
import requests
```

### Get URL Using requests.get()

Calling a URL is the basis of any SEO request. Whether you call an API or a web page, you will need to use the `requests.get()`

function.

```
response = requests.get("https://en.wikipedia.org/wiki/Search_engine_marketing")
print(response)
# <Response [200]>
```

### View The Attributes You Can Run

Whenever you want to know the attributes and methods available for a specific object (here `response`

), you can use the `dir()`

function.

```
print(dir(response))
```

Here, you can see a bunch of useful functions you can use to access content within this `response`

object, such as:

`status_code`

;`text`

;`headers`

;`is_redirect`

`json`

.

### Get Response Code

```
response = requests.get("https://en.wikipedia.org/wiki/Search_engine_marketing")
response.status_code
# 200
```

### Get Content Using Text

You can get content from a web page using `requests`

using the `text`

method.

The `text`

method will return the content of response in Unicode.

```
response.text
```

As you can see, the request returned the HTML content of the page in Unicode (or text). This is useful.

It is not the best way to extract HTML. If you want to “interpret” (i.e. parse) this data. You should use an HTML parser like `Beautifulsoup`

or `Requests-html`

.

### Get HTTP Header

You can review your HTTP response headers using `headers`

. This can provide great information about your SEO performances.

```
response.headers
```

You can also select a specific element of the HTTP header.

```
response.headers['Last-Modified']
```

This is it.

We now have covered the basics of Python for SEO. We have seen how to use Numpy, Pandas, Matplotlib, Seaborn, and Requests packages in Python. Sure, there is plenty more to learn. Keep in touch. Next, we’ll cover what we can do with Pandas for SEO.

SEO Strategist at Tripadvisor, ex- Seek (Melbourne, Australia). Specialized in technical SEO. Writer in Python, Information Retrieval, SEO and machine learning. Guest author at SearchEngineJournal, SearchEngineLand and OnCrawl.