Data Visualization in Python

Data visualization is critical for data analysis. Without it, it is challenging, or sometimes even impossible to share insights on your data. In this tutorial, we will learn the most popular Python libraries for data visualization: Matplotlib, Seaborn, and Plotly.

If is a fundamental part of the data science process. No serious machine learning model was ever built without data visualization.

Most Popular Data Visualization Libraries

Before we dive into creating visualizations, let’s discuss the various libraries we will be using.

Join the Newsletter

    Matplotlib

    Matplotlib is the most often used library for data visualization in Python. It provides a wide range of plots, including line plots, scatter plots, bar plots, and histograms.

    Seaborn

    Seaborn is a data visualization library in Python that is built on top of the Matplotlib package. It brings intuitive functions to help solve most problems encountered by other libraries.

    Plotly

    Plotly is an interactive data visualization library. Not only its visualizations are more beautiful that Matplotlib and Seaborn’s, but you can also interact with them. With Plotly, you can create various plots, such as scatter plots, line plots, bar plots, and more.

    Example of a Matplotlib Visualization in Python

    import matplotlib.pyplot as plt
    import pandas as pd 
     
    from sklearn.datasets import fetch_openml
    
    # load dataset
    titanic = fetch_openml('titanic', version=1, as_frame=True)
    df = titanic['data']
    df['survived'] = titanic['target']
    
    miss_vals = pd.DataFrame(df.isnull().sum() / len(df) * 100)
    miss_vals.plot(kind='bar',
        title='Missing values in percentage',
        ylabel='percentage'
        )
     
    plt.show()
    

    Example of a Seaborn Visualization in Python

    In Seaborn you can create visualizations like a countplot.

    import seaborn as sns
    import matplotlib.pyplot as plt
     
    colors = ['Blue','Blue','Red','Red','Red','Yellow','Yellow','Yellow','Yellow','Yellow']
     
    sns.countplot(x=colors)
    plt.show()
    

    Example of a Plotly Visualization in Python

    import plotly.express as px
    import pandas as pd
    
    # Load the dataset (in this case, the Iris dataset from seaborn)
    iris_df = pd.read_csv('https://raw.githubusercontent.com/mwaskom/seaborn-data/master/iris.csv')
    
    # Create a scatter plot using Plotly Express
    fig = px.scatter(iris_df, x='sepal_length', y='sepal_width', color='species')
    
    # Set the title and axis labels
    fig.update_layout(title='Sepal Length vs. Sepal Width', xaxis_title='Sepal Length', yaxis_title='Sepal Width')
    
    # Show the plot
    fig.show()
    

    Python Articles Using Data Visualization

    Enjoyed This Post?