Learn SQL by Building Your First SQLite Database

In this article, I will show you the basics of SQL and build a simple SQLite database that you can play with.

We are going to cover:

  • Basic SQL Definitions
  • Basic SQL Commands
  • How to build your own SQLite database

SQL Definitions

To get started, you need to understand the basics SQLs concepts like SQL, Databases, Database Management Systems and Data Types.


Subscribe to my Newsletter


What is SQL?

SQL is a programming language that is used to interact with databases.

What is a Database?

A database is a structured set of data store in a computer system. Simply put, a database stores data into tables that have rows and columns.

idnamelatin_namephylumordername
1Common brushtail possum Trichosurus vulpecula Chordata Diprotodontia
2grey-headed flying fox Pteropus poliocephalu Chordata Chiroptera
3rainbow lorikeetTrichoglossus moluccaChordataPsittaciforme
4laughing kookaburraDacelo novaeguineaeChordataCoraciiformes
5eastern brown snakePseudonaja textilisReptiliaSerpentes
Database of animals in Australia

What are Database Management Systems?

Database management systems (DBMS) are used to store and retrieve data from databases. There are different kinds of database management systems with different features.

Let’s look at the most popular ones:

  • MySQL
  • PostgreSQL
  • SQLite

MySQL and PostgreSQL are heavier DBMS and can be stored in separate servers as your web application for example. That way, you have separate processes and one being down does not impact the other.,

SQLite is a lighter DBMS. Instead of sending requests to retrieve the data from separate servers, it will store all the data as a single file.

What are SQL Data Types?

SQL is a programming language. Like other programming languages, it has datatypes to represent a different type of information.

Data types are used by programming languages to tell the computer how to interpret the data. Is it a number or is it text?

You need to define that for the compiler to interpret.

Examples of data types in SQLite. There are a lot more data types in SQL.

Data TypeExample
TEXTHello | World
NUMERIC2020-05-01
INTEGER1 | 2 | 3
REAL0.1 | 0.5 | 0.7347
BLOB (Binary Large Objects)Any type of binary data
(Zeros and Ones)

MySQL data types are much more extensive:

  • CHAR(size)
  • BINARY(size)
  • BOOLEAN
  • INT(size)
  • FLOAT(p)
  • DOUBLE

Basic SQL Commands

Now, let’s look at the basic SQL commands to help you interact with your database.

By convention SQL commands are uppercase.

CREATE TABLE

The CREATE TABLE keyword will be used to create a new table to store data inside the database.

Typical format

That command will define the structure of the table, not the actual data inside it.

CREATE TABLE name 
(
    column DATATYPE CONSTRAINT,
    column DATATYPE CONSTRAINT
);

Example Table

Let’s create a table to store animal with their scientific classification and various behaviour information.

CREATE TABLE animals 
(
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    name TEXT NOT NULL,
    latin_name TEXT NOT NULL,
    kingdom TEXT NOT NULL,
    phylum TEXT NOT NULL,
    classname TEXT NOT NULL,
    ordername TEXT NOT NULL,
    family TEXT NOT NULL,
    genus TEXT NOT NULL,
    species TEXT NOT NULL,
    subspecies TEXT,
    habitat TEXT,
    country TEXT,
    image BLOB,
    video BLOB,
    region TEXT
);

Constraints

Constraint are used to make sure that the data that you add to the table will always be valid.

There are various constraints that can be assigned to columns.

  • PRIMARY KEY: primary way to identify an element
  • NOT NULL: column should not be empty
  • DEFAULT: define default values for a column
  • UNIQUE: all values in a column should be different
  • CHECK(expression): define expressions to test values

The PRIMARY KEY constraint defines which is the primary way in which I will identify the animals.

In this case, I use a unique ID as the PRIMARY KEY as it is more robust than using the name, as a single animal could have many names. It is even better than latin_name, as Latin names sometimes get updated over time.

The NOT NULL constraint makes it a requirement for the creation of a new element. It means that you don’t want this column to ever be empty.

The AUTOINCREMENT key is an indication that you always want to update the ID as you add a new row to a table. Thus, SQL will automatically assign the ID.

INSERT: Add Data to the Table

The INSERT command is used to add data to the SQL table.

INSERT INTO animals
    (name,latin_name,kingdom,phylum,classname,ordername,family,genus,species)
    VALUES ("Brushtail possum","Trichosurus vulpecula","Animalia","Chordata","Mammalia","Diprotodontia","Phalangeridae","Trichosurus","T. vulpecula");

Note that I only needed to add NOT NULL values to the table and I didn’t need to add the ID as it automatically added using AUTOINCREMENT.

DELETE: Remove Data from the Table

The DELETE command is used to remove data to the SQL table.

DELETE FROM animals WHERE id = 2;

UPDATE: Modify data in the Table

The UPDATE command is used to modify data in the SQL table.

UPDATE animals SET name="Common brushtail possum" WHERE name="Brushtail possum";

SELECT: Interact with the table

SELECT is used to interact with the table. It is not used to modify data, but to retrieve data from the table.

SELECT all columns

SELECT * FROM animals
LIMIT 10;

This command will SELECT all ( * ) the columns from the animals table and will show a LIMIT of 10 rows of data.

The LIMIT argument is really as tables are usually quite big, and it makes it much faster to check a table when you just want to peek into it.

SELECT one or multiple columns

To SELECT only a subset of columns you can use the column names.

SELECT name, latin_name FROM animals
LIMIT 10;

Filter Rows in SQL Table

To SELECT columns that match a specific pattern.

SELECT * FROM animals
WHERE classname = "Mammalia"
LIMIT 10;

Order the Data in the Output

Sort columns using the ORDER BY keyword.

SELECT * FROM animals
WHERE order = "Mammalia"
ORDER BY "latin_name" DESC
LIMIT 10;

As opposed to the ASC keyword, the DESC keyword will let you sort from the highest to the lowest value. You could also have used the column index instead of the column name.

SELECT * FROM animals
WHERE order = "Mammalia"
ORDER BY 2 DESC
LIMIT 10;

Aggregate SQL table using Group By

GROUP BY is useful to aggregate data of the table

SELECT phylum, count(*) FROM animals 
GROUP BY 1;

AS Statement

The AS statement is great to control the name of a selected column.

SELECT phylum, count(*) AS number_of_animals FROM animals 
GROUP BY 1;

HAVING

The HAVING statement is a constraint that you can add to a GROUP BY statement to restrict it.

Create a SQLite database

To create a SQLite database, open the Terminal of your computer.

If like me you are using VSCode, open a new Terminal in VSCode using control ^~.

Create the SQL File

Use the touch command to create the SQL file.

$ touch animals.sql

Open the SQLite prompt

Open the SQLite prompt using the sqlite3 command.

$ sqlite3 animals.sql

Create your First Table

Using the CREATE TABLE keywords we saw earlier, we are going to create the table.

I can now inspect my table.

The .table shows me all existing tables in my database. Here I see that there is nothing in my table as nothing is returned.

INSERT Values in the table

Use the INSERT INTO keyword to add 2 animals to the table.

Animal 1: Common brushtail possum

Animal 2: Grey-headed flying fox

Now, looking at the table, I can see that I have two rows of values in my table.

Format the Display of the SQLite Table

Use .mode and .headers to display the table in a human-readable format.

Why Use SQL for Data Science

SQL is a commonly used language used by data scientists to interact with relational databases (e.g MySQL, Redshift, etc.). SQL is needed in data science to extract data from these systems

Conclusion

This is great! We now have our SQL database created. If you want to go further check out some basic SQL commands to interact with the database.

You may want to dive further by reading these articles:

5/5 - (1 vote)