In this article, I will show you the basics of SQL and build a simple SQLite database that you can play with.
We are going to cover:
- Basic SQL Definitions
- Basic SQL Commands
- How to build your own SQLite database
SQL Definitions
To get started, you need to understand the basics SQLs concepts like SQL, Databases, Database Management Systems and Data Types.
What is SQL?
SQL is a programming language that is used to interact with databases.
What is a Database?
A database is a structured set of data store in a computer system. Simply put, a database stores data into tables that have rows and columns.
id | name | latin_name | phylum | ordername |
1 | Common brushtail possum | Trichosurus vulpecula | Chordata | Diprotodontia |
2 | grey-headed flying fox | Pteropus poliocephalu | Chordata | Chiroptera |
3 | rainbow lorikeet | Trichoglossus molucca | Chordata | Psittaciforme |
4 | laughing kookaburra | Dacelo novaeguineae | Chordata | Coraciiformes |
5 | eastern brown snake | Pseudonaja textilis | Reptilia | Serpentes |
What are Database Management Systems?
Database management systems (DBMS) are used to store and retrieve data from databases. There are different kinds of database management systems with different features.
Let’s look at the most popular ones:
- MySQL
- PostgreSQL
- SQLite
MySQL and PostgreSQL are heavier DBMS and can be stored in separate servers as your web application for example. That way, you have separate processes and one being down does not impact the other.,
SQLite is a lighter DBMS. Instead of sending requests to retrieve the data from separate servers, it will store all the data as a single file.
What are SQL Data Types?
SQL is a programming language. Like other programming languages, it has datatypes to represent a different type of information.
Data types are used by programming languages to tell the computer how to interpret the data. Is it a number or is it text?
You need to define that for the compiler to interpret.
Examples of data types in SQLite. There are a lot more data types in SQL.
Data Type | Example |
TEXT | Hello | World |
NUMERIC | 2020-05-01 |
INTEGER | 1 | 2 | 3 |
REAL | 0.1 | 0.5 | 0.7347 |
BLOB (Binary Large Objects) | Any type of binary data (Zeros and Ones) |
MySQL data types are much more extensive:
CHAR(size)
BINARY(size)
BOOLEAN
INT(size)
FLOAT(p)
DOUBLE
- …
Basic SQL Commands
Now, let’s look at the basic SQL commands to help you interact with your database.
By convention SQL commands are uppercase.
CREATE TABLE
The CREATE TABLE keyword will be used to create a new table to store data inside the database.
Typical format
That command will define the structure of the table, not the actual data inside it.
CREATE TABLE name
(
column DATATYPE CONSTRAINT,
column DATATYPE CONSTRAINT
);
Example Table
Let’s create a table to store animal with their scientific classification and various behaviour information.
CREATE TABLE animals
(
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT NOT NULL,
latin_name TEXT NOT NULL,
kingdom TEXT NOT NULL,
phylum TEXT NOT NULL,
classname TEXT NOT NULL,
ordername TEXT NOT NULL,
family TEXT NOT NULL,
genus TEXT NOT NULL,
species TEXT NOT NULL,
subspecies TEXT,
habitat TEXT,
country TEXT,
image BLOB,
video BLOB,
region TEXT
);
Constraints
Constraint are used to make sure that the data that you add to the table will always be valid.
There are various constraints that can be assigned to columns.
- PRIMARY KEY: primary way to identify an element
- NOT NULL: column should not be empty
- DEFAULT: define default values for a column
- UNIQUE: all values in a column should be different
- CHECK(expression): define expressions to test values
The PRIMARY KEY
constraint defines which is the primary way in which I will identify the animals.
In this case, I use a unique ID as the PRIMARY KEY
as it is more robust than using the name, as a single animal could have many names. It is even better than latin_name
, as Latin names sometimes get updated over time.
The NOT NULL
constraint makes it a requirement for the creation of a new element. It means that you don’t want this column to ever be empty.
The AUTOINCREMENT
key is an indication that you always want to update the ID as you add a new row to a table. Thus, SQL will automatically assign the ID.
INSERT: Add Data to the Table
The INSERT
command is used to add data to the SQL table.
INSERT INTO animals
(name,latin_name,kingdom,phylum,classname,ordername,family,genus,species)
VALUES ("Brushtail possum","Trichosurus vulpecula","Animalia","Chordata","Mammalia","Diprotodontia","Phalangeridae","Trichosurus","T. vulpecula");
Note that I only needed to add NOT NULL
values to the table and I didn’t need to add the ID as it automatically added using AUTOINCREMENT
.
DELETE: Remove Data from the Table
The DELETE
command is used to remove data to the SQL table.
DELETE FROM animals WHERE id = 2;
UPDATE: Modify data in the Table
The UPDATE
command is used to modify data in the SQL table.
UPDATE animals SET name="Common brushtail possum" WHERE name="Brushtail possum";
SELECT: Interact with the table
SELECT
is used to interact with the table. It is not used to modify data, but to retrieve data from the table.
SELECT all columns
SELECT * FROM animals
LIMIT 10;
This command will SELECT
all ( *
) the columns from the animals
table and will show a LIMIT
of 10 rows of data.
The LIMIT
argument is really as tables are usually quite big, and it makes it much faster to check a table when you just want to peek into it.
SELECT one or multiple columns
To SELECT
only a subset of columns you can use the column names.
SELECT name, latin_name FROM animals
LIMIT 10;
Filter Rows in SQL Table
To SELECT
columns that match a specific pattern.
SELECT * FROM animals
WHERE classname = "Mammalia"
LIMIT 10;
Order the Data in the Output
Sort columns using the ORDER BY
keyword.
SELECT * FROM animals
WHERE order = "Mammalia"
ORDER BY "latin_name" DESC
LIMIT 10;
As opposed to the ASC
keyword, the DESC
keyword will let you sort from the highest to the lowest value. You could also have used the column index instead of the column name.
SELECT * FROM animals
WHERE order = "Mammalia"
ORDER BY 2 DESC
LIMIT 10;
Aggregate SQL table using Group By
GROUP BY is useful to aggregate data of the table
SELECT phylum, count(*) FROM animals
GROUP BY 1;
AS Statement
The AS
statement is great to control the name of a selected column.
SELECT phylum, count(*) AS number_of_animals FROM animals
GROUP BY 1;
HAVING
The HAVING
statement is a constraint that you can add to a GROUP BY
statement to restrict it.
Create a SQLite database
To create a SQLite database, open the Terminal of your computer.
If like me you are using VSCode, open a new Terminal in VSCode using control ^~
.
Create the SQL File
Use the touch
command to create the SQL file.
$ touch animals.sql
Open the SQLite prompt
Open the SQLite prompt using the sqlite3
command.
$ sqlite3 animals.sql
Create your First Table
Using the CREATE TABLE keywords we saw earlier, we are going to create the table.
I can now inspect my table.
The .table
shows me all existing tables in my database. Here I see that there is nothing in my table as nothing is returned.
INSERT Values in the table
Use the INSERT INTO
keyword to add 2 animals to the table.
Animal 1: Common brushtail possum
Animal 2: Grey-headed flying fox
Now, looking at the table, I can see that I have two rows of values in my table.
Format the Display of the SQLite Table
Use .mode
and .headers
to display the table in a human-readable format.
Why Use SQL for Data Science
SQL is a commonly used language used by data scientists to interact with relational databases (e.g MySQL, Redshift, etc.). SQL is needed in data science to extract data from these systems
Conclusion
This is great! We now have our SQL database created. If you want to go further check out some basic SQL commands to interact with the database.
You may want to dive further by reading these articles:
- Simple SQLite3 Tutorial With Python
- Create a MySQL Database using Python (pymySql)
- Install MySQL and PHPMyAdmin With XAMPP
- Backup Google Search Console Data Into MySQL With Python
SEO Strategist at Tripadvisor, ex- Seek (Melbourne, Australia). Specialized in technical SEO. Writer in Python, Information Retrieval, SEO and machine learning. Guest author at SearchEngineJournal, SearchEngineLand and OnCrawl.