Uncompress Multiple GZip files with Python

Share this post

Recently, I had a lot of CSV files that needed uncompressing.

I didn’t like the options available, because I didn’t fully understand them.

I created an alternative solution with my favourite programming language: Python.

Requirements

In this tutorial, I will be using Python along with the glob and os libraries.

If you haven’t, you will need to install Python.

Uncompress All GZip Files in a Directory

from glob import glob
import os

path = '/path/to/' # Show location of the files
list_of_files = glob(path + '*.gz') # list gzip files

bash_command = 'gzip -dk ' + ' '.join(list_of_files) # create bash command
os.system(bash_command) # Run command in Terminal

Understand the Python Script

In the command above, glob is used to list all the files that end up with the .gz extension.

It returns a list:

# ['/path/to/file1.csv.gz','/path/to/file2.csv.gz']

The Gunzip Command

By using join(), I convert the list to a string.

The bash_command variable is created to reproduce a gunzip command.

It is a string that uses the gzip -dk file.gz format.

-dk is used to make sure that with uncompress, but also keep the compressed version.

This is what is stored in the bash_command variable.

# gzip -dk file.gz /path/to/file1.csv.gz /path/to/file2.csv.gz

Run the Command in The Terminal With os

Using the os.system() command, we will execute the gunzip command to uncompress the files.

This is it. You now have uncompressed multiple gzip files at once using Python and gunzip.

You Might Also Like  Recrawl URLs Extracted with Screaming Frog (using Python)