This tutorial explains how to compress and decompress files in Linux along with the similarities and differences between gzip and bzip2 commands. Learn how to use the gzip, bzip2, gunzip and bunzip2 commands in Linux with practical examples.
A compressed file not only uses less disk space but also consumes less memory and network bandwidth when moved to another location. Linux contains several compression utilities. Among those, this tutorial discusses two most popular utilities; gzip and bzip2.
Similarities between gzip and bzip2
Both commands not only works in similar fashion but also use similar syntax and options to compress and decompress the files. For example, the gzip command uses following syntax.
#gzip [option] [file]
Just like the above syntax, the bzip2 command uses following syntax.
#bzip2 [option] [file]
Once compression is done, both commands replace the supplied source file with the compressed file. To decompress the compressed file, both commands also offer individual commands. These commands are gunzip and bunzip2 for gzip and bzip2 respectively. To decompress the compressed file, we can use the corresponding command or can use command’s inbuilt functionality.
Following table lists supported options and their descriptions.
|Short option||Long option||Supported command||Description|
|-h||–help||Both||List all supported options|
|-d||–decompress||Both||Decompress the compressed file|
|-f||–force||Both||Overwrite existing output file|
|-t||–test||Both||Test compressed file integrity|
|-c||–stdout||Both||Write output to standard output device|
|-q||–quiet||Both||Don’t display noncritical errors and warnings|
|-v||–verbose||Both||Display verbose messages|
|-L||–license||Both||The bzip2 displays both software version and license information. The gzip displays License information only.|
|-V||–version||Both||The bzip2 displays both software version and license information. The gzip shows version information only.|
|-1||–fast||Both||The bzip2 sets block size to 100k. The gzip compresses faster|
|-9||–best||Both||The bzip2 sets block size to 900k. The gzip compresses better.|
|-z||–compress||bzip2 only||Force compression|
|-k||–keep||bzip2 only||Keep original file.|
|-s||–small||bzip2 only||Use less memory|
|-l||–list||gzip only||Display compressed and decompressed size|
|-n||–no-name||gzip only||Do not save or restore original name and time stamp|
|-N||–name||gzip only||Save or restore original name and time stamp|
|-r||–recursive||gzip only||Operate recursively on directories|
|-S||–suffix=SUF||gzip only||Use suffix SUF on compressed files|
As we can see in above table: –
- The options -h, -d, -f, -t, -c, -q and -v similarly work in both commands.
- The options -1, -9 -L and -V work slightly different in both commands.
- The options -z, -k and -s work only in the bzip2 command.
- The options -l, -n, -S, -N and -r work only in the gzip command.
Besides command line options, there are few more differences between both commands. Following table lists those differences.
Differences between gzip command and bzip2 command
|The gzip command||The bzip2 command|
|It uses the DEFLATE algorithm.||It uses the Burrows-Wheeler block sorting algorithm.|
|To denote the compressed file, it uses the extension .gz.||To denote the compressed file, it uses the extension .bz2.|
|It compresses files at higher speed in comparison with the bzip2 command.||It provides higher compression ratio in comparison with the gzip command.|
|It doesn’t provide any inbuilt functionality or associate program to recover the damaged .gz files.||It provides an additional program bzip2recover that can recover the damaged .bz2 files.|
|For decompression, it provides the utility gunzip.||For decompression, it provides the utility bunzip2.|
|It supports recursive compression.||It doesn’t support recursive compression.|
This tutorial is the first part of the article \”Compressing and archiving explained in Linux\”. This tutorial explains following RHCSA/RHCE topic.
Archive, compress, unpack, and uncompress files using tar, star, gzip, and bzip2
Other parts of this article are following.
This tutorial is the second part of the article. It explains basic usages of tar command with syntax and options.
This tutorial is the last part of the article. It explains how to use the tar command in Linux with practical examples.
gzip, bzip2, gunzip and bunzip practical examples
Although both gzip and bzip2 commands use fairly simple and straightforward options,
still if you forget any option or have any confusion about any option, you can list all supported options with the -h option.
To list all supported options of gzip command, use following command
To list all supported options of gzip command, use following command
Compressing and decompressing files
Compressing and decompressing files with gzip and bzip are relatively simple.
To compress a file, simply specify its name (if file is located in same directory) or
full path (if file is located in other directory) with these commands. For example to compress
a file named file_a, we can use any one command from following commands.
#gzip file_a #bzip2 file_a
As explained earlier, both commands replace the supplied file with the compressed file.
So if we use gzip and bzip2 for compression, the supplied file file_a will be replaced with
the compressed file file_a.gz and file_a.bz2 respectively.
To decompress the compressed file, we can use -d option with both commands
or can use gunzip command and bunzip command if file is compressed with gzip and bzip2 respectively.
For example, to decompress the file file_a.bz2, we can use any one command from following commands
#bzip2 -d file_a.bz2 #bunzip file_a.bz2
To decompress the file file_a.gz, we can use any one command from following commands.
#gzip -d file_a.gz #gunzip file_a.gz
Following figure shows compression and decompression with gzip and gunzip commands.
Following figure shows compression and decompression with bzip2 and bunzip commands.
gzip vs bzip2 which provides higher compression ratio
The bzip2 provides higher compression ratio but take more time in compression.
To verify it practically, let’s compress a file with both commands and compare the file size of compressed file.
As we can see in above figure, the file compressed with gzip is larger in size than the file compressed with bzip2.
It clearly shows that bzip2 provides more compression ratio than the gzip. If you need more proof, you can perform the same compression with -v option.
As we can in above figure, when we compressed the file file_a with bzip2, compression ratio was 62.58%.
While when we compressed the same file with gzip, the compression ratio was 61.6%.
Redirecting output to a device or file
As we have seen above, by default both commands store output to a new compressed file.
And once compression is done, both commands replace the supplied file with the compressed file.
If require, we can store output to any device, file or custom location.
To send the output at custom location, the option –c is used. The option -c forces command to
send output at standard output device (console) and keeps the original file intact.
Following figure shows an example. In this example, a small file is created and gzip command with option -c is used to compress it.
As we can see in above figure, if option -c is used, command writes output to the console.
We can use shell redirector (>) to store output in custom location.
For example, following command compresses two files; small-file and small-file-2 in supplied sequence and
writes the output to a new file small.gz.
#gzip -c small-file small-file-2 > small.gz
You can also use this feature to create a single compressed file from multiple files.
Getting information from a compressed file
The gzip command, if used with -l option, scans the supplied compressed file and lists following information about that file.
Compressed size, uncompressed size, compression ratio and uncompressed name
This option only work with gzip command. The bzip2 doesn’t support this option.
Compressing files recursively
Use -r option with gzip command, to scan and compress all files from a directory and all
of its sub-directories. For example, following command not only compresses all files
of the directory named a_dir and but also recursively scans all of its sub-directories.
If it finds any file in any sub-directory, it will also compress that file.
#gzip -r a_dir
We can also use this option with gunzip command to decompress all files form a directory and all of its sub-directories recursively.
#gunzip -r a_dir
The bzip2 command neither supports this option nor provides any other option for recursive operation.
Keeping original file intact
By default, the bzip2 command replaces supplied input file with compressed output file.
To keep input file intact, use -k option with bzip2 command.
For example, following command keeps supplied file file_a along with the compressed output file.
#bzip2 -k file_a
This option only works with bzip2 command. The gzip command does not support this option.
Recovering damage compressed file
To recover the damage compressed file, bzip2 provides a separate tool known as bzip2recover.
This tool scans damage file, skips corrupt data blocks and copy correct data blocks in a new file.
To understand how this tool works, let’s take an example.
Create a compressed file with bzip2 and open it with a text editor. Add an extra line and save it.
Now as, it contains text in both formats; compressed and decompressed, bzip2 treats it as a corrupt compressed file.
To repair this file, we can use bzip2recover tool. Once file is repaired, it can be decompressed with the bzip2.
Following figure shows this exercise.
The bzip2recover tool works only with bzip2 compression. A file that is compressed with any other utility or tool can’t be repaired with it.
Adjusting speed and compression ratio
We can adjust the speed and the compression ratio in both commands.
Both commands supports a scale of 1 to 9 where number 1 provides the highest speed but the lowest
compression ratio while number 9 provides the highest compression ratio but the lowest speed. Compression ratio works inverse of the speed.
Default value is 6. To use any other value, specify that value as option.
Following figure shows, how changing this value can impact the compression ratio.
Compressing an already compressed file
When we compress a file, the information that is required to decompress it, is also stored with compressed data. If we compress it again, this information will be added again. Since data has already compressed in first time, it will remain unchanged in second time. So if we compress an already compressed file, we end up with a large file.
Although it’s a waste of time and space, but if require, you can compress an already compressed file again with -f option.
Let’s take an example. Compress a file with gzip and note down its size. Now compress it again. Since file has already compressed, gzip will not compress it again. Use option -f, to force it. Once file is compressed again, compare its size with noted size.
Following figure shows this exercise.
A file compressed two times, also need to be decompressed two times.
That’s all for this tutorial. If you like this tutorial, please don’t forget to share it with friends through your favorite social channel.
Full Version EX300 Dumps
Try EX300 Dumps Demo