When you use the Linux command
du you get both the actual disk usage and the true size of the file or directory. We will explain why these values do not match.
Actual disk usage and true size
The size of a file and the space it takes up on your hard drive are rarely the same. Disk space is allocated in blocks. If the file is smaller than a block, it is still allocated a whole block because the filesystem does not have a smaller unit of real estate to use.
If the file’s size is not an exact multiple of the number of blocks, the space it uses on the hard drive must always be rounded up to the next whole block. For example, if a file is larger than two blocks but less than three, it still requires three blocks to store.
Two measurements are used depending on the file size. The first is the actual file size, which is the number of bytes of content that make up the file. The second is the effective file size on the hard drive. This is the number of file system blocks needed to store this file.
Let’s look at a simple example. We will redirect one character to a file to create a small file:
echo "1"> geek.txt
geek.txt» in terminal window.’ width=»646″ height=»57″ src=»https://gadgetshelp.com/wp-content/uploads/images/htg/content/uploads/2019/12/30.png»>
Now we will use a long format list
ls to see the length of a file:
ls -l geek.txt
Length is the numeric value that follows the entries
dave dave which is two bytes. Why is it two bytes when we only sent one character to the file? Let’s take a look at what’s going on inside the file.
We will use the command
hexdump which will give us the exact number of bytes and allow us to «see» non-printable characters as hexadecimal values. We will also use the option
-C (canonical) to have the output output hexadecimal values in the output body, as well as their alphanumeric character equivalents:
hexdump -C geek.txt
The output shows us that starting at offset 00000000 in the file, there is a byte that contains the hex value 31 and one that contains the hex value 0A. The right side of the output displays these values as alphanumeric characters where possible.
The hexadecimal value 31 is used to represent the number one. The hexadecimal value 0A is used to represent the newline character, which cannot be displayed as an alphanumeric character, so it is displayed as a dot (.) instead. The newline character is added with
echo . Default
echo starts a new line after displaying the text it needs to write to the terminal window.
This is in line with the conclusion from
ls and is consistent with a file length of two bytes.
RELATED: How to use the ls command to list files and directories in Linux
Now we will use the command
du to view file size:
It says size four, but four of what?
There are blocks and then there are blocks
When you report file block sizes, the size it uses depends on several factors. You can specify what block size it should use on the command line. If you don’t force
du to use a block of a certain size, it decides which one to use.
It first checks the following environment variables:
- BLOCK SIZE
- BLOCK SIZE
If either exists, the block size is set, and
du stops checking. If none of these are set, the default for
du the block size is set to 1024 bytes. Unless the variable is set
POSIXLY_CORRECT . If so, then by default
du has a block size of 512 bytes.
So how do we know which one is being used? You can check every environment variable to solve it, but there is a faster way. Let’s compare the results with the block size that the file system uses.
To determine the block size that the file system is using, we will use the program
tune2fs . We will then use the option
-l (list of superblocks), pipe output via
grep and then output the lines containing the word «Block».
In this example, we will consider the file system of the first partition of the first hard drive,
sda1 and we need to use
sudo tune2fs -l / dev / sda1 | блок grep
The file system block size is 4096 bytes. If we divide this by the result we got from
du (four), it turns out that the block size
du default is 1024 bytes. Now we know a few important things.
First, we know that the smallest amount of file system resources that can be allocated to store a file is 4096 bytes. This means that even our tiny two-byte file takes up 4 KB on the hard drive.