Fatmawati Ahmad Zaenuri / Shutterstock

When you use the Linux command du you get both the actual disk usage and the true size of the file or directory. We will explain why these values ​​do not match.

Actual disk usage and true size

The size of a file and the space it takes up on your hard drive are rarely the same. Disk space is allocated in blocks. If the file is smaller than a block, it is still allocated a whole block because the filesystem does not have a smaller unit of real estate to use.

If the file’s size is not an exact multiple of the number of blocks, the space it uses on the hard drive must always be rounded up to the next whole block. For example, if a file is larger than two blocks but less than three, it still requires three blocks to store.

Two measurements are used depending on the file size. The first is the actual file size, which is the number of bytes of content that make up the file. The second is the effective file size on the hard drive. This is the number of file system blocks needed to store this file.

Example

Let’s look at a simple example. We will redirect one character to a file to create a small file:

  echo "1"> geek.txt 

Command "echo" 1 "> geek.txt» in terminal window.’ width=»646″ height=»57″ src=»https://gadgetshelp.com/wp-content/uploads/images/htg/content/uploads/2019/12/30.png»></p>
<p><noscript><img class="alignnone wp-image-450606 size-full" alt= geek.txt» in terminal window.’ width=»646″ height=»57″ src=»https://gadgetshelp.com/wp-content/uploads/images/htg/content/uploads/2019/12/30.png»>

Now we will use a long format list ls to see the length of a file:

  ls -l geek.txt 

Command "ls -l geek.txt" in a terminal window.

Length is the numeric value that follows the entries dave dave which is two bytes. Why is it two bytes when we only sent one character to the file? Let’s take a look at what’s going on inside the file.

We will use the command hexdump which will give us the exact number of bytes and allow us to «see» non-printable characters as hexadecimal values. We will also use the option -C (canonical) to have the output output hexadecimal values ​​in the output body, as well as their alphanumeric character equivalents:

  hexdump -C geek.txt 

Command "hexdump -C geek.txt" in a terminal window.

The output shows us that starting at offset 00000000 in the file, there is a byte that contains the hex value 31 and one that contains the hex value 0A. The right side of the output displays these values ​​as alphanumeric characters where possible.

The hexadecimal value 31 is used to represent the number one. The hexadecimal value 0A is used to represent the newline character, which cannot be displayed as an alphanumeric character, so it is displayed as a dot (.) instead. The newline character is added with echo . Default echo starts a new line after displaying the text it needs to write to the terminal window.

This is in line with the conclusion from ls and is consistent with a file length of two bytes.

RELATED: How to use the ls command to list files and directories in Linux

Now we will use the command du to view file size:

  du geek.txt 

The du geek.txt command in a terminal window.

It says size four, but four of what?

There are blocks and then there are blocks

When you report file block sizes, the size it uses depends on several factors. You can specify what block size it should use on the command line. If you don’t force du to use a block of a certain size, it decides which one to use.

It first checks the following environment variables:

  • DU_BLOCK_SIZE
  • BLOCK SIZE
  • BLOCK SIZE

If either exists, the block size is set, and du stops checking. If none of these are set, the default for du the block size is set to 1024 bytes. Unless the variable is set POSIXLY_CORRECT name POSIXLY_CORRECT . If so, then by default du has a block size of 512 bytes.

So how do we know which one is being used? You can check every environment variable to solve it, but there is a faster way. Let’s compare the results with the block size that the file system uses.

To determine the block size that the file system is using, we will use the program tune2fs . We will then use the option -l (list of superblocks), pipe output via grep and then output the lines containing the word «Block».

In this example, we will consider the file system of the first partition of the first hard drive, sda1 and we need to use sudo :

  sudo tune2fs -l / dev / sda1 |  блок grep 

Command "sudo tune2fs -l /dev/sda1 | grep Block" in a terminal window.

The file system block size is 4096 bytes. If we divide this by the result we got from du (four), it turns out that the block size du default is 1024 bytes. Now we know a few important things.

First, we know that the smallest amount of file system resources that can be allocated to store a file is 4096 bytes. This means that even our tiny two-byte file takes up 4 KB on the hard drive.

The second thing to keep in mind are applications designed to report hard disk and file system statistics, such as du , ls and tune2fs may have different understandings of what «block» means. Application tune2fs reports the true block sizes of the file system, while ls and du you can configure or force other block sizes. These block sizes are not intended to be related to the block size of the file system; they are just «chunks» that these commands use in their output.

Finally, other than using blocks of different sizes, answers from du and tune2fs pass the same value. Result tune2fs was one block of 4096 bytes, and the result du — four blocks of 1024 bytes.

Using du

No options or command line options, du lists the total disk space used by the current directory and all subdirectories.

Let’s look at an example:

  дю 

The du command in a terminal window.

The size is specified in the default block size of 1024 bytes per block. The entire subdirectory tree has been traversed.

Usage du in another directory

If you want to du presented the report in a different directory than the current one, you can pass the path to the directory on the command line:

  du ~ / .cach / evolution / 

command du ~/.cach/evolution/ in a terminal window.

Usage du in a specific file

If you want to du reported a specific file, specify the path to that file on the command line. You can also pass a shell template to a selected group of files, like so *.txt :

  du ~ / .bash_aliases 

du ~/.bash_aliases command in a terminal window.

Reporting on files in directories

To get a report on the files in the current directory and subdirectories, use the parameter -a (all files):

  ду-а 

The du -a command in a terminal window.

For each directory, the size of each file is listed, as well as the total number for each directory.

The output of the

Directory tree depth limit

You can specify du enumerate a directory tree to a certain depth. To do this, use the parameter -d (maximum depth) and specify the depth value as a parameter. Note that all subdirectories are scanned and used to calculate reported totals, but they are not all listed. To set the maximum depth of a single level directory, use this command:

  ду-д 1 

Command "du -d 1" in the terminal window.

The output gives the total size of that subdirectory in the current directory and also provides the total for each.

To list directories one level deeper, use this command:

  ду-д 2 

Command "du -d 2" in the terminal window.

Setting the block size

You can use the option block to set the block size for the current operation. To use a block size of one byte, use the following command to get the exact sizes of directories and files:

  du --block = 1 

"du --block=1" command in terminal window.

If you want to use a block size of one megabyte, you can use the option -m (megabyte), which is the same as --block=1M :

  ду-м 

The du -m command in a terminal window.

If you want the sizes to be reported in the most appropriate block size according to the disk space used by directories and files, use the parameter -h (readable):

  ду-х 

The du -h command in a terminal window.

To see the apparent size of the file, rather than the amount of hard disk space used to store the file, use the option --apparent-size :

  du --apparent-size 

du --apparent-size command in a terminal window.

You can combine this with the option -a (all) to see the apparent size of each file:

  du --apparent-size -a 

The du --apparent-size -a command in a terminal window.

Each file is listed along with its apparent size.

The output of the command "du --apparent-size -a" in the terminal window.

Display only totals

If you want to du reported only the total number in the catalog, use the option -s (generalization). You can also combine this with other options such as the option -h (readable):

  ду-х-х 

Command "du -h -s" in the terminal window.

Here we will use it with the parameter --apparent-size :

  du --apparent-size -s 

The du --apparent-size -s command in a terminal window.

Modification time display

To see the time and date of creation or last modification, use the parameter --time :

  du --time -d 2 

Command "du --time -d 2" in a terminal window.

Strange results?

If you see strange results from du especially when you’re comparing sizes with the output of other commands, it’s usually due to different block sizes that different commands can be set to, or to which they’re set by default. This may also be due to differences between the actual file sizes and the disk space needed to store them.

If you need to match the output of other commands, experiment with the --block in du .

Похожие записи