The file size in the operating system is measured. File and folder sizes or what are KB, MB, GB

  • Date of: 24.02.2022

NMD provides a similar NML possibility of sequential access to information. A magnetic disk drive combines several sequential access devices, and the reduction in data search time is ensured due to the independence of access to a record from its location relative to other records.

The design of the NMD is more complicated than that of the NML, and, consequently, their cost is higher. In NMD, a package of magnetic disks (or plotters) is used as data carriers, fixed on one rod, around which they rotate at a constant speed. The surface of a magnetic disk covered with a ferromagnetic layer is called the working surface.

The first such devices were equipped with interchangeable MD packages. Inserted into a casing with a hermetically sealed tray, they formed compact storage units called volumes. The most common volume capacities were 7.25, 29.100 MB. The operator put the package on the spindle of the device, removed the casing (at the same time, the package was automatically fixed on the spindle) and turned on the package unwinding motors. After a certain speed of rotation is reached, a package of a block of magnetic heads ("comb") is introduced into the space between the disks.

The principle of placing the heads is floating, since they are kept at the required distance from the surface of the disk by diverging air flows that occur when the package rotates. In the future, either full-contact heads (flexible disks) or mechanically fixed in vacuum at a certain distance from the surface (“hard drive”) were mainly used. Attempts to use liquid media (various oils) to provide the necessary placement of the heads have not been successful.

The number of magnetic heads is equal to the number of working surfaces on one disk pack. If the package consists of 11 disks, then the access mechanism consists of 10 holders with two magnetic heads on each of them. The set of tracks that can be accessed when the block is in a fixed position is called a cylinder. The magnetic head holders are combined into a single block in such a way as to ensure their synchronous movement along all cylinders. By fixing the block of the access mechanism on any of the cylinders, it is possible to make a transition from one track to another of this cylinder by electronically switching the heads.

Any operation of reading (writing) information from (to) a magnetic disk consists of three stages. At the first stage, the magnetic head is brought mechanically to the track containing the required data. At the second stage, waiting for the moment until the required record is in the magnetic head area is provided. At the third stage, the actual process of information exchange between the computer and the magnetic disk is carried out. Thus, the total time spent on the write-read operation consists of the sum of the times of searching for the corresponding track, waiting for the entry of the record (the so-called rotational delay time) and exchange with the computer. The maximum value of the rotational delay time is equal to the time for which a complete rotation of the magnetic disk is performed.

There are several ways to physically store data on a hard drive. Conventional hard drives use a "vertical" display. Data is written first on one cylinder from top to bottom, then the heads move to another cylinder, and so on. With “horizontal” mapping, data is first written in sequence from cylinder to cylinder on the surface of one disc, then also on the surface of the next plotter, and so on. This method is better suited for recording a continuous high-speed data stream, such as when recording “live” video.

The mechanism is hermetically sealed in a case with a partial vacuum inside. This design is often referred to as the main disk assembly (HDA). The environment inside the hard drive must be cleaned of dust, for this the air entering the HDA is passed through special filters. The motor, which rotates the disk at a constant speed, measured in revolutions per minute (rpm), turns on when power is applied to the disk and remains on until power is removed.

There is space between the plates for a read/write head mounted at the end of a moving arm. The head is removed from the plate by a fraction of a millimeter. In the first systems, this distance was 0.2 millimeters, today it has been reduced to 0.07 millimeters. Therefore, the slightest contamination can destroy the head, bringing it closer to the disk, and also damage the magnetic coating of the disk.

The heads are sized to only touch the disc after it has stopped, when the power is off. With a decrease in rotational speed, the air flow weakens and, when it stops completely, the head gently touches the surface of the disk. The touch point is called the LZ landing zone (land zone), which is specifically designed to touch the head and does not contain data.

When a disk is formatted at the physical level, it is divided into sectors and tracks. Physically, the tracks are located one above the other and form cylinders, which are then divided into sectors. There are 512 bytes in one sector. A sector is the smallest unit of measure for disk size. All hard disks have spare sectors that are used by its management scheme if bad sectors are found on the disk.

Theoretically, outer cylinders can contain more data because they have a larger circumference. However, in drives that do not use the zone recording method, all cylinders contain the same amount of data, despite the fact that the circumference of the outer cylinders can be twice that of the inner ones. As a result, outer track space is wasted, as it is used extremely inefficiently.

The process of controlling the recording density is called pre-compensation. To compensate for different recording densities, the Zone Bit Recording method is used, where the entire disk space is divided into zones (eight or more), each of which usually includes from 20 to 30 cylinders with the same number of sectors.

In the zone located on the outer radius (the junior zone), more sectors are recorded per track (120-96). Towards the center of the disk, the number of sectors decreases and in the oldest zone reaches 64-56. The capacity of hard drives can be increased by approximately 30%.

With an increase in the density of recording on a disk, difficulties arise in detecting the peaks of analog signals coming from magnetic heads. Recently, to eliminate this shortcoming, the PRLM (Partial Response Maximum Likelihood) method has been used, which uses a special digital filtering algorithm for the input signal.

The standard size of a DVD disc is 4.7 GB. There are also double sided DVDs. This means that the record can be on two sides - both on one and on the second. These drives have a capacity of 9.4 GB. Dual-layer discs also exist, but are less common. Such disks have the following volumes: 1-sided 2-layer - 8.5 GB; 2-sided 2-ply - 17.1 GB.

How to find out the size of a file or folder

To find out the size of a file or folder with files, move the cursor (arrow) over it and hold for a few seconds. A small window will appear with the characteristics of the file or folder. As you can see in the picture, this characteristic indicates the size:

If nothing appears when hovering over a file or folder, then right-click on that file or folder. From the list that opens, select "Properties". A window will open showing the size of this file or folder.

Now let's practice determining the size:

Task:

We have a 30 MB file. Can we burn it to disk? 1 GB flash drive?

Solution:

A CD can fit 700 MB. Our file is 30 MB in size. 700 MB is more than 30 MB. Conclusion: the file will fit on a CD.

A DVD disc holds 4.7 GB. One Gigabyte equals 1024 Megabytes. That is, about 5000 MB is placed on one DVD disc. And 5000 MB is much more than 30 MB. Conclusion: our file will fit on a DVD.

We were given a 1 GB flash drive. One GB contains 1024 MB. 1024 is more than 30. Conclusion: the file will fit on a 1 GB flash drive.

You have probably heard expressions such as “my toy weighs too much”, “light file”, heavy folder”. Can folders and files be weighed? And in what units are they then weighed? Yes, no matter how strange it sounds, but files and folders also have their own weight, or more correctly, volume. If they weighed nothing, then we would not need hard drives, and free up space for other information.

What is file and folder size

Even information can be measured. For this, computer terminology has its own units of measurement: bytes, kilobytes, megabytes, gigabytes, terabytes, and so on. All computer information is written using 0 (zero) and 1 (one). Zero and one in computer language is 1 bit. A group of eight bits is called a byte. Read more.

Basic units of information storage:

1 byte= 8 bits

1 Kilobyte(KB) = 1024 bytes

1 megabyte(MB) = 1024 kilobytes

Since the computer works in a binary system (1 and 0), it is much more convenient for it to break down information in this way. The number 1024 is a kilobyte, and one kilobyte in the binary system is 2 10 \u003d 1024. We use the decimal number system, so it’s not customary to operate with such numbers.

Any file (graphic, music, video, etc.) has its own size. All information in the computer is recorded on a hard disk, which has a certain amount. Computer memory is also measured in these units.

Any storage medium, such as: a hard drive, a floppy disk, a flash drive, a memory card and CD / DVD disks, has its own volume, more than which you cannot write to it.

How to find out how much a file or folder weighs

In order to find out how much a file or folder weighs, you need to hover over the file (or folder) and wait a couple of seconds until a window with information pops up.


If a folder or file is too large, then in this way you will not find out information about its (her) size. In this case, you need to right-click on the folder or file, select from the drop-down menu Properties(at the very bottom), and see the size in a new window on the tab General.

If we talk about information in general, then it is measured in BYTES. Measurement in these units began back in 1956. Then this amount was enough. To make it clearer what value we are talking about, I will tell you that 1 byte = 1 character. With the development of technology, the amount of information also increased, and it became not convenient to measure a large amount of information in BYTES. Then the prefixes KILO-BYTE (KB), MEGA-BYTE (MB), GIGA-BYTE (GB), TERA-BYTE (TB), etc. appeared.

To understand how big or small these values ​​are, I will give the following comparison:
- 1KB (one kilobyte) = 1024 bytes, and this is the amount of information approximately in one A4 printed sheet;

1MB (one megabyte) = 1024 kilobytes, and this is the amount of information for a decent volume of 600-700 pages!

1GB (one gigabyte) = 1024 megabytes, and this is already a whole library of 1024 books with 600 pages each!

1TB (one terabyte) = 1024 gigabytes, this amount of information is comparable to the average European library, which contains about 8 million books. For example, the Russian State Library has about 43 million items.

Now let's compare the volume and type of information with respect to the media on which this information can be recorded.

Floppy disk with a capacity of 1.44 MB. Once upon a time, the floppy disk was the main available carrier of digital information, tk. You could really write a lot of things on it. Now floppy disks are mainly used by the accountant to store electronic keys and signatures. The reason is simple - there is not enough space on a floppy disk to store modern information. One or two photographs taken on a mobile phone with a 3 megapixel camera can be written to a floppy disk; five, ten Word, Excel documents.

Flash drive with a capacity of 1 GB. The most convenient medium for today. I took the capacity of a 1GB flash drive for the multiplicity of the account, but in general, at the time of writing, there are also 64GB flash drives!
What can be recorded on a 1GB flash drive: one movie of relatively good quality; about 200 music files in .mp3 format; about 200 good quality photos; many documents and programs of small size.

CD disk with a capacity of 700MB. On a CD, you can burn: one movie in .avi format, in relatively good quality; about 150 music files in .mp3 format; about 150 good quality photos; many documents and programs of small size.

DVD disc with a capacity of 4.7 GB. On a DVD disc, you can burn: one movie in DVD or HDTV format; 4-5 good quality .avi movies; about 1200 music files in .mp3 format; about 1000 good quality photos; sooooo many documents and programs.

Winchester with a capacity of 120GB. Here, in order not to write down documents, I will compare it with the number of films that can be recorded on such a hard drive. So, on a 120 GB hard drive, you can record 25 movies in DVD or HDTV quality!

Now let's take it in turn to figure out how to determine the size of a disk, file or folder.
In Windows, you can determine the size of a file, folder, or drive in EXPLORER. You can start the "Explorer" by double-clicking the LEFT mouse button on the shortcut "My Computer" on the desktop or using the key combination "Win + E".

If, for example, you want to know how much free space is left on a disk, in particular on a flash drive, then you right-click on the image of a removable disk, usually it is signed as “Removable disk (F:)” or “Flash drive name (F:) ", as in the picture:

So, right-click on the image of a removable disk - a flash drive and select the "Properties" item in the menu that opens, at the very bottom. After that, a window opens:


Here you can see how much is used (highlighted in blue), how much free (highlighted in pink) and how much total disk space.

Thus, you can find out the remaining free space not only on a flash drive, but also on any removable or logical drive of a hard drive.

The scheme for determining the size of a file or folder is the same as with a disk. Those. find the desired file or folder on the disk, click on it (-it) with the RIGHT mouse button and see "Properties".


All the necessary information will be there.


If you want to know the size of a group of files or folders, then you need to select them and do the same operations, i.e. right-click on one of the selected files or folders, select "Properties" and view the size.

Yes, the second part of the lesson does not quite fall under the heading "Computer basics", but nevertheless. If you have any questions, you can always ask them in the comments.

Probably, you often see on sites a signature indicating the file size. This indicator is not signed by anyone. This issue is solved by writing a function in php. As a result, it will output a line like:

File size: 2.3 MB
It is very convenient in case of downloading any materials from the site. So let's get started.

Create a PHP function that determines the file size

The function will be quite simple and clear. It uses three built-in functions:

file_exists- check for the presence of the specified file or directory.

filesize- determine the file size. Returns the result in bytes. If the file is larger than 2 GB, then, depending on the server, it may display incorrect results.

round is a built-in function that shortens the displayed value to an integer and one tenth after the delimiter point.
The function checks for the presence of the file itself, then sequentially tries to determine how large the file size is - if more than 1024 bytes, then the result should be displayed in MB, if more than 1024 MB, then it should be output in GB. And at the end of each step, the built-in round function rounds the result from many digits to an integer value and one tenth with a delimiter.

Now let's create a function file. Such files are usually kept in a separate folder. For example function.

PHP code(file functions.php)

// function arguments will be the path to the file
function get_filesize($file)
{
// go file
if(!file_exists($file)) return "File not found";
// now determine the file size in several steps
$filesize = filesize($file);
// If the size is greater than 1 KB
if($filesize > 1024)
{

// If the file size is more than Kilobyte
// it's better to display it in Megabytes. Converting to MB
if($filesize > 1024)
{

// And if the file is larger than 1 megabyte, then check
// Isn't it larger than 1 gigabyte
if($filesize > 1024)
{
$filesize = ($filesize/1024);

Return $filesize." GB";
}
else
{

return $filesize." MB";
}
}
else
{
$filesize = round($filesize, 1);
return $filesize." KB";
}
}
else
{
$filesize = round($filesize, 1);
return $filesize." bytes";
}
}
?>
We have created a function. The next step is to apply it.

PHP code

include_once "function/function.php"; // include file with function

// insert a path or a variable with a path to process it with a function
$size = get_filesize("images/photo.jpg");
echo "File size: ".$size.""; // display the result with the size
?>
Everything is ready! Use on health!
Thanks for attention! And good luck with your work!

To measure length, there are such units as millimeter, centimeter, meter, kilometer. It is known that mass is measured in grams, kilograms, centners and tons. Time running is expressed in seconds, minutes, hours, days, months, years, centuries. The computer works with information and there are also appropriate units of measurement for measuring its volume.

Bit and byte - the minimum units of information

We already know that a computer perceives all information.

Bit- this is the minimum unit of measurement of information, corresponding to one binary digit ("0" or "1").

A bit is only 0 ("zero") or only 1 ("one"). With one bit, two states can be written: 0 (zero) or 1 (one). A bit is the smallest unit of memory, nothing less. This cell can store either a zero or a one.

Byte consists of eight bits. Using one byte, you can encode one character out of 256 possible (256 = 2 8). Thus, one byte is equal to one character, that is, 8 bits:

1 character = 8 bits = 1 byte.

Letter, number, punctuation mark are symbols. One letter, one symbol. One number is also one character. One punctuation mark (either a period, or a comma, or a question mark, etc.) is again one character. One space is also one character.

In addition to the bit and byte, of course, there are other, larger units of information.

Byte table:

1 byte = 8 bits

1 Kb (1 Kilobyte) = 2 10 bytes = 2*2*2*2*2*2*2*2*2*2 bytes =
= 1024 bytes (approximately 1 thousand bytes - 10 3 bytes)

1 MB (1 Megabyte) = 2 20 bytes = 1024 kilobytes (approximately 1 million bytes - 10 6 bytes)

1 GB (1 gigabyte) = 2 30 bytes = 1024 megabytes (approximately 1 billion bytes - 10 9 bytes)

1 TB (1 Terabyte) = 240 bytes = 1024 gigabytes (approximately 1012 bytes). Terabyte is sometimes called ton.

1 Pb (1 Petabyte) = 2 50 bytes = 1024 terabytes (approximately 10 15 bytes).

1 exabyte= 260 bytes = 1024 petabytes (approximately 1018 bytes).

1 Zettabyte= 270 bytes = 1024 exabytes (approximately 1021 bytes).

1 Yottabyte= 2 80 bytes = 1024 zettabytes (approximately 10 24 bytes).

In the table above, the powers of two (2 10 , 2 20 , 2 30 , etc.) are the exact values ​​of kilobytes, megabytes, gigabytes. But the powers of the number 10 (more precisely, 10 3 , 10 6 , 10 9 , etc.) will already be approximate values, rounded down. Thus, 2 10 = 1024 bytes represents the exact value of a kilobyte, and 10 3 = 1000 bytes is the approximate value of a kilobyte.

Such an approximation (or rounding) is quite acceptable and generally accepted.

The following is a byte table with English abbreviations (in the left column):

1 Kb ~ 10 3 b = 10*10*10 b= 1000 b – kilobyte

1 Mb ~ 10 6 b = 10*10*10*10*10*10 b = 1 000 000 b - megabyte

1 Gb ~ 10 9 b - gigabyte

1 Tb ~ 10 12 b - terabyte

1 Pb ~ 10 15 b - petabyte

1 Eb ~ 10 18 b - exabyte

1 Zb ~ 10 21 b - zettabyte

1 Yb ~ 10 24 b - yottabyte

Above in the right column are the so-called "decimal prefixes", which are used not only with bytes, but also in other areas of human activity. For example, the prefix "kilo" in the word "kilobyte" means a thousand bytes. In the case of a kilometer, it corresponds to a thousand meters, and in the example of a kilogram, it is equal to a thousand grams.

To be continued…

The question arises: does the byte table have a continuation? In mathematics, there is the concept of infinity, which is denoted as an inverted eight: ∞.

It is clear that in the byte table you can continue to add zeros, or rather, powers to the number 10 in this way: 10 27 , 10 30 , 10 33 and so on ad infinitum. But why is this necessary? In principle, while terabytes and petabytes are enough. In the future, perhaps even a yottabyte will not be enough.


Finally, a couple of examples on devices that can store terabytes and gigabytes of information.

There is a convenient "terabyte" - an external hard drive that connects via USB to a computer. It can store a terabyte of information. It is especially convenient for laptops (where changing the hard drive can be problematic) and for backing up information. It is better to make backup copies of information in advance, and not after everything is gone.

Flash drives come in 1 GB, 2 GB, 4 GB, 8 GB, 16 GB, 32 GB, 64 GB and even 1 terabyte.

In this article, I wanted to introduce my readers to the concept file size, folder, or even programs (given that the program is a set of folders and files).

Any file or folder with files occupies a certain amount of memory on local drives. That is, all files and folders have a volume, in other words, weight or size.

From school we know such concepts as grams and kilograms, meters and kilometers. The computer world also has its own units of measurement. They measure files and folders. Based on the "slang" of advanced users, we will determine how much this or that file or folder "weighs". The main units of measurement are: bytes, kilobytes, megabytes, gigabytes, well, you can also terabytes.

1 KB = 1024 bytes

1 MB = 1024 KB

1 GB = 1024 MB

Let's decipher:

There are 1024 bytes in one KB (kilobyte).
One MB (megabyte) contains 1024 KB (kilobytes).
One GB (gigabyte) contains 1024 MB (megabytes).

How to find out file or folder size?

To find out the size of a file or folder of files, move the cursor over the file or folder and hold for a few seconds. A small window will appear with the characteristics of the file or folder, one of the parameters is the size.

If nothing appears when hovering over a file or folder, then right-click on that file or folder. Select "Properties" from the context menu that opens.

A window will open showing the size of this file or folder.

Why do we need to know the dimensions? For example, in order to determine whether we can write a file or folder to a disk (floppy disk, flash drive) or how much space is left on local disks.

In order for us to be able to determine this, we need to know how much information fits on a disk (floppy disk, flash drive):

  • Floppy disk - 1.44 MB (suitable for writing text files)
  • CD disc - 700 MB (suitable for recording music, small videos and programs)
  • DVD disc - from 4 GB (suitable for recording anything). The standard size of a DVD disc is 4.7 GB. There are also double sided DVDs. This means that the record can be on two sides - both on one and on the second. These drives have a capacity of 9.4 GB. Dual-layer discs also exist, but are less common. Such disks have the following volumes: 1-sided 2-layer - 8.5 GB; 2-sided 2-ply - 17.1 GB.
  • Flash drives - from 1GB (suitable for recording anything)

That's all I wanted to talk about in this article.



Probably, you often see on sites a signature indicating the file size. This indicator is not signed by anyone. This issue is solved by writing a function in php. As a result, it will output a line like:

File size: 2.3 MB
It is very convenient in case of downloading any materials from the site. So let's get started.

Create a PHP function that determines the file size

The function will be quite simple and clear. It uses three built-in functions:

file_exists- check for the presence of the specified file or directory.

filesize- determine the file size. Returns the result in bytes. If the file is larger than 2 GB, then, depending on the server, it may display incorrect results.

round is a built-in function that shortens the displayed value to an integer and one tenth after the delimiter point.
The function checks for the presence of the file itself, then sequentially tries to determine how large the file size is - if more than 1024 bytes, then the result should be displayed in MB, if more than 1024 MB, then it should be output in GB. And at the end of each step, the built-in round function rounds the result from many digits to an integer value and one tenth with a delimiter.

Now let's create a function file. Such files are usually kept in a separate folder. For example function.

PHP code(file functions.php)

// function arguments will be the path to the file
function get_filesize($file)
{
// go file
if(!file_exists($file)) return "File not found";
// now determine the file size in several steps
$filesize = filesize($file);
// If the size is greater than 1 KB
if($filesize > 1024)
{

// If the file size is more than Kilobyte
// it's better to display it in Megabytes. Converting to MB
if($filesize > 1024)
{
$filesize = ($filesize/1024);
// And if the file is larger than 1 megabyte, then check
// Isn't it larger than 1 gigabyte
if($filesize > 1024)
{
$filesize = ($filesize/1024);

return $filesize." GB";
}
else
{
$filesize = round($filesize, 1);
return $filesize." MB";
}
}
else
{
$filesize = round($filesize, 1);
return $filesize." KB";
}
}
else
{
$filesize = round($filesize, 1);
return $filesize." bytes";
}
}
?>
We have created a function. The next step is to apply it.

PHP code

include_once "function/function.php"; // include file with function

// insert a path or a variable with a path to process it with a function
$size = get_filesize("images/photo.jpg");
echo "File size: ".$size.""; // display the result with the size
?>
Everything is ready! Use on health!
Thanks for attention! And good luck with your work!

First of all, let's say that we will talk here about the FAT and NTFS file systems, as the most common ones, and nothing will be said about the file systems used in non-Windows systems, since such systems are outside the scope of the author's interests. And now - to business.

It would seem, what kind of ambiguity can be, if we talk about the file size. How much data was written into it, such is the size (or length). How many bytes does it have from beginning to end (and this number is recorded in the file system as the size of the file), such is the size, right? As Shelmenko the orderly said, so it is, but only a trifle is not so.

Do an experiment. Take any executable and copy it with the command
copy something.exe something else.exe

If you have encountered this before, you already know that the resulting file will be much shorter than the original and will not be a copy. The reason is simple: the copy program, launched without the /b option, copies the file until it encounters a byte with code 27h, this character is called the “end of file”.

So, we already have two different signs of the end of the file - by the number recorded in the file system, and by a special byte in the file body. True, it is worth noting that the second sign has remained since the times when the files were predominantly text and is now practically not used.

In file systems that use clusters, and FAT and NTFS refer specifically to such file systems, there is also a third size - the size of the file on disk, that is, the total size of the clusters allocated to this file. In FAT file systems, this size is greater than or equal to the size of the actual file. The difference between the sizes, if any, - the so-called tail of the file - is wasted disk space, the payment for placing files in clusters, and not end-to-end one after another, although file systems with such file placement also exist.

However, sometimes this place is used. In particular, in the days of floppy disks, there were programs that allowed you to write data to the tails of files in order to secretly transfer information on such diskettes. After all, it is impossible to get access to the tails of files by standard means.

If we include NTFS into consideration, then the picture will be supplemented with new touches.
First of all, the size of the file on disk may be smaller than the actual size of the file.
If the body of a file fits into the free area of ​​an MFT file record, then the file does not occupy any clusters on the disk.

The maximum size of such a file depends on the size of the record and is approximately 600 bytes for a small record (1 KB) and 3600 for a large record (4 KB). However, it should be noted that until recently, Windows showed the size of such a file on disk as equal to one cluster, although in fact no cluster was allocated to the file.

If the file is compressed, then its size on disk can be noticeably smaller than the actual length of the file (the amount of data in it).

The so-called sparse files further complicate the picture. They contain useful data only in certain sections of the file, and the rest of the file is not used at all. Take as an example the changelog file \$Extend\$UsnJrnl found on almost every computer (don't try to see it in Explorer or other file managers, it won't work).


It can be several gigabytes long, but it usually contains only 32 megabytes of meaningful data at the very end. And the rest of the data does not contain any data at all, it does not take up disk space, and when you try to read data from this part, the system will issue a set of zeros without even accessing the disk.
If the reader wishes to experiment with sparse files, such a file can be created using the fsutil sparse command. And at your leisure, you can think about what the real length of the file is if the system has written the number 4 GB in the corresponding column, and the real data in the file is only 32 MB and it also occupies 32 MB on disk.

And finally, let's talk about one more length: the length of valid data (valid data). This length, and the functions that set it, are of interest almost exclusively to programmers, but it may occasionally be encountered by ordinary users.

In FAT file systems, this concept does not exist, and functions that use this value write zeros in the file body in the appropriate places. In NTFS, this length is a characteristic of the file.

Let's try to explain what we are talking about with an example. Take a flash drive (a flash drive is used for clarity, since it works slower than a hard drive with large amounts of data) larger than a gigabyte, formatted in FAT32, and create a large file on it with the command
fsutil file createnew k:\trial.txt 900000000

If the letter assigned to the flash drive is different from K, then correct the command accordingly.
You will see that the file creation procedure will be quite lengthy, half a minute or even more (although the message "file created" will appear immediately, you will have to wait for the command prompt to appear). This is not at all surprising, because the description of the () command says that the file being created consists of zeros. And the file we got was 858 megabytes, so writing it should take not so little time.

Now format the flash drive in NTFS, for the purity of the experiment it is better to take the same one, and repeat the file creation. This time, the operation will take place almost instantly. It is no longer necessary to write zeros to the body of the file, it is enough to allocate space for the file and set the length of valid data for it to zero. The body of the file will contain “garbage” that was written in these sectors, but when reading the data, this data will not be accessed - having found that the length of the actual data is zero, the system will not read everything that is further than this zero - after all, this data invalid. They can be made valid by changing the valid data length value.

Let's look at this with an example. Create a new file on one of the NTFS-formatted scratch drives. Hundreds of megabytes are completely optional, a dozen or two kilobytes will be enough:
fsutil file createnew C:\trial.txt 10000

Now open it with any file viewer like FAR.


As you can see, there are indeed zeros in the file. But if you look at this file using some disk editor that accesses sectors directly, such as dmde, then the picture will be different.

If we open volume C as a logical device and look at the contents of the file, we will see the same zeros.


But if you open the disk as a physical device, then in the same sector (pay attention to the LBA numbers - the difference in 63 arose due to the fact that the beginning of the partition is shifted relative to the beginning of the disk) we will see data that was previously written to some later remote file.


And if we increase the length of the actual data, we will see this data in the file. Set this length to 300 bytes:

fsutil file setvaliddata C:\trial.txt 300

Note that the parameter in this command cannot be set arbitrarily, but must be no less than the current valid data length value and no larger than the file size. You cannot reduce the length of valid data with this command.

Now look again at the contents of the file. Note that we did not write any data to it!


Purely by chance, it turned out that there is quite a lot of meaningful text in this file, which makes the picture more visual. 300 decimal bytes is 12c hexadecimal bytes, and it is on this byte that the text breaks off and zeros begin. If we move the border of valid data even further, then the following lines will “appear” as well.

Summing up

There are two physical file lengths - the size of the file as recorded in the file system and the space it takes up on disk. There are also two logical file lengths - this is the end-of-file sign (byte EOF - 27h) and the length of the actual data. Empty areas in sparse files can also be considered as part of the logical length - remember \$Extend\$UsnJrnl, where a large array of missing data ends with thirty-two megabytes of real data.

So, usually, when people talk about the length of a file, they mean the number stored in the file system. But, as you can see, options are possible!