[PATCH v4 0/4] vvfat: Fix write bugs for large files and add iotests

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v4 0/4] vvfat: Fix write bugs for large files and add iotests
@ 2024-06-05  0:58 Amjad Alsharafi
  2024-06-05  0:58 ` [PATCH v4 1/4] vvfat: Fix bug in writing to middle of file Amjad Alsharafi
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Amjad Alsharafi @ 2024-06-05  0:58 UTC (permalink / raw)
  To: qemu-devel; +Cc: Hanna Reitz, Kevin Wolf, open list:vvfat, Amjad Alsharafi

These patches fix some bugs found when modifying files in vvfat.
First, there was a bug when writing to the cluster 2 or above of a file, it
will copy the cluster before it instead, so, when writing to cluster=2, the
content of cluster=1 will be copied into disk instead in its place.

Another issue was modifying the clusters of a file and adding new
clusters, this showed 2 issues:
- If the new cluster is not immediately after the last cluster, it will
cause issues when reading from this file in the future.
- Generally, the usage of info.file.offset was incorrect, and the
system would crash on abort() when the file is modified and a new
cluster was added.

Also, added some iotests for vvfat, covering the this fix and also
general behavior such as reading, writing, and creating files on the filesystem.
Including tests for reading/writing the first cluster which
would pass even before this patch.

v4:
  Applied some suggestions from Kevin Wolf <kwolf@redhat.com>:
  - Fixed code formatting by following the coding style in `scripts/checkpatch.pl`
  - Reduced changes related to `iotests` by setting `vvfat` format as non-generic.
  - Added another test to cover the fix done in `PATCH 2/4` and `PATCH 3/4` for 
    handling reading/writing files with non-continuous clusters.

v3:
  Added test for creating new files in vvfat.

v2:
  Added iotests for `vvfat` driver along with a simple `fat16` module to run the tests.

v1:
  https://patchew.org/QEMU/20240327201231.31046-1-amjadsharafi10@gmail.com/
  Fix the issue of writing to the middle of the file in vvfat

Amjad Alsharafi (4):
  vvfat: Fix bug in writing to middle of file
  vvfat: Fix usage of `info.file.offset`
  vvfat: Fix reading files with non-continuous clusters
  iotests: Add `vvfat` tests

 block/vvfat.c                      |  38 +-
 tests/qemu-iotests/check           |   2 +-
 tests/qemu-iotests/fat16.py        | 635 +++++++++++++++++++++++++++++
 tests/qemu-iotests/testenv.py      |   2 +-
 tests/qemu-iotests/tests/vvfat     | 440 ++++++++++++++++++++
 tests/qemu-iotests/tests/vvfat.out |   5 +
 6 files changed, 1107 insertions(+), 15 deletions(-)
 create mode 100644 tests/qemu-iotests/fat16.py
 create mode 100755 tests/qemu-iotests/tests/vvfat
 create mode 100755 tests/qemu-iotests/tests/vvfat.out

-- 
2.45.1



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v4 1/4] vvfat: Fix bug in writing to middle of file
  2024-06-05  0:58 [PATCH v4 0/4] vvfat: Fix write bugs for large files and add iotests Amjad Alsharafi
@ 2024-06-05  0:58 ` Amjad Alsharafi
  2024-06-05  0:58 ` [PATCH v4 2/4] vvfat: Fix usage of `info.file.offset` Amjad Alsharafi
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 13+ messages in thread
From: Amjad Alsharafi @ 2024-06-05  0:58 UTC (permalink / raw)
  To: qemu-devel; +Cc: Hanna Reitz, Kevin Wolf, open list:vvfat, Amjad Alsharafi

Before this commit, the behavior when calling `commit_one_file` for
example with `offset=0x2000` (second cluster), what will happen is that
we won't fetch the next cluster from the fat, and instead use the first
cluster for the read operation.

This is due to off-by-one error here, where `i=0x2000 !< offset=0x2000`,
thus not fetching the next cluster.

Signed-off-by: Amjad Alsharafi <amjadsharafi10@gmail.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Tested-by: Kevin Wolf <kwolf@redhat.com>
---
 block/vvfat.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/block/vvfat.c b/block/vvfat.c
index 9d050ba3ae..19da009a5b 100644
--- a/block/vvfat.c
+++ b/block/vvfat.c
@@ -2525,8 +2525,9 @@ commit_one_file(BDRVVVFATState* s, int dir_index, uint32_t offset)
         return -1;
     }
 
-    for (i = s->cluster_size; i < offset; i += s->cluster_size)
+    for (i = 0; i < offset; i += s->cluster_size) {
         c = modified_fat_get(s, c);
+    }
 
     fd = qemu_open_old(mapping->path, O_RDWR | O_CREAT | O_BINARY, 0666);
     if (fd < 0) {
-- 
2.45.1



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v4 2/4] vvfat: Fix usage of `info.file.offset`
  2024-06-05  0:58 [PATCH v4 0/4] vvfat: Fix write bugs for large files and add iotests Amjad Alsharafi
  2024-06-05  0:58 ` [PATCH v4 1/4] vvfat: Fix bug in writing to middle of file Amjad Alsharafi
@ 2024-06-05  0:58 ` Amjad Alsharafi
  2024-06-10 16:49   ` Kevin Wolf
  2024-06-05  0:58 ` [PATCH v4 3/4] vvfat: Fix reading files with non-continuous clusters Amjad Alsharafi
  2024-06-05  0:58 ` [PATCH v4 4/4] iotests: Add `vvfat` tests Amjad Alsharafi
  3 siblings, 1 reply; 13+ messages in thread
From: Amjad Alsharafi @ 2024-06-05  0:58 UTC (permalink / raw)
  To: qemu-devel; +Cc: Hanna Reitz, Kevin Wolf, open list:vvfat, Amjad Alsharafi

The field is marked as "the offset in the file (in clusters)", but it
was being used like this
`cluster_size*(nums)+mapping->info.file.offset`, which is incorrect.

Additionally, removed the `abort` when `first_mapping_index` does not
match, as this matches the case when adding new clusters for files, and
its inevitable that we reach this condition when doing that if the
clusters are not after one another, so there is no reason to `abort`
here, execution continues and the new clusters are written to disk
correctly.

Signed-off-by: Amjad Alsharafi <amjadsharafi10@gmail.com>
---
 block/vvfat.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/block/vvfat.c b/block/vvfat.c
index 19da009a5b..f0642ac3e4 100644
--- a/block/vvfat.c
+++ b/block/vvfat.c
@@ -1408,7 +1408,9 @@ read_cluster_directory:
 
         assert(s->current_fd);
 
-        offset=s->cluster_size*(cluster_num-s->current_mapping->begin)+s->current_mapping->info.file.offset;
+        offset = s->cluster_size *
+            ((cluster_num - s->current_mapping->begin)
+            + s->current_mapping->info.file.offset);
         if(lseek(s->current_fd, offset, SEEK_SET)!=offset)
             return -3;
         s->cluster=s->cluster_buffer;
@@ -1929,8 +1931,9 @@ get_cluster_count_for_direntry(BDRVVVFATState* s, direntry_t* direntry, const ch
                         (mapping->mode & MODE_DIRECTORY) == 0) {
 
                     /* was modified in qcow */
-                    if (offset != mapping->info.file.offset + s->cluster_size
-                            * (cluster_num - mapping->begin)) {
+                    if (offset != s->cluster_size
+                            * ((cluster_num - mapping->begin)
+                            + mapping->info.file.offset)) {
                         /* offset of this cluster in file chain has changed */
                         abort();
                         copy_it = 1;
@@ -1944,7 +1947,6 @@ get_cluster_count_for_direntry(BDRVVVFATState* s, direntry_t* direntry, const ch
 
                     if (mapping->first_mapping_index != first_mapping_index
                             && mapping->info.file.offset > 0) {
-                        abort();
                         copy_it = 1;
                     }
 
@@ -2404,7 +2406,7 @@ static int commit_mappings(BDRVVVFATState* s,
                         (mapping->end - mapping->begin);
             } else
                 next_mapping->info.file.offset = mapping->info.file.offset +
-                        mapping->end - mapping->begin;
+                        (mapping->end - mapping->begin);
 
             mapping = next_mapping;
         }
-- 
2.45.1



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v4 3/4] vvfat: Fix reading files with non-continuous clusters
  2024-06-05  0:58 [PATCH v4 0/4] vvfat: Fix write bugs for large files and add iotests Amjad Alsharafi
  2024-06-05  0:58 ` [PATCH v4 1/4] vvfat: Fix bug in writing to middle of file Amjad Alsharafi
  2024-06-05  0:58 ` [PATCH v4 2/4] vvfat: Fix usage of `info.file.offset` Amjad Alsharafi
@ 2024-06-05  0:58 ` Amjad Alsharafi
  2024-06-05  0:58 ` [PATCH v4 4/4] iotests: Add `vvfat` tests Amjad Alsharafi
  3 siblings, 0 replies; 13+ messages in thread
From: Amjad Alsharafi @ 2024-06-05  0:58 UTC (permalink / raw)
  To: qemu-devel; +Cc: Hanna Reitz, Kevin Wolf, open list:vvfat, Amjad Alsharafi

When reading with `read_cluster` we get the `mapping` with
`find_mapping_for_cluster` and then we call `open_file` for this
mapping.
The issue appear when its the same file, but a second cluster that is
not immediately after it, imagine clusters `500 -> 503`, this will give
us 2 mappings one has the range `500..501` and another `503..504`, both
point to the same file, but different offsets.

When we don't open the file since the path is the same, we won't assign
`s->current_mapping` and thus accessing way out of bound of the file.

From our example above, after `open_file` (that didn't open anything) we
will get the offset into the file with
`s->cluster_size*(cluster_num-s->current_mapping->begin)`, which will
give us `0x2000 * (504-500)`, which is out of bound for this mapping and
will produce some issues.

Signed-off-by: Amjad Alsharafi <amjadsharafi10@gmail.com>
---
 block/vvfat.c | 23 ++++++++++++++++-------
 1 file changed, 16 insertions(+), 7 deletions(-)

diff --git a/block/vvfat.c b/block/vvfat.c
index f0642ac3e4..8b4d162aa1 100644
--- a/block/vvfat.c
+++ b/block/vvfat.c
@@ -1360,15 +1360,24 @@ static int open_file(BDRVVVFATState* s,mapping_t* mapping)
 {
     if(!mapping)
         return -1;
+    int new_path = 1;
     if(!s->current_mapping ||
-            strcmp(s->current_mapping->path,mapping->path)) {
-        /* open file */
-        int fd = qemu_open_old(mapping->path,
+            s->current_mapping->first_mapping_index
+                != mapping->first_mapping_index ||
+            (new_path = strcmp(s->current_mapping->path, mapping->path))) {
+
+        if (new_path) {
+            /* open file */
+            int fd = qemu_open_old(mapping->path,
                                O_RDONLY | O_BINARY | O_LARGEFILE);
-        if(fd<0)
-            return -1;
-        vvfat_close_current_file(s);
-        s->current_fd = fd;
+            if (fd < 0) {
+                return -1;
+            }
+            vvfat_close_current_file(s);
+
+            s->current_fd = fd;
+        }
+        assert(s->current_fd);
         s->current_mapping = mapping;
     }
     return 0;
-- 
2.45.1



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v4 4/4] iotests: Add `vvfat` tests
  2024-06-05  0:58 [PATCH v4 0/4] vvfat: Fix write bugs for large files and add iotests Amjad Alsharafi
                   ` (2 preceding siblings ...)
  2024-06-05  0:58 ` [PATCH v4 3/4] vvfat: Fix reading files with non-continuous clusters Amjad Alsharafi
@ 2024-06-05  0:58 ` Amjad Alsharafi
  2024-06-10 12:01   ` Kevin Wolf
  3 siblings, 1 reply; 13+ messages in thread
From: Amjad Alsharafi @ 2024-06-05  0:58 UTC (permalink / raw)
  To: qemu-devel; +Cc: Hanna Reitz, Kevin Wolf, open list:vvfat, Amjad Alsharafi

Added several tests to verify the implementation of the vvfat driver.

We needed a way to interact with it, so created a basic `fat16.py` driver that handled writing correct sectors for us.

Added `vvfat` to the non-generic formats, as its not a normal image format.

Signed-off-by: Amjad Alsharafi <amjadsharafi10@gmail.com>
---
 tests/qemu-iotests/check           |   2 +-
 tests/qemu-iotests/fat16.py        | 635 +++++++++++++++++++++++++++++
 tests/qemu-iotests/testenv.py      |   2 +-
 tests/qemu-iotests/tests/vvfat     | 440 ++++++++++++++++++++
 tests/qemu-iotests/tests/vvfat.out |   5 +
 5 files changed, 1082 insertions(+), 2 deletions(-)
 create mode 100644 tests/qemu-iotests/fat16.py
 create mode 100755 tests/qemu-iotests/tests/vvfat
 create mode 100755 tests/qemu-iotests/tests/vvfat.out

diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check
index 56d88ca423..545f9ec7bd 100755
--- a/tests/qemu-iotests/check
+++ b/tests/qemu-iotests/check
@@ -84,7 +84,7 @@ def make_argparser() -> argparse.ArgumentParser:
     p.set_defaults(imgfmt='raw', imgproto='file')
 
     format_list = ['raw', 'bochs', 'cloop', 'parallels', 'qcow', 'qcow2',
-                   'qed', 'vdi', 'vpc', 'vhdx', 'vmdk', 'luks', 'dmg']
+                   'qed', 'vdi', 'vpc', 'vhdx', 'vmdk', 'luks', 'dmg', 'vvfat']
     g_fmt = p.add_argument_group(
         '  image format options',
         'The following options set the IMGFMT environment variable. '
diff --git a/tests/qemu-iotests/fat16.py b/tests/qemu-iotests/fat16.py
new file mode 100644
index 0000000000..baf801b4d5
--- /dev/null
+++ b/tests/qemu-iotests/fat16.py
@@ -0,0 +1,635 @@
+# A simple FAT16 driver that is used to test the `vvfat` driver in QEMU.
+#
+# Copyright (C) 2024 Amjad Alsharafi <amjadsharafi10@gmail.com>
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+from typing import List
+import string
+
+SECTOR_SIZE = 512
+DIRENTRY_SIZE = 32
+ALLOWED_FILE_CHARS = \
+    set("!#$%&'()-@^_`{}~" + string.digits + string.ascii_uppercase)
+
+
+class MBR:
+    def __init__(self, data: bytes):
+        assert len(data) == 512
+        self.partition_table = []
+        for i in range(4):
+            partition = data[446 + i * 16 : 446 + (i + 1) * 16]
+            self.partition_table.append(
+                {
+                    "status": partition[0],
+                    "start_head": partition[1],
+                    "start_sector": partition[2] & 0x3F,
+                    "start_cylinder":
+                        ((partition[2] & 0xC0) << 2) | partition[3],
+                    "type": partition[4],
+                    "end_head": partition[5],
+                    "end_sector": partition[6] & 0x3F,
+                    "end_cylinder":
+                        ((partition[6] & 0xC0) << 2) | partition[7],
+                    "start_lba": int.from_bytes(partition[8:12], "little"),
+                    "size": int.from_bytes(partition[12:16], "little"),
+                }
+            )
+
+    def __str__(self):
+        return "\n".join(
+            [f"{i}: {partition}"
+                for i, partition in enumerate(self.partition_table)]
+        )
+
+
+class FatBootSector:
+    def __init__(self, data: bytes):
+        assert len(data) == 512
+        self.bytes_per_sector = int.from_bytes(data[11:13], "little")
+        self.sectors_per_cluster = data[13]
+        self.reserved_sectors = int.from_bytes(data[14:16], "little")
+        self.fat_count = data[16]
+        self.root_entries = int.from_bytes(data[17:19], "little")
+        self.media_descriptor = data[21]
+        self.fat_size = int.from_bytes(data[22:24], "little")
+        self.sectors_per_fat = int.from_bytes(data[22:24], "little")
+        self.sectors_per_track = int.from_bytes(data[24:26], "little")
+        self.heads = int.from_bytes(data[26:28], "little")
+        self.hidden_sectors = int.from_bytes(data[28:32], "little")
+        self.total_sectors = int.from_bytes(data[32:36], "little")
+        self.drive_number = data[36]
+        self.volume_id = int.from_bytes(data[39:43], "little")
+        self.volume_label = data[43:54].decode("ascii").strip()
+        self.fs_type = data[54:62].decode("ascii").strip()
+
+    def root_dir_start(self):
+        """
+        Calculate the start sector of the root directory.
+        """
+        return self.reserved_sectors + self.fat_count * self.sectors_per_fat
+
+    def root_dir_size(self):
+        """
+        Calculate the size of the root directory in sectors.
+        """
+        return (
+            self.root_entries * DIRENTRY_SIZE + self.bytes_per_sector - 1
+        ) // self.bytes_per_sector
+
+    def data_sector_start(self):
+        """
+        Calculate the start sector of the data region.
+        """
+        return self.root_dir_start() + self.root_dir_size()
+
+    def first_sector_of_cluster(self, cluster: int):
+        """
+        Calculate the first sector of the given cluster.
+        """
+        return self.data_sector_start() \
+                + (cluster - 2) * self.sectors_per_cluster
+
+    def cluster_bytes(self):
+        """
+        Calculate the number of bytes in a cluster.
+        """
+        return self.bytes_per_sector * self.sectors_per_cluster
+
+    def __str__(self):
+        return (
+            f"Bytes per sector: {self.bytes_per_sector}\n"
+            f"Sectors per cluster: {self.sectors_per_cluster}\n"
+            f"Reserved sectors: {self.reserved_sectors}\n"
+            f"FAT count: {self.fat_count}\n"
+            f"Root entries: {self.root_entries}\n"
+            f"Total sectors: {self.total_sectors}\n"
+            f"Media descriptor: {self.media_descriptor}\n"
+            f"Sectors per FAT: {self.sectors_per_fat}\n"
+            f"Sectors per track: {self.sectors_per_track}\n"
+            f"Heads: {self.heads}\n"
+            f"Hidden sectors: {self.hidden_sectors}\n"
+            f"Drive number: {self.drive_number}\n"
+            f"Volume ID: {self.volume_id}\n"
+            f"Volume label: {self.volume_label}\n"
+            f"FS type: {self.fs_type}\n"
+        )
+
+
+class FatDirectoryEntry:
+    def __init__(self, data: bytes, sector: int, offset: int):
+        self.name = data[0:8].decode("ascii").strip()
+        self.ext = data[8:11].decode("ascii").strip()
+        self.attributes = data[11]
+        self.reserved = data[12]
+        self.create_time_tenth = data[13]
+        self.create_time = int.from_bytes(data[14:16], "little")
+        self.create_date = int.from_bytes(data[16:18], "little")
+        self.last_access_date = int.from_bytes(data[18:20], "little")
+        high_cluster = int.from_bytes(data[20:22], "little")
+        self.last_mod_time = int.from_bytes(data[22:24], "little")
+        self.last_mod_date = int.from_bytes(data[24:26], "little")
+        low_cluster = int.from_bytes(data[26:28], "little")
+        self.cluster = (high_cluster << 16) | low_cluster
+        self.size_bytes = int.from_bytes(data[28:32], "little")
+
+        # extra (to help write back to disk)
+        self.sector = sector
+        self.offset = offset
+
+    def as_bytes(self) -> bytes:
+        return (
+            self.name.ljust(8, " ").encode("ascii")
+            + self.ext.ljust(3, " ").encode("ascii")
+            + self.attributes.to_bytes(1, "little")
+            + self.reserved.to_bytes(1, "little")
+            + self.create_time_tenth.to_bytes(1, "little")
+            + self.create_time.to_bytes(2, "little")
+            + self.create_date.to_bytes(2, "little")
+            + self.last_access_date.to_bytes(2, "little")
+            + (self.cluster >> 16).to_bytes(2, "little")
+            + self.last_mod_time.to_bytes(2, "little")
+            + self.last_mod_date.to_bytes(2, "little")
+            + (self.cluster & 0xFFFF).to_bytes(2, "little")
+            + self.size_bytes.to_bytes(4, "little")
+        )
+
+    def whole_name(self):
+        if self.ext:
+            return f"{self.name}.{self.ext}"
+        else:
+            return self.name
+
+    def __str__(self):
+        return (
+            f"Name: {self.name}\n"
+            f"Ext: {self.ext}\n"
+            f"Attributes: {self.attributes}\n"
+            f"Reserved: {self.reserved}\n"
+            f"Create time tenth: {self.create_time_tenth}\n"
+            f"Create time: {self.create_time}\n"
+            f"Create date: {self.create_date}\n"
+            f"Last access date: {self.last_access_date}\n"
+            f"Last mod time: {self.last_mod_time}\n"
+            f"Last mod date: {self.last_mod_date}\n"
+            f"Cluster: {self.cluster}\n"
+            f"Size: {self.size_bytes}\n"
+        )
+
+    def __repr__(self):
+        # convert to dict
+        return str(vars(self))
+
+
+class Fat16:
+    def __init__(
+        self,
+        start_sector: int,
+        size: int,
+        sector_reader: callable,
+        sector_writer: callable,
+    ):
+        self.start_sector = start_sector
+        self.size_in_sectors = size
+        self.sector_reader = sector_reader
+        self.sector_writer = sector_writer
+
+        self.boot_sector = FatBootSector(self.sector_reader(start_sector))
+
+        fat_size_in_sectors = \
+            self.boot_sector.fat_size * self.boot_sector.fat_count
+        self.fats = self.read_sectors(
+            self.boot_sector.reserved_sectors, fat_size_in_sectors
+        )
+        self.fats_dirty_sectors = set()
+
+    def read_sectors(self, start_sector: int, num_sectors: int) -> bytes:
+        return self.sector_reader(start_sector + self.start_sector, num_sectors)
+
+    def write_sectors(self, start_sector: int, data: bytes):
+        return self.sector_writer(start_sector + self.start_sector, data)
+
+    def directory_from_bytes(
+        self, data: bytes, start_sector: int
+    ) -> List[FatDirectoryEntry]:
+        """
+        Convert `bytes` into a list of `FatDirectoryEntry` objects.
+        Will ignore long file names.
+        Will stop when it encounters a 0x00 byte.
+        """
+
+        entries = []
+        for i in range(0, len(data), DIRENTRY_SIZE):
+            entry = data[i : i + DIRENTRY_SIZE]
+
+            current_sector = start_sector + (i // SECTOR_SIZE)
+            current_offset = i % SECTOR_SIZE
+
+            if entry[0] == 0:
+                break
+            elif entry[0] == 0xE5:
+                # Deleted file
+                continue
+
+            if entry[11] & 0xF == 0xF:
+                # Long file name
+                continue
+
+            entries.append(
+                FatDirectoryEntry(entry, current_sector, current_offset))
+        return entries
+
+    def read_root_directory(self) -> List[FatDirectoryEntry]:
+        root_dir = self.read_sectors(
+            self.boot_sector.root_dir_start(), self.boot_sector.root_dir_size()
+        )
+        return self.directory_from_bytes(root_dir,
+                                         self.boot_sector.root_dir_start())
+
+    def read_fat_entry(self, cluster: int) -> int:
+        """
+        Read the FAT entry for the given cluster.
+        """
+        fat_offset = cluster * 2  # FAT16
+        return int.from_bytes(self.fats[fat_offset : fat_offset + 2], "little")
+
+    def write_fat_entry(self, cluster: int, value: int):
+        """
+        Write the FAT entry for the given cluster.
+        """
+        fat_offset = cluster * 2
+        self.fats = (
+            self.fats[:fat_offset]
+            + value.to_bytes(2, "little")
+            + self.fats[fat_offset + 2 :]
+        )
+        self.fats_dirty_sectors.add(fat_offset // SECTOR_SIZE)
+
+    def flush_fats(self):
+        """
+        Write the FATs back to the disk.
+        """
+        for sector in self.fats_dirty_sectors:
+            data = self.fats[sector * SECTOR_SIZE : (sector + 1) * SECTOR_SIZE]
+            sector = self.boot_sector.reserved_sectors + sector
+            self.write_sectors(sector, data)
+        self.fats_dirty_sectors = set()
+
+    def next_cluster(self, cluster: int) -> int | None:
+        """
+        Get the next cluster in the chain.
+        If its `None`, then its the last cluster.
+        The function will crash if the next cluster
+        is `FREE` (unexpected) or invalid entry.
+        """
+        fat_entry = self.read_fat_entry(cluster)
+        if fat_entry == 0:
+            raise Exception("Unexpected: FREE cluster")
+        elif fat_entry == 1:
+            raise Exception("Unexpected: RESERVED cluster")
+        elif fat_entry >= 0xFFF8:
+            return None
+        elif fat_entry >= 0xFFF7:
+            raise Exception("Invalid FAT entry")
+        else:
+            return fat_entry
+
+    def next_free_cluster(self) -> int:
+        """
+        Find the next free cluster.
+        """
+        # simple linear search
+        for i in range(2, 0xFFFF):
+            if self.read_fat_entry(i) == 0:
+                return i
+        raise Exception("No free clusters")
+
+    def read_cluster(self, cluster: int) -> bytes:
+        """
+        Read the cluster at the given cluster.
+        """
+        return self.read_sectors(
+            self.boot_sector.first_sector_of_cluster(cluster),
+            self.boot_sector.sectors_per_cluster,
+        )
+
+    def write_cluster(self, cluster: int, data: bytes):
+        """
+        Write the cluster at the given cluster.
+        """
+        assert len(data) == self.boot_sector.cluster_bytes()
+        return self.write_sectors(
+            self.boot_sector.first_sector_of_cluster(cluster),
+            data,
+        )
+
+    def read_directory(self, cluster: int) -> List[FatDirectoryEntry]:
+        """
+        Read the directory at the given cluster.
+        """
+        entries = []
+        while cluster is not None:
+            data = self.read_cluster(cluster)
+            entries.extend(
+                self.directory_from_bytes(
+                    data, self.boot_sector.first_sector_of_cluster(cluster)
+                )
+            )
+            cluster = self.next_cluster(cluster)
+        return entries
+
+    def add_direntry(self,
+                     cluster: int | None,
+                     name: str, ext: str,
+                     attributes: int):
+        """
+        Add a new directory entry to the given cluster.
+        If the cluster is `None`, then it will be added to the root directory.
+        """
+
+        def find_free_entry(data: bytes):
+            for i in range(0, len(data), DIRENTRY_SIZE):
+                entry = data[i : i + DIRENTRY_SIZE]
+                if entry[0] == 0 or entry[0] == 0xE5:
+                    return i
+            return None
+
+        assert len(name) <= 8, "Name must be 8 characters or less"
+        assert len(ext) <= 3, "Ext must be 3 characters or less"
+        assert attributes % 0x15 != 0x15, "Invalid attributes"
+
+        # initial dummy data
+        new_entry = FatDirectoryEntry(b"\0" * 32, 0, 0)
+        new_entry.name = name.ljust(8, " ")
+        new_entry.ext = ext.ljust(3, " ")
+        new_entry.attributes = attributes
+        new_entry.reserved = 0
+        new_entry.create_time_tenth = 0
+        new_entry.create_time = 0
+        new_entry.create_date = 0
+        new_entry.last_access_date = 0
+        new_entry.last_mod_time = 0
+        new_entry.last_mod_date = 0
+        new_entry.cluster = self.next_free_cluster()
+        new_entry.size_bytes = 0
+
+        # mark as EOF
+        self.write_fat_entry(new_entry.cluster, 0xFFFF)
+
+        if cluster is None:
+            for i in range(self.boot_sector.root_dir_size()):
+                sector_data = self.read_sectors(
+                    self.boot_sector.root_dir_start() + i, 1
+                )
+                offset = find_free_entry(sector_data)
+                if offset is not None:
+                    new_entry.sector = self.boot_sector.root_dir_start() + i
+                    new_entry.offset = offset
+                    self.update_direntry(new_entry)
+                    return new_entry
+        else:
+            while cluster is not None:
+                data = self.read_cluster(cluster)
+                offset = find_free_entry(data)
+                if offset is not None:
+                    new_entry.sector = self.boot_sector.first_sector_of_cluster(
+                        cluster
+                    ) + (offset // SECTOR_SIZE)
+                    new_entry.offset = offset % SECTOR_SIZE
+                    self.update_direntry(new_entry)
+                    return new_entry
+                cluster = self.next_cluster(cluster)
+
+        raise Exception("No free directory entries")
+
+    def update_direntry(self, entry: FatDirectoryEntry):
+        """
+        Write the directory entry back to the disk.
+        """
+        sector = self.read_sectors(entry.sector, 1)
+        sector = (
+            sector[: entry.offset]
+            + entry.as_bytes()
+            + sector[entry.offset + DIRENTRY_SIZE :]
+        )
+        self.write_sectors(entry.sector, sector)
+
+    def find_direntry(self, path: str) -> FatDirectoryEntry | None:
+        """
+        Find the directory entry for the given path.
+        """
+        assert path[0] == "/", "Path must start with /"
+
+        path = path[1:]  # remove the leading /
+        parts = path.split("/")
+        directory = self.read_root_directory()
+
+        current_entry = None
+
+        for i, part in enumerate(parts):
+            is_last = i == len(parts) - 1
+
+            for entry in directory:
+                if entry.whole_name() == part:
+                    current_entry = entry
+                    break
+            if current_entry is None:
+                return None
+
+            if is_last:
+                return current_entry
+            else:
+                if current_entry.attributes & 0x10 == 0:
+                    raise Exception(
+                        f"{current_entry.whole_name()} is not a directory")
+                else:
+                    directory = self.read_directory(current_entry.cluster)
+
+    def read_file(self, entry: FatDirectoryEntry) -> bytes:
+        """
+        Read the content of the file at the given path.
+        """
+        if entry is None:
+            return None
+        if entry.attributes & 0x10 != 0:
+            raise Exception(f"{entry.whole_name()} is a directory")
+
+        data = b""
+        cluster = entry.cluster
+        while cluster is not None and len(data) <= entry.size_bytes:
+            data += self.read_cluster(cluster)
+            cluster = self.next_cluster(cluster)
+        return data[: entry.size_bytes]
+
+    def truncate_file(self, entry: FatDirectoryEntry, new_size: int):
+        """
+        Truncate the file at the given path to the new size.
+        """
+        if entry is None:
+            return Exception("entry is None")
+        if entry.attributes & 0x10 != 0:
+            raise Exception(f"{entry.whole_name()} is a directory")
+
+        def clusters_from_size(size: int):
+            return (
+                size + self.boot_sector.cluster_bytes() - 1
+            ) // self.boot_sector.cluster_bytes()
+
+        # First, allocate new FATs if we need to
+        required_clusters = clusters_from_size(new_size)
+        current_clusters = clusters_from_size(entry.size_bytes)
+
+        affected_clusters = set()
+
+        # Keep at least one cluster, easier to manage this way
+        if required_clusters == 0:
+            required_clusters = 1
+        if current_clusters == 0:
+            current_clusters = 1
+
+        if required_clusters > current_clusters:
+            # Allocate new clusters
+            cluster = entry.cluster
+            to_add = required_clusters
+            for _ in range(current_clusters - 1):
+                to_add -= 1
+                cluster = self.next_cluster(cluster)
+            assert required_clusters > 0, "No new clusters to allocate"
+            assert cluster is not None, "Cluster is None"
+            assert self.next_cluster(cluster) is None, \
+                   "Cluster is not the last cluster"
+
+            # Allocate new clusters
+            for _ in range(to_add - 1):
+                new_cluster = self.next_free_cluster()
+                self.write_fat_entry(cluster, new_cluster)
+                self.write_fat_entry(new_cluster, 0xFFFF)
+                cluster = new_cluster
+
+        elif required_clusters < current_clusters:
+            # Truncate the file
+            cluster = entry.cluster
+            for _ in range(required_clusters - 1):
+                cluster = self.next_cluster(cluster)
+            assert cluster is not None, "Cluster is None"
+
+            next_cluster = self.next_cluster(cluster)
+            # mark last as EOF
+            self.write_fat_entry(cluster, 0xFFFF)
+            # free the rest
+            while next_cluster is not None:
+                cluster = next_cluster
+                next_cluster = self.next_cluster(next_cluster)
+                self.write_fat_entry(cluster, 0)
+
+        self.flush_fats()
+
+        # verify number of clusters
+        cluster = entry.cluster
+        count = 0
+        while cluster is not None:
+            count += 1
+            affected_clusters.add(cluster)
+            cluster = self.next_cluster(cluster)
+        assert (
+            count == required_clusters
+        ), f"Expected {required_clusters} clusters, got {count}"
+
+        # update the size
+        entry.size_bytes = new_size
+        self.update_direntry(entry)
+
+        # trigger every affected cluster
+        for cluster in affected_clusters:
+            first_sector = self.boot_sector.first_sector_of_cluster(cluster)
+            first_sector_data = self.read_sectors(first_sector, 1)
+            self.write_sectors(first_sector, first_sector_data)
+
+    def write_file(self, entry: FatDirectoryEntry, data: bytes):
+        """
+        Write the content of the file at the given path.
+        """
+        if entry is None:
+            return Exception("entry is None")
+        if entry.attributes & 0x10 != 0:
+            raise Exception(f"{entry.whole_name()} is a directory")
+
+        data_len = len(data)
+
+        self.truncate_file(entry, data_len)
+
+        cluster = entry.cluster
+        while cluster is not None:
+            data_to_write = data[: self.boot_sector.cluster_bytes()]
+            last_data = False
+            if len(data_to_write) < self.boot_sector.cluster_bytes():
+                last_data = True
+                old_data = self.read_cluster(cluster)
+                data_to_write += old_data[len(data_to_write) :]
+
+            self.write_cluster(cluster, data_to_write)
+            data = data[self.boot_sector.cluster_bytes() :]
+            if len(data) == 0:
+                break
+            cluster = self.next_cluster(cluster)
+
+        assert len(data) == 0, \
+               "Data was not written completely, clusters missing"
+
+    def create_file(self, path: str):
+        """
+        Create a new file at the given path.
+        """
+        assert path[0] == "/", "Path must start with /"
+
+        path = path[1:]  # remove the leading /
+
+        parts = path.split("/")
+
+        directory_cluster = None
+        directory = self.read_root_directory()
+
+        parts, filename = parts[:-1], parts[-1]
+
+        for i, part in enumerate(parts):
+            current_entry = None
+            for entry in directory:
+                if entry.whole_name() == part:
+                    current_entry = entry
+                    break
+            if current_entry is None:
+                return None
+
+            if current_entry.attributes & 0x10 == 0:
+                raise Exception(
+                    f"{current_entry.whole_name()} is not a directory")
+            else:
+                directory = self.read_directory(current_entry.cluster)
+                directory_cluster = current_entry.cluster
+
+        # add new entry to the directory
+
+        filename, ext = filename.split(".")
+
+        if len(ext) > 3:
+            raise Exception("Ext must be 3 characters or less")
+        if len(filename) > 8:
+            raise Exception("Name must be 8 characters or less")
+
+        for c in filename + ext:
+
+            if c not in ALLOWED_FILE_CHARS:
+                raise Exception("Invalid character in filename")
+
+        return self.add_direntry(directory_cluster, filename, ext, 0)
diff --git a/tests/qemu-iotests/testenv.py b/tests/qemu-iotests/testenv.py
index 588f30a4f1..4053d29de4 100644
--- a/tests/qemu-iotests/testenv.py
+++ b/tests/qemu-iotests/testenv.py
@@ -250,7 +250,7 @@ def __init__(self, source_dir: str, build_dir: str,
         self.qemu_img_options = os.getenv('QEMU_IMG_OPTIONS')
         self.qemu_nbd_options = os.getenv('QEMU_NBD_OPTIONS')
 
-        is_generic = self.imgfmt not in ['bochs', 'cloop', 'dmg']
+        is_generic = self.imgfmt not in ['bochs', 'cloop', 'dmg', 'vvfat']
         self.imgfmt_generic = 'true' if is_generic else 'false'
 
         self.qemu_io_options = f'--cache {self.cachemode} --aio {self.aiomode}'
diff --git a/tests/qemu-iotests/tests/vvfat b/tests/qemu-iotests/tests/vvfat
new file mode 100755
index 0000000000..113d7d3270
--- /dev/null
+++ b/tests/qemu-iotests/tests/vvfat
@@ -0,0 +1,440 @@
+#!/usr/bin/env python3
+# group: rw vvfat
+#
+# Test vvfat driver implementation
+# Here, we use a simple FAT16 implementation and check the behavior of the vvfat driver.
+#
+# Copyright (C) 2024 Amjad Alsharafi <amjadsharafi10@gmail.com>
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+import os, shutil
+import iotests
+from iotests import imgfmt, QMPTestCase
+from fat16 import MBR, Fat16, DIRENTRY_SIZE
+
+filesystem = os.path.join(iotests.test_dir, "filesystem")
+
+nbd_sock = iotests.file_path("nbd.sock", base_dir=iotests.sock_dir)
+nbd_uri = "nbd+unix:///disk?socket=" + nbd_sock
+
+SECTOR_SIZE = 512
+
+
+class TestVVFatDriver(QMPTestCase):
+    def setUp(self) -> None:
+        if os.path.exists(filesystem):
+            if os.path.isdir(filesystem):
+                shutil.rmtree(filesystem)
+            else:
+                print(f"Error: {filesystem} exists and is not a directory")
+                exit(1)
+        os.mkdir(filesystem)
+
+        # Add some text files to the filesystem
+        for i in range(10):
+            with open(os.path.join(filesystem, f"file{i}.txt"), "w") as f:
+                f.write(f"Hello, world! {i}\n")
+
+        # Add 2 large files, above the cluster size (8KB)
+        with open(os.path.join(filesystem, "large1.txt"), "wb") as f:
+            # write 'A' * 1KB, 'B' * 1KB, 'C' * 1KB, ...
+            for i in range(8 * 2):  # two clusters
+                f.write(bytes([0x41 + i] * 1024))
+
+        with open(os.path.join(filesystem, "large2.txt"), "wb") as f:
+            # write 'A' * 1KB, 'B' * 1KB, 'C' * 1KB, ...
+            for i in range(8 * 3):  # 3 clusters
+                f.write(bytes([0x41 + i] * 1024))
+
+        self.vm = iotests.VM()
+
+        self.vm.add_blockdev(
+            self.vm.qmp_to_opts(
+                {
+                    "driver": imgfmt,
+                    "node-name": "disk",
+                    "rw": "true",
+                    "fat-type": "16",
+                    "dir": filesystem,
+                }
+            )
+        )
+
+        self.vm.launch()
+
+        self.vm.qmp_log("block-dirty-bitmap-add", **{"node": "disk", "name": "bitmap0"})
+
+        # attach nbd server
+        self.vm.qmp_log(
+            "nbd-server-start",
+            **{"addr": {"type": "unix", "data": {"path": nbd_sock}}},
+            filters=[],
+        )
+
+        self.vm.qmp_log(
+            "nbd-server-add",
+            **{"device": "disk", "writable": True, "bitmap": "bitmap0"},
+        )
+
+        self.qio = iotests.QemuIoInteractive("-f", "raw", nbd_uri)
+
+    def tearDown(self) -> None:
+        self.qio.close()
+        self.vm.shutdown()
+        # print(self.vm.get_log())
+        shutil.rmtree(filesystem)
+
+    def read_sectors(self, sector: int, num: int = 1) -> bytes:
+        """
+        Read `num` sectors starting from `sector` from the `disk`.
+        This uses `QemuIoInteractive` to read the sectors into `stdout` and then parse the output.
+        """
+        self.assertGreater(num, 0)
+        # The output contains the content of the sector in hex dump format
+        # We need to extract the content from it
+        output = self.qio.cmd(f"read -v {sector * SECTOR_SIZE} {num * SECTOR_SIZE}")
+        # Each row is 16 bytes long, and we are writing `num` sectors
+        rows = num * SECTOR_SIZE // 16
+        output_rows = output.split("\n")[:rows]
+
+        hex_content = "".join(
+            [(row.split(": ")[1]).split("  ")[0] for row in output_rows]
+        )
+        bytes_content = bytes.fromhex(hex_content)
+
+        self.assertEqual(len(bytes_content), num * SECTOR_SIZE)
+
+        return bytes_content
+
+    def write_sectors(self, sector: int, data: bytes):
+        """
+        Write `data` to the `disk` starting from `sector`.
+        This uses `QemuIoInteractive` to write the data into the disk.
+        """
+
+        self.assertGreater(len(data), 0)
+        self.assertEqual(len(data) % SECTOR_SIZE, 0)
+
+        temp_file = os.path.join(iotests.test_dir, "temp.bin")
+        with open(temp_file, "wb") as f:
+            f.write(data)
+
+        self.qio.cmd(f"write -s {temp_file} {sector * SECTOR_SIZE} {len(data)}")
+
+        os.remove(temp_file)
+
+    def init_fat16(self):
+        mbr = MBR(self.read_sectors(0))
+        return Fat16(
+            mbr.partition_table[0]["start_lba"],
+            mbr.partition_table[0]["size"],
+            self.read_sectors,
+            self.write_sectors,
+        )
+
+    # Tests
+
+    def test_fat_filesystem(self):
+        """
+        Test that vvfat produce a valid FAT16 and MBR sectors
+        """
+        mbr = MBR(self.read_sectors(0))
+
+        self.assertEqual(mbr.partition_table[0]["status"], 0x80)
+        self.assertEqual(mbr.partition_table[0]["type"], 6)
+
+        fat16 = Fat16(
+            mbr.partition_table[0]["start_lba"],
+            mbr.partition_table[0]["size"],
+            self.read_sectors,
+            self.write_sectors,
+        )
+        self.assertEqual(fat16.boot_sector.bytes_per_sector, 512)
+        self.assertEqual(fat16.boot_sector.volume_label, "QEMU VVFAT")
+
+    def test_read_root_directory(self):
+        """
+        Test the content of the root directory
+        """
+        fat16 = self.init_fat16()
+
+        root_dir = fat16.read_root_directory()
+
+        self.assertEqual(len(root_dir), 13)  # 12 + 1 special file
+
+        files = {
+            "QEMU VVF.AT": 0,  # special empty file
+            "FILE0.TXT": 16,
+            "FILE1.TXT": 16,
+            "FILE2.TXT": 16,
+            "FILE3.TXT": 16,
+            "FILE4.TXT": 16,
+            "FILE5.TXT": 16,
+            "FILE6.TXT": 16,
+            "FILE7.TXT": 16,
+            "FILE8.TXT": 16,
+            "FILE9.TXT": 16,
+            "LARGE1.TXT": 0x2000 * 2,
+            "LARGE2.TXT": 0x2000 * 3,
+        }
+
+        for entry in root_dir:
+            self.assertIn(entry.whole_name(), files)
+            self.assertEqual(entry.size_bytes, files[entry.whole_name()])
+
+    def test_direntry_as_bytes(self):
+        """
+        Test if we can convert Direntry back to bytes, so that we can write it back to the disk safely.
+        """
+        fat16 = self.init_fat16()
+
+        root_dir = fat16.read_root_directory()
+        first_entry_bytes = fat16.read_sectors(fat16.boot_sector.root_dir_start(), 1)
+        # The first entry won't be deleted, so we can compare it with the first entry in the root directory
+        self.assertEqual(root_dir[0].as_bytes(), first_entry_bytes[:DIRENTRY_SIZE])
+
+    def test_read_files(self):
+        """
+        Test reading the content of the files
+        """
+        fat16 = self.init_fat16()
+
+        for i in range(10):
+            file = fat16.find_direntry(f"/FILE{i}.TXT")
+            self.assertIsNotNone(file)
+            self.assertEqual(
+                fat16.read_file(file), f"Hello, world! {i}\n".encode("ascii")
+            )
+
+        # test large files
+        large1 = fat16.find_direntry("/LARGE1.TXT")
+        with open(os.path.join(filesystem, "large1.txt"), "rb") as f:
+            self.assertEqual(fat16.read_file(large1), f.read())
+
+        large2 = fat16.find_direntry("/LARGE2.TXT")
+        self.assertIsNotNone(large2)
+        with open(os.path.join(filesystem, "large2.txt"), "rb") as f:
+            self.assertEqual(fat16.read_file(large2), f.read())
+
+    def test_write_file_same_content_direct(self):
+        """
+        Similar to `test_write_file_in_same_content`, but we write the file directly clusters
+        and thus we don't go through the modification of direntry.
+        """
+        fat16 = self.init_fat16()
+
+        file = fat16.find_direntry("/FILE0.TXT")
+        self.assertIsNotNone(file)
+
+        data = fat16.read_cluster(file.cluster)
+        fat16.write_cluster(file.cluster, data)
+
+        with open(os.path.join(filesystem, "file0.txt"), "rb") as f:
+            self.assertEqual(fat16.read_file(file), f.read())
+
+    def test_write_file_in_same_content(self):
+        """
+        Test writing the same content to the file back to it
+        """
+        fat16 = self.init_fat16()
+
+        file = fat16.find_direntry("/FILE0.TXT")
+        self.assertIsNotNone(file)
+
+        self.assertEqual(fat16.read_file(file), b"Hello, world! 0\n")
+
+        fat16.write_file(file, b"Hello, world! 0\n")
+
+        self.assertEqual(fat16.read_file(file), b"Hello, world! 0\n")
+
+        with open(os.path.join(filesystem, "file0.txt"), "rb") as f:
+            self.assertEqual(f.read(), b"Hello, world! 0\n")
+
+    def test_modify_content_same_clusters(self):
+        """
+        Test modifying the content of the file without changing the number of clusters
+        """
+        fat16 = self.init_fat16()
+
+        file = fat16.find_direntry("/FILE0.TXT")
+        self.assertIsNotNone(file)
+
+        new_content = b"Hello, world! Modified\n"
+        self.assertEqual(fat16.read_file(file), b"Hello, world! 0\n")
+
+        fat16.write_file(file, new_content)
+
+        self.assertEqual(fat16.read_file(file), new_content)
+
+        with open(os.path.join(filesystem, "file0.txt"), "rb") as f:
+            self.assertEqual(f.read(), new_content)
+
+    def test_truncate_file_same_clusters_less(self):
+        """
+        Test truncating the file without changing number of clusters
+        Test decreasing the file size
+        """
+        fat16 = self.init_fat16()
+
+        file = fat16.find_direntry("/FILE0.TXT")
+        self.assertIsNotNone(file)
+
+        self.assertEqual(fat16.read_file(file), b"Hello, world! 0\n")
+
+        fat16.truncate_file(file, 5)
+
+        new_content = fat16.read_file(file)
+
+        self.assertEqual(new_content, b"Hello")
+
+        with open(os.path.join(filesystem, "file0.txt"), "rb") as f:
+            self.assertEqual(f.read(), new_content)
+
+    def test_truncate_file_same_clusters_more(self):
+        """
+        Test truncating the file without changing number of clusters
+        Test increase the file size
+        """
+        fat16 = self.init_fat16()
+
+        file = fat16.find_direntry("/FILE0.TXT")
+        self.assertIsNotNone(file)
+
+        self.assertEqual(fat16.read_file(file), b"Hello, world! 0\n")
+
+        fat16.truncate_file(file, 20)
+
+        new_content = fat16.read_file(file)
+
+        # random pattern will be appended to the file, and its not always the same
+        self.assertEqual(new_content[:16], b"Hello, world! 0\n")
+        self.assertEqual(len(new_content), 20)
+
+        with open(os.path.join(filesystem, "file0.txt"), "rb") as f:
+            self.assertEqual(f.read(), new_content)
+
+    def test_write_large_file(self):
+        """
+        Test writing a large file
+        """
+        fat16 = self.init_fat16()
+
+        file = fat16.find_direntry("/LARGE1.TXT")
+        self.assertIsNotNone(file)
+
+        # The content of LARGE1 is A * 1KB, B * 1KB, C * 1KB, ..., P * 1KB
+        # Lets change it to be Z * 1KB, Y * 1KB, X * 1KB, ..., K * 1KB
+        # without changing the number of clusters or filesize
+        new_content = b"".join([bytes([0x5A - i] * 1024) for i in range(16)])
+
+        fat16.write_file(file, new_content)
+
+        with open(os.path.join(filesystem, "large1.txt"), "rb") as f:
+            self.assertEqual(f.read(), new_content)
+
+    def test_truncate_file_change_clusters_less(self):
+        """
+        Test truncating a file by reducing the number of clusters
+        """
+        fat16 = self.init_fat16()
+
+        file = fat16.find_direntry("/LARGE1.TXT")
+        self.assertIsNotNone(file)
+
+        fat16.truncate_file(file, 1)
+
+        self.assertEqual(fat16.read_file(file), b"A")
+
+        with open(os.path.join(filesystem, "large1.txt"), "rb") as f:
+            self.assertEqual(f.read(), b"A")
+
+    def test_write_file_change_clusters_less(self):
+        """
+        Test truncating a file by reducing the number of clusters
+        """
+        fat16 = self.init_fat16()
+
+        file = fat16.find_direntry("/LARGE2.TXT")
+        self.assertIsNotNone(file)
+
+        new_content = b"Hello, world! This was a large file\n"
+        new_content = b"X" * 8 * 1024 + b"Y" * 8 * 1024
+
+        fat16.write_file(file, new_content)
+
+        self.assertEqual(fat16.read_file(file), new_content)
+
+        with open(os.path.join(filesystem, "large2.txt"), "rb") as f:
+            self.assertEqual(f.read(), new_content)
+
+    def test_write_file_change_clusters_more(self):
+        """
+        Test truncating a file by increasing the number of clusters
+        """
+        fat16 = self.init_fat16()
+
+        file = fat16.find_direntry("/LARGE2.TXT")
+        self.assertIsNotNone(file)
+
+        new_content = b"X" * 8 * 1024 + b"Y" * 8 * 1024 + b"Z" * 8 * 1024
+
+        fat16.write_file(file, new_content)
+
+        with open(os.path.join(filesystem, "large2.txt"), "rb") as f:
+            self.assertEqual(f.read(), new_content)
+
+    def test_write_file_change_clusters_more_non_last_file(self):
+        """
+        Test truncating a file by increasing the number of clusters
+        This is a special variant of the above test, where we write to
+        a file so that when allocating new clusters, it won't have contiguous clusters
+        """
+        fat16 = self.init_fat16()
+
+        file = fat16.find_direntry("/LARGE1.TXT")
+        self.assertIsNotNone(file)
+
+        new_content = b"X" * 8 * 1024 + b"Y" * 8 * 1024 + b"Z" * 8 * 1024
+
+        fat16.write_file(file, new_content)
+
+        with open(os.path.join(filesystem, "large1.txt"), "rb") as f:
+            self.assertEqual(f.read(), new_content)
+
+    def test_create_file(self):
+        """
+        Test creating a new file
+        """
+        fat16 = self.init_fat16()
+
+        new_file = fat16.create_file("/NEWFILE.TXT")
+
+        self.assertIsNotNone(new_file)
+        self.assertEqual(new_file.size_bytes, 0)
+
+        new_content = b"Hello, world! New file\n"
+        fat16.write_file(new_file, new_content)
+
+        self.assertEqual(fat16.read_file(new_file), new_content)
+
+        with open(os.path.join(filesystem, "newfile.txt"), "rb") as f:
+            self.assertEqual(f.read(), new_content)
+
+    # TODO: support deleting files
+
+
+if __name__ == "__main__":
+    # This is a specific test for vvfat driver
+    iotests.main(supported_fmts=["vvfat"], supported_protocols=["file"])
diff --git a/tests/qemu-iotests/tests/vvfat.out b/tests/qemu-iotests/tests/vvfat.out
new file mode 100755
index 0000000000..96961ed0b5
--- /dev/null
+++ b/tests/qemu-iotests/tests/vvfat.out
@@ -0,0 +1,5 @@
+...............
+----------------------------------------------------------------------
+Ran 15 tests
+
+OK
-- 
2.45.1



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 4/4] iotests: Add `vvfat` tests
  2024-06-05  0:58 ` [PATCH v4 4/4] iotests: Add `vvfat` tests Amjad Alsharafi
@ 2024-06-10 12:01   ` Kevin Wolf
  2024-06-10 14:11     ` Amjad Alsharafi
  0 siblings, 1 reply; 13+ messages in thread
From: Kevin Wolf @ 2024-06-10 12:01 UTC (permalink / raw)
  To: Amjad Alsharafi; +Cc: qemu-devel, Hanna Reitz, open list:vvfat

Am 05.06.2024 um 02:58 hat Amjad Alsharafi geschrieben:
> Added several tests to verify the implementation of the vvfat driver.
> 
> We needed a way to interact with it, so created a basic `fat16.py` driver that handled writing correct sectors for us.
> 
> Added `vvfat` to the non-generic formats, as its not a normal image format.
> 
> Signed-off-by: Amjad Alsharafi <amjadsharafi10@gmail.com>
> ---
>  tests/qemu-iotests/check           |   2 +-
>  tests/qemu-iotests/fat16.py        | 635 +++++++++++++++++++++++++++++
>  tests/qemu-iotests/testenv.py      |   2 +-
>  tests/qemu-iotests/tests/vvfat     | 440 ++++++++++++++++++++
>  tests/qemu-iotests/tests/vvfat.out |   5 +
>  5 files changed, 1082 insertions(+), 2 deletions(-)
>  create mode 100644 tests/qemu-iotests/fat16.py
>  create mode 100755 tests/qemu-iotests/tests/vvfat
>  create mode 100755 tests/qemu-iotests/tests/vvfat.out
> 
> diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check
> index 56d88ca423..545f9ec7bd 100755
> --- a/tests/qemu-iotests/check
> +++ b/tests/qemu-iotests/check
> @@ -84,7 +84,7 @@ def make_argparser() -> argparse.ArgumentParser:
>      p.set_defaults(imgfmt='raw', imgproto='file')
>  
>      format_list = ['raw', 'bochs', 'cloop', 'parallels', 'qcow', 'qcow2',
> -                   'qed', 'vdi', 'vpc', 'vhdx', 'vmdk', 'luks', 'dmg']
> +                   'qed', 'vdi', 'vpc', 'vhdx', 'vmdk', 'luks', 'dmg', 'vvfat']
>      g_fmt = p.add_argument_group(
>          '  image format options',
>          'The following options set the IMGFMT environment variable. '
> diff --git a/tests/qemu-iotests/fat16.py b/tests/qemu-iotests/fat16.py
> new file mode 100644
> index 0000000000..baf801b4d5
> --- /dev/null
> +++ b/tests/qemu-iotests/fat16.py
> @@ -0,0 +1,635 @@
> +# A simple FAT16 driver that is used to test the `vvfat` driver in QEMU.
> +#
> +# Copyright (C) 2024 Amjad Alsharafi <amjadsharafi10@gmail.com>
> +#
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 2 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
> +
> +from typing import List
> +import string
> +
> +SECTOR_SIZE = 512
> +DIRENTRY_SIZE = 32
> +ALLOWED_FILE_CHARS = \
> +    set("!#$%&'()-@^_`{}~" + string.digits + string.ascii_uppercase)
> +
> +
> +class MBR:
> +    def __init__(self, data: bytes):
> +        assert len(data) == 512
> +        self.partition_table = []
> +        for i in range(4):
> +            partition = data[446 + i * 16 : 446 + (i + 1) * 16]
> +            self.partition_table.append(
> +                {
> +                    "status": partition[0],
> +                    "start_head": partition[1],
> +                    "start_sector": partition[2] & 0x3F,
> +                    "start_cylinder":
> +                        ((partition[2] & 0xC0) << 2) | partition[3],
> +                    "type": partition[4],
> +                    "end_head": partition[5],
> +                    "end_sector": partition[6] & 0x3F,
> +                    "end_cylinder":
> +                        ((partition[6] & 0xC0) << 2) | partition[7],
> +                    "start_lba": int.from_bytes(partition[8:12], "little"),
> +                    "size": int.from_bytes(partition[12:16], "little"),
> +                }
> +            )
> +
> +    def __str__(self):
> +        return "\n".join(
> +            [f"{i}: {partition}"
> +                for i, partition in enumerate(self.partition_table)]
> +        )
> +
> +
> +class FatBootSector:
> +    def __init__(self, data: bytes):
> +        assert len(data) == 512
> +        self.bytes_per_sector = int.from_bytes(data[11:13], "little")
> +        self.sectors_per_cluster = data[13]
> +        self.reserved_sectors = int.from_bytes(data[14:16], "little")
> +        self.fat_count = data[16]
> +        self.root_entries = int.from_bytes(data[17:19], "little")
> +        self.media_descriptor = data[21]
> +        self.fat_size = int.from_bytes(data[22:24], "little")
> +        self.sectors_per_fat = int.from_bytes(data[22:24], "little")

Why two different attributes self.fat_size and self.sectors_per_fat that
contain the same value?

> +        self.sectors_per_track = int.from_bytes(data[24:26], "little")
> +        self.heads = int.from_bytes(data[26:28], "little")
> +        self.hidden_sectors = int.from_bytes(data[28:32], "little")
> +        self.total_sectors = int.from_bytes(data[32:36], "little")

This value should only be considered if the 16 bit value at byte 19
(which you don't store at all) was 0.

> +        self.drive_number = data[36]
> +        self.volume_id = int.from_bytes(data[39:43], "little")
> +        self.volume_label = data[43:54].decode("ascii").strip()
> +        self.fs_type = data[54:62].decode("ascii").strip()
> +
> +    def root_dir_start(self):
> +        """
> +        Calculate the start sector of the root directory.
> +        """
> +        return self.reserved_sectors + self.fat_count * self.sectors_per_fat
> +
> +    def root_dir_size(self):
> +        """
> +        Calculate the size of the root directory in sectors.
> +        """
> +        return (
> +            self.root_entries * DIRENTRY_SIZE + self.bytes_per_sector - 1
> +        ) // self.bytes_per_sector
> +
> +    def data_sector_start(self):
> +        """
> +        Calculate the start sector of the data region.
> +        """
> +        return self.root_dir_start() + self.root_dir_size()
> +
> +    def first_sector_of_cluster(self, cluster: int):
> +        """
> +        Calculate the first sector of the given cluster.
> +        """
> +        return self.data_sector_start() \
> +                + (cluster - 2) * self.sectors_per_cluster
> +
> +    def cluster_bytes(self):
> +        """
> +        Calculate the number of bytes in a cluster.
> +        """
> +        return self.bytes_per_sector * self.sectors_per_cluster
> +
> +    def __str__(self):
> +        return (
> +            f"Bytes per sector: {self.bytes_per_sector}\n"
> +            f"Sectors per cluster: {self.sectors_per_cluster}\n"
> +            f"Reserved sectors: {self.reserved_sectors}\n"
> +            f"FAT count: {self.fat_count}\n"
> +            f"Root entries: {self.root_entries}\n"
> +            f"Total sectors: {self.total_sectors}\n"
> +            f"Media descriptor: {self.media_descriptor}\n"
> +            f"Sectors per FAT: {self.sectors_per_fat}\n"
> +            f"Sectors per track: {self.sectors_per_track}\n"
> +            f"Heads: {self.heads}\n"
> +            f"Hidden sectors: {self.hidden_sectors}\n"
> +            f"Drive number: {self.drive_number}\n"
> +            f"Volume ID: {self.volume_id}\n"
> +            f"Volume label: {self.volume_label}\n"
> +            f"FS type: {self.fs_type}\n"
> +        )
> +
> +
> +class FatDirectoryEntry:
> +    def __init__(self, data: bytes, sector: int, offset: int):
> +        self.name = data[0:8].decode("ascii").strip()
> +        self.ext = data[8:11].decode("ascii").strip()
> +        self.attributes = data[11]
> +        self.reserved = data[12]
> +        self.create_time_tenth = data[13]
> +        self.create_time = int.from_bytes(data[14:16], "little")
> +        self.create_date = int.from_bytes(data[16:18], "little")
> +        self.last_access_date = int.from_bytes(data[18:20], "little")
> +        high_cluster = int.from_bytes(data[20:22], "little")
> +        self.last_mod_time = int.from_bytes(data[22:24], "little")
> +        self.last_mod_date = int.from_bytes(data[24:26], "little")
> +        low_cluster = int.from_bytes(data[26:28], "little")
> +        self.cluster = (high_cluster << 16) | low_cluster
> +        self.size_bytes = int.from_bytes(data[28:32], "little")
> +
> +        # extra (to help write back to disk)
> +        self.sector = sector
> +        self.offset = offset
> +
> +    def as_bytes(self) -> bytes:
> +        return (
> +            self.name.ljust(8, " ").encode("ascii")
> +            + self.ext.ljust(3, " ").encode("ascii")
> +            + self.attributes.to_bytes(1, "little")
> +            + self.reserved.to_bytes(1, "little")
> +            + self.create_time_tenth.to_bytes(1, "little")
> +            + self.create_time.to_bytes(2, "little")
> +            + self.create_date.to_bytes(2, "little")
> +            + self.last_access_date.to_bytes(2, "little")
> +            + (self.cluster >> 16).to_bytes(2, "little")
> +            + self.last_mod_time.to_bytes(2, "little")
> +            + self.last_mod_date.to_bytes(2, "little")
> +            + (self.cluster & 0xFFFF).to_bytes(2, "little")
> +            + self.size_bytes.to_bytes(4, "little")
> +        )
> +
> +    def whole_name(self):
> +        if self.ext:
> +            return f"{self.name}.{self.ext}"
> +        else:
> +            return self.name
> +
> +    def __str__(self):
> +        return (
> +            f"Name: {self.name}\n"
> +            f"Ext: {self.ext}\n"
> +            f"Attributes: {self.attributes}\n"
> +            f"Reserved: {self.reserved}\n"
> +            f"Create time tenth: {self.create_time_tenth}\n"
> +            f"Create time: {self.create_time}\n"
> +            f"Create date: {self.create_date}\n"
> +            f"Last access date: {self.last_access_date}\n"
> +            f"Last mod time: {self.last_mod_time}\n"
> +            f"Last mod date: {self.last_mod_date}\n"
> +            f"Cluster: {self.cluster}\n"
> +            f"Size: {self.size_bytes}\n"
> +        )
> +
> +    def __repr__(self):
> +        # convert to dict
> +        return str(vars(self))
> +
> +
> +class Fat16:
> +    def __init__(
> +        self,
> +        start_sector: int,
> +        size: int,
> +        sector_reader: callable,
> +        sector_writer: callable,
> +    ):
> +        self.start_sector = start_sector
> +        self.size_in_sectors = size
> +        self.sector_reader = sector_reader
> +        self.sector_writer = sector_writer
> +
> +        self.boot_sector = FatBootSector(self.sector_reader(start_sector))
> +
> +        fat_size_in_sectors = \
> +            self.boot_sector.fat_size * self.boot_sector.fat_count
> +        self.fats = self.read_sectors(
> +            self.boot_sector.reserved_sectors, fat_size_in_sectors
> +        )
> +        self.fats_dirty_sectors = set()
> +
> +    def read_sectors(self, start_sector: int, num_sectors: int) -> bytes:
> +        return self.sector_reader(start_sector + self.start_sector, num_sectors)
> +
> +    def write_sectors(self, start_sector: int, data: bytes):
> +        return self.sector_writer(start_sector + self.start_sector, data)
> +
> +    def directory_from_bytes(
> +        self, data: bytes, start_sector: int
> +    ) -> List[FatDirectoryEntry]:
> +        """
> +        Convert `bytes` into a list of `FatDirectoryEntry` objects.
> +        Will ignore long file names.
> +        Will stop when it encounters a 0x00 byte.
> +        """
> +
> +        entries = []
> +        for i in range(0, len(data), DIRENTRY_SIZE):
> +            entry = data[i : i + DIRENTRY_SIZE]
> +
> +            current_sector = start_sector + (i // SECTOR_SIZE)
> +            current_offset = i % SECTOR_SIZE
> +
> +            if entry[0] == 0:
> +                break
> +            elif entry[0] == 0xE5:
> +                # Deleted file
> +                continue
> +
> +            if entry[11] & 0xF == 0xF:
> +                # Long file name
> +                continue
> +
> +            entries.append(
> +                FatDirectoryEntry(entry, current_sector, current_offset))
> +        return entries
> +
> +    def read_root_directory(self) -> List[FatDirectoryEntry]:
> +        root_dir = self.read_sectors(
> +            self.boot_sector.root_dir_start(), self.boot_sector.root_dir_size()
> +        )
> +        return self.directory_from_bytes(root_dir,
> +                                         self.boot_sector.root_dir_start())
> +
> +    def read_fat_entry(self, cluster: int) -> int:
> +        """
> +        Read the FAT entry for the given cluster.
> +        """
> +        fat_offset = cluster * 2  # FAT16
> +        return int.from_bytes(self.fats[fat_offset : fat_offset + 2], "little")
> +
> +    def write_fat_entry(self, cluster: int, value: int):
> +        """
> +        Write the FAT entry for the given cluster.
> +        """
> +        fat_offset = cluster * 2
> +        self.fats = (
> +            self.fats[:fat_offset]
> +            + value.to_bytes(2, "little")
> +            + self.fats[fat_offset + 2 :]
> +        )
> +        self.fats_dirty_sectors.add(fat_offset // SECTOR_SIZE)
> +
> +    def flush_fats(self):
> +        """
> +        Write the FATs back to the disk.
> +        """
> +        for sector in self.fats_dirty_sectors:
> +            data = self.fats[sector * SECTOR_SIZE : (sector + 1) * SECTOR_SIZE]
> +            sector = self.boot_sector.reserved_sectors + sector
> +            self.write_sectors(sector, data)
> +        self.fats_dirty_sectors = set()
> +
> +    def next_cluster(self, cluster: int) -> int | None:
> +        """
> +        Get the next cluster in the chain.
> +        If its `None`, then its the last cluster.
> +        The function will crash if the next cluster
> +        is `FREE` (unexpected) or invalid entry.
> +        """
> +        fat_entry = self.read_fat_entry(cluster)
> +        if fat_entry == 0:
> +            raise Exception("Unexpected: FREE cluster")
> +        elif fat_entry == 1:
> +            raise Exception("Unexpected: RESERVED cluster")
> +        elif fat_entry >= 0xFFF8:
> +            return None
> +        elif fat_entry >= 0xFFF7:
> +            raise Exception("Invalid FAT entry")
> +        else:
> +            return fat_entry
> +
> +    def next_free_cluster(self) -> int:
> +        """
> +        Find the next free cluster.
> +        """
> +        # simple linear search
> +        for i in range(2, 0xFFFF):
> +            if self.read_fat_entry(i) == 0:
> +                return i
> +        raise Exception("No free clusters")
> +
> +    def read_cluster(self, cluster: int) -> bytes:
> +        """
> +        Read the cluster at the given cluster.
> +        """
> +        return self.read_sectors(
> +            self.boot_sector.first_sector_of_cluster(cluster),
> +            self.boot_sector.sectors_per_cluster,
> +        )
> +
> +    def write_cluster(self, cluster: int, data: bytes):
> +        """
> +        Write the cluster at the given cluster.
> +        """
> +        assert len(data) == self.boot_sector.cluster_bytes()
> +        return self.write_sectors(
> +            self.boot_sector.first_sector_of_cluster(cluster),
> +            data,
> +        )
> +
> +    def read_directory(self, cluster: int) -> List[FatDirectoryEntry]:
> +        """
> +        Read the directory at the given cluster.
> +        """
> +        entries = []
> +        while cluster is not None:
> +            data = self.read_cluster(cluster)
> +            entries.extend(
> +                self.directory_from_bytes(
> +                    data, self.boot_sector.first_sector_of_cluster(cluster)
> +                )
> +            )
> +            cluster = self.next_cluster(cluster)
> +        return entries
> +
> +    def add_direntry(self,
> +                     cluster: int | None,
> +                     name: str, ext: str,
> +                     attributes: int):
> +        """
> +        Add a new directory entry to the given cluster.
> +        If the cluster is `None`, then it will be added to the root directory.
> +        """
> +
> +        def find_free_entry(data: bytes):
> +            for i in range(0, len(data), DIRENTRY_SIZE):
> +                entry = data[i : i + DIRENTRY_SIZE]
> +                if entry[0] == 0 or entry[0] == 0xE5:
> +                    return i
> +            return None
> +
> +        assert len(name) <= 8, "Name must be 8 characters or less"
> +        assert len(ext) <= 3, "Ext must be 3 characters or less"
> +        assert attributes % 0x15 != 0x15, "Invalid attributes"
> +
> +        # initial dummy data
> +        new_entry = FatDirectoryEntry(b"\0" * 32, 0, 0)
> +        new_entry.name = name.ljust(8, " ")
> +        new_entry.ext = ext.ljust(3, " ")
> +        new_entry.attributes = attributes
> +        new_entry.reserved = 0
> +        new_entry.create_time_tenth = 0
> +        new_entry.create_time = 0
> +        new_entry.create_date = 0
> +        new_entry.last_access_date = 0
> +        new_entry.last_mod_time = 0
> +        new_entry.last_mod_date = 0
> +        new_entry.cluster = self.next_free_cluster()
> +        new_entry.size_bytes = 0
> +
> +        # mark as EOF
> +        self.write_fat_entry(new_entry.cluster, 0xFFFF)
> +
> +        if cluster is None:
> +            for i in range(self.boot_sector.root_dir_size()):
> +                sector_data = self.read_sectors(
> +                    self.boot_sector.root_dir_start() + i, 1
> +                )
> +                offset = find_free_entry(sector_data)
> +                if offset is not None:
> +                    new_entry.sector = self.boot_sector.root_dir_start() + i
> +                    new_entry.offset = offset
> +                    self.update_direntry(new_entry)
> +                    return new_entry
> +        else:
> +            while cluster is not None:
> +                data = self.read_cluster(cluster)
> +                offset = find_free_entry(data)
> +                if offset is not None:
> +                    new_entry.sector = self.boot_sector.first_sector_of_cluster(
> +                        cluster
> +                    ) + (offset // SECTOR_SIZE)
> +                    new_entry.offset = offset % SECTOR_SIZE
> +                    self.update_direntry(new_entry)
> +                    return new_entry
> +                cluster = self.next_cluster(cluster)
> +
> +        raise Exception("No free directory entries")
> +
> +    def update_direntry(self, entry: FatDirectoryEntry):
> +        """
> +        Write the directory entry back to the disk.
> +        """
> +        sector = self.read_sectors(entry.sector, 1)
> +        sector = (
> +            sector[: entry.offset]
> +            + entry.as_bytes()
> +            + sector[entry.offset + DIRENTRY_SIZE :]
> +        )
> +        self.write_sectors(entry.sector, sector)
> +
> +    def find_direntry(self, path: str) -> FatDirectoryEntry | None:
> +        """
> +        Find the directory entry for the given path.
> +        """
> +        assert path[0] == "/", "Path must start with /"
> +
> +        path = path[1:]  # remove the leading /
> +        parts = path.split("/")
> +        directory = self.read_root_directory()
> +
> +        current_entry = None
> +
> +        for i, part in enumerate(parts):
> +            is_last = i == len(parts) - 1
> +
> +            for entry in directory:
> +                if entry.whole_name() == part:
> +                    current_entry = entry
> +                    break
> +            if current_entry is None:
> +                return None
> +
> +            if is_last:
> +                return current_entry
> +            else:
> +                if current_entry.attributes & 0x10 == 0:
> +                    raise Exception(
> +                        f"{current_entry.whole_name()} is not a directory")
> +                else:
> +                    directory = self.read_directory(current_entry.cluster)
> +
> +    def read_file(self, entry: FatDirectoryEntry) -> bytes:
> +        """
> +        Read the content of the file at the given path.
> +        """
> +        if entry is None:
> +            return None
> +        if entry.attributes & 0x10 != 0:
> +            raise Exception(f"{entry.whole_name()} is a directory")
> +
> +        data = b""
> +        cluster = entry.cluster
> +        while cluster is not None and len(data) <= entry.size_bytes:
> +            data += self.read_cluster(cluster)
> +            cluster = self.next_cluster(cluster)
> +        return data[: entry.size_bytes]
> +
> +    def truncate_file(self, entry: FatDirectoryEntry, new_size: int):
> +        """
> +        Truncate the file at the given path to the new size.
> +        """
> +        if entry is None:
> +            return Exception("entry is None")
> +        if entry.attributes & 0x10 != 0:
> +            raise Exception(f"{entry.whole_name()} is a directory")
> +
> +        def clusters_from_size(size: int):
> +            return (
> +                size + self.boot_sector.cluster_bytes() - 1
> +            ) // self.boot_sector.cluster_bytes()
> +
> +        # First, allocate new FATs if we need to
> +        required_clusters = clusters_from_size(new_size)
> +        current_clusters = clusters_from_size(entry.size_bytes)
> +
> +        affected_clusters = set()
> +
> +        # Keep at least one cluster, easier to manage this way
> +        if required_clusters == 0:
> +            required_clusters = 1
> +        if current_clusters == 0:
> +            current_clusters = 1
> +
> +        if required_clusters > current_clusters:
> +            # Allocate new clusters
> +            cluster = entry.cluster
> +            to_add = required_clusters
> +            for _ in range(current_clusters - 1):
> +                to_add -= 1
> +                cluster = self.next_cluster(cluster)
> +            assert required_clusters > 0, "No new clusters to allocate"
> +            assert cluster is not None, "Cluster is None"
> +            assert self.next_cluster(cluster) is None, \
> +                   "Cluster is not the last cluster"
> +
> +            # Allocate new clusters
> +            for _ in range(to_add - 1):
> +                new_cluster = self.next_free_cluster()
> +                self.write_fat_entry(cluster, new_cluster)
> +                self.write_fat_entry(new_cluster, 0xFFFF)
> +                cluster = new_cluster
> +
> +        elif required_clusters < current_clusters:
> +            # Truncate the file
> +            cluster = entry.cluster
> +            for _ in range(required_clusters - 1):
> +                cluster = self.next_cluster(cluster)
> +            assert cluster is not None, "Cluster is None"
> +
> +            next_cluster = self.next_cluster(cluster)
> +            # mark last as EOF
> +            self.write_fat_entry(cluster, 0xFFFF)
> +            # free the rest
> +            while next_cluster is not None:
> +                cluster = next_cluster
> +                next_cluster = self.next_cluster(next_cluster)
> +                self.write_fat_entry(cluster, 0)
> +
> +        self.flush_fats()
> +
> +        # verify number of clusters
> +        cluster = entry.cluster
> +        count = 0
> +        while cluster is not None:
> +            count += 1
> +            affected_clusters.add(cluster)
> +            cluster = self.next_cluster(cluster)
> +        assert (
> +            count == required_clusters
> +        ), f"Expected {required_clusters} clusters, got {count}"
> +
> +        # update the size
> +        entry.size_bytes = new_size
> +        self.update_direntry(entry)
> +
> +        # trigger every affected cluster
> +        for cluster in affected_clusters:
> +            first_sector = self.boot_sector.first_sector_of_cluster(cluster)
> +            first_sector_data = self.read_sectors(first_sector, 1)
> +            self.write_sectors(first_sector, first_sector_data)
> +
> +    def write_file(self, entry: FatDirectoryEntry, data: bytes):
> +        """
> +        Write the content of the file at the given path.
> +        """
> +        if entry is None:
> +            return Exception("entry is None")
> +        if entry.attributes & 0x10 != 0:
> +            raise Exception(f"{entry.whole_name()} is a directory")
> +
> +        data_len = len(data)
> +
> +        self.truncate_file(entry, data_len)
> +
> +        cluster = entry.cluster
> +        while cluster is not None:
> +            data_to_write = data[: self.boot_sector.cluster_bytes()]
> +            last_data = False
> +            if len(data_to_write) < self.boot_sector.cluster_bytes():
> +                last_data = True
> +                old_data = self.read_cluster(cluster)
> +                data_to_write += old_data[len(data_to_write) :]
> +
> +            self.write_cluster(cluster, data_to_write)
> +            data = data[self.boot_sector.cluster_bytes() :]
> +            if len(data) == 0:
> +                break
> +            cluster = self.next_cluster(cluster)
> +
> +        assert len(data) == 0, \
> +               "Data was not written completely, clusters missing"
> +
> +    def create_file(self, path: str):
> +        """
> +        Create a new file at the given path.
> +        """
> +        assert path[0] == "/", "Path must start with /"
> +
> +        path = path[1:]  # remove the leading /
> +
> +        parts = path.split("/")
> +
> +        directory_cluster = None
> +        directory = self.read_root_directory()
> +
> +        parts, filename = parts[:-1], parts[-1]
> +
> +        for i, part in enumerate(parts):
> +            current_entry = None
> +            for entry in directory:
> +                if entry.whole_name() == part:
> +                    current_entry = entry
> +                    break
> +            if current_entry is None:
> +                return None
> +
> +            if current_entry.attributes & 0x10 == 0:
> +                raise Exception(
> +                    f"{current_entry.whole_name()} is not a directory")
> +            else:
> +                directory = self.read_directory(current_entry.cluster)
> +                directory_cluster = current_entry.cluster
> +
> +        # add new entry to the directory
> +
> +        filename, ext = filename.split(".")
> +
> +        if len(ext) > 3:
> +            raise Exception("Ext must be 3 characters or less")
> +        if len(filename) > 8:
> +            raise Exception("Name must be 8 characters or less")
> +
> +        for c in filename + ext:
> +
> +            if c not in ALLOWED_FILE_CHARS:
> +                raise Exception("Invalid character in filename")
> +
> +        return self.add_direntry(directory_cluster, filename, ext, 0)
> diff --git a/tests/qemu-iotests/testenv.py b/tests/qemu-iotests/testenv.py
> index 588f30a4f1..4053d29de4 100644
> --- a/tests/qemu-iotests/testenv.py
> +++ b/tests/qemu-iotests/testenv.py
> @@ -250,7 +250,7 @@ def __init__(self, source_dir: str, build_dir: str,
>          self.qemu_img_options = os.getenv('QEMU_IMG_OPTIONS')
>          self.qemu_nbd_options = os.getenv('QEMU_NBD_OPTIONS')
>  
> -        is_generic = self.imgfmt not in ['bochs', 'cloop', 'dmg']
> +        is_generic = self.imgfmt not in ['bochs', 'cloop', 'dmg', 'vvfat']
>          self.imgfmt_generic = 'true' if is_generic else 'false'
>  
>          self.qemu_io_options = f'--cache {self.cachemode} --aio {self.aiomode}'
> diff --git a/tests/qemu-iotests/tests/vvfat b/tests/qemu-iotests/tests/vvfat
> new file mode 100755
> index 0000000000..113d7d3270
> --- /dev/null
> +++ b/tests/qemu-iotests/tests/vvfat
> @@ -0,0 +1,440 @@
> +#!/usr/bin/env python3
> +# group: rw vvfat
> +#
> +# Test vvfat driver implementation
> +# Here, we use a simple FAT16 implementation and check the behavior of the vvfat driver.
> +#
> +# Copyright (C) 2024 Amjad Alsharafi <amjadsharafi10@gmail.com>
> +#
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 2 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
> +
> +import os, shutil
> +import iotests
> +from iotests import imgfmt, QMPTestCase
> +from fat16 import MBR, Fat16, DIRENTRY_SIZE
> +
> +filesystem = os.path.join(iotests.test_dir, "filesystem")
> +
> +nbd_sock = iotests.file_path("nbd.sock", base_dir=iotests.sock_dir)
> +nbd_uri = "nbd+unix:///disk?socket=" + nbd_sock
> +
> +SECTOR_SIZE = 512
> +
> +
> +class TestVVFatDriver(QMPTestCase):
> +    def setUp(self) -> None:
> +        if os.path.exists(filesystem):
> +            if os.path.isdir(filesystem):
> +                shutil.rmtree(filesystem)
> +            else:
> +                print(f"Error: {filesystem} exists and is not a directory")
> +                exit(1)
> +        os.mkdir(filesystem)
> +
> +        # Add some text files to the filesystem
> +        for i in range(10):
> +            with open(os.path.join(filesystem, f"file{i}.txt"), "w") as f:
> +                f.write(f"Hello, world! {i}\n")
> +
> +        # Add 2 large files, above the cluster size (8KB)
> +        with open(os.path.join(filesystem, "large1.txt"), "wb") as f:
> +            # write 'A' * 1KB, 'B' * 1KB, 'C' * 1KB, ...
> +            for i in range(8 * 2):  # two clusters
> +                f.write(bytes([0x41 + i] * 1024))
> +
> +        with open(os.path.join(filesystem, "large2.txt"), "wb") as f:
> +            # write 'A' * 1KB, 'B' * 1KB, 'C' * 1KB, ...
> +            for i in range(8 * 3):  # 3 clusters
> +                f.write(bytes([0x41 + i] * 1024))
> +
> +        self.vm = iotests.VM()
> +
> +        self.vm.add_blockdev(
> +            self.vm.qmp_to_opts(
> +                {
> +                    "driver": imgfmt,
> +                    "node-name": "disk",
> +                    "rw": "true",
> +                    "fat-type": "16",
> +                    "dir": filesystem,
> +                }
> +            )
> +        )
> +
> +        self.vm.launch()
> +
> +        self.vm.qmp_log("block-dirty-bitmap-add", **{"node": "disk", "name": "bitmap0"})
> +
> +        # attach nbd server
> +        self.vm.qmp_log(
> +            "nbd-server-start",
> +            **{"addr": {"type": "unix", "data": {"path": nbd_sock}}},
> +            filters=[],
> +        )
> +
> +        self.vm.qmp_log(
> +            "nbd-server-add",
> +            **{"device": "disk", "writable": True, "bitmap": "bitmap0"},
> +        )
> +
> +        self.qio = iotests.QemuIoInteractive("-f", "raw", nbd_uri)
> +
> +    def tearDown(self) -> None:
> +        self.qio.close()
> +        self.vm.shutdown()
> +        # print(self.vm.get_log())
> +        shutil.rmtree(filesystem)
> +
> +    def read_sectors(self, sector: int, num: int = 1) -> bytes:
> +        """
> +        Read `num` sectors starting from `sector` from the `disk`.
> +        This uses `QemuIoInteractive` to read the sectors into `stdout` and then parse the output.
> +        """
> +        self.assertGreater(num, 0)
> +        # The output contains the content of the sector in hex dump format
> +        # We need to extract the content from it
> +        output = self.qio.cmd(f"read -v {sector * SECTOR_SIZE} {num * SECTOR_SIZE}")
> +        # Each row is 16 bytes long, and we are writing `num` sectors
> +        rows = num * SECTOR_SIZE // 16
> +        output_rows = output.split("\n")[:rows]
> +
> +        hex_content = "".join(
> +            [(row.split(": ")[1]).split("  ")[0] for row in output_rows]
> +        )
> +        bytes_content = bytes.fromhex(hex_content)
> +
> +        self.assertEqual(len(bytes_content), num * SECTOR_SIZE)
> +
> +        return bytes_content
> +
> +    def write_sectors(self, sector: int, data: bytes):
> +        """
> +        Write `data` to the `disk` starting from `sector`.
> +        This uses `QemuIoInteractive` to write the data into the disk.
> +        """
> +
> +        self.assertGreater(len(data), 0)
> +        self.assertEqual(len(data) % SECTOR_SIZE, 0)
> +
> +        temp_file = os.path.join(iotests.test_dir, "temp.bin")
> +        with open(temp_file, "wb") as f:
> +            f.write(data)
> +
> +        self.qio.cmd(f"write -s {temp_file} {sector * SECTOR_SIZE} {len(data)}")
> +
> +        os.remove(temp_file)
> +
> +    def init_fat16(self):
> +        mbr = MBR(self.read_sectors(0))
> +        return Fat16(
> +            mbr.partition_table[0]["start_lba"],
> +            mbr.partition_table[0]["size"],
> +            self.read_sectors,
> +            self.write_sectors,
> +        )
> +
> +    # Tests
> +
> +    def test_fat_filesystem(self):
> +        """
> +        Test that vvfat produce a valid FAT16 and MBR sectors
> +        """
> +        mbr = MBR(self.read_sectors(0))
> +
> +        self.assertEqual(mbr.partition_table[0]["status"], 0x80)
> +        self.assertEqual(mbr.partition_table[0]["type"], 6)
> +
> +        fat16 = Fat16(
> +            mbr.partition_table[0]["start_lba"],
> +            mbr.partition_table[0]["size"],
> +            self.read_sectors,
> +            self.write_sectors,
> +        )
> +        self.assertEqual(fat16.boot_sector.bytes_per_sector, 512)
> +        self.assertEqual(fat16.boot_sector.volume_label, "QEMU VVFAT")
> +
> +    def test_read_root_directory(self):
> +        """
> +        Test the content of the root directory
> +        """
> +        fat16 = self.init_fat16()
> +
> +        root_dir = fat16.read_root_directory()
> +
> +        self.assertEqual(len(root_dir), 13)  # 12 + 1 special file
> +
> +        files = {
> +            "QEMU VVF.AT": 0,  # special empty file
> +            "FILE0.TXT": 16,
> +            "FILE1.TXT": 16,
> +            "FILE2.TXT": 16,
> +            "FILE3.TXT": 16,
> +            "FILE4.TXT": 16,
> +            "FILE5.TXT": 16,
> +            "FILE6.TXT": 16,
> +            "FILE7.TXT": 16,
> +            "FILE8.TXT": 16,
> +            "FILE9.TXT": 16,
> +            "LARGE1.TXT": 0x2000 * 2,
> +            "LARGE2.TXT": 0x2000 * 3,
> +        }
> +
> +        for entry in root_dir:
> +            self.assertIn(entry.whole_name(), files)
> +            self.assertEqual(entry.size_bytes, files[entry.whole_name()])
> +
> +    def test_direntry_as_bytes(self):
> +        """
> +        Test if we can convert Direntry back to bytes, so that we can write it back to the disk safely.
> +        """
> +        fat16 = self.init_fat16()
> +
> +        root_dir = fat16.read_root_directory()
> +        first_entry_bytes = fat16.read_sectors(fat16.boot_sector.root_dir_start(), 1)
> +        # The first entry won't be deleted, so we can compare it with the first entry in the root directory
> +        self.assertEqual(root_dir[0].as_bytes(), first_entry_bytes[:DIRENTRY_SIZE])
> +
> +    def test_read_files(self):
> +        """
> +        Test reading the content of the files
> +        """
> +        fat16 = self.init_fat16()
> +
> +        for i in range(10):
> +            file = fat16.find_direntry(f"/FILE{i}.TXT")
> +            self.assertIsNotNone(file)
> +            self.assertEqual(
> +                fat16.read_file(file), f"Hello, world! {i}\n".encode("ascii")
> +            )
> +
> +        # test large files
> +        large1 = fat16.find_direntry("/LARGE1.TXT")
> +        with open(os.path.join(filesystem, "large1.txt"), "rb") as f:
> +            self.assertEqual(fat16.read_file(large1), f.read())
> +
> +        large2 = fat16.find_direntry("/LARGE2.TXT")
> +        self.assertIsNotNone(large2)
> +        with open(os.path.join(filesystem, "large2.txt"), "rb") as f:
> +            self.assertEqual(fat16.read_file(large2), f.read())
> +
> +    def test_write_file_same_content_direct(self):
> +        """
> +        Similar to `test_write_file_in_same_content`, but we write the file directly clusters
> +        and thus we don't go through the modification of direntry.
> +        """
> +        fat16 = self.init_fat16()
> +
> +        file = fat16.find_direntry("/FILE0.TXT")
> +        self.assertIsNotNone(file)
> +
> +        data = fat16.read_cluster(file.cluster)
> +        fat16.write_cluster(file.cluster, data)
> +
> +        with open(os.path.join(filesystem, "file0.txt"), "rb") as f:
> +            self.assertEqual(fat16.read_file(file), f.read())
> +
> +    def test_write_file_in_same_content(self):
> +        """
> +        Test writing the same content to the file back to it
> +        """
> +        fat16 = self.init_fat16()
> +
> +        file = fat16.find_direntry("/FILE0.TXT")
> +        self.assertIsNotNone(file)
> +
> +        self.assertEqual(fat16.read_file(file), b"Hello, world! 0\n")
> +
> +        fat16.write_file(file, b"Hello, world! 0\n")
> +
> +        self.assertEqual(fat16.read_file(file), b"Hello, world! 0\n")
> +
> +        with open(os.path.join(filesystem, "file0.txt"), "rb") as f:
> +            self.assertEqual(f.read(), b"Hello, world! 0\n")
> +
> +    def test_modify_content_same_clusters(self):
> +        """
> +        Test modifying the content of the file without changing the number of clusters
> +        """
> +        fat16 = self.init_fat16()
> +
> +        file = fat16.find_direntry("/FILE0.TXT")
> +        self.assertIsNotNone(file)
> +
> +        new_content = b"Hello, world! Modified\n"
> +        self.assertEqual(fat16.read_file(file), b"Hello, world! 0\n")
> +
> +        fat16.write_file(file, new_content)
> +
> +        self.assertEqual(fat16.read_file(file), new_content)
> +
> +        with open(os.path.join(filesystem, "file0.txt"), "rb") as f:
> +            self.assertEqual(f.read(), new_content)
> +
> +    def test_truncate_file_same_clusters_less(self):
> +        """
> +        Test truncating the file without changing number of clusters
> +        Test decreasing the file size
> +        """
> +        fat16 = self.init_fat16()
> +
> +        file = fat16.find_direntry("/FILE0.TXT")
> +        self.assertIsNotNone(file)
> +
> +        self.assertEqual(fat16.read_file(file), b"Hello, world! 0\n")
> +
> +        fat16.truncate_file(file, 5)
> +
> +        new_content = fat16.read_file(file)
> +
> +        self.assertEqual(new_content, b"Hello")
> +
> +        with open(os.path.join(filesystem, "file0.txt"), "rb") as f:
> +            self.assertEqual(f.read(), new_content)
> +
> +    def test_truncate_file_same_clusters_more(self):
> +        """
> +        Test truncating the file without changing number of clusters
> +        Test increase the file size
> +        """
> +        fat16 = self.init_fat16()
> +
> +        file = fat16.find_direntry("/FILE0.TXT")
> +        self.assertIsNotNone(file)
> +
> +        self.assertEqual(fat16.read_file(file), b"Hello, world! 0\n")
> +
> +        fat16.truncate_file(file, 20)
> +
> +        new_content = fat16.read_file(file)
> +
> +        # random pattern will be appended to the file, and its not always the same
> +        self.assertEqual(new_content[:16], b"Hello, world! 0\n")
> +        self.assertEqual(len(new_content), 20)
> +
> +        with open(os.path.join(filesystem, "file0.txt"), "rb") as f:
> +            self.assertEqual(f.read(), new_content)
> +
> +    def test_write_large_file(self):
> +        """
> +        Test writing a large file
> +        """
> +        fat16 = self.init_fat16()
> +
> +        file = fat16.find_direntry("/LARGE1.TXT")
> +        self.assertIsNotNone(file)
> +
> +        # The content of LARGE1 is A * 1KB, B * 1KB, C * 1KB, ..., P * 1KB
> +        # Lets change it to be Z * 1KB, Y * 1KB, X * 1KB, ..., K * 1KB
> +        # without changing the number of clusters or filesize
> +        new_content = b"".join([bytes([0x5A - i] * 1024) for i in range(16)])
> +
> +        fat16.write_file(file, new_content)
> +
> +        with open(os.path.join(filesystem, "large1.txt"), "rb") as f:
> +            self.assertEqual(f.read(), new_content)
> +
> +    def test_truncate_file_change_clusters_less(self):
> +        """
> +        Test truncating a file by reducing the number of clusters
> +        """
> +        fat16 = self.init_fat16()
> +
> +        file = fat16.find_direntry("/LARGE1.TXT")
> +        self.assertIsNotNone(file)
> +
> +        fat16.truncate_file(file, 1)
> +
> +        self.assertEqual(fat16.read_file(file), b"A")
> +
> +        with open(os.path.join(filesystem, "large1.txt"), "rb") as f:
> +            self.assertEqual(f.read(), b"A")
> +
> +    def test_write_file_change_clusters_less(self):
> +        """
> +        Test truncating a file by reducing the number of clusters
> +        """
> +        fat16 = self.init_fat16()
> +
> +        file = fat16.find_direntry("/LARGE2.TXT")
> +        self.assertIsNotNone(file)
> +
> +        new_content = b"Hello, world! This was a large file\n"
> +        new_content = b"X" * 8 * 1024 + b"Y" * 8 * 1024

This sets and then immediately overwrites new_content. What was intended
here?

> +
> +        fat16.write_file(file, new_content)
> +
> +        self.assertEqual(fat16.read_file(file), new_content)
> +
> +        with open(os.path.join(filesystem, "large2.txt"), "rb") as f:
> +            self.assertEqual(f.read(), new_content)
> +
> +    def test_write_file_change_clusters_more(self):
> +        """
> +        Test truncating a file by increasing the number of clusters
> +        """
> +        fat16 = self.init_fat16()
> +
> +        file = fat16.find_direntry("/LARGE2.TXT")
> +        self.assertIsNotNone(file)
> +
> +        new_content = b"X" * 8 * 1024 + b"Y" * 8 * 1024 + b"Z" * 8 * 1024
> +
> +        fat16.write_file(file, new_content)
> +
> +        with open(os.path.join(filesystem, "large2.txt"), "rb") as f:
> +            self.assertEqual(f.read(), new_content)
> +
> +    def test_write_file_change_clusters_more_non_last_file(self):
> +        """
> +        Test truncating a file by increasing the number of clusters
> +        This is a special variant of the above test, where we write to
> +        a file so that when allocating new clusters, it won't have contiguous clusters
> +        """
> +        fat16 = self.init_fat16()
> +
> +        file = fat16.find_direntry("/LARGE1.TXT")
> +        self.assertIsNotNone(file)
> +
> +        new_content = b"X" * 8 * 1024 + b"Y" * 8 * 1024 + b"Z" * 8 * 1024
> +
> +        fat16.write_file(file, new_content)
> +
> +        with open(os.path.join(filesystem, "large1.txt"), "rb") as f:
> +            self.assertEqual(f.read(), new_content)
> +
> +    def test_create_file(self):
> +        """
> +        Test creating a new file
> +        """
> +        fat16 = self.init_fat16()
> +
> +        new_file = fat16.create_file("/NEWFILE.TXT")
> +
> +        self.assertIsNotNone(new_file)
> +        self.assertEqual(new_file.size_bytes, 0)
> +
> +        new_content = b"Hello, world! New file\n"
> +        fat16.write_file(new_file, new_content)
> +
> +        self.assertEqual(fat16.read_file(new_file), new_content)
> +
> +        with open(os.path.join(filesystem, "newfile.txt"), "rb") as f:
> +            self.assertEqual(f.read(), new_content)
> +
> +    # TODO: support deleting files
> +
> +
> +if __name__ == "__main__":
> +    # This is a specific test for vvfat driver
> +    iotests.main(supported_fmts=["vvfat"], supported_protocols=["file"])
> diff --git a/tests/qemu-iotests/tests/vvfat.out b/tests/qemu-iotests/tests/vvfat.out
> new file mode 100755
> index 0000000000..96961ed0b5
> --- /dev/null
> +++ b/tests/qemu-iotests/tests/vvfat.out
> @@ -0,0 +1,5 @@
> +...............
> +----------------------------------------------------------------------
> +Ran 15 tests
> +
> +OK

With the updated test, I can catch the problems that are fixed by
patches 1 and 2, but it still doesn't need patch 3 to pass.

Kevin



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 4/4] iotests: Add `vvfat` tests
  2024-06-10 12:01   ` Kevin Wolf
@ 2024-06-10 14:11     ` Amjad Alsharafi
  2024-06-10 16:50       ` Kevin Wolf
  0 siblings, 1 reply; 13+ messages in thread
From: Amjad Alsharafi @ 2024-06-10 14:11 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-devel, Hanna Reitz, open list:vvfat

On Mon, Jun 10, 2024 at 02:01:24PM +0200, Kevin Wolf wrote:
> Am 05.06.2024 um 02:58 hat Amjad Alsharafi geschrieben:
> > Added several tests to verify the implementation of the vvfat driver.
> > 
> > We needed a way to interact with it, so created a basic `fat16.py` driver that handled writing correct sectors for us.
> > 
> > Added `vvfat` to the non-generic formats, as its not a normal image format.
> > 
> > Signed-off-by: Amjad Alsharafi <amjadsharafi10@gmail.com>
> > ---
> >  tests/qemu-iotests/check           |   2 +-
> >  tests/qemu-iotests/fat16.py        | 635 +++++++++++++++++++++++++++++
> >  tests/qemu-iotests/testenv.py      |   2 +-
> >  tests/qemu-iotests/tests/vvfat     | 440 ++++++++++++++++++++
> >  tests/qemu-iotests/tests/vvfat.out |   5 +
> >  5 files changed, 1082 insertions(+), 2 deletions(-)
> >  create mode 100644 tests/qemu-iotests/fat16.py
> >  create mode 100755 tests/qemu-iotests/tests/vvfat
> >  create mode 100755 tests/qemu-iotests/tests/vvfat.out
> > 
> > diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check
> > index 56d88ca423..545f9ec7bd 100755
> > --- a/tests/qemu-iotests/check
> > +++ b/tests/qemu-iotests/check
> > @@ -84,7 +84,7 @@ def make_argparser() -> argparse.ArgumentParser:
> >      p.set_defaults(imgfmt='raw', imgproto='file')
> >  
> >      format_list = ['raw', 'bochs', 'cloop', 'parallels', 'qcow', 'qcow2',
> > -                   'qed', 'vdi', 'vpc', 'vhdx', 'vmdk', 'luks', 'dmg']
> > +                   'qed', 'vdi', 'vpc', 'vhdx', 'vmdk', 'luks', 'dmg', 'vvfat']
> >      g_fmt = p.add_argument_group(
> >          '  image format options',
> >          'The following options set the IMGFMT environment variable. '
> > diff --git a/tests/qemu-iotests/fat16.py b/tests/qemu-iotests/fat16.py
> > new file mode 100644
> > index 0000000000..baf801b4d5
> > --- /dev/null
> > +++ b/tests/qemu-iotests/fat16.py
> > @@ -0,0 +1,635 @@
> > +# A simple FAT16 driver that is used to test the `vvfat` driver in QEMU.
> > +#
> > +# Copyright (C) 2024 Amjad Alsharafi <amjadsharafi10@gmail.com>
> > +#
> > +# This program is free software; you can redistribute it and/or modify
> > +# it under the terms of the GNU General Public License as published by
> > +# the Free Software Foundation; either version 2 of the License, or
> > +# (at your option) any later version.
> > +#
> > +# This program is distributed in the hope that it will be useful,
> > +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > +# GNU General Public License for more details.
> > +#
> > +# You should have received a copy of the GNU General Public License
> > +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
> > +
> > +from typing import List
> > +import string
> > +
> > +SECTOR_SIZE = 512
> > +DIRENTRY_SIZE = 32
> > +ALLOWED_FILE_CHARS = \
> > +    set("!#$%&'()-@^_`{}~" + string.digits + string.ascii_uppercase)
> > +
> > +
> > +class MBR:
> > +    def __init__(self, data: bytes):
> > +        assert len(data) == 512
> > +        self.partition_table = []
> > +        for i in range(4):
> > +            partition = data[446 + i * 16 : 446 + (i + 1) * 16]
> > +            self.partition_table.append(
> > +                {
> > +                    "status": partition[0],
> > +                    "start_head": partition[1],
> > +                    "start_sector": partition[2] & 0x3F,
> > +                    "start_cylinder":
> > +                        ((partition[2] & 0xC0) << 2) | partition[3],
> > +                    "type": partition[4],
> > +                    "end_head": partition[5],
> > +                    "end_sector": partition[6] & 0x3F,
> > +                    "end_cylinder":
> > +                        ((partition[6] & 0xC0) << 2) | partition[7],
> > +                    "start_lba": int.from_bytes(partition[8:12], "little"),
> > +                    "size": int.from_bytes(partition[12:16], "little"),
> > +                }
> > +            )
> > +
> > +    def __str__(self):
> > +        return "\n".join(
> > +            [f"{i}: {partition}"
> > +                for i, partition in enumerate(self.partition_table)]
> > +        )
> > +
> > +
> > +class FatBootSector:
> > +    def __init__(self, data: bytes):
> > +        assert len(data) == 512
> > +        self.bytes_per_sector = int.from_bytes(data[11:13], "little")
> > +        self.sectors_per_cluster = data[13]
> > +        self.reserved_sectors = int.from_bytes(data[14:16], "little")
> > +        self.fat_count = data[16]
> > +        self.root_entries = int.from_bytes(data[17:19], "little")
> > +        self.media_descriptor = data[21]
> > +        self.fat_size = int.from_bytes(data[22:24], "little")
> > +        self.sectors_per_fat = int.from_bytes(data[22:24], "little")
> 
> Why two different attributes self.fat_size and self.sectors_per_fat that
> contain the same value?
> 
> > +        self.sectors_per_track = int.from_bytes(data[24:26], "little")
> > +        self.heads = int.from_bytes(data[26:28], "little")
> > +        self.hidden_sectors = int.from_bytes(data[28:32], "little")
> > +        self.total_sectors = int.from_bytes(data[32:36], "little")
> 
> This value should only be considered if the 16 bit value at byte 19
> (which you don't store at all) was 0.
> 
> > +        self.drive_number = data[36]
> > +        self.volume_id = int.from_bytes(data[39:43], "little")
> > +        self.volume_label = data[43:54].decode("ascii").strip()
> > +        self.fs_type = data[54:62].decode("ascii").strip()
> > +
> > +    def root_dir_start(self):
> > +        """
> > +        Calculate the start sector of the root directory.
> > +        """
> > +        return self.reserved_sectors + self.fat_count * self.sectors_per_fat
> > +
> > +    def root_dir_size(self):
> > +        """
> > +        Calculate the size of the root directory in sectors.
> > +        """
> > +        return (
> > +            self.root_entries * DIRENTRY_SIZE + self.bytes_per_sector - 1
> > +        ) // self.bytes_per_sector
> > +
> > +    def data_sector_start(self):
> > +        """
> > +        Calculate the start sector of the data region.
> > +        """
> > +        return self.root_dir_start() + self.root_dir_size()
> > +
> > +    def first_sector_of_cluster(self, cluster: int):
> > +        """
> > +        Calculate the first sector of the given cluster.
> > +        """
> > +        return self.data_sector_start() \
> > +                + (cluster - 2) * self.sectors_per_cluster
> > +
> > +    def cluster_bytes(self):
> > +        """
> > +        Calculate the number of bytes in a cluster.
> > +        """
> > +        return self.bytes_per_sector * self.sectors_per_cluster
> > +
> > +    def __str__(self):
> > +        return (
> > +            f"Bytes per sector: {self.bytes_per_sector}\n"
> > +            f"Sectors per cluster: {self.sectors_per_cluster}\n"
> > +            f"Reserved sectors: {self.reserved_sectors}\n"
> > +            f"FAT count: {self.fat_count}\n"
> > +            f"Root entries: {self.root_entries}\n"
> > +            f"Total sectors: {self.total_sectors}\n"
> > +            f"Media descriptor: {self.media_descriptor}\n"
> > +            f"Sectors per FAT: {self.sectors_per_fat}\n"
> > +            f"Sectors per track: {self.sectors_per_track}\n"
> > +            f"Heads: {self.heads}\n"
> > +            f"Hidden sectors: {self.hidden_sectors}\n"
> > +            f"Drive number: {self.drive_number}\n"
> > +            f"Volume ID: {self.volume_id}\n"
> > +            f"Volume label: {self.volume_label}\n"
> > +            f"FS type: {self.fs_type}\n"
> > +        )
> > +
> > +
> > +class FatDirectoryEntry:
> > +    def __init__(self, data: bytes, sector: int, offset: int):
> > +        self.name = data[0:8].decode("ascii").strip()
> > +        self.ext = data[8:11].decode("ascii").strip()
> > +        self.attributes = data[11]
> > +        self.reserved = data[12]
> > +        self.create_time_tenth = data[13]
> > +        self.create_time = int.from_bytes(data[14:16], "little")
> > +        self.create_date = int.from_bytes(data[16:18], "little")
> > +        self.last_access_date = int.from_bytes(data[18:20], "little")
> > +        high_cluster = int.from_bytes(data[20:22], "little")
> > +        self.last_mod_time = int.from_bytes(data[22:24], "little")
> > +        self.last_mod_date = int.from_bytes(data[24:26], "little")
> > +        low_cluster = int.from_bytes(data[26:28], "little")
> > +        self.cluster = (high_cluster << 16) | low_cluster
> > +        self.size_bytes = int.from_bytes(data[28:32], "little")
> > +
> > +        # extra (to help write back to disk)
> > +        self.sector = sector
> > +        self.offset = offset
> > +
> > +    def as_bytes(self) -> bytes:
> > +        return (
> > +            self.name.ljust(8, " ").encode("ascii")
> > +            + self.ext.ljust(3, " ").encode("ascii")
> > +            + self.attributes.to_bytes(1, "little")
> > +            + self.reserved.to_bytes(1, "little")
> > +            + self.create_time_tenth.to_bytes(1, "little")
> > +            + self.create_time.to_bytes(2, "little")
> > +            + self.create_date.to_bytes(2, "little")
> > +            + self.last_access_date.to_bytes(2, "little")
> > +            + (self.cluster >> 16).to_bytes(2, "little")
> > +            + self.last_mod_time.to_bytes(2, "little")
> > +            + self.last_mod_date.to_bytes(2, "little")
> > +            + (self.cluster & 0xFFFF).to_bytes(2, "little")
> > +            + self.size_bytes.to_bytes(4, "little")
> > +        )
> > +
> > +    def whole_name(self):
> > +        if self.ext:
> > +            return f"{self.name}.{self.ext}"
> > +        else:
> > +            return self.name
> > +
> > +    def __str__(self):
> > +        return (
> > +            f"Name: {self.name}\n"
> > +            f"Ext: {self.ext}\n"
> > +            f"Attributes: {self.attributes}\n"
> > +            f"Reserved: {self.reserved}\n"
> > +            f"Create time tenth: {self.create_time_tenth}\n"
> > +            f"Create time: {self.create_time}\n"
> > +            f"Create date: {self.create_date}\n"
> > +            f"Last access date: {self.last_access_date}\n"
> > +            f"Last mod time: {self.last_mod_time}\n"
> > +            f"Last mod date: {self.last_mod_date}\n"
> > +            f"Cluster: {self.cluster}\n"
> > +            f"Size: {self.size_bytes}\n"
> > +        )
> > +
> > +    def __repr__(self):
> > +        # convert to dict
> > +        return str(vars(self))
> > +
> > +
> > +class Fat16:
> > +    def __init__(
> > +        self,
> > +        start_sector: int,
> > +        size: int,
> > +        sector_reader: callable,
> > +        sector_writer: callable,
> > +    ):
> > +        self.start_sector = start_sector
> > +        self.size_in_sectors = size
> > +        self.sector_reader = sector_reader
> > +        self.sector_writer = sector_writer
> > +
> > +        self.boot_sector = FatBootSector(self.sector_reader(start_sector))
> > +
> > +        fat_size_in_sectors = \
> > +            self.boot_sector.fat_size * self.boot_sector.fat_count
> > +        self.fats = self.read_sectors(
> > +            self.boot_sector.reserved_sectors, fat_size_in_sectors
> > +        )
> > +        self.fats_dirty_sectors = set()
> > +
> > +    def read_sectors(self, start_sector: int, num_sectors: int) -> bytes:
> > +        return self.sector_reader(start_sector + self.start_sector, num_sectors)
> > +
> > +    def write_sectors(self, start_sector: int, data: bytes):
> > +        return self.sector_writer(start_sector + self.start_sector, data)
> > +
> > +    def directory_from_bytes(
> > +        self, data: bytes, start_sector: int
> > +    ) -> List[FatDirectoryEntry]:
> > +        """
> > +        Convert `bytes` into a list of `FatDirectoryEntry` objects.
> > +        Will ignore long file names.
> > +        Will stop when it encounters a 0x00 byte.
> > +        """
> > +
> > +        entries = []
> > +        for i in range(0, len(data), DIRENTRY_SIZE):
> > +            entry = data[i : i + DIRENTRY_SIZE]
> > +
> > +            current_sector = start_sector + (i // SECTOR_SIZE)
> > +            current_offset = i % SECTOR_SIZE
> > +
> > +            if entry[0] == 0:
> > +                break
> > +            elif entry[0] == 0xE5:
> > +                # Deleted file
> > +                continue
> > +
> > +            if entry[11] & 0xF == 0xF:
> > +                # Long file name
> > +                continue
> > +
> > +            entries.append(
> > +                FatDirectoryEntry(entry, current_sector, current_offset))
> > +        return entries
> > +
> > +    def read_root_directory(self) -> List[FatDirectoryEntry]:
> > +        root_dir = self.read_sectors(
> > +            self.boot_sector.root_dir_start(), self.boot_sector.root_dir_size()
> > +        )
> > +        return self.directory_from_bytes(root_dir,
> > +                                         self.boot_sector.root_dir_start())
> > +
> > +    def read_fat_entry(self, cluster: int) -> int:
> > +        """
> > +        Read the FAT entry for the given cluster.
> > +        """
> > +        fat_offset = cluster * 2  # FAT16
> > +        return int.from_bytes(self.fats[fat_offset : fat_offset + 2], "little")
> > +
> > +    def write_fat_entry(self, cluster: int, value: int):
> > +        """
> > +        Write the FAT entry for the given cluster.
> > +        """
> > +        fat_offset = cluster * 2
> > +        self.fats = (
> > +            self.fats[:fat_offset]
> > +            + value.to_bytes(2, "little")
> > +            + self.fats[fat_offset + 2 :]
> > +        )
> > +        self.fats_dirty_sectors.add(fat_offset // SECTOR_SIZE)
> > +
> > +    def flush_fats(self):
> > +        """
> > +        Write the FATs back to the disk.
> > +        """
> > +        for sector in self.fats_dirty_sectors:
> > +            data = self.fats[sector * SECTOR_SIZE : (sector + 1) * SECTOR_SIZE]
> > +            sector = self.boot_sector.reserved_sectors + sector
> > +            self.write_sectors(sector, data)
> > +        self.fats_dirty_sectors = set()
> > +
> > +    def next_cluster(self, cluster: int) -> int | None:
> > +        """
> > +        Get the next cluster in the chain.
> > +        If its `None`, then its the last cluster.
> > +        The function will crash if the next cluster
> > +        is `FREE` (unexpected) or invalid entry.
> > +        """
> > +        fat_entry = self.read_fat_entry(cluster)
> > +        if fat_entry == 0:
> > +            raise Exception("Unexpected: FREE cluster")
> > +        elif fat_entry == 1:
> > +            raise Exception("Unexpected: RESERVED cluster")
> > +        elif fat_entry >= 0xFFF8:
> > +            return None
> > +        elif fat_entry >= 0xFFF7:
> > +            raise Exception("Invalid FAT entry")
> > +        else:
> > +            return fat_entry
> > +
> > +    def next_free_cluster(self) -> int:
> > +        """
> > +        Find the next free cluster.
> > +        """
> > +        # simple linear search
> > +        for i in range(2, 0xFFFF):
> > +            if self.read_fat_entry(i) == 0:
> > +                return i
> > +        raise Exception("No free clusters")
> > +
> > +    def read_cluster(self, cluster: int) -> bytes:
> > +        """
> > +        Read the cluster at the given cluster.
> > +        """
> > +        return self.read_sectors(
> > +            self.boot_sector.first_sector_of_cluster(cluster),
> > +            self.boot_sector.sectors_per_cluster,
> > +        )
> > +
> > +    def write_cluster(self, cluster: int, data: bytes):
> > +        """
> > +        Write the cluster at the given cluster.
> > +        """
> > +        assert len(data) == self.boot_sector.cluster_bytes()
> > +        return self.write_sectors(
> > +            self.boot_sector.first_sector_of_cluster(cluster),
> > +            data,
> > +        )
> > +
> > +    def read_directory(self, cluster: int) -> List[FatDirectoryEntry]:
> > +        """
> > +        Read the directory at the given cluster.
> > +        """
> > +        entries = []
> > +        while cluster is not None:
> > +            data = self.read_cluster(cluster)
> > +            entries.extend(
> > +                self.directory_from_bytes(
> > +                    data, self.boot_sector.first_sector_of_cluster(cluster)
> > +                )
> > +            )
> > +            cluster = self.next_cluster(cluster)
> > +        return entries
> > +
> > +    def add_direntry(self,
> > +                     cluster: int | None,
> > +                     name: str, ext: str,
> > +                     attributes: int):
> > +        """
> > +        Add a new directory entry to the given cluster.
> > +        If the cluster is `None`, then it will be added to the root directory.
> > +        """
> > +
> > +        def find_free_entry(data: bytes):
> > +            for i in range(0, len(data), DIRENTRY_SIZE):
> > +                entry = data[i : i + DIRENTRY_SIZE]
> > +                if entry[0] == 0 or entry[0] == 0xE5:
> > +                    return i
> > +            return None
> > +
> > +        assert len(name) <= 8, "Name must be 8 characters or less"
> > +        assert len(ext) <= 3, "Ext must be 3 characters or less"
> > +        assert attributes % 0x15 != 0x15, "Invalid attributes"
> > +
> > +        # initial dummy data
> > +        new_entry = FatDirectoryEntry(b"\0" * 32, 0, 0)
> > +        new_entry.name = name.ljust(8, " ")
> > +        new_entry.ext = ext.ljust(3, " ")
> > +        new_entry.attributes = attributes
> > +        new_entry.reserved = 0
> > +        new_entry.create_time_tenth = 0
> > +        new_entry.create_time = 0
> > +        new_entry.create_date = 0
> > +        new_entry.last_access_date = 0
> > +        new_entry.last_mod_time = 0
> > +        new_entry.last_mod_date = 0
> > +        new_entry.cluster = self.next_free_cluster()
> > +        new_entry.size_bytes = 0
> > +
> > +        # mark as EOF
> > +        self.write_fat_entry(new_entry.cluster, 0xFFFF)
> > +
> > +        if cluster is None:
> > +            for i in range(self.boot_sector.root_dir_size()):
> > +                sector_data = self.read_sectors(
> > +                    self.boot_sector.root_dir_start() + i, 1
> > +                )
> > +                offset = find_free_entry(sector_data)
> > +                if offset is not None:
> > +                    new_entry.sector = self.boot_sector.root_dir_start() + i
> > +                    new_entry.offset = offset
> > +                    self.update_direntry(new_entry)
> > +                    return new_entry
> > +        else:
> > +            while cluster is not None:
> > +                data = self.read_cluster(cluster)
> > +                offset = find_free_entry(data)
> > +                if offset is not None:
> > +                    new_entry.sector = self.boot_sector.first_sector_of_cluster(
> > +                        cluster
> > +                    ) + (offset // SECTOR_SIZE)
> > +                    new_entry.offset = offset % SECTOR_SIZE
> > +                    self.update_direntry(new_entry)
> > +                    return new_entry
> > +                cluster = self.next_cluster(cluster)
> > +
> > +        raise Exception("No free directory entries")
> > +
> > +    def update_direntry(self, entry: FatDirectoryEntry):
> > +        """
> > +        Write the directory entry back to the disk.
> > +        """
> > +        sector = self.read_sectors(entry.sector, 1)
> > +        sector = (
> > +            sector[: entry.offset]
> > +            + entry.as_bytes()
> > +            + sector[entry.offset + DIRENTRY_SIZE :]
> > +        )
> > +        self.write_sectors(entry.sector, sector)
> > +
> > +    def find_direntry(self, path: str) -> FatDirectoryEntry | None:
> > +        """
> > +        Find the directory entry for the given path.
> > +        """
> > +        assert path[0] == "/", "Path must start with /"
> > +
> > +        path = path[1:]  # remove the leading /
> > +        parts = path.split("/")
> > +        directory = self.read_root_directory()
> > +
> > +        current_entry = None
> > +
> > +        for i, part in enumerate(parts):
> > +            is_last = i == len(parts) - 1
> > +
> > +            for entry in directory:
> > +                if entry.whole_name() == part:
> > +                    current_entry = entry
> > +                    break
> > +            if current_entry is None:
> > +                return None
> > +
> > +            if is_last:
> > +                return current_entry
> > +            else:
> > +                if current_entry.attributes & 0x10 == 0:
> > +                    raise Exception(
> > +                        f"{current_entry.whole_name()} is not a directory")
> > +                else:
> > +                    directory = self.read_directory(current_entry.cluster)
> > +
> > +    def read_file(self, entry: FatDirectoryEntry) -> bytes:
> > +        """
> > +        Read the content of the file at the given path.
> > +        """
> > +        if entry is None:
> > +            return None
> > +        if entry.attributes & 0x10 != 0:
> > +            raise Exception(f"{entry.whole_name()} is a directory")
> > +
> > +        data = b""
> > +        cluster = entry.cluster
> > +        while cluster is not None and len(data) <= entry.size_bytes:
> > +            data += self.read_cluster(cluster)
> > +            cluster = self.next_cluster(cluster)
> > +        return data[: entry.size_bytes]
> > +
> > +    def truncate_file(self, entry: FatDirectoryEntry, new_size: int):
> > +        """
> > +        Truncate the file at the given path to the new size.
> > +        """
> > +        if entry is None:
> > +            return Exception("entry is None")
> > +        if entry.attributes & 0x10 != 0:
> > +            raise Exception(f"{entry.whole_name()} is a directory")
> > +
> > +        def clusters_from_size(size: int):
> > +            return (
> > +                size + self.boot_sector.cluster_bytes() - 1
> > +            ) // self.boot_sector.cluster_bytes()
> > +
> > +        # First, allocate new FATs if we need to
> > +        required_clusters = clusters_from_size(new_size)
> > +        current_clusters = clusters_from_size(entry.size_bytes)
> > +
> > +        affected_clusters = set()
> > +
> > +        # Keep at least one cluster, easier to manage this way
> > +        if required_clusters == 0:
> > +            required_clusters = 1
> > +        if current_clusters == 0:
> > +            current_clusters = 1
> > +
> > +        if required_clusters > current_clusters:
> > +            # Allocate new clusters
> > +            cluster = entry.cluster
> > +            to_add = required_clusters
> > +            for _ in range(current_clusters - 1):
> > +                to_add -= 1
> > +                cluster = self.next_cluster(cluster)
> > +            assert required_clusters > 0, "No new clusters to allocate"
> > +            assert cluster is not None, "Cluster is None"
> > +            assert self.next_cluster(cluster) is None, \
> > +                   "Cluster is not the last cluster"
> > +
> > +            # Allocate new clusters
> > +            for _ in range(to_add - 1):
> > +                new_cluster = self.next_free_cluster()
> > +                self.write_fat_entry(cluster, new_cluster)
> > +                self.write_fat_entry(new_cluster, 0xFFFF)
> > +                cluster = new_cluster
> > +
> > +        elif required_clusters < current_clusters:
> > +            # Truncate the file
> > +            cluster = entry.cluster
> > +            for _ in range(required_clusters - 1):
> > +                cluster = self.next_cluster(cluster)
> > +            assert cluster is not None, "Cluster is None"
> > +
> > +            next_cluster = self.next_cluster(cluster)
> > +            # mark last as EOF
> > +            self.write_fat_entry(cluster, 0xFFFF)
> > +            # free the rest
> > +            while next_cluster is not None:
> > +                cluster = next_cluster
> > +                next_cluster = self.next_cluster(next_cluster)
> > +                self.write_fat_entry(cluster, 0)
> > +
> > +        self.flush_fats()
> > +
> > +        # verify number of clusters
> > +        cluster = entry.cluster
> > +        count = 0
> > +        while cluster is not None:
> > +            count += 1
> > +            affected_clusters.add(cluster)
> > +            cluster = self.next_cluster(cluster)
> > +        assert (
> > +            count == required_clusters
> > +        ), f"Expected {required_clusters} clusters, got {count}"
> > +
> > +        # update the size
> > +        entry.size_bytes = new_size
> > +        self.update_direntry(entry)
> > +
> > +        # trigger every affected cluster
> > +        for cluster in affected_clusters:
> > +            first_sector = self.boot_sector.first_sector_of_cluster(cluster)
> > +            first_sector_data = self.read_sectors(first_sector, 1)
> > +            self.write_sectors(first_sector, first_sector_data)
> > +
> > +    def write_file(self, entry: FatDirectoryEntry, data: bytes):
> > +        """
> > +        Write the content of the file at the given path.
> > +        """
> > +        if entry is None:
> > +            return Exception("entry is None")
> > +        if entry.attributes & 0x10 != 0:
> > +            raise Exception(f"{entry.whole_name()} is a directory")
> > +
> > +        data_len = len(data)
> > +
> > +        self.truncate_file(entry, data_len)
> > +
> > +        cluster = entry.cluster
> > +        while cluster is not None:
> > +            data_to_write = data[: self.boot_sector.cluster_bytes()]
> > +            last_data = False
> > +            if len(data_to_write) < self.boot_sector.cluster_bytes():
> > +                last_data = True
> > +                old_data = self.read_cluster(cluster)
> > +                data_to_write += old_data[len(data_to_write) :]
> > +
> > +            self.write_cluster(cluster, data_to_write)
> > +            data = data[self.boot_sector.cluster_bytes() :]
> > +            if len(data) == 0:
> > +                break
> > +            cluster = self.next_cluster(cluster)
> > +
> > +        assert len(data) == 0, \
> > +               "Data was not written completely, clusters missing"
> > +
> > +    def create_file(self, path: str):
> > +        """
> > +        Create a new file at the given path.
> > +        """
> > +        assert path[0] == "/", "Path must start with /"
> > +
> > +        path = path[1:]  # remove the leading /
> > +
> > +        parts = path.split("/")
> > +
> > +        directory_cluster = None
> > +        directory = self.read_root_directory()
> > +
> > +        parts, filename = parts[:-1], parts[-1]
> > +
> > +        for i, part in enumerate(parts):
> > +            current_entry = None
> > +            for entry in directory:
> > +                if entry.whole_name() == part:
> > +                    current_entry = entry
> > +                    break
> > +            if current_entry is None:
> > +                return None
> > +
> > +            if current_entry.attributes & 0x10 == 0:
> > +                raise Exception(
> > +                    f"{current_entry.whole_name()} is not a directory")
> > +            else:
> > +                directory = self.read_directory(current_entry.cluster)
> > +                directory_cluster = current_entry.cluster
> > +
> > +        # add new entry to the directory
> > +
> > +        filename, ext = filename.split(".")
> > +
> > +        if len(ext) > 3:
> > +            raise Exception("Ext must be 3 characters or less")
> > +        if len(filename) > 8:
> > +            raise Exception("Name must be 8 characters or less")
> > +
> > +        for c in filename + ext:
> > +
> > +            if c not in ALLOWED_FILE_CHARS:
> > +                raise Exception("Invalid character in filename")
> > +
> > +        return self.add_direntry(directory_cluster, filename, ext, 0)
> > diff --git a/tests/qemu-iotests/testenv.py b/tests/qemu-iotests/testenv.py
> > index 588f30a4f1..4053d29de4 100644
> > --- a/tests/qemu-iotests/testenv.py
> > +++ b/tests/qemu-iotests/testenv.py
> > @@ -250,7 +250,7 @@ def __init__(self, source_dir: str, build_dir: str,
> >          self.qemu_img_options = os.getenv('QEMU_IMG_OPTIONS')
> >          self.qemu_nbd_options = os.getenv('QEMU_NBD_OPTIONS')
> >  
> > -        is_generic = self.imgfmt not in ['bochs', 'cloop', 'dmg']
> > +        is_generic = self.imgfmt not in ['bochs', 'cloop', 'dmg', 'vvfat']
> >          self.imgfmt_generic = 'true' if is_generic else 'false'
> >  
> >          self.qemu_io_options = f'--cache {self.cachemode} --aio {self.aiomode}'
> > diff --git a/tests/qemu-iotests/tests/vvfat b/tests/qemu-iotests/tests/vvfat
> > new file mode 100755
> > index 0000000000..113d7d3270
> > --- /dev/null
> > +++ b/tests/qemu-iotests/tests/vvfat
> > @@ -0,0 +1,440 @@
> > +#!/usr/bin/env python3
> > +# group: rw vvfat
> > +#
> > +# Test vvfat driver implementation
> > +# Here, we use a simple FAT16 implementation and check the behavior of the vvfat driver.
> > +#
> > +# Copyright (C) 2024 Amjad Alsharafi <amjadsharafi10@gmail.com>
> > +#
> > +# This program is free software; you can redistribute it and/or modify
> > +# it under the terms of the GNU General Public License as published by
> > +# the Free Software Foundation; either version 2 of the License, or
> > +# (at your option) any later version.
> > +#
> > +# This program is distributed in the hope that it will be useful,
> > +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > +# GNU General Public License for more details.
> > +#
> > +# You should have received a copy of the GNU General Public License
> > +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
> > +
> > +import os, shutil
> > +import iotests
> > +from iotests import imgfmt, QMPTestCase
> > +from fat16 import MBR, Fat16, DIRENTRY_SIZE
> > +
> > +filesystem = os.path.join(iotests.test_dir, "filesystem")
> > +
> > +nbd_sock = iotests.file_path("nbd.sock", base_dir=iotests.sock_dir)
> > +nbd_uri = "nbd+unix:///disk?socket=" + nbd_sock
> > +
> > +SECTOR_SIZE = 512
> > +
> > +
> > +class TestVVFatDriver(QMPTestCase):
> > +    def setUp(self) -> None:
> > +        if os.path.exists(filesystem):
> > +            if os.path.isdir(filesystem):
> > +                shutil.rmtree(filesystem)
> > +            else:
> > +                print(f"Error: {filesystem} exists and is not a directory")
> > +                exit(1)
> > +        os.mkdir(filesystem)
> > +
> > +        # Add some text files to the filesystem
> > +        for i in range(10):
> > +            with open(os.path.join(filesystem, f"file{i}.txt"), "w") as f:
> > +                f.write(f"Hello, world! {i}\n")
> > +
> > +        # Add 2 large files, above the cluster size (8KB)
> > +        with open(os.path.join(filesystem, "large1.txt"), "wb") as f:
> > +            # write 'A' * 1KB, 'B' * 1KB, 'C' * 1KB, ...
> > +            for i in range(8 * 2):  # two clusters
> > +                f.write(bytes([0x41 + i] * 1024))
> > +
> > +        with open(os.path.join(filesystem, "large2.txt"), "wb") as f:
> > +            # write 'A' * 1KB, 'B' * 1KB, 'C' * 1KB, ...
> > +            for i in range(8 * 3):  # 3 clusters
> > +                f.write(bytes([0x41 + i] * 1024))
> > +
> > +        self.vm = iotests.VM()
> > +
> > +        self.vm.add_blockdev(
> > +            self.vm.qmp_to_opts(
> > +                {
> > +                    "driver": imgfmt,
> > +                    "node-name": "disk",
> > +                    "rw": "true",
> > +                    "fat-type": "16",
> > +                    "dir": filesystem,
> > +                }
> > +            )
> > +        )
> > +
> > +        self.vm.launch()
> > +
> > +        self.vm.qmp_log("block-dirty-bitmap-add", **{"node": "disk", "name": "bitmap0"})
> > +
> > +        # attach nbd server
> > +        self.vm.qmp_log(
> > +            "nbd-server-start",
> > +            **{"addr": {"type": "unix", "data": {"path": nbd_sock}}},
> > +            filters=[],
> > +        )
> > +
> > +        self.vm.qmp_log(
> > +            "nbd-server-add",
> > +            **{"device": "disk", "writable": True, "bitmap": "bitmap0"},
> > +        )
> > +
> > +        self.qio = iotests.QemuIoInteractive("-f", "raw", nbd_uri)
> > +
> > +    def tearDown(self) -> None:
> > +        self.qio.close()
> > +        self.vm.shutdown()
> > +        # print(self.vm.get_log())
> > +        shutil.rmtree(filesystem)
> > +
> > +    def read_sectors(self, sector: int, num: int = 1) -> bytes:
> > +        """
> > +        Read `num` sectors starting from `sector` from the `disk`.
> > +        This uses `QemuIoInteractive` to read the sectors into `stdout` and then parse the output.
> > +        """
> > +        self.assertGreater(num, 0)
> > +        # The output contains the content of the sector in hex dump format
> > +        # We need to extract the content from it
> > +        output = self.qio.cmd(f"read -v {sector * SECTOR_SIZE} {num * SECTOR_SIZE}")
> > +        # Each row is 16 bytes long, and we are writing `num` sectors
> > +        rows = num * SECTOR_SIZE // 16
> > +        output_rows = output.split("\n")[:rows]
> > +
> > +        hex_content = "".join(
> > +            [(row.split(": ")[1]).split("  ")[0] for row in output_rows]
> > +        )
> > +        bytes_content = bytes.fromhex(hex_content)
> > +
> > +        self.assertEqual(len(bytes_content), num * SECTOR_SIZE)
> > +
> > +        return bytes_content
> > +
> > +    def write_sectors(self, sector: int, data: bytes):
> > +        """
> > +        Write `data` to the `disk` starting from `sector`.
> > +        This uses `QemuIoInteractive` to write the data into the disk.
> > +        """
> > +
> > +        self.assertGreater(len(data), 0)
> > +        self.assertEqual(len(data) % SECTOR_SIZE, 0)
> > +
> > +        temp_file = os.path.join(iotests.test_dir, "temp.bin")
> > +        with open(temp_file, "wb") as f:
> > +            f.write(data)
> > +
> > +        self.qio.cmd(f"write -s {temp_file} {sector * SECTOR_SIZE} {len(data)}")
> > +
> > +        os.remove(temp_file)
> > +
> > +    def init_fat16(self):
> > +        mbr = MBR(self.read_sectors(0))
> > +        return Fat16(
> > +            mbr.partition_table[0]["start_lba"],
> > +            mbr.partition_table[0]["size"],
> > +            self.read_sectors,
> > +            self.write_sectors,
> > +        )
> > +
> > +    # Tests
> > +
> > +    def test_fat_filesystem(self):
> > +        """
> > +        Test that vvfat produce a valid FAT16 and MBR sectors
> > +        """
> > +        mbr = MBR(self.read_sectors(0))
> > +
> > +        self.assertEqual(mbr.partition_table[0]["status"], 0x80)
> > +        self.assertEqual(mbr.partition_table[0]["type"], 6)
> > +
> > +        fat16 = Fat16(
> > +            mbr.partition_table[0]["start_lba"],
> > +            mbr.partition_table[0]["size"],
> > +            self.read_sectors,
> > +            self.write_sectors,
> > +        )
> > +        self.assertEqual(fat16.boot_sector.bytes_per_sector, 512)
> > +        self.assertEqual(fat16.boot_sector.volume_label, "QEMU VVFAT")
> > +
> > +    def test_read_root_directory(self):
> > +        """
> > +        Test the content of the root directory
> > +        """
> > +        fat16 = self.init_fat16()
> > +
> > +        root_dir = fat16.read_root_directory()
> > +
> > +        self.assertEqual(len(root_dir), 13)  # 12 + 1 special file
> > +
> > +        files = {
> > +            "QEMU VVF.AT": 0,  # special empty file
> > +            "FILE0.TXT": 16,
> > +            "FILE1.TXT": 16,
> > +            "FILE2.TXT": 16,
> > +            "FILE3.TXT": 16,
> > +            "FILE4.TXT": 16,
> > +            "FILE5.TXT": 16,
> > +            "FILE6.TXT": 16,
> > +            "FILE7.TXT": 16,
> > +            "FILE8.TXT": 16,
> > +            "FILE9.TXT": 16,
> > +            "LARGE1.TXT": 0x2000 * 2,
> > +            "LARGE2.TXT": 0x2000 * 3,
> > +        }
> > +
> > +        for entry in root_dir:
> > +            self.assertIn(entry.whole_name(), files)
> > +            self.assertEqual(entry.size_bytes, files[entry.whole_name()])
> > +
> > +    def test_direntry_as_bytes(self):
> > +        """
> > +        Test if we can convert Direntry back to bytes, so that we can write it back to the disk safely.
> > +        """
> > +        fat16 = self.init_fat16()
> > +
> > +        root_dir = fat16.read_root_directory()
> > +        first_entry_bytes = fat16.read_sectors(fat16.boot_sector.root_dir_start(), 1)
> > +        # The first entry won't be deleted, so we can compare it with the first entry in the root directory
> > +        self.assertEqual(root_dir[0].as_bytes(), first_entry_bytes[:DIRENTRY_SIZE])
> > +
> > +    def test_read_files(self):
> > +        """
> > +        Test reading the content of the files
> > +        """
> > +        fat16 = self.init_fat16()
> > +
> > +        for i in range(10):
> > +            file = fat16.find_direntry(f"/FILE{i}.TXT")
> > +            self.assertIsNotNone(file)
> > +            self.assertEqual(
> > +                fat16.read_file(file), f"Hello, world! {i}\n".encode("ascii")
> > +            )
> > +
> > +        # test large files
> > +        large1 = fat16.find_direntry("/LARGE1.TXT")
> > +        with open(os.path.join(filesystem, "large1.txt"), "rb") as f:
> > +            self.assertEqual(fat16.read_file(large1), f.read())
> > +
> > +        large2 = fat16.find_direntry("/LARGE2.TXT")
> > +        self.assertIsNotNone(large2)
> > +        with open(os.path.join(filesystem, "large2.txt"), "rb") as f:
> > +            self.assertEqual(fat16.read_file(large2), f.read())
> > +
> > +    def test_write_file_same_content_direct(self):
> > +        """
> > +        Similar to `test_write_file_in_same_content`, but we write the file directly clusters
> > +        and thus we don't go through the modification of direntry.
> > +        """
> > +        fat16 = self.init_fat16()
> > +
> > +        file = fat16.find_direntry("/FILE0.TXT")
> > +        self.assertIsNotNone(file)
> > +
> > +        data = fat16.read_cluster(file.cluster)
> > +        fat16.write_cluster(file.cluster, data)
> > +
> > +        with open(os.path.join(filesystem, "file0.txt"), "rb") as f:
> > +            self.assertEqual(fat16.read_file(file), f.read())
> > +
> > +    def test_write_file_in_same_content(self):
> > +        """
> > +        Test writing the same content to the file back to it
> > +        """
> > +        fat16 = self.init_fat16()
> > +
> > +        file = fat16.find_direntry("/FILE0.TXT")
> > +        self.assertIsNotNone(file)
> > +
> > +        self.assertEqual(fat16.read_file(file), b"Hello, world! 0\n")
> > +
> > +        fat16.write_file(file, b"Hello, world! 0\n")
> > +
> > +        self.assertEqual(fat16.read_file(file), b"Hello, world! 0\n")
> > +
> > +        with open(os.path.join(filesystem, "file0.txt"), "rb") as f:
> > +            self.assertEqual(f.read(), b"Hello, world! 0\n")
> > +
> > +    def test_modify_content_same_clusters(self):
> > +        """
> > +        Test modifying the content of the file without changing the number of clusters
> > +        """
> > +        fat16 = self.init_fat16()
> > +
> > +        file = fat16.find_direntry("/FILE0.TXT")
> > +        self.assertIsNotNone(file)
> > +
> > +        new_content = b"Hello, world! Modified\n"
> > +        self.assertEqual(fat16.read_file(file), b"Hello, world! 0\n")
> > +
> > +        fat16.write_file(file, new_content)
> > +
> > +        self.assertEqual(fat16.read_file(file), new_content)
> > +
> > +        with open(os.path.join(filesystem, "file0.txt"), "rb") as f:
> > +            self.assertEqual(f.read(), new_content)
> > +
> > +    def test_truncate_file_same_clusters_less(self):
> > +        """
> > +        Test truncating the file without changing number of clusters
> > +        Test decreasing the file size
> > +        """
> > +        fat16 = self.init_fat16()
> > +
> > +        file = fat16.find_direntry("/FILE0.TXT")
> > +        self.assertIsNotNone(file)
> > +
> > +        self.assertEqual(fat16.read_file(file), b"Hello, world! 0\n")
> > +
> > +        fat16.truncate_file(file, 5)
> > +
> > +        new_content = fat16.read_file(file)
> > +
> > +        self.assertEqual(new_content, b"Hello")
> > +
> > +        with open(os.path.join(filesystem, "file0.txt"), "rb") as f:
> > +            self.assertEqual(f.read(), new_content)
> > +
> > +    def test_truncate_file_same_clusters_more(self):
> > +        """
> > +        Test truncating the file without changing number of clusters
> > +        Test increase the file size
> > +        """
> > +        fat16 = self.init_fat16()
> > +
> > +        file = fat16.find_direntry("/FILE0.TXT")
> > +        self.assertIsNotNone(file)
> > +
> > +        self.assertEqual(fat16.read_file(file), b"Hello, world! 0\n")
> > +
> > +        fat16.truncate_file(file, 20)
> > +
> > +        new_content = fat16.read_file(file)
> > +
> > +        # random pattern will be appended to the file, and its not always the same
> > +        self.assertEqual(new_content[:16], b"Hello, world! 0\n")
> > +        self.assertEqual(len(new_content), 20)
> > +
> > +        with open(os.path.join(filesystem, "file0.txt"), "rb") as f:
> > +            self.assertEqual(f.read(), new_content)
> > +
> > +    def test_write_large_file(self):
> > +        """
> > +        Test writing a large file
> > +        """
> > +        fat16 = self.init_fat16()
> > +
> > +        file = fat16.find_direntry("/LARGE1.TXT")
> > +        self.assertIsNotNone(file)
> > +
> > +        # The content of LARGE1 is A * 1KB, B * 1KB, C * 1KB, ..., P * 1KB
> > +        # Lets change it to be Z * 1KB, Y * 1KB, X * 1KB, ..., K * 1KB
> > +        # without changing the number of clusters or filesize
> > +        new_content = b"".join([bytes([0x5A - i] * 1024) for i in range(16)])
> > +
> > +        fat16.write_file(file, new_content)
> > +
> > +        with open(os.path.join(filesystem, "large1.txt"), "rb") as f:
> > +            self.assertEqual(f.read(), new_content)
> > +
> > +    def test_truncate_file_change_clusters_less(self):
> > +        """
> > +        Test truncating a file by reducing the number of clusters
> > +        """
> > +        fat16 = self.init_fat16()
> > +
> > +        file = fat16.find_direntry("/LARGE1.TXT")
> > +        self.assertIsNotNone(file)
> > +
> > +        fat16.truncate_file(file, 1)
> > +
> > +        self.assertEqual(fat16.read_file(file), b"A")
> > +
> > +        with open(os.path.join(filesystem, "large1.txt"), "rb") as f:
> > +            self.assertEqual(f.read(), b"A")
> > +
> > +    def test_write_file_change_clusters_less(self):
> > +        """
> > +        Test truncating a file by reducing the number of clusters
> > +        """
> > +        fat16 = self.init_fat16()
> > +
> > +        file = fat16.find_direntry("/LARGE2.TXT")
> > +        self.assertIsNotNone(file)
> > +
> > +        new_content = b"Hello, world! This was a large file\n"
> > +        new_content = b"X" * 8 * 1024 + b"Y" * 8 * 1024
> 
> This sets and then immediately overwrites new_content. What was intended
> here?
> 
> > +
> > +        fat16.write_file(file, new_content)
> > +
> > +        self.assertEqual(fat16.read_file(file), new_content)
> > +
> > +        with open(os.path.join(filesystem, "large2.txt"), "rb") as f:
> > +            self.assertEqual(f.read(), new_content)
> > +
> > +    def test_write_file_change_clusters_more(self):
> > +        """
> > +        Test truncating a file by increasing the number of clusters
> > +        """
> > +        fat16 = self.init_fat16()
> > +
> > +        file = fat16.find_direntry("/LARGE2.TXT")
> > +        self.assertIsNotNone(file)
> > +
> > +        new_content = b"X" * 8 * 1024 + b"Y" * 8 * 1024 + b"Z" * 8 * 1024
> > +
> > +        fat16.write_file(file, new_content)
> > +
> > +        with open(os.path.join(filesystem, "large2.txt"), "rb") as f:
> > +            self.assertEqual(f.read(), new_content)
> > +
> > +    def test_write_file_change_clusters_more_non_last_file(self):
> > +        """
> > +        Test truncating a file by increasing the number of clusters
> > +        This is a special variant of the above test, where we write to
> > +        a file so that when allocating new clusters, it won't have contiguous clusters
> > +        """
> > +        fat16 = self.init_fat16()
> > +
> > +        file = fat16.find_direntry("/LARGE1.TXT")
> > +        self.assertIsNotNone(file)
> > +
> > +        new_content = b"X" * 8 * 1024 + b"Y" * 8 * 1024 + b"Z" * 8 * 1024
> > +
> > +        fat16.write_file(file, new_content)
> > +
> > +        with open(os.path.join(filesystem, "large1.txt"), "rb") as f:
> > +            self.assertEqual(f.read(), new_content)
> > +
> > +    def test_create_file(self):
> > +        """
> > +        Test creating a new file
> > +        """
> > +        fat16 = self.init_fat16()
> > +
> > +        new_file = fat16.create_file("/NEWFILE.TXT")
> > +
> > +        self.assertIsNotNone(new_file)
> > +        self.assertEqual(new_file.size_bytes, 0)
> > +
> > +        new_content = b"Hello, world! New file\n"
> > +        fat16.write_file(new_file, new_content)
> > +
> > +        self.assertEqual(fat16.read_file(new_file), new_content)
> > +
> > +        with open(os.path.join(filesystem, "newfile.txt"), "rb") as f:
> > +            self.assertEqual(f.read(), new_content)
> > +
> > +    # TODO: support deleting files
> > +
> > +
> > +if __name__ == "__main__":
> > +    # This is a specific test for vvfat driver
> > +    iotests.main(supported_fmts=["vvfat"], supported_protocols=["file"])
> > diff --git a/tests/qemu-iotests/tests/vvfat.out b/tests/qemu-iotests/tests/vvfat.out
> > new file mode 100755
> > index 0000000000..96961ed0b5
> > --- /dev/null
> > +++ b/tests/qemu-iotests/tests/vvfat.out
> > @@ -0,0 +1,5 @@
> > +...............
> > +----------------------------------------------------------------------
> > +Ran 15 tests
> > +
> > +OK
> 
> With the updated test, I can catch the problems that are fixed by
> patches 1 and 2, but it still doesn't need patch 3 to pass.
> 
> Kevin
> 

Thanks for reviewing, those are all mistakes, and I fixed them (included
a small patch to fix these issues at the end...).

Regarding the failing test, I forgot to also read the files from the fat
driver, and instead I was just reading from the host filesystem.
I'm not sure exactly, why reading from the filesystem works, but reading
from the driver (i.e. guest) gives the weird buggy result. 
I have updated the test in the patch below to reflect this.

I would love if you can test the patch below and let me know if the
issues are fixed, after that I can send the new series.

Thanks,
Amjad

--- PATCH ---
diff --git a/tests/qemu-iotests/fat16.py b/tests/qemu-iotests/fat16.py
index baf801b4d5..411a277906 100644
--- a/tests/qemu-iotests/fat16.py
+++ b/tests/qemu-iotests/fat16.py
@@ -62,13 +62,16 @@ def __init__(self, data: bytes):
         self.reserved_sectors = int.from_bytes(data[14:16], "little")
         self.fat_count = data[16]
         self.root_entries = int.from_bytes(data[17:19], "little")
+        total_sectors_16 = int.from_bytes(data[19:21], "little")
         self.media_descriptor = data[21]
-        self.fat_size = int.from_bytes(data[22:24], "little")
         self.sectors_per_fat = int.from_bytes(data[22:24], "little")
         self.sectors_per_track = int.from_bytes(data[24:26], "little")
         self.heads = int.from_bytes(data[26:28], "little")
         self.hidden_sectors = int.from_bytes(data[28:32], "little")
-        self.total_sectors = int.from_bytes(data[32:36], "little")
+        total_sectors_32 = int.from_bytes(data[32:36], "little")
+        assert total_sectors_16 == 0 or total_sectors_32 == 0, \
+                "Both total sectors (16 and 32) fields are non-zero"
+        self.total_sectors = total_sectors_16 or total_sectors_32
         self.drive_number = data[36]
         self.volume_id = int.from_bytes(data[39:43], "little")
         self.volume_label = data[43:54].decode("ascii").strip()
@@ -208,7 +211,7 @@ def __init__(
         self.boot_sector = FatBootSector(self.sector_reader(start_sector))
 
         fat_size_in_sectors = \
-            self.boot_sector.fat_size * self.boot_sector.fat_count
+            self.boot_sector.sectors_per_fat * self.boot_sector.fat_count
         self.fats = self.read_sectors(
             self.boot_sector.reserved_sectors, fat_size_in_sectors
         )
diff --git a/tests/qemu-iotests/tests/vvfat b/tests/qemu-iotests/tests/vvfat
index 113d7d3270..8d04f292e3 100755
--- a/tests/qemu-iotests/tests/vvfat
+++ b/tests/qemu-iotests/tests/vvfat
@@ -369,7 +369,6 @@ class TestVVFatDriver(QMPTestCase):
         file = fat16.find_direntry("/LARGE2.TXT")
         self.assertIsNotNone(file)
 
-        new_content = b"Hello, world! This was a large file\n"
         new_content = b"X" * 8 * 1024 + b"Y" * 8 * 1024
 
         fat16.write_file(file, new_content)
@@ -388,10 +387,13 @@ class TestVVFatDriver(QMPTestCase):
         file = fat16.find_direntry("/LARGE2.TXT")
         self.assertIsNotNone(file)
 
-        new_content = b"X" * 8 * 1024 + b"Y" * 8 * 1024 + b"Z" * 8 * 1024
+        # from 3 clusters to 4 clusters
+        new_content = b"W" * 8 * 1024 + b"X" * 8 * 1024 + b"Y" * 8 * 1024 + b"Z" * 8 * 1024
 
         fat16.write_file(file, new_content)
 
+        self.assertEqual(fat16.read_file(file), new_content)
+
         with open(os.path.join(filesystem, "large2.txt"), "rb") as f:
             self.assertEqual(f.read(), new_content)
 
@@ -406,10 +408,13 @@ class TestVVFatDriver(QMPTestCase):
         file = fat16.find_direntry("/LARGE1.TXT")
         self.assertIsNotNone(file)
 
+        # from 2 clusters to 3 clusters
         new_content = b"X" * 8 * 1024 + b"Y" * 8 * 1024 + b"Z" * 8 * 1024
 
         fat16.write_file(file, new_content)
 
+        self.assertEqual(fat16.read_file(file), new_content)
+
         with open(os.path.join(filesystem, "large1.txt"), "rb") as f:
             self.assertEqual(f.read(), new_content)
 



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 2/4] vvfat: Fix usage of `info.file.offset`
  2024-06-05  0:58 ` [PATCH v4 2/4] vvfat: Fix usage of `info.file.offset` Amjad Alsharafi
@ 2024-06-10 16:49   ` Kevin Wolf
  2024-06-11 12:31     ` Amjad Alsharafi
  0 siblings, 1 reply; 13+ messages in thread
From: Kevin Wolf @ 2024-06-10 16:49 UTC (permalink / raw)
  To: Amjad Alsharafi; +Cc: qemu-devel, Hanna Reitz, open list:vvfat

Am 05.06.2024 um 02:58 hat Amjad Alsharafi geschrieben:
> The field is marked as "the offset in the file (in clusters)", but it
> was being used like this
> `cluster_size*(nums)+mapping->info.file.offset`, which is incorrect.
> 
> Additionally, removed the `abort` when `first_mapping_index` does not
> match, as this matches the case when adding new clusters for files, and
> its inevitable that we reach this condition when doing that if the
> clusters are not after one another, so there is no reason to `abort`
> here, execution continues and the new clusters are written to disk
> correctly.
> 
> Signed-off-by: Amjad Alsharafi <amjadsharafi10@gmail.com>

Can you help me understand how first_mapping_index really works?

It seems to me that you get a chain of mappings for each file on the FAT
filesystem, which are just the contiguous areas in it, and
first_mapping_index refers to the mapping at the start of the file. But
for much of the time, it actually doesn't seem to be set at all, so you
have mapping->first_mapping_index == -1. Do you understand the rules
around when it's set and when it isn't?

>  block/vvfat.c | 12 +++++++-----
>  1 file changed, 7 insertions(+), 5 deletions(-)
> 
> diff --git a/block/vvfat.c b/block/vvfat.c
> index 19da009a5b..f0642ac3e4 100644
> --- a/block/vvfat.c
> +++ b/block/vvfat.c
> @@ -1408,7 +1408,9 @@ read_cluster_directory:
>  
>          assert(s->current_fd);
>  
> -        offset=s->cluster_size*(cluster_num-s->current_mapping->begin)+s->current_mapping->info.file.offset;
> +        offset = s->cluster_size *
> +            ((cluster_num - s->current_mapping->begin)
> +            + s->current_mapping->info.file.offset);
>          if(lseek(s->current_fd, offset, SEEK_SET)!=offset)
>              return -3;
>          s->cluster=s->cluster_buffer;
> @@ -1929,8 +1931,9 @@ get_cluster_count_for_direntry(BDRVVVFATState* s, direntry_t* direntry, const ch
>                          (mapping->mode & MODE_DIRECTORY) == 0) {
>  
>                      /* was modified in qcow */
> -                    if (offset != mapping->info.file.offset + s->cluster_size
> -                            * (cluster_num - mapping->begin)) {
> +                    if (offset != s->cluster_size
> +                            * ((cluster_num - mapping->begin)
> +                            + mapping->info.file.offset)) {
>                          /* offset of this cluster in file chain has changed */
>                          abort();
>                          copy_it = 1;
> @@ -1944,7 +1947,6 @@ get_cluster_count_for_direntry(BDRVVVFATState* s, direntry_t* direntry, const ch
>  
>                      if (mapping->first_mapping_index != first_mapping_index
>                              && mapping->info.file.offset > 0) {
> -                        abort();
>                          copy_it = 1;
>                      }

I'm unsure which case this represents. If first_mapping_index refers to
the mapping of the first cluster in the file, does this mean we got a
mapping for a different file here? Or is the comparison between -1 and a
real value?

In any case it doesn't seem to be the case that the comment at the
declaration of copy_it describes.

>  
> @@ -2404,7 +2406,7 @@ static int commit_mappings(BDRVVVFATState* s,
>                          (mapping->end - mapping->begin);
>              } else
>                  next_mapping->info.file.offset = mapping->info.file.offset +
> -                        mapping->end - mapping->begin;
> +                        (mapping->end - mapping->begin);
>  
>              mapping = next_mapping;
>          }

Kevin



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 4/4] iotests: Add `vvfat` tests
  2024-06-10 14:11     ` Amjad Alsharafi
@ 2024-06-10 16:50       ` Kevin Wolf
  0 siblings, 0 replies; 13+ messages in thread
From: Kevin Wolf @ 2024-06-10 16:50 UTC (permalink / raw)
  To: Amjad Alsharafi; +Cc: qemu-devel, Hanna Reitz, open list:vvfat

Am 10.06.2024 um 16:11 hat Amjad Alsharafi geschrieben:
> On Mon, Jun 10, 2024 at 02:01:24PM +0200, Kevin Wolf wrote:
> > With the updated test, I can catch the problems that are fixed by
> > patches 1 and 2, but it still doesn't need patch 3 to pass.
> > 
> > Kevin
> > 
> 
> Thanks for reviewing, those are all mistakes, and I fixed them (included
> a small patch to fix these issues at the end...).
> 
> Regarding the failing test, I forgot to also read the files from the fat
> driver, and instead I was just reading from the host filesystem.
> I'm not sure exactly, why reading from the filesystem works, but reading
> from the driver (i.e. guest) gives the weird buggy result. 
> I have updated the test in the patch below to reflect this.
> 
> I would love if you can test the patch below and let me know if the
> issues are fixed, after that I can send the new series.

Yes, that looks good to me and reproduces a failure without patch 3.

Kevin



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 2/4] vvfat: Fix usage of `info.file.offset`
  2024-06-10 16:49   ` Kevin Wolf
@ 2024-06-11 12:31     ` Amjad Alsharafi
  2024-06-11 14:30       ` Kevin Wolf
  0 siblings, 1 reply; 13+ messages in thread
From: Amjad Alsharafi @ 2024-06-11 12:31 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-devel, Hanna Reitz, open list:vvfat

On Mon, Jun 10, 2024 at 06:49:43PM +0200, Kevin Wolf wrote:
> Am 05.06.2024 um 02:58 hat Amjad Alsharafi geschrieben:
> > The field is marked as "the offset in the file (in clusters)", but it
> > was being used like this
> > `cluster_size*(nums)+mapping->info.file.offset`, which is incorrect.
> > 
> > Additionally, removed the `abort` when `first_mapping_index` does not
> > match, as this matches the case when adding new clusters for files, and
> > its inevitable that we reach this condition when doing that if the
> > clusters are not after one another, so there is no reason to `abort`
> > here, execution continues and the new clusters are written to disk
> > correctly.
> > 
> > Signed-off-by: Amjad Alsharafi <amjadsharafi10@gmail.com>
> 
> Can you help me understand how first_mapping_index really works?
> 
> It seems to me that you get a chain of mappings for each file on the FAT
> filesystem, which are just the contiguous areas in it, and
> first_mapping_index refers to the mapping at the start of the file. But
> for much of the time, it actually doesn't seem to be set at all, so you
> have mapping->first_mapping_index == -1. Do you understand the rules
> around when it's set and when it isn't?

Yeah. So `first_mapping_index` is the index of the first mapping, each
mapping is a group of clusters that are contiguous in the file.
Its mostly `-1` because the first mapping will have the value set as
`-1` and not its own index, this value will only be set when the file
contain more than one mapping, and this will only happen when you add
clusters to a file that are not contiguous with the existing clusters.

And actually, thanks to that I noticed another bug not fixed in PATCH 3, 
We are doing this check 
`s->current_mapping->first_mapping_index != mapping->first_mapping_index`
to know if we should switch to the new mapping or not. 
If we were reading from the first mapping (`first_mapping_index == -1`)
and we jumped to the second mapping (`first_mapping_index == n`), we
will catch this condition and switch to the new mapping.

But if the file has more than 2 mappings, and we jumped to the 3rd
mapping, we will not catch this since (`first_mapping_index == n`) for
both of them haha. I think a better check is to check the `mapping`
pointer directly. (I'll add it also in the next series together with a
test for it.)

> 
> >  block/vvfat.c | 12 +++++++-----
> >  1 file changed, 7 insertions(+), 5 deletions(-)
> > 
> > diff --git a/block/vvfat.c b/block/vvfat.c
> > index 19da009a5b..f0642ac3e4 100644
> > --- a/block/vvfat.c
> > +++ b/block/vvfat.c
> > @@ -1408,7 +1408,9 @@ read_cluster_directory:
> >  
> >          assert(s->current_fd);
> >  
> > -        offset=s->cluster_size*(cluster_num-s->current_mapping->begin)+s->current_mapping->info.file.offset;
> > +        offset = s->cluster_size *
> > +            ((cluster_num - s->current_mapping->begin)
> > +            + s->current_mapping->info.file.offset);
> >          if(lseek(s->current_fd, offset, SEEK_SET)!=offset)
> >              return -3;
> >          s->cluster=s->cluster_buffer;
> > @@ -1929,8 +1931,9 @@ get_cluster_count_for_direntry(BDRVVVFATState* s, direntry_t* direntry, const ch
> >                          (mapping->mode & MODE_DIRECTORY) == 0) {
> >  
> >                      /* was modified in qcow */
> > -                    if (offset != mapping->info.file.offset + s->cluster_size
> > -                            * (cluster_num - mapping->begin)) {
> > +                    if (offset != s->cluster_size
> > +                            * ((cluster_num - mapping->begin)
> > +                            + mapping->info.file.offset)) {
> >                          /* offset of this cluster in file chain has changed */
> >                          abort();
> >                          copy_it = 1;
> > @@ -1944,7 +1947,6 @@ get_cluster_count_for_direntry(BDRVVVFATState* s, direntry_t* direntry, const ch
> >  
> >                      if (mapping->first_mapping_index != first_mapping_index
> >                              && mapping->info.file.offset > 0) {
> > -                        abort();
> >                          copy_it = 1;
> >                      }
> 
> I'm unsure which case this represents. If first_mapping_index refers to
> the mapping of the first cluster in the file, does this mean we got a
> mapping for a different file here? Or is the comparison between -1 and a
> real value?

Now that I think more about it, I think this `abort` is actually
correct, the issue though is that the handling around this code is not.

What this `abort` actually does is that it checks.
- if the `mapping->first_mapping_index` is not the same as
  `first_mapping_index`, which **should** happen only in one case, when
  we are handling the first mapping, in that case
  `mapping->first_mapping_index == -1`, in all other cases, the other
  mappings after the first should have the condition true.
- From above, we know that this is the first mapping, so if the offset
  is not `0`, then abort, since this is an invalid state.

This is all good, the issue is that `first_mapping_index` is not set if
we are checking from the middle, the variable `first_mapping_index` is
only set if we passed through the check `cluster_was_modified` with the
first mapping, and in the same function call we checked the other
mappings.

From what I have seen, that doesn't happen since even if you write the
whole file in one go, you are still writing it cluster by cluster, and
the checks happen at that time.

> 
> In any case it doesn't seem to be the case that the comment at the
> declaration of copy_it describes.
> 
> >  
> > @@ -2404,7 +2406,7 @@ static int commit_mappings(BDRVVVFATState* s,
> >                          (mapping->end - mapping->begin);
> >              } else
> >                  next_mapping->info.file.offset = mapping->info.file.offset +
> > -                        mapping->end - mapping->begin;
> > +                        (mapping->end - mapping->begin);
> >  
> >              mapping = next_mapping;
> >          }
> 
> Kevin
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 2/4] vvfat: Fix usage of `info.file.offset`
  2024-06-11 12:31     ` Amjad Alsharafi
@ 2024-06-11 14:30       ` Kevin Wolf
  2024-06-11 16:22         ` Amjad Alsharafi
  0 siblings, 1 reply; 13+ messages in thread
From: Kevin Wolf @ 2024-06-11 14:30 UTC (permalink / raw)
  To: Amjad Alsharafi; +Cc: qemu-devel, Hanna Reitz, open list:vvfat

Am 11.06.2024 um 14:31 hat Amjad Alsharafi geschrieben:
> On Mon, Jun 10, 2024 at 06:49:43PM +0200, Kevin Wolf wrote:
> > Am 05.06.2024 um 02:58 hat Amjad Alsharafi geschrieben:
> > > The field is marked as "the offset in the file (in clusters)", but it
> > > was being used like this
> > > `cluster_size*(nums)+mapping->info.file.offset`, which is incorrect.
> > > 
> > > Additionally, removed the `abort` when `first_mapping_index` does not
> > > match, as this matches the case when adding new clusters for files, and
> > > its inevitable that we reach this condition when doing that if the
> > > clusters are not after one another, so there is no reason to `abort`
> > > here, execution continues and the new clusters are written to disk
> > > correctly.
> > > 
> > > Signed-off-by: Amjad Alsharafi <amjadsharafi10@gmail.com>
> > 
> > Can you help me understand how first_mapping_index really works?
> > 
> > It seems to me that you get a chain of mappings for each file on the FAT
> > filesystem, which are just the contiguous areas in it, and
> > first_mapping_index refers to the mapping at the start of the file. But
> > for much of the time, it actually doesn't seem to be set at all, so you
> > have mapping->first_mapping_index == -1. Do you understand the rules
> > around when it's set and when it isn't?
> 
> Yeah. So `first_mapping_index` is the index of the first mapping, each
> mapping is a group of clusters that are contiguous in the file.
> Its mostly `-1` because the first mapping will have the value set as
> `-1` and not its own index, this value will only be set when the file
> contain more than one mapping, and this will only happen when you add
> clusters to a file that are not contiguous with the existing clusters.

Ah, that makes some sense. Not sure if it's optimal, but it's a rule I
can work with. So just to confirm, this is the invariant that we think
should always hold true, right?

    assert((mapping->mode & MODE_DIRECTORY) ||
           !mapping->info.file.offset ||
           mapping->first_mapping_index > 0);

> And actually, thanks to that I noticed another bug not fixed in PATCH 3, 
> We are doing this check 
> `s->current_mapping->first_mapping_index != mapping->first_mapping_index`
> to know if we should switch to the new mapping or not. 
> If we were reading from the first mapping (`first_mapping_index == -1`)
> and we jumped to the second mapping (`first_mapping_index == n`), we
> will catch this condition and switch to the new mapping.
> 
> But if the file has more than 2 mappings, and we jumped to the 3rd
> mapping, we will not catch this since (`first_mapping_index == n`) for
> both of them haha. I think a better check is to check the `mapping`
> pointer directly. (I'll add it also in the next series together with a
> test for it.)

This comparison is exactly what confused me. I didn't realise that the
first mapping in the chain has a different value here, so I thought this
must mean that we're looking at a different file now - but of course I
couldn't see a reason for that because we're iterating through a single
file in this function.

But even now that I know that the condition triggers when switching from
the first to the second mapping, it doesn't make sense to me. We don't
have to copy things around just because a file is non-contiguous.

What we want to catch is if the order of mappings has changed compared
to the old state. Do we need a linked list, maybe a prev_mapping_index,
instead of first_mapping_index so that we can compare if it is still the
same as before?

Or actually, I suppose that's the first block with an abort() in the
code, just that it doesn't compare mappings, but their offsets.

> > 
> > >  block/vvfat.c | 12 +++++++-----
> > >  1 file changed, 7 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/block/vvfat.c b/block/vvfat.c
> > > index 19da009a5b..f0642ac3e4 100644
> > > --- a/block/vvfat.c
> > > +++ b/block/vvfat.c
> > > @@ -1408,7 +1408,9 @@ read_cluster_directory:
> > >  
> > >          assert(s->current_fd);
> > >  
> > > -        offset=s->cluster_size*(cluster_num-s->current_mapping->begin)+s->current_mapping->info.file.offset;
> > > +        offset = s->cluster_size *
> > > +            ((cluster_num - s->current_mapping->begin)
> > > +            + s->current_mapping->info.file.offset);
> > >          if(lseek(s->current_fd, offset, SEEK_SET)!=offset)
> > >              return -3;
> > >          s->cluster=s->cluster_buffer;
> > > @@ -1929,8 +1931,9 @@ get_cluster_count_for_direntry(BDRVVVFATState* s, direntry_t* direntry, const ch
> > >                          (mapping->mode & MODE_DIRECTORY) == 0) {
> > >  
> > >                      /* was modified in qcow */
> > > -                    if (offset != mapping->info.file.offset + s->cluster_size
> > > -                            * (cluster_num - mapping->begin)) {
> > > +                    if (offset != s->cluster_size
> > > +                            * ((cluster_num - mapping->begin)
> > > +                            + mapping->info.file.offset)) {
> > >                          /* offset of this cluster in file chain has changed */
> > >                          abort();
> > >                          copy_it = 1;
> > > @@ -1944,7 +1947,6 @@ get_cluster_count_for_direntry(BDRVVVFATState* s, direntry_t* direntry, const ch
> > >  
> > >                      if (mapping->first_mapping_index != first_mapping_index
> > >                              && mapping->info.file.offset > 0) {
> > > -                        abort();
> > >                          copy_it = 1;
> > >                      }
> > 
> > I'm unsure which case this represents. If first_mapping_index refers to
> > the mapping of the first cluster in the file, does this mean we got a
> > mapping for a different file here? Or is the comparison between -1 and a
> > real value?
> 
> Now that I think more about it, I think this `abort` is actually
> correct, the issue though is that the handling around this code is not.
> 
> What this `abort` actually does is that it checks.
> - if the `mapping->first_mapping_index` is not the same as
>   `first_mapping_index`, which **should** happen only in one case, when
>   we are handling the first mapping, in that case
>   `mapping->first_mapping_index == -1`, in all other cases, the other
>   mappings after the first should have the condition true.
> - From above, we know that this is the first mapping, so if the offset
>   is not `0`, then abort, since this is an invalid state.

Yes, make sense.

> This is all good, the issue is that `first_mapping_index` is not set if
> we are checking from the middle, the variable `first_mapping_index` is
> only set if we passed through the check `cluster_was_modified` with the
> first mapping, and in the same function call we checked the other
> mappings.

I think I noticed the same yesterday, but when I tried to write a quick
patch that I could show you and that would update first_mapping_index in
each iteration, I broke something. So I decided I'd first ask you what
all of this even means. :-)

> From what I have seen, that doesn't happen since even if you write the
> whole file in one go, you are still writing it cluster by cluster, and
> the checks happen at that time.

Well, we do trigger the condition, but I suppose updating
first_mapping_index in each loop iteration is really the way to go if
you think the same.

Kevin



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 2/4] vvfat: Fix usage of `info.file.offset`
  2024-06-11 14:30       ` Kevin Wolf
@ 2024-06-11 16:22         ` Amjad Alsharafi
  2024-06-11 18:11           ` Kevin Wolf
  0 siblings, 1 reply; 13+ messages in thread
From: Amjad Alsharafi @ 2024-06-11 16:22 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-devel, Hanna Reitz, open list:vvfat

On Tue, Jun 11, 2024 at 04:30:53PM +0200, Kevin Wolf wrote:
> Am 11.06.2024 um 14:31 hat Amjad Alsharafi geschrieben:
> > On Mon, Jun 10, 2024 at 06:49:43PM +0200, Kevin Wolf wrote:
> > > Am 05.06.2024 um 02:58 hat Amjad Alsharafi geschrieben:
> > > > The field is marked as "the offset in the file (in clusters)", but it
> > > > was being used like this
> > > > `cluster_size*(nums)+mapping->info.file.offset`, which is incorrect.
> > > > 
> > > > Additionally, removed the `abort` when `first_mapping_index` does not
> > > > match, as this matches the case when adding new clusters for files, and
> > > > its inevitable that we reach this condition when doing that if the
> > > > clusters are not after one another, so there is no reason to `abort`
> > > > here, execution continues and the new clusters are written to disk
> > > > correctly.
> > > > 
> > > > Signed-off-by: Amjad Alsharafi <amjadsharafi10@gmail.com>
> > > 
> > > Can you help me understand how first_mapping_index really works?
> > > 
> > > It seems to me that you get a chain of mappings for each file on the FAT
> > > filesystem, which are just the contiguous areas in it, and
> > > first_mapping_index refers to the mapping at the start of the file. But
> > > for much of the time, it actually doesn't seem to be set at all, so you
> > > have mapping->first_mapping_index == -1. Do you understand the rules
> > > around when it's set and when it isn't?
> > 
> > Yeah. So `first_mapping_index` is the index of the first mapping, each
> > mapping is a group of clusters that are contiguous in the file.
> > Its mostly `-1` because the first mapping will have the value set as
> > `-1` and not its own index, this value will only be set when the file
> > contain more than one mapping, and this will only happen when you add
> > clusters to a file that are not contiguous with the existing clusters.
> 
> Ah, that makes some sense. Not sure if it's optimal, but it's a rule I
> can work with. So just to confirm, this is the invariant that we think
> should always hold true, right?
> 
>     assert((mapping->mode & MODE_DIRECTORY) ||
>            !mapping->info.file.offset ||
>            mapping->first_mapping_index > 0);
> 

Yes.

We can add this into `get_cluster_count_for_direntry` loop.
I'm thinking of also converting those `abort` into `assert`, since
the line `copy_it = 1;` was confusing me, since it was after the `abort`.

> > And actually, thanks to that I noticed another bug not fixed in PATCH 3, 
> > We are doing this check 
> > `s->current_mapping->first_mapping_index != mapping->first_mapping_index`
> > to know if we should switch to the new mapping or not. 
> > If we were reading from the first mapping (`first_mapping_index == -1`)
> > and we jumped to the second mapping (`first_mapping_index == n`), we
> > will catch this condition and switch to the new mapping.
> > 
> > But if the file has more than 2 mappings, and we jumped to the 3rd
> > mapping, we will not catch this since (`first_mapping_index == n`) for
> > both of them haha. I think a better check is to check the `mapping`
> > pointer directly. (I'll add it also in the next series together with a
> > test for it.)
> 
> This comparison is exactly what confused me. I didn't realise that the
> first mapping in the chain has a different value here, so I thought this
> must mean that we're looking at a different file now - but of course I
> couldn't see a reason for that because we're iterating through a single
> file in this function.
> 
> But even now that I know that the condition triggers when switching from
> the first to the second mapping, it doesn't make sense to me. We don't
> have to copy things around just because a file is non-contiguous.
> 
> What we want to catch is if the order of mappings has changed compared
> to the old state. Do we need a linked list, maybe a prev_mapping_index,
> instead of first_mapping_index so that we can compare if it is still the
> same as before?

I think this would be the better design (tbh, that's what I thought 
`first_mapping_index` would do), though not sure if other components
depend so much into the current design that it would be hard to change.

I'll try to implement this `prev_mapping_index` and see how it goes.

> 
> Or actually, I suppose that's the first block with an abort() in the
> code, just that it doesn't compare mappings, but their offsets.

I think, I'm still confused on the whole logic there, the function
`get_cluster_count_for_direntry` is a mess, and it doesn't just
*get* the cluster count, it also schedule writeouts and may
copy clusters around.

> 
> > > 
> > > >  block/vvfat.c | 12 +++++++-----
> > > >  1 file changed, 7 insertions(+), 5 deletions(-)
> > > > 
> > > > diff --git a/block/vvfat.c b/block/vvfat.c
> > > > index 19da009a5b..f0642ac3e4 100644
> > > > --- a/block/vvfat.c
> > > > +++ b/block/vvfat.c
> > > > @@ -1408,7 +1408,9 @@ read_cluster_directory:
> > > >  
> > > >          assert(s->current_fd);
> > > >  
> > > > -        offset=s->cluster_size*(cluster_num-s->current_mapping->begin)+s->current_mapping->info.file.offset;
> > > > +        offset = s->cluster_size *
> > > > +            ((cluster_num - s->current_mapping->begin)
> > > > +            + s->current_mapping->info.file.offset);
> > > >          if(lseek(s->current_fd, offset, SEEK_SET)!=offset)
> > > >              return -3;
> > > >          s->cluster=s->cluster_buffer;
> > > > @@ -1929,8 +1931,9 @@ get_cluster_count_for_direntry(BDRVVVFATState* s, direntry_t* direntry, const ch
> > > >                          (mapping->mode & MODE_DIRECTORY) == 0) {
> > > >  
> > > >                      /* was modified in qcow */
> > > > -                    if (offset != mapping->info.file.offset + s->cluster_size
> > > > -                            * (cluster_num - mapping->begin)) {
> > > > +                    if (offset != s->cluster_size
> > > > +                            * ((cluster_num - mapping->begin)
> > > > +                            + mapping->info.file.offset)) {
> > > >                          /* offset of this cluster in file chain has changed */
> > > >                          abort();
> > > >                          copy_it = 1;
> > > > @@ -1944,7 +1947,6 @@ get_cluster_count_for_direntry(BDRVVVFATState* s, direntry_t* direntry, const ch
> > > >  
> > > >                      if (mapping->first_mapping_index != first_mapping_index
> > > >                              && mapping->info.file.offset > 0) {
> > > > -                        abort();
> > > >                          copy_it = 1;
> > > >                      }
> > > 
> > > I'm unsure which case this represents. If first_mapping_index refers to
> > > the mapping of the first cluster in the file, does this mean we got a
> > > mapping for a different file here? Or is the comparison between -1 and a
> > > real value?
> > 
> > Now that I think more about it, I think this `abort` is actually
> > correct, the issue though is that the handling around this code is not.
> > 
> > What this `abort` actually does is that it checks.
> > - if the `mapping->first_mapping_index` is not the same as
> >   `first_mapping_index`, which **should** happen only in one case, when
> >   we are handling the first mapping, in that case
> >   `mapping->first_mapping_index == -1`, in all other cases, the other
> >   mappings after the first should have the condition true.
> > - From above, we know that this is the first mapping, so if the offset
> >   is not `0`, then abort, since this is an invalid state.
> 
> Yes, make sense.
> 
> > This is all good, the issue is that `first_mapping_index` is not set if
> > we are checking from the middle, the variable `first_mapping_index` is
> > only set if we passed through the check `cluster_was_modified` with the
> > first mapping, and in the same function call we checked the other
> > mappings.
> 
> I think I noticed the same yesterday, but when I tried to write a quick
> patch that I could show you and that would update first_mapping_index in
> each iteration, I broke something. So I decided I'd first ask you what
> all of this even means. :-)
> 
> > From what I have seen, that doesn't happen since even if you write the
> > whole file in one go, you are still writing it cluster by cluster, and
> > the checks happen at that time.
> 
> Well, we do trigger the condition, but I suppose updating
> first_mapping_index in each loop iteration is really the way to go if
> you think the same.

Indeed, I did a quick change, modifying the loop to always go through
and set the `first_mapping_index` for the first mapping fixes the issue
and we can put the `abort` back in place.

I'll also modify the check to instead be 
`mapping->first_mapping_index < 0 && mapping->info.file.offset > 0`
This will make it clear that this applies only to the first mapping.

> 
> Kevin
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 2/4] vvfat: Fix usage of `info.file.offset`
  2024-06-11 16:22         ` Amjad Alsharafi
@ 2024-06-11 18:11           ` Kevin Wolf
  0 siblings, 0 replies; 13+ messages in thread
From: Kevin Wolf @ 2024-06-11 18:11 UTC (permalink / raw)
  To: Amjad Alsharafi; +Cc: qemu-devel, Hanna Reitz, open list:vvfat

Am 11.06.2024 um 18:22 hat Amjad Alsharafi geschrieben:
> On Tue, Jun 11, 2024 at 04:30:53PM +0200, Kevin Wolf wrote:
> > Am 11.06.2024 um 14:31 hat Amjad Alsharafi geschrieben:
> > > On Mon, Jun 10, 2024 at 06:49:43PM +0200, Kevin Wolf wrote:
> > > > Am 05.06.2024 um 02:58 hat Amjad Alsharafi geschrieben:
> > > > > The field is marked as "the offset in the file (in clusters)", but it
> > > > > was being used like this
> > > > > `cluster_size*(nums)+mapping->info.file.offset`, which is incorrect.
> > > > > 
> > > > > Additionally, removed the `abort` when `first_mapping_index` does not
> > > > > match, as this matches the case when adding new clusters for files, and
> > > > > its inevitable that we reach this condition when doing that if the
> > > > > clusters are not after one another, so there is no reason to `abort`
> > > > > here, execution continues and the new clusters are written to disk
> > > > > correctly.
> > > > > 
> > > > > Signed-off-by: Amjad Alsharafi <amjadsharafi10@gmail.com>
> > > > 
> > > > Can you help me understand how first_mapping_index really works?
> > > > 
> > > > It seems to me that you get a chain of mappings for each file on the FAT
> > > > filesystem, which are just the contiguous areas in it, and
> > > > first_mapping_index refers to the mapping at the start of the file. But
> > > > for much of the time, it actually doesn't seem to be set at all, so you
> > > > have mapping->first_mapping_index == -1. Do you understand the rules
> > > > around when it's set and when it isn't?
> > > 
> > > Yeah. So `first_mapping_index` is the index of the first mapping, each
> > > mapping is a group of clusters that are contiguous in the file.
> > > Its mostly `-1` because the first mapping will have the value set as
> > > `-1` and not its own index, this value will only be set when the file
> > > contain more than one mapping, and this will only happen when you add
> > > clusters to a file that are not contiguous with the existing clusters.
> > 
> > Ah, that makes some sense. Not sure if it's optimal, but it's a rule I
> > can work with. So just to confirm, this is the invariant that we think
> > should always hold true, right?
> > 
> >     assert((mapping->mode & MODE_DIRECTORY) ||
> >            !mapping->info.file.offset ||
> >            mapping->first_mapping_index > 0);
> > 
> 
> Yes.
> 
> We can add this into `get_cluster_count_for_direntry` loop.

Maybe even find_mapping_for_cluster() because we think it should apply
always? It's called by get_cluster_count_for_direntry(), but also by
other functions.

Either way, I think this should be a separate patch.

> I'm thinking of also converting those `abort` into `assert`, since
> the line `copy_it = 1;` was confusing me, since it was after the `abort`.

I agree for the abort() that you removed, but I'm not sure about the
other one. I have a feeling the copy_it = 1 might actually be correct
there (if the copying logic is implemented correctly; I didn't check
that).

> > > And actually, thanks to that I noticed another bug not fixed in PATCH 3, 
> > > We are doing this check 
> > > `s->current_mapping->first_mapping_index != mapping->first_mapping_index`
> > > to know if we should switch to the new mapping or not. 
> > > If we were reading from the first mapping (`first_mapping_index == -1`)
> > > and we jumped to the second mapping (`first_mapping_index == n`), we
> > > will catch this condition and switch to the new mapping.
> > > 
> > > But if the file has more than 2 mappings, and we jumped to the 3rd
> > > mapping, we will not catch this since (`first_mapping_index == n`) for
> > > both of them haha. I think a better check is to check the `mapping`
> > > pointer directly. (I'll add it also in the next series together with a
> > > test for it.)
> > 
> > This comparison is exactly what confused me. I didn't realise that the
> > first mapping in the chain has a different value here, so I thought this
> > must mean that we're looking at a different file now - but of course I
> > couldn't see a reason for that because we're iterating through a single
> > file in this function.
> > 
> > But even now that I know that the condition triggers when switching from
> > the first to the second mapping, it doesn't make sense to me. We don't
> > have to copy things around just because a file is non-contiguous.
> > 
> > What we want to catch is if the order of mappings has changed compared
> > to the old state. Do we need a linked list, maybe a prev_mapping_index,
> > instead of first_mapping_index so that we can compare if it is still the
> > same as before?
> 
> I think this would be the better design (tbh, that's what I thought 
> `first_mapping_index` would do), though not sure if other components
> depend so much into the current design that it would be hard to change.
> 
> I'll try to implement this `prev_mapping_index` and see how it goes.

Let's try not to do too much at once. We know that vvfat is a mess,
nobody fully understands it, and the write support is the worst part.
One series won't fix it all. Let's move in small incremental steps and
complete this series with the fixes, maybe add more testing coverage,
and then we can start doing more cleanups without having to be afraid
that we'll have to revert a lot of code that implemented fixes later
because we didn't get the cleanups right.

But as another step after this series, this might make sense. On the
other hand, I'm not sure if we have a use for prev_mapping_index when
(if?) using the offset does the job, as I said here:

> > Or actually, I suppose that's the first block with an abort() in the
> > code, just that it doesn't compare mappings, but their offsets.
> 
> I think, I'm still confused on the whole logic there, the function
> `get_cluster_count_for_direntry` is a mess, and it doesn't just
> *get* the cluster count, it also schedule writeouts and may
> copy clusters around.

At least the comment says so, but yes, I think much of the vvfat code
could be simpler if it relied less on (badly documented) global state
and side effects.

I think the block with the first abort() implements what the comment for
copy_it describes. This condition seems to check if the mapping was
created for a different offset than where it appears in the cluster
chain now:

    if (offset != mapping->info.file.offset + s->cluster_size
            * (cluster_num - mapping->begin)) {

So some clusters were inserted (or potentially removed?) before it.

> > 
> > > > 
> > > > >  block/vvfat.c | 12 +++++++-----
> > > > >  1 file changed, 7 insertions(+), 5 deletions(-)
> > > > > 
> > > > > diff --git a/block/vvfat.c b/block/vvfat.c
> > > > > index 19da009a5b..f0642ac3e4 100644
> > > > > --- a/block/vvfat.c
> > > > > +++ b/block/vvfat.c
> > > > > @@ -1408,7 +1408,9 @@ read_cluster_directory:
> > > > >  
> > > > >          assert(s->current_fd);
> > > > >  
> > > > > -        offset=s->cluster_size*(cluster_num-s->current_mapping->begin)+s->current_mapping->info.file.offset;
> > > > > +        offset = s->cluster_size *
> > > > > +            ((cluster_num - s->current_mapping->begin)
> > > > > +            + s->current_mapping->info.file.offset);
> > > > >          if(lseek(s->current_fd, offset, SEEK_SET)!=offset)
> > > > >              return -3;
> > > > >          s->cluster=s->cluster_buffer;
> > > > > @@ -1929,8 +1931,9 @@ get_cluster_count_for_direntry(BDRVVVFATState* s, direntry_t* direntry, const ch
> > > > >                          (mapping->mode & MODE_DIRECTORY) == 0) {
> > > > >  
> > > > >                      /* was modified in qcow */
> > > > > -                    if (offset != mapping->info.file.offset + s->cluster_size
> > > > > -                            * (cluster_num - mapping->begin)) {
> > > > > +                    if (offset != s->cluster_size
> > > > > +                            * ((cluster_num - mapping->begin)
> > > > > +                            + mapping->info.file.offset)) {
> > > > >                          /* offset of this cluster in file chain has changed */
> > > > >                          abort();
> > > > >                          copy_it = 1;
> > > > > @@ -1944,7 +1947,6 @@ get_cluster_count_for_direntry(BDRVVVFATState* s, direntry_t* direntry, const ch
> > > > >  
> > > > >                      if (mapping->first_mapping_index != first_mapping_index
> > > > >                              && mapping->info.file.offset > 0) {
> > > > > -                        abort();
> > > > >                          copy_it = 1;
> > > > >                      }
> > > > 
> > > > I'm unsure which case this represents. If first_mapping_index refers to
> > > > the mapping of the first cluster in the file, does this mean we got a
> > > > mapping for a different file here? Or is the comparison between -1 and a
> > > > real value?
> > > 
> > > Now that I think more about it, I think this `abort` is actually
> > > correct, the issue though is that the handling around this code is not.
> > > 
> > > What this `abort` actually does is that it checks.
> > > - if the `mapping->first_mapping_index` is not the same as
> > >   `first_mapping_index`, which **should** happen only in one case, when
> > >   we are handling the first mapping, in that case
> > >   `mapping->first_mapping_index == -1`, in all other cases, the other
> > >   mappings after the first should have the condition true.
> > > - From above, we know that this is the first mapping, so if the offset
> > >   is not `0`, then abort, since this is an invalid state.
> > 
> > Yes, make sense.
> > 
> > > This is all good, the issue is that `first_mapping_index` is not set if
> > > we are checking from the middle, the variable `first_mapping_index` is
> > > only set if we passed through the check `cluster_was_modified` with the
> > > first mapping, and in the same function call we checked the other
> > > mappings.
> > 
> > I think I noticed the same yesterday, but when I tried to write a quick
> > patch that I could show you and that would update first_mapping_index in
> > each iteration, I broke something. So I decided I'd first ask you what
> > all of this even means. :-)
> > 
> > > From what I have seen, that doesn't happen since even if you write the
> > > whole file in one go, you are still writing it cluster by cluster, and
> > > the checks happen at that time.
> > 
> > Well, we do trigger the condition, but I suppose updating
> > first_mapping_index in each loop iteration is really the way to go if
> > you think the same.
> 
> Indeed, I did a quick change, modifying the loop to always go through
> and set the `first_mapping_index` for the first mapping fixes the issue
> and we can put the `abort` back in place.

Oh, nice, so the approach does work after all.

I'd say make only this change to this patch then and send the next
version of the series. Everything else is cleanups for a separate
series.

> I'll also modify the check to instead be 
> `mapping->first_mapping_index < 0 && mapping->info.file.offset > 0`
> This will make it clear that this applies only to the first mapping.

Right, or the inverted one as an assertion.

I'd actually write == -1 instead of < 0 because we know the exact value.

Separate patch, please, let's keep every patch really limited to one
logical change. We may have to bisect problems later and the smaller our
patches are, the easier the problem will be to find.

Kevin



^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2024-06-11 18:13 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-05  0:58 [PATCH v4 0/4] vvfat: Fix write bugs for large files and add iotests Amjad Alsharafi
2024-06-05  0:58 ` [PATCH v4 1/4] vvfat: Fix bug in writing to middle of file Amjad Alsharafi
2024-06-05  0:58 ` [PATCH v4 2/4] vvfat: Fix usage of `info.file.offset` Amjad Alsharafi
2024-06-10 16:49   ` Kevin Wolf
2024-06-11 12:31     ` Amjad Alsharafi
2024-06-11 14:30       ` Kevin Wolf
2024-06-11 16:22         ` Amjad Alsharafi
2024-06-11 18:11           ` Kevin Wolf
2024-06-05  0:58 ` [PATCH v4 3/4] vvfat: Fix reading files with non-continuous clusters Amjad Alsharafi
2024-06-05  0:58 ` [PATCH v4 4/4] iotests: Add `vvfat` tests Amjad Alsharafi
2024-06-10 12:01   ` Kevin Wolf
2024-06-10 14:11     ` Amjad Alsharafi
2024-06-10 16:50       ` Kevin Wolf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).