From: Kevin Wolf <kwolf@redhat.com>
To: Amjad Alsharafi <amjadsharafi10@gmail.com>
Cc: qemu-devel@nongnu.org, Hanna Reitz <hreitz@redhat.com>,
"open list:vvfat" <qemu-block@nongnu.org>
Subject: Re: [PATCH v4 2/4] vvfat: Fix usage of `info.file.offset`
Date: Tue, 11 Jun 2024 16:30:53 +0200 [thread overview]
Message-ID: <ZmhfnSIe30da03uN@redhat.com> (raw)
In-Reply-To: <ZmhDl7ob8G62FtZb@amjad-pc>
Am 11.06.2024 um 14:31 hat Amjad Alsharafi geschrieben:
> On Mon, Jun 10, 2024 at 06:49:43PM +0200, Kevin Wolf wrote:
> > Am 05.06.2024 um 02:58 hat Amjad Alsharafi geschrieben:
> > > The field is marked as "the offset in the file (in clusters)", but it
> > > was being used like this
> > > `cluster_size*(nums)+mapping->info.file.offset`, which is incorrect.
> > >
> > > Additionally, removed the `abort` when `first_mapping_index` does not
> > > match, as this matches the case when adding new clusters for files, and
> > > its inevitable that we reach this condition when doing that if the
> > > clusters are not after one another, so there is no reason to `abort`
> > > here, execution continues and the new clusters are written to disk
> > > correctly.
> > >
> > > Signed-off-by: Amjad Alsharafi <amjadsharafi10@gmail.com>
> >
> > Can you help me understand how first_mapping_index really works?
> >
> > It seems to me that you get a chain of mappings for each file on the FAT
> > filesystem, which are just the contiguous areas in it, and
> > first_mapping_index refers to the mapping at the start of the file. But
> > for much of the time, it actually doesn't seem to be set at all, so you
> > have mapping->first_mapping_index == -1. Do you understand the rules
> > around when it's set and when it isn't?
>
> Yeah. So `first_mapping_index` is the index of the first mapping, each
> mapping is a group of clusters that are contiguous in the file.
> Its mostly `-1` because the first mapping will have the value set as
> `-1` and not its own index, this value will only be set when the file
> contain more than one mapping, and this will only happen when you add
> clusters to a file that are not contiguous with the existing clusters.
Ah, that makes some sense. Not sure if it's optimal, but it's a rule I
can work with. So just to confirm, this is the invariant that we think
should always hold true, right?
assert((mapping->mode & MODE_DIRECTORY) ||
!mapping->info.file.offset ||
mapping->first_mapping_index > 0);
> And actually, thanks to that I noticed another bug not fixed in PATCH 3,
> We are doing this check
> `s->current_mapping->first_mapping_index != mapping->first_mapping_index`
> to know if we should switch to the new mapping or not.
> If we were reading from the first mapping (`first_mapping_index == -1`)
> and we jumped to the second mapping (`first_mapping_index == n`), we
> will catch this condition and switch to the new mapping.
>
> But if the file has more than 2 mappings, and we jumped to the 3rd
> mapping, we will not catch this since (`first_mapping_index == n`) for
> both of them haha. I think a better check is to check the `mapping`
> pointer directly. (I'll add it also in the next series together with a
> test for it.)
This comparison is exactly what confused me. I didn't realise that the
first mapping in the chain has a different value here, so I thought this
must mean that we're looking at a different file now - but of course I
couldn't see a reason for that because we're iterating through a single
file in this function.
But even now that I know that the condition triggers when switching from
the first to the second mapping, it doesn't make sense to me. We don't
have to copy things around just because a file is non-contiguous.
What we want to catch is if the order of mappings has changed compared
to the old state. Do we need a linked list, maybe a prev_mapping_index,
instead of first_mapping_index so that we can compare if it is still the
same as before?
Or actually, I suppose that's the first block with an abort() in the
code, just that it doesn't compare mappings, but their offsets.
> >
> > > block/vvfat.c | 12 +++++++-----
> > > 1 file changed, 7 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/block/vvfat.c b/block/vvfat.c
> > > index 19da009a5b..f0642ac3e4 100644
> > > --- a/block/vvfat.c
> > > +++ b/block/vvfat.c
> > > @@ -1408,7 +1408,9 @@ read_cluster_directory:
> > >
> > > assert(s->current_fd);
> > >
> > > - offset=s->cluster_size*(cluster_num-s->current_mapping->begin)+s->current_mapping->info.file.offset;
> > > + offset = s->cluster_size *
> > > + ((cluster_num - s->current_mapping->begin)
> > > + + s->current_mapping->info.file.offset);
> > > if(lseek(s->current_fd, offset, SEEK_SET)!=offset)
> > > return -3;
> > > s->cluster=s->cluster_buffer;
> > > @@ -1929,8 +1931,9 @@ get_cluster_count_for_direntry(BDRVVVFATState* s, direntry_t* direntry, const ch
> > > (mapping->mode & MODE_DIRECTORY) == 0) {
> > >
> > > /* was modified in qcow */
> > > - if (offset != mapping->info.file.offset + s->cluster_size
> > > - * (cluster_num - mapping->begin)) {
> > > + if (offset != s->cluster_size
> > > + * ((cluster_num - mapping->begin)
> > > + + mapping->info.file.offset)) {
> > > /* offset of this cluster in file chain has changed */
> > > abort();
> > > copy_it = 1;
> > > @@ -1944,7 +1947,6 @@ get_cluster_count_for_direntry(BDRVVVFATState* s, direntry_t* direntry, const ch
> > >
> > > if (mapping->first_mapping_index != first_mapping_index
> > > && mapping->info.file.offset > 0) {
> > > - abort();
> > > copy_it = 1;
> > > }
> >
> > I'm unsure which case this represents. If first_mapping_index refers to
> > the mapping of the first cluster in the file, does this mean we got a
> > mapping for a different file here? Or is the comparison between -1 and a
> > real value?
>
> Now that I think more about it, I think this `abort` is actually
> correct, the issue though is that the handling around this code is not.
>
> What this `abort` actually does is that it checks.
> - if the `mapping->first_mapping_index` is not the same as
> `first_mapping_index`, which **should** happen only in one case, when
> we are handling the first mapping, in that case
> `mapping->first_mapping_index == -1`, in all other cases, the other
> mappings after the first should have the condition true.
> - From above, we know that this is the first mapping, so if the offset
> is not `0`, then abort, since this is an invalid state.
Yes, make sense.
> This is all good, the issue is that `first_mapping_index` is not set if
> we are checking from the middle, the variable `first_mapping_index` is
> only set if we passed through the check `cluster_was_modified` with the
> first mapping, and in the same function call we checked the other
> mappings.
I think I noticed the same yesterday, but when I tried to write a quick
patch that I could show you and that would update first_mapping_index in
each iteration, I broke something. So I decided I'd first ask you what
all of this even means. :-)
> From what I have seen, that doesn't happen since even if you write the
> whole file in one go, you are still writing it cluster by cluster, and
> the checks happen at that time.
Well, we do trigger the condition, but I suppose updating
first_mapping_index in each loop iteration is really the way to go if
you think the same.
Kevin
next prev parent reply other threads:[~2024-06-11 14:31 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-05 0:58 [PATCH v4 0/4] vvfat: Fix write bugs for large files and add iotests Amjad Alsharafi
2024-06-05 0:58 ` [PATCH v4 1/4] vvfat: Fix bug in writing to middle of file Amjad Alsharafi
2024-06-05 0:58 ` [PATCH v4 2/4] vvfat: Fix usage of `info.file.offset` Amjad Alsharafi
2024-06-10 16:49 ` Kevin Wolf
2024-06-11 12:31 ` Amjad Alsharafi
2024-06-11 14:30 ` Kevin Wolf [this message]
2024-06-11 16:22 ` Amjad Alsharafi
2024-06-11 18:11 ` Kevin Wolf
2024-06-05 0:58 ` [PATCH v4 3/4] vvfat: Fix reading files with non-continuous clusters Amjad Alsharafi
2024-06-05 0:58 ` [PATCH v4 4/4] iotests: Add `vvfat` tests Amjad Alsharafi
2024-06-10 12:01 ` Kevin Wolf
2024-06-10 14:11 ` Amjad Alsharafi
2024-06-10 16:50 ` Kevin Wolf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZmhfnSIe30da03uN@redhat.com \
--to=kwolf@redhat.com \
--cc=amjadsharafi10@gmail.com \
--cc=hreitz@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.