qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Christian Schoenebeck <qemu_oss@crudebyte.com>
To: Greg Kurz <groug@kaod.org>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>
Cc: "Meng, Bin" <Bin.Meng@windriver.com>,
	"Marc-André Lureau" <marcandre.lureau@redhat.com>,
	"Daniel P. Berrangé" <berrange@redhat.com>,
	"Shi, Guohuai" <Guohuai.Shi@windriver.com>
Subject: Re: [PATCH v4 04/16] hw/9pfs: Implement Windows specific xxxdir() APIs
Date: Tue, 07 Feb 2023 11:11:31 +0100	[thread overview]
Message-ID: <4342819.egquFENVeL@silver> (raw)
In-Reply-To: <MN2PR11MB4173960A520976B15946CCEBEFDA9@MN2PR11MB4173.namprd11.prod.outlook.com>

On Monday, February 6, 2023 6:37:16 AM CET Shi, Guohuai wrote:
[...]
> > I know, it's an n-square performance issue and what I already wrote in the
> > summary of the linked original suggestion [1] in v3 before, quote:
> > 
> >   + Relatively straight-forward to implement.
> > 
> >   + No (major) changes in 9pfs code base required.
> > 
> >   - Still n-square performance issue (neglectable to land Windows host
> > support
> >     IMO).
> > 
> >   o Consistency assured for "most" cases, except one: if hardlinks are
> >     inserted in between then it might fail
> 
> readdir() on Linux host may also return the deleted entries.
> And POSIX specification does not mention about the consistency issue.

POSIX explicitly specifies that 1. new and 2. deleted entries may or may not
appear in result and leaves that implementation specific. That was never our
concern.

And yes, POSIX does not explicitly discuss consistency concerning entries that
have neither been added or removed, but this expectation is implied. In
practice double entries are probably less of an issue, client might be able to
handle that without misbehaviour (haven't checked this at all yet), but if the
implementation would lead to chances that entries may *never* appear to
clients at all, even after refreshing periodically, I mean how could you work
with a file system like that?

> NTFS file id is the $MFT index id. It will keen unique until file is deleted.
> But the index id may be reuse if delete and re-create many files.
> 
> Saving file id instead of name will make consistency better, but may not cover all status.
> Because read directory is not a "atomic" operation.

I don't see an issue with that, because these are entries that were either
added or removed, we don't care about them. And their file IDs would not
affect fetching the other directory entries that have not been touched in
between.

And we are also not questioning atomicity here, but consistency.

> > [1] https://lore.kernel.org/qemu-devel/2468168.SvRIHAoRfs@silver/
> > 
> > The idea was to use that just as a starting point to land Windows host support
> > ASAP, slower on large dirs compared to other solutions, yes, but with
> > guaranteed correct and deterministic behaviour. And then on the long run
> > we would of course replace that with a more performant solution.
> > 
> > I mean, this is really simple to implement, so I would at least test it. If it really
> > runs horribly slow we could still discuss faster solutions, which are however
> > all much more tricky.
> > 
> 
> I did a basic test on Windows host, here is the code:
> 
>     st = clock();
>     pDir = opendir_win32(TEST_DIR);
> 
>     if (pDir == NULL)
>         return -1;
>     
>     while ((pEnt = readdir_win32(pDir)) != NULL)
>     {
>         totals++;
>     }
>     closedir_win32(pDir);
>     ed = clock();
> 
>     printf("total = %d clocks = %d %d\n", totals, ed - st, CLOCKS_PER_SEC);
> 
> My local storage is SSD disk.
> 
> Run this test for 100, 1000, 10000 entries.
> For file name cache solution, the time cost is: 2, 9, 44 (in ms).
> For file id cache solution, the time cost: 3, 438, 4338 (in ms).
> I already used OpenFileById() to make it faster instead of CreateFile(). If I use CreateFile, it need more than 80 seconds.
> 
> The performance looks like not good. 
> And actually, it would be worse in 9pfs.
> Because in current design, 9pfs  may seek forward and seek back several times during reading directory, which may cause the performance worse.

Poor performance, yes, probably a bit worse than I would have expected.

So it is about choosing your poison (potential crash vs. poor performance).

I mean, I am not keen into suggesting any kind of bike shredding for you on
this issue, but if this is merged, then people expect it to behave reliably
and not allowing a guest to crash QEMU host process by simply creating a large
number of directory entries on guest.

I was also considering to make it a QEMU option, but OTOH, this is a temporary
situation and those options would be wiped once we have an oppropriate
solution a bit later.

I am open for suggestions. Could we probably just mark Windows host support as
experimental for now, is that even allowed by QEMU policies?

> > > So I think store all name entries would be better than store all file ID.
> > 
> > As already discussed, NTFS allows up to (2^32 - 1) = 4,294,967,295 entries per
> > directory. So caching only one directory (entirely) in RAM can already exceed
> > the available RAM, which would crash QEMU. Multiplied by an expected
> > amount of directory lookups by client and we even get into much higher
> > categories, even with much smaller individual directory sizes.
> > 
> 
> Windows file id structure is 24 bytes, which is not a small structure.
> If you think the performance is acceptable, I can rework this commit based on file id.




  reply	other threads:[~2023-02-07 10:14 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-30  9:51 [PATCH v4 00/16] hw/9pfs: Add 9pfs support for Windows Bin Meng
2023-01-30  9:51 ` [PATCH v4 01/16] hw/9pfs: Add missing definitions " Bin Meng
2023-01-30  9:51 ` [PATCH v4 02/16] hw/9pfs: Implement Windows specific utilities functions for 9pfs Bin Meng
2023-01-30  9:51 ` [PATCH v4 03/16] hw/9pfs: Replace the direct call to xxxdir() APIs with a wrapper Bin Meng
2023-01-30  9:51 ` [PATCH v4 04/16] hw/9pfs: Implement Windows specific xxxdir() APIs Bin Meng
2023-02-03 12:24   ` Christian Schoenebeck
2023-02-03 13:34     ` Shi, Guohuai
2023-02-03 14:40       ` Christian Schoenebeck
2023-02-03 16:30         ` Shi, Guohuai
2023-02-03 17:55           ` Christian Schoenebeck
2023-02-06  5:37             ` Shi, Guohuai
2023-02-07 10:11               ` Christian Schoenebeck [this message]
2023-02-07 17:55                 ` Shi, Guohuai
2023-01-30  9:51 ` [PATCH v4 05/16] hw/9pfs: Update the local fs driver to support Windows Bin Meng
2023-01-30  9:51 ` [PATCH v4 06/16] hw/9pfs: Support getting current directory offset for Windows Bin Meng
2023-01-30  9:51 ` [PATCH v4 07/16] hw/9pfs: Update helper qemu_stat_rdev() Bin Meng
2023-01-30  9:51 ` [PATCH v4 08/16] hw/9pfs: Add a helper qemu_stat_blksize() Bin Meng
2023-01-30  9:51 ` [PATCH v4 09/16] hw/9pfs: Disable unsupported flags and features for Windows Bin Meng
2023-01-30  9:51 ` [PATCH v4 10/16] hw/9pfs: Update v9fs_set_fd_limit() " Bin Meng
2023-01-30  9:51 ` [PATCH v4 11/16] hw/9pfs: Add Linux error number definition Bin Meng
2023-01-30  9:51 ` [PATCH v4 12/16] hw/9pfs: Translate Windows errno to Linux value Bin Meng
2023-01-30  9:51 ` [PATCH v4 13/16] fsdev: Disable proxy fs driver on Windows Bin Meng
2023-01-30  9:52 ` [PATCH v4 14/16] hw/9pfs: Update synth fs driver for Windows Bin Meng
2023-01-30  9:52 ` [PATCH v4 15/16] tests/qtest: virtio-9p-test: Adapt the case for win32 Bin Meng
2023-01-30  9:52 ` [PATCH v4 16/16] meson.build: Turn on virtfs for Windows Bin Meng
2023-01-31 14:31 ` [PATCH v4 00/16] hw/9pfs: Add 9pfs support " Marc-André Lureau
2023-01-31 14:39   ` Daniel P. Berrangé
2023-01-31 15:06     ` Marc-André Lureau
2023-02-01 13:04       ` Shi, Guohuai
2023-02-02  7:20         ` Marc-André Lureau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4342819.egquFENVeL@silver \
    --to=qemu_oss@crudebyte.com \
    --cc=Bin.Meng@windriver.com \
    --cc=Guohuai.Shi@windriver.com \
    --cc=berrange@redhat.com \
    --cc=groug@kaod.org \
    --cc=marcandre.lureau@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).