From: Christian Schoenebeck <qemu_oss@crudebyte.com>
To: qemu-devel@nongnu.org
Cc: Greg Kurz <groug@kaod.org>,
Christian Schoenebeck <qemu_oss@crudebyte.com>
Subject: Re: [PATCH v3 02/11] 9pfs: require msize >= 4096
Date: Fri, 17 Jan 2020 13:01:30 +0100 [thread overview]
Message-ID: <5046978.lqI2Hv0RVs@silver> (raw)
In-Reply-To: <20200117112421.4311c21f@bahia.lan>
On Freitag, 17. Januar 2020 11:24:21 CET Greg Kurz wrote:
> On Thu, 16 Jan 2020 22:39:19 +0100
>
> Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:
> > On Donnerstag, 16. Januar 2020 19:07:48 CET Greg Kurz wrote:
> > > > The point here was that there must be some kind of absolute minimum
> > > > msize,
> > >
> > > Then the absolute minimum size is 7, the size of the header part that is
> > > common to all messages:
> > >
> > > size[4] Message tag[2]
> > >
> > > > even though the protocol specs officially don't mandate one. And if a
> > > > client choses a msize < P9_IOHDRSZ, what useful actions shall it be
> > > > able
> > > > to do?
> > >
> > > I haven't checked but I guess some messages can fit in 24 bytes.
> > >
> > > > That's why I mentioned P9_IOHDRSZ just in case somebody might come up
> > > > later
> > > > asking to lower that minimum msize value for whatever reason.
> > >
> > > That's precisely what I want to avoid. The semantic of P9_IOHDRSZ is
> > > that it is the header size of a Twrite message and as such it is used
> > > to compute the 'iounit' which the guest later uses to compute a
> > > suitable 'count' for Tread/Twrite messages only. It shouldn't be
> > > involved in msize considerations IMHO.
> > >
> > > > But np, IMO this sentence makes the fundamental requirement for this
> > > > patch
> > > > more clear, but I have no problem dropping this sentence.
> > >
> > > Then you can mention 7 has the true minimal size.
> >
> > Ok, I'll drop P9_IOHDRSZ completely and mention literal 7 as absolute
> > theoretical minimum instead.
> >
> > > > > > Furthermore there are some responses which are not allowed by the
> > > > > > 9p
> > > > > > protocol to be truncated like e.g. Rreadlink which may yield up to
> > > > >
> > > > > No message may be truncated in any way actually. The spec just
> > > > > allows
> > > > > an exception with the string part of Rerror.
> > > >
> > > > Rreaddir, Rread, Twrite, Rerror can be truncated. So I suggest I'll
> > > > just
> > > > change that to s/some/most/ semantically.
> > >
> > > I mean truncate with loss of data. Rreaddir can return less than 'count'
> > > but it won't truncate an individual entry. It rewinds to the previous
> > > one and returns its offset for the next Treaddir. Same goes with Rread
> > > and Twrite, we always return a 'count' that can be used by the client
> > > to continue reading or writing.
> >
> > Individual entries are not truncated, correct, but data loss: Example, you
> > have this directory and say server's fs would deliver entries by readdir()
> > in>
> > this order (from top down):
> > .
> > ..
> > a
> > 1234567890
> > b
> > c
> > d
> >
> > and say 37 >= msize < 45, then client would receive 3 entries ".", "..",
> > "a" and that's it.
>
> Yes msize < 45 isn't enough to send the Rreaddir for "1234567890"
> and since it is forbidden to truncate, we bail out... but we
> did not send a truncated message still.
I really think we're discussing semantical interpretation details here which
bring us nowhere.
> > > Rerror is really the only message where we can deliberately drop
> > > data (but we actually don't do that).
> > >
> > > > > Maybe just mention that and say we choose 4096 to be able to send
> > > > > big Rreadlink messages.
> > > >
> > > > That's not the point for this patch. The main purpose is to avoid
> > > > having
> > > > to
> > > > maintain individual error prone minimum msize checks all over the code
> > > > base. If we would allow very small msizes (much smaller than 4096),
> > > > then
> > > > we would need to add numerous error handlers all over the code base,
> > > > each
> > > > one checking for different values (depending on the specific message
> > > > type).
> > >
> > > I'm not sure what you mean by 'minimum msize checks all over the code
> > > base'... Please provide an example.
> >
> > 1. Like done in this patch series: Treaddir has to limit client's 'count'
> >
> > parameter according to msize (by subtracting with msize).
>
> Hmm... this patch does a sanity check on 'count', not on 'msize'...
Yes ... :)
> I mean no matter what msize is, clipping count to msize - 11 gives a
> chance to stop processing the entries before overflowing the transport
> buffer.
... and no, this cannot happen if minimum msize of 4096 is forced already by
Tversion. Maybe you now get my point -> It is about avoiding exactly such kind
of issues in the first place. Most file systems have a name limit of 255
bytes:
https://en.wikipedia.org/wiki/Comparison_of_file_systems#Limits
So by forcing a minimum 'msize' of 4096 you avoid having to deal with this
issue (and similar ones) on Treaddir request level (and other request type
handlers), including ReiserFS BTW because 4032+35 < 4096.
If you would allow smaller 'msize' values by Tversion, then you would need to
suffer some kind of death when handling Treaddir with certain high file name
length. Either a transport error (with an error message that a normal user
would not be able to understand at all) or by returning an incomplete Treaddir
response sequence with { Rreaddir count=0 }, or ... any other kind of death.
No matter which death you would choose here, it would be horrible from
usability perspective, because the system would most of the time work
flawlessy, and an error case would just be triggered if guest hits a file/dir
beyond some certain name length. It is not worth it! Force 4kiB already at
Tversion and that's it.
> > 2. get_iounit() does this:
> > iounit = stbuf.f_bsize;
> > iounit *= (s->msize - P9_IOHDRSZ) / stbuf.f_bsize;
> >
> > without checking anywhere for a potential negative outcome
> > (i.e. when msize < P9_IOHDRSZ)
>
> This function should even have an assert() for that, just to be
> sure no one tries to call it before s->msize is even set, which
> would clearly be a bug in QEMU. But this can be done in a
> follow-up patch.
>
> > 3. Example of directory entry name length with Rreaddir above.
>
> msize being too small to accommodate an Rreaddir with a single
> entry is a different problem as we cannot do anything about it
> at this point but fail... That's why the minimum msize should
> rather be chosen with file names in mind, which are likely to
> be longer than any message header. Rreadlink being the one with
> the higher potential since it will usually return a string
> containing a path name (while Rreaddir entries only contain
> a single path element).
>
> > Individual issues that can easily be overlooked but also easily avoided by
> > not allowing small msizes in the first place.
>
> My point is that we're not going to check msize in Tversion in
> order to to avoid multiple checks everywhere. We're going to do
> it there because it is the only place where it makes sense to
> do it.
Also yes and no. Of course it just makes sense to handle it already at
Tversion. But no, you could theoretically also allow much smaller minimum
'msize' value << 4096 (i.e. somewhere closely >7 as we discussed), then you
would indeed still need to add msize checks at other places of the code base
as you just found out now. So forcing a minimum 'msize' which is high enough,
avoids having to add such individual checks and having to deal with them in
some kind of unpleasant way.
I hope this makes it more clear now.
Best regards,
Christian Schoenebeck
next prev parent reply other threads:[~2020-01-17 12:02 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-01-13 23:18 [PATCH v3 00/11] 9pfs: readdir optimization Christian Schoenebeck
2020-01-13 22:20 ` [PATCH v3 01/11] tests/virtio-9p: add terminating null in v9fs_string_read() Christian Schoenebeck
2020-01-13 22:21 ` [PATCH v3 02/11] 9pfs: require msize >= 4096 Christian Schoenebeck
2020-01-16 13:15 ` Greg Kurz
2020-01-16 16:16 ` Christian Schoenebeck
2020-01-16 18:07 ` Greg Kurz
2020-01-16 21:39 ` Christian Schoenebeck
2020-01-17 10:24 ` Greg Kurz
2020-01-17 12:01 ` Christian Schoenebeck [this message]
2020-01-17 15:15 ` Greg Kurz
2020-01-17 16:41 ` Christian Schoenebeck
2020-01-13 22:22 ` [PATCH v3 03/11] 9pfs: validate count sent by client with T_readdir Christian Schoenebeck
2020-01-16 13:33 ` Greg Kurz
2020-01-16 16:51 ` Christian Schoenebeck
2020-01-17 15:50 ` Greg Kurz
2020-01-13 22:23 ` [PATCH v3 04/11] hw/9pfs/9p-synth: added directory for readdir test Christian Schoenebeck
2020-01-13 23:08 ` [PATCH v3 05/11] tests/virtio-9p: added " Christian Schoenebeck
2020-01-17 15:51 ` Greg Kurz
2020-01-17 16:44 ` Christian Schoenebeck
2020-01-13 23:11 ` [PATCH v3 06/11] tests/virtio-9p: added splitted " Christian Schoenebeck
2020-01-13 23:13 ` [PATCH v3 07/11] tests/virtio-9p: failing " Christian Schoenebeck
2020-01-13 23:16 ` [PATCH v3 08/11] 9pfs: readdir benchmark Christian Schoenebeck
2020-01-13 23:16 ` [PATCH v3 09/11] hw/9pfs/9p-synth: avoid n-square issue in synth_readdir() Christian Schoenebeck
2020-01-13 23:17 ` [PATCH v3 10/11] 9pfs: T_readdir latency optimization Christian Schoenebeck
2020-01-13 23:17 ` [PATCH v3 11/11] hw/9pfs/9p.c: benchmark time on T_readdir request Christian Schoenebeck
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5046978.lqI2Hv0RVs@silver \
--to=qemu_oss@crudebyte.com \
--cc=groug@kaod.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).