From: Glenn Washburn <development@efficientek.com>
To: Daniel Kiper <dkiper@net-space.pl>
Cc: grub-devel@gnu.org,
Vladimir 'phcoder' Serbinenko <phcoder@gmail.com>,
Peter Jones <pjones@redhat.com>
Subject: Re: [PATCH] udf: Fix regression which is preventing symlink access
Date: Tue, 14 Sep 2021 18:19:03 +0000 [thread overview]
Message-ID: <20210914181903.79ee0ee2@ubuntu> (raw)
In-Reply-To: <20210914142755.mraorb7nhh3tzbfr@tomti.i.net-space.pl>
On Tue, 14 Sep 2021 16:27:55 +0200
Daniel Kiper <dkiper@net-space.pl> wrote:
> On Fri, Sep 10, 2021 at 04:03:23PM +0000, Glenn Washburn wrote:
> > This code was broken by commit 3f05d693 ("malloc: Use overflow
> > checking primitives where we do complex allocations"), which added
> > overflow checking in many areas. The problem here is that the
> > changes update the local variable sz, which was already in use and
> > which was not updated before the change. So the code using sz was
> > getting a different value of than it would have previously for the
> > same UDF image. This causes the logic getting the destination of
> > the symlink to not realize that its gotten the full destination,
> > but keeps trying to read past the end of the destination. The bytes
> > after the end are generally NULL padding bytes, but that's not a
> > valid component type (ECMA-167 14.16.1.1). So grub_udf_read_symlink
> > branches to error logic, returning NULL, instead of the symlink
> > destination path.
> >
> > The result of this bug is that the UDF filesystem tests were
> > failing in the symlink test with the grub-fstest error message:
> >
> > grub-fstest: error: cannot open `(loop0)/sym': invalid symlink.
> >
> > This change stores the result of doubling sz in another local
> > variable s, so as not to modify sz. Also remove unnecessary
> > grub_add, which increased the output by 1 to account for a NULL
> > byte. This isn't needed because an output buffer of size twice sz
> > is already guaranteed to be more than enough to contain the path
> > components converted to UTF-8. The worst case upper- bound for the
> > needed output buffer size is (sz-4)*1.5, where 4 is the size
>
> I think 4 comes from ECMA-167 spec. Could you add a reference to it
> here? The number of paragraph would be perfect...
Its 14.16.1 basically in the same place as the reference earlier, which
is why I didn't include it. But, yes, I can include it.
> > of a path component header and 1.5 is the maximum growth in bytes
> > when converting from 2-byte unicode code-points to UTF-8 (from 2
> > bytes to 3).
>
> Could you explain how did you come up with the 1.5 value? It would be
> nice if you refer to a spec or something like that.
There is no spec that I know of (but would be happy to know of one). Its
something I've deduced based on my understanding of Unicode, UTF-8, and
UTF-16. All unicode code points less than or equal to 2 bytes (code
points <0x10000) can be represented in UTF-8 by a maximum of 3 bytes
[1]. Longer UTF-16 encodings don't matter because those will be 4 bytes
or longer. The maximum number of bytes for a UTF-8 encoding of a unicode
code point is 4 bytes. So the longer UTF-16 encodings can only be equal
to or longer than the UTF-8 encoding, thus the UTF-16 -> UTF-8 would be
shrinking or maintaining the length of the original byte string. Since
the worst case growth is 2 bytes to 3, that's 1.5 times the original
string size. QED.
Do you want all that in there? I could just remove that part about the
1.5 too.
Here's an SO question that addresses this. Yes, unofficial, but I think
adds some weight as to the correctness of the logic above.
https://stackoverflow.com/questions/55056322/maximum-utf-8-string-size-given-utf-16-size
Glenn
[1] https://en.wikipedia.org/wiki/UTF-8#Encoding
>
> Daniel
>
> > Signed-off-by: Glenn Washburn <development@efficientek.com>
> > ---
> > grub-core/fs/udf.c | 17 ++++++++++++-----
> > 1 file changed, 12 insertions(+), 5 deletions(-)
> >
> > diff --git a/grub-core/fs/udf.c b/grub-core/fs/udf.c
> > index 2ac5c1d00..197e60d12 100644
> > --- a/grub-core/fs/udf.c
> > +++ b/grub-core/fs/udf.c
> > @@ -1022,7 +1022,7 @@ grub_udf_iterate_dir (grub_fshelp_node_t dir,
> > static char *
> > grub_udf_read_symlink (grub_fshelp_node_t node)
> > {
> > - grub_size_t sz = U64 (node->block.fe.file_size);
> > + grub_size_t s, sz = U64 (node->block.fe.file_size);
> > grub_uint8_t *raw;
> > const grub_uint8_t *ptr;
> > char *out = NULL, *optr;
> > @@ -1035,11 +1035,19 @@ grub_udf_read_symlink (grub_fshelp_node_t
> > node) if (grub_udf_read_file (node, NULL, NULL, 0, sz, (char *)
> > raw) < 0) goto fail_1;
> >
> > - if (grub_mul (sz, 2, &sz) ||
> > - grub_add (sz, 1, &sz))
> > + /*
> > + * Local sz is the size of the symlink file data, which contains
> > a sequence
> > + * of path components (ECMA-167 14.16.1) representing the link
> > destination.
> > + * This size is an upper-bound on the number of bytes of a
> > contained and
> > + * potentially compressed 2-byte character string. Allocate 2*sz
> > for the
> > + * output buffer contained the string converted to UTF-8 because
> > the
> > + * resulting string can not be more than double the size (2-byte
> > unicode
> > + * points will be converted to a maximum of 3 bytes in UTF-8).
> > + */
> > + if (grub_mul (sz, 2, &s))
> > goto fail_0;
> >
> > - out = grub_malloc (sz);
> > + out = grub_malloc (s);
> > if (!out)
> > {
> > fail_0:
> > @@ -1051,7 +1059,6 @@ grub_udf_read_symlink (grub_fshelp_node_t
> > node)
> >
> > for (ptr = raw; ptr < raw + sz; )
> > {
> > - grub_size_t s;
> > if ((grub_size_t) (ptr - raw + 4) > sz)
> > goto fail_1;
> > if (!(ptr[2] == 0 && ptr[3] == 0))
> > --
> > 2.32.0
next prev parent reply other threads:[~2021-09-14 18:19 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-10 16:03 [PATCH] udf: Fix regression which is preventing symlink access Glenn Washburn
2021-09-14 14:27 ` Daniel Kiper
2021-09-14 18:19 ` Glenn Washburn [this message]
2021-09-15 14:52 ` Daniel Kiper
2021-09-15 22:52 ` Glenn Washburn
2021-09-16 21:34 ` Daniel Kiper
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210914181903.79ee0ee2@ubuntu \
--to=development@efficientek.com \
--cc=dkiper@net-space.pl \
--cc=grub-devel@gnu.org \
--cc=phcoder@gmail.com \
--cc=pjones@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.