From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from list by lists.gnu.org with archive (Exim 4.90_1) id 1mQD1V-00033M-AG for mharc-grub-devel@gnu.org; Tue, 14 Sep 2021 14:19:13 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54912) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mQD1S-00030M-Pg for grub-devel@gnu.org; Tue, 14 Sep 2021 14:19:11 -0400 Received: from mail-ot1-x334.google.com ([2607:f8b0:4864:20::334]:43779) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1mQD1Q-0003zF-52 for grub-devel@gnu.org; Tue, 14 Sep 2021 14:19:10 -0400 Received: by mail-ot1-x334.google.com with SMTP id x10-20020a056830408a00b004f26cead745so19821773ott.10 for ; Tue, 14 Sep 2021 11:19:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=efficientek-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; bh=vpc9I2rj5/+z5H1hW5ZC1G5PESKPjiGC1Ooo3ShydoU=; b=IAR2+AsmhVH/TEyKcmMXP7DGNFIvFntAkhmGpc1X1v2qsA8urKHAvMi0uTlI6MTaOX UWaOnWwPza3pyV9bj+TYsJpOY+Rp3vlTgytkbQxbFZNA9mhhKG9UQy4uTlQs7z6p72Rf 2XQl5zQq7COCdbbMOtpd3WEq38WvUZw4EpxD+hRTZKGVXv7xUzQFKhb3ajl/UCD+lxzW D/o7TpKvpuqyjxY1EgRqQ3RPnK7XqJ8M9QAcj1ep7Co8k5s/+3Lvoxq06iTrd9ZHS1+V eiJflUHyDZVIMbdMZypiLWIwIRyUngXC+IDkoQz3FgAB6bDWRr6XpdQoE7TdIvWLvGk8 AMEQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to :references:reply-to:mime-version:content-transfer-encoding; bh=vpc9I2rj5/+z5H1hW5ZC1G5PESKPjiGC1Ooo3ShydoU=; b=yfY+cp73ohQRvncB/S0gY/CFoAQlotrDAtFLEkDzqZNRvOzd02+ikN2Cb9FUeh3nel 99ziJ3fA3qMFc8a7oA5KFG4iDXk/WJ09QveROJcnUMg1HwT55TacptqYcfF4+fc8OIKo T6QPyApQb7Tr5k5EMFc8dR3hCb1FGvmV1kOCF3NWkQbkBxuQKE15152JVgnjdl2EyV7E rTr05W6Zbl6MFUtSvPbwxCSltZEE10IGtfPfrTmISJrwYGW8Zdap28vY1pK/rq3VYL46 wbJkWzbav1Za067x15u904WDbxBD1GQmfnGvGy1a9d5GH5BGIiRIYzMIU+jDQCV+h6Ji cCBQ== X-Gm-Message-State: AOAM530OmOcr/Vcj3dBNQAP/W9PX23QoDrdD7lTtZMoTlyZFyLKbuh9Y XfPbXgSLWVI+PNONkozYlecT5Q== X-Google-Smtp-Source: ABdhPJyVjAnc0CL/aPVuyAFDvfeaC8I8tX0i3urDfzZlg+iR9EnI6iI9mOiRvtrnjhz/xbXrtgZuyg== X-Received: by 2002:a05:6830:438d:: with SMTP id s13mr16134944otv.308.1631643546513; Tue, 14 Sep 2021 11:19:06 -0700 (PDT) Received: from ubuntu ([2806:103e:1d:2421:bf4a:e8ea:b57:7e94]) by smtp.gmail.com with ESMTPSA id w10sm2512454oiv.14.2021.09.14.11.19.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 14 Sep 2021 11:19:06 -0700 (PDT) Date: Tue, 14 Sep 2021 18:19:03 +0000 From: Glenn Washburn To: Daniel Kiper Cc: grub-devel@gnu.org, Vladimir 'phcoder' Serbinenko , Peter Jones Subject: Re: [PATCH] udf: Fix regression which is preventing symlink access Message-ID: <20210914181903.79ee0ee2@ubuntu> In-Reply-To: <20210914142755.mraorb7nhh3tzbfr@tomti.i.net-space.pl> References: <20210910160323.1247372-1-development@efficientek.com> <20210914142755.mraorb7nhh3tzbfr@tomti.i.net-space.pl> Reply-To: development@efficientek.com X-Mailer: Claws Mail 3.17.8 (GTK+ 2.24.33; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Received-SPF: pass client-ip=2607:f8b0:4864:20::334; envelope-from=development@efficientek.com; helo=mail-ot1-x334.google.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: grub-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: The development of GNU GRUB List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Sep 2021 18:19:11 -0000 On Tue, 14 Sep 2021 16:27:55 +0200 Daniel Kiper wrote: > On Fri, Sep 10, 2021 at 04:03:23PM +0000, Glenn Washburn wrote: > > This code was broken by commit 3f05d693 ("malloc: Use overflow > > checking primitives where we do complex allocations"), which added > > overflow checking in many areas. The problem here is that the > > changes update the local variable sz, which was already in use and > > which was not updated before the change. So the code using sz was > > getting a different value of than it would have previously for the > > same UDF image. This causes the logic getting the destination of > > the symlink to not realize that its gotten the full destination, > > but keeps trying to read past the end of the destination. The bytes > > after the end are generally NULL padding bytes, but that's not a > > valid component type (ECMA-167 14.16.1.1). So grub_udf_read_symlink > > branches to error logic, returning NULL, instead of the symlink > > destination path. > > > > The result of this bug is that the UDF filesystem tests were > > failing in the symlink test with the grub-fstest error message: > > > > grub-fstest: error: cannot open `(loop0)/sym': invalid symlink. > > > > This change stores the result of doubling sz in another local > > variable s, so as not to modify sz. Also remove unnecessary > > grub_add, which increased the output by 1 to account for a NULL > > byte. This isn't needed because an output buffer of size twice sz > > is already guaranteed to be more than enough to contain the path > > components converted to UTF-8. The worst case upper- bound for the > > needed output buffer size is (sz-4)*1.5, where 4 is the size > > I think 4 comes from ECMA-167 spec. Could you add a reference to it > here? The number of paragraph would be perfect... Its 14.16.1 basically in the same place as the reference earlier, which is why I didn't include it. But, yes, I can include it. > > of a path component header and 1.5 is the maximum growth in bytes > > when converting from 2-byte unicode code-points to UTF-8 (from 2 > > bytes to 3). > > Could you explain how did you come up with the 1.5 value? It would be > nice if you refer to a spec or something like that. There is no spec that I know of (but would be happy to know of one). Its something I've deduced based on my understanding of Unicode, UTF-8, and UTF-16. All unicode code points less than or equal to 2 bytes (code points <0x10000) can be represented in UTF-8 by a maximum of 3 bytes [1]. Longer UTF-16 encodings don't matter because those will be 4 bytes or longer. The maximum number of bytes for a UTF-8 encoding of a unicode code point is 4 bytes. So the longer UTF-16 encodings can only be equal to or longer than the UTF-8 encoding, thus the UTF-16 -> UTF-8 would be shrinking or maintaining the length of the original byte string. Since the worst case growth is 2 bytes to 3, that's 1.5 times the original string size. QED. Do you want all that in there? I could just remove that part about the 1.5 too. Here's an SO question that addresses this. Yes, unofficial, but I think adds some weight as to the correctness of the logic above. https://stackoverflow.com/questions/55056322/maximum-utf-8-string-size-given-utf-16-size Glenn [1] https://en.wikipedia.org/wiki/UTF-8#Encoding > > Daniel > > > Signed-off-by: Glenn Washburn > > --- > > grub-core/fs/udf.c | 17 ++++++++++++----- > > 1 file changed, 12 insertions(+), 5 deletions(-) > > > > diff --git a/grub-core/fs/udf.c b/grub-core/fs/udf.c > > index 2ac5c1d00..197e60d12 100644 > > --- a/grub-core/fs/udf.c > > +++ b/grub-core/fs/udf.c > > @@ -1022,7 +1022,7 @@ grub_udf_iterate_dir (grub_fshelp_node_t dir, > > static char * > > grub_udf_read_symlink (grub_fshelp_node_t node) > > { > > - grub_size_t sz = U64 (node->block.fe.file_size); > > + grub_size_t s, sz = U64 (node->block.fe.file_size); > > grub_uint8_t *raw; > > const grub_uint8_t *ptr; > > char *out = NULL, *optr; > > @@ -1035,11 +1035,19 @@ grub_udf_read_symlink (grub_fshelp_node_t > > node) if (grub_udf_read_file (node, NULL, NULL, 0, sz, (char *) > > raw) < 0) goto fail_1; > > > > - if (grub_mul (sz, 2, &sz) || > > - grub_add (sz, 1, &sz)) > > + /* > > + * Local sz is the size of the symlink file data, which contains > > a sequence > > + * of path components (ECMA-167 14.16.1) representing the link > > destination. > > + * This size is an upper-bound on the number of bytes of a > > contained and > > + * potentially compressed 2-byte character string. Allocate 2*sz > > for the > > + * output buffer contained the string converted to UTF-8 because > > the > > + * resulting string can not be more than double the size (2-byte > > unicode > > + * points will be converted to a maximum of 3 bytes in UTF-8). > > + */ > > + if (grub_mul (sz, 2, &s)) > > goto fail_0; > > > > - out = grub_malloc (sz); > > + out = grub_malloc (s); > > if (!out) > > { > > fail_0: > > @@ -1051,7 +1059,6 @@ grub_udf_read_symlink (grub_fshelp_node_t > > node) > > > > for (ptr = raw; ptr < raw + sz; ) > > { > > - grub_size_t s; > > if ((grub_size_t) (ptr - raw + 4) > sz) > > goto fail_1; > > if (!(ptr[2] == 0 && ptr[3] == 0)) > > -- > > 2.32.0