From: Sean Anderson <sean.anderson@linux.dev>
To: Peter Maydell <peter.maydell@linaro.org>
Cc: "Heinrich Schuchardt" <heinrich.schuchardt@canonical.com>,
"Richard Henderson" <richard.henderson@linaro.org>,
"Luc Michel" <lmichel@kalray.eu>,
"Alex Bennée" <alex.bennee@linaro.org>,
"Philippe Mathieu-Daudé" <philmd@linaro.org>,
qemu-devel@nongnu.org
Subject: Re: [PATCH 3/3] semihosting: Check for overflow in FLEN on 32-bit systems
Date: Mon, 20 Oct 2025 15:31:28 -0400 [thread overview]
Message-ID: <91710089-5b3a-4be8-9554-90e46b991011@linux.dev> (raw)
In-Reply-To: <CAFEAcA_-08zVV6U2jVhNbYAwkLYXdjOzmRP-ZutjAKPuiGQ-_w@mail.gmail.com>
On 10/20/25 12:33, Peter Maydell wrote:
> On Mon, 20 Oct 2025 at 16:01, Sean Anderson <sean.anderson@linux.dev> wrote:
>>
>> On 10/18/25 03:21, Heinrich Schuchardt wrote:
>> > On 10/17/25 23:35, Sean Anderson wrote:
>> >> When semihosting 32-bit systems, the return value of FLEN will be stored
>> >> in a 32-bit integer. To prevent wraparound, return -1 and set EOVERFLOW.
>> >> This matches the behavior of stat(2). Static files don't need to be
>> >> checked, since are always small.
>> >>
>> >> Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
>> >> ---
>> >>
>> >> semihosting/arm-compat-semi.c | 17 ++++++++++++++---
>> >> 1 file changed, 14 insertions(+), 3 deletions(-)
>> >>
>> >> diff --git a/semihosting/arm-compat-semi.c b/semihosting/arm-compat-semi.c
>> >> index c5a07cb947..57453ca6be 100644
>> >> --- a/semihosting/arm-compat-semi.c
>> >> +++ b/semihosting/arm-compat-semi.c
>> >> @@ -305,8 +305,19 @@ static uint64_t common_semi_flen_buf(CPUState *cs)
>> >> return sp - 64;
>> >> }
>> >> +static void common_semi_flen_cb(CPUState *cs, uint64_t ret, int err)
>> >> +{
>> >> + CPUArchState *env = cpu_env(cs);
>> >> +
>> >> + if (!err && !is_64bit_semihosting(env) && ret > INT32_MAX) {
>> >
>> >
>> > The issue with the current implementation is that files with file sizes over 4 GiB will be reported as file size < 4 -GiB on 32bit systems. Thanks for addressing this.
>> >
>> > But unfortunately with your change you are additionally dropping support for file sizes 2 GiB to 4 GiB on 32bit devices. This should be avoided.
>> >
>> > The semihosting specification specifies that the value returned in r0 should be -1 if an error occurs. So on 32 bit systems 0xffffffff should be returned.
>> >
>> > As file sizes cannot be negative there is not reason to assume that the value in r0 has to be interpreted by semihosting clients as signed.
>> >
>> > Please, change your commit to check against 0xffffffff.
>> >
>> > It might make sense to contact ARM to make their specification clearer.
>>
>> stat(2) will return -1/EOVERFLOW on 32-bit hosts for files over 2 GiB. I believe we should be consistent.
>
> I can see both sides of this one -- the semihosting spec
> is pretty old and was designed (to the extent it was designed)
> back in an era when 2GB of RAM or a 2GB file was pretty
> implausible sounding. (And today there's not much appetite
> for making updates to it for AArch32, because 32-bit is
> the past, not the future, and in any case you have to deal
> with all the existing implementations of the spec so
> changes are super painful to try to promulgate.)
>
> Our QEMU implementation of SYS_ISERROR() says "anything that's
> a negative 32-bit integer is an error number" for 32-bit hosts,
> which I suppose you might count on the side of "check
> against INT32_MAX".
>
> I think overall that if we think that anybody is or might be
> using semihosting with files in that 2..4GB size, we should
> err on the side of preserving that functionality. Otherwise
> somebody will report it as a bug and we'll want to fix it
> as a regression. And it doesn't seem impossible that somebody
> out there is doing so.
>
> If we report 2..4GB file sizes as if the file size value
> was an unsigned integer, then clients that expect that will
> work, and clients that treat any negative 32-bit value as
> an error will also work in the sense that they'll handle it
> as an error in the same way they would have done for -1.
> So that is the safest approach from a back-compat point
> of view, and I think that's what I lean towards doing.
Actually, some existing targets don't handle "negative" file sizes
properly. In particular, newlib generally treats the result as an int or
an off_t, which is a long (except on cygwin). Both of those types are
32 bits or smaller on (I)LP32. So if you return a 3 GiB size it will be
treated as -1 GiB. This will break lseek with SEEK_END, since newlib
uses the signed result of flen to recompute a new absolute offset.
IMO the spec is unclear, and this is reflected in the varying semantics
of differing implementations. I think returning any negative number
other than -1 is just inviting bugs.
--Sean
next prev parent reply other threads:[~2025-10-20 19:32 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-17 21:35 [PATCH 0/3] semihosting: Fix a few semihosting bugs Sean Anderson
2025-10-17 21:35 ` [PATCH 1/3] gdbstub: Fix %s formatting Sean Anderson
2025-10-18 0:07 ` Richard Henderson
2025-10-20 15:05 ` Alex Bennée
2025-10-17 21:35 ` [PATCH 2/3] semihosting: Fix GDB File-I/O FLEN Sean Anderson
2025-10-20 16:25 ` Alex Bennée
2025-10-17 21:35 ` [PATCH 3/3] semihosting: Check for overflow in FLEN on 32-bit systems Sean Anderson
2025-10-18 7:21 ` Heinrich Schuchardt
2025-10-20 14:21 ` Sean Anderson
2025-10-20 15:33 ` Heinrich Schuchardt
2025-10-20 15:39 ` Sean Anderson
2025-10-20 16:33 ` Peter Maydell
2025-10-20 19:31 ` Sean Anderson [this message]
2025-10-20 15:03 ` [PATCH 0/3] semihosting: Fix a few semihosting bugs Alex Bennée
2025-10-20 15:06 ` Sean Anderson
2025-10-27 10:54 ` Alex Bennée
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=91710089-5b3a-4be8-9554-90e46b991011@linux.dev \
--to=sean.anderson@linux.dev \
--cc=alex.bennee@linaro.org \
--cc=heinrich.schuchardt@canonical.com \
--cc=lmichel@kalray.eu \
--cc=peter.maydell@linaro.org \
--cc=philmd@linaro.org \
--cc=qemu-devel@nongnu.org \
--cc=richard.henderson@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).