All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sean Anderson <sean.anderson@linux.dev>
To: Peter Maydell <peter.maydell@linaro.org>
Cc: "Heinrich Schuchardt" <heinrich.schuchardt@canonical.com>,
	"Richard Henderson" <richard.henderson@linaro.org>,
	"Luc Michel" <lmichel@kalray.eu>,
	"Alex Bennée" <alex.bennee@linaro.org>,
	"Philippe Mathieu-Daudé" <philmd@linaro.org>,
	qemu-devel@nongnu.org
Subject: Re: [PATCH 3/3] semihosting: Check for overflow in FLEN on 32-bit systems
Date: Mon, 20 Oct 2025 15:31:28 -0400	[thread overview]
Message-ID: <91710089-5b3a-4be8-9554-90e46b991011@linux.dev> (raw)
In-Reply-To: <CAFEAcA_-08zVV6U2jVhNbYAwkLYXdjOzmRP-ZutjAKPuiGQ-_w@mail.gmail.com>

On 10/20/25 12:33, Peter Maydell wrote:
> On Mon, 20 Oct 2025 at 16:01, Sean Anderson <sean.anderson@linux.dev> wrote:
>>
>> On 10/18/25 03:21, Heinrich Schuchardt wrote:
>> > On 10/17/25 23:35, Sean Anderson wrote:
>> >> When semihosting 32-bit systems, the return value of FLEN will be stored
>> >> in a 32-bit integer. To prevent wraparound, return -1 and set EOVERFLOW.
>> >> This matches the behavior of stat(2). Static files don't need to be
>> >> checked, since are always small.
>> >>
>> >> Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
>> >> ---
>> >>
>> >>   semihosting/arm-compat-semi.c | 17 ++++++++++++++---
>> >>   1 file changed, 14 insertions(+), 3 deletions(-)
>> >>
>> >> diff --git a/semihosting/arm-compat-semi.c b/semihosting/arm-compat-semi.c
>> >> index c5a07cb947..57453ca6be 100644
>> >> --- a/semihosting/arm-compat-semi.c
>> >> +++ b/semihosting/arm-compat-semi.c
>> >> @@ -305,8 +305,19 @@ static uint64_t common_semi_flen_buf(CPUState *cs)
>> >>       return sp - 64;
>> >>   }
>> >>   +static void common_semi_flen_cb(CPUState *cs, uint64_t ret, int err)
>> >> +{
>> >> +    CPUArchState *env = cpu_env(cs);
>> >> +
>> >> +    if (!err && !is_64bit_semihosting(env) && ret > INT32_MAX) {
>> >
>> >
>> > The issue with the current implementation is that files with file sizes over 4 GiB will be reported as file size < 4 -GiB on 32bit systems. Thanks for addressing this.
>> >
>> > But unfortunately with your change you are additionally dropping support for file sizes 2 GiB to 4 GiB on 32bit devices. This should be avoided.
>> >
>> > The semihosting specification specifies that the value returned in r0 should be -1 if an error occurs. So on 32 bit systems 0xffffffff should be returned.
>> >
>> > As file sizes cannot be negative there is not reason to assume that the value in r0 has to be interpreted by semihosting clients as signed.
>> >
>> > Please, change your commit to check against 0xffffffff.
>> >
>> > It might make sense to contact ARM to make their specification clearer.
>>
>> stat(2) will return -1/EOVERFLOW on 32-bit hosts for files over 2 GiB. I believe we should be consistent.
> 
> I can see both sides of this one -- the semihosting spec
> is pretty old and was designed (to the extent it was designed)
> back in an era when 2GB of RAM or a 2GB file was pretty
> implausible sounding. (And today there's not much appetite
> for making updates to it for AArch32, because 32-bit is
> the past, not the future, and in any case you have to deal
> with all the existing implementations of the spec so
> changes are super painful to try to promulgate.)
> 
> Our QEMU implementation of SYS_ISERROR() says "anything that's
> a negative 32-bit integer is an error number" for 32-bit hosts,
> which I suppose you might count on the side of "check
> against INT32_MAX".
> 
> I think overall that if we think that anybody is or might be
> using semihosting with files in that 2..4GB size, we should
> err on the side of preserving that functionality. Otherwise
> somebody will report it as a bug and we'll want to fix it
> as a regression. And it doesn't seem impossible that somebody
> out there is doing so.
> 
> If we report 2..4GB file sizes as if the file size value
> was an unsigned integer, then clients that expect that will
> work, and clients that treat any negative 32-bit value as
> an error will also work in the sense that they'll handle it
> as an error in the same way they would have done for -1.
> So that is the safest approach from a back-compat point
> of view, and I think that's what I lean towards doing.

Actually, some existing targets don't handle "negative" file sizes
properly. In particular, newlib generally treats the result as an int or
an off_t, which is a long (except on cygwin). Both of those types are
32 bits or smaller on (I)LP32. So if you return a 3 GiB size it will be
treated as -1 GiB. This will break lseek with SEEK_END, since newlib
uses the signed result of flen to recompute a new absolute offset.

IMO the spec is unclear, and this is reflected in the varying semantics
of differing implementations. I think returning any negative number
other than -1 is just inviting bugs.

--Sean


  reply	other threads:[~2025-10-20 19:32 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-17 21:35 [PATCH 0/3] semihosting: Fix a few semihosting bugs Sean Anderson
2025-10-17 21:35 ` [PATCH 1/3] gdbstub: Fix %s formatting Sean Anderson
2025-10-18  0:07   ` Richard Henderson
2025-10-20 15:05   ` Alex Bennée
2025-10-17 21:35 ` [PATCH 2/3] semihosting: Fix GDB File-I/O FLEN Sean Anderson
2025-10-20 16:25   ` Alex Bennée
2025-10-17 21:35 ` [PATCH 3/3] semihosting: Check for overflow in FLEN on 32-bit systems Sean Anderson
2025-10-18  7:21   ` Heinrich Schuchardt
2025-10-20 14:21     ` Sean Anderson
2025-10-20 15:33       ` Heinrich Schuchardt
2025-10-20 15:39         ` Sean Anderson
2025-10-20 16:33       ` Peter Maydell
2025-10-20 19:31         ` Sean Anderson [this message]
2025-10-20 15:03 ` [PATCH 0/3] semihosting: Fix a few semihosting bugs Alex Bennée
2025-10-20 15:06   ` Sean Anderson
2025-10-27 10:54 ` Alex Bennée
2025-10-31 10:31 ` Michael Tokarev
2025-10-31 11:44   ` Alex Bennée

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=91710089-5b3a-4be8-9554-90e46b991011@linux.dev \
    --to=sean.anderson@linux.dev \
    --cc=alex.bennee@linaro.org \
    --cc=heinrich.schuchardt@canonical.com \
    --cc=lmichel@kalray.eu \
    --cc=peter.maydell@linaro.org \
    --cc=philmd@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=richard.henderson@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.