public inbox for live-patching@vger.kernel.org
 help / color / mirror / Atom feed
* Re: [syzbot] upstream test error: KASAN: invalid-access Read in __entry_tramp_text_end
       [not found]                 ` <YVRRWzXqhMIpwelm@hirez.programming.kicks-ass.net>
@ 2021-09-30 19:26                   ` Josh Poimboeuf
  2021-10-01 12:27                     ` Mark Rutland
  0 siblings, 1 reply; 3+ messages in thread
From: Josh Poimboeuf @ 2021-09-30 19:26 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Mark Rutland, Dmitry Vyukov, syzbot, Linux ARM, linux-fsdevel,
	linux-kernel, syzkaller-bugs, viro, will, x86, live-patching,
	Thomas Gleixner

On Wed, Sep 29, 2021 at 01:43:23PM +0200, Peter Zijlstra wrote:
> On Wed, Sep 29, 2021 at 11:37:30AM +0100, Mark Rutland wrote:
> 
> > > This is because _ASM_EXTABLE only generates data for another section.
> > > There doesn't need to be code continuity between these two asm
> > > statements.
> > 
> > I think you've missed my point. It doesn't matter that the
> > asm_volatile_goto() doesn't contain code, and this is solely about the
> > *state* expected at entry/exit from each asm block being different.
> 
> Urgh.. indeed :/

So much for that idea :-/

To fix the issue of the wrong .fixup code symbol names getting printed,
we could (as Mark suggested) add a '__fixup_text_start' symbol at the
start of the .fixup section.  And then remove all other symbols in the
.fixup section.

For x86, that means removing the kvm_fastop_exception symbol and a few
others.  That way it's all anonymous code, displayed by the kernel as
"__fixup_text_start+0x1234".  Which isn't all that useful, but still
better than printing the wrong symbol.

But there's still a bigger problem: the function with the faulting
instruction doesn't get reported in the stack trace.

For example, in the up-thread bug report, __d_lookup() bug report
doesn't get printed, even though its anonymous .fixup code is running in
the context of the function and will be branching back to it shortly.

Even worse, this means livepatch is broken, because if for example
__d_lookup()'s .fixup code gets preempted, __d_lookup() can get skipped
by a reliable stack trace.

So we may need to get rid of .fixup altogether.  Especially for arches
which support livepatch.

We can replace some of the custom .fixup handlers with generic handlers
like x86 does, which do the fixup work in exception context.  This
generally works better for more generic work like putting an error code
in a certain register and resuming execution at the subsequent
instruction.

However a lot of the .fixup code is rather custom and doesn't
necessarily work well with that model.

In such cases we could just move the .fixup code into the function
(inline for older compilers; out-of-line for compilers that support
CC_HAS_ASM_GOTO_OUTPUT).

Alternatively we could convert each .fixup code fragment into a proper
function which returns to a specified resume point in the function, and
then have the exception handler emulate a call to it like we do with
int3_emulate_call().

-- 
Josh


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [syzbot] upstream test error: KASAN: invalid-access Read in __entry_tramp_text_end
  2021-09-30 19:26                   ` [syzbot] upstream test error: KASAN: invalid-access Read in __entry_tramp_text_end Josh Poimboeuf
@ 2021-10-01 12:27                     ` Mark Rutland
  2021-10-02  5:10                       ` Josh Poimboeuf
  0 siblings, 1 reply; 3+ messages in thread
From: Mark Rutland @ 2021-10-01 12:27 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Peter Zijlstra, Dmitry Vyukov, syzbot, Linux ARM, linux-fsdevel,
	linux-kernel, syzkaller-bugs, viro, will, x86, live-patching,
	Thomas Gleixner

On Thu, Sep 30, 2021 at 12:26:38PM -0700, Josh Poimboeuf wrote:
> On Wed, Sep 29, 2021 at 01:43:23PM +0200, Peter Zijlstra wrote:
> > On Wed, Sep 29, 2021 at 11:37:30AM +0100, Mark Rutland wrote:
> > 
> > > > This is because _ASM_EXTABLE only generates data for another section.
> > > > There doesn't need to be code continuity between these two asm
> > > > statements.
> > > 
> > > I think you've missed my point. It doesn't matter that the
> > > asm_volatile_goto() doesn't contain code, and this is solely about the
> > > *state* expected at entry/exit from each asm block being different.
> > 
> > Urgh.. indeed :/
> 
> So much for that idea :-/
> 
> To fix the issue of the wrong .fixup code symbol names getting printed,
> we could (as Mark suggested) add a '__fixup_text_start' symbol at the
> start of the .fixup section.  And then remove all other symbols in the
> .fixup section.

Just to be clear, that was just as a "make debugging slightly less
painful" aid, not as a fix for reliable stackrtrace and all that.

> For x86, that means removing the kvm_fastop_exception symbol and a few
> others.  That way it's all anonymous code, displayed by the kernel as
> "__fixup_text_start+0x1234".  Which isn't all that useful, but still
> better than printing the wrong symbol.
> 
> But there's still a bigger problem: the function with the faulting
> instruction doesn't get reported in the stack trace.
> 
> For example, in the up-thread bug report, __d_lookup() bug report
> doesn't get printed, even though its anonymous .fixup code is running in
> the context of the function and will be branching back to it shortly.
> 
> Even worse, this means livepatch is broken, because if for example
> __d_lookup()'s .fixup code gets preempted, __d_lookup() can get skipped
> by a reliable stack trace.
> 
> So we may need to get rid of .fixup altogether.  Especially for arches
> which support livepatch.
> 
> We can replace some of the custom .fixup handlers with generic handlers
> like x86 does, which do the fixup work in exception context.  This
> generally works better for more generic work like putting an error code
> in a certain register and resuming execution at the subsequent
> instruction.

I reckon even ignoring the unwind problems this'd be a good thing since
it'd save on redundant copies of the fixup logic that happen to be
identical, and the common cases like uaccess all fall into this shape.

As for how to do that, in the past Peter and I had come up with some
assembler trickery to get the name of the error code register encoded
into the extable info:

  https://lore.kernel.org/lkml/20170207111011.GB28790@leverpostej/
  https://lore.kernel.org/lkml/20170207160300.GB26173@leverpostej/
  https://lore.kernel.org/lkml/20170208091250.GT6515@twins.programming.kicks-ass.net/

... but maybe that's already solved on x86 in a different way?

> However a lot of the .fixup code is rather custom and doesn't
> necessarily work well with that model.

Looking at arm64, even where we'd need custom handlers it does appear we
could mostly do that out-of-line in the exception handler. The more
exotic cases are largely in out-of-line asm functions, where we can move
the fixups within the function, after the usual return.

I reckon we can handle the fixups for load_unaligned_zeropad() in the
exception handler.

Is there anything specific that you think is painful in the exception
handler?

> In such cases we could just move the .fixup code into the function
> (inline for older compilers; out-of-line for compilers that support
> CC_HAS_ASM_GOTO_OUTPUT).
> 
> Alternatively we could convert each .fixup code fragment into a proper
> function which returns to a specified resume point in the function, and
> then have the exception handler emulate a call to it like we do with
> int3_emulate_call().

For arm64 this would be somewhat unfortunate for inline asm due to our
calling convention -- we'd have to clobber the LR, and we'd need to
force the creation of a frame record in the caller which would otherwise
not be necessary.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [syzbot] upstream test error: KASAN: invalid-access Read in __entry_tramp_text_end
  2021-10-01 12:27                     ` Mark Rutland
@ 2021-10-02  5:10                       ` Josh Poimboeuf
  0 siblings, 0 replies; 3+ messages in thread
From: Josh Poimboeuf @ 2021-10-02  5:10 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Peter Zijlstra, Dmitry Vyukov, syzbot, Linux ARM, linux-fsdevel,
	linux-kernel, syzkaller-bugs, viro, will, x86, live-patching,
	Thomas Gleixner

On Fri, Oct 01, 2021 at 01:27:06PM +0100, Mark Rutland wrote:
> > So we may need to get rid of .fixup altogether.  Especially for arches
> > which support livepatch.
> > 
> > We can replace some of the custom .fixup handlers with generic handlers
> > like x86 does, which do the fixup work in exception context.  This
> > generally works better for more generic work like putting an error code
> > in a certain register and resuming execution at the subsequent
> > instruction.
> 
> I reckon even ignoring the unwind problems this'd be a good thing since
> it'd save on redundant copies of the fixup logic that happen to be
> identical, and the common cases like uaccess all fall into this shape.
> 
> As for how to do that, in the past Peter and I had come up with some
> assembler trickery to get the name of the error code register encoded
> into the extable info:
> 
>   https://lore.kernel.org/lkml/20170207111011.GB28790@leverpostej/
>   https://lore.kernel.org/lkml/20170207160300.GB26173@leverpostej/
>   https://lore.kernel.org/lkml/20170208091250.GT6515@twins.programming.kicks-ass.net/
>
> ... but maybe that's already solved on x86 in a different way?

That's really cool :-) But it might be overkill for x86's needs.  For
the exceptions which rely on handlers rather than anonymous .fixup code,
the register assumptions are hard-coded in the assembler constraints.  I
think that works well enough.

> > However a lot of the .fixup code is rather custom and doesn't
> > necessarily work well with that model.
> 
> Looking at arm64, even where we'd need custom handlers it does appear we
> could mostly do that out-of-line in the exception handler. The more
> exotic cases are largely in out-of-line asm functions, where we can move
> the fixups within the function, after the usual return.
> 
> I reckon we can handle the fixups for load_unaligned_zeropad() in the
> exception handler.
> 
> Is there anything specific that you think is painful in the exception
> handler?

Actually, after looking at all the x86 .fixup usage, I think we can make
this two-pronged approach work.  Either move the .fixup code to an
exception handler (with a hard-coded assembler constraint register) or
put it in the function (out-of-line where possible).  I'll try to work
up some patches (x86 only of course).

-- 
Josh


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-10-02  5:10 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20210927170122.GA9201@C02TD0UTHF1T.local>
     [not found] ` <20210927171812.GB9201@C02TD0UTHF1T.local>
     [not found]   ` <CACT4Y+actfuftwMMOGXmEsLYbnCnqcZ2gJGeoMLsFCUNE-AxcQ@mail.gmail.com>
     [not found]     ` <20210928103543.GF1924@C02TD0UTHF1T.local>
     [not found]       ` <20210929013637.bcarm56e4mqo3ndt@treble>
     [not found]         ` <YVQYQzP/vqNWm/hO@hirez.programming.kicks-ass.net>
     [not found]           ` <20210929085035.GA33284@C02TD0UTHF1T.local>
     [not found]             ` <YVQ5F9aT7oSEKenh@hirez.programming.kicks-ass.net>
     [not found]               ` <20210929103730.GC33284@C02TD0UTHF1T.local>
     [not found]                 ` <YVRRWzXqhMIpwelm@hirez.programming.kicks-ass.net>
2021-09-30 19:26                   ` [syzbot] upstream test error: KASAN: invalid-access Read in __entry_tramp_text_end Josh Poimboeuf
2021-10-01 12:27                     ` Mark Rutland
2021-10-02  5:10                       ` Josh Poimboeuf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox