All of lore.kernel.org
 help / color / mirror / Atom feed
* Endless loop on execution attempt on non-executable page
@ 2016-05-12 10:46 Florian Weimer
  2016-05-12 12:53 ` Ralf Baechle
  0 siblings, 1 reply; 6+ messages in thread
From: Florian Weimer @ 2016-05-12 10:46 UTC (permalink / raw)
  To: linux-mips

The GCC compile farm has a big-endian 64-bit MIPS box.  The kernel is:

Linux erpro8-fsf1 3.14.10-er8mod-00013-ge0fe977 #1 SMP PREEMPT Wed Jan
14 12:33:22 PST 2015 mips64 GNU/Linux

Which is a vendor kernel for the EdgeRouter Pro-8.

/proc/cpuinfo reports:

system type             : UBNT_E200 (CN6120p1.1-1000-NSP)
machine                 : Unknown
processor               : 0
cpu model               : Cavium Octeon II V0.1

While testing W^X (execmod, DEP, NX) stack enforcement, I noticed that 
once I try to execute code off a non-executable page, I do not get a 
signal, but the code appears to enter an infinite loop.  The generated 
function starts with a jump instruction to return to the caller, but 
instead, the program counter does not seem to change at all.

“si” in GDB also hangs (but can be interrupted with ^C).

My test code is here:

   https://pagure.io/execmod-tests

Is this a kernel bug or an issue with the silicon?

Thanks,
Florian

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Endless loop on execution attempt on non-executable page
  2016-05-12 10:46 Endless loop on execution attempt on non-executable page Florian Weimer
@ 2016-05-12 12:53 ` Ralf Baechle
  2016-05-12 13:07   ` Florian Weimer
  0 siblings, 1 reply; 6+ messages in thread
From: Ralf Baechle @ 2016-05-12 12:53 UTC (permalink / raw)
  To: Florian Weimer; +Cc: linux-mips

On Thu, May 12, 2016 at 12:46:37PM +0200, Florian Weimer wrote:

> The GCC compile farm has a big-endian 64-bit MIPS box.  The kernel is:
> 
> Linux erpro8-fsf1 3.14.10-er8mod-00013-ge0fe977 #1 SMP PREEMPT Wed Jan
> 14 12:33:22 PST 2015 mips64 GNU/Linux
> 
> Which is a vendor kernel for the EdgeRouter Pro-8.
> 
> /proc/cpuinfo reports:
> 
> system type             : UBNT_E200 (CN6120p1.1-1000-NSP)
> machine                 : Unknown
> processor               : 0
> cpu model               : Cavium Octeon II V0.1
> 
> While testing W^X (execmod, DEP, NX) stack enforcement, I noticed that once
> I try to execute code off a non-executable page, I do not get a signal, but
> the code appears to enter an infinite loop.  The generated function starts
> with a jump instruction to return to the caller, but instead, the program
> counter does not seem to change at all.
> 
> “si” in GDB also hangs (but can be interrupted with ^C).
> 
> My test code is here:
> 
>   https://pagure.io/execmod-tests
> 
> Is this a kernel bug or an issue with the silicon?

I see the test case uses mprotect to add PROT_EXEC after writing the code
to memory.  I don't think mprotect however gives any guarantee that this
will make the I-cache coherent with the D-cache, that is that the CPU will
actually fetch and execute the instruction that were just written to memory.
For that you have to do something architecture specific such as dancing
around a fire waving a dead chicken.  Or on MIPS call cacheflush(), see
the man page for details.

For portability sake to some broken processors you should also ensure
that a 32 byte cache line is entirely filled with valid instructions by
padding the two test instructions with another six no-op (opcode 0).
The test case as it is guarantees this implicitly by using a freshly
allocated page but I thought I should mention it.

  Ralf

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Endless loop on execution attempt on non-executable page
  2016-05-12 12:53 ` Ralf Baechle
@ 2016-05-12 13:07   ` Florian Weimer
  2016-05-12 14:23     ` Ralf Baechle
  0 siblings, 1 reply; 6+ messages in thread
From: Florian Weimer @ 2016-05-12 13:07 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: linux-mips

On 05/12/2016 02:53 PM, Ralf Baechle wrote:
> On Thu, May 12, 2016 at 12:46:37PM +0200, Florian Weimer wrote:
>
>> The GCC compile farm has a big-endian 64-bit MIPS box.  The kernel is:
>>
>> Linux erpro8-fsf1 3.14.10-er8mod-00013-ge0fe977 #1 SMP PREEMPT Wed Jan
>> 14 12:33:22 PST 2015 mips64 GNU/Linux
>>
>> Which is a vendor kernel for the EdgeRouter Pro-8.
>>
>> /proc/cpuinfo reports:
>>
>> system type             : UBNT_E200 (CN6120p1.1-1000-NSP)
>> machine                 : Unknown
>> processor               : 0
>> cpu model               : Cavium Octeon II V0.1
>>
>> While testing W^X (execmod, DEP, NX) stack enforcement, I noticed that once
>> I try to execute code off a non-executable page, I do not get a signal, but
>> the code appears to enter an infinite loop.  The generated function starts
>> with a jump instruction to return to the caller, but instead, the program
>> counter does not seem to change at all.
>>
>> “si” in GDB also hangs (but can be interrupted with ^C).
>>
>> My test code is here:
>>
>>   https://pagure.io/execmod-tests
>>
>> Is this a kernel bug or an issue with the silicon?
>
> I see the test case uses mprotect to add PROT_EXEC after writing the code
> to memory.  I don't think mprotect however gives any guarantee that this
> will make the I-cache coherent with the D-cache, that is that the CPU will
> actually fetch and execute the instruction that were just written to memory.
> For that you have to do something architecture specific such as dancing
> around a fire waving a dead chicken.  Or on MIPS call cacheflush(), see
> the man page for details.

There is a fork between the write and the execute.  It is somewhat 
unlikely that that's not a barrier, but it did happen on POWER.

However, I can successfully execute code without the barrier, so this 
whole thing goes in the wrong direction. :)

I have added it, just to be on the safe side.

> For portability sake to some broken processors you should also ensure
> that a 32 byte cache line is entirely filled with valid instructions by
> padding the two test instructions with another six no-op (opcode 0).

Added as well.

> The test case as it is guarantees this implicitly by using a freshly
> allocated page but I thought I should mention it.

There are some tests that don't (the stack variable might be clobbered, 
for example).

Anyway, neither change fixed things for me.  Given the peculiar “si” 
behavior in GDB, that's not entirely unexpected ...

Thanks,
Florian

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Endless loop on execution attempt on non-executable page
  2016-05-12 13:07   ` Florian Weimer
@ 2016-05-12 14:23     ` Ralf Baechle
  2016-05-12 15:57       ` David Daney
  0 siblings, 1 reply; 6+ messages in thread
From: Ralf Baechle @ 2016-05-12 14:23 UTC (permalink / raw)
  To: Florian Weimer; +Cc: linux-mips

On Thu, May 12, 2016 at 03:07:51PM +0200, Florian Weimer wrote:

> On 05/12/2016 02:53 PM, Ralf Baechle wrote:
> > On Thu, May 12, 2016 at 12:46:37PM +0200, Florian Weimer wrote:
> > 
> > > The GCC compile farm has a big-endian 64-bit MIPS box.  The kernel is:
> > > 
> > > Linux erpro8-fsf1 3.14.10-er8mod-00013-ge0fe977 #1 SMP PREEMPT Wed Jan
> > > 14 12:33:22 PST 2015 mips64 GNU/Linux
> > > 
> > > Which is a vendor kernel for the EdgeRouter Pro-8.
> > > 
> > > /proc/cpuinfo reports:
> > > 
> > > system type             : UBNT_E200 (CN6120p1.1-1000-NSP)
> > > machine                 : Unknown
> > > processor               : 0
> > > cpu model               : Cavium Octeon II V0.1
> > > 
> > > While testing W^X (execmod, DEP, NX) stack enforcement, I noticed that once
> > > I try to execute code off a non-executable page, I do not get a signal, but
> > > the code appears to enter an infinite loop.  The generated function starts
> > > with a jump instruction to return to the caller, but instead, the program
> > > counter does not seem to change at all.
> > > 
> > > “si” in GDB also hangs (but can be interrupted with ^C).
> > > 
> > > My test code is here:
> > > 
> > >   https://pagure.io/execmod-tests
> > > 
> > > Is this a kernel bug or an issue with the silicon?
> > 
> > I see the test case uses mprotect to add PROT_EXEC after writing the code
> > to memory.  I don't think mprotect however gives any guarantee that this
> > will make the I-cache coherent with the D-cache, that is that the CPU will
> > actually fetch and execute the instruction that were just written to memory.
> > For that you have to do something architecture specific such as dancing
> > around a fire waving a dead chicken.  Or on MIPS call cacheflush(), see
> > the man page for details.
> 
> There is a fork between the write and the execute.  It is somewhat unlikely
> that that's not a barrier, but it did happen on POWER.
> 
> However, I can successfully execute code without the barrier, so this whole
> thing goes in the wrong direction. :)
> 
> I have added it, just to be on the safe side.
> 
> > For portability sake to some broken processors you should also ensure
> > that a 32 byte cache line is entirely filled with valid instructions by
> > padding the two test instructions with another six no-op (opcode 0).
> 
> Added as well.
> 
> > The test case as it is guarantees this implicitly by using a freshly
> > allocated page but I thought I should mention it.
> 
> There are some tests that don't (the stack variable might be clobbered, for
> example).
> 
> Anyway, neither change fixed things for me.  Given the peculiar “si”
> behavior in GDB, that's not entirely unexpected ...

Thanks for fixing and testing this obvious things.  Now let's look one
or two levels deeper ...

  Ralf

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Endless loop on execution attempt on non-executable page
  2016-05-12 14:23     ` Ralf Baechle
@ 2016-05-12 15:57       ` David Daney
  2016-05-17 10:15         ` Florian Weimer
  0 siblings, 1 reply; 6+ messages in thread
From: David Daney @ 2016-05-12 15:57 UTC (permalink / raw)
  To: Ralf Baechle, Hill, Steven; +Cc: Florian Weimer, linux-mips

On 05/12/2016 07:23 AM, Ralf Baechle wrote:
> On Thu, May 12, 2016 at 03:07:51PM +0200, Florian Weimer wrote:
>
>> On 05/12/2016 02:53 PM, Ralf Baechle wrote:
>>> On Thu, May 12, 2016 at 12:46:37PM +0200, Florian Weimer wrote:
>>>
>>>> The GCC compile farm has a big-endian 64-bit MIPS box.  The kernel is:
>>>>
>>>> Linux erpro8-fsf1 3.14.10-er8mod-00013-ge0fe977 #1 SMP PREEMPT Wed Jan
>>>> 14 12:33:22 PST 2015 mips64 GNU/Linux
>>>>
>>>> Which is a vendor kernel for the EdgeRouter Pro-8.
>>>>
>>>> /proc/cpuinfo reports:
>>>>
>>>> system type             : UBNT_E200 (CN6120p1.1-1000-NSP)
>>>> machine                 : Unknown
>>>> processor               : 0
>>>> cpu model               : Cavium Octeon II V0.1
>>>>
>>>> While testing W^X (execmod, DEP, NX) stack enforcement, I noticed that once
>>>> I try to execute code off a non-executable page, I do not get a signal, but
>>>> the code appears to enter an infinite loop.  The generated function starts
>>>> with a jump instruction to return to the caller, but instead, the program
>>>> counter does not seem to change at all.
>>>>
>>>> “si” in GDB also hangs (but can be interrupted with ^C).
>>>>
>>>> My test code is here:
>>>>
>>>>    https://pagure.io/execmod-tests
>>>>
>>>> Is this a kernel bug or an issue with the silicon?
>>>
>>> I see the test case uses mprotect to add PROT_EXEC after writing the code
>>> to memory.  I don't think mprotect however gives any guarantee that this
>>> will make the I-cache coherent with the D-cache, that is that the CPU will
>>> actually fetch and execute the instruction that were just written to memory.
>>> For that you have to do something architecture specific such as dancing
>>> around a fire waving a dead chicken.  Or on MIPS call cacheflush(), see
>>> the man page for details.
>>
>> There is a fork between the write and the execute.  It is somewhat unlikely
>> that that's not a barrier, but it did happen on POWER.
>>
>> However, I can successfully execute code without the barrier, so this whole
>> thing goes in the wrong direction. :)
>>
>> I have added it, just to be on the safe side.
>>
>>> For portability sake to some broken processors you should also ensure
>>> that a 32 byte cache line is entirely filled with valid instructions by
>>> padding the two test instructions with another six no-op (opcode 0).
>>
>> Added as well.
>>
>>> The test case as it is guarantees this implicitly by using a freshly
>>> allocated page but I thought I should mention it.
>>
>> There are some tests that don't (the stack variable might be clobbered, for
>> example).
>>
>> Anyway, neither change fixed things for me.  Given the peculiar “si”
>> behavior in GDB, that's not entirely unexpected ...
>
> Thanks for fixing and testing this obvious things.  Now let's look one
> or two levels deeper ...
>

This is something that would be easy to diagnose on the OCTEON simulator...

Before spending time doing that, has anyone tried this on current 
kernels rather than the 3.14 indicated above?

It might also be interesting to know if it still happens when booting on 
only a single CPU rather than what I assume is the default on this 
platform of all available CPUs

David Daney

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Endless loop on execution attempt on non-executable page
  2016-05-12 15:57       ` David Daney
@ 2016-05-17 10:15         ` Florian Weimer
  0 siblings, 0 replies; 6+ messages in thread
From: Florian Weimer @ 2016-05-17 10:15 UTC (permalink / raw)
  To: David Daney, Ralf Baechle, Hill, Steven; +Cc: linux-mips

On 05/12/2016 05:57 PM, David Daney wrote:

> This is something that would be easy to diagnose on the OCTEON simulator...
>
> Before spending time doing that, has anyone tried this on current
> kernels rather than the 3.14 indicated above?

I can't swap kernels on this device, and I suspect it's running a vendor 
kernel for a reason (lack of Debian/upstream support, presumably). 
Sorry about that.

Florian

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-05-17 10:15 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-05-12 10:46 Endless loop on execution attempt on non-executable page Florian Weimer
2016-05-12 12:53 ` Ralf Baechle
2016-05-12 13:07   ` Florian Weimer
2016-05-12 14:23     ` Ralf Baechle
2016-05-12 15:57       ` David Daney
2016-05-17 10:15         ` Florian Weimer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.