[RFC 0/4] Create infrastructure for running C code from SRAM.

linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed

* [RFC 0/4] Create infrastructure for running C code from SRAM.
@ 2013-09-03 16:44 Russ Dill
       [not found] ` <1378226665-27090-4-git-send-email-Russ.Dill@ti.com>
                   ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Russ Dill @ 2013-09-03 16:44 UTC (permalink / raw)
  To: linux-arm-kernel

This RFC patchset explores an idea for loading C code into SRAM.
Currently, all the code I'm aware of that needs to run from SRAM is written
in assembler. The most common reason for code needing to run from SRAM is
that the memory controller is being disabled/ enabled or is already
disabled. arch/arm has by far the most examples, but code also exists in
powerpc and sh.

The code is written in asm for two primary reasons. First so that markers
can be put in indicating the size of the code they it can be copied. Second
so that data can be placed along with text and accessed in a position
independant manner.

SRAM handling code is in the process of being moved from arch directories
into drivers/misc/sram.c using device tree and genalloc [1] [2]. This RFC
patchset builds on that, including the limitation that the SRAM address is
not known at compile time. Because the SRAM address is not known at compile
time, the code that runs from SRAM must be compiled with -fPIC. Even if
the code were loaded to a fixed virtual address, portions of the code must
often be run with the MMU disabled.

The general idea is that for each SRAM user (such as an SoC specific
suspend/resume mechanism) to create a group of sections. The section group
is created with a single macro for each user, but end up looking like this:

.sram.am33xx : AT(ADDR(.sram.am33xx) - 0) {
  __sram_am33xx_start = .;
  *(.sram.am33xx.*)
  __sram_am33xx_end = .;
}

Any data or functions that should be copied to SRAM for this use should be
maked with an appropriate __section() attribute. A helper is then added for
translating between the original kernel symbol, and the address of that
function or variable once it has been copied into SRAM. Once control is
passed to a function within the SRAM section grouping, it can access any
variables or functions within that same SRAM section grouping without
translation.

[1] http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=4984c6
[2] http://www.spinics.net/lists/linux-omap/msg96504.html

Russ Dill (4):
  Misc: SRAM: Create helpers for loading C code into SRAM
  ARM: SRAM: Add macro for generating SRAM resume trampoline
  Misc: SRAM: Hack for allowing executable code in SRAM.
  ARM: AM33XX: Move suspend/resume assembly to C

 arch/arm/include/asm/suspend.h    |  14 ++
 arch/arm/kernel/vmlinux.lds.S     |   2 +
 arch/arm/mach-omap2/Makefile      |   2 +-
 arch/arm/mach-omap2/pm33xx.c      |  50 ++---
 arch/arm/mach-omap2/pm33xx.h      |  23 +--
 arch/arm/mach-omap2/sleep33xx.S   | 394 --------------------------------------
 arch/arm/mach-omap2/sleep33xx.c   | 309 ++++++++++++++++++++++++++++++
 arch/arm/mach-omap2/sram.c        |  15 --
 drivers/misc/sram.c               | 106 +++++++++-
 include/asm-generic/vmlinux.lds.h |   7 +
 include/linux/sram.h              |  44 +++++
 11 files changed, 509 insertions(+), 457 deletions(-)
 delete mode 100644 arch/arm/mach-omap2/sleep33xx.S
 create mode 100644 arch/arm/mach-omap2/sleep33xx.c
 create mode 100644 include/linux/sram.h

-- 
1.8.3.2

^ permalink raw reply	[flat|nested] 17+ messages in thread

[parent not found: <1378226665-27090-4-git-send-email-Russ.Dill@ti.com>]

* [RFC 3/4] Misc: SRAM: Hack for allowing executable code in SRAM.
       [not found] ` <1378226665-27090-4-git-send-email-Russ.Dill@ti.com>
@ 2013-09-04 18:06   ` Tony Lindgren
  2013-09-06 20:50     ` Russ Dill
  0 siblings, 1 reply; 17+ messages in thread
From: Tony Lindgren @ 2013-09-04 18:06 UTC (permalink / raw)
  To: linux-arm-kernel

* Russ Dill <Russ.Dill@ti.com> [130903 09:52]:
> The generic SRAM mechanism does not ioremap memory in a
> manner that allows code to be executed from SRAM. There is
> currently no generic way to request ioremap to return a
> memory area with execution allowed.
> 
> Insert a temporary hack for proof of concept on ARM.
> 
> Signed-off-by: Russ Dill <Russ.Dill@ti.com>
> ---
>  drivers/misc/sram.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/misc/sram.c b/drivers/misc/sram.c
> index 08baaab..e059a23 100644
> --- a/drivers/misc/sram.c
> +++ b/drivers/misc/sram.c
> @@ -31,6 +31,7 @@
>  #include <linux/genalloc.h>
>  #include <linux/sram.h>
>  #include <asm-generic/cacheflush.h>
> +#include <asm/io.h>
>  
>  #define SRAM_GRANULARITY	32
>  
> @@ -138,7 +139,7 @@ static int sram_probe(struct platform_device *pdev)
>  	int ret;
>  
>  	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> -	virt_base = devm_ioremap_resource(&pdev->dev, res);
> +	virt_base = __arm_ioremap_exec(res->start, resource_size(res), false);
>  	if (IS_ERR(virt_base))
>  		return PTR_ERR(virt_base);

You can get rid of this hack by defining ioremap_exec in
include/asm-generic/io.h the same way as ioremap_nocache
is done:

#ifndef ioremap_exec
#define ioremap_exec ioremap
#endif

Then the arch that need ioremap_exec can define and
implement it. Needs to be reviewed on LKML naturally :)

Regards,

Tony

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC 3/4] Misc: SRAM: Hack for allowing executable code in SRAM.
  2013-09-04 18:06   ` [RFC 3/4] Misc: SRAM: Hack for allowing executable code in SRAM Tony Lindgren
@ 2013-09-06 20:50     ` Russ Dill
  0 siblings, 0 replies; 17+ messages in thread
From: Russ Dill @ 2013-09-06 20:50 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Sep 4, 2013 at 11:06 AM, Tony Lindgren <tony@atomide.com> wrote:
> * Russ Dill <Russ.Dill@ti.com> [130903 09:52]:
>> The generic SRAM mechanism does not ioremap memory in a
>> manner that allows code to be executed from SRAM. There is
>> currently no generic way to request ioremap to return a
>> memory area with execution allowed.
>>
>> Insert a temporary hack for proof of concept on ARM.
>>
>> Signed-off-by: Russ Dill <Russ.Dill@ti.com>
>> ---
>>  drivers/misc/sram.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/misc/sram.c b/drivers/misc/sram.c
>> index 08baaab..e059a23 100644
>> --- a/drivers/misc/sram.c
>> +++ b/drivers/misc/sram.c
>> @@ -31,6 +31,7 @@
>>  #include <linux/genalloc.h>
>>  #include <linux/sram.h>
>>  #include <asm-generic/cacheflush.h>
>> +#include <asm/io.h>
>>
>>  #define SRAM_GRANULARITY     32
>>
>> @@ -138,7 +139,7 @@ static int sram_probe(struct platform_device *pdev)
>>       int ret;
>>
>>       res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
>> -     virt_base = devm_ioremap_resource(&pdev->dev, res);
>> +     virt_base = __arm_ioremap_exec(res->start, resource_size(res), false);
>>       if (IS_ERR(virt_base))
>>               return PTR_ERR(virt_base);
>
> You can get rid of this hack by defining ioremap_exec in
> include/asm-generic/io.h the same way as ioremap_nocache
> is done:
>
> #ifndef ioremap_exec
> #define ioremap_exec ioremap
> #endif
>
> Then the arch that need ioremap_exec can define and
> implement it. Needs to be reviewed on LKML naturally :)

The similar statement for nocache in asm-generic/io.h appears in an
#ifndef CONFIG_MMU block. I think the better example would be
ioremap_wc, which looks like:

#ifndef ARCH_HAS_IOREMAP_WC
#define ioremap_wc ioremap_nocache
#endif

Course, ioremap_exec on ARM has a slight complication since it has an
extra bool nocache argument.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC 0/4] Create infrastructure for running C code from SRAM.
  2013-09-03 16:44 [RFC 0/4] Create infrastructure for running C code from SRAM Russ Dill
       [not found] ` <1378226665-27090-4-git-send-email-Russ.Dill@ti.com>
@ 2013-09-04 19:52 ` Emilio López
  2013-09-04 21:47   ` Russ Dill
  2013-09-06 11:12 ` Russell King - ARM Linux
  2 siblings, 1 reply; 17+ messages in thread
From: Emilio López @ 2013-09-04 19:52 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

El 03/09/13 13:44, Russ Dill escribi?:
> This RFC patchset explores an idea for loading C code into SRAM.
> Currently, all the code I'm aware of that needs to run from SRAM is written
> in assembler. The most common reason for code needing to run from SRAM is
> that the memory controller is being disabled/ enabled or is already
> disabled. arch/arm has by far the most examples, but code also exists in
> powerpc and sh.
>
> The code is written in asm for two primary reasons. First so that markers
> can be put in indicating the size of the code they it can be copied. Second
> so that data can be placed along with text and accessed in a position
> independant manner.
>
> SRAM handling code is in the process of being moved from arch directories
> into drivers/misc/sram.c using device tree and genalloc [1] [2]. This RFC
> patchset builds on that, including the limitation that the SRAM address is
> not known at compile time. Because the SRAM address is not known at compile
> time, the code that runs from SRAM must be compiled with -fPIC. Even if
> the code were loaded to a fixed virtual address, portions of the code must
> often be run with the MMU disabled.
>
> The general idea is that for each SRAM user (such as an SoC specific
> suspend/resume mechanism) to create a group of sections. The section group
> is created with a single macro for each user, but end up looking like this:
>
> .sram.am33xx : AT(ADDR(.sram.am33xx) - 0) {
>    __sram_am33xx_start = .;
>    *(.sram.am33xx.*)
>    __sram_am33xx_end = .;
> }
>
> Any data or functions that should be copied to SRAM for this use should be
> maked with an appropriate __section() attribute. A helper is then added for
> translating between the original kernel symbol, and the address of that
> function or variable once it has been copied into SRAM. Once control is
> passed to a function within the SRAM section grouping, it can access any
> variables or functions within that same SRAM section grouping without
> translation.
>
> [1] http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=4984c6
> [2] http://www.spinics.net/lists/linux-omap/msg96504.html
>
> Russ Dill (4):
>    Misc: SRAM: Create helpers for loading C code into SRAM
>    ARM: SRAM: Add macro for generating SRAM resume trampoline
>    Misc: SRAM: Hack for allowing executable code in SRAM.
>    ARM: AM33XX: Move suspend/resume assembly to C
>
>   arch/arm/include/asm/suspend.h    |  14 ++
>   arch/arm/kernel/vmlinux.lds.S     |   2 +
>   arch/arm/mach-omap2/Makefile      |   2 +-
>   arch/arm/mach-omap2/pm33xx.c      |  50 ++---
>   arch/arm/mach-omap2/pm33xx.h      |  23 +--
>   arch/arm/mach-omap2/sleep33xx.S   | 394 --------------------------------------
>   arch/arm/mach-omap2/sleep33xx.c   | 309 ++++++++++++++++++++++++++++++
>   arch/arm/mach-omap2/sram.c        |  15 --
>   drivers/misc/sram.c               | 106 +++++++++-
>   include/asm-generic/vmlinux.lds.h |   7 +
>   include/linux/sram.h              |  44 +++++
>   11 files changed, 509 insertions(+), 457 deletions(-)
>   delete mode 100644 arch/arm/mach-omap2/sleep33xx.S
>   create mode 100644 arch/arm/mach-omap2/sleep33xx.c
>   create mode 100644 include/linux/sram.h
>

I'm interested in this, as I'll need something like it for 
suspend/resume on sunxi. Unfortunately, I only got the cover letter on 
my email, and the web lakml archives don't seem to have the rest either. 
After a bit of searching on Google I found a copy on linux-omap[1], but 
it'd be great if I didn't have to hunt for the patches :)

I only have one comment, from a quick look at the code

+       memcpy((void *) chunk->addr, data, sz);
+       flush_icache_range(chunk->addr, chunk->addr + sz);

How would that behave on Thumb-2 mode? I believe that's the reason why 
fncpy() got introduced[2] some time ago.

Thanks for working on this!

Emilio

[1] http://www.mail-archive.com/linux-omap at vger.kernel.org/msg94995.html
[2] http://www.spinics.net/lists/arm-kernel/msg110706.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC 0/4] Create infrastructure for running C code from SRAM.
  2013-09-04 19:52 ` [RFC 0/4] Create infrastructure for running C code from SRAM Emilio López
@ 2013-09-04 21:47   ` Russ Dill
  2013-09-06 11:02     ` Sekhar Nori
  2013-09-06 11:14     ` Russell King - ARM Linux
  0 siblings, 2 replies; 17+ messages in thread
From: Russ Dill @ 2013-09-04 21:47 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Sep 4, 2013 at 12:52 PM, Emilio L?pez <emilio@elopez.com.ar> wrote:
> Hi,
>
> El 03/09/13 13:44, Russ Dill escribi?:
>
>> This RFC patchset explores an idea for loading C code into SRAM.
>> Currently, all the code I'm aware of that needs to run from SRAM is
>> written
>> in assembler. The most common reason for code needing to run from SRAM is
>> that the memory controller is being disabled/ enabled or is already
>> disabled. arch/arm has by far the most examples, but code also exists in
>> powerpc and sh.
>>
>> The code is written in asm for two primary reasons. First so that markers
>> can be put in indicating the size of the code they it can be copied.
>> Second
>> so that data can be placed along with text and accessed in a position
>> independant manner.
>>
>> SRAM handling code is in the process of being moved from arch directories
>> into drivers/misc/sram.c using device tree and genalloc [1] [2]. This RFC
>> patchset builds on that, including the limitation that the SRAM address is
>> not known at compile time. Because the SRAM address is not known at
>> compile
>> time, the code that runs from SRAM must be compiled with -fPIC. Even if
>> the code were loaded to a fixed virtual address, portions of the code must
>> often be run with the MMU disabled.
>>
>> The general idea is that for each SRAM user (such as an SoC specific
>> suspend/resume mechanism) to create a group of sections. The section group
>> is created with a single macro for each user, but end up looking like
>> this:
>>
>> .sram.am33xx : AT(ADDR(.sram.am33xx) - 0) {
>>    __sram_am33xx_start = .;
>>    *(.sram.am33xx.*)
>>    __sram_am33xx_end = .;
>> }
>>
>> Any data or functions that should be copied to SRAM for this use should be
>> maked with an appropriate __section() attribute. A helper is then added
>> for
>> translating between the original kernel symbol, and the address of that
>> function or variable once it has been copied into SRAM. Once control is
>> passed to a function within the SRAM section grouping, it can access any
>> variables or functions within that same SRAM section grouping without
>> translation.
>>
>> [1]
>> http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=4984c6
>> [2] http://www.spinics.net/lists/linux-omap/msg96504.html
>>
>> Russ Dill (4):
>>    Misc: SRAM: Create helpers for loading C code into SRAM
>>    ARM: SRAM: Add macro for generating SRAM resume trampoline
>>    Misc: SRAM: Hack for allowing executable code in SRAM.
>>    ARM: AM33XX: Move suspend/resume assembly to C
>>
>>   arch/arm/include/asm/suspend.h    |  14 ++
>>   arch/arm/kernel/vmlinux.lds.S     |   2 +
>>   arch/arm/mach-omap2/Makefile      |   2 +-
>>   arch/arm/mach-omap2/pm33xx.c      |  50 ++---
>>   arch/arm/mach-omap2/pm33xx.h      |  23 +--
>>   arch/arm/mach-omap2/sleep33xx.S   | 394
>> --------------------------------------
>>   arch/arm/mach-omap2/sleep33xx.c   | 309 ++++++++++++++++++++++++++++++
>>   arch/arm/mach-omap2/sram.c        |  15 --
>>   drivers/misc/sram.c               | 106 +++++++++-
>>   include/asm-generic/vmlinux.lds.h |   7 +
>>   include/linux/sram.h              |  44 +++++
>>   11 files changed, 509 insertions(+), 457 deletions(-)
>>   delete mode 100644 arch/arm/mach-omap2/sleep33xx.S
>>   create mode 100644 arch/arm/mach-omap2/sleep33xx.c
>>   create mode 100644 include/linux/sram.h
>>
>
> I'm interested in this, as I'll need something like it for suspend/resume on
> sunxi. Unfortunately, I only got the cover letter on my email, and the web
> lakml archives don't seem to have the rest either. After a bit of searching
> on Google I found a copy on linux-omap[1], but it'd be great if I didn't
> have to hunt for the patches :)

The mails to arm-kernel are "awaiting moderation".

> I only have one comment, from a quick look at the code
>
> +       memcpy((void *) chunk->addr, data, sz);
> +       flush_icache_range(chunk->addr, chunk->addr + sz);
>
> How would that behave on Thumb-2 mode? I believe that's the reason why
> fncpy() got introduced[2] some time ago.
>
> Thanks for working on this!

I think this is already taken care of by the way sram.c is using
genalloc. The allocation returned should be aligned to 32 bytes. The
thumb bit shouldn't be an issue as code is copied based on the start
and end makers made by the linker. I may need to add .align statements
in the linker so that the start and end markers for the copied code
are aligned to at least 8 bytes.

Thanks!

> Emilio
>
> [1] http://www.mail-archive.com/linux-omap at vger.kernel.org/msg94995.html
> [2] http://www.spinics.net/lists/arm-kernel/msg110706.html
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC 0/4] Create infrastructure for running C code from SRAM.
  2013-09-04 21:47   ` Russ Dill
@ 2013-09-06 11:02     ` Sekhar Nori
  2013-09-06 11:14     ` Russell King - ARM Linux
  1 sibling, 0 replies; 17+ messages in thread
From: Sekhar Nori @ 2013-09-06 11:02 UTC (permalink / raw)
  To: linux-arm-kernel

On Thursday 05 September 2013 03:17 AM, Russ Dill wrote:
> On Wed, Sep 4, 2013 at 12:52 PM, Emilio L?pez <emilio@elopez.com.ar> wrote:

>> I'm interested in this, as I'll need something like it for suspend/resume on
>> sunxi. Unfortunately, I only got the cover letter on my email, and the web
>> lakml archives don't seem to have the rest either. After a bit of searching
>> on Google I found a copy on linux-omap[1], but it'd be great if I didn't
>> have to hunt for the patches :)
> 
> The mails to arm-kernel are "awaiting moderation".

That is because you have RFC in the subject line instead of PATCH RFC

Thanks,
Sekhar

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC 0/4] Create infrastructure for running C code from SRAM.
  2013-09-04 21:47   ` Russ Dill
  2013-09-06 11:02     ` Sekhar Nori
@ 2013-09-06 11:14     ` Russell King - ARM Linux
  2013-09-06 16:40       ` Dave Martin
  2013-09-06 18:40       ` Russ Dill
  1 sibling, 2 replies; 17+ messages in thread
From: Russell King - ARM Linux @ 2013-09-06 11:14 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Sep 04, 2013 at 02:47:51PM -0700, Russ Dill wrote:
> I think this is already taken care of by the way sram.c is using
> genalloc. The allocation returned should be aligned to 32 bytes. The
> thumb bit shouldn't be an issue as code is copied based on the start
> and end makers made by the linker. I may need to add .align statements
> in the linker so that the start and end markers for the copied code
> are aligned to at least 8 bytes.

I think you need to read up on what fncpy does... there's more to it
than just merely copying code at an appropriate alignment.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC 0/4] Create infrastructure for running C code from SRAM.
  2013-09-06 11:14     ` Russell King - ARM Linux
@ 2013-09-06 16:40       ` Dave Martin
  2013-09-06 18:50         ` Russ Dill
  2013-09-07  8:57         ` Russell King - ARM Linux
  2013-09-06 18:40       ` Russ Dill
  1 sibling, 2 replies; 17+ messages in thread
From: Dave Martin @ 2013-09-06 16:40 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Sep 06, 2013 at 12:14:08PM +0100, Russell King - ARM Linux wrote:
> On Wed, Sep 04, 2013 at 02:47:51PM -0700, Russ Dill wrote:
> > I think this is already taken care of by the way sram.c is using
> > genalloc. The allocation returned should be aligned to 32 bytes. The
> > thumb bit shouldn't be an issue as code is copied based on the start
> > and end makers made by the linker. I may need to add .align statements
> > in the linker so that the start and end markers for the copied code
> > are aligned to at least 8 bytes.
> 
> I think you need to read up on what fncpy does... there's more to it
> than just merely copying code at an appropriate alignment.

The technique of putting each loadable blob in a specific vmlinux
section, and then adjusting entry-point symbols by adding/subtracting
the appropriate offset, probably does work.

This relies on the functions' code alignment requirement being
honoured by both the vmlinux link map, and the allocator used to find
SRAM space to copy the functions to.

Searching the entire list of known blobs every time we want to convert
a symbol seems unnecessary though.  Surely the caller could know the
blob<->symbol mapping anyway?

One thing fncpy() doesn't provide is a way to copy groups of functions
that call each other, if vmlinux needs to know about any symbol other
than the one at the start.  We might need a better mechanism if that is
needed.

I actually wonder whether fncpy() contains a buglet, whereby
flush_icache_range() is used instead of coherent_kern_range().
The SRAM is probably not mapped cached, but at least a DSB would be
needed before flushing the relevant lines from the I-cache.

However, flush_icache_range() seems to be implemented by a call to
coherent_kern_range() anyway, so perhaps that's not a problem.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC 0/4] Create infrastructure for running C code from SRAM.
  2013-09-06 16:40       ` Dave Martin
@ 2013-09-06 18:50         ` Russ Dill
  2013-09-07  8:57         ` Russell King - ARM Linux
  1 sibling, 0 replies; 17+ messages in thread
From: Russ Dill @ 2013-09-06 18:50 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Sep 6, 2013 at 9:40 AM, Dave Martin <Dave.Martin@arm.com> wrote:
> On Fri, Sep 06, 2013 at 12:14:08PM +0100, Russell King - ARM Linux wrote:
>> On Wed, Sep 04, 2013 at 02:47:51PM -0700, Russ Dill wrote:
>> > I think this is already taken care of by the way sram.c is using
>> > genalloc. The allocation returned should be aligned to 32 bytes. The
>> > thumb bit shouldn't be an issue as code is copied based on the start
>> > and end makers made by the linker. I may need to add .align statements
>> > in the linker so that the start and end markers for the copied code
>> > are aligned to at least 8 bytes.
>>
>> I think you need to read up on what fncpy does... there's more to it
>> than just merely copying code at an appropriate alignment.
>
> The technique of putting each loadable blob in a specific vmlinux
> section, and then adjusting entry-point symbols by adding/subtracting
> the appropriate offset, probably does work.
>
> This relies on the functions' code alignment requirement being
> honoured by both the vmlinux link map, and the allocator used to find
> SRAM space to copy the functions to.
>
> Searching the entire list of known blobs every time we want to convert
> a symbol seems unnecessary though.  Surely the caller could know the
> blob<->symbol mapping anyway?

It doesn't search the list of known blobs, only loaded blobs. On all
the platforms I'm aware of, only one SRAM section is loaded with code.

> One thing fncpy() doesn't provide is a way to copy groups of functions
> that call each other, if vmlinux needs to know about any symbol other
> than the one at the start.  We might need a better mechanism if that is
> needed.
>
>
> I actually wonder whether fncpy() contains a buglet, whereby
> flush_icache_range() is used instead of coherent_kern_range().
> The SRAM is probably not mapped cached, but at least a DSB would be
> needed before flushing the relevant lines from the I-cache.

It is mapped cached on most platforms.

> However, flush_icache_range() seems to be implemented by a call to
> coherent_kern_range() anyway, so perhaps that's not a problem.
>
> Cheers
> ---Dave
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC 0/4] Create infrastructure for running C code from SRAM.
  2013-09-06 16:40       ` Dave Martin
  2013-09-06 18:50         ` Russ Dill
@ 2013-09-07  8:57         ` Russell King - ARM Linux
  1 sibling, 0 replies; 17+ messages in thread
From: Russell King - ARM Linux @ 2013-09-07  8:57 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Sep 06, 2013 at 05:40:59PM +0100, Dave Martin wrote:
> I actually wonder whether fncpy() contains a buglet, whereby
> flush_icache_range() is used instead of coherent_kern_range().
> The SRAM is probably not mapped cached, but at least a DSB would be
> needed before flushing the relevant lines from the I-cache.

flush_icache_range() is correct - it's there to ensure that memory which
has been written will be readable to the instruction stream.  That's it's
whole purpose, and it's used when modules are loaded too.

You're reading too much into the name: it doesn't just touch the I cache.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC 0/4] Create infrastructure for running C code from SRAM.
  2013-09-06 11:14     ` Russell King - ARM Linux
  2013-09-06 16:40       ` Dave Martin
@ 2013-09-06 18:40       ` Russ Dill
  1 sibling, 0 replies; 17+ messages in thread
From: Russ Dill @ 2013-09-06 18:40 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Sep 6, 2013 at 4:14 AM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Wed, Sep 04, 2013 at 02:47:51PM -0700, Russ Dill wrote:
>> I think this is already taken care of by the way sram.c is using
>> genalloc. The allocation returned should be aligned to 32 bytes. The
>> thumb bit shouldn't be an issue as code is copied based on the start
>> and end makers made by the linker. I may need to add .align statements
>> in the linker so that the start and end markers for the copied code
>> are aligned to at least 8 bytes.
>
> I think you need to read up on what fncpy does... there's more to it
> than just merely copying code at an appropriate alignment.

Yes, I need to add a pair of inlines that do the asm trickery to/from
function addresses. Thanks.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC 0/4] Create infrastructure for running C code from SRAM.
  2013-09-03 16:44 [RFC 0/4] Create infrastructure for running C code from SRAM Russ Dill
       [not found] ` <1378226665-27090-4-git-send-email-Russ.Dill@ti.com>
  2013-09-04 19:52 ` [RFC 0/4] Create infrastructure for running C code from SRAM Emilio López
@ 2013-09-06 11:12 ` Russell King - ARM Linux
  2013-09-06 16:19   ` Dave Martin
  2013-09-06 19:32   ` Russ Dill
  2 siblings, 2 replies; 17+ messages in thread
From: Russell King - ARM Linux @ 2013-09-06 11:12 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Sep 03, 2013 at 09:44:21AM -0700, Russ Dill wrote:
> SRAM handling code is in the process of being moved from arch directories
> into drivers/misc/sram.c using device tree and genalloc [1] [2]. This RFC
> patchset builds on that, including the limitation that the SRAM address is
> not known at compile time. Because the SRAM address is not known at compile
> time, the code that runs from SRAM must be compiled with -fPIC. Even if
> the code were loaded to a fixed virtual address, portions of the code must
> often be run with the MMU disabled.

What are you doing about the various gcc utility functions that may be
implicitly called from C code such as memcpy and memset?

> The general idea is that for each SRAM user (such as an SoC specific
> suspend/resume mechanism) to create a group of sections. The section group
> is created with a single macro for each user, but end up looking like this:
> 
> .sram.am33xx : AT(ADDR(.sram.am33xx) - 0) {
>   __sram_am33xx_start = .;
>   *(.sram.am33xx.*)
>   __sram_am33xx_end = .;
> }
> 
> Any data or functions that should be copied to SRAM for this use should be
> maked with an appropriate __section() attribute. A helper is then added for
> translating between the original kernel symbol, and the address of that
> function or variable once it has been copied into SRAM. Once control is
> passed to a function within the SRAM section grouping, it can access any
> variables or functions within that same SRAM section grouping without
> translation.

What about the relocations which will need to be fixed up - eg, addresses
in the literal pool, the GOT table contents, etc?  You say nothing about
this.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC 0/4] Create infrastructure for running C code from SRAM.
  2013-09-06 11:12 ` Russell King - ARM Linux
@ 2013-09-06 16:19   ` Dave Martin
  2013-09-06 19:42     ` Russ Dill
  2013-09-06 19:32   ` Russ Dill
  1 sibling, 1 reply; 17+ messages in thread
From: Dave Martin @ 2013-09-06 16:19 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Sep 06, 2013 at 12:12:21PM +0100, Russell King - ARM Linux wrote:
> On Tue, Sep 03, 2013 at 09:44:21AM -0700, Russ Dill wrote:
> > SRAM handling code is in the process of being moved from arch directories
> > into drivers/misc/sram.c using device tree and genalloc [1] [2]. This RFC
> > patchset builds on that, including the limitation that the SRAM address is
> > not known at compile time. Because the SRAM address is not known at compile
> > time, the code that runs from SRAM must be compiled with -fPIC. Even if
> > the code were loaded to a fixed virtual address, portions of the code must
> > often be run with the MMU disabled.
> 
> What are you doing about the various gcc utility functions that may be
> implicitly called from C code such as memcpy and memset?
> 
> > The general idea is that for each SRAM user (such as an SoC specific
> > suspend/resume mechanism) to create a group of sections. The section group
> > is created with a single macro for each user, but end up looking like this:
> > 
> > .sram.am33xx : AT(ADDR(.sram.am33xx) - 0) {
> >   __sram_am33xx_start = .;
> >   *(.sram.am33xx.*)
> >   __sram_am33xx_end = .;
> > }
> > 
> > Any data or functions that should be copied to SRAM for this use should be
> > maked with an appropriate __section() attribute. A helper is then added for
> > translating between the original kernel symbol, and the address of that
> > function or variable once it has been copied into SRAM. Once control is
> > passed to a function within the SRAM section grouping, it can access any
> > variables or functions within that same SRAM section grouping without
> > translation.
> 
> What about the relocations which will need to be fixed up - eg, addresses
> in the literal pool, the GOT table contents, etc?  You say nothing about
> this.

I was also thinking about this, and there are more problems.

As well as what has already been mentioned:

 * Calls from inside the SRAM code to vmlinux (including lib1funcs etc.)
   will typically break, except on architectures where function calls are
   (absolute by default not ARM).

 * The compiler/linker won't detect unsafe constructs or code generation,
   because it assumes that anything built with -fPIC is going to be patched
   up later by ld.so or equivalent.

 * The GOT is generated by the linker, and is a single table.  Yet each
   SRAM blob needs to be able to refer to its own GOT entries position-
   independently.  Moving the blobs independently won't work.

In other words: -fPIC does not generate position-independent code.

It generates position-dependent code that is easier to move around than
non-fPIC code, but you still need a dynamic linker (or equivalent) to
make it all work.

There are various "correct" ways to handle this, the simplest of which
is probably to build each SRAM blob as a kernel module, embed the result
in the kernel somehow, and then use the module loader infrastructure
to handle fixing the module up to the right address.

But this is still likely to be overkill, given the small scale of the
SRAM code.

Restricting such code to carefully-written assembler (as now) may be
the more practical approrach, unless there's a good example of somewhere
that C code would provide a big benefit.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC 0/4] Create infrastructure for running C code from SRAM.
  2013-09-06 16:19   ` Dave Martin
@ 2013-09-06 19:42     ` Russ Dill
  0 siblings, 0 replies; 17+ messages in thread
From: Russ Dill @ 2013-09-06 19:42 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Sep 6, 2013 at 9:19 AM, Dave Martin <Dave.Martin@arm.com> wrote:
> On Fri, Sep 06, 2013 at 12:12:21PM +0100, Russell King - ARM Linux wrote:
>> On Tue, Sep 03, 2013 at 09:44:21AM -0700, Russ Dill wrote:
>> > SRAM handling code is in the process of being moved from arch directories
>> > into drivers/misc/sram.c using device tree and genalloc [1] [2]. This RFC
>> > patchset builds on that, including the limitation that the SRAM address is
>> > not known at compile time. Because the SRAM address is not known at compile
>> > time, the code that runs from SRAM must be compiled with -fPIC. Even if
>> > the code were loaded to a fixed virtual address, portions of the code must
>> > often be run with the MMU disabled.
>>
>> What are you doing about the various gcc utility functions that may be
>> implicitly called from C code such as memcpy and memset?
>>
>> > The general idea is that for each SRAM user (such as an SoC specific
>> > suspend/resume mechanism) to create a group of sections. The section group
>> > is created with a single macro for each user, but end up looking like this:
>> >
>> > .sram.am33xx : AT(ADDR(.sram.am33xx) - 0) {
>> >   __sram_am33xx_start = .;
>> >   *(.sram.am33xx.*)
>> >   __sram_am33xx_end = .;
>> > }
>> >
>> > Any data or functions that should be copied to SRAM for this use should be
>> > maked with an appropriate __section() attribute. A helper is then added for
>> > translating between the original kernel symbol, and the address of that
>> > function or variable once it has been copied into SRAM. Once control is
>> > passed to a function within the SRAM section grouping, it can access any
>> > variables or functions within that same SRAM section grouping without
>> > translation.
>>
>> What about the relocations which will need to be fixed up - eg, addresses
>> in the literal pool, the GOT table contents, etc?  You say nothing about
>> this.
>
> I was also thinking about this, and there are more problems.
>
> As well as what has already been mentioned:
>
>  * Calls from inside the SRAM code to vmlinux (including lib1funcs etc.)
>    will typically break, except on architectures where function calls are
>    (absolute by default not ARM).

As in the response to RMK, I think compiler flags are enough to
prevent implicit memcpy/memset calls. The code would not be allowed to
do divisions, module, or 64 bit multiplication. A make rule would
check the sram sections for any dynamically relocatable symbols.

>  * The compiler/linker won't detect unsafe constructs or code generation,
>    because it assumes that anything built with -fPIC is going to be patched
>    up later by ld.so or equivalent.

Can you provide examples of what some of these other unsafe constructs might be?

>  * The GOT is generated by the linker, and is a single table.  Yet each
>    SRAM blob needs to be able to refer to its own GOT entries position-
>    independently.  Moving the blobs independently won't work.

Would GOT entries only exist if there are accesses to .data or .bss?
The SRAM C code would not support such a thing, only access to data
and text within the SRAM grouping is allowed. Is there a way to make
the compiler or linker complain if such an access is done? If not,
it'd be another make rule as above.

> In other words: -fPIC does not generate position-independent code.
>
> It generates position-dependent code that is easier to move around than
> non-fPIC code, but you still need a dynamic linker (or equivalent) to
> make it all work.

arch/arm/boot/compressed/ seems to manage it. Hopefully, by allowing
only more limited code, I can get by with less tricks.

> There are various "correct" ways to handle this, the simplest of which
> is probably to build each SRAM blob as a kernel module, embed the result
> in the kernel somehow, and then use the module loader infrastructure
> to handle fixing the module up to the right address.
>
> But this is still likely to be overkill, given the small scale of the
> SRAM code.

Yes, I'm pretty sure several people would scream rather loudly if
getting suspend/resume support on their platform required
CONFIG_MODULES=y.

> Restricting such code to carefully-written assembler (as now) may be
> the more practical approrach, unless there's a good example of somewhere
> that C code would provide a big benefit.

There are currently about 5000 or so lines of assembly code in
arch/arm that are used for suspend/resume stubs. In one stage of
am335x development, the sleep/resume stub for am335x was about 1200
lines long. Since then, a lot of that code has been moved to a
firmware blob, but there has been some pushback on that, which is why
I'm investigating this path. Especially given that there are some
future platforms that will follow the am335x pm model.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC 0/4] Create infrastructure for running C code from SRAM.
  2013-09-06 11:12 ` Russell King - ARM Linux
  2013-09-06 16:19   ` Dave Martin
@ 2013-09-06 19:32   ` Russ Dill
  2013-09-07 16:21     ` Ard Biesheuvel
  1 sibling, 1 reply; 17+ messages in thread
From: Russ Dill @ 2013-09-06 19:32 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Sep 6, 2013 at 4:12 AM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Tue, Sep 03, 2013 at 09:44:21AM -0700, Russ Dill wrote:
>> SRAM handling code is in the process of being moved from arch directories
>> into drivers/misc/sram.c using device tree and genalloc [1] [2]. This RFC
>> patchset builds on that, including the limitation that the SRAM address is
>> not known at compile time. Because the SRAM address is not known at compile
>> time, the code that runs from SRAM must be compiled with -fPIC. Even if
>> the code were loaded to a fixed virtual address, portions of the code must
>> often be run with the MMU disabled.
>
> What are you doing about the various gcc utility functions that may be
> implicitly called from C code such as memcpy and memset?

That would create a problem. Would '-ffreestanding' be the correct
flag to add? As far as the family of __aeabi_*, I need to add
documentation stating that on ARM, you can't divide, perform modulo,
and can't do 64 bit multiplications. I can then add a make rule that
will grep the symbol lists of .sram sections for ^__aeabi_. Is this
enough?

>> The general idea is that for each SRAM user (such as an SoC specific
>> suspend/resume mechanism) to create a group of sections. The section group
>> is created with a single macro for each user, but end up looking like this:
>>
>> .sram.am33xx : AT(ADDR(.sram.am33xx) - 0) {
>>   __sram_am33xx_start = .;
>>   *(.sram.am33xx.*)
>>   __sram_am33xx_end = .;
>> }
>>
>> Any data or functions that should be copied to SRAM for this use should be
>> maked with an appropriate __section() attribute. A helper is then added for
>> translating between the original kernel symbol, and the address of that
>> function or variable once it has been copied into SRAM. Once control is
>> passed to a function within the SRAM section grouping, it can access any
>> variables or functions within that same SRAM section grouping without
>> translation.
>
> What about the relocations which will need to be fixed up - eg, addresses
> in the literal pool, the GOT table contents, etc?  You say nothing about
> this.

The C code would need to be written so that such accesses do not
occur. From functions that are in the sram text section, only accesses
to other sram sections in their group would be allowed. And above, a
compilation step could be added to make the compilation fail when such
things happen.

The direction you are going though is good, if this is fragile, it
doesn't put us in a better place then having fragile asm that at least
complies reliably. As I'm looking towards the future and platforms
that are similar to am335x in that they are placing more and more of
the PM state machine burden on the CPU, I'd really like to try to make
this happen.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC 0/4] Create infrastructure for running C code from SRAM.
  2013-09-06 19:32   ` Russ Dill
@ 2013-09-07 16:21     ` Ard Biesheuvel
  2013-09-09 23:10       ` Russ Dill
  0 siblings, 1 reply; 17+ messages in thread
From: Ard Biesheuvel @ 2013-09-07 16:21 UTC (permalink / raw)
  To: linux-arm-kernel

On 6 September 2013 21:32, Russ Dill <Russ.Dill@ti.com> wrote:
> On Fri, Sep 6, 2013 at 4:12 AM, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
>> On Tue, Sep 03, 2013 at 09:44:21AM -0700, Russ Dill wrote:
>>> SRAM handling code is in the process of being moved from arch directories
>>> into drivers/misc/sram.c using device tree and genalloc [1] [2]. This RFC
>>> patchset builds on that, including the limitation that the SRAM address is
>>> not known at compile time. Because the SRAM address is not known at compile
>>> time, the code that runs from SRAM must be compiled with -fPIC. Even if
>>> the code were loaded to a fixed virtual address, portions of the code must
>>> often be run with the MMU disabled.
>>
>> What are you doing about the various gcc utility functions that may be
>> implicitly called from C code such as memcpy and memset?
>
> That would create a problem. Would '-ffreestanding' be the correct
> flag to add?

No, unfortunately, -ffreestanding won't prevent GCC from generating
implicit calls to memzero() et al. These are mainly issued when using
initialized non-POD stack variables so avoiding those might help you
there.

> As far as the family of __aeabi_*, I need to add
> documentation stating that on ARM, you can't divide, perform modulo,
> and can't do 64 bit multiplications. I can then add a make rule that
> will grep the symbol lists of .sram sections for ^__aeabi_. Is this
> enough?
>

Well, even printk() needs integer division for its %d/%u modifiers, so
this is really not so easy to achieve.

>>> The general idea is that for each SRAM user (such as an SoC specific
>>> suspend/resume mechanism) to create a group of sections. The section group
>>> is created with a single macro for each user, but end up looking like this:
>>>
>>> .sram.am33xx : AT(ADDR(.sram.am33xx) - 0) {
>>>   __sram_am33xx_start = .;
>>>   *(.sram.am33xx.*)
>>>   __sram_am33xx_end = .;
>>> }
>>>
>>> Any data or functions that should be copied to SRAM for this use should be
>>> maked with an appropriate __section() attribute. A helper is then added for
>>> translating between the original kernel symbol, and the address of that
>>> function or variable once it has been copied into SRAM. Once control is
>>> passed to a function within the SRAM section grouping, it can access any
>>> variables or functions within that same SRAM section grouping without
>>> translation.
>>
>> What about the relocations which will need to be fixed up - eg, addresses
>> in the literal pool, the GOT table contents, etc?  You say nothing about
>> this.
>
> The C code would need to be written so that such accesses do not
> occur. From functions that are in the sram text section, only accesses
> to other sram sections in their group would be allowed. And above, a
> compilation step could be added to make the compilation fail when such
> things happen.
>

The point is that, sadly, GCC is just not very good at generating
relocatable code for embedded targets. Playing with -fvisibility may
result in code that contains fewer dynamic relocations, but you will
always end up with a few that need to be fixed up before the code can
run. Another thing to note is that usually, these relocations can only
be fixed up once, as the addend is overwritten by the fixed-up
address. This means that the code can only run in SRAM, and you should
probably best avoid the module loader machinery as it may clobber the
addends before you get to process them.

One thing that remains implicit in this discussion is that you are
executing from SRAM because DRAM is not available (I presume).
Wouldn't it be better to treat the code that lives in the SRAM as a
completely separate executable? You can generate a PIE executable that
supplies minimal memzero et al,  fixup the relocations yourself (look
at the uboot sources for an example of this) and you will be
absolutely sure that the code can run completely autonomously. In
fact, some of this stuff could potentially be reused for other
disjoint execution domains such as TZ secure world.

Regards,
Ard.

> The direction you are going though is good, if this is fragile, it
> doesn't put us in a better place then having fragile asm that at least
> complies reliably. As I'm looking towards the future and platforms
> that are similar to am335x in that they are placing more and more of
> the PM state machine burden on the CPU, I'd really like to try to make
> this happen.
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC 0/4] Create infrastructure for running C code from SRAM.
  2013-09-07 16:21     ` Ard Biesheuvel
@ 2013-09-09 23:10       ` Russ Dill
  0 siblings, 0 replies; 17+ messages in thread
From: Russ Dill @ 2013-09-09 23:10 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, Sep 7, 2013 at 9:21 AM, Ard Biesheuvel
<ard.biesheuvel@linaro.org> wrote:
> On 6 September 2013 21:32, Russ Dill <Russ.Dill@ti.com> wrote:
>> On Fri, Sep 6, 2013 at 4:12 AM, Russell King - ARM Linux
>> <linux@arm.linux.org.uk> wrote:
>>> On Tue, Sep 03, 2013 at 09:44:21AM -0700, Russ Dill wrote:
>>>> SRAM handling code is in the process of being moved from arch directories
>>>> into drivers/misc/sram.c using device tree and genalloc [1] [2]. This RFC
>>>> patchset builds on that, including the limitation that the SRAM address is
>>>> not known at compile time. Because the SRAM address is not known at compile
>>>> time, the code that runs from SRAM must be compiled with -fPIC. Even if
>>>> the code were loaded to a fixed virtual address, portions of the code must
>>>> often be run with the MMU disabled.
>>>
>>> What are you doing about the various gcc utility functions that may be
>>> implicitly called from C code such as memcpy and memset?
>>
>> That would create a problem. Would '-ffreestanding' be the correct
>> flag to add?
>
> No, unfortunately, -ffreestanding won't prevent GCC from generating
> implicit calls to memzero() et al. These are mainly issued when using
> initialized non-POD stack variables so avoiding those might help you
> there.
>> As far as the family of __aeabi_*, I need to add
>> documentation stating that on ARM, you can't divide, perform modulo,
>> and can't do 64 bit multiplications. I can then add a make rule that
>> will grep the symbol lists of .sram sections for ^__aeabi_. Is this
>> enough?
>>
>
> Well, even printk() needs integer division for its %d/%u modifiers, so
> this is really not so easy to achieve.
>
>>>> The general idea is that for each SRAM user (such as an SoC specific
>>>> suspend/resume mechanism) to create a group of sections. The section group
>>>> is created with a single macro for each user, but end up looking like this:
>>>>
>>>> .sram.am33xx : AT(ADDR(.sram.am33xx) - 0) {
>>>>   __sram_am33xx_start = .;
>>>>   *(.sram.am33xx.*)
>>>>   __sram_am33xx_end = .;
>>>> }
>>>>
>>>> Any data or functions that should be copied to SRAM for this use should be
>>>> maked with an appropriate __section() attribute. A helper is then added for
>>>> translating between the original kernel symbol, and the address of that
>>>> function or variable once it has been copied into SRAM. Once control is
>>>> passed to a function within the SRAM section grouping, it can access any
>>>> variables or functions within that same SRAM section grouping without
>>>> translation.
>>>
>>> What about the relocations which will need to be fixed up - eg, addresses
>>> in the literal pool, the GOT table contents, etc?  You say nothing about
>>> this.
>>
>> The C code would need to be written so that such accesses do not
>> occur. From functions that are in the sram text section, only accesses
>> to other sram sections in their group would be allowed. And above, a
>> compilation step could be added to make the compilation fail when such
>> things happen.
>>
>
> The point is that, sadly, GCC is just not very good at generating
> relocatable code for embedded targets. Playing with -fvisibility may
> result in code that contains fewer dynamic relocations, but you will
> always end up with a few that need to be fixed up before the code can
> run. Another thing to note is that usually, these relocations can only
> be fixed up once, as the addend is overwritten by the fixed-up
> address. This means that the code can only run in SRAM, and you should
> probably best avoid the module loader machinery as it may clobber the
> addends before you get to process them.
>
> One thing that remains implicit in this discussion is that you are
> executing from SRAM because DRAM is not available (I presume).
> Wouldn't it be better to treat the code that lives in the SRAM as a
> completely separate executable? You can generate a PIE executable that
> supplies minimal memzero et al,  fixup the relocations yourself (look
> at the uboot sources for an example of this) and you will be
> absolutely sure that the code can run completely autonomously. In
> fact, some of this stuff could potentially be reused for other
> disjoint execution domains such as TZ secure world.

This is the path I'm going down, but I'm trying to do it without
relocations. I'm following the model of arch/arm/boot/compressed and
generating a relocatable gcc builtin library with weak symbols
containing lib1funcs.S, string.c, ashldi3.S, and some stubs for div0
and the unwind symbols, call in sramlib.o.

I'm then doing an objcopy of the .sramlib section, and the .sram.*
sections into a single object file and performing a link with a linker
script like:

SECTIONS
{
    .text : { *(.sramlib) }

    OVERLAY ALIGN(32) : NOCROSSREFS
    {
        .sram.am33xx { *(.sram.am33xx.*) }
        .sram.am437x { *(.sram.am437x.*) }
    }
}

It produces output without any relocations, but from there I'm a
little fuzzy on how to get the symbols of functions and variables into
the kernel. In the meantime, I'll look into the u-boot methods.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2013-09-09 23:10 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-09-03 16:44 [RFC 0/4] Create infrastructure for running C code from SRAM Russ Dill
     [not found] ` <1378226665-27090-4-git-send-email-Russ.Dill@ti.com>
2013-09-04 18:06   ` [RFC 3/4] Misc: SRAM: Hack for allowing executable code in SRAM Tony Lindgren
2013-09-06 20:50     ` Russ Dill
2013-09-04 19:52 ` [RFC 0/4] Create infrastructure for running C code from SRAM Emilio López
2013-09-04 21:47   ` Russ Dill
2013-09-06 11:02     ` Sekhar Nori
2013-09-06 11:14     ` Russell King - ARM Linux
2013-09-06 16:40       ` Dave Martin
2013-09-06 18:50         ` Russ Dill
2013-09-07  8:57         ` Russell King - ARM Linux
2013-09-06 18:40       ` Russ Dill
2013-09-06 11:12 ` Russell King - ARM Linux
2013-09-06 16:19   ` Dave Martin
2013-09-06 19:42     ` Russ Dill
2013-09-06 19:32   ` Russ Dill
2013-09-07 16:21     ` Ard Biesheuvel
2013-09-09 23:10       ` Russ Dill

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).