[RFC 0/8] Introducing a generic AMP/IPC framework

linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed

* [RFC 0/8] Introducing a generic AMP/IPC framework
@ 2011-06-21  7:18 Ohad Ben-Cohen
       [not found] ` <BANLkTimn-THKipuQVHLA9i3QSoO5RTqMtA@mail.gmail.com>
                   ` (8 more replies)
  0 siblings, 9 replies; 63+ messages in thread
From: Ohad Ben-Cohen @ 2011-06-21  7:18 UTC (permalink / raw)
  To: linux-arm-kernel

Modern SoCs typically employ a central symmetric multiprocessing (SMP)
application processor running Linux, with several other asymmetric
multiprocessing (AMP) heterogeneous processors running different instances
of operating system, whether Linux or any other flavor of real-time OS.

OMAP4, for example, has dual Cortex-A9, dual Cortex-M3 and a C64x+ DSP.
Typically, the dual cortex-A9 is running Linux in a SMP configuration, and
each of the other three cores (two M3 cores and a DSP) is running its own
instance of RTOS in an AMP configuration.

AMP remote processors typically employ dedicated DSP codecs and multimedia
hardware accelerators, and therefore are often used to offload cpu-intensive
multimedia tasks from the main application processor. They could also be
used to control latency-sensitive sensors, drive "random" hardware blocks,
or just perform background tasks while the main CPU is idling.

Users of those remote processors can either be userland apps (e.g.
multimedia frameworks talking with remote OMX components) or kernel drivers
(controlling hardware accessible only by the remote processor, reserving
kernel-controlled resources on behalf of the remote processor, etc..).

This patch set adds a generic AMP/IPC framework which makes it possible to
control (power on, boot, power off) and communicate (simply send and receive
messages) with those remote processors.

Specifically, we're adding:

* Rpmsg: a virtio-based messaging bus that allows kernel drivers to
communicate with remote processors available on the system. In turn,
drivers could then expose appropriate user space interfaces, if needed
(tasks running on remote processors often have direct access to sensitive
resources like the system's physical memory, gpios, i2c buses, dma
controllers, etc..  so one normally wouldn't want to allow userland to
send everything/everywhere it wants).

Every rpmsg device is a communication channel with a service running on a
remote processor (thus rpmsg devices are called channels). Channels are
identified by a textual name (which is used to match drivers to devices)
and have a local ("source") rpmsg address, and remote ("destination") rpmsg
address. When a driver starts listening on a channel (most commonly when it
is probed), the bus assigns the driver a unique rpmsg src address (a 32 bit
integer) and binds it with the driver's rx callback handler. This way
when inbound messages arrive to this src address, the rpmsg core dispatches
them to that driver, by invoking the driver's rx handler with the payload
of the incoming message.

Once probed, rpmsg drivers can also immediately start sending messages to the
remote rpmsg service by using simple sending API; no need even to specify
a destination address, since that's part of the rpmsg channel, and the rpmsg
bus uses the channel's dst address when it constructs the message (for
more demanding use cases, there's also an extended API, which does allow
full control of both the src and dst addresses).

The rpmsg bus is using virtio to send and receive messages: every pair
of processors share two vrings, which are used to send and receive the
messages over shared memory (one vring is used for tx, and the other one
for rx). Kicking the remote processor (i.e. letting it know it has a pending
message on its vring) is accomplished by means available on the platform we
run on (e.g. OMAP is using its mailbox to both interrupt the remote processor
and tell it which vring is kicked at the same time). The header of every
message sent on the rpmsg bus contains src and dst addresses, which make it
possible to multiplex several rpmsg channels on the same vring.

One nice property of the rpmsg bus is that device creation is completely
dynamic: remote processors can announce the existence of remote rpmsg
services by sending a "name service" messages (which contain the name and
rpmsg addr of the remote service). Those messages are picked by the rpmsg
bus, which in turn dynamically creates and registers the rpmsg channels
(i.e devices) which represents the remote services. If/when a relevant rpmsg
driver is registered, it will be immediately probed by the bus, and can then
start "talking" to the remote service.

Similarly, we can use this technique to dynamically create virtio devices
(and new vrings) which would then represent e.g. remote network, console
and block devices that will be driven by the existing virtio drivers
(this is still not implemented though; it requires some RTOS work as we're
not booting Linux on OMAP's remote processors). Creating new vrings might
also be desired by users who just don't want to use the shared rpmsg vrings
(for performance or any other functionality reasons).

In addition to dynamic creation of rpmsg channels, the rpmsg bus also
supports creation of static channels. This is needed in two cases:
- when a certain remote processor doesn't support sending those "name
  service" announcements. In that case, a static table of remote rpmsg
  services must be used to create the rpmsg channels.
- to support rpmsg server drivers, which aren't bound to a specific remote
  rpmsg address. Instead, they just listen on a local address, waiting for
  incoming messages. To send a message, those server drivers need to use
  the rpmsg_sendto() API, so they can explicitly indicate the dst address
  every time.

There are already several immediate use cases for rpmsg drivers: OMX
offloading (already being used on OMAP4), hardware resource manager (remote
processors on OMAP4 need to ask Linux to enable/disable hardware resources
on its behalf), remote display driver on Netra (dm8168), where the display
is controlled by a remote M3 processor (and a Linux v4l2/fbdev driver will
use rpmsg to communicate with that remote display driver).

* Remoteproc: a generic driver that maintains the state of the remote
processor(s). Simple rproc_get() and rproc_put() API is exposed, which
drivers can use when needed (first driver to call get() will load a firmware,
configure an iommu if needed, and boot the remote processor, while last
driver to call put() will power it down).

Hardware differences are abstracted as usual: a platform-specific driver
registers its own start/stop handlers in the generic remoteproc driver,
and those are invoked when its time to power up/down the processor. As a
reference, this patch set include remoteproc support for both OMAP4's
cortex-M3 and Davinci's DSP, tested on the pandaboard and hawkboard,
respectively.

The gory part of remoteproc is the firmware handling. We tried to come up
with a simple binary format that will require minimum kernel code to handle,
but at the same time be generic enough in the hopes that it would prove
useful to others as well. We're not at all hang onto the binary format
we picked: if there's any technical reason to change it to support other
platforms, please let us know. We do realize that a single binary firmware
structure might eventually not work for everyone. it did prove useful for
us though; we adopted both the OMAP and Davinci platforms (and their
completely different remote processor devices) to this simple binary
structure, so we don't have to duplicate the firmware handling code.

It's also not entirely clear whether remoteproc should really be an
independent layer, or if it should just be squashed with the host-specific
component of the rpmsg framework (there isn't really a remoteproc use case
that doesn't need rpmsg anyway). Looking ahead, functionality like error
recovery and power management might require coupling those two together
as well.

Important stuff:

* Thanks Brian Swetland for great design ideas and fruitful meetings and
  Arnd Bergmann for pointing us at virtio, which is just awesome.
* Thanks Bhavin Shah, Mark Grosen, Suman Anna, Fernando Guzman Lugo,
  Shreyas Prasad, Gilbert Pitney, Armando Uribe De Leon, Robert Tivy and
  Alan DeMars for all your help. You know what you did.
* Patches are based on 3.0-rc3. Code was tested on OMAP4 using a panda
  board (and remoteproc was also tested on Davinci da850-evm and hawkboard).
* I will be replying to this email with an attachment of the M3 firmware
  image I work with, so anyone with a panda board can take the code for a
  ride (put the image in /lib/firmware, and tell me if stuff doesn't work
  for you: there are a few omap iommu fixes that are circulating but not
  merged yet).
* Licensing: definitions that needs to be shared with remote processors
  were put in BSD-licensed header files, so anyone can use them to develop
  compatible peers.
* The M3 RTOS source code itself is BSD licensed and will be publicly
  released too (it's in the works).
* This work is not complete; there are still several things to implement,
  improve and clean up, but the intention is to start the review, and not
  wait until everything is finished.

Guzman Lugo, Fernando (2):
  remoteproc: add omap implementation
  omap: add remoteproc devices

Mark Grosen (2):
  remoteproc: add davinci implementation
  davinci: da850: add remoteproc dsp device

Ohad Ben-Cohen (4):
  drivers: add generic remoteproc framework
  omap: add carveout memory support for remoteproc
  drivers: introduce rpmsg, a remote-processor messaging bus
  rpmsg: add omap host backend

 Documentation/ABI/testing/sysfs-bus-rpmsg       |   75 ++
 Documentation/remoteproc.txt                    |  170 ++++
 Documentation/rpmsg.txt                         |  308 +++++++
 arch/arm/mach-davinci/board-da850-evm.c         |    4 +
 arch/arm/mach-davinci/board-omapl138-hawk.c     |    4 +
 arch/arm/mach-davinci/da850.c                   |   14 +
 arch/arm/mach-davinci/devices-da8xx.c           |   15 +
 arch/arm/mach-davinci/include/mach/da8xx.h      |    1 +
 arch/arm/mach-davinci/include/mach/remoteproc.h |   28 +
 arch/arm/mach-omap2/Makefile                    |    2 +
 arch/arm/mach-omap2/remoteproc.c                |  159 ++++
 arch/arm/plat-omap/Kconfig                      |    8 +
 arch/arm/plat-omap/devices.c                    |   14 +-
 arch/arm/plat-omap/include/plat/dsp.h           |    6 +-
 arch/arm/plat-omap/include/plat/remoteproc.h    |   41 +
 drivers/Kconfig                                 |    3 +
 drivers/Makefile                                |    2 +
 drivers/remoteproc/Kconfig                      |   44 +
 drivers/remoteproc/Makefile                     |    7 +
 drivers/remoteproc/davinci_remoteproc.c         |  147 ++++
 drivers/remoteproc/omap_remoteproc.c            |  243 ++++++
 drivers/remoteproc/remoteproc.c                 |  780 +++++++++++++++++
 drivers/rpmsg/Kconfig                           |   16 +
 drivers/rpmsg/Makefile                          |    3 +
 drivers/rpmsg/host/Kconfig                      |   11 +
 drivers/rpmsg/host/Makefile                     |    1 +
 drivers/rpmsg/host/omap_rpmsg.c                 |  540 ++++++++++++
 drivers/rpmsg/host/omap_rpmsg.h                 |   69 ++
 drivers/rpmsg/virtio_rpmsg_bus.c                | 1036 +++++++++++++++++++++++
 include/linux/mod_devicetable.h                 |   10 +
 include/linux/remoteproc.h                      |  273 ++++++
 include/linux/rpmsg.h                           |  421 +++++++++
 include/linux/virtio_ids.h                      |    1 +
 33 files changed, 4453 insertions(+), 3 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-bus-rpmsg
 create mode 100644 Documentation/remoteproc.txt
 create mode 100644 Documentation/rpmsg.txt
 create mode 100644 arch/arm/mach-davinci/include/mach/remoteproc.h
 create mode 100644 arch/arm/mach-omap2/remoteproc.c
 create mode 100644 arch/arm/plat-omap/include/plat/remoteproc.h
 create mode 100644 drivers/remoteproc/Kconfig
 create mode 100644 drivers/remoteproc/Makefile
 create mode 100644 drivers/remoteproc/davinci_remoteproc.c
 create mode 100644 drivers/remoteproc/omap_remoteproc.c
 create mode 100644 drivers/remoteproc/remoteproc.c
 create mode 100644 drivers/rpmsg/Kconfig
 create mode 100644 drivers/rpmsg/Makefile
 create mode 100644 drivers/rpmsg/host/Kconfig
 create mode 100644 drivers/rpmsg/host/Makefile
 create mode 100644 drivers/rpmsg/host/omap_rpmsg.c
 create mode 100644 drivers/rpmsg/host/omap_rpmsg.h
 create mode 100644 drivers/rpmsg/virtio_rpmsg_bus.c
 create mode 100644 include/linux/remoteproc.h
 create mode 100644 include/linux/rpmsg.h

^ permalink raw reply	[flat|nested] 63+ messages in thread

[parent not found: <BANLkTimn-THKipuQVHLA9i3QSoO5RTqMtA@mail.gmail.com>]

* [RFC 0/8] Introducing a generic AMP/IPC framework
       [not found] ` <BANLkTimn-THKipuQVHLA9i3QSoO5RTqMtA@mail.gmail.com>
@ 2011-06-21  9:30   ` Ohad Ben-Cohen
  0 siblings, 0 replies; 63+ messages in thread
From: Ohad Ben-Cohen @ 2011-06-21  9:30 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jun 21, 2011 at 10:50 AM, Ohad Ben-Cohen <ohad@wizery.com> wrote:
> root at omap4430-panda:~# cat /debug/remoteproc/omap-rproc.1/state
> running (2)

At this point, the two remote M3 cores also start dumping trace logs to
shared memory buffers, which are exposed by debugfs entries:

root at omap4430-panda:~# cat /debug/remoteproc/omap-rproc.1/trace0
CORE0 starting..
...

root at omap4430-panda:~# cat /debug/remoteproc/omap-rproc.1/trace1
CORE1 starting..
...

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 0/8] Introducing a generic AMP/IPC framework
  2011-06-21  7:18 [RFC 0/8] Introducing a generic AMP/IPC framework Ohad Ben-Cohen
       [not found] ` <BANLkTimn-THKipuQVHLA9i3QSoO5RTqMtA@mail.gmail.com>
@ 2011-06-21 14:20 ` Arnd Bergmann
  2011-06-21 15:54   ` Grant Likely
  2011-06-22 11:41   ` Ohad Ben-Cohen
       [not found] ` <1308640714-17961-2-git-send-email-ohad@wizery.com>
                   ` (6 subsequent siblings)
  8 siblings, 2 replies; 63+ messages in thread
From: Arnd Bergmann @ 2011-06-21 14:20 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Ohad,

On Tuesday 21 June 2011, Ohad Ben-Cohen wrote:

> This patch set adds a generic AMP/IPC framework which makes it possible to
> control (power on, boot, power off) and communicate (simply send and receive
> messages) with those remote processors.

This looks really nice overall, but I don't currently have time for a
more in-depth review. My feeling is that you are definitely on the right
track here, and the plans you list as TODO to continue are all good.

One point I noticed is the use of debugfs, which you should probably
replace at some point with a stable API, e.g. your own debugfs-like
file system, but there is no hurry to do that now.

> The gory part of remoteproc is the firmware handling. We tried to come up
> with a simple binary format that will require minimum kernel code to handle,
> but at the same time be generic enough in the hopes that it would prove
> useful to others as well. We're not at all hang onto the binary format
> we picked: if there's any technical reason to change it to support other
> platforms, please let us know. We do realize that a single binary firmware
> structure might eventually not work for everyone. it did prove useful for
> us though; we adopted both the OMAP and Davinci platforms (and their
> completely different remote processor devices) to this simple binary
> structure, so we don't have to duplicate the firmware handling code.

Regarding the firmware, I think it would be nice to have the option
to stick the firmware blob into the flattened device tree as a bininclude,
so you can build systems using the external processors without special
user space support.

The binary format looks reasonable, but if you need anything more complex,
you should probably go straight for ELF files ;-)

The structures you use like

+struct fw_section {
+       u32 type;
+       u64 da;
+       u32 len;
+       char content[0];
+} __packed;

Unfortunately require __packed. It would be better to sort the members
differently so that each member is naturally aligned in order to
avoid the need for padding or __packed attributes, like:

struct fw_section {
       u32 type;
       u32 len;
       u64 da;
       char content[0];
};

	Arnd

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 0/8] Introducing a generic AMP/IPC framework
  2011-06-21 14:20 ` Arnd Bergmann
@ 2011-06-21 15:54   ` Grant Likely
  2011-06-22 11:41   ` Ohad Ben-Cohen
  1 sibling, 0 replies; 63+ messages in thread
From: Grant Likely @ 2011-06-21 15:54 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jun 21, 2011 at 8:20 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> Hi Ohad,
>
> On Tuesday 21 June 2011, Ohad Ben-Cohen wrote:
>
>> This patch set adds a generic AMP/IPC framework which makes it possible to
>> control (power on, boot, power off) and communicate (simply send and receive
>> messages) with those remote processors.
>
> This looks really nice overall, but I don't currently have time for a
> more in-depth review. My feeling is that you are definitely on the right
> track here, and the plans you list as TODO to continue are all good.

Second that.  I'm happy to see the shift over to virtio, I think that
is the right direction overall.  I'll try to carve out some time for a
detailed review.

g.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 0/8] Introducing a generic AMP/IPC framework
  2011-06-21 14:20 ` Arnd Bergmann
  2011-06-21 15:54   ` Grant Likely
@ 2011-06-22 11:41   ` Ohad Ben-Cohen
  2011-06-22 13:05     ` Arnd Bergmann
  1 sibling, 1 reply; 63+ messages in thread
From: Ohad Ben-Cohen @ 2011-06-22 11:41 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Arnd,

On Tue, Jun 21, 2011 at 5:20 PM, Arnd Bergmann <arnd@arndb.de> wrote:
> This looks really nice overall, but I don't currently have time for a
> more in-depth review. My feeling is that you are definitely on the right
> track here, and the plans you list as TODO to continue are all good.

Thanks !

> One point I noticed is the use of debugfs, which you should probably
> replace at some point with a stable API, e.g. your own debugfs-like
> file system, but there is no hurry to do that now.

debugfs is only being used to expose debugging info (namely the power
state of the remote processor and its trace log messages), which is
mostly useful for developers trying to understand what went wrong.

It seems like debugfs fits quite nicely here (e.g. it's perfectly fine
if this is completely compiled out on production systems), but sure,
we can always revisit this later too.

> Regarding the firmware, I think it would be nice to have the option
> to stick the firmware blob into the flattened device tree as a bininclude,
> so you can build systems using the external processors without special
> user space support.

That's interesting. we'll definitely explore this - thanks !

I was also hoping that DT will simplify remote resource allocations as
well (just like any other device, remote processors usually needs some
hw resources) and planned to try it out soon to see how it all works
out together. It might completely eliminate the preliminary "resource
section" thingy we have in the firmware structure currently.

> The binary format looks reasonable, but if you need anything more complex,
> you should probably go straight for ELF files ;-)

Hopefully we won't need to :)

> +struct fw_section {
> + ? ? ? u32 type;
> + ? ? ? u64 da;
> + ? ? ? u32 len;
> + ? ? ? char content[0];
> +} __packed;
>
> Unfortunately require __packed. It would be better to sort the members
> differently so that each member is naturally aligned in order to
> avoid the need for padding or __packed attributes

Definitely. __packed is being used just to be on the safe side; we
didn't intend to introduce unnatural alignment intentionally. will be
fixed.

Thanks!
Ohad.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 0/8] Introducing a generic AMP/IPC framework
  2011-06-22 11:41   ` Ohad Ben-Cohen
@ 2011-06-22 13:05     ` Arnd Bergmann
  2011-06-22 13:09       ` Ohad Ben-Cohen
  0 siblings, 1 reply; 63+ messages in thread
From: Arnd Bergmann @ 2011-06-22 13:05 UTC (permalink / raw)
  To: linux-arm-kernel

On Wednesday 22 June 2011, Ohad Ben-Cohen wrote:
> > One point I noticed is the use of debugfs, which you should probably
> > replace at some point with a stable API, e.g. your own debugfs-like
> > file system, but there is no hurry to do that now.
> 
> debugfs is only being used to expose debugging info (namely the power
> state of the remote processor and its trace log messages), which is
> mostly useful for developers trying to understand what went wrong.
> 
> It seems like debugfs fits quite nicely here (e.g. it's perfectly fine
> if this is completely compiled out on production systems), but sure,
> we can always revisit this later too.

Ok, I see. In that case I agree that using debugfs is fine, but I would
recommend trying to use fewer macros and just open-coding the file
operations for better readability.

> > Unfortunately require __packed. It would be better to sort the members
> > differently so that each member is naturally aligned in order to
> > avoid the need for padding or __packed attributes
> 
> Definitely. __packed is being used just to be on the safe side; we
> didn't intend to introduce unnatural alignment intentionally. will be
> fixed.

Ok.

	Arnd

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 0/8] Introducing a generic AMP/IPC framework
  2011-06-22 13:05     ` Arnd Bergmann
@ 2011-06-22 13:09       ` Ohad Ben-Cohen
  0 siblings, 0 replies; 63+ messages in thread
From: Ohad Ben-Cohen @ 2011-06-22 13:09 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jun 22, 2011 at 4:05 PM, Arnd Bergmann <arnd@arndb.de> wrote:
> Ok, I see. In that case I agree that using debugfs is fine, but I would
> recommend trying to use fewer macros and just open-coding the file
> operations for better readability.

Sure thing. It didn't end up saving much code eventually, too, so I'll
just remove it.

^ permalink raw reply	[flat|nested] 63+ messages in thread

[parent not found: <1308640714-17961-2-git-send-email-ohad@wizery.com>]

* [RFC 1/8] drivers: add generic remoteproc framework
       [not found] ` <1308640714-17961-2-git-send-email-ohad@wizery.com>
@ 2011-06-22 17:55   ` Randy Dunlap
  2011-06-22 19:11     ` Ohad Ben-Cohen
  2011-06-27 20:49   ` Grant Likely
  1 sibling, 1 reply; 63+ messages in thread
From: Randy Dunlap @ 2011-06-22 17:55 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 21 Jun 2011 10:18:27 +0300 Ohad Ben-Cohen wrote:

Hi,
Just a few minor nits inline...


> diff --git a/Documentation/remoteproc.txt b/Documentation/remoteproc.txt
> new file mode 100644
> index 0000000..3075813
> --- /dev/null
> +++ b/Documentation/remoteproc.txt
> @@ -0,0 +1,170 @@
> +Remote Processor Framework
> +
> +1. Introduction
> +
> +Modern SoCs typically have heterogeneous remote processor devices in asymmetric
> +multiprocessing (AMP) configurations, which may be running different instances
> +of operating system, whether it's Linux or any other flavor of real-time OS.
> +
> +OMAP4, for example, has dual Cortex-A9, dual Cortex-M3 and a C64x+ DSP.
> +In a typical configuration, the dual cortex-A9 is running Linux in a SMP
> +configuration, and each of the other three cores (two M3 cores and a DSP)
> +is running its own instance of RTOS in an AMP configuration.
> +
> +The generic remoteproc driver allows different platforms/architectures to
> +control (power on, load firmware, power off) those remote processors while
> +abstracting the hardware differences, so the entire driver doesn't need to be
> +duplicated.
> +
> +2. User API
> +
> +  struct rproc *rproc_get(const char *name);
> +   - power up the remote processor, identified by the 'name' argument,
> +     and boot it. If the remote processor is already powered on, the
> +     function immediately succeeds.
> +     On success, returns the rproc handle. On failure, NULL is returned.
> +
> +  void rproc_put(struct rproc *rproc);
> +   - power off the remote processor, identified by the rproc handle.
> +     Every call to rproc_get() must be (eventually) accompanied by a call
> +     to rproc_put(). Calling rproc_put() redundantly is a bug.
> +     Note: the remote processor will actually be powered off only when the
> +     last user calls rproc_put().
> +
> +3. Typical usage
> +
> +#include <linux/remoteproc.h>
> +
> +int dummy_rproc_example(void)
> +{
> +	struct rproc *my_rproc;
> +
> +	/* let's power on and boot the image processing unit */
> +	my_rproc = rproc_get("ipu");
> +	if (!my_rproc) {
> +		/*
> +		 * something went wrong. handle it and leave.
> +		 */
> +	}
> +
> +	/*
> +	 * the 'ipu' remote processor is now powered on... let it work !
> +	 */
> +
> +	/* if we no longer need ipu's services, power it down */
> +	rproc_put(my_rproc);
> +}
> +
> +4. API for implementors
> +
> +  int rproc_register(struct device *dev, const char *name,
> +				const struct rproc_ops *ops,
> +				const char *firmware,
> +				const struct rproc_mem_entry *memory_maps,
> +				struct module *owner);
> +   - should be called from the underlying platform-specific implementation, in
> +     order to register a new remoteproc device. 'dev' is the underlying
> +     device, 'name' is the name of the remote processor, which will be
> +     specified by users calling rproc_get(), 'ops' is the platform-specific
> +     start/stop handlers, 'firmware' is the name of the firmware file to
> +     boot the processor with, 'memory_maps' is a table of da<->pa memory
> +     mappings which should be used to configure the IOMMU (if not relevant,
> +     just pass NULL here), 'owner' is the underlying module that should
> +     not be removed while the remote processor is in use.
> +
> +     Returns 0 on success, or an appropriate error code on failure.
> +
> +  int rproc_unregister(const char *name);
> +   - should be called from the underlying platform-specific implementation, in
> +     order to unregister a remoteproc device that was previously registered
> +     with rproc_register().
> +
> +5. Implementation callbacks
> +
> +Every remoteproc implementation must provide these handlers:
> +
> +struct rproc_ops {
> +	int (*start)(struct rproc *rproc, u64 bootaddr);
> +	int (*stop)(struct rproc *rproc);
> +};
> +
> +The ->start() handler takes a rproc handle and an optional bootaddr argument,

                               an rproc

> +and should power on the device and boot it (using the bootaddr argument
> +if the hardware requires one).
> +On success, 0 is returned, and on failure, an appropriate error code.
> +
> +The ->stop() handler takes a rproc handle and powers the device off.

                              an rproc

> +On success, 0 is returned, and on failure, an appropriate error code.
> +
> +6. Binary Firmware Structure
> +
> +The following enums and structures define the binary format of the images
> +remoteproc loads and boot the remote processors with.

                        boots

> +
> +The general binary format is as follows:
> +
> +struct {
> +      char magic[4] = { 'R', 'P', 'R', 'C' };
> +      u32 version;
> +      u32 header_len;
> +      char header[...] = { header_len bytes of unformatted, textual header };
> +      struct section {
> +          u32 type;
> +          u64 da;
> +          u32 len;
> +          u8 content[...] = { len bytes of binary data };
> +      } [ no limit on number of sections ];
> +} __packed;
> +
> +The image begins with a 4-bytes "RPRC" magic, a version number, and a
> +free-style textual header that users can easily read.
> +
> +After the header, the firmware contains several sections that should be
> +loaded to memory so the remote processor can access them.
> +
> +Every section begins with its type, device address (da) where the remote
> +processor expects to find this section at (exact meaning depends whether

                            drop:         at

> +the device accesses memory through an IOMMU or not. if not, da might just
> +be physical addresses), the section length and its content.
> +
> +Most of the sections are either text or data (which currently are treated
> +exactly the same), but there is one special "resource" section that allows
> +the remote processor to announce/request certain resources from the host.
> +
> +A resource section is just a packed array of the following struct:
> +
> +struct fw_resource {
> +	u32 type;
> +	u64 da;
> +	u64 pa;
> +	u32 len;
> +	u32 flags;
> +	u8 name[48];
> +} __packed;
> +
> +The way a resource is really handled strongly depends on its type.
> +Some resources are just one-way announcements, e.g., a RSC_TRACE type means
> +that the remote processor will be writing log messages into a trace buffer
> +which is located at the address specified in 'da'. In that case, 'len' is
> +the size of that buffer. A RSC_BOOTADDR resource type announces the boot
> +address (i.e. the first instruction the remote processor should be booted with)
> +in 'da'.
> +
> +Other resources entries might be a two-way request/respond negotiation where
> +a certain resource (memory or any other hardware resource) is requested
> +by specifying the appropriate type and name. The host should then allocate
> +such a resource and "reply" by writing the identifier (physical address
> +or any other device id that will be meaningful to the remote processor)
> +back into the relevant member of the resource structure. Obviously this
> +approach can only be used _before_ booting the remote processor. After
> +the remote processor is powered up, the resource section is expected
> +to stay static. Runtime resource management (i.e. handling requests after
> +the remote processor has booted) will be achieved using a dedicated rpmsg
> +driver.
> +
> +The latter two-way approach is still preliminary and has not been implemented
> +yet. It's left to see how this all works out.
> +
> +Most likely this kind of static allocations of hardware resources for
> +remote processors can also use DT, so it's interesting to see how
> +this all work out when DT materializes.

            works out


thanks,
---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 1/8] drivers: add generic remoteproc framework
  2011-06-22 17:55   ` [RFC 1/8] drivers: add generic remoteproc framework Randy Dunlap
@ 2011-06-22 19:11     ` Ohad Ben-Cohen
  0 siblings, 0 replies; 63+ messages in thread
From: Ohad Ben-Cohen @ 2011-06-22 19:11 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Randy,

On Wed, Jun 22, 2011 at 8:55 PM, Randy Dunlap <rdunlap@xenotime.net> wrote:
> On Tue, 21 Jun 2011 10:18:27 +0300 Ohad Ben-Cohen wrote:
>
> Hi,
> Just a few minor nits inline...

Thanks!

Ohad.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 1/8] drivers: add generic remoteproc framework
       [not found] ` <1308640714-17961-2-git-send-email-ohad@wizery.com>
  2011-06-22 17:55   ` [RFC 1/8] drivers: add generic remoteproc framework Randy Dunlap
@ 2011-06-27 20:49   ` Grant Likely
  2011-06-27 21:52     ` Grosen, Mark
                       ` (2 more replies)
  1 sibling, 3 replies; 63+ messages in thread
From: Grant Likely @ 2011-06-27 20:49 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jun 21, 2011 at 10:18:27AM +0300, Ohad Ben-Cohen wrote:
> Some systems have slave heterogeneous remote processor devices,
> that are usually used to offload cpu-intensive computations
> (e.g. multimedia codec tasks).
> 
> Booting a remote processor typically involves:
> - Loading a firmware which contains the OS image (mainly text and data)
> - If needed, programming an IOMMU
> - Powering on the device
> 
> This patch introduces a generic remoteproc framework that allows drivers
> to start and stop those remote processor devices, load up their firmware
> (which might not necessarily be Linux-based), and in the future also
> support power management and error recovery.
> 
> It's still not clear how much this is really reusable for other
> platforms/architectures, especially the part that deals with the
> firmware.
> 
> Moreover, it's not entirely clear whether this should really be an
> independent layer, or if it should just be squashed with the host-specific
> component of the rpmsg framework (there isn't really a remoteproc use case
> that doesn't require rpmsg).
> 
> That said, it did prove useful for us on two completely different
> platforms: OMAP and Davinci, each with its different remote
> processor (Cortex-M3 and a C674x DSP, respectively). So to avoid
> egregious duplication of code, remoteproc must not be omap-only.
> 
> Firmware loader is based on code by Mark Grosen <mgrosen@ti.com>.
> 
> TODO:
> - drop rproc_da_to_pa(), use iommu_iova_to_phys() instead
>   (requires completion of omap's iommu migration and some generic iommu
>    API work)
> - instead of ioremapping reserved memory and handling IOMMUs, consider
>   moving to the generic DMA mapping API (with a CMA backend)
> 
> Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com>

Hi Ohad,

Overall, looks pretty nice to me.  Comments below...

> ---
>  Documentation/remoteproc.txt    |  170 +++++++++
>  drivers/Kconfig                 |    2 +
>  drivers/Makefile                |    1 +
>  drivers/remoteproc/Kconfig      |    7 +
>  drivers/remoteproc/Makefile     |    5 +
>  drivers/remoteproc/remoteproc.c |  780 +++++++++++++++++++++++++++++++++++++++
>  include/linux/remoteproc.h      |  273 ++++++++++++++
>  7 files changed, 1238 insertions(+), 0 deletions(-)
>  create mode 100644 Documentation/remoteproc.txt
>  create mode 100644 drivers/remoteproc/Kconfig
>  create mode 100644 drivers/remoteproc/Makefile
>  create mode 100644 drivers/remoteproc/remoteproc.c
>  create mode 100644 include/linux/remoteproc.h
> 
> diff --git a/Documentation/remoteproc.txt b/Documentation/remoteproc.txt
> new file mode 100644
> index 0000000..3075813
> --- /dev/null
> +++ b/Documentation/remoteproc.txt
> @@ -0,0 +1,170 @@
> +Remote Processor Framework
> +
> +1. Introduction
> +
> +Modern SoCs typically have heterogeneous remote processor devices in asymmetric
> +multiprocessing (AMP) configurations, which may be running different instances
> +of operating system, whether it's Linux or any other flavor of real-time OS.
> +
> +OMAP4, for example, has dual Cortex-A9, dual Cortex-M3 and a C64x+ DSP.
> +In a typical configuration, the dual cortex-A9 is running Linux in a SMP
> +configuration, and each of the other three cores (two M3 cores and a DSP)
> +is running its own instance of RTOS in an AMP configuration.
> +
> +The generic remoteproc driver allows different platforms/architectures to
> +control (power on, load firmware, power off) those remote processors while
> +abstracting the hardware differences, so the entire driver doesn't need to be
> +duplicated.
> +
> +2. User API
> +
> +  struct rproc *rproc_get(const char *name);
> +   - power up the remote processor, identified by the 'name' argument,
> +     and boot it. If the remote processor is already powered on, the
> +     function immediately succeeds.
> +     On success, returns the rproc handle. On failure, NULL is returned.
> +
> +  void rproc_put(struct rproc *rproc);
> +   - power off the remote processor, identified by the rproc handle.
> +     Every call to rproc_get() must be (eventually) accompanied by a call
> +     to rproc_put(). Calling rproc_put() redundantly is a bug.
> +     Note: the remote processor will actually be powered off only when the
> +     last user calls rproc_put().
> +
> +3. Typical usage
> +
> +#include <linux/remoteproc.h>
> +
> +int dummy_rproc_example(void)
> +{
> +	struct rproc *my_rproc;
> +
> +	/* let's power on and boot the image processing unit */
> +	my_rproc = rproc_get("ipu");

I tend to be suspicious of apis whose primary interface is by-name
lookup.  It works fine when the system is small, but it can get
unwieldy when the client driver doesn't have a direct relation to the
setup code that chooses the name.  At some point I suspect that there
will need to be different lookup mechanism, such as which AMP
processor is currently available (if there are multiple of the same
type).

It also leaves no option for drivers to obtain a reference to the
rproc instance, and bring it up/down as needed (without the name
lookup every time).

That said, it looks like only the rproc_get() api is using by-name
lookup, and everything else is via the structure.  Can (should) the
by-name lookup part be factored out into a rproc_get_by_name()
accessor?

> +	if (!my_rproc) {
> +		/*
> +		 * something went wrong. handle it and leave.
> +		 */
> +	}
> +
> +	/*
> +	 * the 'ipu' remote processor is now powered on... let it work !
> +	 */
> +
> +	/* if we no longer need ipu's services, power it down */
> +	rproc_put(my_rproc);
> +}
> +
> +4. API for implementors
> +
> +  int rproc_register(struct device *dev, const char *name,
> +				const struct rproc_ops *ops,
> +				const char *firmware,
> +				const struct rproc_mem_entry *memory_maps,
> +				struct module *owner);
> +   - should be called from the underlying platform-specific implementation, in
> +     order to register a new remoteproc device. 'dev' is the underlying
> +     device, 'name' is the name of the remote processor, which will be
> +     specified by users calling rproc_get(), 'ops' is the platform-specific
> +     start/stop handlers, 'firmware' is the name of the firmware file to
> +     boot the processor with, 'memory_maps' is a table of da<->pa memory
> +     mappings which should be used to configure the IOMMU (if not relevant,
> +     just pass NULL here), 'owner' is the underlying module that should
> +     not be removed while the remote processor is in use.

Since rproc_register is allocating a struct rproc instance that
represent the device, shouldn't the pointer to that device be returned
to the caller?  Also, consider the use case that at some point someone
will need separate rproc_alloc and rproc_add steps so that it can
modify the structure between allocating and adding.  Otherwise you're
stuck in the model of having to modify the function signature to
rproc_register() every time a new feature is added that required
additional data; regardless of whether or not all drivers will use it.

> +
> +     Returns 0 on success, or an appropriate error code on failure.
> +
> +  int rproc_unregister(const char *name);

I definitely would not do this by name.  I think it is better to pass
the actual instance pointer to rproc_unregister.

> +   - should be called from the underlying platform-specific implementation, in
> +     order to unregister a remoteproc device that was previously registered
> +     with rproc_register().
> +
> +5. Implementation callbacks
> +
> +Every remoteproc implementation must provide these handlers:
> +
> +struct rproc_ops {
> +	int (*start)(struct rproc *rproc, u64 bootaddr);
> +	int (*stop)(struct rproc *rproc);
> +};
> +
> +The ->start() handler takes a rproc handle and an optional bootaddr argument,
> +and should power on the device and boot it (using the bootaddr argument
> +if the hardware requires one).

Naive question: Why is bootaddr an argument?  Wouldn't rproc drivers
keep track of the boot address in their driver private data?

> +On success, 0 is returned, and on failure, an appropriate error code.
> +
> +The ->stop() handler takes a rproc handle and powers the device off.
> +On success, 0 is returned, and on failure, an appropriate error code.
> +
> +6. Binary Firmware Structure
> +
> +The following enums and structures define the binary format of the images
> +remoteproc loads and boot the remote processors with.
> +
> +The general binary format is as follows:
> +
> +struct {
> +      char magic[4] = { 'R', 'P', 'R', 'C' };
> +      u32 version;
> +      u32 header_len;
> +      char header[...] = { header_len bytes of unformatted, textual header };
> +      struct section {
> +          u32 type;
> +          u64 da;
> +          u32 len;
> +          u8 content[...] = { len bytes of binary data };
> +      } [ no limit on number of sections ];
> +} __packed;

Other have commented on the image format, so I'll skip this bit other
than saying that I agree it would be great to have a common format.

> +Most likely this kind of static allocations of hardware resources for
> +remote processors can also use DT, so it's interesting to see how
> +this all work out when DT materializes.

I imagine that it will be quite straight forward.  There will probably
be a node in the tree to represent each slave AMP processor, and other
devices attached to it could be represented using 'phandle' links
between the nodes.  Any configuration of the AMP process can be
handled with arbitrary device-specific properties in the AMP
processor's node.

> diff --git a/drivers/Kconfig b/drivers/Kconfig
> index 3bb154d..1f6d6d3 100644
> --- a/drivers/Kconfig
> +++ b/drivers/Kconfig
> @@ -126,4 +126,6 @@ source "drivers/hwspinlock/Kconfig"
>  
>  source "drivers/clocksource/Kconfig"
>  
> +source "drivers/remoteproc/Kconfig"
> +

Hmmm, I wonder if the end of the drivers list is the best place for
this.  The drivers menu in kconfig is getting quite unwieldy.

>  endmenu
> diff --git a/drivers/Makefile b/drivers/Makefile
> index 09f3232..4d53a18 100644
> --- a/drivers/Makefile
> +++ b/drivers/Makefile
> @@ -122,3 +122,4 @@ obj-y				+= ieee802154/
>  obj-y				+= clk/
>  
>  obj-$(CONFIG_HWSPINLOCK)	+= hwspinlock/
> +obj-$(CONFIG_REMOTE_PROC)	+= remoteproc/
> diff --git a/drivers/remoteproc/Kconfig b/drivers/remoteproc/Kconfig
> new file mode 100644
> index 0000000..a60bb12
> --- /dev/null
> +++ b/drivers/remoteproc/Kconfig
> @@ -0,0 +1,7 @@
> +#
> +# Generic framework for controlling remote processors
> +#
> +
> +# REMOTE_PROC gets selected by whoever wants it
> +config REMOTE_PROC
> +	tristate
> diff --git a/drivers/remoteproc/Makefile b/drivers/remoteproc/Makefile
> new file mode 100644
> index 0000000..d0f60c7
> --- /dev/null
> +++ b/drivers/remoteproc/Makefile
> @@ -0,0 +1,5 @@
> +#
> +# Generic framework for controlling remote processors
> +#
> +
> +obj-$(CONFIG_REMOTE_PROC)		+= remoteproc.o
> diff --git a/drivers/remoteproc/remoteproc.c b/drivers/remoteproc/remoteproc.c
> new file mode 100644
> index 0000000..2b0514b
> --- /dev/null
> +++ b/drivers/remoteproc/remoteproc.c
> @@ -0,0 +1,780 @@
> +/*
> + * Remote Processor Framework
> + *
> + * Copyright (C) 2011 Texas Instruments, Inc.
> + * Copyright (C) 2011 Google, Inc.
> + *
> + * Ohad Ben-Cohen <ohad@wizery.com>
> + * Mark Grosen <mgrosen@ti.com>
> + * Brian Swetland <swetland@google.com>
> + * Fernando Guzman Lugo <fernando.lugo@ti.com>
> + * Robert Tivy <rtivy@ti.com>
> + * Armando Uribe De Leon <x0095078@ti.com>
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License
> + * version 2 as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + */
> +
> +#define pr_fmt(fmt)    "%s: " fmt, __func__
> +
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +#include <linux/device.h>
> +#include <linux/delay.h>
> +#include <linux/slab.h>
> +#include <linux/platform_device.h>
> +#include <linux/firmware.h>
> +#include <linux/io.h>
> +#include <linux/list.h>
> +#include <linux/debugfs.h>
> +#include <linux/remoteproc.h>
> +
> +/* list of the available remote processors */
> +static LIST_HEAD(rprocs);
> +/*
> + * This lock should be taken when the list of rprocs is accessed.
> + * Consider using RCU instead, since remote processors only get registered
> + * once (usually at boot), and then the list is only read accessed.
> + * Though right now the list is pretty short, and only rarely accessed.
> + */
> +static DEFINE_SPINLOCK(rprocs_lock);
> +
> +/* debugfs parent dir */
> +static struct dentry *rproc_dbg;
> +
> +/*
> + * Some remote processors may support dumping trace logs into a shared
> + * memory buffer. We expose this trace buffer using debugfs, so users
> + * can easily tell what's going on remotely.
> + */
> +static ssize_t rproc_format_trace_buf(char __user *userbuf, size_t count,
> +				    loff_t *ppos, const void *src, int size)
> +{
> +	const char *buf = (const char *) src;
> +	int i;
> +
> +	/*
> +	 * find the end of trace buffer (does not account for wrapping).
> +	 * desirable improvement: use a ring buffer instead.
> +	 */
> +	for (i = 0; i < size && buf[i]; i++);

Hmmm, I wonder if this could make use of the ftrace ring buffer.

> +
> +	return simple_read_from_buffer(userbuf, count, ppos, src, i);
> +}
> +
> +static int rproc_open_generic(struct inode *inode, struct file *file)
> +{
> +	file->private_data = inode->i_private;
> +	return 0;
> +}
> +
> +#define DEBUGFS_READONLY_FILE(name, value, len)				\
> +static ssize_t name## _rproc_read(struct file *filp,			\
> +		char __user *userbuf, size_t count, loff_t *ppos)	\
> +{									\
> +	struct rproc *rproc = filp->private_data;			\
> +	return rproc_format_trace_buf(userbuf, count, ppos, value, len);\
> +}									\
> +									\
> +static const struct file_operations name ##_rproc_ops = {		\
> +	.read = name ##_rproc_read,					\
> +	.open = rproc_open_generic,					\
> +	.llseek	= generic_file_llseek,					\
> +};
> +
> +/*
> + * Currently we allow two trace buffers for each remote processor.
> + * This is helpful in case a single remote processor has two independent
> + * cores, each of which is running an independent OS image.
> + * The current implementation is straightforward and simple, and is
> + * rather limited to 2 trace buffers. If, in time, we'd realize we
> + * need additional trace buffers, then the code should be refactored
> + * and generalized.
> + */
> +DEBUGFS_READONLY_FILE(trace0, rproc->trace_buf0, rproc->trace_len0);
> +DEBUGFS_READONLY_FILE(trace1, rproc->trace_buf1, rproc->trace_len1);
> +
> +/* The state of the remote processor is exposed via debugfs, too */
> +const char *rproc_state_string(int state)
> +{
> +	const char *result;
> +
> +	switch (state) {
> +	case RPROC_OFFLINE:
> +		result = "offline";
> +		break;
> +	case RPROC_SUSPENDED:
> +		result = "suspended";
> +		break;
> +	case RPROC_RUNNING:
> +		result = "running";
> +		break;
> +	case RPROC_LOADING:
> +		result = "loading";
> +		break;
> +	case RPROC_CRASHED:
> +		result = "crashed";
> +		break;
> +	default:
> +		result = "invalid state";
> +		break;
> +	}

Me thinks this is asking for a lookup table.

> +
> +	return result;
> +}
> +
> +static ssize_t rproc_state_read(struct file *filp, char __user *userbuf,
> +						size_t count, loff_t *ppos)
> +{
> +	struct rproc *rproc = filp->private_data;
> +	int state = rproc->state;
> +	char buf[100];

100 bytes?  I count at most ~30.

> +	int i;
> +
> +	i = snprintf(buf, 100, "%s (%d)\n", rproc_state_string(state), state);
> +
> +	return simple_read_from_buffer(userbuf, count, ppos, buf, i);
> +}
> +
> +static const struct file_operations rproc_state_ops = {
> +	.read = rproc_state_read,
> +	.open = rproc_open_generic,
> +	.llseek	= generic_file_llseek,
> +};
> +
> +/* The name of the remote processor is exposed via debugfs, too */
> +static ssize_t rproc_name_read(struct file *filp, char __user *userbuf,
> +						size_t count, loff_t *ppos)
> +{
> +	struct rproc *rproc = filp->private_data;
> +	/* need room for the name, a newline and a terminating null */
> +	char buf[RPROC_MAX_NAME + 2];
> +	int i;
> +
> +	i = snprintf(buf, RPROC_MAX_NAME + 2, "%s\n", rproc->name);
> +
> +	return simple_read_from_buffer(userbuf, count, ppos, buf, i);
> +}
> +
> +static const struct file_operations rproc_name_ops = {
> +	.read = rproc_name_read,
> +	.open = rproc_open_generic,
> +	.llseek	= generic_file_llseek,
> +};
> +
> +#define DEBUGFS_ADD(name)						\
> +	debugfs_create_file(#name, 0400, rproc->dbg_dir,		\
> +			rproc, &name## _rproc_ops)

You might want to split the debug stuff off into a separate patch,
just to keep the review load down.  (up to you though).

> +
> +/**
> + * __find_rproc_by_name() - find a registered remote processor by name
> + * @name: name of the remote processor
> + *
> + * Internal function that returns the rproc @name, or NULL if @name does
> + * not exists.
> + */
> +static struct rproc *__find_rproc_by_name(const char *name)
> +{
> +	struct rproc *rproc;
> +	struct list_head *tmp;
> +
> +	spin_lock(&rprocs_lock);
> +
> +	list_for_each(tmp, &rprocs) {
> +		rproc = list_entry(tmp, struct rproc, next);
> +		if (!strcmp(rproc->name, name))
> +			break;
> +		rproc = NULL;
> +	}
> +
> +	spin_unlock(&rprocs_lock);

Unless you're going to be looking up the device at irq time, a mutex
is probably a better choice here.

> +
> +	return rproc;
> +}
> +

[ignoring da_to_pa bits because they are subject to change]
> +
> +/**
> + * rproc_start() - boot the remote processor
> + * @rproc: the remote processor
> + * @bootaddr: address of first instruction to execute (optional)
> + *
> + * Boot a remote processor (i.e. power it on, take it out of reset, etc..)
> + */
> +static void rproc_start(struct rproc *rproc, u64 bootaddr)
> +{
> +	struct device *dev = rproc->dev;
> +	int err;
> +
> +	err = mutex_lock_interruptible(&rproc->lock);
> +	if (err) {
> +		dev_err(dev, "can't lock remote processor %d\n", err);
> +		return;
> +	}
> +
> +	err = rproc->ops->start(rproc, bootaddr);
> +	if (err) {
> +		dev_err(dev, "can't start rproc %s: %d\n", rproc->name, err);
> +		goto unlock_mutex;
> +	}
> +
> +	rproc->state = RPROC_RUNNING;
> +
> +	dev_info(dev, "remote processor %s is now up\n", rproc->name);

How often are remote processors likely to be brought up/down?  Do PM
events hard stop remote processors?  I only ask because I wonder if
this dev_info() will end up flooding the kernel log.

> +/**
> + * rproc_get() - boot the remote processor
> + * @name: name of the remote processor
> + *
> + * Boot a remote processor (i.e. load its firmware, power it on, take it
> + * out of reset, etc..).
> + *
> + * If the remote processor is already powered on, immediately return
> + * its rproc handle.
> + *
> + * On success, returns the rproc handle. On failure, NULL is returned.
> + */
> +struct rproc *rproc_get(const char *name)
> +{
> +	struct rproc *rproc, *ret = NULL;
> +	struct device *dev;
> +	int err;
> +
> +	rproc = __find_rproc_by_name(name);
> +	if (!rproc) {
> +		pr_err("can't find remote processor %s\n", name);
> +		return NULL;
> +	}
> +
> +	dev = rproc->dev;
> +
> +	err = mutex_lock_interruptible(&rproc->lock);
> +	if (err) {
> +		dev_err(dev, "can't lock remote processor %s\n", name);
> +		return NULL;
> +	}
> +
> +	/* prevent underlying implementation from being removed */
> +	if (!try_module_get(rproc->owner)) {
> +		dev_err(dev, "%s: can't get owner\n", __func__);
> +		goto unlock_mutex;
> +	}
> +
> +	/* skip the boot process if rproc is already (being) powered up */
> +	if (rproc->count++) {
> +		ret = rproc;
> +		goto unlock_mutex;
> +	}
> +
> +	/* rproc_put() calls should wait until async loader completes */
> +	init_completion(&rproc->firmware_loading_complete);
> +
> +	dev_info(dev, "powering up %s\n", name);
> +
> +	/* loading a firmware is required */
> +	if (!rproc->firmware) {
> +		dev_err(dev, "%s: no firmware to load\n", __func__);
> +		goto deref_rproc;
> +	}
> +
> +	/*
> +	 * Initiate an asynchronous firmware loading, to allow building
> +	 * remoteproc as built-in kernel code, without hanging the boot process
> +	 */
> +	err = request_firmware_nowait(THIS_MODULE, FW_ACTION_HOTPLUG,
> +			rproc->firmware, dev, GFP_KERNEL, rproc, rproc_load_fw);
> +	if (err < 0) {
> +		dev_err(dev, "request_firmware_nowait failed: %d\n", err);
> +		goto deref_rproc;
> +	}
> +
> +	rproc->state = RPROC_LOADING;
> +	ret = rproc;
> +	goto unlock_mutex;
> +
> +deref_rproc:
> +	complete_all(&rproc->firmware_loading_complete);
> +	module_put(rproc->owner);
> +	--rproc->count;
> +unlock_mutex:
> +	mutex_unlock(&rproc->lock);
> +	return ret;
> +}
> +EXPORT_SYMBOL(rproc_get);
> +
> +/**
> + * rproc_put() - power off the remote processor
> + * @rproc: the remote processor
> + *
> + * Release an rproc handle previously acquired with rproc_get(),
> + * and if we're the last user, power the processor off.
> + *
> + * Every call to rproc_get() must be (eventually) accompanied by a call
> + * to rproc_put(). Calling rproc_put() redundantly is a bug.
> + */
> +void rproc_put(struct rproc *rproc)
> +{
> +	struct device *dev = rproc->dev;
> +	int ret;
> +
> +	/*
> +	 * make sure rproc_get() was called beforehand.
> +	 * it should be safe to check for zero without taking the lock.
> +	 */

However, it may be non-zero here, but drop to zero by the time you
take the lock.  Best be safe and put it inside the mutex.  Having it
under the mutex shouldn't be a performance hit since only buggy code
will get this test wrong.  In fact, it is probably appropriate to
WARN_ON() on the !rproc->count condition.

Actually, using a hand coded reference count like this shouldn't be
done.  Use a kobject or a kref instead.  Looking at the code, I
suspect you'll want separate reference counting for object references
and power up/down count so that clients can control power to a device
without giving up the pointer to the rproc instance.

> +	if (!rproc->count) {
> +		dev_err(dev, "asymmetric put (fogot to call rproc_get ?)\n");
> +		ret = -EINVAL;
> +		goto out;
> +	}
> +
> +	/* if rproc is just being loaded now, wait */
> +	wait_for_completion(&rproc->firmware_loading_complete);
> +
> +	ret = mutex_lock_interruptible(&rproc->lock);
> +	if (ret) {
> +		dev_err(dev, "can't lock rproc %s: %d\n", rproc->name, ret);
> +		return;
> +	}
> +
> +	/* if the remote proc is still needed, bail out */
> +	if (--rproc->count)
> +		goto out;
> +
> +	if (rproc->trace_buf0)
> +		/* iounmap normal memory, so make sparse happy */
> +		iounmap((__force void __iomem *) rproc->trace_buf0);
> +	if (rproc->trace_buf1)
> +		/* iounmap normal memory, so make sparse happy */
> +		iounmap((__force void __iomem *) rproc->trace_buf1);

Icky casting!  That suggests that how the trace buffer pointer is
managed needs work.

> +
> +	rproc->trace_buf0 = rproc->trace_buf1 = NULL;
> +
> +	/*
> +	 * make sure rproc is really running before powering it off.
> +	 * this is important, because the fw loading might have failed.
> +	 */
> +	if (rproc->state == RPROC_RUNNING) {
> +		ret = rproc->ops->stop(rproc);
> +		if (ret) {
> +			dev_err(dev, "can't stop rproc: %d\n", ret);
> +			goto out;
> +		}
> +	}
> +
> +	rproc->state = RPROC_OFFLINE;
> +
> +	dev_info(dev, "stopped remote processor %s\n", rproc->name);
> +
> +out:
> +	mutex_unlock(&rproc->lock);
> +	if (!ret)
> +		module_put(rproc->owner);
> +}
> +EXPORT_SYMBOL(rproc_put);
> +
> +/**
> + * rproc_register() - register a remote processor
> + * @dev: the underlying device
> + * @name: name of this remote processor
> + * @ops: platform-specific handlers (mainly start/stop)
> + * @firmware: name of firmware file to load
> + * @memory_maps: IOMMU settings for this rproc (optional)
> + * @owner: owning module
> + *
> + * Registers a new remote processor in the remoteproc framework.
> + *
> + * This is called by the underlying platform-specific implementation,
> + * whenever a new remote processor device is probed.
> + *
> + * On succes, 0 is return, and on failure an appropriate error code.
> + */
> +int rproc_register(struct device *dev, const char *name,
> +				const struct rproc_ops *ops,
> +				const char *firmware,
> +				const struct rproc_mem_entry *memory_maps,
> +				struct module *owner)
> +{
> +	struct rproc *rproc;
> +
> +	if (!dev || !name || !ops)
> +		return -EINVAL;
> +
> +	rproc = kzalloc(sizeof(struct rproc), GFP_KERNEL);
> +	if (!rproc) {
> +		dev_err(dev, "%s: kzalloc failed\n", __func__);
> +		return -ENOMEM;
> +	}
> +
> +	rproc->dev = dev;
> +	rproc->name = name;
> +	rproc->ops = ops;
> +	rproc->firmware = firmware;
> +	rproc->memory_maps = memory_maps;
> +	rproc->owner = owner;
> +
> +	mutex_init(&rproc->lock);
> +
> +	rproc->state = RPROC_OFFLINE;
> +
> +	spin_lock(&rprocs_lock);
> +	list_add_tail(&rproc->next, &rprocs);
> +	spin_unlock(&rprocs_lock);
> +
> +	dev_info(dev, "%s is available\n", name);
> +
> +	if (!rproc_dbg)
> +		goto out;
> +
> +	rproc->dbg_dir = debugfs_create_dir(dev_name(dev), rproc_dbg);
> +	if (!rproc->dbg_dir) {
> +		dev_err(dev, "can't create debugfs dir\n");
> +		goto out;
> +	}
> +
> +	debugfs_create_file("name", 0400, rproc->dbg_dir, rproc,
> +							&rproc_name_ops);
> +	debugfs_create_file("state", 0400, rproc->dbg_dir, rproc,
> +							&rproc_state_ops);
> +
> +out:
> +	return 0;
> +}
> +EXPORT_SYMBOL(rproc_register);
> +
> +/**
> + * rproc_unregister() - unregister a remote processor
> + * @name: name of this remote processor
> + *
> + * Unregisters a remote processor.
> + *
> + * On succes, 0 is return. If this remote processor isn't found, -EINVAL
> + * is returned.
> + */
> +int rproc_unregister(const char *name)
> +{
> +	struct rproc *rproc;
> +
> +	rproc = __find_rproc_by_name(name);
> +	if (!rproc) {
> +		pr_err("can't find remote processor %s\n", name);
> +		return -EINVAL;
> +	}
> +
> +	dev_info(rproc->dev, "removing %s\n", name);
> +
> +	if (rproc->dbg_dir)
> +		debugfs_remove_recursive(rproc->dbg_dir);
> +
> +	spin_lock(&rprocs_lock);
> +	list_del(&rproc->next);
> +	spin_unlock(&rprocs_lock);
> +
> +	kfree(rproc);
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL(rproc_unregister);
> +
> +static int __init remoteproc_init(void)
> +{
> +	if (debugfs_initialized()) {
> +		rproc_dbg = debugfs_create_dir(KBUILD_MODNAME, NULL);
> +		if (!rproc_dbg)
> +			pr_err("can't create debugfs dir\n");
> +	}
> +
> +	return 0;
> +}
> +/* must be ready in time for device_initcall users */
> +subsys_initcall(remoteproc_init);
> +
> +static void __exit remoteproc_exit(void)
> +{
> +	if (rproc_dbg)
> +		debugfs_remove(rproc_dbg);
> +}
> +module_exit(remoteproc_exit);
> +
> +MODULE_LICENSE("GPL v2");
> +MODULE_DESCRIPTION("Generic Remote Processor Framework");
> diff --git a/include/linux/remoteproc.h b/include/linux/remoteproc.h
> new file mode 100644
> index 0000000..6cdb966
> --- /dev/null
> +++ b/include/linux/remoteproc.h
> @@ -0,0 +1,273 @@
> +/*
> + * Remote Processor Framework
> + *
> + * Copyright(c) 2011 Texas Instruments, Inc.
> + * Copyright(c) 2011 Google, Inc.
> + * All rights reserved.
> + *
> + * Redistribution and use in source and binary forms, with or without
> + * modification, are permitted provided that the following conditions
> + * are met:
> + *
> + * * Redistributions of source code must retain the above copyright
> + *   notice, this list of conditions and the following disclaimer.
> + * * Redistributions in binary form must reproduce the above copyright
> + *   notice, this list of conditions and the following disclaimer in
> + *   the documentation and/or other materials provided with the
> + *   distribution.
> + * * Neither the name Texas Instruments nor the names of its
> + *   contributors may be used to endorse or promote products derived
> + *   from this software without specific prior written permission.
> + *
> + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#ifndef REMOTEPROC_H
> +#define REMOTEPROC_H
> +
> +#include <linux/mutex.h>
> +#include <linux/completion.h>
> +
> +/**
> + * DOC: The Binary Structure of the Firmware
> + *
> + * The following enums and structures define the binary format of the image
> + * we load and run the remote processors with.
> + *
> + * The binary format is as follows:
> + *
> + * struct {
> + *     char magic[4] = { 'R', 'P', 'R', 'C' };
> + *     u32 version;
> + *     u32 header_len;
> + *     char header[...] = { header_len bytes of unformatted, textual header };
> + *     struct section {
> + *         u32 type;
> + *         u64 da;
> + *         u32 len;
> + *         u8 content[...] = { len bytes of binary data };
> + *     } [ no limit on number of sections ];
> + * } __packed;
> + */
> +
> +/**
> + * struct fw_header - header of the firmware image
> + * @magic: 4-bytes magic (should contain "RPRC")
> + * @version: version number, should be bumped on binary changes
> + * @header_len: length, in bytes, of the following text header
> + * @header: free-style textual header, users can read with 'head'
> + *
> + * This structure defines the header of the remoteproc firmware.
> + */
> +struct fw_header {
> +	char magic[4];
> +	u32 version;
> +	u32 header_len;
> +	char header[0];
> +} __packed;
> +
> +/**
> + * struct fw_section - header of a firmware section
> + * @type: section type
> + * @da: device address that the rproc expects to find this section at.
> + * @len: length of the section (in bytes)
> + * @content: the section data
> + *
> + * This structure defines the header of a firmware section. All sections
> + * should be loaded to the address specified by @da, so the remote processor
> + * will find them.
> + *
> + * Note: if the remote processor is not behind an IOMMU, then da is a
> + * mere physical address
> + */
> +struct fw_section {
> +	u32 type;
> +	u64 da;
> +	u32 len;
> +	char content[0];
> +} __packed;
> +
> +/**
> + * enum fw_section_type - section type values
> + *
> + * @FW_RESOURCE: a resource section. this section contains static
> + *		resource requests (/announcements) that the remote
> + *		processor requires (/supports). Most of these requests
> + *		require that the host fulfill them (and usually
> + *		"reply" with a result) before the remote processor
> + *		is booted. See Documentation/remoteproc.h for more info
> + * @FW_TEXT: a text section
> + * @FW_DATA: a data section
> + *
> + * Note: text and data sections have different types so we can support stuff
> + * like crash dumps (which only requires dumping data sections) or loading
> + * text sections into faster memory. Currently, though, both section types
> + * are treated exactly the same.
> + */
> +enum fw_section_type {
> +	FW_RESOURCE	= 0,
> +	FW_TEXT		= 1,
> +	FW_DATA		= 2,
> +};
> +
> +/**
> + * struct fw_resource - describes an entry from the resource section
> + * @type: resource type
> + * @da: depends on the resource type
> + * @pa: depends on the resource type
> + * @len: depends on the resource type
> + * @flags: depends on the resource type
> + * @name: name of resource
> + *
> + * Some resources entries are mere announcements, where the host is informed
> + * of specific remoteproc configuration. Other entries require the host to
> + * do something (e.g. reserve a requested resource) and reply by overwriting
> + * a member inside struct fw_resource with the id of the allocated resource.
> + * There could also be resource entries where the remoteproc's image suggests
> + * a configuration, but the host may overwrite it with its own preference.
> + *
> + * Note: the vast majority of the resource types are not implemented yet,
> + * and this is all very much preliminary.
> + */
> +struct fw_resource {
> +	u32 type;
> +	u64 da;
> +	u64 pa;
> +	u32 len;
> +	u32 flags;
> +	u8 name[48];
> +} __packed;
> +
> +/**
> + * enum fw_resource_type - types of resource entries
> + *
> + * @RSC_TRACE: announces the availability of a trace buffer into which
> + *		the remote processor will be writing logs. In this case,
> + *		'da' indicates the device address where logs are written to,
> + *		and 'len' is the size of the trace buffer.
> + *		Currently we support two trace buffers per remote processor,
> + *		to support two autonomous cores running in a single rproc
> + *		device.
> + *		If additional trace buffers are needed, this should be
> + *		extended/generalized.
> + * @RSC_BOOTADDR: announces the address of the first instruction the remote
> + *		processor should be booted with (address indicated in 'da').
> + *
> + * Note: most of the resource types are not implemented yet, so they are
> + * not documented yet.
> + */
> +enum fw_resource_type {
> +	RSC_CARVEOUT	= 0,
> +	RSC_DEVMEM	= 1,
> +	RSC_DEVICE	= 2,
> +	RSC_IRQ		= 3,
> +	RSC_TRACE	= 4,
> +	RSC_BOOTADDR	= 5,
> +};
> +
> +/**
> + * struct rproc_mem_entry - memory mapping descriptor
> + * @da:		device address as seen by the remote processor
> + * @pa:		physical address
> + * @size:	size of this memory region
> + *
> + * Board file will use this struct to define the IOMMU configuration
> + * for this remote processor. If the rproc device accesses physical memory
> + * directly (and not through an IOMMU), this is not needed.
> + */
> +struct rproc_mem_entry {
> +	u64 da;
> +	phys_addr_t pa;
> +	u32 size;
> +};
> +
> +struct rproc;
> +
> +/**
> + * struct rproc_ops - platform-specific device handlers
> + * @start:	power on the device and boot it. implementation may require
> + *		specifyng a boot address
> + * @stop:	power off the device
> + */
> +struct rproc_ops {
> +	int (*start)(struct rproc *rproc, u64 bootaddr);
> +	int (*stop)(struct rproc *rproc);
> +};
> +
> +/*
> + * enum rproc_state - remote processor states
> + *
> + * @RPROC_OFFLINE:	device is powered off
> + * @RPROC_SUSPENDED:	device is suspended; needs to be woken up to receive
> + *			a message.
> + * @RPROC_RUNNING:	device is up and running
> + * @RPROC_LOADING:	asynchronous firmware loading has started
> + * @RPROC_CRASHED:	device has crashed; need to start recovery
> + */
> +enum rproc_state {
> +	RPROC_OFFLINE,
> +	RPROC_SUSPENDED,
> +	RPROC_RUNNING,
> +	RPROC_LOADING,
> +	RPROC_CRASHED,
> +};
> +
> +#define RPROC_MAX_NAME	100

I wouldn't even bother with this.  The only place it is used is in one
of the debugfs files, and you can protect against too large a static
buffer by using %100s (or whatever) in the snprintf().

> +
> +/*
> + * struct rproc - represents a physical remote processor device
> + *
> + * @next: next rproc entry in the list
> + * @name: human readable name of the rproc, cannot exceed RPROC_MAX_NAME bytes
> + * @memory_maps: table of da-to-pa memory maps (relevant if device is behind
> + *               an iommu)
> + * @firmware: name of firmware file to be loaded
> + * @owner: reference to the platform-specific rproc module
> + * @priv: private data which belongs to the platform-specific rproc module
> + * @ops: platform-specific start/stop rproc handlers
> + * @dev: underlying device
> + * @count: usage refcount
> + * @state: state of the device
> + * @lock: lock which protects concurrent manipulations of the rproc
> + * @dbg_dir: debugfs directory of this rproc device
> + * @trace_buf0: main trace buffer of the remote processor
> + * @trace_buf1: second, optional, trace buffer of the remote processor
> + * @trace_len0: length of main trace buffer of the remote processor
> + * @trace_len1: length of the second (and optional) trace buffer
> + * @firmware_loading_complete: marks e/o asynchronous firmware loading
> + */
> +struct rproc {
> +	struct list_head next;
> +	const char *name;
> +	const struct rproc_mem_entry *memory_maps;
> +	const char *firmware;
> +	struct module *owner;
> +	void *priv;
> +	const struct rproc_ops *ops;
> +	struct device *dev;
> +	int count;
> +	int state;
> +	struct mutex lock;
> +	struct dentry *dbg_dir;
> +	char *trace_buf0, *trace_buf1;
> +	int trace_len0, trace_len1;
> +	struct completion firmware_loading_complete;
> +};
> +
> +struct rproc *rproc_get(const char *);
> +void rproc_put(struct rproc *);
> +int rproc_register(struct device *, const char *, const struct rproc_ops *,
> +		const char *, const struct rproc_mem_entry *, struct module *);
> +int rproc_unregister(const char *);
> +
> +#endif /* REMOTEPROC_H */
> -- 
> 1.7.1
> 

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 1/8] drivers: add generic remoteproc framework
  2011-06-27 20:49   ` Grant Likely
@ 2011-06-27 21:52     ` Grosen, Mark
  2011-06-27 22:24       ` Grant Likely
  2011-06-27 23:29     ` Russell King - ARM Linux
  2011-06-28 21:41     ` Ohad Ben-Cohen
  2 siblings, 1 reply; 63+ messages in thread
From: Grosen, Mark @ 2011-06-27 21:52 UTC (permalink / raw)
  To: linux-arm-kernel

> From: Grant Likely
> Sent: Monday, June 27, 2011 1:50 PM

Grant, thanks for the feedback. I'll try to answer one of your
questions below and leave the rest for Ohad.

Mark
 
> > +Every remoteproc implementation must provide these handlers:
> > +
> > +struct rproc_ops {
> > +	int (*start)(struct rproc *rproc, u64 bootaddr);
> > +	int (*stop)(struct rproc *rproc);
> > +};
> > +
> > +The ->start() handler takes a rproc handle and an optional bootaddr
> argument,
> > +and should power on the device and boot it (using the bootaddr
> argument
> > +if the hardware requires one).
> 
> Naive question: Why is bootaddr an argument?  Wouldn't rproc drivers
> keep track of the boot address in their driver private data?
 
Our AMPs (remote processors) have a variety of boot mechanisms that vary
across the different SoCs (yes, TI likes HW diversity). In some cases, the
boot address is more like an entry point and that comes from the firmware,
so it is not a static attribute of a driver. Correct me if I misunderstood
your question.

Mark

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 1/8] drivers: add generic remoteproc framework
  2011-06-27 21:52     ` Grosen, Mark
@ 2011-06-27 22:24       ` Grant Likely
  2011-06-27 23:54         ` Grosen, Mark
  0 siblings, 1 reply; 63+ messages in thread
From: Grant Likely @ 2011-06-27 22:24 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jun 27, 2011 at 09:52:30PM +0000, Grosen, Mark wrote:
> > From: Grant Likely
> > Sent: Monday, June 27, 2011 1:50 PM
> 
> Grant, thanks for the feedback. I'll try to answer one of your
> questions below and leave the rest for Ohad.
> 
> Mark
>  
> > > +Every remoteproc implementation must provide these handlers:
> > > +
> > > +struct rproc_ops {
> > > +	int (*start)(struct rproc *rproc, u64 bootaddr);
> > > +	int (*stop)(struct rproc *rproc);
> > > +};
> > > +
> > > +The ->start() handler takes a rproc handle and an optional bootaddr
> > argument,
> > > +and should power on the device and boot it (using the bootaddr
> > argument
> > > +if the hardware requires one).
> > 
> > Naive question: Why is bootaddr an argument?  Wouldn't rproc drivers
> > keep track of the boot address in their driver private data?
>  
> Our AMPs (remote processors) have a variety of boot mechanisms that vary
> across the different SoCs (yes, TI likes HW diversity). In some cases, the
> boot address is more like an entry point and that comes from the firmware,
> so it is not a static attribute of a driver. Correct me if I misunderstood
> your question.

More to the point, I would expect the boot_address to be a property of
the rproc instance because it represents the configuration of the
remote processor.  It seems odd that the caller of ->start would know
better than the rproc driver about the entry point of the processor.

g.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 1/8] drivers: add generic remoteproc framework
  2011-06-27 22:24       ` Grant Likely
@ 2011-06-27 23:54         ` Grosen, Mark
  0 siblings, 0 replies; 63+ messages in thread
From: Grosen, Mark @ 2011-06-27 23:54 UTC (permalink / raw)
  To: linux-arm-kernel

> From: Grant Likely
> Sent: Monday, June 27, 2011 3:24 PM

> > Our AMPs (remote processors) have a variety of boot mechanisms that vary
> > across the different SoCs (yes, TI likes HW diversity). In some cases, the
> > boot address is more like an entry point and that comes from the firmware,
> > so it is not a static attribute of a driver. Correct me if I misunderstood
> > your question.
> 
> More to the point, I would expect the boot_address to be a property of
> the rproc instance because it represents the configuration of the
> remote processor.  It seems odd that the caller of ->start would know
> better than the rproc driver about the entry point of the processor.
> 
> g.

Yes, in many cases the boot_address will be defined by the HW. However, we have
processors that are "soft" - the boot_address comes from the particular firmware
being loaded and can (will) be different with each firmware image. We factored
out the firmware loader to be device-independent (in remoteproc.c) so it's not
repeated in each device-specific implementation like omap_remoteproc.c and
davinci_remoteproc.c. In the cases where the HW dictates what happens, the start()
method should just ignore the boot_address.

Mark

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 1/8] drivers: add generic remoteproc framework
  2011-06-27 20:49   ` Grant Likely
  2011-06-27 21:52     ` Grosen, Mark
@ 2011-06-27 23:29     ` Russell King - ARM Linux
  2011-06-27 23:35       ` Grant Likely
  2011-06-28 21:55       ` Ohad Ben-Cohen
  2011-06-28 21:41     ` Ohad Ben-Cohen
  2 siblings, 2 replies; 63+ messages in thread
From: Russell King - ARM Linux @ 2011-06-27 23:29 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jun 27, 2011 at 02:49:58PM -0600, Grant Likely wrote:
> > +struct {
> > +      char magic[4] = { 'R', 'P', 'R', 'C' };
> > +      u32 version;
> > +      u32 header_len;
> > +      char header[...] = { header_len bytes of unformatted, textual header };
> > +      struct section {
> > +          u32 type;
> > +          u64 da;
> > +          u32 len;
> > +          u8 content[...] = { len bytes of binary data };
> > +      } [ no limit on number of sections ];
> > +} __packed;
> 
> Other have commented on the image format, so I'll skip this bit other
> than saying that I agree it would be great to have a common format.

(Don't have the original message to reply to...)

Do we really want to end up with header being 5 bytes, header_len set
as 5, and having to load/store all this data using byte loads/stores ?

If we don't want that, then I suggest we get rid of the packed attribute,
and require stuff to be naturally aligned.

First issue is that struct section could be much better layed out:

struct section {
	u32 type;
	u32 len;
	u64 da;
	u8 content[];
};

and require sizeof(struct section) % sizeof(u64) == 0 - iow, to find
the next section, round len up to sizeof(u64).  Ditto for the header.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 1/8] drivers: add generic remoteproc framework
  2011-06-27 23:29     ` Russell King - ARM Linux
@ 2011-06-27 23:35       ` Grant Likely
  2011-06-28 21:55       ` Ohad Ben-Cohen
  1 sibling, 0 replies; 63+ messages in thread
From: Grant Likely @ 2011-06-27 23:35 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jun 27, 2011 at 5:29 PM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Mon, Jun 27, 2011 at 02:49:58PM -0600, Grant Likely wrote:
>> > +struct {
>> > + ? ? ?char magic[4] = { 'R', 'P', 'R', 'C' };
>> > + ? ? ?u32 version;
>> > + ? ? ?u32 header_len;
>> > + ? ? ?char header[...] = { header_len bytes of unformatted, textual header };
>> > + ? ? ?struct section {
>> > + ? ? ? ? ?u32 type;
>> > + ? ? ? ? ?u64 da;
>> > + ? ? ? ? ?u32 len;
>> > + ? ? ? ? ?u8 content[...] = { len bytes of binary data };
>> > + ? ? ?} [ no limit on number of sections ];
>> > +} __packed;
>>
>> Other have commented on the image format, so I'll skip this bit other
>> than saying that I agree it would be great to have a common format.
>
> (Don't have the original message to reply to...)
>
> Do we really want to end up with header being 5 bytes, header_len set
> as 5, and having to load/store all this data using byte loads/stores ?
>
> If we don't want that, then I suggest we get rid of the packed attribute,
> and require stuff to be naturally aligned.
>
> First issue is that struct section could be much better layed out:
>
> struct section {
> ? ? ? ?u32 type;
> ? ? ? ?u32 len;
> ? ? ? ?u64 da;
> ? ? ? ?u8 content[];
> };
>
> and require sizeof(struct section) % sizeof(u64) == 0 - iow, to find
> the next section, round len up to sizeof(u64). ?Ditto for the header.

Hopefully this will all be moot since it has been proposed to use elf
images directly.

g.

-- 
Grant Likely, B.Sc., P.Eng.
Secret Lab Technologies Ltd.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 1/8] drivers: add generic remoteproc framework
  2011-06-27 23:29     ` Russell King - ARM Linux
  2011-06-27 23:35       ` Grant Likely
@ 2011-06-28 21:55       ` Ohad Ben-Cohen
  1 sibling, 0 replies; 63+ messages in thread
From: Ohad Ben-Cohen @ 2011-06-28 21:55 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jun 28, 2011 at 2:29 AM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> (Don't have the original message to reply to...)

Sorry about that.

My recent emails to linux-arm-kernel were bounced with a "Message has
a suspicious header" reason. not sure what am I doing wrong..

> Do we really want to end up with header being 5 bytes, header_len set
> as 5, and having to load/store all this data using byte loads/stores ?

As Grant said, we're trying to move to ELF now, but if we do end up
sticking with this custom format eventually, we'll sure do take care
of all the alignment issues.

Thanks!

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 1/8] drivers: add generic remoteproc framework
  2011-06-27 20:49   ` Grant Likely
  2011-06-27 21:52     ` Grosen, Mark
  2011-06-27 23:29     ` Russell King - ARM Linux
@ 2011-06-28 21:41     ` Ohad Ben-Cohen
  2 siblings, 0 replies; 63+ messages in thread
From: Ohad Ben-Cohen @ 2011-06-28 21:41 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Grant,

Thanks a lot for the exhaustive review and comments !

On Mon, Jun 27, 2011 at 11:49 PM, Grant Likely
<grant.likely@secretlab.ca> wrote:
>> + ? ? my_rproc = rproc_get("ipu");
>
> I tend to be suspicious of apis whose primary interface is by-name
> lookup. ?It works fine when the system is small, but it can get
> unwieldy when the client driver doesn't have a direct relation to the
> setup code that chooses the name. ?At some point I suspect that there
> will need to be different lookup mechanism, such as which AMP
> processor is currently available (if there are multiple of the same
> type).

Yeah, this might be too limiting on some systems. I gave this a little
thought, but decided to wait until those systems show up first, so I
we can better understand them/their requirements/use-cases. For now,
I've just followed this simple name-based API model (which still seem
a bit popular in several SoC drivers I've looked at, probably due to
the general simplicity of it and its use cases).

> It also leaves no option for drivers to obtain a reference to the
> rproc instance, and bring it up/down as needed (without the name
> lookup every time).
..
> That said, it looks like only the rproc_get() api is using by-name
> lookup, and everything else is via the structure. ?Can (should) the
> by-name lookup part be factored out into a rproc_get_by_name()
> accessor?

I think you are looking for a different set of API here, probably
something that is better implemented using runtime PM.

When a driver calls rproc_get(), not only does it power on the remote
processor, but it also makes sure the underlying implementation cannot
go away (i.e. platform-specific remoteproc module cannot be removed,
and the rproc cannot be unregistered).

After it calls rproc_put(), it cannot rely anymore on the remote
processor to stick around (the rproc can be unregistered at this
point), so the next time it needs it, it must go through the full
get-by-name (or any other get API we will come up with eventually)
getter API.

If drivers need to hold onto the rproc instance, but still explicitly
allow it to power off at times, they should probably call something
like pm_runtime_put(rproc->dev).
(remoteproc runtime PM support is not implemented yet, but is
definitely planned, so we can suspend remote processors on
inactivity).

> Since rproc_register is allocating a struct rproc instance that
> represent the device, shouldn't the pointer to that device be returned
> to the caller?

Yes, it definitely should. We will have the underlying implementation
remember it, and then pass it to rproc_unregister when needed.

>> + ?int rproc_unregister(const char *name);
>
> I definitely would not do this by name. ?I think it is better to pass
> the actual instance pointer to rproc_unregister.

Much better, yeah.

> Naive question: Why is bootaddr an argument? ?Wouldn't rproc drivers
> keep track of the boot address in their driver private data?

Mark already got that one, but basically the boot address comes from
the firmware image: we need to let the implementation know the
physical address where the text section is mapped. This is ignored on
implementations where that address is fixed (e.g. OMAP's M3).

> Other have commented on the image format, so I'll skip this bit other
> than saying that I agree it would be great to have a common format.

We are evaluating now moving to ELF; let's see how it goes. Using a
standard format is an advantage (as long as it's not overly
complicated), but I wonder if achieving a common format is really
feasible and whether eventually different platforms will need
different binary formats anyway, and we'll have to abstract this out
of remoteproc (guess that as usual, we just need to start off with
something, and then evolve as requirements show up).

>> +Most likely this kind of static allocations of hardware resources for
>> +remote processors can also use DT, so it's interesting to see how
>> +this all work out when DT materializes.
>
> I imagine that it will be quite straight forward. ?There will probably
> be a node in the tree to represent each slave AMP processor, and other
> devices attached to it could be represented using 'phandle' links
> between the nodes. ?Any configuration of the AMP process can be
> handled with arbitrary device-specific properties in the AMP
> processor's node.

That sounds good. The dilemma is bigger, though.

The kind of stuff we need to synchronize about are not really
describing the hardware; it's more a runtime policy/configuration than
a hardware description.

As Brian mentioned in the other thread:

> The resource information is a description of
> what resources the firmware requires to work properly (it needs
> certain amounts of working memory, timers, peripheral interfaces like
> i2c to control camera hw, etc), which will be specific to a given
> firmware build.

Some of those resources will be allocated dynamically using an rpmsg
driver (developed by Fernando Guzman Lugo), but some must be supplied
before booting the firmware (memory ?). We're also using the existing
resource table today to announce the boot address and the trace buffer
address.

So the question is whether/if DT can help here.

On one hand, we're not describing the hardware here. it's pure
configuration, but that seem fine, as DT seem to be taking runtime
configuration, too (e.g. bootargs, initrd addresses, etc..). Moreover,
some of those remoteproc configurations should handed early to the
host, too (e.g. we might need to reserve specific physical memory that
must be used by the remote processor, and this can't wait until the
firmware is loaded).

OTOH, as Brian mentioned, it does make sense to couple those
configurations with the specific firmware image, so risk of breaking
stuff when the firmware is changed is minimized. Maybe we can have a
secondary .dts file as part of the firmware sources, and have it
included in the primary .dts (and let the remoteproc access that
respective secondary .dtb) ?

These are just raw ideas - I never tried working with DT yet myself.

>> +source "drivers/remoteproc/Kconfig"
>> +
>
> Hmmm, I wonder if the end of the drivers list is the best place for
> this. ?The drivers menu in kconfig is getting quite unwieldy.

We can arbitrarily choose a better location in that file but I'm not
sure I can objectively justify it :)

(alternatively, we can source that Kconfig from the relevant
platform's Kconfig, like virtio does).

>> + ? ? /*
>> + ? ? ?* find the end of trace buffer (does not account for wrapping).
>> + ? ? ?* desirable improvement: use a ring buffer instead.
>> + ? ? ?*/
>> + ? ? for (i = 0; i < size && buf[i]; i++);
>
> Hmmm, I wonder if this could make use of the ftrace ring buffer.

I thought about it, but I'm not sure we want to.

To do that, we'd need the remote processor to send us a message (via
rpmsg probably...) for every trace log we want to write into that ring
buffer. That would mean significant overhead for every remote trace
message, but would also mean you can't debug low level issues with
rpmsg, because you need it to deliver the debug messages themselves.

Instead, we just use a 'dumb' non-cacheable memory region into which
the remote processor unilaterally writes its trace messages. If/when
we're interested in the last remote log messages, we just read that
shared buffer (e.g. cat /debug/remoteproc/omap-rproc.1/trace0).

This means zero overhead on the host, and the ability to debug very
low level remote issues: all you need is a shared memory buffer and
remote traces work.

Currently this shared buffer is really dumb: we just dump its entire
content when asked. One nice improvement we can do is handling the
inevitable wrapping, by maintaining a shared "head" offset into the
buffer.

>> + ? ? switch (state) {
..
>> + ? ? }
>
> Me thinks this is asking for a lookup table.

sounds good.

>> +static ssize_t rproc_state_read(struct file *filp, char __user *userbuf,
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? size_t count, loff_t *ppos)
>> +{
>> + ? ? struct rproc *rproc = filp->private_data;
>> + ? ? int state = rproc->state;
>> + ? ? char buf[100];
>
> 100 bytes? ?I count at most ~30.

30 it is.

>> +#define DEBUGFS_ADD(name) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?\
>> + ? ? debugfs_create_file(#name, 0400, rproc->dbg_dir, ? ? ? ? ? ? ? ?\
>> + ? ? ? ? ? ? ? ? ? ? rproc, &name## _rproc_ops)
>
> You might want to split the debug stuff off into a separate patch,
> just to keep the review load down. ?(up to you though).

Sure. I thought maybe to even split it to a separate file as well.

>> + ? ? spin_unlock(&rprocs_lock);
>
> Unless you're going to be looking up the device at irq time, a mutex
> is probably a better choice here.

mutex it is.

We can also completely remove the lock and just use RCU, as the list
is rarely changed. Since it's so short today, and rarely accessed at
all (even read access is pretty rare), it probably won't matter too
much.

>> + ? ? dev_info(dev, "remote processor %s is now up\n", rproc->name);
>
> How often are remote processors likely to be brought up/down?

Very rarely. Today we bring it up on boot, and keep it loaded (it will
then be suspended on inactivity and won't consume power when we don't
need it to do anything).

> However, it may be non-zero here, but drop to zero by the time you
> take the lock. ?Best be safe and put it inside the mutex. ?Having it
> under the mutex shouldn't be a performance hit since only buggy code
> will get this test wrong. ?In fact, it is probably appropriate to
> WARN_ON() on the !rproc->count condition.

good points, thanks.

> Actually, using a hand coded reference count like this shouldn't be
> done.

yeah, i planned to switch to an atomic variable here.

> Looking at the code, I
> suspect you'll want separate reference counting for object references
> and power up/down count so that clients can control power to a device
> without giving up the pointer to the rproc instance.

Eventually the plan is to use runtime PM for the second refcount, so
we get all this plumbing for free.

>> + ? ? ? ? ? ? /* iounmap normal memory, so make sparse happy */
>> + ? ? ? ? ? ? iounmap((__force void __iomem *) rproc->trace_buf1);
>
> Icky casting! ?That suggests that how the trace buffer pointer is
> managed needs work.

The plan is to replace those ioremaps with dma coherent memory, and
then we don't need no casting. We just need the generic dma API (which
is in the works) to handle omap's iommu transparently (in the works
too), and then tell the remoteproc where to write logs to. It might
take some time, but it sounds very clean.

>> +#define RPROC_MAX_NAME ? ? ? 100
>
> I wouldn't even bother with this. ?The only place it is used is in one
> of the debugfs files, and you can protect against too large a static
> buffer by using %100s (or whatever) in the snprintf().

cool, thanks!

Again, many thanks for the review,
Ohad.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 0/8] Introducing a generic AMP/IPC framework
  2011-06-21  7:18 [RFC 0/8] Introducing a generic AMP/IPC framework Ohad Ben-Cohen
                   ` (2 preceding siblings ...)
       [not found] ` <1308640714-17961-2-git-send-email-ohad@wizery.com>
@ 2011-06-23 12:23 ` Michael Williamson
  2011-06-23 16:27   ` Grosen, Mark
       [not found] ` <1308640714-17961-6-git-send-email-ohad@wizery.com>
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 63+ messages in thread
From: Michael Williamson @ 2011-06-23 12:23 UTC (permalink / raw)
  To: linux-arm-kernel

On 6/21/2011 3:18 AM, Ohad Ben-Cohen wrote:
> Modern SoCs typically employ a central symmetric multiprocessing (SMP)
> application processor running Linux, with several other asymmetric
> multiprocessing (AMP) heterogeneous processors running different instances
> of operating system, whether Linux or any other flavor of real-time OS.
>
> OMAP4, for example, has dual Cortex-A9, dual Cortex-M3 and a C64x+ DSP.
> Typically, the dual cortex-A9 is running Linux in a SMP configuration, and
> each of the other three cores (two M3 cores and a DSP) is running its own
> instance of RTOS in an AMP configuration.
>
> AMP remote processors typically employ dedicated DSP codecs and multimedia
> hardware accelerators, and therefore are often used to offload cpu-intensive
> multimedia tasks from the main application processor. They could also be
> used to control latency-sensitive sensors, drive "random" hardware blocks,
> or just perform background tasks while the main CPU is idling.
>
> Users of those remote processors can either be userland apps (e.g.
> multimedia frameworks talking with remote OMX components) or kernel drivers
> (controlling hardware accessible only by the remote processor, reserving
> kernel-controlled resources on behalf of the remote processor, etc..).
>
> This patch set adds a generic AMP/IPC framework which makes it possible to
> control (power on, boot, power off) and communicate (simply send and receive
> messages) with those remote processors.
>
> Specifically, we're adding:
>
> * Rpmsg: a virtio-based messaging bus that allows kernel drivers to
> communicate with remote processors available on the system. In turn,
> drivers could then expose appropriate user space interfaces, if needed
> (tasks running on remote processors often have direct access to sensitive
> resources like the system's physical memory, gpios, i2c buses, dma
> controllers, etc..  so one normally wouldn't want to allow userland to
> send everything/everywhere it wants).
>
> Every rpmsg device is a communication channel with a service running on a
> remote processor (thus rpmsg devices are called channels). Channels are
> identified by a textual name (which is used to match drivers to devices)
> and have a local ("source") rpmsg address, and remote ("destination") rpmsg
> address. When a driver starts listening on a channel (most commonly when it
> is probed), the bus assigns the driver a unique rpmsg src address (a 32 bit
> integer) and binds it with the driver's rx callback handler. This way
> when inbound messages arrive to this src address, the rpmsg core dispatches
> them to that driver, by invoking the driver's rx handler with the payload
> of the incoming message.
>
> Once probed, rpmsg drivers can also immediately start sending messages to the
> remote rpmsg service by using simple sending API; no need even to specify
> a destination address, since that's part of the rpmsg channel, and the rpmsg
> bus uses the channel's dst address when it constructs the message (for
> more demanding use cases, there's also an extended API, which does allow
> full control of both the src and dst addresses).
>
> The rpmsg bus is using virtio to send and receive messages: every pair
> of processors share two vrings, which are used to send and receive the
> messages over shared memory (one vring is used for tx, and the other one
> for rx). Kicking the remote processor (i.e. letting it know it has a pending
> message on its vring) is accomplished by means available on the platform we
> run on (e.g. OMAP is using its mailbox to both interrupt the remote processor
> and tell it which vring is kicked at the same time). The header of every
> message sent on the rpmsg bus contains src and dst addresses, which make it
> possible to multiplex several rpmsg channels on the same vring.
>
> One nice property of the rpmsg bus is that device creation is completely
> dynamic: remote processors can announce the existence of remote rpmsg
> services by sending a "name service" messages (which contain the name and
> rpmsg addr of the remote service). Those messages are picked by the rpmsg
> bus, which in turn dynamically creates and registers the rpmsg channels
> (i.e devices) which represents the remote services. If/when a relevant rpmsg
> driver is registered, it will be immediately probed by the bus, and can then
> start "talking" to the remote service.
>
> Similarly, we can use this technique to dynamically create virtio devices
> (and new vrings) which would then represent e.g. remote network, console
> and block devices that will be driven by the existing virtio drivers
> (this is still not implemented though; it requires some RTOS work as we're
> not booting Linux on OMAP's remote processors). Creating new vrings might
> also be desired by users who just don't want to use the shared rpmsg vrings
> (for performance or any other functionality reasons).
>
> In addition to dynamic creation of rpmsg channels, the rpmsg bus also
> supports creation of static channels. This is needed in two cases:
> - when a certain remote processor doesn't support sending those "name
>   service" announcements. In that case, a static table of remote rpmsg
>   services must be used to create the rpmsg channels.
> - to support rpmsg server drivers, which aren't bound to a specific remote
>   rpmsg address. Instead, they just listen on a local address, waiting for
>   incoming messages. To send a message, those server drivers need to use
>   the rpmsg_sendto() API, so they can explicitly indicate the dst address
>   every time.
>
> There are already several immediate use cases for rpmsg drivers: OMX
> offloading (already being used on OMAP4), hardware resource manager (remote
> processors on OMAP4 need to ask Linux to enable/disable hardware resources
> on its behalf), remote display driver on Netra (dm8168), where the display
> is controlled by a remote M3 processor (and a Linux v4l2/fbdev driver will
> use rpmsg to communicate with that remote display driver).
>
> * Remoteproc: a generic driver that maintains the state of the remote
> processor(s). Simple rproc_get() and rproc_put() API is exposed, which
> drivers can use when needed (first driver to call get() will load a firmware,
> configure an iommu if needed, and boot the remote processor, while last
> driver to call put() will power it down).
>
> Hardware differences are abstracted as usual: a platform-specific driver
> registers its own start/stop handlers in the generic remoteproc driver,
> and those are invoked when its time to power up/down the processor. As a
> reference, this patch set include remoteproc support for both OMAP4's
> cortex-M3 and Davinci's DSP, tested on the pandaboard and hawkboard,
> respectively.
>
> The gory part of remoteproc is the firmware handling. We tried to come up
> with a simple binary format that will require minimum kernel code to handle,
> but at the same time be generic enough in the hopes that it would prove
> useful to others as well. We're not at all hang onto the binary format
> we picked: if there's any technical reason to change it to support other
> platforms, please let us know. We do realize that a single binary firmware
> structure might eventually not work for everyone. it did prove useful for
> us though; we adopted both the OMAP and Davinci platforms (and their
> completely different remote processor devices) to this simple binary
> structure, so we don't have to duplicate the firmware handling code.
>

I'd like to kick the tires on this with a da850 based platform (MityDSP-L138).
Any chance you might be able to share the stuff you did on the remote side
(DSP/BIOS adaptations for rpmsg,  utils for ELF or COFF conversion to
firmware format, etc.) for the DSP side of your tests done with the hawkboard?

It looks like, at least for the da850, this subsumes or obsoletes DSPLINK in order
to drive a more general purpose architecture (which looks great, so far, BTW). 
Is that the intent?

-Mike

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 0/8] Introducing a generic AMP/IPC framework
  2011-06-23 12:23 ` [RFC 0/8] Introducing a generic AMP/IPC framework Michael Williamson
@ 2011-06-23 16:27   ` Grosen, Mark
  2011-06-23 18:46     ` Arnd Bergmann
  0 siblings, 1 reply; 63+ messages in thread
From: Grosen, Mark @ 2011-06-23 16:27 UTC (permalink / raw)
  To: linux-arm-kernel

> From: Michael Williamson
> Sent: Thursday, June 23, 2011 5:23 AM
...
> I'd like to kick the tires on this with a da850 based platform (MityDSP-L138).
> Any chance you might be able to share the stuff you did on the remote side
> (DSP/BIOS adaptations for rpmsg,  utils for ELF or COFF conversion to
> firmware format, etc.) for the DSP side of your tests done with the
> hawkboard?

We have only implemented the remoteproc (load, start, stop) part for
OMAPL138 so far, i.e., not the rpmsg part (communications). I am not sure
when we will get to the rpmsg part.

We will be publishing a Git tree soon with the "remote processor" code under
a BSD license. This code works with our TI RTOS, SYS/BIOS (also BSD), on
both M3 and DSP CPUs. This will include the utility to generate the RPRC
firmware format from an ELF file. Note that the Linux side does not have any
explicit binding or dependency on the runtime environment on the remote
processor, so alternate RTOS or bare-metal implementations could also be
done. We will post a follow-up to the list with a URL soon.

> It looks like, at least for the da850, this subsumes or obsoletes
> DSPLINK in order to drive a more general purpose architecture
> (which looks great, so far, BTW).
> Is that the intent?

First, we are not abandoning DSPLINK. We have many users of this, even
though it is out-of-tree, and we will continue to support it. That said, we
do intend to make this new design the basis for DSPLINK-like
functionality. It's designed to be done "the right way" for Linux (and we
are looking for feedback to make it better).  It is also designed to be more
scalable and extensible in userspace. With a solid kernel foundation, we can
provide lots of functionality in userspace, or users can implement their own
custom solutions. One of the key things to do is map our existing DSPLINK
APIs, like MessageQ, to the new rpmsg transport.

Mark

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 0/8] Introducing a generic AMP/IPC framework
  2011-06-23 16:27   ` Grosen, Mark
@ 2011-06-23 18:46     ` Arnd Bergmann
  2011-06-24  4:32       ` Grosen, Mark
  0 siblings, 1 reply; 63+ messages in thread
From: Arnd Bergmann @ 2011-06-23 18:46 UTC (permalink / raw)
  To: linux-arm-kernel

On Thursday 23 June 2011 18:27:10 Grosen, Mark wrote:
> First, we are not abandoning DSPLINK. We have many users of this, even
> though it is out-of-tree, and we will continue to support it. That said, we
> do intend to make this new design the basis for DSPLINK-like
> functionality. It's designed to be done "the right way" for Linux (and we
> are looking for feedback to make it better).  It is also designed to be more
> scalable and extensible in userspace. With a solid kernel foundation, we can
> provide lots of functionality in userspace, or users can implement their own
> custom solutions. One of the key things to do is map our existing DSPLINK
> APIs, like MessageQ, to the new rpmsg transport.

Sounds all good. What about the PRUSS code? Does that fit into the new model
as well?

	Arnd

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 0/8] Introducing a generic AMP/IPC framework
  2011-06-23 18:46     ` Arnd Bergmann
@ 2011-06-24  4:32       ` Grosen, Mark
  0 siblings, 0 replies; 63+ messages in thread
From: Grosen, Mark @ 2011-06-24  4:32 UTC (permalink / raw)
  To: linux-arm-kernel

> From: Arnd Bergmann
> Sent: Thursday, June 23, 2011 11:47 AM
> Subject: Re: [RFC 0/8] Introducing a generic AMP/IPC framework
> 
> On Thursday 23 June 2011 18:27:10 Grosen, Mark wrote:
> > First, we are not abandoning DSPLINK. We have many users of this, even
> > though it is out-of-tree, and we will continue to support it. That said, we
> > do intend to make this new design the basis for DSPLINK-like
> > functionality. It's designed to be done "the right way" for Linux (and we
> > are looking for feedback to make it better).  It is also designed to be more
> > scalable and extensible in userspace. With a solid kernel foundation, we can
> > provide lots of functionality in userspace, or users can implement their own
> > custom solutions. One of the key things to do is map our existing DSPLINK
> > APIs, like MessageQ, to the new rpmsg transport.
> 
> Sounds all good. What about the PRUSS code? Does that fit into the new
> model as well?
> 
> 	Arnd

Arnd,

Yes, I have been following some of the PRUSS discussion. I think the
remoteproc driver could be used to manage the basic load/start/stop of the
PRUSS processor. I am not sure if the virio/rpmsg part would be a good
fit. The PRUSS processor is pretty limited, so the generality of virtio
might be too much to fit and too much overhead. However, one of the good
things about remoteproc currently is that it is standalone, so other
transports could use it via the rproc_get/put methods.

Mark

^ permalink raw reply	[flat|nested] 63+ messages in thread

[parent not found: <1308640714-17961-6-git-send-email-ohad@wizery.com>]

* [RFC 5/8] remoteproc: add davinci implementation
       [not found] ` <1308640714-17961-6-git-send-email-ohad@wizery.com>
@ 2011-06-23 15:27   ` Sergei Shtylyov
  2011-06-24  4:25     ` Grosen, Mark
  0 siblings, 1 reply; 63+ messages in thread
From: Sergei Shtylyov @ 2011-06-23 15:27 UTC (permalink / raw)
  To: linux-arm-kernel

Hello.

Ohad Ben-Cohen wrote:

> From: Mark Grosen <mgrosen@ti.com>

> Add remoteproc implementation for da850, so we can boot its DSP.

> Signed-off-by: Mark Grosen <mgrosen@ti.com>
> Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com>
> ---
>  arch/arm/mach-davinci/include/mach/remoteproc.h |   28 +++++
>  drivers/remoteproc/Kconfig                      |   16 +++
>  drivers/remoteproc/Makefile                     |    1 +
>  drivers/remoteproc/davinci_remoteproc.c         |  147 +++++++++++++++++++++++
>  4 files changed, 192 insertions(+), 0 deletions(-)
>  create mode 100644 arch/arm/mach-davinci/include/mach/remoteproc.h
>  create mode 100644 drivers/remoteproc/davinci_remoteproc.c

> diff --git a/arch/arm/mach-davinci/include/mach/remoteproc.h b/arch/arm/mach-davinci/include/mach/remoteproc.h
> new file mode 100644
> index 0000000..af6e88c
> --- /dev/null
> +++ b/arch/arm/mach-davinci/include/mach/remoteproc.h
> @@ -0,0 +1,28 @@
> +/*
> + * Remote Processor
> + *
> + * Copyright (C) 2011 Texas Instruments, Inc.
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License
> + * version 2 as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + */
> +
> +#ifndef _DAVINCI_REMOTEPROC_H
> +#define _DAVINCI_REMOTEPROC_H
> +
> +#include <linux/remoteproc.h>
> +
> +struct davinci_rproc_pdata {
> +	struct rproc_ops *ops;
> +	char *name;
> +	char *clk_name;
> +	char *firmware;
> +};
> +
> +#endif /* _DAVINCI_REMOTEPROC_H */
> diff --git a/drivers/remoteproc/Kconfig b/drivers/remoteproc/Kconfig
> index 88421bd..1e594b5 100644
> --- a/drivers/remoteproc/Kconfig
> +++ b/drivers/remoteproc/Kconfig
> @@ -26,3 +26,19 @@ config OMAP_REMOTE_PROC
>  
>  	  It's safe to say n here if you're not interested in multimedia
>  	  offloading or just want a bare minimum kernel.
> +
> +config DAVINCI_REMOTE_PROC
> +	tristate "Davinci remoteproc support"
> +	depends on ARCH_DAVINCI_DA850

    It should work on DA830 as well, but not on real DaVinci, so the name is 
misleading...

[...]
> diff --git a/drivers/remoteproc/davinci_remoteproc.c b/drivers/remoteproc/davinci_remoteproc.c
> new file mode 100644
> index 0000000..0e26fe9
> --- /dev/null
> +++ b/drivers/remoteproc/davinci_remoteproc.c
> @@ -0,0 +1,147 @@
> +/*
> + * Remote processor machine-specific module for Davinci
> + *
> + * Copyright (C) 2011 Texas Instruments, Inc.
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License
> + * version 2 as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + */
> +
> +#define pr_fmt(fmt)    "%s: " fmt, __func__
> +
> +#include <linux/kernel.h>
> +#include <linux/err.h>
> +#include <linux/printk.h>
> +#include <linux/bitops.h>
> +#include <linux/platform_device.h>
> +#include <linux/remoteproc.h>
> +#include <linux/clk.h>
> +#include <linux/io.h>
> +
> +#include <mach/da8xx.h>
> +#include <mach/cputype.h>
> +#include <mach/psc.h>
> +#include <mach/remoteproc.h>
> +
> +/*
> + * Technical Reference:
> + * OMAP-L138 Applications Processor System Reference Guide
> + * http://www.ti.com/litv/pdf/sprugm7d
> + */
> +
> +/* local reset bit (0 is asserted) in MDCTL15 register (section 9.6.18) */
> +#define LRST		BIT(8)

    Perhaps this should be named nLRST or something if the sense is inverted?

> +/* next state bits in MDCTL15 register (section 9.6.18) */
> +#define NEXT_ENABLED	0x3

    Isn't this already declared in <mach/psc.h> as PSC_STATE_ENABLED?

> +/* register for DSP boot address in SYSCFG0 module (section 11.5.6) */
> +#define HOST1CFG	0x44

    Worth declaring in <mach/da8xx.h> instead...

> +static inline int davinci_rproc_start(struct rproc *rproc, u64 bootaddr)
> +{
> +	struct device *dev = rproc->dev;
> +	struct davinci_rproc_pdata *pdata = dev->platform_data;
> +	struct davinci_soc_info *soc_info = &davinci_soc_info;
> +	void __iomem *psc_base;
> +	struct clk *dsp_clk;
> +
> +	/* hw requires the start (boot) address be on 1KB boundary */
> +	if (bootaddr & 0x3ff) {
> +		dev_err(dev, "invalid boot address: must be aligned to 1KB\n");
> +		return -EINVAL;
> +	}
> +
> +	dsp_clk = clk_get(dev, pdata->clk_name);

    We could match using clkdev functionality, but the clock entry would need to 
be changed then...

> +	if (IS_ERR_OR_NULL(dsp_clk)) {
> +		dev_err(dev, "clk_get error: %ld\n", PTR_ERR(dsp_clk));
> +		return PTR_ERR(dsp_clk);
> +	}
> +
> +	clk_enable(dsp_clk);

     This seems rather senseless activity as on DA8xx the DSP core boots the ARM 
core, so the DSP clock will be already enabled.

> +	rproc->priv = dsp_clk;
> +
> +	psc_base = ioremap(soc_info->psc_bases[0], SZ_4K);
> +
> +	/* insure local reset is asserted before writing start address */
> +	__raw_writel(NEXT_ENABLED, psc_base + MDCTL + 4 * DA8XX_LPSC0_GEM);
> +
> +	__raw_writel(bootaddr, DA8XX_SYSCFG0_VIRT(HOST1CFG));

    DA8XX_SYSCFG0_VIRT() is not supposed to be used outside mach-davinci. The 
variable it refers is not exported, so driver module won't work.

> +	/* de-assert local reset to start the dsp running */
> +	__raw_writel(LRST | NEXT_ENABLED, psc_base + MDCTL +
> +		4 * DA8XX_LPSC0_GEM);
> +
> +	iounmap(psc_base);
> +
> +	return 0;
> +}
> +
> +static inline int davinci_rproc_stop(struct rproc *rproc)
> +{
> +	struct davinci_soc_info *soc_info = &davinci_soc_info;
> +	void __iomem *psc_base;
> +	struct clk *dsp_clk = rproc->priv;
> +
> +	psc_base = ioremap(soc_info->psc_bases[0], SZ_4K);
> +
> +	/* halt the dsp by asserting local reset */
> +	__raw_writel(NEXT_ENABLED, psc_base + MDCTL + 4 * DA8XX_LPSC0_GEM);
> +
> +	clk_disable(dsp_clk);
> +	clk_put(dsp_clk);
> +
> +	iounmap(psc_base);
> +
> +	return 0;
> +}

    All this is DA8xx specific code which won't fly on real DaVincis, so I 
suggest that you rename the file to da8xx_remoteproc.c for clarity; and rename 
the patch as well...

WBR, Sergei

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 5/8] remoteproc: add davinci implementation
  2011-06-23 15:27   ` [RFC 5/8] remoteproc: add davinci implementation Sergei Shtylyov
@ 2011-06-24  4:25     ` Grosen, Mark
  2011-06-24 15:13       ` Sergei Shtylyov
  0 siblings, 1 reply; 63+ messages in thread
From: Grosen, Mark @ 2011-06-24  4:25 UTC (permalink / raw)
  To: linux-arm-kernel

> From: Sergei Shtylyov
> Sent: Thursday, June 23, 2011 8:28 AM
> Subject: Re: [RFC 5/8] remoteproc: add davinci implementation
> 
> Hello.

Sergei, thanks for the feedback. Comments below.

Mark

> 
>     It should work on DA830 as well, but not on real DaVinci, so the
> name is misleading...

Yes, we debated calling it da8xx, but felt that with minor changes it could
accomodate the other SoCs in the davinci family. However, it may be better
to start with just the da8xx/omapl13x parts and then rename if we add the
others. 

> [...]
> > +
> > +/*
> > + * Technical Reference:
> > + * OMAP-L138 Applications Processor System Reference Guide
> > + * http://www.ti.com/litv/pdf/sprugm7d
> > + */
> > +
> > +/* local reset bit (0 is asserted) in MDCTL15 register (section
> 9.6.18) */
> > +#define LRST		BIT(8)
> 
>     Perhaps this should be named nLRST or something if the sense is inverted?

If there is an established naming convention for this, I'll adopt it.

> 
> > +/* next state bits in MDCTL15 register (section 9.6.18) */
> > +#define NEXT_ENABLED	0x3
> 
>     Isn't this already declared in <mach/psc.h> as PSC_STATE_ENABLED?

Yes, thanks, I missed it.

> > +/* register for DSP boot address in SYSCFG0 module (section 11.5.6)
> */
> > +#define HOST1CFG	0x44
> 
>     Worth declaring in <mach/da8xx.h> instead...

Possibly - since it is only used for the DSP, I thought it would be better
to keep local to this implementation. I'll adopt whichever approach is the
convention. 

> > +static inline int davinci_rproc_start(struct rproc *rproc, u64
> bootaddr)
> > +{
> > +	struct device *dev = rproc->dev;
> > +	struct davinci_rproc_pdata *pdata = dev->platform_data;
> > +	struct davinci_soc_info *soc_info = &davinci_soc_info;
> > +	void __iomem *psc_base;
> > +	struct clk *dsp_clk;
> > +
> > +	/* hw requires the start (boot) address be on 1KB boundary */
> > +	if (bootaddr & 0x3ff) {
> > +		dev_err(dev, "invalid boot address: must be aligned to
> 1KB\n");
> > +		return -EINVAL;
> > +	}
> > +
> > +	dsp_clk = clk_get(dev, pdata->clk_name);
> 
>     We could match using clkdev functionality, but the clock entry
> would need to be changed then...

I followed the existing pattern I saw in other drivers. If there is a new,
better way, please point me to an example.

> 
> > +	if (IS_ERR_OR_NULL(dsp_clk)) {
> > +		dev_err(dev, "clk_get error: %ld\n", PTR_ERR(dsp_clk));
> > +		return PTR_ERR(dsp_clk);
> > +	}
> > +
> > +	clk_enable(dsp_clk);
> 
>      This seems rather senseless activity as on DA8xx the DSP core
> boots the ARM core, so the DSP clock will be already enabled.

I think it is needed. It's true that the DSP initiates the boot, but then it is
reset and the clock disabled. See Section 13.2 of
http://focus.ti.com/lit/ug/sprugm7e/sprugm7e.pdf:

13.2 DSP Wake Up

Following deassertion of device reset, the DSP intializes the ARM296 so that
it can execute the ARM ROM bootloader. Upon successful wake up, the ARM
places the DSP in a reset and clock gated (SwRstDisable) state that is
controlled by the LPSC and the SYSCFG modules.

Besides, the boot loader could have disabled it to save power. The ARM and
DSP are clocked independently, so I think it's best to use clock management.

> > +	rproc->priv = dsp_clk;
> > +
> > +	psc_base = ioremap(soc_info->psc_bases[0], SZ_4K);
> > +
> > +	/* insure local reset is asserted before writing start address */
> > +	__raw_writel(NEXT_ENABLED, psc_base + MDCTL + 4 *
> DA8XX_LPSC0_GEM);
> > +
> > +	__raw_writel(bootaddr, DA8XX_SYSCFG0_VIRT(HOST1CFG));
> 
>     DA8XX_SYSCFG0_VIRT() is not supposed to be used outside mach-davinci. The
> variable it refers is not exported, so driver module won't work.

Ooops, I clearly did not build this as a module. Suggestion how to fix this?

> > +	/* de-assert local reset to start the dsp running */
> > +	__raw_writel(LRST | NEXT_ENABLED, psc_base + MDCTL +
> > +		4 * DA8XX_LPSC0_GEM);
> > +
> > +	iounmap(psc_base);
> > +
> > +	return 0;
> > +}
> > +
> > +static inline int davinci_rproc_stop(struct rproc *rproc)
> > +{
> > +	struct davinci_soc_info *soc_info = &davinci_soc_info;
> > +	void __iomem *psc_base;
> > +	struct clk *dsp_clk = rproc->priv;
> > +
> > +	psc_base = ioremap(soc_info->psc_bases[0], SZ_4K);
> > +
> > +	/* halt the dsp by asserting local reset */
> > +	__raw_writel(NEXT_ENABLED, psc_base + MDCTL + 4 *
> DA8XX_LPSC0_GEM);
> > +
> > +	clk_disable(dsp_clk);
> > +	clk_put(dsp_clk);
> > +
> > +	iounmap(psc_base);
> > +
> > +	return 0;
> > +}
> 
>     All this is DA8xx specific code which won't fly on real DaVincis, so I
> suggest that you rename the file to da8xx_remoteproc.c for clarity; and
> rename the patch as well...

This is probably the right thing to do ...

Mark

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 5/8] remoteproc: add davinci implementation
  2011-06-24  4:25     ` Grosen, Mark
@ 2011-06-24 15:13       ` Sergei Shtylyov
  2011-06-24 15:43         ` Nori, Sekhar
  2011-06-27 18:20         ` Grosen, Mark
  0 siblings, 2 replies; 63+ messages in thread
From: Sergei Shtylyov @ 2011-06-24 15:13 UTC (permalink / raw)
  To: linux-arm-kernel

Hello.

Grosen, Mark wrote:

>>     It should work on DA830 as well,

    So please make it dependent on ARCH_DAVINCI_DA8XX.

>> but not on real DaVinci, so the name is misleading...

> Yes, we debated calling it da8xx, but felt that with minor changes it could
> accomodate the other SoCs in the davinci family.

    I don't think it's a good idea. Using cpu_is_*() is drivers is bad. Using 
#ifdef's is not an option either.

> However, it may be better
> to start with just the da8xx/omapl13x parts and then rename if we add the
> others. 

>> [...]
>>> +
>>> +/*
>>> + * Technical Reference:
>>> + * OMAP-L138 Applications Processor System Reference Guide
>>> + * http://www.ti.com/litv/pdf/sprugm7d
>>> + */
>>> +
>>> +/* local reset bit (0 is asserted) in MDCTL15 register (section
>> 9.6.18) */
>>> +#define LRST		BIT(8)

>>     Perhaps this should be named nLRST or something if the sense is inverted?

> If there is an established naming convention for this, I'll adopt it.

    Looking into my old PSC manual (can't get the recent documentation from TI's 
site right now), the bit is called LRSTz.
    It's worth moving this #define into <mach/psc.h> as well.

>>> +/* register for DSP boot address in SYSCFG0 module (section 11.5.6)
>> */
>>> +#define HOST1CFG	0x44

>>     Worth declaring in <mach/da8xx.h> instead...

> Possibly - since it is only used for the DSP, I thought it would be better
> to keep local to this implementation. I'll adopt whichever approach is the
> convention. 

    Well, the general approach is to keep the #define's where they are used, so 
maybe we should keep this #define here.

>>> +static inline int davinci_rproc_start(struct rproc *rproc, u64
>> bootaddr)
>>> +{
>>> +	struct device *dev = rproc->dev;
>>> +	struct davinci_rproc_pdata *pdata = dev->platform_data;
>>> +	struct davinci_soc_info *soc_info = &davinci_soc_info;
>>> +	void __iomem *psc_base;
>>> +	struct clk *dsp_clk;
>>> +
>>> +	/* hw requires the start (boot) address be on 1KB boundary */
>>> +	if (bootaddr & 0x3ff) {
>>> +		dev_err(dev, "invalid boot address: must be aligned to
>> 1KB\n");
>>> +		return -EINVAL;
>>> +	}
>>> +
>>> +	dsp_clk = clk_get(dev, pdata->clk_name);

>>     We could match using clkdev functionality, but the clock entry
>> would need to be changed then...

> I followed the existing pattern I saw in other drivers.

    Probably MUSB? We're trying to move away from passing the clock name to thge 
drivers, using match by device instead.

> If there is a new, better way, please point me to an example.

    Look at the da850_clks[] in mach-davinci/da850.c: if the device name is 
specified as a first argument to CLK() macro (instead of clock name the second 
argument), the matching is done by device, so you don't need to specify the 
clock name to clk_get(), passing NULL instead.

>>> +	if (IS_ERR_OR_NULL(dsp_clk)) {
>>> +		dev_err(dev, "clk_get error: %ld\n", PTR_ERR(dsp_clk));
>>> +		return PTR_ERR(dsp_clk);
>>> +	}
>>> +
>>> +	clk_enable(dsp_clk);

>>      This seems rather senseless activity as on DA8xx the DSP core
>> boots the ARM core, so the DSP clock will be already enabled.

> I think it is needed. It's true that the DSP initiates the boot, but then it is
> reset and the clock disabled. See Section 13.2 of

    Hm, didn't know that. Contrarywise, we had to work around the races between 
ARM and DSP cores on accessing locked SYSCFG registers -- in kernel. So I was 
under impression that DSP code continues running some stuff of its own.

> http://focus.ti.com/lit/ug/sprugm7e/sprugm7e.pdf:

> 13.2 DSP Wake Up

> Following deassertion of device reset, the DSP intializes the ARM296 so that
> it can execute the ARM ROM bootloader. Upon successful wake up, the ARM
> places the DSP in a reset and clock gated (SwRstDisable) state that is
> controlled by the LPSC and the SYSCFG modules.

> Besides, the boot loader could have disabled it to save power. The ARM and
> DSP are clocked independently, so I think it's best to use clock management.

    OK, agreed.

>>> +	rproc->priv = dsp_clk;
>>> +
>>> +	psc_base = ioremap(soc_info->psc_bases[0], SZ_4K);
>>> +
>>> +	/* insure local reset is asserted before writing start address */
>>> +	__raw_writel(NEXT_ENABLED, psc_base + MDCTL + 4 *
>> DA8XX_LPSC0_GEM);
>>> +
>>> +	__raw_writel(bootaddr, DA8XX_SYSCFG0_VIRT(HOST1CFG));

>>     DA8XX_SYSCFG0_VIRT() is not supposed to be used outside mach-davinci. The
>> variable it refers is not exported, so driver module won't work.

> Ooops, I clearly did not build this as a module. Suggestion how to fix this?

    Using the normal ioremap() of SYSCFG0 space, I suppose.

> Mark

WBR, Sergei

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 5/8] remoteproc: add davinci implementation
  2011-06-24 15:13       ` Sergei Shtylyov
@ 2011-06-24 15:43         ` Nori, Sekhar
  2011-06-27 18:31           ` Grosen, Mark
  2011-06-27 18:20         ` Grosen, Mark
  1 sibling, 1 reply; 63+ messages in thread
From: Nori, Sekhar @ 2011-06-24 15:43 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Mark,

On Fri, Jun 24, 2011 at 20:43:50, Sergei Shtylyov wrote:

> >>> +	rproc->priv = dsp_clk;
> >>> +
> >>> +	psc_base = ioremap(soc_info->psc_bases[0], SZ_4K);
> >>> +
> >>> +	/* insure local reset is asserted before writing start address */
> >>> +	__raw_writel(NEXT_ENABLED, psc_base + MDCTL + 4 *
> >> DA8XX_LPSC0_GEM);
> >>> +
> >>> +	__raw_writel(bootaddr, DA8XX_SYSCFG0_VIRT(HOST1CFG));
> 
> >>     DA8XX_SYSCFG0_VIRT() is not supposed to be used outside mach-davinci. The
> >> variable it refers is not exported, so driver module won't work.
> 
> > Ooops, I clearly did not build this as a module. Suggestion how to fix this?
> 
>     Using the normal ioremap() of SYSCFG0 space, I suppose.

Since procedure to set the boot address varies across DaVinci
platforms, you could have a callback populated in platform data
which will be implemented differently for original DaVinci and
DA8xx devices.

Also, all PSC accesses are better off going through clock
framework to ensure proper locking and modularity.

To assert/de-assert local reset when enabling or disabling PSC,
you could use a flag in the clock structure to indicate the need
for this. This way, if there is any other module needing a local
reset, it can just define the same flag. Similarly, if the DSP
does not need a local reset on a particular platform, that
platform can simply skip the flag.

This can be done in a manner similar to how the support for
a forced transition PSC was added here:

https://patchwork.kernel.org/patch/662941/

Thanks,
Sekhar

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 5/8] remoteproc: add davinci implementation
  2011-06-24 15:43         ` Nori, Sekhar
@ 2011-06-27 18:31           ` Grosen, Mark
  2011-07-05  5:29             ` Nori, Sekhar
  2011-07-05  5:34             ` Nori, Sekhar
  0 siblings, 2 replies; 63+ messages in thread
From: Grosen, Mark @ 2011-06-27 18:31 UTC (permalink / raw)
  To: linux-arm-kernel

> From: Nori, Sekhar
> Sent: Friday, June 24, 2011 8:44 AM
> 
> Hi Mark,

Sekhar, thanks for your feedback and ideas. Comments below.

Mark

> Since procedure to set the boot address varies across DaVinci
> platforms, you could have a callback populated in platform data
> which will be implemented differently for original DaVinci and
> DA8xx devices.

I looked at DM6467 and it's the same as OMAPL13x, except at a different
address. Rather than a callback, it could be just an address in the
platform data.

> 
> Also, all PSC accesses are better off going through clock
> framework to ensure proper locking and modularity.
> 
> To assert/de-assert local reset when enabling or disabling PSC,
> you could use a flag in the clock structure to indicate the need
> for this. This way, if there is any other module needing a local
> reset, it can just define the same flag. Similarly, if the DSP
> does not need a local reset on a particular platform, that
> platform can simply skip the flag.
> 
> This can be done in a manner similar to how the support for
> a forced transition PSC was added here:
> 
> https://patchwork.kernel.org/patch/662941/

Yes, I like this idea - much cleaner. For example, the start() method
becomes (pseudo-code):

start()
{
    /* bootaddrreg derived from platform data */
    bootaddrreg = boot_address;

    clk_enable();
}

Referring to your patch above, would it be better to just pass
the flags into the davinci_psc_config() function rather than breaking out more
arguments for each flag?

Mark

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 5/8] remoteproc: add davinci implementation
  2011-06-27 18:31           ` Grosen, Mark
@ 2011-07-05  5:29             ` Nori, Sekhar
  2011-07-05 16:54               ` Grosen, Mark
  2011-07-05  5:34             ` Nori, Sekhar
  1 sibling, 1 reply; 63+ messages in thread
From: Nori, Sekhar @ 2011-07-05  5:29 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Mark,

On Tue, Jun 28, 2011 at 00:01:41, Grosen, Mark wrote:
> > From: Nori, Sekhar
> > Sent: Friday, June 24, 2011 8:44 AM

> > To assert/de-assert local reset when enabling or disabling PSC,
> > you could use a flag in the clock structure to indicate the need
> > for this. This way, if there is any other module needing a local
> > reset, it can just define the same flag. Similarly, if the DSP
> > does not need a local reset on a particular platform, that
> > platform can simply skip the flag.
> > 
> > This can be done in a manner similar to how the support for
> > a forced transition PSC was added here:
> > 
> > https://patchwork.kernel.org/patch/662941/
> 
> Yes, I like this idea - much cleaner. For example, the start() method
> becomes (pseudo-code):
> 
> start()
> {
>     /* bootaddrreg derived from platform data */
>     bootaddrreg = boot_address;
> 
>     clk_enable();
> }
> 
> Referring to your patch above, would it be better to just pass
> the flags into the davinci_psc_config() function rather than breaking out more
> arguments for each flag?

I did dwell on this quite a bit while writing that
patch, but finally decided on passing an argument
instead since I was not sure if there are going
to be more modifiers required.

Now that you have the need for one more flag, I
think we should go ahead and pass the flags directly
to davinci_psc_config(). My original patch is not
upstream yet. I will re-spin it and you can build
on top of it.

Thanks,
Sekhar

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 5/8] remoteproc: add davinci implementation
  2011-07-05  5:29             ` Nori, Sekhar
@ 2011-07-05 16:54               ` Grosen, Mark
  0 siblings, 0 replies; 63+ messages in thread
From: Grosen, Mark @ 2011-07-05 16:54 UTC (permalink / raw)
  To: linux-arm-kernel

> From: Nori, Sekhar
> Sent: Monday, July 04, 2011 10:30 PM

...
> > > https://patchwork.kernel.org/patch/662941/
> >
> > Yes, I like this idea - much cleaner. For example, the start() method
> > becomes (pseudo-code):
> >
> > start()
> > {
> >     /* bootaddrreg derived from platform data */
> >     bootaddrreg = boot_address;
> >
> >     clk_enable();
> > }
> >
> > Referring to your patch above, would it be better to just pass
> > the flags into the davinci_psc_config() function rather than breaking
> out more
> > arguments for each flag?
> 
> I did dwell on this quite a bit while writing that
> patch, but finally decided on passing an argument
> instead since I was not sure if there are going
> to be more modifiers required.
> 
> Now that you have the need for one more flag, I
> think we should go ahead and pass the flags directly
> to davinci_psc_config(). My original patch is not
> upstream yet. I will re-spin it and you can build
> on top of it.

Thanks, Sekhar, this sounds good. I'll look for your patch and utilize
it in the next rev of this patch.

Mark

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 5/8] remoteproc: add davinci implementation
  2011-06-27 18:31           ` Grosen, Mark
  2011-07-05  5:29             ` Nori, Sekhar
@ 2011-07-05  5:34             ` Nori, Sekhar
  2011-07-05 16:54               ` Grosen, Mark
  1 sibling, 1 reply; 63+ messages in thread
From: Nori, Sekhar @ 2011-07-05  5:34 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Mark,

On Tue, Jun 28, 2011 at 00:01:41, Grosen, Mark wrote:
> > From: Nori, Sekhar
> > Sent: Friday, June 24, 2011 8:44 AM
> > 
> > Hi Mark,
> 
> Sekhar, thanks for your feedback and ideas. Comments below.
> 
> Mark
> 
> > Since procedure to set the boot address varies across DaVinci
> > platforms, you could have a callback populated in platform data
> > which will be implemented differently for original DaVinci and
> > DA8xx devices.
> 
> I looked at DM6467 and it's the same as OMAPL13x, except at a different
> address. Rather than a callback, it could be just an address in the
> platform data.

Sounds okay as long as _all_ the DaVinci devices have the same
bit to be set. Plus, I hope there are no other users of the
register so that there is no race with other platform code using
the same register.

Thanks,
Sekhar

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 5/8] remoteproc: add davinci implementation
  2011-07-05  5:34             ` Nori, Sekhar
@ 2011-07-05 16:54               ` Grosen, Mark
  2011-07-06  4:36                 ` Nori, Sekhar
  0 siblings, 1 reply; 63+ messages in thread
From: Grosen, Mark @ 2011-07-05 16:54 UTC (permalink / raw)
  To: linux-arm-kernel

> From: Nori, Sekhar
> Sent: Monday, July 04, 2011 10:35 PM
> To: Grosen, Mark; Sergei Shtylyov

...
> > > Since procedure to set the boot address varies across DaVinci
> > > platforms, you could have a callback populated in platform data
> > > which will be implemented differently for original DaVinci and
> > > DA8xx devices.
> >
> > I looked at DM6467 and it's the same as OMAPL13x, except at a
> different
> > address. Rather than a callback, it could be just an address in the
> > platform data.
> 
> Sounds okay as long as _all_ the DaVinci devices have the same
> bit to be set. Plus, I hope there are no other users of the
> register so that there is no race with other platform code using
> the same register.

Sekhar,

The register is a dedicated 32-bit register that holds the start/boot
address for the DSP, so no other platform code should be using it. Once
the LRST is de-asserted (via the PSC code enhancement), the DSP starts
execution at the address in this register.

Thanks,

Mark

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 5/8] remoteproc: add davinci implementation
  2011-07-05 16:54               ` Grosen, Mark
@ 2011-07-06  4:36                 ` Nori, Sekhar
  0 siblings, 0 replies; 63+ messages in thread
From: Nori, Sekhar @ 2011-07-06  4:36 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Mark,

On Tue, Jul 05, 2011 at 22:24:21, Grosen, Mark wrote:
> > From: Nori, Sekhar
> > Sent: Monday, July 04, 2011 10:35 PM
> > To: Grosen, Mark; Sergei Shtylyov
> 
> ...
> > > > Since procedure to set the boot address varies across DaVinci
> > > > platforms, you could have a callback populated in platform data
> > > > which will be implemented differently for original DaVinci and
> > > > DA8xx devices.
> > >
> > > I looked at DM6467 and it's the same as OMAPL13x, except at a
> > different
> > > address. Rather than a callback, it could be just an address in the
> > > platform data.
> > 
> > Sounds okay as long as _all_ the DaVinci devices have the same
> > bit to be set. Plus, I hope there are no other users of the
> > register so that there is no race with other platform code using
> > the same register.
> 
> Sekhar,
> 
> The register is a dedicated 32-bit register that holds the start/boot
> address for the DSP, so no other platform code should be using it. Once
> the LRST is de-asserted (via the PSC code enhancement), the DSP starts
> execution at the address in this register.

Okay. I had misunderstood this as a bit which is used to reset the DSP.
Thanks for clarifying.
 
Regards,
Sekhar

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 5/8] remoteproc: add davinci implementation
  2011-06-24 15:13       ` Sergei Shtylyov
  2011-06-24 15:43         ` Nori, Sekhar
@ 2011-06-27 18:20         ` Grosen, Mark
  2011-06-28  9:36           ` Nori, Sekhar
  1 sibling, 1 reply; 63+ messages in thread
From: Grosen, Mark @ 2011-06-27 18:20 UTC (permalink / raw)
  To: linux-arm-kernel

> From: Sergei Shtylyov
> Sent: Friday, June 24, 2011 8:14 AM
> 
> Grosen, Mark wrote:
> 
> >>     It should work on DA830 as well,
> 
>     So please make it dependent on ARCH_DAVINCI_DA8XX.
> 
> >> but not on real DaVinci, so the name is misleading...
> 
> > Yes, we debated calling it da8xx, but felt that with minor changes it could
> > accomodate the other SoCs in the davinci family.
> 
>     I don't think it's a good idea. Using cpu_is_*() is drivers is bad. Using
> #ifdef's is not an option either.
> 
> > However, it may be better
> > to start with just the da8xx/omapl13x parts and then rename if we add the
> > others.

Sergei, I'll respond more to this in a response to Sekhar's ideas. We may be
able to make this work generically for davinci w/o idef's.

> Looking into my old PSC manual (can't get the recent documentation from TI's
> site right now), the bit is called LRSTz.
>     It's worth moving this #define into <mach/psc.h> as well.

Ok, I agree we should try to match the HW names as much as possible

> >>> +/* register for DSP boot address in SYSCFG0 module (section 11.5.6) */
> >>> +#define HOST1CFG	0x44
> 
> >>     Worth declaring in <mach/da8xx.h> instead...
> 
> > Possibly - since it is only used for the DSP, I thought it would be better
> > to keep local to this implementation. I'll adopt whichever approach is the
> > convention.
> 
>     Well, the general approach is to keep the #define's where they are
> used, so maybe we should keep this #define here.

Will review as part of the general cleanup.

> 
> >>> +static inline int davinci_rproc_start(struct rproc *rproc, u64
> >> bootaddr)
> >>> +{
> >>> +	struct device *dev = rproc->dev;
> >>> +	struct davinci_rproc_pdata *pdata = dev->platform_data;
> >>> +	struct davinci_soc_info *soc_info = &davinci_soc_info;
> >>> +	void __iomem *psc_base;
> >>> +	struct clk *dsp_clk;
> >>> +
> >>> +	/* hw requires the start (boot) address be on 1KB boundary */
> >>> +	if (bootaddr & 0x3ff) {
> >>> +		dev_err(dev, "invalid boot address: must be aligned to
> >> 1KB\n");
> >>> +		return -EINVAL;
> >>> +	}
> >>> +
> >>> +	dsp_clk = clk_get(dev, pdata->clk_name);
> 
> >>     We could match using clkdev functionality, but the clock entry
> >> would need to be changed then...
> 
> > I followed the existing pattern I saw in other drivers.
> 
> Probably MUSB? We're trying to move away from passing the clock name to thge
> drivers, using match by device instead.
> 
> > If there is a new, better way, please point me to an example.
> 
>     Look at the da850_clks[] in mach-davinci/da850.c: if the device name is
> specified as a first argument to CLK() macro (instead of clock name the second
> argument), the matching is done by device, so you don't need to specify the
> clock name to clk_get(), passing NULL instead.

Thanks, I'll look at this and ask for Sekhar and Kevin's preferences.

Mark

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 5/8] remoteproc: add davinci implementation
  2011-06-27 18:20         ` Grosen, Mark
@ 2011-06-28  9:36           ` Nori, Sekhar
  0 siblings, 0 replies; 63+ messages in thread
From: Nori, Sekhar @ 2011-06-28  9:36 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Mark,

On Mon, Jun 27, 2011 at 23:50:25, Grosen, Mark wrote:

> > >>> +static inline int davinci_rproc_start(struct rproc *rproc, u64
> > >> bootaddr)
> > >>> +{
> > >>> +	struct device *dev = rproc->dev;
> > >>> +	struct davinci_rproc_pdata *pdata = dev->platform_data;
> > >>> +	struct davinci_soc_info *soc_info = &davinci_soc_info;
> > >>> +	void __iomem *psc_base;
> > >>> +	struct clk *dsp_clk;
> > >>> +
> > >>> +	/* hw requires the start (boot) address be on 1KB boundary */
> > >>> +	if (bootaddr & 0x3ff) {
> > >>> +		dev_err(dev, "invalid boot address: must be aligned to
> > >> 1KB\n");
> > >>> +		return -EINVAL;
> > >>> +	}
> > >>> +
> > >>> +	dsp_clk = clk_get(dev, pdata->clk_name);
> > 
> > >>     We could match using clkdev functionality, but the clock entry
> > >> would need to be changed then...
> > 
> > > I followed the existing pattern I saw in other drivers.
> > 
> > Probably MUSB? We're trying to move away from passing the clock name to thge
> > drivers, using match by device instead.
> > 
> > > If there is a new, better way, please point me to an example.
> > 
> >     Look at the da850_clks[] in mach-davinci/da850.c: if the device name is
> > specified as a first argument to CLK() macro (instead of clock name the second
> > argument), the matching is done by device, so you don't need to specify the
> > clock name to clk_get(), passing NULL instead.
> 
> Thanks, I'll look at this and ask for Sekhar and Kevin's preferences.

The second argument is not a (SoC specific)
clock name, it is the "connection id" which
is specific to the IP (not to the SoC).

For example, the EMAC IP may want to deal
with interface clock, functional clock and
mii clock. So, it passes a connection identifier
to tell the clock framework which particular
clock it is interested in. The names of these
connection identifiers will not change with SoC
as they are IP property. So, it will not be
right to get this through platform data.

If there is no connection identifier in
your case, you can just pass the second
argument as NULL. The clock matching will
happen using device name (which should
remain same across SoCs too).

Of course, this means that any incorrectly defined
clock lookup structures for dsp clock in platform
code needs to be corrected.

Thanks,
Sekhar

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 0/8] Introducing a generic AMP/IPC framework
  2011-06-21  7:18 [RFC 0/8] Introducing a generic AMP/IPC framework Ohad Ben-Cohen
                   ` (4 preceding siblings ...)
       [not found] ` <1308640714-17961-6-git-send-email-ohad@wizery.com>
@ 2011-06-24 20:16 ` Stephen Boyd
  2011-06-26  1:11   ` Ohad Ben-Cohen
       [not found] ` <1308640714-17961-3-git-send-email-ohad@wizery.com>
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 63+ messages in thread
From: Stephen Boyd @ 2011-06-24 20:16 UTC (permalink / raw)
  To: linux-arm-kernel

On 06/21/2011 12:18 AM, Ohad Ben-Cohen wrote:
> * Rpmsg: a virtio-based messaging bus that allows kernel drivers to
> communicate with remote processors available on the system. In turn,
> drivers could then expose appropriate user space interfaces, if needed
> (tasks running on remote processors often have direct access to sensitive
> resources like the system's physical memory, gpios, i2c buses, dma
> controllers, etc..  so one normally wouldn't want to allow userland to
> send everything/everywhere it wants).
>
> Every rpmsg device is a communication channel with a service running on a
> remote processor (thus rpmsg devices are called channels). Channels are
> identified by a textual name (which is used to match drivers to devices)
> and have a local ("source") rpmsg address, and remote ("destination") rpmsg
> address. When a driver starts listening on a channel (most commonly when it
> is probed), the bus assigns the driver a unique rpmsg src address (a 32 bit
> integer) and binds it with the driver's rx callback handler. This way
> when inbound messages arrive to this src address, the rpmsg core dispatches
> them to that driver, by invoking the driver's rx handler with the payload
> of the incoming message.
>
[snip]
>
> In addition to dynamic creation of rpmsg channels, the rpmsg bus also
> supports creation of static channels. This is needed in two cases:
> - when a certain remote processor doesn't support sending those "name
>   service" announcements. In that case, a static table of remote rpmsg
>   services must be used to create the rpmsg channels.
> - to support rpmsg server drivers, which aren't bound to a specific remote
>   rpmsg address. Instead, they just listen on a local address, waiting for
>   incoming messages. To send a message, those server drivers need to use
>   the rpmsg_sendto() API, so they can explicitly indicate the dst address
>   every time.
>

This sounds a lot like SMD (shared memory driver) on MSM. The main
difference I see is that SMD uses the platform bus instead of the virtio
bus and it has its own protocol for channel allocation.

>
> * Remoteproc: a generic driver that maintains the state of the remote
> processor(s). Simple rproc_get() and rproc_put() API is exposed, which
> drivers can use when needed (first driver to call get() will load a firmware,
> configure an iommu if needed, and boot the remote processor, while last
> driver to call put() will power it down).
>
> Hardware differences are abstracted as usual: a platform-specific driver
> registers its own start/stop handlers in the generic remoteproc driver,
> and those are invoked when its time to power up/down the processor. As a
> reference, this patch set include remoteproc support for both OMAP4's
> cortex-M3 and Davinci's DSP, tested on the pandaboard and hawkboard,
> respectively.

This remote proc code is eerily similar to PIL (peripheral image loader,
yes we love our acronyms) which I posted a few months back[1]. Was it
inspiration for this patch series?

In terms of API, s/pil/rproc/ and it would be 95% identical. There are
some low-level differences though (see below).

>
> The gory part of remoteproc is the firmware handling. We tried to come up
> with a simple binary format that will require minimum kernel code to handle,
> but at the same time be generic enough in the hopes that it would prove
> useful to others as well. We're not at all hang onto the binary format
> we picked: if there's any technical reason to change it to support other
> platforms, please let us know. We do realize that a single binary firmware
> structure might eventually not work for everyone. it did prove useful for
> us though; we adopted both the OMAP and Davinci platforms (and their
> completely different remote processor devices) to this simple binary
> structure, so we don't have to duplicate the firmware handling code.

This is an important difference between remote proc and PIL. In PIL, we
do image authentication in addition to processor boot. In this case, we
need to give the authentication engine the elf header and program
headers for the firmware so that it knows what parts of memory need to
be authenticated. Instead of devising a new firmware format, we decided
to just stick with elf and parse the headers in the kernel because we
needed them for authentication anyway. Is this reason enough to move to
an ELF format instead?

Another difference is inter-processor dependencies. For example, on
msm8660 the modem can't boot until the dsp has been booted. I suppose we
could hide this detail in the platform specific get() implementation by
calling rproc_get() on the dependent processor (hopefully no locking
issues arise). I'd rather have it built into the core though as it isn't
really specific to the hardware.

If we can resolve these differences I think we can easily support remote
processor boot on MSM via remoteproc.

[1] https://lkml.org/lkml/2011/3/9/490

-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 0/8] Introducing a generic AMP/IPC framework
  2011-06-24 20:16 ` [RFC 0/8] Introducing a generic AMP/IPC framework Stephen Boyd
@ 2011-06-26  1:11   ` Ohad Ben-Cohen
  2011-06-26  1:17     ` Brian Swetland
  2011-06-27 21:22     ` Grosen, Mark
  0 siblings, 2 replies; 63+ messages in thread
From: Ohad Ben-Cohen @ 2011-06-26  1:11 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Stephen,

On Fri, Jun 24, 2011 at 11:16 PM, Stephen Boyd <sboyd@codeaurora.org> wrote:
> This sounds a lot like SMD (shared memory driver) on MSM. The main
> difference I see is that SMD uses the platform bus instead of the virtio
> bus and it has its own protocol for channel allocation.

Yeah, virtio is a key factor in this work; it was suggested to us by
Arnd at the AMP plumbers discussions last year, where it was apparent
that many vendors have their own IPC drivers/buses/channels over
shared memory with some vendor-ish binary protocol. I must say we
really liked virtio: it considerably simplified the code (we're not
adding any new binary protocol), it's very nicely optimized and
flexible, and it comes with a set of virtio drivers (e.g. network,
console, block) so we don't have to write our own.

We also cared about adding this functionality as an IPC bus, so the
driver core will help matching drivers to channels - it simplified the
code (in both setup and tear down of channels) and kept it flexible.
It will also facilitate error recovery (on remote crash, we just
remove the virtio device, and then the driver core will in turn start
->remove()ing the rpmsg drivers) and power management (via runtime
PM).

About SMD: I'm not familiar with it too much, but Brian naturally is
(just for the sake of everyone who are not reading headers - Brian
Swetland wrote the Linux SMD driver, and is also an author of this
Google+TI joint work).

Btw, I'm sure SMD is conceptually not MSM-specific, and have wondered
whether you guys would like to use rpmsg/virtio (I know you have
several drivers like network/tty/etc over SMD, somewhat similarly to
virtio). Probably the biggest reason why not to is the pain in
changing the binary protocol with the modem/dsp side. If you ever do
think about it, I'd be happy to work with you to make it happen.

> This remote proc code is eerily similar to PIL (peripheral image loader,
> yes we love our acronyms) which I posted a few months back[1]. Was it
> inspiration for this patch series?

No, we weren't (or at least I wasn't) aware of PIL.

> In terms of API, s/pil/rproc/ and it would be 95% identical. There are
> some low-level differences though (see below).

Indeed, eerily similar :O

I just guess the API is so simple that probably most kernel hackers
would use refcounting get/put semantics here.

> This is an important difference between remote proc and PIL. In PIL, we
> do image authentication in addition to processor boot.

Yes, we have this too (secure code will need to authenticate the image
in some use cases) - but that's not ready yet for submission and we
decided to start off with the basics first and then evolve.

> Instead of devising a new firmware format, we decided
> to just stick with elf and parse the headers in the kernel because we
> needed them for authentication anyway. Is this reason enough to move to
> an ELF format instead?

I think that consolidation of code is enough reason to make an effort.
I know that our firmware format was chosen for simplicity, but I'm not
sure if we have the tools yet to build standard ELF files for the
remote processors (IIRC it's in the works though). I'll let Mark
comment this one.

> Another difference is inter-processor dependencies. For example, on
> msm8660 the modem can't boot until the dsp has been booted. I suppose we
> could hide this detail in the platform specific get() implementation by
> calling rproc_get() on the dependent processor (hopefully no locking
> issues arise). I'd rather have it built into the core though as it isn't
> really specific to the hardware.

No problems, I'm sure we can solve this one easily.

> If we can resolve these differences I think we can easily support remote
> processor boot on MSM via remoteproc.

That'd be very cool, I sure do hope we can work together.

Thanks for your comments !
Ohad.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 0/8] Introducing a generic AMP/IPC framework
  2011-06-26  1:11   ` Ohad Ben-Cohen
@ 2011-06-26  1:17     ` Brian Swetland
  2011-06-27 21:22     ` Grosen, Mark
  1 sibling, 0 replies; 63+ messages in thread
From: Brian Swetland @ 2011-06-26  1:17 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, Jun 25, 2011 at 6:11 PM, Ohad Ben-Cohen <ohad@wizery.com> wrote:
> Hi Stephen,
>
> On Fri, Jun 24, 2011 at 11:16 PM, Stephen Boyd <sboyd@codeaurora.org> wrote:
>> This sounds a lot like SMD (shared memory driver) on MSM. The main
>> difference I see is that SMD uses the platform bus instead of the virtio
>> bus and it has its own protocol for channel allocation.
>
> Yeah, virtio is a key factor in this work; it was suggested to us by
> Arnd at the AMP plumbers discussions last year, where it was apparent
> that many vendors have their own IPC drivers/buses/channels over
> shared memory with some vendor-ish binary protocol. I must say we
> really liked virtio: it considerably simplified the code (we're not
> adding any new binary protocol), it's very nicely optimized and
> flexible, and it comes with a set of virtio drivers (e.g. network,
> console, block) so we don't have to write our own.
>
> We also cared about adding this functionality as an IPC bus, so the
> driver core will help matching drivers to channels - it simplified the
> code (in both setup and tear down of channels) and kept it flexible.
> It will also facilitate error recovery (on remote crash, we just
> remove the virtio device, and then the driver core will in turn start
> ->remove()ing the rpmsg drivers) and power management (via runtime
> PM).
>
> About SMD: I'm not familiar with it too much, but Brian naturally is
> (just for the sake of everyone who are not reading headers - Brian
> Swetland wrote the Linux SMD driver, and is also an author of this
> Google+TI joint work).

rpmsg definitely shares some design features with SMD (given that I
wrote the linux SMD driver and was involved in the design of rpmsg,
this is maybe unsurprising), but whereas in SMD we had to be
compatible with existing AMSS modems (to a degree), we had more
flexibility in the "wire protocol" on rpmsg and virtio looked like a
really nice fit that already was in the kernel.

Ohad had a number of great ideas for making the dynamic discovery,
publishing, and binding of remote services very clean and
straightforward -- I wish I had a chance to pick his brain about this
stuff back when building the SMD interfaces, which started out fairly
static, but then evolved into publishing platform devices, etc.  Of
course some of this is the benefit of hindsight.  It's easier to get
it right (er?) the second or third time around.

The TI SYS/BIOS folks were quite helpful and patient as we redesigned
the wire protocol and publishing/resource model several times along
the way here.

Brian

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 0/8] Introducing a generic AMP/IPC framework
  2011-06-26  1:11   ` Ohad Ben-Cohen
  2011-06-26  1:17     ` Brian Swetland
@ 2011-06-27 21:22     ` Grosen, Mark
  2011-06-28 11:26       ` Ohad Ben-Cohen
  1 sibling, 1 reply; 63+ messages in thread
From: Grosen, Mark @ 2011-06-27 21:22 UTC (permalink / raw)
  To: linux-arm-kernel

> From: Ohad Ben-Cohen
> Sent: Saturday, June 25, 2011 6:12 PM
> 
> Hi Stephen,
> 
> On Fri, Jun 24, 2011 at 11:16 PM, Stephen Boyd <sboyd@codeaurora.org> wrote:
> > Instead of devising a new firmware format, we decided
> > to just stick with elf and parse the headers in the kernel because we
> > needed them for authentication anyway. Is this reason enough to move to
> > an ELF format instead?
> 
> I think that consolidation of code is enough reason to make an effort.
> I know that our firmware format was chosen for simplicity, but I'm not
> sure if we have the tools yet to build standard ELF files for the
> remote processors (IIRC it's in the works though). I'll let Mark
> comment this one.

Yes, we are converting from "standard" ELF to the simple format. I've used the
GNU binutils to work with our ELF files. There were a few motivations:

1. Concern about complexity of parsing ELF files in kernel; however, the PIL
implementation looks pretty clean to me.

2. We added a special section (resource table) that is interpreted as part
of the loading process. The tool that generates our simple format just
recognizes a named section (".resource_table"), so perhaps that could be
done with the PIL ELF loader.

3. Smaller firmware file sizes. Our ELF files are large relative to the
payload, but this might be addressed by a better ELF "strip" utility.

> > Another difference is inter-processor dependencies. For example, on
> > msm8660 the modem can't boot until the dsp has been booted. I suppose we
> > could hide this detail in the platform specific get() implementation by
> > calling rproc_get() on the dependent processor (hopefully no locking
> > issues arise). I'd rather have it built into the core though as it isn't
> > really specific to the hardware.
> 
> No problems, I'm sure we can solve this one easily.
> 
> > If we can resolve these differences I think we can easily support remote
> > processor boot on MSM via remoteproc.
> 
> That'd be very cool, I sure do hope we can work together.

Yes, I hope we can merge our efforts on PIL and remoteproc since they seem
quite close in function and design.

Mark

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 0/8] Introducing a generic AMP/IPC framework
  2011-06-27 21:22     ` Grosen, Mark
@ 2011-06-28 11:26       ` Ohad Ben-Cohen
  2011-06-28 11:36         ` Brian Swetland
  0 siblings, 1 reply; 63+ messages in thread
From: Ohad Ben-Cohen @ 2011-06-28 11:26 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jun 28, 2011 at 12:22 AM, Grosen, Mark <mgrosen@ti.com> wrote:
> 2. We added a special section (resource table) that is interpreted as part
> of the loading process. The tool that generates our simple format just
> recognizes a named section (".resource_table"), so perhaps that could be
> done with the PIL ELF loader.

I hope that DT will completely replace that "resource table" section
eventually. We still need to try it out (as soon as DT materializes on
ARM/pandaboard) and see how it all fits together, but in general it
looks like DT could provide all the necessary static resource
configuration, and then we just need to expose it to the remote
processor.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 0/8] Introducing a generic AMP/IPC framework
  2011-06-28 11:26       ` Ohad Ben-Cohen
@ 2011-06-28 11:36         ` Brian Swetland
  0 siblings, 0 replies; 63+ messages in thread
From: Brian Swetland @ 2011-06-28 11:36 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jun 28, 2011 at 4:26 AM, Ohad Ben-Cohen <ohad@wizery.com> wrote:
> On Tue, Jun 28, 2011 at 12:22 AM, Grosen, Mark <mgrosen@ti.com> wrote:
>> 2. We added a special section (resource table) that is interpreted as part
>> of the loading process. The tool that generates our simple format just
>> recognizes a named section (".resource_table"), so perhaps that could be
>> done with the PIL ELF loader.
>
> I hope that DT will completely replace that "resource table" section
> eventually. We still need to try it out (as soon as DT materializes on
> ARM/pandaboard) and see how it all fits together, but in general it
> looks like DT could provide all the necessary static resource
> configuration, and then we just need to expose it to the remote
> processor.

I'm not sure I see the benefit to pulling the resource information out
of the firmware image.  The resource information is a description of
what resources the firmware requires to work properly (it needs
certain amounts of working memory, timers, peripheral interfaces like
i2c to control camera hw, etc), which will be specific to a given
firmware build.  Pulling that information out into another place adds
new ways for things to break as the firmware and its resource
requirements are now separate and could get out of sync, etc.

Brian

^ permalink raw reply	[flat|nested] 63+ messages in thread

[parent not found: <1308640714-17961-3-git-send-email-ohad@wizery.com>]

* [RFC 2/8] remoteproc: add omap implementation
       [not found] ` <1308640714-17961-3-git-send-email-ohad@wizery.com>
@ 2011-06-22 10:05   ` Will Newton
  2011-06-22 10:50     ` Ohad Ben-Cohen
  2011-06-27 21:00   ` Grant Likely
  1 sibling, 1 reply; 63+ messages in thread
From: Will Newton @ 2011-06-22 10:05 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jun 21, 2011 at 8:18 AM, Ohad Ben-Cohen <ohad@wizery.com> wrote:

> +/* bootaddr isn't needed for the dual M3's */
> +static inline int omap_rproc_start(struct rproc *rproc, u64 bootaddr)

> +static inline int omap_rproc_stop(struct rproc *rproc)

These two functions don't need to be inline as far as I can see.

> +static struct rproc_ops omap_rproc_ops = {
> + ? ? ? .start = omap_rproc_start,
> + ? ? ? .stop = omap_rproc_stop,
> +};

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 2/8] remoteproc: add omap implementation
  2011-06-22 10:05   ` [RFC 2/8] remoteproc: add omap implementation Will Newton
@ 2011-06-22 10:50     ` Ohad Ben-Cohen
  0 siblings, 0 replies; 63+ messages in thread
From: Ohad Ben-Cohen @ 2011-06-22 10:50 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jun 22, 2011 at 1:05 PM, Will Newton <will.newton@gmail.com> wrote:
> On Tue, Jun 21, 2011 at 8:18 AM, Ohad Ben-Cohen <ohad@wizery.com> wrote:
>
>> +/* bootaddr isn't needed for the dual M3's */
>> +static inline int omap_rproc_start(struct rproc *rproc, u64 bootaddr)
>
>> +static inline int omap_rproc_stop(struct rproc *rproc)
>
> These two functions don't need to be inline as far as I can see.

They definitely don't need to. Thanks !

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 2/8] remoteproc: add omap implementation
       [not found] ` <1308640714-17961-3-git-send-email-ohad@wizery.com>
  2011-06-22 10:05   ` [RFC 2/8] remoteproc: add omap implementation Will Newton
@ 2011-06-27 21:00   ` Grant Likely
  2011-06-29 15:04     ` Ohad Ben-Cohen
  1 sibling, 1 reply; 63+ messages in thread
From: Grant Likely @ 2011-06-27 21:00 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jun 21, 2011 at 10:18:28AM +0300, Ohad Ben-Cohen wrote:
> From: Guzman Lugo, Fernando <fernando.lugo@ti.com>
> 
> Add remoteproc implementation for OMAP4, so we can load the M3 and DSP
> remote processors.
> 
> Based on code by Hari Kanigeri <h-kanigeri2@ti.com>
> 
> While this code is functional, and works on OMAP4 & its M3's,
> it is still preliminary and going to substantially change:
> 
> 1. Splitting physically contiguous regions to pages should happen
>    inside the IOMMU API framework, and not here (so omap_rproc_map_unmap
>    should go away).
> 2. IOMMU programming in general should go away too; it should be handled by
>    generic code (preferably using the DMA mapping API), and not by an OMAP
>    specific component.
> 
> This patch depends on OMAP's IOMMU API migration, as posted here:
> https://lkml.org/lkml/2011/6/2/369
> 
> [ohad at wizery.com: commit log, generic iommu api migration, refactoring, cleanups]
> 
> Signed-off-by: Guzman Lugo, Fernando <fernando.lugo@ti.com>
> Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com>
> ---
>  arch/arm/plat-omap/include/plat/remoteproc.h |   41 +++++
>  drivers/remoteproc/Kconfig                   |   21 +++
>  drivers/remoteproc/Makefile                  |    1 +
>  drivers/remoteproc/omap_remoteproc.c         |  243 ++++++++++++++++++++++++++
>  4 files changed, 306 insertions(+), 0 deletions(-)
>  create mode 100644 arch/arm/plat-omap/include/plat/remoteproc.h
>  create mode 100644 drivers/remoteproc/omap_remoteproc.c
> 
> diff --git a/arch/arm/plat-omap/include/plat/remoteproc.h b/arch/arm/plat-omap/include/plat/remoteproc.h
> new file mode 100644
> index 0000000..6494570
> --- /dev/null
> +++ b/arch/arm/plat-omap/include/plat/remoteproc.h
> @@ -0,0 +1,41 @@
> +/*
> + * Remote Processor - omap-specific bits
> + *
> + * Copyright (C) 2011 Texas Instruments, Inc.
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License
> + * version 2 as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + */
> +
> +#ifndef _PLAT_REMOTEPROC_H
> +#define _PLAT_REMOTEPROC_H
> +
> +#include <linux/remoteproc.h>
> +
> +/*
> + * struct omap_rproc_pdata - omap remoteproc's platform data
> + * @name: the remoteproc's name, cannot exceed RPROC_MAX_NAME bytes
> + * @iommu_name: iommu device we're behind of
> + * @oh_name: omap hwmod device
> + * @oh_name_opt: optional, secondary omap hwmod device
> + * @firmware: name of firmware file to load
> + * @ops: start/stop rproc handlers
> + * @memory_maps: table of da-to-pa iommu memory maps
> + */
> +struct omap_rproc_pdata {
> +	const char *name;
> +	const char *iommu_name;
> +	const char *oh_name;
> +	const char *oh_name_opt;
> +	const char *firmware;
> +	const struct rproc_ops *ops;
> +	const struct rproc_mem_entry *memory_maps;
> +};
> +
> +#endif /* _PLAT_REMOTEPROC_H */
> diff --git a/drivers/remoteproc/Kconfig b/drivers/remoteproc/Kconfig
> index a60bb12..88421bd 100644
> --- a/drivers/remoteproc/Kconfig
> +++ b/drivers/remoteproc/Kconfig
> @@ -5,3 +5,24 @@
>  # REMOTE_PROC gets selected by whoever wants it
>  config REMOTE_PROC
>  	tristate
> +
> +# for tristate, either expose omap_device_* or pass it via pdata
> +config OMAP_REMOTE_PROC
> +	bool "OMAP remoteproc support"
> +	depends on ARCH_OMAP4
> +	select OMAP_IOMMU
> +	select REMOTE_PROC
> +	select OMAP_MBOX_FWK
> +	default y
> +	help
> +	  Say y here to support OMAP's remote processors (dual M3
> +	  and DSP on OMAP4) via the remote processor framework.
> +
> +	  Currently only supported on OMAP4.
> +
> +	  Usually you want to say y here, in order to enable multimedia
> +	  use-cases to run on your platform (multimedia codecs are
> +	  offloaded to remote DSP processors using this framework).
> +
> +	  It's safe to say n here if you're not interested in multimedia
> +	  offloading or just want a bare minimum kernel.
> diff --git a/drivers/remoteproc/Makefile b/drivers/remoteproc/Makefile
> index d0f60c7..0198b1d 100644
> --- a/drivers/remoteproc/Makefile
> +++ b/drivers/remoteproc/Makefile
> @@ -3,3 +3,4 @@
>  #
>  
>  obj-$(CONFIG_REMOTE_PROC)		+= remoteproc.o
> +obj-$(CONFIG_OMAP_REMOTE_PROC)		+= omap_remoteproc.o
> diff --git a/drivers/remoteproc/omap_remoteproc.c b/drivers/remoteproc/omap_remoteproc.c
> new file mode 100644
> index 0000000..91f7f11
> --- /dev/null
> +++ b/drivers/remoteproc/omap_remoteproc.c
> @@ -0,0 +1,243 @@
> +/*
> + * Remote processor machine-specific module for OMAP4
> + *
> + * Copyright (C) 2011 Texas Instruments, Inc.
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License
> + * version 2 as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + */
> +
> +#define pr_fmt(fmt)    "%s: " fmt, __func__
> +
> +#include <linux/kernel.h>
> +#include <linux/err.h>
> +#include <linux/platform_device.h>
> +#include <linux/iommu.h>
> +#include <linux/slab.h>
> +#include <linux/remoteproc.h>
> +
> +#include <plat/iommu.h>
> +#include <plat/omap_device.h>
> +#include <plat/remoteproc.h>
> +
> +struct omap_rproc_priv {
> +	struct iommu_domain *domain;
> +	struct device *iommu;
> +};
> +
> +/*
> + * Map a physically contiguous memory region using biggest pages possible.
> + * TODO: this code should go away; it belongs in the generic IOMMU layer
> + */
> +static int omap_rproc_map_unmap(struct iommu_domain *domain,
> +				const struct rproc_mem_entry *me, bool map)
> +{
> +	u32 all_bits;
> +	/* these are the page sizes supported by OMAP's IOMMU */
> +	u32 pg_size[] = {SZ_16M, SZ_1M, SZ_64K, SZ_4K};
> +	int i, ret, size = me->size;
> +	u32 da = me->da;
> +	u32 pa = me->pa;
> +	int order;
> +	int flags;
> +
> +	/* must be aligned at least with the smallest supported iommu page */
> +	if (!IS_ALIGNED(size, SZ_4K) || !IS_ALIGNED(da | pa, SZ_4K)) {
> +		pr_err("misaligned: size %x da 0x%x pa 0x%x\n", size, da, pa);
> +		return -EINVAL;
> +	}
> +
> +	while (size) {
> +		/* find the max page size with which both pa, da are aligned */
> +		all_bits = pa | da;
> +		for (i = 0; i < ARRAY_SIZE(pg_size); i++) {
> +			if ((size >= pg_size[i]) &&
> +				((all_bits & (pg_size[i] - 1)) == 0)) {
> +				break;
> +			}
> +		}
> +
> +		order = get_order(pg_size[i]);
> +
> +		/* OMAP4's M3 is little endian, so no need for conversions */
> +		flags = MMU_RAM_ENDIAN_LITTLE | MMU_RAM_ELSZ_NONE;
> +
> +		if (map)
> +			ret = iommu_map(domain, da, pa, order, flags);
> +		else
> +			ret = iommu_unmap(domain, da, order);
> +
> +		if (ret)
> +			return ret;
> +
> +		size -= pg_size[i];
> +		da += pg_size[i];
> +		pa += pg_size[i];
> +	}
> +
> +	return 0;
> +}
> +
> +/* bootaddr isn't needed for the dual M3's */
> +static inline int omap_rproc_start(struct rproc *rproc, u64 bootaddr)
> +{
> +	struct device *dev = rproc->dev;
> +	struct platform_device *pdev = to_platform_device(dev);
> +	struct omap_rproc_pdata *pdata = dev->platform_data;
> +	struct iommu_domain *domain;
> +	struct device *iommu;
> +	struct omap_rproc_priv *priv;
> +	int ret, i;
> +
> +	if (!iommu_found()) {
> +		dev_err(&pdev->dev, "iommu not found\n");
> +		return -ENODEV;
> +	}
> +
> +	priv = kmalloc(sizeof(*priv), GFP_KERNEL);
> +	if (!priv) {
> +		dev_err(&pdev->dev, "kzalloc failed\n");
> +		return -ENOMEM;
> +	}
> +
> +	/*
> +	 * Use the specified iommu name to find our iommu device.
> +	 * TODO: solve this generically so other platforms can use it, too
> +	 */
> +	iommu = omap_find_iommu_device(pdata->iommu_name);
> +	if (!iommu) {
> +		dev_err(dev, "omap_find_iommu_device failed\n");
> +		ret = -ENODEV;
> +		goto free_mem;
> +	}
> +
> +	domain = iommu_domain_alloc();
> +	if (!domain) {
> +		dev_err(&pdev->dev, "can't alloc iommu domain\n");
> +		ret = -ENODEV;
> +		goto free_mem;
> +	}
> +
> +	priv->iommu = iommu;
> +	priv->domain = domain;
> +	rproc->priv = priv;
> +
> +	ret = iommu_attach_device(domain, iommu);
> +	if (ret) {
> +		dev_err(&pdev->dev, "can't attach iommu device: %d\n", ret);
> +		goto free_domain;
> +	}
> +
> +	for (i = 0; rproc->memory_maps[i].size; i++) {
> +		const struct rproc_mem_entry *me = &rproc->memory_maps[i];
> +
> +		ret = omap_rproc_map_unmap(domain, me, true);
> +		if (ret) {
> +			dev_err(&pdev->dev, "iommu_map failed: %d\n", ret);
> +			goto unmap_regions;
> +		}
> +	}
> +
> +	/* power on the remote processor itself */
> +	ret = omap_device_enable(pdev);
> +	if (ret)
> +		goto unmap_regions;
> +
> +	return 0;
> +
> +unmap_regions:
> +	for (--i; i >= 0 && rproc->memory_maps[i].size; i--) {
> +		const struct rproc_mem_entry *me = &rproc->memory_maps[i];
> +		omap_rproc_map_unmap(domain, me, false);
> +	}
> +free_domain:
> +	iommu_domain_free(domain);
> +free_mem:
> +	kfree(priv);
> +	return ret;
> +}
> +
> +static inline int omap_rproc_stop(struct rproc *rproc)
> +{
> +	struct device *dev = rproc->dev;
> +	struct platform_device *pdev = to_platform_device(dev);
> +	struct omap_rproc_priv *priv = rproc->priv;
> +	struct iommu_domain *domain = priv->domain;
> +	struct device *iommu = priv->iommu;
> +	int ret, i;
> +
> +	/* power off the remote processor itself */
> +	ret = omap_device_shutdown(pdev);
> +	if (ret) {
> +		dev_err(dev, "failed to shutdown: %d\n", ret);
> +		goto out;
> +	}
> +
> +	for (i = 0; rproc->memory_maps[i].size; i++) {
> +		const struct rproc_mem_entry *me = &rproc->memory_maps[i];
> +
> +		ret = omap_rproc_map_unmap(domain, me, false);
> +		if (ret) {
> +			dev_err(&pdev->dev, "iommu_unmap failed: %d\n", ret);
> +			goto out;
> +		}
> +	}
> +
> +	iommu_detach_device(domain, iommu);
> +	iommu_domain_free(domain);
> +
> +out:
> +	return ret;
> +}
> +
> +static struct rproc_ops omap_rproc_ops = {
> +	.start = omap_rproc_start,
> +	.stop = omap_rproc_stop,
> +};
> +
> +static int omap_rproc_probe(struct platform_device *pdev)
> +{
> +	struct omap_rproc_pdata *pdata = pdev->dev.platform_data;
> +
> +	return rproc_register(&pdev->dev, pdata->name, &omap_rproc_ops,
> +				pdata->firmware, pdata->memory_maps,
> +				THIS_MODULE);

Very little for me to comment on here.  However, something I just
noticed.  Why is it necessary to pass in THIS_MODULE to the
rproc_register function?  Having a reference to the pdev gives you the
pointer to the driver, which has the THIS_MODULE value in it.  That
should be sufficient.

/me also isn't sure if incrementing the refcount on the module is the
best way to prevent the rproc from going away, but I haven't dug into
the details in the driver code to find out.  Drivers can get unbound
from devices without the driver being unloaded, so I imagine there is
an in-use count on the device itself that would prevent driver
unbinding.

> +}
> +
> +static int __devexit omap_rproc_remove(struct platform_device *pdev)
> +{
> +	struct omap_rproc_pdata *pdata = pdev->dev.platform_data;
> +
> +	return rproc_unregister(pdata->name);
> +}
> +
> +static struct platform_driver omap_rproc_driver = {
> +	.probe = omap_rproc_probe,
> +	.remove = __devexit_p(omap_rproc_remove),
> +	.driver = {
> +		.name = "omap-rproc",
> +		.owner = THIS_MODULE,
> +	},
> +};
> +
> +static int __init omap_rproc_init(void)
> +{
> +	return platform_driver_register(&omap_rproc_driver);
> +}
> +/* must be ready in time for device_initcall users */
> +subsys_initcall(omap_rproc_init);
> +
> +static void __exit omap_rproc_exit(void)
> +{
> +	platform_driver_unregister(&omap_rproc_driver);
> +}
> +module_exit(omap_rproc_exit);
> +
> +MODULE_LICENSE("GPL v2");
> +MODULE_DESCRIPTION("OMAP Remote Processor control driver");
> -- 
> 1.7.1
> 

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 2/8] remoteproc: add omap implementation
  2011-06-27 21:00   ` Grant Likely
@ 2011-06-29 15:04     ` Ohad Ben-Cohen
  2011-06-29 15:31       ` Grant Likely
  0 siblings, 1 reply; 63+ messages in thread
From: Ohad Ben-Cohen @ 2011-06-29 15:04 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jun 28, 2011 at 12:00 AM, Grant Likely
<grant.likely@secretlab.ca> wrote:
> Very little for me to comment on here. ?However, something I just
> noticed. ?Why is it necessary to pass in THIS_MODULE to the
> rproc_register function? ?Having a reference to the pdev gives you the
> pointer to the driver, which has the THIS_MODULE value in it. ?That
> should be sufficient.

Nice one, thanks !

> /me also isn't sure if incrementing the refcount on the module is the
> best way to prevent the rproc from going away, but I haven't dug into
> the details in the driver code to find out. ?Drivers can get unbound
> from devices without the driver being unloaded, so I imagine there is
> an in-use count on the device itself that would prevent driver
> unbinding.

Yes, increasing the module refcount is necessary to prevent the user
from removing the driver when the rproc is used.

If the underlying device goes away while rproc is used, then
rproc_unregister should return -EBUSY, which would fail the underlying
driver's ->remove() handler (gpiolib is doing something very similar).
I have forgotten to add this check, and will add it now.

Thanks !

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 2/8] remoteproc: add omap implementation
  2011-06-29 15:04     ` Ohad Ben-Cohen
@ 2011-06-29 15:31       ` Grant Likely
  0 siblings, 0 replies; 63+ messages in thread
From: Grant Likely @ 2011-06-29 15:31 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jun 29, 2011 at 9:04 AM, Ohad Ben-Cohen <ohad@wizery.com> wrote:
> On Tue, Jun 28, 2011 at 12:00 AM, Grant Likely
> <grant.likely@secretlab.ca> wrote:
>> Very little for me to comment on here. ?However, something I just
>> noticed. ?Why is it necessary to pass in THIS_MODULE to the
>> rproc_register function? ?Having a reference to the pdev gives you the
>> pointer to the driver, which has the THIS_MODULE value in it. ?That
>> should be sufficient.
>
> Nice one, thanks !
>
>> /me also isn't sure if incrementing the refcount on the module is the
>> best way to prevent the rproc from going away, but I haven't dug into
>> the details in the driver code to find out. ?Drivers can get unbound
>> from devices without the driver being unloaded, so I imagine there is
>> an in-use count on the device itself that would prevent driver
>> unbinding.
>
> Yes, increasing the module refcount is necessary to prevent the user
> from removing the driver when the rproc is used.

That prevents removing the module which necessitates unbinding the
device.  However, I believe it is possible to unbind a driver
/without/ the module being unloaded.  My question (for which I don't
have an answer) is whether or not there is a way to increment a
refcount on users of the driver bound to the device..

g.

^ permalink raw reply	[flat|nested] 63+ messages in thread

[parent not found: <1308640714-17961-8-git-send-email-ohad@wizery.com>]

* [RFC 7/8] drivers: introduce rpmsg, a remote-processor messaging bus
       [not found] ` <1308640714-17961-8-git-send-email-ohad@wizery.com>
@ 2011-06-22  2:42   ` Rusty Russell
  2011-06-22  3:11     ` Sasha Levin
  2011-06-22 10:46     ` Ohad Ben-Cohen
  2011-06-27 22:21   ` Grant Likely
                     ` (2 subsequent siblings)
  3 siblings, 2 replies; 63+ messages in thread
From: Rusty Russell @ 2011-06-22  2:42 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 21 Jun 2011 10:18:33 +0300, Ohad Ben-Cohen <ohad@wizery.com> wrote:
> Add a virtio-based IPC bus, which enables kernel users to communicate
> with remote processors over shared memory using a simple messaging
> protocol.

Wow, sometimes one writes a standard and people use it.  Thanks!

I'm still digesting exactly what you're doing, but this is a bit
unconventional:

> +	/* Platform must supply pre-allocated uncached buffers for now */
> +	vdev->config->get(vdev, VPROC_BUF_ADDR, &addr, sizeof(addr));
> +	vdev->config->get(vdev, VPROC_BUF_NUM, &num_bufs, sizeof(num_bufs));
> +	vdev->config->get(vdev, VPROC_BUF_SZ, &buf_size, sizeof(buf_size));
> +	vdev->config->get(vdev, VPROC_BUF_PADDR, &vrp->phys_base,
> +						sizeof(vrp->phys_base));

The normal way is to think of the config space as a structure, and use
offsets rather than using an enum value to distinguish the fields.

> +#define RPMSG_NAME_SIZE			32
> +#define RPMSG_DEVICE_MODALIAS_FMT	"rpmsg:%s"
> +
> +struct rpmsg_device_id {
> +	char name[RPMSG_NAME_SIZE];
> +	kernel_ulong_t driver_data	/* Data private to the driver */
> +			__attribute__((aligned(sizeof(kernel_ulong_t))));
> +};

This alignment directive seems overkill...

> +#define VIRTIO_ID_RPMSG		10 /* virtio remote processor messaging */

I think you want 6.  Plan 9 jumped ahead to grab 9 :)

Cheers,
Rusty.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 7/8] drivers: introduce rpmsg, a remote-processor messaging bus
  2011-06-22  2:42   ` [RFC 7/8] drivers: introduce rpmsg, a remote-processor messaging bus Rusty Russell
@ 2011-06-22  3:11     ` Sasha Levin
  2011-06-22 10:46     ` Ohad Ben-Cohen
  1 sibling, 0 replies; 63+ messages in thread
From: Sasha Levin @ 2011-06-22  3:11 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 2011-06-22 at 12:12 +0930, Rusty Russell wrote:
> > +#define VIRTIO_ID_RPMSG		10 /* virtio remote processor messaging */
> 
> I think you want 6.  Plan 9 jumped ahead to grab 9 :)

Maybe it's worth adding an appendix to the virtio spec as part of this
(and really any) patch series which takes a virtio ID.

Also, I understand that virtio itself as an idea isn't Linux specific,
but maybe it's worth including the spec (or at least a variation of it)
as part of the kernel documentation.

-- 

Sasha.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 7/8] drivers: introduce rpmsg, a remote-processor messaging bus
  2011-06-22  2:42   ` [RFC 7/8] drivers: introduce rpmsg, a remote-processor messaging bus Rusty Russell
  2011-06-22  3:11     ` Sasha Levin
@ 2011-06-22 10:46     ` Ohad Ben-Cohen
  2011-09-14 18:38       ` Ohad Ben-Cohen
  1 sibling, 1 reply; 63+ messages in thread
From: Ohad Ben-Cohen @ 2011-06-22 10:46 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jun 22, 2011 at 5:42 AM, Rusty Russell <rusty@rustcorp.com.au> wrote:
> On Tue, 21 Jun 2011 10:18:33 +0300, Ohad Ben-Cohen <ohad@wizery.com> wrote:
>> Add a virtio-based IPC bus, which enables kernel users to communicate
>> with remote processors over shared memory using a simple messaging
>> protocol.
>
> Wow, sometimes one writes a standard and people use it. ?Thanks!

And we really liked it: virtio_rpmsg_bus.c, the virtio driver which
does most of the magic here ended up pretty small thanks to virtio.
and the performance numbers are really good, too.

>> + ? ? /* Platform must supply pre-allocated uncached buffers for now */
>> + ? ? vdev->config->get(vdev, VPROC_BUF_ADDR, &addr, sizeof(addr));
>> + ? ? vdev->config->get(vdev, VPROC_BUF_NUM, &num_bufs, sizeof(num_bufs));
>> + ? ? vdev->config->get(vdev, VPROC_BUF_SZ, &buf_size, sizeof(buf_size));
>> + ? ? vdev->config->get(vdev, VPROC_BUF_PADDR, &vrp->phys_base,
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? sizeof(vrp->phys_base));
>
> The normal way is to think of the config space as a structure, and use
> offsets rather than using an enum value to distinguish the fields.

Yes, I was (mis-)using the config space for now to talk with
platform-specific code (on the host), and not with the peer, so I
opted for simplicity.

But this is definitely one thing that is going away: I don't see any
reason why not just use dma_alloc_coherent (or even dma_pool_create)
directly from the driver here in order to get those buffers.

>> +#define RPMSG_NAME_SIZE ? ? ? ? ? ? ? ? ? ? ?32
>> +#define RPMSG_DEVICE_MODALIAS_FMT ? ?"rpmsg:%s"
>> +
>> +struct rpmsg_device_id {
>> + ? ? char name[RPMSG_NAME_SIZE];
>> + ? ? kernel_ulong_t driver_data ? ? ?/* Data private to the driver */
>> + ? ? ? ? ? ? ? ? ? ? __attribute__((aligned(sizeof(kernel_ulong_t))));
>> +};
>
> This alignment directive seems overkill...

Yes, looks like I can remove this. thanks.

>> +#define VIRTIO_ID_RPMSG ? ? ? ? ? ? ?10 /* virtio remote processor messaging */
>
> I think you want 6. ?Plan 9 jumped ahead to grab 9 :)

6 it is :)

Thanks,
Ohad.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 7/8] drivers: introduce rpmsg, a remote-processor messaging bus
  2011-06-22 10:46     ` Ohad Ben-Cohen
@ 2011-09-14 18:38       ` Ohad Ben-Cohen
  2011-09-15  0:12         ` Rusty Russell
  0 siblings, 1 reply; 63+ messages in thread
From: Ohad Ben-Cohen @ 2011-09-14 18:38 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Rusty,

On Wed, Jun 22, 2011 at 1:46 PM, Ohad Ben-Cohen <ohad@wizery.com> wrote:
> On Wed, Jun 22, 2011 at 5:42 AM, Rusty Russell <rusty@rustcorp.com.au> wrote:
>> On Tue, 21 Jun 2011 10:18:33 +0300, Ohad Ben-Cohen <ohad@wizery.com> wrote:
>>> +#define VIRTIO_ID_RPMSG ? ? ? ? ? ? ?10 /* virtio remote processor messaging */
>>
>> I think you want 6. ?Plan 9 jumped ahead to grab 9 :)
>
> 6 it is :)

Looking at the virtio-spec, it seems that 6 is taken by 'ioMemory'.
There's no indication for it in virtio_ids.h though, and the
virtio-spec has no appendix for this id, so maybe it's available after
all ?

If it is available, I'll take it for rpmsg, and will also update
virtio-spec appropriately. Otherwise, we'll settle on 7 ? :)

Thanks,
Ohad.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 7/8] drivers: introduce rpmsg, a remote-processor messaging bus
  2011-09-14 18:38       ` Ohad Ben-Cohen
@ 2011-09-15  0:12         ` Rusty Russell
  2011-09-15 14:56           ` Ohad Ben-Cohen
  0 siblings, 1 reply; 63+ messages in thread
From: Rusty Russell @ 2011-09-15  0:12 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 14 Sep 2011 21:38:43 +0300, Ohad Ben-Cohen <ohad@wizery.com> wrote:
> Hi Rusty,
> 
> On Wed, Jun 22, 2011 at 1:46 PM, Ohad Ben-Cohen <ohad@wizery.com> wrote:
> > On Wed, Jun 22, 2011 at 5:42 AM, Rusty Russell <rusty@rustcorp.com.au> wrote:
> >> On Tue, 21 Jun 2011 10:18:33 +0300, Ohad Ben-Cohen <ohad@wizery.com> wrote:
> >>> +#define VIRTIO_ID_RPMSG ? ? ? ? ? ? ?10 /* virtio remote processor messaging */
> >>
> >> I think you want 6. ?Plan 9 jumped ahead to grab 9 :)
> >
> > 6 it is :)
> 
> Looking at the virtio-spec, it seems that 6 is taken by 'ioMemory'.
> There's no indication for it in virtio_ids.h though, and the
> virtio-spec has no appendix for this id, so maybe it's available after
> all ?
> 
> If it is available, I'll take it for rpmsg, and will also update
> virtio-spec appropriately. Otherwise, we'll settle on 7 ? :)

7... numbers are cheap :)

Cheers,
Rusty.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 7/8] drivers: introduce rpmsg, a remote-processor messaging bus
  2011-09-15  0:12         ` Rusty Russell
@ 2011-09-15 14:56           ` Ohad Ben-Cohen
  0 siblings, 0 replies; 63+ messages in thread
From: Ohad Ben-Cohen @ 2011-09-15 14:56 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Sep 15, 2011 at 3:12 AM, Rusty Russell <rusty@rustcorp.com.au> wrote:
> 7... numbers are cheap :)

7 it is :)

Thanks,
Ohad.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 7/8] drivers: introduce rpmsg, a remote-processor messaging bus
       [not found] ` <1308640714-17961-8-git-send-email-ohad@wizery.com>
  2011-06-22  2:42   ` [RFC 7/8] drivers: introduce rpmsg, a remote-processor messaging bus Rusty Russell
@ 2011-06-27 22:21   ` Grant Likely
  2011-06-28 22:46     ` Ohad Ben-Cohen
  2011-06-28 23:44   ` Randy Dunlap
  2011-07-18 22:07   ` Pavel Machek
  3 siblings, 1 reply; 63+ messages in thread
From: Grant Likely @ 2011-06-27 22:21 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jun 21, 2011 at 10:18:33AM +0300, Ohad Ben-Cohen wrote:
> Add a virtio-based IPC bus, which enables kernel users to communicate
> with remote processors over shared memory using a simple messaging
> protocol.
> 
> Assign a local address for every local endpoint that is created,
> and bind it to the user's callback. Invoke that callback when the
> destination of an inbound message is the user's address.
> 
> Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com>

Hey Ohad,

Nice patch.  I'm quite thrilled to see this implemented.  Some
comments below, but otherwise I think it looks pretty good.


> ---
>  Documentation/ABI/testing/sysfs-bus-rpmsg |   75 +++
>  Documentation/rpmsg.txt                   |  308 +++++++++
>  drivers/Kconfig                           |    1 +
>  drivers/Makefile                          |    1 +
>  drivers/rpmsg/Kconfig                     |   14 +
>  drivers/rpmsg/Makefile                    |    1 +
>  drivers/rpmsg/virtio_rpmsg_bus.c          | 1036 +++++++++++++++++++++++++++++
>  include/linux/mod_devicetable.h           |   10 +
>  include/linux/rpmsg.h                     |  421 ++++++++++++
>  include/linux/virtio_ids.h                |    1 +
>  10 files changed, 1868 insertions(+), 0 deletions(-)
>  create mode 100644 Documentation/ABI/testing/sysfs-bus-rpmsg
>  create mode 100644 Documentation/rpmsg.txt
>  create mode 100644 drivers/rpmsg/Kconfig
>  create mode 100644 drivers/rpmsg/Makefile
>  create mode 100644 drivers/rpmsg/virtio_rpmsg_bus.c
>  create mode 100644 include/linux/rpmsg.h
> 
> diff --git a/Documentation/rpmsg.txt b/Documentation/rpmsg.txt
> new file mode 100644
> index 0000000..0a7c820
> --- /dev/null
> +++ b/Documentation/rpmsg.txt
[...]

Great documentation!  Thanks!

> diff --git a/drivers/Kconfig b/drivers/Kconfig
> index 1f6d6d3..840f835 100644
> --- a/drivers/Kconfig
> +++ b/drivers/Kconfig
> @@ -128,4 +128,5 @@ source "drivers/clocksource/Kconfig"
>  
>  source "drivers/remoteproc/Kconfig"
>  
> +source "drivers/rpmsg/Kconfig"
>  endmenu
> diff --git a/drivers/Makefile b/drivers/Makefile
> index 4d53a18..2980a15 100644
> --- a/drivers/Makefile
> +++ b/drivers/Makefile
> @@ -22,6 +22,7 @@ obj-$(CONFIG_ARM_AMBA)		+= amba/
>  obj-$(CONFIG_DMA_ENGINE)	+= dma/
>  
>  obj-$(CONFIG_VIRTIO)		+= virtio/
> +obj-$(CONFIG_RPMSG)		+= rpmsg/
>  obj-$(CONFIG_XEN)		+= xen/
>  
>  # regulators early, since some subsystems rely on them to initialize
> diff --git a/drivers/rpmsg/Kconfig b/drivers/rpmsg/Kconfig
> new file mode 100644
> index 0000000..41303f5
> --- /dev/null
> +++ b/drivers/rpmsg/Kconfig
> @@ -0,0 +1,14 @@
> +# RPMSG always gets selected by whoever wants it
> +config RPMSG
> +	tristate
> +	select VIRTIO
> +	select VIRTIO_RING
> +
> +if RPMSG
> +
> +# OK, it's a little counter-intuitive to do this, but it puts it neatly under
> +# the rpmsg menu (and it's the approach preferred by the virtio folks).
> +
> +source "drivers/virtio/Kconfig"

What happens when kvm and rpmsg both get enabled on the same kernel.
ARM virtualization is currently being worked on, so it will happen.
Also, I can see this finding use in the x86 world to talk to
coprocessor boards (like the Xilinx FPGA plugin board which can have a
soft core on it).

> +
> +endif # RPMSG
> diff --git a/drivers/rpmsg/Makefile b/drivers/rpmsg/Makefile
> new file mode 100644
> index 0000000..7617fcb
> --- /dev/null
> +++ b/drivers/rpmsg/Makefile
> @@ -0,0 +1 @@
> +obj-$(CONFIG_RPMSG)	+= virtio_rpmsg_bus.o
> diff --git a/drivers/rpmsg/virtio_rpmsg_bus.c b/drivers/rpmsg/virtio_rpmsg_bus.c
> new file mode 100644
> index 0000000..3e98b02
> --- /dev/null
> +++ b/drivers/rpmsg/virtio_rpmsg_bus.c
[...]
> +/* rpmsg devices and drivers are matched using the service name */
> +static inline int rpmsg_id_match(const struct rpmsg_channel *rpdev,
> +				  const struct rpmsg_device_id *id)
> +{
> +	if (strncmp(id->name, rpdev->id.name, RPMSG_NAME_SIZE))
> +		return 0;
> +
> +	return 1;
> +}

or simply: 'return strncmp(id->name, rpdev->id.name, RPMSG_NAME_SIZE) == 0;'

:-)

> +/* for more info, see below documentation of rpmsg_create_ept() */
> +static struct rpmsg_endpoint *__rpmsg_create_ept(struct virtproc_info *vrp,
> +		struct rpmsg_channel *rpdev,
> +		void (*cb)(struct rpmsg_channel *, void *, int, void *, u32),
> +		void *priv, u32 addr)
> +{
> +	int err, tmpaddr, request;
> +	struct rpmsg_endpoint *ept;
> +	struct device *dev = rpdev ? &rpdev->dev : &vrp->vdev->dev;
> +
> +	if (!idr_pre_get(&vrp->endpoints, GFP_KERNEL))
> +		return NULL;
> +
> +	ept = kzalloc(sizeof(*ept), GFP_KERNEL);
> +	if (!ept) {
> +		dev_err(dev, "failed to kzalloc a new ept\n");
> +		return NULL;
> +	}
> +
> +	ept->rpdev = rpdev;
> +	ept->cb = cb;
> +	ept->priv = priv;
> +
> +	/* do we need to allocate a local address ? */
> +	request = addr == RPMSG_ADDR_ANY ? RPMSG_RESERVED_ADDRESSES : addr;
> +
> +	spin_lock(&vrp->endpoints_lock);

Is a spin_lock the right choice for endpoints, or should it be a
mutex (do the endpoints operations need to be atomic)?

> +/*
> + * find an existing channel using its name + address properties,
> + * and destroy it
> + */
> +static int rpmsg_destroy_channel(struct virtproc_info *vrp,
> +					struct rpmsg_channel_info *chinfo)
> +{
> +	struct virtio_device *vdev = vrp->vdev;
> +	struct device *dev;
> +
> +	dev = device_find_child(&vdev->dev, chinfo, rpmsg_channel_match);
> +	if (!dev)
> +		return -EINVAL;
> +
> +	device_unregister(dev);
> +
> +	put_device(dev);
> +
> +	kfree(to_rpmsg_channel(dev));

At put_device time, it is conceivable that the dev pointer is no
longer valid.  You'll need to get do the to_rpmsg_channel() before
putting the dev.

> +
> +	return 0;
> +}
> +
> +/* super simple buffer "allocator" that is just enough for now */
> +static void *get_a_tx_buf(struct virtproc_info *vrp)
> +{
> +	unsigned int len;
> +	void *ret;
> +
> +	/* support multiple concurrent senders */
> +	spin_lock(&vrp->tx_lock);
> +
> +	/*
> +	 * either pick the next unused tx buffer
> +	 * (half of our buffers are used for sending messages)
> +	 */
> +	if (vrp->last_sbuf < vrp->num_bufs / 2)
> +		ret = vrp->sbufs + vrp->buf_size * vrp->last_sbuf++;
> +	/* or recycle a used one */
> +	else
> +		ret = virtqueue_get_buf(vrp->svq, &len);
> +
> +	spin_unlock(&vrp->tx_lock);
> +
> +	return ret;
> +}
> +
> +/**
> + * rpmsg_upref_sleepers() - enable "tx-complete" interrupts, if needed
> + * @vrp: virtual remote processor state
> + *
> + * This function is called before a sender is blocked, waiting for
> + * a tx buffer to become available.
> + *
> + * If we already have blocking senders, this function merely increases
> + * the "sleepers" reference count, and exits.
> + *
> + * Otherwise, if this is the first sender to block, we also enable
> + * virtio's tx callbacks, so we'd be immediately notified when a tx
> + * buffer is consumed (we rely on virtio's tx callback in order
> + * to wake up sleeping senders as soon as a tx buffer is used by the
> + * remote processor).
> + */
> +static void rpmsg_upref_sleepers(struct virtproc_info *vrp)
> +{
> +	/* support multiple concurrent senders */
> +	spin_lock(&vrp->tx_lock);
> +
> +	/* are we the first sleeping context waiting for tx buffers ? */
> +	if (!vrp->sleepers++)

Maybe use a kref?  It might be useful to have a kref_get_first() that
takes a callback used for the first increment.

> +static int rpmsg_remove_device(struct device *dev, void *data)
> +{
> +	struct rpmsg_channel *rpdev = to_rpmsg_channel(dev);
> +
> +	device_unregister(dev);
> +
> +	kfree(rpdev);

put_device() I think.

> +
> +	return 0;
> +}
> +
> +static void __devexit rpmsg_remove(struct virtio_device *vdev)
> +{
> +	struct virtproc_info *vrp = vdev->priv;
> +	int ret;
> +
> +	ret = device_for_each_child(&vdev->dev, NULL, rpmsg_remove_device);
> +	if (ret)
> +		dev_warn(&vdev->dev, "can't remove rpmsg device: %d\n", ret);
> +
> +	idr_remove_all(&vrp->endpoints);
> +	idr_destroy(&vrp->endpoints);
> +
> +	vdev->config->del_vqs(vrp->vdev);
> +
> +	kfree(vrp);
> +}
> +
> +static struct virtio_device_id id_table[] = {
> +	{ VIRTIO_ID_RPMSG, VIRTIO_DEV_ANY_ID },
> +	{ 0 },
> +};
> +
> +static unsigned int features[] = {
> +	VIRTIO_RPMSG_F_NS,
> +};
> +
> +static struct virtio_driver virtio_ipc_driver = {
> +	.feature_table	= features,
> +	.feature_table_size = ARRAY_SIZE(features),
> +	.driver.name	= KBUILD_MODNAME,
> +	.driver.owner	= THIS_MODULE,
> +	.id_table	= id_table,
> +	.probe		= rpmsg_probe,
> +	.remove		= __devexit_p(rpmsg_remove),
> +};
> +
> +static int __init init(void)

Even for static functions, it's a really good idea to prefix all
function names and file scoped symbol with a common prefix like
"rpmsg_".  Doing so avoids even the outside chance of a namespace
conflict.

> +{
> +	int ret;
> +
> +	ret = bus_register(&rpmsg_bus);
> +	if (ret) {
> +		pr_err("failed to register rpmsg bus: %d\n", ret);
> +		return ret;
> +	}
> +
> +	return register_virtio_driver(&virtio_ipc_driver);

And if register_virtio_driver fails?

> +}
> +module_init(init);
> +
> +static void __exit fini(void)
> +{
> +	unregister_virtio_driver(&virtio_ipc_driver);
> +	bus_unregister(&rpmsg_bus);
> +}
> +module_exit(fini);
> +
> +MODULE_DEVICE_TABLE(virtio, id_table);
> +MODULE_DESCRIPTION("Virtio-based remote processor messaging bus");
> +MODULE_LICENSE("GPL v2");
> diff --git a/include/linux/mod_devicetable.h b/include/linux/mod_devicetable.h
> index ae28e93..561567e 100644
> --- a/include/linux/mod_devicetable.h
> +++ b/include/linux/mod_devicetable.h
> @@ -533,4 +533,14 @@ struct isapnp_device_id {
>  	kernel_ulong_t driver_data;	/* data private to the driver */
>  };
>  
> +/* rpmsg */
> +
> +#define RPMSG_NAME_SIZE			32
> +#define RPMSG_DEVICE_MODALIAS_FMT	"rpmsg:%s"
> +
> +struct rpmsg_device_id {
> +	char name[RPMSG_NAME_SIZE];
> +	kernel_ulong_t driver_data	/* Data private to the driver */
> +			__attribute__((aligned(sizeof(kernel_ulong_t))));
> +};

Should this be co-located with vio_device_id?

It makes it easier to dereference in kernel code if you do:

#ifdef __KERNEL__
	void *data;
#else
	kernel_ulong_t data;
#endif

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 7/8] drivers: introduce rpmsg, a remote-processor messaging bus
  2011-06-27 22:21   ` Grant Likely
@ 2011-06-28 22:46     ` Ohad Ben-Cohen
  2011-06-28 22:51       ` Grant Likely
  0 siblings, 1 reply; 63+ messages in thread
From: Ohad Ben-Cohen @ 2011-06-28 22:46 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Grant,

On Tue, Jun 28, 2011 at 1:21 AM, Grant Likely <grant.likely@secretlab.ca> wrote:
> Nice patch. ?I'm quite thrilled to see this implemented. ?Some
> comments below, but otherwise I think it looks pretty good.

Thanks !

>> +source "drivers/virtio/Kconfig"
>
> What happens when kvm and rpmsg both get enabled on the same kernel.

Sounds to me that eventually we'll have to source virtio's Kconfig in
drivers/Kconfig, and remove all the other locations it is sourced at
(currently it is directly sourced by several arch Kconfigs when
VIRTUALIZATION is selected).

>> + ? ? if (strncmp(id->name, rpdev->id.name, RPMSG_NAME_SIZE))
>> + ? ? ? ? ? ? return 0;
>> +
>> + ? ? return 1;
>> +}
>
> or simply: 'return strncmp(id->name, rpdev->id.name, RPMSG_NAME_SIZE) == 0;'
>
> :-)

that works too ;)

>> + ? ? spin_lock(&vrp->endpoints_lock);
>
> Is a spin_lock the right choice for endpoints, or should it be a
> mutex (do the endpoints operations need to be atomic)?

Today it can be a mutex: it protects idr operations invoked either by
users (creating new endpoints, definitely not requiring atomicity) or
by platform code indicating that a new message has arrived (today it's
omap's mailbox driver, which calls from a process context).

I can change that, and if someone requires atomic operations later on,
move to spinlocks again.

But it probably won't matter too much as there's no contention on that
lock today (notifications coming from omap's mailbox are serialized,
and users creating new endpoints show up very infrequently).

> At put_device time, it is conceivable that the dev pointer is no
> longer valid. ?You'll need to get do the to_rpmsg_channel() before
> putting the dev.

sure.

>> +static void rpmsg_upref_sleepers(struct virtproc_info *vrp)
>> +{
>> + ? ? /* support multiple concurrent senders */
>> + ? ? spin_lock(&vrp->tx_lock);
>> +
>> + ? ? /* are we the first sleeping context waiting for tx buffers ? */
>> + ? ? if (!vrp->sleepers++)
>
> Maybe use a kref?

I can change it to an atomic variable, but I don't think kref's memory
barriers are needed here: we already have memory barriers induced by
the spinlock.

>> +static int rpmsg_remove_device(struct device *dev, void *data)
>> +{
>> + ? ? struct rpmsg_channel *rpdev = to_rpmsg_channel(dev);
>> +
>> + ? ? device_unregister(dev);
>> +
>> + ? ? kfree(rpdev);
>
> put_device() I think.

Don't think so, we get the device handle from device_for_each_child
here, which doesn't call get_device (unlike device_find_child).

>> +static int __init init(void)
>
> Even for static functions, it's a really good idea to prefix all
> function names and file scoped symbol with a common prefix like
> "rpmsg_". ?Doing so avoids even the outside chance of a namespace
> conflict.

Sure thing.

>> + ? ? ret = bus_register(&rpmsg_bus);
>> + ? ? if (ret) {
>> + ? ? ? ? ? ? pr_err("failed to register rpmsg bus: %d\n", ret);
>> + ? ? ? ? ? ? return ret;
>> + ? ? }
>> +
>> + ? ? return register_virtio_driver(&virtio_ipc_driver);
>
> And if register_virtio_driver fails?

I'll handle that, thanks.

>> +struct rpmsg_device_id {
>> + ? ? char name[RPMSG_NAME_SIZE];
>> + ? ? kernel_ulong_t driver_data ? ? ?/* Data private to the driver */
>> + ? ? ? ? ? ? ? ? ? ? __attribute__((aligned(sizeof(kernel_ulong_t))));
>> +};
>
> Should this be co-located with vio_device_id?

Sure, can't hurt (I assume you meant virtio_device_id here).

> It makes it easier to dereference in kernel code if you do:
>
> #ifdef __KERNEL__
> ? ? ? ?void *data;
> #else
> ? ? ? ?kernel_ulong_t data;
> #endif

Sure thing.

Thanks!
Ohad.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 7/8] drivers: introduce rpmsg, a remote-processor messaging bus
  2011-06-28 22:46     ` Ohad Ben-Cohen
@ 2011-06-28 22:51       ` Grant Likely
  2011-06-28 23:00         ` Ohad Ben-Cohen
  0 siblings, 1 reply; 63+ messages in thread
From: Grant Likely @ 2011-06-28 22:51 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jun 28, 2011 at 4:46 PM, Ohad Ben-Cohen <ohad@wizery.com> wrote:
> Hi Grant,
>
> On Tue, Jun 28, 2011 at 1:21 AM, Grant Likely <grant.likely@secretlab.ca> wrote:
>>> +static int rpmsg_remove_device(struct device *dev, void *data)
>>> +{
>>> + ? ? struct rpmsg_channel *rpdev = to_rpmsg_channel(dev);
>>> +
>>> + ? ? device_unregister(dev);
>>> +
>>> + ? ? kfree(rpdev);
>>
>> put_device() I think.
>
> Don't think so, we get the device handle from device_for_each_child
> here, which doesn't call get_device (unlike device_find_child).

It's not the device_for_each_child() that you're 'putting' back from
here.  Its the original kref initialization when the device was
created.  Once a device is initialized, it must never be directly
kfree()'d.

g.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 7/8] drivers: introduce rpmsg, a remote-processor messaging bus
  2011-06-28 22:51       ` Grant Likely
@ 2011-06-28 23:00         ` Ohad Ben-Cohen
  2011-06-29 15:43           ` Grant Likely
  0 siblings, 1 reply; 63+ messages in thread
From: Ohad Ben-Cohen @ 2011-06-28 23:00 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jun 29, 2011 at 1:51 AM, Grant Likely <grant.likely@secretlab.ca> wrote:
> It's not the device_for_each_child() that you're 'putting' back from
> here. ?Its the original kref initialization when the device was
> created.

device_unregister() is already calling put_device(), doesn't that deal
with the original kref init for us ?

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 7/8] drivers: introduce rpmsg, a remote-processor messaging bus
  2011-06-28 23:00         ` Ohad Ben-Cohen
@ 2011-06-29 15:43           ` Grant Likely
  2011-07-01 15:13             ` Ohad Ben-Cohen
  0 siblings, 1 reply; 63+ messages in thread
From: Grant Likely @ 2011-06-29 15:43 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jun 28, 2011 at 5:00 PM, Ohad Ben-Cohen <ohad@wizery.com> wrote:
> On Wed, Jun 29, 2011 at 1:51 AM, Grant Likely <grant.likely@secretlab.ca> wrote:
>> It's not the device_for_each_child() that you're 'putting' back from
>> here. ?Its the original kref initialization when the device was
>> created.
>
> device_unregister() is already calling put_device(), doesn't that deal
> with the original kref init for us ?

/me digs deeper:

device_register() has 2 parts; device_initialize() and device_add()
device_init() initialized the kref to 1 (via kobject_init()
device_add() calls get_device() to increment it to 2

Then similarly for device_unregister():
device_del() calls put_device() to decrement the kref to 1
a final put_device() call decrements the kref to 0 - which triggers a
call to the release method that kfrees the object.

So you are right that device_unregister drops the refcount to zero,
but the code still needs to be fixed to not call kfree() directly.

It also looks like rpmsg_destroy_channel() needs to be fixed to remove
the kfree call and an extra put_device() call.  This is important
because the last put_device() call above might /not/ decrement the
refcount to zero is for some reason something still holds a reference
to the device.  But the device will still get freed correctly when the
other holder finally calls device_put().

g.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 7/8] drivers: introduce rpmsg, a remote-processor messaging bus
  2011-06-29 15:43           ` Grant Likely
@ 2011-07-01 15:13             ` Ohad Ben-Cohen
  0 siblings, 0 replies; 63+ messages in thread
From: Ohad Ben-Cohen @ 2011-07-01 15:13 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jun 29, 2011 at 6:43 PM, Grant Likely <grant.likely@secretlab.ca> wrote:
> So you are right that device_unregister drops the refcount to zero,
> but the code still needs to be fixed to not call kfree() directly.

Good point, thanks !

> It also looks like rpmsg_destroy_channel() needs to be fixed to remove
> the kfree call

Yes, I need to remove this direct kfree as well, and indeed just let
rpmsg_release_device do its thing when the last reference is dropped.

I should also remove the direct kfree when device_register fails: in
that case, I need only to put_device and let the release handler do
its thing too.

> and an extra put_device() call.

We need that extra put_device in rpmsg_destroy_channel because
device_find_child() is doing get_device before returning it.

Thanks, Grant!

Ohad.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 7/8] drivers: introduce rpmsg, a remote-processor messaging bus
       [not found] ` <1308640714-17961-8-git-send-email-ohad@wizery.com>
  2011-06-22  2:42   ` [RFC 7/8] drivers: introduce rpmsg, a remote-processor messaging bus Rusty Russell
  2011-06-27 22:21   ` Grant Likely
@ 2011-06-28 23:44   ` Randy Dunlap
  2011-06-29  6:30     ` Ohad Ben-Cohen
  2011-07-18 22:07   ` Pavel Machek
  3 siblings, 1 reply; 63+ messages in thread
From: Randy Dunlap @ 2011-06-28 23:44 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 21 Jun 2011 10:18:33 +0300 Ohad Ben-Cohen wrote:

> Add a virtio-based IPC bus, which enables kernel users to communicate
> with remote processors over shared memory using a simple messaging
> protocol.
> 
> Assign a local address for every local endpoint that is created,
> and bind it to the user's callback. Invoke that callback when the
> destination of an inbound message is the user's address.
> 
> Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com>
> ---
>  Documentation/ABI/testing/sysfs-bus-rpmsg |   75 +++
>  Documentation/rpmsg.txt                   |  308 +++++++++
>  drivers/Kconfig                           |    1 +
>  drivers/Makefile                          |    1 +
>  drivers/rpmsg/Kconfig                     |   14 +
>  drivers/rpmsg/Makefile                    |    1 +
>  drivers/rpmsg/virtio_rpmsg_bus.c          | 1036 +++++++++++++++++++++++++++++
>  include/linux/mod_devicetable.h           |   10 +
>  include/linux/rpmsg.h                     |  421 ++++++++++++
>  include/linux/virtio_ids.h                |    1 +
>  10 files changed, 1868 insertions(+), 0 deletions(-)
>  create mode 100644 Documentation/ABI/testing/sysfs-bus-rpmsg
>  create mode 100644 Documentation/rpmsg.txt
>  create mode 100644 drivers/rpmsg/Kconfig
>  create mode 100644 drivers/rpmsg/Makefile
>  create mode 100644 drivers/rpmsg/virtio_rpmsg_bus.c
>  create mode 100644 include/linux/rpmsg.h


> diff --git a/Documentation/rpmsg.txt b/Documentation/rpmsg.txt
> new file mode 100644
> index 0000000..0a7c820
> --- /dev/null
> +++ b/Documentation/rpmsg.txt
> @@ -0,0 +1,308 @@
> +Remote Processor Messaging (rpmsg) Framework
> +
> +1. Introduction
> +
> +Modern SoCs typically employ heterogeneous remote processor devices in
> +asymmetric multiprocessing (AMP) configurations, which may be running
> +different instances of operating system, whether it's Linux or any other
> +flavor of real-time OS.
> +
> +OMAP4, for example, has dual Cortex-A9, dual Cortex-M3 and a C64x+ DSP.
> +Typically, the dual cortex-A9 is running Linux in a SMP configuration,
> +and each of the other three cores (two M3 cores and a DSP) is running
> +its own instance of RTOS in an AMP configuration.
> +
> +Typically AMP remote processors employ dedicated DSP codecs and multimedia
> +hardware accelerators, and therefore are often used to offload cpu-intensive

prefer:                                                           CPU-
throughout.

> +multimedia tasks from the main application processor.
> +
> +These remote processors could also be used to control latency-sensitive
> +sensors, drive random hardware blocks, or just perform background tasks
> +while the main CPU is idling.
> +
> +Users of those remote processors can either be userland apps (e.g. multimedia
> +frameworks talking with remote OMX components) or kernel drivers (controlling
> +hardware accessible only by the remote processor, reserving kernel-controlled
> +resources on behalf of the remote processor, etc..).
> +
> +Rpmsg is a virtio-based messaging bus that allows kernel drivers to communicate
> +with remote processors available on the system. In turn, drivers could then
> +expose appropriate user space interfaces, if needed.
> +
> +When writing a driver that exposes rpmsg communication to userland, please
> +keep in mind that remote processors might have direct access to the
> +system's physical memory and/or other sensitive hardware resources (e.g. on
> +OMAP4, remote cores (/hardware accelerators) may have direct access to the
> +physical memory, gpio banks, dma controllers, i2c bus, gptimers, mailbox
> +devices, hwspinlocks, etc..). Moreover, those remote processors might be
> +running RTOS where every task can access the entire memory/devices exposed
> +to the processor. To minimize the risks of rogue (/buggy) userland code

What is with the leading / here and above (/hardware) and below?
Looks like they can all be dropped.

> +exploiting (/triggering) remote bugs, and by that taking over (/down) the
> +system, it is often desired to limit userland to specific rpmsg channels (see
> +definition below) it is allowed to send messages on, and if possible/relevant,
> +minimize the amount of control it has over the content of the messages.
> +
> +Every rpmsg device is a communication channel with a remote processor (thus
> +rpmsg devices are called channels). Channels are identified by a textual name
> +and have a local ("source") rpmsg address, and remote ("destination") rpmsg
> +address.
> +
> +When a driver starts listening on a channel, it binds it with a unique
> +rpmsg src address (a 32 bits integer). This way when inbound messages arrive

                     (a 32-bit integer).

> +to this src address, the rpmsg core dispatches them to that driver (by invoking
> +the driver's rx handler with the payload of the incoming message).
> +
> +
> +2. User API
> +
> +  int rpmsg_send(struct rpmsg_channel *rpdev, void *data, int len);
> +   - sends a message across to the remote processor on a given channel.
> +     The caller should specify the channel, the data it wants to send,
> +     and its length (in bytes). The message will be sent on the specified
> +     channel, i.e. its source and destination address fields will be
> +     set to the channel's src and dst addresses.
> +
> +     In case there are no TX buffers available, the function will block until
> +     one becomes available (i.e. until the remote processor will consume

prefer:                             until the remote processor consumes
                    and puts it back on

> +     a tx buffer and put it back on virtio's used descriptor ring),
> +     or a timeout of 15 seconds elapses. When the latter happens,
> +     -ERESTARTSYS is returned.
> +     The function can only be called from a process context (for now).
> +     Returns 0 on success and an appropriate error value on failure.
> +
> +  int rpmsg_sendto(struct rpmsg_channel *rpdev, void *data, int len, u32 dst);
> +   - sends a message across to the remote processor on a given channel,
> +     to a destination address provided by the caller.
> +     The caller should specify the channel, the data it wants to send,
> +     its length (in bytes), and an explicit destination address.
> +     The message will then be sent to the remote processor to which the
> +     channel belongs to, using the channel's src address, and the user-provided

        channel belongs,

> +     dst address (thus the channel's dst address will be ignored).
> +
> +     In case there are no TX buffers available, the function will block until
> +     one becomes available (i.e. until the remote processor will consume

                                    until the remote processor consumes
        a tx buffer and puts it back on

> +     a tx buffer and put it back on virtio's used descriptor ring),
> +     or a timeout of 15 seconds elapses. When the latter happens,
> +     -ERESTARTSYS is returned.
> +     The function can only be called from a process context (for now).
> +     Returns 0 on success and an appropriate error value on failure.
> +
> +  int rpmsg_send_offchannel(struct rpmsg_channel *rpdev, u32 src, u32 dst,
> +							void *data, int len);
> +   - sends a message across to the remote processor, using the src and dst
> +     addresses provided by the user.
> +     The caller should specify the channel, the data it wants to send,
> +     its length (in bytes), and explicit source and destination addresses.
> +     The message will then be sent to the remote processor to which the
> +     channel belongs to, but the channel's src and dst addresses will be

        channel belongs,

> +     ignored (and the user-provided addresses will be used instead).
> +
> +     In case there are no TX buffers available, the function will block until
> +     one becomes available (i.e. until the remote processor will consume

                                    until the remote processor consumes
        a tx buffer and puts it back on

> +     a tx buffer and put it back on virtio's used descriptor ring),
> +     or a timeout of 15 seconds elapses. When the latter happens,
> +     -ERESTARTSYS is returned.
> +     The function can only be called from a process context (for now).
> +     Returns 0 on success and an appropriate error value on failure.
> +
> +  int rpmsg_trysend(struct rpmsg_channel *rpdev, void *data, int len);
> +   - sends a message across to the remote processor on a given channel.
> +     The caller should specify the channel, the data it wants to send,
> +     and its length (in bytes). The message will be sent on the specified
> +     channel, i.e. its source and destination address fields will be
> +     set to the channel's src and dst addresses.
> +
> +     In case there are no TX buffers available, the function will immediately
> +     return -ENOMEM without waiting until one becomes available.
> +     The function can only be called from a process context (for now).
> +     Returns 0 on success and an appropriate error value on failure.
> +
> +  int rpmsg_trysendto(struct rpmsg_channel *rpdev, void *data, int len, u32 dst)
> +   - sends a message across to the remote processor on a given channel,
> +     to a destination address provided by the user.
> +     The user should specify the channel, the data it wants to send,
> +     its length (in bytes), and an explicit destination address.
> +     The message will then be sent to the remote processor to which the
> +     channel belongs to, using the channel's src address, and the user-provided

        channel belongs,

> +     dst address (thus the channel's dst address will be ignored).
> +
> +     In case there are no TX buffers available, the function will immediately
> +     return -ENOMEM without waiting until one becomes available.
> +     The function can only be called from a process context (for now).
> +     Returns 0 on success and an appropriate error value on failure.
> +
> +  int rpmsg_trysend_offchannel(struct rpmsg_channel *rpdev, u32 src, u32 dst,
> +							void *data, int len);
> +   - sends a message across to the remote processor, using source and
> +     destination addresses provided by the user.
> +     The user should specify the channel, the data it wants to send,
> +     its length (in bytes), and explicit source and destination addresses.
> +     The message will then be sent to the remote processor to which the
> +     channel belongs to, but the channel's src and dst addresses will be

        channel belongs,

> +     ignored (and the user-provided addresses will be used instead).
> +
> +     In case there are no TX buffers available, the function will immediately
> +     return -ENOMEM without waiting until one becomes available.
> +     The function can only be called from a process context (for now).
> +     Returns 0 on success and an appropriate error value on failure.
> +
> +  struct rpmsg_endpoint *rpmsg_create_ept(struct rpmsg_channel *rpdev,
> +		void (*cb)(struct rpmsg_channel *, void *, int, void *, u32),
> +		void *priv, u32 addr);
> +   - every rpmsg address in the system is bound to an rx callback (so when
> +     inbound messages arrive, they are dispatched by the rpmsg bus using the
> +     appropriate callback handler) by means of an rpmsg_endpoint struct.
> +
> +     This function allows drivers to create such an endpoint, and by that,
> +     bind a callback, and possibly some private data too, to an rpmsg address
> +     (either one that is known in advance, or one that will be dynamically
> +     assigned for them).
> +
> +     Simple rpmsg drivers need not call rpmsg_create_ept, because an endpoint
> +     is already created for them when they are probed by the rpmsg bus
> +     (using the rx callback they provide when they registered to the rpmsg bus).
> +
> +     So things should just work for simple drivers: they already have an
> +     endpoint, their rx callback is bound to their rpmsg address, and when
> +     relevant inbound messages arrive (i.e. messages which their dst address
> +     equals to the src address of their rpmsg channel), the driver's handler
> +     is invoked to process it.
> +
> +     That said, more complicated drivers might do need to allocate
> +     additional rpmsg addresses, and bind them to different rx callbacks.
> +     To accomplish that, those drivers need to call this function.
> +     Driver should provide their channel (so the new endpoint would bind

        Drivers

> +     to the same remote processor their channel belongs to), an rx callback
> +     function, an optional private data (which is provided back when the
> +     rx callback is invoked), and an address they want to bind with the
> +     callback. If addr is RPMSG_ADDR_ANY, then rpmsg_create_ept will
> +     dynamically assign them an available rpmsg address (drivers should have
> +     a very good reason why not to always use RPMSG_ADDR_ANY here).
> +
> +     Returns a pointer to the endpoint on success, or NULL on error.
> +
> +  void rpmsg_destroy_ept(struct rpmsg_endpoint *ept);
> +   - destroys an existing rpmsg endpoint. user should provide a pointer
> +     to an rpmsg endpoint that was previously created with rpmsg_create_ept().
> +
> +  int register_rpmsg_driver(struct rpmsg_driver *rpdrv);
> +   - registers an rpmsg driver with the rpmsg bus. user should provide
> +     a pointer to an rpmsg_driver struct, which contains the driver's
> +     ->probe() and ->remove() functions, an rx callback, and an id_table
> +     specifying the names of the channels this driver is interested to
> +     be probed with.
> +
> +  void unregister_rpmsg_driver(struct rpmsg_driver *rpdrv);
> +   - unregisters an rpmsg driver from the rpmsg bus. user should provide
> +     a pointer to a previously-registerd rpmsg_driver struct.

                                  registered

> +     Returns 0 on success, and an appropriate error value on failure.
> +
> +
> +3. Typical usage
> +
> +The following is a simple rpmsg driver, that sends an "hello!" message
> +on probe(), and whenever it receives an incoming message, it dumps its
> +content to the console.
> +
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +#include <linux/rpmsg.h>
> +
> +static void rpmsg_sample_cb(struct rpmsg_channel *rpdev, void *data, int len,
> +						void *priv, u32 src)
> +{
> +	print_hex_dump(KERN_INFO, "incoming message:", DUMP_PREFIX_NONE,
> +						16, 1, data, len, true);
> +}
> +
> +static int rpmsg_sample_probe(struct rpmsg_channel *rpdev)
> +{
> +	int err;
> +
> +	dev_info(&rpdev->dev, "chnl: 0x%x -> 0x%x\n", rpdev->src, rpdev->dst);
> +
> +	/* send a message on our channel */
> +	err = rpmsg_send(rpdev, "hello!", 6);
> +	if (err) {
> +		pr_err("rpmsg_send failed: %d\n", err);
> +		return err;
> +	}
> +
> +	return 0;
> +}
> +
> +static void __devexit rpmsg_sample_remove(struct rpmsg_channel *rpdev)
> +{
> +	dev_info(&rpdev->dev, "rpmsg sample client driver is removed\n");
> +}
> +
> +static struct rpmsg_device_id rpmsg_driver_sample_id_table[] = {
> +	{ .name	= "rpmsg-client-sample" },
> +	{ },
> +};
> +MODULE_DEVICE_TABLE(rpmsg, rpmsg_driver_sample_id_table);
> +
> +static struct rpmsg_driver rpmsg_sample_client = {
> +	.drv.name	= KBUILD_MODNAME,
> +	.drv.owner	= THIS_MODULE,
> +	.id_table	= rpmsg_driver_sample_id_table,
> +	.probe		= rpmsg_sample_probe,
> +	.callback	= rpmsg_sample_cb,
> +	.remove		= __devexit_p(rpmsg_sample_remove),
> +};
> +
> +static int __init init(void)
> +{
> +	return register_rpmsg_driver(&rpmsg_sample_client);
> +}
> +
> +static void __exit fini(void)
> +{
> +	unregister_rpmsg_driver(&rpmsg_sample_client);
> +}
> +module_init(init);
> +module_exit(fini);
> +
> +
> +4. API for implementors
> +
> +Adding rpmsg support for a new platform is relatively easy; one just needs
> +to register a VIRTIO_ID_RPMSG virtio device with the usual virtio_config_ops
> +handlers.
> +
> +For simplicity, it is recommended to register a single virtio device for
> +every physical remote processor we have in the system, but there are no

                  remote processor in the system,

> +hard rules here, and this decision largely depends on the use cases,
> +platform capabilities, performance requirements, etc.
> +
> +OMAP4, e.g., registers two virtio devices to communicate with the remote dual
> +Cortex-M3 processor, because each of the M3 cores executes its own OS instance
> +(see omap_rpmsg.c for reference).
> +This way each of the remote cores may have different rpmsg channels, and the
> +rpmsg core will treat them as completely independent processors (despite
> +the fact that both of are part of the same physical device, and they are

                 both of them are part

> +powered on/off together).
> +
> +Notable virtio implementation bits:
> +
> +* virtio features: VIRTIO_RPMSG_F_NS should be enabled if the remote
> +  processor supports dynamic name service announcement messages.
> +
> +  Enabling this means that rpmsg device (i.e. channel) creation is
> +  completely dynamic: the remote processor announces the existence of a
> +  remote rpmsg service by sending a name service message (which contains
> +  the name and rpmsg addr of the remote service, see struct rpmsg_ns_msg).
> +
> +  This message is then handled by the rpmsg bus, which in turn dynamically
> +  creates and registers an rpmsg channel (which represents the remote service).
> +  If/when a relevant rpmsg driver is registered, it will be immediately probed
> +  by the bus, and can then start sending messages to the remote service.
> +
> +* virtqueue's notify handler: should inform the remote processor whenever
> +  it is kicked by virtio. OMAP4 is using its mailbox device to interrupt
> +  the remote processor, and inform it which virtqueue number is kicked
> +  (the index of the kicked virtqueue is written to the mailbox register).
> +
> +* virtio_config_ops's ->get() handler: the rpmsg bus uses this handler
> +  to request for platform-specific configuration values.
> +  see enum rpmsg_platform_requests for more info.

> diff --git a/drivers/rpmsg/Kconfig b/drivers/rpmsg/Kconfig
> new file mode 100644
> index 0000000..41303f5
> --- /dev/null
> +++ b/drivers/rpmsg/Kconfig
> @@ -0,0 +1,14 @@
> +# RPMSG always gets selected by whoever wants it
> +config RPMSG
> +	tristate
> +	select VIRTIO
> +	select VIRTIO_RING
> +
> +if RPMSG
> +
> +# OK, it's a little counter-intuitive to do this, but it puts it neatly under
> +# the rpmsg menu (and it's the approach preferred by the virtio folks).
> +
> +source "drivers/virtio/Kconfig"

It seems odd to have that Kconfig file sourced in multiple places.
Are the kconfig tools happy with that?

> +
> +endif # RPMSG


Sorry about the delay.  I had most of this in my drafts folder and
forgot about it...

HTH.
---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 7/8] drivers: introduce rpmsg, a remote-processor messaging bus
  2011-06-28 23:44   ` Randy Dunlap
@ 2011-06-29  6:30     ` Ohad Ben-Cohen
  2011-06-29 11:59       ` Arnd Bergmann
  0 siblings, 1 reply; 63+ messages in thread
From: Ohad Ben-Cohen @ 2011-06-29  6:30 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Randy,

Thanks for your comments !

On Wed, Jun 29, 2011 at 2:44 AM, Randy Dunlap <rdunlap@xenotime.net> wrote:
>> +hardware accelerators, and therefore are often used to offload cpu-intensive
>
> prefer: ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? CPU-
> throughout.

Isn't that changing the meaning a bit ? Let's stick with the original
version, I think it's more clear.

>> +system's physical memory and/or other sensitive hardware resources (e.g. on
>> +OMAP4, remote cores (/hardware accelerators) may have direct access to the
>> +physical memory, gpio banks, dma controllers, i2c bus, gptimers, mailbox
>> +devices, hwspinlocks, etc..). Moreover, those remote processors might be
>> +running RTOS where every task can access the entire memory/devices exposed
>> +to the processor. To minimize the risks of rogue (/buggy) userland code
>
> What is with the leading / here and above (/hardware) and below?

/ here means "or". You can read the sentence twice, either without the
(/ ..) options or with them, and then you get two (different)
examples.

Any idea how to make it more readable ? I prefer not to drop the
second example, as it's adding information.

>> +if RPMSG
>> +
>> +# OK, it's a little counter-intuitive to do this, but it puts it neatly under
>> +# the rpmsg menu (and it's the approach preferred by the virtio folks).
>> +
>> +source "drivers/virtio/Kconfig"
>
> It seems odd to have that Kconfig file sourced in multiple places.
> Are the kconfig tools happy with that?

They are, probably because these places are mutually exclusive today:

$ git grep "drivers/virtio/Kconfig"
arch/ia64/kvm/Kconfig:source drivers/virtio/Kconfig
arch/mips/Kconfig:source drivers/virtio/Kconfig
arch/powerpc/kvm/Kconfig:source drivers/virtio/Kconfig
arch/s390/kvm/Kconfig:source drivers/virtio/Kconfig
arch/sh/Kconfig:source drivers/virtio/Kconfig
arch/tile/kvm/Kconfig:source drivers/virtio/Kconfig
arch/x86/kvm/Kconfig:source drivers/virtio/Kconfig

Now that we start using virtio for inter-processor communications,
too, we might soon bump into a situation where virtio will be sourced
twice.

Probably the solution is to move 'source "drivers/virtio/Kconfig"'
into drivers/Kconfig, and remove all other instances.

Rusty, are you ok with that ?

Thanks,
Ohad.

> Sorry about the delay. ?I had most of this in my drafts folder and
> forgot about it...

Np, thanks a lot !

Ohad.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 7/8] drivers: introduce rpmsg, a remote-processor messaging bus
  2011-06-29  6:30     ` Ohad Ben-Cohen
@ 2011-06-29 11:59       ` Arnd Bergmann
  2011-06-29 12:29         ` Ohad Ben-Cohen
  0 siblings, 1 reply; 63+ messages in thread
From: Arnd Bergmann @ 2011-06-29 11:59 UTC (permalink / raw)
  To: linux-arm-kernel

On Wednesday 29 June 2011, Ohad Ben-Cohen wrote:
> On Wed, Jun 29, 2011 at 2:44 AM, Randy Dunlap <rdunlap@xenotime.net> wrote:
> >> +hardware accelerators, and therefore are often used to offload cpu-intensive
> >
> > prefer:                                                           CPU-
> > throughout.
> 
> Isn't that changing the meaning a bit ? Let's stick with the original
> version, I think it's more clear.

I think you misunderstood Randy, he meant you should do 's/cpu/CPU/' globally,
which does not change the meaning.

> >> +system's physical memory and/or other sensitive hardware resources (e.g. on
> >> +OMAP4, remote cores (/hardware accelerators) may have direct access to the
> >> +physical memory, gpio banks, dma controllers, i2c bus, gptimers, mailbox
> >> +devices, hwspinlocks, etc..). Moreover, those remote processors might be
> >> +running RTOS where every task can access the entire memory/devices exposed
> >> +to the processor. To minimize the risks of rogue (/buggy) userland code
> >
> > What is with the leading / here and above (/hardware) and below?
> 
> / here means "or". You can read the sentence twice, either without the
> (/ ..) options or with them, and then you get two (different)
> examples.
> 
> Any idea how to make it more readable ? I prefer not to drop the
> second example, as it's adding information.

The easiest way would be to replace it with 'or', as in

... remote cores (or hardware accelerators) may have ...

I would also suggest you drop the parentheses, especially in the first
case where you have two levels of them:

system's physical memory and/or other sensitive hardware resources, e.g. on
OMAP4, remote cores and hardware accelerators may have direct access to the
specific hardware blocks such as physical memory, gpio banks or dma
controllers.
Moreover, those remote processors might be...

> >> +if RPMSG
> >> +
> >> +# OK, it's a little counter-intuitive to do this, but it puts it neatly under
> >> +# the rpmsg menu (and it's the approach preferred by the virtio folks).
> >> +
> >> +source "drivers/virtio/Kconfig"
> >
> > It seems odd to have that Kconfig file sourced in multiple places.
> > Are the kconfig tools happy with that?
> 
> They are, probably because these places are mutually exclusive today:
> 
> $ git grep "drivers/virtio/Kconfig"
> arch/ia64/kvm/Kconfig:source drivers/virtio/Kconfig
> arch/mips/Kconfig:source drivers/virtio/Kconfig
> arch/powerpc/kvm/Kconfig:source drivers/virtio/Kconfig
> arch/s390/kvm/Kconfig:source drivers/virtio/Kconfig
> arch/sh/Kconfig:source drivers/virtio/Kconfig
> arch/tile/kvm/Kconfig:source drivers/virtio/Kconfig
> arch/x86/kvm/Kconfig:source drivers/virtio/Kconfig
> 
> Now that we start using virtio for inter-processor communications,
> too, we might soon bump into a situation where virtio will be sourced
> twice.
> 
> Probably the solution is to move 'source "drivers/virtio/Kconfig"'
> into drivers/Kconfig, and remove all other instances.

I think changing that would be good. However, you need to at least
restructure the contents, or they will show up in the main driver menu.

	Arnd

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 7/8] drivers: introduce rpmsg, a remote-processor messaging bus
  2011-06-29 11:59       ` Arnd Bergmann
@ 2011-06-29 12:29         ` Ohad Ben-Cohen
  0 siblings, 0 replies; 63+ messages in thread
From: Ohad Ben-Cohen @ 2011-06-29 12:29 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jun 29, 2011 at 2:59 PM, Arnd Bergmann <arnd@arndb.de> wrote:
> On Wednesday 29 June 2011, Ohad Ben-Cohen wrote:
>> On Wed, Jun 29, 2011 at 2:44 AM, Randy Dunlap <rdunlap@xenotime.net> wrote:
>> >> +hardware accelerators, and therefore are often used to offload cpu-intensive
>> >
>> > prefer: ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? CPU-
>> > throughout.
>>
>> Isn't that changing the meaning a bit ? Let's stick with the original
>> version, I think it's more clear.
>
> I think you misunderstood Randy, he meant you should do 's/cpu/CPU/' globally,

Oh, sorry, Randy. For some reason I thought you meant
s/cpu-intensive/CPU-throughout/ which didn't make a lot of sense to me
:)

s/cpu/CPU/ is of course nicer. thanks !

> The easiest way would be to replace it with 'or', as in
>
> ... remote cores (or hardware accelerators) may have ...

yeah, i'll do it, thanks.

It's a bit harder to get rid of the parentheses in the second
sentence, but I'll think of something too.

>> Probably the solution is to move 'source "drivers/virtio/Kconfig"'
>> into drivers/Kconfig, and remove all other instances.
>
> I think changing that would be good. However, you need to at least
> restructure the contents, or they will show up in the main driver menu.

I'll do that.

Thanks,
Ohad.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 7/8] drivers: introduce rpmsg, a remote-processor messaging bus
       [not found] ` <1308640714-17961-8-git-send-email-ohad@wizery.com>
                     ` (2 preceding siblings ...)
  2011-06-28 23:44   ` Randy Dunlap
@ 2011-07-18 22:07   ` Pavel Machek
  2011-07-19  5:38     ` Ohad Ben-Cohen
  3 siblings, 1 reply; 63+ messages in thread
From: Pavel Machek @ 2011-07-18 22:07 UTC (permalink / raw)
  To: linux-arm-kernel

Hi!

> @@ -0,0 +1,75 @@
> +What:		/sys/bus/rpmsg/devices/.../name
> +Date:		June 2011
> +KernelVersion:	3.2
> +Contact:	Ohad Ben-Cohen <ohad@wizery.com>
> +Description:
> +		Every rpmsg device is a communication channel with a remote
> +		processor. Channels are identified with a (textual) name,
> +		which is maximum 32 bytes long (defined as RPMSG_NAME_SIZE in
> +		rpmsg.h).
> +
> +		This sysfs entry contains the name of this channel.
> +
> +What:		/sys/bus/rpmsg/devices/.../src
> +Date:		June 2011
> +KernelVersion:	3.2
> +Contact:	Ohad Ben-Cohen <ohad@wizery.com>
> +Description:
> +		Every rpmsg device is a communication channel with a remote
> +		processor. Channels have a local ("source") rpmsg address,
> +		and remote ("destination") rpmsg address. When an entity
> +		starts listening on one end of a channel, it assigns it with
> +		a unique rpmsg address (a 32 bits integer). This way when
> +		inbound messages arrive to this address, the rpmsg core
> +		dispatches them to the listening entity (a kernel driver).
> +
> +		This sysfs entry contains the src (local) rpmsg address
> +		of this channel. If it contains 0xffffffff, then an address
> +		wasn't assigned (can happen if no driver exists for this
> +		channel).

So this is basically networking... right? Why not implement it as
sockets? (accept, connect, read, write)?

									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC 7/8] drivers: introduce rpmsg, a remote-processor messaging bus
  2011-07-18 22:07   ` Pavel Machek
@ 2011-07-19  5:38     ` Ohad Ben-Cohen
  0 siblings, 0 replies; 63+ messages in thread
From: Ohad Ben-Cohen @ 2011-07-19  5:38 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Pavel,

On Thu, Jun 9, 2011 at 8:12 PM, Pavel Machek <pavel@ucw.cz> wrote:
> So this is basically networking... right? Why not implement it as
> sockets? (accept, connect, read, write)?

This patch focuses on adding the core rpmsg kernel infrastructure. The
next step, after getting the basic stuff in, would be adding rpmsg
drivers, and exposing user space API.

For some use cases, where userland talks directly with remote entities
(and otherwise requires no kernel involvement besides exposing the
transport), socket API is very nice as it's flexible and prevalent.

We already have several rpmsg drivers and a rpmsg-based socket API
implementation too, but we'll get to it only after the core stuff gets
in.

Thanks,
Ohad.

^ permalink raw reply	[flat|nested] 63+ messages in thread

[parent not found: <1308640714-17961-7-git-send-email-ohad@wizery.com>]

* [RFC 6/8] davinci: da850: add remoteproc dsp device
       [not found] ` <1308640714-17961-7-git-send-email-ohad@wizery.com>
@ 2011-06-28 10:18   ` Sergei Shtylyov
  0 siblings, 0 replies; 63+ messages in thread
From: Sergei Shtylyov @ 2011-06-28 10:18 UTC (permalink / raw)
  To: linux-arm-kernel

Hello.

On 21.06.2011 11:18, Ohad Ben-Cohen wrote:

> From: Mark Grosen <mgrosen@ti.com>

> Add davinci remoteproc device for the da850's C674x dsp remote
> processor, and support it on the da850-evm and omapl138-hawk boards.

> Signed-off-by: Mark Grosen<mgrosen@ti.com>
> Signed-off-by: Ohad Ben-Cohen<ohad@wizery.com>
[...]

> diff --git a/arch/arm/mach-davinci/da850.c b/arch/arm/mach-davinci/da850.c
> index 133aac4..9280b1e 100644
> --- a/arch/arm/mach-davinci/da850.c
> +++ b/arch/arm/mach-davinci/da850.c
> @@ -74,6 +74,13 @@ static struct clk pll0_aux_clk = {
>   	.flags		= CLK_PLL | PRE_PLL,
>   };
>
> +static struct clk pll0_sysclk1 = {
> +	.name		= "pll0_sysclk1",
> +	.parent		= &pll0_clk,
> +	.flags		= CLK_PLL,
> +	.div_reg	= PLLDIV1,
> +};
> +
>   static struct clk pll0_sysclk2 = {
>   	.name		= "pll0_sysclk2",
>   	.parent		= &pll0_clk,
> @@ -373,6 +380,12 @@ static struct clk spi1_clk = {
>   	.flags		= DA850_CLK_ASYNC3,
>   };
>
> +static struct clk dsp_clk = {
> +	.name		= "dsp",
> +	.parent		= &pll0_sysclk1,
> +	.lpsc		= DA8XX_LPSC0_GEM,
> +};
> +
>   static struct clk_lookup da850_clks[] = {
>   	CLK(NULL,		"ref",		&ref_clk),
>   	CLK(NULL,		"pll0",		&pll0_clk),
> @@ -419,6 +432,7 @@ static struct clk_lookup da850_clks[] = {
>   	CLK(NULL,		"usb20",	&usb20_clk),
>   	CLK("spi_davinci.0",	NULL,		&spi0_clk),
>   	CLK("spi_davinci.1",	NULL,		&spi1_clk),
> +	CLK(NULL,		"dsp",		&dsp_clk),
>   	CLK(NULL,		NULL,		NULL),
>   };

    How about also adding the clock for DA830?

> diff --git a/arch/arm/mach-davinci/devices-da8xx.c b/arch/arm/mach-davinci/devices-da8xx.c
> index fc4e98e..3becfd1 100644
> --- a/arch/arm/mach-davinci/devices-da8xx.c
> +++ b/arch/arm/mach-davinci/devices-da8xx.c
[...]
> @@ -647,6 +648,20 @@ int __init da850_register_mmcsd1(struct davinci_mmc_config *config)
>   	da850_mmcsd1_device.dev.platform_data = config;
>   	return platform_device_register(&da850_mmcsd1_device);
>   }
> +
> +int __init da850_register_rproc(void)

    Please rename to da8xx_register_rproc() -- there's nothing DA850-specific 
here.

> +{
> +	struct platform_device *rproc;
> +	struct davinci_rproc_pdata rproc_pdata = {
> +		.name		= "dsp",
> +		.firmware	= "davinci-dsp.bin",

    Isn't the firmware DA8xx-specific?

WBR, Sergei

^ permalink raw reply	[flat|nested] 63+ messages in thread

end of thread, other threads:[~2011-09-15 14:56 UTC | newest]

Thread overview: 63+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-06-21  7:18 [RFC 0/8] Introducing a generic AMP/IPC framework Ohad Ben-Cohen
     [not found] ` <BANLkTimn-THKipuQVHLA9i3QSoO5RTqMtA@mail.gmail.com>
2011-06-21  9:30   ` Ohad Ben-Cohen
2011-06-21 14:20 ` Arnd Bergmann
2011-06-21 15:54   ` Grant Likely
2011-06-22 11:41   ` Ohad Ben-Cohen
2011-06-22 13:05     ` Arnd Bergmann
2011-06-22 13:09       ` Ohad Ben-Cohen
     [not found] ` <1308640714-17961-2-git-send-email-ohad@wizery.com>
2011-06-22 17:55   ` [RFC 1/8] drivers: add generic remoteproc framework Randy Dunlap
2011-06-22 19:11     ` Ohad Ben-Cohen
2011-06-27 20:49   ` Grant Likely
2011-06-27 21:52     ` Grosen, Mark
2011-06-27 22:24       ` Grant Likely
2011-06-27 23:54         ` Grosen, Mark
2011-06-27 23:29     ` Russell King - ARM Linux
2011-06-27 23:35       ` Grant Likely
2011-06-28 21:55       ` Ohad Ben-Cohen
2011-06-28 21:41     ` Ohad Ben-Cohen
2011-06-23 12:23 ` [RFC 0/8] Introducing a generic AMP/IPC framework Michael Williamson
2011-06-23 16:27   ` Grosen, Mark
2011-06-23 18:46     ` Arnd Bergmann
2011-06-24  4:32       ` Grosen, Mark
     [not found] ` <1308640714-17961-6-git-send-email-ohad@wizery.com>
2011-06-23 15:27   ` [RFC 5/8] remoteproc: add davinci implementation Sergei Shtylyov
2011-06-24  4:25     ` Grosen, Mark
2011-06-24 15:13       ` Sergei Shtylyov
2011-06-24 15:43         ` Nori, Sekhar
2011-06-27 18:31           ` Grosen, Mark
2011-07-05  5:29             ` Nori, Sekhar
2011-07-05 16:54               ` Grosen, Mark
2011-07-05  5:34             ` Nori, Sekhar
2011-07-05 16:54               ` Grosen, Mark
2011-07-06  4:36                 ` Nori, Sekhar
2011-06-27 18:20         ` Grosen, Mark
2011-06-28  9:36           ` Nori, Sekhar
2011-06-24 20:16 ` [RFC 0/8] Introducing a generic AMP/IPC framework Stephen Boyd
2011-06-26  1:11   ` Ohad Ben-Cohen
2011-06-26  1:17     ` Brian Swetland
2011-06-27 21:22     ` Grosen, Mark
2011-06-28 11:26       ` Ohad Ben-Cohen
2011-06-28 11:36         ` Brian Swetland
     [not found] ` <1308640714-17961-3-git-send-email-ohad@wizery.com>
2011-06-22 10:05   ` [RFC 2/8] remoteproc: add omap implementation Will Newton
2011-06-22 10:50     ` Ohad Ben-Cohen
2011-06-27 21:00   ` Grant Likely
2011-06-29 15:04     ` Ohad Ben-Cohen
2011-06-29 15:31       ` Grant Likely
     [not found] ` <1308640714-17961-8-git-send-email-ohad@wizery.com>
2011-06-22  2:42   ` [RFC 7/8] drivers: introduce rpmsg, a remote-processor messaging bus Rusty Russell
2011-06-22  3:11     ` Sasha Levin
2011-06-22 10:46     ` Ohad Ben-Cohen
2011-09-14 18:38       ` Ohad Ben-Cohen
2011-09-15  0:12         ` Rusty Russell
2011-09-15 14:56           ` Ohad Ben-Cohen
2011-06-27 22:21   ` Grant Likely
2011-06-28 22:46     ` Ohad Ben-Cohen
2011-06-28 22:51       ` Grant Likely
2011-06-28 23:00         ` Ohad Ben-Cohen
2011-06-29 15:43           ` Grant Likely
2011-07-01 15:13             ` Ohad Ben-Cohen
2011-06-28 23:44   ` Randy Dunlap
2011-06-29  6:30     ` Ohad Ben-Cohen
2011-06-29 11:59       ` Arnd Bergmann
2011-06-29 12:29         ` Ohad Ben-Cohen
2011-07-18 22:07   ` Pavel Machek
2011-07-19  5:38     ` Ohad Ben-Cohen
     [not found] ` <1308640714-17961-7-git-send-email-ohad@wizery.com>
2011-06-28 10:18   ` [RFC 6/8] davinci: da850: add remoteproc dsp device Sergei Shtylyov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).