* Re: [Ksummit-2009-discuss] Representing Embedded Architectures at the Kernel Summit
From: Thomas Petazzoni @ 2009-06-10 9:42 UTC (permalink / raw)
To: James Bottomley
Cc: Catalin Marinas, ksummit-2009-discuss, linux-arch, linux-embedded
In-Reply-To: <1244045997.21423.303.camel@mulgrave.site>
Le Wed, 03 Jun 2009 12:19:57 -0400,
James Bottomley <James.Bottomley@HansenPartnership.com> a écrit :
> So ZONE_DMA and coherent memory allocation as represented by the
> coherent mask are really totally separate things. The idea of
> ZONE_DMA was really that if you had an ISA device, allocations from
> ZONE_DMA would be able to access the allocated memory without
> bouncing. Since ISA is really going away, this definition has been
> hijacked. If your problem is just that you need memory allocated on
> a certain physical mask and neither GFP_DMA or GFP_DMA32 cut it for
> you, then we could revisit the kmalloc_mask() proposal again ... but
> the consensus last time was that no-one really had a compelling use
> case that couldn't be covered by GFP_DMA32.
Back a few years ago, I was working on a MIPS platform which had 256 MB
of RAM attached to the CPU memory controller and 128 MB attached to an
external memory controller. The layout of the memories was: 256 MB
CPU-attached memory first, and then the 128 MB
external-controller-attached memory.
Now, back to the DMA discussion: the Ethernet controller, which was
part of that external controller also driving the 128 MB bank of
memory, could only DMA to and from memory controlled by that same
controller (i.e only to the *top* 128 MB of the physical address
space). I'm by far not an mm expert, but as far as I could understand
the zone mechanism, it was not possible to describe such a
physical memory configuration where DMA-able memory is only at the top.
In the end, I ended up passing mem=..., managing manually a few
megabytes of memory at the top of the physical address space, and
hacking the Ethernet driver to copy back and forth the skb contents
between the main memory and the DMA-reserved memory.
So when Calatalin Marinas says « currently ZONE_DMA is assumed to be in
the bottom part of the memory which isn't always the case », I cannot
agree more.
Reference:
http://www.linux-mips.org/archives/linux-mips/2004-09/msg00152.html
Sincerly,
Thomas
--
Thomas Petazzoni, Free Electrons
Kernel, drivers and embedded Linux development,
consulting, training and support.
http://free-electrons.com
^ permalink raw reply
* USB : Problem in backporting C67x00 Driver to kernel 2.6.20
From: Childhood Hwang @ 2009-06-09 12:45 UTC (permalink / raw)
To: linux-embedded
I am tring to backport C67x00 driver to kernel 2.6.20 right now. But I
can't make it work.
I saw there was some discussion before, and it seem's that those
discussions were the right thing messing me.
I found the discussion in link: http://marc.info/?t=122849175200004&r=1&w=2
But there is no answer how to getover this, just that the problem was
in different status handling for older kernel.
I am quite newest to Linux.
I insert a kernel module that register the platform device for USB,
I think the interrupt works well.
Register code is like that,:
static struct resource c67x00_resources[] = {
[0] = {
.start = 0x80800000,
.end = 0x8080ffff,
.flags = IORESOURCE_MEM,
},
[1] = {
.start = 0,
.end = 0,
.flags = IORESOURCE_IRQ,
},
};
static struct c67x00_platform_data c67x00_data = {
.sie_config = C67X00_SIE1_HOST | C67X00_SIE2_HOST,
.hpi_regstep = 0x02,
};
static struct platform_device c67x00_device = {
.name = "c67x00",
.id = -1,
.num_resources = ARRAY_SIZE(c67x00_resources),
.resource = c67x00_resources,
.dev.platform_data = &c67x00_data,
};
static int __init c67x00_device_reg_init(void)
{
/* printk("<1>Hello module world.\n");
printk("<1>Module parameters were (0x%08x) and \"%s\"\n", myint,
mystr); */
printk("======== Call : c67x00_device_reg_init() ========\n");
/* A non 0 return means init_module failed; module can't be loaded. */
return platform_device_register(&c67x00_device);
}
static void __exit c67x00_device_reg_exit(void)
{
printk(KERN_ALERT "======== Goodbye c67x00_device_reg_exit(). ========\n");
platform_device_unregister(&c67x00_device);
}
module_init(c67x00_device_reg_init);
module_exit(c67x00_device_reg_exit);
And the log list below:
# insmod /lib/modules/kernel/drivers/misc/c67x00_device_reg.ko
======== Call : c67x00_device_reg_init() ========
============ USB driver : Call c67x00_drv_probe() ============
c67x00 c67x00: Cypress C67X00 Host Controller
drivers/usb/core/inode.c: creating file 'devices'
drivers/usb/core/inode.c: creating file '001'
c67x00 c67x00: new USB bus registered, assigned bus number 1
usb usb1: default language 0x0409
usb usb1: new device strings: Mfr=3, Product=2, SerialNumber=1
usb usb1: Product: Cypress C67X00 Host Controller
usb usb1: Manufacturer: Linux 2.6.20-uc0 c67x00-hcd
usb usb1: SerialNumber: c67x00_sie
usb usb1: usb_probe_device
usb usb1: configuration #1 chosen from 1 choice
usb usb1: adding 1-0:1.0 (config #1, interface 0)
hub 1-0:1.0: usb_probe_interface
hub 1-0:1.0: usb_probe_interface - got id
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 2 ports detected
hub 1-0:1.0: standalone hub
hub 1-0:1.0: ganged power switching
hub 1-0:1.0: global over-current protection
hub 1-0:1.0: power on to power good time: 100ms
hub 1-0:1.0: local power source is good
hub 1-0:1.0: no over-current condition exists
hub 1-0:1.0: enabling power on all ports
drivers/usb/core/inode.c: creating file '001'
hub 1-0:1.0: state 7 ports 2 chg 0000 evt 0000
c67x00 c67x00: Cypress C67X00 Host Controller
drivers/usb/core/inode.c: creating file '002'
c67x00 c67x00: new USB bus registered, assigned bus number 2
usb usb2: default language 0x0409
usb usb2: new device strings: Mfr=3, Product=2, SerialNumber=1
usb usb2: Product: Cypress C67X00 Host Controller
usb usb2: Manufacturer: Linux 2.6.20-uc0 c67x00-hcd
usb usb2: SerialNumber: c67x00_sie
usb usb2: usb_probe_device
usb usb2: configuration #1 chosen from 1 choice
usb usb2: adding 2-0:1.0 (config #1, interface 0)
hub 2-0:1.0: usb_probe_interface
hub 2-0:1.0: usb_probe_interface - got id
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 2 ports detected
hub 2-0:1.0: standalone hub
hub 2-0:1.0: ganged power switching
hub 2-0:1.0: global over-current protection
hub 2-0:1.0: power on to power good time: 100ms
hub 2-0:1.0: local power source is good
hub 2-0:1.0: no over-current condition exists
hub 2-0:1.0: enabling power on all ports
drivers/usb/core/inode.c: creating file '001'
hub 2-0:1.0: state 7 ports 2 chg 0000 evt 0000
# hub 1-0:1.0: state 7 ports 2 chg 0000 evt 0002
hub 1-0:1.0: port 1, status 0101, change 0001, 12 Mb/s
hub 1-0:1.0: debounce: port 1: total 100ms stable 100ms status 0x101
c67x00 c67x00: ClearPortFeature (0): C_RESET
usb 1-1: new full speed USB device using c67x00 and address 2
c67x00 c67x00: ClearPortFeature (0): C_RESET
c67x00 c67x00: ### TIMEOUT at 0x0500
c67x00 c67x00: urb: 0x902390cc
c67x00 c67x00: endpoint: 0
c67x00 c67x00: pipeout: 1
c67x00 c67x00: ly_base_addr: 0x0700
c67x00 c67x00: port_length: 0x0008
c67x00 c67x00: pid_ep: 0xd0
c67x00 c67x00: dev_addr: 0x00
c67x00 c67x00: ctrl_reg: 0x01
c67x00 c67x00: status: 0x14
c67x00 c67x00: retry_cnt: 0x00
c67x00 c67x00: residue: 0x00
c67x00 c67x00: next_td_addr: 0x0000
c67x00 c67x00: data:<3>usb 1-1: device not accepting address 2, error -115
c67x00 c67x00: ClearPortFeature (0): C_RESET
usb 1-1: new full speed USB device using c67x00 and address 3
c67x00 c67x00: ClearPortFeature (0): C_RESET
c67x00 c67x00: ### TIMEOUT at 0x0500
c67x00 c67x00: urb: 0x902390cc
c67x00 c67x00: endpoint: 0
c67x00 c67x00: pipeout: 1
c67x00 c67x00: ly_base_addr: 0x0700
c67x00 c67x00: port_length: 0x0008
c67x00 c67x00: pid_ep: 0xd0
c67x00 c67x00: dev_addr: 0x00
c67x00 c67x00: ctrl_reg: 0x01
c67x00 c67x00: status: 0x14
c67x00 c67x00: retry_cnt: 0x00
c67x00 c67x00: residue: 0x00
c67x00 c67x00: next_td_addr: 0x0000
c67x00 c67x00: data:<3>usb 1-1: device not accepting address 3, error -115
c67x00 c67x00: ClearPortFeature (0): C_RESET
usb 1-1: new full speed USB device using c67x00 and address 4
c67x00 c67x00: ### TIMEOUT at 0x0500
c67x00 c67x00: urb: 0x902390cc
c67x00 c67x00: endpoint: 0
c67x00 c67x00: pipeout: 1
c67x00 c67x00: ly_base_addr: 0x0700
c67x00 c67x00: port_length: 0x0008
c67x00 c67x00: pid_ep: 0xd0
c67x00 c67x00: dev_addr: 0x00
c67x00 c67x00: ctrl_reg: 0x01
c67x00 c67x00: status: 0x14
c67x00 c67x00: retry_cnt: 0x00
c67x00 c67x00: residue: 0x00
c67x00 c67x00: next_td_addr: 0x0000
c67x00 c67x00: data:<3>usb 1-1: device not accepting address 4, error -115
c67x00 c67x00: ClearPortFeature (0): C_RESET
usb 1-1: new full speed USB device using c67x00 and address 5
c67x00 c67x00: ### TIMEOUT at 0x0500
c67x00 c67x00: urb: 0x902390cc
c67x00 c67x00: endpoint: 0
c67x00 c67x00: pipeout: 1
c67x00 c67x00: ly_base_addr: 0x0700
c67x00 c67x00: port_length: 0x0008
c67x00 c67x00: pid_ep: 0xd0
c67x00 c67x00: dev_addr: 0x00
c67x00 c67x00: ctrl_reg: 0x01
c67x00 c67x00: status: 0x14
c67x00 c67x00: retry_cnt: 0x00
c67x00 c67x00: residue: 0x00
c67x00 c67x00: next_td_addr: 0x0000
c67x00 c67x00: data:<3>usb 1-1: device not accepting address 5, error -115
^ permalink raw reply
* Re: Representing Embedded Architectures at the Kernel Summit
From: Grant Likely @ 2009-06-04 20:24 UTC (permalink / raw)
To: steve.langstaff
Cc: Russell King, James Bottomley, ksummit-2009-discuss, linux-arch,
linux-embedded, Josh Boyer, Tim Bird
In-Reply-To: <3340601010994331832@unknownmsgid>
On Wed, Jun 3, 2009 at 3:08 AM, Steve Langstaff
<steve.langstaff@pebblebay.com> wrote:
>> From: linux-embedded-owner@vger.kernel.org [mailto:linux-embedded-
>> owner@vger.kernel.org] On Behalf Of Russell King
>
>> The big problem we have is that the only commonality between different
>> SoCs is that the CPU executes ARM instructions. Everything else is
>> entirely up to the SoC designer - eg location of memory, spacing of
>> memory banks, type of interrupt controller, etc is all highly SoC
>> specific. Nothing outside of the ARM CPU itself is standardized.
>
> To my naive ears it sounds like this problem is crying out for ARM and the
> SoC designers to add a standardized "autoprobe" interface to the core to
> allow discovery of machine type and/or "location of memory, spacing of
> memory banks, type of interrupt controller, etc".
>
> The benefits of such mechanisms are well known, but what are the drawbacks?
Local bus probing probably imposes a lot of assumptions on a bus
designed to be as open as possible. How are chip selects wired up?
What base addresses do devices respond to? How do you know what the
device is? What IRQ lines are used? PCI solves this by exporting
configuration space which defines all of this, but PCI is considerably
more complex and not as fast as a CPU's local bus. Similarly busses
like spi and i2c either have no probing protocol defined (spi), or
cannot always be reliably probed (i2c).
In short, the drawbacks are complexity on devices which cannot afford
the complexity.
g.
--
Grant Likely, B.Sc., P.Eng.
Secret Lab Technologies Ltd.
^ permalink raw reply
* Re: Representing Embedded Architectures at the Kernel Summit
From: Grant Likely @ 2009-06-04 20:15 UTC (permalink / raw)
To: Bill Gatliff
Cc: James Bottomley, ksummit-2009-discuss, linux-arch, linux-embedded,
Josh Boyer, Tim Bird
In-Reply-To: <4A2596B4.3020309@billgatliff.com>
On Tue, Jun 2, 2009 at 3:16 PM, Bill Gatliff <bgat@billgatliff.com> wrote:
> Russell King wrote:
>>
>> The big problem we have is that the only commonality between different
>> SoCs is that the CPU executes ARM instructions. Everything else is
>> entirely up to the SoC designer - eg location of memory, spacing of
>> memory banks, type of interrupt controller, etc is all highly SoC
>> specific. Nothing outside of the ARM CPU itself is standardized.
>
> And that diversity is precisely because of the diversity in ARM-based
> embedded platforms.
>
> Such diversity means that kernel/driver development is a constant activity,
> which suggests that we shouldn't bother the effort to come up with a
> comprehensive solution because none will exist. Rather, we should maintain
> and improve the ability to rapidly prototype and adapt. Things like
> furthering the deployment of platform_device, clocksource/clockdevice, and
> so on.
No, not comprehensive; just common. It makes sense to spend the
effort on the patterns and devices which are common. It may not cover
everything, but it doesn't have to to be valuable.
g.
--
Grant Likely, B.Sc., P.Eng.
Secret Lab Technologies Ltd.
^ permalink raw reply
* Re: Representing Embedded Architectures at the Kernel Summit
From: Grant Likely @ 2009-06-04 20:08 UTC (permalink / raw)
To: Grant Likely, James Bottomley, ksummit-2009-discuss, linux-arch,
linux-e
In-Reply-To: <20090602211057.GA10800@flint.arm.linux.org.uk>
On Tue, Jun 2, 2009 at 3:10 PM, Russell King <rmk@arm.linux.org.uk> wrote:
> On Tue, Jun 02, 2009 at 11:29:46AM -0600, Grant Likely wrote:
>> Embedded PowerPC and Microblaze has tackled this problem with the
>> "Flattened Device Tree" data format which is derived from the
>> OpenFirmware specifications, and there is some interest and debate (as
>> discussed recently on the ARM mailing list) about making flattened
>> device tree usable by ARM also (which I'm currently
>> proof-of-concepting).
>
> Note that I have to point out that ARM will probably never be in a
> situation where you can have a one kernel image boots on everything.
> That _is_ practical today (and does happen with all PXA now) with what
> we have within a very big restriction - which is that the kernel must
> be built to support PXA and not Atmel SoCs.
Agreed, and isn't really my position either. I do understand that
some things, like different CPU cores requiring different code
generation options, will probably never be feasible to support in a
single image. Others may be doable, but not worthwhile because of
memory constraints or impact on hot paths.
> I really don't think combining SoC support is going to be realistic,
> device tree or not. When we had just four machine types (RiscPC,
> EBSA110, EBSA285, Netwinder) I did look into this and came to the
> conclusion that it would be far too inefficient for there to be any
> win.
>
> The big problem we have is that the only commonality between different
> SoCs is that the CPU executes ARM instructions. Everything else is
> entirely up to the SoC designer - eg location of memory, spacing of
> memory banks, type of interrupt controller, etc is all highly SoC
> specific. Nothing outside of the ARM CPU itself is standardized.
I've been working on the Xilinx Virtex platform which is a CPU core
surrounded by a big FPGA fabric. Doesn't get much non-standardized
that that. :-) It is trivial to build an image which boots on all of
the ppc440 platforms; the SoCs, and the Virtex5 FPGA.
Most of the 60x based SoCs have been merged too; all with very
different bus architectures, interrupt controllers and embedded
devices, but I do grant you that it isn't the same level of diversity
as in ARM land.
However, there has been no attempt to make a single powerpc image
which boots on everything; the cores are too different. Instead, we
break it up by CPU type: 40x, 44x, 60x, 64 bit and the 8xx stuff that
everyone tries to forget. IIUC, in ARM the MMU architecture is quite
varied, and that would have significant impact too.
Within each sub arch pretty much anything goes. The kernel
initializes itself to the point of setting up the MMU and then probes
the machine definitions (arch/powerpc/platforms/*) until it finds one
which matches the device tree data. At that point it is able to
branch off and set up the correct interrupt controller, register
devices and configure busses, etc. As long as the correct machine
definition is compiled in, then everything goes.
As I'm sure you're about to point out :-), PowerPC machine probing
really isn't all that different from ARM machine probing except that
ARM uses the machine number and PowerPC uses the "model" property in
the device tree. The real difference is the source of data describing
which devices are present; it just happens to be in an external data
structure instead of compiled in. A lot of what was originally
thought to be machine specific code has just gone away because the
device tree infrastructure handles it elegantly. Only the parts that
truly are machine specific continue to hang around.
I'm *not* suggesting that all ARM machines, or even most ARM machines
should be reworked to use something like the device tree. What I am
suggesting is that there are many uses cases where it does make sense.
For example, on the range of ARM netbooks rumored to come out near
the end of the year, the prospect of having a single kernel image boot
on multiple netbooks from multiple vendors is attractive to
distribution vendors like Canonical. Device Tree certainly isn't the
only way to do this, but it is a viable one.
g.
--
Grant Likely, B.Sc., P.Eng.
Secret Lab Technologies Ltd.
^ permalink raw reply
* Re: Representing Embedded Architectures at the Kernel Summit
From: Grant Likely @ 2009-06-04 18:18 UTC (permalink / raw)
To: James Bottomley
Cc: ksummit-2009-discuss, linux-arch, linux-embedded, Josh Boyer,
Tim Bird, Greg KH
In-Reply-To: <1243964917.4229.54.camel@mulgrave.int.hansenpartnership.com>
On Tue, Jun 2, 2009 at 11:48 AM, James Bottomley
<James.Bottomley@hansenpartnership.com> wrote:
> On Tue, 2009-06-02 at 11:29 -0600, Grant Likely wrote:
>> One topic that seems to garner debate is the issue of decoupling the
>> kernel image from the target platform. ie. On x86, PowerPC and Sparc
>> a kernel image will boot on any machine (assuming the needed drivers
>> are enabled), but this is rarely the case in embedded. Most embedded
>> kernels require explicit board support selected at compile time with
>> no way to produce a generic kernel which will boot on a whole family
>> of devices, let alone the whole architecture. Part of this is a
>> firmware issue, where existing firmware passes very little in the way
>> of hardware description to the kernel, but part is also not making
>> available any form of common language for describing the machine.
>
> OK, so my minimal understanding in this area lead me to believe this was
> because most embedded systems didn't have properly discoverable busses
> and therefore you had to tailor the kernel configuration exactly to the
> devices the system had.
Yes, mostly true. The kernel must be explicitly told the layout of
the non-discoverable busses and interconnects. One method is to use
per-machine statically compiled tables of platform devices, but
nothing forces embedded to do it that way...
>> I think that in the absence of any established standard like the PC
>> BIOS/EFI or a real Open Firmware interface, then the kernel should at
>> least offer a recommended interface so that multiplatform kernels are
>> possible without explicitly having the machine layout described to it
>> at compile time. I know that some of the embedded distros are
>> interested in such a thing since it gets them away from shipping
>> separate images for each supported board. ie. It's really hard to do
>> a generic live-cd without some form of multiplatform. FDT is a great
>> approach, but it probably isn't the only option. It would be worth
>> debating.
>
> It sounds interesting ... however, it also sounds like an area which
> might not impact the core kernel much ... or am I wrong about that? The
> topics we're really looking for the Kernel Summit are ones that require
> cross system input and which can't simply be sorted out by organising an
> Embedded mini-summit.
Hmmm... in reading this thread and thinking about it more, I'm
beginning to think that it might really be a core kernel issue; or at
least a device driver policy one. Regardless of architecture, at boot
time Linux must use some method to discover the system layout, be it:
1) Reading BIOS/EFI/ACPI/OpenFirmware/FDT data
2) Probing the bus (PCI, USB, etc)
3) Compiled into the kernel (tables of platform devices, machine specific code)
Many types of devices could be end up being discovered using any of
the above methods. Ignoring for the time being the complexities and
history of the Linux UART drivers, I'm going to use 16550 serial ports
for an example. On ARM, a platform device for an 16550 serial port
may be instantiated by machine specific init code, on PowerPC it will
be discovered by a device tree parser, on a PC it could be a legacy
port, and on all three it could hang off a PCI device. The bus
connection and source of data are different in each case, but the same
core driver will handle all of them. The real differences are in
discovery and decoding the configuration.
SPI devices (struct spi_device) is possibly a better example.
spi_device drivers that need additional configuration go looking at
the platform_data pointer in the struct device. This is easy when the
device is hard coded into the kernel because the correct pdata struct
is initialized statically at build time. When the device is
discovered via one of the other mechanisms, the question remains of
where should the code live that does the translation and fills in the
correct pdata? The mmc_spi driver handles this by calling out to an
mmc_spi_get_pdata() function (drivers/spi/{mmc_spi,mmc_spi_of}.c. If
running on an OF platform, mmc_spi_get_pdata() has the knowledge to
decode the device tree data and munge it into the pdata form needed by
the driver. Both statically compiled and Device Tree described
mmc_spi configurations must be handled, and driver specific decode
methods must exist, but there I don't think there is any desire to
write multiple probe routines for each device driver. The same issue
stands for i2c, MMIO, and other non-discoverable busses.
For drivers which require pdata, writing decode functions is
unavoidable, but it is unclear how to hook in that code with as little
impact on a device driver as possible. To me the issue is, where
should that code live? and how should it get executed? (which is why
I think it is a device driver policy issue) I've used the example of
OF device tree bindings vs. static configuration, but it applies just
as readily to something like UEFI (ie. I see that ARM is a member of
the UEFI forum). Here are some of my unsorted thoughts on the issue:
- Translation code is driver specific, so it should live as part of,
or in the vicinity of, the driver it works with
- My guess is that The Future(tm) will probably bring more methods of
describing machine configuration, not less; It is worth debating now
about how to have multiple decoders for a single device driver.
- Devices on non-discoverable busses appear in both desktop and
embedded machines. (sensors anyone?) This is not just an embedded
issue.
- driver authors will probably implement only the decode methods that
they actually need. It is likely that different people will develop
additional decoders. These will need to co-exist peacefully.
- Some things are just hard and just require machine specific setup
code. Things like weird CS selections on SPI busses, or clock
routing. Decoding to pdata won't always be feasible and sometime
machine specific hooks must be used. Need a method for machine setup
code to provide pdata.
- Binding algorithms are problematic. Naming convention in the data
source won't always match Linux internal kernel naming so there must
be some logic for matching to the correct drivers. Currently, an
exceptions table is used for i2c and spi busses (drivers/of/base.c:
of_modalias_table). There aren't many entries in there now, but I'm
not sure it is a scalable solution in the long term.
And some possible approaches:
1) One option is to link a list of per-device decoder methods into the
kernel so that generic bus discovery code and call the correct pdata
decoder for each device when it is discovered. Doing this ensures
that pdata is available before the driver's probe method is called and
completely isolates the driver code from the decode method. However,
it also means that the entire list of decoders must be statically
linked into the kernel, and it is of no use at all to out-of-tree
drivers because it provides no method of linking in additional decode
hooks. Many drivers used to do this in arch/powerpc, but we moved
away from it for the problems listed above.
2) It could be handled with wrapper drivers which have their own
struct device and get bound to data elements within the Device Tree or
other data source. The wrapper driver could generate the pdata and
register a child platform device which gets bound to the generic
driver. Doing it this way lets a decode method live in a module.
This works well for things which are currently struct platform_device,
and ensures that all data is available at driver probe time, but
doesn't fit well into the structure of other bus types (SPI, I2C,
MDIO, etc) without creating an indirection layer for decoding on each
bus. It also means that for every real device, 2 struct device get
registered; one to bind against the decoder, and one to bind against
the real driver. Maybe this isn't a significant memory consumption;
but it doesn't *feel* right to me.
3) The decoders could be linked into the drivers themselves. The
mmc_spi driver uses this approach, though it is a bit crude, and the
mmc_spi_get_pdata routine must be modified for each. I've been
thinking about the possibility of have a decoder function list
attached to the driver and use a common helper function to walk the
list until one of them is able to provide valid pdata. This would
keep the decode method with the device driver (where it belongs
IMNSHO), but minimize the impact on the core of the driver as a whole
(only a function would be added). But, this is still impact on the
driver which there may be resistance to.
4) write separate probe() routines for each type of discovery (static
pdata, device tree, etc). Solves same problems as 3, but I think
results in more code and possibly a bunch of #ifdef'ry
5) other options I haven't thought of? ....
All of this options listed above have been talked about and
implemented to a lesser or greater degree without really coming to
much of a conclusion on how it should be approached and what the
impact on device drivers will be. It is worth some debate.
> Now if flattened device tree could help us out with BIOS, ACPI, EFI and
> the other myriad boot and identification standards that seem designed to
> hide system information rather than reveal it, then we might be all
> ears ...
:-)
Interesting, but probably not much help here. This would just be
translating (and imperfectly at that) from one machine representation
to another without a whole lot of benefit. It is conceivable that
data sourced from multiple locations (probing, ACPI, EFI, and known
quirks) could be all funneled into a single FDT image and then that
data used for creating and registering device structures, but I don't
really see any benefit here.
g.
--
Grant Likely, B.Sc., P.Eng.
Secret Lab Technologies Ltd.
^ permalink raw reply
* Re: [Ksummit-2009-discuss] Representing Embedded Architectures at the Kernel Summit
From: Mel Gorman @ 2009-06-04 9:23 UTC (permalink / raw)
To: David VomLehn (dvomlehn)
Cc: Andrew Morton, Russell King, James.Bottomley, linux-arch,
linux-embedded, ksummit-2009-discuss
In-Reply-To: <FF038EB85946AA46B18DFEE6E6F8A2890151520C@xmb-rtp-218.amer.cisco.com>
On Wed, Jun 03, 2009 at 11:11:13PM -0400, David VomLehn (dvomlehn) wrote:
> David Delaney has a proof-of-concept of an idea of his which was
> presented at the last CELF, which is basically to put the kernel and
> loadable kernel modules closely enough together that you can avoid the
> use of long jumps. He sees a better than 1% improvement in performance,
> which we've duplicated using a slightly different approach. This is nice
> payback for little work and, though it doesn't help on all processors,
> it helps on several.
>
> The problem is: how do you allocate memory with the magical "close to
> the kernel" attribute? We have something that adds a new ZONE_KERNEL
> (this name has some problems, actually).
As this is about addresses of text, I imagine that you really care about the
virtual address the module is loaded at which is what the virtual address
space is responsible for and not the physical addresses which zones are
concerned with. Would that be right?
If that was the case, this could be potentially be done by moving where
the vmalloc address space is located or possibly splitting it in two. By
locating some portion of the vmalloc address space above the kernel
image, the kernel modules could be loaded in there.
It's different for the DMA problem, it really requires particular physical
addresses. No one trying to implementat Andrew's suggestion is a bit of a
surprise because basically, it'd do the job as far as I can see but is not
an issue that hurts me so I never sat down to try implementing it. Granted,
increasing the number of zones adds its own problem but it's for large numbers
of zones and there are other things that could be done to the allocator to
reduce its cache footprint. The big plus is that it plays very well with
reclaim and I would expect it to perform better than than searching the
free lists for a suitable page which would be a bit of a hatchet job.
> It seems like a pretty good
> solution if you look at zones as conceptually concentric usages, but
> with the current zone implementation, each zone must be contiguous. So,
> if we're talking about changing what zones are done, I'd like to throw
> this into the pot.
>
> > -----Original Message-----
> > From: linux-embedded-owner@vger.kernel.org
> > [mailto:linux-embedded-owner@vger.kernel.org] On Behalf Of
> > Andrew Morton
> > Sent: Wednesday, June 03, 2009 11:44 AM
> > To: Russell King
> > Cc: James.Bottomley@HansenPartnership.com;
> > linux-arch@vger.kernel.org; linux-embedded@vger.kernel.org;
> > ksummit-2009-discuss@lists.linux-foundation.org
> > Subject: Re: [Ksummit-2009-discuss] Representing Embedded
> > Architectures at the Kernel Summit
> >
> > On Wed, 3 Jun 2009 18:09:25 +0100
> > Russell King <rmk@arm.linux.org.uk> wrote:
> >
> > > In
> > > fact, on ARM the DMA mask is exactly that - it's a 100%
> > proper mask. It's
> > > not a bunch of zeros in the MSB followed by a bunch of ones
> > down to the
> > > LSB. It can be a bunch of ones, a bunch of zeros, followed
> > by a bunch of
> > > ones.
> > >
> > > The way we occasionally have to deal with this is to trial
> > an allocation,
> > > see if the physical address fits, if not free the page and
> > try again with
> > > GFP_DMA set.
> >
> > A couple of times I've suggested that we have the ability to allocate
> > one zone per address bit, so a 32-bit machine with 4k pages would end
> > up having 20 zones. Then, your funny DMA mask can be directly passed
> > into the page allocator as a zone mask and voila, I think.
> >
> > > There's many stories I've heard on what is supposed to take
> > care of the
> > > coherency that I now just close my ears to the problem and chant "it
> > > doesn't exist, people aren't seeing it, mainline folk just
> > don't give
> > > a damn". Really. It is a problem on _some_ ARM devices
> > and has been
> > > for several years now, and I've 100% given up caring about it.
> >
> > I wasn't even aware that there was an issue here. Please don't blame
> > "mainline folk" for something they weren't told about!
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe
> > linux-embedded" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> >
> _______________________________________________
> Ksummit-2009-discuss mailing list
> Ksummit-2009-discuss@lists.linux-foundation.org
> https://lists.linux-foundation.org/mailman/listinfo/ksummit-2009-discuss
>
--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
^ permalink raw reply
* Re: [Ksummit-2009-discuss] Representing Embedded Architectures at the Kernel Summit
From: Mike Frysinger @ 2009-06-04 3:24 UTC (permalink / raw)
To: David VomLehn (dvomlehn)
Cc: Andrew Morton, Russell King, James.Bottomley, linux-arch,
linux-embedded, ksummit-2009-discuss
In-Reply-To: <FF038EB85946AA46B18DFEE6E6F8A2890151520C@xmb-rtp-218.amer.cisco.com>
On Wed, Jun 3, 2009 at 23:11, David VomLehn (dvomlehn) wrote:
> David Delaney has a proof-of-concept of an idea of his which was
> presented at the last CELF, which is basically to put the kernel and
> loadable kernel modules closely enough together that you can avoid the
> use of long jumps. He sees a better than 1% improvement in performance,
> which we've duplicated using a slightly different approach. This is nice
> payback for little work and, though it doesn't help on all processors,
> it helps on several.
it would help on the Blackfin architecture. we compile all kernel
modules with -mlong-call because of this issue.
> The problem is: how do you allocate memory with the magical "close to
> the kernel" attribute? We have something that adds a new ZONE_KERNEL
> (this name has some problems, actually). It seems like a pretty good
> solution if you look at zones as conceptually concentric usages, but
> with the current zone implementation, each zone must be contiguous. So,
> if we're talking about changing what zones are done, I'd like to throw
> this into the pot.
what do you do if the alloc fails ? return back to userspace with
something like ENOMEM and have it retry with a module that was
compiled with -mlong-call ?
-mike
^ permalink raw reply
* RE: [Ksummit-2009-discuss] Representing Embedded Architectures at the Kernel Summit
From: David VomLehn (dvomlehn) @ 2009-06-04 3:11 UTC (permalink / raw)
To: Andrew Morton, Russell King
Cc: James.Bottomley, linux-arch, linux-embedded, ksummit-2009-discuss
In-Reply-To: <20090603114344.bf852654.akpm@linux-foundation.org>
David Delaney has a proof-of-concept of an idea of his which was
presented at the last CELF, which is basically to put the kernel and
loadable kernel modules closely enough together that you can avoid the
use of long jumps. He sees a better than 1% improvement in performance,
which we've duplicated using a slightly different approach. This is nice
payback for little work and, though it doesn't help on all processors,
it helps on several.
The problem is: how do you allocate memory with the magical "close to
the kernel" attribute? We have something that adds a new ZONE_KERNEL
(this name has some problems, actually). It seems like a pretty good
solution if you look at zones as conceptually concentric usages, but
with the current zone implementation, each zone must be contiguous. So,
if we're talking about changing what zones are done, I'd like to throw
this into the pot.
> -----Original Message-----
> From: linux-embedded-owner@vger.kernel.org
> [mailto:linux-embedded-owner@vger.kernel.org] On Behalf Of
> Andrew Morton
> Sent: Wednesday, June 03, 2009 11:44 AM
> To: Russell King
> Cc: James.Bottomley@HansenPartnership.com;
> linux-arch@vger.kernel.org; linux-embedded@vger.kernel.org;
> ksummit-2009-discuss@lists.linux-foundation.org
> Subject: Re: [Ksummit-2009-discuss] Representing Embedded
> Architectures at the Kernel Summit
>
> On Wed, 3 Jun 2009 18:09:25 +0100
> Russell King <rmk@arm.linux.org.uk> wrote:
>
> > In
> > fact, on ARM the DMA mask is exactly that - it's a 100%
> proper mask. It's
> > not a bunch of zeros in the MSB followed by a bunch of ones
> down to the
> > LSB. It can be a bunch of ones, a bunch of zeros, followed
> by a bunch of
> > ones.
> >
> > The way we occasionally have to deal with this is to trial
> an allocation,
> > see if the physical address fits, if not free the page and
> try again with
> > GFP_DMA set.
>
> A couple of times I've suggested that we have the ability to allocate
> one zone per address bit, so a 32-bit machine with 4k pages would end
> up having 20 zones. Then, your funny DMA mask can be directly passed
> into the page allocator as a zone mask and voila, I think.
>
> > There's many stories I've heard on what is supposed to take
> care of the
> > coherency that I now just close my ears to the problem and chant "it
> > doesn't exist, people aren't seeing it, mainline folk just
> don't give
> > a damn". Really. It is a problem on _some_ ARM devices
> and has been
> > for several years now, and I've 100% given up caring about it.
>
> I wasn't even aware that there was an issue here. Please don't blame
> "mainline folk" for something they weren't told about!
>
> --
> To unsubscribe from this list: send the line "unsubscribe
> linux-embedded" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply
* Re: [Ksummit-2009-discuss] Representing Embedded Architectures at the Kernel Summit
From: Catalin Marinas @ 2009-06-03 19:08 UTC (permalink / raw)
To: James Bottomley; +Cc: ksummit-2009-discuss, linux-arch, linux-embedded
In-Reply-To: <1244045997.21423.303.camel@mulgrave.site>
On Wed, 2009-06-03 at 12:19 -0400, James Bottomley wrote:
> On Wed, 2009-06-03 at 14:04 +0100, Catalin Marinas wrote:
> > On Tue, 2009-06-02 at 15:22 +0000, James Bottomley wrote:
> > > So what we're looking for is a proposal to discuss the issues
> > > most affecting embedded architectures, or preview any features affecting
> > > the main kernel which embedded architectures might need ... or any other
> > > topics from embedded architectures which might need discussion or
> > > debate.
> >
> > Some issues that come up on embedded systems (and not only):
> >
> > * Multiple coherency domains for devices - the system may have
> > multiple bus levels, coherency ports, cache levels etc. Some
> > devices in the system (but not all) may be able to "see" various
> > cache levels but the DMA API (at least on ARM) cannot handle
> > this. It may be useful to discuss how other embedded
> > architectures handle this and come up with a unified solution
>
> So this is partially what the dma_sync_for_{device|cpu} is supposed to
> be helping with. By and large, the DMA API tries to hide the
> complexities of coherency domains from the user. The actual API, as far
> as it goes, seems to do this OK.
Yes, the dma_sync_* API is probably OK. The actual implementation should
become aware of various coherency domains on the same system (it could
hold this information in one of the bus-related structures). Currently,
devices that can access the CPU (inner or outer) cache have drivers
modified to avoid calling the dma_sync_* functions (since other devices
need such functions).
If other embedded architectures face similar issues, it is worth
discussing and maybe come up with a common solution (of course, like
most topics, they could simply be discussed on the mailing lists rather
than at the KS).
> > * Better support for coherent DMA mask - currently ZONE_DMA is
> > assumed to be in the bottom part of the memory which isn't
> > always the case. Enabling NUMA may help but it is overkill for
> > some systems. As above, a more unified solution across
> > architectures would help
>
> So ZONE_DMA and coherent memory allocation as represented by the
> coherent mask are really totally separate things. The idea of ZONE_DMA
> was really that if you had an ISA device, allocations from ZONE_DMA
> would be able to access the allocated memory without bouncing. Since
> ISA is really going away, this definition has been hijacked. If your
> problem is just that you need memory allocated on a certain physical
> mask and neither GFP_DMA or GFP_DMA32 cut it for you, then we could
> revisit the kmalloc_mask() proposal again ... but the consensus last
> time was that no-one really had a compelling use case that couldn't be
> covered by GFP_DMA32.
Russell already commented on this. As an example, I have a platform with
two blocks of RAM - 512MB @ 0x20000000 and 512MB @ 0x70000000 - but only
the higher one allows DMA.
> > * PIO block devices and non-coherent hardware - code like mpage.c
> > assumes that the either the hardware is coherent or the device
> > driver performs the cache flushing. The latter is true for
> > DMA-capable device but not for PIO. The issue becomes visible
> > with write-allocate caches and the device driver may not have
> > the struct page information to call flush_dcache_page(). A
> > proposed solution on the ARM lists was to differentiate (via
> > some flags) between PIO and DMA block devices and use this
> > information in mpage.c
>
> flush_dcache_page() is supposed to be for making the data visible to the
> user ... that coherency is supposed to be managed by the block layer.
I'm referring to kernel<->user coherency issues and yes,
flush_dcache_page() is the function supposed to handle this. It's only
that it isn't always called in the block or VFS layers (for example, to
be able to use ext2 over compact flash using pata I had to add a hack so
that flush_dcache_page is called from mpage_end_io_read).
Some devices like Russell's mmci.c use scatter lists and they have
access to the page structure and perform the flushing. I noticed that
for some block devices you can't easily retrieve the page structure (I
would need to check the code for more precise references). But if the
driver is somehow marked as PIO, the VFS layer could ensure the
coherency.
> > * Mixed endianness devices in the same system - this may only need
> > dedicated readl_be/writel_be etc. macros but it could also be
> > done by having bus-aware readl/writel-like macros
>
> We have ioreadXbe for this exact case (similar problem on parisc)
OK, probably not worth a new topic. As it was mentioned on
linux-embedded already, it may just need better documention (there is no
reference to ioread* in Documentation/ and most devices seem to use
readl/writel etc.).
> > * Asymmetric MP:
> > * Different CPU frequencies
> > * Different CPU features (e.g. floating point only one
> > some CPUs): scheduler awareness, per-CPU hwcap bits (in
> > case user space wants to set the affinity)
> > * Asymmetric workload balancing for power consumption (may
> > be better to load 1 CPU at 60% than 4 at 15%)
>
> This actually just works(tm) for me on a voyager system running SMP with
> a mixed 486/586 set of processors ... what's the problem? The only
> issue I see is that you have to set the capabilities of the boot CPU to
> the intersection of the mixture otherwise setup goes wrong, but
> otherwise it seems to work OK.
You can set the capabilities to the intersection of the CPU features but
that's not the goal. We'll see multiprocessor systems with only one (out
of 2, 4 etc.) of the CPUs having some features (like media processing
instructions). That's the case on embedded where the number of gates is
limited and the battery saving is important but you want to use the
extra features and not limit them. So the code I currently have for such
configuration is to trap the undefined instructions and set the CPU
affinity to the faulty threads (the affinity could be reset after some
time). Could it be done better? I think that's worth discussing.
--
Catalin
^ permalink raw reply
* Re: [Ksummit-2009-discuss] Representing Embedded Architectures at the Kernel Summit
From: James Bottomley @ 2009-06-03 19:01 UTC (permalink / raw)
To: Andrew Morton
Cc: Russell King, linux-arch, linux-embedded, ksummit-2009-discuss
In-Reply-To: <20090603114344.bf852654.akpm@linux-foundation.org>
On Wed, 2009-06-03 at 11:43 -0700, Andrew Morton wrote:
> On Wed, 3 Jun 2009 18:09:25 +0100
> Russell King <rmk@arm.linux.org.uk> wrote:
>
> > In
> > fact, on ARM the DMA mask is exactly that - it's a 100% proper mask. It's
> > not a bunch of zeros in the MSB followed by a bunch of ones down to the
> > LSB. It can be a bunch of ones, a bunch of zeros, followed by a bunch of
> > ones.
> >
> > The way we occasionally have to deal with this is to trial an allocation,
> > see if the physical address fits, if not free the page and try again with
> > GFP_DMA set.
>
> A couple of times I've suggested that we have the ability to allocate
> one zone per address bit, so a 32-bit machine with 4k pages would end
> up having 20 zones. Then, your funny DMA mask can be directly passed
> into the page allocator as a zone mask and voila, I think.
The objection I heard to that one is that the zone machinery works
better with fewer zones ... but we could certainly align them along
known boundaries for allocations (if it's only bit X that's the problem,
say, you only need an additional zone covering that one).
Based on this, I dug up the initial proposal, it was the Ottawa Kernel
Summit in 2005 (I'm a packrat; I keep all my old presentations):
http://www.hansenpartnership.com/sites/hansenpartnership.com/files/jejb/kernel_summit_iommu.pdf
kmalloc_mask() is right at the end. It basically died for lack of
interest and the fact that GFP_DMA32 satisfied 99% of the actual use
cases.
James
^ permalink raw reply
* Re: [Ksummit-2009-discuss] Representing Embedded Architectures at the Kernel Summit
From: Andrew Morton @ 2009-06-03 18:43 UTC (permalink / raw)
To: Russell King
Cc: James.Bottomley, linux-arch, linux-embedded, ksummit-2009-discuss
In-Reply-To: <20090603170925.GA8330@flint.arm.linux.org.uk>
On Wed, 3 Jun 2009 18:09:25 +0100
Russell King <rmk@arm.linux.org.uk> wrote:
> In
> fact, on ARM the DMA mask is exactly that - it's a 100% proper mask. It's
> not a bunch of zeros in the MSB followed by a bunch of ones down to the
> LSB. It can be a bunch of ones, a bunch of zeros, followed by a bunch of
> ones.
>
> The way we occasionally have to deal with this is to trial an allocation,
> see if the physical address fits, if not free the page and try again with
> GFP_DMA set.
A couple of times I've suggested that we have the ability to allocate
one zone per address bit, so a 32-bit machine with 4k pages would end
up having 20 zones. Then, your funny DMA mask can be directly passed
into the page allocator as a zone mask and voila, I think.
> There's many stories I've heard on what is supposed to take care of the
> coherency that I now just close my ears to the problem and chant "it
> doesn't exist, people aren't seeing it, mainline folk just don't give
> a damn". Really. It is a problem on _some_ ARM devices and has been
> for several years now, and I've 100% given up caring about it.
I wasn't even aware that there was an issue here. Please don't blame
"mainline folk" for something they weren't told about!
^ permalink raw reply
* can initrd and initramfs be made independent?
From: Robert P. J. Day @ 2009-06-03 17:10 UTC (permalink / raw)
To: Embedded Linux mailing list
as it stands, when configuring the kernel, you can't select
initramfs capability without also selecting initrd. but isn't it
feasible that you might want initramfs and have no need for initrd?
if you make those selections independent, the help info suggests you
can save 15Kb by not having RAM disk (initrd) support.
or is there an interdependence here that's not so obvious?
rday
--
========================================================================
Robert P. J. Day Waterloo, Ontario, CANADA
Linux Consulting, Training and Annoying Kernel Pedantry.
Web page: http://crashcourse.ca
Linked In: http://www.linkedin.com/in/rpjday
Twitter: http://twitter.com/rpjday
========================================================================
^ permalink raw reply
* Re: [Ksummit-2009-discuss] Representing Embedded Architectures at the Kernel Summit
From: Russell King @ 2009-06-03 17:09 UTC (permalink / raw)
To: James Bottomley
Cc: Catalin Marinas, ksummit-2009-discuss, linux-arch, linux-embedded
In-Reply-To: <1244045997.21423.303.camel@mulgrave.site>
On Wed, Jun 03, 2009 at 12:19:57PM -0400, James Bottomley wrote:
> On Wed, 2009-06-03 at 14:04 +0100, Catalin Marinas wrote:
> > * Better support for coherent DMA mask - currently ZONE_DMA is
> > assumed to be in the bottom part of the memory which isn't
> > always the case. Enabling NUMA may help but it is overkill for
> > some systems. As above, a more unified solution across
> > architectures would help
>
> So ZONE_DMA and coherent memory allocation as represented by the
> coherent mask are really totally separate things. The idea of ZONE_DMA
> was really that if you had an ISA device, allocations from ZONE_DMA
> would be able to access the allocated memory without bouncing. Since
> ISA is really going away, this definition has been hijacked. If your
> problem is just that you need memory allocated on a certain physical
> mask and neither GFP_DMA or GFP_DMA32 cut it for you, then we could
> revisit the kmalloc_mask() proposal again ... but the consensus last
> time was that no-one really had a compelling use case that couldn't be
> covered by GFP_DMA32.
I'm not aware of such a discussion; I keep running into issues here. In
fact, on ARM the DMA mask is exactly that - it's a 100% proper mask. It's
not a bunch of zeros in the MSB followed by a bunch of ones down to the
LSB. It can be a bunch of ones, a bunch of zeros, followed by a bunch of
ones.
The way we occasionally have to deal with this is to trial an allocation,
see if the physical address fits, if not free the page and try again with
GFP_DMA set.
We do certain checks on the DMA mask - notably that a GFP_DMA allocation
will satisfy the mask which has been passed.
I've never submitted the patch which does this in the ARM coherent DMA
allocator, but it's something that occasionally crops up as being
necessary - I've always thought the allocate-by-mask stuff would
eventually be merged.
> > * PIO block devices and non-coherent hardware - code like mpage.c
> > assumes that the either the hardware is coherent or the device
> > driver performs the cache flushing. The latter is true for
> > DMA-capable device but not for PIO. The issue becomes visible
> > with write-allocate caches and the device driver may not have
> > the struct page information to call flush_dcache_page(). A
> > proposed solution on the ARM lists was to differentiate (via
> > some flags) between PIO and DMA block devices and use this
> > information in mpage.c
>
> flush_dcache_page() is supposed to be for making the data visible to the
> user ... that coherency is supposed to be managed by the block layer.
> The DMA API is specifically aimed at device to kernel space
> coherency ... although if you line up all your aliases, that can also be
> device to userspace. Technically though we have two separate APIs for
> user<->kernel coherency and device<->kernel coherency. What's the path
> you're seeing this problem down? SG_IO to a device doing PIO should be
> handling this correctly.
There's many stories I've heard on what is supposed to take care of the
coherency that I now just close my ears to the problem and chant "it
doesn't exist, people aren't seeing it, mainline folk just don't give
a damn". Really. It is a problem on _some_ ARM devices and has been
for several years now, and I've 100% given up caring about it.
So people who see the problem just have to suffer with it, and they have
to accept that the Linux kernel sucks with PIO on ARM hardware.
Unless they use a driver I've written which has the necessary callbacks
in it to ensure cache coherency (like MMC). IDE... forget it.
Yes, that taste you're experiencing is my bitterness on this subject.
--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of:
^ permalink raw reply
* Re: [Ksummit-2009-discuss] Representing Embedded Architectures at the Kernel Summit
From: James Bottomley @ 2009-06-03 16:19 UTC (permalink / raw)
To: Catalin Marinas; +Cc: ksummit-2009-discuss, linux-arch, linux-embedded
In-Reply-To: <1244034286.24482.84.camel@pc1117.cambridge.arm.com>
On Wed, 2009-06-03 at 14:04 +0100, Catalin Marinas wrote:
> Hi,
>
> On Tue, 2009-06-02 at 15:22 +0000, James Bottomley wrote:
> > So what we're looking for is a proposal to discuss the issues
> > most affecting embedded architectures, or preview any features affecting
> > the main kernel which embedded architectures might need ... or any other
> > topics from embedded architectures which might need discussion or
> > debate.
>
> Some issues that come up on embedded systems (and not only):
>
> * Multiple coherency domains for devices - the system may have
> multiple bus levels, coherency ports, cache levels etc. Some
> devices in the system (but not all) may be able to "see" various
> cache levels but the DMA API (at least on ARM) cannot handle
> this. It may be useful to discuss how other embedded
> architectures handle this and come up with a unified solution
So this is partially what the dma_sync_for_{device|cpu} is supposed to
be helping with. By and large, the DMA API tries to hide the
complexities of coherency domains from the user. The actual API, as far
as it goes, seems to do this OK. We have synchronisation issues that
mmiowb() and friends help with ... what's the actual problem here?
> * Better support for coherent DMA mask - currently ZONE_DMA is
> assumed to be in the bottom part of the memory which isn't
> always the case. Enabling NUMA may help but it is overkill for
> some systems. As above, a more unified solution across
> architectures would help
So ZONE_DMA and coherent memory allocation as represented by the
coherent mask are really totally separate things. The idea of ZONE_DMA
was really that if you had an ISA device, allocations from ZONE_DMA
would be able to access the allocated memory without bouncing. Since
ISA is really going away, this definition has been hijacked. If your
problem is just that you need memory allocated on a certain physical
mask and neither GFP_DMA or GFP_DMA32 cut it for you, then we could
revisit the kmalloc_mask() proposal again ... but the consensus last
time was that no-one really had a compelling use case that couldn't be
covered by GFP_DMA32.
> * PIO block devices and non-coherent hardware - code like mpage.c
> assumes that the either the hardware is coherent or the device
> driver performs the cache flushing. The latter is true for
> DMA-capable device but not for PIO. The issue becomes visible
> with write-allocate caches and the device driver may not have
> the struct page information to call flush_dcache_page(). A
> proposed solution on the ARM lists was to differentiate (via
> some flags) between PIO and DMA block devices and use this
> information in mpage.c
flush_dcache_page() is supposed to be for making the data visible to the
user ... that coherency is supposed to be managed by the block layer.
The DMA API is specifically aimed at device to kernel space
coherency ... although if you line up all your aliases, that can also be
device to userspace. Technically though we have two separate APIs for
user<->kernel coherency and device<->kernel coherency. What's the path
you're seeing this problem down? SG_IO to a device doing PIO should be
handling this correctly.
> * Mixed endianness devices in the same system - this may only need
> dedicated readl_be/writel_be etc. macros but it could also be
> done by having bus-aware readl/writel-like macros
We have ioreadXbe for this exact case (similar problem on parisc)
> * Asymmetric MP:
> * Different CPU frequencies
> * Different CPU features (e.g. floating point only one
> some CPUs): scheduler awareness, per-CPU hwcap bits (in
> case user space wants to set the affinity)
> * Asymmetric workload balancing for power consumption (may
> be better to load 1 CPU at 60% than 4 at 15%)
This actually just works(tm) for me on a voyager system running SMP with
a mixed 486/586 set of processors ... what's the problem? The only
issue I see is that you have to set the capabilities of the boot CPU to
the intersection of the mixture otherwise setup goes wrong, but
otherwise it seems to work OK.
James
^ permalink raw reply
* Re: [Ksummit-2009-discuss] Representing Embedded Architectures at the Kernel Summit
From: Josh Boyer @ 2009-06-03 14:11 UTC (permalink / raw)
To: Catalin Marinas; +Cc: linux-embedded
In-Reply-To: <1244036737.24482.96.camel@pc1117.cambridge.arm.com>
On Wed, Jun 03, 2009 at 02:45:37PM +0100, Catalin Marinas wrote:
>On Wed, 2009-06-03 at 09:18 -0400, Josh Boyer wrote:
>> On Wed, Jun 03, 2009 at 02:04:46PM +0100, Catalin Marinas wrote:
>> > * Mixed endianness devices in the same system - this may only need
>> > dedicated readl_be/writel_be etc. macros but it could also be
>> > done by having bus-aware readl/writel-like macros
>>
>> ioread/iowrite{8,16,32} and ioread/iowrite{8,16,32}_be don't suffice here?
>
>Yes, but there there are many drivers that only use readl/writel (and
>arch/arm makes the assumption, maybe correctly, that this is little
>endian only).
readl/writel are little-endian only.
>I think that's useful even if the outcome of such discussion is better
>documentation on the above functions/macros (grepping Documentation/
>doesn't show any reference).
I think we could perhaps start with just writting some of that documentation
and trying to get it into the kernel. I'm not sure this specific item is
really worth of a KS topic.
josh
^ permalink raw reply
* Re: [Ksummit-2009-discuss] Representing Embedded Architectures at the Kernel Summit
From: Jean-Christophe PLAGNIOL-VILLARD @ 2009-06-03 14:06 UTC (permalink / raw)
To: Catalin Marinas
Cc: James Bottomley, ksummit-2009-discuss, linux-arch, linux-embedded
In-Reply-To: <1244034286.24482.84.camel@pc1117.cambridge.arm.com>
>
> * Asymmetric MP:
> * Different CPU frequencies
> * Different CPU features (e.g. floating point only one
> some CPUs): scheduler awareness, per-CPU hwcap bits (in
> case user space wants to set the affinity)
> * Asymmetric workload balancing for power consumption (may
> be better to load 1 CPU at 60% than 4 at 15%)
I'll add
* Different Core ARCH
* FDT or similar to describe I/O (MEM, PCI, GPIO) acessible for
each instance
* Mailbox Architecture
* boot preocedure (bootloader as example done by Kumar Gala
for the mpc8572ds in linux & U-Boot)
* sharing rootfs (RO) (reduce the rootfs size on embedded)
Best Regards,
J.
^ permalink raw reply
* Re: [Ksummit-2009-discuss] Representing Embedded Architectures at the Kernel Summit
From: Catalin Marinas @ 2009-06-03 13:45 UTC (permalink / raw)
To: Josh Boyer; +Cc: linux-embedded
In-Reply-To: <20090603131842.GT3095@hansolo.jdub.homelinux.org>
On Wed, 2009-06-03 at 09:18 -0400, Josh Boyer wrote:
> On Wed, Jun 03, 2009 at 02:04:46PM +0100, Catalin Marinas wrote:
> > * Mixed endianness devices in the same system - this may only need
> > dedicated readl_be/writel_be etc. macros but it could also be
> > done by having bus-aware readl/writel-like macros
>
> ioread/iowrite{8,16,32} and ioread/iowrite{8,16,32}_be don't suffice here?
Yes, but there there are many drivers that only use readl/writel (and
arch/arm makes the assumption, maybe correctly, that this is little
endian only).
I think that's useful even if the outcome of such discussion is better
documentation on the above functions/macros (grepping Documentation/
doesn't show any reference).
Thanks.
--
Catalin
^ permalink raw reply
* Re: [Ksummit-2009-discuss] Representing Embedded Architectures at the Kernel Summit
From: Josh Boyer @ 2009-06-03 13:18 UTC (permalink / raw)
To: Catalin Marinas; +Cc: linux-embedded
In-Reply-To: <1244034286.24482.84.camel@pc1117.cambridge.arm.com>
On Wed, Jun 03, 2009 at 02:04:46PM +0100, Catalin Marinas wrote:
> * Mixed endianness devices in the same system - this may only need
> dedicated readl_be/writel_be etc. macros but it could also be
> done by having bus-aware readl/writel-like macros
ioread/iowrite{8,16,32} and ioread/iowrite{8,16,32}_be don't suffice here?
josh
^ permalink raw reply
* Re: [Ksummit-2009-discuss] Representing Embedded Architectures at the Kernel Summit
From: Catalin Marinas @ 2009-06-03 13:04 UTC (permalink / raw)
To: James Bottomley; +Cc: ksummit-2009-discuss, linux-arch, linux-embedded
In-Reply-To: <1243956140.4229.25.camel@mulgrave.int.hansenpartnership.com>
Hi,
On Tue, 2009-06-02 at 15:22 +0000, James Bottomley wrote:
> So what we're looking for is a proposal to discuss the issues
> most affecting embedded architectures, or preview any features affecting
> the main kernel which embedded architectures might need ... or any other
> topics from embedded architectures which might need discussion or
> debate.
Some issues that come up on embedded systems (and not only):
* Multiple coherency domains for devices - the system may have
multiple bus levels, coherency ports, cache levels etc. Some
devices in the system (but not all) may be able to "see" various
cache levels but the DMA API (at least on ARM) cannot handle
this. It may be useful to discuss how other embedded
architectures handle this and come up with a unified solution
* Better support for coherent DMA mask - currently ZONE_DMA is
assumed to be in the bottom part of the memory which isn't
always the case. Enabling NUMA may help but it is overkill for
some systems. As above, a more unified solution across
architectures would help
* PIO block devices and non-coherent hardware - code like mpage.c
assumes that the either the hardware is coherent or the device
driver performs the cache flushing. The latter is true for
DMA-capable device but not for PIO. The issue becomes visible
with write-allocate caches and the device driver may not have
the struct page information to call flush_dcache_page(). A
proposed solution on the ARM lists was to differentiate (via
some flags) between PIO and DMA block devices and use this
information in mpage.c
* Mixed endianness devices in the same system - this may only need
dedicated readl_be/writel_be etc. macros but it could also be
done by having bus-aware readl/writel-like macros
* Asymmetric MP:
* Different CPU frequencies
* Different CPU features (e.g. floating point only one
some CPUs): scheduler awareness, per-CPU hwcap bits (in
case user space wants to set the affinity)
* Asymmetric workload balancing for power consumption (may
be better to load 1 CPU at 60% than 4 at 15%)
Thanks.
--
Catalin
^ permalink raw reply
* Re: Representing Embedded Architectures at the Kernel Summit
From: Mark Brown @ 2009-06-03 12:17 UTC (permalink / raw)
To: James Bottomley
Cc: Grant Likely, ksummit-2009-discuss, linux-arch, linux-embedded,
Josh Boyer, Tim Bird
In-Reply-To: <1243964917.4229.54.camel@mulgrave.int.hansenpartnership.com>
On Tue, Jun 02, 2009 at 05:48:37PM +0000, James Bottomley wrote:
> On Tue, 2009-06-02 at 11:29 -0600, Grant Likely wrote:
> > firmware issue, where existing firmware passes very little in the way
> > of hardware description to the kernel, but part is also not making
> > available any form of common language for describing the machine.
> OK, so my minimal understanding in this area lead me to believe this was
> because most embedded systems didn't have properly discoverable busses
> and therefore you had to tailor the kernel configuration exactly to the
> devices the system had.
Ish; essentially it's pushing the description of the non-enumerable bits
of the hardware out of kernel code and into a separate bit of data that
can be passed in to the kernel.
> It sounds interesting ... however, it also sounds like an area which
> might not impact the core kernel much ... or am I wrong about that? The
> topics we're really looking for the Kernel Summit are ones that require
> cross system input and which can't simply be sorted out by organising an
> Embedded mini-summit.
One issue that does have wider impact is that the OpenFirmware bindings
can affect any driver level code - it means that drivers may need to
parse configuration out of the device tree as well as the mechanisms
they normally use. This is already happening due to the current use but
will become more visible if more platforms adopt the device tree. As
someone primarily working on driver/subsystem side stuff this is my
primary concern with expanded use of device tree - it's another set of
platform data code that needs writing in addition to the other schemes
currently in use.
On the other hand, if all the embedded architectures got together and
agreeded to move in this direction it'd be pretty much equivalent to
some new BIOS standard being introduced for PCs so perhaps not worth
worrying about at a general kernel level.
> Now if flattened device tree could help us out with BIOS, ACPI, EFI and
> the other myriad boot and identification standards that seem designed to
> hide system information rather than reveal it, then we might be all
> ears ...
Well, you could potentially try to render other BIOS data tables into
device tree format. I'm not sure that the translation would be less
effort than parsing the existing data, though.
^ permalink raw reply
* Re: [Ksummit-2009-discuss] Representing Embedded Architectures at the Kernel Summit
From: Ralf Baechle @ 2009-06-03 7:07 UTC (permalink / raw)
To: Geert Uytterhoeven
Cc: Grant Likely, James Bottomley, ksummit-2009-discuss, linux-arch,
linux-embedded, Josh Boyer, Tim Bird
In-Reply-To: <10f740e80906021418i1d58f5eer940e7a8ec9fb8b9e@mail.gmail.com>
On Tue, Jun 02, 2009 at 11:18:04PM +0200, Geert Uytterhoeven wrote:
> > The big problem we have is that the only commonality between different
> > SoCs is that the CPU executes ARM instructions. Everything else is
> > entirely up to the SoC designer - eg location of memory, spacing of
> > memory banks, type of interrupt controller, etc is all highly SoC
> > specific. Nothing outside of the ARM CPU itself is standardized.
>
> That sounds very similar to m68k, which does support generic kernels
> (except for Sun-3, which uses a completely different MMU)?
For MIPS we currently have a concept of system families. The user selects
one in the "System type" Kconfig menu. With few exceptions that are not
worth considering these system types are incompatible so there are
currently 37 families of incompatible platforms with more being added.
Ralf
^ permalink raw reply
* Re: [Ksummit-2009-discuss] Representing Embedded Architectures at the Kernel Summit
From: Ralf Baechle @ 2009-06-03 6:53 UTC (permalink / raw)
To: Grant Likely
Cc: James Bottomley, linux-arch, Josh Boyer, Tim Bird, linux-embedded,
ksummit-2009-discuss
In-Reply-To: <fa686aa40906021029macf3c72g5261f8d4dbea935a@mail.gmail.com>
On Tue, Jun 02, 2009 at 11:29:46AM -0600, Grant Likely wrote:
> One topic that seems to garner debate is the issue of decoupling the
> kernel image from the target platform. ie. On x86, PowerPC and Sparc
> a kernel image will boot on any machine (assuming the needed drivers
> are enabled), but this is rarely the case in embedded. Most embedded
> kernels require explicit board support selected at compile time with
> no way to produce a generic kernel which will boot on a whole family
> of devices, let alone the whole architecture. Part of this is a
> firmware issue, where existing firmware passes very little in the way
> of hardware description to the kernel, but part is also not making
> available any form of common language for describing the machine.
Hardware is simple to incompatible to allow the generation of a single
"one size fits all" image. To list a few reasons from the MIPS world:
o little vs. big endian
o 32-bit vs. 64-bit
o different system firmwares
o processors and peripherals requiring creative workarounds which for
code size or performance reasons want to limit to those systems
suffering from the issue.
o often claustrophobically small memory sizes.
o exactly no communality across all systems except the processor
architecture.
o vendors coming up with their own instruction set enhancements not
supported by any competitor and insisting on their use for the extra
bit of performance and product differenciation.
o many users have a long standing "roll your own" attitude.
o peripherals that are even less probeable than ISA cards
"Flattened Device Tree" can tackle only a small part of this but its a
step.
> Embedded PowerPC and Microblaze has tackled this problem with the
> "Flattened Device Tree" data format which is derived from the
> OpenFirmware specifications, and there is some interest and debate (as
> discussed recently on the ARM mailing list) about making flattened
> device tree usable by ARM also (which I'm currently
> proof-of-concepting). Josh Boyer has already touched on discussing
> flattened device tree support at kernel summit in an email to the
> ksummit list last week (quoted below), and I'm wondering if a broader
> discussing would be warranted.
Agreed.
Ralf
^ permalink raw reply
* Re: [Ksummit-2009-discuss] Representing Embedded Architectures at the Kernel Summit
From: Ralf Baechle @ 2009-06-03 6:24 UTC (permalink / raw)
To: Jean-Christophe PLAGNIOL-VILLARD
Cc: Josh Boyer, James Bottomley, linux-arch, linux-embedded,
ksummit-2009-discuss
In-Reply-To: <20090602222132.GK6399@game.jcrosoft.org>
On Wed, Jun 03, 2009 at 12:21:32AM +0200, Jean-Christophe PLAGNIOL-VILLARD wrote:
> I'd like to propose AMP and kernel relocate
> as more and SoC will came with multiple core with or without the same arch
AMP is very interesting. It's not uncommon two have only a single of like
16 cores running Linux and the others running bare metal code. A typical
example would be networking code. Each vendor provides their own AMP
infrastructure and so far it all looks crude. With an every increasing
number of cores there also is the question if AMP could be generally useful
for the kernel.
Ralf
^ permalink raw reply
* Re: [Ksummit-2009-discuss] Representing Embedded Architectures at the Kernel Summit
From: Greg KH @ 2009-06-03 3:35 UTC (permalink / raw)
To: Robert Schwebel
Cc: James Bottomley, linux-arch, linux-embedded, David VomLehn,
Josh Boyer, Tim Bird, ksummit-2009-discuss
In-Reply-To: <20090602213452.GK32630@pengutronix.de>
On Tue, Jun 02, 2009 at 11:34:52PM +0200, Robert Schwebel wrote:
> Could flickerfree-bootsplash be a topic? Or is that completely pushed
> into the userspace these fastboot days?
We have that working today, no in-kernel work needed other than the
already-present KMS stuff. See the recent Moblin images for proof of
this.
thanks,
greg k-h
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox