LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [RFC] genalloc != generic DEVICE memory allocator
From: Andrey Volkov @ 2005-12-22 18:18 UTC (permalink / raw)
  To: Jes Sorensen, Pantelis Antoniou
  Cc: Andrew Morton, linux-kernel, linuxppc-embedded
In-Reply-To: <yq0d5jpuoqe.fsf@jaguar.mkp.net>

Hi Jes,

Jes Sorensen wrote:
>>>>>>"Andrey" == Andrey Volkov <avolkov@varma-el.com> writes:
> 
> 
> Andrey> Hello Jes and all I try to use your allocator (gen_pool_xxx),
> Andrey> idea of which is a cute nice thing. But current implementation
> Andrey> of it is inappropriate for a _device_ (aka onchip, like
> Andrey> framebuffer) memory allocation, by next reasons:
> 
> Andrey,
> 
> Keep in mind that genalloc was meant to be simple for basic memory
> allocations. It was never meant to be an over complex super high
> performance allocation mechanism.
> 
> Andrey>  1) Device memory is expensive resource by access time and/or
> Andrey> size cost.  So we couldn't use (usually) this memory for the
> Andrey> free blocks lists.
> 
> This really is irrelevant, the space is only used within the object
> when it's on the free list. Ie. if all memory is handed out there's
> no space used for this purpose.

I point out 2 reasons: ACCESS TIME was first :), let take very
widespread case: PCI device with some onboard memory and any
N GHz proc. - result may be terrible: each access to device mem (which
usually uncached) will slowed down this super fast proc to 33 MHZ, i.e
same as we made busy-wait with disabled interrupts after each read/write...

I possible awry when use 'control structures' in 2), I've in view
allocator's control structures (size/next etc), not device specific
control structs.

> 
> Andrey> 3) Obvious (IMHO) workflow of mem. allocator
> Andrey> look like: - at startup time, driver allocate some big
> Andrey> (almost) static mem. chunk(s) for a control/data structures.
> Andrey> - during work of the device, driver allocate many small
> Andrey> mem. blocks with almost identical size.  such behavior lead to
> Andrey> degeneration of buddy method and transform it to the
> Andrey> first/best fit method (with long seek by the free node list).
> 
> This is only really valid for network devices, and even then it's not
> quite so. For things like uncached allocations your observation is
> completely off.

Could you give me some examples? Possible I overlooked something
significant.

> 
> For the case of more traditional devices, the control structures will
> be allocated from one end of the block, the rest will be used for
> packet descriptors which will be going in and out of the memory pool
> on a regular basis. 

This was main reason why I try to modify genalloc: I needed  in
generic allocator for both short-live strictly aligned blocks and
long-live blocks with restriction by size.

> In most normal cases these will all be of the same
> size and it doesn't matter where in the memory space they were
> allocated.

And thats also why I consider that 'buddy' is not appropriate to be
'generic' (most cases == generic, isn't is :)?): when you're allocate
mainly same sized blocks, 'buddy' degraded to the first-fit.

Possible solution I see in mixed first-fit with lazy coalescent for
short lived blocks and first-fit with immediately coalescent for
long-lived blocks. But, again, I may overlook something significant.
And, certainly, I could overlooked someone else allocator implementation
in some driver.

> 
> Andrey> 4) The simple binary buddy method is far away from perfect for
> Andrey> a device due to a big internal fragmentation. Especially for a
> Andrey> network/mfd devices, for which, size of allocated data very
> Andrey> often is not a power of 2.
> 
snip
> 
> Andrey> I start to modify your code to satisfy above demands, but
> Andrey> firstly I wish to know your, or somebody else, opinion.
> 
> I honestly don't think the majority of your demands are valid.
> genalloc was meant to be simple, not an ultra fast at any random
> block size allocator. So far I don't see any reason for changing to
> the allocation algorithm into anything much more complex - doesn't
> mean there couldn't be a reason for doing so, but I don't think you
> have described any so far.
I disagree here, generic couldn't be very simple and slow, because in
this case simply no one will be use it, and hence we'll get today's
picture: reimplemented allocators in many drivers.

> 
> You mentioned frame buffers, but what is the kernel supposed to do
> with those allocation wise? If you have a frame buffer console, the
> memory is allocated once and handed to the frame buffer driver.
> Ie. you don't need a ton of on demand allocations for that and for
> X, the memory management is handled in the X server, not by the
> kernel.

For video-only device this is true, but if device is a multifunctional,
which is frequent case in embedded systems, then kernel must control of
device memory allocation. Currently, however, even video cards for
desktops become more and more multifunctional (VIVO/audio etc.).

> 
> The only thing I think would make sense to implement is to allow it to
> use indirect descriptor blocks for the memory it manages. This is not
> because it's wrong to use the memory for the free list, as it will
> only be used for this when the chunk is not in use, but because access
> to certain types of memory isn't always valid through normal direct
> access. Ie. if one used descriptor blocks residing in normal
> GFP_KERNEL memory, it would be possible to use the allocator to manage
> memory sitting on the other side of a PCI bus.
I describe above, why we couldn't/wouldn't use onboard memory for
allocator specific data.

Pantelis, Am I answered to your question (...what are you trying to
do...) too?

-- 
Regards
Andrey Volkov

^ permalink raw reply

* Re: [RFC] genalloc != generic DEVICE memory allocator
From: Jes Sorensen @ 2005-12-22 15:37 UTC (permalink / raw)
  To: Andrey Volkov; +Cc: Andrew Morton, linux-kernel, linuxppc-embedded
In-Reply-To: <43A98F90.9010001@varma-el.com>

>>>>> "Andrey" == Andrey Volkov <avolkov@varma-el.com> writes:

Andrey> Hello Jes and all I try to use your allocator (gen_pool_xxx),
Andrey> idea of which is a cute nice thing. But current implementation
Andrey> of it is inappropriate for a _device_ (aka onchip, like
Andrey> framebuffer) memory allocation, by next reasons:

Andrey,

Keep in mind that genalloc was meant to be simple for basic memory
allocations. It was never meant to be an over complex super high
performance allocation mechanism.

Andrey>  1) Device memory is expensive resource by access time and/or
Andrey> size cost.  So we couldn't use (usually) this memory for the
Andrey> free blocks lists.

This really is irrelevant, the space is only used within the object
when it's on the free list. Ie. if all memory is handed out there's
no space used for this purpose.

Andrey> 3) Obvious (IMHO) workflow of mem. allocator
Andrey> look like: - at startup time, driver allocate some big
Andrey> (almost) static mem. chunk(s) for a control/data structures.
Andrey> - during work of the device, driver allocate many small
Andrey> mem. blocks with almost identical size.  such behavior lead to
Andrey> degeneration of buddy method and transform it to the
Andrey> first/best fit method (with long seek by the free node list).

This is only really valid for network devices, and even then it's not
quite so. For things like uncached allocations your observation is
completely off.

For the case of more traditional devices, the control structures will
be allocated from one end of the block, the rest will be used for
packet descriptors which will be going in and out of the memory pool
on a regular basis. In most normal cases these will all be of the same
size and it doesn't matter where in the memory space they were
allocated.

Andrey> 4) The simple binary buddy method is far away from perfect for
Andrey> a device due to a big internal fragmentation. Especially for a
Andrey> network/mfd devices, for which, size of allocated data very
Andrey> often is not a power of 2.

For network devices it's perfectly adequate as it will almost always
satisfy what I described above. Incoming packets will always be
allocated for a full MTU sized packet hence all allocated blocks will
be of the same size. For outgoing packets, the allcation is short
lived and while it may be that a good chunk of packets aren't all full
MTU sized, it is rarely worth the hassle of trying to make the
allocator allow to-the-byte sized allocations as the number of
outstanding outgoing packets will be very limited.

Andrey> I start to modify your code to satisfy above demands, but
Andrey> firstly I wish to know your, or somebody else, opinion.

I honestly don't think the majority of your demands are valid.
genalloc was meant to be simple, not an ultra fast at any random
block size allocator. So far I don't see any reason for changing to
the allocation algorithm into anything much more complex - doesn't
mean there couldn't be a reason for doing so, but I don't think you
have described any so far.

You mentioned frame buffers, but what is the kernel supposed to do
with those allocation wise? If you have a frame buffer console, the
memory is allocated once and handed to the frame buffer driver.
Ie. you don't need a ton of on demand allocations for that and for
X, the memory management is handled in the X server, not by the
kernel.

The only thing I think would make sense to implement is to allow it to
use indirect descriptor blocks for the memory it manages. This is not
because it's wrong to use the memory for the free list, as it will
only be used for this when the chunk is not in use, but because access
to certain types of memory isn't always valid through normal direct
access. Ie. if one used descriptor blocks residing in normal
GFP_KERNEL memory, it would be possible to use the allocator to manage
memory sitting on the other side of a PCI bus.

Regards,
Jes

^ permalink raw reply

* AW: AW: Kernel cmdline
From: Achim Machura @ 2005-12-22 16:01 UTC (permalink / raw)
  To: 'Kumar Gala'; +Cc: Linuxppc-Embedded (E-Mail)
In-Reply-To: <85C67BBD-0072-4B64-A7B1-71871E755B76@kernel.crashing.org>

Hello, 

> > I' ve to parse the commandline in a dynamic loaded modul.
> 
> Why would you want to do this.  Anyways, the cmd_line is to exported  
> to modules.  You can provide modules there own params

i want check an argument given by the bootloader via commandline

Achim

^ permalink raw reply

* Re: [RFC] genalloc != generic DEVICE memory allocator
From: Pantelis Antoniou @ 2005-12-22 16:09 UTC (permalink / raw)
  To: linuxppc-embedded; +Cc: Andrew Morton, jes, linux-kernel
In-Reply-To: <43AAC9E8.2060105@varma-el.com>

>

[snip]

> I'm sure lib/ will be appropriate place. and something like
> "DON'T TRY REINVENT WHEEL, TRY FIX EXISTS" in documentation/ :).
> 
> Now couple word about rheap: I understand why you are use static
> alignment in allocator, but its very specialized for CPM. IMO, align
> must be a param of xx_alloc. For ex: device may demand alignment by
> 8 bytes, which ok until... you are try map this memory to the user
> space (don't shoot at me, remember about framebuffer & co).
>

It is trivial to align to a given alignment in a call. Please search
the archives since this was needed for CPM2 and I've committed a patch.

As for mapping user space, since rheap only deals with addresses and never
touches the memory it's supposed to control, you can do pretty much everything.

I still don't understand what are you trying to do however.

Mind explaining?
 
> -- 
> Regards
> Andrey Volkov
>

Regards

Pantelis

^ permalink raw reply

* MPC5200 Cache issue with Bestcomm
From: Amir Bukhari @ 2005-12-22 15:39 UTC (permalink / raw)
  To: linuxppc-embedded

Sorry this question is not related to linux kernel, but I can't find any
sources which I can find any tipps for the issue I encoured.
I write a standalone application with TCP stack (I use lwip stack). I am
want to enable the cache to increase performance of my application.

------------------------
Here is my configuration of various registers:

#define CORE_HID0_INIT                    0x8010C000
#define CORE_HID2_INIT                    0x00000000
#define CORE_IBAT0U_INIT                  0x000007FF
#define CORE_IBAT0L_INIT                  0x00000001

#define CORE_DBAT0U_INIT                  0x000007FF
#define CORE_DBAT0L_INIT                  0x00000052
#define CORE_DBAT1U_INIT                  0xF000000F  // for MBAR
#define CORE_DBAT1L_INIT                  0xF000002A  // for MBAR
#define CORE_DBAT2U_INIT                  0x40001FFF
#define CORE_DBAT2L_INIT                  0x40000022  // for PCI
#define CORE_DBAT3U_INIT                  0x50001FFF  //
#define CORE_DBAT3L_INIT                  0x50000022  //

XLARB configuration is :
#define XLARB_CONFIG_INIT                 0x8000A006 // snoop window is
enabled
#define XLARB_PRIORITY_ENABLE_INIT        0x0000000F
#define XLARB_PRIORITY_INIT               0x11111010
#define XLARB_SNOOP_WINDOW_INIT           0x00000019 // base address is =
0
and 64Mbytes length

----------------

Now as soon as I enable cache the Bestcomm doesn't stop firing me a =
ethernet
packet receive. It doesn't stop this and this let my system hangs up. =
When
running the system without cache every thing work well.

I will be happy if someone can give me a tipp if I may missed something!

-Amir

^ permalink raw reply

* Re: AW: Kernel cmdline
From: Kumar Gala @ 2005-12-22 15:50 UTC (permalink / raw)
  To: achim.machura; +Cc: Linuxppc-Embedded (E-Mail), 'Jenkins, Clive'
In-Reply-To: <001f01c6070d$649e8f00$34f1ff0a@beint.local>


On Dec 22, 2005, at 9:35 AM, Achim Machura wrote:

>
>> Yes. Look at:
>> http://lxr.linux.no/source/init/main.c?v=2.6.10;a=ppc#L123
>> Clive
>
> Thanks,
>
> but this exaple seems to work only with modules staticly linked  
> into the
> kernel.
>
> I' ve to parse the commandline in a dynamic loaded modul.

Why would you want to do this.  Anyways, the cmd_line is to exported  
to modules.  You can provide modules there own params

- kumar

^ permalink raw reply

* Re: [RFC] genalloc != generic DEVICE memory allocator
From: Andrey Volkov @ 2005-12-22 15:44 UTC (permalink / raw)
  To: Pantelis Antoniou; +Cc: Andrew Morton, jes, linux-kernel, linuxppc-embedded
In-Reply-To: <43AAB508.7000007@intracom.gr>

Pantelis Antoniou wrote:
> Andrey Volkov wrote:
> 
>> Hi Pantelis,
>>
>> Pantelis Antoniou wrote:
>>
>>> Andrey Volkov wrote:
>>>
> 
> [snip]
> 
>>>
>>> Hi Andrey,
>>>
>>> FYI, on arch/ppc/lib/rheap.c theres an implementation of a remote heap.
>>>
>>> It is currently used for the management of freescale's CPM1 & CPM2
>>> internal
>>> dual port RAM.
>>>
>>> Take a look, it might be what you have in mind.
>>>
>>> Regards
>>>
>>> Pantelis
>>
>>
>>
>> Thanks I missed it (and small wonder! :( ).
>>
>> Andrew, Is somebody count HOW MANY dev specific implementation
>> of buddy/first-fit allocators now in kernel?
>>
> 
> Yes, it is indeed messy.
> 
> The rheap implementation is generic enough and I believe can fit most of
> the
> special memory allocators needs. If you'd like I could move it somewhere
> generic and test it.
> 
I'm sure lib/ will be appropriate place. and something like
"DON'T TRY REINVENT WHEEL, TRY FIX EXISTS" in documentation/ :).

Now couple word about rheap: I understand why you are use static
alignment in allocator, but its very specialized for CPM. IMO, align
must be a param of xx_alloc. For ex: device may demand alignment by
8 bytes, which ok until... you are try map this memory to the user
space (don't shoot at me, remember about framebuffer & co).

-- 
Regards
Andrey Volkov

^ permalink raw reply

* console problem on 8247
From: srideep.devireddy @ 2005-12-22 15:17 UTC (permalink / raw)
  To: linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 2448 bytes --]



 HI ,

i am trying to use Ram disk with my mpc8247 board and compiled both the linux kernel image and the ramdisk image and tried to load it on to the  board when i try doing this i am getting a familiar error  which is know to all  hangs after uncompressing linux kernel ... actually  so please do tell me what might be the problem as we are using SMC as the console .

=> bootm 100000 b00000
## Booting image at 00100000 ...
   Image Name:   Linux-2.4.20_mvl31-8272ads
   Image Type:   PowerPC Linux Kernel Image (gzip compressed)
   Data Size:    680207 Bytes = 664.3 kB
   Load Address: 00000000
   Entry Point:  00000000
   Verifying Checksum ... OK
   Uncompressing Kernel Image ... OK
## Current stack ends at 0x0FBC18D8 => set upper limit to 0x00800000
## cmdline at 0x007FFF00 ... 0x007FFF49
bd address  = 0x0FBC1BB4
memstart    = 0x00000000
memsize     = 0x10000000
flashstart  = 0xFE000000
flashsize   = 0x02000000
flashoffset = 0x00000000
sramstart   = 0x00000000
sramsize    = 0x00000000
immr_base   = 0xF0000000
bootflags   = 0x00000001
vco         =    396 MHz
sccfreq     =     99 MHz
brgfreq     = 24.750 MHz
intfreq     =    264 MHz
cpmfreq     =    198 MHz
busfreq     =     66 MHz
ethaddr     = 00:11:22:33:44:55
IP addr     = 192.168.80.196
baudrate    = 115200 bps
## Loading RAMDisk Image at 00b00000 ...
   Image Name:   uImage.ramdisk
   Image Type:   PowerPC Linux RAMDisk Image (gzip compressed)
   Data Size:    1827099 Bytes =  1.7 MB
   Load Address: 00000000
   Entry Point:  00000000
   Verifying Checksum ... OK
## initrd at 0x00B00040 ... 0x00CBE15A (len=1827099=0x1BE11B)
   Loading Ramdisk to 0fa02000, end 0fbc011b ... OK
## Transferring control to Linux (at address 00000000) ...





The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.

WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.

www.wipro.com

[-- Attachment #2: Type: text/html, Size: 3891 bytes --]

^ permalink raw reply

* AW: Kernel cmdline
From: Achim Machura @ 2005-12-22 15:35 UTC (permalink / raw)
  To: 'Jenkins, Clive'; +Cc: Linuxppc-Embedded (E-Mail)
In-Reply-To: <35786B99AB3FDC45A8215724617919736D91B7@gbrwgceumf01.eu.xerox.net>


> Yes. Look at:
> http://lxr.linux.no/source/init/main.c?v=2.6.10;a=ppc#L123
> Clive

Thanks,

but this exaple seems to work only with modules staticly linked into the
kernel.

I' ve to parse the commandline in a dynamic loaded modul.

Achim

^ permalink raw reply

* RE: Kernel cmdline
From: Jenkins, Clive @ 2005-12-22 14:59 UTC (permalink / raw)
  To: achim.machura, Linuxppc-Embedded (E-Mail)

> From: linuxppc-embedded-bounces@ozlabs.org
[mailto:linuxppc-embedded-bounces@ozlabs.org] On Behalf Of Achim Machura
> Sent: 22 December 2005 13:48
> To: Linuxppc-Embedded (E-Mail)
> Subject: Kernel cmdline
>
>
> Hello,
>
> is the a posibility to read (parse) the cmdline in a driver, similar
to cat
> /proc/cmdline ?
>
> best regards
>
> Achim

Yes. Look at:
http://lxr.linux.no/source/init/main.c?v=3D2.6.10;a=3Dppc#L123
Clive
_______________________________________________
Linuxppc-embedded mailing list
Linuxppc-embedded@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-embedded

^ permalink raw reply

* Re: [RFC] genalloc != generic DEVICE memory allocator
From: Pantelis Antoniou @ 2005-12-22 14:15 UTC (permalink / raw)
  To: Andrey Volkov; +Cc: Andrew Morton, jes, linux-kernel, linuxppc-embedded
In-Reply-To: <43AAAEA2.8030200@varma-el.com>

Andrey Volkov wrote:
> Hi Pantelis,
> 
> Pantelis Antoniou wrote:
> 
>>Andrey Volkov wrote:
>>

[snip]

>>
>>Hi Andrey,
>>
>>FYI, on arch/ppc/lib/rheap.c theres an implementation of a remote heap.
>>
>>It is currently used for the management of freescale's CPM1 & CPM2 internal
>>dual port RAM.
>>
>>Take a look, it might be what you have in mind.
>>
>>Regards
>>
>>Pantelis
> 
> 
> Thanks I missed it (and small wonder! :( ).
> 
> Andrew, Is somebody count HOW MANY dev specific implementation
> of buddy/first-fit allocators now in kernel?
> 

Yes, it is indeed messy.

The rheap implementation is generic enough and I believe can fit most of the
special memory allocators needs. If you'd like I could move it somewhere
generic and test it.

Regards

Pantelis

^ permalink raw reply

* Kernel cmdline
From: Achim Machura @ 2005-12-22 13:48 UTC (permalink / raw)
  To: Linuxppc-Embedded (E-Mail)

Hello,

is the a posibility to read (parse) the cmdline in a driver, similar to cat
/proc/cmdline ?

best regards

Achim

^ permalink raw reply

* Re: [RFC] genalloc != generic DEVICE memory allocator
From: Andrey Volkov @ 2005-12-22 13:48 UTC (permalink / raw)
  To: Pantelis Antoniou; +Cc: Andrew Morton, jes, linux-kernel, linuxppc-embedded
In-Reply-To: <43AA65F4.10409@intracom.gr>

Hi Pantelis,

Pantelis Antoniou wrote:
> Andrey Volkov wrote:
> 
>> Hello Jes and all
>>
>> I try to use your allocator (gen_pool_xxx), idea of which
>> is a cute nice thing. But current implementation of it is
>> inappropriate for a _device_ (aka onchip, like framebuffer) memory
>> allocation, by next reasons:
>>
>>  1) Device memory is expensive resource by access time and/or size cost.
>>     So we couldn't use (usually) this memory for the free blocks lists.
>>  2) Device memory usually have special requirement of access to it
>>     (alignment/special insn). So we couldn't use part of allocated
>>     blocks for some control structures (this problem solved in your
>>     implementation, it's common remark)
>>  3) Obvious (IMHO) workflow of mem. allocator look like:
>>      - at startup time, driver allocate some big
>>       (almost) static mem. chunk(s) for a control/data structures.
>>         - during work of the device, driver allocate many small
>>       mem. blocks with almost identical size.
>>     such behavior lead to degeneration of buddy method and
>>     transform it to the first/best fit method (with long seek
>>     by the free node list).
>>  4) The simple binary buddy method is far away from perfect for a device
>>     due to a big internal fragmentation. Especially for a
>>     network/mfd devices, for which, size of allocated data very
>>     often is not a power of 2.
>>
>> I start to modify your code to satisfy above demands,
>> but firstly I wish to know your, or somebody else, opinion.
>>
>> Especially I will very happy if somebody have and could
>> provide to all, some device specific memory usage statistics.
>>
> 
> Hi Andrey,
> 
> FYI, on arch/ppc/lib/rheap.c theres an implementation of a remote heap.
> 
> It is currently used for the management of freescale's CPM1 & CPM2 internal
> dual port RAM.
> 
> Take a look, it might be what you have in mind.
> 
> Regards
> 
> Pantelis

Thanks I missed it (and small wonder! :( ).

Andrew, Is somebody count HOW MANY dev specific implementation
of buddy/first-fit allocators now in kernel?

-- 
Regards
Andrey Volkov

^ permalink raw reply

* Re: [RFC] genalloc != generic DEVICE memory allocator
From: Andrey Volkov @ 2005-12-22 13:41 UTC (permalink / raw)
  To: Sylvain Munaut; +Cc: Andrew Morton, jes, linux-kernel, linuxppc-embedded
In-Reply-To: <43A9B2F1.8090402@246tNt.com>

Hi Sylvain,

Sylvain Munaut wrote:
> Hi Andrey,
> 
> 
> Didn't I sent you the memory allocator I wrote a few month back for 5200
> SRAM ?
Yes, I receive it and currently I use it for Bestcomm, but,
as I wrote before, I also writing another driver for which I need
allocator too, and sram_xxx/gen_pool_xxx completely inappropriate for it
(since device is PCI based). Also, trust me, it will be 6th or 7th
allocator implementation what I did, its more than enough to make me
sick from allocators.

As well, IMHO, yet another allocator in kernel (currently almost each
driver for dev. with onboard dynamically allocated mem. implement
somehow or other buddy/first fit alloc) will cause yet another bugs in
kernel ALREADY FIXED in driver in the neighbourhood dir.

> 
> It uses the sram itself for the free block list but without using any
> (iow, you could allocate the whole SRAM, no memory is wasted). The SRAM
> is on-chip so pretty fast access. That kind of allocator is no good for
> memory on a PCI board or such though (bad access time ! using main
> memory would be better)
>
> Sylvain

Completely agree, but - for BESTCOMM case. This is what I have in mind
when I wrote 'usually' at 1) ;). Also don't forget about storage
for size of allocated blocks  (which later passed to free) - in sram_xxx
case main memory used for it indirectly (when you push constant as
param) or directly, when you are store it in data struct. So, IMO,
better use it directly and control it in one place, then try to catch
bugs with invalid size pushed to free.


> 
> 
> Andrey Volkov wrote:
> 
>>Hello Jes and all
>>
>>I try to use your allocator (gen_pool_xxx), idea of which
>>is a cute nice thing. But current implementation of it is
>>inappropriate for a _device_ (aka onchip, like framebuffer) memory
>>allocation, by next reasons:
>>
>> 1) Device memory is expensive resource by access time and/or size cost.
>>    So we couldn't use (usually) this memory for the free blocks lists.
>> 2) Device memory usually have special requirement of access to it
>>    (alignment/special insn). So we couldn't use part of allocated
>>    blocks for some control structures (this problem solved in your
>>    implementation, it's common remark)
>> 3) Obvious (IMHO) workflow of mem. allocator look like:
>> 	- at startup time, driver allocate some big
>>	  (almost) static mem. chunk(s) for a control/data structures.
>>        - during work of the device, driver allocate many small
>>	  mem. blocks with almost identical size.
>>    such behavior lead to degeneration of buddy method and
>>    transform it to the first/best fit method (with long seek
>>    by the free node list).
>> 4) The simple binary buddy method is far away from perfect for a device
>>    due to a big internal fragmentation. Especially for a
>>    network/mfd devices, for which, size of allocated data very
>>    often is not a power of 2.
>>
>>I start to modify your code to satisfy above demands,
>>but firstly I wish to know your, or somebody else, opinion.
>>
>>Especially I will very happy if somebody have and could
>>provide to all, some device specific memory usage statistics.
>>

-- 
Regards
Andrey Volkov

P.S. Oops, sorry for duplication, I forget insert CC in prev replay.

^ permalink raw reply

* Bridge function at Linux 2.4.25
From: 徐小威的EMAIL @ 2005-12-22  9:40 UTC (permalink / raw)
  To: linuxppc-embedded

Hi:
     I used Linux 2.4.25 at my custom MPC852T board.I got some problem
after enable 802.1d bridge function.Why?

    Anybody know which version of brctl is suitable for Linux 2.4.25?(I
used bridge-utils-1.0.6.)

NETDEV WATCHDOG: eth1: transmit timed out
eth1: transmit timed out.
Ring data dump: cur_tx c30bd100, tx_free 0, dirty_tx c30bd100, cur_rx
c30bd000
 tx: 16 buffers
  c30bd100: 9c00 003c 01ba5a70
  c30bd108: 9c00 003c 01ba5b70
  c30bd110: 9c00 003c 01ba5c70
  c30bd118: 9c00 003c 01ba5d70
  c30bd120: 9c00 0071 01ba5e70
  c30bd128: 9c00 006e 00880080
  c30bd130: 9c00 0076 00880180

^ permalink raw reply

* Re: [RFC] genalloc != generic DEVICE memory allocator
From: Pantelis Antoniou @ 2005-12-22  8:38 UTC (permalink / raw)
  To: Andrey Volkov; +Cc: Andrew Morton, jes, linux-kernel, linuxppc-embedded
In-Reply-To: <43A98F90.9010001@varma-el.com>

Andrey Volkov wrote:
> Hello Jes and all
> 
> I try to use your allocator (gen_pool_xxx), idea of which
> is a cute nice thing. But current implementation of it is
> inappropriate for a _device_ (aka onchip, like framebuffer) memory
> allocation, by next reasons:
> 
>  1) Device memory is expensive resource by access time and/or size cost.
>     So we couldn't use (usually) this memory for the free blocks lists.
>  2) Device memory usually have special requirement of access to it
>     (alignment/special insn). So we couldn't use part of allocated
>     blocks for some control structures (this problem solved in your
>     implementation, it's common remark)
>  3) Obvious (IMHO) workflow of mem. allocator look like:
>  	- at startup time, driver allocate some big
> 	  (almost) static mem. chunk(s) for a control/data structures.
>         - during work of the device, driver allocate many small
> 	  mem. blocks with almost identical size.
>     such behavior lead to degeneration of buddy method and
>     transform it to the first/best fit method (with long seek
>     by the free node list).
>  4) The simple binary buddy method is far away from perfect for a device
>     due to a big internal fragmentation. Especially for a
>     network/mfd devices, for which, size of allocated data very
>     often is not a power of 2.
> 
> I start to modify your code to satisfy above demands,
> but firstly I wish to know your, or somebody else, opinion.
> 
> Especially I will very happy if somebody have and could
> provide to all, some device specific memory usage statistics.
> 

Hi Andrey,

FYI, on arch/ppc/lib/rheap.c theres an implementation of a remote heap.

It is currently used for the management of freescale's CPM1 & CPM2 internal
dual port RAM.

Take a look, it might be what you have in mind.

Regards

Pantelis

^ permalink raw reply

* Re: [RFC] RTC subsystem
From: Alessandro Zummo @ 2005-12-22  6:48 UTC (permalink / raw)
  To: Simon Richter; +Cc: linuxppc-dev, lm-sensors, linux-kernel, linuxppc-embedded
In-Reply-To: <43A9E2C9.7080300@hogyros.de>

On Thu, 22 Dec 2005 00:18:33 +0100
Simon Richter <Simon.Richter@hogyros.de> wrote:

> A good ntpd will adjust the speed rather than write to the clock; the
> ntpd shipped by most distributions can already handle multiple time sources.

 Yes, but there's the kernel who writes to the clock,
 for example http://lxr.linux.no/source/arch/arm/kernel/time.c?a=arm#L103 .

> >  later. The /dev/rtc interface only supports one clock.
> >  It can either be extended to have /dev/rtcX or we
> >  can extend the sysfs one to allow clock updating.
> 
> /dev is the way to go IMO. As far as I've understood sysfs, it carries
> meta information about devices and drivers only, the actual
> communication then happens through device nodes still.

 Ok. We can use dynamic device numbers and go for the /dev
 interface. 

> 
> >  NTP mode could then be adjusted to update one or more
> >  of the rtcs. Maybe each RTC could have an attribute
> >  (let's say /sys/class/rtc/rtcX/ntp) which tells the
> >  kernel whether to update it or not.
> 
> That's entirely a userspace thing. What the userspace needs to know from
> the kernel is whether the clock is writable and whether its speed can be
> adjusted.

 agreed. there are are also some variables of interest in
 http://lxr.linux.no/source/include/linux/timex.h?a=arm#L188
 some of them may be usefully exported in sysfs.
 
 
-- 

 Best regards,

 Alessandro Zummo,
  Tower Technologies - Turin, Italy

  http://www.towertech.it

^ permalink raw reply

* port prpmc610
From: siman @ 2005-12-22  3:29 UTC (permalink / raw)
  To: linuxppc-embedded
In-Reply-To: <20051222010003.6CEF16895E@ozlabs.org>

Hi  All:
I am porting linux to  the prpmc610 board, powerpc610 has the ppc5debug
system, I have tried more deconfigs,but failed, The system can not run =
the
kernel  When I execute the kernel, Anybody have any experience to port =
this
system. Please tell me.
Thank you so much.



-----=D3=CA=BC=FE=D4=AD=BC=FE-----
=B7=A2=BC=FE=C8=CB: linuxppc-embedded-bounces@ozlabs.org
[mailto:linuxppc-embedded-bounces@ozlabs.org] =B4=FA=B1=ED
linuxppc-embedded-request@ozlabs.org
=B7=A2=CB=CD=CA=B1=BC=E4: 2005=C4=EA12=D4=C222=C8=D5 9:00
=CA=D5=BC=FE=C8=CB: linuxppc-embedded@ozlabs.org
=D6=F7=CC=E2: Linuxppc-embedded Digest, Vol 16, Issue 55

Send Linuxppc-embedded mailing list submissions to
	linuxppc-embedded@ozlabs.org

To subscribe or unsubscribe via the World Wide Web, visit
	https://ozlabs.org/mailman/listinfo/linuxppc-embedded
or, via email, send a message with subject or body 'help' to
	linuxppc-embedded-request@ozlabs.org

You can reach the person managing the list at
	linuxppc-embedded-owner@ozlabs.org

When replying, please edit your Subject line so it is more specific than
"Re: Contents of Linuxppc-embedded digest..."


Today's Topics:

   1. Re: [RFC] RTC subsystem (Simon Richter)
   2. Re: [RFC] RTC subsystem (Alessandro Zummo)
   3. Re: [RFC] RTC subsystem (Simon Richter)
   4. [RFC] genalloc !=3D generic DEVICE memory allocator (Andrey =
Volkov)
   5. Re: [RFC] RTC subsystem (Alessandro Zummo)
   6. Re: [RFC] RTC subsystem (Simon Richter)


----------------------------------------------------------------------

Message: 1
Date: Wed, 21 Dec 2005 13:18:25 +0100
From: Simon Richter <Simon.Richter@hogyros.de>
Subject: Re: [RFC] RTC subsystem
To: Alessandro Zummo <azummo-lists@towertech.it>
Cc: linuxppc-dev@ozlabs.org, linuxppc-embedded@ozlabs.org,
	lm-sensors@lm-sensors.org, linux-arm-kernel@lists.arm.linux.org.uk,
	nslu2-developers@yahoogroups.com
Message-ID: <43A94811.4010704@hogyros.de>
Content-Type: text/plain; charset=3D"iso-8859-1"

Hi,

Alessandro Zummo wrote:

>   I've posted a proposal for a new RTC subsystem  on lkml (=20
> http://lkml.org/lkml/2005/12/20/220 ) .

I agree that there is room for improvement. Do you have a specific =
structure
in mind? Specifically,

  - which functions do you believe to be generic,
  - how should multiple RTCs be handled,
  - are read-only (radio controlled) RTCs taken care of?

At present, I don't have time to help the cause, but I can provide =
hosting
for a git tree if desired.

    Simon
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 307 bytes
Desc: OpenPGP digital signature
Url :
http://ozlabs.org/pipermail/linuxppc-embedded/attachments/20051221/05aa56=
40/
signature-0001.pgp

------------------------------

Message: 2
Date: Wed, 21 Dec 2005 16:07:12 +0100
From: Alessandro Zummo <azummo-lists@towertech.it>
Subject: Re: [RFC] RTC subsystem
To: Simon Richter <Simon.Richter@hogyros.de>
Cc: linuxppc-dev@ozlabs.org, linuxppc-embedded@ozlabs.org,
	lm-sensors@lm-sensors.org, linux-arm-kernel@lists.arm.linux.org.uk,
	nslu2-developers@yahoogroups.com
Message-ID: <20051221160712.2d322f42@inspiron>
Content-Type: text/plain; charset=3DUS-ASCII

On Wed, 21 Dec 2005 13:18:25 +0100
Simon Richter <Simon.Richter@hogyros.de> wrote:

> >   I've posted a proposal for a new RTC subsystem  on lkml (=20
> > http://lkml.org/lkml/2005/12/20/220 ) .
>=20
> I agree that there is room for improvement. Do you have a specific=20
> structure in mind? Specifically,

 Hi Simon,
  the proposal actually had a fully-working patch attached :)

>   - which functions do you believe to be generic,
>   - how should multiple RTCs be handled,

 In my code, the first rtc that register is bound  to /proc/driver/rtc =
and
/dev/rtc (if those interfaces  are compiled in, as they are all =
selectable).

 The other RTCs are available thru /sys/class/rtc/rtcX  (again, if =
compiled
in).

>   - are read-only (radio controlled) RTCs taken care of?

 You have full control of which functions you will provide  to the upper
layer. Obivously if you try to set the  time on a read-only rtc, you =
will
get an error.

> At present, I don't have time to help the cause, but I can provide=20
> hosting for a git tree if desired.

 Thanks, I'll consider it if the need arises.

--=20

 Best regards,

 Alessandro Zummo,
  Tower Technologies - Turin, Italy

  http://www.towertech.it



------------------------------

Message: 3
Date: Wed, 21 Dec 2005 17:02:55 +0100
From: Simon Richter <Simon.Richter@hogyros.de>
Subject: Re: [RFC] RTC subsystem
To: Alessandro Zummo <azummo-lists@towertech.it>
Cc: linuxppc-dev@ozlabs.org, lm-sensors@lm-sensors.org,
	linuxppc-embedded@ozlabs.org
Message-ID: <43A97CAF.50301@hogyros.de>
Content-Type: text/plain; charset=3D"iso-8859-1"

Hello,

Alessandro Zummo wrote:

>   the proposal actually had a fully-working patch attached :)

Ah, didn't see that, as I just skimmed over the web archive page you =
linked
to, which has no link to the actual patch (or I'm too stupid to find =
it).

>  In my code, the first rtc that register is bound  to /proc/driver/rtc =

> and /dev/rtc (if those interfaces  are compiled in, as they are all=20
> selectable).

It would be good to have a way to change which clock is the "primary"=20
one from userspace later (userspace because this is clearly site =
policy).

>  You have full control of which functions you will provide  to the=20
> upper layer. Obivously if you try to set the  time on a read-only rtc, =

> you will get an error.

Sure. I was thinking of the question which error that should be.

    Simon
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 307 bytes
Desc: OpenPGP digital signature
Url :
http://ozlabs.org/pipermail/linuxppc-embedded/attachments/20051221/29bd58=
58/
signature-0001.pgp

------------------------------

Message: 4
Date: Wed, 21 Dec 2005 20:23:28 +0300
From: Andrey Volkov <avolkov@varma-el.com>
Subject: [RFC] genalloc !=3D generic DEVICE memory allocator
To: jes@trained-monkey.org
Cc: Andrew Morton <akpm@osdl.org>, linux-kernel@vger.kernel.org,
	linuxppc-embedded@ozlabs.org
Message-ID: <43A98F90.9010001@varma-el.com>
Content-Type: text/plain; charset=3DKOI8-R

Hello Jes and all

I try to use your allocator (gen_pool_xxx), idea of which is a cute nice
thing. But current implementation of it is inappropriate for a _device_ =
(aka
onchip, like framebuffer) memory allocation, by next reasons:

 1) Device memory is expensive resource by access time and/or size cost.
    So we couldn't use (usually) this memory for the free blocks lists.
 2) Device memory usually have special requirement of access to it
    (alignment/special insn). So we couldn't use part of allocated
    blocks for some control structures (this problem solved in your
    implementation, it's common remark)
 3) Obvious (IMHO) workflow of mem. allocator look like:
 	- at startup time, driver allocate some big
	  (almost) static mem. chunk(s) for a control/data structures.
        - during work of the device, driver allocate many small
	  mem. blocks with almost identical size.
    such behavior lead to degeneration of buddy method and
    transform it to the first/best fit method (with long seek
    by the free node list).
 4) The simple binary buddy method is far away from perfect for a device
    due to a big internal fragmentation. Especially for a
    network/mfd devices, for which, size of allocated data very
    often is not a power of 2.

I start to modify your code to satisfy above demands, but firstly I wish =
to
know your, or somebody else, opinion.

Especially I will very happy if somebody have and could provide to all, =
some
device specific memory usage statistics.

--
Regards
Andrey Volkov



------------------------------

Message: 5
Date: Wed, 21 Dec 2005 18:41:22 +0100
From: Alessandro Zummo <azummo-lists@towertech.it>
Subject: Re: [RFC] RTC subsystem
To: Simon Richter <Simon.Richter@hogyros.de>
Cc: linuxppc-dev@ozlabs.org, lm-sensors@lm-sensors.org,
	linux-kernel@vger.kernel.org, linuxppc-embedded@ozlabs.org
Message-ID: <20051221184122.5253df01@inspiron>
Content-Type: text/plain; charset=3DUS-ASCII

On Wed, 21 Dec 2005 17:02:55 +0100
Simon Richter <Simon.Richter@hogyros.de> wrote:

> >   the proposal actually had a fully-working patch attached :)
>=20
> Ah, didn't see that, as I just skimmed over the web archive page you=20
> linked to, which has no link to the actual patch (or I'm too stupid to =

> find it).

 right.. the link was to 0/6 of the patchset, which is
 actually only the introduction. real patch was in subsequent
 messages.

> >  In my code, the first rtc that register is bound
> >  to /proc/driver/rtc and /dev/rtc (if those interfaces
> >  are compiled in, as they are all selectable).
>=20
> It would be good to have a way to change which clock is the "primary"=20
> one from userspace later (userspace because this is clearly site =
policy).

 If I'm not wrong, the RTC is usually queried at bootup
 and written to on shutdown. If NTP mode is active,=20
 it is also written every 11 minutes.

 So my intention was to emulate that interface as a starting
 point. Then we can update the userspace utilities (hwclock)
 to let the user choose which clock he want to use.

 I guess /proc/driver/rtc will be deprecated sooner or
 later. The /dev/rtc interface only supports one clock.
 It can either be extended to have /dev/rtcX or we
 can extend the sysfs one to allow clock updating.

 NTP mode could then be adjusted to update one or more
 of the rtcs. Maybe each RTC could have an attribute
 (let's say /sys/class/rtc/rtcX/ntp) which tells the
 kernel whether to update it or not.
 =20
 This way we will not have a primary clock anymore.

> >  You have full control of which functions you will provide
> >  to the upper layer. Obivously if you try to set the
> >  time on a read-only rtc, you will get an error.
>=20
> Sure. I was thinking of the question which error that should be.

 -EPERM ? -EACCESS? :)

--=20

 Best regards,

 Alessandro Zummo,
  Tower Technologies - Turin, Italy

  http://www.towertech.it



------------------------------

Message: 6
Date: Thu, 22 Dec 2005 00:18:33 +0100
From: Simon Richter <Simon.Richter@hogyros.de>
Subject: Re: [RFC] RTC subsystem
To: Alessandro Zummo <azummo-lists@towertech.it>
Cc: linuxppc-dev@ozlabs.org, lm-sensors@lm-sensors.org,
	linux-kernel@vger.kernel.org, linuxppc-embedded@ozlabs.org
Message-ID: <43A9E2C9.7080300@hogyros.de>
Content-Type: text/plain; charset=3D"iso-8859-1"

Hello,

Alessandro Zummo schrieb:

>>It would be good to have a way to change which clock is the "primary"=20
>>one from userspace later (userspace because this is clearly site =
policy).

>  If I'm not wrong, the RTC is usually queried at bootup
>  and written to on shutdown. If NTP mode is active,=20
>  it is also written every 11 minutes.

A good ntpd will adjust the speed rather than write to the clock; the
ntpd shipped by most distributions can already handle multiple time =
sources.

I'm thinking of the case where a computer is not attached to a network
but needs accurate tim; in this case I'd give it a battery powered RTC
and a time signal receiver. As most time signals are low-bandwidth, they
may not carry full time information in each tick so it may take several
minutes to fully synchronize. In this case I'd like to use the battery
backed up clock first and switch later on when synchronized.

>  I guess /proc/driver/rtc will be deprecated sooner or
>  later. The /dev/rtc interface only supports one clock.
>  It can either be extended to have /dev/rtcX or we
>  can extend the sysfs one to allow clock updating.

/dev is the way to go IMO. As far as I've understood sysfs, it carries
meta information about devices and drivers only, the actual
communication then happens through device nodes still.

>  NTP mode could then be adjusted to update one or more
>  of the rtcs. Maybe each RTC could have an attribute
>  (let's say /sys/class/rtc/rtcX/ntp) which tells the
>  kernel whether to update it or not.

That's entirely a userspace thing. What the userspace needs to know from
the kernel is whether the clock is writable and whether its speed can be
adjusted.

>  -EPERM ? -EACCESS? :)

-EIO or -ENOSYS would also be possible options.

   Simon
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 374 bytes
Desc: OpenPGP digital signature
Url :
http://ozlabs.org/pipermail/linuxppc-embedded/attachments/20051222/00d0d7=
c9/
signature-0001.pgp

------------------------------

_______________________________________________
Linuxppc-embedded mailing list
Linuxppc-embedded@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-embedded

End of Linuxppc-embedded Digest, Vol 16, Issue 55
*************************************************

^ permalink raw reply

* Re: linux DMA capabilities in MV64460
From: KokHow Teh @ 2005-12-22  2:59 UTC (permalink / raw)
  To: Phil.Nitschke, Linuxppc-embedded



>Currently there is a 2M aperture on the device, but it is not being seen
>as "prefetchable", so when I try to get data from the device using
>repetitive reads, they are very slow.  Hence my efforts to get DMA
>happening.

>Presumably the CPU/bridge discovers PCI device memory regions during bus
>enumeration.  What characteristic of a device determines whether the
>memory region is going to be marked as "prefetchable"?

Being "prefetchable" or not is determined by bit-3 of PCI Memory BAR.

>Does this attribute also affect whether DMA will work?


  MAG> You may want to pick up "PCI System Architecture" from Mindshare,
  MAG> Inc.  There are ones for PCI-X and PCI-Express too, I think.
  MAG> Well worth the money.

>Sounds like a good idea.  I'd hoped not to have to become a PCI expert,
>but it seems that there is a lot for me to learn just to determine how
>best to design my driver.

Here is a good online reference but it does not cover dma and
cache-coherency in great details.
http://www.tldp.org/LDP/tlk/dd/pci.html

^ permalink raw reply

* Re: linux DMA capabilities in MV64460
From: Phil Nitschke @ 2005-12-22  0:54 UTC (permalink / raw)
  To: linuxppc-embedded
In-Reply-To: <20051220010136.GA31165@mag.az.mvista.com>

>>>>> "MAG" == Mark A Greer <mgreer@mvista.com> writes:

  MAG> Hi Phil,
  MAG> [Note: I'm cc'ing linuxppc-embedded for others to reference and to
  MAG> add their thoughts.]

OK, I've just subscribed...

  MAG> On Tue, Dec 20, 2005 at 10:49:58AM +1030, Phil Nitschke wrote:
  >> Hi Mark,
  >>
  >> I'm developing a device driver to run in the 2.6.10 kernel.  I want to

  MAG> That's a pretty old kernel.  Do you have the option of using a more
  MAG> recent one like 2.6.14?

That might be possible if I reverse-engineer a patch file by comparing
the Artesyn reference kernel (2.6.10) with the kernel.org version, then
trying to apply that patch to the latest kernel.  I'll try this later...

  >> get large amounts of data from a custom peripheral on the PCI bus.  The
  >> software is running on an Artesyn PmPPC7448, which includes a Discovery
  >> III bridge.

  MAG> Can you share exact platform you're using?

I'm using a PMC processor on a custom carrier card (not made by Avalon).
Here are the respective links:

  Carrier:   http://www.tenix.com.au/Main.asp?ID=938
  Processor: http://www.artesyncp.com/products/PmPPC7448.html

  MAG> The bridge supports bursting on the PCI bus as long as the bridge
  MAG> is configured correctly and the PCI device is making an
  MAG> appropriate request.  Note, however, that there are many errata
  MAG> for the Marvell parts including some with cache coherency.  If
  MAG> your system is running with coherency on, you may have to limit
  MAG> your bursts to 32 bytes (i.e., the size of one cache line).

  MAG> You can see how the bursting is set up on the bridge by looking
  MAG> at the platform file for your board (e.g.,
  MAG> <file:arch/ppc/platforms/katana.c> in the latest linux
  MAG> kernel)--search for 'BURST'.

As far as I can tell, there is no platform file for this board in the
mainstream kernel.

In the reference kernel provided by Artesyn, there is a file named
arch/ppc/configs/pmppc7447_defconfig, where CONFIG_NOT_COHERENT_CACHE=y

Therefore in arch/ppc/platforms/pmppc7447.c, there is some code which
does this:

#if defined(CONFIG_NOT_COHERENT_CACHE)
        mv64x60_write(&bh, MV64360_SRAM_CONFIG, 0x00160000);
#else
        mv64x60_write(&bh, MV64360_SRAM_CONFIG, 0x001600b2);
#endif

... and later ...

        for (i = 0; i < MV64x60_CPU2MEM_WINDOWS; i++) {
#if defined(CONFIG_NOT_COHERENT_CACHE)
                si.cpu_prot_options[i] = 0;
                si.enet_options[i] = MV64360_ENET2MEM_SNOOP_NONE;
                si.mpsc_options[i] = MV64360_MPSC2MEM_SNOOP_NONE;
                si.idma_options[i] = MV64360_IDMA2MEM_SNOOP_NONE;
                si.pci_0.acc_cntl_options[i] =
                    MV64360_PCI_ACC_CNTL_SNOOP_NONE |
                    MV64360_PCI_ACC_CNTL_SWAP_NONE |
                    MV64360_PCI_ACC_CNTL_MBURST_128_BYTES |
                    MV64360_PCI_ACC_CNTL_RDSIZE_256_BYTES;
#else
                si.cpu_prot_options[i] = 0;
                si.enet_options[i] = MV64360_ENET2MEM_SNOOP_NONE;       /* errata */
                si.mpsc_options[i] = MV64360_MPSC2MEM_SNOOP_NONE;       /* errata */
                si.idma_options[i] = MV64360_IDMA2MEM_SNOOP_NONE;       /* errata */
                si.pci_0.acc_cntl_options[i] =
                    MV64360_PCI_ACC_CNTL_SNOOP_WB |
                    MV64360_PCI_ACC_CNTL_SWAP_NONE |
                    MV64360_PCI_ACC_CNTL_MBURST_32_BYTES |
                    MV64360_PCI_ACC_CNTL_RDSIZE_32_BYTES;
#endif
        }

But I'm yet to learn what all this means...

  >> Is there a summary of what is possible and/or not possible with the 4
  >> IDMA channels on the mv64460?

  MAG> The only real documentation is the bridge's user manual from Marvell.
  MAG> Unfortunately, you must sign an NDA to get access to it so I can't share
  MAG> mine with you.  You will need access to that info to get very far so I
  MAG> recommend you contact the people in your company that can make that
  MAG> happen, ASAP.

I talked with a person from Marvell's only Australian distributor, who
told me that they'd not be too keen to give us an NDA, since we're not
developing a project specifically for the Marvell, rather we're using a
Marvell which has already been integrated in the Artesyn card.
Therefore, he argued, Marvell would tell me to go to Artesyn for the
info, as they already have the NDA.  So for now, assume no NDA, no errata.

  >> For example, if the device that I'm trying to get data from supported a
  >> DMA engine capable of initiating bursts on the PCI bus (it currently
  >> can't do this), does the current kernel code support that?

  MAG> That's a hardware feature so its not really an issue of kernel support
  MAG> other than ensuring that the firmware and/or kernel configures the bridge
  MAG> correctly.  IOW, it can be supported by software but its an issue of
  MAG> whether your hardware supports it (and it actually works).

I'm not sure here whether you're talking about the hardware in the
CPU/bridge, or the hardware in the device.  Since the device interfaces
to the PCI bus using firmware inside an FPGA, this is configurable (to a
certain extent).

Currently there is a 2M aperture on the device, but it is not being seen
as "prefetchable", so when I try to get data from the device using
repetitive reads, they are very slow.  Hence my efforts to get DMA
happening.

Presumably the CPU/bridge discovers PCI device memory regions during bus
enumeration.  What characteristic of a device determines whether the
memory region is going to be marked as "prefetchable"?

Does this attribute also affect whether DMA will work?

  >> Or if I wanted to suck the data into main memory using the mv64460 IDMA
  >> controller (assuming the device couldn't initiate its own burst writes),
  >> is there a standard kernel interface to allow me to do this?

  MAG> Yes.  You would make a "dma ctlr driver" for the dma ctlr(s).  I
  MAG> don't know what the best example would be but hopefully someone
  MAG> else has a suggestion.

OK, I'll look into this.  I've been using the O'Reilly book "Linux
Device Drivers, Third Edition" by Jonathan Corbet, Alessandro Rubini,
and Greg Kroah-Hartman.  They say "The kernel developers recommend the
use of streaming mappings over coherent mappings whenever possible."

I'm not sure how the H/W vs S/W coherency discussion has anything to do
with their assertion.  I had previously thought that allocating a huge
buffer (for example at boot time) would be the way to go, but perhaps
getting the CPU to collect the data in smaller amounts into cache
coherent memory will give me the best performance?

  MAG> You may want to pick up "PCI System Architecture" from Mindshare,
  MAG> Inc.  There are ones for PCI-X and PCI-Express too, I think.
  MAG> Well worth the money.

Sounds like a good idea.  I'd hoped not to have to become a PCI expert,
but it seems that there is a lot for me to learn just to determine how
best to design my driver.

Thanks for your input.

--
Phil

^ permalink raw reply

* Re: [RFC] RTC subsystem
From: Simon Richter @ 2005-12-21 23:18 UTC (permalink / raw)
  To: Alessandro Zummo
  Cc: linuxppc-dev, lm-sensors, linux-kernel, linuxppc-embedded
In-Reply-To: <20051221184122.5253df01@inspiron>

[-- Attachment #1: Type: text/plain, Size: 1764 bytes --]

Hello,

Alessandro Zummo schrieb:

>>It would be good to have a way to change which clock is the "primary" 
>>one from userspace later (userspace because this is clearly site policy).

>  If I'm not wrong, the RTC is usually queried at bootup
>  and written to on shutdown. If NTP mode is active, 
>  it is also written every 11 minutes.

A good ntpd will adjust the speed rather than write to the clock; the
ntpd shipped by most distributions can already handle multiple time sources.

I'm thinking of the case where a computer is not attached to a network
but needs accurate tim; in this case I'd give it a battery powered RTC
and a time signal receiver. As most time signals are low-bandwidth, they
may not carry full time information in each tick so it may take several
minutes to fully synchronize. In this case I'd like to use the battery
backed up clock first and switch later on when synchronized.

>  I guess /proc/driver/rtc will be deprecated sooner or
>  later. The /dev/rtc interface only supports one clock.
>  It can either be extended to have /dev/rtcX or we
>  can extend the sysfs one to allow clock updating.

/dev is the way to go IMO. As far as I've understood sysfs, it carries
meta information about devices and drivers only, the actual
communication then happens through device nodes still.

>  NTP mode could then be adjusted to update one or more
>  of the rtcs. Maybe each RTC could have an attribute
>  (let's say /sys/class/rtc/rtcX/ntp) which tells the
>  kernel whether to update it or not.

That's entirely a userspace thing. What the userspace needs to know from
the kernel is whether the clock is writable and whether its speed can be
adjusted.

>  -EPERM ? -EACCESS? :)

-EIO or -ENOSYS would also be possible options.

   Simon

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 374 bytes --]

^ permalink raw reply

* Re: [PATCH] Don't allow CONFIG_PMAC_BACKLIGHT on pmac64
From: Paul Mackerras @ 2005-12-21 22:20 UTC (permalink / raw)
  To: Andi Kleen; +Cc: linuxppc-dev
In-Reply-To: <20051221193709.GA49171@muc.de>

Andi Kleen writes:

> It didn't work - i could select it.

Are you sure you don't have CONFIG_BROKEN=y?

Paul.

^ permalink raw reply

* Re: [PATCH] Don't allow CONFIG_PMAC_BACKLIGHT on pmac64
From: Andi Kleen @ 2005-12-21 19:37 UTC (permalink / raw)
  To: Simon Richter; +Cc: linuxppc-dev
In-Reply-To: <43A95FE5.8060504@hogyros.de>

On Wed, Dec 21, 2005 at 03:00:05PM +0100, Simon Richter wrote:
> Hi,
> 
> Andi Kleen wrote:
> 
> >Don't allow to set CONFIG_MAC_BACKLIGHT on pmac64. It won't compile.
> 
> > -	depends on ADB_PMU && (BROKEN || !PPC64)
> > +	depends on ADB_PMU && !PPC64
> 
> That's why it's marked BROKEN, perhaps?

It didn't work - i could select it.

-Andi

^ permalink raw reply

* Re: [RFC] RTC subsystem
From: Alessandro Zummo @ 2005-12-21 17:41 UTC (permalink / raw)
  To: Simon Richter; +Cc: linuxppc-dev, lm-sensors, linux-kernel, linuxppc-embedded
In-Reply-To: <43A97CAF.50301@hogyros.de>

On Wed, 21 Dec 2005 17:02:55 +0100
Simon Richter <Simon.Richter@hogyros.de> wrote:

> >   the proposal actually had a fully-working patch attached :)
> 
> Ah, didn't see that, as I just skimmed over the web archive page you 
> linked to, which has no link to the actual patch (or I'm too stupid to 
> find it).

 right.. the link was to 0/6 of the patchset, which is
 actually only the introduction. real patch was in subsequent
 messages.

> >  In my code, the first rtc that register is bound
> >  to /proc/driver/rtc and /dev/rtc (if those interfaces
> >  are compiled in, as they are all selectable).
> 
> It would be good to have a way to change which clock is the "primary" 
> one from userspace later (userspace because this is clearly site policy).

 If I'm not wrong, the RTC is usually queried at bootup
 and written to on shutdown. If NTP mode is active, 
 it is also written every 11 minutes.

 So my intention was to emulate that interface as a starting
 point. Then we can update the userspace utilities (hwclock)
 to let the user choose which clock he want to use.

 I guess /proc/driver/rtc will be deprecated sooner or
 later. The /dev/rtc interface only supports one clock.
 It can either be extended to have /dev/rtcX or we
 can extend the sysfs one to allow clock updating.

 NTP mode could then be adjusted to update one or more
 of the rtcs. Maybe each RTC could have an attribute
 (let's say /sys/class/rtc/rtcX/ntp) which tells the
 kernel whether to update it or not.
  
 This way we will not have a primary clock anymore.

> >  You have full control of which functions you will provide
> >  to the upper layer. Obivously if you try to set the
> >  time on a read-only rtc, you will get an error.
> 
> Sure. I was thinking of the question which error that should be.

 -EPERM ? -EACCESS? :)

-- 

 Best regards,

 Alessandro Zummo,
  Tower Technologies - Turin, Italy

  http://www.towertech.it

^ permalink raw reply

* [RFC] genalloc != generic DEVICE memory allocator
From: Andrey Volkov @ 2005-12-21 17:23 UTC (permalink / raw)
  To: jes; +Cc: Andrew Morton, linux-kernel, linuxppc-embedded

Hello Jes and all

I try to use your allocator (gen_pool_xxx), idea of which
is a cute nice thing. But current implementation of it is
inappropriate for a _device_ (aka onchip, like framebuffer) memory
allocation, by next reasons:

 1) Device memory is expensive resource by access time and/or size cost.
    So we couldn't use (usually) this memory for the free blocks lists.
 2) Device memory usually have special requirement of access to it
    (alignment/special insn). So we couldn't use part of allocated
    blocks for some control structures (this problem solved in your
    implementation, it's common remark)
 3) Obvious (IMHO) workflow of mem. allocator look like:
 	- at startup time, driver allocate some big
	  (almost) static mem. chunk(s) for a control/data structures.
        - during work of the device, driver allocate many small
	  mem. blocks with almost identical size.
    such behavior lead to degeneration of buddy method and
    transform it to the first/best fit method (with long seek
    by the free node list).
 4) The simple binary buddy method is far away from perfect for a device
    due to a big internal fragmentation. Especially for a
    network/mfd devices, for which, size of allocated data very
    often is not a power of 2.

I start to modify your code to satisfy above demands,
but firstly I wish to know your, or somebody else, opinion.

Especially I will very happy if somebody have and could
provide to all, some device specific memory usage statistics.

-- 
Regards
Andrey Volkov

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox