LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: AW: PowerPC PCI DMA issues (prefetch/coherency?)
From: Adam Zilkie @ 2009-09-03 16:04 UTC (permalink / raw)
  To: benh; +Cc: Tom Burns, Chris Pringle, Andrea Zypchen, linuxppc-dev
In-Reply-To: <1251971849.15089.28.camel@pasglop>

Ben,

Thanks for your info.

Are you sure there is L2 cache on the 440?

I am seeing this problem with our custom IDE driver which is based on
pretty old code. Our driver uses pci_alloc_consistent() to allocate the
physical DMA memory and alloc_pages() to allocate a virtual page. It
then uses pci_map_sg() to map to a scatter/gather buffer. Perhaps I
should convert these to the DMA API calls as you suggest.

Regards,
Adam

On Thu, 2009-09-03 at 19:57 +1000, Benjamin Herrenschmidt wrote:
> On Thu, 2009-09-03 at 09:05 +0100, Chris Pringle wrote:
> > Hi Adam,
> > 
> > If you have a look in include/asm-ppc/pgtable.h for the following section:
> > #ifdef CONFIG_44x
> > #define _PAGE_BASE    (_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_GUARDED)
> > #else
> > #define _PAGE_BASE    (_PAGE_PRESENT | _PAGE_ACCESSED)
> > #endif
> > 
> > Try adding _PAGE_COHERENT to the appropriate line above and see if that 
> > fixes your issue - this causes the 'M' bit to be set on the page which 
> > sure enforce cache coherency. If it doesn't, you'll need to check the 
> > 'M' bit isn't being masked out in head_44x.S (it was originally masked 
> > out on arch/powerpc, but was fixed in later kernels when the cache 
> > coherency issues with non-SMP systems were resolved).
> 
> I have some doubts about the usefulness of doing that for 4xx. AFAIK,
> the 440 core just ignores M.
> 
> The problem lies probably elsewhere. Maybe the L2 cache coherency isn't
> enabled or not working ?
> 
> The L1 cache on 440 is simply not coherent, so drivers have to make sure
> they use the appropriate DMA APIs which will do cache flushing when
> needed.
> 
> Adam, what driver is causing you that sort of problems ?
> 
> Cheers,
> Ben.
> 
> 
-- 
Adam Zilkie
Software Designer,
International Datacasting Corp.

This message and the documents attached hereto are intended only for the addressee and may contain privileged or confidential information. Any unauthorized disclosure is strictly prohibited. If you have received this message in error, please notify us immediately so that we may correct our internal records. Please then delete the original message. Thank you.

^ permalink raw reply

* Re: AW: PowerPC PCI DMA issues (prefetch/coherency?)
From: Adam Zilkie @ 2009-09-03 15:54 UTC (permalink / raw)
  To: chris.pringle; +Cc: Tom Burns, Andrea Zypchen, linuxppc-dev
In-Reply-To: <4A9F78AF.4010206@oxtel.com>

Chris,

I noticed the following comment in pgtable.h: 

* - CACHE COHERENT bit (M) has no effect on PPC440 core, because it
 *     doesn't support SMP. So we can use this as software bit, like
 *     DIRTY.

And _PAGE_COHERENT is not defined for the 44x (giving a compile error
when I add it the _PAGE_BASE line as you suggested). This would confirm
that the M bit is meaningless for the PPC440

Regards,
Adam


On Thu, 2009-09-03 at 09:05 +0100, Chris Pringle wrote:
> Hi Adam,
> 
> If you have a look in include/asm-ppc/pgtable.h for the following section:
> #ifdef CONFIG_44x
> #define _PAGE_BASE    (_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_GUARDED)
> #else
> #define _PAGE_BASE    (_PAGE_PRESENT | _PAGE_ACCESSED)
> #endif
> 
> Try adding _PAGE_COHERENT to the appropriate line above and see if that 
> fixes your issue - this causes the 'M' bit to be set on the page which 
> sure enforce cache coherency. If it doesn't, you'll need to check the 
> 'M' bit isn't being masked out in head_44x.S (it was originally masked 
> out on arch/powerpc, but was fixed in later kernels when the cache 
> coherency issues with non-SMP systems were resolved).
> 
> The patch I had fixed two problems on 2.6.26 for 'powerpc':
> 1) It stopped the 'M' bit being masked out (head_32.S)
> 2) It set the cache coherency ('M' bit) flag on each page table entry 
> (pgtable-ppc32.h)
> 
> Hope this helps!
> 
> Cheers,
> Chris
> 
> Adam Zilkie wrote:
> > Hi Chris,
> >
> > I am having a problem similar to what you described in this discussion.
> > We are using the ppc arch with 2.6.24 with CONFIG_SEQUOIA with compiles
> > arch/ppc/kernel/head_44x.c (quite different
> > from /arch/powerpc/kernel/head_32.S). I would like to apply your
> > backporting patch to this architecture. Any help would be appreciated.
> >
> > Regards,
> > Adam 
> >
> >   
> 
> 
-- 
Adam Zilkie
Software Designer,
International Datacasting Corp.

This message and the documents attached hereto are intended only for the addressee and may contain privileged or confidential information. Any unauthorized disclosure is strictly prohibited. If you have received this message in error, please notify us immediately so that we may correct our internal records. Please then delete the original message. Thank you.

^ permalink raw reply

* [PATCH] powerpc: Fix i8259 interrupt driver kernel crash on ML510
From: Grant Likely @ 2009-09-03 15:57 UTC (permalink / raw)
  To: linuxppc-dev, benh, linux-kernel, torvalds; +Cc: Roderick Colenbrander

From: Roderick Colenbrander <thunderbird2k@gmail.com>

This patch fixes a null pointer exception caused by removal of
'ack()' for level interrupts in the Xilinx interrupt driver.  A recent
change to the xilinx interrupt controller removed the ack hook for
level irqs.

Signed-off-by: Roderick Colenbrander <thunderbird2k@gmail.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
---

Hi Ben & Linus,

This is a last minute bug fix must go into 2.6.31.  This patch
is needed to prevent a kernel panic on Xilinx ml510 boards.

I've also pushed the patch out to my git tree if you'd prefer to pull:

The following changes since commit 326ba5010a5429a5a528b268b36a5900d4ab0eba:
  Linus Torvalds (1):
        Linux 2.6.31-rc8

are available in the git repository at:

  git://git.secretlab.ca/git/linux-2.6 merge

Roderick Colenbrander (1):
      powerpc: Fix i8259 interrupt driver kernel crash on ML510

 arch/powerpc/sysdev/xilinx_intc.c |    1 -
 1 files changed, 0 insertions(+), 1 deletions(-)



diff --git a/arch/powerpc/sysdev/xilinx_intc.c b/arch/powerpc/sysdev/xilinx_intc.c
index 3ee1fd3..40edad5 100644
--- a/arch/powerpc/sysdev/xilinx_intc.c
+++ b/arch/powerpc/sysdev/xilinx_intc.c
@@ -234,7 +234,6 @@ static void xilinx_i8259_cascade(unsigned int irq, struct irq_desc *desc)
 		generic_handle_irq(cascade_irq);
 
 	/* Let xilinx_intc end the interrupt */
-	desc->chip->ack(irq);
 	desc->chip->unmask(irq);
 }
 

^ permalink raw reply related

* Fix i8259 kernel crash on ML510
From: Roderick Colenbrander @ 2009-09-03 13:14 UTC (permalink / raw)
  To: grant.likely, linuxppc-dev

>From 11a2072b285c2eb0f19980ad729229d4ebf22291 Mon Sep 17 00:00:00 2001
From: Roderick Colenbrander <colenbrander@CE202.(none)>
Date: Thu, 3 Sep 2009 15:11:08 +0200
Subject: [PATCH] This patch fixes a null pointer exception caused by
removal of 'ack()' for level interrupts in the Xilinx interrupt driver.

---
 arch/powerpc/sysdev/xilinx_intc.c |    1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/sysdev/xilinx_intc.c
b/arch/powerpc/sysdev/xilinx_intc.c
index 3ee1fd3..40edad5 100644
--- a/arch/powerpc/sysdev/xilinx_intc.c
+++ b/arch/powerpc/sysdev/xilinx_intc.c
@@ -234,7 +234,6 @@ static void xilinx_i8259_cascade(unsigned int irq,
struct irq_desc *desc)
 		generic_handle_irq(cascade_irq);
 
 	/* Let xilinx_intc end the interrupt */
-	desc->chip->ack(irq);
 	desc->chip->unmask(irq);
 }
 
-- 
1.6.0.4

^ permalink raw reply related

* Fix i8259 kernel crash on ML510 [with signed-off]
From: Roderick Colenbrander @ 2009-09-03 13:18 UTC (permalink / raw)
  To: grant.likely, linuxppc-dev

Hi,

This is the same patch but with a signed-off message which I forgot.

Regards,
Roderick Colenbrander

Signed-off-by: Roderick Colenbrander <thunderbird2k@gmail.com>

>From 11a2072b285c2eb0f19980ad729229d4ebf22291 Mon Sep 17 00:00:00 2001
From: Roderick Colenbrander <colenbrander@CE202.(none)>
Date: Thu, 3 Sep 2009 15:11:08 +0200
Subject: [PATCH] This patch fixes a null pointer exception caused by
removal of 'ack()' for level interrupts in the Xilinx interrupt driver.

---
 arch/powerpc/sysdev/xilinx_intc.c |    1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/sysdev/xilinx_intc.c
b/arch/powerpc/sysdev/xilinx_intc.c
index 3ee1fd3..40edad5 100644
--- a/arch/powerpc/sysdev/xilinx_intc.c
+++ b/arch/powerpc/sysdev/xilinx_intc.c
@@ -234,7 +234,6 @@ static void xilinx_i8259_cascade(unsigned int irq,
struct irq_desc *desc)
 		generic_handle_irq(cascade_irq);
 
 	/* Let xilinx_intc end the interrupt */
-	desc->chip->ack(irq);
 	desc->chip->unmask(irq);
 }
 
-- 
1.6.0.4

^ permalink raw reply related

* Re: time jumps forward/backwards
From: Ben Gamsa @ 2009-09-03 12:49 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: linuxppc-dev, Paul Mackerras, Sean MacLennan
In-Reply-To: <4A9D038F.90605@somanetworks.com>

Benjamin Gamsa wrote:
> Benjamin Herrenschmidt wrote:
>> On Mon, 2009-08-31 at 23:57 -0400, Benjamin Gamsa wrote:
>>> Sean MacLennan wrote:
>>>> On Mon, 31 Aug 2009 22:20:00 -0400
>>>> Benjamin Gamsa <ben@somanetworks.com> wrote:
>>>>
>>>>> For what it's worth, the problem occurs even when ntp is not even
>>>>> started.
>>>> This is grasping, but could it have anything to do with the jiffies
>>>> wrapping near startup?
>>>>
>>> I don't know how to test it, but I don't think so, since there are 
>>> multiple of these glitches over an extended period of time.
>>
>> I'm not familiar with all the FSL processor variants, but is this
>> an UP or an SMP platform ? In the later case, are all the core timebases
>> properly synchronized ?
>>
> 
> This a UP with a single e500 core.
> 

I take it from the lack of follow-ups that no one has any good ideas as 
to what might be going wrong?

Since the problem seems to be confined to situations where the date is 
around the epoch, I guess I'll just work-around the problem by setting 
the date to a more recent date on startup.

-- 
Ben Gamsa       ben@somanetworks.com
SOMA Networks   312 Adelaide St. W. Suite 600 Toronto, Ontario, M5V1R2

^ permalink raw reply

* Re: AW: PowerPC PCI DMA issues (prefetch/coherency?)
From: Chris Pringle @ 2009-09-03 12:43 UTC (permalink / raw)
  To: Wrobel Heinz-R39252; +Cc: Tom Burns, Andrea Zypchen, linuxppc-dev, azilkie
In-Reply-To: <AAE514D00E55E6438B7F5186462A545202750F9B@zuk35exm20.fsl.freescale.net>

In our case, we were suffering coherency issues on an 8260 when using 
DMA with PCI. Setting the 'M' bit cured all of our DMA coherency issues.

There is a comment in "pgtable-ppc32.h" on 2.6.29.6 that says:
"We always set _PAGE_COHERENT when SMP is enabled *or* the processor 
might need it for DMA coherency". Freescale had also suggested setting 
the 'M' bit when we submitted a support request.

I've no idea how this bit affects other PowerPC chips. Looking briefly 
through some of the header files, it looks as if the 'M' bit should not 
be set for 44x, so the issue is probably not the same as the one I had.

Cheers,
Chris

Wrobel Heinz-R39252 wrote:
> Hi,
>
> This doesn't seem right. If we are talking about a single CPU core chip,
> i.e., just one data cache, then setting M is typically a) useless and
> could even b) cause a performance penalty depending on a chip's
> implementation.
> The M bit is required if *other* cores with caches need to see changes
> for coherency of their caches. You wouldn't set it for one core only
> because your own core knows about its own cache.
> The possible performance penalty could happen because you need some way
> to tell the others that they better intercept a transaction. And that
> could, depending on the chip, by a clock extra or so per transaction.
> Now, in theory, a DMA engine could have caches, read from cache content
> first, and could snoop the bus on global transactions like another core,
> but I have never heard of such a beast. 
>
> Hope this helps,
>
> Heinz
>
> -----Original Message-----
> From: linuxppc-dev-bounces+heinz.wrobel=freescale.com@lists.ozlabs.org
> [mailto:linuxppc-dev-bounces+heinz.wrobel=freescale.com@lists.ozlabs.org
> ] On Behalf Of Chris Pringle
> Sent: Donnerstag, 3. September 2009 10:05
> To: azilkie@datacast.com
> Cc: Tom Burns; Andrea Zypchen; linuxppc-dev@lists.ozlabs.org
> Subject: Re: AW: PowerPC PCI DMA issues (prefetch/coherency?)
>
> Hi Adam,
>
> If you have a look in include/asm-ppc/pgtable.h for the following
> section:
> #ifdef CONFIG_44x
> #define _PAGE_BASE    (_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_GUARDED)
> #else
> #define _PAGE_BASE    (_PAGE_PRESENT | _PAGE_ACCESSED)
> #endif
>
> Try adding _PAGE_COHERENT to the appropriate line above and see if that
> fixes your issue - this causes the 'M' bit to be set on the page which
> sure enforce cache coherency. If it doesn't, you'll need to check the
> 'M' bit isn't being masked out in head_44x.S (it was originally masked
> out on arch/powerpc, but was fixed in later kernels when the cache
> coherency issues with non-SMP systems were resolved).
>
> The patch I had fixed two problems on 2.6.26 for 'powerpc':
> 1) It stopped the 'M' bit being masked out (head_32.S)
> 2) It set the cache coherency ('M' bit) flag on each page table entry
> (pgtable-ppc32.h)
>
> Hope this helps!
>
> Cheers,
> Chris
>
> Adam Zilkie wrote:
>   
>> Hi Chris,
>>
>> I am having a problem similar to what you described in this
>>     
> discussion.
>   
>> We are using the ppc arch with 2.6.24 with CONFIG_SEQUOIA with 
>> compiles arch/ppc/kernel/head_44x.c (quite different from 
>> /arch/powerpc/kernel/head_32.S). I would like to apply your 
>> backporting patch to this architecture. Any help would be appreciated.
>>
>> Regards,
>> Adam
>>
>>   
>>     
>
>
>   
____________________________

Miranda Technologies Limited
Registered in England and Wales CN 02017053
Registered Office: James House, Mere Park, Dedmere Road, Marlow, Bucks, SL7 1FJ

^ permalink raw reply

* RE: AW: PowerPC PCI DMA issues (prefetch/coherency?)
From: Wrobel Heinz-R39252 @ 2009-09-03 12:20 UTC (permalink / raw)
  To: Chris Pringle, linuxppc-dev; +Cc: Tom Burns, Andrea Zypchen, azilkie
In-Reply-To: <4A9F78AF.4010206@oxtel.com>

Hi,

This doesn't seem right. If we are talking about a single CPU core chip,
i.e., just one data cache, then setting M is typically a) useless and
could even b) cause a performance penalty depending on a chip's
implementation.
The M bit is required if *other* cores with caches need to see changes
for coherency of their caches. You wouldn't set it for one core only
because your own core knows about its own cache.
The possible performance penalty could happen because you need some way
to tell the others that they better intercept a transaction. And that
could, depending on the chip, by a clock extra or so per transaction.
Now, in theory, a DMA engine could have caches, read from cache content
first, and could snoop the bus on global transactions like another core,
but I have never heard of such a beast.=20

Hope this helps,

Heinz

-----Original Message-----
From: linuxppc-dev-bounces+heinz.wrobel=3Dfreescale.com@lists.ozlabs.org
[mailto:linuxppc-dev-bounces+heinz.wrobel=3Dfreescale.com@lists.ozlabs.or=
g
] On Behalf Of Chris Pringle
Sent: Donnerstag, 3. September 2009 10:05
To: azilkie@datacast.com
Cc: Tom Burns; Andrea Zypchen; linuxppc-dev@lists.ozlabs.org
Subject: Re: AW: PowerPC PCI DMA issues (prefetch/coherency?)

Hi Adam,

If you have a look in include/asm-ppc/pgtable.h for the following
section:
#ifdef CONFIG_44x
#define _PAGE_BASE    (_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_GUARDED)
#else
#define _PAGE_BASE    (_PAGE_PRESENT | _PAGE_ACCESSED)
#endif

Try adding _PAGE_COHERENT to the appropriate line above and see if that
fixes your issue - this causes the 'M' bit to be set on the page which
sure enforce cache coherency. If it doesn't, you'll need to check the
'M' bit isn't being masked out in head_44x.S (it was originally masked
out on arch/powerpc, but was fixed in later kernels when the cache
coherency issues with non-SMP systems were resolved).

The patch I had fixed two problems on 2.6.26 for 'powerpc':
1) It stopped the 'M' bit being masked out (head_32.S)
2) It set the cache coherency ('M' bit) flag on each page table entry
(pgtable-ppc32.h)

Hope this helps!

Cheers,
Chris

Adam Zilkie wrote:
> Hi Chris,
>
> I am having a problem similar to what you described in this
discussion.
> We are using the ppc arch with 2.6.24 with CONFIG_SEQUOIA with=20
> compiles arch/ppc/kernel/head_44x.c (quite different from=20
> /arch/powerpc/kernel/head_32.S). I would like to apply your=20
> backporting patch to this architecture. Any help would be appreciated.
>
> Regards,
> Adam
>
>  =20


--=20

______________________________
Chris Pringle
Software Design Engineer

Miranda Technologies Ltd.
Hithercroft Road
Wallingford
Oxfordshire OX10 9DG
UK

Tel. +44 1491 820206
Fax. +44 1491 820001
www.miranda.com

____________________________

Miranda Technologies Limited
Registered in England and Wales CN 02017053 Registered Office: James
House, Mere Park, Dedmere Road, Marlow, Bucks, SL7 1FJ
_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

^ permalink raw reply

* powerpc test branch build failure with 6xx defconfig + PERF_CTRS
From: Michael Ellerman @ 2009-09-03 12:10 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: linuxppc-dev list

[-- Attachment #1: Type: text/plain, Size: 295 bytes --]

With benh's test branch, I'm seeing this trying to build a 6xx defconfig
with CONFIG_PPC_PERF_CTRS=y:

arch/powerpc/kernel/perf_counter.c: In function 'power_check_constraints':
arch/powerpc/kernel/perf_counter.c:352: error: the frame size of 1152 bytes is larger than 1024 bytes

cheers

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply

* Re: PPC PCI bus registers
From: Benjamin Herrenschmidt @ 2009-09-03 10:00 UTC (permalink / raw)
  To: Eddie Dawydiuk; +Cc: linuxppc-dev
In-Reply-To: <4A9F0377.1070606@embeddedarm.com>

On Wed, 2009-09-02 at 16:44 -0700, Eddie Dawydiuk wrote:
> Hello,
> 
> I have a question regarding reading PCI bus registers from a user space 
> application running on a PPC SBC. Seeing as though the PCI bus is little endian 
> and PPC is big endian is it typical that one must perform a byte swap on all 16 
> and 32 bit register reads?
> 
> I've found this is true on a custom board I am working on(with an FPGA connected 
> via the PCI bus) and as a result I've added a byte swap command in busybox to 
> accommodate this feature...

Note that powerpc has efficient load/store reverse instructions that
perform the byteswap for you. We use them for IOs in the kernel for
example.

Also, if you're going to access a PCI device directly, beware of other
issues such as ordering. PPC is an out of order architecture, you need
to ensure you add the appropriate memory barriers if you want to ensure
you accesses are done in the order you write them in your program.

For "standard" stuff that doesn't involve DMA or locks, an eieio after
both MMIO loads and stores should do the trick.

If you need to order vs. DMA and/or locks, you may want to look at what
the kernel does in io.h

Cheers,
Ben.

^ permalink raw reply

* Re: AW: PowerPC PCI DMA issues (prefetch/coherency?)
From: Benjamin Herrenschmidt @ 2009-09-03  9:57 UTC (permalink / raw)
  To: Chris Pringle; +Cc: Tom Burns, Andrea Zypchen, linuxppc-dev, azilkie
In-Reply-To: <4A9F78AF.4010206@oxtel.com>

On Thu, 2009-09-03 at 09:05 +0100, Chris Pringle wrote:
> Hi Adam,
> 
> If you have a look in include/asm-ppc/pgtable.h for the following section:
> #ifdef CONFIG_44x
> #define _PAGE_BASE    (_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_GUARDED)
> #else
> #define _PAGE_BASE    (_PAGE_PRESENT | _PAGE_ACCESSED)
> #endif
> 
> Try adding _PAGE_COHERENT to the appropriate line above and see if that 
> fixes your issue - this causes the 'M' bit to be set on the page which 
> sure enforce cache coherency. If it doesn't, you'll need to check the 
> 'M' bit isn't being masked out in head_44x.S (it was originally masked 
> out on arch/powerpc, but was fixed in later kernels when the cache 
> coherency issues with non-SMP systems were resolved).

I have some doubts about the usefulness of doing that for 4xx. AFAIK,
the 440 core just ignores M.

The problem lies probably elsewhere. Maybe the L2 cache coherency isn't
enabled or not working ?

The L1 cache on 440 is simply not coherent, so drivers have to make sure
they use the appropriate DMA APIs which will do cache flushing when
needed.

Adam, what driver is causing you that sort of problems ?

Cheers,
Ben.

^ permalink raw reply

* Re: [v4 PATCH 1/5]: cpuidle: Cleanup drivers/cpuidle/cpuidle.c
From: Peter Zijlstra @ 2009-09-03  9:40 UTC (permalink / raw)
  To: arun
  Cc: Gautham R Shenoy, linux-kernel, Paul Mackerras, Ingo Molnar,
	linuxppc-dev
In-Reply-To: <20090903044253.GA31928@linux.vnet.ibm.com>

On Thu, 2009-09-03 at 10:12 +0530, Arun R Bharadwaj wrote:

> > OK, that's a start I guess. Best would be to replace all of pm_idle with
> > cpuidle, which is what should have been done from the very start.
> > 
> > If cpuidle cannot fully replace the pm_idle functionality, then it needs
> > to fix that. But having two layers of idle functions is just silly.
> > 
> > Looking at patch 2 and 3, you're making the same mistake on power, after
> > those patches there are multiple ways of registering idle functions, one
> > through some native interface and one through cpuidle, this strikes me
> > as undesirable.
> > 
> > If cpuidle is a good idle function manager, then it should be good
> > enough to be the sole one, if its not, then why bother with it at all.
> > 
> 
> Okay, I'm giving this approach a shot now. i.e. trying to make cpuidle
> as _the_ sole idle function manager. This would mean doing away with
> pm_idle and ppc_md.power_save. And, cpuidle_idle_call() which is the
> main idle loop of cpuidle, present in drivers/cpuidle/cpuidle.c will
> have to be called from arch specific code of cpu_idle()
> 
> Also this would mean enabling cpuidle for all platforms, even if the
> platform doesn't have multiple idle states. So suppose a platform doesnt
> have multiple states, it wouldn't want the bloated code of cpuidle
> governors, and would want just a simple cpuidle loop.

Do talk to the powerpc maintainers about this. But yes, something like
that should be doable.

AFAICT the whole governor thing is optional and cpuidle provides a
spinning idle loop by default, and platforms can always register a
simple alternative when they set up bits -- the only thing to be careful
about is not creating a chicken-egg problem where the platform setup
runs before cpuidle is able to register a new handler or something.

I'd be delighted to see the end of pm_idle on x86.

^ permalink raw reply

* ucc_geth.c - NETDEV WATCHDOG: eth2 Tx transmit timeout
From: Shailesh Panchal @ 2009-09-03  9:13 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: 'Ronak Shah', 'Shailesh Panchal'


Dear All,

Currently we are using the MPC8360E processor, for that we face the problem
with Ethernet port (UCC) port, which one configures as RMII mode. We get the
error like "NETDEV WATCHDOG: eth2 tx transmit timeout", When applied more
load on port. Can u give me any solution for this problem; I will see the
entire patch related it but not found solution of this problem. Any one has
any idea about it how to resolve it.

Wait for Replay

Regards,
Shailesh

^ permalink raw reply

* Re: AW: PowerPC PCI DMA issues (prefetch/coherency?)
From: Chris Pringle @ 2009-09-03  8:05 UTC (permalink / raw)
  To: azilkie; +Cc: Tom Burns, Andrea Zypchen, linuxppc-dev
In-Reply-To: <1251926572.10090.17.camel@Adam>

Hi Adam,

If you have a look in include/asm-ppc/pgtable.h for the following section:
#ifdef CONFIG_44x
#define _PAGE_BASE    (_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_GUARDED)
#else
#define _PAGE_BASE    (_PAGE_PRESENT | _PAGE_ACCESSED)
#endif

Try adding _PAGE_COHERENT to the appropriate line above and see if that 
fixes your issue - this causes the 'M' bit to be set on the page which 
sure enforce cache coherency. If it doesn't, you'll need to check the 
'M' bit isn't being masked out in head_44x.S (it was originally masked 
out on arch/powerpc, but was fixed in later kernels when the cache 
coherency issues with non-SMP systems were resolved).

The patch I had fixed two problems on 2.6.26 for 'powerpc':
1) It stopped the 'M' bit being masked out (head_32.S)
2) It set the cache coherency ('M' bit) flag on each page table entry 
(pgtable-ppc32.h)

Hope this helps!

Cheers,
Chris

Adam Zilkie wrote:
> Hi Chris,
>
> I am having a problem similar to what you described in this discussion.
> We are using the ppc arch with 2.6.24 with CONFIG_SEQUOIA with compiles
> arch/ppc/kernel/head_44x.c (quite different
> from /arch/powerpc/kernel/head_32.S). I would like to apply your
> backporting patch to this architecture. Any help would be appreciated.
>
> Regards,
> Adam 
>
>   


-- 

______________________________
Chris Pringle
Software Design Engineer

Miranda Technologies Ltd.
Hithercroft Road
Wallingford
Oxfordshire OX10 9DG
UK

Tel. +44 1491 820206
Fax. +44 1491 820001
www.miranda.com

____________________________

Miranda Technologies Limited
Registered in England and Wales CN 02017053
Registered Office: James House, Mere Park, Dedmere Road, Marlow, Bucks, SL7 1FJ

^ permalink raw reply

* Re: MPC866 FEC's Receive processing thru pre allocated buffers
From: Joakim Tjernlund @ 2009-09-03  7:21 UTC (permalink / raw)
  To: Ganesh Kumar; +Cc: linuxppc-dev
In-Reply-To: <200909031015.15080.ganeshkumar@signal-networks.com>

Ganesh Kumar <ganeshkumar@signal-networks.com> wrote on 03/09/2009 06:45:14:
>
> Hi Tjernlund,
>
>     Thanks a lot for the reply.
>
> I checked in my code regarding to the invalidate/flushing of the
> data cache. In the fec_init its been done by calling the sequence
>
>        /* Make it uncached.
>         */
>         pte = va_to_pte(mem_addr);
>         pte_val(*pte) |= _PAGE_NO_CACHE;
>         flush_tlb_page(init_mm.mmap, mem_addr);
> So I did the same thing whenever I allocated new skb, but the
> problems still showed up, then I saw one comment in FEC code where
> it says
>
>         /* This does 16 byte alignment, exactly what we need.
>          * The packet length includes FCS, but we don't want to
>          * include that when passing upstream as it messes up
>          * bridging applications.
>          */
> while receiving the frames, I checked my modified code w.r.t the length,
> since I was not knowing the receive lengthn while allocating for the
> RX ring, I did with a maximum of 2048 bytes length and called the skb_put
> to reserve 2048 bytes for data, calling of the skb_put also updated the
> skb->len field with 2048, this was causing the problem, the bridge module
> was trying to send the frame with 2048 bytes even though the actual length
> was less number of bytes, so even after sending it to the FEC, the frame was
> getting transmitted successfully. So I updated the actual length to the
> skb->len field in the rx ISR, the problem is solved now.
>
> But I'm facing problems during load time in bridge mode
>  PC-1 ---->eth0  [Bridge machine] eth1 ----> PC-2
> With the above setup I initiate 1500 pings each  of 1400  bytes
> from PC1 to PC2, then the ping sequence starts, but after some time
> say some 25-35(all 1500 instances) sequences all of a sudden no
> ping reply is received for any request.
> At that time if I observe in the Bridge machine cat /proc/interrupts
> the fec interrupts will not get updated there(initially it used to)
> again it resumes after some 45-60 seconds and the sequence repeats.
> Dunno what's happening with in the FEC if configured in bridge mode
> any clue on this, Thanks a lakh in advance.

If I remember correctly, this is what you get when the invalidation
of the skb buffers isn't working properly.

Guessing again, but you seem to split up the page into two buffers of len 2048,
but you flush/invalidate the whole page. That won't work.

You are much better off by just using plain skb allocation and invalidate
the buffer before passing it to the CPM/FEC. Just make sure that the allocated
buffer has a cache aligned length. This is what I did long time ago and it worked
out perfectly.

 Jocke

^ permalink raw reply

* Re: [v4 PATCH 1/5]: cpuidle: Cleanup drivers/cpuidle/cpuidle.c
From: Arun R Bharadwaj @ 2009-09-03  4:42 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Gautham R Shenoy, linux-kernel, Paul Mackerras, Arun Bharadwaj,
	Ingo Molnar, linuxppc-dev
In-Reply-To: <1251870144.7547.48.camel@twins>

* Peter Zijlstra <a.p.zijlstra@chello.nl> [2009-09-02 07:42:24]:

> On Tue, 2009-09-01 at 17:08 +0530, Arun R Bharadwaj wrote:
> > * Arun R Bharadwaj <arun@linux.vnet.ibm.com> [2009-09-01 17:07:04]:
> > 
> > Cleanup drivers/cpuidle/cpuidle.c
> > 
> > Cpuidle maintains a pm_idle_old void pointer because, currently in x86
> > there is no clean way of registering and unregistering a idle function.
> 
> Right, and instead of fixing that, they build this cpuidle crap on top,
> instead of replacing the current crap with it.
> 
> > So remove pm_idle_old and leave the responsibility of maintaining the
> > list of registered idle loops to the architecture specific code. If the
> > architecture registers cpuidle_idle_call as its idle loop, only then
> > this loop is called.
> 
> OK, that's a start I guess. Best would be to replace all of pm_idle with
> cpuidle, which is what should have been done from the very start.
> 
> If cpuidle cannot fully replace the pm_idle functionality, then it needs
> to fix that. But having two layers of idle functions is just silly.
> 
> Looking at patch 2 and 3, you're making the same mistake on power, after
> those patches there are multiple ways of registering idle functions, one
> through some native interface and one through cpuidle, this strikes me
> as undesirable.
> 
> If cpuidle is a good idle function manager, then it should be good
> enough to be the sole one, if its not, then why bother with it at all.
> 

Okay, I'm giving this approach a shot now. i.e. trying to make cpuidle
as _the_ sole idle function manager. This would mean doing away with
pm_idle and ppc_md.power_save. And, cpuidle_idle_call() which is the
main idle loop of cpuidle, present in drivers/cpuidle/cpuidle.c will
have to be called from arch specific code of cpu_idle()

Also this would mean enabling cpuidle for all platforms, even if the
platform doesn't have multiple idle states. So suppose a platform doesnt
have multiple states, it wouldn't want the bloated code of cpuidle
governors, and would want just a simple cpuidle loop.

--arun
> 

^ permalink raw reply

* Re: MPC866 FEC's Receive processing thru pre allocated buffers
From: Ganesh Kumar @ 2009-09-03  4:45 UTC (permalink / raw)
  To: Joakim Tjernlund; +Cc: linuxppc-dev
In-Reply-To: <OF0B184DAC.3E68F19C-ONC1257620.0044C588-C1257620.00466F8B@transmode.se>

Hi Tjernlund,

    Thanks a lot for the reply.

I checked in my code regarding to the invalidate/flushing of the=20
data cache. In the fec_init its been done by calling the sequence

       /* Make it uncached.
        */
        pte =3D va_to_pte(mem_addr);
        pte_val(*pte) |=3D _PAGE_NO_CACHE;
        flush_tlb_page(init_mm.mmap, mem_addr);
So I did the same thing whenever I allocated new skb, but the=20
problems still showed up, then I saw one comment in FEC code where=20
it says

        /* This does 16 byte alignment, exactly what we need.
         * The packet length includes FCS, but we don't want to
         * include that when passing upstream as it messes up
         * bridging applications.
         */
while receiving the frames, I checked my modified code w.r.t the length,
since I was not knowing the receive lengthn while allocating for the
RX ring, I did with a maximum of 2048 bytes length and called the skb_put
to reserve 2048 bytes for data, calling of the skb_put also updated the
skb->len field with 2048, this was causing the problem, the bridge module=20
was trying to send the frame with 2048 bytes even though the actual length
was less number of bytes, so even after sending it to the FEC, the frame was
getting transmitted successfully. So I updated the actual length to the
skb->len field in the rx ISR, the problem is solved now.

But I'm facing problems during load time in bridge mode
 PC-1 ---->eth0  [Bridge machine] eth1 ----> PC-2
With the above setup I initiate 1500 pings each  of 1400  bytes
from PC1 to PC2, then the ping sequence starts, but after some time
say some 25-35(all 1500 instances) sequences all of a sudden no=20
ping reply is received for any request.
At that time if I observe in the Bridge machine cat /proc/interrupts
the fec interrupts will not get updated there(initially it used to)
again it resumes after some 45-60 seconds and the sequence repeats.
Dunno what's happening with in the FEC if configured in bridge mode
any clue on this, Thanks a lakh in advance.

=2D-Ganesh

On Friday 28 August 2009 18:19, you wrote:
> > Hi All,
> >
> > I've already sent this almost before 6-7 hours, but the
> > mail did not appear on the Aug 2009 archives, So I'm sending
> > it again. Sorry for this!!. Thanks in advance.
> >
> > =A0 =A0 =A0 =A0 I'm working on MPC860 with Linux Kernel 2.4.18.
> > As I'm fine tuning the FEC(Fast Ethernet Controller) driver,
> > I came across the receive side processing of the ethernet frames
> > where in the Rx BD rings are preallocated with the buffers and each time
> > a new frame is received, the whole frame will get copied from the Buffer
> > Descriptors to the external memory by allocating the skb.
> > Is this the right way to do that ?, as memcpy is not efficient inside t=
he
> > ISRs.
> > So I did some changes in the RX BDs initialization, like allocate the s=
kb
> > and initialize the BD's address pointer with the skb->data(using __pa)
> > and then on reception of the frame I take out the skb from theBD and
> > allocate a new skb and reinit the BD address with the newly allocated
> > skb->data.
> >
> > It works for normal conditions, but if I load the driver then
> > I receive lots of corrupted frames, So I tried increasing the
> > RX_RING_SIZE(16) and also enabling the receive dscriptor active only
> > after I come out of the while loop (inside fec_enet_rx)
> > Increasing the Rx ring eliminated the frame corruption and runs fine on
> > load test.
> >
> > But if I configure my Linux box in bridge mode then it doesn't work,
> > i.e., the bridging doesn't happen,
> >
> > =A0 =A0PC-1 ---->eth0 =A0[Bridge machine] eth1 ----> PC-2
> > What I mean here is if we initiate a ping from the
> > PC-1 to PC-2, I don't get any response,
> > it continously try to resole the ARP.
> >
> > What may be the reason??
> > Thanks in advance
>
> A guess, you are missing invalidating the dcache when handing a skb to
> the CPM:
>
> #define CPM_ENET_RX_FRSIZE	L1_CACHE_ALIGN(PKT_MAXBUF_SIZE) /* This is
> needed so that invalidate_xxx wont invalidate too much */
>
> static inline void invalidate_dcache_region(void *adr, unsigned long len)
> {
> 	/* if(len =3D=3D 0) return; len will never be zero */
> 	len =3D ((len-1) >> LG_L1_CACHE_LINE_SIZE) +1;
> 	do {
> 		asm  ("dcbi     0,%0" : : "r" (adr) : "memory");
> 		adr +=3D L1_CACHE_LINE_SIZE;
> 	} while(--len);
> }
> then:
>   invalidate_dcache_region(skb->data, CPM_ENET_RX_FRSIZE);
>   bdp->cbd_bufaddr =3D __pa(skb->data);

^ permalink raw reply

* Re: why do we need reloc_offset ??
From: Michael Ellerman @ 2009-09-03  2:19 UTC (permalink / raw)
  To: HongWoo Lee; +Cc: linuxppc-dev
In-Reply-To: <5e2889710909012333q69874b24qf6e3c0abfceb8dfd@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1902 bytes --]

On Wed, 2009-09-02 at 15:33 +0900, HongWoo Lee wrote:
> Hi everyone~ 
> 
> In ther linux kernel code, I found the reloc_offset. 
> 
> {{{
> // file : misc.S 
> /* Returns (address we are running at) - (address we were linked at)
>  * for use before the text and data are mapped to KERNELBASE.
>  */
> _GLOBAL(reloc_offset)
> }}}
> 
> I couldn't understand the comment saying "Returns (address we are
> running at) - (address we were linked at)". 
> For now, I'm studying each instruction. 
> 
> And below is best comment I can explain for each instruction. 
> 
> _GLOBAL(reloc_offset)
>         mflr    r0                // move from link register, save the return address
>         bl      1f                 // bl 1f
> 1:     mflr    r3                // move from link register, r3 is just return address pointing itself 

At this point r3 contains the value of LR based on the branch we just
did. So it's the address of the current instruction, based on where the
code is _running_.

>         LOAD_REG_IMMEDIATE(r4,1b)    // get the 1b address, r4 is the address 

Here we load into r4 the address of the previous instruction, but based
on the label "1b". The address of the label is calculated by the linker,
so r4 contains the address the instruction was linked at.

>         subf    r3,r4,r3        // r3 = r3 – r4 

So here we calculate any difference between the address the code was
linked at and the address it's running at.

>         mtlr    r0                // restore return address 
>         blr
> 
> After this, I still don't know why "r3-r4" is the offset. 
> And what does it mean ?? 

The offset is just the difference between the address the code was
linked at and the address it's running it. It's used in places where the
code might be (or is always) running at an address other than the
address it was linked at.

cheers

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply

* Re: PPC PCI bus registers
From: Grant Likely @ 2009-09-02 23:56 UTC (permalink / raw)
  To: Eddie Dawydiuk; +Cc: linuxppc-dev
In-Reply-To: <4A9F0377.1070606@embeddedarm.com>

On Wed, Sep 2, 2009 at 5:44 PM, Eddie Dawydiuk<eddie@embeddedarm.com> wrote:
> Hello,
>
> I have a question regarding reading PCI bus registers from a user space
> application running on a PPC SBC. Seeing as though the PCI bus is little
> endian and PPC is big endian is it typical that one must perform a byte swap
> on all 16 and 32 bit register reads?

Yes, this is correct.

g.

^ permalink raw reply

* PPC PCI bus registers
From: Eddie Dawydiuk @ 2009-09-02 23:44 UTC (permalink / raw)
  To: linuxppc-dev, Grant Likely

Hello,

I have a question regarding reading PCI bus registers from a user space 
application running on a PPC SBC. Seeing as though the PCI bus is little endian 
and PPC is big endian is it typical that one must perform a byte swap on all 16 
and 32 bit register reads?

I've found this is true on a custom board I am working on(with an FPGA connected 
via the PCI bus) and as a result I've added a byte swap command in busybox to 
accommodate this feature...

-- 
Best Regards,
________________________________________________________________
  Eddie Dawydiuk, Technologic Systems | voice:  (480) 837-5200
  16525 East Laser Drive              | fax:    (480) 837-5300
  Fountain Hills, AZ 85268            | web: www.embeddedARM.com

^ permalink raw reply

* [PATCH 2/2] powerpc: Change archdata dma_data to a union
From: Becky Bruce @ 2009-09-02 22:23 UTC (permalink / raw)
  To: linuxppc-dev
In-Reply-To: <1251930200-4796-1-git-send-email-beckyb@kernel.crashing.org>

Sometimes this is used to hold a simple offset, and sometimes
it is used to hold a pointer.  This patch changes it to a union containing
void * and dma_addr_t.  get/set accessors are also provided, because it was
getting a bit ugly to get to the actual data.

Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org>
---
 arch/powerpc/include/asm/device.h        |   11 ++++++++++-
 arch/powerpc/include/asm/dma-mapping.h   |   10 ++++++++--
 arch/powerpc/include/asm/iommu.h         |   10 ++++++++++
 arch/powerpc/kernel/dma-iommu.c          |   16 ++++++++--------
 arch/powerpc/kernel/pci-common.c         |    2 +-
 arch/powerpc/kernel/vio.c                |    2 +-
 arch/powerpc/platforms/cell/beat_iommu.c |    2 +-
 arch/powerpc/platforms/cell/iommu.c      |    9 +++------
 arch/powerpc/platforms/iseries/iommu.c   |    2 +-
 arch/powerpc/platforms/pasemi/iommu.c    |    2 +-
 arch/powerpc/platforms/pseries/iommu.c   |    8 ++++----
 arch/powerpc/sysdev/dart_iommu.c         |    2 +-
 12 files changed, 49 insertions(+), 27 deletions(-)

diff --git a/arch/powerpc/include/asm/device.h b/arch/powerpc/include/asm/device.h
index 67fcd7f..07ca8b5 100644
--- a/arch/powerpc/include/asm/device.h
+++ b/arch/powerpc/include/asm/device.h
@@ -15,7 +15,16 @@ struct dev_archdata {
 
 	/* DMA operations on that device */
 	struct dma_map_ops	*dma_ops;
-	void			*dma_data;
+
+	/*
+	 * When an iommu is in use, dma_data is used as a ptr to the base of the
+	 * iommu_table.  Otherwise, it is a simple numerical offset.
+	 */
+	union {
+		dma_addr_t	dma_offset;
+		void		*iommu_table_base;
+	} dma_data;
+
 #ifdef CONFIG_SWIOTLB
 	dma_addr_t		max_direct_dma_addr;
 #endif
diff --git a/arch/powerpc/include/asm/dma-mapping.h b/arch/powerpc/include/asm/dma-mapping.h
index eef4db1..e9f4fe9 100644
--- a/arch/powerpc/include/asm/dma-mapping.h
+++ b/arch/powerpc/include/asm/dma-mapping.h
@@ -89,14 +89,20 @@ static inline void set_dma_ops(struct device *dev, struct dma_map_ops *ops)
 	dev->archdata.dma_ops = ops;
 }
 
-static inline unsigned long get_dma_offset(struct device *dev)
+static inline dma_addr_t get_dma_offset(struct device *dev)
 {
 	if (dev)
-		return (unsigned long)dev->archdata.dma_data;
+		return dev->archdata.dma_data.dma_offset;
 
 	return PCI_DRAM_OFFSET;
 }
 
+static inline void set_dma_offset(struct device *dev, dma_addr_t off)
+{
+	if (dev)
+		dev->archdata.dma_data.dma_offset = off;
+}
+
 /* this will be removed soon */
 #define flush_write_buffers()
 
diff --git a/arch/powerpc/include/asm/iommu.h b/arch/powerpc/include/asm/iommu.h
index 7464c0d..edfc980 100644
--- a/arch/powerpc/include/asm/iommu.h
+++ b/arch/powerpc/include/asm/iommu.h
@@ -70,6 +70,16 @@ struct iommu_table {
 
 struct scatterlist;
 
+static inline void set_iommu_table_base(struct device *dev, void *base)
+{
+	dev->archdata.dma_data.iommu_table_base = base;
+}
+
+static inline void *get_iommu_table_base(struct device *dev)
+{
+	return dev->archdata.dma_data.iommu_table_base;
+}
+
 /* Frees table for an individual device node */
 extern void iommu_free_table(struct iommu_table *tbl, const char *node_name);
 
diff --git a/arch/powerpc/kernel/dma-iommu.c b/arch/powerpc/kernel/dma-iommu.c
index 87ddb3f..37771a5 100644
--- a/arch/powerpc/kernel/dma-iommu.c
+++ b/arch/powerpc/kernel/dma-iommu.c
@@ -18,7 +18,7 @@
 static void *dma_iommu_alloc_coherent(struct device *dev, size_t size,
 				      dma_addr_t *dma_handle, gfp_t flag)
 {
-	return iommu_alloc_coherent(dev, dev->archdata.dma_data, size,
+	return iommu_alloc_coherent(dev, get_iommu_table_base(dev), size,
 				    dma_handle, device_to_mask(dev), flag,
 				    dev_to_node(dev));
 }
@@ -26,7 +26,7 @@ static void *dma_iommu_alloc_coherent(struct device *dev, size_t size,
 static void dma_iommu_free_coherent(struct device *dev, size_t size,
 				    void *vaddr, dma_addr_t dma_handle)
 {
-	iommu_free_coherent(dev->archdata.dma_data, size, vaddr, dma_handle);
+	iommu_free_coherent(get_iommu_table_base(dev), size, vaddr, dma_handle);
 }
 
 /* Creates TCEs for a user provided buffer.  The user buffer must be
@@ -39,8 +39,8 @@ static dma_addr_t dma_iommu_map_page(struct device *dev, struct page *page,
 				     enum dma_data_direction direction,
 				     struct dma_attrs *attrs)
 {
-	return iommu_map_page(dev, dev->archdata.dma_data, page, offset, size,
-			      device_to_mask(dev), direction, attrs);
+	return iommu_map_page(dev, get_iommu_table_base(dev), page, offset,
+			      size, device_to_mask(dev), direction, attrs);
 }
 
 
@@ -48,7 +48,7 @@ static void dma_iommu_unmap_page(struct device *dev, dma_addr_t dma_handle,
 				 size_t size, enum dma_data_direction direction,
 				 struct dma_attrs *attrs)
 {
-	iommu_unmap_page(dev->archdata.dma_data, dma_handle, size, direction,
+	iommu_unmap_page(get_iommu_table_base(dev), dma_handle, size, direction,
 			 attrs);
 }
 
@@ -57,7 +57,7 @@ static int dma_iommu_map_sg(struct device *dev, struct scatterlist *sglist,
 			    int nelems, enum dma_data_direction direction,
 			    struct dma_attrs *attrs)
 {
-	return iommu_map_sg(dev, dev->archdata.dma_data, sglist, nelems,
+	return iommu_map_sg(dev, get_iommu_table_base(dev), sglist, nelems,
 			    device_to_mask(dev), direction, attrs);
 }
 
@@ -65,14 +65,14 @@ static void dma_iommu_unmap_sg(struct device *dev, struct scatterlist *sglist,
 		int nelems, enum dma_data_direction direction,
 		struct dma_attrs *attrs)
 {
-	iommu_unmap_sg(dev->archdata.dma_data, sglist, nelems, direction,
+	iommu_unmap_sg(get_iommu_table_base(dev), sglist, nelems, direction,
 		       attrs);
 }
 
 /* We support DMA to/from any memory page via the iommu */
 static int dma_iommu_dma_supported(struct device *dev, u64 mask)
 {
-	struct iommu_table *tbl = dev->archdata.dma_data;
+	struct iommu_table *tbl = get_iommu_table_base(dev);
 
 	if (!tbl || tbl->it_offset > mask) {
 		printk(KERN_INFO
diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
index e9f4840..bb8209e 100644
--- a/arch/powerpc/kernel/pci-common.c
+++ b/arch/powerpc/kernel/pci-common.c
@@ -1117,7 +1117,7 @@ void __devinit pcibios_setup_bus_devices(struct pci_bus *bus)
 
 		/* Hook up default DMA ops */
 		sd->dma_ops = pci_dma_ops;
-		sd->dma_data = (void *)PCI_DRAM_OFFSET;
+		set_dma_offset(&dev->dev, PCI_DRAM_OFFSET);
 
 		/* Additional platform DMA/iommu setup */
 		if (ppc_md.pci_dma_dev_setup)
diff --git a/arch/powerpc/kernel/vio.c b/arch/powerpc/kernel/vio.c
index bc7b41e..8d9275f 100644
--- a/arch/powerpc/kernel/vio.c
+++ b/arch/powerpc/kernel/vio.c
@@ -1233,7 +1233,7 @@ struct vio_dev *vio_register_device_node(struct device_node *of_node)
 		vio_cmo_set_dma_ops(viodev);
 	else
 		viodev->dev.archdata.dma_ops = &dma_iommu_ops;
-	viodev->dev.archdata.dma_data = vio_build_iommu_table(viodev);
+	set_iommu_table_base(&viodev->dev, vio_build_iommu_table(viodev));
 	set_dev_node(&viodev->dev, of_node_to_nid(of_node));
 
 	/* init generic 'struct device' fields: */
diff --git a/arch/powerpc/platforms/cell/beat_iommu.c b/arch/powerpc/platforms/cell/beat_iommu.c
index 93b0efd..39d361c 100644
--- a/arch/powerpc/platforms/cell/beat_iommu.c
+++ b/arch/powerpc/platforms/cell/beat_iommu.c
@@ -77,7 +77,7 @@ static void __init celleb_init_direct_mapping(void)
 static void celleb_dma_dev_setup(struct device *dev)
 {
 	dev->archdata.dma_ops = get_pci_dma_ops();
-	dev->archdata.dma_data = (void *)celleb_dma_direct_offset;
+	set_dma_offset(dev, celleb_dma_direct_offset);
 }
 
 static void celleb_pci_dma_dev_setup(struct pci_dev *pdev)
diff --git a/arch/powerpc/platforms/cell/iommu.c b/arch/powerpc/platforms/cell/iommu.c
index 416db17..ca5bfdf 100644
--- a/arch/powerpc/platforms/cell/iommu.c
+++ b/arch/powerpc/platforms/cell/iommu.c
@@ -657,15 +657,13 @@ static void cell_dma_dev_setup_fixed(struct device *dev);
 
 static void cell_dma_dev_setup(struct device *dev)
 {
-	struct dev_archdata *archdata = &dev->archdata;
-
 	/* Order is important here, these are not mutually exclusive */
 	if (get_dma_ops(dev) == &dma_iommu_fixed_ops)
 		cell_dma_dev_setup_fixed(dev);
 	else if (get_pci_dma_ops() == &dma_iommu_ops)
-		archdata->dma_data = cell_get_iommu_table(dev);
+		set_iommu_table_base(dev, cell_get_iommu_table(dev));
 	else if (get_pci_dma_ops() == &dma_direct_ops)
-		archdata->dma_data = (void *)cell_dma_direct_offset;
+		set_dma_offset(dev, cell_dma_direct_offset);
 	else
 		BUG();
 }
@@ -973,11 +971,10 @@ static int dma_set_mask_and_switch(struct device *dev, u64 dma_mask)
 
 static void cell_dma_dev_setup_fixed(struct device *dev)
 {
-	struct dev_archdata *archdata = &dev->archdata;
 	u64 addr;
 
 	addr = cell_iommu_get_fixed_address(dev) + dma_iommu_fixed_base;
-	archdata->dma_data = (void *)addr;
+	set_dma_offset(dev, addr);
 
 	dev_dbg(dev, "iommu: fixed addr = %llx\n", addr);
 }
diff --git a/arch/powerpc/platforms/iseries/iommu.c b/arch/powerpc/platforms/iseries/iommu.c
index 6c1e101..9d53cb4 100644
--- a/arch/powerpc/platforms/iseries/iommu.c
+++ b/arch/powerpc/platforms/iseries/iommu.c
@@ -193,7 +193,7 @@ static void pci_dma_dev_setup_iseries(struct pci_dev *pdev)
 		pdn->iommu_table = iommu_init_table(tbl, -1);
 	else
 		kfree(tbl);
-	pdev->dev.archdata.dma_data = pdn->iommu_table;
+	set_iommu_table_base(&pdev->dev, pdn->iommu_table);
 }
 #else
 #define pci_dma_dev_setup_iseries	NULL
diff --git a/arch/powerpc/platforms/pasemi/iommu.c b/arch/powerpc/platforms/pasemi/iommu.c
index a0ff03a..7b1d608 100644
--- a/arch/powerpc/platforms/pasemi/iommu.c
+++ b/arch/powerpc/platforms/pasemi/iommu.c
@@ -189,7 +189,7 @@ static void pci_dma_dev_setup_pasemi(struct pci_dev *dev)
 	}
 #endif
 
-	dev->dev.archdata.dma_data = &iommu_table_iobmap;
+	set_iommu_table_base(&dev->dev, &iommu_table_iobmap);
 }
 
 static void pci_dma_bus_setup_null(struct pci_bus *b) { }
diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c
index 661c8e0..1a0000a 100644
--- a/arch/powerpc/platforms/pseries/iommu.c
+++ b/arch/powerpc/platforms/pseries/iommu.c
@@ -482,7 +482,7 @@ static void pci_dma_dev_setup_pSeries(struct pci_dev *dev)
 				   phb->node);
 		iommu_table_setparms(phb, dn, tbl);
 		PCI_DN(dn)->iommu_table = iommu_init_table(tbl, phb->node);
-		dev->dev.archdata.dma_data = PCI_DN(dn)->iommu_table;
+		set_iommu_table_base(&dev->dev, PCI_DN(dn)->iommu_table);
 		return;
 	}
 
@@ -494,7 +494,7 @@ static void pci_dma_dev_setup_pSeries(struct pci_dev *dev)
 		dn = dn->parent;
 
 	if (dn && PCI_DN(dn))
-		dev->dev.archdata.dma_data = PCI_DN(dn)->iommu_table;
+		set_iommu_table_base(&dev->dev, PCI_DN(dn)->iommu_table);
 	else
 		printk(KERN_WARNING "iommu: Device %s has no iommu table\n",
 		       pci_name(dev));
@@ -538,7 +538,7 @@ static void pci_dma_dev_setup_pSeriesLP(struct pci_dev *dev)
 	 */
 	if (dma_window == NULL || pdn->parent == NULL) {
 		pr_debug("  no dma window for device, linking to parent\n");
-		dev->dev.archdata.dma_data = PCI_DN(pdn)->iommu_table;
+		set_iommu_table_base(&dev->dev, PCI_DN(pdn)->iommu_table);
 		return;
 	}
 
@@ -554,7 +554,7 @@ static void pci_dma_dev_setup_pSeriesLP(struct pci_dev *dev)
 		pr_debug("  found DMA window, table: %p\n", pci->iommu_table);
 	}
 
-	dev->dev.archdata.dma_data = pci->iommu_table;
+	set_iommu_table_base(&dev->dev, pci->iommu_table);
 }
 #else  /* CONFIG_PCI */
 #define pci_dma_bus_setup_pSeries	NULL
diff --git a/arch/powerpc/sysdev/dart_iommu.c b/arch/powerpc/sysdev/dart_iommu.c
index 89639ec..ae3c4db 100644
--- a/arch/powerpc/sysdev/dart_iommu.c
+++ b/arch/powerpc/sysdev/dart_iommu.c
@@ -297,7 +297,7 @@ static void pci_dma_dev_setup_dart(struct pci_dev *dev)
 	/* We only have one iommu table on the mac for now, which makes
 	 * things simple. Setup all PCI devices to point to this table
 	 */
-	dev->dev.archdata.dma_data = &iommu_table_dart;
+	set_iommu_table_base(&dev->dev, &iommu_table_dart);
 }
 
 static void pci_dma_bus_setup_dart(struct pci_bus *bus)
-- 
1.6.0.6

^ permalink raw reply related

* [PATCH 1/2] powerpc: rename get_dma_direct_offset get_dma_offset
From: Becky Bruce @ 2009-09-02 22:23 UTC (permalink / raw)
  To: linuxppc-dev

The former is no longer really accurate with the swiotlb case now
a possibility.  I also move it into dma-mapping.h - it no longer
needs to be in dma.c, and there are about to be some more accessors
that should all end up in the same place.

Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org>
---
 arch/powerpc/include/asm/dma-mapping.h |   13 ++++++++++---
 arch/powerpc/kernel/dma.c              |   15 ++++-----------
 2 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/arch/powerpc/include/asm/dma-mapping.h b/arch/powerpc/include/asm/dma-mapping.h
index cb2ca41..eef4db1 100644
--- a/arch/powerpc/include/asm/dma-mapping.h
+++ b/arch/powerpc/include/asm/dma-mapping.h
@@ -26,7 +26,6 @@ extern void *dma_direct_alloc_coherent(struct device *dev, size_t size,
 extern void dma_direct_free_coherent(struct device *dev, size_t size,
 				     void *vaddr, dma_addr_t dma_handle);
 
-extern unsigned long get_dma_direct_offset(struct device *dev);
 
 #ifdef CONFIG_NOT_COHERENT_CACHE
 /*
@@ -90,6 +89,14 @@ static inline void set_dma_ops(struct device *dev, struct dma_map_ops *ops)
 	dev->archdata.dma_ops = ops;
 }
 
+static inline unsigned long get_dma_offset(struct device *dev)
+{
+	if (dev)
+		return (unsigned long)dev->archdata.dma_data;
+
+	return PCI_DRAM_OFFSET;
+}
+
 /* this will be removed soon */
 #define flush_write_buffers()
 
@@ -181,12 +188,12 @@ static inline bool dma_capable(struct device *dev, dma_addr_t addr, size_t size)
 
 static inline dma_addr_t phys_to_dma(struct device *dev, phys_addr_t paddr)
 {
-	return paddr + get_dma_direct_offset(dev);
+	return paddr + get_dma_offset(dev);
 }
 
 static inline phys_addr_t dma_to_phys(struct device *dev, dma_addr_t daddr)
 {
-	return daddr - get_dma_direct_offset(dev);
+	return daddr - get_dma_offset(dev);
 }
 
 #define dma_alloc_noncoherent(d, s, h, f) dma_alloc_coherent(d, s, h, f)
diff --git a/arch/powerpc/kernel/dma.c b/arch/powerpc/kernel/dma.c
index 21b784d..6215062 100644
--- a/arch/powerpc/kernel/dma.c
+++ b/arch/powerpc/kernel/dma.c
@@ -21,13 +21,6 @@
  * default the offset is PCI_DRAM_OFFSET.
  */
 
-unsigned long get_dma_direct_offset(struct device *dev)
-{
-	if (dev)
-		return (unsigned long)dev->archdata.dma_data;
-
-	return PCI_DRAM_OFFSET;
-}
 
 void *dma_direct_alloc_coherent(struct device *dev, size_t size,
 				dma_addr_t *dma_handle, gfp_t flag)
@@ -37,7 +30,7 @@ void *dma_direct_alloc_coherent(struct device *dev, size_t size,
 	ret = __dma_alloc_coherent(dev, size, dma_handle, flag);
 	if (ret == NULL)
 		return NULL;
-	*dma_handle += get_dma_direct_offset(dev);
+	*dma_handle += get_dma_offset(dev);
 	return ret;
 #else
 	struct page *page;
@@ -51,7 +44,7 @@ void *dma_direct_alloc_coherent(struct device *dev, size_t size,
 		return NULL;
 	ret = page_address(page);
 	memset(ret, 0, size);
-	*dma_handle = virt_to_abs(ret) + get_dma_direct_offset(dev);
+	*dma_handle = virt_to_abs(ret) + get_dma_offset(dev);
 
 	return ret;
 #endif
@@ -75,7 +68,7 @@ static int dma_direct_map_sg(struct device *dev, struct scatterlist *sgl,
 	int i;
 
 	for_each_sg(sgl, sg, nents, i) {
-		sg->dma_address = sg_phys(sg) + get_dma_direct_offset(dev);
+		sg->dma_address = sg_phys(sg) + get_dma_offset(dev);
 		sg->dma_length = sg->length;
 		__dma_sync_page(sg_page(sg), sg->offset, sg->length, direction);
 	}
@@ -110,7 +103,7 @@ static inline dma_addr_t dma_direct_map_page(struct device *dev,
 {
 	BUG_ON(dir == DMA_NONE);
 	__dma_sync_page(page, offset, size, dir);
-	return page_to_phys(page) + offset + get_dma_direct_offset(dev);
+	return page_to_phys(page) + offset + get_dma_offset(dev);
 }
 
 static inline void dma_direct_unmap_page(struct device *dev,
-- 
1.6.0.6

^ permalink raw reply related

* Re: AW: PowerPC PCI DMA issues (prefetch/coherency?)
From: Adam Zilkie @ 2009-09-02 21:22 UTC (permalink / raw)
  To: chris.pringle, linuxppc-dev; +Cc: Tom Burns, Andrea Zypchen

Hi Chris,

I am having a problem similar to what you described in this discussion.
We are using the ppc arch with 2.6.24 with CONFIG_SEQUOIA with compiles
arch/ppc/kernel/head_44x.c (quite different
from /arch/powerpc/kernel/head_32.S). I would like to apply your
backporting patch to this architecture. Any help would be appreciated.

Regards,
Adam 

-- 
Adam Zilkie
Software Designer,
International Datacasting Corp.

This message and the documents attached hereto are intended only for the addressee and may contain privileged or confidential information. Any unauthorized disclosure is strictly prohibited. If you have received this message in error, please notify us immediately so that we may correct our internal records. Please then delete the original message. Thank you.

^ permalink raw reply

* Re: [PATCH v2] Fix fake numa on ppc
From: David Rientjes @ 2009-09-02 20:09 UTC (permalink / raw)
  To: Balbir Singh; +Cc: linuxppc-dev, Ankita Garg, LKML
In-Reply-To: <661de9470909021256i569261bxbe1523d8e37b5b14@mail.gmail.com>

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1023 bytes --]

On Thu, 3 Sep 2009, Balbir Singh wrote:

> > Right, I'm proposing an alternate mapping scheme (which we've used for
> > years) for both platforms such that a cpu is bound (and is set in
> > cpumask_of_node()) to each fake node with which it has physical affinity.
> > That is the only way for zonelist ordering in node order, task migration
> > from offlined cpus, correct sched domains, etc.  I can propose a patchset
> > for x86_64 to do exactly this if there aren't any objections and I hope
> > you'll help do ppc.
> 
> Sounds interesting, I'd definitely be interested in seeing your
> proposal, but I would think of that as additional development on top
> of this patch
> 

Absolutely.  I'm not familiar with numa=fake on ppc, but if cpus are being 
bound to nodes with which they don't have affinity, it definitely warrants 
a fix such as this (although the initial value for fake_enabled looks 
wrong and fake_numa_node_mapping[] can be __cpuinitdata).  I'll cc you, 
Ben, and Ankita on the x86_64 patches.  Thanks.

^ permalink raw reply

* Re: [PATCH v2] Fix fake numa on ppc
From: Balbir Singh @ 2009-09-02 19:56 UTC (permalink / raw)
  To: David Rientjes; +Cc: linuxppc-dev, Ankita Garg, LKML
In-Reply-To: <alpine.DEB.1.00.0909021226160.10279@chino.kir.corp.google.com>

On Thu, Sep 3, 2009 at 1:06 AM, David Rientjes<rientjes@google.com> wrote:
> On Wed, 2 Sep 2009, Ankita Garg wrote:
>
>> Currently, the behavior of fake numa is not so on x86 as well? Below is
>> a sample output from a single node x86 system booted with numa=3Dfake=3D=
8:
>>
>> # cat node0/cpulist
>>
>> # cat node1/cpulist
>>
>> ...
>> # cat node6/cpulist
>>
>> # cat node7/cpulist
>> 0-7
>>
>> Presently, just fixing the cpu association issue with ppc, as explained
>> in my previous mail.
>>
>
> Right, I'm proposing an alternate mapping scheme (which we've used for
> years) for both platforms such that a cpu is bound (and is set in
> cpumask_of_node()) to each fake node with which it has physical affinity.
> That is the only way for zonelist ordering in node order, task migration
> from offlined cpus, correct sched domains, etc. =A0I can propose a patchs=
et
> for x86_64 to do exactly this if there aren't any objections and I hope
> you'll help do ppc.

Sounds interesting, I'd definitely be interested in seeing your
proposal, but I would think of that as additional development on top
of this patch

Balbir Singh.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox