* [PATCH 0/2] [PPC 4xx] L2-cache synchronization for ppc44x @ 2007-11-06 22:40 Yuri Tikhonov 2007-11-28 19:50 ` Eugene Surovegin 0 siblings, 1 reply; 7+ messages in thread From: Yuri Tikhonov @ 2007-11-06 22:40 UTC (permalink / raw) To: linuxppc-dev; +Cc: sr, dzu Hello all, Here is a patch-set for support L2-cache synchronization routines for the ppc44x processors family. I know that the "ppc" branch is for bug-fixing only, thus the patch-set is just FYI [though enabled but non-coherent L2-cache may appear as a bug for someone who uses one of the boards listed below :)]. [PATCH 1/2] [PPC 4xx] invalidate_l2cache_range() implementation for ppc44x; [PATCH 2/2] [PPC 44x] enable L2-cache for the following ppc44x-based boards: ALPR, Katmai, Ocotea, and Taishan. Regards, Yuri -- Yuri Tikhonov, Senior Software Engineer Emcraft Systems, www.emcraft.com ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 0/2] [PPC 4xx] L2-cache synchronization for ppc44x 2007-11-06 22:40 [PATCH 0/2] [PPC 4xx] L2-cache synchronization for ppc44x Yuri Tikhonov @ 2007-11-28 19:50 ` Eugene Surovegin 2008-01-11 15:24 ` Yuri Tikhonov 0 siblings, 1 reply; 7+ messages in thread From: Eugene Surovegin @ 2007-11-28 19:50 UTC (permalink / raw) To: Yuri Tikhonov; +Cc: linuxppc-dev, sr, dzu On Wed, Nov 07, 2007 at 01:40:10AM +0300, Yuri Tikhonov wrote: > > Hello all, > > Here is a patch-set for support L2-cache synchronization routines for > the ppc44x processors family. I know that the "ppc" branch is for bug-fixing only, thus > the patch-set is just FYI [though enabled but non-coherent L2-cache may appear as a bug for > someone who uses one of the boards listed below :)]. > > [PATCH 1/2] [PPC 4xx] invalidate_l2cache_range() implementation for ppc44x; > [PATCH 2/2] [PPC 44x] enable L2-cache for the following ppc44x-based boards: ALPR, > Katmai, Ocotea, and Taishan. Why is this all needed? IIRC ibm440gx_l2c_enable() configures 64G snoop region for L2C. Did AMCC made non-only-coherent L2C chips recently? -- Eugene ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 0/2] [PPC 4xx] L2-cache synchronization for ppc44x 2007-11-28 19:50 ` Eugene Surovegin @ 2008-01-11 15:24 ` Yuri Tikhonov 2008-01-11 17:41 ` Eugene Surovegin 0 siblings, 1 reply; 7+ messages in thread From: Yuri Tikhonov @ 2008-01-11 15:24 UTC (permalink / raw) To: Eugene Surovegin; +Cc: linuxppc-dev, sr, dzu Hello, Eugene, The h/w snooping mechanism you are talking about is limited to the Low Latency (LL) segment of the PLB bus in ppc440sp and ppc440spe chips (see section "7.2.7 L2 Cache Coherency" of the ppc440spe spec), whereas DMA and XOR engines use the High Bandwidth (HB) segment of PLB bus (see section "1.1.2 Internal Buses" of the ppc440spe spec). Thus, the h/w snooping mechanism is not able to trace the results of operations performed by DMA and XOR engines and keep L2-cache coherent with SDRAM, because the data flow through the HB PLB segment. This leads to, for example, incorrect results of RAID-parity calculations if one uses the h/w accelerated ppc440spe ADMA driver with L2-cache enabled. The s/w synchronization algorithms proposed in my patches has no LL PLB limitations as opposed to h/w snooping, but, probably, this is not the best way of how it might be implemented. Even though with these patches the h/w accelerated RAID starts to operate correctly (with L2-cache enabled) there is a performance degradation (induced by loops in the L2-cache synchronization routines) observed in the most cases. So, as a result, there is no benefit from using L2-cache for these, RAID, cases at all. Regards, Yuri On Wednesday 28 November 2007 22:50, Eugene Surovegin wrote: > On Wed, Nov 07, 2007 at 01:40:10AM +0300, Yuri Tikhonov wrote: > > > > Hello all, > > > > Here is a patch-set for support L2-cache synchronization routines for > > the ppc44x processors family. I know that the "ppc" branch is for bug-fixing only, thus > > the patch-set is just FYI [though enabled but non-coherent L2-cache may appear as a bug for > > someone who uses one of the boards listed below :)]. > > > > [PATCH 1/2] [PPC 4xx] invalidate_l2cache_range() implementation for ppc44x; > > [PATCH 2/2] [PPC 44x] enable L2-cache for the following ppc44x-based boards: ALPR, > > Katmai, Ocotea, and Taishan. > > Why is this all needed? > > IIRC ibm440gx_l2c_enable() configures 64G snoop region for L2C. > > Did AMCC made non-only-coherent L2C chips recently? > > -- > Eugene > > -- Yuri Tikhonov, Senior Software Engineer Emcraft Systems, www.emcraft.com ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 0/2] [PPC 4xx] L2-cache synchronization for ppc44x 2008-01-11 15:24 ` Yuri Tikhonov @ 2008-01-11 17:41 ` Eugene Surovegin 2008-01-11 22:05 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 7+ messages in thread From: Eugene Surovegin @ 2008-01-11 17:41 UTC (permalink / raw) To: Yuri Tikhonov; +Cc: linuxppc-dev, sr, dzu On Fri, Jan 11, 2008 at 06:24:46PM +0300, Yuri Tikhonov wrote: > > Hello, Eugene, > > The h/w snooping mechanism you are talking about is limited to the Low > Latency (LL) segment of the PLB bus in ppc440sp and ppc440spe chips (see > section "7.2.7 L2 Cache Coherency" of the ppc440spe spec), whereas DMA and > XOR engines use the High Bandwidth (HB) segment of PLB bus (see > section "1.1.2 Internal Buses" of the ppc440spe spec). > > Thus, the h/w snooping mechanism is not able to trace the results of > operations performed by DMA and XOR engines and keep L2-cache coherent with > SDRAM, because the data flow through the HB PLB segment. This leads to, for > example, incorrect results of RAID-parity calculations if one uses the h/w > accelerated ppc440spe ADMA driver with L2-cache enabled. > > The s/w synchronization algorithms proposed in my patches has no LL PLB > limitations as opposed to h/w snooping, but, probably, this is not the best > way of how it might be implemented. Even though with these patches the h/w > accelerated RAID starts to operate correctly (with L2-cache enabled) there is > a performance degradation (induced by loops in the L2-cache synchronization > routines) observed in the most cases. So, as a result, there is no benefit > from using L2-cache for these, RAID, cases at all. Thanks a lot for explanation, Yuri. I'd never imagine they were so stupid to make new chips with such behaviour. -- Eugene ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 0/2] [PPC 4xx] L2-cache synchronization for ppc44x 2008-01-11 17:41 ` Eugene Surovegin @ 2008-01-11 22:05 ` Benjamin Herrenschmidt 2008-01-11 22:38 ` Eugene Surovegin 0 siblings, 1 reply; 7+ messages in thread From: Benjamin Herrenschmidt @ 2008-01-11 22:05 UTC (permalink / raw) To: Eugene Surovegin; +Cc: linuxppc-dev, sr, dzu > > The s/w synchronization algorithms proposed in my patches has no LL PLB > > limitations as opposed to h/w snooping, but, probably, this is not the best > > way of how it might be implemented. Even though with these patches the h/w > > accelerated RAID starts to operate correctly (with L2-cache enabled) there is > > a performance degradation (induced by loops in the L2-cache synchronization > > routines) observed in the most cases. So, as a result, there is no benefit > > from using L2-cache for these, RAID, cases at all. > > Thanks a lot for explanation, Yuri. I'd never imagine they were so > stupid to make new chips with such behaviour. Indeed. Now the question is do we want to make that configurable by the platform so it can select whether to enable snooping, or use this mechanism (in which case we can disable snooping on the L2) ? Another option would be to make the dma_ops smart enough to know whether a given device is on the snooped portion of the bus, which would be easier to do after I merge 32 and 64 bits DMA ops, so we get the ability to change the dma-ops per bus or per device even. What do you guys think ? Cheers, Ben. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 0/2] [PPC 4xx] L2-cache synchronization for ppc44x 2008-01-11 22:05 ` Benjamin Herrenschmidt @ 2008-01-11 22:38 ` Eugene Surovegin 2008-01-12 1:52 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 7+ messages in thread From: Eugene Surovegin @ 2008-01-11 22:38 UTC (permalink / raw) To: Benjamin Herrenschmidt; +Cc: linuxppc-dev, sr, dzu On Sat, Jan 12, 2008 at 09:05:35AM +1100, Benjamin Herrenschmidt wrote: > > > > The s/w synchronization algorithms proposed in my patches has no LL PLB > > > limitations as opposed to h/w snooping, but, probably, this is not the best > > > way of how it might be implemented. Even though with these patches the h/w > > > accelerated RAID starts to operate correctly (with L2-cache enabled) there is > > > a performance degradation (induced by loops in the L2-cache synchronization > > > routines) observed in the most cases. So, as a result, there is no benefit > > > from using L2-cache for these, RAID, cases at all. > > > > Thanks a lot for explanation, Yuri. I'd never imagine they were so > > stupid to make new chips with such behaviour. > > Indeed. Now the question is do we want to make that configurable by the > platform so it can select whether to enable snooping, or use this > mechanism (in which case we can disable snooping on the L2) ? I don't think we should panish platforms with sane L2 caches, because there are some brain-dead ones. > Another option would be to make the dma_ops smart enough to know whether > a given device is on the snooped portion of the bus, which would be > easier to do after I merge 32 and 64 bits DMA ops, so we get the ability > to change the dma-ops per bus or per device even. > > What do you guys think ? I like the idea of having smart DMA routines with different per-bus/device behaviour. -- Eugene ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 0/2] [PPC 4xx] L2-cache synchronization for ppc44x 2008-01-11 22:38 ` Eugene Surovegin @ 2008-01-12 1:52 ` Benjamin Herrenschmidt 0 siblings, 0 replies; 7+ messages in thread From: Benjamin Herrenschmidt @ 2008-01-12 1:52 UTC (permalink / raw) To: Eugene Surovegin; +Cc: linuxppc-dev, sr, dzu On Fri, 2008-01-11 at 14:38 -0800, Eugene Surovegin wrote: > On Sat, Jan 12, 2008 at 09:05:35AM +1100, Benjamin Herrenschmidt wrote: > > > > > > The s/w synchronization algorithms proposed in my patches has no LL PLB > > > > limitations as opposed to h/w snooping, but, probably, this is not the best > > > > way of how it might be implemented. Even though with these patches the h/w > > > > accelerated RAID starts to operate correctly (with L2-cache enabled) there is > > > > a performance degradation (induced by loops in the L2-cache synchronization > > > > routines) observed in the most cases. So, as a result, there is no benefit > > > > from using L2-cache for these, RAID, cases at all. > > > > > > Thanks a lot for explanation, Yuri. I'd never imagine they were so > > > stupid to make new chips with such behaviour. > > > > Indeed. Now the question is do we want to make that configurable by the > > platform so it can select whether to enable snooping, or use this > > mechanism (in which case we can disable snooping on the L2) ? > > I don't think we should panish platforms with sane L2 caches, because > there are some brain-dead ones. I agree, which is why I'm thinking about making it some kind of explicit thing that a give platform would call from it's setup_arch() callbacks to turn on manual L2 sycnhronization. > > Another option would be to make the dma_ops smart enough to know whether > > a given device is on the snooped portion of the bus, which would be > > easier to do after I merge 32 and 64 bits DMA ops, so we get the ability > > to change the dma-ops per bus or per device even. > > > > What do you guys think ? > > I like the idea of having smart DMA routines with different > per-bus/device behaviour. That would be longer term. When I merge the dma ops, I'll look into a way to provide 44x specific DMA ops that handle that case, and then a way for devices to be tagged (maybe via the device-tree) on whether they are on an L2 coherent or non-L2 coherent segment of the bus. Cheers, Ben. ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2008-01-12 1:53 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-11-06 22:40 [PATCH 0/2] [PPC 4xx] L2-cache synchronization for ppc44x Yuri Tikhonov 2007-11-28 19:50 ` Eugene Surovegin 2008-01-11 15:24 ` Yuri Tikhonov 2008-01-11 17:41 ` Eugene Surovegin 2008-01-11 22:05 ` Benjamin Herrenschmidt 2008-01-11 22:38 ` Eugene Surovegin 2008-01-12 1:52 ` Benjamin Herrenschmidt
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).