* Fwd: [PATCH 002/002] de2104x: support for systems lacking cache coherence [not found] <46e1c7760902071330i5362fe4fvd99fc7075fc666d3@mail.gmail.com> @ 2009-02-09 7:27 ` Risto Suominen 2009-02-09 7:45 ` David Miller 2009-02-10 1:01 ` Andrew Morton 1 sibling, 1 reply; 18+ messages in thread From: Risto Suominen @ 2009-02-09 7:27 UTC (permalink / raw) To: netdev [-- Attachment #1: Type: text/plain, Size: 2844 bytes --] ---------- Forwarded message ---------- From: Risto Suominen <risto.suominen@gmail.com> Date: 2009/2/7 Subject: [PATCH 002/002] de2104x: support for systems lacking cache coherence To: Jeff Garzik <jgarzik@pobox.com>, lkml <linux-kernel@vger.kernel.org> Add a configurable Descriptor Skip Length for systems that lack cache coherence. Signed-off-by: Risto Suominen <Risto.Suominen@gmail.com> --- The testing is done on kernel version 2.6.18. --- drivers/net/tulip/Kconfig.org 2006-09-20 06:42:06.000000000 +0300 +++ drivers/net/tulip/Kconfig 2009-02-07 20:48:17.000000000 +0200 @@ -27,6 +27,18 @@ config DE2104X To compile this driver as a module, choose M here. The module will be called de2104x. +config DE2104X_DSL + int "Descriptor Skip Length in 32 bit longwords" + depends on DE2104X + range 0 31 + default 0 + help + Setting this value allows to align ring buffer descriptors into their + own cache lines. Value of 4 corresponds to the typical 32 byte line + (the descriptor is 16 bytes). This is necessary on systems that lack + cache coherence, an example is PowerMac 5500. Otherwise 0 is safe. + Default is 0, and range is 0 to 31. + config TULIP tristate "DECchip Tulip (dc2114x) PCI support" depends on PCI --- drivers/net/tulip/de2104x.c.org 2006-09-20 06:42:06.000000000 +0300 +++ drivers/net/tulip/de2104x.c 2009-02-07 15:04:04.000000000 +0200 @@ -82,6 +82,13 @@ MODULE_PARM_DESC (rx_copybreak, "de2104x NETIF_MSG_RX_ERR | \ NETIF_MSG_TX_ERR) +/* Descriptor skip length in 32 bit longwords. */ +#ifndef CONFIG_DE2104X_DSL +#define DSL 0 +#else +#define DSL CONFIG_DE2104X_DSL +#endif + #define DE_RX_RING_SIZE 64 #define DE_TX_RING_SIZE 64 #define DE_RING_BYTES \ @@ -153,6 +160,7 @@ enum { CmdReset = (1 << 0), CacheAlign16 = 0x00008000, BurstLen4 = 0x00000400, + DescSkipLen = (DSL << 2), /* Rx/TxPoll bits */ NormalTxPoll = (1 << 0), @@ -246,7 +254,7 @@ static const u32 de_intr_mask = * Set the programmable burst length to 4 longwords for all: * DMA errors result without these values. Cache align 16 long. */ -static const u32 de_bus_mode = CacheAlign16 | BurstLen4; +static const u32 de_bus_mode = CacheAlign16 | BurstLen4 | DescSkipLen; struct de_srom_media_block { u8 opts; @@ -266,6 +274,9 @@ struct de_desc { __le32 opts2; __le32 addr1; __le32 addr2; +#if DSL + __le32 skip[DSL]; +#endif }; struct media_info { [-- Attachment #2: de2104x-dsl.patch --] [-- Type: text/x-diff, Size: 2182 bytes --] Add a configurable Descriptor Skip Length for systems that lack cache coherence. Signed-off-by: Risto Suominen <Risto.Suominen@gmail.com> --- The testing is done on kernel version 2.6.18. --- drivers/net/tulip/Kconfig.org 2006-09-20 06:42:06.000000000 +0300 +++ drivers/net/tulip/Kconfig 2009-02-07 20:48:17.000000000 +0200 @@ -27,6 +27,18 @@ config DE2104X To compile this driver as a module, choose M here. The module will be called de2104x. +config DE2104X_DSL + int "Descriptor Skip Length in 32 bit longwords" + depends on DE2104X + range 0 31 + default 0 + help + Setting this value allows to align ring buffer descriptors into their + own cache lines. Value of 4 corresponds to the typical 32 byte line + (the descriptor is 16 bytes). This is necessary on systems that lack + cache coherence, an example is PowerMac 5500. Otherwise 0 is safe. + Default is 0, and range is 0 to 31. + config TULIP tristate "DECchip Tulip (dc2114x) PCI support" depends on PCI --- drivers/net/tulip/de2104x.c.org 2006-09-20 06:42:06.000000000 +0300 +++ drivers/net/tulip/de2104x.c 2009-02-07 15:04:04.000000000 +0200 @@ -82,6 +82,13 @@ MODULE_PARM_DESC (rx_copybreak, "de2104x NETIF_MSG_RX_ERR | \ NETIF_MSG_TX_ERR) +/* Descriptor skip length in 32 bit longwords. */ +#ifndef CONFIG_DE2104X_DSL +#define DSL 0 +#else +#define DSL CONFIG_DE2104X_DSL +#endif + #define DE_RX_RING_SIZE 64 #define DE_TX_RING_SIZE 64 #define DE_RING_BYTES \ @@ -153,6 +160,7 @@ enum { CmdReset = (1 << 0), CacheAlign16 = 0x00008000, BurstLen4 = 0x00000400, + DescSkipLen = (DSL << 2), /* Rx/TxPoll bits */ NormalTxPoll = (1 << 0), @@ -246,7 +254,7 @@ static const u32 de_intr_mask = * Set the programmable burst length to 4 longwords for all: * DMA errors result without these values. Cache align 16 long. */ -static const u32 de_bus_mode = CacheAlign16 | BurstLen4; +static const u32 de_bus_mode = CacheAlign16 | BurstLen4 | DescSkipLen; struct de_srom_media_block { u8 opts; @@ -266,6 +274,9 @@ struct de_desc { __le32 opts2; __le32 addr1; __le32 addr2; +#if DSL + __le32 skip[DSL]; +#endif }; struct media_info { ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 002/002] de2104x: support for systems lacking cache coherence 2009-02-09 7:27 ` Fwd: [PATCH 002/002] de2104x: support for systems lacking cache coherence Risto Suominen @ 2009-02-09 7:45 ` David Miller 2009-02-09 8:22 ` Risto Suominen 2009-02-13 3:42 ` Benjamin Herrenschmidt 0 siblings, 2 replies; 18+ messages in thread From: David Miller @ 2009-02-09 7:45 UTC (permalink / raw) To: risto.suominen; +Cc: netdev From: Risto Suominen <risto.suominen@gmail.com> Date: Mon, 9 Feb 2009 09:27:49 +0200 > Add a configurable Descriptor Skip Length for systems that lack cache coherence. > > Signed-off-by: Risto Suominen <Risto.Suominen@gmail.com> I really don't see why this patch could possibly be necessary. On systems that lack cache coherence: 1) {pci,dma}_alloc_{consistent,coherent}() give kernel mappings of the buffer with the cache disabled. Therefore the device and and cpu see the correct data. 2) {pci,dma}_{map,unmap}_{single,sg}() do the appropriate cache flushing. Explicit syncing between cpu and device can be performed using {pci,dma}_sync_{single,sg}() as needed. Therefore, this patch is superfluous. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 002/002] de2104x: support for systems lacking cache coherence 2009-02-09 7:45 ` David Miller @ 2009-02-09 8:22 ` Risto Suominen 2009-02-09 8:29 ` David Miller 2009-02-09 16:58 ` Krzysztof Halasa 2009-02-13 3:42 ` Benjamin Herrenschmidt 1 sibling, 2 replies; 18+ messages in thread From: Risto Suominen @ 2009-02-09 8:22 UTC (permalink / raw) To: David Miller; +Cc: netdev 2009/2/9 David Miller <davem@davemloft.net>: > From: Risto Suominen <risto.suominen@gmail.com> > Date: Mon, 9 Feb 2009 09:27:49 +0200 > >> Add a configurable Descriptor Skip Length for systems that lack cache coherence. >> >> Signed-off-by: Risto Suominen <Risto.Suominen@gmail.com> > > I really don't see why this patch could possibly be necessary. > Because it makes it work on my PowerMac 5500 ;) > On systems that lack cache coherence: > > 1) {pci,dma}_alloc_{consistent,coherent}() give kernel mappings of > the buffer with the cache disabled. Therefore the device and > and cpu see the correct data. > > 2) {pci,dma}_{map,unmap}_{single,sg}() do the appropriate cache > flushing. > > Explicit syncing between cpu and device can be performed > using {pci,dma}_sync_{single,sg}() as needed. > Sounds good, but does not seem to help. My theory is that when the cpu is writing to one descriptor, it accidentally overwrites another descriptor, that has already been written to by the device, as there is only a single dirty bit, that makes the whole cacheline to be flushed. > Therefore, this patch is superfluous. > Or everything else is. My solution does not cost a penny. Risto ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 002/002] de2104x: support for systems lacking cache coherence 2009-02-09 8:22 ` Risto Suominen @ 2009-02-09 8:29 ` David Miller 2009-02-09 8:35 ` Risto Suominen 2009-02-09 16:58 ` Krzysztof Halasa 1 sibling, 1 reply; 18+ messages in thread From: David Miller @ 2009-02-09 8:29 UTC (permalink / raw) To: risto.suominen; +Cc: netdev From: Risto Suominen <risto.suominen@gmail.com> Date: Mon, 9 Feb 2009 10:22:15 +0200 > Sounds good, but does not seem to help. My theory is that when the cpu > is writing to one descriptor, it accidentally overwrites another > descriptor, that has already been written to by the device, as there > is only a single dirty bit, that makes the whole cacheline to be > flushed. You tested with 2.6.18 but you want me to apply this to current kernels, that won't work. Therefore you'll need to verify that the problem still exists with current kernels before I'll consider this seriously. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 002/002] de2104x: support for systems lacking cache coherence 2009-02-09 8:29 ` David Miller @ 2009-02-09 8:35 ` Risto Suominen 0 siblings, 0 replies; 18+ messages in thread From: Risto Suominen @ 2009-02-09 8:35 UTC (permalink / raw) To: David Miller; +Cc: netdev I see your point, was kind of expecting this. I'll see what I can do. Risto ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 002/002] de2104x: support for systems lacking cache coherence 2009-02-09 8:22 ` Risto Suominen 2009-02-09 8:29 ` David Miller @ 2009-02-09 16:58 ` Krzysztof Halasa 2009-02-09 19:22 ` Risto Suominen 1 sibling, 1 reply; 18+ messages in thread From: Krzysztof Halasa @ 2009-02-09 16:58 UTC (permalink / raw) To: Risto Suominen; +Cc: David Miller, netdev Risto Suominen <risto.suominen@gmail.com> writes: >> 1) {pci,dma}_alloc_{consistent,coherent}() give kernel mappings of >> the buffer with the cache disabled. ^^^^^^^^^^^^^^^^^^^^^^^ > Sounds good, but does not seem to help. My theory is that when the cpu > is writing to one descriptor, it accidentally overwrites another > descriptor, that has already been written to by the device, as there > is only a single dirty bit, that makes the whole cacheline to be > flushed. That means the consistent/coherent mapping isn't really consistent/coherent (uncached), right? Perhaps there is some way to fix this instead of changing the drivers to avoid the problematic area? Potentially any driver is affected by such coherency problem, this can't be specific to 21040. -- Krzysztof Halasa ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 002/002] de2104x: support for systems lacking cache coherence 2009-02-09 16:58 ` Krzysztof Halasa @ 2009-02-09 19:22 ` Risto Suominen 2009-02-09 22:51 ` David Miller 2009-02-10 23:21 ` David Miller 0 siblings, 2 replies; 18+ messages in thread From: Risto Suominen @ 2009-02-09 19:22 UTC (permalink / raw) To: Krzysztof Halasa; +Cc: David Miller, netdev 2009/2/9 Krzysztof Halasa <khc@pm.waw.pl>: > > That means the consistent/coherent mapping isn't really > consistent/coherent (uncached), right? Perhaps there is some way to fix > this instead of changing the drivers to avoid the problematic area? > Yes. The other way could be to add a config parameter that allows to explicitly set CONFIG_NOT_COHERENT_CACHE. I have not tried this, probably because the necessary code wasn't in arch/powerpc until 2.6.21. Anyway, now I have tested with 2.6.24, and the problem still exists. Apropos changing the driver, I would like to point out, that leaving my proposed config parameter to its default value means not changing the driver. > Potentially any driver is affected by such coherency problem, this can't > be specific to 21040. > I agree. That talks for the config solution. Risto ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 002/002] de2104x: support for systems lacking cache coherence 2009-02-09 19:22 ` Risto Suominen @ 2009-02-09 22:51 ` David Miller 2009-02-10 1:45 ` Krzysztof Halasa 2009-02-10 23:21 ` David Miller 1 sibling, 1 reply; 18+ messages in thread From: David Miller @ 2009-02-09 22:51 UTC (permalink / raw) To: risto.suominen; +Cc: khc, netdev From: Risto Suominen <risto.suominen@gmail.com> Date: Mon, 9 Feb 2009 21:22:03 +0200 > 2009/2/9 Krzysztof Halasa <khc@pm.waw.pl>: > > > > That means the consistent/coherent mapping isn't really > > consistent/coherent (uncached), right? Perhaps there is some way to fix > > this instead of changing the drivers to avoid the problematic area? > > > Yes. > > The other way could be to add a config parameter that allows to > explicitly set CONFIG_NOT_COHERENT_CACHE. I have not tried this, > probably because the necessary code wasn't in arch/powerpc until > 2.6.21. > > Anyway, now I have tested with 2.6.24, and the problem still exists. Ok. The issue are descriptors that are _written_ by both the cpu and the device. That is the problematic case here. e100 had a similar problem and changes were made there too to accomodate such system types. If only the cpu or only the device writes to the descriptor (which is the most common design these days) there are no problems. So as promised I'll have to relook at this patch :-) ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 002/002] de2104x: support for systems lacking cache coherence 2009-02-09 22:51 ` David Miller @ 2009-02-10 1:45 ` Krzysztof Halasa 2009-02-10 1:50 ` David Miller 0 siblings, 1 reply; 18+ messages in thread From: Krzysztof Halasa @ 2009-02-10 1:45 UTC (permalink / raw) To: David Miller; +Cc: risto.suominen, netdev David Miller <davem@davemloft.net> writes: > The issue are descriptors that are _written_ by both the cpu > and the device. That is the problematic case here. Do you mean both CPU and 21040 write to the same descriptor at (nearly) the same time? Is it TX, RX or both? I wonder, how would the patch help it? The patch seems to align the descriptors on cache line boundary. That IMHO means the corruption is caused by the 21040 writing to e.g. desc #0, CPU writing to desc #1, which causes the cache line write bringing the old desc #0 back. Is it possible to use uncached memory for coherent allocations (with no write side effects) on this machine? -- Krzysztof Halasa ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 002/002] de2104x: support for systems lacking cache coherence 2009-02-10 1:45 ` Krzysztof Halasa @ 2009-02-10 1:50 ` David Miller 0 siblings, 0 replies; 18+ messages in thread From: David Miller @ 2009-02-10 1:50 UTC (permalink / raw) To: khc; +Cc: risto.suominen, netdev From: Krzysztof Halasa <khc@pm.waw.pl> Date: Tue, 10 Feb 2009 02:45:46 +0100 > David Miller <davem@davemloft.net> writes: > > > The issue are descriptors that are _written_ by both the cpu > > and the device. That is the problematic case here. > > Do you mean both CPU and 21040 write to the same descriptor at (nearly) > the same time? Is it TX, RX or both? > > I wonder, how would the patch help it? The problem is when the chip is writing to one neighbouring descriptor of one which the cpu is writing to at the same time. > The patch seems to align the descriptors on cache line boundary. That > IMHO means the corruption is caused by the 21040 writing to e.g. desc > #0, CPU writing to desc #1, which causes the cache line write bringing > the old desc #0 back. Right. > Is it possible to use uncached memory for coherent allocations (with no > write side effects) on this machine? Good question. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 002/002] de2104x: support for systems lacking cache coherence 2009-02-09 19:22 ` Risto Suominen 2009-02-09 22:51 ` David Miller @ 2009-02-10 23:21 ` David Miller 2009-02-11 12:18 ` Risto Suominen 1 sibling, 1 reply; 18+ messages in thread From: David Miller @ 2009-02-10 23:21 UTC (permalink / raw) To: risto.suominen; +Cc: khc, netdev From: Risto Suominen <risto.suominen@gmail.com> Date: Mon, 9 Feb 2009 21:22:03 +0200 > 2009/2/9 Krzysztof Halasa <khc@pm.waw.pl>: > > Potentially any driver is affected by such coherency problem, this can't > > be specific to 21040. > > > I agree. That talks for the config solution. I think the pci_alloc_consistent() implementation for your particular platform should be fixed instead :-) It should be using uncacheable cpu mappings of the returned memory if the cpu lacks cache coherency with DMA. Peppering all kinds of drivers with this kind of change being proposed here is not the way to handle this problem. It makes the creation of the abstractions we created with pci_alloc_consistent() and friends totally useless. Drivers have to be able to say "what I write to this memory the device will see, and what the device writes the cpu will see" and they have to be able to say this regardless of details like cache alignment and other things that they should have no business knowing about. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 002/002] de2104x: support for systems lacking cache coherence 2009-02-10 23:21 ` David Miller @ 2009-02-11 12:18 ` Risto Suominen 2009-02-11 12:31 ` Risto Suominen 0 siblings, 1 reply; 18+ messages in thread From: Risto Suominen @ 2009-02-11 12:18 UTC (permalink / raw) To: David Miller; +Cc: khc, netdev [-- Attachment #1: Type: text/plain, Size: 987 bytes --] 2009/2/11 David Miller <davem@davemloft.net>: > > I think the pci_alloc_consistent() implementation for your particular > platform should be fixed instead :-) > Looks like it works as expected with 2.6.24. This patch is all that is needed. Still I think that my first solution was beautiful in its simplicity :) (and usable with older kernels) Risto Allow setting NOT_COHERENT_CACHE explicitly. Signed-off-by: Risto Suominen <Risto.Suominen@gmail.com> --- The testing is done on kernel version 2.6.24. --- a/arch/powerpc/platforms/powermac/Kconfig.org 2008-01-25 00:58:37.000000000 +0200 +++ b/arch/powerpc/platforms/powermac/Kconfig 2009-02-10 17:44:24.000000000 +0200 @@ -18,4 +18,10 @@ config PPC_PMAC64 select PPC_970_NAP default y - +config NOT_COHERENT_CACHE + bool "Incoherent cache" + default n + help + Setting this option may be necessary for avoiding cache-related + problems with some network cards on some platforms. An example is + 2104x and PowerMac 5500. [-- Attachment #2: incoherent_cache.patch --] [-- Type: text/x-diff, Size: 635 bytes --] Allow setting NOT_COHERENT_CACHE explicitly. Signed-off-by: Risto Suominen <Risto.Suominen@gmail.com> --- The testing is done on kernel version 2.6.24. --- a/arch/powerpc/platforms/powermac/Kconfig.org 2008-01-25 00:58:37.000000000 +0200 +++ b/arch/powerpc/platforms/powermac/Kconfig 2009-02-10 17:44:24.000000000 +0200 @@ -18,4 +18,10 @@ config PPC_PMAC64 select PPC_970_NAP default y - +config NOT_COHERENT_CACHE + bool "Incoherent cache" + default n + help + Setting this option may be necessary for avoiding cache-related + problems with some network cards on some platforms. An example is + 2104x and PowerMac 5500. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 002/002] de2104x: support for systems lacking cache coherence 2009-02-11 12:18 ` Risto Suominen @ 2009-02-11 12:31 ` Risto Suominen 2009-02-11 21:39 ` David Miller 0 siblings, 1 reply; 18+ messages in thread From: Risto Suominen @ 2009-02-11 12:31 UTC (permalink / raw) To: David Miller; +Cc: khc, netdev As an interesting side effect, the patch caused my mesh module to stop working. It crashes. Risto ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 002/002] de2104x: support for systems lacking cache coherence 2009-02-11 12:31 ` Risto Suominen @ 2009-02-11 21:39 ` David Miller 0 siblings, 0 replies; 18+ messages in thread From: David Miller @ 2009-02-11 21:39 UTC (permalink / raw) To: risto.suominen; +Cc: khc, netdev From: Risto Suominen <risto.suominen@gmail.com> Date: Wed, 11 Feb 2009 14:31:43 +0200 > As an interesting side effect, the patch caused my mesh module to stop > working. It crashes. I think you need to take this discussion to linuxppc-dev at this point. Nobody amongst the networking developers can say whether your NOT_COHERENT_CACHE change is a valid or desirable way to solve the problem, nor what side effects it might have. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 002/002] de2104x: support for systems lacking cache coherence 2009-02-09 7:45 ` David Miller 2009-02-09 8:22 ` Risto Suominen @ 2009-02-13 3:42 ` Benjamin Herrenschmidt 1 sibling, 0 replies; 18+ messages in thread From: Benjamin Herrenschmidt @ 2009-02-13 3:42 UTC (permalink / raw) To: David Miller; +Cc: risto.suominen, netdev > I really don't see why this patch could possibly be necessary. > > On systems that lack cache coherence: > > 1) {pci,dma}_alloc_{consistent,coherent}() give kernel mappings of > the buffer with the cache disabled. Therefore the device and > and cpu see the correct data. > > 2) {pci,dma}_{map,unmap}_{single,sg}() do the appropriate cache > flushing. > > Explicit syncing between cpu and device can be performed > using {pci,dma}_sync_{single,sg}() as needed. > > Therefore, this patch is superfluous. Allright so in that case, I'm not entirely sure... There is some history behind that (and btw, de4x5 and tulip probably need the same "workaround"), see below. First what I believe the problem to be is some kind of coherency bug in a bridge in some of these old systems. I'm not 100% sure of the details but it does have to do with partial cache line access iirc. By moving the descriptors into separate cache lines, we avoid the bug. It would be possible, I suppose, to work around it by treating those machines as non-coherent but with two significant drawbacks: - One is simple and you may want to ignore it :-) Basically, those are already pretty slow machines, and thus we would slow them down even more by doing manual cache flush/invalidates all over the place (as a matter of fact, I'm not even sure those CPUs support dcbi instructions for cache inval, I need to dbl check). - The other is more subtle but also harder to avoid: It would be fairly hard to add support for non-coherent mappings on these because of the way the whole MMU stuff works for those 32-bit hash based powerpc's. Basically, we cannot have something mapped both cached and non-cached (in different virtual addresses), and the use of the BAT mappings in the processor means that most of the linear mapping -will- be mapped cached in chunks of 256MB. The code pretty much relies on that, it's not just an optimisation that can be turned off. So I'm of mixed opinion here... It looks like since only those 2 or 3 drivers are affected (ie, they are the only thing ever found behind those weirdo bridges) and since those chips happen to support this padding between descriptor, it looks like a good compromise to just add this feature to the drivers. In fact, de4x5.c at least used to have it, I don't know if it's still the case. Now, there are other cache coherency related bugs I think on those old machines, or at least what look like it -could- be cache coherency related bugs, or maybe just bugs in the DBDMA engine. Mesh for example gets unhappy if we give it a buffer that is not aligned on a cache line boundary and corrupts the beginning of that cache line. But I don't think that treating the platform as non-coherent will fix that, ie, it seems to happens regardless of the line actually being in the cache or not. This is typically triggered by the SCSI ID at boot when using SLAB debug which de-aligns kmalloc, or at least that's how I observed it a while ago. Cheers, Ben. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 002/002] de2104x: support for systems lacking cache coherence [not found] <46e1c7760902071330i5362fe4fvd99fc7075fc666d3@mail.gmail.com> 2009-02-09 7:27 ` Fwd: [PATCH 002/002] de2104x: support for systems lacking cache coherence Risto Suominen @ 2009-02-10 1:01 ` Andrew Morton 2009-02-10 1:07 ` David Miller 2009-02-10 7:16 ` Risto Suominen 1 sibling, 2 replies; 18+ messages in thread From: Andrew Morton @ 2009-02-10 1:01 UTC (permalink / raw) To: Risto Suominen; +Cc: jgarzik, linux-kernel, netdev, Grant Grundler (cc netdev, and maintainer) On Sat, 7 Feb 2009 23:30:40 +0200 Risto Suominen <risto.suominen@gmail.com> wrote: > Add a configurable Descriptor Skip Length for systems that lack cache coherence. > > Signed-off-by: Risto Suominen <Risto.Suominen@gmail.com> > --- > The testing is done on kernel version 2.6.18. > > --- drivers/net/tulip/Kconfig.org 2006-09-20 06:42:06.000000000 +0300 > +++ drivers/net/tulip/Kconfig 2009-02-07 20:48:17.000000000 +0200 Please prefer to prepare patches in `patch -p1' form: --- a/drivers/net/tulip/Kconfig +++ a/drivers/net/tulip/Kconfig > @@ -27,6 +27,18 @@ config DE2104X > To compile this driver as a module, choose M here. The module will > be called de2104x. > > +config DE2104X_DSL > + int "Descriptor Skip Length in 32 bit longwords" > + depends on DE2104X > + range 0 31 > + default 0 > + help > + Setting this value allows to align ring buffer descriptors into their > + own cache lines. Value of 4 corresponds to the typical 32 byte line > + (the descriptor is 16 bytes). This is necessary on systems that lack > + cache coherence, an example is PowerMac 5500. Otherwise 0 is safe. > + Default is 0, and range is 0 to 31. > + > config TULIP > tristate "DECchip Tulip (dc2114x) PCI support" > depends on PCI > --- drivers/net/tulip/de2104x.c.org 2006-09-20 06:42:06.000000000 +0300 > +++ drivers/net/tulip/de2104x.c 2009-02-07 15:04:04.000000000 +0200 > @@ -82,6 +82,13 @@ MODULE_PARM_DESC (rx_copybreak, "de2104x > NETIF_MSG_RX_ERR | \ > NETIF_MSG_TX_ERR) > > +/* Descriptor skip length in 32 bit longwords. */ > +#ifndef CONFIG_DE2104X_DSL > +#define DSL 0 > +#else > +#define DSL CONFIG_DE2104X_DSL > +#endif I think CONFIG_DE2104X_DSL is always defined if this driver is being compiled. So the Kconfig `default' should suffice here? In which case we can do #define DSL CONFIG_DE2104X_DSL and leave it at that. > #define DE_RX_RING_SIZE 64 > #define DE_TX_RING_SIZE 64 > #define DE_RING_BYTES \ > @@ -153,6 +160,7 @@ enum { > CmdReset = (1 << 0), > CacheAlign16 = 0x00008000, > BurstLen4 = 0x00000400, > + DescSkipLen = (DSL << 2), In which case we can do away with DSL everywhere and just use CONFIG_DE2104X_DSL here. But really, we shouldn't do this at configuration time at all. It would be much better to do it at runtime, via a module parameter. And it would be much^2 better to do it automatically, based upon the chip probing information or whatever. That's probably hard. > /* Rx/TxPoll bits */ > NormalTxPoll = (1 << 0), > @@ -246,7 +254,7 @@ static const u32 de_intr_mask = > * Set the programmable burst length to 4 longwords for all: > * DMA errors result without these values. Cache align 16 long. > */ > -static const u32 de_bus_mode = CacheAlign16 | BurstLen4; > +static const u32 de_bus_mode = CacheAlign16 | BurstLen4 | DescSkipLen; > > struct de_srom_media_block { > u8 opts; > @@ -266,6 +274,9 @@ struct de_desc { > __le32 opts2; > __le32 addr1; > __le32 addr2; > +#if DSL > + __le32 skip[DSL]; > +#endif > }; Can remove the ifdefs here. A zero-length array is OK, and will consume zero space. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 002/002] de2104x: support for systems lacking cache coherence 2009-02-10 1:01 ` Andrew Morton @ 2009-02-10 1:07 ` David Miller 2009-02-10 7:16 ` Risto Suominen 1 sibling, 0 replies; 18+ messages in thread From: David Miller @ 2009-02-10 1:07 UTC (permalink / raw) To: akpm; +Cc: risto.suominen, jgarzik, linux-kernel, netdev, grundler From: Andrew Morton <akpm@linux-foundation.org> Date: Mon, 9 Feb 2009 17:01:20 -0800 > > (cc netdev, and maintainer) We're already discussing this patch on netdev, he resent it there. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 002/002] de2104x: support for systems lacking cache coherence 2009-02-10 1:01 ` Andrew Morton 2009-02-10 1:07 ` David Miller @ 2009-02-10 7:16 ` Risto Suominen 1 sibling, 0 replies; 18+ messages in thread From: Risto Suominen @ 2009-02-10 7:16 UTC (permalink / raw) To: Andrew Morton; +Cc: jgarzik, linux-kernel, netdev, Grant Grundler 2009/2/10 Andrew Morton <akpm@linux-foundation.org>: > > Please prefer to prepare patches in `patch -p1' form: > > --- a/drivers/net/tulip/Kconfig > +++ a/drivers/net/tulip/Kconfig > I'll try to remember that. > I think CONFIG_DE2104X_DSL is always defined if this driver is being > compiled. So the Kconfig `default' should suffice here? In which case > we can do > > #define DSL CONFIG_DE2104X_DSL > > and leave it at that. > > In which case we can do away with DSL everywhere and just use > CONFIG_DE2104X_DSL here. > I was thinking of the case where someone just compiles without reconfiguring. > But really, we shouldn't do this at configuration time at all. It > would be much better to do it at runtime, via a module parameter. > Well, yeah, but that would probably cost cpu cycles. At least it would mean lots of changes. > And it would be much^2 better to do it automatically, based upon the > chip probing information or whatever. That's probably hard. > I agree. > Can remove the ifdefs here. A zero-length array is OK, and will > consume zero space. > I was uncertain whether this applies to all compilers. I kind of remember opposite cases in the past. Risto ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2009-02-13 3:42 UTC | newest] Thread overview: 18+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <46e1c7760902071330i5362fe4fvd99fc7075fc666d3@mail.gmail.com> 2009-02-09 7:27 ` Fwd: [PATCH 002/002] de2104x: support for systems lacking cache coherence Risto Suominen 2009-02-09 7:45 ` David Miller 2009-02-09 8:22 ` Risto Suominen 2009-02-09 8:29 ` David Miller 2009-02-09 8:35 ` Risto Suominen 2009-02-09 16:58 ` Krzysztof Halasa 2009-02-09 19:22 ` Risto Suominen 2009-02-09 22:51 ` David Miller 2009-02-10 1:45 ` Krzysztof Halasa 2009-02-10 1:50 ` David Miller 2009-02-10 23:21 ` David Miller 2009-02-11 12:18 ` Risto Suominen 2009-02-11 12:31 ` Risto Suominen 2009-02-11 21:39 ` David Miller 2009-02-13 3:42 ` Benjamin Herrenschmidt 2009-02-10 1:01 ` Andrew Morton 2009-02-10 1:07 ` David Miller 2009-02-10 7:16 ` Risto Suominen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).