* Breakage caused by unreviewed patch in x86 tree
@ 2008-04-27 20:51 James Bottomley
2008-04-27 20:53 ` David Miller
` (3 more replies)
0 siblings, 4 replies; 51+ messages in thread
From: James Bottomley @ 2008-04-27 20:51 UTC (permalink / raw)
To: Ingo Molnar, Thomas Gleixner; +Cc: linux-kernel
This patch:
commit 6371b495991debfd1417b17c2bc4f7d7bae05739
Author: Ingo Molnar <mingo@elte.hu>
Date: Wed Jan 30 13:33:40 2008 +0100
x86: change ioremap() to default to uncached
As far as I can tell went blindly into the x86 tree without being shared
on any mailing list at all. How can something that completely alters
the semantics of ioremap on x86 platforms go in without any review.
I'm pissed off because it broke a class of voyager machines: those which
rely on the quad interrupt controller (QIC). The precis of why they
broke is because the QIC does IPIs (or CPIs in its terminology) via
cache line interference: you interrupt a processor by moving a
designated memory area to write exclusive in the cache (by simply
writing to the line) and the CPU acks the interrupt by moving it back to
read shared (by reading from it). That area, is, of course, mapped by
ioremap, so reversing the ioremap semantics and adding the uncached bit
completely breaks the QIC. I might add that the intel SAPIC functions
in roughly the same manner, so this might break more than just voyager.
This patch was originally proposed my Matthew Wilcox in March 2006:
http://lkml.org/lkml/2006/3/30/251
And after a reasonable discussion, it was decided that there was too
much risk of breakage to take it. How come you two decided that an
identical patch with different authorship could go in to your tree
without discussion this time?
James
^ permalink raw reply [flat|nested] 51+ messages in thread* Re: Breakage caused by unreviewed patch in x86 tree 2008-04-27 20:51 Breakage caused by unreviewed patch in x86 tree James Bottomley @ 2008-04-27 20:53 ` David Miller 2008-04-27 21:48 ` [patch] x86, voyager: fix ioremap_nocache() Ingo Molnar ` (2 subsequent siblings) 3 siblings, 0 replies; 51+ messages in thread From: David Miller @ 2008-04-27 20:53 UTC (permalink / raw) To: James.Bottomley; +Cc: mingo, tglx, linux-kernel From: James Bottomley <James.Bottomley@HansenPartnership.com> Date: Sun, 27 Apr 2008 16:51:25 -0400 > I'm pissed off because it broke a class of voyager machines: those which Join the very large and constantly growing club, Ingo is upsetting a lot of people lately. ^ permalink raw reply [flat|nested] 51+ messages in thread
* [patch] x86, voyager: fix ioremap_nocache() 2008-04-27 20:51 Breakage caused by unreviewed patch in x86 tree James Bottomley 2008-04-27 20:53 ` David Miller @ 2008-04-27 21:48 ` Ingo Molnar 2008-04-27 22:05 ` James Bottomley 2008-04-27 22:34 ` James Bottomley 2008-04-27 22:00 ` Breakage caused by unreviewed patch in x86 tree H. Peter Anvin 2008-04-27 22:58 ` Arjan van de Ven 3 siblings, 2 replies; 51+ messages in thread From: Ingo Molnar @ 2008-04-27 21:48 UTC (permalink / raw) To: James Bottomley Cc: Thomas Gleixner, linux-kernel, H. Peter Anvin, David S. Miller * James Bottomley <James.Bottomley@HansenPartnership.com> wrote: > This patch: > > commit 6371b495991debfd1417b17c2bc4f7d7bae05739 > Author: Ingo Molnar <mingo@elte.hu> > Date: Wed Jan 30 13:33:40 2008 +0100 > > x86: change ioremap() to default to uncached > > As far as I can tell went blindly into the x86 tree without being > shared on any mailing list at all. How can something that completely > alters the semantics of ioremap on x86 platforms go in without any > review. what happened before was that on x86 ioremap() was "lax" about the PTE cachability and just said "writeback cached". That was utterly false for most of the real ioremap()s done by drivers, but it happened to work out fine due to the courtesy of BIOSes setting up UC MTRRs that forced those areas into uncachable. with the PAT changes, what used to be this default and careless ioremap() behavior by x86 turned into a hard cache aliasing conflict. So we pretty much _have to_ default to uncached - like all other sane architectures already did forever. This is a direct consequence of the PAT changes which were discussed on lkml. But with PAT we take over from the MTRRs on x86 and using a cacheable ioremap() would give us aliasing conflicts and trouble all around the place. In the Voyager case we should use ioremap_cache(), and thanks for pointing out that dependency of the QIC. Does the patch below fix it for you? Ingo ---------------> Subject: x86, voyager: fix ioremap_nocache() From: Ingo Molnar <mingo@elte.hu> Date: Sun Apr 27 23:21:03 CEST 2008 James Bottomley reported that the following PAT related commit: | commit 6371b495991debfd1417b17c2bc4f7d7bae05739 | Author: Ingo Molnar <mingo@elte.hu> | Date: Wed Jan 30 13:33:40 2008 +0100 | | x86: change ioremap() to default to uncached broke Voyager. James says: " it broke a class of voyager machines: those which rely on the quad interrupt controller (QIC). The precis of why they broke is because the QIC does IPIs (or CPIs in its terminology) via cache line interference: you interrupt a processor by moving a designated memory area to write exclusive in the cache (by simply writing to the line) and the CPU acks the interrupt by moving it back to read shared (by reading from it). That area, is, of course, mapped by ioremap, so reversing the ioremap semantics and adding the uncached bit completely breaks the QIC. " Sorry about that! Signed-off-by: Ingo Molnar <mingo@elte.hu> --- arch/x86/mach-voyager/voyager_cat.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) Index: linux-x86.q/arch/x86/mach-voyager/voyager_cat.c =================================================================== --- linux-x86.q.orig/arch/x86/mach-voyager/voyager_cat.c +++ linux-x86.q/arch/x86/mach-voyager/voyager_cat.c @@ -602,7 +602,7 @@ void __init voyager_cat_init(void) request_resource(&iomem_resource, &res); voyager_SUS = (struct voyager_SUS *) - ioremap(addr, 0x400); + ioremap_cache(addr, 0x400); printk(KERN_NOTICE "Voyager SUS mailbox version 0x%x\n", voyager_SUS->SUS_version); voyager_SUS->kernel_version = VOYAGER_MAILBOX_VERSION; @@ -877,7 +877,7 @@ void __init voyager_cat_init(void) request_resource(&iomem_resource, res); } - qic_addr = (unsigned long)ioremap(qic_addr, 0x400); + qic_addr = (unsigned long)ioremap_cache(qic_addr, 0x400); for (j = 0; j < 4; j++) { __u8 cpu; ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [patch] x86, voyager: fix ioremap_nocache() 2008-04-27 21:48 ` [patch] x86, voyager: fix ioremap_nocache() Ingo Molnar @ 2008-04-27 22:05 ` James Bottomley 2008-04-27 22:36 ` Willy Tarreau ` (2 more replies) 2008-04-27 22:34 ` James Bottomley 1 sibling, 3 replies; 51+ messages in thread From: James Bottomley @ 2008-04-27 22:05 UTC (permalink / raw) To: Ingo Molnar Cc: Thomas Gleixner, linux-kernel, H. Peter Anvin, David S. Miller On Sun, 2008-04-27 at 23:48 +0200, Ingo Molnar wrote: > * James Bottomley <James.Bottomley@HansenPartnership.com> wrote: > > > This patch: > > > > commit 6371b495991debfd1417b17c2bc4f7d7bae05739 > > Author: Ingo Molnar <mingo@elte.hu> > > Date: Wed Jan 30 13:33:40 2008 +0100 > > > > x86: change ioremap() to default to uncached > > > > As far as I can tell went blindly into the x86 tree without being > > shared on any mailing list at all. How can something that completely > > alters the semantics of ioremap on x86 platforms go in without any > > review. > > what happened before was that on x86 ioremap() was "lax" about the PTE > cachability and just said "writeback cached". That was utterly false for > most of the real ioremap()s done by drivers, but it happened to work out > fine due to the courtesy of BIOSes setting up UC MTRRs that forced those > areas into uncachable. > > with the PAT changes, what used to be this default and careless > ioremap() behavior by x86 turned into a hard cache aliasing conflict. So > we pretty much _have to_ default to uncached - like all other sane > architectures already did forever. This is a direct consequence of the > PAT changes which were discussed on lkml. > > But with PAT we take over from the MTRRs on x86 and using a cacheable > ioremap() would give us aliasing conflicts and trouble all around the > place. I'm not saying the patch is wrong ... or that just because it broke voyager it shouldn't be done. What I'm saying is that it shouldn't have been put into the x86 tree without mailing list review. Running a git tree isn't a private fiefdom, it's a public trust; to keep the trust of other developers, you have to run the tree in a transparent fashion ... and making the mailing list the only input to it is one way of ensuring this. It also helps with review that we're all so worried about so little being done ... > In the Voyager case we should use ioremap_cache(), and thanks for > pointing out that dependency of the QIC. Does the patch below fix it for > you? > > Ingo > > ---------------> > Subject: x86, voyager: fix ioremap_nocache() > From: Ingo Molnar <mingo@elte.hu> > Date: Sun Apr 27 23:21:03 CEST 2008 > > James Bottomley reported that the following PAT related commit: > > | commit 6371b495991debfd1417b17c2bc4f7d7bae05739 > | Author: Ingo Molnar <mingo@elte.hu> > | Date: Wed Jan 30 13:33:40 2008 +0100 > | > | x86: change ioremap() to default to uncached > > broke Voyager. > > James says: > > " it broke a class of voyager machines: those which > rely on the quad interrupt controller (QIC). The precis of why they > broke is because the QIC does IPIs (or CPIs in its terminology) via > cache line interference: you interrupt a processor by moving a > designated memory area to write exclusive in the cache (by simply > writing to the line) and the CPU acks the interrupt by moving it back to > read shared (by reading from it). That area, is, of course, mapped by > ioremap, so reversing the ioremap semantics and adding the uncached bit > completely breaks the QIC. " > > Sorry about that! > > Signed-off-by: Ingo Molnar <mingo@elte.hu> > --- > arch/x86/mach-voyager/voyager_cat.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > Index: linux-x86.q/arch/x86/mach-voyager/voyager_cat.c > =================================================================== > --- linux-x86.q.orig/arch/x86/mach-voyager/voyager_cat.c > +++ linux-x86.q/arch/x86/mach-voyager/voyager_cat.c > @@ -602,7 +602,7 @@ void __init voyager_cat_init(void) > > request_resource(&iomem_resource, &res); > voyager_SUS = (struct voyager_SUS *) > - ioremap(addr, 0x400); > + ioremap_cache(addr, 0x400); Actually, no ... wrong ioremap. That one is the SUS clickmap remap, which is irrelevant whether it's cached or not. The QIC cacheline area is the next one below it. I've already got the actual patch (with a nice explanation) in the voyager tree. I'll send them all out to the mailing list shortly; I'm just trying to chase down a problem with TSCs before the voyager tree is ready. James ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [patch] x86, voyager: fix ioremap_nocache() 2008-04-27 22:05 ` James Bottomley @ 2008-04-27 22:36 ` Willy Tarreau 2008-04-27 22:41 ` Ingo Molnar 2008-04-27 23:18 ` Ingo Molnar 2 siblings, 0 replies; 51+ messages in thread From: Willy Tarreau @ 2008-04-27 22:36 UTC (permalink / raw) To: James Bottomley Cc: Ingo Molnar, Thomas Gleixner, linux-kernel, H. Peter Anvin, David S. Miller On Sun, Apr 27, 2008 at 06:05:59PM -0400, James Bottomley wrote: > > Index: linux-x86.q/arch/x86/mach-voyager/voyager_cat.c > > =================================================================== > > --- linux-x86.q.orig/arch/x86/mach-voyager/voyager_cat.c > > +++ linux-x86.q/arch/x86/mach-voyager/voyager_cat.c > > @@ -602,7 +602,7 @@ void __init voyager_cat_init(void) > > > > request_resource(&iomem_resource, &res); > > voyager_SUS = (struct voyager_SUS *) > > - ioremap(addr, 0x400); > > + ioremap_cache(addr, 0x400); > > Actually, no ... wrong ioremap. That one is the SUS clickmap remap, > which is irrelevant whether it's cached or not. The QIC cacheline area > is the next one below it. > > I've already got the actual patch (with a nice explanation) in the > voyager tree. I'll send them all out to the mailing list shortly; I'm > just trying to chase down a problem with TSCs before the voyager tree is > ready. James, whatever the fix, please add a comment on the sensible line above so that the same error does not happen again. Code must be readable without comments, but magic choices must be commented if there's an underlying reason. Thanks, Willy ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [patch] x86, voyager: fix ioremap_nocache() 2008-04-27 22:05 ` James Bottomley 2008-04-27 22:36 ` Willy Tarreau @ 2008-04-27 22:41 ` Ingo Molnar 2008-04-27 23:18 ` Ingo Molnar 2 siblings, 0 replies; 51+ messages in thread From: Ingo Molnar @ 2008-04-27 22:41 UTC (permalink / raw) To: James Bottomley Cc: Thomas Gleixner, linux-kernel, H. Peter Anvin, David S. Miller * James Bottomley <James.Bottomley@HansenPartnership.com> wrote: > > request_resource(&iomem_resource, &res); > > voyager_SUS = (struct voyager_SUS *) > > - ioremap(addr, 0x400); > > + ioremap_cache(addr, 0x400); > > Actually, no ... wrong ioremap. That one is the SUS clickmap remap, > which is irrelevant whether it's cached or not. The QIC cacheline area > is the next one below it. yes, but i assumed that since you had all cached ioremaps before, you might wanted it for all of the devices in voyager_cat.c (or at least it wouldnt hurt). Btw., the QIC ioremap i changed in my patch too, just in case you didnt notice: > > - qic_addr = (unsigned long)ioremap(qic_addr, 0x400); > > + qic_addr = (unsigned long)ioremap_cache(qic_addr, 0x400); so my patch too should do the trick. Ingo ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [patch] x86, voyager: fix ioremap_nocache() 2008-04-27 22:05 ` James Bottomley 2008-04-27 22:36 ` Willy Tarreau 2008-04-27 22:41 ` Ingo Molnar @ 2008-04-27 23:18 ` Ingo Molnar 2008-04-27 23:31 ` David Miller 2008-04-28 6:10 ` Christoph Hellwig 2 siblings, 2 replies; 51+ messages in thread From: Ingo Molnar @ 2008-04-27 23:18 UTC (permalink / raw) To: James Bottomley Cc: Thomas Gleixner, linux-kernel, H. Peter Anvin, David S. Miller * James Bottomley <James.Bottomley@HansenPartnership.com> wrote: > > with the PAT changes, what used to be this default and careless > > ioremap() behavior by x86 turned into a hard cache aliasing > > conflict. So we pretty much _have to_ default to uncached - like all > > other sane architectures already did forever. This is a direct > > consequence of the PAT changes which were discussed on lkml. > > > > But with PAT we take over from the MTRRs on x86 and using a > > cacheable ioremap() would give us aliasing conflicts and trouble all > > around the place. > > I'm not saying the patch is wrong ... or that just because it broke > voyager it shouldn't be done. [...] ok, but that rather crutial piece of information was not at all clear from your mail ... ;-) > [...] What I'm saying is that it shouldn't have been put into the x86 > tree without mailing list review. ... but the PAT patchset was all around lkml. The ioremap change was trivial, went into v2.6.25 and was discussed on lkml: http://lkml.org/lkml/2008/1/25/287 " that's why we changed all ioremaps to default to cache-disabled (PCD) in latest x86.git as well. For years the x86 architecture set the ioremap pagetable entries to cacheable by default and only the MTRRs (and the BIOS writers) saved us from trouble. Now we try to be a bit more defensive and avoid "BIOS bug causes only Linux to crash and burn while other OSs work fine" type of scenarios. " i could have included the actual patch in that mail i guess - but this was hardly an unreviewed change and you never showed interest in the PAT threads on lkml (or much interest in x86 threads for that matter) so would it have made any difference in practice? The commit was mentioned in the pull request to Linus as well, which was Cc:-ed to lkml. The commit was upstream about 3 months before you noticed that it broke Voyager. But, we'd not mind at all posting 1000 x86.git patches to lkml (or another list) every 3 months (or more frequently), if people request that. Ingo ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [patch] x86, voyager: fix ioremap_nocache() 2008-04-27 23:18 ` Ingo Molnar @ 2008-04-27 23:31 ` David Miller 2008-04-28 0:31 ` Rik van Riel 2008-04-28 9:01 ` Alan Cox 2008-04-28 6:10 ` Christoph Hellwig 1 sibling, 2 replies; 51+ messages in thread From: David Miller @ 2008-04-27 23:31 UTC (permalink / raw) To: mingo; +Cc: James.Bottomley, tglx, linux-kernel, hpa From: Ingo Molnar <mingo@elte.hu> Date: Mon, 28 Apr 2008 01:18:09 +0200 > But, we'd not mind at all posting 1000 x86.git patches to lkml (or > another list) every 3 months (or more frequently), if people request > that. Keep defending yourself Ingo. You can post whatever patches you like a million times to lkml. That's not the problem. It's that the patches don't get reviewed, posting them more or to a different place doesn't help that. And you add patches before they are properly reviewed. Look at all of the postings people have made this merge window about x86 merge fallout(s). This is exactly the chief gripe of everyone, that you merge patches with little or no review into your tree. People paying attention can see what your goal is. You've expressed in the past that you think we're not merging patches in fast enough, which is quite a laugh, and now that you can do x86 maintainence you're going to ramp things up and show everyone "how it's done." And at what a great expense this experiment is being performed. Just ask anyone who has tried to get any work done this past week or so. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [patch] x86, voyager: fix ioremap_nocache() 2008-04-27 23:31 ` David Miller @ 2008-04-28 0:31 ` Rik van Riel 2008-04-28 0:45 ` Al Viro 2008-04-28 9:01 ` Alan Cox 1 sibling, 1 reply; 51+ messages in thread From: Rik van Riel @ 2008-04-28 0:31 UTC (permalink / raw) To: David Miller; +Cc: mingo, James.Bottomley, tglx, linux-kernel, hpa On Sun, 27 Apr 2008 16:31:06 -0700 (PDT) David Miller <davem@davemloft.net> wrote: > You can post whatever patches you like a million times to lkml. > That's not the problem. > > It's that the patches don't get reviewed, posting them more or to a > different place doesn't help that. If you really want to enforce this, I bet it could be automated with scripts around git. Simply refuse to apply a patch that does not have at least two Signed-off-by/Reviewed-by/Acked-by lines and refuse to apply a "git pull" if there is a changeset like that in the tree. Of course, this could end up screwing rare architectures like Voyager, so I'm not convinced it is a good idea... -- All rights reversed. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [patch] x86, voyager: fix ioremap_nocache() 2008-04-28 0:31 ` Rik van Riel @ 2008-04-28 0:45 ` Al Viro 2008-04-28 0:52 ` H. Peter Anvin 0 siblings, 1 reply; 51+ messages in thread From: Al Viro @ 2008-04-28 0:45 UTC (permalink / raw) To: Rik van Riel Cc: David Miller, mingo, James.Bottomley, tglx, linux-kernel, hpa On Sun, Apr 27, 2008 at 08:31:27PM -0400, Rik van Riel wrote: > On Sun, 27 Apr 2008 16:31:06 -0700 (PDT) > David Miller <davem@davemloft.net> wrote: > > > You can post whatever patches you like a million times to lkml. > > That's not the problem. > > > > It's that the patches don't get reviewed, posting them more or to a > > different place doesn't help that. > > If you really want to enforce this, I bet it could be automated > with scripts around git. > > Simply refuse to apply a patch that does not have at least two > Signed-off-by/Reviewed-by/Acked-by lines and refuse to apply > a "git pull" if there is a changeset like that in the tree. Yeah, right. commit 138fe4e069798d9aa948a5402ff15e58f483ee4e Author: Konrad Rzeszutek <ketuzsezr@darnok.org> Date: Wed Apr 9 19:50:41 2008 -0700 Firmware: add iSCSI iBFT Support ... [akpm@linux-foundation.org: fix build] Signed-off-by: Konrad Rzeszutek <konradr@linux.vnet.ibm.com> Cc: Mike Christie <michaelc@cs.wisc.edu> Cc: Peter Jones <pjones@redhat.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> ... +static ssize_t sprintf_ipaddr(char *buf, u8 *ip) +{ ... + str += sprintf(str, NIP6_FMT, ntohs(ip[0]), ntohs(ip[1]), + ntohs(ip[2]), ntohs(ip[3]), ntohs(ip[4]), + ntohs(ip[5]), ntohs(ip[6]), ntohs(ip[7])); are you going to tell me that this had been reviewed? Note these ntohs() applied to 8bit values in there. Two signed-off-by, including Greg "two s-o-b are enough to guarantee review"... ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [patch] x86, voyager: fix ioremap_nocache() 2008-04-28 0:45 ` Al Viro @ 2008-04-28 0:52 ` H. Peter Anvin 0 siblings, 0 replies; 51+ messages in thread From: H. Peter Anvin @ 2008-04-28 0:52 UTC (permalink / raw) To: Al Viro Cc: Rik van Riel, David Miller, mingo, James.Bottomley, tglx, linux-kernel Al Viro wrote: > are you going to tell me that this had been reviewed? Note these > ntohs() applied to 8bit values in there. Two signed-off-by, including > Greg "two s-o-b are enough to guarantee review"... I thought it was well established that Signed-off-by was about copyright and patch flow, *nothing else*. -hpa ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [patch] x86, voyager: fix ioremap_nocache() 2008-04-27 23:31 ` David Miller 2008-04-28 0:31 ` Rik van Riel @ 2008-04-28 9:01 ` Alan Cox 2008-04-28 9:17 ` David Miller 1 sibling, 1 reply; 51+ messages in thread From: Alan Cox @ 2008-04-28 9:01 UTC (permalink / raw) To: David Miller; +Cc: mingo, James.Bottomley, tglx, linux-kernel, hpa > You can post whatever patches you like a million times to lkml. > That's not the problem. > > It's that the patches don't get reviewed, posting them more or to a > different place doesn't help that. So review them. Your comments strike me as the pot calling the kettle black given the way the network people like to live on their own mailing list. Sorting x86 arch code is inevitably going to break a few eggs, but I suspect the time cost has been more in Dave v Ingo (12 rounds, two falls, two submissions or a knockout) than actually sorting out the fallout of a couple of problem cases. Alan ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [patch] x86, voyager: fix ioremap_nocache() 2008-04-28 9:01 ` Alan Cox @ 2008-04-28 9:17 ` David Miller 2008-04-28 9:48 ` Adrian Bunk 2008-04-28 11:50 ` Ingo Molnar 0 siblings, 2 replies; 51+ messages in thread From: David Miller @ 2008-04-28 9:17 UTC (permalink / raw) To: alan; +Cc: mingo, James.Bottomley, tglx, linux-kernel, hpa From: Alan Cox <alan@lxorguk.ukuu.org.uk> Date: Mon, 28 Apr 2008 10:01:54 +0100 > > You can post whatever patches you like a million times to lkml. > > That's not the problem. > > > > It's that the patches don't get reviewed, posting them more or to a > > different place doesn't help that. > > So review them. Your comments strike me as the pot calling the kettle > black given the way the network people like to live on their own mailing > list. Oh contraire. Because we networking folks use a seperate mailing list with a lower signal to noise ratio than lkml, and as a result more specialization, more patches get more review by more specialists. It's the point I'm trying to make every time the "post everything to lkml" argument gets fronted. Telling me to review all of this crud just ignores the problem. And the implication is that it's OK for unreviewed patches to go into the tree, and it's most certainly not. You might want to know that linux-next mainly exists because of how much of this has been going on over the past half year or so. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [patch] x86, voyager: fix ioremap_nocache() 2008-04-28 9:17 ` David Miller @ 2008-04-28 9:48 ` Adrian Bunk 2008-04-28 11:50 ` Ingo Molnar 1 sibling, 0 replies; 51+ messages in thread From: Adrian Bunk @ 2008-04-28 9:48 UTC (permalink / raw) To: David Miller; +Cc: alan, mingo, James.Bottomley, tglx, linux-kernel, hpa On Mon, Apr 28, 2008 at 02:17:45AM -0700, David Miller wrote: > From: Alan Cox <alan@lxorguk.ukuu.org.uk> > Date: Mon, 28 Apr 2008 10:01:54 +0100 > > > > You can post whatever patches you like a million times to lkml. > > > That's not the problem. > > > > > > It's that the patches don't get reviewed, posting them more or to a > > > different place doesn't help that. > > > > So review them. Your comments strike me as the pot calling the kettle > > black given the way the network people like to live on their own mailing > > list. > > Oh contraire. Because we networking folks use a seperate mailing list > with a lower signal to noise ratio than lkml, and as a result more > specialization, more patches get more review by more specialists. > > It's the point I'm trying to make every time the "post everything to > lkml" argument gets fronted. >... I'd say the best solution is to do both. Many people active only in some area want to read only stuff affecting their area and not linux-kernel that averaged at 525 emails per day in February. But when I have time I skim over patches on linux-kernel, and when I see in the diffstat that a Makefile or Kconfig file gets touched I check whether I can spot any bugs there (and other people sometimes also spot bugs in patches for areas they aren't active in). And starting regression tracking was only possible since most bug reports went to linux-kernel, not only to a gazillion different lists. That's why it might make sense to have everything on both specialized lists and linux-kernel. If it is wanted to reduce the volume on linux-kernel, offloading the sending of patches to some new kernel-patches mailing list might be an option. OTOH, patches and the discussion of patches seems to be the signal on linux-kernel, not the noise.. cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [patch] x86, voyager: fix ioremap_nocache() 2008-04-28 9:17 ` David Miller 2008-04-28 9:48 ` Adrian Bunk @ 2008-04-28 11:50 ` Ingo Molnar 1 sibling, 0 replies; 51+ messages in thread From: Ingo Molnar @ 2008-04-28 11:50 UTC (permalink / raw) To: David Miller; +Cc: alan, James.Bottomley, tglx, linux-kernel, hpa * David Miller <davem@davemloft.net> wrote: > > So review them. Your comments strike me as the pot calling the > > kettle black given the way the network people like to live on their > > own mailing list. > > Oh contraire. Because we networking folks use a seperate mailing list > with a lower signal to noise ratio than lkml, and as a result more > specialization, more patches get more review by more specialists. well, then lets go back to the very basis of this whole ... box match. :) It was about a broken networking patch that i stumbled upon (unwillingly, via testing), which commit was _not_ posted on netdev and _not_ posted on lkml: http://lkml.org/lkml/2008/4/19/51 I simply hit a (trivial) regression in the networking code, but i simply found no existing discussion of the suspect patch (commit 5e8fbe2a). the development process is an integral part of the source code of this OS and not a private matter of developers or maintainers. It is not a religion and it is not taboo to criticise it, it is a crutial part of our technology. So i will continue to criticise the development process in the future too when i think it has aspects that hurts us. [and will try to address all incoming criticism as well.] > You might want to know that linux-next mainly exists because of how > much of this has been going on over the past half year or so. the problem is that linux-next alone would not have helped much in this specific matter. For example the softlockup warnings annoyance you reported would have triggered immediately had you booted your Sparc64 box with linux-next or -mm even just _once_ :-) But the same holds for me: had i ran linux-next i could have reported some of the networking regressions sooner. So how about making mutual use of linux-next and 'promise' to each other to at least minimally build/boot the integrated tree, or at least promise to not complain too loudly about bugs that could have been found and reported there via reasonable mutual testing of linux-next? Does that sound reasonable? Ingo ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [patch] x86, voyager: fix ioremap_nocache() 2008-04-27 23:18 ` Ingo Molnar 2008-04-27 23:31 ` David Miller @ 2008-04-28 6:10 ` Christoph Hellwig 2008-04-28 16:55 ` H. Peter Anvin 1 sibling, 1 reply; 51+ messages in thread From: Christoph Hellwig @ 2008-04-28 6:10 UTC (permalink / raw) To: Ingo Molnar Cc: James Bottomley, Thomas Gleixner, linux-kernel, H. Peter Anvin, David S. Miller On Mon, Apr 28, 2008 at 01:18:09AM +0200, Ingo Molnar wrote: > ... but the PAT patchset was all around lkml. The ioremap change was > trivial, went into v2.6.25 and was discussed on lkml: > > http://lkml.org/lkml/2008/1/25/287 lkml is a hell-hole with a signal/noise ratio worse than slashdot. Any chance you could create a linux-x86 list for x86 specific changes so that a focussed review and discussion can happen? ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [patch] x86, voyager: fix ioremap_nocache() 2008-04-28 6:10 ` Christoph Hellwig @ 2008-04-28 16:55 ` H. Peter Anvin 0 siblings, 0 replies; 51+ messages in thread From: H. Peter Anvin @ 2008-04-28 16:55 UTC (permalink / raw) To: Christoph Hellwig Cc: Ingo Molnar, James Bottomley, Thomas Gleixner, linux-kernel, David S. Miller Christoph Hellwig wrote: > On Mon, Apr 28, 2008 at 01:18:09AM +0200, Ingo Molnar wrote: >> ... but the PAT patchset was all around lkml. The ioremap change was >> trivial, went into v2.6.25 and was discussed on lkml: >> >> http://lkml.org/lkml/2008/1/25/287 > > lkml is a hell-hole with a signal/noise ratio worse than slashdot. Any > chance you could create a linux-x86 list for x86 specific changes so > that a focussed review and discussion can happen? We have been talking about it between us. Originally, Ingo was opposed to it exactly *because* he wanted to keep the development as open as possible. After the merge window craziness is over, we may want to revisit this. I'm not saying we're doing everything right at the moment; heck, we're still figuring out how to work this on three people. However, I want to set the record straight - Ingo *especially* has been concerned to keep development open from the very beginning. -hpa ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [patch] x86, voyager: fix ioremap_nocache() 2008-04-27 21:48 ` [patch] x86, voyager: fix ioremap_nocache() Ingo Molnar 2008-04-27 22:05 ` James Bottomley @ 2008-04-27 22:34 ` James Bottomley 2008-04-27 22:39 ` Jeff Garzik 1 sibling, 1 reply; 51+ messages in thread From: James Bottomley @ 2008-04-27 22:34 UTC (permalink / raw) To: Ingo Molnar Cc: Thomas Gleixner, linux-kernel, H. Peter Anvin, David S. Miller Here's another piece of the x86 API that's designed to be cached. The dma_declare_coherent_memory() usually represents behind bridge memory that's fully participatory in the coherence model. Making it uncached damages the utility of this memory because doing cacheline sized burst cycles when needed to it is far faster than individual byte/word/quad writes. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> --- diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c index 388b113..df83ffd 100644 --- a/arch/x86/kernel/pci-dma.c +++ b/arch/x86/kernel/pci-dma.c @@ -214,7 +214,7 @@ int dma_declare_coherent_memory(struct device *dev, dma_addr_t bus_addr, /* FIXME: this routine just ignores DMA_MEMORY_INCLUDES_CHILDREN */ - mem_base = ioremap(bus_addr, size); + mem_base = ioremap_cache(bus_addr, size); if (!mem_base) goto out; ^ permalink raw reply related [flat|nested] 51+ messages in thread
* Re: [patch] x86, voyager: fix ioremap_nocache() 2008-04-27 22:34 ` James Bottomley @ 2008-04-27 22:39 ` Jeff Garzik 2008-04-27 22:44 ` H. Peter Anvin ` (3 more replies) 0 siblings, 4 replies; 51+ messages in thread From: Jeff Garzik @ 2008-04-27 22:39 UTC (permalink / raw) To: James Bottomley Cc: Ingo Molnar, Thomas Gleixner, linux-kernel, H. Peter Anvin, David S. Miller, Linus Torvalds James Bottomley wrote: > Here's another piece of the x86 API that's designed to be cached. The > dma_declare_coherent_memory() usually represents behind bridge memory > that's fully participatory in the coherence model. > > Making it uncached damages the utility of this memory because doing > cacheline sized burst cycles when needed to it is far faster than > individual byte/word/quad writes. > > Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> > > --- > > diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c > index 388b113..df83ffd 100644 > --- a/arch/x86/kernel/pci-dma.c > +++ b/arch/x86/kernel/pci-dma.c > @@ -214,7 +214,7 @@ int dma_declare_coherent_memory(struct device *dev, dma_addr_t bus_addr, > > /* FIXME: this routine just ignores DMA_MEMORY_INCLUDES_CHILDREN */ > > - mem_base = ioremap(bus_addr, size); > + mem_base = ioremap_cache(bus_addr, size); > if (!mem_base) > goto out; Surely the change in semantics will find other places to creep out like this? I would rather change drivers to use ioremap_nocache(), and leave the API as-is. Isn't there Yet More Breakage in lib/iomap.c, given these new semantics? if (flags & IORESOURCE_MEM) { if (flags & IORESOURCE_CACHEABLE) return ioremap(start, len); return ioremap_nocache(start, len); } Any driver using pci_iomap() (libata, and others) is affected. I disagree with this semantics change. A number of code places _and drivers_ GET IT RIGHT, and these are all broken now? Jeff ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [patch] x86, voyager: fix ioremap_nocache() 2008-04-27 22:39 ` Jeff Garzik @ 2008-04-27 22:44 ` H. Peter Anvin 2008-04-27 22:46 ` David Miller ` (2 subsequent siblings) 3 siblings, 0 replies; 51+ messages in thread From: H. Peter Anvin @ 2008-04-27 22:44 UTC (permalink / raw) To: Jeff Garzik Cc: James Bottomley, Ingo Molnar, Thomas Gleixner, linux-kernel, David S. Miller, Linus Torvalds Jeff Garzik wrote: > > Isn't there Yet More Breakage in lib/iomap.c, given these new semantics? > > if (flags & IORESOURCE_MEM) { > if (flags & IORESOURCE_CACHEABLE) > return ioremap(start, len); > return ioremap_nocache(start, len); > } > > Any driver using pci_iomap() (libata, and others) is affected. > > I disagree with this semantics change. A number of code places _and > drivers_ GET IT RIGHT, and these are all broken now? > They don't, though. They rely on the MTRRs to get it right for them, and the MTRRs will say "uncached" in nearly all cases. Consider the case above: you would have gotten the same result *regardless*, because the MTRRs would not have made the memory cacheable. Yes, it's broken, but it was broken before as well. -hpa ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [patch] x86, voyager: fix ioremap_nocache() 2008-04-27 22:39 ` Jeff Garzik 2008-04-27 22:44 ` H. Peter Anvin @ 2008-04-27 22:46 ` David Miller 2008-04-27 22:52 ` H. Peter Anvin ` (2 more replies) 2008-04-27 23:01 ` Thomas Gleixner 2008-04-28 14:10 ` Arjan van de Ven 3 siblings, 3 replies; 51+ messages in thread From: David Miller @ 2008-04-27 22:46 UTC (permalink / raw) To: jeff; +Cc: James.Bottomley, mingo, tglx, linux-kernel, hpa, torvalds From: Jeff Garzik <jeff@garzik.org> Date: Sun, 27 Apr 2008 18:39:24 -0400 > I disagree with this semantics change. A number of code places _and > drivers_ GET IT RIGHT, and these are all broken now? [ Note, James's patch that you quoted is about mapping DMA memory, in dma_declare_coherent_memory(), rather than devices. But I know what you are trying to talk about Jeff. :-) ] Wrt. ioremap() semanics, it is important to realize that if the implementation of this on x86 has been giving non-cached I/O mappings out up until recently, you can expect that there are hundreds of drivers that might now be broken. That's the sad fact of the ubiquity of x86, and it doesn't matter how we defined the API is some document. Anyways, my point is that this angle should be strongly considered in any discussion about ioremap() behavior. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [patch] x86, voyager: fix ioremap_nocache() 2008-04-27 22:46 ` David Miller @ 2008-04-27 22:52 ` H. Peter Anvin 2008-04-27 22:58 ` David Miller 2008-04-27 23:34 ` Jeff Garzik 2008-04-27 22:53 ` Jeff Garzik 2008-04-27 23:01 ` Arjan van de Ven 2 siblings, 2 replies; 51+ messages in thread From: H. Peter Anvin @ 2008-04-27 22:52 UTC (permalink / raw) To: David Miller; +Cc: jeff, James.Bottomley, mingo, tglx, linux-kernel, torvalds David Miller wrote: > From: Jeff Garzik <jeff@garzik.org> > Date: Sun, 27 Apr 2008 18:39:24 -0400 > >> I disagree with this semantics change. A number of code places _and >> drivers_ GET IT RIGHT, and these are all broken now? > > [ Note, James's patch that you quoted is about mapping DMA > memory, in dma_declare_coherent_memory(), rather than devices. > But I know what you are trying to talk about Jeff. :-) ] > > Wrt. ioremap() semanics, it is important to realize that if > the implementation of this on x86 has been giving non-cached > I/O mappings out up until recently, you can expect that there > are hundreds of drivers that might now be broken. > > That's the sad fact of the ubiquity of x86, and it doesn't matter how > we defined the API is some document. > > Anyways, my point is that this angle should be strongly considered in > any discussion about ioremap() behavior. > There are pretty much three reasons to default to uncached, as far as I understand: 1. De facto historical practice (follow the MTRRs); 2. Conservatism (failure mode less drastic); 3. Compatibility with other architectures. Arguably, the right thing is to not even have ioremap() anymore, just ioremap_{cache,nocache,wc} and consider any unconverted ioremap() as a flag to audit that particular piece of code. I don't think that can be done "instantly", though, so defaulting ioremap() to uncached (as it apparently has been on other arches already?) is the conservative option in the meantime. -hpa ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [patch] x86, voyager: fix ioremap_nocache() 2008-04-27 22:52 ` H. Peter Anvin @ 2008-04-27 22:58 ` David Miller 2008-04-27 23:04 ` H. Peter Anvin 2008-04-27 23:34 ` Jeff Garzik 1 sibling, 1 reply; 51+ messages in thread From: David Miller @ 2008-04-27 22:58 UTC (permalink / raw) To: hpa; +Cc: jeff, James.Bottomley, mingo, tglx, linux-kernel, torvalds From: "H. Peter Anvin" <hpa@zytor.com> Date: Sun, 27 Apr 2008 15:52:02 -0700 > Arguably, the right thing is to not even have ioremap() anymore, just > ioremap_{cache,nocache,wc} and consider any unconverted ioremap() as a > flag to audit that particular piece of code. > > I don't think that can be done "instantly", though, so defaulting > ioremap() to uncached (as it apparently has been on other arches > already?) is the conservative option in the meantime. I think this is a reasonable course of action. That leaves one of Jeff's concerns, what to do with pci_iomap(). That was designed to give mappings with caching enabled, and as a result we probably should make it behave that way. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [patch] x86, voyager: fix ioremap_nocache() 2008-04-27 22:58 ` David Miller @ 2008-04-27 23:04 ` H. Peter Anvin 2008-04-30 20:35 ` Eric W. Biederman 0 siblings, 1 reply; 51+ messages in thread From: H. Peter Anvin @ 2008-04-27 23:04 UTC (permalink / raw) To: David Miller; +Cc: jeff, James.Bottomley, mingo, tglx, linux-kernel, torvalds David Miller wrote: > I think this is a reasonable course of action. > > That leaves one of Jeff's concerns, what to do with pci_iomap(). That > was designed to give mappings with caching enabled, and as a result we > probably should make it behave that way. Yes, it should. Just be clear that *that* is a semantic change over what the code currently does; it would appear that that is what the code is *trying* to do. I believe on x86 it will still get clamped by the MTRRs, at least initially (I don't think we have flipped the default MTRR type to WB yet) but that would immediately change the behaviour on non-x86 architectures. -hpa ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [patch] x86, voyager: fix ioremap_nocache() 2008-04-27 23:04 ` H. Peter Anvin @ 2008-04-30 20:35 ` Eric W. Biederman 0 siblings, 0 replies; 51+ messages in thread From: Eric W. Biederman @ 2008-04-30 20:35 UTC (permalink / raw) To: H. Peter Anvin Cc: David Miller, jeff, James.Bottomley, mingo, tglx, linux-kernel, torvalds "H. Peter Anvin" <hpa@zytor.com> writes: > David Miller wrote: >> I think this is a reasonable course of action. >> >> That leaves one of Jeff's concerns, what to do with pci_iomap(). That >> was designed to give mappings with caching enabled, and as a result we >> probably should make it behave that way. > > Yes, it should. Just be clear that *that* is a semantic change over what the > code currently does; it would appear that that is what the code is *trying* to > do. > > I believe on x86 it will still get clamped by the MTRRs, at least initially (I > don't think we have flipped the default MTRR type to WB yet) but that would > immediately change the behaviour on non-x86 architectures. Flipping the default MTRR type to WB has the potential to break an SMM monitor which does I/O and isn't using our page tables. So we need to be very careful about that direction. Finally I remember what the specific and really scary failure mode with SMM and mucking with the MTRRs is. There don't appear to be many bars that set IORESOURCE_CACHABLE for use with pci_iomap so it doesn't appear to be much of an issue. In addition it looks like we should map IORESOURCE_PREFETCH to ioremap_wc in pci_iomap, for those places that can take advantage of it. It might not always work (hardware is like that) but semantically pci prefetch is the same as the WC attribute (reads and writes have no side effects). I believe on x86 if we want to get cacheline sized reads (read prefetch) there is special instruction in the latest processors we need to use on a wc area. Eric ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [patch] x86, voyager: fix ioremap_nocache() 2008-04-27 22:52 ` H. Peter Anvin 2008-04-27 22:58 ` David Miller @ 2008-04-27 23:34 ` Jeff Garzik 2008-04-27 23:39 ` H. Peter Anvin 1 sibling, 1 reply; 51+ messages in thread From: Jeff Garzik @ 2008-04-27 23:34 UTC (permalink / raw) To: H. Peter Anvin Cc: David Miller, James.Bottomley, mingo, tglx, linux-kernel, torvalds H. Peter Anvin wrote: > Arguably, the right thing is to not even have ioremap() anymore, just > ioremap_{cache,nocache,wc} and consider any unconverted ioremap() as a > flag to audit that particular piece of code. Seems like a sane plan... last handful might be tough, though. Jeff ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [patch] x86, voyager: fix ioremap_nocache() 2008-04-27 23:34 ` Jeff Garzik @ 2008-04-27 23:39 ` H. Peter Anvin 0 siblings, 0 replies; 51+ messages in thread From: H. Peter Anvin @ 2008-04-27 23:39 UTC (permalink / raw) To: Jeff Garzik Cc: David Miller, James.Bottomley, mingo, tglx, linux-kernel, torvalds Jeff Garzik wrote: > H. Peter Anvin wrote: >> Arguably, the right thing is to not even have ioremap() anymore, just >> ioremap_{cache,nocache,wc} and consider any unconverted ioremap() as a >> flag to audit that particular piece of code. > > Seems like a sane plan... last handful might be tough, though. No doubt, although the vast majority of all call sites can probably be trivially changed to ioremap_nocache(). -hpa ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [patch] x86, voyager: fix ioremap_nocache() 2008-04-27 22:46 ` David Miller 2008-04-27 22:52 ` H. Peter Anvin @ 2008-04-27 22:53 ` Jeff Garzik 2008-04-27 22:56 ` H. Peter Anvin 2008-04-27 23:01 ` Arjan van de Ven 2 siblings, 1 reply; 51+ messages in thread From: Jeff Garzik @ 2008-04-27 22:53 UTC (permalink / raw) To: David Miller; +Cc: James.Bottomley, mingo, tglx, linux-kernel, hpa, torvalds David Miller wrote: > From: Jeff Garzik <jeff@garzik.org> > Date: Sun, 27 Apr 2008 18:39:24 -0400 > >> I disagree with this semantics change. A number of code places _and >> drivers_ GET IT RIGHT, and these are all broken now? > > [ Note, James's patch that you quoted is about mapping DMA > memory, in dma_declare_coherent_memory(), rather than devices. > But I know what you are trying to talk about Jeff. :-) ] > > Wrt. ioremap() semanics, it is important to realize that if > the implementation of this on x86 has been giving non-cached > I/O mappings out up until recently, you can expect that there > are hundreds of drivers that might now be broken. > > That's the sad fact of the ubiquity of x86, and it doesn't matter how > we defined the API is some document. > > Anyways, my point is that this angle should be strongly considered in > any discussion about ioremap() behavior. Understood. I guess I am more annoyed that this stealth semantics change appears to have broken everything that depends on pci_iomap(), including 90%+ of all libata drivers, unless I am missing something. That one piece of code (pci_iomap) was correct under the old semantics, on x86 and elsewhere. It's tested and working nicely, and depended upon by many drivers. Jeff ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [patch] x86, voyager: fix ioremap_nocache() 2008-04-27 22:53 ` Jeff Garzik @ 2008-04-27 22:56 ` H. Peter Anvin 2008-04-27 22:59 ` David Miller 2008-04-27 23:02 ` Jeff Garzik 0 siblings, 2 replies; 51+ messages in thread From: H. Peter Anvin @ 2008-04-27 22:56 UTC (permalink / raw) To: Jeff Garzik Cc: David Miller, James.Bottomley, mingo, tglx, linux-kernel, torvalds Jeff Garzik wrote: > > Understood. > > I guess I am more annoyed that this stealth semantics change appears to > have broken everything that depends on pci_iomap(), including 90%+ of > all libata drivers, unless I am missing something. > > That one piece of code (pci_iomap) was correct under the old semantics, > on x86 and elsewhere. It's tested and working nicely, and depended upon > by many drivers. > That one piece of code has had no effective change. Under both the old and the new code, both branches functionally because ioremap_nocache(), in one case because of MTRR and in one case because of PAT. -hpa ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [patch] x86, voyager: fix ioremap_nocache() 2008-04-27 22:56 ` H. Peter Anvin @ 2008-04-27 22:59 ` David Miller 2008-04-27 23:02 ` Jeff Garzik 1 sibling, 0 replies; 51+ messages in thread From: David Miller @ 2008-04-27 22:59 UTC (permalink / raw) To: hpa; +Cc: jeff, James.Bottomley, mingo, tglx, linux-kernel, torvalds From: "H. Peter Anvin" <hpa@zytor.com> Date: Sun, 27 Apr 2008 15:56:58 -0700 > That one piece of code has had no effective change. Under both the old > and the new code, both branches functionally because ioremap_nocache(), > in one case because of MTRR and in one case because of PAT. Ho hum.. forget the suggestion I made in the reply I just sent out then. :-/ ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [patch] x86, voyager: fix ioremap_nocache() 2008-04-27 22:56 ` H. Peter Anvin 2008-04-27 22:59 ` David Miller @ 2008-04-27 23:02 ` Jeff Garzik 2008-04-27 23:14 ` Arjan van de Ven 1 sibling, 1 reply; 51+ messages in thread From: Jeff Garzik @ 2008-04-27 23:02 UTC (permalink / raw) To: H. Peter Anvin Cc: David Miller, James.Bottomley, mingo, tglx, linux-kernel, torvalds H. Peter Anvin wrote: > Jeff Garzik wrote: >> >> Understood. >> >> I guess I am more annoyed that this stealth semantics change appears >> to have broken everything that depends on pci_iomap(), including 90%+ >> of all libata drivers, unless I am missing something. >> >> That one piece of code (pci_iomap) was correct under the old >> semantics, on x86 and elsewhere. It's tested and working nicely, and >> depended upon by many drivers. >> > > That one piece of code has had no effective change. Under both the old > and the new code, both branches functionally because ioremap_nocache(), > in one case because of MTRR and in one case because of PAT. OK good... libata uses it for controller registers exclusively, so that should be fine from an operational standpoint. The lack of discussion is still a bit irksome... at the very least I could have pointed out that lib/iomap.c wanted an update, and the 2.5 yo discussion could have resurfaced. Jeff ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [patch] x86, voyager: fix ioremap_nocache() 2008-04-27 23:02 ` Jeff Garzik @ 2008-04-27 23:14 ` Arjan van de Ven 0 siblings, 0 replies; 51+ messages in thread From: Arjan van de Ven @ 2008-04-27 23:14 UTC (permalink / raw) To: Jeff Garzik Cc: H. Peter Anvin, David Miller, James.Bottomley, mingo, tglx, linux-kernel, torvalds On Sun, 27 Apr 2008 19:02:05 -0400 Jeff Garzik <jeff@garzik.org> wrote: > H. Peter Anvin wrote: > > Jeff Garzik wrote: > >> > >> Understood. > >> > >> I guess I am more annoyed that this stealth semantics change > >> appears to have broken everything that depends on pci_iomap(), > >> including 90%+ of all libata drivers, unless I am missing > >> something. > >> > >> That one piece of code (pci_iomap) was correct under the old > >> semantics, on x86 and elsewhere. It's tested and working nicely, > >> and depended upon by many drivers. > >> > > > > That one piece of code has had no effective change. Under both the > > old and the new code, both branches functionally because > > ioremap_nocache(), in one case because of MTRR and in one case > > because of PAT. > > OK good... libata uses it for controller registers exclusively, so > that should be fine from an operational standpoint. > > at the very least I > could have pointed out that lib/iomap.c wanted an update, and the 2.5 > yo discussion could have resurfaced. > btw as a technical point: ioremap_cached() isn't guaranteed to give you cached memory. In fact, on a PC, if you use ioremap_cached() on PCI bars, you will STILL get uncached access, just because right now, it's not possible to give a cached mapping out! We might want to try to change that in the future (by rewriting the MTRRs), but that would need a LOT of auditing of ioremap (ab)users to make sure they actually mean "cached" when they ask for cached (pci_iomap... I'm not convinced it's users really can deal with cached) and it also will take a LOT of testing. -- If you want to reach me at my work email, use arjan@linux.intel.com For development, discussion and tips for power savings, visit http://www.lesswatts.org ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [patch] x86, voyager: fix ioremap_nocache() 2008-04-27 22:46 ` David Miller 2008-04-27 22:52 ` H. Peter Anvin 2008-04-27 22:53 ` Jeff Garzik @ 2008-04-27 23:01 ` Arjan van de Ven 2008-04-30 21:44 ` James Bottomley 2 siblings, 1 reply; 51+ messages in thread From: Arjan van de Ven @ 2008-04-27 23:01 UTC (permalink / raw) To: David Miller Cc: jeff, James.Bottomley, mingo, tglx, linux-kernel, hpa, torvalds On Sun, 27 Apr 2008 15:46:20 -0700 (PDT) David Miller <davem@davemloft.net> wrote: > From: Jeff Garzik <jeff@garzik.org> > Date: Sun, 27 Apr 2008 18:39:24 -0400 > > > I disagree with this semantics change. A number of code places > > _and drivers_ GET IT RIGHT, and these are all broken now? > > [ Note, James's patch that you quoted is about mapping DMA > memory, in dma_declare_coherent_memory(), rather than devices. > But I know what you are trying to talk about Jeff. :-) ] > > Wrt. ioremap() semanics, it is important to realize that if > the implementation of this on x86 has been giving non-cached > I/O mappings out up until recently, you can expect that there > are hundreds of drivers that might now be broken. it's even worse than that. 99% of the time it gave out uncached memory, but if the bios was iffy, it would suddenly give out something else. The changes to ioremap() recently turned that into "100% uncached". For me, that's the right (because safe) behavior, exactly for the reason you mentioned: it's what happened in practice, and driver writers would assume that to happen. Anything except uncached would just be insanity as default. (well the old "whatever mood the bios writer was in, usually uncached" is effectively uncached except for weird behaving boxes) -- If you want to reach me at my work email, use arjan@linux.intel.com For development, discussion and tips for power savings, visit http://www.lesswatts.org ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [patch] x86, voyager: fix ioremap_nocache() 2008-04-27 23:01 ` Arjan van de Ven @ 2008-04-30 21:44 ` James Bottomley 2008-04-30 22:39 ` H. Peter Anvin 0 siblings, 1 reply; 51+ messages in thread From: James Bottomley @ 2008-04-30 21:44 UTC (permalink / raw) To: Arjan van de Ven Cc: David Miller, jeff, mingo, tglx, linux-kernel, hpa, torvalds On Sun, 2008-04-27 at 16:01 -0700, Arjan van de Ven wrote: > On Sun, 27 Apr 2008 15:46:20 -0700 (PDT) > David Miller <davem@davemloft.net> wrote: > > > From: Jeff Garzik <jeff@garzik.org> > > Date: Sun, 27 Apr 2008 18:39:24 -0400 > > > > > I disagree with this semantics change. A number of code places > > > _and drivers_ GET IT RIGHT, and these are all broken now? > > > > [ Note, James's patch that you quoted is about mapping DMA > > memory, in dma_declare_coherent_memory(), rather than devices. > > But I know what you are trying to talk about Jeff. :-) ] > > > > Wrt. ioremap() semanics, it is important to realize that if > > the implementation of this on x86 has been giving non-cached > > I/O mappings out up until recently, you can expect that there > > are hundreds of drivers that might now be broken. > > it's even worse than that. > 99% of the time it gave out uncached memory, but if the bios was iffy, > it would suddenly give out something else. > > The changes to ioremap() recently turned that into "100% uncached". > For me, that's the right (because safe) behavior, exactly for the > reason you mentioned: it's what happened in practice, and driver writers > would assume that to happen. > > Anything except uncached would just be insanity as default. > (well the old "whatever mood the bios writer was in, usually uncached" is > effectively uncached except for weird behaving boxes) OK, but look; this is why you broke me. The former meaning of ioremap() was give me whatever caching the mtrrs set up (assuming you actually have mtrrs. For voyager, we don't so it always meant give me cached memory). The change broke the QIC ioremap area because it absolutely *requires* cached memory. It drops performance on the Q720 SCSI cards because caching was used essentially like write combining to reduce the unbursted traffic across the MCA bus. When the Voyagers were designed, the only way to get them better performace with the slow bus technology was to do a massive amount of caching, so they're optimised to the point where every element on the voyager bus (sort of like the intel frontside bus) is a primary participant in the caching algorithm, and this includes the MCA controllers. So for me, I want every invocation of ioremap to mean ioremap_cached() otherwise I don't benefit from the added caches. I can fully understand that bus technology has got worse in terms of caching since the voyager heydays. However, I'd rather not be penalised for everyone else's hardware failings. Since the old ioremap is only used in the IORESOURCE_CACHABLE case for PCI, and since that used to defer to the mtrr settings anyway, it probably does make sense to keep it as ioremap_cache() as long as that still defers to the mtrr settings. I suppose I could think about an ioremap platform override for voyager since, by and large, ioremap_nocache() is the wrong thing on the platform. James ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [patch] x86, voyager: fix ioremap_nocache() 2008-04-30 21:44 ` James Bottomley @ 2008-04-30 22:39 ` H. Peter Anvin 0 siblings, 0 replies; 51+ messages in thread From: H. Peter Anvin @ 2008-04-30 22:39 UTC (permalink / raw) To: James Bottomley Cc: Arjan van de Ven, David Miller, jeff, mingo, tglx, linux-kernel, torvalds James Bottomley wrote: > > OK, but look; this is why you broke me. > > The former meaning of ioremap() was give me whatever caching the mtrrs > set up (assuming you actually have mtrrs. For voyager, we don't so it > always meant give me cached memory). > Pre-686 generations of x86 had chipset logic (KEN logic) that did the equivalent. The only change in the 686 was moving this logic into the CPU itself. So no, it didn't always mean "give me cached memory". > The change broke the QIC ioremap area because it absolutely *requires* > cached memory. It drops performance on the Q720 SCSI cards because > caching was used essentially like write combining to reduce the > unbursted traffic across the MCA bus. > > When the Voyagers were designed, the only way to get them better > performace with the slow bus technology was to do a massive amount of > caching, so they're optimised to the point where every element on the > voyager bus (sort of like the intel frontside bus) is a primary > participant in the caching algorithm, and this includes the MCA > controllers. So for me, I want every invocation of ioremap to mean > ioremap_cached() otherwise I don't benefit from the added caches. > > I can fully understand that bus technology has got worse in terms of > caching since the voyager heydays. However, I'd rather not be penalised > for everyone else's hardware failings. Since the old ioremap is only > used in the IORESOURCE_CACHABLE case for PCI, and since that used to > defer to the mtrr settings anyway, it probably does make sense to keep > it as ioremap_cache() as long as that still defers to the mtrr settings. > > I suppose I could think about an ioremap platform override for voyager > since, by and large, ioremap_nocache() is the wrong thing on the > platform. Sounds like it, or you should just do a global replace of ioremap() with ioremap_cache() for the Voyager drivers. -hpa ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [patch] x86, voyager: fix ioremap_nocache() 2008-04-27 22:39 ` Jeff Garzik 2008-04-27 22:44 ` H. Peter Anvin 2008-04-27 22:46 ` David Miller @ 2008-04-27 23:01 ` Thomas Gleixner 2008-04-28 14:10 ` Arjan van de Ven 3 siblings, 0 replies; 51+ messages in thread From: Thomas Gleixner @ 2008-04-27 23:01 UTC (permalink / raw) To: Jeff Garzik Cc: James Bottomley, Ingo Molnar, linux-kernel, H. Peter Anvin, David S. Miller, Linus Torvalds On Sun, 27 Apr 2008, Jeff Garzik wrote: > > Isn't there Yet More Breakage in lib/iomap.c, given these new semantics? > > if (flags & IORESOURCE_MEM) { > if (flags & IORESOURCE_CACHEABLE) > return ioremap(start, len); > return ioremap_nocache(start, len); > } That's broken anyway as we are inconsistent across architectures. ioremap defaults to uncached at least on: POWERPC, ARM, MIPS, SH, FRV, PARISC, H8300, M68K, BLACKFIN, M68KNOMMU So changing the above to ioremap_cache() would provide the desired results across architectures. Thanks, tglx ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [patch] x86, voyager: fix ioremap_nocache() 2008-04-27 22:39 ` Jeff Garzik ` (2 preceding siblings ...) 2008-04-27 23:01 ` Thomas Gleixner @ 2008-04-28 14:10 ` Arjan van de Ven 2008-04-28 14:29 ` James Bottomley 3 siblings, 1 reply; 51+ messages in thread From: Arjan van de Ven @ 2008-04-28 14:10 UTC (permalink / raw) To: Jeff Garzik Cc: James Bottomley, Ingo Molnar, Thomas Gleixner, linux-kernel, H. Peter Anvin, David S. Miller, Linus Torvalds On Sun, 27 Apr 2008 18:39:24 -0400 Jeff Garzik <jeff@garzik.org> wrote: > James Bottomley wrote: > > Here's another piece of the x86 API that's designed to be cached. > > The dma_declare_coherent_memory() usually represents behind bridge > > memory that's fully participatory in the coherence model. > > > > Making it uncached damages the utility of this memory because doing > > cacheline sized burst cycles when needed to it is far faster than > > individual byte/word/quad writes. > > > > Signed-off-by: James Bottomley > > <James.Bottomley@HansenPartnership.com> > > > > --- > > > > diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c > > index 388b113..df83ffd 100644 > > --- a/arch/x86/kernel/pci-dma.c > > +++ b/arch/x86/kernel/pci-dma.c > > @@ -214,7 +214,7 @@ int dma_declare_coherent_memory(struct device > > *dev, dma_addr_t bus_addr, > > /* FIXME: this routine just ignores > > DMA_MEMORY_INCLUDES_CHILDREN */ > > - mem_base = ioremap(bus_addr, size); > > + mem_base = ioremap_cache(bus_addr, size); > > if (!mem_base) > > goto out; this patch patch is likely broken on x86; or rather, anyone who uses it is... thinking you can find cache coherent memory on a PCI or similar bus that is actually cachable... keep dreaming. (for now; there's talk about extending PCI) > I would rather change drivers to use ioremap_nocache(), and leave the > API as-is. > > Isn't there Yet More Breakage in lib/iomap.c, given these new > semantics? > > if (flags & IORESOURCE_MEM) { > if (flags & IORESOURCE_CACHEABLE) > return ioremap(start, len); > return ioremap_nocache(start, len); > } > > Any driver using pci_iomap() (libata, and others) is affected. only if you map ROM's. Anything but ROMs you cannot set IORESOURCE_CACHEABLE on... since PCI MMIO memory isn't cache coherent per se. (it's cache coherent on x86 by virtue of being uncachable ;-) Thankfully Linux doesn't do that. > I disagree with this semantics change. A number of code places _and > drivers_ GET IT RIGHT, and these are all broken now? Can you list one that gets it actually right ? (cachable pretty much means: "the cpu is the only one changing it AND there is no side effect of reading or writing") -- If you want to reach me at my work email, use arjan@linux.intel.com For development, discussion and tips for power savings, visit http://www.lesswatts.org ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [patch] x86, voyager: fix ioremap_nocache() 2008-04-28 14:10 ` Arjan van de Ven @ 2008-04-28 14:29 ` James Bottomley 2008-04-28 15:07 ` Arjan van de Ven 0 siblings, 1 reply; 51+ messages in thread From: James Bottomley @ 2008-04-28 14:29 UTC (permalink / raw) To: Arjan van de Ven Cc: Jeff Garzik, Ingo Molnar, Thomas Gleixner, linux-kernel, H. Peter Anvin, David S. Miller, Linus Torvalds On Mon, 2008-04-28 at 07:10 -0700, Arjan van de Ven wrote: > On Sun, 27 Apr 2008 18:39:24 -0400 > Jeff Garzik <jeff@garzik.org> wrote: > > > James Bottomley wrote: > > > Here's another piece of the x86 API that's designed to be cached. > > > The dma_declare_coherent_memory() usually represents behind bridge > > > memory that's fully participatory in the coherence model. > > > > > > Making it uncached damages the utility of this memory because doing > > > cacheline sized burst cycles when needed to it is far faster than > > > individual byte/word/quad writes. > > > > > > Signed-off-by: James Bottomley > > > <James.Bottomley@HansenPartnership.com> > > > > > > --- > > > > > > diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c > > > index 388b113..df83ffd 100644 > > > --- a/arch/x86/kernel/pci-dma.c > > > +++ b/arch/x86/kernel/pci-dma.c > > > @@ -214,7 +214,7 @@ int dma_declare_coherent_memory(struct device > > > *dev, dma_addr_t bus_addr, > > > /* FIXME: this routine just ignores > > > DMA_MEMORY_INCLUDES_CHILDREN */ > > > - mem_base = ioremap(bus_addr, size); > > > + mem_base = ioremap_cache(bus_addr, size); > > > if (!mem_base) > > > goto out; > > this patch patch is likely broken on x86; or rather, anyone who uses it is... > thinking you can find cache coherent memory on a PCI or similar bus that is actually > cachable... keep dreaming. (for now; there's talk about extending PCI) No ... it works for me, and caching is a performance advantage for me too. The only current consumer of this API is the NCR_Q720 SCSI card which keeps a bunch of cacheable memory remote across the MCA bus. If you think about it logically, most busses are second citizens in the caching hierarchy: they really only get to force a flush and invalidate of the CPU cache line rather than being fully participatory in the coherence protocol. However, even being second class is enough of a speed up on slow busses because it allows bursting of the cache line for the bus transfers. The other consumers are SoC embedded ... so yes, perhaps I should ask about this on linux-arch. James ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [patch] x86, voyager: fix ioremap_nocache() 2008-04-28 14:29 ` James Bottomley @ 2008-04-28 15:07 ` Arjan van de Ven 2008-04-28 19:59 ` H. Peter Anvin 0 siblings, 1 reply; 51+ messages in thread From: Arjan van de Ven @ 2008-04-28 15:07 UTC (permalink / raw) To: James Bottomley Cc: Jeff Garzik, Ingo Molnar, Thomas Gleixner, linux-kernel, H. Peter Anvin, David S. Miller, Linus Torvalds On Mon, 28 Apr 2008 10:29:08 -0400 James Bottomley <James.Bottomley@HansenPartnership.com> wrote: > On Mon, 2008-04-28 at 07:10 -0700, Arjan van de Ven wrote: > > On Sun, 27 Apr 2008 18:39:24 -0400 > > Jeff Garzik <jeff@garzik.org> wrote: > > > > > James Bottomley wrote: > > > > Here's another piece of the x86 API that's designed to be > > > > cached. The dma_declare_coherent_memory() usually represents > > > > behind bridge memory that's fully participatory in the > > > > coherence model. > > > > > > > > Making it uncached damages the utility of this memory because > > > > doing cacheline sized burst cycles when needed to it is far > > > > faster than individual byte/word/quad writes. > > > > > > > > Signed-off-by: James Bottomley > > > > <James.Bottomley@HansenPartnership.com> > > > > > > > > --- > > > > > > > > diff --git a/arch/x86/kernel/pci-dma.c > > > > b/arch/x86/kernel/pci-dma.c index 388b113..df83ffd 100644 > > > > --- a/arch/x86/kernel/pci-dma.c > > > > +++ b/arch/x86/kernel/pci-dma.c > > > > @@ -214,7 +214,7 @@ int dma_declare_coherent_memory(struct > > > > device *dev, dma_addr_t bus_addr, > > > > /* FIXME: this routine just ignores > > > > DMA_MEMORY_INCLUDES_CHILDREN */ > > > > - mem_base = ioremap(bus_addr, size); > > > > + mem_base = ioremap_cache(bus_addr, size); > > > > if (!mem_base) > > > > goto out; > > > > this patch patch is likely broken on x86; or rather, anyone who > > uses it is... thinking you can find cache coherent memory on a PCI > > or similar bus that is actually cachable... keep dreaming. (for > > now; there's talk about extending PCI) > > No ... it works for me, and caching is a performance advantage for me > too. The only current consumer of this API is the NCR_Q720 SCSI card > which keeps a bunch of cacheable memory remote across the MCA bus. > > If you think about it logically, most busses are second citizens in > the caching hierarchy: they really only get to force a flush and > invalidate of the CPU cache line rather than being fully > participatory in the coherence protocol. Cached means that the cpu, at any time, can do a speculative read to the memory. It also means that the cpu can then write the speculated cacheline back at any time later, if some speculation was going to write to the cacheline but didn't actually happen. (before you think this is bogus, at least AMD cpus do this and I can't vouch for Intel cpus never doing this). If the on-the-bus hardware *ever* writes to the memory without being part of the full cache coherence protocol it's in trouble. Big time. Even if it sends an invalidate first (which PCI and others just don't allow, not sure about MCA though), it's not enough because the cpu can just read it right back... one needs a "take for ownership" not an "invalidate" for this to work, and that means being part of the full protocol. > However, even being second > class is enough of a speed up on slow busses because it allows > bursting of the cache line for the bus transfers. that's write combining not cachability I suspect... at least for writes it is. > > The other consumers are SoC embedded ... so yes, perhaps I should ask > about this on linux-arch. x86 SoC ? (we're talking about an arch/x86 file here) Esp in SoC people don't bother doing cache coherency if they can get away with it. -- If you want to reach me at my work email, use arjan@linux.intel.com For development, discussion and tips for power savings, visit http://www.lesswatts.org ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [patch] x86, voyager: fix ioremap_nocache() 2008-04-28 15:07 ` Arjan van de Ven @ 2008-04-28 19:59 ` H. Peter Anvin 0 siblings, 0 replies; 51+ messages in thread From: H. Peter Anvin @ 2008-04-28 19:59 UTC (permalink / raw) To: Arjan van de Ven Cc: James Bottomley, Jeff Garzik, Ingo Molnar, Thomas Gleixner, linux-kernel, David S. Miller, Linus Torvalds Arjan van de Ven wrote: > > Cached means that the cpu, at any time, can do a speculative read to the memory. > It also means that the cpu can then write the speculated cacheline back at any time later, > if some speculation was going to write to the cacheline but didn't actually happen. > (before you think this is bogus, at least AMD cpus do this and I can't vouch for Intel > cpus never doing this). > If the on-the-bus hardware *ever* writes to the memory without being part of the full > cache coherence protocol it's in trouble. Big time. > Even if it sends an invalidate first (which PCI and others just don't allow, not sure about MCA though), > it's not enough because the cpu can just read it right back... one needs a "take for ownership" not > an "invalidate" for this to work, and that means being part of the full protocol. > Although realistically speaking, the OS *can* generally know that the only transactor is the CPU -- in fact, that will be the normal condition. -hpa ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Breakage caused by unreviewed patch in x86 tree 2008-04-27 20:51 Breakage caused by unreviewed patch in x86 tree James Bottomley 2008-04-27 20:53 ` David Miller 2008-04-27 21:48 ` [patch] x86, voyager: fix ioremap_nocache() Ingo Molnar @ 2008-04-27 22:00 ` H. Peter Anvin 2008-04-27 22:10 ` James Bottomley 2008-04-27 22:58 ` Arjan van de Ven 3 siblings, 1 reply; 51+ messages in thread From: H. Peter Anvin @ 2008-04-27 22:00 UTC (permalink / raw) To: James Bottomley; +Cc: Ingo Molnar, Thomas Gleixner, linux-kernel James Bottomley wrote: > I might add that the intel SAPIC functions > in roughly the same manner, so this might break more than just voyager. Are you referring to the IA64 SAPIC here, or something else? The only mention of SAPIC in the x86 tree appear to be naming of fields in ACPI tables. -hpa ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Breakage caused by unreviewed patch in x86 tree 2008-04-27 22:00 ` Breakage caused by unreviewed patch in x86 tree H. Peter Anvin @ 2008-04-27 22:10 ` James Bottomley 2008-04-27 22:13 ` H. Peter Anvin 0 siblings, 1 reply; 51+ messages in thread From: James Bottomley @ 2008-04-27 22:10 UTC (permalink / raw) To: H. Peter Anvin; +Cc: Ingo Molnar, Thomas Gleixner, linux-kernel On Sun, 2008-04-27 at 15:00 -0700, H. Peter Anvin wrote: > James Bottomley wrote: > > I might add that the intel SAPIC functions > > in roughly the same manner, so this might break more than just voyager. > > Are you referring to the IA64 SAPIC here, or something else? The only > mention of SAPIC in the x86 tree appear to be naming of fields in ACPI > tables. Yes, that's the one ... but I believe a class of the xAPICs also used a similar principle. James ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Breakage caused by unreviewed patch in x86 tree 2008-04-27 22:10 ` James Bottomley @ 2008-04-27 22:13 ` H. Peter Anvin 2008-04-27 22:18 ` James Bottomley 0 siblings, 1 reply; 51+ messages in thread From: H. Peter Anvin @ 2008-04-27 22:13 UTC (permalink / raw) To: James Bottomley; +Cc: Ingo Molnar, Thomas Gleixner, linux-kernel James Bottomley wrote: > On Sun, 2008-04-27 at 15:00 -0700, H. Peter Anvin wrote: >> James Bottomley wrote: >>> I might add that the intel SAPIC functions >>> in roughly the same manner, so this might break more than just voyager. >> Are you referring to the IA64 SAPIC here, or something else? The only >> mention of SAPIC in the x86 tree appear to be naming of fields in ACPI >> tables. > > Yes, that's the one ... but I believe a class of the xAPICs also used a > similar principle. I certainly have never seen a system on which the APIC has been mapped cacheable. I would be very interested in the details, so if you could elaborate that would be extremely useful. -hpa ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Breakage caused by unreviewed patch in x86 tree 2008-04-27 22:13 ` H. Peter Anvin @ 2008-04-27 22:18 ` James Bottomley 2008-04-27 22:31 ` H. Peter Anvin 0 siblings, 1 reply; 51+ messages in thread From: James Bottomley @ 2008-04-27 22:18 UTC (permalink / raw) To: H. Peter Anvin; +Cc: Ingo Molnar, Thomas Gleixner, linux-kernel On Sun, 2008-04-27 at 15:13 -0700, H. Peter Anvin wrote: > James Bottomley wrote: > > On Sun, 2008-04-27 at 15:00 -0700, H. Peter Anvin wrote: > >> James Bottomley wrote: > >>> I might add that the intel SAPIC functions > >>> in roughly the same manner, so this might break more than just voyager. > >> Are you referring to the IA64 SAPIC here, or something else? The only > >> mention of SAPIC in the x86 tree appear to be naming of fields in ACPI > >> tables. > > > > Yes, that's the one ... but I believe a class of the xAPICs also used a > > similar principle. > > I certainly have never seen a system on which the APIC has been mapped > cacheable. I would be very interested in the details, so if you could > elaborate that would be extremely useful. Not really ... I just remember when the SAPIC and later the xAPIC details were published as novel nearly a decade ago, I remember saying that some of the voyager interrupt controllers had been using a similar method for years. James ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Breakage caused by unreviewed patch in x86 tree 2008-04-27 22:18 ` James Bottomley @ 2008-04-27 22:31 ` H. Peter Anvin 0 siblings, 0 replies; 51+ messages in thread From: H. Peter Anvin @ 2008-04-27 22:31 UTC (permalink / raw) To: James Bottomley; +Cc: Ingo Molnar, Thomas Gleixner, linux-kernel James Bottomley wrote: >>> Yes, that's the one ... but I believe a class of the xAPICs also used a >>> similar principle. >> I certainly have never seen a system on which the APIC has been mapped >> cacheable. I would be very interested in the details, so if you could >> elaborate that would be extremely useful. > > Not really ... I just remember when the SAPIC and later the xAPIC > details were published as novel nearly a decade ago, I remember saying > that some of the voyager interrupt controllers had been using a similar > method for years. > Well, I just looked up the xAPIC spec, and it states very clearly: APIC registers are memory-mapped to a 4-KByte region of the processor’s physical address space with an initial starting address of FEE00000H. For correct APIC operation, this address space must be mapped to an area of memory that has been designated as strong uncacheable (UC). See Section 10.3, “Methods of Caching Available.” So any use of cacheline-related bus cycles is generated by the LAPIC and doesn't affect the CPU <-> LAPIC interface. -hpa ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Breakage caused by unreviewed patch in x86 tree 2008-04-27 20:51 Breakage caused by unreviewed patch in x86 tree James Bottomley ` (2 preceding siblings ...) 2008-04-27 22:00 ` Breakage caused by unreviewed patch in x86 tree H. Peter Anvin @ 2008-04-27 22:58 ` Arjan van de Ven 2008-04-27 23:00 ` David Miller 2008-04-27 23:03 ` James Bottomley 3 siblings, 2 replies; 51+ messages in thread From: Arjan van de Ven @ 2008-04-27 22:58 UTC (permalink / raw) To: James Bottomley; +Cc: Ingo Molnar, Thomas Gleixner, linux-kernel On Sun, 27 Apr 2008 16:51:25 -0400 James Bottomley <James.Bottomley@HansenPartnership.com> wrote: > This patch: > > commit 6371b495991debfd1417b17c2bc4f7d7bae05739 > Author: Ingo Molnar <mingo@elte.hu> > Date: Wed Jan 30 13:33:40 2008 +0100 > > x86: change ioremap() to default to uncached > > As far as I can tell went blindly into the x86 tree without being > shared on any mailing list at all. How can something that completely > alters the semantics of ioremap on x86 platforms go in without any > review. it changed from "whatever coinflip you got" to "predictable outcome". What you got before was uncached (most of the time), or if the bios was creative, write combining. Or if the bios was broken in how it set up MTRR's, you could suddenly get "cached". When you're mapping device memory, uncached is the safe default. With the switch to PAT (and phasing out of MTRR), the kernel needs to pick one of the three (cached, writecombining, uncached) since you can no longer really depend on MTRRs saving your bacon there. Drivers in general, with VERY few exceptions, want uncached. Any other choice would have been deadly... I'd like to ask you which one you would pick... you maintain a whole bunch of drivers as scsi maintainer, what would you have picked? The answer "whatever the MTRR set up" no longer holds ;( > > I might add that the intel > SAPIC functions in roughly the same manner, so this might break more > than just voyager. Apics are uncached... -- If you want to reach me at my work email, use arjan@linux.intel.com For development, discussion and tips for power savings, visit http://www.lesswatts.org ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Breakage caused by unreviewed patch in x86 tree 2008-04-27 22:58 ` Arjan van de Ven @ 2008-04-27 23:00 ` David Miller 2008-04-27 23:07 ` Arjan van de Ven 2008-04-27 23:03 ` James Bottomley 1 sibling, 1 reply; 51+ messages in thread From: David Miller @ 2008-04-27 23:00 UTC (permalink / raw) To: arjan; +Cc: James.Bottomley, mingo, tglx, linux-kernel From: Arjan van de Ven <arjan@infradead.org> Date: Sun, 27 Apr 2008 15:58:03 -0700 > On Sun, 27 Apr 2008 16:51:25 -0400 > James Bottomley <James.Bottomley@HansenPartnership.com> wrote: > > > This patch: > > > > commit 6371b495991debfd1417b17c2bc4f7d7bae05739 > > Author: Ingo Molnar <mingo@elte.hu> > > Date: Wed Jan 30 13:33:40 2008 +0100 > > > > x86: change ioremap() to default to uncached > > > > As far as I can tell went blindly into the x86 tree without being > > shared on any mailing list at all. How can something that completely > > alters the semantics of ioremap on x86 platforms go in without any > > review. > > it changed from "whatever coinflip you got" to "predictable outcome". You're making technical responses to a question about process. The core issue how the change in question was handled. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Breakage caused by unreviewed patch in x86 tree 2008-04-27 23:00 ` David Miller @ 2008-04-27 23:07 ` Arjan van de Ven 0 siblings, 0 replies; 51+ messages in thread From: Arjan van de Ven @ 2008-04-27 23:07 UTC (permalink / raw) To: David Miller; +Cc: James.Bottomley, mingo, tglx, linux-kernel On Sun, 27 Apr 2008 16:00:41 -0700 (PDT) David Miller <davem@davemloft.net> wrote: > From: Arjan van de Ven <arjan@infradead.org> > Date: Sun, 27 Apr 2008 15:58:03 -0700 > > > On Sun, 27 Apr 2008 16:51:25 -0400 > > James Bottomley <James.Bottomley@HansenPartnership.com> wrote: > > > > > This patch: > > > > > > commit 6371b495991debfd1417b17c2bc4f7d7bae05739 > > > Author: Ingo Molnar <mingo@elte.hu> > > > Date: Wed Jan 30 13:33:40 2008 +0100 > > > > > > x86: change ioremap() to default to uncached > > > > > > As far as I can tell went blindly into the x86 tree without being > > > shared on any mailing list at all. How can something that > > > completely alters the semantics of ioremap on x86 platforms go in > > > without any review. > > > > it changed from "whatever coinflip you got" to "predictable > > outcome". > > You're making technical responses to a question about process. ok on a process side it is basically a "go from 99.9% to 100%" bugfix ;( Anyone who claims that ioremap() on x86 was cached before doesn't understand the issue, it was NOT cached. It was "almost always uncached". The bugfix here is to take away the "almost" part. Should that have been posted to lkml more vocally? Wouldn't have hurt. No argument about that. Is it as big a deal as this thread makes it sound? Absolutely not! The change in behavior or even semantics is almost zero. -- If you want to reach me at my work email, use arjan@linux.intel.com For development, discussion and tips for power savings, visit http://www.lesswatts.org ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Breakage caused by unreviewed patch in x86 tree 2008-04-27 22:58 ` Arjan van de Ven 2008-04-27 23:00 ` David Miller @ 2008-04-27 23:03 ` James Bottomley 2008-04-27 23:11 ` Arjan van de Ven 2008-04-27 23:17 ` H. Peter Anvin 1 sibling, 2 replies; 51+ messages in thread From: James Bottomley @ 2008-04-27 23:03 UTC (permalink / raw) To: Arjan van de Ven; +Cc: Ingo Molnar, Thomas Gleixner, linux-kernel On Sun, 2008-04-27 at 15:58 -0700, Arjan van de Ven wrote: > On Sun, 27 Apr 2008 16:51:25 -0400 > James Bottomley <James.Bottomley@HansenPartnership.com> wrote: > > > This patch: > > > > commit 6371b495991debfd1417b17c2bc4f7d7bae05739 > > Author: Ingo Molnar <mingo@elte.hu> > > Date: Wed Jan 30 13:33:40 2008 +0100 > > > > x86: change ioremap() to default to uncached > > > > As far as I can tell went blindly into the x86 tree without being > > shared on any mailing list at all. How can something that completely > > alters the semantics of ioremap on x86 platforms go in without any > > review. > it changed from "whatever coinflip you got" to "predictable outcome". > What you got before was uncached (most of the time), or if the bios was > creative, write combining. Or if the bios was broken in how it set up MTRR's, you could suddenly > get "cached". My major complaint is the lack of review and notice, not the actual patch. > When you're mapping device memory, uncached is the safe default. Well, for certain device mailboxes, uncached does mean less performant. The voyager breakage was exceptional ... I expect other problems just to result in a loss of performance that caching gave by improving the bursting. If we're lucky, the PCI bridge cache might hide a lot of this. > With the switch to PAT (and phasing out of MTRR), the kernel needs to pick one of the three > (cached, writecombining, uncached) since you can no longer really depend on MTRRs saving > your bacon there. > Drivers in general, with VERY few exceptions, want uncached. Any other choice would have been deadly... > > I'd like to ask you which one you would pick... you maintain a whole bunch of drivers as scsi maintainer, what would > you have picked? > The answer "whatever the MTRR set up" no longer holds ;( I wouldn't have picked ... I'd have asked for input. James ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Breakage caused by unreviewed patch in x86 tree 2008-04-27 23:03 ` James Bottomley @ 2008-04-27 23:11 ` Arjan van de Ven 2008-04-27 23:17 ` H. Peter Anvin 1 sibling, 0 replies; 51+ messages in thread From: Arjan van de Ven @ 2008-04-27 23:11 UTC (permalink / raw) To: James Bottomley; +Cc: Ingo Molnar, Thomas Gleixner, linux-kernel On Sun, 27 Apr 2008 19:03:59 -0400 James Bottomley <James.Bottomley@HansenPartnership.com> wrote: > On Sun, 2008-04-27 at 15:58 -0700, Arjan van de Ven wrote: > > On Sun, 27 Apr 2008 16:51:25 -0400 > > James Bottomley <James.Bottomley@HansenPartnership.com> wrote: > > When you're mapping device memory, uncached is the safe default. > > Well, for certain device mailboxes, uncached does mean less > performant. The voyager breakage was exceptional ... I expect other > problems just to result in a loss of performance that caching gave by > improving the bursting. If we're lucky, the PCI bridge cache might > hide a lot of this. on a PC, NONE of the ioremap()s of PCI stuff were cached before, with the exception of prefetchable ranges, that some bioses set up a write-combining MTRR for. (and write-combining is effectively "uncached but with write buffering") If you look at at what kind of devices have prefetchable ranges, you get two answers: 1) video cards 2) some IB adapters So your argument that this might change PCI stuff is just false. Your argument that this "99.9%->100% uncached" should have been announced/discussed in public, sure. But to make this big a deal out of it? -- If you want to reach me at my work email, use arjan@linux.intel.com For development, discussion and tips for power savings, visit http://www.lesswatts.org ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Breakage caused by unreviewed patch in x86 tree 2008-04-27 23:03 ` James Bottomley 2008-04-27 23:11 ` Arjan van de Ven @ 2008-04-27 23:17 ` H. Peter Anvin 1 sibling, 0 replies; 51+ messages in thread From: H. Peter Anvin @ 2008-04-27 23:17 UTC (permalink / raw) To: James Bottomley Cc: Arjan van de Ven, Ingo Molnar, Thomas Gleixner, linux-kernel James Bottomley wrote: > My major complaint is the lack of review and notice, not the actual > patch. [...] > I wouldn't have picked ... I'd have asked for input. I can't speak for anyone else, but I definitely have discussed this issue with a large number of people. I can't tell you in which specific fora, because it usually became a very brief conversation since there really is only one plausible answer (*especially* in the light of the precedents set by other architectures.) Ingo's mention of it on LKML in this email: http://lkml.org/lkml/2008/1/25/287 ... is pretty typical of how the conversation usually went. -hpa ^ permalink raw reply [flat|nested] 51+ messages in thread
end of thread, other threads:[~2008-04-30 22:43 UTC | newest] Thread overview: 51+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-04-27 20:51 Breakage caused by unreviewed patch in x86 tree James Bottomley 2008-04-27 20:53 ` David Miller 2008-04-27 21:48 ` [patch] x86, voyager: fix ioremap_nocache() Ingo Molnar 2008-04-27 22:05 ` James Bottomley 2008-04-27 22:36 ` Willy Tarreau 2008-04-27 22:41 ` Ingo Molnar 2008-04-27 23:18 ` Ingo Molnar 2008-04-27 23:31 ` David Miller 2008-04-28 0:31 ` Rik van Riel 2008-04-28 0:45 ` Al Viro 2008-04-28 0:52 ` H. Peter Anvin 2008-04-28 9:01 ` Alan Cox 2008-04-28 9:17 ` David Miller 2008-04-28 9:48 ` Adrian Bunk 2008-04-28 11:50 ` Ingo Molnar 2008-04-28 6:10 ` Christoph Hellwig 2008-04-28 16:55 ` H. Peter Anvin 2008-04-27 22:34 ` James Bottomley 2008-04-27 22:39 ` Jeff Garzik 2008-04-27 22:44 ` H. Peter Anvin 2008-04-27 22:46 ` David Miller 2008-04-27 22:52 ` H. Peter Anvin 2008-04-27 22:58 ` David Miller 2008-04-27 23:04 ` H. Peter Anvin 2008-04-30 20:35 ` Eric W. Biederman 2008-04-27 23:34 ` Jeff Garzik 2008-04-27 23:39 ` H. Peter Anvin 2008-04-27 22:53 ` Jeff Garzik 2008-04-27 22:56 ` H. Peter Anvin 2008-04-27 22:59 ` David Miller 2008-04-27 23:02 ` Jeff Garzik 2008-04-27 23:14 ` Arjan van de Ven 2008-04-27 23:01 ` Arjan van de Ven 2008-04-30 21:44 ` James Bottomley 2008-04-30 22:39 ` H. Peter Anvin 2008-04-27 23:01 ` Thomas Gleixner 2008-04-28 14:10 ` Arjan van de Ven 2008-04-28 14:29 ` James Bottomley 2008-04-28 15:07 ` Arjan van de Ven 2008-04-28 19:59 ` H. Peter Anvin 2008-04-27 22:00 ` Breakage caused by unreviewed patch in x86 tree H. Peter Anvin 2008-04-27 22:10 ` James Bottomley 2008-04-27 22:13 ` H. Peter Anvin 2008-04-27 22:18 ` James Bottomley 2008-04-27 22:31 ` H. Peter Anvin 2008-04-27 22:58 ` Arjan van de Ven 2008-04-27 23:00 ` David Miller 2008-04-27 23:07 ` Arjan van de Ven 2008-04-27 23:03 ` James Bottomley 2008-04-27 23:11 ` Arjan van de Ven 2008-04-27 23:17 ` H. Peter Anvin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).