* msi_free_irqs #2 @ 2007-05-24 16:07 Mike Miller (OS Dev) 2007-05-24 17:27 ` Andrew Morton 0 siblings, 1 reply; 14+ messages in thread From: Mike Miller (OS Dev) @ 2007-05-24 16:07 UTC (permalink / raw) To: linux-kernel, tom.l.nguyen; +Cc: iss_storagedev, akpm, jens.axboe So I guess I found the answer to my own question. msi_free_irqs was apparently added in 2.6.22-something. I don't find it in 2.6.21.2 or anywhere else. So somebody broke a couple of things. The most noticable is cciss hangs after turning on interrupts. The reason for that is the kernel now looks at my array of MSI-X vectors in reverse order. We have 4 ways of generating an interrupt on Smart Array hw. They are: # define DOORBELL_INT 0 # define PERF_MODE_INT 1 # define SIMPLE_MODE_INT 2 # define MEMQ_MODE_INT 3 For INTx these four lines are OR'ed together and run to one interrupt pin. MSI-X breaks this hardware OR'ing so we must register either all 4 or at the least the correct interrupt. When I first submitted the MSI/X support I was registering all 4. Someone changed that to only register a single MSI-X vector. That worked fine until 2.6.22-something. Now it appears that the kernel is looking at the array in reverse order. IOW, I must register PERF_MODE_INT in order for cciss to work. That's messed up. Anybody want to `fess up to making these changes? :) I'll keep working this, but I'm going to undo someones change when I figure out where it's broke. mikem ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: msi_free_irqs #2 2007-05-24 16:07 msi_free_irqs #2 Mike Miller (OS Dev) @ 2007-05-24 17:27 ` Andrew Morton 2007-05-24 19:42 ` Eric W. Biederman 2007-05-24 20:07 ` msi_free_irqs #2 Mike Miller (OS Dev) 0 siblings, 2 replies; 14+ messages in thread From: Andrew Morton @ 2007-05-24 17:27 UTC (permalink / raw) To: Mike Miller (OS Dev) Cc: linux-kernel, tom.l.nguyen, iss_storagedev, jens.axboe, Eric W. Biederman, Michael Ellerman On Thu, 24 May 2007 11:07:56 -0500 "Mike Miller (OS Dev)" <mikem@beardog.cca.cpqcorp.net> wrote: > So I guess I found the answer to my own question. msi_free_irqs was apparently added > in 2.6.22-something. I don't find it in 2.6.21.2 or anywhere else. So somebody broke a > couple of things. > The most noticable is cciss hangs after turning on interrupts. The reason for that is > the kernel now looks at my array of MSI-X vectors in reverse order. We have 4 ways of > generating an interrupt on Smart Array hw. They are: > > # define DOORBELL_INT 0 > # define PERF_MODE_INT 1 > # define SIMPLE_MODE_INT 2 > # define MEMQ_MODE_INT 3 > > For INTx these four lines are OR'ed together and run to one interrupt pin. MSI-X > breaks this hardware OR'ing so we must register either all 4 or at the least the > correct interrupt. When I first submitted the MSI/X support I was registering all 4. > Someone changed that to only register a single MSI-X vector. That worked fine until > 2.6.22-something. > Now it appears that the kernel is looking at the array in reverse order. IOW, I must > register PERF_MODE_INT in order for cciss to work. That's messed up. Anybody want to > `fess up to making these changes? :) > I'll keep working this, but I'm going to undo someones change when I figure out where > it's broke. > I'd guess that you're referring to Michael's changes. If you can identify the offending code in a less vague fashion, more confident answers can be given ;) I canot find any sign of anyone altering the IRQ handling in cciss.c after your initial MSI-support merge. But that's perhaps isn't what you meant. it's all rather foggy. Please either quote file-n-line, or grab a copy of git-blame. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: msi_free_irqs #2 2007-05-24 17:27 ` Andrew Morton @ 2007-05-24 19:42 ` Eric W. Biederman 2007-05-24 20:59 ` Mike Miller (OS Dev) 2007-05-24 20:07 ` msi_free_irqs #2 Mike Miller (OS Dev) 1 sibling, 1 reply; 14+ messages in thread From: Eric W. Biederman @ 2007-05-24 19:42 UTC (permalink / raw) To: Andrew Morton Cc: Mike Miller (OS Dev), linux-kernel, tom.l.nguyen, iss_storagedev, jens.axboe, Michael Ellerman Andrew Morton <akpm@linux-foundation.org> writes: > On Thu, 24 May 2007 11:07:56 -0500 > "Mike Miller (OS Dev)" <mikem@beardog.cca.cpqcorp.net> wrote: > >> So I guess I found the answer to my own question. msi_free_irqs was apparently > added >> in 2.6.22-something. I don't find it in 2.6.21.2 or anywhere else. So somebody > broke a >> couple of things. >> The most noticable is cciss hangs after turning on interrupts. The reason for > that is >> the kernel now looks at my array of MSI-X vectors in reverse order. We have 4 > ways of >> generating an interrupt on Smart Array hw. They are: >> >> # define DOORBELL_INT 0 >> # define PERF_MODE_INT 1 >> # define SIMPLE_MODE_INT 2 >> # define MEMQ_MODE_INT 3 >> >> For INTx these four lines are OR'ed together and run to one interrupt > pin. MSI-X >> breaks this hardware OR'ing so we must register either all 4 or at the least > the >> correct interrupt. When I first submitted the MSI/X support I was registering > all 4. >> Someone changed that to only register a single MSI-X vector. That worked fine > until >> 2.6.22-something. >> Now it appears that the kernel is looking at the array in reverse order. IOW, > I must >> register PERF_MODE_INT in order for cciss to work. That's messed up. Anybody > want to >> `fess up to making these changes? :) >> I'll keep working this, but I'm going to undo someones change when I figure > out where >> it's broke. >> > > I'd guess that you're referring to Michael's changes. If you can identify > the offending code in a less vague fashion, more confident answers can > be given ;) > > I canot find any sign of anyone altering the IRQ handling in cciss.c after > your initial MSI-support merge. But that's perhaps isn't what you meant. > it's all rather foggy. Please either quote file-n-line, or grab a copy > of git-blame. Or perhaps git-bisect to find the offending patch. I don't recall seeing anything that looked to bad but there was a fair amount of change needed to get the last bit of portability into the msi code, so it is possible something slipped through. Possibly someone changed the default enable or disable state? .... Which reminds me. Now that we have a reasonable list, we really need to reduce pci_enable_msix: - int pci_enable_msix(struct pci_dev* dev, struct msix_entry *entries, int nvec); + int pci_enable_msix(struct pci_dev* dev, int nvec); And just have drivers that use more the one irq walk the list off of pci_dev of all of the msi irqs. I did a little review a while ago and only 0-(nvec -1) are allocated and the are always in order in entries so it shouldn't be to bad to generate a patch for that case, and not having to worry about out of order or holes in the irq allocator would be good. Eric ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: msi_free_irqs #2 2007-05-24 19:42 ` Eric W. Biederman @ 2007-05-24 20:59 ` Mike Miller (OS Dev) 2007-05-24 21:08 ` [PATCH] msi: Fix the ordering of msix irqs Eric W. Biederman 0 siblings, 1 reply; 14+ messages in thread From: Mike Miller (OS Dev) @ 2007-05-24 20:59 UTC (permalink / raw) To: Eric W. Biederman Cc: Andrew Morton, linux-kernel, tom.l.nguyen, iss_storagedev, jens.axboe, Michael Ellerman On Thu, May 24, 2007 at 01:42:58PM -0600, Eric W. Biederman wrote: > Andrew Morton <akpm@linux-foundation.org> writes: > > > On Thu, 24 May 2007 11:07:56 -0500 > > "Mike Miller (OS Dev)" <mikem@beardog.cca.cpqcorp.net> wrote: > > > >> So I guess I found the answer to my own question. msi_free_irqs was apparently > > added > >> in 2.6.22-something. I don't find it in 2.6.21.2 or anywhere else. So somebody > > broke a > >> couple of things. > >> The most noticable is cciss hangs after turning on interrupts. The reason for > > that is > >> the kernel now looks at my array of MSI-X vectors in reverse order. We have 4 > > ways of > >> generating an interrupt on Smart Array hw. They are: > >> > >> # define DOORBELL_INT 0 > >> # define PERF_MODE_INT 1 > >> # define SIMPLE_MODE_INT 2 > >> # define MEMQ_MODE_INT 3 > >> > >> For INTx these four lines are OR'ed together and run to one interrupt > > pin. MSI-X > >> breaks this hardware OR'ing so we must register either all 4 or at the least > > the > >> correct interrupt. When I first submitted the MSI/X support I was registering > > all 4. > >> Someone changed that to only register a single MSI-X vector. That worked fine > > until > >> 2.6.22-something. > >> Now it appears that the kernel is looking at the array in reverse order. IOW, > > I must > >> register PERF_MODE_INT in order for cciss to work. That's messed up. Anybody > > want to > >> `fess up to making these changes? :) > >> I'll keep working this, but I'm going to undo someones change when I figure > > out where > >> it's broke. > >> > > > > I'd guess that you're referring to Michael's changes. If you can identify > > the offending code in a less vague fashion, more confident answers can > > be given ;) > > > > I canot find any sign of anyone altering the IRQ handling in cciss.c after > > your initial MSI-support merge. But that's perhaps isn't what you meant. > > it's all rather foggy. Please either quote file-n-line, or grab a copy > > of git-blame. > > Or perhaps git-bisect to find the offending patch. > > I don't recall seeing anything that looked to bad but there was a fair > amount of change needed to get the last bit of portability into the msi > code, so it is possible something slipped through. > > Possibly someone changed the default enable or disable state? > > .... > > Which reminds me. Now that we have a reasonable list, we really need > to reduce pci_enable_msix: > > - int pci_enable_msix(struct pci_dev* dev, struct msix_entry *entries, int nvec); > + int pci_enable_msix(struct pci_dev* dev, int nvec); > > And just have drivers that use more the one irq walk the list off of pci_dev > of all of the msi irqs. I did a little review a while ago and only > 0-(nvec -1) are allocated and the are always in order in entries so it > shouldn't be to bad to generate a patch for that case, and not having > to worry about out of order or holes in the irq allocator would be > good. > > Eric Found what seems the problem with our vectors being listed backward. In drivers/pci/msi.c we should be using list_add_tail rather than list_add to preserve the ordering across various kernels. Please consider this for inclusion. diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index 0e67723..d74975d 100644 --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -333,7 +333,7 @@ static int msi_capability_init(struct pci_dev *dev) msi_mask_bits_reg(pos, is_64bit_address(control)), maskbits); } - list_add(&entry->list, &dev->msi_list); + list_add_tail(&entry->list, &dev->msi_list); /* Configure MSI capability structure */ ret = arch_setup_msi_irqs(dev, 1, PCI_CAP_ID_MSI); @@ -404,7 +404,7 @@ static int msix_capability_init(struct pci_dev *dev, entry->dev = dev; entry->mask_base = base; - list_add(&entry->list, &dev->msi_list); + list_add_tail(&entry->list, &dev->msi_list); } ret = arch_setup_msi_irqs(dev, nvec, PCI_CAP_ID_MSIX); -------------------------------------------------------------------------------- This patch undoes my dirty little hack. diff --git a/drivers/block/cciss.h b/drivers/block/cciss.h index 26b5866..b70988d 100644 --- a/drivers/block/cciss.h +++ b/drivers/block/cciss.h @@ -70,8 +70,8 @@ struct ctlr_info int highest_lun; int usage_count; /* number of opens all all minor devices */ # define DOORBELL_INT 0 -# define PERF_MODE_INT 2 -# define SIMPLE_MODE_INT 1 +# define PERF_MODE_INT 1 +# define SIMPLE_MODE_INT 2 # define MEMQ_MODE_INT 3 unsigned int intr[4]; unsigned int msix_vector; ^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH] msi: Fix the ordering of msix irqs. 2007-05-24 20:59 ` Mike Miller (OS Dev) @ 2007-05-24 21:08 ` Eric W. Biederman 2007-05-24 21:17 ` Andrew Morton 2007-05-25 2:32 ` Michael Ellerman 0 siblings, 2 replies; 14+ messages in thread From: Eric W. Biederman @ 2007-05-24 21:08 UTC (permalink / raw) To: Mike Miller (OS Dev) Cc: Andrew Morton, linux-kernel, tom.l.nguyen, iss_storagedev, jens.axboe, Michael Ellerman Yours looks more complete then my test patch so: From: "Mike Miller (OS Dev)" <mikem@beardog.cca.cpqcorp.net> writes: Found what seems the problem with our vectors being listed backward. In drivers/pci/msi.c we should be using list_add_tail rather than list_add to preserve the ordering across various kernels. Please consider this for inclusion. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index 0e67723..d74975d 100644 --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -333,7 +333,7 @@ static int msi_capability_init(struct pci_dev *dev) msi_mask_bits_reg(pos, is_64bit_address(control)), maskbits); } - list_add(&entry->list, &dev->msi_list); + list_add_tail(&entry->list, &dev->msi_list); /* Configure MSI capability structure */ ret = arch_setup_msi_irqs(dev, 1, PCI_CAP_ID_MSI); @@ -404,7 +404,7 @@ static int msix_capability_init(struct pci_dev *dev, entry->dev = dev; entry->mask_base = base; - list_add(&entry->list, &dev->msi_list); + list_add_tail(&entry->list, &dev->msi_list); } ret = arch_setup_msi_irqs(dev, nvec, PCI_CAP_ID_MSIX); ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH] msi: Fix the ordering of msix irqs. 2007-05-24 21:08 ` [PATCH] msi: Fix the ordering of msix irqs Eric W. Biederman @ 2007-05-24 21:17 ` Andrew Morton 2007-05-24 21:36 ` Eric W. Biederman 2007-05-25 2:32 ` Michael Ellerman 1 sibling, 1 reply; 14+ messages in thread From: Andrew Morton @ 2007-05-24 21:17 UTC (permalink / raw) To: Eric W. Biederman Cc: Mike Miller (OS Dev), linux-kernel, tom.l.nguyen, iss_storagedev, jens.axboe, Michael Ellerman On Thu, 24 May 2007 15:08:21 -0600 ebiederm@xmission.com (Eric W. Biederman) wrote: > Yours looks more complete then my test patch so: > > From: "Mike Miller (OS Dev)" <mikem@beardog.cca.cpqcorp.net> writes: > > Found what seems the problem with our vectors being listed backward. In > drivers/pci/msi.c we should be using list_add_tail rather than list_add to > preserve the ordering across various kernels. Please consider this for > inclusion. > > Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> > > diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c > index 0e67723..d74975d 100644 > --- a/drivers/pci/msi.c > +++ b/drivers/pci/msi.c > @@ -333,7 +333,7 @@ static int msi_capability_init(struct pci_dev *dev) > msi_mask_bits_reg(pos, is_64bit_address(control)), > maskbits); > } > - list_add(&entry->list, &dev->msi_list); > + list_add_tail(&entry->list, &dev->msi_list); > > /* Configure MSI capability structure */ > ret = arch_setup_msi_irqs(dev, 1, PCI_CAP_ID_MSI); > @@ -404,7 +404,7 @@ static int msix_capability_init(struct pci_dev *dev, > entry->dev = dev; > entry->mask_base = base; > > - list_add(&entry->list, &dev->msi_list); > + list_add_tail(&entry->list, &dev->msi_list); > } > > ret = arch_setup_msi_irqs(dev, nvec, PCI_CAP_ID_MSIX); Is this as fragile as I think it is? What happens when close+open and rmmod+modprobe happen? The list gets reordered then? If this is important then perhaps a big-fat-comment which explains wtf is going on is needed? ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] msi: Fix the ordering of msix irqs. 2007-05-24 21:17 ` Andrew Morton @ 2007-05-24 21:36 ` Eric W. Biederman 0 siblings, 0 replies; 14+ messages in thread From: Eric W. Biederman @ 2007-05-24 21:36 UTC (permalink / raw) To: Andrew Morton Cc: Mike Miller (OS Dev), linux-kernel, tom.l.nguyen, iss_storagedev, jens.axboe, Michael Ellerman Andrew Morton <akpm@linux-foundation.org> writes: > On Thu, 24 May 2007 15:08:21 -0600 > ebiederm@xmission.com (Eric W. Biederman) wrote: > >> Yours looks more complete then my test patch so: >> >> From: "Mike Miller (OS Dev)" <mikem@beardog.cca.cpqcorp.net> writes: >> >> Found what seems the problem with our vectors being listed backward. In >> drivers/pci/msi.c we should be using list_add_tail rather than list_add to >> preserve the ordering across various kernels. Please consider this for >> inclusion. >> >> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> >> >> diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c >> index 0e67723..d74975d 100644 >> --- a/drivers/pci/msi.c >> +++ b/drivers/pci/msi.c >> @@ -333,7 +333,7 @@ static int msi_capability_init(struct pci_dev *dev) >> msi_mask_bits_reg(pos, is_64bit_address(control)), >> maskbits); >> } >> - list_add(&entry->list, &dev->msi_list); >> + list_add_tail(&entry->list, &dev->msi_list); >> >> /* Configure MSI capability structure */ >> ret = arch_setup_msi_irqs(dev, 1, PCI_CAP_ID_MSI); >> @@ -404,7 +404,7 @@ static int msix_capability_init(struct pci_dev *dev, >> entry->dev = dev; >> entry->mask_base = base; >> >> - list_add(&entry->list, &dev->msi_list); >> + list_add_tail(&entry->list, &dev->msi_list); >> } >> >> ret = arch_setup_msi_irqs(dev, nvec, PCI_CAP_ID_MSIX); > > Is this as fragile as I think it is? What happens when close+open and > rmmod+modprobe happen? The list gets reordered then? No this patch is not that fragile. We destroy and recreate the list each time we call pci_disable_msix, pci_enable_msix. The order only matters for the duration of pci_enable_msix. Yes that silly struct msi_entries *entries vector passed into pci_enable_msix is that fragile. The right fix is to just kill that vector and have the driver walk the list. That is fundamentally more robust. > If this is important then perhaps a big-fat-comment which explains wtf is > going on is needed? When I get a moment I will remove the need to keep the list in any sort of order. That will permanently remove the need for explanations. The truly problematic piece of code is this one below: i = 0; list_for_each_entry(entry, &dev->msi_list, list) { entries[i].vector = entry->irq; set_irq_msi(entry->irq, entry); i++; } If we don't want to care about order we should really should make it look something like: list_for_each_entry(entry, &dev->msi_list, list) { for (i = 0; i < nvecs; i++) { if (entries[i].vector == entry->entry_nr) entries[i].vector = entry->irq; } set_irq_msi(entry->irq, entry); } But that is overkill for a trivial bug fix, and under kill for a proper fix because the drivers still need to deal with that complexity. Letting the drivers walk dev->msi_list should be noticeably more maintainable long term. Especially when drivers start using a lot of msi irqs. Eric ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] msi: Fix the ordering of msix irqs. 2007-05-24 21:08 ` [PATCH] msi: Fix the ordering of msix irqs Eric W. Biederman 2007-05-24 21:17 ` Andrew Morton @ 2007-05-25 2:32 ` Michael Ellerman 2007-05-25 3:04 ` kernel crash in timer interrupt handler gshan 1 sibling, 1 reply; 14+ messages in thread From: Michael Ellerman @ 2007-05-25 2:32 UTC (permalink / raw) To: Mike Miller (OS Dev) Cc: Eric W. Biederman, Andrew Morton, linux-kernel, tom.l.nguyen, iss_storagedev, jens.axboe [-- Attachment #1: Type: text/plain, Size: 920 bytes --] On Thu, 2007-05-24 at 15:08 -0600, Eric W. Biederman wrote: > Yours looks more complete then my test patch so: > > From: "Mike Miller (OS Dev)" <mikem@beardog.cca.cpqcorp.net> writes: > > Found what seems the problem with our vectors being listed backward. In > drivers/pci/msi.c we should be using list_add_tail rather than list_add to > preserve the ordering across various kernels. Please consider this for > inclusion. > > Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Screwed-up-by: Michael Ellerman <michael@ellerman.id.au> The sad part is my earlier patches did use list_add_tail(), doh. :( Thanks for debugging it. cheers -- Michael Ellerman OzLabs, IBM Australia Development Lab wwweb: http://michael.ellerman.id.au phone: +61 2 6212 1183 (tie line 70 21183) We do not inherit the earth from our ancestors, we borrow it from our children. - S.M.A.R.T Person [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 14+ messages in thread
* kernel crash in timer interrupt handler 2007-05-25 2:32 ` Michael Ellerman @ 2007-05-25 3:04 ` gshan 0 siblings, 0 replies; 14+ messages in thread From: gshan @ 2007-05-25 3:04 UTC (permalink / raw) To: linux-kernel The kernel crashed inside timer handler. Anyone has ideas? The Linux I'm using is 2.6.19. Thanks in advance! BUG: soft lockup detected on CPU#0! Call Trace: [C0395EA0] [C000C308] show_stack+0x58/0x180 (unreliable) [C0395ED0] [C0043A18] softlockup_tick+0xac/0xc8 [C0395EF0] [C00223C4] run_local_timers+0x18/0x28 [C0395F00] [C0022414] update_process_times+0x40/0x7c [C0395F10] [C0005800] timer_interrupt+0x70/0x208 [C0395F40] [C0004874] ret_from_except+0x0/0x14 ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: msi_free_irqs #2 2007-05-24 17:27 ` Andrew Morton 2007-05-24 19:42 ` Eric W. Biederman @ 2007-05-24 20:07 ` Mike Miller (OS Dev) 2007-05-24 20:53 ` Eric W. Biederman 1 sibling, 1 reply; 14+ messages in thread From: Mike Miller (OS Dev) @ 2007-05-24 20:07 UTC (permalink / raw) To: Andrew Morton Cc: linux-kernel, tom.l.nguyen, iss_storagedev, jens.axboe, Eric W. Biederman, Michael Ellerman On Thu, May 24, 2007 at 10:27:02AM -0700, Andrew Morton wrote: > On Thu, 24 May 2007 11:07:56 -0500 > "Mike Miller (OS Dev)" <mikem@beardog.cca.cpqcorp.net> wrote: > > > So I guess I found the answer to my own question. msi_free_irqs was apparently added > > in 2.6.22-something. I don't find it in 2.6.21.2 or anywhere else. So somebody broke a > > couple of things. > > The most noticable is cciss hangs after turning on interrupts. The reason for that is > > the kernel now looks at my array of MSI-X vectors in reverse order. We have 4 ways of > > generating an interrupt on Smart Array hw. They are: > > > > # define DOORBELL_INT 0 > > # define PERF_MODE_INT 1 > > # define SIMPLE_MODE_INT 2 > > # define MEMQ_MODE_INT 3 > > I apologize for the vagueness of the message. This dirty hack makes cciss work in the .22-rc kernel. I have not yet figured out what broke. ----------------------------------------------------------------------------------------- diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c index 5acc6c4..7383483 100644 --- a/drivers/block/cciss.c +++ b/drivers/block/cciss.c @@ -3494,7 +3494,7 @@ static void cciss_shutdown(struct pci_dev *pdev) } else { printk(KERN_WARNING "Error flushing cache on controller %d\n", i); } - free_irq(hba[i]->intr[2], hba[i]); + free_irq(hba[i]->intr[SIMPLE_MODE_INT], hba[i]); } static void __devexit cciss_remove_one(struct pci_dev *pdev) diff --git a/drivers/block/cciss.h b/drivers/block/cciss.h index b70988d..26b5866 100644 --- a/drivers/block/cciss.h +++ b/drivers/block/cciss.h @@ -70,8 +70,8 @@ struct ctlr_info int highest_lun; int usage_count; /* number of opens all all minor devices */ # define DOORBELL_INT 0 -# define PERF_MODE_INT 1 -# define SIMPLE_MODE_INT 2 +# define PERF_MODE_INT 2 +# define SIMPLE_MODE_INT 1 # define MEMQ_MODE_INT 3 unsigned int intr[4]; unsigned int msix_vector; ----------------------------------------------------------------------------------------- > > For INTx these four lines are OR'ed together and run to one interrupt pin. MSI-X > > breaks this hardware OR'ing so we must register either all 4 or at the least the > > correct interrupt. When I first submitted the MSI/X support I was registering all 4. > > Someone changed that to only register a single MSI-X vector. That worked fine until > > 2.6.22-something. > > Now it appears that the kernel is looking at the array in reverse order. IOW, I must > > register PERF_MODE_INT in order for cciss to work. That's messed up. Anybody want to Have not yet found the change that caused this but my nasty little hack gets around it for my testing. After I return from the long weekend I'll try to hunt this down. > > `fess up to making these changes? :) > > I'll keep working this, but I'm going to undo someones change when I figure out where > > it's broke. > > > > I'd guess that you're referring to Michael's changes. If you can identify > the offending code in a less vague fashion, more confident answers can > be given ;) > > I canot find any sign of anyone altering the IRQ handling in cciss.c after > your initial MSI-support merge. But that's perhaps isn't what you meant. > it's all rather foggy. Please either quote file-n-line, or grab a copy > of git-blame. Now for my original mail where the driver Oops'ed on rmmod. This patch prevents the Oops but I'm not 100% sure it's right. Here's the Oops: Completed flushing cache on controller 2 BUG: unable to handle kernel paging request at virtual address f8b2200c printing eip: c01e9cc7 *pdpt = 0000000000003001 *pde = 0000000037e48067 *pte = 0000000000000000 Oops: 0002 [#1] SMP Modules linked in: cciss ipv6 parport_pc lp parport autofs4 i2c_dev i2c_core sunrpc loop dm_multipath button battery asus_acpi ac tg3 floppy sg dm_snapshot dm_zero dm_mirror ext3 jbd dm_mod ata_piix libata mptsas scsi_transport_sas mptspi scsi_transport_spi mptscsih mptbase sd_mod scsi_mod CPU: 1 EIP: 0060:[<c01e9cc7>] Not tainted VLI EFLAGS: 00010286 (2.6.22-rc2-gd2579053 #1) EIP is at msi_free_irqs+0x81/0xbe eax: f8b22000 ebx: f71f3180 ecx: f7fff280 edx: c1886eb8 esi: f7c4e800 edi: f7c4ec48 ebp: 00000002 esp: f5a0dec8 ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068 Process rmmod (pid: 5286, ti=f5a0d000 task=c47d2550 task.ti=f5a0d000) Stack: 00000002 f8b72294 00000400 f8b69ca7 f8b6bc6c 00000002 00000000 00000000 00000000 00000000 00000000 f5a997f4 f8b69d61 f7c5a4b0 f7c4e848 f7c4e848 f7c4e800 f7c4e800 f8b72294 f7c4e848 f8b72294 c01e3cdf f7c4e848 c024c469 Call Trace: [<f8b69ca7>] cciss_shutdown+0xae/0xc3 [cciss] [<f8b69d61>] cciss_remove_one+0xa5/0x178 [cciss] [<c01e3cdf>] pci_device_remove+0x16/0x35 [<c024c469>] __device_release_driver+0x71/0x8e [<c024c56e>] driver_detach+0xa0/0xde [<c024bc5c>] bus_remove_driver+0x27/0x41 [<c01e3ef3>] pci_unregister_driver+0xb/0x13 [<f8b6a343>] cciss_cleanup+0xf/0x51 [cciss] [<c0139ced>] sys_delete_module+0x110/0x135 [<c0104c7a>] sysenter_past_esp+0x5f/0x85 ======================= Code: 00 39 c2 74 5d 0f b6 03 24 1f 3c 11 75 24 8d 86 54 04 00 00 39 43 0c 75 08 8b 43 14 e8 32 df f2 ff 0f b7 43 02 c1 e0 04 03 43 14 <c7> 40 0c 01 00 00 00 8d 4b 0c 8b 43 0c 8b 51 04 89 50 04 89 02 EIP: [<c01e9cc7>] msi_free_irqs+0x81/0xbe SS:ESP 0068:f5a0dec8 BUG: sleeping function called from invalid context at kernel/rwsem.c:20 in_atomic():0, irqs_disabled():1 [<c011d16a>] __might_sleep+0xa7/0xad [<c013239a>] down_read+0x12/0x25 [<c013db2f>] acct_collect+0x32/0x128 [<c0121dc4>] do_exit+0x198/0x37e [<c0105f7b>] die+0x1da/0x1e2 [<c03123ad>] do_page_fault+0x5b9/0x69c [<c01127b6>] smp_call_function+0x1a/0x1d [<c0311df4>] do_page_fault+0x0/0x69c [<c0310b1a>] error_code+0x72/0x78 [<c011007b>] single_msr_reserve+0xf/0x3b [<c01e9cc7>] msi_free_irqs+0x81/0xbe [<f8b69ca7>] cciss_shutdown+0xae/0xc3 [cciss] [<f8b69d61>] cciss_remove_one+0xa5/0x178 [cciss] [<c01e3cdf>] pci_device_remove+0x16/0x35 [<c024c469>] __device_release_driver+0x71/0x8e [<c024c56e>] driver_detach+0xa0/0xde [<c024bc5c>] bus_remove_driver+0x27/0x41 [<c01e3ef3>] pci_unregister_driver+0xb/0x13 [<f8b6a343>] cciss_cleanup+0xf/0x51 [cciss] [<c0139ced>] sys_delete_module+0x110/0x135 [<c0104c7a>] sysenter_past_esp+0x5f/0x85 ======================= Michael or Eric, would you please review this patch and see if it's OK? Adding an else between the the if (list_is....) and the writel resolved the Oops. I'm not sure how this is supposed to work but using entry->mask_base after iounmap'ing seems wrong. diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index d9cbd58..0e67723 100644 --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -560,10 +560,11 @@ static int msi_free_irqs(struct pci_dev* dev) if (entry->msi_attrib.type == PCI_CAP_ID_MSIX) { if (list_is_last(&entry->list, &dev->msi_list)) iounmap(entry->mask_base); - - writel(1, entry->mask_base + entry->msi_attrib.entry_nr - * PCI_MSIX_ENTRY_SIZE - + PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET); + else + writel(1, entry->mask_base + + entry->msi_attrib.entry_nr + * PCI_MSIX_ENTRY_SIZE + + PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET); } list_del(&entry->list); kfree(entry); ----------------------------------------------------------------------------------------- I hope this clears up a little of the fog. Thanks, mikem ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: msi_free_irqs #2 2007-05-24 20:07 ` msi_free_irqs #2 Mike Miller (OS Dev) @ 2007-05-24 20:53 ` Eric W. Biederman 2007-05-24 21:01 ` Mike Miller (OS Dev) 2007-05-24 21:11 ` Mike Miller (OS Dev) 0 siblings, 2 replies; 14+ messages in thread From: Eric W. Biederman @ 2007-05-24 20:53 UTC (permalink / raw) To: Mike Miller (OS Dev) Cc: Andrew Morton, linux-kernel, tom.l.nguyen, iss_storagedev, jens.axboe, Michael Ellerman Could you please try the patch below. Unless I have misread something this should fix your problem.... diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index 000c9ae..2e1d4af 100644 --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -404,7 +404,7 @@ static int msix_capability_init(struct pci_dev *dev, entry->dev = dev; entry->mask_base = base; - list_add(&entry->list, &dev->msi_list); + list_add_tail(&entry->list, &dev->msi_list); } ret = arch_setup_msi_irqs(dev, nvec, PCI_CAP_ID_MSIX); > Michael or Eric, would you please review this patch and see if it's OK? Adding > an else > between the the if (list_is....) and the writel resolved the Oops. I'm not sure > how this > is supposed to work but using entry->mask_base after iounmap'ing seems wrong. I think I would rather just swap those two lines of code. We are manually setting the mask bit to make certain the irq doesn't fire after we free it, and we clearly want to set the mask bit for all the irq entries. > diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c > index d9cbd58..0e67723 100644 > --- a/drivers/pci/msi.c > +++ b/drivers/pci/msi.c > @@ -560,10 +560,11 @@ static int msi_free_irqs(struct pci_dev* dev) > if (entry->msi_attrib.type == PCI_CAP_ID_MSIX) { > if (list_is_last(&entry->list, &dev->msi_list)) > iounmap(entry->mask_base); > - > - writel(1, entry->mask_base + entry->msi_attrib.entry_nr > - * PCI_MSIX_ENTRY_SIZE > - + PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET); > + else > + writel(1, entry->mask_base > + + entry->msi_attrib.entry_nr > + * PCI_MSIX_ENTRY_SIZE > + + PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET); > } > list_del(&entry->list); > kfree(entry); > ----------------------------------------------------------------------------------------- > > I hope this clears up a little of the fog. Yes it does thanks. Eric ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: msi_free_irqs #2 2007-05-24 20:53 ` Eric W. Biederman @ 2007-05-24 21:01 ` Mike Miller (OS Dev) 2007-05-24 21:11 ` Mike Miller (OS Dev) 1 sibling, 0 replies; 14+ messages in thread From: Mike Miller (OS Dev) @ 2007-05-24 21:01 UTC (permalink / raw) To: Eric W. Biederman Cc: Andrew Morton, linux-kernel, tom.l.nguyen, iss_storagedev, jens.axboe, Michael Ellerman On Thu, May 24, 2007 at 02:53:08PM -0600, Eric W. Biederman wrote: > > > Could you please try the patch below. Unless I have misread something > this should fix your problem.... > > diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c > index 000c9ae..2e1d4af 100644 > --- a/drivers/pci/msi.c > +++ b/drivers/pci/msi.c > @@ -404,7 +404,7 @@ static int msix_capability_init(struct pci_dev *dev, > entry->dev = dev; > entry->mask_base = base; > > - list_add(&entry->list, &dev->msi_list); > + list_add_tail(&entry->list, &dev->msi_list); > } Yes, this is what we found. But we made 2 changes to list_add. Just sent the patch. > > ret = arch_setup_msi_irqs(dev, nvec, PCI_CAP_ID_MSIX); > > > Michael or Eric, would you please review this patch and see if it's OK? Adding > > an else > > between the the if (list_is....) and the writel resolved the Oops. I'm not sure > > how this > > is supposed to work but using entry->mask_base after iounmap'ing seems wrong. > > I think I would rather just swap those two lines of code. > We are manually setting the mask bit to make certain the irq doesn't > fire after we free it, and we clearly want to set the mask bit for > all the irq entries. OK, new patch on the way. mikem > > > diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c > > index d9cbd58..0e67723 100644 > > --- a/drivers/pci/msi.c > > +++ b/drivers/pci/msi.c > > @@ -560,10 +560,11 @@ static int msi_free_irqs(struct pci_dev* dev) > > if (entry->msi_attrib.type == PCI_CAP_ID_MSIX) { > > if (list_is_last(&entry->list, &dev->msi_list)) > > iounmap(entry->mask_base); > > - > > - writel(1, entry->mask_base + entry->msi_attrib.entry_nr > > - * PCI_MSIX_ENTRY_SIZE > > - + PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET); > > + else > > + writel(1, entry->mask_base > > + + entry->msi_attrib.entry_nr > > + * PCI_MSIX_ENTRY_SIZE > > + + PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET); > > } > > list_del(&entry->list); > > kfree(entry); > > ----------------------------------------------------------------------------------------- > > > > I hope this clears up a little of the fog. > > Yes it does thanks. > > Eric ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: msi_free_irqs #2 2007-05-24 20:53 ` Eric W. Biederman 2007-05-24 21:01 ` Mike Miller (OS Dev) @ 2007-05-24 21:11 ` Mike Miller (OS Dev) 2007-05-25 3:16 ` [PATCH] msi: mask the msix vector before we unmap it Eric W. Biederman 1 sibling, 1 reply; 14+ messages in thread From: Mike Miller (OS Dev) @ 2007-05-24 21:11 UTC (permalink / raw) To: Eric W. Biederman Cc: Andrew Morton, linux-kernel, tom.l.nguyen, iss_storagedev, jens.axboe, Michael Ellerman On Thu, May 24, 2007 at 02:53:08PM -0600, Eric W. Biederman wrote: > > > Could you please try the patch below. Unless I have misread something > this should fix your problem.... > > diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c > index 000c9ae..2e1d4af 100644 > --- a/drivers/pci/msi.c > +++ b/drivers/pci/msi.c > @@ -404,7 +404,7 @@ static int msix_capability_init(struct pci_dev *dev, > entry->dev = dev; > entry->mask_base = base; > > - list_add(&entry->list, &dev->msi_list); > + list_add_tail(&entry->list, &dev->msi_list); > } > > ret = arch_setup_msi_irqs(dev, nvec, PCI_CAP_ID_MSIX); > > > Michael or Eric, would you please review this patch and see if it's OK? Adding > > an else > > between the the if (list_is....) and the writel resolved the Oops. I'm not sure > > how this > > is supposed to work but using entry->mask_base after iounmap'ing seems wrong. > > I think I would rather just swap those two lines of code. > We are manually setting the mask bit to make certain the irq doesn't > fire after we free it, and we clearly want to set the mask bit for > all the irq entries. > > > diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c > > index d9cbd58..0e67723 100644 > > --- a/drivers/pci/msi.c > > +++ b/drivers/pci/msi.c > > @@ -560,10 +560,11 @@ static int msi_free_irqs(struct pci_dev* dev) > > if (entry->msi_attrib.type == PCI_CAP_ID_MSIX) { > > if (list_is_last(&entry->list, &dev->msi_list)) > > iounmap(entry->mask_base); > > - > > - writel(1, entry->mask_base + entry->msi_attrib.entry_nr > > - * PCI_MSIX_ENTRY_SIZE > > - + PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET); > > + else > > + writel(1, entry->mask_base > > + + entry->msi_attrib.entry_nr > > + * PCI_MSIX_ENTRY_SIZE > > + + PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET); > > } > > list_del(&entry->list); > > kfree(entry); Here's a patch that just reverses the 2 lines of code as Eric suggests. Please consider this for inclusion. Signed-off-by: Mike Miller <mike.miller@hp.com> Signed-off-by: Chase Maupin <chase.maupin@hp.com> Thanks, mikem ---------------------------------------------------------------------------------------- diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index e01380b..6632150 100644 --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -558,12 +558,12 @@ static int msi_free_irqs(struct pci_dev* dev) list_for_each_entry_safe(entry, tmp, &dev->msi_list, list) { if (entry->msi_attrib.type == PCI_CAP_ID_MSIX) { - if (list_is_last(&entry->list, &dev->msi_list)) - iounmap(entry->mask_base); - writel(1, entry->mask_base + entry->msi_attrib.entry_nr * PCI_MSIX_ENTRY_SIZE + PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET); + + if (list_is_last(&entry->list, &dev->msi_list)) + iounmap(entry->mask_base); } list_del(&entry->list); kfree(entry); ^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH] msi: mask the msix vector before we unmap it. 2007-05-24 21:11 ` Mike Miller (OS Dev) @ 2007-05-25 3:16 ` Eric W. Biederman 0 siblings, 0 replies; 14+ messages in thread From: Eric W. Biederman @ 2007-05-25 3:16 UTC (permalink / raw) To: Mike Miller (OS Dev) Cc: Andrew Morton, linux-kernel, tom.l.nguyen, iss_storagedev, jens.axboe, Michael Ellerman With these two lines in the reverse order the drives/block/ccis.c was oopsing in msi_free_irqs. Silly us calling writel on an area after we unmap it. BUG: unable to handle kernel paging request at virtual address f8b2200c printing eip: c01e9cc7 *pdpt = 0000000000003001 *pde = 0000000037e48067 *pte = 0000000000000000 Oops: 0002 [#1] SMP Modules linked in: cciss ipv6 parport_pc lp parport autofs4 i2c_dev i2c_core sunrpc loop dm_multipath button battery asus_acpi ac tg3 floppy sg dm_snapshot dm_zero dm_mirror ext3 jbd dm_mod ata_piix libata mptsas scsi_transport_sas mptspi scsi_transport_spi mptscsih mptbase sd_mod scsi_mod CPU: 1 EIP: 0060:[<c01e9cc7>] Not tainted VLI EFLAGS: 00010286 (2.6.22-rc2-gd2579053 #1) EIP is at msi_free_irqs+0x81/0xbe eax: f8b22000 ebx: f71f3180 ecx: f7fff280 edx: c1886eb8 esi: f7c4e800 edi: f7c4ec48 ebp: 00000002 esp: f5a0dec8 ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068 Process rmmod (pid: 5286, ti=f5a0d000 task=c47d2550 task.ti=f5a0d000) Stack: 00000002 f8b72294 00000400 f8b69ca7 f8b6bc6c 00000002 00000000 00000000 00000000 00000000 00000000 f5a997f4 f8b69d61 f7c5a4b0 f7c4e848 f7c4e848 f7c4e800 f7c4e800 f8b72294 f7c4e848 f8b72294 c01e3cdf f7c4e848 c024c469 Call Trace: [<f8b69ca7>] cciss_shutdown+0xae/0xc3 [cciss] [<f8b69d61>] cciss_remove_one+0xa5/0x178 [cciss] [<c01e3cdf>] pci_device_remove+0x16/0x35 [<c024c469>] __device_release_driver+0x71/0x8e [<c024c56e>] driver_detach+0xa0/0xde [<c024bc5c>] bus_remove_driver+0x27/0x41 [<c01e3ef3>] pci_unregister_driver+0xb/0x13 [<f8b6a343>] cciss_cleanup+0xf/0x51 [cciss] [<c0139ced>] sys_delete_module+0x110/0x135 [<c0104c7a>] sysenter_past_esp+0x5f/0x85 Here's a patch that just reverses the 2 lines of code as Eric suggests. Please consider this for inclusion. Signed-off-by: Mike Miller <mike.miller@hp.com> Signed-off-by: Chase Maupin <chase.maupin@hp.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> --- diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index e01380b..6632150 100644 --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -558,12 +558,12 @@ static int msi_free_irqs(struct pci_dev* dev) list_for_each_entry_safe(entry, tmp, &dev->msi_list, list) { if (entry->msi_attrib.type == PCI_CAP_ID_MSIX) { - if (list_is_last(&entry->list, &dev->msi_list)) - iounmap(entry->mask_base); - writel(1, entry->mask_base + entry->msi_attrib.entry_nr * PCI_MSIX_ENTRY_SIZE + PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET); + + if (list_is_last(&entry->list, &dev->msi_list)) + iounmap(entry->mask_base); } list_del(&entry->list); kfree(entry); ^ permalink raw reply related [flat|nested] 14+ messages in thread
end of thread, other threads:[~2007-05-25 3:21 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-05-24 16:07 msi_free_irqs #2 Mike Miller (OS Dev) 2007-05-24 17:27 ` Andrew Morton 2007-05-24 19:42 ` Eric W. Biederman 2007-05-24 20:59 ` Mike Miller (OS Dev) 2007-05-24 21:08 ` [PATCH] msi: Fix the ordering of msix irqs Eric W. Biederman 2007-05-24 21:17 ` Andrew Morton 2007-05-24 21:36 ` Eric W. Biederman 2007-05-25 2:32 ` Michael Ellerman 2007-05-25 3:04 ` kernel crash in timer interrupt handler gshan 2007-05-24 20:07 ` msi_free_irqs #2 Mike Miller (OS Dev) 2007-05-24 20:53 ` Eric W. Biederman 2007-05-24 21:01 ` Mike Miller (OS Dev) 2007-05-24 21:11 ` Mike Miller (OS Dev) 2007-05-25 3:16 ` [PATCH] msi: mask the msix vector before we unmap it Eric W. Biederman
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox