* [PATCH 0/3] msi: fix kobject/sysfs removal from msi_list @ 2013-09-17 1:47 Veaceslav Falico 2013-09-17 1:47 ` [PATCH 1/3] msi: add forgotten pci_dev_put(pdev) to populate_msi_sysfs() Veaceslav Falico ` (2 more replies) 0 siblings, 3 replies; 17+ messages in thread From: Veaceslav Falico @ 2013-09-17 1:47 UTC (permalink / raw) To: linux-kernel; +Cc: Bjorn Helgaas, linux-pci, Veaceslav Falico Currently, while removing msi_list's ->kobj, we just do kobject_put() on it and after that free the entry itself. However, kobject_put() doesn't guarantee that the kobject itself is freed - it can be used by someone else and thus, when we'll free the entry, we'll free the kobject too - leading to bugs in the other users (or when we'll finally release it). Also, in some cases we might fail to register the kobjects, but we forget to remove pdev->msi_kset, and this can lead to errors if we try to re-register it - cause we already have that kset initialized. Fix both issues by moving msi_kset/kobject deinitialization code completely to free_msi_irqs(), which is called every time we fail and need to roll back (and on the proper device irqs deinit). Also, move kfree-ing of the msi_list entry to kobject->release (msi_kobj_release()), so that the entry containing kobject will only be delisted in free_msi_irqs(), and free only when there are no other users of its kobject. CC: Bjorn Helgaas <bhelgaas@google.com> CC: linux-pci@vger.kernel.org CC: linux-kernel@vger.kernel.org Signed-off-by: Veaceslav Falico <vfalico@redhat.com> --- drivers/pci/msi.c | 38 +++++++++++++++----------------------- 1 file changed, 15 insertions(+), 23 deletions(-) ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 1/3] msi: add forgotten pci_dev_put(pdev) to populate_msi_sysfs() 2013-09-17 1:47 [PATCH 0/3] msi: fix kobject/sysfs removal from msi_list Veaceslav Falico @ 2013-09-17 1:47 ` Veaceslav Falico 2013-09-25 21:08 ` Bjorn Helgaas 2013-09-17 1:47 ` [PATCH 2/3] msi: always unregister ->msi_kset within free_msi_irqs() Veaceslav Falico 2013-09-17 1:47 ` [PATCH 3/3] msi: free msi_desc entry only after we've released the kobject Veaceslav Falico 2 siblings, 1 reply; 17+ messages in thread From: Veaceslav Falico @ 2013-09-17 1:47 UTC (permalink / raw) To: linux-kernel; +Cc: Veaceslav Falico, Bjorn Helgaas, linux-pci Before trying to kobject_init_and_add(), we add a reference to pdev via pci_dev_get(pdev). However, if it fails to init and/or add the kobject, we don't return it back - even on out_unroll. Fix this by adding pci_dev_put(pdev) before going to unrolling section. CC: Bjorn Helgaas <bhelgaas@google.com> CC: linux-pci@vger.kernel.org CC: linux-kernel@vger.kernel.org Signed-off-by: Veaceslav Falico <vfalico@redhat.com> --- drivers/pci/msi.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index d5f90d6..14bf578 100644 --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -534,8 +534,10 @@ static int populate_msi_sysfs(struct pci_dev *pdev) pci_dev_get(pdev); ret = kobject_init_and_add(kobj, &msi_irq_ktype, NULL, "%u", entry->irq); - if (ret) + if (ret) { + pci_dev_put(pdev); goto out_unroll; + } count++; } -- 1.8.4 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 1/3] msi: add forgotten pci_dev_put(pdev) to populate_msi_sysfs() 2013-09-17 1:47 ` [PATCH 1/3] msi: add forgotten pci_dev_put(pdev) to populate_msi_sysfs() Veaceslav Falico @ 2013-09-25 21:08 ` Bjorn Helgaas 2013-09-25 21:30 ` Bjorn Helgaas 2013-09-25 23:23 ` Neil Horman 0 siblings, 2 replies; 17+ messages in thread From: Bjorn Helgaas @ 2013-09-25 21:08 UTC (permalink / raw) To: Veaceslav Falico Cc: linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org, Neil Horman, Greg Kroah-Hartman [+cc Neil (he added this code in da8d1c8ba4), Greg] On Mon, Sep 16, 2013 at 7:47 PM, Veaceslav Falico <vfalico@redhat.com> wrote: > Before trying to kobject_init_and_add(), we add a reference to pdev via > pci_dev_get(pdev). However, if it fails to init and/or add the kobject, we > don't return it back - even on out_unroll. > > Fix this by adding pci_dev_put(pdev) before going to unrolling section. > > CC: Bjorn Helgaas <bhelgaas@google.com> > CC: linux-pci@vger.kernel.org > CC: linux-kernel@vger.kernel.org > Signed-off-by: Veaceslav Falico <vfalico@redhat.com> > --- > drivers/pci/msi.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c > index d5f90d6..14bf578 100644 > --- a/drivers/pci/msi.c > +++ b/drivers/pci/msi.c > @@ -534,8 +534,10 @@ static int populate_msi_sysfs(struct pci_dev *pdev) > pci_dev_get(pdev); > ret = kobject_init_and_add(kobj, &msi_irq_ktype, NULL, > "%u", entry->irq); > - if (ret) > + if (ret) { > + pci_dev_put(pdev); > goto out_unroll; > + } > > count++; > } I don't understand why this code does the pci_dev_get() in the first place. The pdev->msi_list of msi_desc structs is private to the pci_dev, and even without bumping the refcount, there should be no way for the pci_dev to be freed before the msi_desc. I also don't understand this nearby code (the same pattern appears in free_msi_irqs()): out_unroll: list_for_each_entry(entry, &pdev->msi_list, list) { if (!count) break; kobject_del(&entry->kobj); kobject_put(&entry->kobj); count--; } Why do we call kobject_del() here? The kobject_put() will call kobject_del() anyway, so it looks redundant. Documentation/kobject.txt says kobject_del() must be called explicitly to break a circular reference, but I don't think we have that here. Also, I think it is incorrect that free_msi_irqs() does this: if (entry->kobj.parent) { kobject_del(&entry->kobj); kobject_put(&entry->kobj); } list_del(&entry->list); kfree(entry); I think the "kfree(entry)" should be in msi_kobj_release() instead. Bjorn ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/3] msi: add forgotten pci_dev_put(pdev) to populate_msi_sysfs() 2013-09-25 21:08 ` Bjorn Helgaas @ 2013-09-25 21:30 ` Bjorn Helgaas 2013-09-25 22:09 ` Veaceslav Falico 2013-09-25 23:23 ` Neil Horman 1 sibling, 1 reply; 17+ messages in thread From: Bjorn Helgaas @ 2013-09-25 21:30 UTC (permalink / raw) To: Veaceslav Falico Cc: linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org, Neil Horman, Greg Kroah-Hartman On Wed, Sep 25, 2013 at 3:08 PM, Bjorn Helgaas <bhelgaas@google.com> wrote: > Also, I think it is incorrect that free_msi_irqs() does this: > > if (entry->kobj.parent) { > kobject_del(&entry->kobj); > kobject_put(&entry->kobj); > } > > list_del(&entry->list); > kfree(entry); > > I think the "kfree(entry)" should be in msi_kobj_release() instead. Oh, I see you fixed this in patch 3/3. I hadn't read that far yet :) Did you find these problems by inspection, or is there some easy way to trigger bad behavior? Just wondering if this is some way I can reproduce the problem. Bjorn ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/3] msi: add forgotten pci_dev_put(pdev) to populate_msi_sysfs() 2013-09-25 21:30 ` Bjorn Helgaas @ 2013-09-25 22:09 ` Veaceslav Falico 0 siblings, 0 replies; 17+ messages in thread From: Veaceslav Falico @ 2013-09-25 22:09 UTC (permalink / raw) To: Bjorn Helgaas Cc: linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org, Neil Horman, Greg Kroah-Hartman On Wed, Sep 25, 2013 at 03:30:14PM -0600, Bjorn Helgaas wrote: >On Wed, Sep 25, 2013 at 3:08 PM, Bjorn Helgaas <bhelgaas@google.com> wrote: >> Also, I think it is incorrect that free_msi_irqs() does this: >> >> if (entry->kobj.parent) { >> kobject_del(&entry->kobj); >> kobject_put(&entry->kobj); >> } >> >> list_del(&entry->list); >> kfree(entry); >> >> I think the "kfree(entry)" should be in msi_kobj_release() instead. > >Oh, I see you fixed this in patch 3/3. I hadn't read that far yet :) > >Did you find these problems by inspection, or is there some easy way >to trigger bad behavior? Just wondering if this is some way I can >reproduce the problem. Hi, I've first found it by building with CONFIG_DEBUG_KOBJECT_RELEASE and CONFIG_DEBUG_OBJECTS - it shows that it's freeing an object in an active state (I'm just running insmod/rmmod tg3 concurently, but I guess it's reproducible with any driver that uses msi/x). Without CONFIG_DEBUG_OBJECTS it's also reproducible, and without CONFIG_DEBUG_KOBJECT_RELEASE it's really hard to reproduce, but still reproducible (I've hit it with tg3 as a slave of bonding and concurently running rmmod bonding/ifup bond0 - though it's really hard). It just panics in kobject_put(), iirc. So, with those CONFIG_DEBUG_* it's easily triggerable, without - quite hard. Hope that helps. p.s. I'll adress your other comments tomorrow already, it's quite late here and I don't want to do something stupid now :). Thanks a lot! > >Bjorn ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/3] msi: add forgotten pci_dev_put(pdev) to populate_msi_sysfs() 2013-09-25 21:08 ` Bjorn Helgaas 2013-09-25 21:30 ` Bjorn Helgaas @ 2013-09-25 23:23 ` Neil Horman 2013-09-25 23:35 ` Bjorn Helgaas 1 sibling, 1 reply; 17+ messages in thread From: Neil Horman @ 2013-09-25 23:23 UTC (permalink / raw) To: Bjorn Helgaas Cc: Veaceslav Falico, linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org, Greg Kroah-Hartman On Wed, Sep 25, 2013 at 03:08:05PM -0600, Bjorn Helgaas wrote: > [+cc Neil (he added this code in da8d1c8ba4), Greg] > > On Mon, Sep 16, 2013 at 7:47 PM, Veaceslav Falico <vfalico@redhat.com> wrote: > > Before trying to kobject_init_and_add(), we add a reference to pdev via > > pci_dev_get(pdev). However, if it fails to init and/or add the kobject, we > > don't return it back - even on out_unroll. > > > > Fix this by adding pci_dev_put(pdev) before going to unrolling section. > > > > CC: Bjorn Helgaas <bhelgaas@google.com> > > CC: linux-pci@vger.kernel.org > > CC: linux-kernel@vger.kernel.org > > Signed-off-by: Veaceslav Falico <vfalico@redhat.com> > > --- > > drivers/pci/msi.c | 4 +++- > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c > > index d5f90d6..14bf578 100644 > > --- a/drivers/pci/msi.c > > +++ b/drivers/pci/msi.c > > @@ -534,8 +534,10 @@ static int populate_msi_sysfs(struct pci_dev *pdev) > > pci_dev_get(pdev); > > ret = kobject_init_and_add(kobj, &msi_irq_ktype, NULL, > > "%u", entry->irq); > > - if (ret) > > + if (ret) { > > + pci_dev_put(pdev); > > goto out_unroll; > > + } > > > > count++; > > } > > I don't understand why this code does the pci_dev_get() in the first > place. The pdev->msi_list of msi_desc structs is private to the > pci_dev, and even without bumping the refcount, there should be no way > for the pci_dev to be freed before the msi_desc. > Its been a few years now, but IIRC I did the pci_dev_get/put here to ensure that people didn't try to remove the device prior to freeing all their interrupts (i.e I didn't want a broken driver to go through its remove routine without freeing all its irqs). That might have been the wrong thing to do, but thats what bubbles to the front of my head when looking at this. > I also don't understand this nearby code (the same pattern appears in > free_msi_irqs()): > > out_unroll: > list_for_each_entry(entry, &pdev->msi_list, list) { > if (!count) > break; > kobject_del(&entry->kobj); > kobject_put(&entry->kobj); > count--; > } > > Why do we call kobject_del() here? The kobject_put() will call > kobject_del() anyway, so it looks redundant. > Documentation/kobject.txt says kobject_del() must be called explicitly > to break a circular reference, but I don't think we have that here. > I think thats exactly why I did it, because of the documentation. I agree however, it does look redundant. Harmless, but redundant. > Also, I think it is incorrect that free_msi_irqs() does this: > > if (entry->kobj.parent) { > kobject_del(&entry->kobj); > kobject_put(&entry->kobj); > } > > list_del(&entry->list); > kfree(entry); > > I think the "kfree(entry)" should be in msi_kobj_release() instead. > As far as this change goes, I think it looks ok Acked-by: Neil Horman <nhorman@tuxdriver.com> > Bjorn > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/3] msi: add forgotten pci_dev_put(pdev) to populate_msi_sysfs() 2013-09-25 23:23 ` Neil Horman @ 2013-09-25 23:35 ` Bjorn Helgaas 2013-09-26 9:27 ` Veaceslav Falico 2013-09-26 12:25 ` Veaceslav Falico 0 siblings, 2 replies; 17+ messages in thread From: Bjorn Helgaas @ 2013-09-25 23:35 UTC (permalink / raw) To: Neil Horman Cc: Veaceslav Falico, linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org, Greg Kroah-Hartman On Wed, Sep 25, 2013 at 5:23 PM, Neil Horman <nhorman@tuxdriver.com> wrote: > On Wed, Sep 25, 2013 at 03:08:05PM -0600, Bjorn Helgaas wrote: >> [+cc Neil (he added this code in da8d1c8ba4), Greg] >> >> On Mon, Sep 16, 2013 at 7:47 PM, Veaceslav Falico <vfalico@redhat.com> wrote: >> > Before trying to kobject_init_and_add(), we add a reference to pdev via >> > pci_dev_get(pdev). However, if it fails to init and/or add the kobject, we >> > don't return it back - even on out_unroll. >> > >> > Fix this by adding pci_dev_put(pdev) before going to unrolling section. >> > >> > CC: Bjorn Helgaas <bhelgaas@google.com> >> > CC: linux-pci@vger.kernel.org >> > CC: linux-kernel@vger.kernel.org >> > Signed-off-by: Veaceslav Falico <vfalico@redhat.com> >> > --- >> > drivers/pci/msi.c | 4 +++- >> > 1 file changed, 3 insertions(+), 1 deletion(-) >> > >> > diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c >> > index d5f90d6..14bf578 100644 >> > --- a/drivers/pci/msi.c >> > +++ b/drivers/pci/msi.c >> > @@ -534,8 +534,10 @@ static int populate_msi_sysfs(struct pci_dev *pdev) >> > pci_dev_get(pdev); >> > ret = kobject_init_and_add(kobj, &msi_irq_ktype, NULL, >> > "%u", entry->irq); >> > - if (ret) >> > + if (ret) { >> > + pci_dev_put(pdev); >> > goto out_unroll; >> > + } >> > >> > count++; >> > } >> >> I don't understand why this code does the pci_dev_get() in the first >> place. The pdev->msi_list of msi_desc structs is private to the >> pci_dev, and even without bumping the refcount, there should be no way >> for the pci_dev to be freed before the msi_desc. >> > Its been a few years now, but IIRC I did the pci_dev_get/put here to ensure that > people didn't try to remove the device prior to freeing all their interrupts > (i.e I didn't want a broken driver to go through its remove routine without > freeing all its irqs). That might have been the wrong thing to do, but thats > what bubbles to the front of my head when looking at this. That sounds plausible, but I think I'd rather deal with that by having the PCI core remove logic free all the interrupts. I *think* that's already in place, i.e., pci_free_resources() calls msi_remove_pci_irq_vectors(). So I propose that we remove the pci_dev_get()/put() unless we come up with a more compelling reason for it. >> I also don't understand this nearby code (the same pattern appears in >> free_msi_irqs()): >> >> out_unroll: >> list_for_each_entry(entry, &pdev->msi_list, list) { >> if (!count) >> break; >> kobject_del(&entry->kobj); >> kobject_put(&entry->kobj); >> count--; >> } >> >> Why do we call kobject_del() here? The kobject_put() will call >> kobject_del() anyway, so it looks redundant. >> Documentation/kobject.txt says kobject_del() must be called explicitly >> to break a circular reference, but I don't think we have that here. >> > I think thats exactly why I did it, because of the documentation. I agree > however, it does look redundant. Harmless, but redundant. OK, thanks. I think we should remove it on the grounds that it's not needed and removing it will make this code look more similar to other callers of kobject_init_and_add(), which means bugs will have fewer places to hide. Thanks, Neil! Bjorn ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/3] msi: add forgotten pci_dev_put(pdev) to populate_msi_sysfs() 2013-09-25 23:35 ` Bjorn Helgaas @ 2013-09-26 9:27 ` Veaceslav Falico 2013-09-26 12:25 ` Veaceslav Falico 1 sibling, 0 replies; 17+ messages in thread From: Veaceslav Falico @ 2013-09-26 9:27 UTC (permalink / raw) To: Bjorn Helgaas Cc: Neil Horman, linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org, Greg Kroah-Hartman On Wed, Sep 25, 2013 at 05:35:54PM -0600, Bjorn Helgaas wrote: >On Wed, Sep 25, 2013 at 5:23 PM, Neil Horman <nhorman@tuxdriver.com> wrote: >> On Wed, Sep 25, 2013 at 03:08:05PM -0600, Bjorn Helgaas wrote: >>> [+cc Neil (he added this code in da8d1c8ba4), Greg] >>> >>> On Mon, Sep 16, 2013 at 7:47 PM, Veaceslav Falico <vfalico@redhat.com> wrote: >>> > Before trying to kobject_init_and_add(), we add a reference to pdev via >>> > pci_dev_get(pdev). However, if it fails to init and/or add the kobject, we >>> > don't return it back - even on out_unroll. >>> > >>> > Fix this by adding pci_dev_put(pdev) before going to unrolling section. >>> > >>> > CC: Bjorn Helgaas <bhelgaas@google.com> >>> > CC: linux-pci@vger.kernel.org >>> > CC: linux-kernel@vger.kernel.org >>> > Signed-off-by: Veaceslav Falico <vfalico@redhat.com> >>> > --- >>> > drivers/pci/msi.c | 4 +++- >>> > 1 file changed, 3 insertions(+), 1 deletion(-) >>> > >>> > diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c >>> > index d5f90d6..14bf578 100644 >>> > --- a/drivers/pci/msi.c >>> > +++ b/drivers/pci/msi.c >>> > @@ -534,8 +534,10 @@ static int populate_msi_sysfs(struct pci_dev *pdev) >>> > pci_dev_get(pdev); >>> > ret = kobject_init_and_add(kobj, &msi_irq_ktype, NULL, >>> > "%u", entry->irq); >>> > - if (ret) >>> > + if (ret) { >>> > + pci_dev_put(pdev); >>> > goto out_unroll; >>> > + } >>> > >>> > count++; >>> > } >>> >>> I don't understand why this code does the pci_dev_get() in the first >>> place. The pdev->msi_list of msi_desc structs is private to the >>> pci_dev, and even without bumping the refcount, there should be no way >>> for the pci_dev to be freed before the msi_desc. >>> >> Its been a few years now, but IIRC I did the pci_dev_get/put here to ensure that >> people didn't try to remove the device prior to freeing all their interrupts >> (i.e I didn't want a broken driver to go through its remove routine without >> freeing all its irqs). That might have been the wrong thing to do, but thats >> what bubbles to the front of my head when looking at this. > >That sounds plausible, but I think I'd rather deal with that by having >the PCI core remove logic free all the interrupts. I *think* that's >already in place, i.e., pci_free_resources() calls >msi_remove_pci_irq_vectors(). So I propose that we remove the >pci_dev_get()/put() unless we come up with a more compelling reason >for it. > >>> I also don't understand this nearby code (the same pattern appears in >>> free_msi_irqs()): >>> >>> out_unroll: >>> list_for_each_entry(entry, &pdev->msi_list, list) { >>> if (!count) >>> break; >>> kobject_del(&entry->kobj); >>> kobject_put(&entry->kobj); >>> count--; >>> } >>> >>> Why do we call kobject_del() here? The kobject_put() will call >>> kobject_del() anyway, so it looks redundant. >>> Documentation/kobject.txt says kobject_del() must be called explicitly >>> to break a circular reference, but I don't think we have that here. >>> >> I think thats exactly why I did it, because of the documentation. I agree >> however, it does look redundant. Harmless, but redundant. > >OK, thanks. I think we should remove it on the grounds that it's not >needed and removing it will make this code look more similar to other >callers of kobject_init_and_add(), which means bugs will have fewer >places to hide. Hi, Sounds great, I'll add this as new patches and send this in a separate patchset with the unregister msi_kset in free_msi_irqs patch, so that it won't depend on the fix (both removing kobject_del() and pci_dev_get/put()). Thank you! > >Thanks, Neil! > >Bjorn ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/3] msi: add forgotten pci_dev_put(pdev) to populate_msi_sysfs() 2013-09-25 23:35 ` Bjorn Helgaas 2013-09-26 9:27 ` Veaceslav Falico @ 2013-09-26 12:25 ` Veaceslav Falico 2013-09-26 14:07 ` Veaceslav Falico 2013-09-26 14:40 ` Neil Horman 1 sibling, 2 replies; 17+ messages in thread From: Veaceslav Falico @ 2013-09-26 12:25 UTC (permalink / raw) To: Bjorn Helgaas Cc: Neil Horman, linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org, Greg Kroah-Hartman On Wed, Sep 25, 2013 at 05:35:54PM -0600, Bjorn Helgaas wrote: >On Wed, Sep 25, 2013 at 5:23 PM, Neil Horman <nhorman@tuxdriver.com> wrote: >> On Wed, Sep 25, 2013 at 03:08:05PM -0600, Bjorn Helgaas wrote: >>> [+cc Neil (he added this code in da8d1c8ba4), Greg] >>> >>> On Mon, Sep 16, 2013 at 7:47 PM, Veaceslav Falico <vfalico@redhat.com> wrote: >>> > Before trying to kobject_init_and_add(), we add a reference to pdev via >>> > pci_dev_get(pdev). However, if it fails to init and/or add the kobject, we >>> > don't return it back - even on out_unroll. >>> > >>> > Fix this by adding pci_dev_put(pdev) before going to unrolling section. >>> > >>> > CC: Bjorn Helgaas <bhelgaas@google.com> >>> > CC: linux-pci@vger.kernel.org >>> > CC: linux-kernel@vger.kernel.org >>> > Signed-off-by: Veaceslav Falico <vfalico@redhat.com> >>> > --- >>> > drivers/pci/msi.c | 4 +++- >>> > 1 file changed, 3 insertions(+), 1 deletion(-) >>> > >>> > diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c >>> > index d5f90d6..14bf578 100644 >>> > --- a/drivers/pci/msi.c >>> > +++ b/drivers/pci/msi.c >>> > @@ -534,8 +534,10 @@ static int populate_msi_sysfs(struct pci_dev *pdev) >>> > pci_dev_get(pdev); >>> > ret = kobject_init_and_add(kobj, &msi_irq_ktype, NULL, >>> > "%u", entry->irq); >>> > - if (ret) >>> > + if (ret) { >>> > + pci_dev_put(pdev); >>> > goto out_unroll; >>> > + } >>> > >>> > count++; >>> > } >>> >>> I don't understand why this code does the pci_dev_get() in the first >>> place. The pdev->msi_list of msi_desc structs is private to the >>> pci_dev, and even without bumping the refcount, there should be no way >>> for the pci_dev to be freed before the msi_desc. >>> >> Its been a few years now, but IIRC I did the pci_dev_get/put here to ensure that >> people didn't try to remove the device prior to freeing all their interrupts >> (i.e I didn't want a broken driver to go through its remove routine without >> freeing all its irqs). That might have been the wrong thing to do, but thats >> what bubbles to the front of my head when looking at this. > >That sounds plausible, but I think I'd rather deal with that by having >the PCI core remove logic free all the interrupts. I *think* that's >already in place, i.e., pci_free_resources() calls >msi_remove_pci_irq_vectors(). So I propose that we remove the >pci_dev_get()/put() unless we come up with a more compelling reason >for it. As an update - I've found an interesting case why exactly that kobject_del() might be needed: in kobject_del() it removes instantly the link to kset - via kobj_kset_leave(), so that our kset remains without links and, thus, might be instantly removed. So, with kobject_del(), our kset (msi_irqs sysfs dir) remains instantly without any links (i.e. other kobjects) and, when we call kset_unregister() - it exits instantly (if it's not being hold somewhere elsewhere...). Without it, kset_unregister() will wait till all the kobjects will be gone. Now, the fun part starts - if we quickly call pci_disable_msi() and, afterwards, pci_enable_msi() - we might fail because the msi_irqs kset is still there, waiting to unregister, and the sysfs dir is still active. It's used, for example, in tg3_open/tg3_close, which are ndo_open/close, and are called on enslave/deslave in bonding. What I get: [ 60.458319] WARNING: CPU: 0 PID: 5552 at fs/sysfs/dir.c:526 sysfs_add_one+0xbb/0xe0() [ 60.458350] sysfs: cannot create duplicate filename '/devices/pci0000:00/0000:00:1c.5/0000:3f:00.0/msi_irqs' I'll take a deeper look at the issue, though any feedback/advise is welcome. And I'll hold on with the patchset that removes pci_dev_get/put and kobject_del. > >>> I also don't understand this nearby code (the same pattern appears in >>> free_msi_irqs()): >>> >>> out_unroll: >>> list_for_each_entry(entry, &pdev->msi_list, list) { >>> if (!count) >>> break; >>> kobject_del(&entry->kobj); >>> kobject_put(&entry->kobj); >>> count--; >>> } >>> >>> Why do we call kobject_del() here? The kobject_put() will call >>> kobject_del() anyway, so it looks redundant. >>> Documentation/kobject.txt says kobject_del() must be called explicitly >>> to break a circular reference, but I don't think we have that here. >>> >> I think thats exactly why I did it, because of the documentation. I agree >> however, it does look redundant. Harmless, but redundant. > >OK, thanks. I think we should remove it on the grounds that it's not >needed and removing it will make this code look more similar to other >callers of kobject_init_and_add(), which means bugs will have fewer >places to hide. > >Thanks, Neil! > >Bjorn ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/3] msi: add forgotten pci_dev_put(pdev) to populate_msi_sysfs() 2013-09-26 12:25 ` Veaceslav Falico @ 2013-09-26 14:07 ` Veaceslav Falico 2013-09-26 22:16 ` Bjorn Helgaas 2013-09-26 14:40 ` Neil Horman 1 sibling, 1 reply; 17+ messages in thread From: Veaceslav Falico @ 2013-09-26 14:07 UTC (permalink / raw) To: Bjorn Helgaas Cc: Neil Horman, linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org, Greg Kroah-Hartman On Thu, Sep 26, 2013 at 02:25:52PM +0200, Veaceslav Falico wrote: >On Wed, Sep 25, 2013 at 05:35:54PM -0600, Bjorn Helgaas wrote: >>On Wed, Sep 25, 2013 at 5:23 PM, Neil Horman <nhorman@tuxdriver.com> wrote: >>>On Wed, Sep 25, 2013 at 03:08:05PM -0600, Bjorn Helgaas wrote: >>>>[+cc Neil (he added this code in da8d1c8ba4), Greg] >>>> >>>>On Mon, Sep 16, 2013 at 7:47 PM, Veaceslav Falico <vfalico@redhat.com> wrote: >>>>> Before trying to kobject_init_and_add(), we add a reference to pdev via >>>>> pci_dev_get(pdev). However, if it fails to init and/or add the kobject, we >>>>> don't return it back - even on out_unroll. >>>>> >>>>> Fix this by adding pci_dev_put(pdev) before going to unrolling section. >>>>> >>>>> CC: Bjorn Helgaas <bhelgaas@google.com> >>>>> CC: linux-pci@vger.kernel.org >>>>> CC: linux-kernel@vger.kernel.org >>>>> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> >>>>> --- >>>>> drivers/pci/msi.c | 4 +++- >>>>> 1 file changed, 3 insertions(+), 1 deletion(-) >>>>> >>>>> diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c >>>>> index d5f90d6..14bf578 100644 >>>>> --- a/drivers/pci/msi.c >>>>> +++ b/drivers/pci/msi.c >>>>> @@ -534,8 +534,10 @@ static int populate_msi_sysfs(struct pci_dev *pdev) >>>>> pci_dev_get(pdev); >>>>> ret = kobject_init_and_add(kobj, &msi_irq_ktype, NULL, >>>>> "%u", entry->irq); >>>>> - if (ret) >>>>> + if (ret) { >>>>> + pci_dev_put(pdev); >>>>> goto out_unroll; >>>>> + } >>>>> >>>>> count++; >>>>> } >>>> >>>>I don't understand why this code does the pci_dev_get() in the first >>>>place. The pdev->msi_list of msi_desc structs is private to the >>>>pci_dev, and even without bumping the refcount, there should be no way >>>>for the pci_dev to be freed before the msi_desc. >>>> >>>Its been a few years now, but IIRC I did the pci_dev_get/put here to ensure that >>>people didn't try to remove the device prior to freeing all their interrupts >>>(i.e I didn't want a broken driver to go through its remove routine without >>>freeing all its irqs). That might have been the wrong thing to do, but thats >>>what bubbles to the front of my head when looking at this. >> >>That sounds plausible, but I think I'd rather deal with that by having >>the PCI core remove logic free all the interrupts. I *think* that's >>already in place, i.e., pci_free_resources() calls >>msi_remove_pci_irq_vectors(). So I propose that we remove the >>pci_dev_get()/put() unless we come up with a more compelling reason >>for it. > >As an update - I've found an interesting case why exactly that >kobject_del() might be needed: > >in kobject_del() it removes instantly the link to kset - via >kobj_kset_leave(), so that our kset remains without links and, thus, might >be instantly removed. > >So, with kobject_del(), our kset (msi_irqs sysfs dir) remains instantly >without any links (i.e. other kobjects) and, when we call kset_unregister() >- it exits instantly (if it's not being hold somewhere elsewhere...). > >Without it, kset_unregister() will wait till all the kobjects will be gone. > >Now, the fun part starts - if we quickly call pci_disable_msi() and, >afterwards, pci_enable_msi() - we might fail because the msi_irqs kset is >still there, waiting to unregister, and the sysfs dir is still active. > >It's used, for example, in tg3_open/tg3_close, which are ndo_open/close, >and are called on enslave/deslave in bonding. > >What I get: >[ 60.458319] WARNING: CPU: 0 PID: 5552 at fs/sysfs/dir.c:526 sysfs_add_one+0xbb/0xe0() >[ 60.458350] sysfs: cannot create duplicate filename '/devices/pci0000:00/0000:00:1c.5/0000:3f:00.0/msi_irqs' > >I'll take a deeper look at the issue, though any feedback/advise is >welcome. And I'll hold on with the patchset that removes pci_dev_get/put >and kobject_del. Ok, this is only reproducible with CONFIG_DEBUG_KOBJECT_RELEASE, with or without removing kobject_del() (though it's harder to reproduce). I could not trigger it without CONFIG_DEBUG_KOBJECT_RELEASE, even with constantly poking /sys/class/net/tg3_device/device/msi_irqs/* . Though it's, obviously, possible, with and without cleanup and my previous bugfix. I'll then send now the cleanup, however this theoretical issue was, is and won't be fixed by it :-/. And I don't know how can we possible fix it without something like kobject_put_wait(). > > >> >>>>I also don't understand this nearby code (the same pattern appears in >>>>free_msi_irqs()): >>>> >>>> out_unroll: >>>> list_for_each_entry(entry, &pdev->msi_list, list) { >>>> if (!count) >>>> break; >>>> kobject_del(&entry->kobj); >>>> kobject_put(&entry->kobj); >>>> count--; >>>> } >>>> >>>>Why do we call kobject_del() here? The kobject_put() will call >>>>kobject_del() anyway, so it looks redundant. >>>>Documentation/kobject.txt says kobject_del() must be called explicitly >>>>to break a circular reference, but I don't think we have that here. >>>> >>> I think thats exactly why I did it, because of the documentation. I agree >>>however, it does look redundant. Harmless, but redundant. >> >>OK, thanks. I think we should remove it on the grounds that it's not >>needed and removing it will make this code look more similar to other >>callers of kobject_init_and_add(), which means bugs will have fewer >>places to hide. >> >>Thanks, Neil! >> >>Bjorn ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/3] msi: add forgotten pci_dev_put(pdev) to populate_msi_sysfs() 2013-09-26 14:07 ` Veaceslav Falico @ 2013-09-26 22:16 ` Bjorn Helgaas 2013-09-26 23:05 ` Veaceslav Falico 0 siblings, 1 reply; 17+ messages in thread From: Bjorn Helgaas @ 2013-09-26 22:16 UTC (permalink / raw) To: Veaceslav Falico Cc: Neil Horman, linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org, Greg Kroah-Hartman, Russell King [+cc Russell] On Thu, Sep 26, 2013 at 04:07:51PM +0200, Veaceslav Falico wrote: > On Thu, Sep 26, 2013 at 02:25:52PM +0200, Veaceslav Falico wrote: > >On Wed, Sep 25, 2013 at 05:35:54PM -0600, Bjorn Helgaas wrote: > >>On Wed, Sep 25, 2013 at 5:23 PM, Neil Horman <nhorman@tuxdriver.com> wrote: > >>>On Wed, Sep 25, 2013 at 03:08:05PM -0600, Bjorn Helgaas wrote: > >>>>On Mon, Sep 16, 2013 at 7:47 PM, Veaceslav Falico <vfalico@redhat.com> wrote: ... > >As an update - I've found an interesting case why exactly that > >kobject_del() might be needed: > > > >in kobject_del() it removes instantly the link to kset - via > >kobj_kset_leave(), so that our kset remains without links and, thus, might > >be instantly removed. > > > >So, with kobject_del(), our kset (msi_irqs sysfs dir) remains instantly > >without any links (i.e. other kobjects) and, when we call kset_unregister() > >- it exits instantly (if it's not being hold somewhere elsewhere...). > > > >Without it, kset_unregister() will wait till all the kobjects will be gone. I don't see any waiting in kset_unregister(); all it does is a kobject_put(). > >Now, the fun part starts - if we quickly call pci_disable_msi() and, > >afterwards, pci_enable_msi() - we might fail because the msi_irqs kset is > >still there, waiting to unregister, and the sysfs dir is still active. > > > >It's used, for example, in tg3_open/tg3_close, which are ndo_open/close, > >and are called on enslave/deslave in bonding. > > > >What I get: > >[ 60.458319] WARNING: CPU: 0 PID: 5552 at fs/sysfs/dir.c:526 sysfs_add_one+0xbb/0xe0() > >[ 60.458350] sysfs: cannot create duplicate filename '/devices/pci0000:00/0000:00:1c.5/0000:3f:00.0/msi_irqs' > > > >I'll take a deeper look at the issue, though any feedback/advise is > >welcome. And I'll hold on with the patchset that removes pci_dev_get/put > >and kobject_del. > > Ok, this is only reproducible with CONFIG_DEBUG_KOBJECT_RELEASE, with or > without removing kobject_del() (though it's harder to reproduce). I could > not trigger it without CONFIG_DEBUG_KOBJECT_RELEASE, even with constantly > poking /sys/class/net/tg3_device/device/msi_irqs/* . Though it's, > obviously, possible, with and without cleanup and my previous bugfix. > > I'll then send now the cleanup, however this theoretical issue was, is and > won't be fixed by it :-/. And I don't know how can we possible fix it > without something like kobject_put_wait(). I still think we should remove the kobject_del(). We don't want to make a race harder to hit; we want to remove it completely. What we really want is to make races *easier* to hit so we can find them, which seems to be the point of KOBJECT_RELEASE :) That said, I think I see why you see the warning in this case. You're calling pci_enable_msi(), pci_disable_msi(), pci_enable_msi() as in this call chain: pci_enable_msi msi_capability_init populate_msi_sysfs dev->msi_kset = kset_create_and_add("msi_irqs", ...) list_for_each_entry(entry, &dev->msi_list, ...) kobj->kset = dev->msi_kset kobject_init_and_add(kobj, &msi_irq_ktype, NULL, ...) kobject_add_internal kobject_get(&kobj->kset->kobj) # dev->msi_kset pci_disable_msi free_msi_irqs list_for_each_entry_safe(entry, ..., &dev->msi_list, ...) kobject_put(&entry->kobj) kobject_release ... <delayed> ... kobject_cleanup kobject_del kobj_kset_leave kset_put(kobj->kset) # dev->msi_kset kobject_put # happens AFTER pci_enable_msi() below t->release list_del(&entry->list) kset_unregister(dev->msi_kset) kobject_put kobject_release ... <delayed> ... kobject_cleanup # happens AFTER pci_enable_msi() below dev->msi_kset = NULL pci_enable_msi msi_capability_init populate_msi_sysfs dev->msi_kset = kset_create_and_add("msi_irqs", ...) "sysfs: cannot create duplicate filename .../msi_irqs" When we kset_create_and_add("msi_irqs") the second time, the delayed kobject_cleanups() for the msi_desc entries and the msi_kset have not yet occurred, so the msi_desc entries still hold references to the msi_kset, etc. I'm not sure if this is a design problem in the way PCI manages msi_kset and msi_desc entries, or if there's something wrong in the way KOBJECT_RELEASE is implemented. I could imagine changing it so the bulk of kobject_cleanup(), including the sysfs cleanup, is executed immediately when the last reference is dropped, but the kobj_type->release() function itself is delayed. Calling kobject_del() explicitly sort of side-steps this problem by doing the sysfs cleanup before the last put. But it is quite subtle, and it feels error-prone to rely on that. Bjorn ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/3] msi: add forgotten pci_dev_put(pdev) to populate_msi_sysfs() 2013-09-26 22:16 ` Bjorn Helgaas @ 2013-09-26 23:05 ` Veaceslav Falico 0 siblings, 0 replies; 17+ messages in thread From: Veaceslav Falico @ 2013-09-26 23:05 UTC (permalink / raw) To: Bjorn Helgaas Cc: Neil Horman, linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org, Greg Kroah-Hartman, Russell King On Thu, Sep 26, 2013 at 04:16:13PM -0600, Bjorn Helgaas wrote: >[+cc Russell] > >On Thu, Sep 26, 2013 at 04:07:51PM +0200, Veaceslav Falico wrote: >> On Thu, Sep 26, 2013 at 02:25:52PM +0200, Veaceslav Falico wrote: >> >On Wed, Sep 25, 2013 at 05:35:54PM -0600, Bjorn Helgaas wrote: >> >>On Wed, Sep 25, 2013 at 5:23 PM, Neil Horman <nhorman@tuxdriver.com> wrote: >> >>>On Wed, Sep 25, 2013 at 03:08:05PM -0600, Bjorn Helgaas wrote: >> >>>>On Mon, Sep 16, 2013 at 7:47 PM, Veaceslav Falico <vfalico@redhat.com> wrote: >... >> >As an update - I've found an interesting case why exactly that >> >kobject_del() might be needed: >> > >> >in kobject_del() it removes instantly the link to kset - via >> >kobj_kset_leave(), so that our kset remains without links and, thus, might >> >be instantly removed. >> > >> >So, with kobject_del(), our kset (msi_irqs sysfs dir) remains instantly >> >without any links (i.e. other kobjects) and, when we call kset_unregister() >> >- it exits instantly (if it's not being hold somewhere elsewhere...). >> > >> >Without it, kset_unregister() will wait till all the kobjects will be gone. > >I don't see any waiting in kset_unregister(); all it does is a >kobject_put(). Sorry, didn't word it correctly - my thought was that kset_unregister() (which is, basicly, kobject_put()) will NOT unregister it instantly without kobject_del() in place, because the 'child' kobjects are still tied to this kset. > >> >Now, the fun part starts - if we quickly call pci_disable_msi() and, >> >afterwards, pci_enable_msi() - we might fail because the msi_irqs kset is >> >still there, waiting to unregister, and the sysfs dir is still active. >> > >> >It's used, for example, in tg3_open/tg3_close, which are ndo_open/close, >> >and are called on enslave/deslave in bonding. >> > >> >What I get: >> >[ 60.458319] WARNING: CPU: 0 PID: 5552 at fs/sysfs/dir.c:526 sysfs_add_one+0xbb/0xe0() >> >[ 60.458350] sysfs: cannot create duplicate filename '/devices/pci0000:00/0000:00:1c.5/0000:3f:00.0/msi_irqs' >> > >> >I'll take a deeper look at the issue, though any feedback/advise is >> >welcome. And I'll hold on with the patchset that removes pci_dev_get/put >> >and kobject_del. >> >> Ok, this is only reproducible with CONFIG_DEBUG_KOBJECT_RELEASE, with or >> without removing kobject_del() (though it's harder to reproduce). I could >> not trigger it without CONFIG_DEBUG_KOBJECT_RELEASE, even with constantly >> poking /sys/class/net/tg3_device/device/msi_irqs/* . Though it's, >> obviously, possible, with and without cleanup and my previous bugfix. >> >> I'll then send now the cleanup, however this theoretical issue was, is and >> won't be fixed by it :-/. And I don't know how can we possible fix it >> without something like kobject_put_wait(). > >I still think we should remove the kobject_del(). We don't want to >make a race harder to hit; we want to remove it completely. What we >really want is to make races *easier* to hit so we can find them, >which seems to be the point of KOBJECT_RELEASE :) Agree, that's what I've done in the v2 patchset :). > >That said, I think I see why you see the warning in this case. >You're calling pci_enable_msi(), pci_disable_msi(), pci_enable_msi() >as in this call chain: > > pci_enable_msi > msi_capability_init > populate_msi_sysfs > dev->msi_kset = kset_create_and_add("msi_irqs", ...) > list_for_each_entry(entry, &dev->msi_list, ...) > kobj->kset = dev->msi_kset > kobject_init_and_add(kobj, &msi_irq_ktype, NULL, ...) > kobject_add_internal > kobject_get(&kobj->kset->kobj) # dev->msi_kset > > pci_disable_msi > free_msi_irqs > list_for_each_entry_safe(entry, ..., &dev->msi_list, ...) > kobject_put(&entry->kobj) > kobject_release > ... <delayed> ... > kobject_cleanup > kobject_del > kobj_kset_leave > kset_put(kobj->kset) # dev->msi_kset > kobject_put # happens AFTER pci_enable_msi() below > t->release > list_del(&entry->list) > kset_unregister(dev->msi_kset) > kobject_put > kobject_release > ... <delayed> ... > kobject_cleanup # happens AFTER pci_enable_msi() below > dev->msi_kset = NULL > > pci_enable_msi > msi_capability_init > populate_msi_sysfs > dev->msi_kset = kset_create_and_add("msi_irqs", ...) > "sysfs: cannot create duplicate filename .../msi_irqs" > >When we kset_create_and_add("msi_irqs") the second time, the delayed >kobject_cleanups() for the msi_desc entries and the msi_kset have >not yet occurred, so the msi_desc entries still hold references to >the msi_kset, etc. Exactly! > >I'm not sure if this is a design problem in the way PCI manages >msi_kset and msi_desc entries, or if there's something wrong in >the way KOBJECT_RELEASE is implemented. I could imagine changing >it so the bulk of kobject_cleanup(), including the sysfs cleanup, >is executed immediately when the last reference is dropped, but >the kobj_type->release() function itself is delayed. Yep, could be one of the possibilities - and it actually resembles what kobject_del() does :). But anyway, I agree that we shouldn't leave kobject_del(), cause it only hides the real problem, and not completely (if there are some other users who kobject_get(kset/kobject) - we're in trouble). > >Calling kobject_del() explicitly sort of side-steps this problem >by doing the sysfs cleanup before the last put. But it is quite >subtle, and it feels error-prone to rely on that. So, in my v2 patchset, I've removed the kobject_del(). It's really hard to trigger the bug without KOBJECT_DEBUG_RELEASE, still. > >Bjorn ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/3] msi: add forgotten pci_dev_put(pdev) to populate_msi_sysfs() 2013-09-26 12:25 ` Veaceslav Falico 2013-09-26 14:07 ` Veaceslav Falico @ 2013-09-26 14:40 ` Neil Horman 1 sibling, 0 replies; 17+ messages in thread From: Neil Horman @ 2013-09-26 14:40 UTC (permalink / raw) To: Veaceslav Falico Cc: Bjorn Helgaas, linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org, Greg Kroah-Hartman On Thu, Sep 26, 2013 at 02:25:52PM +0200, Veaceslav Falico wrote: > On Wed, Sep 25, 2013 at 05:35:54PM -0600, Bjorn Helgaas wrote: > >On Wed, Sep 25, 2013 at 5:23 PM, Neil Horman <nhorman@tuxdriver.com> wrote: > >>On Wed, Sep 25, 2013 at 03:08:05PM -0600, Bjorn Helgaas wrote: > >>>[+cc Neil (he added this code in da8d1c8ba4), Greg] > >>> > >>>On Mon, Sep 16, 2013 at 7:47 PM, Veaceslav Falico <vfalico@redhat.com> wrote: > >>>> Before trying to kobject_init_and_add(), we add a reference to pdev via > >>>> pci_dev_get(pdev). However, if it fails to init and/or add the kobject, we > >>>> don't return it back - even on out_unroll. > >>>> > >>>> Fix this by adding pci_dev_put(pdev) before going to unrolling section. > >>>> > >>>> CC: Bjorn Helgaas <bhelgaas@google.com> > >>>> CC: linux-pci@vger.kernel.org > >>>> CC: linux-kernel@vger.kernel.org > >>>> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> > >>>> --- > >>>> drivers/pci/msi.c | 4 +++- > >>>> 1 file changed, 3 insertions(+), 1 deletion(-) > >>>> > >>>> diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c > >>>> index d5f90d6..14bf578 100644 > >>>> --- a/drivers/pci/msi.c > >>>> +++ b/drivers/pci/msi.c > >>>> @@ -534,8 +534,10 @@ static int populate_msi_sysfs(struct pci_dev *pdev) > >>>> pci_dev_get(pdev); > >>>> ret = kobject_init_and_add(kobj, &msi_irq_ktype, NULL, > >>>> "%u", entry->irq); > >>>> - if (ret) > >>>> + if (ret) { > >>>> + pci_dev_put(pdev); > >>>> goto out_unroll; > >>>> + } > >>>> > >>>> count++; > >>>> } > >>> > >>>I don't understand why this code does the pci_dev_get() in the first > >>>place. The pdev->msi_list of msi_desc structs is private to the > >>>pci_dev, and even without bumping the refcount, there should be no way > >>>for the pci_dev to be freed before the msi_desc. > >>> > >>Its been a few years now, but IIRC I did the pci_dev_get/put here to ensure that > >>people didn't try to remove the device prior to freeing all their interrupts > >>(i.e I didn't want a broken driver to go through its remove routine without > >>freeing all its irqs). That might have been the wrong thing to do, but thats > >>what bubbles to the front of my head when looking at this. > > > >That sounds plausible, but I think I'd rather deal with that by having > >the PCI core remove logic free all the interrupts. I *think* that's > >already in place, i.e., pci_free_resources() calls > >msi_remove_pci_irq_vectors(). So I propose that we remove the > >pci_dev_get()/put() unless we come up with a more compelling reason > >for it. > > As an update - I've found an interesting case why exactly that > kobject_del() might be needed: > > in kobject_del() it removes instantly the link to kset - via > kobj_kset_leave(), so that our kset remains without links and, thus, might > be instantly removed. > > So, with kobject_del(), our kset (msi_irqs sysfs dir) remains instantly > without any links (i.e. other kobjects) and, when we call kset_unregister() > - it exits instantly (if it's not being hold somewhere elsewhere...). > > Without it, kset_unregister() will wait till all the kobjects will be gone. > > Now, the fun part starts - if we quickly call pci_disable_msi() and, > afterwards, pci_enable_msi() - we might fail because the msi_irqs kset is > still there, waiting to unregister, and the sysfs dir is still active. > > It's used, for example, in tg3_open/tg3_close, which are ndo_open/close, > and are called on enslave/deslave in bonding. > > What I get: > [ 60.458319] WARNING: CPU: 0 PID: 5552 at fs/sysfs/dir.c:526 sysfs_add_one+0xbb/0xe0() > [ 60.458350] sysfs: cannot create duplicate filename '/devices/pci0000:00/0000:00:1c.5/0000:3f:00.0/msi_irqs' > > I'll take a deeper look at the issue, though any feedback/advise is > welcome. And I'll hold on with the patchset that removes pci_dev_get/put > and kobject_del. > > The origional post may offer some guidance here: https://lkml.org/lkml/2011/9/29/220 In particular the v3 update I think is relevant. Neil > > > >>>I also don't understand this nearby code (the same pattern appears in > >>>free_msi_irqs()): > >>> > >>> out_unroll: > >>> list_for_each_entry(entry, &pdev->msi_list, list) { > >>> if (!count) > >>> break; > >>> kobject_del(&entry->kobj); > >>> kobject_put(&entry->kobj); > >>> count--; > >>> } > >>> > >>>Why do we call kobject_del() here? The kobject_put() will call > >>>kobject_del() anyway, so it looks redundant. > >>>Documentation/kobject.txt says kobject_del() must be called explicitly > >>>to break a circular reference, but I don't think we have that here. > >>> > >> I think thats exactly why I did it, because of the documentation. I agree > >>however, it does look redundant. Harmless, but redundant. > > > >OK, thanks. I think we should remove it on the grounds that it's not > >needed and removing it will make this code look more similar to other > >callers of kobject_init_and_add(), which means bugs will have fewer > >places to hide. > > > >Thanks, Neil! > > > >Bjorn > ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 2/3] msi: always unregister ->msi_kset within free_msi_irqs() 2013-09-17 1:47 [PATCH 0/3] msi: fix kobject/sysfs removal from msi_list Veaceslav Falico 2013-09-17 1:47 ` [PATCH 1/3] msi: add forgotten pci_dev_put(pdev) to populate_msi_sysfs() Veaceslav Falico @ 2013-09-17 1:47 ` Veaceslav Falico 2013-09-17 1:47 ` [PATCH 3/3] msi: free msi_desc entry only after we've released the kobject Veaceslav Falico 2 siblings, 0 replies; 17+ messages in thread From: Veaceslav Falico @ 2013-09-17 1:47 UTC (permalink / raw) To: linux-kernel; +Cc: Veaceslav Falico, Bjorn Helgaas, linux-pci Currently we create and populate ->msi_kset while allocating irqs in populate_msi_sysfs(), however if it fails and/or we want to free the entries - we don't always remove it, and we might have problems if we've failed to allocate irqs and try it again. To fix that, move the unregister part to free_msi_irqs() and remove already existing ones. CC: Bjorn Helgaas <bhelgaas@google.com> CC: linux-pci@vger.kernel.org CC: linux-kernel@vger.kernel.org Signed-off-by: Veaceslav Falico <vfalico@redhat.com> --- drivers/pci/msi.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index 14bf578..68da921 100644 --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -388,6 +388,9 @@ static void free_msi_irqs(struct pci_dev *dev) list_del(&entry->list); kfree(entry); } + + kset_unregister(dev->msi_kset); + dev->msi_kset = NULL; } static struct msi_desc *alloc_msi_entry(struct pci_dev *dev) @@ -917,8 +920,6 @@ void pci_disable_msi(struct pci_dev *dev) pci_msi_shutdown(dev); free_msi_irqs(dev); - kset_unregister(dev->msi_kset); - dev->msi_kset = NULL; } EXPORT_SYMBOL(pci_disable_msi); @@ -1015,8 +1016,6 @@ void pci_disable_msix(struct pci_dev *dev) pci_msix_shutdown(dev); free_msi_irqs(dev); - kset_unregister(dev->msi_kset); - dev->msi_kset = NULL; } EXPORT_SYMBOL(pci_disable_msix); -- 1.8.4 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 3/3] msi: free msi_desc entry only after we've released the kobject 2013-09-17 1:47 [PATCH 0/3] msi: fix kobject/sysfs removal from msi_list Veaceslav Falico 2013-09-17 1:47 ` [PATCH 1/3] msi: add forgotten pci_dev_put(pdev) to populate_msi_sysfs() Veaceslav Falico 2013-09-17 1:47 ` [PATCH 2/3] msi: always unregister ->msi_kset within free_msi_irqs() Veaceslav Falico @ 2013-09-17 1:47 ` Veaceslav Falico 2013-09-25 21:34 ` Bjorn Helgaas 2 siblings, 1 reply; 17+ messages in thread From: Veaceslav Falico @ 2013-09-17 1:47 UTC (permalink / raw) To: linux-kernel; +Cc: Veaceslav Falico, Bjorn Helgaas, linux-pci Currently, we first do kobject_put(&entry->kobj) and the kfree(entry), however kobject_put() doesn't guarantee us that it was the last reference and that the kobj isn't used currently by someone else, so after we kfree(entry) with the struct kobject - other users will begin using the freed memory, instead of the actual kobject. Fix this by using the kobject->release callback, which is called last when the kobject is indeed not used and is cleaned up - it's msi_kobj_release(), which can do the kfree(entry) safely (kobject_put/cleanup doesn't use the kobj itself after ->release() was called, so we're safe). In case we've failed to create the sysfs directories - just kfree() it - cause we don't have the kobjects attached. Also, remove the same functionality from populate_msi_sysfs(), cause on failure we anyway call free_msi_irqs(), which will take care of all the kobjects properly. CC: Bjorn Helgaas <bhelgaas@google.com> CC: linux-pci@vger.kernel.org CC: linux-kernel@vger.kernel.org Signed-off-by: Veaceslav Falico <vfalico@redhat.com> --- drivers/pci/msi.c | 27 +++++++++------------------ 1 file changed, 9 insertions(+), 18 deletions(-) diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index 68da921..c9236e4 100644 --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -374,19 +374,22 @@ static void free_msi_irqs(struct pci_dev *dev) iounmap(entry->mask_base); } + list_del(&entry->list); + /* * Its possible that we get into this path * When populate_msi_sysfs fails, which means the entries * were not registered with sysfs. In that case don't - * unregister them. + * unregister them, and just free. Otherwise the + * kobject->release will take care of freeing the entry via + * msi_kobj_release(). */ if (entry->kobj.parent) { kobject_del(&entry->kobj); kobject_put(&entry->kobj); + } else { + kfree(entry); } - - list_del(&entry->list); - kfree(entry); } kset_unregister(dev->msi_kset); @@ -512,6 +515,7 @@ static void msi_kobj_release(struct kobject *kobj) struct msi_desc *entry = to_msi_desc(kobj); pci_dev_put(entry->dev); + kfree(entry); } static struct kobj_type msi_irq_ktype = { @@ -525,7 +529,6 @@ static int populate_msi_sysfs(struct pci_dev *pdev) struct msi_desc *entry; struct kobject *kobj; int ret; - int count = 0; pdev->msi_kset = kset_create_and_add("msi_irqs", NULL, &pdev->dev.kobj); if (!pdev->msi_kset) @@ -539,23 +542,11 @@ static int populate_msi_sysfs(struct pci_dev *pdev) "%u", entry->irq); if (ret) { pci_dev_put(pdev); - goto out_unroll; + return ret; } - - count++; } return 0; - -out_unroll: - list_for_each_entry(entry, &pdev->msi_list, list) { - if (!count) - break; - kobject_del(&entry->kobj); - kobject_put(&entry->kobj); - count--; - } - return ret; } /** -- 1.8.4 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 3/3] msi: free msi_desc entry only after we've released the kobject 2013-09-17 1:47 ` [PATCH 3/3] msi: free msi_desc entry only after we've released the kobject Veaceslav Falico @ 2013-09-25 21:34 ` Bjorn Helgaas 2013-09-25 22:12 ` Veaceslav Falico 0 siblings, 1 reply; 17+ messages in thread From: Bjorn Helgaas @ 2013-09-25 21:34 UTC (permalink / raw) To: Veaceslav Falico; +Cc: linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org On Mon, Sep 16, 2013 at 7:47 PM, Veaceslav Falico <vfalico@redhat.com> wrote: > Currently, we first do kobject_put(&entry->kobj) and the kfree(entry), > however kobject_put() doesn't guarantee us that it was the last reference > and that the kobj isn't used currently by someone else, so after we > kfree(entry) with the struct kobject - other users will begin using the > freed memory, instead of the actual kobject. > > Fix this by using the kobject->release callback, which is called last when > the kobject is indeed not used and is cleaned up - it's msi_kobj_release(), > which can do the kfree(entry) safely (kobject_put/cleanup doesn't use the > kobj itself after ->release() was called, so we're safe). > > In case we've failed to create the sysfs directories - just kfree() > it - cause we don't have the kobjects attached. > > Also, remove the same functionality from populate_msi_sysfs(), cause on > failure we anyway call free_msi_irqs(), which will take care of all the > kobjects properly. I agree; it looks like populate_msi_sysfs() doesn't need to have the cleanup in it. Can you split that into a separate patch, since I don't think it depends on the kfree() fix? Bjorn > CC: Bjorn Helgaas <bhelgaas@google.com> > CC: linux-pci@vger.kernel.org > CC: linux-kernel@vger.kernel.org > Signed-off-by: Veaceslav Falico <vfalico@redhat.com> > --- > drivers/pci/msi.c | 27 +++++++++------------------ > 1 file changed, 9 insertions(+), 18 deletions(-) > > diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c > index 68da921..c9236e4 100644 > --- a/drivers/pci/msi.c > +++ b/drivers/pci/msi.c > @@ -374,19 +374,22 @@ static void free_msi_irqs(struct pci_dev *dev) > iounmap(entry->mask_base); > } > > + list_del(&entry->list); > + > /* > * Its possible that we get into this path > * When populate_msi_sysfs fails, which means the entries > * were not registered with sysfs. In that case don't > - * unregister them. > + * unregister them, and just free. Otherwise the > + * kobject->release will take care of freeing the entry via > + * msi_kobj_release(). > */ > if (entry->kobj.parent) { > kobject_del(&entry->kobj); > kobject_put(&entry->kobj); > + } else { > + kfree(entry); > } > - > - list_del(&entry->list); > - kfree(entry); > } > > kset_unregister(dev->msi_kset); > @@ -512,6 +515,7 @@ static void msi_kobj_release(struct kobject *kobj) > struct msi_desc *entry = to_msi_desc(kobj); > > pci_dev_put(entry->dev); > + kfree(entry); > } > > static struct kobj_type msi_irq_ktype = { > @@ -525,7 +529,6 @@ static int populate_msi_sysfs(struct pci_dev *pdev) > struct msi_desc *entry; > struct kobject *kobj; > int ret; > - int count = 0; > > pdev->msi_kset = kset_create_and_add("msi_irqs", NULL, &pdev->dev.kobj); > if (!pdev->msi_kset) > @@ -539,23 +542,11 @@ static int populate_msi_sysfs(struct pci_dev *pdev) > "%u", entry->irq); > if (ret) { > pci_dev_put(pdev); > - goto out_unroll; > + return ret; > } > - > - count++; > } > > return 0; > - > -out_unroll: > - list_for_each_entry(entry, &pdev->msi_list, list) { > - if (!count) > - break; > - kobject_del(&entry->kobj); > - kobject_put(&entry->kobj); > - count--; > - } > - return ret; > } > > /** > -- > 1.8.4 > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 3/3] msi: free msi_desc entry only after we've released the kobject 2013-09-25 21:34 ` Bjorn Helgaas @ 2013-09-25 22:12 ` Veaceslav Falico 0 siblings, 0 replies; 17+ messages in thread From: Veaceslav Falico @ 2013-09-25 22:12 UTC (permalink / raw) To: Bjorn Helgaas; +Cc: linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org On Wed, Sep 25, 2013 at 03:34:58PM -0600, Bjorn Helgaas wrote: >On Mon, Sep 16, 2013 at 7:47 PM, Veaceslav Falico <vfalico@redhat.com> wrote: >> Currently, we first do kobject_put(&entry->kobj) and the kfree(entry), >> however kobject_put() doesn't guarantee us that it was the last reference >> and that the kobj isn't used currently by someone else, so after we >> kfree(entry) with the struct kobject - other users will begin using the >> freed memory, instead of the actual kobject. >> >> Fix this by using the kobject->release callback, which is called last when >> the kobject is indeed not used and is cleaned up - it's msi_kobj_release(), >> which can do the kfree(entry) safely (kobject_put/cleanup doesn't use the >> kobj itself after ->release() was called, so we're safe). >> >> In case we've failed to create the sysfs directories - just kfree() >> it - cause we don't have the kobjects attached. >> >> Also, remove the same functionality from populate_msi_sysfs(), cause on >> failure we anyway call free_msi_irqs(), which will take care of all the >> kobjects properly. > >I agree; it looks like populate_msi_sysfs() doesn't need to have the >cleanup in it. Can you split that into a separate patch, since I >don't think it depends on the kfree() fix? Yep, will do and re-send in two patchsets. Thank you! > >Bjorn > >> CC: Bjorn Helgaas <bhelgaas@google.com> >> CC: linux-pci@vger.kernel.org >> CC: linux-kernel@vger.kernel.org >> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> >> --- >> drivers/pci/msi.c | 27 +++++++++------------------ >> 1 file changed, 9 insertions(+), 18 deletions(-) >> >> diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c >> index 68da921..c9236e4 100644 >> --- a/drivers/pci/msi.c >> +++ b/drivers/pci/msi.c >> @@ -374,19 +374,22 @@ static void free_msi_irqs(struct pci_dev *dev) >> iounmap(entry->mask_base); >> } >> >> + list_del(&entry->list); >> + >> /* >> * Its possible that we get into this path >> * When populate_msi_sysfs fails, which means the entries >> * were not registered with sysfs. In that case don't >> - * unregister them. >> + * unregister them, and just free. Otherwise the >> + * kobject->release will take care of freeing the entry via >> + * msi_kobj_release(). >> */ >> if (entry->kobj.parent) { >> kobject_del(&entry->kobj); >> kobject_put(&entry->kobj); >> + } else { >> + kfree(entry); >> } >> - >> - list_del(&entry->list); >> - kfree(entry); >> } >> >> kset_unregister(dev->msi_kset); >> @@ -512,6 +515,7 @@ static void msi_kobj_release(struct kobject *kobj) >> struct msi_desc *entry = to_msi_desc(kobj); >> >> pci_dev_put(entry->dev); >> + kfree(entry); >> } >> >> static struct kobj_type msi_irq_ktype = { >> @@ -525,7 +529,6 @@ static int populate_msi_sysfs(struct pci_dev *pdev) >> struct msi_desc *entry; >> struct kobject *kobj; >> int ret; >> - int count = 0; >> >> pdev->msi_kset = kset_create_and_add("msi_irqs", NULL, &pdev->dev.kobj); >> if (!pdev->msi_kset) >> @@ -539,23 +542,11 @@ static int populate_msi_sysfs(struct pci_dev *pdev) >> "%u", entry->irq); >> if (ret) { >> pci_dev_put(pdev); >> - goto out_unroll; >> + return ret; >> } >> - >> - count++; >> } >> >> return 0; >> - >> -out_unroll: >> - list_for_each_entry(entry, &pdev->msi_list, list) { >> - if (!count) >> - break; >> - kobject_del(&entry->kobj); >> - kobject_put(&entry->kobj); >> - count--; >> - } >> - return ret; >> } >> >> /** >> -- >> 1.8.4 >> ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2013-09-26 23:07 UTC | newest] Thread overview: 17+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-09-17 1:47 [PATCH 0/3] msi: fix kobject/sysfs removal from msi_list Veaceslav Falico 2013-09-17 1:47 ` [PATCH 1/3] msi: add forgotten pci_dev_put(pdev) to populate_msi_sysfs() Veaceslav Falico 2013-09-25 21:08 ` Bjorn Helgaas 2013-09-25 21:30 ` Bjorn Helgaas 2013-09-25 22:09 ` Veaceslav Falico 2013-09-25 23:23 ` Neil Horman 2013-09-25 23:35 ` Bjorn Helgaas 2013-09-26 9:27 ` Veaceslav Falico 2013-09-26 12:25 ` Veaceslav Falico 2013-09-26 14:07 ` Veaceslav Falico 2013-09-26 22:16 ` Bjorn Helgaas 2013-09-26 23:05 ` Veaceslav Falico 2013-09-26 14:40 ` Neil Horman 2013-09-17 1:47 ` [PATCH 2/3] msi: always unregister ->msi_kset within free_msi_irqs() Veaceslav Falico 2013-09-17 1:47 ` [PATCH 3/3] msi: free msi_desc entry only after we've released the kobject Veaceslav Falico 2013-09-25 21:34 ` Bjorn Helgaas 2013-09-25 22:12 ` Veaceslav Falico
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).