* In place kexec @ 2010-07-28 21:57 H. Peter Anvin 2010-07-28 22:02 ` Eric W. Biederman 0 siblings, 1 reply; 36+ messages in thread From: H. Peter Anvin @ 2010-07-28 21:57 UTC (permalink / raw) To: Eric Biederman, Simon Horman, kexec@lists.infradead.org We are getting a claim that the qla driver corrupts memory after a kexec, apparently due to a DMA engine left running in the before-kernel. For an in-place kexec (as opposed to a crash dump kexec, where we switch into dedicated memory), what shutdown paths get executed? -hpa _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: In place kexec 2010-07-28 21:57 In place kexec H. Peter Anvin @ 2010-07-28 22:02 ` Eric W. Biederman 2010-07-29 13:43 ` Neil Horman 0 siblings, 1 reply; 36+ messages in thread From: Eric W. Biederman @ 2010-07-28 22:02 UTC (permalink / raw) To: H. Peter Anvin; +Cc: Simon Horman, kexec@lists.infradead.org "H. Peter Anvin" <hpa@zytor.com> writes: > We are getting a claim that the qla driver corrupts memory after a > kexec, apparently due to a DMA engine left running in the before-kernel. > > For an in-place kexec (as opposed to a crash dump kexec, where we switch > into dedicated memory), what shutdown paths get executed? It is the normal reboot path, so the device shutdown method gets executed. Eric _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: In place kexec 2010-07-28 22:02 ` Eric W. Biederman @ 2010-07-29 13:43 ` Neil Horman 2010-07-29 15:03 ` H. Peter Anvin 0 siblings, 1 reply; 36+ messages in thread From: Neil Horman @ 2010-07-29 13:43 UTC (permalink / raw) To: Eric W. Biederman; +Cc: Simon Horman, kexec@lists.infradead.org, H. Peter Anvin On Wed, Jul 28, 2010 at 03:02:19PM -0700, Eric W. Biederman wrote: > "H. Peter Anvin" <hpa@zytor.com> writes: > > > We are getting a claim that the qla driver corrupts memory after a > > kexec, apparently due to a DMA engine left running in the before-kernel. > > > > For an in-place kexec (as opposed to a crash dump kexec, where we switch > > into dedicated memory), what shutdown paths get executed? > > It is the normal reboot path, so the device shutdown method gets > executed. > > Eric > Check your iommu. We've had lots of problems with them in the past, and in the crash path we explicity leave the iommu on now, whereas the normal shutdown path turns it off. If some other dma-capable device doesn't shut down properly and keeps dma operations going, you're liable to get memory corruption when the iommu re-initalizes. Neil > _______________________________________________ > kexec mailing list > kexec@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/kexec _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: In place kexec 2010-07-29 13:43 ` Neil Horman @ 2010-07-29 15:03 ` H. Peter Anvin 2010-07-29 15:06 ` Neil Horman 0 siblings, 1 reply; 36+ messages in thread From: H. Peter Anvin @ 2010-07-29 15:03 UTC (permalink / raw) To: Neil Horman; +Cc: Simon Horman, kexec@lists.infradead.org, Eric W. Biederman On 07/29/2010 06:43 AM, Neil Horman wrote: > On Wed, Jul 28, 2010 at 03:02:19PM -0700, Eric W. Biederman wrote: >> "H. Peter Anvin" <hpa@zytor.com> writes: >> >>> We are getting a claim that the qla driver corrupts memory after a >>> kexec, apparently due to a DMA engine left running in the before-kernel. >>> >>> For an in-place kexec (as opposed to a crash dump kexec, where we switch >>> into dedicated memory), what shutdown paths get executed? >> >> It is the normal reboot path, so the device shutdown method gets >> executed. >> >> Eric >> > Check your iommu. We've had lots of problems with them in the past, and in the > crash path we explicity leave the iommu on now, whereas the normal shutdown path > turns it off. If some other dma-capable device doesn't shut down properly and > keeps dma operations going, you're liable to get memory corruption when the > iommu re-initalizes. > Sorry, can we keep the discussions of kexec-on-crash and kexec-in-place clearly separated, please? The qla driver issue is supposed to be kexec-in-place, and it sounds like you're talking about kexec-on-crash. -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: In place kexec 2010-07-29 15:03 ` H. Peter Anvin @ 2010-07-29 15:06 ` Neil Horman 2010-07-29 17:51 ` H. Peter Anvin 0 siblings, 1 reply; 36+ messages in thread From: Neil Horman @ 2010-07-29 15:06 UTC (permalink / raw) To: H. Peter Anvin; +Cc: Simon Horman, kexec@lists.infradead.org, Eric W. Biederman On Thu, Jul 29, 2010 at 08:03:47AM -0700, H. Peter Anvin wrote: > On 07/29/2010 06:43 AM, Neil Horman wrote: > > On Wed, Jul 28, 2010 at 03:02:19PM -0700, Eric W. Biederman wrote: > >> "H. Peter Anvin" <hpa@zytor.com> writes: > >> > >>> We are getting a claim that the qla driver corrupts memory after a > >>> kexec, apparently due to a DMA engine left running in the before-kernel. > >>> > >>> For an in-place kexec (as opposed to a crash dump kexec, where we switch > >>> into dedicated memory), what shutdown paths get executed? > >> > >> It is the normal reboot path, so the device shutdown method gets > >> executed. > >> > >> Eric > >> > > Check your iommu. We've had lots of problems with them in the past, and in the > > crash path we explicity leave the iommu on now, whereas the normal shutdown path > > turns it off. If some other dma-capable device doesn't shut down properly and > > keeps dma operations going, you're liable to get memory corruption when the > > iommu re-initalizes. > > > > Sorry, can we keep the discussions of kexec-on-crash and kexec-in-place > clearly separated, please? The qla driver issue is supposed to be > kexec-in-place, and it sounds like you're talking about kexec-on-crash. > No, I'm just indicating a difference between the two paths, and I'm doing so because we used to have simmilar dma problems in the crash path, which we resolved by not turning of the iommu during shutdown, which is different from the in-place path. Just trying to give you some thoughts about where to look for your problem. Neil > -hpa > > -- > H. Peter Anvin, Intel Open Source Technology Center > I work for Intel. I don't speak on their behalf. > _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: In place kexec 2010-07-29 15:06 ` Neil Horman @ 2010-07-29 17:51 ` H. Peter Anvin 2010-07-29 18:06 ` Eric W. Biederman 0 siblings, 1 reply; 36+ messages in thread From: H. Peter Anvin @ 2010-07-29 17:51 UTC (permalink / raw) To: Neil Horman; +Cc: Simon Horman, kexec@lists.infradead.org, Eric W. Biederman On 07/29/2010 08:06 AM, Neil Horman wrote: >> >> Sorry, can we keep the discussions of kexec-on-crash and kexec-in-place >> clearly separated, please? The qla driver issue is supposed to be >> kexec-in-place, and it sounds like you're talking about kexec-on-crash. >> > No, I'm just indicating a difference between the two paths, and I'm doing so > because we used to have simmilar dma problems in the crash path, which we > resolved by not turning of the iommu during shutdown, which is different from > the in-place path. Just trying to give you some thoughts about where to look Fair enough... just wanted to flag this as a problem, because it has already been the source of a lot of confusion. -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: In place kexec 2010-07-29 17:51 ` H. Peter Anvin @ 2010-07-29 18:06 ` Eric W. Biederman 2010-07-29 18:29 ` H. Peter Anvin 0 siblings, 1 reply; 36+ messages in thread From: Eric W. Biederman @ 2010-07-29 18:06 UTC (permalink / raw) To: H. Peter Anvin; +Cc: Neil Horman, Simon Horman, kexec@lists.infradead.org "H. Peter Anvin" <hpa@zytor.com> writes: > On 07/29/2010 08:06 AM, Neil Horman wrote: >>> >>> Sorry, can we keep the discussions of kexec-on-crash and kexec-in-place >>> clearly separated, please? The qla driver issue is supposed to be >>> kexec-in-place, and it sounds like you're talking about kexec-on-crash. >>> >> No, I'm just indicating a difference between the two paths, and I'm doing so >> because we used to have simmilar dma problems in the crash path, which we >> resolved by not turning of the iommu during shutdown, which is different from >> the in-place path. Just trying to give you some thoughts about where to look > > Fair enough... just wanted to flag this as a problem, because it has > already been the source of a lot of confusion. Thinking about this I am a bit surprised that you would find DMA left on from a disk driver. Historically disks have been pretty good about shutting off in this scenario. Added to that typically we unmount all filesystems. Calling rmmod on the driver before the final kexec --exec could be interesting, and drivers much more reliably implement .remove than .shutdown. Network drivers are more likely to be a problem, but we should be downing all of the network interfaces before something happens. All of which is to say kexec-in-place has generally been a lot less hassle, because it is so similar to the normal case. Eric _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: In place kexec 2010-07-29 18:06 ` Eric W. Biederman @ 2010-07-29 18:29 ` H. Peter Anvin 2010-07-29 19:16 ` Vivek Goyal 0 siblings, 1 reply; 36+ messages in thread From: H. Peter Anvin @ 2010-07-29 18:29 UTC (permalink / raw) To: Eric W. Biederman Cc: linux-driver, Neil Horman, Simon Horman, kexec@lists.infradead.org, Andrew Vasquez On 07/29/2010 11:06 AM, Eric W. Biederman wrote: > > Thinking about this I am a bit surprised that you would find > DMA left on from a disk driver. Historically disks have been > pretty good about shutting off in this scenario. > > Added to that typically we unmount all filesystems. > > Calling rmmod on the driver before the final kexec --exec > could be interesting, and drivers much more reliably implement > .remove than .shutdown. > > Network drivers are more likely to be a problem, but we should be > downing all of the network interfaces before something happens. > > All of which is to say kexec-in-place has generally been a lot > less hassle, because it is so similar to the normal case. > In particular, the supposed corruption comes from the "firmware logging" feature in the qla2xxx driver. I'd really like to understand if this is a kexec problem or a qla2xxx problem. -hpa _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: In place kexec 2010-07-29 18:29 ` H. Peter Anvin @ 2010-07-29 19:16 ` Vivek Goyal 2010-07-29 19:51 ` Eric W. Biederman 0 siblings, 1 reply; 36+ messages in thread From: Vivek Goyal @ 2010-07-29 19:16 UTC (permalink / raw) To: H. Peter Anvin Cc: Neil Horman, kexec@lists.infradead.org, Simon Horman, Eric W. Biederman, linux-driver, Andrew Vasquez On Thu, Jul 29, 2010 at 11:29:12AM -0700, H. Peter Anvin wrote: > On 07/29/2010 11:06 AM, Eric W. Biederman wrote: > > > > Thinking about this I am a bit surprised that you would find > > DMA left on from a disk driver. Historically disks have been > > pretty good about shutting off in this scenario. > > > > Added to that typically we unmount all filesystems. > > > > Calling rmmod on the driver before the final kexec --exec > > could be interesting, and drivers much more reliably implement > > .remove than .shutdown. > > > > Network drivers are more likely to be a problem, but we should be > > downing all of the network interfaces before something happens. > > > > All of which is to say kexec-in-place has generally been a lot > > less hassle, because it is so similar to the normal case. > > > > In particular, the supposed corruption comes from the "firmware logging" > feature in the qla2xxx driver. I'd really like to understand if this is > a kexec problem or a qla2xxx problem. > kernel_kexec() kernel_restart_prepare() device_shutdown() I would suspect it to be a qla2xxx driver problem that it did not shut down the device properly. Vivek _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: In place kexec 2010-07-29 19:16 ` Vivek Goyal @ 2010-07-29 19:51 ` Eric W. Biederman 2010-07-29 19:55 ` Randy Dunlap 2010-07-29 20:06 ` H. Peter Anvin 0 siblings, 2 replies; 36+ messages in thread From: Eric W. Biederman @ 2010-07-29 19:51 UTC (permalink / raw) To: Vivek Goyal Cc: Neil Horman, kexec@lists.infradead.org, Simon Horman, Andrew Vasquez, H. Peter Anvin, linux-driver Vivek Goyal <vgoyal@redhat.com> writes: > On Thu, Jul 29, 2010 at 11:29:12AM -0700, H. Peter Anvin wrote: >> On 07/29/2010 11:06 AM, Eric W. Biederman wrote: >> > >> > Thinking about this I am a bit surprised that you would find >> > DMA left on from a disk driver. Historically disks have been >> > pretty good about shutting off in this scenario. >> > >> > Added to that typically we unmount all filesystems. >> > >> > Calling rmmod on the driver before the final kexec --exec >> > could be interesting, and drivers much more reliably implement >> > .remove than .shutdown. >> > >> > Network drivers are more likely to be a problem, but we should be >> > downing all of the network interfaces before something happens. >> > >> > All of which is to say kexec-in-place has generally been a lot >> > less hassle, because it is so similar to the normal case. >> > >> >> In particular, the supposed corruption comes from the "firmware logging" >> feature in the qla2xxx driver. I'd really like to understand if this is >> a kexec problem or a qla2xxx problem. >> > > kernel_kexec() > kernel_restart_prepare() > device_shutdown() > > I would suspect it to be a qla2xxx driver problem that it did not shut > down the device properly. And device_shutdown calls every drivers .shutdown method. Things like this are always a driver problem. Eric _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: In place kexec 2010-07-29 19:51 ` Eric W. Biederman @ 2010-07-29 19:55 ` Randy Dunlap 2010-07-30 3:38 ` H. Peter Anvin 2010-07-29 20:06 ` H. Peter Anvin 1 sibling, 1 reply; 36+ messages in thread From: Randy Dunlap @ 2010-07-29 19:55 UTC (permalink / raw) To: Eric W. Biederman Cc: Neil Horman, kexec@lists.infradead.org, Simon Horman, Andrew Vasquez, H. Peter Anvin, linux-driver, Vivek Goyal On Thu, 29 Jul 2010 12:51:09 -0700 Eric W. Biederman wrote: > Vivek Goyal <vgoyal@redhat.com> writes: > > > On Thu, Jul 29, 2010 at 11:29:12AM -0700, H. Peter Anvin wrote: > >> On 07/29/2010 11:06 AM, Eric W. Biederman wrote: > >> > > >> > Thinking about this I am a bit surprised that you would find > >> > DMA left on from a disk driver. Historically disks have been > >> > pretty good about shutting off in this scenario. > >> > > >> > Added to that typically we unmount all filesystems. > >> > > >> > Calling rmmod on the driver before the final kexec --exec > >> > could be interesting, and drivers much more reliably implement > >> > .remove than .shutdown. > >> > > >> > Network drivers are more likely to be a problem, but we should be > >> > downing all of the network interfaces before something happens. > >> > > >> > All of which is to say kexec-in-place has generally been a lot > >> > less hassle, because it is so similar to the normal case. > >> > > >> > >> In particular, the supposed corruption comes from the "firmware logging" > >> feature in the qla2xxx driver. I'd really like to understand if this is > >> a kexec problem or a qla2xxx problem. > >> > > > > kernel_kexec() > > kernel_restart_prepare() > > device_shutdown() > > > > I would suspect it to be a qla2xxx driver problem that it did not shut > > down the device properly. > > And device_shutdown calls every drivers .shutdown method. > > Things like this are always a driver problem. so is there a default .shutdown method for drivers that do not specify one? like the qla2xxx driver does not. --- ~Randy *** Remember to use Documentation/SubmitChecklist when testing your code *** _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: In place kexec 2010-07-29 19:55 ` Randy Dunlap @ 2010-07-30 3:38 ` H. Peter Anvin 2010-07-30 4:41 ` Eric W. Biederman 0 siblings, 1 reply; 36+ messages in thread From: H. Peter Anvin @ 2010-07-30 3:38 UTC (permalink / raw) To: Randy Dunlap Cc: Neil Horman, kexec@lists.infradead.org, Simon Horman, Eric W. Biederman, linux-driver, Vivek Goyal, Andrew Vasquez On 07/29/2010 12:55 PM, Randy Dunlap wrote: >> >> And device_shutdown calls every drivers .shutdown method. >> >> Things like this are always a driver problem. > > so is there a default .shutdown method for drivers that do not specify one? > > like the qla2xxx driver does not. > If it doesn't, even if bus mastering gets shut off at the core level, there is a risk that is clobbers data when it turns it back on if the initialization sequence is problematic. -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: In place kexec 2010-07-30 3:38 ` H. Peter Anvin @ 2010-07-30 4:41 ` Eric W. Biederman 2010-07-30 5:04 ` H. Peter Anvin 2010-07-30 16:53 ` David Woodhouse 0 siblings, 2 replies; 36+ messages in thread From: Eric W. Biederman @ 2010-07-30 4:41 UTC (permalink / raw) To: H. Peter Anvin Cc: Randy Dunlap, Neil Horman, kexec@lists.infradead.org, Simon Horman, Andrew Vasquez, linux-driver, Vivek Goyal "H. Peter Anvin" <hpa@zytor.com> writes: > On 07/29/2010 12:55 PM, Randy Dunlap wrote: >>> >>> And device_shutdown calls every drivers .shutdown method. >>> >>> Things like this are always a driver problem. >> >> so is there a default .shutdown method for drivers that do not specify one? >> >> like the qla2xxx driver does not. >> > > If it doesn't, even if bus mastering gets shut off at the core level, > there is a risk that is clobbers data when it turns it back on if the > initialization sequence is problematic. There isn't a bus master shut off at the core level. When we did the original analysis it turned out that the bus mastering bit was implemented on a lot of devices in advisory way, so it didn't make sense to count on it. That said it looks like the code to do the shutdown is in qla2x00_remove_one so it should be too hard if someone cared to extract just the hardware bits. Eric _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: In place kexec 2010-07-30 4:41 ` Eric W. Biederman @ 2010-07-30 5:04 ` H. Peter Anvin 2010-07-30 16:30 ` Eric W. Biederman 2010-07-30 16:53 ` David Woodhouse 1 sibling, 1 reply; 36+ messages in thread From: H. Peter Anvin @ 2010-07-30 5:04 UTC (permalink / raw) To: Eric W. Biederman Cc: Randy Dunlap, Neil Horman, kexec@lists.infradead.org, Simon Horman, Andrew Vasquez, linux-driver, Vivek Goyal On 07/29/2010 09:41 PM, Eric W. Biederman wrote: > > There isn't a bus master shut off at the core level. When we did > the original analysis it turned out that the bus mastering bit > was implemented on a lot of devices in advisory way, so it didn't > make sense to count on it. > But does it make sense to not flip the bit for the cases where it is implemented properly? > That said it looks like the code to do the shutdown is in > qla2x00_remove_one so it should be too hard if someone cared to > extract just the hardware bits. Charming. Code is there, just not hooked up. -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: In place kexec 2010-07-30 5:04 ` H. Peter Anvin @ 2010-07-30 16:30 ` Eric W. Biederman 2010-07-30 16:41 ` H. Peter Anvin 0 siblings, 1 reply; 36+ messages in thread From: Eric W. Biederman @ 2010-07-30 16:30 UTC (permalink / raw) To: H. Peter Anvin Cc: Randy Dunlap, Neil Horman, kexec@lists.infradead.org, Simon Horman, Andrew Vasquez, linux-driver, Vivek Goyal "H. Peter Anvin" <hpa@zytor.com> writes: > On 07/29/2010 09:41 PM, Eric W. Biederman wrote: >> >> There isn't a bus master shut off at the core level. When we did >> the original analysis it turned out that the bus mastering bit >> was implemented on a lot of devices in advisory way, so it didn't >> make sense to count on it. >> > > But does it make sense to not flip the bit for the cases where it is > implemented properly? It is probably worth looking into again. I think it was 5+ years ago when that determination was made. >> That said it looks like the code to do the shutdown is in >> qla2x00_remove_one so it should be too hard if someone cared to >> extract just the hardware bits. > > Charming. Code is there, just not hooked up. Using the .remove method in reboot is a fight a lost long ago. Eric _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: In place kexec 2010-07-30 16:30 ` Eric W. Biederman @ 2010-07-30 16:41 ` H. Peter Anvin 2010-07-30 18:36 ` Eric W. Biederman 0 siblings, 1 reply; 36+ messages in thread From: H. Peter Anvin @ 2010-07-30 16:41 UTC (permalink / raw) To: Eric W. Biederman Cc: Randy Dunlap, Neil Horman, kexec@lists.infradead.org, Simon Horman, Andrew Vasquez, linux-driver, Vivek Goyal On 07/30/2010 09:30 AM, Eric W. Biederman wrote: > >>> That said it looks like the code to do the shutdown is in >>> qla2x00_remove_one so it should be too hard if someone cared to >>> extract just the hardware bits. >> >> Charming. Code is there, just not hooked up. > > Using the .remove method in reboot is a fight a lost long ago. > Could you elucidate, please? -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: In place kexec 2010-07-30 16:41 ` H. Peter Anvin @ 2010-07-30 18:36 ` Eric W. Biederman 2010-07-30 22:52 ` Andrew Vasquez 0 siblings, 1 reply; 36+ messages in thread From: Eric W. Biederman @ 2010-07-30 18:36 UTC (permalink / raw) To: H. Peter Anvin Cc: Randy Dunlap, Neil Horman, kexec@lists.infradead.org, Simon Horman, Andrew Vasquez, linux-driver, Vivek Goyal "H. Peter Anvin" <hpa@zytor.com> writes: > On 07/30/2010 09:30 AM, Eric W. Biederman wrote: >> >>>> That said it looks like the code to do the shutdown is in >>>> qla2x00_remove_one so it should be too hard if someone cared to >>>> extract just the hardware bits. >>> >>> Charming. Code is there, just not hooked up. >> >> Using the .remove method in reboot is a fight a lost long ago. >> > > Could you elucidate, please? My original proposal was for device_shutdown to call the .remove methods as those are well exercised and tested in development. aka rmmod. It was argued (with some merit) that for a system reboot we don't want to perform all of the subsystem registration work, to make it more likely that reboot -f will reboot even if there is a kernel oops. What I proposed and unfortunately failed to write the patch for at the time is was to have the device remove path call shutdown before calling remove, so drivers wouldn't have to code it all up twice. A lot of the disk drivers implement .shutdown these days and there aren't may bug reports about kexec failing. So I would be reluctant to change things other than on a driver by driver basis unless I had a lot of time for testing etc. It might be worth playing with adding a pci_clear_master in pci_device_shutdown. It has the potential to break things like usb keyboards, so I would be careful. If it doesn't break fundamental things like usb a pci_clear_master when shutting down devices should improve reliability somewhat. And of course there is the old staple of work arounds: "rmmod <driver>" before calling kexec --exec. Eric _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: In place kexec 2010-07-30 18:36 ` Eric W. Biederman @ 2010-07-30 22:52 ` Andrew Vasquez 2010-07-30 23:25 ` H. Peter Anvin 0 siblings, 1 reply; 36+ messages in thread From: Andrew Vasquez @ 2010-07-30 22:52 UTC (permalink / raw) To: Eric W. Biederman Cc: Randy Dunlap, Neil Horman, kexec@lists.infradead.org, Simon Horman, H. Peter Anvin, Linux Driver, Vivek Goyal On Fri, 30 Jul 2010, Eric W. Biederman wrote: > "H. Peter Anvin" <hpa@zytor.com> writes: > > > On 07/30/2010 09:30 AM, Eric W. Biederman wrote: > >> > >>>> That said it looks like the code to do the shutdown is in > >>>> qla2x00_remove_one so it should be too hard if someone cared to > >>>> extract just the hardware bits. > >>> > >>> Charming. Code is there, just not hooked up. > >> > >> Using the .remove method in reboot is a fight a lost long ago. > >> > > > > Could you elucidate, please? > > My original proposal was for device_shutdown to call the .remove > methods as those are well exercised and tested in development. aka > rmmod. > > It was argued (with some merit) that for a system reboot we don't want > to perform all of the subsystem registration work, to make it more > likely that reboot -f will reboot even if there is a kernel oops. > > What I proposed and unfortunately failed to write the patch for at the > time is was to have the device remove path call shutdown before calling > remove, so drivers wouldn't have to code it all up twice. > > A lot of the disk drivers implement .shutdown these days and there aren't > may bug reports about kexec failing. So I would be reluctant to change > things other than on a driver by driver basis unless I had a lot of time > for testing etc. > > It might be worth playing with adding a pci_clear_master in > pci_device_shutdown. It has the potential to break things like usb > keyboards, so I would be careful. If it doesn't break fundamental > things like usb a pci_clear_master when shutting down devices should > improve reliability somewhat. > > And of course there is the old staple of work arounds: "rmmod <driver>" > before calling kexec --exec. Looking through all these emails, what's the upshot here? Is the expectation, for all storage drivers to starting to implement some 'minimal' level of shutdown with the hardware/firmware during the .shutdown callback? -- Andrew Vasquez _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: In place kexec 2010-07-30 22:52 ` Andrew Vasquez @ 2010-07-30 23:25 ` H. Peter Anvin 2010-07-30 23:40 ` Eric W. Biederman 0 siblings, 1 reply; 36+ messages in thread From: H. Peter Anvin @ 2010-07-30 23:25 UTC (permalink / raw) To: Andrew Vasquez Cc: Randy Dunlap, Neil Horman, kexec@lists.infradead.org, Simon Horman, Eric W. Biederman, Linux Driver, Vivek Goyal On 07/30/2010 03:52 PM, Andrew Vasquez wrote: > > Looking through all these emails, what's the upshot here? Is the > expectation, for all storage drivers to starting to implement some > 'minimal' level of shutdown with the hardware/firmware during the > .shutdown callback? > I believe so. It seems to be a fundamental requirement for kexec to function. -hpa _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: In place kexec 2010-07-30 23:25 ` H. Peter Anvin @ 2010-07-30 23:40 ` Eric W. Biederman 0 siblings, 0 replies; 36+ messages in thread From: Eric W. Biederman @ 2010-07-30 23:40 UTC (permalink / raw) To: H. Peter Anvin Cc: Randy Dunlap, Neil Horman, kexec@lists.infradead.org, Simon Horman, Andrew Vasquez, Linux Driver, Vivek Goyal "H. Peter Anvin" <hpa@zytor.com> writes: > On 07/30/2010 03:52 PM, Andrew Vasquez wrote: >> >> Looking through all these emails, what's the upshot here? Is the >> expectation, for all storage drivers to starting to implement some >> 'minimal' level of shutdown with the hardware/firmware during the >> .shutdown callback? >> > > I believe so. It seems to be a fundamental requirement for kexec to > function. Yes. Implementing a .shutdown method the solution we have, and the requirement has been stable for several years. I did a quick grep through drivers scsi and a lot of the storage drivers already implement the .shutdown method. Beyond not leaving DMAs running which can foul up kexec there is also the need to ensure any drive caches are flushed on reboot. I know ide/sata drivers have been handling this case in ide_gd_shutdown for a long time to ensure the drives write-back caches are flushed. Eric _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: In place kexec 2010-07-30 4:41 ` Eric W. Biederman 2010-07-30 5:04 ` H. Peter Anvin @ 2010-07-30 16:53 ` David Woodhouse 2010-07-30 18:21 ` Eric W. Biederman 2010-07-30 20:42 ` H. Peter Anvin 1 sibling, 2 replies; 36+ messages in thread From: David Woodhouse @ 2010-07-30 16:53 UTC (permalink / raw) To: Eric W. Biederman Cc: Randy Dunlap, Neil Horman, kexec@lists.infradead.org, Simon Horman, Andrew Vasquez, H. Peter Anvin, linux-driver, Vivek Goyal On Thu, 2010-07-29 at 21:41 -0700, Eric W. Biederman wrote: > There isn't a bus master shut off at the core level. Effectively, there is if you have an IOMMU. -- dwmw2 _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: In place kexec 2010-07-30 16:53 ` David Woodhouse @ 2010-07-30 18:21 ` Eric W. Biederman 2010-07-30 18:34 ` Vivek Goyal 2010-07-30 20:42 ` H. Peter Anvin 1 sibling, 1 reply; 36+ messages in thread From: Eric W. Biederman @ 2010-07-30 18:21 UTC (permalink / raw) To: David Woodhouse Cc: Randy Dunlap, Neil Horman, kexec@lists.infradead.org, Simon Horman, Andrew Vasquez, H. Peter Anvin, linux-driver, Vivek Goyal David Woodhouse <dwmw2@infradead.org> writes: > On Thu, 2010-07-29 at 21:41 -0700, Eric W. Biederman wrote: >> There isn't a bus master shut off at the core level. > > Effectively, there is if you have an IOMMU. Depends on the IOMMU. There are several dinky IOMMUs that when you shut them off DMA simply goes around them, and is not stopped. Eric _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: In place kexec 2010-07-30 18:21 ` Eric W. Biederman @ 2010-07-30 18:34 ` Vivek Goyal 2010-07-30 18:50 ` David Woodhouse 0 siblings, 1 reply; 36+ messages in thread From: Vivek Goyal @ 2010-07-30 18:34 UTC (permalink / raw) To: Eric W. Biederman Cc: Randy Dunlap, Neil Horman, kexec@lists.infradead.org, Simon Horman, Andrew Vasquez, H. Peter Anvin, linux-driver, David Woodhouse On Fri, Jul 30, 2010 at 11:21:42AM -0700, Eric W. Biederman wrote: > David Woodhouse <dwmw2@infradead.org> writes: > > > On Thu, 2010-07-29 at 21:41 -0700, Eric W. Biederman wrote: > >> There isn't a bus master shut off at the core level. > > > > Effectively, there is if you have an IOMMU. > > Depends on the IOMMU. There are several dinky IOMMUs that when you > shut them off DMA simply goes around them, and is not stopped. I think last time we were discussing this for AMD IOMMU where if you disable IOMMU, it just kind of become pass through with 1:1 mapping of addresses. Vivek _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: In place kexec 2010-07-30 18:34 ` Vivek Goyal @ 2010-07-30 18:50 ` David Woodhouse 2010-07-30 18:56 ` Vivek Goyal 0 siblings, 1 reply; 36+ messages in thread From: David Woodhouse @ 2010-07-30 18:50 UTC (permalink / raw) To: Vivek Goyal Cc: Randy Dunlap, Neil Horman, kexec@lists.infradead.org, Simon Horman, Eric W. Biederman, H. Peter Anvin, linux-driver, Andrew Vasquez On Fri, 30 Jul 2010, Vivek Goyal wrote: > On Fri, Jul 30, 2010 at 11:21:42AM -0700, Eric W. Biederman wrote: >> David Woodhouse <dwmw2@infradead.org> writes: >> >>> On Thu, 2010-07-29 at 21:41 -0700, Eric W. Biederman wrote: >>>> There isn't a bus master shut off at the core level. >>> >>> Effectively, there is if you have an IOMMU. >> >> Depends on the IOMMU. There are several dinky IOMMUs that when you >> shut them off DMA simply goes around them, and is not stopped. > > I think last time we were discussing this for AMD IOMMU where if you > disable IOMMU, it just kind of become pass through with 1:1 mapping of > addresses. Yeah, don't do that. The IOMMU should be *on*, but without any active mappings set up. Which is exactly how Linux will set it up at boot. -- dwmw2 _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: In place kexec 2010-07-30 18:50 ` David Woodhouse @ 2010-07-30 18:56 ` Vivek Goyal 2010-07-30 19:17 ` David Woodhouse 0 siblings, 1 reply; 36+ messages in thread From: Vivek Goyal @ 2010-07-30 18:56 UTC (permalink / raw) To: David Woodhouse Cc: Randy Dunlap, Neil Horman, kexec@lists.infradead.org, Simon Horman, Eric W. Biederman, H. Peter Anvin, linux-driver, Andrew Vasquez On Fri, Jul 30, 2010 at 07:50:19PM +0100, David Woodhouse wrote: > On Fri, 30 Jul 2010, Vivek Goyal wrote: > > >On Fri, Jul 30, 2010 at 11:21:42AM -0700, Eric W. Biederman wrote: > >>David Woodhouse <dwmw2@infradead.org> writes: > >> > >>>On Thu, 2010-07-29 at 21:41 -0700, Eric W. Biederman wrote: > >>>>There isn't a bus master shut off at the core level. > >>> > >>>Effectively, there is if you have an IOMMU. > >> > >>Depends on the IOMMU. There are several dinky IOMMUs that when you > >>shut them off DMA simply goes around them, and is not stopped. > > > >I think last time we were discussing this for AMD IOMMU where if you > >disable IOMMU, it just kind of become pass through with 1:1 mapping of > >addresses. > > Yeah, don't do that. The IOMMU should be *on*, but without any > active mappings set up. Which is exactly how Linux will set it up at > boot. > So what happens if we tear down the mapping while DMA is on. Vivek _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: In place kexec 2010-07-30 18:56 ` Vivek Goyal @ 2010-07-30 19:17 ` David Woodhouse 2010-07-30 19:39 ` Eric W. Biederman 0 siblings, 1 reply; 36+ messages in thread From: David Woodhouse @ 2010-07-30 19:17 UTC (permalink / raw) To: Vivek Goyal Cc: Randy Dunlap, Neil Horman, kexec@lists.infradead.org, Simon Horman, Eric W. Biederman, H. Peter Anvin, linux-driver, Andrew Vasquez On Fri, 30 Jul 2010, Vivek Goyal wrote: > So what happens if we tear down the mapping while DMA is on. The DMA gets blocked, and you don't have to worry about whether the device was shut down cleanly or not. The device may be unhappy, but when the new kernel's driver loads and reinitialises it, all should be forgiven. -- dwmw2 _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: In place kexec 2010-07-30 19:17 ` David Woodhouse @ 2010-07-30 19:39 ` Eric W. Biederman 2010-07-30 19:46 ` David Woodhouse 0 siblings, 1 reply; 36+ messages in thread From: Eric W. Biederman @ 2010-07-30 19:39 UTC (permalink / raw) To: David Woodhouse Cc: Randy Dunlap, Neil Horman, kexec@lists.infradead.org, Simon Horman, Andrew Vasquez, H. Peter Anvin, linux-driver, Vivek Goyal David Woodhouse <dwmw2@infradead.org> writes: > On Fri, 30 Jul 2010, Vivek Goyal wrote: > >> So what happens if we tear down the mapping while DMA is on. > > The DMA gets blocked, and you don't have to worry about whether the device was > shut down cleanly or not. The device may be unhappy, but when the new kernel's > driver loads and reinitialises it, all should be forgiven. Assuming IOMMU page faults don't cause pain. I seem to remember that also being a nasty issue. Eric _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: In place kexec 2010-07-30 19:39 ` Eric W. Biederman @ 2010-07-30 19:46 ` David Woodhouse 2010-07-30 20:08 ` Eric W. Biederman 0 siblings, 1 reply; 36+ messages in thread From: David Woodhouse @ 2010-07-30 19:46 UTC (permalink / raw) To: Eric W. Biederman Cc: Randy Dunlap, Neil Horman, kexec\@lists.infradead.org, Simon Horman, Andrew Vasquez, H. Peter Anvin, linux-driver, Vivek Goyal On Fri, 30 Jul 2010, Eric W. Biederman wrote: > David Woodhouse <dwmw2@infradead.org> writes: >> The DMA gets blocked, and you don't have to worry about whether the device was >> shut down cleanly or not. The device may be unhappy, but when the new kernel's >> driver loads and reinitialises it, all should be forgiven. > > Assuming IOMMU page faults don't cause pain. I seem to remember that > also being a nasty issue. Only if the driver (or the hardware) is so broken that it can't reccover. There's very little excuse for a driver to have that problem even at runtime (and fail to recover from such an error)... for a driver to fail to initialise the hardware even when that driver is first being loaded is *entirely* fucked. Not that it doesn't happen, of course. But do we care? I lump those broken drivers is the same class as the ones which only work after a warm start from Windows or Mac OS. -- dwmw2 _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: In place kexec 2010-07-30 19:46 ` David Woodhouse @ 2010-07-30 20:08 ` Eric W. Biederman 2010-07-30 20:15 ` David Woodhouse 0 siblings, 1 reply; 36+ messages in thread From: Eric W. Biederman @ 2010-07-30 20:08 UTC (permalink / raw) To: David Woodhouse Cc: Randy Dunlap, Neil Horman, kexec@lists.infradead.org, Simon Horman, Andrew Vasquez, H. Peter Anvin, linux-driver, Vivek Goyal David Woodhouse <dwmw2@infradead.org> writes: > On Fri, 30 Jul 2010, Eric W. Biederman wrote: > >> David Woodhouse <dwmw2@infradead.org> writes: >>> The DMA gets blocked, and you don't have to worry about whether the device was >>> shut down cleanly or not. The device may be unhappy, but when the new kernel's >>> driver loads and reinitialises it, all should be forgiven. >> >> Assuming IOMMU page faults don't cause pain. I seem to remember that >> also being a nasty issue. > > Only if the driver (or the hardware) is so broken that it can't > reccover. There's very little excuse for a driver to have that problem even at > runtime (and fail to recover from such an error)... for a driver to fail to > initialise the hardware even when that driver is first being loaded is > *entirely* fucked. > > Not that it doesn't happen, of course. But do we care? I lump those broken > drivers is the same class as the ones which only work after a warm start from > Windows or Mac OS. The issue is what happens if you take an IOMMU page fault during between shutdown and restart. I seem to remember an IOMMU page fault triggering a machine check on AMD cpus. So maybe it works but my gut impression is simply leaving the IOMMU in a state that is on but not responding could actually make a reboot or kexec less stable than having on-going DMAs stomping on memory. If you can leave it on, without translations and not trapping to software that is a different story. Eric _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: In place kexec 2010-07-30 20:08 ` Eric W. Biederman @ 2010-07-30 20:15 ` David Woodhouse 2010-07-30 21:11 ` H. Peter Anvin 0 siblings, 1 reply; 36+ messages in thread From: David Woodhouse @ 2010-07-30 20:15 UTC (permalink / raw) To: Eric W. Biederman Cc: Randy Dunlap, Neil Horman, kexec@lists.infradead.org, Simon Horman, Andrew Vasquez, H. Peter Anvin, linux-driver, Vivek Goyal On Fri, 2010-07-30 at 13:08 -0700, Eric W. Biederman wrote: > The issue is what happens if you take an IOMMU page fault during > between shutdown and restart. I seem to remember an IOMMU page fault > triggering a machine check on AMD cpus. So maybe it works but my gut > impression is simply leaving the IOMMU in a state that is on but not > responding could actually make a reboot or kexec less stable than having > on-going DMAs stomping on memory. If you can leave it on, without > translations and not trapping to software that is a different story. Speaking of the Intel IOMMU, I know nothing of any 'on but not responding' state. You have: - 'off', which gives a 1:1 mapping and thus if you do this during kexec any still-running devices could be scribbling *anywhere* in memory, using their previously-allocated virtual DMA addresses which are now interpreted as physical addresses. - 'on with page tables cleared', in which case you are safe but some devices might get upset when their DMA is aborted, so their driver needs not to be a pile of shit, and needs to recover from that. - 'on and we preserve the virt->phys mappings of the previous kernel', which is just crack-inspired. You'd have to find the physical pages which were mapped by the previous kernel and steal them away from the new kernel's memory map, just in case they get scribbled on by a device which hasn't been properly shut down by the previous kernel, through the still-extant DMA mappings. I mention the latter only because it's been suggested by someone who was dealing with a broken driver/hardware combination where it *didn't* get properly reset after a fault, even when the driver was loaded anew. Not because anyone in their right mind would ever *do* it. -- dwmw2 _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: In place kexec 2010-07-30 20:15 ` David Woodhouse @ 2010-07-30 21:11 ` H. Peter Anvin 0 siblings, 0 replies; 36+ messages in thread From: H. Peter Anvin @ 2010-07-30 21:11 UTC (permalink / raw) To: David Woodhouse Cc: Randy Dunlap, Neil Horman, kexec@lists.infradead.org, Simon Horman, Eric W. Biederman, linux-driver, Vivek Goyal, Andrew Vasquez On 07/30/2010 01:15 PM, David Woodhouse wrote: > > - 'on with page tables cleared', in which case you are safe but some > devices might get upset when their DMA is aborted, so their driver > needs not to be a pile of shit, and needs to recover from that. > I presume this is the state he's referring to. Now, if this means there are page tables in memory those tables are still subject to being overwritten. -hpa _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: In place kexec 2010-07-30 16:53 ` David Woodhouse 2010-07-30 18:21 ` Eric W. Biederman @ 2010-07-30 20:42 ` H. Peter Anvin 2010-07-30 21:18 ` Khalid Aziz 1 sibling, 1 reply; 36+ messages in thread From: H. Peter Anvin @ 2010-07-30 20:42 UTC (permalink / raw) To: David Woodhouse Cc: Randy Dunlap, Neil Horman, kexec@lists.infradead.org, Simon Horman, Eric W. Biederman, linux-driver, Vivek Goyal, Andrew Vasquez On 07/30/2010 09:53 AM, David Woodhouse wrote: > On Thu, 2010-07-29 at 21:41 -0700, Eric W. Biederman wrote: >> There isn't a bus master shut off at the core level. > > Effectively, there is if you have an IOMMU. With the "core level" I meant Linux kernel code, as opposed to hardware level which is slightly different; I meant it would make sense to at least set the bus master control bit (PCI_COMMAND_MASTER) to zero before kexec. I also agree that reboot and kexec are different; the requirements for kexec are really much more strict. -hpa _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: In place kexec 2010-07-30 20:42 ` H. Peter Anvin @ 2010-07-30 21:18 ` Khalid Aziz 2010-07-30 21:44 ` Khalid Aziz 0 siblings, 1 reply; 36+ messages in thread From: Khalid Aziz @ 2010-07-30 21:18 UTC (permalink / raw) To: H. Peter Anvin Cc: Randy Dunlap, Neil Horman, kexec@lists.infradead.org, Simon Horman, Eric W. Biederman, linux-driver@qlogic.com, David Woodhouse, Vivek Goyal, Andrew Vasquez On Fri, 2010-07-30 at 20:42 +0000, H. Peter Anvin wrote: > On 07/30/2010 09:53 AM, David Woodhouse wrote: > > On Thu, 2010-07-29 at 21:41 -0700, Eric W. Biederman wrote: > >> There isn't a bus master shut off at the core level. > > > > Effectively, there is if you have an IOMMU. > > With the "core level" I meant Linux kernel code, as opposed to hardware > level which is slightly different; I meant it would make sense to at > least set the bus master control bit (PCI_COMMAND_MASTER) to zero before > kexec. Before kexec patch for ia64 was merged into mainline kernel, Zou Nan Hai and I had added a device_shootdown() routine to arch/ia64/kernel/crash.c that was called from machine_crash_shutdown(). device_shootdown() did exactly what you are proposing: +static void device_shootdown(void) +{ + struct pci_dev *dev; + irq_desc_t *desc; + u16 pci_command; + + list_for_each_entry(dev, &pci_devices, global_list) { + desc = irq_descp(dev->irq); + if (!desc->action) + continue; + pci_read_config_word(dev, PCI_COMMAND, &pci_command); + if (pci_command & PCI_COMMAND_MASTER) { + pci_command &= ~PCI_COMMAND_MASTER; + pci_write_config_word(dev, PCI_COMMAND, pci_command); + } + disable_irq_nosync(dev->irq); + desc->handler->end(dev->irq); + } +} There were some discussions regarding this and this code was removed by the time it was merged into mainline kernel. I can't remember the details of why. I remember one report of kernel hang on kexec that seemed to happen in device_shootdown(). I will look for any discussion threads I can find. -- Khalid ==================================================================== Khalid Aziz Telco Platform Software, ISB (970)898-9214 Hewlett-Packard khalid.aziz@hp.com Fort Collins, CO "The Linux kernel is subject to relentless development" - Alessandro Rubini _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: In place kexec 2010-07-30 21:18 ` Khalid Aziz @ 2010-07-30 21:44 ` Khalid Aziz [not found] ` <20120425211512.GA8583@ldl.usa.hp.com> 0 siblings, 1 reply; 36+ messages in thread From: Khalid Aziz @ 2010-07-30 21:44 UTC (permalink / raw) To: H. Peter Anvin Cc: Randy Dunlap, Neil Horman, kexec@lists.infradead.org, Simon Horman, Eric W. Biederman, linux-driver@qlogic.com, David Woodhouse, Vivek Goyal, Andrew Vasquez On Fri, 2010-07-30 at 21:18 +0000, Aziz, Khalid wrote: > There were some discussions regarding this and this code was removed by > the time it was merged into mainline kernel. I can't remember the > details of why. I remember one report of kernel hang on kexec that > seemed to happen in device_shootdown(). I will look for any discussion > threads I can find. > These are the messages I could find discussing the code that disables PCI Master bit, before the code doing that was removed from ia64 tree: <https://lists.linux-foundation.org/pipermail/fastboot/2006-June/010175.html> <https://lists.linux-foundation.org/pipermail/fastboot/2006-June/010176.html> <https://lists.linux-foundation.org/pipermail/fastboot/2006-June/010178.html> <https://lists.linux-foundation.org/pipermail/fastboot/2006-June/010214.html> May be it is time to revisit this, for more than just ia64. -- Khalid ==================================================================== Khalid Aziz Telco Platform Software, ISB (970)898-9214 Hewlett-Packard khalid.aziz@hp.com Fort Collins, CO "The Linux kernel is subject to relentless development" - Alessandro Rubini _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 36+ messages in thread
[parent not found: <20120425211512.GA8583@ldl.usa.hp.com>]
* Re: In place kexec [not found] ` <20120425211512.GA8583@ldl.usa.hp.com> @ 2012-04-25 22:06 ` Vivek Goyal 0 siblings, 0 replies; 36+ messages in thread From: Vivek Goyal @ 2012-04-25 22:06 UTC (permalink / raw) To: Khalid Aziz Cc: Randy Dunlap, Neil Horman, kexec@lists.infradead.org, Simon Horman, Eric W. Biederman, H. Peter Anvin, linux-driver@qlogic.com, David Woodhouse, Andrew Vasquez On Wed, Apr 25, 2012 at 03:15:12PM -0600, Khalid Aziz wrote: [..] > I would appreciate if others could test it and give a thumbs up or thumbs > down before I send it to LKML. Kexec still seems to work on my x86_64 box with this patch applied. So from testing point of view thumbs up. Thanks Vivek _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: In place kexec 2010-07-29 19:51 ` Eric W. Biederman 2010-07-29 19:55 ` Randy Dunlap @ 2010-07-29 20:06 ` H. Peter Anvin 1 sibling, 0 replies; 36+ messages in thread From: H. Peter Anvin @ 2010-07-29 20:06 UTC (permalink / raw) To: Eric W. Biederman Cc: Neil Horman, kexec@lists.infradead.org, Simon Horman, Andrew Vasquez, linux-driver, Vivek Goyal On 07/29/2010 12:51 PM, Eric W. Biederman wrote: >> >> kernel_kexec() >> kernel_restart_prepare() >> device_shutdown() >> >> I would suspect it to be a qla2xxx driver problem that it did not shut >> down the device properly. > > And device_shutdown calls every drivers .shutdown method. > > Things like this are always a driver problem. > Anyone from Qlogic who can comment/confirm/deny/investigage? -hpa _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 36+ messages in thread
end of thread, other threads:[~2012-04-25 22:06 UTC | newest]
Thread overview: 36+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-07-28 21:57 In place kexec H. Peter Anvin
2010-07-28 22:02 ` Eric W. Biederman
2010-07-29 13:43 ` Neil Horman
2010-07-29 15:03 ` H. Peter Anvin
2010-07-29 15:06 ` Neil Horman
2010-07-29 17:51 ` H. Peter Anvin
2010-07-29 18:06 ` Eric W. Biederman
2010-07-29 18:29 ` H. Peter Anvin
2010-07-29 19:16 ` Vivek Goyal
2010-07-29 19:51 ` Eric W. Biederman
2010-07-29 19:55 ` Randy Dunlap
2010-07-30 3:38 ` H. Peter Anvin
2010-07-30 4:41 ` Eric W. Biederman
2010-07-30 5:04 ` H. Peter Anvin
2010-07-30 16:30 ` Eric W. Biederman
2010-07-30 16:41 ` H. Peter Anvin
2010-07-30 18:36 ` Eric W. Biederman
2010-07-30 22:52 ` Andrew Vasquez
2010-07-30 23:25 ` H. Peter Anvin
2010-07-30 23:40 ` Eric W. Biederman
2010-07-30 16:53 ` David Woodhouse
2010-07-30 18:21 ` Eric W. Biederman
2010-07-30 18:34 ` Vivek Goyal
2010-07-30 18:50 ` David Woodhouse
2010-07-30 18:56 ` Vivek Goyal
2010-07-30 19:17 ` David Woodhouse
2010-07-30 19:39 ` Eric W. Biederman
2010-07-30 19:46 ` David Woodhouse
2010-07-30 20:08 ` Eric W. Biederman
2010-07-30 20:15 ` David Woodhouse
2010-07-30 21:11 ` H. Peter Anvin
2010-07-30 20:42 ` H. Peter Anvin
2010-07-30 21:18 ` Khalid Aziz
2010-07-30 21:44 ` Khalid Aziz
[not found] ` <20120425211512.GA8583@ldl.usa.hp.com>
2012-04-25 22:06 ` Vivek Goyal
2010-07-29 20:06 ` H. Peter Anvin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox