public inbox for kexec@lists.infradead.org
 help / color / mirror / Atom feed
* In place kexec
@ 2010-07-28 21:57 H. Peter Anvin
  2010-07-28 22:02 ` Eric W. Biederman
  0 siblings, 1 reply; 36+ messages in thread
From: H. Peter Anvin @ 2010-07-28 21:57 UTC (permalink / raw)
  To: Eric Biederman, Simon Horman, kexec@lists.infradead.org

We are getting a claim that the qla driver corrupts memory after a
kexec, apparently due to a DMA engine left running in the before-kernel.

For an in-place kexec (as opposed to a crash dump kexec, where we switch
into dedicated memory), what shutdown paths get executed?

	-hpa

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: In place kexec
  2010-07-28 21:57 In place kexec H. Peter Anvin
@ 2010-07-28 22:02 ` Eric W. Biederman
  2010-07-29 13:43   ` Neil Horman
  0 siblings, 1 reply; 36+ messages in thread
From: Eric W. Biederman @ 2010-07-28 22:02 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Simon Horman, kexec@lists.infradead.org

"H. Peter Anvin" <hpa@zytor.com> writes:

> We are getting a claim that the qla driver corrupts memory after a
> kexec, apparently due to a DMA engine left running in the before-kernel.
>
> For an in-place kexec (as opposed to a crash dump kexec, where we switch
> into dedicated memory), what shutdown paths get executed?

It is the normal reboot path, so the device shutdown method gets
executed.

Eric

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: In place kexec
  2010-07-28 22:02 ` Eric W. Biederman
@ 2010-07-29 13:43   ` Neil Horman
  2010-07-29 15:03     ` H. Peter Anvin
  0 siblings, 1 reply; 36+ messages in thread
From: Neil Horman @ 2010-07-29 13:43 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: Simon Horman, kexec@lists.infradead.org, H. Peter Anvin

On Wed, Jul 28, 2010 at 03:02:19PM -0700, Eric W. Biederman wrote:
> "H. Peter Anvin" <hpa@zytor.com> writes:
> 
> > We are getting a claim that the qla driver corrupts memory after a
> > kexec, apparently due to a DMA engine left running in the before-kernel.
> >
> > For an in-place kexec (as opposed to a crash dump kexec, where we switch
> > into dedicated memory), what shutdown paths get executed?
> 
> It is the normal reboot path, so the device shutdown method gets
> executed.
> 
> Eric
> 
Check your iommu.  We've had lots of problems with them in the past, and in the
crash path we explicity leave the iommu on now, whereas the normal shutdown path
turns it off.  If some other dma-capable device doesn't shut down properly and
keeps dma operations going, you're liable to get memory corruption when the
iommu re-initalizes.  

Neil

> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: In place kexec
  2010-07-29 13:43   ` Neil Horman
@ 2010-07-29 15:03     ` H. Peter Anvin
  2010-07-29 15:06       ` Neil Horman
  0 siblings, 1 reply; 36+ messages in thread
From: H. Peter Anvin @ 2010-07-29 15:03 UTC (permalink / raw)
  To: Neil Horman; +Cc: Simon Horman, kexec@lists.infradead.org, Eric W. Biederman

On 07/29/2010 06:43 AM, Neil Horman wrote:
> On Wed, Jul 28, 2010 at 03:02:19PM -0700, Eric W. Biederman wrote:
>> "H. Peter Anvin" <hpa@zytor.com> writes:
>>
>>> We are getting a claim that the qla driver corrupts memory after a
>>> kexec, apparently due to a DMA engine left running in the before-kernel.
>>>
>>> For an in-place kexec (as opposed to a crash dump kexec, where we switch
>>> into dedicated memory), what shutdown paths get executed?
>>
>> It is the normal reboot path, so the device shutdown method gets
>> executed.
>>
>> Eric
>>
> Check your iommu.  We've had lots of problems with them in the past, and in the
> crash path we explicity leave the iommu on now, whereas the normal shutdown path
> turns it off.  If some other dma-capable device doesn't shut down properly and
> keeps dma operations going, you're liable to get memory corruption when the
> iommu re-initalizes.  
> 

Sorry, can we keep the discussions of kexec-on-crash and kexec-in-place
clearly separated, please?  The qla driver issue is supposed to be
kexec-in-place, and it sounds like you're talking about kexec-on-crash.

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: In place kexec
  2010-07-29 15:03     ` H. Peter Anvin
@ 2010-07-29 15:06       ` Neil Horman
  2010-07-29 17:51         ` H. Peter Anvin
  0 siblings, 1 reply; 36+ messages in thread
From: Neil Horman @ 2010-07-29 15:06 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Simon Horman, kexec@lists.infradead.org, Eric W. Biederman

On Thu, Jul 29, 2010 at 08:03:47AM -0700, H. Peter Anvin wrote:
> On 07/29/2010 06:43 AM, Neil Horman wrote:
> > On Wed, Jul 28, 2010 at 03:02:19PM -0700, Eric W. Biederman wrote:
> >> "H. Peter Anvin" <hpa@zytor.com> writes:
> >>
> >>> We are getting a claim that the qla driver corrupts memory after a
> >>> kexec, apparently due to a DMA engine left running in the before-kernel.
> >>>
> >>> For an in-place kexec (as opposed to a crash dump kexec, where we switch
> >>> into dedicated memory), what shutdown paths get executed?
> >>
> >> It is the normal reboot path, so the device shutdown method gets
> >> executed.
> >>
> >> Eric
> >>
> > Check your iommu.  We've had lots of problems with them in the past, and in the
> > crash path we explicity leave the iommu on now, whereas the normal shutdown path
> > turns it off.  If some other dma-capable device doesn't shut down properly and
> > keeps dma operations going, you're liable to get memory corruption when the
> > iommu re-initalizes.  
> > 
> 
> Sorry, can we keep the discussions of kexec-on-crash and kexec-in-place
> clearly separated, please?  The qla driver issue is supposed to be
> kexec-in-place, and it sounds like you're talking about kexec-on-crash.
> 
No, I'm just indicating a difference between the two paths, and I'm doing so
because we used to have simmilar dma problems in the crash path, which we
resolved by not turning of the iommu during shutdown, which is different from
the in-place path.  Just trying to give you some thoughts about where to look
for your problem.
Neil

> 	-hpa
> 
> -- 
> H. Peter Anvin, Intel Open Source Technology Center
> I work for Intel.  I don't speak on their behalf.
> 

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: In place kexec
  2010-07-29 15:06       ` Neil Horman
@ 2010-07-29 17:51         ` H. Peter Anvin
  2010-07-29 18:06           ` Eric W. Biederman
  0 siblings, 1 reply; 36+ messages in thread
From: H. Peter Anvin @ 2010-07-29 17:51 UTC (permalink / raw)
  To: Neil Horman; +Cc: Simon Horman, kexec@lists.infradead.org, Eric W. Biederman

On 07/29/2010 08:06 AM, Neil Horman wrote:
>>
>> Sorry, can we keep the discussions of kexec-on-crash and kexec-in-place
>> clearly separated, please?  The qla driver issue is supposed to be
>> kexec-in-place, and it sounds like you're talking about kexec-on-crash.
>>
> No, I'm just indicating a difference between the two paths, and I'm doing so
> because we used to have simmilar dma problems in the crash path, which we
> resolved by not turning of the iommu during shutdown, which is different from
> the in-place path.  Just trying to give you some thoughts about where to look

Fair enough... just wanted to flag this as a problem, because it has
already been the source of a lot of confusion.

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: In place kexec
  2010-07-29 17:51         ` H. Peter Anvin
@ 2010-07-29 18:06           ` Eric W. Biederman
  2010-07-29 18:29             ` H. Peter Anvin
  0 siblings, 1 reply; 36+ messages in thread
From: Eric W. Biederman @ 2010-07-29 18:06 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Neil Horman, Simon Horman, kexec@lists.infradead.org

"H. Peter Anvin" <hpa@zytor.com> writes:

> On 07/29/2010 08:06 AM, Neil Horman wrote:
>>>
>>> Sorry, can we keep the discussions of kexec-on-crash and kexec-in-place
>>> clearly separated, please?  The qla driver issue is supposed to be
>>> kexec-in-place, and it sounds like you're talking about kexec-on-crash.
>>>
>> No, I'm just indicating a difference between the two paths, and I'm doing so
>> because we used to have simmilar dma problems in the crash path, which we
>> resolved by not turning of the iommu during shutdown, which is different from
>> the in-place path.  Just trying to give you some thoughts about where to look
>
> Fair enough... just wanted to flag this as a problem, because it has
> already been the source of a lot of confusion.

Thinking about this I am a bit surprised that you would find
DMA left on from a disk driver.  Historically disks have been
pretty good about shutting off in this scenario.

Added to that typically we unmount all filesystems.

Calling rmmod on the driver before the final kexec --exec
could be interesting, and drivers much more reliably implement
.remove than .shutdown.

Network drivers are more likely to be a problem, but we should be
downing all of the network interfaces before something happens.

All of which is to say kexec-in-place has generally been a lot
less hassle, because it is so similar to the normal case.

Eric


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: In place kexec
  2010-07-29 18:06           ` Eric W. Biederman
@ 2010-07-29 18:29             ` H. Peter Anvin
  2010-07-29 19:16               ` Vivek Goyal
  0 siblings, 1 reply; 36+ messages in thread
From: H. Peter Anvin @ 2010-07-29 18:29 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: linux-driver, Neil Horman, Simon Horman,
	kexec@lists.infradead.org, Andrew Vasquez

On 07/29/2010 11:06 AM, Eric W. Biederman wrote:
> 
> Thinking about this I am a bit surprised that you would find
> DMA left on from a disk driver.  Historically disks have been
> pretty good about shutting off in this scenario.
> 
> Added to that typically we unmount all filesystems.
> 
> Calling rmmod on the driver before the final kexec --exec
> could be interesting, and drivers much more reliably implement
> .remove than .shutdown.
> 
> Network drivers are more likely to be a problem, but we should be
> downing all of the network interfaces before something happens.
> 
> All of which is to say kexec-in-place has generally been a lot
> less hassle, because it is so similar to the normal case.
> 

In particular, the supposed corruption comes from the "firmware logging"
feature in the qla2xxx driver.  I'd really like to understand if this is
a kexec problem or a qla2xxx problem.
	
	-hpa

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: In place kexec
  2010-07-29 18:29             ` H. Peter Anvin
@ 2010-07-29 19:16               ` Vivek Goyal
  2010-07-29 19:51                 ` Eric W. Biederman
  0 siblings, 1 reply; 36+ messages in thread
From: Vivek Goyal @ 2010-07-29 19:16 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Neil Horman, kexec@lists.infradead.org, Simon Horman,
	Eric W. Biederman, linux-driver, Andrew Vasquez

On Thu, Jul 29, 2010 at 11:29:12AM -0700, H. Peter Anvin wrote:
> On 07/29/2010 11:06 AM, Eric W. Biederman wrote:
> > 
> > Thinking about this I am a bit surprised that you would find
> > DMA left on from a disk driver.  Historically disks have been
> > pretty good about shutting off in this scenario.
> > 
> > Added to that typically we unmount all filesystems.
> > 
> > Calling rmmod on the driver before the final kexec --exec
> > could be interesting, and drivers much more reliably implement
> > .remove than .shutdown.
> > 
> > Network drivers are more likely to be a problem, but we should be
> > downing all of the network interfaces before something happens.
> > 
> > All of which is to say kexec-in-place has generally been a lot
> > less hassle, because it is so similar to the normal case.
> > 
> 
> In particular, the supposed corruption comes from the "firmware logging"
> feature in the qla2xxx driver.  I'd really like to understand if this is
> a kexec problem or a qla2xxx problem.
> 	

kernel_kexec()
   kernel_restart_prepare()
	device_shutdown()

I would suspect it to be a qla2xxx driver problem that it did not shut
down the device properly.

Vivek

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: In place kexec
  2010-07-29 19:16               ` Vivek Goyal
@ 2010-07-29 19:51                 ` Eric W. Biederman
  2010-07-29 19:55                   ` Randy Dunlap
  2010-07-29 20:06                   ` H. Peter Anvin
  0 siblings, 2 replies; 36+ messages in thread
From: Eric W. Biederman @ 2010-07-29 19:51 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: Neil Horman, kexec@lists.infradead.org, Simon Horman,
	Andrew Vasquez, H. Peter Anvin, linux-driver

Vivek Goyal <vgoyal@redhat.com> writes:

> On Thu, Jul 29, 2010 at 11:29:12AM -0700, H. Peter Anvin wrote:
>> On 07/29/2010 11:06 AM, Eric W. Biederman wrote:
>> > 
>> > Thinking about this I am a bit surprised that you would find
>> > DMA left on from a disk driver.  Historically disks have been
>> > pretty good about shutting off in this scenario.
>> > 
>> > Added to that typically we unmount all filesystems.
>> > 
>> > Calling rmmod on the driver before the final kexec --exec
>> > could be interesting, and drivers much more reliably implement
>> > .remove than .shutdown.
>> > 
>> > Network drivers are more likely to be a problem, but we should be
>> > downing all of the network interfaces before something happens.
>> > 
>> > All of which is to say kexec-in-place has generally been a lot
>> > less hassle, because it is so similar to the normal case.
>> > 
>> 
>> In particular, the supposed corruption comes from the "firmware logging"
>> feature in the qla2xxx driver.  I'd really like to understand if this is
>> a kexec problem or a qla2xxx problem.
>> 	
>
> kernel_kexec()
>    kernel_restart_prepare()
> 	device_shutdown()
>
> I would suspect it to be a qla2xxx driver problem that it did not shut
> down the device properly.

And device_shutdown calls every drivers .shutdown method.

Things like this are always a driver problem.


Eric

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: In place kexec
  2010-07-29 19:51                 ` Eric W. Biederman
@ 2010-07-29 19:55                   ` Randy Dunlap
  2010-07-30  3:38                     ` H. Peter Anvin
  2010-07-29 20:06                   ` H. Peter Anvin
  1 sibling, 1 reply; 36+ messages in thread
From: Randy Dunlap @ 2010-07-29 19:55 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Neil Horman, kexec@lists.infradead.org, Simon Horman,
	Andrew Vasquez, H. Peter Anvin, linux-driver, Vivek Goyal

On Thu, 29 Jul 2010 12:51:09 -0700 Eric W. Biederman wrote:

> Vivek Goyal <vgoyal@redhat.com> writes:
> 
> > On Thu, Jul 29, 2010 at 11:29:12AM -0700, H. Peter Anvin wrote:
> >> On 07/29/2010 11:06 AM, Eric W. Biederman wrote:
> >> > 
> >> > Thinking about this I am a bit surprised that you would find
> >> > DMA left on from a disk driver.  Historically disks have been
> >> > pretty good about shutting off in this scenario.
> >> > 
> >> > Added to that typically we unmount all filesystems.
> >> > 
> >> > Calling rmmod on the driver before the final kexec --exec
> >> > could be interesting, and drivers much more reliably implement
> >> > .remove than .shutdown.
> >> > 
> >> > Network drivers are more likely to be a problem, but we should be
> >> > downing all of the network interfaces before something happens.
> >> > 
> >> > All of which is to say kexec-in-place has generally been a lot
> >> > less hassle, because it is so similar to the normal case.
> >> > 
> >> 
> >> In particular, the supposed corruption comes from the "firmware logging"
> >> feature in the qla2xxx driver.  I'd really like to understand if this is
> >> a kexec problem or a qla2xxx problem.
> >> 	
> >
> > kernel_kexec()
> >    kernel_restart_prepare()
> > 	device_shutdown()
> >
> > I would suspect it to be a qla2xxx driver problem that it did not shut
> > down the device properly.
> 
> And device_shutdown calls every drivers .shutdown method.
> 
> Things like this are always a driver problem.


so is there a default .shutdown method for drivers that do not specify one?

like the qla2xxx driver does not.

---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: In place kexec
  2010-07-29 19:51                 ` Eric W. Biederman
  2010-07-29 19:55                   ` Randy Dunlap
@ 2010-07-29 20:06                   ` H. Peter Anvin
  1 sibling, 0 replies; 36+ messages in thread
From: H. Peter Anvin @ 2010-07-29 20:06 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Neil Horman, kexec@lists.infradead.org, Simon Horman,
	Andrew Vasquez, linux-driver, Vivek Goyal

On 07/29/2010 12:51 PM, Eric W. Biederman wrote:
>>
>> kernel_kexec()
>>    kernel_restart_prepare()
>> 	device_shutdown()
>>
>> I would suspect it to be a qla2xxx driver problem that it did not shut
>> down the device properly.
> 
> And device_shutdown calls every drivers .shutdown method.
> 
> Things like this are always a driver problem.
> 

Anyone from Qlogic who can comment/confirm/deny/investigage?

	-hpa

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: In place kexec
  2010-07-29 19:55                   ` Randy Dunlap
@ 2010-07-30  3:38                     ` H. Peter Anvin
  2010-07-30  4:41                       ` Eric W. Biederman
  0 siblings, 1 reply; 36+ messages in thread
From: H. Peter Anvin @ 2010-07-30  3:38 UTC (permalink / raw)
  To: Randy Dunlap
  Cc: Neil Horman, kexec@lists.infradead.org, Simon Horman,
	Eric W. Biederman, linux-driver, Vivek Goyal, Andrew Vasquez

On 07/29/2010 12:55 PM, Randy Dunlap wrote:
>>
>> And device_shutdown calls every drivers .shutdown method.
>>
>> Things like this are always a driver problem.
> 
> so is there a default .shutdown method for drivers that do not specify one?
> 
> like the qla2xxx driver does not.
> 

If it doesn't, even if bus mastering gets shut off at the core level,
there is a risk that is clobbers data when it turns it back on if the
initialization sequence is problematic.

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: In place kexec
  2010-07-30  3:38                     ` H. Peter Anvin
@ 2010-07-30  4:41                       ` Eric W. Biederman
  2010-07-30  5:04                         ` H. Peter Anvin
  2010-07-30 16:53                         ` David Woodhouse
  0 siblings, 2 replies; 36+ messages in thread
From: Eric W. Biederman @ 2010-07-30  4:41 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Randy Dunlap, Neil Horman, kexec@lists.infradead.org,
	Simon Horman, Andrew Vasquez, linux-driver, Vivek Goyal

"H. Peter Anvin" <hpa@zytor.com> writes:

> On 07/29/2010 12:55 PM, Randy Dunlap wrote:
>>>
>>> And device_shutdown calls every drivers .shutdown method.
>>>
>>> Things like this are always a driver problem.
>> 
>> so is there a default .shutdown method for drivers that do not specify one?
>> 
>> like the qla2xxx driver does not.
>> 
>
> If it doesn't, even if bus mastering gets shut off at the core level,
> there is a risk that is clobbers data when it turns it back on if the
> initialization sequence is problematic.

There isn't a bus master shut off at the core level.  When we did
the original analysis it turned out that the bus mastering bit
was implemented on a lot of devices in advisory way, so it didn't
make sense to count on it.

That said it looks like the code to do the shutdown is in
qla2x00_remove_one so it should be too hard if someone cared to
extract just the hardware bits.

Eric


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: In place kexec
  2010-07-30  4:41                       ` Eric W. Biederman
@ 2010-07-30  5:04                         ` H. Peter Anvin
  2010-07-30 16:30                           ` Eric W. Biederman
  2010-07-30 16:53                         ` David Woodhouse
  1 sibling, 1 reply; 36+ messages in thread
From: H. Peter Anvin @ 2010-07-30  5:04 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Randy Dunlap, Neil Horman, kexec@lists.infradead.org,
	Simon Horman, Andrew Vasquez, linux-driver, Vivek Goyal

On 07/29/2010 09:41 PM, Eric W. Biederman wrote:
> 
> There isn't a bus master shut off at the core level.  When we did
> the original analysis it turned out that the bus mastering bit
> was implemented on a lot of devices in advisory way, so it didn't
> make sense to count on it.
> 

But does it make sense to not flip the bit for the cases where it is
implemented properly?

> That said it looks like the code to do the shutdown is in
> qla2x00_remove_one so it should be too hard if someone cared to
> extract just the hardware bits.

Charming.  Code is there, just not hooked up.

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: In place kexec
  2010-07-30  5:04                         ` H. Peter Anvin
@ 2010-07-30 16:30                           ` Eric W. Biederman
  2010-07-30 16:41                             ` H. Peter Anvin
  0 siblings, 1 reply; 36+ messages in thread
From: Eric W. Biederman @ 2010-07-30 16:30 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Randy Dunlap, Neil Horman, kexec@lists.infradead.org,
	Simon Horman, Andrew Vasquez, linux-driver, Vivek Goyal

"H. Peter Anvin" <hpa@zytor.com> writes:

> On 07/29/2010 09:41 PM, Eric W. Biederman wrote:
>> 
>> There isn't a bus master shut off at the core level.  When we did
>> the original analysis it turned out that the bus mastering bit
>> was implemented on a lot of devices in advisory way, so it didn't
>> make sense to count on it.
>> 
>
> But does it make sense to not flip the bit for the cases where it is
> implemented properly?

It is probably worth looking into again.  I think it was 5+ years
ago when that determination was made.

>> That said it looks like the code to do the shutdown is in
>> qla2x00_remove_one so it should be too hard if someone cared to
>> extract just the hardware bits.
>
> Charming.  Code is there, just not hooked up.

Using the .remove method in reboot is a fight a lost long ago.

Eric

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: In place kexec
  2010-07-30 16:30                           ` Eric W. Biederman
@ 2010-07-30 16:41                             ` H. Peter Anvin
  2010-07-30 18:36                               ` Eric W. Biederman
  0 siblings, 1 reply; 36+ messages in thread
From: H. Peter Anvin @ 2010-07-30 16:41 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Randy Dunlap, Neil Horman, kexec@lists.infradead.org,
	Simon Horman, Andrew Vasquez, linux-driver, Vivek Goyal

On 07/30/2010 09:30 AM, Eric W. Biederman wrote:
> 
>>> That said it looks like the code to do the shutdown is in
>>> qla2x00_remove_one so it should be too hard if someone cared to
>>> extract just the hardware bits.
>>
>> Charming.  Code is there, just not hooked up.
> 
> Using the .remove method in reboot is a fight a lost long ago.
> 

Could you elucidate, please?

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: In place kexec
  2010-07-30  4:41                       ` Eric W. Biederman
  2010-07-30  5:04                         ` H. Peter Anvin
@ 2010-07-30 16:53                         ` David Woodhouse
  2010-07-30 18:21                           ` Eric W. Biederman
  2010-07-30 20:42                           ` H. Peter Anvin
  1 sibling, 2 replies; 36+ messages in thread
From: David Woodhouse @ 2010-07-30 16:53 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Randy Dunlap, Neil Horman, kexec@lists.infradead.org,
	Simon Horman, Andrew Vasquez, H. Peter Anvin, linux-driver,
	Vivek Goyal

On Thu, 2010-07-29 at 21:41 -0700, Eric W. Biederman wrote:
> There isn't a bus master shut off at the core level.  

Effectively, there is if you have an IOMMU.

-- 
dwmw2


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: In place kexec
  2010-07-30 16:53                         ` David Woodhouse
@ 2010-07-30 18:21                           ` Eric W. Biederman
  2010-07-30 18:34                             ` Vivek Goyal
  2010-07-30 20:42                           ` H. Peter Anvin
  1 sibling, 1 reply; 36+ messages in thread
From: Eric W. Biederman @ 2010-07-30 18:21 UTC (permalink / raw)
  To: David Woodhouse
  Cc: Randy Dunlap, Neil Horman, kexec@lists.infradead.org,
	Simon Horman, Andrew Vasquez, H. Peter Anvin, linux-driver,
	Vivek Goyal

David Woodhouse <dwmw2@infradead.org> writes:

> On Thu, 2010-07-29 at 21:41 -0700, Eric W. Biederman wrote:
>> There isn't a bus master shut off at the core level.  
>
> Effectively, there is if you have an IOMMU.

Depends on the IOMMU.  There are several dinky IOMMUs that when you
shut them off DMA simply goes around them, and is not stopped.

Eric

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: In place kexec
  2010-07-30 18:21                           ` Eric W. Biederman
@ 2010-07-30 18:34                             ` Vivek Goyal
  2010-07-30 18:50                               ` David Woodhouse
  0 siblings, 1 reply; 36+ messages in thread
From: Vivek Goyal @ 2010-07-30 18:34 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Randy Dunlap, Neil Horman, kexec@lists.infradead.org,
	Simon Horman, Andrew Vasquez, H. Peter Anvin, linux-driver,
	David Woodhouse

On Fri, Jul 30, 2010 at 11:21:42AM -0700, Eric W. Biederman wrote:
> David Woodhouse <dwmw2@infradead.org> writes:
> 
> > On Thu, 2010-07-29 at 21:41 -0700, Eric W. Biederman wrote:
> >> There isn't a bus master shut off at the core level.  
> >
> > Effectively, there is if you have an IOMMU.
> 
> Depends on the IOMMU.  There are several dinky IOMMUs that when you
> shut them off DMA simply goes around them, and is not stopped.

I think last time we were discussing this for AMD IOMMU where if you
disable IOMMU, it just kind of become pass through with 1:1 mapping of
addresses.

Vivek

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: In place kexec
  2010-07-30 16:41                             ` H. Peter Anvin
@ 2010-07-30 18:36                               ` Eric W. Biederman
  2010-07-30 22:52                                 ` Andrew Vasquez
  0 siblings, 1 reply; 36+ messages in thread
From: Eric W. Biederman @ 2010-07-30 18:36 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Randy Dunlap, Neil Horman, kexec@lists.infradead.org,
	Simon Horman, Andrew Vasquez, linux-driver, Vivek Goyal

"H. Peter Anvin" <hpa@zytor.com> writes:

> On 07/30/2010 09:30 AM, Eric W. Biederman wrote:
>> 
>>>> That said it looks like the code to do the shutdown is in
>>>> qla2x00_remove_one so it should be too hard if someone cared to
>>>> extract just the hardware bits.
>>>
>>> Charming.  Code is there, just not hooked up.
>> 
>> Using the .remove method in reboot is a fight a lost long ago.
>> 
>
> Could you elucidate, please?

My original proposal was for device_shutdown to call the .remove
methods as those are well exercised and tested in development. aka
rmmod.

It was argued (with some merit) that for a system reboot we don't want
to perform all of the subsystem registration work, to make it more
likely that reboot -f will reboot even if there is a kernel oops.

What I proposed and unfortunately failed to write the patch for at the
time is was to have the device remove path call shutdown before calling
remove, so drivers wouldn't have to code it all up twice.

A lot of the disk drivers implement .shutdown these days and there aren't
may bug reports about kexec failing.  So I would be reluctant to change
things other than on a driver by driver basis unless I had a lot of time
for testing etc.

It might be worth playing with adding a pci_clear_master in
pci_device_shutdown.  It has the potential to break things like usb
keyboards, so I would be careful.  If it doesn't break fundamental
things like usb a pci_clear_master when shutting down devices should
improve reliability somewhat.

And of course there is the old staple of work arounds: "rmmod <driver>"
before calling kexec --exec.

Eric

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: In place kexec
  2010-07-30 18:34                             ` Vivek Goyal
@ 2010-07-30 18:50                               ` David Woodhouse
  2010-07-30 18:56                                 ` Vivek Goyal
  0 siblings, 1 reply; 36+ messages in thread
From: David Woodhouse @ 2010-07-30 18:50 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: Randy Dunlap, Neil Horman, kexec@lists.infradead.org,
	Simon Horman, Eric W. Biederman, H. Peter Anvin, linux-driver,
	Andrew Vasquez

On Fri, 30 Jul 2010, Vivek Goyal wrote:

> On Fri, Jul 30, 2010 at 11:21:42AM -0700, Eric W. Biederman wrote:
>> David Woodhouse <dwmw2@infradead.org> writes:
>>
>>> On Thu, 2010-07-29 at 21:41 -0700, Eric W. Biederman wrote:
>>>> There isn't a bus master shut off at the core level.
>>>
>>> Effectively, there is if you have an IOMMU.
>>
>> Depends on the IOMMU.  There are several dinky IOMMUs that when you
>> shut them off DMA simply goes around them, and is not stopped.
>
> I think last time we were discussing this for AMD IOMMU where if you
> disable IOMMU, it just kind of become pass through with 1:1 mapping of
> addresses.

Yeah, don't do that. The IOMMU should be *on*, but without any active 
mappings set up. Which is exactly how Linux will set it up at boot.

-- 
dwmw2


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: In place kexec
  2010-07-30 18:50                               ` David Woodhouse
@ 2010-07-30 18:56                                 ` Vivek Goyal
  2010-07-30 19:17                                   ` David Woodhouse
  0 siblings, 1 reply; 36+ messages in thread
From: Vivek Goyal @ 2010-07-30 18:56 UTC (permalink / raw)
  To: David Woodhouse
  Cc: Randy Dunlap, Neil Horman, kexec@lists.infradead.org,
	Simon Horman, Eric W. Biederman, H. Peter Anvin, linux-driver,
	Andrew Vasquez

On Fri, Jul 30, 2010 at 07:50:19PM +0100, David Woodhouse wrote:
> On Fri, 30 Jul 2010, Vivek Goyal wrote:
> 
> >On Fri, Jul 30, 2010 at 11:21:42AM -0700, Eric W. Biederman wrote:
> >>David Woodhouse <dwmw2@infradead.org> writes:
> >>
> >>>On Thu, 2010-07-29 at 21:41 -0700, Eric W. Biederman wrote:
> >>>>There isn't a bus master shut off at the core level.
> >>>
> >>>Effectively, there is if you have an IOMMU.
> >>
> >>Depends on the IOMMU.  There are several dinky IOMMUs that when you
> >>shut them off DMA simply goes around them, and is not stopped.
> >
> >I think last time we were discussing this for AMD IOMMU where if you
> >disable IOMMU, it just kind of become pass through with 1:1 mapping of
> >addresses.
> 
> Yeah, don't do that. The IOMMU should be *on*, but without any
> active mappings set up. Which is exactly how Linux will set it up at
> boot.
> 

So what happens if we tear down the mapping while DMA is on.

Vivek

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: In place kexec
  2010-07-30 18:56                                 ` Vivek Goyal
@ 2010-07-30 19:17                                   ` David Woodhouse
  2010-07-30 19:39                                     ` Eric W. Biederman
  0 siblings, 1 reply; 36+ messages in thread
From: David Woodhouse @ 2010-07-30 19:17 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: Randy Dunlap, Neil Horman, kexec@lists.infradead.org,
	Simon Horman, Eric W. Biederman, H. Peter Anvin, linux-driver,
	Andrew Vasquez

On Fri, 30 Jul 2010, Vivek Goyal wrote:

> So what happens if we tear down the mapping while DMA is on.

The DMA gets blocked, and you don't have to worry about whether the device 
was shut down cleanly or not. The device may be unhappy, but when the new 
kernel's driver loads and reinitialises it, all should be forgiven.

-- 
dwmw2


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: In place kexec
  2010-07-30 19:17                                   ` David Woodhouse
@ 2010-07-30 19:39                                     ` Eric W. Biederman
  2010-07-30 19:46                                       ` David Woodhouse
  0 siblings, 1 reply; 36+ messages in thread
From: Eric W. Biederman @ 2010-07-30 19:39 UTC (permalink / raw)
  To: David Woodhouse
  Cc: Randy Dunlap, Neil Horman, kexec@lists.infradead.org,
	Simon Horman, Andrew Vasquez, H. Peter Anvin, linux-driver,
	Vivek Goyal

David Woodhouse <dwmw2@infradead.org> writes:

> On Fri, 30 Jul 2010, Vivek Goyal wrote:
>
>> So what happens if we tear down the mapping while DMA is on.
>
> The DMA gets blocked, and you don't have to worry about whether the device was
> shut down cleanly or not. The device may be unhappy, but when the new kernel's
> driver loads and reinitialises it, all should be forgiven.

Assuming IOMMU page faults don't cause pain.  I seem to remember that
also being a nasty issue.

Eric

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: In place kexec
  2010-07-30 19:39                                     ` Eric W. Biederman
@ 2010-07-30 19:46                                       ` David Woodhouse
  2010-07-30 20:08                                         ` Eric W. Biederman
  0 siblings, 1 reply; 36+ messages in thread
From: David Woodhouse @ 2010-07-30 19:46 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Randy Dunlap, Neil Horman, kexec\@lists.infradead.org,
	Simon Horman, Andrew Vasquez, H. Peter Anvin, linux-driver,
	Vivek Goyal

On Fri, 30 Jul 2010, Eric W. Biederman wrote:

> David Woodhouse <dwmw2@infradead.org> writes:
>> The DMA gets blocked, and you don't have to worry about whether the device was
>> shut down cleanly or not. The device may be unhappy, but when the new kernel's
>> driver loads and reinitialises it, all should be forgiven.
>
> Assuming IOMMU page faults don't cause pain.  I seem to remember that
> also being a nasty issue.

Only if the driver (or the hardware) is so broken that it can't reccover. 
There's very little excuse for a driver to have that problem even at 
runtime (and fail to recover from such an error)... for a driver to fail 
to initialise the hardware even when that driver is first being loaded is 
*entirely* fucked.

Not that it doesn't happen, of course. But do we care? I lump those 
broken drivers is the same class as the ones which only work after a warm 
start from Windows or Mac OS.

-- 
dwmw2


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: In place kexec
  2010-07-30 19:46                                       ` David Woodhouse
@ 2010-07-30 20:08                                         ` Eric W. Biederman
  2010-07-30 20:15                                           ` David Woodhouse
  0 siblings, 1 reply; 36+ messages in thread
From: Eric W. Biederman @ 2010-07-30 20:08 UTC (permalink / raw)
  To: David Woodhouse
  Cc: Randy Dunlap, Neil Horman, kexec@lists.infradead.org,
	Simon Horman, Andrew Vasquez, H. Peter Anvin, linux-driver,
	Vivek Goyal

David Woodhouse <dwmw2@infradead.org> writes:

> On Fri, 30 Jul 2010, Eric W. Biederman wrote:
>
>> David Woodhouse <dwmw2@infradead.org> writes:
>>> The DMA gets blocked, and you don't have to worry about whether the device was
>>> shut down cleanly or not. The device may be unhappy, but when the new kernel's
>>> driver loads and reinitialises it, all should be forgiven.
>>
>> Assuming IOMMU page faults don't cause pain.  I seem to remember that
>> also being a nasty issue.
>
> Only if the driver (or the hardware) is so broken that it can't
> reccover. There's very little excuse for a driver to have that problem even at
> runtime (and fail to recover from such an error)... for a driver to fail to
> initialise the hardware even when that driver is first being loaded is
> *entirely* fucked.
>
> Not that it doesn't happen, of course. But do we care? I lump those broken
> drivers is the same class as the ones which only work after a warm start from
> Windows or Mac OS.

The issue is what happens if you take an IOMMU page fault during
between shutdown and restart.  I seem to remember an IOMMU page fault
triggering a machine check on AMD cpus.  So maybe it works but my gut
impression is simply leaving the IOMMU in a state that is on but not
responding could actually make a reboot or kexec less stable than having
on-going DMAs stomping on memory.  If you can leave it on, without
translations and not trapping to software that is a different story.

Eric

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: In place kexec
  2010-07-30 20:08                                         ` Eric W. Biederman
@ 2010-07-30 20:15                                           ` David Woodhouse
  2010-07-30 21:11                                             ` H. Peter Anvin
  0 siblings, 1 reply; 36+ messages in thread
From: David Woodhouse @ 2010-07-30 20:15 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Randy Dunlap, Neil Horman, kexec@lists.infradead.org,
	Simon Horman, Andrew Vasquez, H. Peter Anvin, linux-driver,
	Vivek Goyal

On Fri, 2010-07-30 at 13:08 -0700, Eric W. Biederman wrote:
> The issue is what happens if you take an IOMMU page fault during
> between shutdown and restart.  I seem to remember an IOMMU page fault
> triggering a machine check on AMD cpus.  So maybe it works but my gut
> impression is simply leaving the IOMMU in a state that is on but not
> responding could actually make a reboot or kexec less stable than having
> on-going DMAs stomping on memory.  If you can leave it on, without
> translations and not trapping to software that is a different story. 

Speaking of the Intel IOMMU, I know nothing of any 'on but not
responding' state. You have:

 - 'off', which gives a 1:1 mapping and thus if you do this during kexec
    any still-running devices could be scribbling *anywhere* in memory,
    using their previously-allocated virtual DMA addresses which are
    now interpreted as physical addresses.

 - 'on with page tables cleared', in which case you are safe but some
    devices might get upset when their DMA is aborted, so their driver
    needs not to be a pile of shit, and needs to recover from that.

 - 'on and we preserve the virt->phys mappings of the previous kernel',
    which is just crack-inspired. You'd have to find the physical pages
    which were mapped by the previous kernel and steal them away from
    the new kernel's memory map, just in case they get scribbled on by
    a device which hasn't been properly shut down by the previous
    kernel, through the still-extant DMA mappings.

I mention the latter only because it's been suggested by someone who was
dealing with a broken driver/hardware combination where it *didn't* get
properly reset after a fault, even when the driver was loaded anew. Not
because anyone in their right mind would ever *do* it.

-- 
dwmw2


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: In place kexec
  2010-07-30 16:53                         ` David Woodhouse
  2010-07-30 18:21                           ` Eric W. Biederman
@ 2010-07-30 20:42                           ` H. Peter Anvin
  2010-07-30 21:18                             ` Khalid Aziz
  1 sibling, 1 reply; 36+ messages in thread
From: H. Peter Anvin @ 2010-07-30 20:42 UTC (permalink / raw)
  To: David Woodhouse
  Cc: Randy Dunlap, Neil Horman, kexec@lists.infradead.org,
	Simon Horman, Eric W. Biederman, linux-driver, Vivek Goyal,
	Andrew Vasquez

On 07/30/2010 09:53 AM, David Woodhouse wrote:
> On Thu, 2010-07-29 at 21:41 -0700, Eric W. Biederman wrote:
>> There isn't a bus master shut off at the core level.  
> 
> Effectively, there is if you have an IOMMU.

With the "core level" I meant Linux kernel code, as opposed to hardware
level which is slightly different; I meant it would make sense to at
least set the bus master control bit (PCI_COMMAND_MASTER) to zero before
kexec.

I also agree that reboot and kexec are different; the requirements for
kexec are really much more strict.

	-hpa


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: In place kexec
  2010-07-30 20:15                                           ` David Woodhouse
@ 2010-07-30 21:11                                             ` H. Peter Anvin
  0 siblings, 0 replies; 36+ messages in thread
From: H. Peter Anvin @ 2010-07-30 21:11 UTC (permalink / raw)
  To: David Woodhouse
  Cc: Randy Dunlap, Neil Horman, kexec@lists.infradead.org,
	Simon Horman, Eric W. Biederman, linux-driver, Vivek Goyal,
	Andrew Vasquez

On 07/30/2010 01:15 PM, David Woodhouse wrote:
> 
>  - 'on with page tables cleared', in which case you are safe but some
>     devices might get upset when their DMA is aborted, so their driver
>     needs not to be a pile of shit, and needs to recover from that.
> 

I presume this is the state he's referring to.

Now, if this means there are page tables in memory those tables are
still subject to being overwritten.

	-hpa

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: In place kexec
  2010-07-30 20:42                           ` H. Peter Anvin
@ 2010-07-30 21:18                             ` Khalid Aziz
  2010-07-30 21:44                               ` Khalid Aziz
  0 siblings, 1 reply; 36+ messages in thread
From: Khalid Aziz @ 2010-07-30 21:18 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Randy Dunlap, Neil Horman, kexec@lists.infradead.org,
	Simon Horman, Eric W. Biederman, linux-driver@qlogic.com,
	David Woodhouse, Vivek Goyal, Andrew Vasquez

On Fri, 2010-07-30 at 20:42 +0000, H. Peter Anvin wrote:
> On 07/30/2010 09:53 AM, David Woodhouse wrote:
> > On Thu, 2010-07-29 at 21:41 -0700, Eric W. Biederman wrote:
> >> There isn't a bus master shut off at the core level.  
> > 
> > Effectively, there is if you have an IOMMU.
> 
> With the "core level" I meant Linux kernel code, as opposed to hardware
> level which is slightly different; I meant it would make sense to at
> least set the bus master control bit (PCI_COMMAND_MASTER) to zero before
> kexec.

Before kexec patch for ia64 was merged into mainline kernel, Zou Nan Hai
and I had added a device_shootdown() routine to arch/ia64/kernel/crash.c
that was called from machine_crash_shutdown(). device_shootdown() did
exactly what you are proposing:

+static void device_shootdown(void)
+{
+       struct pci_dev *dev;
+       irq_desc_t *desc;
+       u16 pci_command;
+
+       list_for_each_entry(dev, &pci_devices, global_list) {
+               desc = irq_descp(dev->irq);
+               if (!desc->action)
+                       continue;
+               pci_read_config_word(dev, PCI_COMMAND, &pci_command);
+               if (pci_command & PCI_COMMAND_MASTER) {
+                       pci_command &= ~PCI_COMMAND_MASTER;
+                       pci_write_config_word(dev, PCI_COMMAND, pci_command);
+               }
+               disable_irq_nosync(dev->irq);
+               desc->handler->end(dev->irq);
+       }
+}

There were some discussions regarding this and this code was removed by
the time it was merged into mainline kernel. I can't remember the
details of why. I remember one report of kernel hang on kexec that
seemed to happen in device_shootdown(). I will look for any discussion
threads I can find.

-- 
Khalid
====================================================================
Khalid Aziz                             Telco Platform Software, ISB
(970)898-9214                                        Hewlett-Packard
khalid.aziz@hp.com                                  Fort Collins, CO

"The Linux kernel is subject to relentless development" 
                                - Alessandro Rubini


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: In place kexec
  2010-07-30 21:18                             ` Khalid Aziz
@ 2010-07-30 21:44                               ` Khalid Aziz
       [not found]                                 ` <20120425211512.GA8583@ldl.usa.hp.com>
  0 siblings, 1 reply; 36+ messages in thread
From: Khalid Aziz @ 2010-07-30 21:44 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Randy Dunlap, Neil Horman, kexec@lists.infradead.org,
	Simon Horman, Eric W. Biederman, linux-driver@qlogic.com,
	David Woodhouse, Vivek Goyal, Andrew Vasquez

On Fri, 2010-07-30 at 21:18 +0000, Aziz, Khalid wrote:
> There were some discussions regarding this and this code was removed by
> the time it was merged into mainline kernel. I can't remember the
> details of why. I remember one report of kernel hang on kexec that
> seemed to happen in device_shootdown(). I will look for any discussion
> threads I can find.
> 

These are the messages I could find discussing the code that disables
PCI Master bit, before the code doing that was removed from ia64 tree:

<https://lists.linux-foundation.org/pipermail/fastboot/2006-June/010175.html>
<https://lists.linux-foundation.org/pipermail/fastboot/2006-June/010176.html>
<https://lists.linux-foundation.org/pipermail/fastboot/2006-June/010178.html>
<https://lists.linux-foundation.org/pipermail/fastboot/2006-June/010214.html>

May be it is time to revisit this, for more than just ia64.

-- 
Khalid
====================================================================
Khalid Aziz                             Telco Platform Software, ISB
(970)898-9214                                        Hewlett-Packard
khalid.aziz@hp.com                                  Fort Collins, CO

"The Linux kernel is subject to relentless development" 
                                - Alessandro Rubini


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: In place kexec
  2010-07-30 18:36                               ` Eric W. Biederman
@ 2010-07-30 22:52                                 ` Andrew Vasquez
  2010-07-30 23:25                                   ` H. Peter Anvin
  0 siblings, 1 reply; 36+ messages in thread
From: Andrew Vasquez @ 2010-07-30 22:52 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Randy Dunlap, Neil Horman, kexec@lists.infradead.org,
	Simon Horman, H. Peter Anvin, Linux Driver, Vivek Goyal

On Fri, 30 Jul 2010, Eric W. Biederman wrote:

> "H. Peter Anvin" <hpa@zytor.com> writes:
> 
> > On 07/30/2010 09:30 AM, Eric W. Biederman wrote:
> >> 
> >>>> That said it looks like the code to do the shutdown is in
> >>>> qla2x00_remove_one so it should be too hard if someone cared to
> >>>> extract just the hardware bits.
> >>>
> >>> Charming.  Code is there, just not hooked up.
> >> 
> >> Using the .remove method in reboot is a fight a lost long ago.
> >> 
> >
> > Could you elucidate, please?
> 
> My original proposal was for device_shutdown to call the .remove
> methods as those are well exercised and tested in development. aka
> rmmod.
> 
> It was argued (with some merit) that for a system reboot we don't want
> to perform all of the subsystem registration work, to make it more
> likely that reboot -f will reboot even if there is a kernel oops.
> 
> What I proposed and unfortunately failed to write the patch for at the
> time is was to have the device remove path call shutdown before calling
> remove, so drivers wouldn't have to code it all up twice.
> 
> A lot of the disk drivers implement .shutdown these days and there aren't
> may bug reports about kexec failing.  So I would be reluctant to change
> things other than on a driver by driver basis unless I had a lot of time
> for testing etc.
> 
> It might be worth playing with adding a pci_clear_master in
> pci_device_shutdown.  It has the potential to break things like usb
> keyboards, so I would be careful.  If it doesn't break fundamental
> things like usb a pci_clear_master when shutting down devices should
> improve reliability somewhat.
> 
> And of course there is the old staple of work arounds: "rmmod <driver>"
> before calling kexec --exec.

Looking through all these emails, what's the upshot here?  Is the
expectation, for all storage drivers to starting to implement some
'minimal' level of shutdown with the hardware/firmware during the
.shutdown callback?


--
Andrew Vasquez

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: In place kexec
  2010-07-30 22:52                                 ` Andrew Vasquez
@ 2010-07-30 23:25                                   ` H. Peter Anvin
  2010-07-30 23:40                                     ` Eric W. Biederman
  0 siblings, 1 reply; 36+ messages in thread
From: H. Peter Anvin @ 2010-07-30 23:25 UTC (permalink / raw)
  To: Andrew Vasquez
  Cc: Randy Dunlap, Neil Horman, kexec@lists.infradead.org,
	Simon Horman, Eric W. Biederman, Linux Driver, Vivek Goyal

On 07/30/2010 03:52 PM, Andrew Vasquez wrote:
> 
> Looking through all these emails, what's the upshot here?  Is the
> expectation, for all storage drivers to starting to implement some
> 'minimal' level of shutdown with the hardware/firmware during the
> .shutdown callback?
> 

I believe so.  It seems to be a fundamental requirement for kexec to
function.

	-hpa

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: In place kexec
  2010-07-30 23:25                                   ` H. Peter Anvin
@ 2010-07-30 23:40                                     ` Eric W. Biederman
  0 siblings, 0 replies; 36+ messages in thread
From: Eric W. Biederman @ 2010-07-30 23:40 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Randy Dunlap, Neil Horman, kexec@lists.infradead.org,
	Simon Horman, Andrew Vasquez, Linux Driver, Vivek Goyal

"H. Peter Anvin" <hpa@zytor.com> writes:

> On 07/30/2010 03:52 PM, Andrew Vasquez wrote:
>> 
>> Looking through all these emails, what's the upshot here?  Is the
>> expectation, for all storage drivers to starting to implement some
>> 'minimal' level of shutdown with the hardware/firmware during the
>> .shutdown callback?
>> 
>
> I believe so.  It seems to be a fundamental requirement for kexec to
> function.

Yes.  Implementing a .shutdown method the solution we have, and the
requirement has been stable for several years.  I did a quick grep
through drivers scsi and a lot of the storage drivers already
implement the .shutdown method.

Beyond not leaving DMAs running which can foul up kexec there is also
the need to ensure any drive caches are flushed on reboot.  I know
ide/sata drivers have been handling this case in ide_gd_shutdown for a
long time to ensure the drives write-back caches are flushed.

Eric

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: In place kexec
       [not found]                                 ` <20120425211512.GA8583@ldl.usa.hp.com>
@ 2012-04-25 22:06                                   ` Vivek Goyal
  0 siblings, 0 replies; 36+ messages in thread
From: Vivek Goyal @ 2012-04-25 22:06 UTC (permalink / raw)
  To: Khalid Aziz
  Cc: Randy Dunlap, Neil Horman, kexec@lists.infradead.org,
	Simon Horman, Eric W. Biederman, H. Peter Anvin,
	linux-driver@qlogic.com, David Woodhouse, Andrew Vasquez

On Wed, Apr 25, 2012 at 03:15:12PM -0600, Khalid Aziz wrote:

[..]
> I would appreciate if others could test it and give a thumbs up or thumbs 
> down before I send it to LKML.

Kexec still seems to work on my x86_64 box with this patch applied. So
from testing point of view thumbs up.

Thanks
Vivek

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2012-04-25 22:06 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-07-28 21:57 In place kexec H. Peter Anvin
2010-07-28 22:02 ` Eric W. Biederman
2010-07-29 13:43   ` Neil Horman
2010-07-29 15:03     ` H. Peter Anvin
2010-07-29 15:06       ` Neil Horman
2010-07-29 17:51         ` H. Peter Anvin
2010-07-29 18:06           ` Eric W. Biederman
2010-07-29 18:29             ` H. Peter Anvin
2010-07-29 19:16               ` Vivek Goyal
2010-07-29 19:51                 ` Eric W. Biederman
2010-07-29 19:55                   ` Randy Dunlap
2010-07-30  3:38                     ` H. Peter Anvin
2010-07-30  4:41                       ` Eric W. Biederman
2010-07-30  5:04                         ` H. Peter Anvin
2010-07-30 16:30                           ` Eric W. Biederman
2010-07-30 16:41                             ` H. Peter Anvin
2010-07-30 18:36                               ` Eric W. Biederman
2010-07-30 22:52                                 ` Andrew Vasquez
2010-07-30 23:25                                   ` H. Peter Anvin
2010-07-30 23:40                                     ` Eric W. Biederman
2010-07-30 16:53                         ` David Woodhouse
2010-07-30 18:21                           ` Eric W. Biederman
2010-07-30 18:34                             ` Vivek Goyal
2010-07-30 18:50                               ` David Woodhouse
2010-07-30 18:56                                 ` Vivek Goyal
2010-07-30 19:17                                   ` David Woodhouse
2010-07-30 19:39                                     ` Eric W. Biederman
2010-07-30 19:46                                       ` David Woodhouse
2010-07-30 20:08                                         ` Eric W. Biederman
2010-07-30 20:15                                           ` David Woodhouse
2010-07-30 21:11                                             ` H. Peter Anvin
2010-07-30 20:42                           ` H. Peter Anvin
2010-07-30 21:18                             ` Khalid Aziz
2010-07-30 21:44                               ` Khalid Aziz
     [not found]                                 ` <20120425211512.GA8583@ldl.usa.hp.com>
2012-04-25 22:06                                   ` Vivek Goyal
2010-07-29 20:06                   ` H. Peter Anvin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox