* [Qemu-devel] e1000e migration @ 2017-05-11 12:36 Dr. David Alan Gilbert 2017-05-14 13:14 ` Dmitry Fleytman 0 siblings, 1 reply; 5+ messages in thread From: Dr. David Alan Gilbert @ 2017-05-11 12:36 UTC (permalink / raw) To: dmitry; +Cc: qemu-devel Hi Dmitry, Have you seen any problems with e1000e migration under windows? I've got a repeatable case where after migration with e1000e windows hangs/almost hangs. I'm seeing the e1000e generate interrupts at a very very high rate (maybe ~1000 second ish?) after migration. Some versions of qemu do it and some dont, but my attempts at bisection lead me to code that should be irrelevant. Prior to migration I see: 36461@1494504466.711929:e1000e_irq_pending_interrupts ICR PENDING: 0x100000 (ICR: 0x80100082, IMS: 0x1f00004) 36461@1494504466.711992:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x80000082, IMS: 0x1a00004) 36461@1494504466.712076:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x80000082, IMS: 0x1f00004) 36461@1494504466.712245:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x80000082, IMS: 0x1a00004) 36461@1494504466.712332:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x80000082, IMS: 0x1f00004) which I think the ICR means: 31 - int asserted 20 - RxQ0 - receive queue 0 interrupt 7 - RXT0 - receiver timer interrupt 1 - TXQE - Transmit Queue empty after migration it varies more, I'm seeing mostly: 21977@1494504516.320707:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 (ICR: 0x815000c2, IMS: 0x1a00004) 31 - int asserted 24 - 'Other' 22 - TxQ0 interrupt 20 - RxQ0 interrupt 07 - RXT0 Receiver timer interrupt 06 - RX0 - Receiver overrun 01 - TXQE - Transmit queue empty For reference this is https://bugzilla.redhat.com/show_bug.cgi?id=1447935 Dave -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Qemu-devel] e1000e migration 2017-05-11 12:36 [Qemu-devel] e1000e migration Dr. David Alan Gilbert @ 2017-05-14 13:14 ` Dmitry Fleytman 2017-05-15 9:22 ` Dr. David Alan Gilbert 0 siblings, 1 reply; 5+ messages in thread From: Dmitry Fleytman @ 2017-05-14 13:14 UTC (permalink / raw) To: Dr. David Alan Gilbert; +Cc: qemu-devel Hello Dave, We are trying to reproduce this issue on our systems but with no luck so far… From what you describe it looks like some bit in ICR is not being cleared by the driver. This usually means that this bit should never be set in that specific interrupt mode. Could you please check which bit is not cleared and who sets it? Regards, Dmitry > On 11 May 2017, at 15:36 PM, Dr. David Alan Gilbert <dgilbert@redhat.com> wrote: > > Hi Dmitry, > Have you seen any problems with e1000e migration under windows? > I've got a repeatable case where after migration with e1000e windows > hangs/almost hangs. > I'm seeing the e1000e generate interrupts at a very very high > rate (maybe ~1000 second ish?) after migration. > > Some versions of qemu do it and some dont, but my attempts > at bisection lead me to code that should be irrelevant. > > Prior to migration I see: > > 36461@1494504466.711929:e1000e_irq_pending_interrupts ICR PENDING: 0x100000 (ICR: 0x80100082, IMS: 0x1f00004) > 36461@1494504466.711992:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x80000082, IMS: 0x1a00004) > 36461@1494504466.712076:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x80000082, IMS: 0x1f00004) > 36461@1494504466.712245:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x80000082, IMS: 0x1a00004) > 36461@1494504466.712332:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x80000082, IMS: 0x1f00004) > > which I think the ICR means: > 31 - int asserted > 20 - RxQ0 - receive queue 0 interrupt > 7 - RXT0 - receiver timer interrupt > 1 - TXQE - Transmit Queue empty > > after migration it varies more, I'm seeing mostly: > 21977@1494504516.320707:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 (ICR: 0x815000c2, IMS: 0x1a00004) > 31 - int asserted > 24 - 'Other' > 22 - TxQ0 interrupt > 20 - RxQ0 interrupt > 07 - RXT0 Receiver timer interrupt > 06 - RX0 - Receiver overrun > 01 - TXQE - Transmit queue empty > > For reference this is https://bugzilla.redhat.com/show_bug.cgi?id=1447935 > > Dave > -- > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Qemu-devel] e1000e migration 2017-05-14 13:14 ` Dmitry Fleytman @ 2017-05-15 9:22 ` Dr. David Alan Gilbert 2017-05-15 16:13 ` Dmitry Fleytman 0 siblings, 1 reply; 5+ messages in thread From: Dr. David Alan Gilbert @ 2017-05-15 9:22 UTC (permalink / raw) To: Dmitry Fleytman; +Cc: qemu-devel * Dmitry Fleytman (dmitry@daynix.com) wrote: > Hello Dave, Hi Dmitry, Thanks for the reply. > We are trying to reproduce this issue on our systems but with no luck so far… Note our QE hit this with both a Win8.1 and a win2012r2 guest - although the 2012r2 is reported to have recoverd after a few minutes. 2016 apparently works OK. > From what you describe it looks like some bit in ICR is not being cleared by the driver. > This usually means that this bit should never be set in that specific interrupt mode. > > Could you please check which bit is not cleared and who sets it? The full set of e1000e_irq_pending_interrupts after migration is: 23004@1494519346.673905:e1000e_irq_pending_interrupts ICR PENDING: 0x100000 (ICR: 0x80100082, IMS: 0x1f00004) 23004@1494519346.674787:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x80100082, IMS: 0x1e00004) 23004@1494519346.674946:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x80100082, IMS: 0x1e00004) 23004@1494519346.675119:e1000e_irq_pending_interrupts ICR PENDING: 0x200000 (ICR: 0x80300082, IMS: 0x1e00004) 23004@1494519346.675302:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x80100082, IMS: 0x1c00004) <repeated lots> 23004@1494519346.716279:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x80300082, IMS: 0x1c00004) 23004@1494519346.716380:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 (ICR: 0x813000c2, IMS: 0x1c00004) 23004@1494519346.717040:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 (ICR: 0x813000c2, IMS: 0x1400004) 23004@1494519346.717276:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 (ICR: 0x813000c2, IMS: 0x1000004) 23004@1494519346.717443:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x813000c2, IMS: 0x4) 23004@1494519346.717567:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x813000c2, IMS: 0x4) 23004@1494519346.717782:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x813000c2, IMS: 0x4) 23004@1494519346.717918:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x813000c2, IMS: 0x4) 23004@1494519346.718319:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x813000c2, IMS: 0x4) 23004@1494519346.718523:e1000e_irq_pending_interrupts ICR PENDING: 0x200000 (ICR: 0x813000c2, IMS: 0xa00004) 23004@1494519346.718684:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x811000c2, IMS: 0x4) 23004@1494519346.718890:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x811000c2, IMS: 0x4) 23004@1494519346.719034:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x811000c2, IMS: 0xa00004) 23004@1494519346.719130:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x811000c2, IMS: 0xa00004) <repeats> 23004@1494519346.722699:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x811000c2, IMS: 0xa00004) 23004@1494519346.722868:e1000e_irq_pending_interrupts ICR PENDING: 0x200000 (ICR: 0x813000c2, IMS: 0xa00004) 23004@1494519346.723068:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x811000c2, IMS: 0x800004) <repeats> 23004@1494519346.731198:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x813000c2, IMS: 0x800004) 23004@1494519346.731422:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x813000c2, IMS: 0x4) 23004@1494519346.731930:e1000e_irq_pending_interrupts ICR PENDING: 0x200000 (ICR: 0x813000c2, IMS: 0xa00004) 23004@1494519346.732082:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x811000c2, IMS: 0x4) 23004@1494519346.732274:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x811000c2, IMS: 0x4) 23004@1494519346.732404:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x811000c2, IMS: 0xa00004) 23004@1494519346.732504:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x811000c2, IMS: 0xa00004) 23004@1494519346.784150:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004) 23004@1494519346.786506:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004) 23004@1494519346.786534:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004) 23004@1494519346.789644:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 (ICR: 0x815000c2, IMS: 0x1a00004) 23004@1494519346.789864:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004) 23004@1494519346.789992:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004) 23004@1494519346.790413:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004) 23004@1494519346.790539:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004) 23004@1494519346.792593:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004) 23004@1494519346.792620:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004) 23004@1494519346.795943:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 (ICR: 0x815000c2, IMS: 0x1a00004) and then I think we get stuck in this cycle of this one always being the one that fires repeatedly. I think that's the 'other' firing, I think because of the receive-overrun. One thing I've not figured out is why the receive overrun happens - is that because we really have a very heavy packet rate or is it because something has stopped receiving them. The network I'm testing on does have a fair amount of broadcast traffic on. Dave > Regards, > Dmitry > > > On 11 May 2017, at 15:36 PM, Dr. David Alan Gilbert <dgilbert@redhat.com> wrote: > > > > Hi Dmitry, > > Have you seen any problems with e1000e migration under windows? > > I've got a repeatable case where after migration with e1000e windows > > hangs/almost hangs. > > I'm seeing the e1000e generate interrupts at a very very high > > rate (maybe ~1000 second ish?) after migration. > > > > Some versions of qemu do it and some dont, but my attempts > > at bisection lead me to code that should be irrelevant. > > > > Prior to migration I see: > > > > 36461@1494504466.711929:e1000e_irq_pending_interrupts ICR PENDING: 0x100000 (ICR: 0x80100082, IMS: 0x1f00004) > > 36461@1494504466.711992:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x80000082, IMS: 0x1a00004) > > 36461@1494504466.712076:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x80000082, IMS: 0x1f00004) > > 36461@1494504466.712245:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x80000082, IMS: 0x1a00004) > > 36461@1494504466.712332:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x80000082, IMS: 0x1f00004) > > > > which I think the ICR means: > > 31 - int asserted > > 20 - RxQ0 - receive queue 0 interrupt > > 7 - RXT0 - receiver timer interrupt > > 1 - TXQE - Transmit Queue empty > > > > after migration it varies more, I'm seeing mostly: > > 21977@1494504516.320707:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 (ICR: 0x815000c2, IMS: 0x1a00004) > > 31 - int asserted > > 24 - 'Other' > > 22 - TxQ0 interrupt > > 20 - RxQ0 interrupt > > 07 - RXT0 Receiver timer interrupt > > 06 - RX0 - Receiver overrun > > 01 - TXQE - Transmit queue empty > > > > For reference this is https://bugzilla.redhat.com/show_bug.cgi?id=1447935 > > > > Dave > > -- > > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK > -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Qemu-devel] e1000e migration 2017-05-15 9:22 ` Dr. David Alan Gilbert @ 2017-05-15 16:13 ` Dmitry Fleytman 2017-05-15 16:14 ` Dr. David Alan Gilbert 0 siblings, 1 reply; 5+ messages in thread From: Dmitry Fleytman @ 2017-05-15 16:13 UTC (permalink / raw) To: Dr. David Alan Gilbert; +Cc: qemu-devel Hello Dave, It looks like we identified the problem. We are working on fix and will send it as soon as it is ready. ~Dmitry. Sent from my iPhone > On 15 May 2017, at 12:22, Dr. David Alan Gilbert <dgilbert@redhat.com> wrote: > > * Dmitry Fleytman (dmitry@daynix.com) wrote: >> Hello Dave, > > Hi Dmitry, > Thanks for the reply. > >> We are trying to reproduce this issue on our systems but with no luck so far… > > Note our QE hit this with both a Win8.1 and a win2012r2 guest - although > the 2012r2 is reported to have recoverd after a few minutes. > 2016 apparently works OK. > >> From what you describe it looks like some bit in ICR is not being cleared by the driver. >> This usually means that this bit should never be set in that specific interrupt mode. >> >> Could you please check which bit is not cleared and who sets it? > > The full set of e1000e_irq_pending_interrupts after migration is: > 23004@1494519346.673905:e1000e_irq_pending_interrupts ICR PENDING: 0x100000 (ICR: 0x80100082, IMS: 0x1f00004) > 23004@1494519346.674787:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x80100082, IMS: 0x1e00004) > 23004@1494519346.674946:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x80100082, IMS: 0x1e00004) > 23004@1494519346.675119:e1000e_irq_pending_interrupts ICR PENDING: 0x200000 (ICR: 0x80300082, IMS: 0x1e00004) > 23004@1494519346.675302:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x80100082, IMS: 0x1c00004) > <repeated lots> > 23004@1494519346.716279:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x80300082, IMS: 0x1c00004) > 23004@1494519346.716380:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 (ICR: 0x813000c2, IMS: 0x1c00004) > 23004@1494519346.717040:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 (ICR: 0x813000c2, IMS: 0x1400004) > 23004@1494519346.717276:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 (ICR: 0x813000c2, IMS: 0x1000004) > 23004@1494519346.717443:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x813000c2, IMS: 0x4) > 23004@1494519346.717567:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x813000c2, IMS: 0x4) > 23004@1494519346.717782:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x813000c2, IMS: 0x4) > 23004@1494519346.717918:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x813000c2, IMS: 0x4) > 23004@1494519346.718319:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x813000c2, IMS: 0x4) > 23004@1494519346.718523:e1000e_irq_pending_interrupts ICR PENDING: 0x200000 (ICR: 0x813000c2, IMS: 0xa00004) > 23004@1494519346.718684:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x811000c2, IMS: 0x4) > 23004@1494519346.718890:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x811000c2, IMS: 0x4) > 23004@1494519346.719034:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x811000c2, IMS: 0xa00004) > 23004@1494519346.719130:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x811000c2, IMS: 0xa00004) > <repeats> > 23004@1494519346.722699:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x811000c2, IMS: 0xa00004) > 23004@1494519346.722868:e1000e_irq_pending_interrupts ICR PENDING: 0x200000 (ICR: 0x813000c2, IMS: 0xa00004) > 23004@1494519346.723068:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x811000c2, IMS: 0x800004) > <repeats> > 23004@1494519346.731198:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x813000c2, IMS: 0x800004) > 23004@1494519346.731422:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x813000c2, IMS: 0x4) > 23004@1494519346.731930:e1000e_irq_pending_interrupts ICR PENDING: 0x200000 (ICR: 0x813000c2, IMS: 0xa00004) > 23004@1494519346.732082:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x811000c2, IMS: 0x4) > 23004@1494519346.732274:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x811000c2, IMS: 0x4) > 23004@1494519346.732404:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x811000c2, IMS: 0xa00004) > 23004@1494519346.732504:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x811000c2, IMS: 0xa00004) > 23004@1494519346.784150:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004) > 23004@1494519346.786506:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004) > 23004@1494519346.786534:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004) > 23004@1494519346.789644:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 (ICR: 0x815000c2, IMS: 0x1a00004) > 23004@1494519346.789864:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004) > 23004@1494519346.789992:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004) > 23004@1494519346.790413:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004) > 23004@1494519346.790539:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004) > 23004@1494519346.792593:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004) > 23004@1494519346.792620:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004) > 23004@1494519346.795943:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 (ICR: 0x815000c2, IMS: 0x1a00004) > > and then I think we get stuck in this cycle of this one always being the > one that fires repeatedly. I think that's the 'other' firing, I think > because of the receive-overrun. One thing I've not > figured out is why the receive overrun happens - is that because we > really have a very heavy packet rate or is it because something has > stopped receiving them. > The network I'm testing on does have a fair amount of broadcast traffic > on. > > Dave > >> Regards, >> Dmitry >> >>> On 11 May 2017, at 15:36 PM, Dr. David Alan Gilbert <dgilbert@redhat.com> wrote: >>> >>> Hi Dmitry, >>> Have you seen any problems with e1000e migration under windows? >>> I've got a repeatable case where after migration with e1000e windows >>> hangs/almost hangs. >>> I'm seeing the e1000e generate interrupts at a very very high >>> rate (maybe ~1000 second ish?) after migration. >>> >>> Some versions of qemu do it and some dont, but my attempts >>> at bisection lead me to code that should be irrelevant. >>> >>> Prior to migration I see: >>> >>> 36461@1494504466.711929:e1000e_irq_pending_interrupts ICR PENDING: 0x100000 (ICR: 0x80100082, IMS: 0x1f00004) >>> 36461@1494504466.711992:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x80000082, IMS: 0x1a00004) >>> 36461@1494504466.712076:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x80000082, IMS: 0x1f00004) >>> 36461@1494504466.712245:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x80000082, IMS: 0x1a00004) >>> 36461@1494504466.712332:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x80000082, IMS: 0x1f00004) >>> >>> which I think the ICR means: >>> 31 - int asserted >>> 20 - RxQ0 - receive queue 0 interrupt >>> 7 - RXT0 - receiver timer interrupt >>> 1 - TXQE - Transmit Queue empty >>> >>> after migration it varies more, I'm seeing mostly: >>> 21977@1494504516.320707:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 (ICR: 0x815000c2, IMS: 0x1a00004) >>> 31 - int asserted >>> 24 - 'Other' >>> 22 - TxQ0 interrupt >>> 20 - RxQ0 interrupt >>> 07 - RXT0 Receiver timer interrupt >>> 06 - RX0 - Receiver overrun >>> 01 - TXQE - Transmit queue empty >>> >>> For reference this is https://bugzilla.redhat.com/show_bug.cgi?id=1447935 >>> >>> Dave >>> -- >>> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK >> > -- > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Qemu-devel] e1000e migration 2017-05-15 16:13 ` Dmitry Fleytman @ 2017-05-15 16:14 ` Dr. David Alan Gilbert 0 siblings, 0 replies; 5+ messages in thread From: Dr. David Alan Gilbert @ 2017-05-15 16:14 UTC (permalink / raw) To: Dmitry Fleytman; +Cc: qemu-devel * Dmitry Fleytman (dmitry@daynix.com) wrote: > Hello Dave, > > It looks like we identified the problem. > > We are working on fix and will send it as soon as it is ready. Thanks! Dave > ~Dmitry. > > Sent from my iPhone > > > On 15 May 2017, at 12:22, Dr. David Alan Gilbert <dgilbert@redhat.com> wrote: > > > > * Dmitry Fleytman (dmitry@daynix.com) wrote: > >> Hello Dave, > > > > Hi Dmitry, > > Thanks for the reply. > > > >> We are trying to reproduce this issue on our systems but with no luck so far… > > > > Note our QE hit this with both a Win8.1 and a win2012r2 guest - although > > the 2012r2 is reported to have recoverd after a few minutes. > > 2016 apparently works OK. > > > >> From what you describe it looks like some bit in ICR is not being cleared by the driver. > >> This usually means that this bit should never be set in that specific interrupt mode. > >> > >> Could you please check which bit is not cleared and who sets it? > > > > The full set of e1000e_irq_pending_interrupts after migration is: > > 23004@1494519346.673905:e1000e_irq_pending_interrupts ICR PENDING: 0x100000 (ICR: 0x80100082, IMS: 0x1f00004) > > 23004@1494519346.674787:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x80100082, IMS: 0x1e00004) > > 23004@1494519346.674946:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x80100082, IMS: 0x1e00004) > > 23004@1494519346.675119:e1000e_irq_pending_interrupts ICR PENDING: 0x200000 (ICR: 0x80300082, IMS: 0x1e00004) > > 23004@1494519346.675302:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x80100082, IMS: 0x1c00004) > > <repeated lots> > > 23004@1494519346.716279:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x80300082, IMS: 0x1c00004) > > 23004@1494519346.716380:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 (ICR: 0x813000c2, IMS: 0x1c00004) > > 23004@1494519346.717040:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 (ICR: 0x813000c2, IMS: 0x1400004) > > 23004@1494519346.717276:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 (ICR: 0x813000c2, IMS: 0x1000004) > > 23004@1494519346.717443:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x813000c2, IMS: 0x4) > > 23004@1494519346.717567:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x813000c2, IMS: 0x4) > > 23004@1494519346.717782:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x813000c2, IMS: 0x4) > > 23004@1494519346.717918:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x813000c2, IMS: 0x4) > > 23004@1494519346.718319:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x813000c2, IMS: 0x4) > > 23004@1494519346.718523:e1000e_irq_pending_interrupts ICR PENDING: 0x200000 (ICR: 0x813000c2, IMS: 0xa00004) > > 23004@1494519346.718684:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x811000c2, IMS: 0x4) > > 23004@1494519346.718890:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x811000c2, IMS: 0x4) > > 23004@1494519346.719034:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x811000c2, IMS: 0xa00004) > > 23004@1494519346.719130:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x811000c2, IMS: 0xa00004) > > <repeats> > > 23004@1494519346.722699:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x811000c2, IMS: 0xa00004) > > 23004@1494519346.722868:e1000e_irq_pending_interrupts ICR PENDING: 0x200000 (ICR: 0x813000c2, IMS: 0xa00004) > > 23004@1494519346.723068:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x811000c2, IMS: 0x800004) > > <repeats> > > 23004@1494519346.731198:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x813000c2, IMS: 0x800004) > > 23004@1494519346.731422:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x813000c2, IMS: 0x4) > > 23004@1494519346.731930:e1000e_irq_pending_interrupts ICR PENDING: 0x200000 (ICR: 0x813000c2, IMS: 0xa00004) > > 23004@1494519346.732082:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x811000c2, IMS: 0x4) > > 23004@1494519346.732274:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x811000c2, IMS: 0x4) > > 23004@1494519346.732404:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x811000c2, IMS: 0xa00004) > > 23004@1494519346.732504:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x811000c2, IMS: 0xa00004) > > 23004@1494519346.784150:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004) > > 23004@1494519346.786506:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004) > > 23004@1494519346.786534:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004) > > 23004@1494519346.789644:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 (ICR: 0x815000c2, IMS: 0x1a00004) > > 23004@1494519346.789864:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004) > > 23004@1494519346.789992:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004) > > 23004@1494519346.790413:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004) > > 23004@1494519346.790539:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004) > > 23004@1494519346.792593:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004) > > 23004@1494519346.792620:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004) > > 23004@1494519346.795943:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 (ICR: 0x815000c2, IMS: 0x1a00004) > > > > and then I think we get stuck in this cycle of this one always being the > > one that fires repeatedly. I think that's the 'other' firing, I think > > because of the receive-overrun. One thing I've not > > figured out is why the receive overrun happens - is that because we > > really have a very heavy packet rate or is it because something has > > stopped receiving them. > > The network I'm testing on does have a fair amount of broadcast traffic > > on. > > > > Dave > > > >> Regards, > >> Dmitry > >> > >>> On 11 May 2017, at 15:36 PM, Dr. David Alan Gilbert <dgilbert@redhat.com> wrote: > >>> > >>> Hi Dmitry, > >>> Have you seen any problems with e1000e migration under windows? > >>> I've got a repeatable case where after migration with e1000e windows > >>> hangs/almost hangs. > >>> I'm seeing the e1000e generate interrupts at a very very high > >>> rate (maybe ~1000 second ish?) after migration. > >>> > >>> Some versions of qemu do it and some dont, but my attempts > >>> at bisection lead me to code that should be irrelevant. > >>> > >>> Prior to migration I see: > >>> > >>> 36461@1494504466.711929:e1000e_irq_pending_interrupts ICR PENDING: 0x100000 (ICR: 0x80100082, IMS: 0x1f00004) > >>> 36461@1494504466.711992:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x80000082, IMS: 0x1a00004) > >>> 36461@1494504466.712076:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x80000082, IMS: 0x1f00004) > >>> 36461@1494504466.712245:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x80000082, IMS: 0x1a00004) > >>> 36461@1494504466.712332:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x80000082, IMS: 0x1f00004) > >>> > >>> which I think the ICR means: > >>> 31 - int asserted > >>> 20 - RxQ0 - receive queue 0 interrupt > >>> 7 - RXT0 - receiver timer interrupt > >>> 1 - TXQE - Transmit Queue empty > >>> > >>> after migration it varies more, I'm seeing mostly: > >>> 21977@1494504516.320707:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 (ICR: 0x815000c2, IMS: 0x1a00004) > >>> 31 - int asserted > >>> 24 - 'Other' > >>> 22 - TxQ0 interrupt > >>> 20 - RxQ0 interrupt > >>> 07 - RXT0 Receiver timer interrupt > >>> 06 - RX0 - Receiver overrun > >>> 01 - TXQE - Transmit queue empty > >>> > >>> For reference this is https://bugzilla.redhat.com/show_bug.cgi?id=1447935 > >>> > >>> Dave > >>> -- > >>> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK > >> > > -- > > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2017-05-15 16:14 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-05-11 12:36 [Qemu-devel] e1000e migration Dr. David Alan Gilbert 2017-05-14 13:14 ` Dmitry Fleytman 2017-05-15 9:22 ` Dr. David Alan Gilbert 2017-05-15 16:13 ` Dmitry Fleytman 2017-05-15 16:14 ` Dr. David Alan Gilbert
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).