* Re: [Intel-wired-lan] [PATCH net v2] igc: Prevent garbled TX queue with XDP ZEROCOPY [not found] <20230628091148.62256-1-florian.kauer@linutronix.de> @ 2023-06-28 21:34 ` Vinicius Costa Gomes 2023-06-29 7:07 ` Florian Kauer 0 siblings, 1 reply; 4+ messages in thread From: Vinicius Costa Gomes @ 2023-06-28 21:34 UTC (permalink / raw) To: Florian Kauer, Jesse Brandeburg, Tony Nguyen, David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Vedang Patel, Maciej Fijalkowski, Jithu Joseph, Andre Guedes, Simon Horman Cc: netdev, kurt, intel-wired-lan, linux-kernel Florian Kauer <florian.kauer@linutronix.de> writes: > In normal operation, each populated queue item has > next_to_watch pointing to the last TX desc of the packet, > while each cleaned item has it set to 0. In particular, > next_to_use that points to the next (necessarily clean) > item to use has next_to_watch set to 0. > > When the TX queue is used both by an application using > AF_XDP with ZEROCOPY as well as a second non-XDP application > generating high traffic, the queue pointers can get in > an invalid state where next_to_use points to an item > where next_to_watch is NOT set to 0. > > However, the implementation assumes at several places > that this is never the case, so if it does hold, > bad things happen. In particular, within the loop inside > of igc_clean_tx_irq(), next_to_clean can overtake next_to_use. > Finally, this prevents any further transmission via > this queue and it never gets unblocked or signaled. > Secondly, if the queue is in this garbled state, > the inner loop of igc_clean_tx_ring() will never terminate, > completely hogging a CPU core. > > The reason is that igc_xdp_xmit_zc() reads next_to_use > before acquiring the lock, and writing it back > (potentially unmodified) later. If it got modified > before locking, the outdated next_to_use is written > pointing to an item that was already used elsewhere > (and thus next_to_watch got written). > > Fixes: 9acf59a752d4 ("igc: Enable TX via AF_XDP zero-copy") > Signed-off-by: Florian Kauer <florian.kauer@linutronix.de> > Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> > Tested-by: Kurt Kanzenbach <kurt@linutronix.de> > --- This patch doesn't directly apply because there's a small conflict with commit 95b681485563 ("igc: Avoid transmit queue timeout for XDP"), but really easy to solve. Anyway, good catch: Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Cheers, -- Vinicius ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Intel-wired-lan] [PATCH net v2] igc: Prevent garbled TX queue with XDP ZEROCOPY 2023-06-28 21:34 ` [Intel-wired-lan] [PATCH net v2] igc: Prevent garbled TX queue with XDP ZEROCOPY Vinicius Costa Gomes @ 2023-06-29 7:07 ` Florian Kauer 2023-06-29 16:20 ` Tony Nguyen 2023-06-29 16:25 ` Vinicius Costa Gomes 0 siblings, 2 replies; 4+ messages in thread From: Florian Kauer @ 2023-06-29 7:07 UTC (permalink / raw) To: Vinicius Costa Gomes, Jesse Brandeburg, Tony Nguyen, David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Vedang Patel, Maciej Fijalkowski, Jithu Joseph, Andre Guedes, Simon Horman Cc: netdev, kurt, intel-wired-lan, linux-kernel Hi Vinicius, On 28.06.23 23:34, Vinicius Costa Gomes wrote: > Florian Kauer <florian.kauer@linutronix.de> writes: > >> In normal operation, each populated queue item has >> next_to_watch pointing to the last TX desc of the packet, >> while each cleaned item has it set to 0. In particular, >> next_to_use that points to the next (necessarily clean) >> item to use has next_to_watch set to 0. >> >> When the TX queue is used both by an application using >> AF_XDP with ZEROCOPY as well as a second non-XDP application >> generating high traffic, the queue pointers can get in >> an invalid state where next_to_use points to an item >> where next_to_watch is NOT set to 0. >> >> However, the implementation assumes at several places >> that this is never the case, so if it does hold, >> bad things happen. In particular, within the loop inside >> of igc_clean_tx_irq(), next_to_clean can overtake next_to_use. >> Finally, this prevents any further transmission via >> this queue and it never gets unblocked or signaled. >> Secondly, if the queue is in this garbled state, >> the inner loop of igc_clean_tx_ring() will never terminate, >> completely hogging a CPU core. >> >> The reason is that igc_xdp_xmit_zc() reads next_to_use >> before acquiring the lock, and writing it back >> (potentially unmodified) later. If it got modified >> before locking, the outdated next_to_use is written >> pointing to an item that was already used elsewhere >> (and thus next_to_watch got written). >> >> Fixes: 9acf59a752d4 ("igc: Enable TX via AF_XDP zero-copy") >> Signed-off-by: Florian Kauer <florian.kauer@linutronix.de> >> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> >> Tested-by: Kurt Kanzenbach <kurt@linutronix.de> >> --- > > This patch doesn't directly apply because there's a small conflict with > commit 95b681485563 ("igc: Avoid transmit queue timeout for XDP"), > but really easy to solve. > > Anyway, good catch: > > Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> I am sorry, that was bad timing. I prepared the initial patch on Friday and overlooked the merge. Shall I send a v3 or will someone else take care of the conflict resolution? Greetings, Florian ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Intel-wired-lan] [PATCH net v2] igc: Prevent garbled TX queue with XDP ZEROCOPY 2023-06-29 7:07 ` Florian Kauer @ 2023-06-29 16:20 ` Tony Nguyen 2023-06-29 16:25 ` Vinicius Costa Gomes 1 sibling, 0 replies; 4+ messages in thread From: Tony Nguyen @ 2023-06-29 16:20 UTC (permalink / raw) To: Florian Kauer, Vinicius Costa Gomes, Jesse Brandeburg, David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Vedang Patel, Maciej Fijalkowski, Jithu Joseph, Andre Guedes, Simon Horman Cc: netdev, kurt, intel-wired-lan, linux-kernel On 6/29/2023 12:07 AM, Florian Kauer wrote: ... >> This patch doesn't directly apply because there's a small conflict with >> commit 95b681485563 ("igc: Avoid transmit queue timeout for XDP"), >> but really easy to solve. >> >> Anyway, good catch: >> >> Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> > > I am sorry, that was bad timing. I prepared the initial patch on Friday and overlooked the merge. > Shall I send a v3 or will someone else take care of the conflict resolution? A v3 would be great. Thanks, Tony ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Intel-wired-lan] [PATCH net v2] igc: Prevent garbled TX queue with XDP ZEROCOPY 2023-06-29 7:07 ` Florian Kauer 2023-06-29 16:20 ` Tony Nguyen @ 2023-06-29 16:25 ` Vinicius Costa Gomes 1 sibling, 0 replies; 4+ messages in thread From: Vinicius Costa Gomes @ 2023-06-29 16:25 UTC (permalink / raw) To: Florian Kauer, Jesse Brandeburg, Tony Nguyen, David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Vedang Patel, Maciej Fijalkowski, Jithu Joseph, Andre Guedes, Simon Horman Cc: netdev, kurt, intel-wired-lan, linux-kernel Florian Kauer <florian.kauer@linutronix.de> writes: > Hi Vinicius, > > On 28.06.23 23:34, Vinicius Costa Gomes wrote: >> Florian Kauer <florian.kauer@linutronix.de> writes: >> >>> In normal operation, each populated queue item has >>> next_to_watch pointing to the last TX desc of the packet, >>> while each cleaned item has it set to 0. In particular, >>> next_to_use that points to the next (necessarily clean) >>> item to use has next_to_watch set to 0. >>> >>> When the TX queue is used both by an application using >>> AF_XDP with ZEROCOPY as well as a second non-XDP application >>> generating high traffic, the queue pointers can get in >>> an invalid state where next_to_use points to an item >>> where next_to_watch is NOT set to 0. >>> >>> However, the implementation assumes at several places >>> that this is never the case, so if it does hold, >>> bad things happen. In particular, within the loop inside >>> of igc_clean_tx_irq(), next_to_clean can overtake next_to_use. >>> Finally, this prevents any further transmission via >>> this queue and it never gets unblocked or signaled. >>> Secondly, if the queue is in this garbled state, >>> the inner loop of igc_clean_tx_ring() will never terminate, >>> completely hogging a CPU core. >>> >>> The reason is that igc_xdp_xmit_zc() reads next_to_use >>> before acquiring the lock, and writing it back >>> (potentially unmodified) later. If it got modified >>> before locking, the outdated next_to_use is written >>> pointing to an item that was already used elsewhere >>> (and thus next_to_watch got written). >>> >>> Fixes: 9acf59a752d4 ("igc: Enable TX via AF_XDP zero-copy") >>> Signed-off-by: Florian Kauer <florian.kauer@linutronix.de> >>> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> >>> Tested-by: Kurt Kanzenbach <kurt@linutronix.de> >>> --- >> >> This patch doesn't directly apply because there's a small conflict with >> commit 95b681485563 ("igc: Avoid transmit queue timeout for XDP"), >> but really easy to solve. >> >> Anyway, good catch: >> >> Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> > > I am sorry, that was bad timing. I prepared the initial patch on Friday and overlooked the merge. > Shall I send a v3 or will someone else take care of the conflict > resolution? I think it's easier if you send a v3. Cheers, -- Vinicius ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2023-06-29 16:27 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20230628091148.62256-1-florian.kauer@linutronix.de>
2023-06-28 21:34 ` [Intel-wired-lan] [PATCH net v2] igc: Prevent garbled TX queue with XDP ZEROCOPY Vinicius Costa Gomes
2023-06-29 7:07 ` Florian Kauer
2023-06-29 16:20 ` Tony Nguyen
2023-06-29 16:25 ` Vinicius Costa Gomes
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).