netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 1/3] r8169: fix the KCSAN reported data-race in rtl_tx() while reading tp->cur_tx
@ 2023-10-18 19:34 Mirsad Goran Todorovac
  2023-10-18 19:34 ` [PATCH v4 2/3] r8169: fix the KCSAN reported data-race in rtl_tx while reading TxDescArray[entry].opts1 Mirsad Goran Todorovac
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Mirsad Goran Todorovac @ 2023-10-18 19:34 UTC (permalink / raw)
  To: Heiner Kallweit, netdev, linux-kernel
  Cc: nic_swsd, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Mirsad Goran Todorovac, Marco Elver

KCSAN reported the following data-race:

==================================================================
BUG: KCSAN: data-race in rtl8169_poll [r8169] / rtl8169_start_xmit [r8169]

write (marked) to 0xffff888102474b74 of 4 bytes by task 5358 on cpu 29:
rtl8169_start_xmit (drivers/net/ethernet/realtek/r8169_main.c:4254) r8169
dev_hard_start_xmit (./include/linux/netdevice.h:4889 ./include/linux/netdevice.h:4903 net/core/dev.c:3544 net/core/dev.c:3560)
sch_direct_xmit (net/sched/sch_generic.c:342)
__dev_queue_xmit (net/core/dev.c:3817 net/core/dev.c:4306)
ip_finish_output2 (./include/linux/netdevice.h:3082 ./include/net/neighbour.h:526 ./include/net/neighbour.h:540 net/ipv4/ip_output.c:233)
__ip_finish_output (net/ipv4/ip_output.c:311 net/ipv4/ip_output.c:293)
ip_finish_output (net/ipv4/ip_output.c:328)
ip_output (net/ipv4/ip_output.c:435)
ip_send_skb (./include/net/dst.h:458 net/ipv4/ip_output.c:127 net/ipv4/ip_output.c:1486)
udp_send_skb (net/ipv4/udp.c:963)
udp_sendmsg (net/ipv4/udp.c:1246)
inet_sendmsg (net/ipv4/af_inet.c:840 (discriminator 4))
sock_sendmsg (net/socket.c:730 net/socket.c:753)
__sys_sendto (net/socket.c:2177)
__x64_sys_sendto (net/socket.c:2185)
do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)

read to 0xffff888102474b74 of 4 bytes by interrupt on cpu 21:
rtl8169_poll (drivers/net/ethernet/realtek/r8169_main.c:4397 drivers/net/ethernet/realtek/r8169_main.c:4581) r8169
__napi_poll (net/core/dev.c:6527)
net_rx_action (net/core/dev.c:6596 net/core/dev.c:6727)
__do_softirq (kernel/softirq.c:553)
__irq_exit_rcu (kernel/softirq.c:427 kernel/softirq.c:632)
irq_exit_rcu (kernel/softirq.c:647)
common_interrupt (arch/x86/kernel/irq.c:247 (discriminator 14))
asm_common_interrupt (./arch/x86/include/asm/idtentry.h:636)
cpuidle_enter_state (drivers/cpuidle/cpuidle.c:291)
cpuidle_enter (drivers/cpuidle/cpuidle.c:390)
call_cpuidle (kernel/sched/idle.c:135)
do_idle (kernel/sched/idle.c:219 kernel/sched/idle.c:282)
cpu_startup_entry (kernel/sched/idle.c:378 (discriminator 1))
start_secondary (arch/x86/kernel/smpboot.c:210 arch/x86/kernel/smpboot.c:294)
secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:433)

value changed: 0x002f4815 -> 0x002f4816

Reported by Kernel Concurrency Sanitizer on:
CPU: 21 PID: 0 Comm: swapper/21 Tainted: G             L     6.6.0-rc2-kcsan-00143-gb5cbe7c00aa0 #41
Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023
==================================================================

The write side of drivers/net/ethernet/realtek/r8169_main.c is:
==================
   4251         /* rtl_tx needs to see descriptor changes before updated tp->cur_tx */
   4252         smp_wmb();
   4253
 → 4254         WRITE_ONCE(tp->cur_tx, tp->cur_tx + frags + 1);
   4255
   4256         stop_queue = !netif_subqueue_maybe_stop(dev, 0, rtl_tx_slots_avail(tp),
   4257                                                 R8169_TX_STOP_THRS,
   4258                                                 R8169_TX_START_THRS);

The read side is the function rtl_tx():

   4355 static void rtl_tx(struct net_device *dev, struct rtl8169_private *tp,
   4356                    int budget)
   4357 {
   4358         unsigned int dirty_tx, bytes_compl = 0, pkts_compl = 0;
   4359         struct sk_buff *skb;
   4360
   4361         dirty_tx = tp->dirty_tx;
   4362
   4363         while (READ_ONCE(tp->cur_tx) != dirty_tx) {
   4364                 unsigned int entry = dirty_tx % NUM_TX_DESC;
   4365                 u32 status;
   4366
   4367                 status = le32_to_cpu(tp->TxDescArray[entry].opts1);
   4368                 if (status & DescOwn)
   4369                         break;
   4370
   4371                 skb = tp->tx_skb[entry].skb;
   4372                 rtl8169_unmap_tx_skb(tp, entry);
   4373
   4374                 if (skb) {
   4375                         pkts_compl++;
   4376                         bytes_compl += skb->len;
   4377                         napi_consume_skb(skb, budget);
   4378                 }
   4379                 dirty_tx++;
   4380         }
   4381
   4382         if (tp->dirty_tx != dirty_tx) {
   4383                 dev_sw_netstats_tx_add(dev, pkts_compl, bytes_compl);
   4384                 WRITE_ONCE(tp->dirty_tx, dirty_tx);
   4385
   4386                 netif_subqueue_completed_wake(dev, 0, pkts_compl, bytes_compl,
   4387                                               rtl_tx_slots_avail(tp),
   4388                                               R8169_TX_START_THRS);
   4389                 /*
   4390                  * 8168 hack: TxPoll requests are lost when the Tx packets are
   4391                  * too close. Let's kick an extra TxPoll request when a burst
   4392                  * of start_xmit activity is detected (if it is not detected,
   4393                  * it is slow enough). -- FR
   4394                  * If skb is NULL then we come here again once a tx irq is
   4395                  * triggered after the last fragment is marked transmitted.
   4396                  */
 → 4397                 if (tp->cur_tx != dirty_tx && skb)
   4398                         rtl8169_doorbell(tp);
   4399         }
   4400 }

Obviously from the code, an earlier detected data-race for tp->cur_tx was fixed in the
line 4363:

   4363         while (READ_ONCE(tp->cur_tx) != dirty_tx) {

but the same solution is required for protecting the other access to tp->cur_tx:

 → 4397                 if (READ_ONCE(tp->cur_tx) != dirty_tx && skb)
   4398                         rtl8169_doorbell(tp);

The write in the line 4254 is protected with WRITE_ONCE(), but the read in the line 4397
might have suffered read tearing under some compiler optimisations.

The fix eliminated the KCSAN data-race report for this bug.

It is yet to be evaluated what happens if tp->cur_tx changes between the test in line 4363
and line 4397. This test should certainly not be cached by the compiler in some register
for such a long time, while asynchronous writes to tp->cur_tx might have occurred in line
4254 in the meantime.

Fixes: 94d8a98e6235c ("r8169: reduce number of workaround doorbell rings")
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Cc: nic_swsd@realtek.com
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Marco Elver <elver@google.com>
Cc: netdev@vger.kernel.org
Link: https://lore.kernel.org/lkml/dc7fc8fa-4ea4-e9a9-30a6-7c83e6b53188@alu.unizg.hr/
Signed-off-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
Acked-by: Marco Elver <elver@google.com>
---
v4:
 fixed the Fixes: tag for 2/3.
 
v3:
 fixed the Fixes: tag for 3/3.

v2:
 fixed double Signed-off-by: tag

 drivers/net/ethernet/realtek/r8169_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
index 6351a2dc13bc..281aaa851847 100644
--- a/drivers/net/ethernet/realtek/r8169_main.c
+++ b/drivers/net/ethernet/realtek/r8169_main.c
@@ -4394,7 +4394,7 @@ static void rtl_tx(struct net_device *dev, struct rtl8169_private *tp,
 		 * If skb is NULL then we come here again once a tx irq is
 		 * triggered after the last fragment is marked transmitted.
 		 */
-		if (tp->cur_tx != dirty_tx && skb)
+		if (READ_ONCE(tp->cur_tx) != dirty_tx && skb)
 			rtl8169_doorbell(tp);
 	}
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v4 2/3] r8169: fix the KCSAN reported data-race in rtl_tx while reading TxDescArray[entry].opts1
  2023-10-18 19:34 [PATCH v4 1/3] r8169: fix the KCSAN reported data-race in rtl_tx() while reading tp->cur_tx Mirsad Goran Todorovac
@ 2023-10-18 19:34 ` Mirsad Goran Todorovac
  2023-10-20 13:00   ` Simon Horman
  2023-10-18 19:34 ` [PATCH v4 3/3] r8169: fix the KCSAN reported data race in rtl_rx while reading desc->opts1 Mirsad Goran Todorovac
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 8+ messages in thread
From: Mirsad Goran Todorovac @ 2023-10-18 19:34 UTC (permalink / raw)
  To: Heiner Kallweit, netdev, linux-kernel
  Cc: nic_swsd, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Mirsad Goran Todorovac, Marco Elver

KCSAN reported the following data-race:

==================================================================
BUG: KCSAN: data-race in rtl8169_poll (drivers/net/ethernet/realtek/r8169_main.c:4368 drivers/net/ethernet/realtek/r8169_main.c:4581) r8169

race at unknown origin, with read to 0xffff888140d37570 of 4 bytes by interrupt on cpu 21:
rtl8169_poll (drivers/net/ethernet/realtek/r8169_main.c:4368 drivers/net/ethernet/realtek/r8169_main.c:4581) r8169
__napi_poll (net/core/dev.c:6527)
net_rx_action (net/core/dev.c:6596 net/core/dev.c:6727)
__do_softirq (kernel/softirq.c:553)
__irq_exit_rcu (kernel/softirq.c:427 kernel/softirq.c:632)
irq_exit_rcu (kernel/softirq.c:647)
sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1074 (discriminator 14))
asm_sysvec_apic_timer_interrupt (./arch/x86/include/asm/idtentry.h:645)
cpuidle_enter_state (drivers/cpuidle/cpuidle.c:291)
cpuidle_enter (drivers/cpuidle/cpuidle.c:390)
call_cpuidle (kernel/sched/idle.c:135)
do_idle (kernel/sched/idle.c:219 kernel/sched/idle.c:282)
cpu_startup_entry (kernel/sched/idle.c:378 (discriminator 1))
start_secondary (arch/x86/kernel/smpboot.c:210 arch/x86/kernel/smpboot.c:294)
secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:433)

value changed: 0xb0000042 -> 0x00000000

Reported by Kernel Concurrency Sanitizer on:
CPU: 21 PID: 0 Comm: swapper/21 Tainted: G             L     6.6.0-rc2-kcsan-00143-gb5cbe7c00aa0 #41
Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023
==================================================================

The read side is in

drivers/net/ethernet/realtek/r8169_main.c
=========================================
   4355 static void rtl_tx(struct net_device *dev, struct rtl8169_private *tp,
   4356                    int budget)
   4357 {
   4358         unsigned int dirty_tx, bytes_compl = 0, pkts_compl = 0;
   4359         struct sk_buff *skb;
   4360
   4361         dirty_tx = tp->dirty_tx;
   4362
   4363         while (READ_ONCE(tp->cur_tx) != dirty_tx) {
   4364                 unsigned int entry = dirty_tx % NUM_TX_DESC;
   4365                 u32 status;
   4366
 → 4367                 status = le32_to_cpu(tp->TxDescArray[entry].opts1);
   4368                 if (status & DescOwn)
   4369                         break;
   4370
   4371                 skb = tp->tx_skb[entry].skb;
   4372                 rtl8169_unmap_tx_skb(tp, entry);
   4373
   4374                 if (skb) {
   4375                         pkts_compl++;
   4376                         bytes_compl += skb->len;
   4377                         napi_consume_skb(skb, budget);
   4378                 }
   4379                 dirty_tx++;
   4380         }
   4381
   4382         if (tp->dirty_tx != dirty_tx) {
   4383                 dev_sw_netstats_tx_add(dev, pkts_compl, bytes_compl);
   4384                 WRITE_ONCE(tp->dirty_tx, dirty_tx);
   4385
   4386                 netif_subqueue_completed_wake(dev, 0, pkts_compl, bytes_compl,
   4387                                               rtl_tx_slots_avail(tp),
   4388                                               R8169_TX_START_THRS);
   4389                 /*
   4390                  * 8168 hack: TxPoll requests are lost when the Tx packets are
   4391                  * too close. Let's kick an extra TxPoll request when a burst
   4392                  * of start_xmit activity is detected (if it is not detected,
   4393                  * it is slow enough). -- FR
   4394                  * If skb is NULL then we come here again once a tx irq is
   4395                  * triggered after the last fragment is marked transmitted.
   4396                  */
   4397                 if (READ_ONCE(tp->cur_tx) != dirty_tx && skb)
   4398                         rtl8169_doorbell(tp);
   4399         }
   4400 }

tp->TxDescArray[entry].opts1 is reported to have a data-race and READ_ONCE() fixes
this KCSAN warning.

   4366
 → 4367                 status = le32_to_cpu(READ_ONCE(tp->TxDescArray[entry].opts1));
   4368                 if (status & DescOwn)
   4369                         break;
   4370

Cc: Heiner Kallweit <hkallweit1@gmail.com>
Cc: nic_swsd@realtek.com
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Marco Elver <elver@google.com>
Cc: netdev@vger.kernel.org
Link: https://lore.kernel.org/lkml/dc7fc8fa-4ea4-e9a9-30a6-7c83e6b53188@alu.unizg.hr/
Signed-off-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
Acked-by: Marco Elver <elver@google.com>
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
---
v4:
 fixed the Fixes: tag for 2/3.

v3:
 fixed the Fixes: tag for 3/3.

v2:
 fixed double Signed-off-by: tag
 
 drivers/net/ethernet/realtek/r8169_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
index 81be6085a480..361b90007148 100644
--- a/drivers/net/ethernet/realtek/r8169_main.c
+++ b/drivers/net/ethernet/realtek/r8169_main.c
@@ -4364,7 +4364,7 @@ static void rtl_tx(struct net_device *dev, struct rtl8169_private *tp,
 		unsigned int entry = dirty_tx % NUM_TX_DESC;
 		u32 status;
 
-		status = le32_to_cpu(tp->TxDescArray[entry].opts1);
+		status = le32_to_cpu(READ_ONCE(tp->TxDescArray[entry].opts1));
 		if (status & DescOwn)
 			break;
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v4 3/3] r8169: fix the KCSAN reported data race in rtl_rx while reading desc->opts1
  2023-10-18 19:34 [PATCH v4 1/3] r8169: fix the KCSAN reported data-race in rtl_tx() while reading tp->cur_tx Mirsad Goran Todorovac
  2023-10-18 19:34 ` [PATCH v4 2/3] r8169: fix the KCSAN reported data-race in rtl_tx while reading TxDescArray[entry].opts1 Mirsad Goran Todorovac
@ 2023-10-18 19:34 ` Mirsad Goran Todorovac
  2023-10-20 13:00   ` Simon Horman
  2023-10-20 12:59 ` [PATCH v4 1/3] r8169: fix the KCSAN reported data-race in rtl_tx() while reading tp->cur_tx Simon Horman
  2023-10-20 23:50 ` Jakub Kicinski
  3 siblings, 1 reply; 8+ messages in thread
From: Mirsad Goran Todorovac @ 2023-10-18 19:34 UTC (permalink / raw)
  To: Heiner Kallweit, netdev, linux-kernel
  Cc: nic_swsd, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Mirsad Goran Todorovac, Marco Elver

KCSAN reported the following data-race bug:

==================================================================
BUG: KCSAN: data-race in rtl8169_poll (drivers/net/ethernet/realtek/r8169_main.c:4430 drivers/net/ethernet/realtek/r8169_main.c:4583) r8169

race at unknown origin, with read to 0xffff888117e43510 of 4 bytes by interrupt on cpu 21:
rtl8169_poll (drivers/net/ethernet/realtek/r8169_main.c:4430 drivers/net/ethernet/realtek/r8169_main.c:4583) r8169
__napi_poll (net/core/dev.c:6527)
net_rx_action (net/core/dev.c:6596 net/core/dev.c:6727)
__do_softirq (kernel/softirq.c:553)
__irq_exit_rcu (kernel/softirq.c:427 kernel/softirq.c:632)
irq_exit_rcu (kernel/softirq.c:647)
sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1074 (discriminator 14))
asm_sysvec_apic_timer_interrupt (./arch/x86/include/asm/idtentry.h:645)
cpuidle_enter_state (drivers/cpuidle/cpuidle.c:291)
cpuidle_enter (drivers/cpuidle/cpuidle.c:390)
call_cpuidle (kernel/sched/idle.c:135)
do_idle (kernel/sched/idle.c:219 kernel/sched/idle.c:282)
cpu_startup_entry (kernel/sched/idle.c:378 (discriminator 1))
start_secondary (arch/x86/kernel/smpboot.c:210 arch/x86/kernel/smpboot.c:294)
secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:433)

value changed: 0x80003fff -> 0x3402805f

Reported by Kernel Concurrency Sanitizer on:
CPU: 21 PID: 0 Comm: swapper/21 Tainted: G             L     6.6.0-rc2-kcsan-00143-gb5cbe7c00aa0 #41
Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023
==================================================================

drivers/net/ethernet/realtek/r8169_main.c:
==========================================
   4429
 → 4430                 status = le32_to_cpu(desc->opts1);
   4431                 if (status & DescOwn)
   4432                         break;
   4433
   4434                 /* This barrier is needed to keep us from reading
   4435                  * any other fields out of the Rx descriptor until
   4436                  * we know the status of DescOwn
   4437                  */
   4438                 dma_rmb();
   4439
   4440                 if (unlikely(status & RxRES)) {
   4441                         if (net_ratelimit())
   4442                                 netdev_warn(dev, "Rx ERROR. status = %08x\n",

Marco Elver explained that dma_rmb() doesn't prevent the compiler to tear up the access to
desc->opts1 which can be written to concurrently. READ_ONCE() should prevent that from
happening:

   4429
 → 4430                 status = le32_to_cpu(READ_ONCE(desc->opts1));
   4431                 if (status & DescOwn)
   4432                         break;
   4433

As the consequence of this fix, this KCSAN warning was eliminated.

Fixes: 6202806e7c03a ("r8169: drop member opts1_mask from struct rtl8169_private")
Suggested-by: Marco Elver <elver@google.com>
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Cc: nic_swsd@realtek.com
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: netdev@vger.kernel.org
Link: https://lore.kernel.org/lkml/dc7fc8fa-4ea4-e9a9-30a6-7c83e6b53188@alu.unizg.hr/
Signed-off-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
Acked-by: Marco Elver <elver@google.com>
---
v4:
 fixed the Fixes: tag for 2/3.

v3:
 fixed the Fixes: tag for 3/3.

v2:
 fixed double Signed-off-by: tag

 drivers/net/ethernet/realtek/r8169_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
index 281aaa851847..81be6085a480 100644
--- a/drivers/net/ethernet/realtek/r8169_main.c
+++ b/drivers/net/ethernet/realtek/r8169_main.c
@@ -4427,7 +4427,7 @@ static int rtl_rx(struct net_device *dev, struct rtl8169_private *tp, int budget
 		dma_addr_t addr;
 		u32 status;
 
-		status = le32_to_cpu(desc->opts1);
+		status = le32_to_cpu(READ_ONCE(desc->opts1));
 		if (status & DescOwn)
 			break;
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v4 1/3] r8169: fix the KCSAN reported data-race in rtl_tx() while reading tp->cur_tx
  2023-10-18 19:34 [PATCH v4 1/3] r8169: fix the KCSAN reported data-race in rtl_tx() while reading tp->cur_tx Mirsad Goran Todorovac
  2023-10-18 19:34 ` [PATCH v4 2/3] r8169: fix the KCSAN reported data-race in rtl_tx while reading TxDescArray[entry].opts1 Mirsad Goran Todorovac
  2023-10-18 19:34 ` [PATCH v4 3/3] r8169: fix the KCSAN reported data race in rtl_rx while reading desc->opts1 Mirsad Goran Todorovac
@ 2023-10-20 12:59 ` Simon Horman
  2023-10-20 23:50 ` Jakub Kicinski
  3 siblings, 0 replies; 8+ messages in thread
From: Simon Horman @ 2023-10-20 12:59 UTC (permalink / raw)
  To: Mirsad Goran Todorovac
  Cc: Heiner Kallweit, netdev, linux-kernel, nic_swsd, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Marco Elver

On Wed, Oct 18, 2023 at 09:34:34PM +0200, Mirsad Goran Todorovac wrote:
> KCSAN reported the following data-race:
> 
> ==================================================================
> BUG: KCSAN: data-race in rtl8169_poll [r8169] / rtl8169_start_xmit [r8169]
> 
> write (marked) to 0xffff888102474b74 of 4 bytes by task 5358 on cpu 29:
> rtl8169_start_xmit (drivers/net/ethernet/realtek/r8169_main.c:4254) r8169
> dev_hard_start_xmit (./include/linux/netdevice.h:4889 ./include/linux/netdevice.h:4903 net/core/dev.c:3544 net/core/dev.c:3560)
> sch_direct_xmit (net/sched/sch_generic.c:342)
> __dev_queue_xmit (net/core/dev.c:3817 net/core/dev.c:4306)
> ip_finish_output2 (./include/linux/netdevice.h:3082 ./include/net/neighbour.h:526 ./include/net/neighbour.h:540 net/ipv4/ip_output.c:233)
> __ip_finish_output (net/ipv4/ip_output.c:311 net/ipv4/ip_output.c:293)
> ip_finish_output (net/ipv4/ip_output.c:328)
> ip_output (net/ipv4/ip_output.c:435)
> ip_send_skb (./include/net/dst.h:458 net/ipv4/ip_output.c:127 net/ipv4/ip_output.c:1486)
> udp_send_skb (net/ipv4/udp.c:963)
> udp_sendmsg (net/ipv4/udp.c:1246)
> inet_sendmsg (net/ipv4/af_inet.c:840 (discriminator 4))
> sock_sendmsg (net/socket.c:730 net/socket.c:753)
> __sys_sendto (net/socket.c:2177)
> __x64_sys_sendto (net/socket.c:2185)
> do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
> entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)
> 
> read to 0xffff888102474b74 of 4 bytes by interrupt on cpu 21:
> rtl8169_poll (drivers/net/ethernet/realtek/r8169_main.c:4397 drivers/net/ethernet/realtek/r8169_main.c:4581) r8169
> __napi_poll (net/core/dev.c:6527)
> net_rx_action (net/core/dev.c:6596 net/core/dev.c:6727)
> __do_softirq (kernel/softirq.c:553)
> __irq_exit_rcu (kernel/softirq.c:427 kernel/softirq.c:632)
> irq_exit_rcu (kernel/softirq.c:647)
> common_interrupt (arch/x86/kernel/irq.c:247 (discriminator 14))
> asm_common_interrupt (./arch/x86/include/asm/idtentry.h:636)
> cpuidle_enter_state (drivers/cpuidle/cpuidle.c:291)
> cpuidle_enter (drivers/cpuidle/cpuidle.c:390)
> call_cpuidle (kernel/sched/idle.c:135)
> do_idle (kernel/sched/idle.c:219 kernel/sched/idle.c:282)
> cpu_startup_entry (kernel/sched/idle.c:378 (discriminator 1))
> start_secondary (arch/x86/kernel/smpboot.c:210 arch/x86/kernel/smpboot.c:294)
> secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:433)
> 
> value changed: 0x002f4815 -> 0x002f4816
> 
> Reported by Kernel Concurrency Sanitizer on:
> CPU: 21 PID: 0 Comm: swapper/21 Tainted: G             L     6.6.0-rc2-kcsan-00143-gb5cbe7c00aa0 #41
> Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023
> ==================================================================
> 
> The write side of drivers/net/ethernet/realtek/r8169_main.c is:
> ==================
>    4251         /* rtl_tx needs to see descriptor changes before updated tp->cur_tx */
>    4252         smp_wmb();
>    4253
>  → 4254         WRITE_ONCE(tp->cur_tx, tp->cur_tx + frags + 1);
>    4255
>    4256         stop_queue = !netif_subqueue_maybe_stop(dev, 0, rtl_tx_slots_avail(tp),
>    4257                                                 R8169_TX_STOP_THRS,
>    4258                                                 R8169_TX_START_THRS);
> 
> The read side is the function rtl_tx():
> 
>    4355 static void rtl_tx(struct net_device *dev, struct rtl8169_private *tp,
>    4356                    int budget)
>    4357 {
>    4358         unsigned int dirty_tx, bytes_compl = 0, pkts_compl = 0;
>    4359         struct sk_buff *skb;
>    4360
>    4361         dirty_tx = tp->dirty_tx;
>    4362
>    4363         while (READ_ONCE(tp->cur_tx) != dirty_tx) {
>    4364                 unsigned int entry = dirty_tx % NUM_TX_DESC;
>    4365                 u32 status;
>    4366
>    4367                 status = le32_to_cpu(tp->TxDescArray[entry].opts1);
>    4368                 if (status & DescOwn)
>    4369                         break;
>    4370
>    4371                 skb = tp->tx_skb[entry].skb;
>    4372                 rtl8169_unmap_tx_skb(tp, entry);
>    4373
>    4374                 if (skb) {
>    4375                         pkts_compl++;
>    4376                         bytes_compl += skb->len;
>    4377                         napi_consume_skb(skb, budget);
>    4378                 }
>    4379                 dirty_tx++;
>    4380         }
>    4381
>    4382         if (tp->dirty_tx != dirty_tx) {
>    4383                 dev_sw_netstats_tx_add(dev, pkts_compl, bytes_compl);
>    4384                 WRITE_ONCE(tp->dirty_tx, dirty_tx);
>    4385
>    4386                 netif_subqueue_completed_wake(dev, 0, pkts_compl, bytes_compl,
>    4387                                               rtl_tx_slots_avail(tp),
>    4388                                               R8169_TX_START_THRS);
>    4389                 /*
>    4390                  * 8168 hack: TxPoll requests are lost when the Tx packets are
>    4391                  * too close. Let's kick an extra TxPoll request when a burst
>    4392                  * of start_xmit activity is detected (if it is not detected,
>    4393                  * it is slow enough). -- FR
>    4394                  * If skb is NULL then we come here again once a tx irq is
>    4395                  * triggered after the last fragment is marked transmitted.
>    4396                  */
>  → 4397                 if (tp->cur_tx != dirty_tx && skb)
>    4398                         rtl8169_doorbell(tp);
>    4399         }
>    4400 }
> 
> Obviously from the code, an earlier detected data-race for tp->cur_tx was fixed in the
> line 4363:
> 
>    4363         while (READ_ONCE(tp->cur_tx) != dirty_tx) {
> 
> but the same solution is required for protecting the other access to tp->cur_tx:
> 
>  → 4397                 if (READ_ONCE(tp->cur_tx) != dirty_tx && skb)
>    4398                         rtl8169_doorbell(tp);
> 
> The write in the line 4254 is protected with WRITE_ONCE(), but the read in the line 4397
> might have suffered read tearing under some compiler optimisations.
> 
> The fix eliminated the KCSAN data-race report for this bug.
> 
> It is yet to be evaluated what happens if tp->cur_tx changes between the test in line 4363
> and line 4397. This test should certainly not be cached by the compiler in some register
> for such a long time, while asynchronous writes to tp->cur_tx might have occurred in line
> 4254 in the meantime.
> 
> Fixes: 94d8a98e6235c ("r8169: reduce number of workaround doorbell rings")
> Cc: Heiner Kallweit <hkallweit1@gmail.com>
> Cc: nic_swsd@realtek.com
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: Eric Dumazet <edumazet@google.com>
> Cc: Jakub Kicinski <kuba@kernel.org>
> Cc: Paolo Abeni <pabeni@redhat.com>
> Cc: Marco Elver <elver@google.com>
> Cc: netdev@vger.kernel.org
> Link: https://lore.kernel.org/lkml/dc7fc8fa-4ea4-e9a9-30a6-7c83e6b53188@alu.unizg.hr/
> Signed-off-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
> Acked-by: Marco Elver <elver@google.com>

Reviewed-by: Simon Horman <horms@kernel.org>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v4 2/3] r8169: fix the KCSAN reported data-race in rtl_tx while reading TxDescArray[entry].opts1
  2023-10-18 19:34 ` [PATCH v4 2/3] r8169: fix the KCSAN reported data-race in rtl_tx while reading TxDescArray[entry].opts1 Mirsad Goran Todorovac
@ 2023-10-20 13:00   ` Simon Horman
  0 siblings, 0 replies; 8+ messages in thread
From: Simon Horman @ 2023-10-20 13:00 UTC (permalink / raw)
  To: Mirsad Goran Todorovac
  Cc: Heiner Kallweit, netdev, linux-kernel, nic_swsd, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Marco Elver

On Wed, Oct 18, 2023 at 09:34:36PM +0200, Mirsad Goran Todorovac wrote:
> KCSAN reported the following data-race:
> 
> ==================================================================
> BUG: KCSAN: data-race in rtl8169_poll (drivers/net/ethernet/realtek/r8169_main.c:4368 drivers/net/ethernet/realtek/r8169_main.c:4581) r8169
> 
> race at unknown origin, with read to 0xffff888140d37570 of 4 bytes by interrupt on cpu 21:
> rtl8169_poll (drivers/net/ethernet/realtek/r8169_main.c:4368 drivers/net/ethernet/realtek/r8169_main.c:4581) r8169
> __napi_poll (net/core/dev.c:6527)
> net_rx_action (net/core/dev.c:6596 net/core/dev.c:6727)
> __do_softirq (kernel/softirq.c:553)
> __irq_exit_rcu (kernel/softirq.c:427 kernel/softirq.c:632)
> irq_exit_rcu (kernel/softirq.c:647)
> sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1074 (discriminator 14))
> asm_sysvec_apic_timer_interrupt (./arch/x86/include/asm/idtentry.h:645)
> cpuidle_enter_state (drivers/cpuidle/cpuidle.c:291)
> cpuidle_enter (drivers/cpuidle/cpuidle.c:390)
> call_cpuidle (kernel/sched/idle.c:135)
> do_idle (kernel/sched/idle.c:219 kernel/sched/idle.c:282)
> cpu_startup_entry (kernel/sched/idle.c:378 (discriminator 1))
> start_secondary (arch/x86/kernel/smpboot.c:210 arch/x86/kernel/smpboot.c:294)
> secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:433)
> 
> value changed: 0xb0000042 -> 0x00000000
> 
> Reported by Kernel Concurrency Sanitizer on:
> CPU: 21 PID: 0 Comm: swapper/21 Tainted: G             L     6.6.0-rc2-kcsan-00143-gb5cbe7c00aa0 #41
> Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023
> ==================================================================
> 
> The read side is in
> 
> drivers/net/ethernet/realtek/r8169_main.c
> =========================================
>    4355 static void rtl_tx(struct net_device *dev, struct rtl8169_private *tp,
>    4356                    int budget)
>    4357 {
>    4358         unsigned int dirty_tx, bytes_compl = 0, pkts_compl = 0;
>    4359         struct sk_buff *skb;
>    4360
>    4361         dirty_tx = tp->dirty_tx;
>    4362
>    4363         while (READ_ONCE(tp->cur_tx) != dirty_tx) {
>    4364                 unsigned int entry = dirty_tx % NUM_TX_DESC;
>    4365                 u32 status;
>    4366
>  → 4367                 status = le32_to_cpu(tp->TxDescArray[entry].opts1);
>    4368                 if (status & DescOwn)
>    4369                         break;
>    4370
>    4371                 skb = tp->tx_skb[entry].skb;
>    4372                 rtl8169_unmap_tx_skb(tp, entry);
>    4373
>    4374                 if (skb) {
>    4375                         pkts_compl++;
>    4376                         bytes_compl += skb->len;
>    4377                         napi_consume_skb(skb, budget);
>    4378                 }
>    4379                 dirty_tx++;
>    4380         }
>    4381
>    4382         if (tp->dirty_tx != dirty_tx) {
>    4383                 dev_sw_netstats_tx_add(dev, pkts_compl, bytes_compl);
>    4384                 WRITE_ONCE(tp->dirty_tx, dirty_tx);
>    4385
>    4386                 netif_subqueue_completed_wake(dev, 0, pkts_compl, bytes_compl,
>    4387                                               rtl_tx_slots_avail(tp),
>    4388                                               R8169_TX_START_THRS);
>    4389                 /*
>    4390                  * 8168 hack: TxPoll requests are lost when the Tx packets are
>    4391                  * too close. Let's kick an extra TxPoll request when a burst
>    4392                  * of start_xmit activity is detected (if it is not detected,
>    4393                  * it is slow enough). -- FR
>    4394                  * If skb is NULL then we come here again once a tx irq is
>    4395                  * triggered after the last fragment is marked transmitted.
>    4396                  */
>    4397                 if (READ_ONCE(tp->cur_tx) != dirty_tx && skb)
>    4398                         rtl8169_doorbell(tp);
>    4399         }
>    4400 }
> 
> tp->TxDescArray[entry].opts1 is reported to have a data-race and READ_ONCE() fixes
> this KCSAN warning.
> 
>    4366
>  → 4367                 status = le32_to_cpu(READ_ONCE(tp->TxDescArray[entry].opts1));
>    4368                 if (status & DescOwn)
>    4369                         break;
>    4370
> 
> Cc: Heiner Kallweit <hkallweit1@gmail.com>
> Cc: nic_swsd@realtek.com
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: Eric Dumazet <edumazet@google.com>
> Cc: Jakub Kicinski <kuba@kernel.org>
> Cc: Paolo Abeni <pabeni@redhat.com>
> Cc: Marco Elver <elver@google.com>
> Cc: netdev@vger.kernel.org
> Link: https://lore.kernel.org/lkml/dc7fc8fa-4ea4-e9a9-30a6-7c83e6b53188@alu.unizg.hr/
> Signed-off-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
> Acked-by: Marco Elver <elver@google.com>
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")

Reviewed-by: Simon Horman <horms@kernel.org>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v4 3/3] r8169: fix the KCSAN reported data race in rtl_rx while reading desc->opts1
  2023-10-18 19:34 ` [PATCH v4 3/3] r8169: fix the KCSAN reported data race in rtl_rx while reading desc->opts1 Mirsad Goran Todorovac
@ 2023-10-20 13:00   ` Simon Horman
  0 siblings, 0 replies; 8+ messages in thread
From: Simon Horman @ 2023-10-20 13:00 UTC (permalink / raw)
  To: Mirsad Goran Todorovac
  Cc: Heiner Kallweit, netdev, linux-kernel, nic_swsd, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Marco Elver

On Wed, Oct 18, 2023 at 09:34:38PM +0200, Mirsad Goran Todorovac wrote:
> KCSAN reported the following data-race bug:
> 
> ==================================================================
> BUG: KCSAN: data-race in rtl8169_poll (drivers/net/ethernet/realtek/r8169_main.c:4430 drivers/net/ethernet/realtek/r8169_main.c:4583) r8169
> 
> race at unknown origin, with read to 0xffff888117e43510 of 4 bytes by interrupt on cpu 21:
> rtl8169_poll (drivers/net/ethernet/realtek/r8169_main.c:4430 drivers/net/ethernet/realtek/r8169_main.c:4583) r8169
> __napi_poll (net/core/dev.c:6527)
> net_rx_action (net/core/dev.c:6596 net/core/dev.c:6727)
> __do_softirq (kernel/softirq.c:553)
> __irq_exit_rcu (kernel/softirq.c:427 kernel/softirq.c:632)
> irq_exit_rcu (kernel/softirq.c:647)
> sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1074 (discriminator 14))
> asm_sysvec_apic_timer_interrupt (./arch/x86/include/asm/idtentry.h:645)
> cpuidle_enter_state (drivers/cpuidle/cpuidle.c:291)
> cpuidle_enter (drivers/cpuidle/cpuidle.c:390)
> call_cpuidle (kernel/sched/idle.c:135)
> do_idle (kernel/sched/idle.c:219 kernel/sched/idle.c:282)
> cpu_startup_entry (kernel/sched/idle.c:378 (discriminator 1))
> start_secondary (arch/x86/kernel/smpboot.c:210 arch/x86/kernel/smpboot.c:294)
> secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:433)
> 
> value changed: 0x80003fff -> 0x3402805f
> 
> Reported by Kernel Concurrency Sanitizer on:
> CPU: 21 PID: 0 Comm: swapper/21 Tainted: G             L     6.6.0-rc2-kcsan-00143-gb5cbe7c00aa0 #41
> Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023
> ==================================================================
> 
> drivers/net/ethernet/realtek/r8169_main.c:
> ==========================================
>    4429
>  → 4430                 status = le32_to_cpu(desc->opts1);
>    4431                 if (status & DescOwn)
>    4432                         break;
>    4433
>    4434                 /* This barrier is needed to keep us from reading
>    4435                  * any other fields out of the Rx descriptor until
>    4436                  * we know the status of DescOwn
>    4437                  */
>    4438                 dma_rmb();
>    4439
>    4440                 if (unlikely(status & RxRES)) {
>    4441                         if (net_ratelimit())
>    4442                                 netdev_warn(dev, "Rx ERROR. status = %08x\n",
> 
> Marco Elver explained that dma_rmb() doesn't prevent the compiler to tear up the access to
> desc->opts1 which can be written to concurrently. READ_ONCE() should prevent that from
> happening:
> 
>    4429
>  → 4430                 status = le32_to_cpu(READ_ONCE(desc->opts1));
>    4431                 if (status & DescOwn)
>    4432                         break;
>    4433
> 
> As the consequence of this fix, this KCSAN warning was eliminated.
> 
> Fixes: 6202806e7c03a ("r8169: drop member opts1_mask from struct rtl8169_private")
> Suggested-by: Marco Elver <elver@google.com>
> Cc: Heiner Kallweit <hkallweit1@gmail.com>
> Cc: nic_swsd@realtek.com
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: Eric Dumazet <edumazet@google.com>
> Cc: Jakub Kicinski <kuba@kernel.org>
> Cc: Paolo Abeni <pabeni@redhat.com>
> Cc: netdev@vger.kernel.org
> Link: https://lore.kernel.org/lkml/dc7fc8fa-4ea4-e9a9-30a6-7c83e6b53188@alu.unizg.hr/
> Signed-off-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
> Acked-by: Marco Elver <elver@google.com>

Reviewed-by: Simon Horman <horms@kernel.org>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v4 1/3] r8169: fix the KCSAN reported data-race in rtl_tx() while reading tp->cur_tx
  2023-10-18 19:34 [PATCH v4 1/3] r8169: fix the KCSAN reported data-race in rtl_tx() while reading tp->cur_tx Mirsad Goran Todorovac
                   ` (2 preceding siblings ...)
  2023-10-20 12:59 ` [PATCH v4 1/3] r8169: fix the KCSAN reported data-race in rtl_tx() while reading tp->cur_tx Simon Horman
@ 2023-10-20 23:50 ` Jakub Kicinski
  2023-10-21 16:51   ` Mirsad Goran Todorovac
  3 siblings, 1 reply; 8+ messages in thread
From: Jakub Kicinski @ 2023-10-20 23:50 UTC (permalink / raw)
  To: Mirsad Goran Todorovac
  Cc: Heiner Kallweit, netdev, linux-kernel, nic_swsd, David S. Miller,
	Eric Dumazet, Paolo Abeni, Marco Elver

On Wed, 18 Oct 2023 21:34:34 +0200 Mirsad Goran Todorovac wrote:
> KCSAN reported the following data-race:

All 3 patches seem to have been applied to net, thank you!

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v4 1/3] r8169: fix the KCSAN reported data-race in rtl_tx() while reading tp->cur_tx
  2023-10-20 23:50 ` Jakub Kicinski
@ 2023-10-21 16:51   ` Mirsad Goran Todorovac
  0 siblings, 0 replies; 8+ messages in thread
From: Mirsad Goran Todorovac @ 2023-10-21 16:51 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Heiner Kallweit, netdev, linux-kernel, nic_swsd, David S. Miller,
	Eric Dumazet, Paolo Abeni, Marco Elver

On 10/21/2023 1:50 AM, Jakub Kicinski wrote:
> On Wed, 18 Oct 2023 21:34:34 +0200 Mirsad Goran Todorovac wrote:
>> KCSAN reported the following data-race:
> 
> All 3 patches seem to have been applied to net, thank you!

That sounds like great news. Many thanks to Heiner, Marco, and Simon as well for reviewing
and advice.

Best regards,
Mirsad Todorovac

-- 
Mirsad Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu

System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia
tel. +385 (0)1 3711 451
mob. +385 91 57 88 355

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-10-21 16:51 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-18 19:34 [PATCH v4 1/3] r8169: fix the KCSAN reported data-race in rtl_tx() while reading tp->cur_tx Mirsad Goran Todorovac
2023-10-18 19:34 ` [PATCH v4 2/3] r8169: fix the KCSAN reported data-race in rtl_tx while reading TxDescArray[entry].opts1 Mirsad Goran Todorovac
2023-10-20 13:00   ` Simon Horman
2023-10-18 19:34 ` [PATCH v4 3/3] r8169: fix the KCSAN reported data race in rtl_rx while reading desc->opts1 Mirsad Goran Todorovac
2023-10-20 13:00   ` Simon Horman
2023-10-20 12:59 ` [PATCH v4 1/3] r8169: fix the KCSAN reported data-race in rtl_tx() while reading tp->cur_tx Simon Horman
2023-10-20 23:50 ` Jakub Kicinski
2023-10-21 16:51   ` Mirsad Goran Todorovac

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).