Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH net v4 1/2] r8169: add handling DASH when DASH is disabled
From: Heiner Kallweit @ 2023-11-09 17:39 UTC (permalink / raw)
  To: ChunHao Lin
  Cc: nic_swsd, davem, edumazet, kuba, pabeni, netdev, linux-kernel,
	stable
In-Reply-To: <20231109173400.4573-2-hau@realtek.com>

On 09.11.2023 18:33, ChunHao Lin wrote:
> For devices that support DASH, even DASH is disabled, there may still
> exist a default firmware that will influence device behavior.
> So driver needs to handle DASH for devices that support DASH, no
> matter the DASH status is.
> 
> This patch also prepares for "fix network lost after resume on DASH
> systems".
> 
> Fixes: ee7a1beb9759 ("r8169:call "rtl8168_driver_start" "rtl8168_driver_stop" only when hardware dash function is enabled")
> Cc: stable@vger.kernel.org
> Signed-off-by: ChunHao Lin <hau@realtek.com>
> ---

Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com>


^ permalink raw reply

* Re: [PATCH 02/41] rxrpc: Fix two connection reaping bugs
From: Jeffrey E Altman @ 2023-11-09 17:27 UTC (permalink / raw)
  To: David Howells, Marc Dionne
  Cc: linux-afs, linux-fsdevel, linux-kernel, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev
In-Reply-To: <20231109154004.3317227-3-dhowells@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 1098 bytes --]

On 11/9/2023 10:39 AM, David Howells wrote:
> Fix two connection reaping bugs:
>
>   (1) rxrpc_connection_expiry is in units of seconds, so
>       rxrpc_disconnect_call() needs to multiply it by HZ when adding it to
>       jiffies.
>
>   (2) rxrpc_client_conn_reap_timeout() should set RXRPC_CLIENT_REAP_TIMER if
>       local->kill_all_client_conns is clear, not if it is set (in which case
>       we don't need the timer).  Without this, old client connections don't
>       get cleaned up until the local endpoint is cleaned up.
>
> Fixes: 5040011d073d ("rxrpc: Make the local endpoint hold a ref on a connected call")
> Fixes: 0d6bf319bc5a ("rxrpc: Move the client conn cache management to the I/O thread")
> Signed-off-by: David Howells <dhowells@redhat.com>
> cc: Marc Dionne <marc.dionne@auristor.com>
> cc: "David S. Miller" <davem@davemloft.net>
> cc: Eric Dumazet <edumazet@google.com>
> cc: Jakub Kicinski <kuba@kernel.org>
> cc: Paolo Abeni <pabeni@redhat.com>
> cc: netdev@vger.kernel.org
> cc: linux-afs@lists.infradead.org
> ---
Reviewed-by: Jeffrey Altman <jaltman@auristor.com>

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4039 bytes --]

^ permalink raw reply

* [PATCH net v4 1/2] r8169: add handling DASH when DASH is disabled
From: ChunHao Lin @ 2023-11-09 17:33 UTC (permalink / raw)
  To: hkallweit1
  Cc: nic_swsd, davem, edumazet, kuba, pabeni, netdev, linux-kernel,
	ChunHao Lin, stable
In-Reply-To: <20231109173400.4573-1-hau@realtek.com>

For devices that support DASH, even DASH is disabled, there may still
exist a default firmware that will influence device behavior.
So driver needs to handle DASH for devices that support DASH, no
matter the DASH status is.

This patch also prepares for "fix network lost after resume on DASH
systems".

Fixes: ee7a1beb9759 ("r8169:call "rtl8168_driver_start" "rtl8168_driver_stop" only when hardware dash function is enabled")
Cc: stable@vger.kernel.org
Signed-off-by: ChunHao Lin <hau@realtek.com>
---
 drivers/net/ethernet/realtek/r8169_main.c | 36 ++++++++++++++++-------
 1 file changed, 25 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
index 0c76c162b8a9..cfcb40d90920 100644
--- a/drivers/net/ethernet/realtek/r8169_main.c
+++ b/drivers/net/ethernet/realtek/r8169_main.c
@@ -624,6 +624,7 @@ struct rtl8169_private {
 
 	unsigned supports_gmii:1;
 	unsigned aspm_manageable:1;
+	unsigned dash_enabled:1;
 	dma_addr_t counters_phys_addr;
 	struct rtl8169_counters *counters;
 	struct rtl8169_tc_offsets tc_offset;
@@ -1253,14 +1254,26 @@ static bool r8168ep_check_dash(struct rtl8169_private *tp)
 	return r8168ep_ocp_read(tp, 0x128) & BIT(0);
 }
 
-static enum rtl_dash_type rtl_check_dash(struct rtl8169_private *tp)
+static bool rtl_dash_is_enabled(struct rtl8169_private *tp)
+{
+	switch (tp->dash_type) {
+	case RTL_DASH_DP:
+		return r8168dp_check_dash(tp);
+	case RTL_DASH_EP:
+		return r8168ep_check_dash(tp);
+	default:
+		return false;
+	}
+}
+
+static enum rtl_dash_type rtl_get_dash_type(struct rtl8169_private *tp)
 {
 	switch (tp->mac_version) {
 	case RTL_GIGA_MAC_VER_28:
 	case RTL_GIGA_MAC_VER_31:
-		return r8168dp_check_dash(tp) ? RTL_DASH_DP : RTL_DASH_NONE;
+		return RTL_DASH_DP;
 	case RTL_GIGA_MAC_VER_51 ... RTL_GIGA_MAC_VER_53:
-		return r8168ep_check_dash(tp) ? RTL_DASH_EP : RTL_DASH_NONE;
+		return RTL_DASH_EP;
 	default:
 		return RTL_DASH_NONE;
 	}
@@ -1453,7 +1466,7 @@ static void __rtl8169_set_wol(struct rtl8169_private *tp, u32 wolopts)
 
 	device_set_wakeup_enable(tp_to_dev(tp), wolopts);
 
-	if (tp->dash_type == RTL_DASH_NONE) {
+	if (!tp->dash_enabled) {
 		rtl_set_d3_pll_down(tp, !wolopts);
 		tp->dev->wol_enabled = wolopts ? 1 : 0;
 	}
@@ -2512,7 +2525,7 @@ static void rtl_wol_enable_rx(struct rtl8169_private *tp)
 
 static void rtl_prepare_power_down(struct rtl8169_private *tp)
 {
-	if (tp->dash_type != RTL_DASH_NONE)
+	if (tp->dash_enabled)
 		return;
 
 	if (tp->mac_version == RTL_GIGA_MAC_VER_32 ||
@@ -4869,7 +4882,7 @@ static int rtl8169_runtime_idle(struct device *device)
 {
 	struct rtl8169_private *tp = dev_get_drvdata(device);
 
-	if (tp->dash_type != RTL_DASH_NONE)
+	if (tp->dash_enabled)
 		return -EBUSY;
 
 	if (!netif_running(tp->dev) || !netif_carrier_ok(tp->dev))
@@ -4895,8 +4908,7 @@ static void rtl_shutdown(struct pci_dev *pdev)
 	/* Restore original MAC address */
 	rtl_rar_set(tp, tp->dev->perm_addr);
 
-	if (system_state == SYSTEM_POWER_OFF &&
-	    tp->dash_type == RTL_DASH_NONE) {
+	if (system_state == SYSTEM_POWER_OFF && !tp->dash_enabled) {
 		pci_wake_from_d3(pdev, tp->saved_wolopts);
 		pci_set_power_state(pdev, PCI_D3hot);
 	}
@@ -5254,7 +5266,8 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 		rc = pci_disable_link_state(pdev, PCIE_LINK_STATE_L1);
 	tp->aspm_manageable = !rc;
 
-	tp->dash_type = rtl_check_dash(tp);
+	tp->dash_type = rtl_get_dash_type(tp);
+	tp->dash_enabled = rtl_dash_is_enabled(tp);
 
 	tp->cp_cmd = RTL_R16(tp, CPlusCmd) & CPCMD_MASK;
 
@@ -5325,7 +5338,7 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 	/* configure chip for default features */
 	rtl8169_set_features(dev, dev->features);
 
-	if (tp->dash_type == RTL_DASH_NONE) {
+	if (!tp->dash_enabled) {
 		rtl_set_d3_pll_down(tp, true);
 	} else {
 		rtl_set_d3_pll_down(tp, false);
@@ -5365,7 +5378,8 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 			    "ok" : "ko");
 
 	if (tp->dash_type != RTL_DASH_NONE) {
-		netdev_info(dev, "DASH enabled\n");
+		netdev_info(dev, "DASH %s\n",
+			    tp->dash_enabled ? "enabled" : "disabled");
 		rtl8168_driver_start(tp);
 	}
 
-- 
2.39.2


^ permalink raw reply related

* [PATCH net v4 0/2] r8169: fix DASH devices network lost issue
From: ChunHao Lin @ 2023-11-09 17:33 UTC (permalink / raw)
  To: hkallweit1
  Cc: nic_swsd, davem, edumazet, kuba, pabeni, netdev, linux-kernel,
	ChunHao Lin

This series are used to fix network lost issue on systems that support 
DASH. It has been tested on rtl8168ep and rtl8168fp.

V3 -> V4: Fix a coding style issue.
V2 -> V3: Add 'Fixes' tag and correct indentation.
V1 -> V2: Change variable and function name. And update DASH info message.

ChunHao Lin (2):
  r8169: add handling DASH when DASH is disabled
  r8169: fix network lost after resume on DASH systems

 drivers/net/ethernet/realtek/r8169_main.c | 42 +++++++++++++++++------
 1 file changed, 31 insertions(+), 11 deletions(-)

-- 
2.39.2

^ permalink raw reply

* [PATCH net v4 2/2] r8169: fix network lost after resume on DASH systems
From: ChunHao Lin @ 2023-11-09 17:34 UTC (permalink / raw)
  To: hkallweit1
  Cc: nic_swsd, davem, edumazet, kuba, pabeni, netdev, linux-kernel,
	ChunHao Lin, stable
In-Reply-To: <20231109173400.4573-1-hau@realtek.com>

Device that support DASH may be reseted or powered off during suspend.
So driver needs to handle DASH during system suspend and resume. Or
DASH firmware will influence device behavior and causes network lost.

Fixes: b646d90053f8 ("r8169: magic.")
Cc: stable@vger.kernel.org
Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: ChunHao Lin <hau@realtek.com>
---
 drivers/net/ethernet/realtek/r8169_main.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
index cfcb40d90920..b9bb1d2f0237 100644
--- a/drivers/net/ethernet/realtek/r8169_main.c
+++ b/drivers/net/ethernet/realtek/r8169_main.c
@@ -4661,10 +4661,16 @@ static void rtl8169_down(struct rtl8169_private *tp)
 	rtl8169_cleanup(tp);
 	rtl_disable_exit_l1(tp);
 	rtl_prepare_power_down(tp);
+
+	if (tp->dash_type != RTL_DASH_NONE)
+		rtl8168_driver_stop(tp);
 }
 
 static void rtl8169_up(struct rtl8169_private *tp)
 {
+	if (tp->dash_type != RTL_DASH_NONE)
+		rtl8168_driver_start(tp);
+
 	pci_set_master(tp->pci_dev);
 	phy_init_hw(tp->phydev);
 	phy_resume(tp->phydev);
-- 
2.39.2


^ permalink raw reply related

* RE: [PATCH net v3 1/2] r8169: add handling DASH when DASH is disabled
From: Hau @ 2023-11-09 17:29 UTC (permalink / raw)
  To: Heiner Kallweit
  Cc: nic_swsd, davem@davemloft.net, edumazet@google.com,
	kuba@kernel.org, pabeni@redhat.com, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, stable@vger.kernel.org
In-Reply-To: <a1f703f4-a0da-4f88-9c1c-e5f6b6e2ce50@gmail.com>

> >       if (!netif_running(tp->dev) || !netif_carrier_ok(tp->dev)) @@
> > -4896,7 +4909,7 @@ static void rtl_shutdown(struct pci_dev *pdev)
> >       rtl_rar_set(tp, tp->dev->perm_addr);
> >
> >       if (system_state == SYSTEM_POWER_OFF &&
> > -         tp->dash_type == RTL_DASH_NONE) {
> > +         !tp->dash_enabled) {
> 
> Why break the line at all? Now the check fits the 80 char line limit.

I will correct this. Thanks.

^ permalink raw reply

* Re: [PATCH iwl-net] ice: Restore fix disabling RX VLAN filtering
From: Simon Horman @ 2023-11-09 17:24 UTC (permalink / raw)
  To: Marcin Szycik; +Cc: intel-wired-lan, netdev, Wojciech Drewek
In-Reply-To: <20231107135138.10692-1-marcin.szycik@linux.intel.com>

On Tue, Nov 07, 2023 at 02:51:38PM +0100, Marcin Szycik wrote:
> Fix setting dis_rx_filtering depending on whether port vlan is being
> turned on or off. This was originally fixed in commit c793f8ea15e3 ("ice:
> Fix disabling Rx VLAN filtering with port VLAN enabled"), but while
> refactoring ice_vf_vsi_init_vlan_ops(), the fix has been lost. Restore the
> fix along with the original comment from that change.
> 
> Also delete duplicate lines in ice_port_vlan_on().
> 
> Fixes: 2946204b3fa8 ("ice: implement bridge port vlan")
> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
> Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com>

Reviewed-by: Simon Horman <horms@kernel.org>


^ permalink raw reply

* RE: [PATCH net 0/3] dpll: fix unordered unbind/bind registerer issues
From: Kubalewski, Arkadiusz @ 2023-11-09 17:20 UTC (permalink / raw)
  To: Vadim Fedorenko, netdev@vger.kernel.org
  Cc: jiri@resnulli.us, Michalik, Michal, Olech, Milena,
	pabeni@redhat.com, kuba@kernel.org
In-Reply-To: <4c251905-308b-4709-8e08-39cda85678f9@linux.dev>

>From: Vadim Fedorenko <vadim.fedorenko@linux.dev>
>Sent: Thursday, November 9, 2023 11:51 AM
>
>On 08/11/2023 10:32, Arkadiusz Kubalewski wrote:
>> Fix issues when performing unordered unbind/bind of a kernel modules
>> which are using a dpll device with DPLL_PIN_TYPE_MUX pins.
>> Currently only serialized bind/unbind of such use case works, fix
>> the issues and allow for unserialized kernel module bind order.
>>
>> The issues are observed on the ice driver, i.e.,
>>
>> $ echo 0000:af:00.0 > /sys/bus/pci/drivers/ice/unbind
>> $ echo 0000:af:00.1 > /sys/bus/pci/drivers/ice/unbind
>>
>> results in:
>>
>> ice 0000:af:00.0: Removed PTP clock
>> BUG: kernel NULL pointer dereference, address: 0000000000000010
>> PF: supervisor read access in kernel mode
>> PF: error_code(0x0000) - not-present page
>> PGD 0 P4D 0
>> Oops: 0000 [#1] PREEMPT SMP PTI
>> CPU: 7 PID: 71848 Comm: bash Kdump: loaded Not tainted 6.6.0-rc5_next-
>>queue_19th-Oct-2023-01625-g039e5d15e451 #1
>> Hardware name: Intel Corporation S2600STB/S2600STB, BIOS
>>SE5C620.86B.02.01.0008.031920191559 03/19/2019
>> RIP: 0010:ice_dpll_rclk_state_on_pin_get+0x2f/0x90 [ice]
>> Code: 41 57 4d 89 cf 41 56 41 55 4d 89 c5 41 54 55 48 89 f5 53 4c 8b 66
>>08 48 89 cb 4d 8d b4 24 f0 49 00 00 4c 89 f7 e8 71 ec 1f c5 <0f> b6 5b 10
>>41 0f b6 84 24 30 4b 00 00 29 c3 41 0f b6 84 24 28 4b
>> RSP: 0018:ffffc902b179fb60 EFLAGS: 00010246
>> RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
>> RDX: ffff8882c1398000 RSI: ffff888c7435cc60 RDI: ffff888c7435cb90
>> RBP: ffff888c7435cc60 R08: ffffc902b179fbb0 R09: 0000000000000000
>> R10: ffff888ef1fc8050 R11: fffffffffff82700 R12: ffff888c743581a0
>> R13: ffffc902b179fbb0 R14: ffff888c7435cb90 R15: 0000000000000000
>> FS:  00007fdc7dae0740(0000) GS:ffff888c105c0000(0000)
>knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 0000000000000010 CR3: 0000000132c24002 CR4: 00000000007706e0
>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> PKRU: 55555554
>> Call Trace:
>>   <TASK>
>>   ? __die+0x20/0x70
>>   ? page_fault_oops+0x76/0x170
>>   ? exc_page_fault+0x65/0x150
>>   ? asm_exc_page_fault+0x22/0x30
>>   ? ice_dpll_rclk_state_on_pin_get+0x2f/0x90 [ice]
>>   ? __pfx_ice_dpll_rclk_state_on_pin_get+0x10/0x10 [ice]
>>   dpll_msg_add_pin_parents+0x142/0x1d0
>>   dpll_pin_event_send+0x7d/0x150
>>   dpll_pin_on_pin_unregister+0x3f/0x100
>>   ice_dpll_deinit_pins+0xa1/0x230 [ice]
>>   ice_dpll_deinit+0x29/0xe0 [ice]
>>   ice_remove+0xcd/0x200 [ice]
>>   pci_device_remove+0x33/0xa0
>>   device_release_driver_internal+0x193/0x200
>>   unbind_store+0x9d/0xb0
>>   kernfs_fop_write_iter+0x128/0x1c0
>>   vfs_write+0x2bb/0x3e0
>>   ksys_write+0x5f/0xe0
>>   do_syscall_64+0x59/0x90
>>   ? filp_close+0x1b/0x30
>>   ? do_dup2+0x7d/0xd0
>>   ? syscall_exit_work+0x103/0x130
>>   ? syscall_exit_to_user_mode+0x22/0x40
>>   ? do_syscall_64+0x69/0x90
>>   ? syscall_exit_work+0x103/0x130
>>   ? syscall_exit_to_user_mode+0x22/0x40
>>   ? do_syscall_64+0x69/0x90
>>   entry_SYSCALL_64_after_hwframe+0x6e/0xd8
>> RIP: 0033:0x7fdc7d93eb97
>> Code: 0b 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e
>fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0
>ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
>> RSP: 002b:00007fff2aa91028 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
>> RAX: ffffffffffffffda RBX: 000000000000000d RCX: 00007fdc7d93eb97
>> RDX: 000000000000000d RSI: 00005644814ec9b0 RDI: 0000000000000001
>> RBP: 00005644814ec9b0 R08: 0000000000000000 R09: 00007fdc7d9b14e0
>> R10: 00007fdc7d9b13e0 R11: 0000000000000246 R12: 000000000000000d
>> R13: 00007fdc7d9fb780 R14: 000000000000000d R15: 00007fdc7d9f69e0
>>   </TASK>
>> Modules linked in: uinput vfio_pci vfio_pci_core vfio_iommu_type1 vfio
>>irqbypass ixgbevf snd_seq_dummy snd_hrtimer snd_seq snd_timer
>>snd_seq_device snd soundcore overlay qrtr rfkill vfat fat xfs libcrc32c
>>rpcrdma sunrpc rdma_ucm ib_srpt ib_isert iscsi_target_mod target_core_mod
>>ib_iser libiscsi scsi_transport_iscsi rdma_cm iw_cm ib_cm intel_rapl_msr
>>intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common
>>isst_if_common skx_edac nfit libnvdimm ipmi_ssif x86_pkg_temp_thermal
>>intel_powerclamp coretemp irdma rapl intel_cstate ib_uverbs iTCO_wdt
>>iTCO_vendor_support acpi_ipmi intel_uncore mei_me ipmi_si pcspkr i2c_i801
>>ib_core mei ipmi_devintf intel_pch_thermal ioatdma i2c_smbus
>>ipmi_msghandler lpc_ich joydev acpi_power_meter acpi_pad ext4 mbcache jbd2
>>sd_mod t10_pi sg ast i2c_algo_bit drm_shmem_helper drm_kms_helper ice
>>crct10dif_pclmul ixgbe crc32_pclmul drm crc32c_intel ahci i40e libahci
>>ghash_clmulni_intel libata mdio dca gnss wmi fuse [last unloaded: iavf]
>> CR2: 0000000000000010
>>
>> Arkadiusz Kubalewski (3):
>>    dpll: fix pin dump crash after module unbind
>>    dpll: fix pin dump crash for rebound module
>>    dpll: fix register pin with unregistered parent pin
>>
>>   drivers/dpll/dpll_core.c    |  8 ++------
>>   drivers/dpll/dpll_core.h    |  4 ++--
>>   drivers/dpll/dpll_netlink.c | 37 ++++++++++++++++++++++---------------
>>   3 files changed, 26 insertions(+), 23 deletions(-)
>>
>
>
>I still don't get how can we end up with unregistered pin. And shouldn't
>drivers do unregister of dpll/pin during release procedure? I thought it
>was kind of agreement we reached while developing the subsystem.
>

It's definitely not about ending up with unregistered pins.

Usually the driver is loaded for PF0, PF1, PF2, PF3 and unloaded in opposite
order: PF3, PF2, PF1, PF0. And this is working without any issues.

Above crash is caused because of unordered driver unload, where dpll subsystem
tries to notify muxed pin was deleted, but at that time the parent is already
gone, thus data points to memory which is no longer available, thus crash
happens when trying to dump pin parents.

This series fixes all issues I could find connected to the situation where
muxed-pins are trying to access their parents, when parent registerer was removed
in the meantime.

Thank you!
Arkadiusz

^ permalink raw reply

* Re: [PATCH net v3 2/2] r8169: fix network lost after resume on DASH systems
From: Heiner Kallweit @ 2023-11-09 17:15 UTC (permalink / raw)
  To: ChunHao Lin
  Cc: nic_swsd, davem, edumazet, kuba, pabeni, netdev, linux-kernel,
	stable
In-Reply-To: <20231109164327.3577-3-hau@realtek.com>

On 09.11.2023 17:43, ChunHao Lin wrote:
> Device that support DASH may be reseted or powered off during suspend.
> So driver needs to handle DASH during system suspend and resume. Or
> DASH firmware will influence device behavior and causes network lost.
> 
> Fixes: b646d90053f8 ("r8169: magic.")
> Cc: stable@vger.kernel.org
> Signed-off-by: ChunHao Lin <hau@realtek.com>

Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com>



^ permalink raw reply

* Re: [PATCH net v3 1/2] r8169: add handling DASH when DASH is disabled
From: Heiner Kallweit @ 2023-11-09 17:14 UTC (permalink / raw)
  To: ChunHao Lin
  Cc: nic_swsd, davem, edumazet, kuba, pabeni, netdev, linux-kernel,
	stable
In-Reply-To: <20231109164327.3577-2-hau@realtek.com>

On 09.11.2023 17:43, ChunHao Lin wrote:
> For devices that support DASH, even DASH is disabled, there may still
> exist a default firmware that will influence device behavior.
> So driver needs to handle DASH for devices that support DASH, no
> matter the DASH status is.
> 
> This patch also prepares for "fix network lost after resume on DASH
> systems".
> 
> Fixes: ee7a1beb9759 ("r8169:call "rtl8168_driver_start" "rtl8168_driver_stop" only when hardware dash function is enabled")
> Cc: stable@vger.kernel.org
> Signed-off-by: ChunHao Lin <hau@realtek.com>
> ---
>  drivers/net/ethernet/realtek/r8169_main.c | 35 ++++++++++++++++-------
>  1 file changed, 25 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
> index 0c76c162b8a9..4954ff0f72b1 100644
> --- a/drivers/net/ethernet/realtek/r8169_main.c
> +++ b/drivers/net/ethernet/realtek/r8169_main.c
> @@ -624,6 +624,7 @@ struct rtl8169_private {
>  
>  	unsigned supports_gmii:1;
>  	unsigned aspm_manageable:1;
> +	unsigned dash_enabled:1;
>  	dma_addr_t counters_phys_addr;
>  	struct rtl8169_counters *counters;
>  	struct rtl8169_tc_offsets tc_offset;
> @@ -1253,14 +1254,26 @@ static bool r8168ep_check_dash(struct rtl8169_private *tp)
>  	return r8168ep_ocp_read(tp, 0x128) & BIT(0);
>  }
>  
> -static enum rtl_dash_type rtl_check_dash(struct rtl8169_private *tp)
> +static bool rtl_dash_is_enabled(struct rtl8169_private *tp)
> +{
> +	switch (tp->dash_type) {
> +	case RTL_DASH_DP:
> +		return r8168dp_check_dash(tp);
> +	case RTL_DASH_EP:
> +		return r8168ep_check_dash(tp);
> +	default:
> +		return false;
> +	}
> +}
> +
> +static enum rtl_dash_type rtl_get_dash_type(struct rtl8169_private *tp)
>  {
>  	switch (tp->mac_version) {
>  	case RTL_GIGA_MAC_VER_28:
>  	case RTL_GIGA_MAC_VER_31:
> -		return r8168dp_check_dash(tp) ? RTL_DASH_DP : RTL_DASH_NONE;
> +		return RTL_DASH_DP;
>  	case RTL_GIGA_MAC_VER_51 ... RTL_GIGA_MAC_VER_53:
> -		return r8168ep_check_dash(tp) ? RTL_DASH_EP : RTL_DASH_NONE;
> +		return RTL_DASH_EP;
>  	default:
>  		return RTL_DASH_NONE;
>  	}
> @@ -1453,7 +1466,7 @@ static void __rtl8169_set_wol(struct rtl8169_private *tp, u32 wolopts)
>  
>  	device_set_wakeup_enable(tp_to_dev(tp), wolopts);
>  
> -	if (tp->dash_type == RTL_DASH_NONE) {
> +	if (!tp->dash_enabled) {
>  		rtl_set_d3_pll_down(tp, !wolopts);
>  		tp->dev->wol_enabled = wolopts ? 1 : 0;
>  	}
> @@ -2512,7 +2525,7 @@ static void rtl_wol_enable_rx(struct rtl8169_private *tp)
>  
>  static void rtl_prepare_power_down(struct rtl8169_private *tp)
>  {
> -	if (tp->dash_type != RTL_DASH_NONE)
> +	if (tp->dash_enabled)
>  		return;
>  
>  	if (tp->mac_version == RTL_GIGA_MAC_VER_32 ||
> @@ -4869,7 +4882,7 @@ static int rtl8169_runtime_idle(struct device *device)
>  {
>  	struct rtl8169_private *tp = dev_get_drvdata(device);
>  
> -	if (tp->dash_type != RTL_DASH_NONE)
> +	if (tp->dash_enabled)
>  		return -EBUSY;
>  
>  	if (!netif_running(tp->dev) || !netif_carrier_ok(tp->dev))
> @@ -4896,7 +4909,7 @@ static void rtl_shutdown(struct pci_dev *pdev)
>  	rtl_rar_set(tp, tp->dev->perm_addr);
>  
>  	if (system_state == SYSTEM_POWER_OFF &&
> -	    tp->dash_type == RTL_DASH_NONE) {
> +	    !tp->dash_enabled) {

Why break the line at all? Now the check fits the 80 char line limit.

>  		pci_wake_from_d3(pdev, tp->saved_wolopts);
>  		pci_set_power_state(pdev, PCI_D3hot);
>  	}
> @@ -5254,7 +5267,8 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
>  		rc = pci_disable_link_state(pdev, PCIE_LINK_STATE_L1);
>  	tp->aspm_manageable = !rc;
>  
> -	tp->dash_type = rtl_check_dash(tp);
> +	tp->dash_type = rtl_get_dash_type(tp);
> +	tp->dash_enabled = rtl_dash_is_enabled(tp);
>  
>  	tp->cp_cmd = RTL_R16(tp, CPlusCmd) & CPCMD_MASK;
>  
> @@ -5325,7 +5339,7 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
>  	/* configure chip for default features */
>  	rtl8169_set_features(dev, dev->features);
>  
> -	if (tp->dash_type == RTL_DASH_NONE) {
> +	if (!tp->dash_enabled) {
>  		rtl_set_d3_pll_down(tp, true);
>  	} else {
>  		rtl_set_d3_pll_down(tp, false);
> @@ -5365,7 +5379,8 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
>  			    "ok" : "ko");
>  
>  	if (tp->dash_type != RTL_DASH_NONE) {
> -		netdev_info(dev, "DASH enabled\n");
> +		netdev_info(dev, "DASH %s\n",
> +			    tp->dash_enabled ? "enabled" : "disabled");
>  		rtl8168_driver_start(tp);
>  	}
>  


^ permalink raw reply

* Re: [PATCH net v4 0/3] Fix large frames in the Gemini ethernet driver
From: Vladimir Oltean @ 2023-11-09 17:13 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: Linus Walleij, Hans Ulli Kroll, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Michał Mirosław, Andrew Lunn,
	linux-arm-kernel, netdev, linux-kernel
In-Reply-To: <ad66b532d1702c36adecd944e25f84e4497ef8b3.camel@redhat.com>

On Thu, Nov 09, 2023 at 01:26:17PM +0100, Paolo Abeni wrote:
> I fear this is a bit too late for today's PR. I hope it should not be a
> big problem, since we are very early in the release cycle.

No problem from my side.

^ permalink raw reply

* Re: [PATCH v9 bpf-next 02/17] bpf: add BPF token delegation mount options to BPF FS
From: Andrii Nakryiko @ 2023-11-09 17:09 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Andrii Nakryiko, bpf, netdev, paul, linux-fsdevel,
	linux-security-module, keescook, kernel-team, sargun
In-Reply-To: <20231109-linden-kursprogramm-15c2cbd860b3@brauner>

On Thu, Nov 9, 2023 at 12:48 AM Christian Brauner <brauner@kernel.org> wrote:
>
> On Wed, Nov 08, 2023 at 01:09:27PM -0800, Andrii Nakryiko wrote:
> > On Wed, Nov 8, 2023 at 5:51 AM Christian Brauner <brauner@kernel.org> wrote:
> > >
> > > On Fri, Nov 03, 2023 at 12:05:08PM -0700, Andrii Nakryiko wrote:
> > > > Add few new mount options to BPF FS that allow to specify that a given
> > > > BPF FS instance allows creation of BPF token (added in the next patch),
> > > > and what sort of operations are allowed under BPF token. As such, we get
> > > > 4 new mount options, each is a bit mask
> > > >   - `delegate_cmds` allow to specify which bpf() syscall commands are
> > > >     allowed with BPF token derived from this BPF FS instance;
> > > >   - if BPF_MAP_CREATE command is allowed, `delegate_maps` specifies
> > > >     a set of allowable BPF map types that could be created with BPF token;
> > > >   - if BPF_PROG_LOAD command is allowed, `delegate_progs` specifies
> > > >     a set of allowable BPF program types that could be loaded with BPF token;
> > > >   - if BPF_PROG_LOAD command is allowed, `delegate_attachs` specifies
> > > >     a set of allowable BPF program attach types that could be loaded with
> > > >     BPF token; delegate_progs and delegate_attachs are meant to be used
> > > >     together, as full BPF program type is, in general, determined
> > > >     through both program type and program attach type.
> > > >
> > > > Currently, these mount options accept the following forms of values:
> > > >   - a special value "any", that enables all possible values of a given
> > > >   bit set;
> > > >   - numeric value (decimal or hexadecimal, determined by kernel
> > > >   automatically) that specifies a bit mask value directly;
> > > >   - all the values for a given mount option are combined, if specified
> > > >   multiple times. E.g., `mount -t bpf nodev /path/to/mount -o
> > > >   delegate_maps=0x1 -o delegate_maps=0x2` will result in a combined 0x3
> > > >   mask.
> > > >
> > > > Ideally, more convenient (for humans) symbolic form derived from
> > > > corresponding UAPI enums would be accepted (e.g., `-o
> > > > delegate_progs=kprobe|tracepoint`) and I intend to implement this, but
> > > > it requires a bunch of UAPI header churn, so I postponed it until this
> > > > feature lands upstream or at least there is a definite consensus that
> > > > this feature is acceptable and is going to make it, just to minimize
> > > > amount of wasted effort and not increase amount of non-essential code to
> > > > be reviewed.
> > > >
> > > > Attentive reader will notice that BPF FS is now marked as
> > > > FS_USERNS_MOUNT, which theoretically makes it mountable inside non-init
> > > > user namespace as long as the process has sufficient *namespaced*
> > > > capabilities within that user namespace. But in reality we still
> > > > restrict BPF FS to be mountable only by processes with CAP_SYS_ADMIN *in
> > > > init userns* (extra check in bpf_fill_super()). FS_USERNS_MOUNT is added
> > > > to allow creating BPF FS context object (i.e., fsopen("bpf")) from
> > > > inside unprivileged process inside non-init userns, to capture that
> > > > userns as the owning userns. It will still be required to pass this
> > > > context object back to privileged process to instantiate and mount it.
> > > >
> > > > This manipulation is important, because capturing non-init userns as the
> > > > owning userns of BPF FS instance (super block) allows to use that userns
> > > > to constraint BPF token to that userns later on (see next patch). So
> > > > creating BPF FS with delegation inside unprivileged userns will restrict
> > > > derived BPF token objects to only "work" inside that intended userns,
> > > > making it scoped to a intended "container".
> > > >
> > > > There is a set of selftests at the end of the patch set that simulates
> > > > this sequence of steps and validates that everything works as intended.
> > > > But careful review is requested to make sure there are no missed gaps in
> > > > the implementation and testing.
> > > >
> > > > All this is based on suggestions and discussions with Christian Brauner
> > > > ([0]), to the best of my ability to follow all the implications.
> > >
> > > "who will not be held responsible for any CVE future or present as he's
> > >  not sure whether bpf token is a good idea in general"
> > >
> > > I'm not opposing it because it's really not my subsystem. But it'd be
> > > nice if you also added a disclaimer that I'm not endorsing this. :)
> > >
> >
> > Sure, I'll clarify. I still appreciate your reviewing everything and
> > pointing out all the gotchas (like the reconfiguration and other
> > stuff), thanks!
> >
> > > A comment below.
> > >
> > > >
> > > >   [0] https://lore.kernel.org/bpf/20230704-hochverdient-lehne-eeb9eeef785e@brauner/
> > > >
> > > > Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> > > > ---
> > > >  include/linux/bpf.h | 10 ++++++
> > > >  kernel/bpf/inode.c  | 88 +++++++++++++++++++++++++++++++++++++++------
> > > >  2 files changed, 88 insertions(+), 10 deletions(-)
> > > >
> >
> > [...]
> >
> > > >       opt = fs_parse(fc, bpf_fs_parameters, param, &result);
> > > >       if (opt < 0) {
> > > > @@ -665,6 +692,25 @@ static int bpf_parse_param(struct fs_context *fc, struct fs_parameter *param)
> > > >       case OPT_MODE:
> > > >               opts->mode = result.uint_32 & S_IALLUGO;
> > > >               break;
> > > > +     case OPT_DELEGATE_CMDS:
> > > > +     case OPT_DELEGATE_MAPS:
> > > > +     case OPT_DELEGATE_PROGS:
> > > > +     case OPT_DELEGATE_ATTACHS:
> > > > +             if (strcmp(param->string, "any") == 0) {
> > > > +                     msk = ~0ULL;
> > > > +             } else {
> > > > +                     err = kstrtou64(param->string, 0, &msk);
> > > > +                     if (err)
> > > > +                             return err;
> > > > +             }
> > > > +             switch (opt) {
> > > > +             case OPT_DELEGATE_CMDS: opts->delegate_cmds |= msk; break;
> > > > +             case OPT_DELEGATE_MAPS: opts->delegate_maps |= msk; break;
> > > > +             case OPT_DELEGATE_PROGS: opts->delegate_progs |= msk; break;
> > > > +             case OPT_DELEGATE_ATTACHS: opts->delegate_attachs |= msk; break;
> > > > +             default: return -EINVAL;
> > > > +             }
> > > > +             break;
> > > >       }
> > >
> > > So just to repeat that this will allow a container to set it's own
> > > delegation options:
> > >
> > >         # unprivileged container
> > >
> > >         fd_fs = fsopen();
> > >         fsconfig(fd_fs, FSCONFIG_BLA_BLA, "give-me-all-the-delegation");
> > >
> > >         # Now hand of that fd_fs to a privileged process
> > >
> > >         fsconfig(fd_fs, FSCONFIG_CREATE_CMD, ...)
> > >
> > > This means the container manager can't be part of your threat model
> > > because you need to trust it to set delegation options.
> > >
> > > But if the container manager is part of your threat model then you can
> > > never trust an fd_fs handed to you because the container manager might
> > > have enabled arbitrary delegation privileges.
> > >
> > > There's ways around this:
> > >
> > > (1) kernel: Account for this in the kernel and require privileges when
> > >     setting delegation options.
> >
> > What sort of privilege would that be? We are in an unprivileged user
> > namespace, so that would have to be some ns_capable() checks or
> > something? I can add ns_capable(CAP_BPF), but what else did you have
> > in mind?
>
> You would require privileges in the initial namespace aka capable()
> checks similar to what you require for superblock creation.

ok, I was just wondering if I'm missing something non-obvious.
capable(CAP_SYS_ADMIN) makes sense and doesn't really hurt intended
use case. Privileged parent will set these config values and then do
FSCONFIG_CREATE_CMD.

For reconfiguration I'll enforce same capable(CAP_SYS_ADMIN) checks,
unless unprivileged user drops permissions to more restrictive ones
(but I haven't had a chance to look at exact callback API, so we'll
see if that's easy to support).

Thanks for feedback!

>
> >
> > I think even if we say that privileged parent does FSCONFIG_SET_STRING
> > and unprivileged child just does sys_fsopen("bpf", 0) and nothing
> > more, we still can't be sure that child won't race with parent and set
> > FSCONFIG_SET_STRING at the same time. Because they both have access to
> > the same fs_fd.
>
> Unless you require privileges as outlined above to set delegation
> options in which case an unprivileged container cannot change delegation
> options at all.

Yep, makes sense, that's what I'm going to do.

>
> >
> > > (2) userspace: A trusted helper that allocates an fs_context fd in
> > >     the target user namespace, then sets delegation options and creates
> > >     superblock.
> > >
> > > (1) Is more restrictive but also more secure. (2) is less restrictive
> > > but requires more care from userspace.
> > >
> > > Either way I would probably consider writing a document detailing
> > > various delegation scenarios and possible pitfalls and implications
> > > before advertising it.
> > >
> > > If you choose (2) then you also need to be aware that the security of
> > > this also hinges on bpffs not allowing to reconfigure parameters once it
> > > has been mounted. Otherwise an unprivileged container can change
> > > delegation options.
> > >
> > > I would recommend that you either add a dummy bpf_reconfigure() method
> > > with a comment in it or you add a comment on top of bpf_context_ops.
> > > Something like:
> > >
> > > /*
> > >  * Unprivileged mounts of bpffs are owned by the user namespace they are
> > >  * mounted in. That means unprivileged users can change vfs mount
> > >  * options (ro<->rw, nosuid, etc.).
> > >  *
> > >  * They currently cannot change bpffs specific mount options such as
> > >  * delegation settings. If that is ever implemented it is necessary to
> > >  * require rivileges in the initial namespace. Otherwise unprivileged
> > >  * users can change delegation options to whatever they want.
> > >  */
> >
> > Yep, I will add a custom callback. I think we can allow reconfiguring
> > towards less permissive delegation subset, but I'll need to look at
> > the specifics to see if we can support that easily.

^ permalink raw reply

* Re: [PATCH net-next 12/15] net: page_pool: report when page pool was destroyed
From: Dragos Tatulea @ 2023-11-09 17:05 UTC (permalink / raw)
  To: kuba@kernel.org, davem@davemloft.net
  Cc: ilias.apalodimas@linaro.org, edumazet@google.com,
	netdev@vger.kernel.org, hawk@kernel.org, pabeni@redhat.com,
	almasrymina@google.com
In-Reply-To: <20231024160220.3973311-13-kuba@kernel.org>

On Tue, 2023-10-24 at 09:02 -0700, Jakub Kicinski wrote:
> Report when page pool was destroyed. Together with the inflight
> / memory use reporting this can serve as a replacement for the
> warning about leaked page pools we currently print to dmesg.
> 
> Example output for a fake leaked page pool using some hacks
> in netdevsim (one "live" pool, and one "leaked" on the same dev):
> 
> $ ./cli.py --no-schema --spec netlink/specs/netdev.yaml \
>            --dump page-pool-get
> [{'id': 2, 'ifindex': 3},
>  {'id': 1, 'ifindex': 3, 'destroyed': 133, 'inflight': 1}]
> 
The destroyed ts really helps to narrow down which tests/test ranges are
triggering the leaks. Thanks!

I was planning to add a per page_pool "name" where the driver would encode the
rq index + creation timestamp and the printout would show this name. This
approach is much cleaner though.

For what it's worth:

Tested-by: Dragos Tatulea <dtatulea@nvidia.com>

> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> ---
>  Documentation/netlink/specs/netdev.yaml |  9 +++++++++
>  include/net/page_pool/types.h           |  1 +
>  include/uapi/linux/netdev.h             |  1 +
>  net/core/page_pool.c                    |  1 +
>  net/core/page_pool_priv.h               |  1 +
>  net/core/page_pool_user.c               | 12 ++++++++++++
>  6 files changed, 25 insertions(+)
> 
> diff --git a/Documentation/netlink/specs/netdev.yaml
> b/Documentation/netlink/specs/netdev.yaml
> index 8d995760a14a..8be8f249bed3 100644
> --- a/Documentation/netlink/specs/netdev.yaml
> +++ b/Documentation/netlink/specs/netdev.yaml
> @@ -125,6 +125,14 @@ name: netdev
>          type: uint
>          doc: |
>            Amount of memory held by inflight pages.
> +      -
> +        name: destroyed
> +        type: uint
> +        doc: |
> +          Seconds in CLOCK_BOOTTIME of when Page Pool was destroyed.
> +          Page Pools wait for all the memory allocated from them to be freed
> +          before truly disappearing.
> +          Absent if Page Pool hasn't been destroyed.
>  
>  operations:
>    list:
> @@ -176,6 +184,7 @@ name: netdev
>              - napi-id
>              - inflight
>              - inflight-mem
> +            - destroyed
>        dump:
>          reply: *pp-reply
>      -
> diff --git a/include/net/page_pool/types.h b/include/net/page_pool/types.h
> index 7e47d7bb2c1e..f0c51ef5e345 100644
> --- a/include/net/page_pool/types.h
> +++ b/include/net/page_pool/types.h
> @@ -193,6 +193,7 @@ struct page_pool {
>         /* User-facing fields, protected by page_pools_lock */
>         struct {
>                 struct hlist_node list;
> +               u64 destroyed;
>                 u32 napi_id;
>                 u32 id;
>         } user;
> diff --git a/include/uapi/linux/netdev.h b/include/uapi/linux/netdev.h
> index 26ae5bdd3187..e5bf66d2aa31 100644
> --- a/include/uapi/linux/netdev.h
> +++ b/include/uapi/linux/netdev.h
> @@ -70,6 +70,7 @@ enum {
>         NETDEV_A_PAGE_POOL_NAPI_ID,
>         NETDEV_A_PAGE_POOL_INFLIGHT,
>         NETDEV_A_PAGE_POOL_INFLIGHT_MEM,
> +       NETDEV_A_PAGE_POOL_DESTROYED,
>  
>         __NETDEV_A_PAGE_POOL_MAX,
>         NETDEV_A_PAGE_POOL_MAX = (__NETDEV_A_PAGE_POOL_MAX - 1)
> diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> index 30c8fc91fa66..57847fbb76a0 100644
> --- a/net/core/page_pool.c
> +++ b/net/core/page_pool.c
> @@ -949,6 +949,7 @@ void page_pool_destroy(struct page_pool *pool)
>         if (!page_pool_release(pool))
>                 return;
>  
> +       page_pool_destroyed(pool);
>         pool->defer_start = jiffies;
>         pool->defer_warn  = jiffies + DEFER_WARN_INTERVAL;
>  
> diff --git a/net/core/page_pool_priv.h b/net/core/page_pool_priv.h
> index 72fb21ea1ddc..7fe6f842a270 100644
> --- a/net/core/page_pool_priv.h
> +++ b/net/core/page_pool_priv.h
> @@ -6,6 +6,7 @@
>  s32 page_pool_inflight(const struct page_pool *pool, bool strict);
>  
>  int page_pool_list(struct page_pool *pool);
> +void page_pool_destroyed(struct page_pool *pool);
>  void page_pool_unlist(struct page_pool *pool);
>  
>  #endif
> diff --git a/net/core/page_pool_user.c b/net/core/page_pool_user.c
> index c971fe9eeb01..1fb5c3cbe412 100644
> --- a/net/core/page_pool_user.c
> +++ b/net/core/page_pool_user.c
> @@ -134,6 +134,10 @@ page_pool_nl_fill(struct sk_buff *rsp, const struct
> page_pool *pool,
>             nla_put_uint(rsp, NETDEV_A_PAGE_POOL_INFLIGHT_MEM,
>                          inflight * refsz))
>                 goto err_cancel;
> +       if (pool->user.destroyed &&
> +           nla_put_uint(rsp, NETDEV_A_PAGE_POOL_DESTROYED,
> +                        pool->user.destroyed))
> +               goto err_cancel;
>  
>         genlmsg_end(rsp, hdr);
>  
> @@ -219,6 +223,14 @@ int page_pool_list(struct page_pool *pool)
>         return err;
>  }
>  
> +void page_pool_destroyed(struct page_pool *pool)
> +{
> +       mutex_lock(&page_pools_lock);
> +       pool->user.destroyed = ktime_get_boottime_seconds();
> +       netdev_nl_page_pool_event(pool, NETDEV_CMD_PAGE_POOL_CHANGE_NTF);
> +       mutex_unlock(&page_pools_lock);
> +}
> +
>  void page_pool_unlist(struct page_pool *pool)
>  {
>         mutex_lock(&page_pools_lock);


^ permalink raw reply

* Re: [PATCH net-next 07/15] eth: link netdev to page_pools in drivers
From: Ilias Apalodimas @ 2023-11-09 16:51 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: davem, netdev, edumazet, pabeni, almasrymina, hawk
In-Reply-To: <20231109082630.77f74839@kernel.org>

On Thu, 9 Nov 2023 at 18:26, Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Thu, 9 Nov 2023 11:11:04 +0200 Ilias Apalodimas wrote:
> > > ---
> > >  drivers/net/ethernet/broadcom/bnxt/bnxt.c         | 1 +
> > >  drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 1 +
> > >  drivers/net/ethernet/microsoft/mana/mana_en.c     | 1 +
> > >  3 files changed, 3 insertions(+)
> >
> > Mind add a line for netsec.c (probably in netsec_setup_rx_dring),
> > since that's the only NIC I currently have around with pp support?
>
> It's a single queue / single napi device? So like this?

Yes it is

>
> diff --git a/drivers/net/ethernet/socionext/netsec.c b/drivers/net/ethernet/socionext/netsec.c
> index 0891e9e49ecb..384c506bb930 100644
> --- a/drivers/net/ethernet/socionext/netsec.c
> +++ b/drivers/net/ethernet/socionext/netsec.c
> @@ -1302,6 +1302,8 @@ static int netsec_setup_rx_dring(struct netsec_priv *priv)
>                 .dma_dir = xdp_prog ? DMA_BIDIRECTIONAL : DMA_FROM_DEVICE,
>                 .offset = NETSEC_RXBUF_HEADROOM,
>                 .max_len = NETSEC_RX_BUF_SIZE,
> +               .napi = priv->napi,
> +               .netdev = priv->ndev,

Yea

Thanks
/Ilias
>         };
>         int i, err;
>

^ permalink raw reply

* Re: [PATCH net-next 04/15] net: page_pool: id the page pools
From: Ilias Apalodimas @ 2023-11-09 16:48 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: davem, netdev, edumazet, pabeni, almasrymina, hawk
In-Reply-To: <20231109082219.5ee1d0cf@kernel.org>

On Thu, 9 Nov 2023 at 18:22, Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Thu, 9 Nov 2023 11:21:32 +0200 Ilias Apalodimas wrote:
> > > +       mutex_lock(&page_pools_lock);
> > > +       err = xa_alloc_cyclic(&page_pools, &pool->user.id, pool, xa_limit_32b,
> > > +                             &id_alloc_next, GFP_KERNEL);
> > > +       if (err < 0)
> > > +               goto err_unlock;
> >
> > A nit really, but get rid of the if/goto and just let this return err; ?
>
> There's more stuff added here by a subsequent patch. It ends up like
> this:
>
> int page_pool_list(struct page_pool *pool)
> {
>         static u32 id_alloc_next;
>         int err;
>
>         mutex_lock(&page_pools_lock);
>         err = xa_alloc_cyclic(&page_pools, &pool->user.id, pool, xa_limit_32b,
>                               &id_alloc_next, GFP_KERNEL);
>         if (err < 0)
>                 goto err_unlock;
>
>         if (pool->slow.netdev) {
>                 hlist_add_head(&pool->user.list,
>                                &pool->slow.netdev->page_pools);
>                 pool->user.napi_id = pool->p.napi ? pool->p.napi->napi_id : 0;
>
>                 netdev_nl_page_pool_event(pool, NETDEV_CMD_PAGE_POOL_ADD_NTF);
>         }
>
>         mutex_unlock(&page_pools_lock);
>         return 0;
>
> err_unlock:
>         mutex_unlock(&page_pools_lock);
>         return err;
> }
>
> Do you want me to combine the error and non-error paths?
> I have a weak preference for not mixing, sometimes err gets set
> to a positive value and that starts to propagate, unlikely to
> happen here tho.

Nop that's fine -- I actually prefer it myself. I just haven't got the
follow up patches yet


Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>

^ permalink raw reply

* [PATCH net v3 0/2] r8169: fix DASH devices network lost issue
From: ChunHao Lin @ 2023-11-09 16:43 UTC (permalink / raw)
  To: hkallweit1
  Cc: nic_swsd, davem, edumazet, kuba, pabeni, netdev, linux-kernel,
	ChunHao Lin

This series are used to fix network lost issue on systems that support 
DASH. It has been tested on rtl8168ep and rtl8168fp.

V2 -> V3: Add 'Fixes' tag and correct indentation.
V1 -> V2: Change variable and function name. And update DASH info message.

ChunHao Lin (2):
  r8169: add handling DASH when DASH is disabled
  r8169: fix network lost after resume on DASH systems

 drivers/net/ethernet/realtek/r8169_main.c | 41 +++++++++++++++++------
 1 file changed, 31 insertions(+), 10 deletions(-)

-- 
2.39.2

^ permalink raw reply

* [PATCH net v3 1/2] r8169: add handling DASH when DASH is disabled
From: ChunHao Lin @ 2023-11-09 16:43 UTC (permalink / raw)
  To: hkallweit1
  Cc: nic_swsd, davem, edumazet, kuba, pabeni, netdev, linux-kernel,
	ChunHao Lin, stable
In-Reply-To: <20231109164327.3577-1-hau@realtek.com>

For devices that support DASH, even DASH is disabled, there may still
exist a default firmware that will influence device behavior.
So driver needs to handle DASH for devices that support DASH, no
matter the DASH status is.

This patch also prepares for "fix network lost after resume on DASH
systems".

Fixes: ee7a1beb9759 ("r8169:call "rtl8168_driver_start" "rtl8168_driver_stop" only when hardware dash function is enabled")
Cc: stable@vger.kernel.org
Signed-off-by: ChunHao Lin <hau@realtek.com>
---
 drivers/net/ethernet/realtek/r8169_main.c | 35 ++++++++++++++++-------
 1 file changed, 25 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
index 0c76c162b8a9..4954ff0f72b1 100644
--- a/drivers/net/ethernet/realtek/r8169_main.c
+++ b/drivers/net/ethernet/realtek/r8169_main.c
@@ -624,6 +624,7 @@ struct rtl8169_private {
 
 	unsigned supports_gmii:1;
 	unsigned aspm_manageable:1;
+	unsigned dash_enabled:1;
 	dma_addr_t counters_phys_addr;
 	struct rtl8169_counters *counters;
 	struct rtl8169_tc_offsets tc_offset;
@@ -1253,14 +1254,26 @@ static bool r8168ep_check_dash(struct rtl8169_private *tp)
 	return r8168ep_ocp_read(tp, 0x128) & BIT(0);
 }
 
-static enum rtl_dash_type rtl_check_dash(struct rtl8169_private *tp)
+static bool rtl_dash_is_enabled(struct rtl8169_private *tp)
+{
+	switch (tp->dash_type) {
+	case RTL_DASH_DP:
+		return r8168dp_check_dash(tp);
+	case RTL_DASH_EP:
+		return r8168ep_check_dash(tp);
+	default:
+		return false;
+	}
+}
+
+static enum rtl_dash_type rtl_get_dash_type(struct rtl8169_private *tp)
 {
 	switch (tp->mac_version) {
 	case RTL_GIGA_MAC_VER_28:
 	case RTL_GIGA_MAC_VER_31:
-		return r8168dp_check_dash(tp) ? RTL_DASH_DP : RTL_DASH_NONE;
+		return RTL_DASH_DP;
 	case RTL_GIGA_MAC_VER_51 ... RTL_GIGA_MAC_VER_53:
-		return r8168ep_check_dash(tp) ? RTL_DASH_EP : RTL_DASH_NONE;
+		return RTL_DASH_EP;
 	default:
 		return RTL_DASH_NONE;
 	}
@@ -1453,7 +1466,7 @@ static void __rtl8169_set_wol(struct rtl8169_private *tp, u32 wolopts)
 
 	device_set_wakeup_enable(tp_to_dev(tp), wolopts);
 
-	if (tp->dash_type == RTL_DASH_NONE) {
+	if (!tp->dash_enabled) {
 		rtl_set_d3_pll_down(tp, !wolopts);
 		tp->dev->wol_enabled = wolopts ? 1 : 0;
 	}
@@ -2512,7 +2525,7 @@ static void rtl_wol_enable_rx(struct rtl8169_private *tp)
 
 static void rtl_prepare_power_down(struct rtl8169_private *tp)
 {
-	if (tp->dash_type != RTL_DASH_NONE)
+	if (tp->dash_enabled)
 		return;
 
 	if (tp->mac_version == RTL_GIGA_MAC_VER_32 ||
@@ -4869,7 +4882,7 @@ static int rtl8169_runtime_idle(struct device *device)
 {
 	struct rtl8169_private *tp = dev_get_drvdata(device);
 
-	if (tp->dash_type != RTL_DASH_NONE)
+	if (tp->dash_enabled)
 		return -EBUSY;
 
 	if (!netif_running(tp->dev) || !netif_carrier_ok(tp->dev))
@@ -4896,7 +4909,7 @@ static void rtl_shutdown(struct pci_dev *pdev)
 	rtl_rar_set(tp, tp->dev->perm_addr);
 
 	if (system_state == SYSTEM_POWER_OFF &&
-	    tp->dash_type == RTL_DASH_NONE) {
+	    !tp->dash_enabled) {
 		pci_wake_from_d3(pdev, tp->saved_wolopts);
 		pci_set_power_state(pdev, PCI_D3hot);
 	}
@@ -5254,7 +5267,8 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 		rc = pci_disable_link_state(pdev, PCIE_LINK_STATE_L1);
 	tp->aspm_manageable = !rc;
 
-	tp->dash_type = rtl_check_dash(tp);
+	tp->dash_type = rtl_get_dash_type(tp);
+	tp->dash_enabled = rtl_dash_is_enabled(tp);
 
 	tp->cp_cmd = RTL_R16(tp, CPlusCmd) & CPCMD_MASK;
 
@@ -5325,7 +5339,7 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 	/* configure chip for default features */
 	rtl8169_set_features(dev, dev->features);
 
-	if (tp->dash_type == RTL_DASH_NONE) {
+	if (!tp->dash_enabled) {
 		rtl_set_d3_pll_down(tp, true);
 	} else {
 		rtl_set_d3_pll_down(tp, false);
@@ -5365,7 +5379,8 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 			    "ok" : "ko");
 
 	if (tp->dash_type != RTL_DASH_NONE) {
-		netdev_info(dev, "DASH enabled\n");
+		netdev_info(dev, "DASH %s\n",
+			    tp->dash_enabled ? "enabled" : "disabled");
 		rtl8168_driver_start(tp);
 	}
 
-- 
2.39.2


^ permalink raw reply related

* [PATCH net v3 2/2] r8169: fix network lost after resume on DASH systems
From: ChunHao Lin @ 2023-11-09 16:43 UTC (permalink / raw)
  To: hkallweit1
  Cc: nic_swsd, davem, edumazet, kuba, pabeni, netdev, linux-kernel,
	ChunHao Lin, stable
In-Reply-To: <20231109164327.3577-1-hau@realtek.com>

Device that support DASH may be reseted or powered off during suspend.
So driver needs to handle DASH during system suspend and resume. Or
DASH firmware will influence device behavior and causes network lost.

Fixes: b646d90053f8 ("r8169: magic.")
Cc: stable@vger.kernel.org
Signed-off-by: ChunHao Lin <hau@realtek.com>
---
 drivers/net/ethernet/realtek/r8169_main.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
index 4954ff0f72b1..7e90dac2d97d 100644
--- a/drivers/net/ethernet/realtek/r8169_main.c
+++ b/drivers/net/ethernet/realtek/r8169_main.c
@@ -4661,10 +4661,16 @@ static void rtl8169_down(struct rtl8169_private *tp)
 	rtl8169_cleanup(tp);
 	rtl_disable_exit_l1(tp);
 	rtl_prepare_power_down(tp);
+
+	if (tp->dash_type != RTL_DASH_NONE)
+		rtl8168_driver_stop(tp);
 }
 
 static void rtl8169_up(struct rtl8169_private *tp)
 {
+	if (tp->dash_type != RTL_DASH_NONE)
+		rtl8168_driver_start(tp);
+
 	pci_set_master(tp->pci_dev);
 	phy_init_hw(tp->phydev);
 	phy_resume(tp->phydev);
-- 
2.39.2


^ permalink raw reply related

* Re: [PATCH net v2 1/2] r8169: add handling DASH when DASH is disabled
From: Heiner Kallweit @ 2023-11-09 16:37 UTC (permalink / raw)
  To: Paolo Abeni, ChunHao Lin
  Cc: nic_swsd, davem, edumazet, kuba, netdev, linux-kernel, stable
In-Reply-To: <5783a6f8819a741f0f299602ff615e6a03368246.camel@redhat.com>

On 09.11.2023 12:49, Paolo Abeni wrote:
> On Thu, 2023-11-09 at 02:48 +0800, ChunHao Lin wrote:
>> For devices that support DASH, even DASH is disabled, there may still
>> exist a default firmware that will influence device behavior.
>> So driver needs to handle DASH for devices that support DASH, no
>> matter the DASH status is.
>>
>> This patch also prepare for "fix DASH deviceis network lost issue".
>>
>> Signed-off-by: ChunHao Lin <hau@realtek.com>
> 
> You should include the fixes tag you already added in v1 and your Sob
> should come as the last tag
> 
> The same applies to the next patch 
> 
>> Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com>
> 
> It's not clear where/when Heiner provided the above tag for this patch.
> I hope that was off-list.
> 
Right, so far I added my Rb for patch 2 only. Will have a look at patch 1
again once there's a version with your review comments having been addressed.

>> Cc: stable@vger.kernel.org
>> ---
>>  drivers/net/ethernet/realtek/r8169_main.c | 35 ++++++++++++++++-------
>>  1 file changed, 25 insertions(+), 10 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
>> index 0c76c162b8a9..108dc75050ba 100644
>> --- a/drivers/net/ethernet/realtek/r8169_main.c
>> +++ b/drivers/net/ethernet/realtek/r8169_main.c
>> @@ -624,6 +624,7 @@ struct rtl8169_private {
>>  
>>  	unsigned supports_gmii:1;
>>  	unsigned aspm_manageable:1;
>> +	unsigned dash_enabled:1;
>>  	dma_addr_t counters_phys_addr;
>>  	struct rtl8169_counters *counters;
>>  	struct rtl8169_tc_offsets tc_offset;
>> @@ -1253,14 +1254,26 @@ static bool r8168ep_check_dash(struct rtl8169_private *tp)
>>  	return r8168ep_ocp_read(tp, 0x128) & BIT(0);
>>  }
>>  
>> -static enum rtl_dash_type rtl_check_dash(struct rtl8169_private *tp)
>> +static bool rtl_dash_is_enabled(struct rtl8169_private *tp)
>> +{
>> +	switch (tp->dash_type) {
>> +	case RTL_DASH_DP:
>> +		return r8168dp_check_dash(tp);
>> +	case RTL_DASH_EP:
>> +		return r8168ep_check_dash(tp);
>> +	default:
>> +		return false;
>> +	}
>> +}
>> +
>> +static enum rtl_dash_type rtl_get_dash_type(struct rtl8169_private *tp)
>>  {
>>  	switch (tp->mac_version) {
>>  	case RTL_GIGA_MAC_VER_28:
>>  	case RTL_GIGA_MAC_VER_31:
>> -		return r8168dp_check_dash(tp) ? RTL_DASH_DP : RTL_DASH_NONE;
>> +		return RTL_DASH_DP;
>>  	case RTL_GIGA_MAC_VER_51 ... RTL_GIGA_MAC_VER_53:
>> -		return r8168ep_check_dash(tp) ? RTL_DASH_EP : RTL_DASH_NONE;
>> +		return RTL_DASH_EP;
>>  	default:
>>  		return RTL_DASH_NONE;
>>  	}
>> @@ -1453,7 +1466,7 @@ static void __rtl8169_set_wol(struct rtl8169_private *tp, u32 wolopts)
>>  
>>  	device_set_wakeup_enable(tp_to_dev(tp), wolopts);
>>  
>> -	if (tp->dash_type == RTL_DASH_NONE) {
>> +	if (!tp->dash_enabled) {
>>  		rtl_set_d3_pll_down(tp, !wolopts);
>>  		tp->dev->wol_enabled = wolopts ? 1 : 0;
>>  	}
>> @@ -2512,7 +2525,7 @@ static void rtl_wol_enable_rx(struct rtl8169_private *tp)
>>  
>>  static void rtl_prepare_power_down(struct rtl8169_private *tp)
>>  {
>> -	if (tp->dash_type != RTL_DASH_NONE)
>> +	if (tp->dash_enabled)
>>  		return;
>>  
>>  	if (tp->mac_version == RTL_GIGA_MAC_VER_32 ||
>> @@ -4869,7 +4882,7 @@ static int rtl8169_runtime_idle(struct device *device)
>>  {
>>  	struct rtl8169_private *tp = dev_get_drvdata(device);
>>  
>> -	if (tp->dash_type != RTL_DASH_NONE)
>> +	if (tp->dash_enabled)
>>  		return -EBUSY;
>>  
>>  	if (!netif_running(tp->dev) || !netif_carrier_ok(tp->dev))
>> @@ -4896,7 +4909,7 @@ static void rtl_shutdown(struct pci_dev *pdev)
>>  	rtl_rar_set(tp, tp->dev->perm_addr);
>>  
>>  	if (system_state == SYSTEM_POWER_OFF &&
>> -	    tp->dash_type == RTL_DASH_NONE) {
>> +		!tp->dash_enabled) {
> 
> Since you have to repost, please maintain the correct indentation
> above:
> 
> 	if (system_state == SYSTEM_POWER_OFF &&
> 	    !tp->dash_enabled) {
> 
>         ^^^^
> spaces here.
> 
> 
> Cheers,
> 
> Paolo
> 


^ permalink raw reply

* RE: [PATCH net 1/3] dpll: fix pin dump crash after module unbind
From: Kubalewski, Arkadiusz @ 2023-11-09 16:33 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: netdev@vger.kernel.org, vadim.fedorenko@linux.dev,
	Michalik, Michal, Olech, Milena, pabeni@redhat.com,
	kuba@kernel.org
In-Reply-To: <ZUzcEdhmnBVdXsBD@nanopsycho>

>From: Jiri Pirko <jiri@resnulli.us>
>Sent: Thursday, November 9, 2023 2:18 PM
>
>Thu, Nov 09, 2023 at 10:49:49AM CET, arkadiusz.kubalewski@intel.com wrote:
>>>From: Jiri Pirko <jiri@resnulli.us>
>>>Sent: Wednesday, November 8, 2023 4:09 PM
>>>
>>>Wed, Nov 08, 2023 at 11:32:24AM CET, arkadiusz.kubalewski@intel.com
>>>wrote:
>>>>Disallow dump of unregistered parent pins, it is possible when parent
>>>>pin and dpll device registerer kernel module instance unbinds, and
>>>>other kernel module instances of the same dpll device have pins
>>>>registered with the parent pin. The user can invoke a pin-dump but as
>>>>the parent was unregistered, thus shall not be accessed by the
>>>>userspace, prevent that by checking if parent pin is still registered.
>>>>
>>>>Fixes: 9d71b54b65b1 ("dpll: netlink: Add DPLL framework base functions")
>>>>Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
>>>>---
>>>> drivers/dpll/dpll_netlink.c | 7 +++++++
>>>> 1 file changed, 7 insertions(+)
>>>>
>>>>diff --git a/drivers/dpll/dpll_netlink.c b/drivers/dpll/dpll_netlink.c
>>>>index a6dc3997bf5c..93fc6c4b8a78 100644
>>>>--- a/drivers/dpll/dpll_netlink.c
>>>>+++ b/drivers/dpll/dpll_netlink.c
>>>>@@ -328,6 +328,13 @@ dpll_msg_add_pin_parents(struct sk_buff *msg,
>>>>        struct dpll_pin *pin,
>>>> 		void *parent_priv;
>>>>
>>>> 		ppin = ref->pin;
>>>>+		/*
>>>>+		 * dump parent only if it is registered, thus prevent crash on
>>>>+		 * pin dump called when driver which registered the pin unbinds
>>>>+		 * and different instance registered pin on that parent pin
>>>
>>>Read this sentence like 10 times, still don't get what you mean.
>>>Shouldn't comments be easy to understand?
>>>
>>
>>Hi,
>>
>>Hmm, wondering isn't it better to remove this comment at all?
>>If you think it is needed I will rephrase it somehow..
>
>I don't know if it is needed as I don't understand it :)
>Just remove it.
>

Sure, will do.

Thank you!
Arkadiusz

>
>>
>>Thank you!
>>Arkadiusz
>>
>>>
>>>>+		 */
>>>>+		if (!xa_get_mark(&dpll_pin_xa, ppin->id, DPLL_REGISTERED))
>>>>+			continue;
>>>> 		parent_priv = dpll_pin_on_dpll_priv(dpll_ref->dpll, ppin);
>>>> 		ret = ops->state_on_pin_get(pin,
>>>> 					    dpll_pin_on_pin_priv(ppin, pin),
>>>>--
>>>>2.38.1
>>>>
>>

^ permalink raw reply

* RE: [PATCH net 2/3] dpll: fix pin dump crash for rebound module
From: Kubalewski, Arkadiusz @ 2023-11-09 16:30 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: netdev@vger.kernel.org, vadim.fedorenko@linux.dev,
	Michalik, Michal, Olech, Milena, pabeni@redhat.com,
	kuba@kernel.org
In-Reply-To: <ZUzcTBmSPxIs5iH3@nanopsycho>

>From: Jiri Pirko <jiri@resnulli.us>
>Sent: Thursday, November 9, 2023 2:19 PM
>
>Thu, Nov 09, 2023 at 01:20:48PM CET, arkadiusz.kubalewski@intel.com wrote:
>>>From: Jiri Pirko <jiri@resnulli.us>
>>>Sent: Wednesday, November 8, 2023 3:30 PM
>>>
>>>Wed, Nov 08, 2023 at 11:32:25AM CET, arkadiusz.kubalewski@intel.com
>>>wrote:
>>>>When a kernel module is unbound but the pin resources were not entirely
>>>>freed (other kernel module instance have had kept the reference to that
>>>>pin), and kernel module is again bound, the pin properties would not be
>>>>updated (the properties are only assigned when memory for the pin is
>>>>allocated), prop pointer still points to the kernel module memory of
>>>>the kernel module which was deallocated on the unbind.
>>>>
>>>>If the pin dump is invoked in this state, the result is a kernel crash.
>>>>Prevent the crash by storing persistent pin properties in dpll
>>>>subsystem,
>>>>copy the content from the kernel module when pin is allocated, instead
>>>>of
>>>>using memory of the kernel module.
>>>>
>>>>Fixes: 9431063ad323 ("dpll: core: Add DPLL framework base functions")
>>>>Fixes: 9d71b54b65b1 ("dpll: netlink: Add DPLL framework base functions")
>>>>Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
>>>>---
>>>> drivers/dpll/dpll_core.c    |  4 ++--
>>>> drivers/dpll/dpll_core.h    |  4 ++--
>>>> drivers/dpll/dpll_netlink.c | 28 ++++++++++++++--------------
>>>> 3 files changed, 18 insertions(+), 18 deletions(-)
>>>>
>>>>diff --git a/drivers/dpll/dpll_core.c b/drivers/dpll/dpll_core.c
>>>>index 3568149b9562..4077b562ba3b 100644
>>>>--- a/drivers/dpll/dpll_core.c
>>>>+++ b/drivers/dpll/dpll_core.c
>>>>@@ -442,7 +442,7 @@ dpll_pin_alloc(u64 clock_id, u32 pin_idx, struct
>>>>module *module,
>>>> 		ret = -EINVAL;
>>>> 		goto err;
>>>> 	}
>>>>-	pin->prop = prop;
>>>>+	memcpy(&pin->prop, prop, sizeof(pin->prop));
>>>
>>>Odd, you don't care about the pointer within this structure?
>>>
>>
>>Well, true. Need a fix.
>>Wondering if copying idea is better than just assigning prop pointer on
>>each call to dpll_pin_get(..) function (when pin already exists)?
>
>Not sure what do you mean. Examples please.
>

Sure,

Basically this change:

diff --git a/drivers/dpll/dpll_core.c b/drivers/dpll/dpll_core.c
index ae884b92d68c..06b72d5877c3 100644
--- a/drivers/dpll/dpll_core.c
+++ b/drivers/dpll/dpll_core.c
@@ -483,6 +483,7 @@ dpll_pin_get(u64 clock_id, u32 pin_idx, struct module *module,
                    pos->pin_idx == pin_idx &&
                    pos->module == module) {
                        ret = pos;
+                       pos->prop = prop;
                        refcount_inc(&ret->refcount);
                        break;
                }

would replace whole of this patch changes, although seems a bit hacky.

Thank you!
Arkadiusz

>
>>
>>Thank you!
>>Arkadiusz
>>
>>>
>>>> 	refcount_set(&pin->refcount, 1);
>>>> 	xa_init_flags(&pin->dpll_refs, XA_FLAGS_ALLOC);
>>>> 	xa_init_flags(&pin->parent_refs, XA_FLAGS_ALLOC);
>>>>@@ -634,7 +634,7 @@ int dpll_pin_on_pin_register(struct dpll_pin
>>>>*parent,
>>>>struct dpll_pin *pin,
>>>> 	unsigned long i, stop;
>>>> 	int ret;
>>>>
>>>>-	if (WARN_ON(parent->prop->type != DPLL_PIN_TYPE_MUX))
>>>>+	if (WARN_ON(parent->prop.type != DPLL_PIN_TYPE_MUX))
>>>> 		return -EINVAL;
>>>>
>>>> 	if (WARN_ON(!ops) ||
>>>>diff --git a/drivers/dpll/dpll_core.h b/drivers/dpll/dpll_core.h
>>>>index 5585873c5c1b..717f715015c7 100644
>>>>--- a/drivers/dpll/dpll_core.h
>>>>+++ b/drivers/dpll/dpll_core.h
>>>>@@ -44,7 +44,7 @@ struct dpll_device {
>>>>  * @module:		module of creator
>>>>  * @dpll_refs:		hold referencees to dplls pin was registered
>>>>with
>>>>  * @parent_refs:	hold references to parent pins pin was registered
>>>>with
>>>>- * @prop:		pointer to pin properties given by registerer
>>>>+ * @prop:		pin properties copied from the registerer
>>>>  * @rclk_dev_name:	holds name of device when pin can recover clock
>>>>from it
>>>>  * @refcount:		refcount
>>>>  **/
>>>>@@ -55,7 +55,7 @@ struct dpll_pin {
>>>> 	struct module *module;
>>>> 	struct xarray dpll_refs;
>>>> 	struct xarray parent_refs;
>>>>-	const struct dpll_pin_properties *prop;
>>>>+	struct dpll_pin_properties prop;
>>>> 	refcount_t refcount;
>>>> };
>>>>
>>>>diff --git a/drivers/dpll/dpll_netlink.c b/drivers/dpll/dpll_netlink.c
>>>>index 93fc6c4b8a78..963bbbbe6660 100644
>>>>--- a/drivers/dpll/dpll_netlink.c
>>>>+++ b/drivers/dpll/dpll_netlink.c
>>>>@@ -278,17 +278,17 @@ dpll_msg_add_pin_freq(struct sk_buff *msg, struct
>>>>dpll_pin *pin,
>>>> 	if (nla_put_64bit(msg, DPLL_A_PIN_FREQUENCY, sizeof(freq), &freq,
>>>> 			  DPLL_A_PIN_PAD))
>>>> 		return -EMSGSIZE;
>>>>-	for (fs = 0; fs < pin->prop->freq_supported_num; fs++) {
>>>>+	for (fs = 0; fs < pin->prop.freq_supported_num; fs++) {
>>>> 		nest = nla_nest_start(msg, DPLL_A_PIN_FREQUENCY_SUPPORTED);
>>>> 		if (!nest)
>>>> 			return -EMSGSIZE;
>>>>-		freq = pin->prop->freq_supported[fs].min;
>>>>+		freq = pin->prop.freq_supported[fs].min;
>>>> 		if (nla_put_64bit(msg, DPLL_A_PIN_FREQUENCY_MIN, sizeof(freq),
>>>> 				  &freq, DPLL_A_PIN_PAD)) {
>>>> 			nla_nest_cancel(msg, nest);
>>>> 			return -EMSGSIZE;
>>>> 		}
>>>>-		freq = pin->prop->freq_supported[fs].max;
>>>>+		freq = pin->prop.freq_supported[fs].max;
>>>> 		if (nla_put_64bit(msg, DPLL_A_PIN_FREQUENCY_MAX, sizeof(freq),
>>>> 				  &freq, DPLL_A_PIN_PAD)) {
>>>> 			nla_nest_cancel(msg, nest);
>>>>@@ -304,9 +304,9 @@ static bool dpll_pin_is_freq_supported(struct
>>>>dpll_pin
>>>>*pin, u32 freq)
>>>> {
>>>> 	int fs;
>>>>
>>>>-	for (fs = 0; fs < pin->prop->freq_supported_num; fs++)
>>>>-		if (freq >= pin->prop->freq_supported[fs].min &&
>>>>-		    freq <= pin->prop->freq_supported[fs].max)
>>>>+	for (fs = 0; fs < pin->prop.freq_supported_num; fs++)
>>>>+		if (freq >= pin->prop.freq_supported[fs].min &&
>>>>+		    freq <= pin->prop.freq_supported[fs].max)
>>>> 			return true;
>>>> 	return false;
>>>> }
>>>>@@ -403,7 +403,7 @@ static int
>>>> dpll_cmd_pin_get_one(struct sk_buff *msg, struct dpll_pin *pin,
>>>> 		     struct netlink_ext_ack *extack)
>>>> {
>>>>-	const struct dpll_pin_properties *prop = pin->prop;
>>>>+	const struct dpll_pin_properties *prop = &pin->prop;
>>>> 	struct dpll_pin_ref *ref;
>>>> 	int ret;
>>>>
>>>>@@ -696,7 +696,7 @@ dpll_pin_on_pin_state_set(struct dpll_pin *pin, u32
>>>>parent_idx,
>>>> 	int ret;
>>>>
>>>> 	if (!(DPLL_PIN_CAPABILITIES_STATE_CAN_CHANGE &
>>>>-	      pin->prop->capabilities)) {
>>>>+	      pin->prop.capabilities)) {
>>>> 		NL_SET_ERR_MSG(extack, "state changing is not allowed");
>>>> 		return -EOPNOTSUPP;
>>>> 	}
>>>>@@ -732,7 +732,7 @@ dpll_pin_state_set(struct dpll_device *dpll, struct
>>>>dpll_pin *pin,
>>>> 	int ret;
>>>>
>>>> 	if (!(DPLL_PIN_CAPABILITIES_STATE_CAN_CHANGE &
>>>>-	      pin->prop->capabilities)) {
>>>>+	      pin->prop.capabilities)) {
>>>> 		NL_SET_ERR_MSG(extack, "state changing is not allowed");
>>>> 		return -EOPNOTSUPP;
>>>> 	}
>>>>@@ -759,7 +759,7 @@ dpll_pin_prio_set(struct dpll_device *dpll, struct
>>>>dpll_pin *pin,
>>>> 	int ret;
>>>>
>>>> 	if (!(DPLL_PIN_CAPABILITIES_PRIORITY_CAN_CHANGE &
>>>>-	      pin->prop->capabilities)) {
>>>>+	      pin->prop.capabilities)) {
>>>> 		NL_SET_ERR_MSG(extack, "prio changing is not allowed");
>>>> 		return -EOPNOTSUPP;
>>>> 	}
>>>>@@ -787,7 +787,7 @@ dpll_pin_direction_set(struct dpll_pin *pin, struct
>>>>dpll_device *dpll,
>>>> 	int ret;
>>>>
>>>> 	if (!(DPLL_PIN_CAPABILITIES_DIRECTION_CAN_CHANGE &
>>>>-	      pin->prop->capabilities)) {
>>>>+	      pin->prop.capabilities)) {
>>>> 		NL_SET_ERR_MSG(extack, "direction changing is not allowed");
>>>> 		return -EOPNOTSUPP;
>>>> 	}
>>>>@@ -817,8 +817,8 @@ dpll_pin_phase_adj_set(struct dpll_pin *pin, struct
>>>>nlattr *phase_adj_attr,
>>>> 	int ret;
>>>>
>>>> 	phase_adj = nla_get_s32(phase_adj_attr);
>>>>-	if (phase_adj > pin->prop->phase_range.max ||
>>>>-	    phase_adj < pin->prop->phase_range.min) {
>>>>+	if (phase_adj > pin->prop.phase_range.max ||
>>>>+	    phase_adj < pin->prop.phase_range.min) {
>>>> 		NL_SET_ERR_MSG_ATTR(extack, phase_adj_attr,
>>>> 				    "phase adjust value not supported");
>>>> 		return -EINVAL;
>>>>@@ -999,7 +999,7 @@ dpll_pin_find(u64 clock_id, struct nlattr
>>>>*mod_name_attr,
>>>> 	unsigned long i;
>>>>
>>>> 	xa_for_each_marked(&dpll_pin_xa, i, pin, DPLL_REGISTERED) {
>>>>-		prop = pin->prop;
>>>>+		prop = &pin->prop;
>>>> 		cid_match = clock_id ? pin->clock_id == clock_id : true;
>>>> 		mod_match = mod_name_attr && module_name(pin->module) ?
>>>> 			!nla_strcmp(mod_name_attr,
>>>>--
>>>>2.38.1
>>>>
>>

^ permalink raw reply related

* Re: [PATCH net-next v2 16/21] virtio_net: xsk: rx: introduce add_recvbuf_xsk()
From: Maciej Fijalkowski @ 2023-11-09 16:26 UTC (permalink / raw)
  To: Xuan Zhuo
  Cc: Michael S. Tsirkin, netdev, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Jason Wang, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	virtualization, bpf
In-Reply-To: <1699528306.7236402-5-xuanzhuo@linux.alibaba.com>

On Thu, Nov 09, 2023 at 07:11:46PM +0800, Xuan Zhuo wrote:
> On Thu, 9 Nov 2023 03:12:27 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > On Tue, Nov 07, 2023 at 11:12:22AM +0800, Xuan Zhuo wrote:
> > > Implement the logic of filling rq with XSK buffers.
> > >
> > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > ---
> > >  drivers/net/virtio/main.c       |  4 ++-
> > >  drivers/net/virtio/virtio_net.h |  5 ++++
> > >  drivers/net/virtio/xsk.c        | 49 ++++++++++++++++++++++++++++++++-
> > >  drivers/net/virtio/xsk.h        |  2 ++
> > >  4 files changed, 58 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/net/virtio/main.c b/drivers/net/virtio/main.c
> > > index 6210a6e37396..15943a22e17d 100644
> > > --- a/drivers/net/virtio/main.c
> > > +++ b/drivers/net/virtio/main.c
> > > @@ -1798,7 +1798,9 @@ static bool try_fill_recv(struct virtnet_info *vi, struct virtnet_rq *rq,
> > >  	bool oom;
> > >
> > >  	do {
> > > -		if (vi->mergeable_rx_bufs)
> > > +		if (rq->xsk.pool)
> > > +			err = virtnet_add_recvbuf_xsk(vi, rq, rq->xsk.pool, gfp);
> > > +		else if (vi->mergeable_rx_bufs)
> > >  			err = add_recvbuf_mergeable(vi, rq, gfp);
> > >  		else if (vi->big_packets)
> > >  			err = add_recvbuf_big(vi, rq, gfp);
> >
> > I'm not sure I understand. How does this handle mergeable flag still being set?
> 
> 
> You has the same question as Jason.
> 
> So I think maybe I should put the handle into the
> add_recvbuf_mergeable and add_recvbuf_small.
> 
> Let me think about this.
> 
> 
> >
> >
> > > diff --git a/drivers/net/virtio/virtio_net.h b/drivers/net/virtio/virtio_net.h
> > > index a13d6d301fdb..1242785e311e 100644
> > > --- a/drivers/net/virtio/virtio_net.h
> > > +++ b/drivers/net/virtio/virtio_net.h
> > > @@ -140,6 +140,11 @@ struct virtnet_rq {
> > >
> > >  		/* xdp rxq used by xsk */
> > >  		struct xdp_rxq_info xdp_rxq;
> > > +
> > > +		struct xdp_buff **xsk_buffs;
> > > +		u32 nxt_idx;
> > > +		u32 num;
> > > +		u32 size;
> > >  	} xsk;
> > >  };
> > >
> > > diff --git a/drivers/net/virtio/xsk.c b/drivers/net/virtio/xsk.c
> > > index ea5804ddd44e..e737c3353212 100644
> > > --- a/drivers/net/virtio/xsk.c
> > > +++ b/drivers/net/virtio/xsk.c
> > > @@ -38,6 +38,41 @@ static void virtnet_xsk_check_queue(struct virtnet_sq *sq)
> > >  		netif_stop_subqueue(dev, qnum);
> > >  }
> > >
> > > +int virtnet_add_recvbuf_xsk(struct virtnet_info *vi, struct virtnet_rq *rq,
> > > +			    struct xsk_buff_pool *pool, gfp_t gfp)
> > > +{
> > > +	struct xdp_buff **xsk_buffs;
> > > +	dma_addr_t addr;
> > > +	u32 len, i;
> > > +	int err = 0;
> > > +
> > > +	xsk_buffs = rq->xsk.xsk_buffs;
> > > +
> > > +	if (rq->xsk.nxt_idx >= rq->xsk.num) {
> > > +		rq->xsk.num = xsk_buff_alloc_batch(pool, xsk_buffs, rq->xsk.size);
> > > +		if (!rq->xsk.num)
> > > +			return -ENOMEM;
> > > +		rq->xsk.nxt_idx = 0;
> > > +	}
> >
> > Another manually rolled linked list implementation.
> > Please, don't.
> 
> 
> The array is for speedup.
> 
> xsk_buff_alloc_batch will return many xsk_buff that will be more efficient than
> the xsk_buff_alloc.

But your sg list just contains a single entry?
I think that you have to walk through the xsk_buffs array, retrieve dma
addrs from there and have sg list sized to the value
xsk_buff_alloc_batch() returned.

I don't think your logic based on nxt_idx is needed. Please take a look
how other drivers use xsk_buff_alloc_batch().

I don't see callsites of virtnet_add_recvbuf_xsk() though.

> 
> Thanks.
> 
> >
> >
> > > +
> > > +	i = rq->xsk.nxt_idx;
> > > +
> > > +	/* use the part of XDP_PACKET_HEADROOM as the virtnet hdr space */
> > > +	addr = xsk_buff_xdp_get_dma(xsk_buffs[i]) - vi->hdr_len;
> > > +	len = xsk_pool_get_rx_frame_size(pool) + vi->hdr_len;
> > > +
> > > +	sg_init_table(rq->sg, 1);
> > > +	sg_fill_dma(rq->sg, addr, len);
> > > +
> > > +	err = virtqueue_add_inbuf(rq->vq, rq->sg, 1, xsk_buffs[i], gfp);
> > > +	if (err)
> > > +		return err;
> > > +
> > > +	rq->xsk.nxt_idx++;
> > > +
> > > +	return 0;
> > > +}
> > > +
> > >  static int virtnet_xsk_xmit_one(struct virtnet_sq *sq,
> > >  				struct xsk_buff_pool *pool,
> > >  				struct xdp_desc *desc)
> > > @@ -213,7 +248,7 @@ static int virtnet_xsk_pool_enable(struct net_device *dev,
> > >  	struct virtnet_sq *sq;
> > >  	struct device *dma_dev;
> > >  	dma_addr_t hdr_dma;
> > > -	int err;
> > > +	int err, size;
> > >
> > >  	/* In big_packets mode, xdp cannot work, so there is no need to
> > >  	 * initialize xsk of rq.
> > > @@ -249,6 +284,16 @@ static int virtnet_xsk_pool_enable(struct net_device *dev,
> > >  	if (!dma_dev)
> > >  		return -EPERM;
> > >
> > > +	size = virtqueue_get_vring_size(rq->vq);
> > > +
> > > +	rq->xsk.xsk_buffs = kcalloc(size, sizeof(*rq->xsk.xsk_buffs), GFP_KERNEL);
> > > +	if (!rq->xsk.xsk_buffs)
> > > +		return -ENOMEM;
> > > +
> > > +	rq->xsk.size = size;
> > > +	rq->xsk.nxt_idx = 0;
> > > +	rq->xsk.num = 0;
> > > +
> > >  	hdr_dma = dma_map_single(dma_dev, &xsk_hdr, vi->hdr_len, DMA_TO_DEVICE);
> > >  	if (dma_mapping_error(dma_dev, hdr_dma))
> > >  		return -ENOMEM;
> > > @@ -307,6 +352,8 @@ static int virtnet_xsk_pool_disable(struct net_device *dev, u16 qid)
> > >
> > >  	dma_unmap_single(dma_dev, sq->xsk.hdr_dma_address, vi->hdr_len, DMA_TO_DEVICE);
> > >
> > > +	kfree(rq->xsk.xsk_buffs);
> > > +
> > >  	return err1 | err2;
> > >  }
> > >
> > > diff --git a/drivers/net/virtio/xsk.h b/drivers/net/virtio/xsk.h
> > > index 7ebc9bda7aee..bef41a3f954e 100644
> > > --- a/drivers/net/virtio/xsk.h
> > > +++ b/drivers/net/virtio/xsk.h
> > > @@ -23,4 +23,6 @@ int virtnet_xsk_pool_setup(struct net_device *dev, struct netdev_bpf *xdp);
> > >  bool virtnet_xsk_xmit(struct virtnet_sq *sq, struct xsk_buff_pool *pool,
> > >  		      int budget);
> > >  int virtnet_xsk_wakeup(struct net_device *dev, u32 qid, u32 flag);
> > > +int virtnet_add_recvbuf_xsk(struct virtnet_info *vi, struct virtnet_rq *rq,
> > > +			    struct xsk_buff_pool *pool, gfp_t gfp);
> > >  #endif
> > > --
> > > 2.32.0.3.g01195cf9f
> >
> >
> 

^ permalink raw reply

* Re: [PATCH net-next 07/15] eth: link netdev to page_pools in drivers
From: Jakub Kicinski @ 2023-11-09 16:26 UTC (permalink / raw)
  To: Ilias Apalodimas; +Cc: davem, netdev, edumazet, pabeni, almasrymina, hawk
In-Reply-To: <CAC_iWjJzJp+QWCY8ES=yOr4WrKXqF_AWGJjzNdCQmGpa=5dbyQ@mail.gmail.com>

On Thu, 9 Nov 2023 11:11:04 +0200 Ilias Apalodimas wrote:
> > ---
> >  drivers/net/ethernet/broadcom/bnxt/bnxt.c         | 1 +
> >  drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 1 +
> >  drivers/net/ethernet/microsoft/mana/mana_en.c     | 1 +
> >  3 files changed, 3 insertions(+)  
> 
> Mind add a line for netsec.c (probably in netsec_setup_rx_dring),
> since that's the only NIC I currently have around with pp support?

It's a single queue / single napi device? So like this?

diff --git a/drivers/net/ethernet/socionext/netsec.c b/drivers/net/ethernet/socionext/netsec.c
index 0891e9e49ecb..384c506bb930 100644
--- a/drivers/net/ethernet/socionext/netsec.c
+++ b/drivers/net/ethernet/socionext/netsec.c
@@ -1302,6 +1302,8 @@ static int netsec_setup_rx_dring(struct netsec_priv *priv)
 		.dma_dir = xdp_prog ? DMA_BIDIRECTIONAL : DMA_FROM_DEVICE,
 		.offset = NETSEC_RXBUF_HEADROOM,
 		.max_len = NETSEC_RX_BUF_SIZE,
+		.napi = priv->napi,
+		.netdev = priv->ndev,
 	};
 	int i, err;
 

^ permalink raw reply related

* Re: [PATCH net-next 1/1] net: stmmac: Add support for HW-accelerated VLAN stripping
From: Andrew Lunn @ 2023-11-09 16:22 UTC (permalink / raw)
  To: Gan Yi Fang
  Cc: Alexandre Torgue, Jose Abreu, David S . Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Maxime Coquelin, Russell King,
	Andrew Halaney, Simon Horman, Bartosz Golaszewski, Shenwei Wang,
	Russell King, Johannes Zink, Jochen Henneberg, netdev,
	linux-stm32, linux-arm-kernel, linux-kernel, Looi Hong Aun,
	Voon Weifeng, Song Yoong Siang
In-Reply-To: <20231109053831.2572699-1-yi.fang.gan@intel.com>

> +static void dwmac4_rx_hw_vlan(struct mac_device_info *hw,
> +			      struct dma_desc *rx_desc, struct sk_buff *skb)
> +{
> +	if (hw->desc->get_rx_vlan_valid(rx_desc)) {
> +		u16 vid = (u16)hw->desc->get_rx_vlan_tci(rx_desc);
> +
> +		__vlan_hwaccel_put_tag(skb, htons(ETH_P_8021Q), vid);
> +	}
> +}

> +static inline int dwmac4_wrback_get_rx_vlan_tci(struct dma_desc *p)
> +{
> +	return (le32_to_cpu(p->des0) & RDES0_VLAN_TAG_MASK);
> +}

A tci is a u16. If you make this function return a u16, you can avoid
the cast.

  Andrew

^ permalink raw reply

* Re: [PATCH net-next 04/15] net: page_pool: id the page pools
From: Jakub Kicinski @ 2023-11-09 16:22 UTC (permalink / raw)
  To: Ilias Apalodimas; +Cc: davem, netdev, edumazet, pabeni, almasrymina, hawk
In-Reply-To: <CAC_iWj+gdrsyumk77mR60o6rw=pUmnXgrkmwJXK_04KPJCMhAw@mail.gmail.com>

On Thu, 9 Nov 2023 11:21:32 +0200 Ilias Apalodimas wrote:
> > +       mutex_lock(&page_pools_lock);
> > +       err = xa_alloc_cyclic(&page_pools, &pool->user.id, pool, xa_limit_32b,
> > +                             &id_alloc_next, GFP_KERNEL);
> > +       if (err < 0)
> > +               goto err_unlock;  
> 
> A nit really, but get rid of the if/goto and just let this return err; ?

There's more stuff added here by a subsequent patch. It ends up like
this:

int page_pool_list(struct page_pool *pool)
{
	static u32 id_alloc_next;
	int err;

	mutex_lock(&page_pools_lock);
	err = xa_alloc_cyclic(&page_pools, &pool->user.id, pool, xa_limit_32b,
			      &id_alloc_next, GFP_KERNEL);
	if (err < 0)
		goto err_unlock;

	if (pool->slow.netdev) {
		hlist_add_head(&pool->user.list,
			       &pool->slow.netdev->page_pools);
		pool->user.napi_id = pool->p.napi ? pool->p.napi->napi_id : 0;

		netdev_nl_page_pool_event(pool, NETDEV_CMD_PAGE_POOL_ADD_NTF);
	}

	mutex_unlock(&page_pools_lock);
	return 0;

err_unlock:
	mutex_unlock(&page_pools_lock);
	return err;
}

Do you want me to combine the error and non-error paths?
I have a weak preference for not mixing, sometimes err gets set 
to a positive value and that starts to propagate, unlikely to
happen here tho.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox