Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH net-next v6 0/5] Coalesce mac ocp write/modify calls to reduce spinlock contention
From: Andrew Lunn @ 2023-11-05 15:33 UTC (permalink / raw)
  To: Mirsad Todorovac
  Cc: Heiner Kallweit, linux-kernel, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, netdev, nic_swsd
In-Reply-To: <edee64f4-442d-4670-a91b-e5b83117dd40@alu.unizg.hr>

> > > With about 130 of these sequential calls to r8168_mac_ocp_write() this looks like
> > > a lock storm that will stall all of the cores and CPUs on the same memory controller
> > > for certain time I/O takes to finish.

Please provide benchmark data to show this is a real issue, and the
patch fixes it.

> Additionally, I would like to "inline" many functions, as I think that call/return
> sequences with stack frame generation /destruction are more expensive than inlining the
> small one liners.

Please provide benchmarks to show the compiler is getting this wrong,
and inline really is needed.

Until there are benchmarks: NACK.

    Andrew

^ permalink raw reply

* [GIT PULL] vhost,virtio,vdpa: features, fixes, cleanups
From: Michael S. Tsirkin @ 2023-11-05 15:58 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: kvm, virtualization, netdev, linux-kernel, dtatulea, eperezma,
	geert+renesas, gregkh, jasowang, leiyang, leon, mst, pizhenwei,
	sgarzare, shannon.nelson, shawn.shao, simon.horman, si-wei.liu,
	xieyongji, xuanzhuo, xueshi.hu

The following changes since commit ffc253263a1375a65fa6c9f62a893e9767fbebfa:

  Linux 6.6 (2023-10-29 16:31:08 -1000)

are available in the Git repository at:

  https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git tags/for_linus

for you to fetch changes up to 86f6c224c97911b4392cb7b402e6a4ed323a449e:

  vdpa_sim: implement .reset_map support (2023-11-01 09:20:00 -0400)

----------------------------------------------------------------
vhost,virtio,vdpa: features, fixes, cleanups

vdpa/mlx5:
	VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK
	new maintainer
vdpa:
	support for vq descriptor mappings
	decouple reset of iotlb mapping from device reset

fixes, cleanups all over the place

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

----------------------------------------------------------------
Dragos Tatulea (14):
      vdpa/mlx5: Expose descriptor group mkey hw capability
      vdpa/mlx5: Create helper function for dma mappings
      vdpa/mlx5: Decouple cvq iotlb handling from hw mapping code
      vdpa/mlx5: Take cvq iotlb lock during refresh
      vdpa/mlx5: Collapse "dvq" mr add/delete functions
      vdpa/mlx5: Rename mr destroy functions
      vdpa/mlx5: Allow creation/deletion of any given mr struct
      vdpa/mlx5: Move mr mutex out of mr struct
      vdpa/mlx5: Improve mr update flow
      vdpa/mlx5: Introduce mr for vq descriptor
      vdpa/mlx5: Enable hw support for vq descriptor mapping
      vdpa/mlx5: Make iotlb helper functions more generic
      vdpa/mlx5: Update cvq iotlb mapping on ASID change
      MAINTAINERS: Add myself as mlx5_vdpa driver

Eugenio Pérez (1):
      mlx5_vdpa: offer VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK

Geert Uytterhoeven (1):
      vhost-scsi: Spelling s/preceeding/preceding/g

Greg Kroah-Hartman (1):
      vduse: make vduse_class constant

Michael S. Tsirkin (1):
      Merge branch 'mlx5-vhost' of https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git

Shannon Nelson (1):
      virtio: kdoc for struct virtio_pci_modern_device

Shawn.Shao (1):
      vdpa: Update sysfs ABI documentation

Si-Wei Liu (10):
      vdpa: introduce dedicated descriptor group for virtqueue
      vhost-vdpa: introduce descriptor group backend feature
      vhost-vdpa: uAPI to get dedicated descriptor group id
      vdpa: introduce .reset_map operation callback
      vhost-vdpa: reset vendor specific mapping to initial state in .release
      vhost-vdpa: introduce IOTLB_PERSIST backend feature bit
      vdpa: introduce .compat_reset operation callback
      vhost-vdpa: clean iotlb map during reset for older userspace
      vdpa/mlx5: implement .reset_map driver op
      vdpa_sim: implement .reset_map support

Xuan Zhuo (3):
      virtio: add definition of VIRTIO_F_NOTIF_CONFIG_DATA feature bit
      virtio_pci: add build offset check for the new common cfg items
      virtio_pci: add check for common cfg size

Xueshi Hu (1):
      virtio-balloon: correct the comment of virtballoon_migratepage()

zhenwei pi (1):
      virtio-blk: fix implicit overflow on virtio_max_dma_size

 Documentation/ABI/testing/sysfs-bus-vdpa |   4 +-
 MAINTAINERS                              |   6 +
 drivers/block/virtio_blk.c               |   4 +-
 drivers/vdpa/mlx5/core/mlx5_vdpa.h       |  32 +++--
 drivers/vdpa/mlx5/core/mr.c              | 213 +++++++++++++++++++------------
 drivers/vdpa/mlx5/core/resources.c       |   6 +-
 drivers/vdpa/mlx5/net/mlx5_vnet.c        | 137 +++++++++++++++-----
 drivers/vdpa/vdpa_sim/vdpa_sim.c         |  52 ++++++--
 drivers/vdpa/vdpa_user/vduse_dev.c       |  40 +++---
 drivers/vhost/scsi.c                     |   2 +-
 drivers/vhost/vdpa.c                     |  79 +++++++++++-
 drivers/virtio/virtio_balloon.c          |   2 +-
 drivers/virtio/virtio_pci_modern.c       |  36 ++++++
 drivers/virtio/virtio_pci_modern_dev.c   |   6 +-
 drivers/virtio/virtio_vdpa.c             |   2 +-
 include/linux/mlx5/mlx5_ifc.h            |   8 +-
 include/linux/mlx5/mlx5_ifc_vdpa.h       |   7 +-
 include/linux/vdpa.h                     |  41 +++++-
 include/linux/virtio_pci_modern.h        |  35 +++--
 include/uapi/linux/vhost.h               |   8 ++
 include/uapi/linux/vhost_types.h         |   7 +
 include/uapi/linux/virtio_config.h       |   5 +
 22 files changed, 546 insertions(+), 186 deletions(-)


^ permalink raw reply

* [PATCH] net: bcmasp: Use common error handling code in bcmasp_probe()
From: Markus Elfring @ 2023-11-05 16:33 UTC (permalink / raw)
  To: Julia Lawall, David S. Miller, Eric Dumazet, Florian Fainelli,
	Jakub Kicinski, Justin Chen, Paolo Abeni,
	bcm-kernel-feedback-list, netdev, kernel-janitors
  Cc: cocci, LKML

From: Markus Elfring <elfring@users.sourceforge.net>
Date: Sun, 5 Nov 2023 17:24:01 +0100

Add a jump target so that a bit of exception handling can be better
reused at the end of this function.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
---
 drivers/net/ethernet/broadcom/asp2/bcmasp.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/asp2/bcmasp.c b/drivers/net/ethernet/broadcom/asp2/bcmasp.c
index 29b04a274d07..675437e44b94 100644
--- a/drivers/net/ethernet/broadcom/asp2/bcmasp.c
+++ b/drivers/net/ethernet/broadcom/asp2/bcmasp.c
@@ -1304,9 +1304,8 @@ static int bcmasp_probe(struct platform_device *pdev)
 		intf = bcmasp_interface_create(priv, intf_node, i);
 		if (!intf) {
 			dev_err(dev, "Cannot create eth interface %d\n", i);
-			bcmasp_remove_intfs(priv);
 			of_node_put(intf_node);
-			goto of_put_exit;
+			goto remove_intfs;
 		}
 		list_add_tail(&intf->list, &priv->intfs);
 		i++;
@@ -1331,8 +1330,7 @@ static int bcmasp_probe(struct platform_device *pdev)
 			netdev_err(intf->ndev,
 				   "failed to register net_device: %d\n", ret);
 			priv->destroy_wol(priv);
-			bcmasp_remove_intfs(priv);
-			goto of_put_exit;
+			goto remove_intfs;
 		}
 		count++;
 	}
@@ -1342,6 +1340,10 @@ static int bcmasp_probe(struct platform_device *pdev)
 of_put_exit:
 	of_node_put(ports_node);
 	return ret;
+
+remove_intfs:
+	bcmasp_remove_intfs(priv);
+	goto of_put_exit;
 }

 static void bcmasp_remove(struct platform_device *pdev)
--
2.42.0


^ permalink raw reply related

* [PATCH net v2] i40e: Fix adding unsupported cloud filters
From: Ivan Vecera @ 2023-11-05 16:46 UTC (permalink / raw)
  To: netdev
  Cc: Jesse Brandeburg, Tony Nguyen, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, intel-wired-lan, linux-kernel,
	Jacob Keller, Wojciech Drewek, Simon Horman

If a VF tries to add unsupported cloud filter through virchnl
then i40e_add_del_cloud_filter(_big_buf) returns -ENOTSUPP but
this error code is stored in 'ret' instead of 'aq_ret' that
is used as error code sent back to VF. In this scenario where
one of the mentioned functions fails the value of 'aq_ret'
is zero so the VF will incorrectly receive a 'success'.

Use 'aq_ret' to store return value and remove 'ret' local
variable. Additionally fix the issue when filter allocation
fails, in this case no notification is sent back to the VF.

Fixes: e284fc280473be ("i40e: Add and delete cloud filter")
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
---
 .../net/ethernet/intel/i40e/i40e_virtchnl_pf.c   | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index 08d7edccfb8ddb..3f99eb19824527 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -3844,7 +3844,7 @@ static int i40e_vc_add_cloud_filter(struct i40e_vf *vf, u8 *msg)
 	struct i40e_pf *pf = vf->pf;
 	struct i40e_vsi *vsi = NULL;
 	int aq_ret = 0;
-	int i, ret;
+	int i;
 
 	if (!i40e_sync_vf_state(vf, I40E_VF_STATE_ACTIVE)) {
 		aq_ret = -EINVAL;
@@ -3868,8 +3868,10 @@ static int i40e_vc_add_cloud_filter(struct i40e_vf *vf, u8 *msg)
 	}
 
 	cfilter = kzalloc(sizeof(*cfilter), GFP_KERNEL);
-	if (!cfilter)
-		return -ENOMEM;
+	if (!cfilter) {
+		aq_ret = -ENOMEM;
+		goto err_out;
+	}
 
 	/* parse destination mac address */
 	for (i = 0; i < ETH_ALEN; i++)
@@ -3917,13 +3919,13 @@ static int i40e_vc_add_cloud_filter(struct i40e_vf *vf, u8 *msg)
 
 	/* Adding cloud filter programmed as TC filter */
 	if (tcf.dst_port)
-		ret = i40e_add_del_cloud_filter_big_buf(vsi, cfilter, true);
+		aq_ret = i40e_add_del_cloud_filter_big_buf(vsi, cfilter, true);
 	else
-		ret = i40e_add_del_cloud_filter(vsi, cfilter, true);
-	if (ret) {
+		aq_ret = i40e_add_del_cloud_filter(vsi, cfilter, true);
+	if (aq_ret) {
 		dev_err(&pf->pdev->dev,
 			"VF %d: Failed to add cloud filter, err %pe aq_err %s\n",
-			vf->vf_id, ERR_PTR(ret),
+			vf->vf_id, ERR_PTR(aq_ret),
 			i40e_aq_str(&pf->hw, pf->hw.aq.asq_last_status));
 		goto err_free;
 	}
-- 
2.41.0


^ permalink raw reply related

* Re: [syzbot] [nfc?] [net?] BUG: corrupted list in nfc_llcp_unregister_device
From: syzbot @ 2023-11-05 17:34 UTC (permalink / raw)
  To: 309386628, davem, dominic.coppola, dvyukov, edumazet, hdanton,
	johan.hedberg, krzysztof.kozlowski, kuba, linma, linux-bluetooth,
	linux-kernel, luiz.dentz, luiz.von.dentz, marcel, netdev, pabeni,
	syzkaller-bugs, zahiabdelmalak0
In-Reply-To: <0000000000006f759505ee84d8d7@google.com>

syzbot suspects this issue was fixed by commit:

commit b938790e70540bf4f2e653dcd74b232494d06c8f
Author: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Date:   Fri Sep 15 20:24:47 2023 +0000

    Bluetooth: hci_codec: Fix leaking content of local_codecs

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=15cb8f17680000
start commit:   e1c04510f521 Merge tag 'pm-6.2-rc9' of git://git.kernel.or..
git tree:       upstream
kernel config:  https://syzkaller.appspot.com/x/.config?x=fe56f7d193926860
dashboard link: https://syzkaller.appspot.com/bug?extid=81232c4a81a886e2b580
userspace arch: i386
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=14e90568c80000

If the result looks correct, please mark the issue as fixed by replying with:

#syz fix: Bluetooth: hci_codec: Fix leaking content of local_codecs

For information about bisection process see: https://goo.gl/tpsmEJ#bisection

^ permalink raw reply

* [RFC Draft net-next] docs: netdev: add section on using lei to manage netdev mail volume
From: David Wei @ 2023-11-05 18:50 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: netdev

As a beginner to netdev I found the volume of mail to be overwhelming. I only
want to focus on core netdev changes and ignore most driver changes. I found a
way to do this using lei, filtering the mailing list using lore's query
language and writing the results into an IMAP server.

This patch is an RFC draft of updating the maintainer-netdev documentation with
this information in the hope of helping out others in the future.

Signed-off-by: David Wei <dw@davidwei.uk>
---
 Documentation/process/maintainer-netdev.rst | 39 +++++++++++++++++++++
 1 file changed, 39 insertions(+)

diff --git a/Documentation/process/maintainer-netdev.rst b/Documentation/process/maintainer-netdev.rst
index 7feacc20835e..93851783de6f 100644
--- a/Documentation/process/maintainer-netdev.rst
+++ b/Documentation/process/maintainer-netdev.rst
@@ -33,6 +33,45 @@ Aside from subsystems like those mentioned above, all network-related
 Linux development (i.e. RFC, review, comments, etc.) takes place on
 netdev.
 
+Managing emails
+~~~~~~~~~~~~~~~
+
+netdev is a busy mailing list with on average over 200 emails received per day,
+which can be overwhelming to beginners. Rather than subscribing to the entire
+list, considering using ``lei`` to only subscribe to topics that you are
+interested in. Konstantin Ryabitsev wrote excellent tutorials on using ``lei``:
+
+ - https://people.kernel.org/monsieuricon/lore-lei-part-1-getting-started
+ - https://people.kernel.org/monsieuricon/lore-lei-part-2-now-with-imap
+
+As a netdev beginner, you may want to filter out driver changes and only focus
+on core netdev changes. Try using the following query with ``lei q``::
+
+  lei q -o ~/Mail/netdev \
+    -I https://lore.kernel.org/all \
+    -t '(b:b/net/* AND tc:netdev@vger.kernel.org AND rt:2.week.ago..'
+
+This query will only match threads containing messages with patches that modify
+files in ``net/*``. For more information on the query language, see:
+
+  https://lore.kernel.org/linux-btrfs/_/text/help/
+
+By default ``lei`` will output to a Maildir, but it also supports Mbox and IMAP
+by adding a prefix to the output directory ``-o``. For a list of supported
+formats and prefix strings, see:
+
+  https://www.mankier.com/1/lei-q
+
+If you would like to use IMAP, Konstantin’s blog is slightly outdated and you
+no longer need to use here strings i.e. ``<<<`` or ``<<EOF``. You can simply
+point lei at an IMAP server e.g. ``imaps://imap.gmail.com``::
+
+  lei q -o imaps://imap.gmail.com/netdev \
+    -I https://lore.kernel.org/all \
+    -t '(b:b/net/* AND tc:netdev@vger.kernel.org AND rt:2.week.ago..'
+
+You need to set up ``git-credentials`` properly first.
+
 Development cycle
 -----------------
 
-- 
2.39.3


^ permalink raw reply related

* [PATCH v3] tg3: Fix the TX ring stall
From: alexey.pakhunov @ 2023-11-05 18:58 UTC (permalink / raw)
  To: mchan
  Cc: vincent.wong2, netdev, linux-kernel, siva.kallam, prashant,
	Alex Pakhunov

From: Alex Pakhunov <alexey.pakhunov@spacex.com>

The TX ring maintained by the tg3 driver can end up in the state, when it
has packets queued for sending but the NIC hardware is not informed, so no
progress is made. This leads to a multi-second interruption in network
traffic followed by dev_watchdog() firing and resetting the queue.

The specific sequence of steps is:

1. tg3_start_xmit() is called at least once and queues packet(s) without
   updating tnapi->prodmbox (netdev_xmit_more() returns true)
2. tg3_start_xmit() is called with an SKB which causes tg3_tso_bug() to be
   called.
3. tg3_tso_bug() determines that the SKB is too large, ...

        if (unlikely(tg3_tx_avail(tnapi) <= frag_cnt_est)) {

   ... stops the queue, and returns NETDEV_TX_BUSY:

        netif_tx_stop_queue(txq);
        ...
        if (tg3_tx_avail(tnapi) <= frag_cnt_est)
                return NETDEV_TX_BUSY;

4. Since all tg3_tso_bug() call sites directly return, the code updating
   tnapi->prodmbox is skipped.

5. The queue is stuck now. tg3_start_xmit() is not called while the queue
   is stopped. The NIC is not processing new packets because
   tnapi->prodmbox wasn't updated. tg3_tx() is not called by
   tg3_poll_work() because the all TX descriptions that could be freed has
   been freed:

        /* run TX completion thread */
        if (tnapi->hw_status->idx[0].tx_consumer != tnapi->tx_cons) {
                tg3_tx(tnapi);

6. Eventually, dev_watchdog() fires triggering a reset of the queue.

This fix makes sure that the tnapi->prodmbox update happens regardless of
the reason tg3_start_xmit() returned.

Signed-off-by: Alex Pakhunov <alexey.pakhunov@spacex.com>
Signed-off-by: Vincent Wong <vincent.wong2@spacex.com>
---
v3: Split "Fix the TX ring stall" into a standalone patch. No code changes from v2.
v2: https://lore.kernel.org/netdev/CACKFLi=ZLAb1Y92LwvqjOGPCuinka7qbHwDP2pkG4-_a7DMorQ@mail.gmail.com/T/#t
    - Sort the local variables in tg3_start_xmit() in the RCS order
v1: https://lore.kernel.org/netdev/20231101191858.2611154-1-alexey.pakhunov@spacex.com/T/#t
---
 drivers/net/ethernet/broadcom/tg3.c | 53 +++++++++++++++++++++++------
 1 file changed, 42 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c
index 14b311196b8f..310f3323fba1 100644
--- a/drivers/net/ethernet/broadcom/tg3.c
+++ b/drivers/net/ethernet/broadcom/tg3.c
@@ -6603,9 +6603,9 @@ static void tg3_tx(struct tg3_napi *tnapi)
 
 	tnapi->tx_cons = sw_idx;
 
-	/* Need to make the tx_cons update visible to tg3_start_xmit()
+	/* Need to make the tx_cons update visible to __tg3_start_xmit()
 	 * before checking for netif_queue_stopped().  Without the
-	 * memory barrier, there is a small possibility that tg3_start_xmit()
+	 * memory barrier, there is a small possibility that __tg3_start_xmit()
 	 * will miss it and cause the queue to be stopped forever.
 	 */
 	smp_mb();
@@ -7845,7 +7845,7 @@ static bool tg3_tso_bug_gso_check(struct tg3_napi *tnapi, struct sk_buff *skb)
 	return skb_shinfo(skb)->gso_segs < tnapi->tx_pending / 3;
 }
 
-static netdev_tx_t tg3_start_xmit(struct sk_buff *, struct net_device *);
+static netdev_tx_t __tg3_start_xmit(struct sk_buff *, struct net_device *);
 
 /* Use GSO to workaround all TSO packets that meet HW bug conditions
  * indicated in tg3_tx_frag_set()
@@ -7879,7 +7879,7 @@ static int tg3_tso_bug(struct tg3 *tp, struct tg3_napi *tnapi,
 
 	skb_list_walk_safe(segs, seg, next) {
 		skb_mark_not_on_list(seg);
-		tg3_start_xmit(seg, tp->dev);
+		__tg3_start_xmit(seg, tp->dev);
 	}
 
 tg3_tso_bug_end:
@@ -7889,7 +7889,7 @@ static int tg3_tso_bug(struct tg3 *tp, struct tg3_napi *tnapi,
 }
 
 /* hard_start_xmit for all devices */
-static netdev_tx_t tg3_start_xmit(struct sk_buff *skb, struct net_device *dev)
+static netdev_tx_t __tg3_start_xmit(struct sk_buff *skb, struct net_device *dev)
 {
 	struct tg3 *tp = netdev_priv(dev);
 	u32 len, entry, base_flags, mss, vlan = 0;
@@ -8133,11 +8133,6 @@ static netdev_tx_t tg3_start_xmit(struct sk_buff *skb, struct net_device *dev)
 			netif_tx_wake_queue(txq);
 	}
 
-	if (!netdev_xmit_more() || netif_xmit_stopped(txq)) {
-		/* Packets are ready, update Tx producer idx on card. */
-		tw32_tx_mbox(tnapi->prodmbox, entry);
-	}
-
 	return NETDEV_TX_OK;
 
 dma_error:
@@ -8150,6 +8145,42 @@ static netdev_tx_t tg3_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	return NETDEV_TX_OK;
 }
 
+static netdev_tx_t tg3_start_xmit(struct sk_buff *skb, struct net_device *dev)
+{
+	struct netdev_queue *txq;
+	u16 skb_queue_mapping;
+	netdev_tx_t ret;
+
+	skb_queue_mapping = skb_get_queue_mapping(skb);
+	txq = netdev_get_tx_queue(dev, skb_queue_mapping);
+
+	ret = __tg3_start_xmit(skb, dev);
+
+	/* Notify the hardware that packets are ready by updating the TX ring
+	 * tail pointer. We respect netdev_xmit_more() thus avoiding poking
+	 * the hardware for every packet. To guarantee forward progress the TX
+	 * ring must be drained when it is full as indicated by
+	 * netif_xmit_stopped(). This needs to happen even when the current
+	 * skb was dropped or rejected with NETDEV_TX_BUSY. Otherwise packets
+	 * queued by previous __tg3_start_xmit() calls might get stuck in
+	 * the queue forever.
+	 */
+	if (!netdev_xmit_more() || netif_xmit_stopped(txq)) {
+		struct tg3_napi *tnapi;
+		struct tg3 *tp;
+
+		tp = netdev_priv(dev);
+		tnapi = &tp->napi[skb_queue_mapping];
+
+		if (tg3_flag(tp, ENABLE_TSS))
+			tnapi++;
+
+		tw32_tx_mbox(tnapi->prodmbox, tnapi->tx_prod);
+	}
+
+	return ret;
+}
+
 static void tg3_mac_loopback(struct tg3 *tp, bool enable)
 {
 	if (enable) {
@@ -17680,7 +17711,7 @@ static int tg3_init_one(struct pci_dev *pdev,
 	 * device behind the EPB cannot support DMA addresses > 40-bit.
 	 * On 64-bit systems with IOMMU, use 40-bit dma_mask.
 	 * On 64-bit systems without IOMMU, use 64-bit dma_mask and
-	 * do DMA address check in tg3_start_xmit().
+	 * do DMA address check in __tg3_start_xmit().
 	 */
 	if (tg3_flag(tp, IS_5788))
 		persist_dma_mask = dma_mask = DMA_BIT_MASK(32);

base-commit: ffc253263a1375a65fa6c9f62a893e9767fbebfa
-- 
2.39.3


^ permalink raw reply related

* [syzbot] [net?] general protection fault in ptp_ioctl
From: syzbot @ 2023-11-05 19:10 UTC (permalink / raw)
  To: davem, linux-kernel, netdev, reibax, richardcochran, rrameshbabu,
	syzkaller-bugs

Hello,

syzbot found the following issue on:

HEAD commit:    4652b8e4f3ff Merge tag '6.7-rc-ksmbd-server-fixes' of git:..
git tree:       upstream
console+strace: https://syzkaller.appspot.com/x/log.txt?x=11aa125f680000
kernel config:  https://syzkaller.appspot.com/x/.config?x=423e70610024fd6b
dashboard link: https://syzkaller.appspot.com/bug?extid=8a78ecea7ac1a2ea26e5
compiler:       gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=16193ef7680000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=17e035d7680000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/a9cb6d5a8c4b/disk-4652b8e4.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/363795681962/vmlinux-4652b8e4.xz
kernel image: https://storage.googleapis.com/syzbot-assets/113d96b73fef/bzImage-4652b8e4.xz

The issue was bisected to:

commit c5a445b1e9347b14752b01f1a304bd7a2f260acc
Author: Xabier Marquiegui <reibax@gmail.com>
Date:   Wed Oct 11 22:39:56 2023 +0000

    ptp: support event queue reader channel masks

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=122491ef680000
final oops:     https://syzkaller.appspot.com/x/report.txt?x=112491ef680000
console output: https://syzkaller.appspot.com/x/log.txt?x=162491ef680000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+8a78ecea7ac1a2ea26e5@syzkaller.appspotmail.com
Fixes: c5a445b1e934 ("ptp: support event queue reader channel masks")

general protection fault, probably for non-canonical address 0xdffffc000000020b: 0000 [#1] PREEMPT SMP KASAN
KASAN: probably user-memory-access in range [0x0000000000001058-0x000000000000105f]
CPU: 0 PID: 5053 Comm: syz-executor353 Not tainted 6.6.0-syzkaller-10396-g4652b8e4f3ff #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/09/2023
RIP: 0010:ptp_ioctl+0xcb7/0x1d10 drivers/ptp/ptp_chardev.c:476
Code: 81 fe 13 3d 00 00 0f 85 9c 02 00 00 e8 c2 83 23 fa 49 8d bc 24 58 10 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 dc 0e 00 00 49 8b bc 24 58 10 00 00 ba 00 01 00
RSP: 0018:ffffc90003a37ba0 EFLAGS: 00010212
RAX: dffffc0000000000 RBX: ffff88814a78a000 RCX: ffffffff8764f81f
RDX: 000000000000020b RSI: ffffffff8765028e RDI: 0000000000001058
RBP: ffffc90003a37ec0 R08: 0000000000000005 R09: ffffc90003a37c40
R10: 0000000000003d13 R11: 0000000000000000 R12: 0000000000000000
R13: ffffc90003a37c80 R14: 0000000000003d13 R15: ffffffff92ac78e8
FS:  00005555569a9380(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020000040 CR3: 0000000076e09000 CR4: 0000000000350ef0
Call Trace:
 <TASK>
 posix_clock_ioctl+0xf8/0x160 kernel/time/posix-clock.c:86
 vfs_ioctl fs/ioctl.c:51 [inline]
 __do_sys_ioctl fs/ioctl.c:871 [inline]
 __se_sys_ioctl fs/ioctl.c:857 [inline]
 __x64_sys_ioctl+0x18f/0x210 fs/ioctl.c:857
 do_syscall_x64 arch/x86/entry/common.c:51 [inline]
 do_syscall_64+0x3f/0x110 arch/x86/entry/common.c:82
 entry_SYSCALL_64_after_hwframe+0x63/0x6b
RIP: 0033:0x7f710ac4a2a9
Code: 48 83 c4 28 c3 e8 37 17 00 00 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffda288c4c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007ffda288c698 RCX: 00007f710ac4a2a9
RDX: 0000000000000000 RSI: 0000000000003d13 RDI: 0000000000000003
RBP: 00007f710acbd610 R08: 00007ffda288c698 R09: 00007ffda288c698
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
R13: 00007ffda288c688 R14: 0000000000000001 R15: 0000000000000001
 </TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:ptp_ioctl+0xcb7/0x1d10 drivers/ptp/ptp_chardev.c:476
Code: 81 fe 13 3d 00 00 0f 85 9c 02 00 00 e8 c2 83 23 fa 49 8d bc 24 58 10 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 dc 0e 00 00 49 8b bc 24 58 10 00 00 ba 00 01 00
RSP: 0018:ffffc90003a37ba0 EFLAGS: 00010212
RAX: dffffc0000000000 RBX: ffff88814a78a000 RCX: ffffffff8764f81f
RDX: 000000000000020b RSI: ffffffff8765028e RDI: 0000000000001058
RBP: ffffc90003a37ec0 R08: 0000000000000005 R09: ffffc90003a37c40
R10: 0000000000003d13 R11: 0000000000000000 R12: 0000000000000000
R13: ffffc90003a37c80 R14: 0000000000003d13 R15: ffffffff92ac78e8
FS:  00005555569a9380(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000005fdeb8 CR3: 0000000076e09000 CR4: 0000000000350ef0
----------------
Code disassembly (best guess):
   0:	81 fe 13 3d 00 00    	cmp    $0x3d13,%esi
   6:	0f 85 9c 02 00 00    	jne    0x2a8
   c:	e8 c2 83 23 fa       	call   0xfa2383d3
  11:	49 8d bc 24 58 10 00 	lea    0x1058(%r12),%rdi
  18:	00
  19:	48 b8 00 00 00 00 00 	movabs $0xdffffc0000000000,%rax
  20:	fc ff df
  23:	48 89 fa             	mov    %rdi,%rdx
  26:	48 c1 ea 03          	shr    $0x3,%rdx
* 2a:	80 3c 02 00          	cmpb   $0x0,(%rdx,%rax,1) <-- trapping instruction
  2e:	0f 85 dc 0e 00 00    	jne    0xf10
  34:	49 8b bc 24 58 10 00 	mov    0x1058(%r12),%rdi
  3b:	00
  3c:	ba                   	.byte 0xba
  3d:	00 01                	add    %al,(%rcx)


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

^ permalink raw reply

* Re: [GIT PULL] vhost,virtio,vdpa: features, fixes, cleanups
From: pr-tracker-bot @ 2023-11-05 19:11 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Linus Torvalds, xuanzhuo, geert+renesas, kvm, mst, simon.horman,
	netdev, xieyongji, xueshi.hu, pizhenwei, linux-kernel, eperezma,
	leiyang, gregkh, shawn.shao, virtualization, leon
In-Reply-To: <20231105105806-mutt-send-email-mst@kernel.org>

The pull request you sent on Sun, 5 Nov 2023 10:58:06 -0500:

> https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git tags/for_linus

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/77fa2fbe87fc605c4bfa87dff87be9bfded0e9a3

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

^ permalink raw reply

* Re: [PATCH v2 1/2] tg3: Increment tx_dropped in tg3_tso_bug()
From: Alex Pakhunov @ 2023-11-05 19:26 UTC (permalink / raw)
  To: michael.chan
  Cc: alexey.pakhunov, linux-kernel, mchan, netdev, prashant,
	siva.kallam, vincent.wong2
In-Reply-To: <CACKFLi=ZLAb1Y92LwvqjOGPCuinka7qbHwDP2pkG4-_a7DMorQ@mail.gmail.com>

> I recommend using per queue counters as briefly mentioned in my
> earlier reply...
> tg3_get_stats64() can just loop and sum all the tx_dropped and
> rx_dropped counters in each tg3_napi struct.  We don't worry about
> locks here since we are just reading.

Got it. So the core idea is to make sure there is a single writer for each
counter which will make updating the counter race-free. It does not keep
reading the counters from multiple queues completely race free, but, I
guess, the assumption is that computing the aggregate counter to be
slightly wrong is acceptable - it will be recomputed correctly next time.

There is still some gotchas on 32 bit machines though. 64 bit reads are not
atomic there, so we have to make the counters 32bit to compensate:

====
@@ -11895,6 +11898,9 @@ static void tg3_get_nstats(struct tg3 *tp, struct rtnl_link_stats64 *stats)
 {
        struct rtnl_link_stats64 *old_stats = &tp->net_stats_prev;
        struct tg3_hw_stats *hw_stats = tp->hw_stats;
+       uintptr_t rx_dropped = 0;
+       uintptr_t tx_dropped = 0;
+       int i;

        stats->rx_packets = old_stats->rx_packets +
                get_stat64(&hw_stats->rx_ucast_packets) +
@@ -11941,8 +11947,27 @@ static void tg3_get_nstats(struct tg3 *tp, struct rtnl_link_stats64 *stats)
        stats->rx_missed_errors = old_stats->rx_missed_errors +
                get_stat64(&hw_stats->rx_discards);

-       stats->rx_dropped = tp->rx_dropped;
-       stats->tx_dropped = tp->tx_dropped;
+       /* Aggregate per-queue counters. Each per-queue counter is updated by
+        * a single writer, race-free. The aggregare counters might be not
+        * completely accurate (if an update happens in the middle of the loop)
+        * but they will be recomputed correctly the next time this function is
+        * called. This avoids explicit synchronization between this function
+        * and tg3_rx()/tg3_start_xmit().
+        **/
+       for (i = 0; i < tp->irq_cnt; i++) {
+               struct tg3_napi *tnapi = &tp->napi[i];
+
+               rx_dropped += tnapi->rx_dropped;
+               tx_dropped += tnapi->tx_dropped;
+       }
+
+       /* Since we are using uintptr_t, these counters wrap around at 4G on
+        * a 32bit machine. This seems like an acceptable price for being
+        * able to read them atomically in the loop above.
+        */
+       stats->rx_dropped = rx_dropped;
+       stats->tx_dropped = tx_dropped;
+
 }
====

An alternative implementation would use atomic64_add to update
tg3::[rt]x_dropped. It would allow the counters to be 64 bit even on 32 bit
machines. The downside is that updating the counter will be slightly more
expensive. There counters are not updated often, so the cost is negligible.
ALthough it also means that preactically speaking we don't care if
the counters are effectively 32 bits wide.

I'll assume you prefer the former implementation for now, but let me know
if this not the case.

> Yes, we can merge patch #2 first which fixes the stall.  Please repost
> just patch #2 standalone if you want to do that.

OK, I posted "[PATCH v3] tg3: Fix the TX ring stall".

Alex.

^ permalink raw reply

* Re: [PATCH net-next v6 0/5] Coalesce mac ocp write/modify calls to reduce spinlock contention
From: Mirsad Todorovac @ 2023-11-05 19:27 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Heiner Kallweit, linux-kernel, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, netdev, nic_swsd
In-Reply-To: <344fc5c2-4447-4481-843f-9d7720e55a77@lunn.ch>

[-- Attachment #1: Type: text/plain, Size: 10863 bytes --]

On 11/5/23 16:33, Andrew Lunn wrote:

>>>> With about 130 of these sequential calls to r8168_mac_ocp_write() this looks like
>>>> a lock storm that will stall all of the cores and CPUs on the same memory controller
>>>> for certain time I/O takes to finish.

> Please provide benchmark data to show this is a real issue, and the
> patch fixes it.

Certainly, Sir, but I have to figure out what to measure.

To better think of it, I actually do not have a system with a physical RTL8411b, this patch
was made by the finding in the visual inspection of the code.

FWIW, separation of code and data is the design principle that is strongly endorsed lately
and it seems like a good practice that prevents a range of security breaches and attacks:

[1] https://blog.klipse.tech/databook/2020/10/02/separate-code-data.html

[2] Reduce system complexity by separating Code from Data,
     https://livebook.manning.com/book/data-oriented-programming/chapter-2/v-2/

In this case, data is hard-coded.

The resulting code would be smaller in size and execution time, and IMHO more readable,
(in a table), but I will not enter much discussion for you have made your mind already.

>> Additionally, I would like to "inline" many functions, as I think that call/return
>> sequences with stack frame generation /destruction are more expensive than inlining the
>> small one liners.
> 
> Please provide benchmarks to show the compiler is getting this wrong,
> and inline really is needed.

I think I am by now technologically up to that:

"drivers/net/ethernet/realtek/r8169_main.s" 302034 lines
-------------------------------------------------------------------------------------
   7500 r8168_mac_ocp_write:
   7501 .LVL488:
   7502 .LFB6322:
   7503         .loc 1 900 1 is_stmt 1 view -0
   7504         .cfi_startproc
   7505 1:      call    __fentry__
   7506         .section __mcount_loc, "a",@progbits
   7507         .quad 1b
   7508         .previous
   7509         .loc 1 901 2 view .LVU1955
   7510         .loc 1 903 2 view .LVU1956
   7511         .loc 1 903 7 view .LVU1957
   7512         .loc 1 903 10 view .LVU1958
   7513         .loc 1 903 33 view .LVU1959
   7514         .loc 1 903 57 view .LVU1960
   7515         .loc 1 903 88 view .LVU1961
   7516         .loc 1 903 95 view .LVU1962
   7517         .loc 1 900 1 is_stmt 0 view .LVU1963
   7518         pushq   %rbp
   7519         .cfi_def_cfa_offset 16
   7520         .cfi_offset 6, -16
   7521         movq    %rsp, %rbp
   7522         .cfi_def_cfa_register 6
   7523         pushq   %r15
   7524         .cfi_offset 15, -24
   7525         .loc 1 903 103 view .LVU1964
   7526         leaq    6692(%rdi), %r15
   7527         .loc 1 900 1 view .LVU1965
   7528         pushq   %r14
   7529         .cfi_offset 14, -32
   7530         movl    %edx, %r14d
   7531         pushq   %r13
   7532         pushq   %r12
   7533         .cfi_offset 13, -40
   7534         .cfi_offset 12, -48
   7535         movq    %rdi, %r12
   7536         .loc 1 903 103 view .LVU1966
   7537         movq    %r15, %rdi
   7538 .LVL489:
   7539         .loc 1 900 1 view .LVU1967
   7540         pushq   %rbx
   7541         .cfi_offset 3, -56
   7542         .loc 1 900 1 view .LVU1968
   7543         movl    %esi, %ebx
   7544         .loc 1 903 103 view .LVU1969
   7545         call    _raw_spin_lock_irqsave
   7546 .LVL490:
   7547 .LBB3557:
   7548 .LBB3558:
   7549         .loc 1 893 6 view .LVU1970
   7550         movl    %ebx, %edi
   7551 .LBE3558:
   7552 .LBE3557:
   7553         .loc 1 903 103 view .LVU1971
   7554         movq    %rax, %r13
   7555 .LVL491:
   7556         .loc 1 903 5 is_stmt 1 view .LVU1972
   7557         .loc 1 904 2 view .LVU1973
   7558 .LBB3564:
   7559 .LBI3557:
   7560         .loc 1 891 13 view .LVU1974
   7561 .LBB3563:
   7562         .loc 1 893 2 view .LVU1975
   7563         .loc 1 893 6 is_stmt 0 view .LVU1976
   7564         call    rtl_ocp_reg_failure
   7565 .LVL492:
   7566         .loc 1 893 5 view .LVU1977
   7567         testb   %al, %al
   7568         jne     .L375
   7569 .LVL493:
   7570 .LBB3559:
   7571 .LBI3559:
   7572         .loc 1 891 13 is_stmt 1 view .LVU1978
   7573 .LBB3560:
   7574         .loc 1 896 2 view .LVU1979
   7575         .loc 1 896 28 is_stmt 0 view .LVU1980
   7576         sall    $15, %ebx
   7577 .LVL494:
   7578         .loc 1 896 58 view .LVU1981
   7579         movq    (%r12), %rax
   7580         .loc 1 896 35 view .LVU1982
   7581         orl     %r14d, %ebx
   7582         .loc 1 896 2 view .LVU1983
   7583         orl     $-2147483648, %ebx
   7584 .LVL495:
   7585 .LBB3561:
   7586 .LBI3561:
   7587         .loc 2 66 120 is_stmt 1 view .LVU1984
   7588 .LBB3562:
   7589         .loc 2 66 168 view .LVU1985
   7590 #APP
   7591 # 66 "./arch/x86/include/asm/io.h" 1
   7592         movl %ebx,176(%rax)
   7593 # 0 "" 2
   7594 .LVL496:
   7595 #NO_APP
   7596 .L375:
   7597         .loc 2 66 168 is_stmt 0 view .LVU1986
   7598 .LBE3562:
   7599 .LBE3561:
   7600 .LBE3560:
   7601 .LBE3559:
   7602 .LBE3563:
   7603 .LBE3564:
   7604         .loc 1 905 2 is_stmt 1 view .LVU1987
   7605         .loc 1 905 7 view .LVU1988
   7606         .loc 1 905 10 view .LVU1989
   7607         .loc 1 905 33 view .LVU1990
   7608         .loc 1 905 57 view .LVU1991
   7609         .loc 1 905 88 view .LVU1992
   7610         .loc 1 905 95 view .LVU1993
   7611         movq    %r13, %rsi
   7612         movq    %r15, %rdi
   7613         call    _raw_spin_unlock_irqrestore
   7614 .LVL497:
   7615         .loc 1 905 5 view .LVU1994
   7616         .loc 1 906 1 is_stmt 0 view .LVU1995
   7617         popq    %rbx
   7618         .cfi_restore 3
   7619         popq    %r12
   7620         .cfi_restore 12
   7621 .LVL498:
   7622         .loc 1 906 1 view .LVU1996
   7623         popq    %r13
   7624         .cfi_restore 13
   7625 .LVL499:
   7626         .loc 1 906 1 view .LVU1997
   7627         popq    %r14
   7628         .cfi_restore 14
   7629 .LVL500:
   7630         .loc 1 906 1 view .LVU1998
   7631         popq    %r15
   7632         .cfi_restore 15
   7633 .LVL501:
   7634         .loc 1 906 1 view .LVU1999
   7635         popq    %rbp
   7636         .cfi_restore 6
   7637         .cfi_def_cfa 7, 8
   7638         xorl    %eax, %eax
   7639         xorl    %edx, %edx
   7640         xorl    %esi, %esi
   7641         xorl    %edi, %edi
   7642         jmp     __x86_return_thunk
   7643         .cfi_endproc
   7644 .LFE6322:
   7645         .size   r8168_mac_ocp_write, .-r8168_mac_ocp_write
   7646         .p2align 4
   7647         .section        __patchable_function_entries,"awo",@progbits,rtl_eriar_cond_check
   7648         .align 8
   7649         .quad   .LPFE44
   7650         .text
   7651 .LPFE44:
   7652         nop
   7653         nop
   7654         nop
   7655         nop
-------------------------------------------------------------------------------------

The call of the function is the actual call:

-------------------------------------------------------------------------------------
  39334 .LBE11119:
  39335         .loc 1 3112 2 is_stmt 1 view .LVU10399
  39336         xorl    %edx, %edx
  39337         movl    $64556, %esi
  39338         movq    %r13, %rdi
  39339         call    r8168_mac_ocp_write
-------------------------------------------------------------------------------------

The command used for generating the assembly was taken from .o.cmd file and
added -save-temps as the only change:

$ gcc -Wp,-MMD,drivers/net/ethernet/realtek/.r8169_main.o.d -save-temps -nostdinc \
-I./arch/x86/include -I./arch/x86/include/generated  -I./include -I./arch/x86/include/uapi \
-I./arch/x86/include/generated/uapi -I./include/uapi -I./include/generated/uapi \
-include ./include/linux/compiler-version.h -include ./include/linux/kconfig.h \
-include ./include/linux/compiler_types.h -D__KERNEL__ -fmacro-prefix-map=./= \
-std=gnu11 -fshort-wchar -funsigned-char -fno-common -fno-PIE -fno-strict-aliasing \
-mno-sse -mno-mmx -mno-sse2 -mno-3dnow -mno-avx -fcf-protection=none -m64 -falign-jumps=1 \
-falign-loops=1 -mno-80387 -mno-fp-ret-in-387 -mpreferred-stack-boundary=3 -mskip-rax-setup \
-mtune=generic -mno-red-zone -mcmodel=kernel -Wno-sign-compare -fno-asynchronous-unwind-tables \
-mindirect-branch=thunk-extern -mindirect-branch-register -mindirect-branch-cs-prefix \
-mfunction-return=thunk-extern -fno-jump-tables -mharden-sls=all -fpatchable-function-entry=16,16 \
-fno-delete-null-pointer-checks -O2 -fno-allow-store-data-races -fstack-protector-strong \
-fno-omit-frame-pointer -fno-optimize-sibling-calls -fno-stack-clash-protection \
-fzero-call-used-regs=used-gpr -pg -mrecord-mcount -mfentry -DCC_USING_FENTRY -falign-functions=16 \
-fno-strict-overflow -fno-stack-check -fconserve-stack -Wall -Wundef -Werror=implicit-function-declaration \
-Werror=implicit-int -Werror=return-type -Werror=strict-prototypes -Wno-format-security -Wno-trigraphs \
-Wno-frame-address -Wno-address-of-packed-member -Wframe-larger-than=1024 -Wno-main \
-Wno-unused-but-set-variable -Wno-unused-const-variable -Wvla -Wno-pointer-sign -Wcast-function-type \
-Wno-array-bounds -Wno-alloc-size-larger-than -Wimplicit-fallthrough=5 -Werror=date-time \
-Werror=incompatible-pointer-types -Werror=designated-init -Wenum-conversion -Wno-unused-but-set-variable \
-Wno-unused-const-variable -Wno-restrict -Wno-packed-not-aligned -Wno-format-overflow -Wno-format-truncation \
-Wno-stringop-overflow -Wno-stringop-truncation -Wno-missing-field-initializers -Wno-type-limits \
-Wno-shift-negative-value -Wno-maybe-uninitialized -Wno-sign-compare -g -gdwarf-5  -fsanitize=bounds-strict \
-fsanitize=shift -fsanitize=bool -fsanitize=enum  -DMODULE  -DKBUILD_BASENAME='"r8169_main"' \
-DKBUILD_MODNAME='"r8169"' -D__KBUILD_MODNAME=kmod_r8169 -c -o drivers/net/ethernet/realtek/r8169_main.o \
drivers/net/ethernet/realtek/r8169_main.c   ; ./tools/objtool/objtool --hacks=jump_label --hacks=noinstr \
--hacks=skylake --retpoline --rethunk --sls --stackval --static-call --uaccess --prefix=16 \
   --module drivers/net/ethernet/realtek/r8169_main.o

This is a build against net-next. Please find the attached config.

RTL_(R|W)(8|16|32) family are obviously macros:

#define RTL_W32(tp, reg, val32)	writel((val32), tp->mmio_addr + (reg))

This is the writel():

   7590 #APP
   7591 # 66 "./arch/x86/include/asm/io.h" 1
   7592         movl %ebx,176(%rax)
   7593 # 0 "" 2
   7594 .LVL496:
   7595 #NO_APP

writel() looks optimal.

Hope this helps.

> Until there are benchmarks: NACK.

That means I've got a Reviewed-by: Jacob Keller and two NACKs.

I am voted out. :-/

I suppose one NACK from a maintainer is sufficient to halt the patch.

Going back to the documentation, the drawing board, and of course, the Source. :-)

Best regards,
Mirsad Todorovac

[-- Attachment #2: net-next-config.xz --]
[-- Type: application/x-xz, Size: 58464 bytes --]

^ permalink raw reply

* [PATCH net v2] netfilter: xt_recent: fix (increase) ipv6 literal buffer length
From: Maciej Żenczykowski @ 2023-11-05 19:56 UTC (permalink / raw)
  To: Maciej Żenczykowski, David S . Miller, Pablo Neira Ayuso,
	Florian Westphal
  Cc: Linux Network Development Mailing List,
	Netfilter Development Mailing List, Jan Engelhardt,
	Patrick McHardy

From: Maciej Żenczykowski <zenczykowski@gmail.com>

in6_pton() supports 'low-32-bit dot-decimal representation'
(this is useful with DNS64/NAT64 networks for example):

  # echo +aaaa:bbbb:cccc:dddd:eeee:ffff:1.2.3.4 > /proc/self/net/xt_recent/DEFAULT
  # cat /proc/self/net/xt_recent/DEFAULT
  src=aaaa:bbbb:cccc:dddd:eeee:ffff:0102:0304 ttl: 0 last_seen: 9733848829 oldest_pkt: 1 9733848829

but the provided buffer is too short:

  # echo +aaaa:bbbb:cccc:dddd:eeee:ffff:255.255.255.255 > /proc/self/net/xt_recent/DEFAULT
  -bash: echo: write error: Invalid argument

Cc: Jan Engelhardt <jengelh@inai.de>
Cc: Patrick McHardy <kaber@trash.net>
Fixes: 079aa88fe717 ("netfilter: xt_recent: IPv6 support")
Signed-off-by: Maciej Żenczykowski <zenczykowski@gmail.com>
---
 net/netfilter/xt_recent.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/netfilter/xt_recent.c b/net/netfilter/xt_recent.c
index 7ddb9a78e3fc..ef93e0d3bee0 100644
--- a/net/netfilter/xt_recent.c
+++ b/net/netfilter/xt_recent.c
@@ -561,7 +561,7 @@ recent_mt_proc_write(struct file *file, const char __user *input,
 {
 	struct recent_table *t = pde_data(file_inode(file));
 	struct recent_entry *e;
-	char buf[sizeof("+b335:1d35:1e55:dead:c0de:1715:5afe:c0de")];
+	char buf[sizeof("+b335:1d35:1e55:dead:c0de:1715:255.255.255.255")];
 	const char *c = buf;
 	union nf_inet_addr addr = {};
 	u_int16_t family;
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related

* Re: [PATCH net] net: xt_recent: fix (increase) ipv6 literal buffer length
From: Maciej Żenczykowski @ 2023-11-05 19:59 UTC (permalink / raw)
  To: Jan Engelhardt
  Cc: David S . Miller, Pablo Neira Ayuso, Florian Westphal,
	Linux Network Development Mailing List,
	Netfilter Development Mailing List, Patrick McHardy
In-Reply-To: <1654342n-nn1q-959p-s6r0-3846psss5on6@vanv.qr>

On Sun, Nov 5, 2023 at 12:08 AM Jan Engelhardt <jengelh@inai.de> wrote:
>
>
> On Saturday 2023-11-04 22:00, Maciej Żenczykowski wrote:
> >
> >IPv4 in IPv6 is supported by in6_pton [...]
> >but the provided buffer is too short:
>
> If in6_pton were to support tunnel traffic.. wait that sounds
> unusual, and would require dst to be at least 20 bytes, which the
> function documentation contradicts.
>
> As the RFCs make no precise name proposition
>
>         (IPv6 Text Representation, third alternative,
>         IPv4 "decimal value" of the "four low-order 8-bit pieces")
>
> so let's just call it
>
>         "low-32-bit dot-decimal representation"
>
> which should avoid the tunnel term.

Resent [ https://patchwork.kernel.org/project/netdevbpf/patch/20231105195600.522779-1-maze@google.com/
], hopefully this is better.
Also:
- used your (Jan's) new email in the CC.
- changed net to netfilter in the commit title
(but as it is such a trivial bug fix, it does still feel like it
should go straight into net/main... rather than via netfilter repos)

Cheers,
Maciej

^ permalink raw reply

* Re: [PATCH net-next v6 0/5] Coalesce mac ocp write/modify calls to reduce spinlock contention
From: Andrew Lunn @ 2023-11-05 20:36 UTC (permalink / raw)
  To: Mirsad Todorovac
  Cc: Heiner Kallweit, linux-kernel, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, netdev, nic_swsd
In-Reply-To: <b9573c0e-3cdb-4444-b8f2-579aa699b2e1@alu.unizg.hr>

> The command used for generating the assembly was taken from .o.cmd file and
> added -save-temps as the only change:

make drivers/net/ethernet/realtek/r8169_main.lst

is simpler. You get the C and the generated assembler listed together.

   Andrew

^ permalink raw reply

* Re: [PATCH net 4/4] net: ethernet: cortina: Handle large frames
From: Linus Walleij @ 2023-11-05 20:56 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Hans Ulli Kroll, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Michał Mirosław, Vladimir Oltean,
	linux-arm-kernel, netdev, linux-kernel
In-Reply-To: <036b481e-ac5b-4e77-b93a-4badaf19e185@lunn.ch>

On Sat, Nov 4, 2023 at 3:57 PM Andrew Lunn <andrew@lunn.ch> wrote:

> > +              * Just bypass on bigger frames.
> > +              */
> > +             word1 |= TSS_BYPASS_BIT;
> > +     } else if (skb->ip_summed != CHECKSUM_NONE) {
>
> I've never looked at how the network stack does checksums. But looking
> at this patch, it made me wounder, how do you tell the stack it needs
> to do a software checksum because the hardware cannot?

I read up on it: the documentation is in
Documentation/networking/checksum-offloads.rst
and in the header for skbuff, include/linux/skbuff.h

Actually we should check for == CHECKSUM_PARTIAL which means
we need to do the checksum (!= CHECKSUM_NONE is not inclusive)
then I call a software fallback directly from the driver if I need to.

> Or for this
> driver, is it always calculating a checksum, which is then ignored?
> Maybe you can improve performance a little but disabling software
> checksum when it is not needed?

The ping was somehow working without proper checksum
before, but I think I'm doing the right thing now, also tested with
HTTP traffic, check out v2.

Thanks for pointing it out, the patch looks way better now.

Yours,
Linus Walleij

^ permalink raw reply

* [PATCH net v2 0/4] Fix large frames in the Gemini ethernet driver
From: Linus Walleij @ 2023-11-05 20:57 UTC (permalink / raw)
  To: Hans Ulli Kroll, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Michał Mirosław, Vladimir Oltean,
	Andrew Lunn
  Cc: linux-arm-kernel, netdev, linux-kernel, Linus Walleij

This is the result of a bug hunt for a problem with the
RTL8366RB DSA switch leading me wrong all over the place.

I am indebted to Vladimir Oltean who as usual pointed
out where the real problem was, many thanks!

Tryig to actually use big ("jumbo") frames on this
hardware uncovered the real bugs. Then I tested it on
the DSA switch and it indeed fixes the issue.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
Changes in v2:
- Don't check for oversized MTU request: the framework makes sure it doesn't
  happen.
- Drop unrelated BIT() macro cleanups (I might send these later for net-next)
- Use a special error code if the skbuff is too big and fail gracefully
  is this happens.
- Do proper checksum of the frame using a software fallback when the frame
  is too long for hardware checksumming.
- Link to v1: https://lore.kernel.org/r/20231104-gemini-largeframe-fix-v1-0-9c5513f22f33@linaro.org

---
Linus Walleij (4):
      net: ethernet: cortina: Fix MTU max setting
      net: ethernet: cortina: Fix max RX frame define
      net: ethernet: cortina: Protect against oversized frames
      net: ethernet: cortina: Handle large frames

 drivers/net/ethernet/cortina/gemini.c | 39 ++++++++++++++++++++++++++++-------
 drivers/net/ethernet/cortina/gemini.h |  4 ++--
 2 files changed, 34 insertions(+), 9 deletions(-)
---
base-commit: e85fd73c7d9630d392f451fcf69a457c8e3f21dd
change-id: 20231104-gemini-largeframe-fix-c143d2c781b5

Best regards,
-- 
Linus Walleij <linus.walleij@linaro.org>


^ permalink raw reply

* [PATCH net v2 1/4] net: ethernet: cortina: Fix MTU max setting
From: Linus Walleij @ 2023-11-05 20:57 UTC (permalink / raw)
  To: Hans Ulli Kroll, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Michał Mirosław, Vladimir Oltean,
	Andrew Lunn
  Cc: linux-arm-kernel, netdev, linux-kernel, Linus Walleij
In-Reply-To: <20231105-gemini-largeframe-fix-v2-0-cd3a5aa6c496@linaro.org>

The RX max frame size is over 10000 for the Gemini ethernet,
but the TX max frame size is actually just 2047 (0x7ff after
checking the datasheet). Reflect this in what we offer to Linux,
cap the MTU at the TX max frame minus ethernet headers.

Use the BIT() macro for related bit flags so these TX settings
are consistent.

Fixes: 4d5ae32f5e1e ("net: ethernet: Add a driver for Gemini gigabit ethernet")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
 drivers/net/ethernet/cortina/gemini.c | 7 ++++---
 drivers/net/ethernet/cortina/gemini.h | 2 +-
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/cortina/gemini.c b/drivers/net/ethernet/cortina/gemini.c
index 5423fe26b4ef..ed9701f8ad9a 100644
--- a/drivers/net/ethernet/cortina/gemini.c
+++ b/drivers/net/ethernet/cortina/gemini.c
@@ -2464,11 +2464,12 @@ static int gemini_ethernet_port_probe(struct platform_device *pdev)
 
 	netdev->hw_features = GMAC_OFFLOAD_FEATURES;
 	netdev->features |= GMAC_OFFLOAD_FEATURES | NETIF_F_GRO;
-	/* We can handle jumbo frames up to 10236 bytes so, let's accept
-	 * payloads of 10236 bytes minus VLAN and ethernet header
+	/* We can receive jumbo frames up to 10236 bytes but only
+	 * transmit 2047 bytes so, let's accept payloads of 2047
+	 * bytes minus VLAN and ethernet header
 	 */
 	netdev->min_mtu = ETH_MIN_MTU;
-	netdev->max_mtu = 10236 - VLAN_ETH_HLEN;
+	netdev->max_mtu = MTU_SIZE_BIT_MASK - VLAN_ETH_HLEN;
 
 	port->freeq_refill = 0;
 	netif_napi_add(netdev, &port->napi, gmac_napi_poll);
diff --git a/drivers/net/ethernet/cortina/gemini.h b/drivers/net/ethernet/cortina/gemini.h
index 9fdf77d5eb37..201b4efe2937 100644
--- a/drivers/net/ethernet/cortina/gemini.h
+++ b/drivers/net/ethernet/cortina/gemini.h
@@ -502,7 +502,7 @@ union gmac_txdesc_3 {
 #define SOF_BIT			0x80000000
 #define EOF_BIT			0x40000000
 #define EOFIE_BIT		BIT(29)
-#define MTU_SIZE_BIT_MASK	0x1fff
+#define MTU_SIZE_BIT_MASK	0x7ff /* Max MTU 2047 bytes */
 
 /* GMAC Tx Descriptor */
 struct gmac_txdesc {

-- 
2.34.1


^ permalink raw reply related

* [PATCH net v2 2/4] net: ethernet: cortina: Fix max RX frame define
From: Linus Walleij @ 2023-11-05 20:57 UTC (permalink / raw)
  To: Hans Ulli Kroll, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Michał Mirosław, Vladimir Oltean,
	Andrew Lunn
  Cc: linux-arm-kernel, netdev, linux-kernel, Linus Walleij
In-Reply-To: <20231105-gemini-largeframe-fix-v2-0-cd3a5aa6c496@linaro.org>

Enumerator 3 is 1548 bytes according to the datasheet.
Not 1542.

Fixes: 4d5ae32f5e1e ("net: ethernet: Add a driver for Gemini gigabit ethernet")
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
 drivers/net/ethernet/cortina/gemini.c | 4 ++--
 drivers/net/ethernet/cortina/gemini.h | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/cortina/gemini.c b/drivers/net/ethernet/cortina/gemini.c
index ed9701f8ad9a..b21a94b4ab5c 100644
--- a/drivers/net/ethernet/cortina/gemini.c
+++ b/drivers/net/ethernet/cortina/gemini.c
@@ -432,8 +432,8 @@ static const struct gmac_max_framelen gmac_maxlens[] = {
 		.val = CONFIG0_MAXLEN_1536,
 	},
 	{
-		.max_l3_len = 1542,
-		.val = CONFIG0_MAXLEN_1542,
+		.max_l3_len = 1548,
+		.val = CONFIG0_MAXLEN_1548,
 	},
 	{
 		.max_l3_len = 9212,
diff --git a/drivers/net/ethernet/cortina/gemini.h b/drivers/net/ethernet/cortina/gemini.h
index 201b4efe2937..24bb989981f2 100644
--- a/drivers/net/ethernet/cortina/gemini.h
+++ b/drivers/net/ethernet/cortina/gemini.h
@@ -787,7 +787,7 @@ union gmac_config0 {
 #define  CONFIG0_MAXLEN_1536	0
 #define  CONFIG0_MAXLEN_1518	1
 #define  CONFIG0_MAXLEN_1522	2
-#define  CONFIG0_MAXLEN_1542	3
+#define  CONFIG0_MAXLEN_1548	3
 #define  CONFIG0_MAXLEN_9k	4	/* 9212 */
 #define  CONFIG0_MAXLEN_10k	5	/* 10236 */
 #define  CONFIG0_MAXLEN_1518__6	6

-- 
2.34.1


^ permalink raw reply related

* [PATCH net v2 3/4] net: ethernet: cortina: Protect against oversized frames
From: Linus Walleij @ 2023-11-05 20:57 UTC (permalink / raw)
  To: Hans Ulli Kroll, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Michał Mirosław, Vladimir Oltean,
	Andrew Lunn
  Cc: linux-arm-kernel, netdev, linux-kernel, Linus Walleij
In-Reply-To: <20231105-gemini-largeframe-fix-v2-0-cd3a5aa6c496@linaro.org>

The max size of a transfer no matter the MTU is 64KB-1 so immediately
bail out if the skb exceeds that.

The calling site tries to linearize the skbuff on error so return a
special error code -E2BIG to indicate that this will not work in
any way and bail out immediately if this happens.

Fixes: 4d5ae32f5e1e ("net: ethernet: Add a driver for Gemini gigabit ethernet")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
 drivers/net/ethernet/cortina/gemini.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/cortina/gemini.c b/drivers/net/ethernet/cortina/gemini.c
index b21a94b4ab5c..576174a862a9 100644
--- a/drivers/net/ethernet/cortina/gemini.c
+++ b/drivers/net/ethernet/cortina/gemini.c
@@ -1151,6 +1151,12 @@ static int gmac_map_tx_bufs(struct net_device *netdev, struct sk_buff *skb,
 	if (skb->protocol == htons(ETH_P_8021Q))
 		mtu += VLAN_HLEN;
 
+	if (skb->len > 65535) {
+		/* The field for length is only 16 bits */
+		netdev_err(netdev, "%s: frame too big, max size 65535 bytes\n", __func__);
+		return -E2BIG;
+	}
+
 	word1 = skb->len;
 	word3 = SOF_BIT;
 
@@ -1232,6 +1238,7 @@ static netdev_tx_t gmac_start_xmit(struct sk_buff *skb,
 	struct gmac_txq *txq;
 	int txq_num, nfrags;
 	union dma_rwptr rw;
+	int ret;
 
 	if (skb->len >= 0x10000)
 		goto out_drop_free;
@@ -1269,7 +1276,11 @@ static netdev_tx_t gmac_start_xmit(struct sk_buff *skb,
 		}
 	}
 
-	if (gmac_map_tx_bufs(netdev, skb, txq, &w)) {
+	ret = gmac_map_tx_bufs(netdev, skb, txq, &w);
+	if (ret == -E2BIG)
+		goto out_drop;
+	if (ret) {
+		/* Linearize and retry */
 		if (skb_linearize(skb))
 			goto out_drop;
 

-- 
2.34.1


^ permalink raw reply related

* [PATCH net v2 4/4] net: ethernet: cortina: Handle large frames
From: Linus Walleij @ 2023-11-05 20:57 UTC (permalink / raw)
  To: Hans Ulli Kroll, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Michał Mirosław, Vladimir Oltean,
	Andrew Lunn
  Cc: linux-arm-kernel, netdev, linux-kernel, Linus Walleij
In-Reply-To: <20231105-gemini-largeframe-fix-v2-0-cd3a5aa6c496@linaro.org>

The Gemini ethernet controller provides hardware checksumming
for frames up to 1514 bytes including ethernet headers but not
FCS.

If we start sending bigger frames (after first bumping up the MTU
on both interfaces sending and receiveing the frames), truncated
packets start to appear on the target such as in this tcpdump
resulting from ping -s 1474:

23:34:17.241983 14:d6:4d:a8:3c:4f (oui Unknown) > bc:ae:c5:6b:a8:3d (oui Unknown),
ethertype IPv4 (0x0800), length 1514: truncated-ip - 2 bytes missing!
(tos 0x0, ttl 64, id 32653, offset 0, flags [DF], proto ICMP (1), length 1502)
OpenWrt.lan > Fecusia: ICMP echo request, id 1672, seq 50, length 1482

If we bypass the hardware checksumming and provide a software
fallback, everything starts working fine up to the max TX MTU
of 2047 bytes, for example ping -s2000 192.168.1.2:

00:44:29.587598 bc:ae:c5:6b:a8:3d (oui Unknown) > 14:d6:4d:a8:3c:4f (oui Unknown),
ethertype IPv4 (0x0800), length 2042:
(tos 0x0, ttl 64, id 51828, offset 0, flags [none], proto ICMP (1), length 2028)
Fecusia > OpenWrt.lan: ICMP echo reply, id 1683, seq 4, length 2008

The bit enabling to bypass hardware checksum (or any of the
"TSS" bits) are undocumented in the hardware reference manual.
The entire hardware checksum unit appears undocumented. The
conclusion that we need to use the "bypass" bit was found by
trial-and-error.

Since no hardware checksum will happen, we slot in a software
checksum fallback.

Check for the condition where we need to compute checksum on the
skb with either hardware or software using == CHECKSUM_PARTIAL instead
of != CHECKSUM_NONE which is an incomplete check according to
<linux/skbuff.h>.

On the D-Link DIR-685 router this fixes a bug on the conduit
interface to the RTL8366RB DSA switch: as the switch needs to add
space for its tag it increases the MTU on the conduit interface
to 1504 and that means that when the router sends packages
of 1500 bytes these get an extra 4 bytes of DSA tag and the
transfer fails because of the erroneous hardware checksumming,
affecting such basic functionality as the LuCI web interface.

Suggested-by: Vladimir Oltean <olteanv@gmail.com>
Fixes: 4d5ae32f5e1e ("net: ethernet: Add a driver for Gemini gigabit ethernet")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
 drivers/net/ethernet/cortina/gemini.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/cortina/gemini.c b/drivers/net/ethernet/cortina/gemini.c
index 576174a862a9..84295c1b87e6 100644
--- a/drivers/net/ethernet/cortina/gemini.c
+++ b/drivers/net/ethernet/cortina/gemini.c
@@ -1145,6 +1145,7 @@ static int gmac_map_tx_bufs(struct net_device *netdev, struct sk_buff *skb,
 	dma_addr_t mapping;
 	unsigned short mtu;
 	void *buffer;
+	int ret;
 
 	mtu  = ETH_HLEN;
 	mtu += netdev->mtu;
@@ -1165,7 +1166,19 @@ static int gmac_map_tx_bufs(struct net_device *netdev, struct sk_buff *skb,
 		word3 |= mtu;
 	}
 
-	if (skb->ip_summed != CHECKSUM_NONE) {
+	if (skb->len >= ETH_FRAME_LEN) {
+		/* Hardware offloaded checksumming isn't working on frames
+		 * bigger than 1514 bytes. Perhaps the buffer is only 1518
+		 * bytes fitting a normal frame and a checksum?
+		 * Just use software checksumming and bypass on bigger frames.
+		 */
+		if (skb->ip_summed == CHECKSUM_PARTIAL) {
+			ret = skb_checksum_help(skb);
+			if (ret)
+				return ret;
+		}
+		word1 |= TSS_BYPASS_BIT;
+	} else if (skb->ip_summed == CHECKSUM_PARTIAL) {
 		int tcp = 0;
 
 		if (skb->protocol == htons(ETH_P_IP)) {

-- 
2.34.1


^ permalink raw reply related

* [PATCH net] r8169: respect userspace disabling IFF_MULTICAST
From: Heiner Kallweit @ 2023-11-05 22:43 UTC (permalink / raw)
  To: Paolo Abeni, Eric Dumazet, David Miller, Jakub Kicinski,
	Realtek linux nic maintainers
  Cc: netdev@vger.kernel.org

So far we ignore the setting of IFF_MULTICAST. Fix this and clear bit
AcceptMulticast if IFF_MULTICAST isn't set.

Note: Based on the implementations I've seen it doesn't seem to be 100% clear
what a driver is supposed to do if IFF_ALLMULTI is set but IFF_MULTICAST
is not. This patch is based on the understanding that IFF_MULTICAST has
precedence.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
---
 drivers/net/ethernet/realtek/r8169_main.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
index 4b8251cdb..0c76c162b 100644
--- a/drivers/net/ethernet/realtek/r8169_main.c
+++ b/drivers/net/ethernet/realtek/r8169_main.c
@@ -2582,6 +2582,8 @@ static void rtl_set_rx_mode(struct net_device *dev)
 
 	if (dev->flags & IFF_PROMISC) {
 		rx_mode |= AcceptAllPhys;
+	} else if (!(dev->flags & IFF_MULTICAST)) {
+		rx_mode &= ~AcceptMulticast;
 	} else if (netdev_mc_count(dev) > MC_FILTER_LIMIT ||
 		   dev->flags & IFF_ALLMULTI ||
 		   tp->mac_version == RTL_GIGA_MAC_VER_35 ||
-- 
2.42.1


^ permalink raw reply related

* Re: [PATCH net-next v10 12/13] net:ethernet:realtek: Update the Makefile and Kconfig in the realtek folder
From: kernel test robot @ 2023-11-05 22:57 UTC (permalink / raw)
  To: Justin Lai, kuba
  Cc: llvm, oe-kbuild-all, davem, edumazet, pabeni, linux-kernel,
	netdev, andrew, pkshih, larry.chiu, Justin Lai
In-Reply-To: <20231102154505.940783-13-justinlai0215@realtek.com>

Hi Justin,

kernel test robot noticed the following build errors:

[auto build test ERROR on net-next/main]

url:    https://github.com/intel-lab-lkp/linux/commits/Justin-Lai/net-ethernet-realtek-rtase-Add-pci-table-supported-in-this-module/20231103-032946
base:   net-next/main
patch link:    https://lore.kernel.org/r/20231102154505.940783-13-justinlai0215%40realtek.com
patch subject: [PATCH net-next v10 12/13] net:ethernet:realtek: Update the Makefile and Kconfig in the realtek folder
config: powerpc64-allyesconfig (https://download.01.org/0day-ci/archive/20231106/202311060633.814zKJdK-lkp@intel.com/config)
compiler: clang version 17.0.0 (https://github.com/llvm/llvm-project.git 4a5ac14ee968ff0ad5d2cc1ffa0299048db4c88a)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231106/202311060633.814zKJdK-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202311060633.814zKJdK-lkp@intel.com/

All errors (new ones prefixed by >>):

>> drivers/net/ethernet/realtek/rtase/rtase_main.c:68:10: fatal error: 'net/page_pool.h' file not found
      68 | #include <net/page_pool.h>
         |          ^~~~~~~~~~~~~~~~~
   1 error generated.


vim +68 drivers/net/ethernet/realtek/rtase/rtase_main.c

db2657d0fa3a98 Justin Lai 2023-11-02 @68  #include <net/page_pool.h>
db2657d0fa3a98 Justin Lai 2023-11-02  69  #include <net/pkt_cls.h>
db2657d0fa3a98 Justin Lai 2023-11-02  70  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply

* Re: [PATCH net v2 1/4] net: ethernet: cortina: Fix MTU max setting
From: Andrew Lunn @ 2023-11-05 23:39 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Hans Ulli Kroll, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Michał Mirosław, Vladimir Oltean,
	linux-arm-kernel, netdev, linux-kernel
In-Reply-To: <20231105-gemini-largeframe-fix-v2-1-cd3a5aa6c496@linaro.org>

On Sun, Nov 05, 2023 at 09:57:23PM +0100, Linus Walleij wrote:
> The RX max frame size is over 10000 for the Gemini ethernet,
> but the TX max frame size is actually just 2047 (0x7ff after
> checking the datasheet). Reflect this in what we offer to Linux,
> cap the MTU at the TX max frame minus ethernet headers.
> 
> Use the BIT() macro for related bit flags so these TX settings
> are consistent.
> 
> Fixes: 4d5ae32f5e1e ("net: ethernet: Add a driver for Gemini gigabit ethernet")
> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew

^ permalink raw reply

* [PATCH iproute2] Revert "Makefile: ensure CONF_USR_DIR honours the libdir config"
From: luca.boccassi @ 2023-11-06  0:14 UTC (permalink / raw)
  To: netdev; +Cc: stephen, aclaudi

From: Luca Boccassi <bluca@debian.org>

LIBDIR in Debian and derivatives is not /usr/lib/, it's
/usr/lib/<architecture triplet>/, which is different, and it's the
wrong location where to install architecture-independent default
configuration files, which should always go to /usr/lib/ instead.
Installing these files to the per-architecture directory is not
the right thing, hence revert the change.

This reverts commit 946753a4459bd035132a27bb2eb87529c1979b90.

Signed-off-by: Luca Boccassi <bluca@debian.org>
---
 Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index 54539ce4..7d1819ce 100644
--- a/Makefile
+++ b/Makefile
@@ -17,7 +17,7 @@ endif
 PREFIX?=/usr
 SBINDIR?=/sbin
 CONF_ETC_DIR?=/etc/iproute2
-CONF_USR_DIR?=$(LIBDIR)/iproute2
+CONF_USR_DIR?=$(PREFIX)/lib/iproute2
 NETNS_RUN_DIR?=/var/run/netns
 NETNS_ETC_DIR?=/etc/netns
 DATADIR?=$(PREFIX)/share
-- 
2.39.2

^ permalink raw reply related

* [PATCH] Fixes a null pointer dereference in ptp_ioctl
From: Yuran Pereira @ 2023-11-06  0:48 UTC (permalink / raw)
  To: richardcochran, netdev
  Cc: Yuran Pereira, linux-kernel, linux-kernel-mentees,
	syzbot+8a78ecea7ac1a2ea26e5

Syzkaller found a null pointer dereference in ptp_ioctl
originating from the lack of a null check for tsevq.

```
general protection fault, probably for non-canonical
	address 0xdffffc000000020b: 0000 [#1] PREEMPT SMP KASAN
KASAN: probably user-memory-access in range
	[0x0000000000001058-0x000000000000105f]
CPU: 0 PID: 5053 Comm: syz-executor353 Not tainted
	6.6.0-syzkaller-10396-g4652b8e4f3ff #0
Hardware name: Google Google Compute Engine/Google Compute Engine,
	BIOS Google 10/09/2023
RIP: 0010:ptp_ioctl+0xcb7/0x1d10 drivers/ptp/ptp_chardev.c:476
...
Call Trace:
 <TASK>
 posix_clock_ioctl+0xf8/0x160 kernel/time/posix-clock.c:86
 vfs_ioctl fs/ioctl.c:51 [inline]
 __do_sys_ioctl fs/ioctl.c:871 [inline]
 __se_sys_ioctl fs/ioctl.c:857 [inline]
 __x64_sys_ioctl+0x18f/0x210 fs/ioctl.c:857
 do_syscall_x64 arch/x86/entry/common.c:51 [inline]
 do_syscall_64+0x3f/0x110 arch/x86/entry/common.c:82
 entry_SYSCALL_64_after_hwframe+0x63/0x6b
```

This patch fixes the issue by adding a check for tsevq and
ensuring ptp_ioctl returns with an error if tsevq is null.

Dashboard link: https://syzkaller.appspot.com/bug?extid=8a78ecea7ac1a2ea26e5
Reported-by: syzbot+8a78ecea7ac1a2ea26e5@syzkaller.appspotmail.com
Fixes: c5a445b1e934 ("ptp: support event queue reader channel masks")

Signed-off-by: Yuran Pereira <yuran.pereira@hotmail.com>
---
 drivers/ptp/ptp_chardev.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/ptp/ptp_chardev.c b/drivers/ptp/ptp_chardev.c
index 282cd7d24077..5b36c34629a0 100644
--- a/drivers/ptp/ptp_chardev.c
+++ b/drivers/ptp/ptp_chardev.c
@@ -173,6 +173,8 @@ long ptp_ioctl(struct posix_clock_context *pccontext, unsigned int cmd,
 	int enable, err = 0;
 
 	tsevq = pccontext->private_clkdata;
+	if (!tsevq)
+		return -EINVAL;
 
 	switch (cmd) {
 
-- 
2.25.1


^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox