Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH net-next v1 1/4] net: phy: broadcom: Add PTP support for some Broadcom PHYs.
From: Richard Cochran @ 2022-04-26  2:53 UTC (permalink / raw)
  To: Jonathan Lemon
  Cc: f.fainelli, bcm-kernel-feedback-list, andrew, hkallweit1, linux,
	netdev, kernel-team
In-Reply-To: <20220425235540.vuacu26xb6bzpxob@bsd-mbp.dhcp.thefacebook.com>

On Mon, Apr 25, 2022 at 04:55:40PM -0700, Jonathan Lemon wrote:

> We could just have a chip-specific version of this function.  The
> recovered timestamp is passed back in a structure, so the rest of the
> code would be unchanged.

Yeah, but it means that I'll have to check each and every bit of every
register to see what other random changes are there...

> Jonathan    (no, not volunteering to do this...)

For now, just get your chip merged, and then the next chip's driver
will refactor/reuse as needed.

Thanks,
Richard

^ permalink raw reply

* Re: [PATCH memcg v3] net: set proper memcg for net_init hooks allocations
From: Roman Gushchin @ 2022-04-26  2:50 UTC (permalink / raw)
  To: Vasily Averin
  Cc: Vlastimil Babka, Shakeel Butt, kernel, Florian Westphal,
	linux-kernel, Michal Hocko, cgroups, netdev, David S. Miller,
	Jakub Kicinski, Paolo Abeni
In-Reply-To: <c2f0139a-62e2-5985-34e9-d42faac81960@openvz.org>

On Mon, Apr 25, 2022 at 01:56:02PM +0300, Vasily Averin wrote:

Hello, Vasily!

> __register_pernet_operations() executes init hook of registered
> pernet_operation structure in all existing net namespaces.
> 
> Typically, these hooks are called by a process associated with
> the specified net namespace, and all __GFP_ACCOUNTING marked
> allocation are accounted for corresponding container/memcg.

__GFP_ACCOUNT

> 
> However __register_pernet_operations() calls the hooks in the same
> context, and as a result all marked allocations are accounted
> to one memcg for all processed net namespaces.
> 
> This patch adjusts active memcg for each net namespace and helps
> to account memory allocated inside ops_init() into the proper memcg.
> 
> Signed-off-by: Vasily Averin <vvs@openvz.org>
> ---
> v3: put_net_memcg() replaced by an alreay existing mem_cgroup_put()
>     It checks memcg before accessing it, this is required for
>     __register_pernet_operations() called before memcg initialization.
>     Additionally fixed leading whitespaces in non-memcg_kmem version
>     of mem_cgroup_from_obj().
> 
> v2: introduced get/put_net_memcg(),
>     new functions are moved under CONFIG_MEMCG_KMEM
>     to fix compilation issues reported by Intel's kernel test robot
> 
> v1: introduced get_mem_cgroup_from_kmem(), which takes the refcount
>     for the found memcg, suggested by Shakeel
> ---
>  include/linux/memcontrol.h | 29 ++++++++++++++++++++++++++++-
>  net/core/net_namespace.c   |  7 +++++++
>  2 files changed, 35 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index 0abbd685703b..cfb68a3f7015 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -1714,6 +1714,29 @@ static inline int memcg_cache_id(struct mem_cgroup *memcg)
>  
>  struct mem_cgroup *mem_cgroup_from_obj(void *p);
>  
> +static inline struct mem_cgroup *get_mem_cgroup_from_kmem(void *p)
> +{
> +	struct mem_cgroup *memcg;
> +
> +	rcu_read_lock();
> +	do {
> +		memcg = mem_cgroup_from_obj(p);
> +	} while (memcg && !css_tryget(&memcg->css));
> +	rcu_read_unlock();
> +	return memcg;
> +}

Please, rename it to get_mem_cgroup_from_obj() for consistency.

> +
> +static inline struct mem_cgroup *get_net_memcg(void *p)
> +{
> +	struct mem_cgroup *memcg;
> +
> +	memcg = get_mem_cgroup_from_kmem(p);
> +
> +	if (!memcg)
> +		memcg = root_mem_cgroup;
> +
> +	return memcg;
> +}

I'm not a fan of this helper: it has nothing to do with the networking,
actually it's a wrapper of get_mem_cgroup_from_kmem() replacing NULL
with root_mem_cgroup.

Overall the handling of root_mem_cgroup is very messy, I don't blame
this patch. But I wonder if it's better to simple move this code
to the call site without introducing a new function?

Alternatively, you can introduce something like:
struct mem_cgroup *mem_cgroup_or_root(struct mem_cgroup *memcg)
{
	return memcg ? memcg : root_mem_cgroup;
}

>  #else
>  static inline bool mem_cgroup_kmem_disabled(void)
>  {
> @@ -1763,9 +1786,13 @@ static inline void memcg_put_cache_ids(void)
>  
>  static inline struct mem_cgroup *mem_cgroup_from_obj(void *p)
>  {
> -       return NULL;
> +	return NULL;
>  }
>  
> +static inline struct mem_cgroup *get_net_memcg(void *p)
> +{
> +	return NULL;
> +}
>  #endif /* CONFIG_MEMCG_KMEM */
>  
>  #endif /* _LINUX_MEMCONTROL_H */
> diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
> index a5b5bb99c644..3093b4d5b2b9 100644
> --- a/net/core/net_namespace.c
> +++ b/net/core/net_namespace.c
> @@ -26,6 +26,7 @@
>  #include <net/net_namespace.h>
>  #include <net/netns/generic.h>
>  
> +#include <linux/sched/mm.h>
>  /*
>   *	Our network namespace constructor/destructor lists
>   */
> @@ -1147,7 +1148,13 @@ static int __register_pernet_operations(struct list_head *list,
>  		 * setup_net() and cleanup_net() are not possible.
>  		 */
>  		for_each_net(net) {
> +			struct mem_cgroup *old, *memcg;
> +
> +			memcg = get_net_memcg(net);
> +			old = set_active_memcg(memcg);
>  			error = ops_init(ops, net);
> +			set_active_memcg(old);
> +			mem_cgroup_put(memcg);
>  			if (error)
>  				goto out_undo;
>  			list_add_tail(&net->exit_list, &net_exit_list);
> -- 
> 2.31.1
> 

^ permalink raw reply

* Re: [PATCH net-next v1 1/4] net: phy: broadcom: Add PTP support for some Broadcom PHYs.
From: Richard Cochran @ 2022-04-26  2:49 UTC (permalink / raw)
  To: Jonathan Lemon
  Cc: f.fainelli, bcm-kernel-feedback-list, andrew, hkallweit1, linux,
	netdev, kernel-team
In-Reply-To: <20220425233043.q5335cvto5c6zcck@bsd-mbp.dhcp.thefacebook.com>

On Mon, Apr 25, 2022 at 04:30:43PM -0700, Jonathan Lemon wrote:
> On Sun, Apr 24, 2022 at 06:19:31PM -0700, Richard Cochran wrote:
> > On Sat, Apr 23, 2022 at 07:23:53PM -0700, Jonathan Lemon wrote:
> > 
> > > +static bool bcm_ptp_rxtstamp(struct mii_timestamper *mii_ts,
> > > +			     struct sk_buff *skb, int type)
> > > +{
> > > +	struct bcm_ptp_private *priv = mii2priv(mii_ts);
> > > +	struct skb_shared_hwtstamps *hwts;
> > > +	struct ptp_header *header;
> > > +	u32 sec, nsec;
> > > +	u8 *data;
> > > +
> > > +	if (!priv->hwts_rx)
> > > +		return false;
> > > +
> > > +	header = ptp_parse_header(skb, type);
> > > +	if (!header)
> > > +		return false;
> > > +
> > > +	data = (u8 *)(header + 1);
> > 
> > No need to pointer math, as ptp_header already has reserved1 and reserved2.
> > 
> > > +	sec = get_unaligned_be32(data);
> > 
> > Something is missing here.  The seconds field is only four bits, so
> > the code needs to read the 80 bit counter once in a while and augment
> > the time stamp with the upper bits.
> 
> The BCM chip inserts a 64-bit sec.nsec RX timestamp immediately after
> the PTP header.  So I'm recovering it here.  I'll also update the patch
> to memmove() the tail of the skb up in order to remove it, just in case
> it makes a difference.

Okay, this is something different.  This won't work because that
corrupts the PTP message format.

I'm talking about a different mode where the PHY places the time stamp
into the reserved1/2 fields.

Thanks,
Richard

^ permalink raw reply

* Re: [PATCH 4.19 000/338] 4.19.238-rc1 review
From: NeilBrown @ 2022-04-26  2:29 UTC (permalink / raw)
  To: Naresh Kamboju
  Cc: Greg Kroah-Hartman, linux-kernel, stable, torvalds, akpm, linux,
	shuah, patches, lkft-triage, pavel, jonathanh, f.fainelli,
	sudipm.mukherjee, slade, Netdev, David S. Miller, Jakub Kicinski,
	Paolo Abeni, Trond Myklebust, linux-nfs, Anna Schumaker
In-Reply-To: <CA+G9fYscMP+DTzaQGw1p-KxyhPi0JB64ABDu_aNSU0r+_VgBHg@mail.gmail.com>

On Thu, 21 Apr 2022, Naresh Kamboju wrote:
> On Mon, 18 Apr 2022 at 14:09, Naresh Kamboju <naresh.kamboju@linaro.org> wrote:
> >
> > On Thu, 14 Apr 2022 at 18:45, Greg Kroah-Hartman
> > <gregkh@linuxfoundation.org> wrote:
> > >
> > > This is the start of the stable review cycle for the 4.19.238 release.
> > > There are 338 patches in this series, all will be posted as a response
> > > to this one.  If anyone has any issues with these being applied, please
> > > let me know.
> > >
> > > Responses should be made by Sat, 16 Apr 2022 11:07:54 +0000.
> > > Anything received after that time might be too late.
> > >
> > > The whole patch series can be found in one patch at:
> > >         https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.238-rc1.gz
> > > or in the git tree and branch at:
> > >         git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y
> > > and the diffstat can be found below.
> > >
> > > thanks,
> > >
> > > greg k-h
> >
> >
> > Following kernel warning noticed on arm64 Juno-r2 while booting
> > stable-rc 4.19.238. Here is the full test log link [1].
> >
> > [    0.000000] Booting Linux on physical CPU 0x0000000100 [0x410fd033]
> > [    0.000000] Linux version 4.19.238 (tuxmake@tuxmake) (gcc version
> > 11.2.0 (Debian 11.2.0-18)) #1 SMP PREEMPT @1650206156
> > [    0.000000] Machine model: ARM Juno development board (r2)
> > <trim>
> > [   18.499895] ================================
> > [   18.504172] WARNING: inconsistent lock state
> > [   18.508451] 4.19.238 #1 Not tainted
> > [   18.511944] --------------------------------
> > [   18.516222] inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
> > [   18.522242] kworker/u12:3/60 [HC0[0]:SC0[0]:HE1:SE1] takes:
> > [   18.527826] (____ptrval____)
> > (&(&xprt->transport_lock)->rlock){+.?.}, at: xprt_destroy+0x70/0xe0
> > [   18.536648] {IN-SOFTIRQ-W} state was registered at:
> > [   18.541543]   lock_acquire+0xc8/0x23c

Prior to Linux 5.3, ->transport_lock needs spin_lock_bh() and 
spin_unlock_bh().

Thanks,
NeilBrown

^ permalink raw reply

* Re: [PATCH iproute2-next 1/3] libbpf: Use bpf_object__load instead of bpf_object__load_xattr
From: Hangbin Liu @ 2022-04-26  2:29 UTC (permalink / raw)
  To: David Ahern; +Cc: netdev, stephen, toke, Paul Chaignon
In-Reply-To: <cde7f66a-7a23-0d98-d5fe-04ecc8cd5f6b@kernel.org>

On Sun, Apr 24, 2022 at 10:10:44AM -0600, David Ahern wrote:
> On 4/23/22 7:56 PM, Hangbin Liu wrote:
> > Hi David,
> > 
> > This patch revert c04e45d0 lib/bpf: fix verbose flag when using libbpf,
> > Should we set prog->log_level directly before it loaded, like
> > bpf_program__set_log_level() does?
> > 
> 
> that API is new - Dec 2021 so it will not be present across relevant
> libbpf versions. Detecting what exists in a libbpf version and adding
> compat wrappers needs to be added. That's an undertaking I do not have
> time for at the moment. If you or someone else does it would be appreciated.
> 
Ah, yes, I forgot we can't set prog log_level directly. Looks we need to
add a new flag similar with HAVE_LIBBPF_SECTION_NAME... It's really a pain to
add more flags for libbfp...

Hi Paul, will you do that?

Thanks
Hangbin


^ permalink raw reply

* Re: [PATCH net v3] virtio_net: fix wrong buf address calculation when using xdp
From: Jason Wang @ 2022-04-26  2:22 UTC (permalink / raw)
  To: Nikolay Aleksandrov
  Cc: netdev, Jakub Kicinski, davem, stable, Xuan Zhuo, Daniel Borkmann,
	Michael S. Tsirkin, virtualization
In-Reply-To: <20220425103703.3067292-1-razor@blackwall.org>

On Mon, Apr 25, 2022 at 6:38 PM Nikolay Aleksandrov <razor@blackwall.org> wrote:
>
> We received a report[1] of kernel crashes when Cilium is used in XDP
> mode with virtio_net after updating to newer kernels. After
> investigating the reason it turned out that when using mergeable bufs
> with an XDP program which adjusts xdp.data or xdp.data_meta page_to_buf()
> calculates the build_skb address wrong because the offset can become less
> than the headroom so it gets the address of the previous page (-X bytes
> depending on how lower offset is):
>  page_to_skb: page addr ffff9eb2923e2000 buf ffff9eb2923e1ffc offset 252 headroom 256
>
> This is a pr_err() I added in the beginning of page_to_skb which clearly
> shows offset that is less than headroom by adding 4 bytes of metadata
> via an xdp prog. The calculations done are:
>  receive_mergeable():
>  headroom = VIRTIO_XDP_HEADROOM; // VIRTIO_XDP_HEADROOM == 256 bytes
>  offset = xdp.data - page_address(xdp_page) -
>           vi->hdr_len - metasize;
>
>  page_to_skb():
>  p = page_address(page) + offset;
>  ...
>  buf = p - headroom;
>
> Now buf goes -4 bytes from the page's starting address as can be seen
> above which is set as skb->head and skb->data by build_skb later. Depending
> on what's done with the skb (when it's freed most often) we get all kinds
> of corruptions and BUG_ON() triggers in mm[2]. We have to recalculate
> the new headroom after the xdp program has run, similar to how offset
> and len are recalculated. Headroom is directly related to
> data_hard_start, data and data_meta, so we use them to get the new size.
> The result is correct (similar pr_err() in page_to_skb, one case of
> xdp_page and one case of virtnet buf):
>  a) Case with 4 bytes of metadata
>  [  115.949641] page_to_skb: page addr ffff8b4dcfad2000 offset 252 headroom 252
>  [  121.084105] page_to_skb: page addr ffff8b4dcf018000 offset 20732 headroom 252
>  b) Case of pushing data +32 bytes
>  [  153.181401] page_to_skb: page addr ffff8b4dd0c4d000 offset 288 headroom 288
>  [  158.480421] page_to_skb: page addr ffff8b4dd00b0000 offset 24864 headroom 288
>  c) Case of pushing data -33 bytes
>  [  835.906830] page_to_skb: page addr ffff8b4dd3270000 offset 223 headroom 223
>  [  840.839910] page_to_skb: page addr ffff8b4dcdd68000 offset 12511 headroom 223
>
> Offset and headroom are equal because offset points to the start of
> reserved bytes for the virtio_net header which are at buf start +
> headroom, while data points at buf start + vnet hdr size + headroom so
> when data or data_meta are adjusted by the xdp prog both the headroom size
> and the offset change equally. We can use data_hard_start to compute the
> new headroom after the xdp prog (linearized / page start case, the
> virtnet buf case is similar just with bigger base offset):
>  xdp.data_hard_start = page_address + vnet_hdr
>  xdp.data = page_address + vnet_hdr + headroom
>  new headroom after xdp prog = xdp.data - xdp.data_hard_start - metasize
>
> An example reproducer xdp prog[3] is below.
>
> [1] https://github.com/cilium/cilium/issues/19453
>
> [2] Two of the many traces:
>  [   40.437400] BUG: Bad page state in process swapper/0  pfn:14940
>  [   40.916726] BUG: Bad page state in process systemd-resolve  pfn:053b7
>  [   41.300891] kernel BUG at include/linux/mm.h:720!
>  [   41.301801] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
>  [   41.302784] CPU: 1 PID: 1181 Comm: kubelet Kdump: loaded Tainted: G    B   W         5.18.0-rc1+ #37
>  [   41.304458] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1.fc35 04/01/2014
>  [   41.306018] RIP: 0010:page_frag_free+0x79/0xe0
>  [   41.306836] Code: 00 00 75 ea 48 8b 07 a9 00 00 01 00 74 e0 48 8b 47 48 48 8d 50 ff a8 01 48 0f 45 fa eb d0 48 c7 c6 18 b8 30 a6 e8 d7 f8 fc ff <0f> 0b 48 8d 78 ff eb bc 48 8b 07 a9 00 00 01 00 74 3a 66 90 0f b6
>  [   41.310235] RSP: 0018:ffffac05c2a6bc78 EFLAGS: 00010292
>  [   41.311201] RAX: 000000000000003e RBX: 0000000000000000 RCX: 0000000000000000
>  [   41.312502] RDX: 0000000000000001 RSI: ffffffffa6423004 RDI: 00000000ffffffff
>  [   41.313794] RBP: ffff993c98823600 R08: 0000000000000000 R09: 00000000ffffdfff
>  [   41.315089] R10: ffffac05c2a6ba68 R11: ffffffffa698ca28 R12: ffff993c98823600
>  [   41.316398] R13: ffff993c86311ebc R14: 0000000000000000 R15: 000000000000005c
>  [   41.317700] FS:  00007fe13fc56740(0000) GS:ffff993cdd900000(0000) knlGS:0000000000000000
>  [   41.319150] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>  [   41.320152] CR2: 000000c00008a000 CR3: 0000000014908000 CR4: 0000000000350ee0
>  [   41.321387] Call Trace:
>  [   41.321819]  <TASK>
>  [   41.322193]  skb_release_data+0x13f/0x1c0
>  [   41.322902]  __kfree_skb+0x20/0x30
>  [   41.343870]  tcp_recvmsg_locked+0x671/0x880
>  [   41.363764]  tcp_recvmsg+0x5e/0x1c0
>  [   41.384102]  inet_recvmsg+0x42/0x100
>  [   41.406783]  ? sock_recvmsg+0x1d/0x70
>  [   41.428201]  sock_read_iter+0x84/0xd0
>  [   41.445592]  ? 0xffffffffa3000000
>  [   41.462442]  new_sync_read+0x148/0x160
>  [   41.479314]  ? 0xffffffffa3000000
>  [   41.496937]  vfs_read+0x138/0x190
>  [   41.517198]  ksys_read+0x87/0xc0
>  [   41.535336]  do_syscall_64+0x3b/0x90
>  [   41.551637]  entry_SYSCALL_64_after_hwframe+0x44/0xae
>  [   41.568050] RIP: 0033:0x48765b
>  [   41.583955] Code: e8 4a 35 fe ff eb 88 cc cc cc cc cc cc cc cc e8 fb 7a fe ff 48 8b 7c 24 10 48 8b 74 24 18 48 8b 54 24 20 48 8b 44 24 08 0f 05 <48> 3d 01 f0 ff ff 76 20 48 c7 44 24 28 ff ff ff ff 48 c7 44 24 30
>  [   41.632818] RSP: 002b:000000c000a2f5b8 EFLAGS: 00000212 ORIG_RAX: 0000000000000000
>  [   41.664588] RAX: ffffffffffffffda RBX: 000000c000062000 RCX: 000000000048765b
>  [   41.681205] RDX: 0000000000005e54 RSI: 000000c000e66000 RDI: 0000000000000016
>  [   41.697164] RBP: 000000c000a2f608 R08: 0000000000000001 R09: 00000000000001b4
>  [   41.713034] R10: 00000000000000b6 R11: 0000000000000212 R12: 00000000000000e9
>  [   41.728755] R13: 0000000000000001 R14: 000000c000a92000 R15: ffffffffffffffff
>  [   41.744254]  </TASK>
>  [   41.758585] Modules linked in: br_netfilter bridge veth netconsole virtio_net
>
>  and
>
>  [   33.524802] BUG: Bad page state in process systemd-network  pfn:11e60
>  [   33.528617] page ffffe05dc0147b00 ffffe05dc04e7a00 ffff8ae9851ec000 (1) len 82 offset 252 metasize 4 hroom 0 hdr_len 12 data ffff8ae9851ec10c data_meta ffff8ae9851ec108 data_end ffff8ae9851ec14e
>  [   33.529764] page:000000003792b5ba refcount:0 mapcount:-512 mapping:0000000000000000 index:0x0 pfn:0x11e60
>  [   33.532463] flags: 0xfffffc0000000(node=0|zone=1|lastcpupid=0x1fffff)
>  [   33.532468] raw: 000fffffc0000000 0000000000000000 dead000000000122 0000000000000000
>  [   33.532470] raw: 0000000000000000 0000000000000000 00000000fffffdff 0000000000000000
>  [   33.532471] page dumped because: nonzero mapcount
>  [   33.532472] Modules linked in: br_netfilter bridge veth netconsole virtio_net
>  [   33.532479] CPU: 0 PID: 791 Comm: systemd-network Kdump: loaded Not tainted 5.18.0-rc1+ #37
>  [   33.532482] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1.fc35 04/01/2014
>  [   33.532484] Call Trace:
>  [   33.532496]  <TASK>
>  [   33.532500]  dump_stack_lvl+0x45/0x5a
>  [   33.532506]  bad_page.cold+0x63/0x94
>  [   33.532510]  free_pcp_prepare+0x290/0x420
>  [   33.532515]  free_unref_page+0x1b/0x100
>  [   33.532518]  skb_release_data+0x13f/0x1c0
>  [   33.532524]  kfree_skb_reason+0x3e/0xc0
>  [   33.532527]  ip6_mc_input+0x23c/0x2b0
>  [   33.532531]  ip6_sublist_rcv_finish+0x83/0x90
>  [   33.532534]  ip6_sublist_rcv+0x22b/0x2b0
>
> [3] XDP program to reproduce(xdp_pass.c):
>  #include <linux/bpf.h>
>  #include <bpf/bpf_helpers.h>
>
>  SEC("xdp_pass")
>  int xdp_pkt_pass(struct xdp_md *ctx)
>  {
>           bpf_xdp_adjust_head(ctx, -(int)32);
>           return XDP_PASS;
>  }
>
>  char _license[] SEC("license") = "GPL";
>
>  compile: clang -O2 -g -Wall -target bpf -c xdp_pass.c -o xdp_pass.o
>  load on virtio_net: ip link set enp1s0 xdpdrv obj xdp_pass.o sec xdp_pass
>
> CC: stable@vger.kernel.org
> CC: Jason Wang <jasowang@redhat.com>
> CC: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> CC: Daniel Borkmann <daniel@iogearbox.net>
> CC: "Michael S. Tsirkin" <mst@redhat.com>
> CC: virtualization@lists.linux-foundation.org
> Fixes: 8fb7da9e9907 ("virtio_net: get build_skb() buf by data ptr")
> Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org>
> ---
> v3: Add a comment explaining why offset and headroom are equal,
>     no code changes
> v2: Recalculate headroom based on data, data_hard_start and data_meta

Acked-by: Jason Wang <jasowang@redhat.com>

>
>  drivers/net/virtio_net.c | 20 +++++++++++++++++++-
>  1 file changed, 19 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 87838cbe38cf..cbba9d2e8f32 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -1005,6 +1005,24 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
>                          * xdp.data_meta were adjusted
>                          */
>                         len = xdp.data_end - xdp.data + vi->hdr_len + metasize;
> +
> +                       /* recalculate headroom if xdp.data or xdp_data_meta
> +                        * were adjusted, note that offset should always point
> +                        * to the start of the reserved bytes for virtio_net
> +                        * header which are followed by xdp.data, that means
> +                        * that offset is equal to the headroom (when buf is
> +                        * starting at the beginning of the page, otherwise
> +                        * there is a base offset inside the page) but it's used
> +                        * with a different starting point (buf start) than
> +                        * xdp.data (buf start + vnet hdr size). If xdp.data or
> +                        * data_meta were adjusted by the xdp prog then the
> +                        * headroom size has changed and so has the offset, we
> +                        * can use data_hard_start, which points at buf start +
> +                        * vnet hdr size, to calculate the new headroom and use
> +                        * it later to compute buf start in page_to_skb()
> +                        */
> +                       headroom = xdp.data - xdp.data_hard_start - metasize;
> +
>                         /* We can only create skb based on xdp_page. */
>                         if (unlikely(xdp_page != page)) {
>                                 rcu_read_unlock();
> @@ -1012,7 +1030,7 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
>                                 head_skb = page_to_skb(vi, rq, xdp_page, offset,
>                                                        len, PAGE_SIZE, false,
>                                                        metasize,
> -                                                      VIRTIO_XDP_HEADROOM);
> +                                                      headroom);
>                                 return head_skb;
>                         }
>                         break;
> --
> 2.35.1
>


^ permalink raw reply

* Re: [PATCH iproute2-next v2 2/2] f_flower: Check args with num_of_vlans
From: David Ahern @ 2022-04-26  2:11 UTC (permalink / raw)
  To: Boris Sukholitko
  Cc: netdev, David S . Miller, Jakub Kicinski, Jamal Hadi Salim,
	Cong Wang, Jiri Pirko, Gustavo A . R . Silva, Vladimir Oltean,
	Eric Dumazet, zhang kai, Yoshiki Komachi, Ilya Lifshits
In-Reply-To: <20220412100343.27387-3-boris.sukholitko@broadcom.com>

On Tue, Apr 12, 2022 at 01:03:43PM +0300, Boris Sukholitko wrote:
> Having more than one vlan allows matching on the vlan tag parameters.
> This patch changes vlan key validation to take number of vlan tags into
> account.
> 
> Signed-off-by: Boris Sukholitko <boris.sukholitko@broadcom.com>
> ---
>  tc/f_flower.c | 41 +++++++++++++++++++++++------------------
>  1 file changed, 23 insertions(+), 18 deletions(-)
> 

does not apply cleanly to iproute2-next

^ permalink raw reply

* [syzbot] WARNING: suspicious RCU usage in hsr_node_get_first
From: syzbot @ 2022-04-26  2:08 UTC (permalink / raw)
  To: claudiajkang, davem, ennoerlangen, george.mccollister, kuba,
	linux-kernel, netdev, pabeni, syzkaller-bugs

Hello,

syzbot found the following issue on:

HEAD commit:    22da5264abf4 Merge tag '5.18-rc3-ksmbd-fixes' of git://git..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=177ed38af00000
kernel config:  https://syzkaller.appspot.com/x/.config?x=71bf5c8488a4e33a
dashboard link: https://syzkaller.appspot.com/bug?extid=d4de7030f60c07837e60
compiler:       gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=15200242f00000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=13437c44f00000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+d4de7030f60c07837e60@syzkaller.appspotmail.com

netdevsim netdevsim0 netdevsim1: set [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim0 netdevsim2: set [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim0 netdevsim3: set [1, 0] type 2 family 0 port 6081 - 0
netlink: 'syz-executor156': attribute type 2 has an invalid length.
netlink: 194488 bytes leftover after parsing attributes in process `syz-executor156'.
=============================
WARNING: suspicious RCU usage
5.18.0-rc3-syzkaller-00235-g22da5264abf4 #0 Not tainted
-----------------------------
net/hsr/hsr_framereg.c:41 suspicious rcu_dereference_check() usage!

other info that might help us debug this:


rcu_scheduler_active = 2, debug_locks = 1
3 locks held by syz-executor156/3589:
 #0: ffffffff8d5ff210 (cb_lock){++++}-{3:3}, at: genl_rcv+0x15/0x40 net/netlink/genetlink.c:802
 #1: ffffffff8d5ff2c8 (genl_mutex){+.+.}-{3:3}, at: genl_lock net/netlink/genetlink.c:33 [inline]
 #1: ffffffff8d5ff2c8 (genl_mutex){+.+.}-{3:3}, at: genl_rcv_msg+0x3e0/0x580 net/netlink/genetlink.c:790
 #2: ffffffff8bd820e0 (rcu_read_lock){....}-{1:2}, at: hsr_get_node_list+0xcb/0xa80 net/hsr/hsr_netlink.c:425

stack backtrace:
CPU: 1 PID: 3589 Comm: syz-executor156 Not tainted 5.18.0-rc3-syzkaller-00235-g22da5264abf4 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
 hsr_node_get_first+0xc7/0xe0 net/hsr/hsr_framereg.c:41
 hsr_get_next_node+0x1d0/0x320 net/hsr/hsr_framereg.c:608
 hsr_get_node_list+0x333/0xa80 net/hsr/hsr_netlink.c:461
 genl_family_rcv_msg_doit+0x228/0x320 net/netlink/genetlink.c:731
 genl_family_rcv_msg net/netlink/genetlink.c:775 [inline]
 genl_rcv_msg+0x328/0x580 net/netlink/genetlink.c:792
 netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2503
 genl_rcv+0x24/0x40 net/netlink/genetlink.c:803
 netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline]
 netlink_unicast+0x543/0x7f0 net/netlink/af_netlink.c:1345
 netlink_sendmsg+0x904/0xe00 net/netlink/af_netlink.c:1921
 sock_sendmsg_nosec net/socket.c:705 [inline]
 sock_sendmsg+0xcf/0x120 net/socket.c:725
 ____sys_sendmsg+0x6e2/0x800 net/socket.c:2413
 ___sys_sendmsg+0xf3/0x170 net/socket.c:2467
 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2496
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7fb779cfc8d9
Code: 28 c3 e8 4a 15 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffe75a56d68 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007ffe75a56d78 RCX: 00007fb779cfc8d9
RDX: 0000000000000000 RSI: 0000000020000000 RDI: 0000000000000003
RBP: 0000000000000003 R08: bb1414ac00000000 R09: bb1414ac00000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffe75a56d80
R13: 00007ffe75a56d74 R14: 0000000000000003 R15: 0000000000000000
 </TASK>


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply

* RE: [PATCH net-next v9 2/2] net: ethernet: Add driver for Sunplus SP7021
From: Wells Lu 呂芳騰 @ 2022-04-26  1:37 UTC (permalink / raw)
  To: Stephen Hemminger, Wells Lu
  Cc: davem@davemloft.net, kuba@kernel.org, robh+dt@kernel.org,
	netdev@vger.kernel.org, devicetree@vger.kernel.org,
	linux-kernel@vger.kernel.org, p.zabel@pengutronix.de,
	pabeni@redhat.com, krzk+dt@kernel.org, roopa@nvidia.com,
	andrew@lunn.ch, edumazet@google.com
In-Reply-To: <20220425092446.477bd8f5@hermes.local>

Hi Stephen,


> > diff --git a/drivers/net/ethernet/sunplus/spl2sw_driver.h
> > b/drivers/net/ethernet/sunplus/spl2sw_driver.h
> > new file mode 100644
> > index 000000000..5f177b3af
> > --- /dev/null
> > +++ b/drivers/net/ethernet/sunplus/spl2sw_driver.h
> > @@ -0,0 +1,12 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/* Copyright Sunplus Technology Co., Ltd.
> > + *       All rights reserved.
> > + */
> > +
> > +#ifndef __SPL2SW_DRIVER_H__
> > +#define __SPL2SW_DRIVER_H__
> > +
> > +#define SPL2SW_RX_NAPI_WEIGHT	16
> > +#define SPL2SW_TX_NAPI_WEIGHT	16
> 
> Why define your own? there is NAPI_POLL_WEIGHT already defined in netdevice.h

I didn't know there is NAPI_POLL_WEIGHT defined in netdevice.h.
I'll remove my own define and use it next patch.

Thank you for your review.


Best regards,
Wells

^ permalink raw reply

* Re: [PATCH net-next v6 02/13] net: wwan: t7xx: Add control DMA interface
From: Sergey Ryazanov @ 2022-04-26  0:19 UTC (permalink / raw)
  To: Ricardo Martinez, Loic Poulain
  Cc: netdev, linux-wireless, Jakub Kicinski, David Miller,
	Johannes Berg, M Chetan Kumar, chandrashekar.devegowda,
	Intel Corporation, chiranjeevi.rapolu,
	Haijun Liu (刘海军), amir.hanania,
	Andy Shevchenko, dinesh.sharma, eliot.lee, ilpo.johannes.jarvinen,
	moises.veleta, pierre-louis.bossart, muralidharan.sethuraman,
	Soumya.Prakash.Mishra, sreehari.kancharla, madhusmita.sahu
In-Reply-To: <20220407223629.21487-3-ricardo.martinez@linux.intel.com>

Hello Ricardo, Loic, Ilpo,

On Fri, Apr 8, 2022 at 1:37 AM Ricardo Martinez
<ricardo.martinez@linux.intel.com> wrote:
> ...
> Co-developed-by: Ricardo Martinez <ricardo.martinez@linux.intel.com>
> Signed-off-by: Ricardo Martinez <ricardo.martinez@linux.intel.com>
>
> From a WWAN framework perspective:
> Reviewed-by: Loic Poulain <loic.poulain@linaro.org>
>
> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>

This line with "From a WWAN framework perspective" looks confusing to
me. Anyone not familiar with all of the iterations will be in doubt as
to whether it belongs only to Loic's review or to both of them.

How about to format this block like this:

> Co-developed-by: Ricardo Martinez <ricardo.martinez@linux.intel.com>
> Signed-off-by: Ricardo Martinez <ricardo.martinez@linux.intel.com>
> Reviewed-by: Loic Poulain <loic.poulain@linaro.org> (WWAN framework)
> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>

or like this:

> Co-developed-by: Ricardo Martinez <ricardo.martinez@linux.intel.com>
> Signed-off-by: Ricardo Martinez <ricardo.martinez@linux.intel.com>
> Reviewed-by: Loic Poulain <loic.poulain@linaro.org> # WWAN framework
> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>

Parentheses vs. comment sign. I saw people use both of these formats,
I just do not know which is better. What do you think?

--
Sergey

^ permalink raw reply

* Re: [PATCH bpf-next v6 5/6] bpf: Add selftests for raw syncookie helpers
From: Alexei Starovoitov @ 2022-04-26  0:12 UTC (permalink / raw)
  To: Maxim Mikityanskiy
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, netdev,
	Tariq Toukan, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, David S. Miller, Jakub Kicinski,
	Petar Penkov, Lorenz Bauer, Eric Dumazet, Hideaki YOSHIFUJI,
	David Ahern, Shuah Khan, Jesper Dangaard Brouer,
	Nathan Chancellor, Nick Desaulniers, Joe Stringer, Florent Revest,
	linux-kselftest, Toke Høiland-Jørgensen,
	Kumar Kartikeya Dwivedi, Florian Westphal, pabeni
In-Reply-To: <20220422172422.4037988-6-maximmi@nvidia.com>

On Fri, Apr 22, 2022 at 08:24:21PM +0300, Maxim Mikityanskiy wrote:
> +void test_xdp_synproxy(void)
> +{
> +	int server_fd = -1, client_fd = -1, accept_fd = -1;
> +	struct nstoken *ns = NULL;
> +	FILE *ctrl_file = NULL;
> +	char buf[1024];
> +	size_t size;
> +
> +	SYS("ip netns add synproxy");
> +
> +	SYS("ip link add tmp0 type veth peer name tmp1");
> +	SYS("ip link set tmp1 netns synproxy");
> +	SYS("ip link set tmp0 up");
> +	SYS("ip addr replace 198.18.0.1/24 dev tmp0");
> +
> +	// When checksum offload is enabled, the XDP program sees wrong
> +	// checksums and drops packets.
> +	SYS("ethtool -K tmp0 tx off");

BPF CI image doesn't have ethtool installed.
It will take some time to get it updated. Until then we cannot land the patch set.
Can you think of a way to run this test without shelling to ethtool?

^ permalink raw reply

* Re: [PATCH bpf-next v2] bpf: use bpf_prog_run_array_cg_flags everywhere
From: patchwork-bot+netdevbpf @ 2022-04-26  0:10 UTC (permalink / raw)
  To: Stanislav Fomichev; +Cc: netdev, bpf, ast, daniel, andrii, kafai
In-Reply-To: <20220425220448.3669032-1-sdf@google.com>

Hello:

This patch was applied to bpf/bpf-next.git (master)
by Alexei Starovoitov <ast@kernel.org>:

On Mon, 25 Apr 2022 15:04:48 -0700 you wrote:
> Rename bpf_prog_run_array_cg_flags to bpf_prog_run_array_cg and
> use it everywhere. check_return_code already enforces sane
> return ranges for all cgroup types. (only egress and bind hooks have
> uncanonical return ranges, the rest is using [0, 1])
> 
> No functional changes.
> 
> [...]

Here is the summary with links:
  - [bpf-next,v2] bpf: use bpf_prog_run_array_cg_flags everywhere
    https://git.kernel.org/bpf/bpf-next/c/d9d31cf88702

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* Re: [PATCH net-next v6 13/13] net: wwan: t7xx: Add maintainers and documentation
From: Sergey Ryazanov @ 2022-04-25 23:55 UTC (permalink / raw)
  To: Ricardo Martinez
  Cc: netdev, linux-wireless, Jakub Kicinski, David Miller,
	Johannes Berg, Loic Poulain, M Chetan Kumar,
	chandrashekar.devegowda, Intel Corporation, chiranjeevi.rapolu,
	Haijun Liu (刘海军), amir.hanania,
	Andy Shevchenko, dinesh.sharma, eliot.lee, ilpo.johannes.jarvinen,
	moises.veleta, pierre-louis.bossart, muralidharan.sethuraman,
	Soumya.Prakash.Mishra, sreehari.kancharla, madhusmita.sahu
In-Reply-To: <20220407223629.21487-14-ricardo.martinez@linux.intel.com>

On Fri, Apr 8, 2022 at 1:37 AM Ricardo Martinez
<ricardo.martinez@linux.intel.com> wrote:
> Adds maintainers and documentation for MediaTek t7xx 5G WWAN modem
> device driver.
>
> Signed-off-by: Ricardo Martinez <ricardo.martinez@linux.intel.com>
>
> From a WWAN framework perspective:
> Reviewed-by: Loic Poulain <loic.poulain@linaro.org>

Reviewed-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>

^ permalink raw reply

* Re: [PATCH net-next v1 1/4] net: phy: broadcom: Add PTP support for some Broadcom PHYs.
From: Jonathan Lemon @ 2022-04-25 23:55 UTC (permalink / raw)
  To: Richard Cochran
  Cc: f.fainelli, bcm-kernel-feedback-list, andrew, hkallweit1, linux,
	netdev, kernel-team
In-Reply-To: <20220425013800.GC4472@hoboy.vegasvil.org>

On Sun, Apr 24, 2022 at 06:38:00PM -0700, Richard Cochran wrote:
> On Sat, Apr 23, 2022 at 07:23:53PM -0700, Jonathan Lemon wrote:
> 
> > +static bool bcm_ptp_get_tstamp(struct bcm_ptp_private *priv,
> > +			       struct bcm_ptp_capture *capts)
> > +{
> > +	struct phy_device *phydev = priv->phydev;
> > +	u16 ts[4], reg;
> > +	u32 sec, nsec;
> > +
> > +	mutex_lock(&priv->mutex);
> > +
> > +	reg = bcm_phy_read_exp(phydev, INTR_STATUS);
> > +	if ((reg & INTC_SOP) == 0) {
> > +		mutex_unlock(&priv->mutex);
> > +		return false;
> > +	}
> > +
> > +	bcm_phy_write_exp(phydev, TS_READ_CTRL, TS_READ_START);
> > +
> > +	ts[0] = bcm_phy_read_exp(phydev, TS_REG_0);
> > +	ts[1] = bcm_phy_read_exp(phydev, TS_REG_1);
> > +	ts[2] = bcm_phy_read_exp(phydev, TS_REG_2);
> > +	ts[3] = bcm_phy_read_exp(phydev, TS_REG_3);
> > +
> > +	/* not in be32 format for some reason */
> > +	capts->seq_id = bcm_phy_read_exp(priv->phydev, TS_INFO_0);
> > +
> > +	reg = bcm_phy_read_exp(phydev, TS_INFO_1);
> > +	capts->msgtype = reg >> 12;
> > +	capts->tx_dir = !!(reg & BIT(11));
> 
> Okay, so now I am sad.  The 541xx has:
> 
>   TIMESTAMP_INFO_1 0xA8C  bit 0 DIR, bits 1-2 msg_type, etc
>   TIMESTAMP_INFO_2 0xA8D  sequence ID
> 
> It is the same info, but randomly shuffled among the two registers in
> a different way.
> 
> So much for supporting multiple devices with a common code base.  :(

We could just have a chip-specific version of this function.  The
recovered timestamp is passed back in a structure, so the rest of the
code would be unchanged.
-- 
Jonathan    (no, not volunteering to do this...)

^ permalink raw reply

* Re: [PATCH net-next v6 09/13] net: wwan: t7xx: Add WWAN network interface
From: Sergey Ryazanov @ 2022-04-25 23:55 UTC (permalink / raw)
  To: Ricardo Martinez
  Cc: netdev, linux-wireless, Jakub Kicinski, David Miller,
	Johannes Berg, Loic Poulain, M Chetan Kumar,
	chandrashekar.devegowda, Intel Corporation, chiranjeevi.rapolu,
	Haijun Liu (刘海军), amir.hanania,
	Andy Shevchenko, dinesh.sharma, eliot.lee, ilpo.johannes.jarvinen,
	moises.veleta, pierre-louis.bossart, muralidharan.sethuraman,
	Soumya.Prakash.Mishra, sreehari.kancharla, madhusmita.sahu
In-Reply-To: <20220407223629.21487-10-ricardo.martinez@linux.intel.com>

On Fri, Apr 8, 2022 at 1:37 AM Ricardo Martinez
<ricardo.martinez@linux.intel.com> wrote:
> Creates the Cross Core Modem Network Interface (CCMNI) which implements
> the wwan_ops for registration with the WWAN framework, CCMNI also
> implements the net_device_ops functions used by the network device.
> Network device operations include open, close, start transmission, TX
> timeout and change MTU.
>
> Signed-off-by: Haijun Liu <haijun.liu@mediatek.com>
> Co-developed-by: Chandrashekar Devegowda <chandrashekar.devegowda@intel.com>
> Signed-off-by: Chandrashekar Devegowda <chandrashekar.devegowda@intel.com>
> Co-developed-by: Ricardo Martinez <ricardo.martinez@linux.intel.com>
> Signed-off-by: Ricardo Martinez <ricardo.martinez@linux.intel.com>
>
> From a WWAN framework perspective:
> Reviewed-by: Loic Poulain <loic.poulain@linaro.org>

Reviewed-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>

^ permalink raw reply

* Re: [PATCH net-next v6 08/13] net: wwan: t7xx: Add data path interface
From: Sergey Ryazanov @ 2022-04-25 23:55 UTC (permalink / raw)
  To: Ricardo Martinez
  Cc: netdev, linux-wireless, Jakub Kicinski, David Miller,
	Johannes Berg, Loic Poulain, M Chetan Kumar,
	chandrashekar.devegowda, Intel Corporation, chiranjeevi.rapolu,
	Haijun Liu (刘海军), amir.hanania,
	Andy Shevchenko, dinesh.sharma, eliot.lee, ilpo.johannes.jarvinen,
	moises.veleta, pierre-louis.bossart, muralidharan.sethuraman,
	Soumya.Prakash.Mishra, sreehari.kancharla, madhusmita.sahu
In-Reply-To: <20220407223629.21487-9-ricardo.martinez@linux.intel.com>

On Fri, Apr 8, 2022 at 1:37 AM Ricardo Martinez
<ricardo.martinez@linux.intel.com> wrote:
> Data Path Modem AP Interface (DPMAIF) HIF layer provides methods
> for initialization, ISR, control and event handling of TX/RX flows.
>
> DPMAIF TX
> Exposes the 'dmpaif_tx_send_skb' function which can be used by the
> network device to transmit packets.
> The uplink data management uses a Descriptor Ring Buffer (DRB).
> First DRB entry is a message type that will be followed by 1 or more
> normal DRB entries. Message type DRB will hold the skb information
> and each normal DRB entry holds a pointer to the skb payload.
>
> DPMAIF RX
> The downlink buffer management uses Buffer Address Table (BAT) and
> Packet Information Table (PIT) rings.
> The BAT ring holds the address of skb data buffer for the HW to use,
> while the PIT contains metadata about a whole network packet including
> a reference to the BAT entry holding the data buffer address.
> The driver reads the PIT and BAT entries written by the modem, when
> reaching a threshold, the driver will reload the PIT and BAT rings.
>
> Signed-off-by: Haijun Liu <haijun.liu@mediatek.com>
> Signed-off-by: Chandrashekar Devegowda <chandrashekar.devegowda@intel.com>
> Co-developed-by: Ricardo Martinez <ricardo.martinez@linux.intel.com>
> Signed-off-by: Ricardo Martinez <ricardo.martinez@linux.intel.com>
>
> From a WWAN framework perspective:
> Reviewed-by: Loic Poulain <loic.poulain@linaro.org>

Reviewed-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>

and a small question below.

> diff --git a/drivers/net/wwan/t7xx/t7xx_hif_dpmaif_rx.c b/drivers/net/wwan/t7xx/t7xx_hif_dpmaif_rx.c
> ...
> +static bool t7xx_alloc_and_map_skb_info(const struct dpmaif_ctrl *dpmaif_ctrl,
> +                                       const unsigned int size, struct dpmaif_bat_skb *cur_skb)
> +{
> +       dma_addr_t data_bus_addr;
> +       struct sk_buff *skb;
> +       size_t data_len;
> +
> +       skb = __dev_alloc_skb(size, GFP_KERNEL);
> +       if (!skb)
> +               return false;
> +
> +       data_len = skb_end_pointer(skb) - skb->data;

Earlier you use a nice t7xx_skb_data_area_size() function here, but
now you opencode it. Is it a consequence of t7xx_common.h removing?

I would even encourage you to make this function common and place it
into include/linux/skbuff.h with a dedicated patch and call it
something like skb_data_size(). Surprisingly, there is no such helper
function in the kernel, and several other drivers will benefit from
it:

$ grep -rn 'skb_end_pointer(.*) [-]' drivers/net/
drivers/net/ethernet/marvell/mv643xx_eth.c:628: size =
skb_end_pointer(skb) - skb->data;
drivers/net/ethernet/marvell/pxa168_eth.c:322: size =
skb_end_pointer(skb) - skb->data;
drivers/net/ethernet/micrel/ksz884x.c:4764: if (skb_end_pointer(skb) -
skb->data >= 50) {
drivers/net/ethernet/netronome/nfp/ccm_mbox.c:492: undersize =
max_reply_size - (skb_end_pointer(skb) - skb->data);
drivers/net/ethernet/nvidia/forcedeth.c:2073:
(skb_end_pointer(np->rx_skb[i].skb) -
drivers/net/ethernet/nvidia/forcedeth.c:5238: (skb_end_pointer(tx_skb)
- tx_skb->data),
drivers/net/veth.c:767: frame_sz = skb_end_pointer(skb) - skb->head;
drivers/net/wwan/t7xx/t7xx_hif_cldma.c:106: return
skb_end_pointer(skb) - skb->data;
drivers/net/wwan/t7xx/t7xx_hif_dpmaif_rx.c:160: data_len =
skb_end_pointer(skb) - skb->data;

> +       data_bus_addr = dma_map_single(dpmaif_ctrl->dev, skb->data, data_len, DMA_FROM_DEVICE);
> +       if (dma_mapping_error(dpmaif_ctrl->dev, data_bus_addr)) {
> +               dev_err_ratelimited(dpmaif_ctrl->dev, "DMA mapping error\n");
> +               dev_kfree_skb_any(skb);
> +               return false;
> +       }
> +
> +       cur_skb->skb = skb;
> +       cur_skb->data_bus_addr = data_bus_addr;
> +       cur_skb->data_len = data_len;
> +
> +       return true;
> +}

--
Sergey

^ permalink raw reply

* Re: [PATCH net-next v6 07/13] net: wwan: t7xx: Data path HW layer
From: Sergey Ryazanov @ 2022-04-25 23:54 UTC (permalink / raw)
  To: Ricardo Martinez
  Cc: netdev, linux-wireless, Jakub Kicinski, David Miller,
	Johannes Berg, Loic Poulain, M Chetan Kumar,
	chandrashekar.devegowda, Intel Corporation, chiranjeevi.rapolu,
	Haijun Liu (刘海军), amir.hanania,
	Andy Shevchenko, dinesh.sharma, eliot.lee, ilpo.johannes.jarvinen,
	moises.veleta, pierre-louis.bossart, muralidharan.sethuraman,
	Soumya.Prakash.Mishra, sreehari.kancharla, madhusmita.sahu
In-Reply-To: <20220407223629.21487-8-ricardo.martinez@linux.intel.com>

On Fri, Apr 8, 2022 at 1:37 AM Ricardo Martinez
<ricardo.martinez@linux.intel.com> wrote:
> Data Path Modem AP Interface (DPMAIF) HW layer provides HW abstraction
> for the upper layer (DPMAIF HIF). It implements functions to do the HW
> configuration, TX/RX control and interrupt handling.
>
> Signed-off-by: Haijun Liu <haijun.liu@mediatek.com>
> Signed-off-by: Chandrashekar Devegowda <chandrashekar.devegowda@intel.com>
> Co-developed-by: Ricardo Martinez <ricardo.martinez@linux.intel.com>
> Signed-off-by: Ricardo Martinez <ricardo.martinez@linux.intel.com>
>
> From a WWAN framework perspective:
> Reviewed-by: Loic Poulain <loic.poulain@linaro.org>

Reviewed-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>

^ permalink raw reply

* Re: [PATCH net-next v6 06/13] net: wwan: t7xx: Add AT and MBIM WWAN ports
From: Sergey Ryazanov @ 2022-04-25 23:54 UTC (permalink / raw)
  To: Ricardo Martinez
  Cc: netdev, linux-wireless, Jakub Kicinski, David Miller,
	Johannes Berg, Loic Poulain, M Chetan Kumar,
	chandrashekar.devegowda, Intel Corporation, chiranjeevi.rapolu,
	Haijun Liu (刘海军), amir.hanania,
	Andy Shevchenko, dinesh.sharma, eliot.lee, ilpo.johannes.jarvinen,
	moises.veleta, pierre-louis.bossart, muralidharan.sethuraman,
	Soumya.Prakash.Mishra, sreehari.kancharla, madhusmita.sahu
In-Reply-To: <20220407223629.21487-7-ricardo.martinez@linux.intel.com>

On Fri, Apr 8, 2022 at 1:37 AM Ricardo Martinez
<ricardo.martinez@linux.intel.com> wrote:
> Adds AT and MBIM ports to the port proxy infrastructure.
> The initialization method is responsible for creating the corresponding
> ports using the WWAN framework infrastructure. The implemented WWAN port
> operations are start, stop, and TX.
>
> Signed-off-by: Chandrashekar Devegowda <chandrashekar.devegowda@intel.com>
> Co-developed-by: Ricardo Martinez <ricardo.martinez@linux.intel.com>
> Signed-off-by: Ricardo Martinez <ricardo.martinez@linux.intel.com>
>
> From a WWAN framework perspective:
> Reviewed-by: Loic Poulain <loic.poulain@linaro.org>

Reviewed-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>

^ permalink raw reply

* Re: [PATCH net-next v6 05/13] net: wwan: t7xx: Add control port
From: Sergey Ryazanov @ 2022-04-25 23:54 UTC (permalink / raw)
  To: Ricardo Martinez
  Cc: netdev, linux-wireless, Jakub Kicinski, David Miller,
	Johannes Berg, Loic Poulain, M Chetan Kumar,
	chandrashekar.devegowda, Intel Corporation, chiranjeevi.rapolu,
	Haijun Liu (刘海军), amir.hanania,
	Andy Shevchenko, dinesh.sharma, eliot.lee, ilpo.johannes.jarvinen,
	moises.veleta, pierre-louis.bossart, muralidharan.sethuraman,
	Soumya.Prakash.Mishra, sreehari.kancharla, madhusmita.sahu
In-Reply-To: <20220407223629.21487-6-ricardo.martinez@linux.intel.com>

On Fri, Apr 8, 2022 at 1:37 AM Ricardo Martinez
<ricardo.martinez@linux.intel.com> wrote:
> Control Port implements driver control messages such as modem-host
> handshaking, controls port enumeration, and handles exception messages.
>
> The handshaking process between the driver and the modem happens during
> the init sequence. The process involves the exchange of a list of
> supported runtime features to make sure that modem and host are ready
> to provide proper feature lists including port enumeration. Further
> features can be enabled and controlled in this handshaking process.
>
> Signed-off-by: Haijun Liu <haijun.liu@mediatek.com>
> Signed-off-by: Chandrashekar Devegowda <chandrashekar.devegowda@intel.com>
> Co-developed-by: Ricardo Martinez <ricardo.martinez@linux.intel.com>
> Signed-off-by: Ricardo Martinez <ricardo.martinez@linux.intel.com>
>
> From a WWAN framework perspective:
> Reviewed-by: Loic Poulain <loic.poulain@linaro.org>
>
> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>

Reviewed-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>

^ permalink raw reply

* Re: [PATCH net-next v6 04/13] net: wwan: t7xx: Add port proxy infrastructure
From: Sergey Ryazanov @ 2022-04-25 23:53 UTC (permalink / raw)
  To: Ricardo Martinez
  Cc: netdev, linux-wireless, Jakub Kicinski, David Miller,
	Johannes Berg, Loic Poulain, M Chetan Kumar,
	chandrashekar.devegowda, Intel Corporation, chiranjeevi.rapolu,
	Haijun Liu (刘海军), amir.hanania,
	Andy Shevchenko, dinesh.sharma, eliot.lee, ilpo.johannes.jarvinen,
	moises.veleta, pierre-louis.bossart, muralidharan.sethuraman,
	Soumya.Prakash.Mishra, sreehari.kancharla, madhusmita.sahu
In-Reply-To: <20220407223629.21487-5-ricardo.martinez@linux.intel.com>

On Fri, Apr 8, 2022 at 1:37 AM Ricardo Martinez
<ricardo.martinez@linux.intel.com> wrote:
> Port-proxy provides a common interface to interact with different types
> of ports. Ports export their configuration via `struct t7xx_port` and
> operate as defined by `struct port_ops`.
>
> Signed-off-by: Haijun Liu <haijun.liu@mediatek.com>
> Co-developed-by: Chandrashekar Devegowda <chandrashekar.devegowda@intel.com>
> Signed-off-by: Chandrashekar Devegowda <chandrashekar.devegowda@intel.com>
> Co-developed-by: Ricardo Martinez <ricardo.martinez@linux.intel.com>
> Signed-off-by: Ricardo Martinez <ricardo.martinez@linux.intel.com>
>
> From a WWAN framework perspective:
> Reviewed-by: Loic Poulain <loic.poulain@linaro.org>

[skipped]

> diff --git a/drivers/net/wwan/t7xx/t7xx_port_proxy.c b/drivers/net/wwan/t7xx/t7xx_port_proxy.c

[skipped]

> +static struct t7xx_port_conf t7xx_md_port_conf[] = {};

Please spell this definition in two lines (i.e. move closing brace
into next line):

+static struct t7xx_port_conf t7xx_md_port_conf[] = {
+};

Such spelling in this patch will help you avoid editing the line when
you add the first entry in the next (control port introducing) patch:

-static struct t7xx_port_conf t7xx_md_port_conf[] = {};
+static struct t7xx_port_conf t7xx_md_port_conf[] = {
+       {
+               ...
+               .name = "t7xx_ctrl",
+       },
+};

It will become as simple as:

 static struct t7xx_port_conf t7xx_md_port_conf[] = {
+       {
+               ...
+               .name = "t7xx_ctrl",
+       },
 };

BTW, if the t7xx_md_port_conf contents are not expected to change at
run-time, should this array, as well as all pointers to it, be const?

[skipped]

> +int t7xx_port_enqueue_skb(struct t7xx_port *port, struct sk_buff *skb)
> +{
> +       unsigned long flags;
> +
> +       spin_lock_irqsave(&port->rx_wq.lock, flags);
> +       if (port->rx_skb_list.qlen >= port->rx_length_th) {
> +               spin_unlock_irqrestore(&port->rx_wq.lock, flags);

Probably skb should be freed here before returning. The caller assumes
that skb will be consumed even in case of error.

> +               return -ENOBUFS;
> +       }
> +       __skb_queue_tail(&port->rx_skb_list, skb);
> +       spin_unlock_irqrestore(&port->rx_wq.lock, flags);
> +
> +       wake_up_all(&port->rx_wq);
> +       return 0;
> +}

[skipped]

> +static int t7xx_port_proxy_recv_skb(struct cldma_queue *queue, struct sk_buff *skb)
> +{
> +       struct ccci_header *ccci_h = (struct ccci_header *)skb->data;
> +       struct t7xx_pci_dev *t7xx_dev = queue->md_ctrl->t7xx_dev;
> +       struct t7xx_fsm_ctl *ctl = t7xx_dev->md->fsm_ctl;
> +       struct device *dev = queue->md_ctrl->dev;
> +       struct t7xx_port_conf *port_conf;
> +       struct t7xx_port *port;
> +       u16 seq_num, channel;
> +       int ret;
> +
> +       if (!skb)
> +               return -EINVAL;
> +
> +       channel = FIELD_GET(CCCI_H_CHN_FLD, le32_to_cpu(ccci_h->status));
> +       if (t7xx_fsm_get_md_state(ctl) == MD_STATE_INVALID) {
> +               dev_err_ratelimited(dev, "Packet drop on channel 0x%x, modem not ready\n", channel);
> +               goto drop_skb;
> +       }
> +
> +       port = t7xx_port_proxy_find_port(t7xx_dev, queue, channel);
> +       if (!port) {
> +               dev_err_ratelimited(dev, "Packet drop on channel 0x%x, port not found\n", channel);
> +               goto drop_skb;
> +       }
> +
> +       seq_num = t7xx_port_next_rx_seq_num(port, ccci_h);
> +       port_conf = port->port_conf;
> +       skb_pull(skb, sizeof(*ccci_h));
> +
> +       ret = port_conf->ops->recv_skb(port, skb);
> +       if (ret) {
> +               skb_push(skb, sizeof(*ccci_h));

Header can not be pushed back here, since the .recv_skb() callback
consumes (frees) skb even in case of error. See
t7xx_port_wwan_recv_skb() and my comment in t7xx_port_enqueue_skb().
Pushing the header back after failure will trigger the use-after-free
error.

> +               return ret;
> +       }
> +
> +       port->seq_nums[MTK_RX] = seq_num;

The expected sequence number updated only on successful .recv_skb()
exit. This will trigger the out-of-order seqno warning on a next
message after a .recv_skb() failure. Is this intentional behaviour?

Maybe the expected sequence number should be updated before the
.recv_skb() call? Or even the sequence number update should be moved
to t7xx_port_next_rx_seq_num()?

> +       return 0;
> +
> +drop_skb:
> +       dev_kfree_skb_any(skb);
> +       return 0;
> +}

--
Sergey

^ permalink raw reply

* Re: [PATCH net-next] nfp: VF rate limit support
From: Jakub Kicinski @ 2022-04-25 23:53 UTC (permalink / raw)
  To: Simon Horman; +Cc: David Miller, netdev, oss-drivers, Bin Chen
In-Reply-To: <20220422131945.948311-1-simon.horman@corigine.com>

On Fri, 22 Apr 2022 15:19:45 +0200 Simon Horman wrote:
> +	if (max_tx_rate > 0 || min_tx_rate > 0) {
> +		if (max_tx_rate > 0 && max_tx_rate < min_tx_rate) {
> +			nfp_warn(app->cpp, "min-tx-rate exceeds max_tx_rate.\n");
> +			return -EINVAL;
> +		}

This check should be moved to the core, I reckon.

> +		if (max_tx_rate > NFP_NET_VF_RATE_MAX || min_tx_rate > NFP_NET_VF_RATE_MAX) {

Please wrap the lines at 80 chars, it's actually going to be easier 
to read here.

> +			nfp_warn(app->cpp, "tx-rate exceeds 0x%x.\n", NFP_NET_VF_RATE_MAX);

Does it really make sense to print the rate as hex?

> +			return -EINVAL;
> +		}

> @@ -261,5 +294,18 @@ int nfp_app_get_vf_config(struct net_device *netdev, int vf,
>  	ivi->trusted = FIELD_GET(NFP_NET_VF_CFG_CTRL_TRUST, flags);
>  	ivi->linkstate = FIELD_GET(NFP_NET_VF_CFG_CTRL_LINK_STATE, flags);
>  
> +	err = nfp_net_sriov_check(app, vf, NFP_NET_VF_CFG_MB_CAP_RATE, "rate");
> +	if (!err) {
> +		rate = readl(app->pf->vfcfg_tbl2 + vf_offset + NFP_NET_VF_CFG_RATE);
> +
> +		ivi->max_tx_rate = FIELD_GET(NFP_NET_VF_CFG_MAX_RATE, rate);
> +		ivi->min_tx_rate = FIELD_GET(NFP_NET_VF_CFG_MIN_RATE, rate);
> +
> +		if (ivi->max_tx_rate == NFP_NET_VF_RATE_MAX)
> +			ivi->max_tx_rate = 0;

If rate == NFP_NET_VF_RATE_MAX means unset then the check on set should
disallow it, IOW:

	if (max_tx_rate >= NFP_NET_VF_RATE_MAX || 
            min_tx_rate >= NFP_NET_VF_RATE_MAX) {
		nfp_war(...

no?

> +		if (ivi->min_tx_rate == NFP_NET_VF_RATE_MAX)
> +			ivi->max_tx_rate = 0;

*squint* you check min and clear max, is this intentional?

^ permalink raw reply

* Re: [PATCH net-next v1 1/4] net: phy: broadcom: Add PTP support for some Broadcom PHYs.
From: Jonathan Lemon @ 2022-04-25 23:53 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: bcm-kernel-feedback-list, andrew, hkallweit1, linux,
	richardcochran, netdev, kernel-team
In-Reply-To: <91d60847-4721-971d-7e86-22e1dd3c694e@gmail.com>

On Sun, Apr 24, 2022 at 04:27:06PM -0700, Florian Fainelli wrote:
> 
> 
> On 4/23/2022 7:23 PM, Jonathan Lemon wrote:
> > This adds PTP support for BCM54210E Broadcom PHYs, in particular,
> > the BCM54213PE, as used in the Rasperry PI CM4.  It has only been
> > tested on that hardware.
> > 
> > Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
> > ---
> 
> [snip]
> 
> Mostly checking register names/offsets/usage because I am not familiar
> enough with how PTP is supposed to work.
> 
> > +#define MODE_SEL_SHIFT_PORT		0
> > +#define MODE_SEL_SHIFT_CPU		8
> > +
> > +#define rx_mode(sel, evt, act) \
> > +	(((MODE_RX_##act) << (MODE_EVT_SHIFT_##evt)) << (MODE_SEL_SHIFT_##sel))
> > +
> > +#define tx_mode(sel, evt, act) \
> > +	(((MODE_TX_##act) << (MODE_EVT_SHIFT_##evt)) << (MODE_SEL_SHIFT_##sel))
> 
> I would capitalize these two macros to make it clear that they are exactly
> that, macros.

Ack.


> > +/* needs global TS capture first */
> > +#define TX_TS_CAPTURE		0x0821
> > +#define  TX_TS_CAP_EN			BIT(0)
> > +#define RX_TS_CAPTURE		0x0822
> > +#define  RX_TS_CAP_EN			BIT(0)
> > +
> > +#define TIME_CODE_0		0x0854
> 
> Maybe add a macro here as well:
> 
> #define TIME_CODE(x)		(TIME_CODE0 + (x))

I'd prrefer keep them separate, as not all of the registers are
sequential - the heartbeat registers for example.


> > +#define SHADOW_LOAD		0x085d
> > +#define  TIME_CODE_LOAD			BIT(10)
> > +#define  SYNC_OUT_LOAD			BIT(9)
> > +#define  NCO_TIME_LOAD			BIT(7)
> 
> NCO Divider load is bit 8 and Local time Load bit is 7, can you check which
> one you need?

Local time load.  NCO divider counts the SYNC_IN pulses and generates a
timestamp after <n> events (as I understand things).

 
> > +#define  FREQ_LOAD			BIT(6)
> > +#define INTR_MASK		0x085e
> > +#define INTR_STATUS		0x085f
> > +#define  INTC_FSYNC			BIT(0)
> > +#define  INTC_SOP			BIT(1)
> > +
> > +#define FREQ_REG_LSB		0x0873
> > +#define FREQ_REG_MSB		0x0874
> 
> Those two hold the NCO frequency, can you rename accordingly?


> > +#define TS_READ_CTRL		0x0885
> > +#define  TS_READ_START			BIT(0)
> > +#define  TS_READ_END			BIT(1)
> > +
> > +#define TIMECODE_CTRL		0x08c3
> > +#define  TX_TIMECODE_SEL		GENMASK(7, 0)
> > +#define  RX_TIMECODE_SEL		GENMASK(15, 8)
> 
> This one looks out of order compared to the other registers

Will rearrange the groups a bit.


> > +struct bcm_ptp_private *bcm_ptp_probe(struct phy_device *phydev)
> > +{
> > +	struct bcm_ptp_private *priv;
> > +	struct ptp_clock *clock;
> > +
> > +	switch (BRCM_PHY_MODEL(phydev)) {
> > +	case PHY_ID_BCM54210E:
> > +#ifdef PHY_ID_BCM54213PE
> > +	case PHY_ID_BCM54213PE:
> > +#endif
> 
> This does not exist upstream.

Yes, hence the #ifdef.  Will remove this for the next patch.  It's here
since I can just copy it over to the rpi-5.15 tree and have it still
work.
-- 
Jonathan

^ permalink raw reply

* Re: [PATCH net-next v6 03/13] net: wwan: t7xx: Add core components
From: Sergey Ryazanov @ 2022-04-25 23:52 UTC (permalink / raw)
  To: Ricardo Martinez
  Cc: netdev, linux-wireless, Jakub Kicinski, David Miller,
	Johannes Berg, Loic Poulain, M Chetan Kumar,
	chandrashekar.devegowda, Intel Corporation, chiranjeevi.rapolu,
	Haijun Liu (刘海军), amir.hanania,
	Andy Shevchenko, dinesh.sharma, eliot.lee, ilpo.johannes.jarvinen,
	moises.veleta, pierre-louis.bossart, muralidharan.sethuraman,
	Soumya.Prakash.Mishra, sreehari.kancharla, madhusmita.sahu
In-Reply-To: <20220407223629.21487-4-ricardo.martinez@linux.intel.com>

On Fri, Apr 8, 2022 at 1:37 AM Ricardo Martinez
<ricardo.martinez@linux.intel.com> wrote:
> Registers the t7xx device driver with the kernel. Setup all the core
> components: PCIe layer, Modem Host Cross Core Interface (MHCCIF),
> modem control operations, modem state machine, and build
> infrastructure.
>
> * PCIe layer code implements driver probe and removal.
> * MHCCIF provides interrupt channels to communicate events
>   such as handshake, PM and port enumeration.
> * Modem control implements the entry point for modem init,
>   reset and exit.
> * The modem status monitor is a state machine used by modem control
>   to complete initialization and stop. It is used also to propagate
>   exception events reported by other components.
>
> Signed-off-by: Haijun Liu <haijun.liu@mediatek.com>
> Signed-off-by: Chandrashekar Devegowda <chandrashekar.devegowda@intel.com>
> Co-developed-by: Ricardo Martinez <ricardo.martinez@linux.intel.com>
> Signed-off-by: Ricardo Martinez <ricardo.martinez@linux.intel.com>
>
> From a WWAN framework perspective:
> Reviewed-by: Loic Poulain <loic.poulain@linaro.org>

Reviewed-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>

^ permalink raw reply

* Re: [PATCH net-next v6 02/13] net: wwan: t7xx: Add control DMA interface
From: Sergey Ryazanov @ 2022-04-25 23:52 UTC (permalink / raw)
  To: Ricardo Martinez
  Cc: netdev, linux-wireless, Jakub Kicinski, David Miller,
	Johannes Berg, Loic Poulain, M Chetan Kumar,
	chandrashekar.devegowda, Intel Corporation, chiranjeevi.rapolu,
	Haijun Liu (刘海军), amir.hanania,
	Andy Shevchenko, dinesh.sharma, eliot.lee, ilpo.johannes.jarvinen,
	moises.veleta, pierre-louis.bossart, muralidharan.sethuraman,
	Soumya.Prakash.Mishra, sreehari.kancharla, madhusmita.sahu
In-Reply-To: <20220407223629.21487-3-ricardo.martinez@linux.intel.com>

On Fri, Apr 8, 2022 at 1:37 AM Ricardo Martinez
<ricardo.martinez@linux.intel.com> wrote:
> Cross Layer DMA (CLDMA) Hardware interface (HIF) enables the control
> path of Host-Modem data transfers. CLDMA HIF layer provides a common
> interface to the Port Layer.
>
> CLDMA manages 8 independent RX/TX physical channels with data flow
> control in HW queues. CLDMA uses ring buffers of General Packet
> Descriptors (GPD) for TX/RX. GPDs can represent multiple or single
> data buffers (DB).
>
> CLDMA HIF initializes GPD rings, registers ISR handlers for CLDMA
> interrupts, and initializes CLDMA HW registers.
>
> CLDMA TX flow:
> 1. Port Layer write
> 2. Get DB address
> 3. Configure GPD
> 4. Triggering processing via HW register write
>
> CLDMA RX flow:
> 1. CLDMA HW sends a RX "done" to host
> 2. Driver starts thread to safely read GPD
> 3. DB is sent to Port layer
> 4. Create a new buffer for GPD ring
>
> Note: This patch does not enable compilation since it has dependencies
> such as t7xx_pcie_mac_clear_int()/t7xx_pcie_mac_set_int() and
> struct t7xx_pci_dev which are added by the core patch.
>
> Signed-off-by: Haijun Liu <haijun.liu@mediatek.com>
> Signed-off-by: Chandrashekar Devegowda <chandrashekar.devegowda@intel.com>
> Co-developed-by: Ricardo Martinez <ricardo.martinez@linux.intel.com>
> Signed-off-by: Ricardo Martinez <ricardo.martinez@linux.intel.com>
>
> From a WWAN framework perspective:
> Reviewed-by: Loic Poulain <loic.poulain@linaro.org>
>
> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>

[skipped]

> diff --git a/drivers/net/wwan/t7xx/t7xx_hif_cldma.c b/drivers/net/wwan/t7xx/t7xx_hif_cldma.c

[skipped]

> +static int t7xx_cldma_gpd_rx_from_q(struct cldma_queue *queue, int budget, bool *over_budget)
> +{
> +       struct cldma_ctrl *md_ctrl = queue->md_ctrl;
> +       unsigned int hwo_polling_count = 0;
> +       struct t7xx_cldma_hw *hw_info;
> +       bool rx_not_done = true;
> +       unsigned long flags;
> +       int count = 0;
> +
> +       hw_info = &md_ctrl->hw_info;
> +
> +       do {
> +               struct cldma_request *req;
> +               struct cldma_rgpd *rgpd;
> +               struct sk_buff *skb;
> +               int ret;
> +
> +               req = queue->tr_done;
> +               if (!req)
> +                       return -ENODATA;
> +
> +               rgpd = req->gpd;
> +               if ((rgpd->gpd_flags & GPD_FLAGS_HWO) || !req->skb) {
> +                       dma_addr_t gpd_addr;
> +
> +                       if (!pci_device_is_present(to_pci_dev(md_ctrl->dev))) {
> +                               dev_err(md_ctrl->dev, "PCIe Link disconnected\n");
> +                               return -ENODEV;
> +                       }
> +
> +                       gpd_addr = ioread64(hw_info->ap_pdn_base + REG_CLDMA_DL_CURRENT_ADDRL_0 +
> +                                           queue->index * sizeof(u64));
> +                       if (req->gpd_addr == gpd_addr || hwo_polling_count++ >= 100)
> +                               return 0;
> +
> +                       udelay(1);
> +                       continue;
> +               }
> +
> +               hwo_polling_count = 0;
> +               skb = req->skb;
> +
> +               if (req->mapped_buff) {
> +                       dma_unmap_single(md_ctrl->dev, req->mapped_buff,
> +                                        t7xx_skb_data_area_size(skb), DMA_FROM_DEVICE);
> +                       req->mapped_buff = 0;
> +               }
> +
> +               skb->len = 0;
> +               skb_reset_tail_pointer(skb);
> +               skb_put(skb, le16_to_cpu(rgpd->data_buff_len));
> +
> +               ret = md_ctrl->recv_skb(queue, skb);
> +               if (ret < 0)
> +                       return ret;

The execution flow can not be broken here even in case of error. The
.recv_skb() callback consumes (frees) skb even if there is an error.
But the skb field in req (struct cldma_request) will keep the skb
pointer if the function returns from here. And this stale skb pointer
will cause a use-after-free or double-free error.

I have not dug too deeply into the CLDMA layer and can not suggest any
solution. But the error handling path needs to be rechecked.

> +               req->skb = NULL;
> +               t7xx_cldma_rgpd_set_data_ptr(rgpd, 0);
> +
> +               spin_lock_irqsave(&queue->ring_lock, flags);
> +               queue->tr_done = list_next_entry_circular(req, &queue->tr_ring->gpd_ring, entry);
> +               spin_unlock_irqrestore(&queue->ring_lock, flags);
> +               req = queue->rx_refill;
> +
> +               ret = t7xx_cldma_alloc_and_map_skb(md_ctrl, req, queue->tr_ring->pkt_size);
> +               if (ret)
> +                       return ret;
> +
> +               rgpd = req->gpd;
> +               t7xx_cldma_rgpd_set_data_ptr(rgpd, req->mapped_buff);
> +               rgpd->data_buff_len = 0;
> +               rgpd->gpd_flags = GPD_FLAGS_IOC | GPD_FLAGS_HWO;
> +
> +               spin_lock_irqsave(&queue->ring_lock, flags);
> +               queue->rx_refill = list_next_entry_circular(req, &queue->tr_ring->gpd_ring, entry);
> +               spin_unlock_irqrestore(&queue->ring_lock, flags);
> +
> +               rx_not_done = ++count < budget || !need_resched();
> +       } while (rx_not_done);
> +
> +       *over_budget = true;
> +       return 0;
> +}

--
Sergey

^ permalink raw reply

* Re: [PATCH net-next v1 1/4] net: phy: broadcom: Add PTP support for some Broadcom PHYs.
From: Jonathan Lemon @ 2022-04-25 23:47 UTC (permalink / raw)
  To: Richard Cochran
  Cc: f.fainelli, bcm-kernel-feedback-list, andrew, hkallweit1, linux,
	netdev, kernel-team
In-Reply-To: <20220425031239.GA6294@hoboy.vegasvil.org>

On Sun, Apr 24, 2022 at 08:12:39PM -0700, Richard Cochran wrote:
> On Sat, Apr 23, 2022 at 07:23:53PM -0700, Jonathan Lemon wrote:
> 
> > +static int bcm_ptp_settime_locked(struct bcm_ptp_private *priv,
> > +				  const struct timespec64 *ts)
> > +{
> > +	struct phy_device *phydev = priv->phydev;
> > +	u16 ctrl;
> > +
> > +	/* set up time code */
> > +	bcm_phy_write_exp(phydev, TIME_CODE_0, ts->tv_nsec);
> > +	bcm_phy_write_exp(phydev, TIME_CODE_1, ts->tv_nsec >> 16);
> > +	bcm_phy_write_exp(phydev, TIME_CODE_2, ts->tv_sec);
> > +	bcm_phy_write_exp(phydev, TIME_CODE_3, ts->tv_sec >> 16);
> > +	bcm_phy_write_exp(phydev, TIME_CODE_4, ts->tv_sec >> 32);
> > +
> > +	/* zero out NCO counter */
> > +	bcm_phy_write_exp(phydev, NCO_TIME_0, 0);
> > +	bcm_phy_write_exp(phydev, NCO_TIME_1, 0);
> > +	bcm_phy_write_exp(phydev, NCO_TIME_2_CTRL, 0);
> 
> You are setting the 48 bit counter to zero.
> 
> But Lasse's version does this:
> 
> 	// Assign original time codes (48 bit)
> 	local_time_codes[2] = 0x4000;
> 	local_time_codes[1] = (u16)(ts->tv_nsec >> 20);
> 	local_time_codes[0] = (u16)(ts->tv_nsec >> 4);
> 
> 	...
> 
> 	// Write Local Time Code Register
> 	bcm_phy_write_exp(phydev, NSE_DPPL_NCO_2_0_REG, local_time_codes[0]);
> 	bcm_phy_write_exp(phydev, NSE_DPPL_NCO_2_1_REG, local_time_codes[1]);
> 	bcm_phy_write_exp(phydev, NSE_DPPL_NCO_2_2_REG, local_time_codes[2]);
> 
> My understanding is that the PPS output function uses the 48 bit
> counter, and so it ought to be set to a non-zero value.

I'm not sure what this is doing.  Setting BIT(14) says this is a
frequency control adjustment.  From my understanding, the local timer is
used for generating a oneshot output pulse, which the driver currently
doesn't do.


> In any case, it would be nice to have the 80/48 bit register usage
> clearly explained.

You and me both.
-- 
Jonathan

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox