Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH net-next] bridge: mrp: Implement LC mode for MRP
From: Horatiu Vultur @ 2020-11-23 22:31 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Nikolay Aleksandrov, roopa, davem, linux-kernel, bridge, netdev
In-Reply-To: <20201123140519.3bb3db16@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>

The 11/23/2020 14:05, Jakub Kicinski wrote:
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
> 
> On Mon, 23 Nov 2020 16:25:53 +0200 Nikolay Aleksandrov wrote:
> > >>> @@ -156,4 +157,10 @@ struct br_mrp_in_link_hdr {
> > >>>       __be16 interval;
> > >>>  };
> > >>>
> > >>> +struct br_mrp_in_link_status_hdr {
> > >>> +     __u8 sa[ETH_ALEN];
> > >>> +     __be16 port_role;
> > >>> +     __be16 id;
> > >>> +};
> > >>> +
> > >>
> > >> I didn't see this struct used anywhere, am I missing anything?
> > >
> > > Yes, you are right, the struct is not used any. But I put it there as I
> > > put the other frame types for MRP.
> > >
> >
> > I see, we don't usually add unused code. The patch is fine as-is and since
> > this is already the case for other MRP parts I'm not strictly against it, so:
> >
> > Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
> >
> > If Jakub decides to adhere to that rule you can keep my acked-by and just remove
> > the struct for v2.

Hi Jakub,

> 
> Yes, good catch, let's drop it, we don't want to make too much of
> a precedent for using kernel uAPI headers as a place to provide
> protocol-related structs if the kernel doesn't need them.

OK, I see. I will send a new version for this patch where I will drop
the struct 'br_mrp_in_link_stauts_hdr'.

> 
> The existing structs are only present in net-next as well, so if you
> don't mind Horatiu it'd be good to follow up and remove the unused ones
> and move the ones (if any) which are only used by the kernel but not by
> the user space <-> kernel API communication out of include/uapi.

Maybe we don't refer to the same structs, but I could see that they are
already in net and in Linus' tree. For example the struct
'br_mrp_ring_topo_hdr'. Or am I missunderstanding something?

-- 
/Horatiu

^ permalink raw reply

* Re: [PATCH v5 4/9] task_isolation: Add task isolation hooks to arch-independent code
From: Thomas Gleixner @ 2020-11-23 22:31 UTC (permalink / raw)
  To: Alex Belits, nitesh@redhat.com, frederic@kernel.org
  Cc: Prasun Kapoor, linux-api@vger.kernel.org, davem@davemloft.net,
	trix@redhat.com, mingo@kernel.org, catalin.marinas@arm.com,
	rostedt@goodmis.org, linux-kernel@vger.kernel.org,
	peterx@redhat.com, linux-arch@vger.kernel.org,
	mtosatti@redhat.com, will@kernel.org, peterz@infradead.org,
	leon@sidebranch.com, linux-arm-kernel@lists.infradead.org,
	pauld@redhat.com, netdev@vger.kernel.org
In-Reply-To: <ec4bacce635fed4e77ab46752d41432f270edf83.camel@marvell.com>

Alex,

On Mon, Nov 23 2020 at 17:57, Alex Belits wrote:
> Kernel entry and exit functions for task isolation are added to context
> tracking and common entry points. Common handling of pending work on exit
> to userspace now processes isolation breaking, cleanup and start.

Again: You fail to explain the rationale and just explain what the patch
is doing. I can see what the patch is doing from the patch itself.

> ---
>  include/linux/hardirq.h   |  2 ++
>  include/linux/sched.h     |  2 ++
>  kernel/context_tracking.c |  5 +++++
>  kernel/entry/common.c     | 10 +++++++++-
>  kernel/irq/irqdesc.c      |  5 +++++

At least 3 different subsystems, which means this again failed to be
split into seperate patches.

>  extern void synchronize_irq(unsigned int irq);
> @@ -115,6 +116,7 @@ extern void rcu_nmi_exit(void);
>  	do {							\
>  		lockdep_off();					\
>  		arch_nmi_enter();				\
> +		task_isolation_kernel_enter();			\

Where is the explanation why this is safe and correct vs. this fragile
code path?

> @@ -1762,6 +1763,7 @@ extern char *__get_task_comm(char *to, size_t len, struct task_struct *tsk);
>  #ifdef CONFIG_SMP
>  static __always_inline void scheduler_ipi(void)
>  {
> +	task_isolation_kernel_enter();

Why is the scheduler_ipi() special? Just because everything else cannot
happen at all? Oh well...

>  #define CREATE_TRACE_POINTS
>  #include <trace/events/context_tracking.h>
> @@ -100,6 +101,8 @@ void noinstr __context_tracking_enter(enum ctx_state state)
>  		__this_cpu_write(context_tracking.state, state);
>  	}
>  	context_tracking_recursion_exit();
> +
> +	task_isolation_exit_to_user_mode();

Why is this here at all and why is it outside of the recursion
protection

>  }
>  EXPORT_SYMBOL_GPL(__context_tracking_enter);
>  
> @@ -148,6 +151,8 @@ void noinstr __context_tracking_exit(enum ctx_state state)
>  	if (!context_tracking_recursion_enter())
>  		return;
>  
> +	task_isolation_kernel_enter();

while this is inside?

And why has the scheduler_ipi() on x86 call this twice? Just because?

>  	if (__this_cpu_read(context_tracking.state) == state) {
>  		if (__this_cpu_read(context_tracking.active)) {
>  			/*
> diff --git a/kernel/entry/common.c b/kernel/entry/common.c
>  static void exit_to_user_mode_prepare(struct pt_regs *regs)
>  {
> -	unsigned long ti_work = READ_ONCE(current_thread_info()->flags);
> +	unsigned long ti_work;
>  
>  	lockdep_assert_irqs_disabled();
>  
> +	task_isolation_before_pending_work_check();
> +
> +	ti_work = READ_ONCE(current_thread_info()->flags);
> +
>  	if (unlikely(ti_work & EXIT_TO_USER_MODE_WORK))
>  		ti_work = exit_to_user_mode_loop(regs, ti_work);
>  
> +	if (unlikely(ti_work & _TIF_TASK_ISOLATION))
> +		task_isolation_start();

Where is the explaination of this change?

Aside of that how does anything of this compile on x86 at all?

Answer: It does not ...

Stop this frenzy right now. It's going nowhere and all you achieve is to
make people more grumpy than they are already.

Thanks,

        tglx

^ permalink raw reply

* Re: Hardcoded multicast queue length in macvlan.c driver causes poor multicast receive performance
From: Jakub Kicinski @ 2020-11-23 22:30 UTC (permalink / raw)
  To: Thomas Karlsson; +Cc: davem@davemloft.net, netdev@vger.kernel.org
In-Reply-To: <147b704ac1d5426fbaa8617289dad648@paneda.se>

On Mon, 23 Nov 2020 14:22:31 +0000 Thomas Karlsson wrote:
> Hello,
> 
> There is a special queue handling in macvlan.c for broadcast and
> multicast packages that was arbitrarily set to 1000 in commit
> 07d92d5cc977a7fe1e683e1d4a6f723f7f2778cb . While this is probably
> sufficient for most uses cases it is insufficient to support high
> packet rates. I currently have a setup with 144 000 multicast packets
> incoming per second (144 different live audio RTP streams) and suffer
> very frequent packet loss. With unicast this is not an issue and I
> can in addition to the 144kpps load the macvlan interface with
> another 450mbit/s using iperf.
> 
> In order to verify that the queue is the problem I edited the define
> to 100000 and recompiled the kernel module. After replacing it with
> rmmod/insmod I get 0 packet loss (measured over 2 days where I before
> had losses every other second or so) and can also load an additional
> 450 mbit/s multicast traffic using iperf without losses. So basically
> no change in performance between unicast/multicast when it comes to
> lost packets on my machine.
> 
> I think It would be best if this queue length was configurable
> somehow. Either an option when creating the macvlan (like how
> bridge/passthrough/etc are set) or at least when loading the module
> (for instance by using a config in /etc/modprobe.d). One size does
> not fit all in this situation.

The former please. You can add a netlink attribute, should be
reasonably straightforward. The other macvlan attrs are defined
under "MACVLAN section" in if_link.h.

^ permalink raw reply

* Re: [PATCH v5 9/9] task_isolation: kick_all_cpus_sync: don't kick isolated cpus
From: Frederic Weisbecker @ 2020-11-23 22:29 UTC (permalink / raw)
  To: Alex Belits
  Cc: nitesh@redhat.com, Prasun Kapoor, linux-api@vger.kernel.org,
	davem@davemloft.net, trix@redhat.com, mingo@kernel.org,
	catalin.marinas@arm.com, rostedt@goodmis.org,
	linux-kernel@vger.kernel.org, peterx@redhat.com,
	tglx@linutronix.de, linux-arch@vger.kernel.org,
	mtosatti@redhat.com, will@kernel.org, peterz@infradead.org,
	leon@sidebranch.com, linux-arm-kernel@lists.infradead.org,
	pauld@redhat.com, netdev@vger.kernel.org
In-Reply-To: <3236b13f42679031960c5605be20664e90e75223.camel@marvell.com>

On Mon, Nov 23, 2020 at 05:58:42PM +0000, Alex Belits wrote:
> From: Yuri Norov <ynorov@marvell.com>
> 
> Make sure that kick_all_cpus_sync() does not call CPUs that are running
> isolated tasks.
> 
> Signed-off-by: Yuri Norov <ynorov@marvell.com>
> [abelits@marvell.com: use safe task_isolation_cpumask() implementation]
> Signed-off-by: Alex Belits <abelits@marvell.com>
> ---
>  kernel/smp.c | 14 +++++++++++++-
>  1 file changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/smp.c b/kernel/smp.c
> index 4d17501433be..b2faecf58ed0 100644
> --- a/kernel/smp.c
> +++ b/kernel/smp.c
> @@ -932,9 +932,21 @@ static void do_nothing(void *unused)
>   */
>  void kick_all_cpus_sync(void)
>  {
> +	struct cpumask mask;
> +
>  	/* Make sure the change is visible before we kick the cpus */
>  	smp_mb();
> -	smp_call_function(do_nothing, NULL, 1);
> +
> +	preempt_disable();
> +#ifdef CONFIG_TASK_ISOLATION
> +	cpumask_clear(&mask);
> +	task_isolation_cpumask(&mask);
> +	cpumask_complement(&mask, &mask);
> +#else
> +	cpumask_setall(&mask);
> +#endif
> +	smp_call_function_many(&mask, do_nothing, NULL, 1);
> +	preempt_enable();

Same comment about IPIs here.

>  }
>  EXPORT_SYMBOL_GPL(kick_all_cpus_sync);
>  
> -- 
> 2.20.1
> 

^ permalink raw reply

* Re: [PATCH net-next v3 4/7] dpaa_eth: add XDP_TX support
From: Maciej Fijalkowski @ 2020-11-23 22:18 UTC (permalink / raw)
  To: Camelia Alexandra Groza
  Cc: kuba@kernel.org, brouer@redhat.com, saeed@kernel.org,
	davem@davemloft.net, Madalin Bucur (OSS), Ioana Ciornei,
	netdev@vger.kernel.org
In-Reply-To: <VI1PR04MB5807F56500D25ECD20657618F2FF0@VI1PR04MB5807.eurprd04.prod.outlook.com>

On Fri, Nov 20, 2020 at 06:54:42PM +0000, Camelia Alexandra Groza wrote:
> > -----Original Message-----
> > From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> > Sent: Friday, November 20, 2020 01:51
> > To: Camelia Alexandra Groza <camelia.groza@nxp.com>
> > Cc: kuba@kernel.org; brouer@redhat.com; saeed@kernel.org;
> > davem@davemloft.net; Madalin Bucur (OSS)
> > <madalin.bucur@oss.nxp.com>; Ioana Ciornei <ioana.ciornei@nxp.com>;
> > netdev@vger.kernel.org
> > Subject: Re: [PATCH net-next v3 4/7] dpaa_eth: add XDP_TX support
> > 
> > On Thu, Nov 19, 2020 at 06:29:33PM +0200, Camelia Groza wrote:
> > > Use an xdp_frame structure for managing the frame. Store a backpointer
> > to
> > > the structure at the start of the buffer before enqueueing. Use the XDP
> > > API for freeing the buffer when it returns to the driver on the TX
> > > confirmation path.
> > 

[...]

> > >  static u32 dpaa_run_xdp(struct dpaa_priv *priv, struct qm_fd *fd, void
> > *vaddr,
> > > -			unsigned int *xdp_meta_len)
> > > +			struct dpaa_fq *dpaa_fq, unsigned int
> > *xdp_meta_len)
> > >  {
> > >  	ssize_t fd_off = qm_fd_get_offset(fd);
> > >  	struct bpf_prog *xdp_prog;
> > > +	struct xdp_frame *xdpf;
> > >  	struct xdp_buff xdp;
> > >  	u32 xdp_act;
> > >
> > > @@ -2370,7 +2470,8 @@ static u32 dpaa_run_xdp(struct dpaa_priv *priv,
> > struct qm_fd *fd, void *vaddr,
> > >  	xdp.data_meta = xdp.data;
> > >  	xdp.data_hard_start = xdp.data - XDP_PACKET_HEADROOM;
> > >  	xdp.data_end = xdp.data + qm_fd_get_length(fd);
> > > -	xdp.frame_sz = DPAA_BP_RAW_SIZE;
> > > +	xdp.frame_sz = DPAA_BP_RAW_SIZE - DPAA_TX_PRIV_DATA_SIZE;
> > > +	xdp.rxq = &dpaa_fq->xdp_rxq;
> > >
> > >  	xdp_act = bpf_prog_run_xdp(xdp_prog, &xdp);
> > >
> > > @@ -2381,6 +2482,22 @@ static u32 dpaa_run_xdp(struct dpaa_priv *priv,
> > struct qm_fd *fd, void *vaddr,
> > >  	case XDP_PASS:
> > >  		*xdp_meta_len = xdp.data - xdp.data_meta;
> > >  		break;
> > > +	case XDP_TX:
> > > +		/* We can access the full headroom when sending the frame
> > > +		 * back out
> > 
> > And normally why a piece of headroom is taken away? I probably should
> > have
> > started from the basic XDP support patch, but if you don't mind, please
> > explain it a bit.
> 
> I mentioned we require DPAA_TX_PRIV_DATA_SIZE bytes at the start of the buffer in order to make sure we have enough space for our private info.

What is your private info?

> 
> When setting up the xdp_buff, this area is reserved from the frame size exposed to the XDP program.
>  -	xdp.frame_sz = DPAA_BP_RAW_SIZE;
>  +	xdp.frame_sz = DPAA_BP_RAW_SIZE - DPAA_TX_PRIV_DATA_SIZE;
> 
> After the XDP_TX verdict, we're sure that DPAA_TX_PRIV_DATA_SIZE bytes at the start of the buffer are free and we can use the full headroom how it suits us, hence the increase of the frame size back to DPAA_BP_RAW_SIZE.

Not at the *end* of the buffer?

> 
> Thanks for all your feedback.
> 
> > > +		 */
> > > +		xdp.data_hard_start = vaddr;
> > > +		xdp.frame_sz = DPAA_BP_RAW_SIZE;
> > > +		xdpf = xdp_convert_buff_to_frame(&xdp);
> > > +		if (unlikely(!xdpf)) {
> > > +			free_pages((unsigned long)vaddr, 0);
> > > +			break;
> > > +		}
> > > +
> > > +		if (dpaa_xdp_xmit_frame(priv->net_dev, xdpf))
> > > +			xdp_return_frame_rx_napi(xdpf);
> > > +
> > > +		break;
> > >  	default:
> > >  		bpf_warn_invalid_xdp_action(xdp_act);
> > >  		fallthrough;
> > > @@ -2415,6 +2532,7 @@ static enum qman_cb_dqrr_result

^ permalink raw reply

* Re: [PATCH net-next] ip_gre: remove CRC flag from dev features in gre_gso_segment
From: Alexander Duyck @ 2020-11-23 22:23 UTC (permalink / raw)
  To: Xin Long
  Cc: network dev, linux-sctp @ vger . kernel . org,
	Marcelo Ricardo Leitner, David Miller, Jakub Kicinski,
	Guillaume Nault, Paolo Abeni, Lorenzo Bianconi
In-Reply-To: <CADvbK_cY3y-DonBDp7DjKdxbnxkP1r18v1dggW_b3q9cih5coA@mail.gmail.com>

On Mon, Nov 23, 2020 at 1:14 AM Xin Long <lucien.xin@gmail.com> wrote:
>
> On Sat, Nov 21, 2020 at 12:10 AM Alexander Duyck
> <alexander.duyck@gmail.com> wrote:
> >
> > On Fri, Nov 20, 2020 at 2:23 AM Xin Long <lucien.xin@gmail.com> wrote:
> > >
> > > On Fri, Nov 20, 2020 at 1:24 AM Alexander Duyck
> > > <alexander.duyck@gmail.com> wrote:
> > > >
> > > > On Wed, Nov 18, 2020 at 9:53 PM Xin Long <lucien.xin@gmail.com> wrote:
> > > > >
> > > > > On Thu, Nov 19, 2020 at 4:35 AM Alexander Duyck
> > > > > <alexander.duyck@gmail.com> wrote:
> > > > > >
> > > > > > On Mon, Nov 16, 2020 at 1:17 AM Xin Long <lucien.xin@gmail.com> wrote:
> > > > > > >

<snip>

> > > > > @@ -27,7 +27,11 @@ static __le32 sctp_gso_make_checksum(struct sk_buff *skb)
> > > > >  {
> > > > >         skb->ip_summed = CHECKSUM_NONE;
> > > > >         skb->csum_not_inet = 0;
> > > > > -       gso_reset_checksum(skb, ~0);
> > > > > +       /* csum and csum_start in GSO CB may be needed to do the UDP
> > > > > +        * checksum when it's a UDP tunneling packet.
> > > > > +        */
> > > > > +       SKB_GSO_CB(skb)->csum = (__force __wsum)~0;
> > > > > +       SKB_GSO_CB(skb)->csum_start = skb_headroom(skb) + skb->len;
> > > > >         return sctp_compute_cksum(skb, skb_transport_offset(skb));
> > > > >  }
> > > > >
> > > > > And yes, this patch has been tested with GRE tunnel checksums enabled.
> > > > > (... icsum ocsum).
> > > > > And yes, it was segmented in the lower NIC level, and we can make it by:
> > > > >
> > > > > # ethtool -K gre1 tx-sctp-segmentation on
> > > > > # ethtool -K veth0 tx-sctp-segmentation off
> > > > >
> > > > > (Note: "tx-checksum-sctp" and "gso" are on for both devices)
> > > > >
> > > > > Thanks.
> > > >
> > > > I would also turn off Tx and Rx checksum offload on your veth device
> > > > in order to make certain you aren't falsely sending data across
> > > > indicating that the checksums are valid when they are not. It might be
> > > > better if you were to run this over an actual NIC as that could then
> > > > provide independent verification as it would be a fixed checksum test.
> > > >
> > > > I'm still not convinced this is working correctly. Basically a crc32c
> > > > is not the same thing as a 1's complement checksum so you should need
> > > > to compute both if you have an SCTP packet tunneled inside a UDP or
> > > > GRE packet with a checksum. I don't see how computing a crc32c should
> > > > automatically give you a 1's complement checksum of ~0.
> > >
> > > On the tx Path [1] below, the sctp crc checksum is calculated in
> > > sctp_gso_make_checksum() [a], where it calls *sctp_compute_cksum()*
> > > to do that, and as for the code in it:
> > >
> > >     SKB_GSO_CB(skb)->csum = (__force __wsum)~0;
> > >     SKB_GSO_CB(skb)->csum_start = skb_headroom(skb) + skb->len;
> >
> > Okay, so I think I know how this is working, but the number of things
> > relied on is ugly. Normally assuming skb_headroom(skb) + skb->len
> > being valid for this would be a non-starter. I was assuming you
> > weren't doing the 1's compliment checksum because you weren't using
> > __skb_checksum to generate it.
> >
> > If I am not mistaken SCTP GSO uses the GSO_BY_FRAGS and apparently
> > none of the frags are using page fragments within the skb. Am I
> > understanding correctly? One thing that would help to address some of
> > my concerns would be to clear instead of set NETIF_F_SG in
> > sctp_gso_segment since your checksum depends on linear skbs.
> Right, no frag is using page fragments for SCTP GSO.
> NETIF_F_SG is set here, because in skb_segment():
>
>                 if (hsize > len || !sg)
>                         hsize = len;
>
>                 if (!hsize && i >= nfrags && skb_headlen(list_skb) &&
>                     (skb_headlen(list_skb) == len || sg)) { <------
> for flag_list
>
> without sg set, it won't go to this 'if' block, which is the process
> of frag_list, see

I don't think that is processing frag_list, it is processing frags. It
is just updating list_skb as needed, however it is also configured
outside of that block.

> commit 89319d3801d1d3ac29c7df1f067038986f267d29
> Author: Herbert Xu <herbert@gondor.apana.org.au>
> Date:   Mon Dec 15 23:26:06 2008 -0800
>
>     net: Add frag_list support to skb_segment
>
> do you think this might be a bug in skb_segment()?

I would say it is assuming your logic is correct. Basically it should
be able to segment the frame regardless of if the lower device
supports NETIF_F_SG or not.

> I was also thinking if SCTP GSO could go with the way of UDP's
> with skb_segment_list() instead of GSO_BY_FRAGS things.
> the different is that the head skb does not only include header,
> but may also include userdata/payload with skb_segment_list().

The problem is right now the way the checksum is being configured you
would have to keep the payload and data all in one logical data
segment starting at skb->data. We cannot have any data stored in
shinfo->frags, nor shinfo->frag_list.

> >
> > > is prepared for doing 1's complement checksum (for outer UDP/GRE), and used
> > > in gre_gso_segment() [b], where it calls gso_make_checksum() to do that
> > > when need_csum is set. Note that, here it played a TRICK:
> > >
> > > I set SKB_GSO_CB->csum_start to the end of this packet and
> > > SKB_GSO_CB->csum = ~0 manually, so that in gso_make_checksum() it will do
> > > the 1's complement checksum for outer UDP/GRE by summing all packet bits up.
> > > See gso_make_checksum() (called by gre_gso_segment()):
> > >
> > >  unsigned char *csum_start = skb_transport_header(skb);
> > >  int plen = (skb->head + SKB_GSO_CB(skb)->csum_start) - csum_start;
> > >  /* now plen is from udp header to the END of packet. */
> > >  __wsum partial = SKB_GSO_CB(skb)->csum;
> > >
> > >  return csum_fold(csum_partial(csum_start, plen, partial));
> > >
> > > So yes, here it does compute both if I have an SCTP packet tunnelled inside
> > > a UDP or GRE packet with a checksum.
> >
> > Assuming you have the payload data in the skb->data section. Normally
> > payload is in page frags. That is why I was concerned. You have to
> > have guarantees in place that there will not be page fragments
> > attached to the skb.
> On SCTP TX path, sctp_packet_transmit() will always alloc linear skbs
> and reserve headers for lower-layer protocols. I think this will guarantee it.

That ends up being my big concern. We need to make certain that is
true for all GRO and GSO cases if we are going to operate on the
assumption that just doing a linear checksum will work in the GSO
code. Otherwise we need to make certain that segmentation will correct
this for us if it cannot be guaranteed. That is why I would be much
more comfortable if we were able to just drop the NETIF_F_SG bit when
doing the segmentation since that would guarantee the results we are
looking for.

^ permalink raw reply

* Re: [PATCH net-next 2/3] net: dsa: add Arrow SpeedChips XRS700x driver
From: George McCollister @ 2020-11-23 22:23 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Vivien Didelot, Florian Fainelli, Vladimir Oltean,
	David S . Miller, netdev, open list:OPEN FIRMWARE AND...
In-Reply-To: <20201123220914.GC2036992@lunn.ch>

On Mon, Nov 23, 2020 at 4:09 PM Andrew Lunn <andrew@lunn.ch> wrote:
>
> > > > https://www.flexibilis.com/downloads/xrs/SpeedChip_XRS7000_3000_User_Manual.pdf
>
> Section 6.1.4
>
> The forwarding decision is presented in Figure 19. Note that also
> frames coming into a disabled port are received to the buffer memory,
> but because their forwarding decision is not to forward them to any
> port, they are dropped.  This behavior however can be changed, and
> frames can be forwarded from disabled ports to other ports by using
> Inbound Policy (see Chapter 6.1.5).
>
> Sounds promising. And Section 6.1.5:

Indeed, I missed this.

>
> Inbound Policy checks the source and the destination MAC addresses of
> all the received frames. The user can configure through the register
> interface what kind of a treatment should frames with certain source
> or destination MAC addresses get. Many protocols use protocol specific
> multicast MAC addresses and the destination MAC address can therefore
> be used for forwarding those frames to CPU port and not to other
> ports.
>
> Looking at table 36, i think you can add a match for the BPDU
> destination MAC address, and have i forwarded to the CPU port.

I'll give it a try.

>
> Looks like you can add 15 such filters. So you might want to think
> about how you want to use these, what other special packets do you
> want to allow through?

Okay.

>
>             Andrew

Thanks!

^ permalink raw reply

* Re: [PATCH net-next 1/3] dsa: add support for Arrow XRS700x tag trailer
From: Florian Fainelli @ 2020-11-23 22:18 UTC (permalink / raw)
  To: George McCollister, Andrew Lunn, Vivien Didelot, Vladimir Oltean
  Cc: David S . Miller, netdev, devicetree
In-Reply-To: <20201120181627.21382-2-george.mccollister@gmail.com>



On 11/20/2020 10:16 AM, George McCollister wrote:
> Add support for Arrow SpeedChips XRS700x single byte tag trailer. This
> is modeled on tag_trailer.c which works in a similar way.
> 
> Signed-off-by: George McCollister <george.mccollister@gmail.com>

One question below:

[snip]

> +	if (pskb_trim_rcsum(skb, skb->len - 1))
> +		return NULL;
> +
> +	/* Frame is forwarded by hardware, don't forward in software. */
> +	skb->offload_fwd_mark = 1;

Given the switch does not give you a forwarding reason, I suppose this
is fine, but do you possibly have to qualify this against different
source MAC addresses?
-- 
Florian

^ permalink raw reply

* Re: [PATCH net-next v3 2/7] dpaa_eth: add basic XDP support
From: Maciej Fijalkowski @ 2020-11-23 22:07 UTC (permalink / raw)
  To: Camelia Alexandra Groza
  Cc: kuba@kernel.org, brouer@redhat.com, saeed@kernel.org,
	davem@davemloft.net, Madalin Bucur (OSS), Ioana Ciornei,
	netdev@vger.kernel.org
In-Reply-To: <VI1PR04MB58076B6D16C76E71247BA12FF2FF0@VI1PR04MB5807.eurprd04.prod.outlook.com>

On Fri, Nov 20, 2020 at 06:50:28PM +0000, Camelia Alexandra Groza wrote:
> > -----Original Message-----
> > From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> > Sent: Friday, November 20, 2020 02:19
> > To: Camelia Alexandra Groza <camelia.groza@nxp.com>
> > Cc: kuba@kernel.org; brouer@redhat.com; saeed@kernel.org;
> > davem@davemloft.net; Madalin Bucur (OSS)
> > <madalin.bucur@oss.nxp.com>; Ioana Ciornei <ioana.ciornei@nxp.com>;
> > netdev@vger.kernel.org
> > Subject: Re: [PATCH net-next v3 2/7] dpaa_eth: add basic XDP support
> > 
> > On Thu, Nov 19, 2020 at 06:29:31PM +0200, Camelia Groza wrote:
> > > Implement the XDP_DROP and XDP_PASS actions.
> > >
> > > Avoid draining and reconfiguring the buffer pool at each XDP
> > > setup/teardown by increasing the frame headroom and reserving
> > > XDP_PACKET_HEADROOM bytes from the start. Since we always reserve
> > an
> > > entire page per buffer, this change only impacts Jumbo frame scenarios
> > > where the maximum linear frame size is reduced by 256 bytes. Multi
> > > buffer Scatter/Gather frames are now used instead in these scenarios.
> > >
> > > Allow XDP programs to access the entire buffer.
> > >
> > > The data in the received frame's headroom can be overwritten by the XDP
> > > program. Extract the relevant fields from the headroom while they are
> > > still available, before running the XDP program.
> > >
> > > Since the headroom might be resized before the frame is passed up to the
> > > stack, remove the check for a fixed headroom value when building an skb.
> > >
> > > Allow the meta data to be updated and pass the information up the stack.
> > >
> > > Scatter/Gather frames are dropped when XDP is enabled.
> > >
> > > Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com>
> > > Signed-off-by: Camelia Groza <camelia.groza@nxp.com>
> > > ---
> > > Changes in v2:
> > > - warn only once if extracting the timestamp from a received frame fails
> > >
> > > Changes in v3:
> > > - drop received S/G frames when XDP is enabled
> > >
> > >  drivers/net/ethernet/freescale/dpaa/dpaa_eth.c | 158
> > ++++++++++++++++++++++---
> > >  drivers/net/ethernet/freescale/dpaa/dpaa_eth.h |   2 +
> > >  2 files changed, 144 insertions(+), 16 deletions(-)
> > >
> > > diff --git a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
> > b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
> > > index 88533a2..102023c 100644
> > > --- a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
> > > +++ b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
> > > @@ -53,6 +53,8 @@
> > >  #include <linux/dma-mapping.h>
> > >  #include <linux/sort.h>
> > >  #include <linux/phy_fixed.h>
> > > +#include <linux/bpf.h>
> > > +#include <linux/bpf_trace.h>
> > >  #include <soc/fsl/bman.h>
> > >  #include <soc/fsl/qman.h>
> > >  #include "fman.h"
> > > @@ -177,7 +179,7 @@
> > >  #define DPAA_HWA_SIZE (DPAA_PARSE_RESULTS_SIZE +
> > DPAA_TIME_STAMP_SIZE \
> > >  		       + DPAA_HASH_RESULTS_SIZE)
> > >  #define DPAA_RX_PRIV_DATA_DEFAULT_SIZE
> > (DPAA_TX_PRIV_DATA_SIZE + \
> > > -					dpaa_rx_extra_headroom)
> > > +					XDP_PACKET_HEADROOM -
> > DPAA_HWA_SIZE)
> > >  #ifdef CONFIG_DPAA_ERRATUM_A050385
> > >  #define DPAA_RX_PRIV_DATA_A050385_SIZE (DPAA_A050385_ALIGN -
> > DPAA_HWA_SIZE)
> > >  #define DPAA_RX_PRIV_DATA_SIZE (fman_has_errata_a050385() ? \
> > > @@ -1733,7 +1735,6 @@ static struct sk_buff *contig_fd_to_skb(const
> > struct dpaa_priv *priv,
> > >  			SKB_DATA_ALIGN(sizeof(struct skb_shared_info)));
> > >  	if (WARN_ONCE(!skb, "Build skb failure on Rx\n"))
> > >  		goto free_buffer;
> > > -	WARN_ON(fd_off != priv->rx_headroom);
> > >  	skb_reserve(skb, fd_off);
> > >  	skb_put(skb, qm_fd_get_length(fd));
> > >
> > > @@ -2349,12 +2350,62 @@ static enum qman_cb_dqrr_result
> > rx_error_dqrr(struct qman_portal *portal,
> > >  	return qman_cb_dqrr_consume;
> > >  }
> > >
> > > +static u32 dpaa_run_xdp(struct dpaa_priv *priv, struct qm_fd *fd, void
> > *vaddr,
> > > +			unsigned int *xdp_meta_len)
> > > +{
> > > +	ssize_t fd_off = qm_fd_get_offset(fd);
> > > +	struct bpf_prog *xdp_prog;
> > > +	struct xdp_buff xdp;
> > > +	u32 xdp_act;
> > > +
> > > +	rcu_read_lock();
> > > +
> > > +	xdp_prog = READ_ONCE(priv->xdp_prog);
> > > +	if (!xdp_prog) {
> > > +		rcu_read_unlock();
> > > +		return XDP_PASS;
> > > +	}
> > > +
> > > +	xdp.data = vaddr + fd_off;
> > 
> > I feel like a little drawing of xdp_buff layout would help me with
> > understanding what is going on over here :)
> 
> A region at the start of the buffer is reserved for storing hardware annotations and room for the XDP_PACKET_HEADROOM, before the actual data starts. So vaddr points to the start of the buffer, while fd offset provides the offset of the data inside the buffer. I don't feel that we are filling the xdp_buff in a majorly different way from other drivers, so please mention what is unclear here and I can provide more details.

Okay, so fd_off tells me where the frame starts, from vaddr to vaddr +
fd_off there might be some HW provided data, so you extract it and then
you are free to go with setting the data_hard_start?

> 
> > > +	xdp.data_meta = xdp.data;
> > > +	xdp.data_hard_start = xdp.data - XDP_PACKET_HEADROOM;
> > > +	xdp.data_end = xdp.data + qm_fd_get_length(fd);
> > > +	xdp.frame_sz = DPAA_BP_RAW_SIZE;
> > 
> > Maybe you could fill xdp_buff outside of this function so that later on
> > you could set xdp.rxq once per napi?
> 
> I admit I haven't looked into exactly how much performance we would gain from this, but I don't think it would be enough to justify the code churn. We don't have a clean loop for processing the received frames like I see the Intel and ENA drivers have.
> 
> > > +
> > > +	xdp_act = bpf_prog_run_xdp(xdp_prog, &xdp);
> > > +
> > > +	/* Update the length and the offset of the FD */
> > > +	qm_fd_set_contig(fd, xdp.data - vaddr, xdp.data_end - xdp.data);
> > > +
> > > +	switch (xdp_act) {
> > > +	case XDP_PASS:
> > > +		*xdp_meta_len = xdp.data - xdp.data_meta;
> > > +		break;
> > > +	default:
> > > +		bpf_warn_invalid_xdp_action(xdp_act);
> > > +		fallthrough;
> > > +	case XDP_ABORTED:
> > > +		trace_xdp_exception(priv->net_dev, xdp_prog, xdp_act);
> > > +		fallthrough;
> > > +	case XDP_DROP:
> > > +		/* Free the buffer */
> > > +		free_pages((unsigned long)vaddr, 0);
> > > +		break;
> > > +	}
> > > +
> > > +	rcu_read_unlock();
> > > +
> > > +	return xdp_act;
> > > +}
> > > +
> > >  static enum qman_cb_dqrr_result rx_default_dqrr(struct qman_portal
> > *portal,
> > >  						struct qman_fq *fq,
> > >  						const struct qm_dqrr_entry
> > *dq,
> > >  						bool sched_napi)
> > >  {
> > > +	bool ts_valid = false, hash_valid = false;
> > >  	struct skb_shared_hwtstamps *shhwtstamps;
> > > +	unsigned int skb_len, xdp_meta_len = 0;
> > >  	struct rtnl_link_stats64 *percpu_stats;
> > >  	struct dpaa_percpu_priv *percpu_priv;
> > >  	const struct qm_fd *fd = &dq->fd;
> > > @@ -2362,12 +2413,14 @@ static enum qman_cb_dqrr_result
> > rx_default_dqrr(struct qman_portal *portal,
> > >  	enum qm_fd_format fd_format;
> > >  	struct net_device *net_dev;
> > >  	u32 fd_status, hash_offset;
> > > +	struct qm_sg_entry *sgt;
> > >  	struct dpaa_bp *dpaa_bp;
> > >  	struct dpaa_priv *priv;
> > > -	unsigned int skb_len;
> > >  	struct sk_buff *skb;
> > >  	int *count_ptr;
> > > +	u32 xdp_act;
> > >  	void *vaddr;
> > > +	u32 hash;
> > >  	u64 ns;
> > >
> > >  	fd_status = be32_to_cpu(fd->status);
> > > @@ -2423,35 +2476,67 @@ static enum qman_cb_dqrr_result
> > rx_default_dqrr(struct qman_portal *portal,
> > >  	count_ptr = this_cpu_ptr(dpaa_bp->percpu_count);
> > >  	(*count_ptr)--;
> > >
> > > -	if (likely(fd_format == qm_fd_contig))
> > > +	/* Extract the timestamp stored in the headroom before running
> > XDP */
> > > +	if (priv->rx_tstamp) {
> > > +		if (!fman_port_get_tstamp(priv->mac_dev->port[RX], vaddr,
> > &ns))
> > > +			ts_valid = true;
> > > +		else
> > > +			WARN_ONCE(1, "fman_port_get_tstamp failed!\n");
> > > +	}
> > > +
> > > +	/* Extract the hash stored in the headroom before running XDP */
> > > +	if (net_dev->features & NETIF_F_RXHASH && priv->keygen_in_use
> > &&
> > > +	    !fman_port_get_hash_result_offset(priv->mac_dev->port[RX],
> > > +					      &hash_offset)) {
> > > +		hash = be32_to_cpu(*(u32 *)(vaddr + hash_offset));
> > > +		hash_valid = true;
> > > +	}
> > > +
> > > +	if (likely(fd_format == qm_fd_contig)) {
> > > +		xdp_act = dpaa_run_xdp(priv, (struct qm_fd *)fd, vaddr,
> > > +				       &xdp_meta_len);
> > > +		if (xdp_act != XDP_PASS) {
> > > +			percpu_stats->rx_packets++;
> > > +			percpu_stats->rx_bytes += qm_fd_get_length(fd);
> > > +			return qman_cb_dqrr_consume;
> > > +		}
> > >  		skb = contig_fd_to_skb(priv, fd);
> > > -	else
> > > +	} else {
> > > +		/* XDP doesn't support S/G frames. Return the fragments to
> > the
> > > +		 * buffer pool and release the SGT.
> > > +		 */
> > > +		if (READ_ONCE(priv->xdp_prog)) {
> > > +			WARN_ONCE(1, "S/G frames not supported under
> > XDP\n");
> > > +			sgt = vaddr + qm_fd_get_offset(fd);
> > > +			dpaa_release_sgt_members(sgt);
> > > +			free_pages((unsigned long)vaddr, 0);
> > > +			return qman_cb_dqrr_consume;
> > > +		}
> > >  		skb = sg_fd_to_skb(priv, fd);
> > > +	}
> > >  	if (!skb)
> > >  		return qman_cb_dqrr_consume;
> > >
> > > -	if (priv->rx_tstamp) {
> > > +	if (xdp_meta_len)
> > > +		skb_metadata_set(skb, xdp_meta_len);
> > 
> > This is working on a single buffer, right? So there's no need to clear
> > xdp_meta_len?
> 
> I don't think I understand what you mean. Are you saying I shouldn't be initializing xdp_meta_len to 0? This receive path is used when XDP is disabled as well, in which case we don't propagate the metadata.

What I meant was that if this function would operate on many buffers then
we would have to clear the xdp_meta_len, so that next buffers wouldn't get
the value from previous bufs, but I suppose that this rx_default_dqrr
callback is called once per each buffer.

> 
> > > +
> > > +	/* Set the previously extracted timestamp */
> > > +	if (ts_valid) {
> > >  		shhwtstamps = skb_hwtstamps(skb);
> > >  		memset(shhwtstamps, 0, sizeof(*shhwtstamps));
> > > -
> > > -		if (!fman_port_get_tstamp(priv->mac_dev->port[RX], vaddr,
> > &ns))
> > > -			shhwtstamps->hwtstamp = ns_to_ktime(ns);
> > > -		else
> > > -			dev_warn(net_dev->dev.parent,
> > "fman_port_get_tstamp failed!\n");
> > > +		shhwtstamps->hwtstamp = ns_to_ktime(ns);
> > >  	}
> > >
> > >  	skb->protocol = eth_type_trans(skb, net_dev);
> > >
> > > -	if (net_dev->features & NETIF_F_RXHASH && priv->keygen_in_use
> > &&
> > > -	    !fman_port_get_hash_result_offset(priv->mac_dev->port[RX],
> > > -					      &hash_offset)) {
> > > +	/* Set the previously extracted hash */
> > > +	if (hash_valid) {
> > >  		enum pkt_hash_types type;
> > >
> > >  		/* if L4 exists, it was used in the hash generation */
> > >  		type = be32_to_cpu(fd->status) & FM_FD_STAT_L4CV ?
> > >  			PKT_HASH_TYPE_L4 : PKT_HASH_TYPE_L3;
> > > -		skb_set_hash(skb, be32_to_cpu(*(u32 *)(vaddr +
> > hash_offset)),
> > > -			     type);
> > > +		skb_set_hash(skb, hash, type);
> > >  	}
> > >
> > >  	skb_len = skb->len;
> > > @@ -2671,6 +2756,46 @@ static int dpaa_eth_stop(struct net_device
> > *net_dev)
> > >  	return err;
> > >  }

^ permalink raw reply

* Re: [PATCH net-next 3/3] dt-bindings: net: dsa: add bindings for xrs700x switches
From: Florian Fainelli @ 2020-11-23 22:15 UTC (permalink / raw)
  To: George McCollister, Andrew Lunn, Vivien Didelot, Vladimir Oltean
  Cc: David S . Miller, netdev, devicetree
In-Reply-To: <20201120181627.21382-4-george.mccollister@gmail.com>



On 11/20/2020 10:16 AM, George McCollister wrote:
> Add documentation and an example for Arrow SpeedChips XRS7000 Series
> single chip Ethernet switches.
> 
> Signed-off-by: George McCollister <george.mccollister@gmail.com>

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
-- 
Florian

^ permalink raw reply

* Re: [PATCH v5 7/9] task_isolation: don't interrupt CPUs with tick_nohz_full_kick_cpu()
From: Frederic Weisbecker @ 2020-11-23 22:13 UTC (permalink / raw)
  To: Alex Belits
  Cc: nitesh@redhat.com, Prasun Kapoor, linux-api@vger.kernel.org,
	davem@davemloft.net, trix@redhat.com, mingo@kernel.org,
	catalin.marinas@arm.com, rostedt@goodmis.org,
	linux-kernel@vger.kernel.org, peterx@redhat.com,
	tglx@linutronix.de, linux-arch@vger.kernel.org,
	mtosatti@redhat.com, will@kernel.org, peterz@infradead.org,
	leon@sidebranch.com, linux-arm-kernel@lists.infradead.org,
	pauld@redhat.com, netdev@vger.kernel.org
In-Reply-To: <76ed0b222d2f16fb5aebd144ac0222a7f3b87fa1.camel@marvell.com>

Hi Alex,

On Mon, Nov 23, 2020 at 05:58:22PM +0000, Alex Belits wrote:
> From: Yuri Norov <ynorov@marvell.com>
> 
> For nohz_full CPUs the desirable behavior is to receive interrupts
> generated by tick_nohz_full_kick_cpu(). But for hard isolation it's
> obviously not desirable because it breaks isolation.
> 
> This patch adds check for it.
> 
> Signed-off-by: Yuri Norov <ynorov@marvell.com>
> [abelits@marvell.com: updated, only exclude CPUs running isolated tasks]
> Signed-off-by: Alex Belits <abelits@marvell.com>
> ---
>  kernel/time/tick-sched.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> index a213952541db..6c8679e200f0 100644
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -20,6 +20,7 @@
>  #include <linux/sched/clock.h>
>  #include <linux/sched/stat.h>
>  #include <linux/sched/nohz.h>
> +#include <linux/isolation.h>
>  #include <linux/module.h>
>  #include <linux/irq_work.h>
>  #include <linux/posix-timers.h>
> @@ -268,7 +269,8 @@ static void tick_nohz_full_kick(void)
>   */
>  void tick_nohz_full_kick_cpu(int cpu)
>  {
> -	if (!tick_nohz_full_cpu(cpu))
> +	smp_rmb();
> +	if (!tick_nohz_full_cpu(cpu) || task_isolation_on_cpu(cpu))
>  		return;

Like I said in subsequent reviews, we are not going to ignore IPIs.
We must fix the sources of these IPIs instead.

Thanks.

>  
>  	irq_work_queue_on(&per_cpu(nohz_full_kick_work, cpu), cpu);
> -- 
> 2.20.1
> 

^ permalink raw reply

* Re: [PATCH net-next 00/15] net: phy: add support for shared interrupts (part 3)
From: Martin Blumenstingl @ 2020-11-23 22:13 UTC (permalink / raw)
  To: Ioana Ciornei; +Cc: netdev, linux-kernel
In-Reply-To: <20201123153817.1616814-1-ciorneiioana@gmail.com>

Hello Ioana,

On Mon, Nov 23, 2020 at 4:38 PM Ioana Ciornei <ciorneiioana@gmail.com> wrote:
[...]
> Ioana Ciornei (15):
>   net: phy: intel-xway: implement generic .handle_interrupt() callback
>   net: phy: intel-xway: remove the use of .ack_interrupt()
>   net: phy: icplus: implement generic .handle_interrupt() callback
>   net: phy: icplus: remove the use .ack_interrupt()
>   net: phy: meson-gxl: implement generic .handle_interrupt() callback
>   net: phy: meson-gxl: remove the use of .ack_callback()
I will check the six patches above on Saturday (due to me being very
busy with my daytime job)
if that's too late for the netdev maintainers then I'm not worried
about it. at first glance this looks fine to me. and we can always fix
things afterwards (but still before -rc1).


Best regards,
Martin

^ permalink raw reply

* Re: [PATCH net-next v4 2/5] net/lapb: support netdev events
From: Xie He @ 2020-11-23 22:09 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Martin Schiller, Andrew Hendry, David S. Miller, Linux X25,
	Linux Kernel Network Developers, LKML
In-Reply-To: <20201123113622.115c474b@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>

On Mon, Nov 23, 2020 at 11:36 AM Jakub Kicinski <kuba@kernel.org> wrote:
>
> > >  From this point of view it will be the best to handle the NETDEV_UP in
> > > the lapb event handler and establish the link analog to the
> > > NETDEV_CHANGE event if the carrier is UP.
> >
> > Thanks! This way we can make sure LAPB would automatically connect in
> > all situations.
> >
> > Since we'll have a netif_carrier_ok check in NETDEV_UP handing, it
> > might make the code look prettier to also have a netif_carrier_ok
> > check in NETDEV_GOING_DOWN handing (for symmetry). Just a suggestion.
> > You can do whatever looks good to you :)
>
> Xie other than this the patches look good to you?
>
> Martin should I expect a respin to follow Xie's suggestion
> or should I apply v4?

There should be a respin because we need to handle the NETDEV_UP
event. The lapbether driver (and possibly some HDLC WAN drivers)
doesn't generate carrier events so we need to do auto-connect in the
NETDEV_UP event.

^ permalink raw reply

* Re: [PATCH net-next 2/3] net: dsa: add Arrow SpeedChips XRS700x driver
From: Andrew Lunn @ 2020-11-23 22:09 UTC (permalink / raw)
  To: George McCollister
  Cc: Vivien Didelot, Florian Fainelli, Vladimir Oltean,
	David S . Miller, netdev, open list:OPEN FIRMWARE AND...
In-Reply-To: <CAFSKS=M-2rwM2UC58xf8n0ORuwxHq06BjLj7QP=JuU19-tCpGg@mail.gmail.com>

> > > https://www.flexibilis.com/downloads/xrs/SpeedChip_XRS7000_3000_User_Manual.pdf

Section 6.1.4

The forwarding decision is presented in Figure 19. Note that also
frames coming into a disabled port are received to the buffer memory,
but because their forwarding decision is not to forward them to any
port, they are dropped.  This behavior however can be changed, and
frames can be forwarded from disabled ports to other ports by using
Inbound Policy (see Chapter 6.1.5).

Sounds promising. And Section 6.1.5:

Inbound Policy checks the source and the destination MAC addresses of
all the received frames. The user can configure through the register
interface what kind of a treatment should frames with certain source
or destination MAC addresses get. Many protocols use protocol specific
multicast MAC addresses and the destination MAC address can therefore
be used for forwarding those frames to CPU port and not to other
ports.

Looking at table 36, i think you can add a match for the BPDU
destination MAC address, and have i forwarded to the CPU port.

Looks like you can add 15 such filters. So you might want to think
about how you want to use these, what other special packets do you
want to allow through?

	    Andrew

^ permalink raw reply

* [PATCH net-next] net: sfp: add debugfs support
From: Russell King @ 2020-11-23 22:06 UTC (permalink / raw)
  To: Andrew Lunn, Heiner Kallweit; +Cc: David S. Miller, netdev, Jakub Kicinski

Add debugfs support to SFP so that the internal state of the SFP state
machines and hardware signal state can be viewed from userspace, rather
than having to compile a debug kernel to view state state transitions
in the kernel log.  The 'state' output looks like:

Module state: empty
Module probe attempts: 0 0
Device state: up
Main state: down
Fault recovery remaining retries: 5
PHY probe remaining retries: 12
moddef0: 0
rx_los: 1
tx_fault: 1
tx_disable: 1

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
---
This is useful to view the state of the SFP cage when e.g., debugging
a module that the link doesn't seem to be coming up for. Rather than
sending this patch each time there is a query, it seems sensible to
merge it into mainline instead.

 drivers/net/phy/sfp.c | 55 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 55 insertions(+)

diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c
index 1e347afa951e..2c32aa891f17 100644
--- a/drivers/net/phy/sfp.c
+++ b/drivers/net/phy/sfp.c
@@ -1,6 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0
 #include <linux/acpi.h>
 #include <linux/ctype.h>
+#include <linux/debugfs.h>
 #include <linux/delay.h>
 #include <linux/gpio/consumer.h>
 #include <linux/hwmon.h>
@@ -258,6 +259,9 @@ struct sfp {
 	char *hwmon_name;
 #endif
 
+#if IS_ENABLED(CONFIG_DEBUG_FS)
+	struct dentry *debugfs_dir;
+#endif
 };
 
 static bool sff_module_supported(const struct sfp_eeprom_id *id)
@@ -1390,6 +1394,54 @@ static void sfp_module_tx_enable(struct sfp *sfp)
 	sfp_set_state(sfp, sfp->state);
 }
 
+#if IS_ENABLED(CONFIG_DEBUG_FS)
+static int sfp_debug_state_show(struct seq_file *s, void *data)
+{
+	struct sfp *sfp = s->private;
+
+	seq_printf(s, "Module state: %s\n",
+		   mod_state_to_str(sfp->sm_mod_state));
+	seq_printf(s, "Module probe attempts: %d %d\n",
+		   R_PROBE_RETRY_INIT - sfp->sm_mod_tries_init,
+		   R_PROBE_RETRY_SLOW - sfp->sm_mod_tries);
+	seq_printf(s, "Device state: %s\n",
+		   dev_state_to_str(sfp->sm_dev_state));
+	seq_printf(s, "Main state: %s\n",
+		   sm_state_to_str(sfp->sm_state));
+	seq_printf(s, "Fault recovery remaining retries: %d\n",
+		   sfp->sm_fault_retries);
+	seq_printf(s, "PHY probe remaining retries: %d\n",
+		   sfp->sm_phy_retries);
+	seq_printf(s, "moddef0: %d\n", !!(sfp->state & SFP_F_PRESENT));
+	seq_printf(s, "rx_los: %d\n", !!(sfp->state & SFP_F_LOS));
+	seq_printf(s, "tx_fault: %d\n", !!(sfp->state & SFP_F_TX_FAULT));
+	seq_printf(s, "tx_disable: %d\n", !!(sfp->state & SFP_F_TX_DISABLE));
+	return 0;
+}
+DEFINE_SHOW_ATTRIBUTE(sfp_debug_state);
+
+static void sfp_debugfs_init(struct sfp *sfp)
+{
+	sfp->debugfs_dir = debugfs_create_dir(dev_name(sfp->dev), NULL);
+
+	debugfs_create_file("state", 0600, sfp->debugfs_dir, sfp,
+			    &sfp_debug_state_fops);
+}
+
+static void sfp_debugfs_exit(struct sfp *sfp)
+{
+	debugfs_remove_recursive(sfp->debugfs_dir);
+}
+#else
+static void sfp_debugfs_init(struct sfp *sfp)
+{
+}
+
+static void sfp_debugfs_exit(struct sfp *sfp)
+{
+}
+#endif
+
 static void sfp_module_tx_fault_reset(struct sfp *sfp)
 {
 	unsigned int state = sfp->state;
@@ -2472,6 +2524,8 @@ static int sfp_probe(struct platform_device *pdev)
 		return -ENOMEM;
 	}
 
+	sfp_debugfs_init(sfp);
+
 	return 0;
 }
 
@@ -2479,6 +2533,7 @@ static int sfp_remove(struct platform_device *pdev)
 {
 	struct sfp *sfp = platform_get_drvdata(pdev);
 
+	sfp_debugfs_exit(sfp);
 	sfp_unregister_socket(sfp->sfp_bus);
 
 	rtnl_lock();
-- 
2.20.1


^ permalink raw reply related

* Re: [PATCH net-next] bridge: mrp: Implement LC mode for MRP
From: Jakub Kicinski @ 2020-11-23 22:05 UTC (permalink / raw)
  To: Nikolay Aleksandrov
  Cc: Horatiu Vultur, roopa, davem, linux-kernel, bridge, netdev
In-Reply-To: <13cef7c2-cacc-2c24-c0d5-e462b0e3b4df@nvidia.com>

On Mon, 23 Nov 2020 16:25:53 +0200 Nikolay Aleksandrov wrote:
> >>> @@ -156,4 +157,10 @@ struct br_mrp_in_link_hdr {
> >>>       __be16 interval;
> >>>  };
> >>>
> >>> +struct br_mrp_in_link_status_hdr {
> >>> +     __u8 sa[ETH_ALEN];
> >>> +     __be16 port_role;
> >>> +     __be16 id;
> >>> +};
> >>> +  
> >>
> >> I didn't see this struct used anywhere, am I missing anything?  
> > 
> > Yes, you are right, the struct is not used any. But I put it there as I
> > put the other frame types for MRP.
> >   
> 
> I see, we don't usually add unused code. The patch is fine as-is and since
> this is already the case for other MRP parts I'm not strictly against it, so:
> 
> Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
> 
> If Jakub decides to adhere to that rule you can keep my acked-by and just remove
> the struct for v2.

Yes, good catch, let's drop it, we don't want to make too much of 
a precedent for using kernel uAPI headers as a place to provide
protocol-related structs if the kernel doesn't need them.

The existing structs are only present in net-next as well, so if you
don't mind Horatiu it'd be good to follow up and remove the unused ones
and move the ones (if any) which are only used by the kernel but not by
the user space <-> kernel API communication out of include/uapi.

^ permalink raw reply

* Re: [PATCH net-next 2/3] mlxsw: spectrum_ptp: use PTP wide message type definitions
From: Jakub Kicinski @ 2020-11-23 22:03 UTC (permalink / raw)
  To: Christian Eggers
  Cc: Ido Schimmel, Richard Cochran, Andrew Lunn, Heiner Kallweit,
	Vladimir Oltean, Russell King, David S . Miller, netdev,
	linux-kernel, Petr Machata, Jiri Pirko, Ido Schimmel
In-Reply-To: <20201122200154.GA668367@shredder.lan>

On Sun, 22 Nov 2020 22:01:54 +0200 Ido Schimmel wrote:
> > > I don't know what are Jakub's preferences, but had this happened on our
> > > internal patchwork instance, I would just ask the author to submit
> > > another version with all the patches.  
> > Please let me know how I shall proceed...  
> 
> Jakub has the final say, so I assume he will comment on that.

The pre-requisite was just merged, so please collect all the review tags
you got so far and repost.

^ permalink raw reply

* Re: [PATCH net-next v3 0/3] net: ptp: introduce common defines for PTP message types
From: Jakub Kicinski @ 2020-11-23 22:02 UTC (permalink / raw)
  To: Christian Eggers
  Cc: Richard Cochran, Andrew Lunn, Heiner Kallweit, Vladimir Oltean,
	Russell King, David S . Miller, netdev, linux-kernel
In-Reply-To: <20201120084106.10046-1-ceggers@arri.de>

On Fri, 20 Nov 2020 09:41:03 +0100 Christian Eggers wrote:
> This series introduces commen defines for PTP event messages. Driver
> internal defines are removed and some uses of magic numbers are replaced
> by the new defines.

Applied, thanks!

^ permalink raw reply

* Re: [PATCH v5 3/9] task_isolation: userspace hard isolation from kernel
From: Thomas Gleixner @ 2020-11-23 22:01 UTC (permalink / raw)
  To: Alex Belits, nitesh@redhat.com, frederic@kernel.org
  Cc: Prasun Kapoor, linux-api@vger.kernel.org, davem@davemloft.net,
	trix@redhat.com, mingo@kernel.org, catalin.marinas@arm.com,
	rostedt@goodmis.org, linux-kernel@vger.kernel.org,
	peterx@redhat.com, linux-arch@vger.kernel.org,
	mtosatti@redhat.com, will@kernel.org, peterz@infradead.org,
	leon@sidebranch.com, linux-arm-kernel@lists.infradead.org,
	pauld@redhat.com, netdev@vger.kernel.org
In-Reply-To: <5d882681867ed43636e22d265d61afbbac1b5a62.camel@marvell.com>

Alex,

On Mon, Nov 23 2020 at 17:56, Alex Belits wrote:
>  .../admin-guide/kernel-parameters.txt         |   6 +
>  drivers/base/cpu.c                            |  23 +
>  include/linux/hrtimer.h                       |   4 +
>  include/linux/isolation.h                     | 326 ++++++++
>  include/linux/sched.h                         |   5 +
>  include/linux/tick.h                          |   3 +
>  include/uapi/linux/prctl.h                    |   6 +
>  init/Kconfig                                  |  27 +
>  kernel/Makefile                               |   2 +
>  kernel/isolation.c                            | 714 ++++++++++++++++++
>  kernel/signal.c                               |   2 +
>  kernel/sys.c                                  |   6 +
>  kernel/time/hrtimer.c                         |  27 +
>  kernel/time/tick-sched.c                      |  18 +

I asked you before to split this up into bits and pieces and argue and
justify each change. Throwing this wholesale over the fence is going
nowhere. It's not revieable at all.

Aside of that ignoring review comments is a sure path to make yourself
ignored:

> +/*
> + * Logging
> + */
> +int task_isolation_message(int cpu, int level, bool supp, const char *fmt, ...);
> +
> +#define pr_task_isol_emerg(cpu, fmt, ...)			\
> +	task_isolation_message(cpu, LOGLEVEL_EMERG, false, fmt, ##__VA_ARGS__)

The comments various people made about that are not going away and none
of this is going near anything I'm responsible for unless you provide
these independent of the rest and with a reasonable justification why
you can't use any other existing mechanism or extend it for your use
case.

Thanks,

        tglx

^ permalink raw reply

* Re: [PATCH net-next 2/3] net: dsa: add Arrow SpeedChips XRS700x driver
From: Andrew Lunn @ 2020-11-23 21:55 UTC (permalink / raw)
  To: George McCollister
  Cc: Vivien Didelot, Florian Fainelli, Vladimir Oltean,
	David S . Miller, netdev, open list:OPEN FIRMWARE AND...
In-Reply-To: <CAFSKS=M-2rwM2UC58xf8n0ORuwxHq06BjLj7QP=JuU19-tCpGg@mail.gmail.com>

> > If it cannot send/receive BPDUs, it might get into an oscillating
> > state. They see each other via BPDUs, decide there is a loop, and
> > block a port. The BPDUs stop, they think the loop has been broken and
> > so unblock. They see each other via BPUS, decide there is a loop,...
> 
> Yeah, this is messed up. The switch doesn't seem to pass up BPDUs in
> either disabled or learning mode, only forward mode.
> Can I just replace .port_stp_state_set with .port_enable (setting
> switch port to forward mode) and .port_disable (setting switch port to
> disabled mode)? I don't see any other way around this. It looks like
> rtl8366rb.c also has no .port_stp_state_set.

Do you have access to a 'vendor crap driver'? Anything about STP in
it? Maybe you need to add special entries to its forwarding database
for the BPDU destination MAC address?

If you cannot get BPDUs to be passed, i think that means you cannot
offload bridging to the switch. So you also need to remove bridge join
and bridge leave. I've no idea if you can still do HSR under those
conditions!

	Andrew

^ permalink raw reply

* [PATCH net-next] net: sfp: VSOL V2801F / CarlitoxxPro CPGOS03-0490 v2.0 workaround
From: Russell King @ 2020-11-23 21:53 UTC (permalink / raw)
  To: Andrew Lunn, Heiner Kallweit; +Cc: David S. Miller, netdev, Jakub Kicinski

Add a workaround for the VSOL V2801F / CarlitoxxPro CPGOS03-0490 v2.0
GPON module which the manufacturer states needs single byte I2C reads
to the EEPROM.

Reported-by: Thomas Schreiber <tschreibe@gmail.com>
Tested-by: Thomas Schreiber <tschreibe@gmail.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
---
 drivers/net/phy/sfp.c | 45 ++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 40 insertions(+), 5 deletions(-)

diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c
index debf91412a72..1e347afa951e 100644
--- a/drivers/net/phy/sfp.c
+++ b/drivers/net/phy/sfp.c
@@ -219,6 +219,7 @@ struct sfp {
 	struct sfp_bus *sfp_bus;
 	struct phy_device *mod_phy;
 	const struct sff_data *type;
+	size_t i2c_block_size;
 	u32 max_power_mW;
 
 	unsigned int (*get_state)(struct sfp *);
@@ -335,10 +336,19 @@ static int sfp_i2c_read(struct sfp *sfp, bool a2, u8 dev_addr, void *buf,
 			size_t len)
 {
 	struct i2c_msg msgs[2];
-	u8 bus_addr = a2 ? 0x51 : 0x50;
+	size_t block_size;
 	size_t this_len;
+	u8 bus_addr;
 	int ret;
 
+	if (a2) {
+		block_size = 16;
+		bus_addr = 0x51;
+	} else {
+		block_size = sfp->i2c_block_size;
+		bus_addr = 0x50;
+	}
+
 	msgs[0].addr = bus_addr;
 	msgs[0].flags = 0;
 	msgs[0].len = 1;
@@ -350,8 +360,8 @@ static int sfp_i2c_read(struct sfp *sfp, bool a2, u8 dev_addr, void *buf,
 
 	while (len) {
 		this_len = len;
-		if (this_len > 16)
-			this_len = 16;
+		if (this_len > sfp->i2c_block_size)
+			this_len = sfp->i2c_block_size;
 
 		msgs[1].len = this_len;
 
@@ -1673,14 +1683,20 @@ static int sfp_sm_mod_probe(struct sfp *sfp, bool report)
 	u8 check;
 	int ret;
 
-	ret = sfp_read(sfp, false, 0, &id, sizeof(id));
+	/* Some modules (CarlitoxxPro CPGOS03-0490) do not support multibyte
+	 * reads from the EEPROM, so start by reading the base identifying
+	 * information one byte at a time.
+	 */
+	sfp->i2c_block_size = 1;
+
+	ret = sfp_read(sfp, false, 0, &id.base, sizeof(id.base));
 	if (ret < 0) {
 		if (report)
 			dev_err(sfp->dev, "failed to read EEPROM: %d\n", ret);
 		return -EAGAIN;
 	}
 
-	if (ret != sizeof(id)) {
+	if (ret != sizeof(id.base)) {
 		dev_err(sfp->dev, "EEPROM short read: %d\n", ret);
 		return -EAGAIN;
 	}
@@ -1719,6 +1735,25 @@ static int sfp_sm_mod_probe(struct sfp *sfp, bool report)
 		}
 	}
 
+	/* Some modules (Nokia 3FE46541AA) lock up if byte 0x51 is read as a
+	 * single read. Switch back to reading 16 byte blocks unless we have
+	 * a CarlitoxxPro module (rebranded VSOL V2801F).
+	 */
+	if (memcmp(id.base.vendor_name, "VSOL            ", 16))
+		sfp->i2c_block_size = 16;
+
+	ret = sfp_read(sfp, false, SFP_CC_BASE + 1, &id.ext, sizeof(id.ext));
+	if (ret < 0) {
+		if (report)
+			dev_err(sfp->dev, "failed to read EEPROM: %d\n", ret);
+		return -EAGAIN;
+	}
+
+	if (ret != sizeof(id.ext)) {
+		dev_err(sfp->dev, "EEPROM short read: %d\n", ret);
+		return -EAGAIN;
+	}
+
 	check = sfp_check(&id.ext, sizeof(id.ext) - 1);
 	if (check != id.ext.cc_ext) {
 		if (cotsworks) {
-- 
2.20.1


^ permalink raw reply related

* Re: [PATCH net-next 00/13] Add mlx5 subfunction support
From: Saeed Mahameed @ 2020-11-23 21:51 UTC (permalink / raw)
  To: Samudrala, Sridhar, Alexander Duyck, Jakub Kicinski
  Cc: David Ahern, Jason Gunthorpe, Parav Pandit,
	netdev@vger.kernel.org, linux-rdma@vger.kernel.org,
	gregkh@linuxfoundation.org, Jiri Pirko, dledford@redhat.com,
	Leon Romanovsky, davem@davemloft.net
In-Reply-To: <102d20e1-c78f-09cb-fabb-efdc59f61eb8@intel.com>

On Fri, 2020-11-20 at 11:04 -0800, Samudrala, Sridhar wrote:
> 
> On 11/20/2020 9:58 AM, Alexander Duyck wrote:
> > On Thu, Nov 19, 2020 at 5:29 PM Jakub Kicinski <kuba@kernel.org>
> > wrote:
> > > On Wed, 18 Nov 2020 21:35:29 -0700 David Ahern wrote:
> > > > On 11/18/20 7:14 PM, Jakub Kicinski wrote:
> > > > > On Tue, 17 Nov 2020 14:49:54 -0400 Jason Gunthorpe wrote:
> > > > > > On Tue, Nov 17, 2020 at 09:11:20AM -0800, Jakub Kicinski
> > > > > > wrote:
> > > > > > 
> > > > > > > > Just to refresh all our memory, we discussed and
> > > > > > > > settled on the flow
> > > > > > > > in [2]; RFC [1] followed this discussion.
> > > > > > > > 
> > > > > > > > vdpa tool of [3] can add one or more vdpa device(s) on
> > > > > > > > top of already
> > > > > > > > spawned PF, VF, SF device.
> > > > > > > 
> > > > > > > Nack for the networking part of that. It'd basically be
> > > > > > > VMDq.
> > > > > > 
> > > > > > What are you NAK'ing?
> > > > > 
> > > > > Spawning multiple netdevs from one device by slicing up its
> > > > > queues.
> > > > 
> > > > Why do you object to that? Slicing up h/w resources for virtual
> > > > what
> > > > ever has been common practice for a long time.
> > > 
> > > My memory of the VMDq debate is hazy, let me rope in Alex into
> > > this.
> > > I believe the argument was that we should offload software
> > > constructs,
> > > not create HW-specific APIs which depend on HW availability and
> > > implementation. So the path we took was offloading macvlan.
> > 
> > I think it somewhat depends on the type of interface we are talking
> > about. What we were wanting to avoid was drivers spawning their own
> > unique VMDq netdevs and each having a different way of doing it. 

Agreed, but SF netdevs are not a VMDq netdevs, they are avaiable in the
switchdev model where they correspond to a full blown port (HW domain).

> > The
> > approach Intel went with was to use a MACVLAN offload to approach
> > it.
> > Although I would imagine many would argue the approach is somewhat
> > dated and limiting since you cannot do many offloads on a MACVLAN
> > interface.
> 
> Yes. We talked about this at netdev 0x14 and the limitations of
> macvlan 
> based offloads.
> https://netdevconf.info/0x14/session.html?talk-hardware-acceleration-of-container-networking-interfaces
> 
> Subfunction seems to be a good model to expose VMDq VSI or SIOV ADI
> as a 

Exactly, Subfunctions is the most generic model to overcome any SW
model limitations e.g macvtap offload, all HW vendors are already
creating netdevs on a given PF/VF .. all we need is to model the SF and
all the rest is the same! most likely every thing else comes for free
like in the mlx5 model where the netdev/rmda interfaces are abstracted
from the underlying HW, same netdev loads on a PF/VF/SF or even an
embedded function !


> netdev for kernel containers. AF_XDP ZC in a container is one of the 
> usecase this would address. Today we have to pass the entire PF/VF to
> a 
> container to do AF_XDP.
> 

this will be supported out of the box for free with SFs.

> Looks like the current model is to create a subfunction of a
> specific 
> type on auxiliary bus, do some configuration to assign resources and 
> then activate the subfunction.
> 
> > With the VDPA case I believe there is a set of predefined virtio
> > devices that are being emulated and presented so it isn't as if
> > they
> > are creating a totally new interface for this.
> > 
> > What I would be interested in seeing is if there are any other
> > vendors
> > that have reviewed this and sign off on this approach. What we
> > don't
> > want to see is Nivida/Mellanox do this one way, then Broadcom or
> > Intel
> > come along later and have yet another way of doing this. We need an
> > interface and feature set that will work for everyone in terms of
> > how
> > this will look going forward.

Well, the vdpa interface was created by the virtio community and
especially redhat, i am not sure mellanox were even involved in the
initial development stages :-)

anyway historically speaking vDPA was originally created for DPDK, but
same API applies to device drivers who can deliver the same set of
queues and API while bypassing the whole DPDK stack, enters Kernel vDPA
which was created to overcome some of the userspace limitations and
complexity and to leverage some of the kernel great feature such as
eBPF.

https://www.redhat.com/en/blog/introduction-vdpa-kernel-framework






^ permalink raw reply

* Re: [PATCH v5 2/9] task_isolation: vmstat: add vmstat_idle function
From: Thomas Gleixner @ 2020-11-23 21:49 UTC (permalink / raw)
  To: Alex Belits, nitesh@redhat.com, frederic@kernel.org
  Cc: Prasun Kapoor, linux-api@vger.kernel.org, davem@davemloft.net,
	trix@redhat.com, mingo@kernel.org, catalin.marinas@arm.com,
	rostedt@goodmis.org, linux-kernel@vger.kernel.org,
	peterx@redhat.com, linux-arch@vger.kernel.org,
	mtosatti@redhat.com, will@kernel.org, peterz@infradead.org,
	leon@sidebranch.com, linux-arm-kernel@lists.infradead.org,
	pauld@redhat.com, netdev@vger.kernel.org
In-Reply-To: <6ac7143e5038614e3950636456cef67b5bc0c9e4.camel@marvell.com>

On Mon, Nov 23 2020 at 17:56, Alex Belits wrote:
> This function checks to see if a vmstat worker is not running,
> and the vmstat diffs don't require an update.  The function is
> called from the task-isolation code to see if we need to
> actually do some work to quiet vmstat.

A changelog has to explain _WHY_ this change is necessary and not _WHAT_
the patch is doing.

Thanks,

        tglx

^ permalink raw reply

* Re: [PATCH v5 1/9] task_isolation: vmstat: add quiet_vmstat_sync function
From: Thomas Gleixner @ 2020-11-23 21:48 UTC (permalink / raw)
  To: Alex Belits, nitesh@redhat.com, frederic@kernel.org
  Cc: Prasun Kapoor, linux-api@vger.kernel.org, davem@davemloft.net,
	trix@redhat.com, mingo@kernel.org, catalin.marinas@arm.com,
	rostedt@goodmis.org, linux-kernel@vger.kernel.org,
	peterx@redhat.com, linux-arch@vger.kernel.org,
	mtosatti@redhat.com, will@kernel.org, peterz@infradead.org,
	leon@sidebranch.com, linux-arm-kernel@lists.infradead.org,
	pauld@redhat.com, netdev@vger.kernel.org
In-Reply-To: <0e07e5bf6f65dc89d263683c81b4a19bcc6d4b60.camel@marvell.com>

Alex,

On Mon, Nov 23 2020 at 17:56, Alex Belits wrote:

why are you insisting on adding 'task_isolation: ' as prefix to every
single patch? That's wrong as I explained before.

The prefix denotes the affected subsystem and 'task_isolation' is _NOT_
a subsystem. It's the project name you are using but the affected code
belongs to the memory management subsystem and if you run

 git log mm/vmstat.c

you might get a hint what the proper prefix is, i.e. 'mm/vmstat: '

> In commit f01f17d3705b ("mm, vmstat: make quiet_vmstat lighter")
> the quiet_vmstat() function became asynchronous, in the sense that
> the vmstat work was still scheduled to run on the core when the
> function returned.  For task isolation, we need a synchronous

This changelog is useless because how should someone not familiar with
the term 'task isolation' figure out what that means?

It's not the reviewers job to figure that out. Again: Go read and adhere
to Documentation/process/*

Aside of that your patches are CR/LF inflicted. Please fix your work
flow and tools.

Thanks,

        tglx




^ permalink raw reply

* Re: [PATCH net 02/15] ibmvnic: process HMC disable command
From: drt @ 2020-11-23 21:46 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: Lijun Pan, netdev, sukadev, drt
In-Reply-To: <20201123114328.00c0b933@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>

On 2020-11-23 11:43, Jakub Kicinski wrote:
> On Sun, 22 Nov 2020 07:12:38 -0800 drt wrote:
>> On 2020-11-21 15:36, Jakub Kicinski wrote:
>> > On Fri, 20 Nov 2020 16:40:36 -0600 Lijun Pan wrote:
>> >> From: Dany Madden <drt@linux.ibm.com>
>> >>
>> >> Currently ibmvnic does not support the disable vnic command from the
>> >> Hardware Management Console. This patch enables ibmvnic to process
>> >> CRQ message 0x07, disable vnic adapter.
>> >
>> > What user-visible problem does this one solve?
>> This allows HMC to disconnect a Linux client from the network if the
>> vNIC adapter is misbehaving and/or sending malicious traffic. The 
>> effect
>> is the same as when a sysadmin sets a link down (ibmvnic_close()) on 
>> the
>> Linux client. This patch extends this ability to the HMC.
> 
> Okay, sounds to me like net-next material, then.
> 
> IIUC we don't need to fix this ASAP and backport to stable.

Yes, I will submit v2 net-next. Thank you.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox