Netdev List
 help / color / mirror / Atom feed
* Re: kernel BUG at net/rxrpc/local_object.c:LINE!
From: David Howells @ 2019-07-31 14:30 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: dhowells, syzbot, Eric Biggers, David Miller, linux-afs, LKML,
	netdev, syzkaller-bugs
In-Reply-To: <CACT4Y+YjdV8CqX5=PzKsHnLsJOzsydqiq3igYDm_=nSdmFo2YQ@mail.gmail.com>

Dmitry Vyukov <dvyukov@google.com> wrote:

> Re bisection, I don't know if there are some more subtle things as
> play (you are in the better position to judge that), but bisection log
> looks good, it tracked the target crash throughout and wasn't
> distracted by any unrelated bugs, etc. So I don't see any obvious
> reasons to not trust it.

Can you turn on:

	echo 1 > /sys/kernel/debug/tracing/events/rxrpc/rxrpc_local/enable

and capture the trace log at the point it crashes?

David

^ permalink raw reply

* Re: [Intel-wired-lan] [PATCH bpf-next v4 03/11] libbpf: add flags to umem config
From: Björn Töpel @ 2019-07-31 14:25 UTC (permalink / raw)
  To: Kevin Laatz
  Cc: Netdev, Alexei Starovoitov, Daniel Borkmann,
	Björn Töpel, Karlsson, Magnus, Jakub Kicinski,
	Jonathan Lemon, Saeed Mahameed, Maxim Mikityanskiy,
	Stephen Hemminger, Bruce Richardson, ciara.loftus,
	intel-wired-lan, bpf
In-Reply-To: <20190730085400.10376-4-kevin.laatz@intel.com>

On Tue, 30 Jul 2019 at 19:43, Kevin Laatz <kevin.laatz@intel.com> wrote:
>
> This patch adds a 'flags' field to the umem_config and umem_reg structs.
> This will allow for more options to be added for configuring umems.
>
> The first use for the flags field is to add a flag for unaligned chunks
> mode. These flags can either be user-provided or filled with a default.
>
> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
> Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
>
> ---
> v2:
>   - Removed the headroom check from this patch. It has moved to the
>     previous patch.
>
> v4:
>   - modified chunk flag define
> ---
>  tools/include/uapi/linux/if_xdp.h | 9 +++++++--
>  tools/lib/bpf/xsk.c               | 3 +++
>  tools/lib/bpf/xsk.h               | 2 ++
>  3 files changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/tools/include/uapi/linux/if_xdp.h b/tools/include/uapi/linux/if_xdp.h
> index faaa5ca2a117..a691802d7915 100644
> --- a/tools/include/uapi/linux/if_xdp.h
> +++ b/tools/include/uapi/linux/if_xdp.h
> @@ -17,6 +17,10 @@
>  #define XDP_COPY       (1 << 1) /* Force copy-mode */
>  #define XDP_ZEROCOPY   (1 << 2) /* Force zero-copy mode */
>
> +/* Flags for xsk_umem_config flags */
> +#define XDP_UMEM_UNALIGNED_CHUNK_FLAG_SHIFT 15
> +#define XDP_UMEM_UNALIGNED_CHUNK_FLAG (1 << XDP_UMEM_UNALIGNED_CHUNK_FLAG_SHIFT)
> +
>  struct sockaddr_xdp {
>         __u16 sxdp_family;
>         __u16 sxdp_flags;
> @@ -49,8 +53,9 @@ struct xdp_mmap_offsets {
>  #define XDP_OPTIONS                    8
>
>  struct xdp_umem_reg {
> -       __u64 addr; /* Start of packet data area */
> -       __u64 len; /* Length of packet data area */
> +       __u64 addr;     /* Start of packet data area */
> +       __u64 len:48;   /* Length of packet data area */
> +       __u64 flags:16; /* Flags for umem */
>         __u32 chunk_size;
>         __u32 headroom;
>  };
> diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c
> index 5007b5d4fd2c..5e7e4d420ee0 100644
> --- a/tools/lib/bpf/xsk.c
> +++ b/tools/lib/bpf/xsk.c
> @@ -116,6 +116,7 @@ static void xsk_set_umem_config(struct xsk_umem_config *cfg,
>                 cfg->comp_size = XSK_RING_CONS__DEFAULT_NUM_DESCS;
>                 cfg->frame_size = XSK_UMEM__DEFAULT_FRAME_SIZE;
>                 cfg->frame_headroom = XSK_UMEM__DEFAULT_FRAME_HEADROOM;
> +               cfg->flags = XSK_UMEM__DEFAULT_FLAGS;
>                 return;
>         }
>
> @@ -123,6 +124,7 @@ static void xsk_set_umem_config(struct xsk_umem_config *cfg,
>         cfg->comp_size = usr_cfg->comp_size;
>         cfg->frame_size = usr_cfg->frame_size;
>         cfg->frame_headroom = usr_cfg->frame_headroom;
> +       cfg->flags = usr_cfg->flags;
>  }
>
>  static int xsk_set_xdp_socket_config(struct xsk_socket_config *cfg,
> @@ -182,6 +184,7 @@ int xsk_umem__create(struct xsk_umem **umem_ptr, void *umem_area, __u64 size,
>         mr.len = size;
>         mr.chunk_size = umem->config.frame_size;
>         mr.headroom = umem->config.frame_headroom;
> +       mr.flags = umem->config.flags;
>
>         err = setsockopt(umem->fd, SOL_XDP, XDP_UMEM_REG, &mr, sizeof(mr));
>         if (err) {
> diff --git a/tools/lib/bpf/xsk.h b/tools/lib/bpf/xsk.h
> index 833a6e60d065..44a03d8c34b9 100644
> --- a/tools/lib/bpf/xsk.h
> +++ b/tools/lib/bpf/xsk.h
> @@ -170,12 +170,14 @@ LIBBPF_API int xsk_socket__fd(const struct xsk_socket *xsk);
>  #define XSK_UMEM__DEFAULT_FRAME_SHIFT    12 /* 4096 bytes */
>  #define XSK_UMEM__DEFAULT_FRAME_SIZE     (1 << XSK_UMEM__DEFAULT_FRAME_SHIFT)
>  #define XSK_UMEM__DEFAULT_FRAME_HEADROOM 0
> +#define XSK_UMEM__DEFAULT_FLAGS 0
>
>  struct xsk_umem_config {
>         __u32 fill_size;
>         __u32 comp_size;
>         __u32 frame_size;
>         __u32 frame_headroom;
> +       __u32 flags;

And the flags addition here, unfortunately, requires symbol versioning
of xsk_umem__create(). That'll be the first in libbpf! :-)


Björn

>  };
>
>  /* Flags for the libbpf_flags field. */
> --
> 2.17.1
>
> _______________________________________________
> Intel-wired-lan mailing list
> Intel-wired-lan@osuosl.org
> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply

* Re: [PATCH net-next v2 6/6] net: dsa: mv88e6xxx: add PTP support for MV88E6250 family
From: Richard Cochran @ 2019-07-31 14:06 UTC (permalink / raw)
  To: Hubert Feurstein
  Cc: netdev, linux-kernel, Andrew Lunn, Vivien Didelot,
	Florian Fainelli, David S. Miller, Rasmus Villemoes
In-Reply-To: <20190731082351.3157-7-h.feurstein@gmail.com>

On Wed, Jul 31, 2019 at 10:23:51AM +0200, Hubert Feurstein wrote:
> This adds PTP support for the MV88E6250 family.
> 
> Signed-off-by: Hubert Feurstein <h.feurstein@gmail.com>
> ---
>  drivers/net/dsa/mv88e6xxx/chip.c |  4 ++
>  drivers/net/dsa/mv88e6xxx/chip.h |  4 ++
>  drivers/net/dsa/mv88e6xxx/ptp.c  | 79 +++++++++++++++++++++++++++-----
>  drivers/net/dsa/mv88e6xxx/ptp.h  |  2 +
>  4 files changed, 78 insertions(+), 11 deletions(-)

Acked-by: Richard Cochran <richardcochran@gmail.com>

^ permalink raw reply

* Re: KASAN: use-after-free Read in release_sock
From: syzbot @ 2019-07-31 13:58 UTC (permalink / raw)
  To: davem, linux-hams, linux-kernel, netdev, ralf, syzkaller-bugs,
	xiyou.wangcong
In-Reply-To: <000000000000de000a058e9025db@google.com>

syzbot has bisected this bug to:

commit c8c8218ec5af5d2598381883acbefbf604e56b5e
Author: Cong Wang <xiyou.wangcong@gmail.com>
Date:   Thu Jun 27 21:30:58 2019 +0000

     netrom: fix a memory leak in nr_rx_frame()

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=13b52242600000
start commit:   629f8205 Merge tag 'for-linus-20190730' of git://git.kerne..
git tree:       upstream
final crash:    https://syzkaller.appspot.com/x/report.txt?x=10752242600000
console output: https://syzkaller.appspot.com/x/log.txt?x=17b52242600000
kernel config:  https://syzkaller.appspot.com/x/.config?x=4c7b914a2680c9c6
dashboard link: https://syzkaller.appspot.com/bug?extid=107429d62fb1d293602f
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=16a7c8dc600000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1498cfa2600000

Reported-by: syzbot+107429d62fb1d293602f@syzkaller.appspotmail.com
Fixes: c8c8218ec5af ("netrom: fix a memory leak in nr_rx_frame()")

For information about bisection process see: https://goo.gl/tpsmEJ#bisection

^ permalink raw reply

* Re: [PATCH net-next] net/tls: prevent skb_orphan() from leaking TLS plain text with offload
From: Boris Pismenny @ 2019-07-31 13:57 UTC (permalink / raw)
  To: Jakub Kicinski, davem@davemloft.net
  Cc: netdev@vger.kernel.org, oss-drivers@netronome.com,
	edumazet@google.com, Aviad Yehezkel, davejwatson@fb.com,
	john.fastabend@gmail.com, daniel@iogearbox.net,
	willemb@google.com, xiyou.wangcong@gmail.com, fw@strlen.de,
	alexei.starovoitov@gmail.com
In-Reply-To: <20190730211258.13748-1-jakub.kicinski@netronome.com>

Hi Jacub,

On 7/31/2019 12:12 AM, Jakub Kicinski wrote:
> sk_validate_xmit_skb() and drivers depend on the sk member of
> struct sk_buff to identify segments requiring encryption.
> Any operation which removes or does not preserve the original TLS
> socket such as skb_orphan() or skb_clone() will cause clear text
> leaks.
>
> Make the TCP socket underlying an offloaded TLS connection
> mark all skbs as decrypted, if TLS TX is in offload mode.
> Then in sk_validate_xmit_skb() catch skbs which have no socket
> (or a socket with no validation) and decrypted flag set.
>
> Note that CONFIG_SOCK_VALIDATE_XMIT, CONFIG_TLS_DEVICE and
> sk->sk_validate_xmit_skb are slightly interchangeable right now,
> they all imply TLS offload. The new checks are guarded by
> CONFIG_TLS_DEVICE because that's the option guarding the
> sk_buff->decrypted member.
>
> Second, smaller issue with orphaning is that it breaks
> the guarantee that packets will be delivered to device
> queues in-order. All TLS offload drivers depend on that
> scheduling property. This means skb_orphan_partial()'s
> trick of preserving partial socket references will cause
> issues in the drivers. We need a full orphan, and as a
> result netem delay/throttling will cause all TLS offload
> skbs to be dropped.
>
> Reusing the sk_buff->decrypted flag also protects from
> leaking clear text when incoming, decrypted skb is redirected
> (e.g. by TC).

Nice!

>
> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
> ---
> I'm sending this for net-next because of lack of confidence
> in my own abilities. It should apply cleanly to net... :)
>
>   Documentation/networking/tls-offload.rst |  9 --------
>   include/net/sock.h                       | 28 +++++++++++++++++++++++-
>   net/core/skbuff.c                        |  3 +++
>   net/core/sock.c                          | 22 ++++++++++++++-----
>   net/ipv4/tcp.c                           |  4 +++-
>   net/ipv4/tcp_output.c                    |  3 +++
>   net/tls/tls_device.c                     |  2 ++
>   7 files changed, 55 insertions(+), 16 deletions(-)
>
> diff --git a/Documentation/networking/tls-offload.rst b/Documentation/networking/tls-offload.rst
> index 048e5ca44824..2bc3ab5515d8 100644
> --- a/Documentation/networking/tls-offload.rst
> +++ b/Documentation/networking/tls-offload.rst
> @@ -499,15 +499,6 @@ offloads, old connections will remain active after flags are cleared.
>   Known bugs
>   ==========
>   
> -skb_orphan() leaks clear text
> ------------------------------
> -
> -Currently drivers depend on the :c:member:`sk` member of
> -:c:type:`struct sk_buff <sk_buff>` to identify segments requiring
> -encryption. Any operation which removes or does not preserve the socket
> -association such as :c:func:`skb_orphan` or :c:func:`skb_clone`
> -will cause the driver to miss the packets and lead to clear text leaks.
> -
>   Redirects leak clear text
>   -------------------------
Doesn't this patch cover the redirect case as well?
>   
> diff --git a/include/net/sock.h b/include/net/sock.h
> index 228db3998e46..90f3f552c789 100644
> --- a/include/net/sock.h
> +++ b/include/net/sock.h
> @@ -814,6 +814,7 @@ enum sock_flags {
>   	SOCK_TXTIME,
>   	SOCK_XDP, /* XDP is attached */
>   	SOCK_TSTAMP_NEW, /* Indicates 64 bit timestamps always */
> +	SOCK_CRYPTO_TX_PLAIN_TEXT, /* Generate skbs with decrypted flag set */
>   };
>   
>   #define SK_FLAGS_TIMESTAMP ((1UL << SOCK_TIMESTAMP) | (1UL << SOCK_TIMESTAMPING_RX_SOFTWARE))
> @@ -2480,8 +2481,26 @@ static inline bool sk_fullsock(const struct sock *sk)
>   	return (1 << sk->sk_state) & ~(TCPF_TIME_WAIT | TCPF_NEW_SYN_RECV);
>   }
>   
> +static inline void sk_set_tx_crypto(const struct sock *sk, struct sk_buff *skb)
> +{
> +#ifdef CONFIG_TLS_DEVICE
> +	skb->decrypted = sock_flag(sk, SOCK_CRYPTO_TX_PLAIN_TEXT);
> +#endif
> +}

Have you considered the following alternative to calling 
sk_set_tx_crypto() - Add a new MSG_DECRYPTE flag that will set the 
skb->decrypted bit in the do_tcp_sendpages function.

Then, the rest of the TCP code can propagate this bit from the existing skb.

> +
> +static inline bool sk_tx_crypto_match(const struct sock *sk,
> +				      const struct sk_buff *skb)
> +{
> +#ifdef CONFIG_TLS_DEVICE
> +	return sock_flag(sk, SOCK_CRYPTO_TX_PLAIN_TEXT) == !!skb->decrypted;
> +#else
> +	return true;
> +#endif
> +}
> +
>   /* Checks if this SKB belongs to an HW offloaded socket
>    * and whether any SW fallbacks are required based on dev.
> + * Check decrypted mark in case skb_orphan() cleared socket.
>    */
>   static inline struct sk_buff *sk_validate_xmit_skb(struct sk_buff *skb,
>   						   struct net_device *dev)
> @@ -2489,8 +2508,15 @@ static inline struct sk_buff *sk_validate_xmit_skb(struct sk_buff *skb,
>   #ifdef CONFIG_SOCK_VALIDATE_XMIT
>   	struct sock *sk = skb->sk;
>   
> -	if (sk && sk_fullsock(sk) && sk->sk_validate_xmit_skb)
> +	if (sk && sk_fullsock(sk) && sk->sk_validate_xmit_skb) {
>   		skb = sk->sk_validate_xmit_skb(sk, dev, skb);
> +#ifdef CONFIG_TLS_DEVICE
> +	} else if (unlikely(skb->decrypted)) {
> +		pr_warn_ratelimited("unencrypted skb with no associated socket - dropping\n");
> +		kfree_skb(skb);
> +		skb = NULL;
> +#endif
> +	}
>   #endif
>   
>   	return skb;
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 0b788df5a75b..9e92684479b9 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -3794,6 +3794,9 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
>   
>   			skb_reserve(nskb, headroom);
>   			__skb_put(nskb, doffset);
> +#ifdef CONFIG_TLS_DEVICE
> +			nskb->decrypted = head_skb->decrypted;
> +#endif
>   		}
>   
>   		if (segs)
> diff --git a/net/core/sock.c b/net/core/sock.c
> index d57b0cc995a0..b0c10b518e65 100644
> --- a/net/core/sock.c
> +++ b/net/core/sock.c
> @@ -1992,6 +1992,22 @@ void skb_set_owner_w(struct sk_buff *skb, struct sock *sk)
>   }
>   EXPORT_SYMBOL(skb_set_owner_w);
>   
> +static bool can_skb_orphan_partial(const struct sk_buff *skb)
> +{
> +#ifdef CONFIG_TLS_DEVICE
> +	/* Drivers depend on in-order delivery for crypto offload,
> +	 * partial orphan breaks out-of-order-OK logic.
> +	 */
> +	if (skb->decrypted)
> +		return false;
> +#endif
> +#ifdef CONFIG_INET
> +	if (skb->destructor == tcp_wfree)
> +		return true;
> +#endif
> +	return skb->destructor == sock_wfree;
> +}
> +
>   /* This helper is used by netem, as it can hold packets in its
>    * delay queue. We want to allow the owner socket to send more
>    * packets, as if they were already TX completed by a typical driver.
> @@ -2003,11 +2019,7 @@ void skb_orphan_partial(struct sk_buff *skb)
>   	if (skb_is_tcp_pure_ack(skb))
>   		return;
>   
> -	if (skb->destructor == sock_wfree
> -#ifdef CONFIG_INET
> -	    || skb->destructor == tcp_wfree
> -#endif
> -		) {
> +	if (can_skb_orphan_partial(skb)) {
>   		struct sock *sk = skb->sk;
>   
>   		if (refcount_inc_not_zero(&sk->sk_refcnt)) {
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index f62f0e7e3cdd..dee93eab02fe 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -984,6 +984,7 @@ ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset,
>   			if (!skb)
>   				goto wait_for_memory;
>   
> +			sk_set_tx_crypto(sk, skb);
>   			skb_entail(sk, skb);
>   			copy = size_goal;
>   		}
> @@ -993,7 +994,8 @@ ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset,
>   
>   		i = skb_shinfo(skb)->nr_frags;
>   		can_coalesce = skb_can_coalesce(skb, i, page, offset);
> -		if (!can_coalesce && i >= sysctl_max_skb_frags) {
> +		if ((!can_coalesce && i >= sysctl_max_skb_frags) ||
> +		    !sk_tx_crypto_match(sk, skb)) {

I see that sk_tx_crypto_match is called only here to handle cases where 
the socket expected crypto offload, while the skb is not marked 
decrypted. AFAIU, this should not be possible, because we set the 
skb->eor bit for the last plaintext skb before sending any traffic that 
expects crypto offload.



^ permalink raw reply

* [PATCH 2/2] cxgb4: smt: Use refcount_t for refcount
From: Chuhong Yuan @ 2019-07-31 13:55 UTC (permalink / raw)
  Cc: Vishal Kulkarni, David S . Miller, netdev, linux-kernel,
	Chuhong Yuan

refcount_t is better for reference counters since its
implementation can prevent overflows.
So convert atomic_t ref counters to refcount_t.

Signed-off-by: Chuhong Yuan <hslester96@gmail.com>
---
 drivers/net/ethernet/chelsio/cxgb4/smt.c | 14 +++++++-------
 drivers/net/ethernet/chelsio/cxgb4/smt.h |  2 +-
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/smt.c b/drivers/net/ethernet/chelsio/cxgb4/smt.c
index eaf1fb74689c..b008d522bef6 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/smt.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/smt.c
@@ -57,7 +57,7 @@ struct smt_data *t4_init_smt(void)
 		s->smtab[i].state = SMT_STATE_UNUSED;
 		memset(&s->smtab[i].src_mac, 0, ETH_ALEN);
 		spin_lock_init(&s->smtab[i].lock);
-		atomic_set(&s->smtab[i].refcnt, 0);
+		refcount_set(&s->smtab[i].refcnt, 0);
 	}
 	return s;
 }
@@ -68,7 +68,7 @@ static struct smt_entry *find_or_alloc_smte(struct smt_data *s, u8 *smac)
 	struct smt_entry *e, *end;
 
 	for (e = &s->smtab[0], end = &s->smtab[s->smt_size]; e != end; ++e) {
-		if (atomic_read(&e->refcnt) == 0) {
+		if (refcount_read(&e->refcnt) == 0) {
 			if (!first_free)
 				first_free = e;
 		} else {
@@ -98,7 +98,7 @@ static struct smt_entry *find_or_alloc_smte(struct smt_data *s, u8 *smac)
 static void t4_smte_free(struct smt_entry *e)
 {
 	spin_lock_bh(&e->lock);
-	if (atomic_read(&e->refcnt) == 0) {  /* hasn't been recycled */
+	if (refcount_read(&e->refcnt) == 0) {  /* hasn't been recycled */
 		e->state = SMT_STATE_UNUSED;
 	}
 	spin_unlock_bh(&e->lock);
@@ -111,7 +111,7 @@ static void t4_smte_free(struct smt_entry *e)
  */
 void cxgb4_smt_release(struct smt_entry *e)
 {
-	if (atomic_dec_and_test(&e->refcnt))
+	if (refcount_dec_and_test(&e->refcnt))
 		t4_smte_free(e);
 }
 EXPORT_SYMBOL(cxgb4_smt_release);
@@ -215,14 +215,14 @@ static struct smt_entry *t4_smt_alloc_switching(struct adapter *adap, u16 pfvf,
 	e = find_or_alloc_smte(s, smac);
 	if (e) {
 		spin_lock(&e->lock);
-		if (!atomic_read(&e->refcnt)) {
-			atomic_set(&e->refcnt, 1);
+		if (!refcount_read(&e->refcnt)) {
+			refcount_set(&e->refcnt, 1);
 			e->state = SMT_STATE_SWITCHING;
 			e->pfvf = pfvf;
 			memcpy(e->src_mac, smac, ETH_ALEN);
 			write_smt_entry(adap, e);
 		} else {
-			atomic_inc(&e->refcnt);
+			refcount_inc(&e->refcnt);
 		}
 		spin_unlock(&e->lock);
 	}
diff --git a/drivers/net/ethernet/chelsio/cxgb4/smt.h b/drivers/net/ethernet/chelsio/cxgb4/smt.h
index d6c2cc271398..4774606a0101 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/smt.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/smt.h
@@ -59,7 +59,7 @@ struct smt_entry {
 	u16 idx;
 	u16 pfvf;
 	u8 src_mac[ETH_ALEN];
-	atomic_t refcnt;
+	refcount_t refcnt;
 	spinlock_t lock;	/* protect smt entry add,removal */
 };
 
-- 
2.20.1


^ permalink raw reply related

* [PATCH 1/2] cxgb4: sched: Use refcount_t for refcount
From: Chuhong Yuan @ 2019-07-31 13:55 UTC (permalink / raw)
  Cc: Vishal Kulkarni, David S . Miller, netdev, linux-kernel,
	Chuhong Yuan

refcount_t is better for reference counters since its
implementation can prevent overflows.
So convert atomic_t ref counters to refcount_t.

Signed-off-by: Chuhong Yuan <hslester96@gmail.com>
---
 drivers/net/ethernet/chelsio/cxgb4/sched.c | 8 ++++----
 drivers/net/ethernet/chelsio/cxgb4/sched.h | 2 +-
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/sched.c b/drivers/net/ethernet/chelsio/cxgb4/sched.c
index 60218dc676a8..2d04ffb31528 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/sched.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/sched.c
@@ -173,7 +173,7 @@ static int t4_sched_queue_unbind(struct port_info *pi, struct ch_sched_queue *p)
 
 		list_del(&qe->list);
 		kvfree(qe);
-		if (atomic_dec_and_test(&e->refcnt)) {
+		if (refcount_dec_and_test(&e->refcnt)) {
 			e->state = SCHED_STATE_UNUSED;
 			memset(&e->info, 0, sizeof(e->info));
 		}
@@ -216,7 +216,7 @@ static int t4_sched_queue_bind(struct port_info *pi, struct ch_sched_queue *p)
 		goto out_err;
 
 	list_add_tail(&qe->list, &e->queue_list);
-	atomic_inc(&e->refcnt);
+	refcount_inc(&e->refcnt);
 	return err;
 
 out_err:
@@ -434,7 +434,7 @@ static struct sched_class *t4_sched_class_alloc(struct port_info *pi,
 		if (err)
 			return NULL;
 		memcpy(&e->info, &np, sizeof(e->info));
-		atomic_set(&e->refcnt, 0);
+		refcount_set(&e->refcnt, 0);
 		e->state = SCHED_STATE_ACTIVE;
 	}
 
@@ -488,7 +488,7 @@ struct sched_table *t4_init_sched(unsigned int sched_size)
 		s->tab[i].idx = i;
 		s->tab[i].state = SCHED_STATE_UNUSED;
 		INIT_LIST_HEAD(&s->tab[i].queue_list);
-		atomic_set(&s->tab[i].refcnt, 0);
+		refcount_set(&s->tab[i].refcnt, 0);
 	}
 	return s;
 }
diff --git a/drivers/net/ethernet/chelsio/cxgb4/sched.h b/drivers/net/ethernet/chelsio/cxgb4/sched.h
index 168fb4ce3759..23a6ca1e6d3e 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/sched.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/sched.h
@@ -69,7 +69,7 @@ struct sched_class {
 	u8 idx;
 	struct ch_sched_params info;
 	struct list_head queue_list;
-	atomic_t refcnt;
+	refcount_t refcnt;
 };
 
 struct sched_table {      /* per port scheduling table */
-- 
2.20.1


^ permalink raw reply related

* Re: [PATCH] net: bridge: Allow bridge to joing multicast groups
From: Andrew Lunn @ 2019-07-31 13:45 UTC (permalink / raw)
  To: Allan W. Nielsen
  Cc: Ido Schimmel, Nikolay Aleksandrov, Horatiu Vultur, roopa, davem,
	bridge, netdev, linux-kernel
In-Reply-To: <20190731080149.oyqcrw42utxjizsx@lx-anielsen.microsemi.net>

> > 2) The interface is part of a bridge. There are a few sub-cases
> > 
> > a) IGMP snooping is being performed. We can block multicast where
> > there is no interest in the group. But this is limited to IP
> > multicast.
> Agree. And this is done today by installing an explicit offload rule to limit
> the flooding of a specific group.
> 
> > b) IGMP snooping is not being used and all interfaces in the bridge
> > are ports of the switch. IP Multicast can be blocked to the CPU.
> Does it matter if IGMP snooping is used or not? A more general statement could
> be:
> 
> - "All interfaces in the bridge are ports of the switch. IP Multicast can be
>   blocked to the CPU."

We have seen IPv6 neighbour discovery break in conditions like this. I
don't know the exact details.

You also need to watch out for 224.0.0.0/24. This is the link local
multicast range. There is no need to join MC addresses in that
range. It is assumed they will always be received. So even if IGMP is
enabled, you still need to pass some multicast traffic to the CPU.

> > So one possibility here is to teach the SW bridge about non-IP
> > multicast addresses. Initially the switch should forward all MAC
> > multicast frames to the CPU. If the frame is not an IPv4 or IPv6
> > frame, and there has not been a call to set_rx_mode() for the MAC
> > address on the br0 interface, and the bridge only contains switch
> > ports, switchdev could be used to block the multicast to the CPU
> > frame, but forward it out all other ports of the bridge.
> Close, but not exactly (due to the arguments above).
> 
> Here is how I see it:
> 
> Teach the SW bridge about non-IP multicast addresses. Initially the switch
> should forward all MAC multicast frames to the CPU. Today MDB rules can be
> installed (either static or dynamic by IGMP), which limit the flooding of IPv4/6
> multicast streams. In the same way, we should have a way to install a rule
> (FDM/ or MDB) to limit the flooding of L2 multicast frames.
> 
> If foreign interfaces (or br0 it self) is part of the destination list, then
> traffic also needs to go to the CPU.
> 
> By doing this, we can for explicitly configured dst mac address:
> - limit the flooding on the on the SW bridge interfaces
> - limit the flooding on the on the HW bridge interfaces
> - prevent them to go to the CPU if they are not needed

This is all very complex because of all the different corner cases. So
i don't think we want a user API to do the CPU part, we want the
network stack to do it. Otherwise the user is going to get is wrong,
break their network, and then come running to the list for help.

This also fits with how we do things in DSA. There is deliberately no
user space concept for configuring the DSA CPU port. To user space,
the switch is just a bunch of Linux interfaces. Everything to do with
the CPU port is hidden away in the DSA core layer, the DSA drivers,
and a little bit in the bridge.

       Andrew

^ permalink raw reply

* Re: [PATCH V2 4/9] vhost: reset invalidate_count in vhost_set_vring_num_addr()
From: Jason Wang @ 2019-07-31 13:29 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: mst, kvm, virtualization, netdev, linux-kernel, linux-mm
In-Reply-To: <20190731124124.GD3946@ziepe.ca>


On 2019/7/31 下午8:41, Jason Gunthorpe wrote:
> On Wed, Jul 31, 2019 at 04:46:50AM -0400, Jason Wang wrote:
>> The vhost_set_vring_num_addr() could be called in the middle of
>> invalidate_range_start() and invalidate_range_end(). If we don't reset
>> invalidate_count after the un-registering of MMU notifier, the
>> invalidate_cont will run out of sync (e.g never reach zero). This will
>> in fact disable the fast accessor path. Fixing by reset the count to
>> zero.
>>
>> Reported-by: Michael S. Tsirkin <mst@redhat.com>
> Did Michael report this as well?


Correct me if I was wrong. I think it's point 4 described in 
https://lkml.org/lkml/2019/7/21/25.

Thanks


>
>> Fixes: 7f466032dc9e ("vhost: access vq metadata through kernel virtual address")
>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>>   drivers/vhost/vhost.c | 4 ++++
>>   1 file changed, 4 insertions(+)
>>
>> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
>> index 2a3154976277..2a7217c33668 100644
>> +++ b/drivers/vhost/vhost.c
>> @@ -2073,6 +2073,10 @@ static long vhost_vring_set_num_addr(struct vhost_dev *d,
>>   		d->has_notifier = false;
>>   	}
>>   
>> +	/* reset invalidate_count in case we are in the middle of
>> +	 * invalidate_start() and invalidate_end().
>> +	 */
>> +	vq->invalidate_count = 0;
>>   	vhost_uninit_vq_maps(vq);
>>   #endif
>>   

^ permalink raw reply

* Re: [PATCH V2 7/9] vhost: do not use RCU to synchronize MMU notifier with worker
From: Jason Wang @ 2019-07-31 13:28 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: mst, kvm, virtualization, netdev, linux-kernel, linux-mm
In-Reply-To: <20190731123935.GC3946@ziepe.ca>


On 2019/7/31 下午8:39, Jason Gunthorpe wrote:
> On Wed, Jul 31, 2019 at 04:46:53AM -0400, Jason Wang wrote:
>> We used to use RCU to synchronize MMU notifier with worker. This leads
>> calling synchronize_rcu() in invalidate_range_start(). But on a busy
>> system, there would be many factors that may slow down the
>> synchronize_rcu() which makes it unsuitable to be called in MMU
>> notifier.
>>
>> A solution is SRCU but its overhead is obvious with the expensive full
>> memory barrier. Another choice is to use seqlock, but it doesn't
>> provide a synchronization method between readers and writers. The last
>> choice is to use vq mutex, but it need to deal with the worst case
>> that MMU notifier must be blocked and wait for the finish of swap in.
>>
>> So this patch switches use a counter to track whether or not the map
>> was used. The counter was increased when vq try to start or finish
>> uses the map. This means, when it was even, we're sure there's no
>> readers and MMU notifier is synchronized. When it was odd, it means
>> there's a reader we need to wait it to be even again then we are
>> synchronized.
> You just described a seqlock.


Kind of, see my explanation below.


>
> We've been talking about providing this as some core service from mmu
> notifiers because nearly every use of this API needs it.


That would be very helpful.


>
> IMHO this gets the whole thing backwards, the common pattern is to
> protect the 'shadow pte' data with a seqlock (usually open coded),
> such that the mmu notififer side has the write side of that lock and
> the read side is consumed by the thread accessing or updating the SPTE.


Yes, I've considered something like that. But the problem is, mmu 
notifier (writer) need to wait for the vhost worker to finish the read 
before it can do things like setting dirty pages and unmapping page.  It 
looks to me seqlock doesn't provide things like this.  Or are you 
suggesting that taking writer seq lock in vhost worker and busy wait for 
seqcount to be even in MMU notifier (something similar to what this 
patch did)? I don't do this because e.g:


write_seqcount_begin()

map = vq->map[X]

write or read through map->addr directly

write_seqcount_end()


There's no rmb() in write_seqcount_begin(), so map could be read before 
write_seqcount_begin(), but it looks to me now that this doesn't harm at 
all, maybe we can try this way.


>
>
>> Reported-by: Michael S. Tsirkin <mst@redhat.com>
>> Fixes: 7f466032dc9e ("vhost: access vq metadata through kernel virtual address")
>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>>   drivers/vhost/vhost.c | 145 ++++++++++++++++++++++++++----------------
>>   drivers/vhost/vhost.h |   7 +-
>>   2 files changed, 94 insertions(+), 58 deletions(-)
>>
>> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
>> index cfc11f9ed9c9..db2c81cb1e90 100644
>> +++ b/drivers/vhost/vhost.c
>> @@ -324,17 +324,16 @@ static void vhost_uninit_vq_maps(struct vhost_virtqueue *vq)
>>   
>>   	spin_lock(&vq->mmu_lock);
>>   	for (i = 0; i < VHOST_NUM_ADDRS; i++) {
>> -		map[i] = rcu_dereference_protected(vq->maps[i],
>> -				  lockdep_is_held(&vq->mmu_lock));
>> +		map[i] = vq->maps[i];
>>   		if (map[i]) {
>>   			vhost_set_map_dirty(vq, map[i], i);
>> -			rcu_assign_pointer(vq->maps[i], NULL);
>> +			vq->maps[i] = NULL;
>>   		}
>>   	}
>>   	spin_unlock(&vq->mmu_lock);
>>   
>> -	/* No need for synchronize_rcu() or kfree_rcu() since we are
>> -	 * serialized with memory accessors (e.g vq mutex held).
>> +	/* No need for synchronization since we are serialized with
>> +	 * memory accessors (e.g vq mutex held).
>>   	 */
>>   
>>   	for (i = 0; i < VHOST_NUM_ADDRS; i++)
>> @@ -362,6 +361,44 @@ static bool vhost_map_range_overlap(struct vhost_uaddr *uaddr,
>>   	return !(end < uaddr->uaddr || start > uaddr->uaddr - 1 + uaddr->size);
>>   }
>>   
>> +static void inline vhost_vq_access_map_begin(struct vhost_virtqueue *vq)
>> +{
>> +	int ref = READ_ONCE(vq->ref);
> Is a lock/single threaded supposed to be held for this?


Yes, only vhost worker kthread can accept this.


>
>> +
>> +	smp_store_release(&vq->ref, ref + 1);
>> +	/* Make sure ref counter is visible before accessing the map */
>> +	smp_load_acquire(&vq->ref);
> release/acquire semantics are intended to protect blocks of related
> data, so reading something with acquire and throwing away the result
> is nonsense.


Actually I want to use smp_mb() here, so I admit it's a trick that even 
won't work. But now I think I can just use write_seqcount_begin() here.


>
>> +}
>> +
>> +static void inline vhost_vq_access_map_end(struct vhost_virtqueue *vq)
>> +{
>> +	int ref = READ_ONCE(vq->ref);
> If the write to vq->ref is not locked this algorithm won't work, if it
> is locked the READ_ONCE is not needed.


Yes.


>
>> +	/* Make sure vq access is done before increasing ref counter */
>> +	smp_store_release(&vq->ref, ref + 1);
>> +}
>> +
>> +static void inline vhost_vq_sync_access(struct vhost_virtqueue *vq)
>> +{
>> +	int ref;
>> +
>> +	/* Make sure map change was done before checking ref counter */
>> +	smp_mb();
> This is probably smp_rmb after reading ref, and if you are setting ref
> with smp_store_release then this should be smp_load_acquire() without
> an explicit mb.


We had something like:

spin_lock();

vq->maps[index] = NULL;

spin_unlock();

vhost_vq_sync_access(vq);

we need to make sure the read of ref is done after setting 
vq->maps[index] to NULL. It looks to me neither smp_load_acquire() nor 
smp_store_release() can help in this case.


>
>> +	ref = READ_ONCE(vq->ref);
>> +	if (ref & 0x1) {
>> +		/* When ref change, we are sure no reader can see
>> +		 * previous map */
>> +		while (READ_ONCE(vq->ref) == ref) {
>> +			set_current_state(TASK_RUNNING);
>> +			schedule();
>> +		}
>> +	}
> This is basically read_seqcount_begin()' with a schedule instead of
> cpu_relax


Yes it is.


>
>
>> +	/* Make sure ref counter was checked before any other
>> +	 * operations that was dene on map. */
>> +	smp_mb();
> should be in a smp_load_acquire()


Right, if we use smp_load_acquire() to load the counter.


>
>> +}
>> +
>>   static void vhost_invalidate_vq_start(struct vhost_virtqueue *vq,
>>   				      int index,
>>   				      unsigned long start,
>> @@ -376,16 +413,15 @@ static void vhost_invalidate_vq_start(struct vhost_virtqueue *vq,
>>   	spin_lock(&vq->mmu_lock);
>>   	++vq->invalidate_count;
>>   
>> -	map = rcu_dereference_protected(vq->maps[index],
>> -					lockdep_is_held(&vq->mmu_lock));
>> +	map = vq->maps[index];
>>   	if (map) {
>>   		vhost_set_map_dirty(vq, map, index);
>> -		rcu_assign_pointer(vq->maps[index], NULL);
>> +		vq->maps[index] = NULL;
>>   	}
>>   	spin_unlock(&vq->mmu_lock);
>>   
>>   	if (map) {
>> -		synchronize_rcu();
>> +		vhost_vq_sync_access(vq);
> What prevents racing with vhost_vq_access_map_end here?


vhost_vq_access_map_end() uses smp_store_release() for the counter. Is 
this not sufficient?


>
>>   		vhost_map_unprefetch(map);
>>   	}
>>   }
> Overall I don't like it.
>
> We are trying to get rid of these botique mmu notifier patterns in
> drivers.


I agree, so do you think we can take write lock in vhost worker then 
wait for the counter to be even in MMU notifier? It looks much cleaner 
than this patch.

Thanks


>
> Jason

^ permalink raw reply

* [PATCH 1/8] net: 8390: Fix manufacturer name in Kconfig help text
From: Geert Uytterhoeven @ 2019-07-31 13:22 UTC (permalink / raw)
  To: David S . Miller; +Cc: netdev, linux-kernel, Geert Uytterhoeven
In-Reply-To: <20190731132216.17194-1-geert+renesas@glider.be>

The help text refers to Western Digital instead of National
Semiconductor 8390, presumably because it was copied from the former.

Fixes: 644570b830266ff3 ("8390: Move the 8390 related drivers")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
---
 drivers/net/ethernet/8390/Kconfig | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/8390/Kconfig b/drivers/net/ethernet/8390/Kconfig
index 2a3e2450968eeb06..a9478577b49560f2 100644
--- a/drivers/net/ethernet/8390/Kconfig
+++ b/drivers/net/ethernet/8390/Kconfig
@@ -12,8 +12,8 @@ config NET_VENDOR_8390
 
 	  Note that the answer to this question doesn't directly affect the
 	  kernel: saying N will just cause the configurator to skip all
-	  the questions about Western Digital cards. If you say Y, you will be
-	  asked for your specific card in the following questions.
+	  the questions about National Semiconductor 8390 cards. If you say Y,
+	  you will be asked for your specific card in the following questions.
 
 if NET_VENDOR_8390
 
-- 
2.17.1


^ permalink raw reply related

* [PATCH 3/8] net: apple: Fix manufacturer name in Kconfig help text
From: Geert Uytterhoeven @ 2019-07-31 13:22 UTC (permalink / raw)
  To: David S . Miller; +Cc: netdev, linux-kernel, Geert Uytterhoeven
In-Reply-To: <20190731132216.17194-1-geert+renesas@glider.be>

The help text refers to IBM instead of Apple, presumably because it was
copied from the former.

Fixes: 8fb6b0908176704a ("bmac/mace/macmace/mac89x0/cs89x0: Move the Macintosh (Apple) drivers")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
---
 drivers/net/ethernet/apple/Kconfig | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/apple/Kconfig b/drivers/net/ethernet/apple/Kconfig
index fde7ae33e302b6bc..f78b9c841296c94b 100644
--- a/drivers/net/ethernet/apple/Kconfig
+++ b/drivers/net/ethernet/apple/Kconfig
@@ -11,8 +11,8 @@ config NET_VENDOR_APPLE
 	  If you have a network (Ethernet) card belonging to this class, say Y.
 
 	  Note that the answer to this question doesn't directly affect the
-	  kernel: saying N will just cause the configurator to skip all
-	  the questions about IBM devices. If you say Y, you will be asked for
+	  kernel: saying N will just cause the configurator to skip all the
+	  questions about Apple devices. If you say Y, you will be asked for
 	  your specific card in the following questions.
 
 if NET_VENDOR_APPLE
-- 
2.17.1


^ permalink raw reply related

* [PATCH 5/8] net: ixp4xx: Spelling s/XSacle/XScale/
From: Geert Uytterhoeven @ 2019-07-31 13:22 UTC (permalink / raw)
  To: David S . Miller; +Cc: netdev, linux-kernel, Geert Uytterhoeven
In-Reply-To: <20190731132216.17194-1-geert+renesas@glider.be>

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
---
 drivers/net/ethernet/xscale/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/xscale/Kconfig b/drivers/net/ethernet/xscale/Kconfig
index 2f354ba029a61491..cd0a8f46e7c6c143 100644
--- a/drivers/net/ethernet/xscale/Kconfig
+++ b/drivers/net/ethernet/xscale/Kconfig
@@ -13,7 +13,7 @@ config NET_VENDOR_XSCALE
 
 	  Note that the answer to this question does not directly affect the
 	  kernel: saying N will just cause the configurator to skip all
-	  the questions about XSacle IXP devices. If you say Y, you will be
+	  the questions about XScale IXP devices. If you say Y, you will be
 	  asked for your specific card in the following questions.
 
 if NET_VENDOR_XSCALE
-- 
2.17.1


^ permalink raw reply related

* [PATCH 2/8] net: amd: Spelling s/case/cause/
From: Geert Uytterhoeven @ 2019-07-31 13:22 UTC (permalink / raw)
  To: David S . Miller; +Cc: netdev, linux-kernel, Geert Uytterhoeven
In-Reply-To: <20190731132216.17194-1-geert+renesas@glider.be>

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
---
 drivers/net/ethernet/amd/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/amd/Kconfig b/drivers/net/ethernet/amd/Kconfig
index de4950d2022e0a6d..9f965cdfff5c99db 100644
--- a/drivers/net/ethernet/amd/Kconfig
+++ b/drivers/net/ethernet/amd/Kconfig
@@ -14,7 +14,7 @@ config NET_VENDOR_AMD
 	  say Y.
 
 	  Note that the answer to this question does not directly affect
-	  the kernel: saying N will just case the configurator to skip all
+	  the kernel: saying N will just cause the configurator to skip all
 	  the questions regarding AMD chipsets. If you say Y, you will be asked
 	  for your specific chipset/driver in the following questions.
 
-- 
2.17.1


^ permalink raw reply related

* [PATCH 7/8] net: packetengines: Fix manufacturer spelling and capitalization
From: Geert Uytterhoeven @ 2019-07-31 13:22 UTC (permalink / raw)
  To: David S . Miller; +Cc: netdev, linux-kernel, Geert Uytterhoeven
In-Reply-To: <20190731132216.17194-1-geert+renesas@glider.be>

Use "Packet Engines" consistently.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
---
 drivers/net/ethernet/packetengines/Kconfig  | 6 +++---
 drivers/net/ethernet/packetengines/Makefile | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/packetengines/Kconfig b/drivers/net/ethernet/packetengines/Kconfig
index 8161e308e64b0f16..ead3750b4489d1bf 100644
--- a/drivers/net/ethernet/packetengines/Kconfig
+++ b/drivers/net/ethernet/packetengines/Kconfig
@@ -1,10 +1,10 @@
 # SPDX-License-Identifier: GPL-2.0-only
 #
-# Packet engine device configuration
+# Packet Engines device configuration
 #
 
 config NET_VENDOR_PACKET_ENGINES
-	bool "Packet Engine devices"
+	bool "Packet Engines devices"
 	default y
 	depends on PCI
 	---help---
@@ -12,7 +12,7 @@ config NET_VENDOR_PACKET_ENGINES
 
 	  Note that the answer to this question doesn't directly affect the
 	  kernel: saying N will just cause the configurator to skip all
-	  the questions about packet engine devices. If you say Y, you will
+	  the questions about Packet Engines devices. If you say Y, you will
 	  be asked for your specific card in the following questions.
 
 if NET_VENDOR_PACKET_ENGINES
diff --git a/drivers/net/ethernet/packetengines/Makefile b/drivers/net/ethernet/packetengines/Makefile
index 1553c9cfc254d6f8..cf054b796d111231 100644
--- a/drivers/net/ethernet/packetengines/Makefile
+++ b/drivers/net/ethernet/packetengines/Makefile
@@ -1,6 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0-only
 #
-# Makefile for the Packet Engine network device drivers.
+# Makefile for the Packet Engines network device drivers.
 #
 
 obj-$(CONFIG_HAMACHI) += hamachi.o
-- 
2.17.1


^ permalink raw reply related

* [PATCH 4/8] net: broadcom: Fix manufacturer name in Kconfig help text
From: Geert Uytterhoeven @ 2019-07-31 13:22 UTC (permalink / raw)
  To: David S . Miller; +Cc: netdev, linux-kernel, Geert Uytterhoeven
In-Reply-To: <20190731132216.17194-1-geert+renesas@glider.be>

The help text refers to AMD instead of Broadcom, presumably because it
was copied from the former.

Fixes: adfc5217e9db68d3 ("broadcom: Move the Broadcom drivers")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
---
 drivers/net/ethernet/broadcom/Kconfig | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/Kconfig b/drivers/net/ethernet/broadcom/Kconfig
index e9017caf024dcf33..e24f5d2b6afe3547 100644
--- a/drivers/net/ethernet/broadcom/Kconfig
+++ b/drivers/net/ethernet/broadcom/Kconfig
@@ -14,9 +14,9 @@ config NET_VENDOR_BROADCOM
 	  say Y.
 
 	  Note that the answer to this question does not directly affect
-	  the kernel: saying N will just case the configurator to skip all
-	  the questions regarding AMD chipsets. If you say Y, you will be asked
-	  for your specific chipset/driver in the following questions.
+	  the kernel: saying N will just cause the configurator to skip all
+	  the questions regarding Broadcom chipsets. If you say Y, you will
+	  be asked for your specific chipset/driver in the following questions.
 
 if NET_VENDOR_BROADCOM
 
-- 
2.17.1


^ permalink raw reply related

* [PATCH 8/8] net: samsung: Spelling s/case/cause/
From: Geert Uytterhoeven @ 2019-07-31 13:22 UTC (permalink / raw)
  To: David S . Miller; +Cc: netdev, linux-kernel, Geert Uytterhoeven
In-Reply-To: <20190731132216.17194-1-geert+renesas@glider.be>

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
---
 drivers/net/ethernet/samsung/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/samsung/Kconfig b/drivers/net/ethernet/samsung/Kconfig
index 027938017579130f..e92a178a76df0849 100644
--- a/drivers/net/ethernet/samsung/Kconfig
+++ b/drivers/net/ethernet/samsung/Kconfig
@@ -11,7 +11,7 @@ config NET_VENDOR_SAMSUNG
 	  say Y.
 
 	  Note that the answer to this question does not directly affect
-	  the kernel: saying N will just case the configurator to skip all
+	  the kernel: saying N will just cause the configurator to skip all
 	  the questions about Samsung chipsets. If you say Y, you will be asked
 	  for your specific chipset/driver in the following questions.
 
-- 
2.17.1


^ permalink raw reply related

* [PATCH 6/8] net: nixge: Spelling s/Instrument/Instruments/
From: Geert Uytterhoeven @ 2019-07-31 13:22 UTC (permalink / raw)
  To: David S . Miller; +Cc: netdev, linux-kernel, Geert Uytterhoeven
In-Reply-To: <20190731132216.17194-1-geert+renesas@glider.be>

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
---
 drivers/net/ethernet/ni/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/ni/Kconfig b/drivers/net/ethernet/ni/Kconfig
index 70b1a03c0953e696..01229190132d484f 100644
--- a/drivers/net/ethernet/ni/Kconfig
+++ b/drivers/net/ethernet/ni/Kconfig
@@ -11,7 +11,7 @@ config NET_VENDOR_NI
 
 	  Note that the answer to this question doesn't directly affect the
 	  kernel: saying N will just cause the configurator to skip all
-	  the questions about National Instrument devices.
+	  the questions about National Instruments devices.
 	  If you say Y, you will be asked for your specific device in the
 	  following questions.
 
-- 
2.17.1


^ permalink raw reply related

* [PATCH 0/8] net: Manufacturer names and spelling fixes
From: Geert Uytterhoeven @ 2019-07-31 13:22 UTC (permalink / raw)
  To: David S . Miller; +Cc: netdev, linux-kernel, Geert Uytterhoeven

	Hi David,

This is a set of fixes for (some blatantly) wrong manufacturer names and
various spelling issues, mostly in Kconfig help texts.

Thanks!

Geert Uytterhoeven (8):
  net: 8390: Fix manufacturer name in Kconfig help text
  net: amd: Spelling s/case/cause/
  net: apple: Fix manufacturer name in Kconfig help text
  net: broadcom: Fix manufacturer name in Kconfig help text
  net: ixp4xx: Spelling s/XSacle/XScale/
  net: nixge: Spelling s/Instrument/Instruments/
  net: packetengines: Fix manufacturer spelling and capitalization
  net: samsung: Spelling s/case/cause/

 drivers/net/ethernet/8390/Kconfig           | 4 ++--
 drivers/net/ethernet/amd/Kconfig            | 2 +-
 drivers/net/ethernet/apple/Kconfig          | 4 ++--
 drivers/net/ethernet/broadcom/Kconfig       | 6 +++---
 drivers/net/ethernet/ni/Kconfig             | 2 +-
 drivers/net/ethernet/packetengines/Kconfig  | 6 +++---
 drivers/net/ethernet/packetengines/Makefile | 2 +-
 drivers/net/ethernet/samsung/Kconfig        | 2 +-
 drivers/net/ethernet/xscale/Kconfig         | 2 +-
 9 files changed, 15 insertions(+), 15 deletions(-)

-- 
2.17.1

Gr{oetje,eeting}s,

						Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
							    -- Linus Torvalds

^ permalink raw reply

* Re: [PATCH bpf-next 1/9] selftests/bpf: prevent headers to be compiled as C code
From: Ilya Leoshkevich @ 2019-07-31 13:21 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Stanislav Fomichev, Andrii Nakryiko, bpf, Networking,
	Alexei Starovoitov, Daniel Borkmann, Kernel Team
In-Reply-To: <CAEf4Bzbj6RWUo8Q7wgMnbL=T7V2yK2=gbdO3sSfxJ71Gp6jeYA@mail.gmail.com>

> Am 27.07.2019 um 20:53 schrieb Andrii Nakryiko <andrii.nakryiko@gmail.com>:
> 
> On Fri, Jul 26, 2019 at 3:01 PM Stanislav Fomichev <sdf@fomichev.me> wrote:
>> 
>> On 07/26, Andrii Nakryiko wrote:
>>> On Fri, Jul 26, 2019 at 2:21 PM Stanislav Fomichev <sdf@fomichev.me> wrote:
>>>> 
>>>> On 07/26, Andrii Nakryiko wrote:
>>>>> Apprently listing header as a normal dependency for a binary output
>>>>> makes it go through compilation as if it was C code. This currently
>>>>> works without a problem, but in subsequent commits causes problems for
>>>>> differently generated test.h for test_progs. Marking those headers as
>>>>> order-only dependency solves the issue.
>>>> Are you sure it will not result in a situation where
>>>> test_progs/test_maps is not regenerated if tests.h is updated.
>>>> 
>>>> If I read the following doc correctly, order deps make sense for
>>>> directories only:
>>>> https://www.gnu.org/software/make/manual/html_node/Prerequisite-Types.html
>>>> 
>>>> Can you maybe double check it with:
>>>> * make
>>>> * add new prog_tests/test_something.c
>>>> * make
>>>> to see if the binary is regenerated with test_something.c?
>>> 
>>> Yeah, tested that, it triggers test_progs rebuild.
>>> 
>>> Ordering is still preserved, because test.h is dependency of
>>> test_progs.c, which is dependency of test_progs binary, so that's why
>>> it works.
>>> 
>>> As to why .h file is compiled as C file, I have no idea and ideally
>>> that should be fixed somehow.
>> I guess that's because it's a prerequisite and we have a target that
>> puts all prerequisites when calling CC:
>> 
>> test_progs: a.c b.c tests.h
>>        gcc a.c b.c tests.h -o test_progs
>> 
>> So gcc compiles each input file.
> 
> If that's really a definition of the rule, then it seems not exactly
> correct. What if some of the prerequisites are some object files,
> directories, etc. I'd say the rule should only include .c files. I'll
> add it to my TODO list to go and check how all this is defined
> somewhere, but for now I'm leaving everything as is in this patch.
> 

I believe it’s an implicit built-in rule, which is defined by make
itself here:

https://git.savannah.gnu.org/cgit/make.git/tree/default.c?h=4.2.1#n131

The observed behavior arises because these rules use $^ all over the
place. My 2c is that it would be better to make the rule explicit,
because it would cost just an extra line, but it would be immediately
obvious why the code is correct w.r.t. rebuilds.

Best regards,
Ilya

^ permalink raw reply

* [PATCH v2] isdn: hfcsusb: Fix mISDN driver crash caused by transfer buffer on the stack
From: Juliana Rodrigueiro @ 2019-07-31 13:17 UTC (permalink / raw)
  To: isdn, netdev; +Cc: thomas.jarosch

Since linux 4.9 it is not possible to use buffers on the stack for DMA transfers.

During usb probe the driver crashes with "transfer buffer is on stack" message.

This fix k-allocates a buffer to be used on "read_reg_atomic", which is a macro
that calls "usb_control_msg" under the hood.

Kernel 4.19 backtrace:

usb_hcd_submit_urb+0x3e5/0x900
? sched_clock+0x9/0x10
? log_store+0x203/0x270
? get_random_u32+0x6f/0x90
? cache_alloc_refill+0x784/0x8a0
usb_submit_urb+0x3b4/0x550
usb_start_wait_urb+0x4e/0xd0
usb_control_msg+0xb8/0x120
hfcsusb_probe+0x6bc/0xb40 [hfcsusb]
usb_probe_interface+0xc2/0x260
really_probe+0x176/0x280
driver_probe_device+0x49/0x130
__driver_attach+0xa9/0xb0
? driver_probe_device+0x130/0x130
bus_for_each_dev+0x5a/0x90
driver_attach+0x14/0x20
? driver_probe_device+0x130/0x130
bus_add_driver+0x157/0x1e0
driver_register+0x51/0xe0
usb_register_driver+0x5d/0x120
? 0xf81ed000
hfcsusb_drv_init+0x17/0x1000 [hfcsusb]
do_one_initcall+0x44/0x190
? free_unref_page_commit+0x6a/0xd0
do_init_module+0x46/0x1c0
load_module+0x1dc1/0x2400
sys_init_module+0xed/0x120
do_fast_syscall_32+0x7a/0x200
entry_SYSENTER_32+0x6b/0xbe

Signed-off-by: Juliana Rodrigueiro <juliana.rodrigueiro@intra2net.com>
---
Changes in v2:
   - Ordered the local variables declaration from longest to shortest.

 drivers/isdn/hardware/mISDN/hfcsusb.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/isdn/hardware/mISDN/hfcsusb.c b/drivers/isdn/hardware/mISDN/hfcsusb.c
index 6d05946b445e..fe1117a21f6b 100644
--- a/drivers/isdn/hardware/mISDN/hfcsusb.c
+++ b/drivers/isdn/hardware/mISDN/hfcsusb.c
@@ -1704,13 +1704,23 @@ hfcsusb_stop_endpoint(struct hfcsusb *hw, int channel)
 static int
 setup_hfcsusb(struct hfcsusb *hw)
 {
+	void *dmabuf = kmalloc(sizeof(u_char), GFP_KERNEL);
 	u_char b;
+	int ret;
 
 	if (debug & DBG_HFC_CALL_TRACE)
 		printk(KERN_DEBUG "%s: %s\n", hw->name, __func__);
 
+	if (!dmabuf)
+		return -ENOMEM;
+
+	ret = read_reg_atomic(hw, HFCUSB_CHIP_ID, dmabuf);
+
+	memcpy(&b, dmabuf, sizeof(u_char));
+	kfree(dmabuf);
+
 	/* check the chip id */
-	if (read_reg_atomic(hw, HFCUSB_CHIP_ID, &b) != 1) {
+	if (ret != 1) {
 		printk(KERN_DEBUG "%s: %s: cannot read chip id\n",
 		       hw->name, __func__);
 		return 1;
-- 
2.17.2





^ permalink raw reply related

* [PATCH] net: mediatek: Drop unneeded dependency on NET_VENDOR_MEDIATEK
From: Geert Uytterhoeven @ 2019-07-31 13:12 UTC (permalink / raw)
  To: Felix Fietkau, John Crispin, Sean Wang, Nelson Chang,
	David S . Miller, Matthias Brugger
  Cc: netdev, linux-arm-kernel, linux-mediatek, Geert Uytterhoeven

The whole block is protected by "if NET_VENDOR_MEDIATEK", so there is
no need for individual driver config symbols to duplicate this
dependency.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
---
 drivers/net/ethernet/mediatek/Kconfig | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/net/ethernet/mediatek/Kconfig b/drivers/net/ethernet/mediatek/Kconfig
index 263cd0909fe0de39..1f7fff81f24dbb96 100644
--- a/drivers/net/ethernet/mediatek/Kconfig
+++ b/drivers/net/ethernet/mediatek/Kconfig
@@ -9,7 +9,6 @@ if NET_VENDOR_MEDIATEK
 
 config NET_MEDIATEK_SOC
 	tristate "MediaTek SoC Gigabit Ethernet support"
-	depends on NET_VENDOR_MEDIATEK
 	select PHYLIB
 	---help---
 	  This driver supports the gigabit ethernet MACs in the
-- 
2.17.1


^ permalink raw reply related

* pull-request: mac80211 2019-07-31
From: Johannes Berg @ 2019-07-31 12:49 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-wireless

Hi Dave,

We have few fixes, most importantly probably the NETIF_F_LLTX revert,
we thought we were now more layered like VLAN or such since we do all
of the queue control internally, but it caused problems, evidently not.

Please pull and let me know if there's any problem.

Thanks,
johannes



The following changes since commit 47d858d0bdcd47cc1c6c9eeca91b091dd9e55637:

  ipip: validate header length in ipip_tunnel_xmit (2019-07-25 17:23:40 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211.git tags/mac80211-for-davem-2019-07-31

for you to fetch changes up to eef347f846ee8f7296a6f84e3866c057ca6bcce0:

  Revert "mac80211: set NETIF_F_LLTX when using intermediate tx queues" (2019-07-30 14:52:50 +0200)

----------------------------------------------------------------
Just a few fixes:
 * revert NETIF_F_LLTX usage as it caused problems
 * avoid warning on WMM parameters from AP that are too short
 * fix possible null-ptr dereference in hwsim
 * fix interface combinations with 4-addr and crypto control

----------------------------------------------------------------
Brian Norris (1):
      mac80211: don't WARN on short WMM parameters from AP

Jia-Ju Bai (1):
      mac80211_hwsim: Fix possible null-pointer dereferences in hwsim_dump_radio_nl()

Johannes Berg (1):
      Revert "mac80211: set NETIF_F_LLTX when using intermediate tx queues"

Manikanta Pubbisetty (1):
      {nl,mac}80211: fix interface combinations on crypto controlled devices

 drivers/net/wireless/mac80211_hwsim.c |  8 +++++---
 include/net/cfg80211.h                | 15 +++++++++++++++
 net/mac80211/iface.c                  |  1 -
 net/mac80211/mlme.c                   | 10 ++++++++++
 net/mac80211/util.c                   |  7 +++----
 net/wireless/core.c                   |  6 ++----
 net/wireless/nl80211.c                |  4 +---
 net/wireless/util.c                   | 27 +++++++++++++++++++++++++--
 8 files changed, 61 insertions(+), 17 deletions(-)


^ permalink raw reply

* Re: [Intel-wired-lan] [PATCH bpf-next v4 03/11] libbpf: add flags to umem config
From: Björn Töpel @ 2019-07-31 12:45 UTC (permalink / raw)
  To: Kevin Laatz
  Cc: Netdev, Alexei Starovoitov, Daniel Borkmann,
	Björn Töpel, Karlsson, Magnus, Jakub Kicinski,
	Jonathan Lemon, Saeed Mahameed, Maxim Mikityanskiy,
	Stephen Hemminger, Bruce Richardson, ciara.loftus,
	intel-wired-lan, bpf
In-Reply-To: <20190730085400.10376-4-kevin.laatz@intel.com>

On Tue, 30 Jul 2019 at 19:43, Kevin Laatz <kevin.laatz@intel.com> wrote:
>
> This patch adds a 'flags' field to the umem_config and umem_reg structs.
> This will allow for more options to be added for configuring umems.
>
> The first use for the flags field is to add a flag for unaligned chunks
> mode. These flags can either be user-provided or filled with a default.
>
> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
> Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
>
> ---
> v2:
>   - Removed the headroom check from this patch. It has moved to the
>     previous patch.
>
> v4:
>   - modified chunk flag define
> ---
>  tools/include/uapi/linux/if_xdp.h | 9 +++++++--
>  tools/lib/bpf/xsk.c               | 3 +++
>  tools/lib/bpf/xsk.h               | 2 ++
>  3 files changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/tools/include/uapi/linux/if_xdp.h b/tools/include/uapi/linux/if_xdp.h
> index faaa5ca2a117..a691802d7915 100644
> --- a/tools/include/uapi/linux/if_xdp.h
> +++ b/tools/include/uapi/linux/if_xdp.h
> @@ -17,6 +17,10 @@
>  #define XDP_COPY       (1 << 1) /* Force copy-mode */
>  #define XDP_ZEROCOPY   (1 << 2) /* Force zero-copy mode */
>
> +/* Flags for xsk_umem_config flags */
> +#define XDP_UMEM_UNALIGNED_CHUNK_FLAG_SHIFT 15
> +#define XDP_UMEM_UNALIGNED_CHUNK_FLAG (1 << XDP_UMEM_UNALIGNED_CHUNK_FLAG_SHIFT)
> +
>  struct sockaddr_xdp {
>         __u16 sxdp_family;
>         __u16 sxdp_flags;
> @@ -49,8 +53,9 @@ struct xdp_mmap_offsets {
>  #define XDP_OPTIONS                    8
>
>  struct xdp_umem_reg {
> -       __u64 addr; /* Start of packet data area */
> -       __u64 len; /* Length of packet data area */
> +       __u64 addr;     /* Start of packet data area */
> +       __u64 len:48;   /* Length of packet data area */
> +       __u64 flags:16; /* Flags for umem */

So, the flags member moved from struct sockaddr_xdp to struct
xdp_umem_reg. Makes sense. However, I'm not a fan of the bitfield. Why
not just add the flags member after the last member (headroom) and
deal with it in xsk.c/xsk_setsockopt and libbpf? The bitfield
preserves the size, but makes it hard to read/error prone IMO.


Björn


>         __u32 chunk_size;
>         __u32 headroom;
>  };
> diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c
> index 5007b5d4fd2c..5e7e4d420ee0 100644
> --- a/tools/lib/bpf/xsk.c
> +++ b/tools/lib/bpf/xsk.c
> @@ -116,6 +116,7 @@ static void xsk_set_umem_config(struct xsk_umem_config *cfg,
>                 cfg->comp_size = XSK_RING_CONS__DEFAULT_NUM_DESCS;
>                 cfg->frame_size = XSK_UMEM__DEFAULT_FRAME_SIZE;
>                 cfg->frame_headroom = XSK_UMEM__DEFAULT_FRAME_HEADROOM;
> +               cfg->flags = XSK_UMEM__DEFAULT_FLAGS;
>                 return;
>         }
>
> @@ -123,6 +124,7 @@ static void xsk_set_umem_config(struct xsk_umem_config *cfg,
>         cfg->comp_size = usr_cfg->comp_size;
>         cfg->frame_size = usr_cfg->frame_size;
>         cfg->frame_headroom = usr_cfg->frame_headroom;
> +       cfg->flags = usr_cfg->flags;
>  }
>
>  static int xsk_set_xdp_socket_config(struct xsk_socket_config *cfg,
> @@ -182,6 +184,7 @@ int xsk_umem__create(struct xsk_umem **umem_ptr, void *umem_area, __u64 size,
>         mr.len = size;
>         mr.chunk_size = umem->config.frame_size;
>         mr.headroom = umem->config.frame_headroom;
> +       mr.flags = umem->config.flags;
>
>         err = setsockopt(umem->fd, SOL_XDP, XDP_UMEM_REG, &mr, sizeof(mr));
>         if (err) {
> diff --git a/tools/lib/bpf/xsk.h b/tools/lib/bpf/xsk.h
> index 833a6e60d065..44a03d8c34b9 100644
> --- a/tools/lib/bpf/xsk.h
> +++ b/tools/lib/bpf/xsk.h
> @@ -170,12 +170,14 @@ LIBBPF_API int xsk_socket__fd(const struct xsk_socket *xsk);
>  #define XSK_UMEM__DEFAULT_FRAME_SHIFT    12 /* 4096 bytes */
>  #define XSK_UMEM__DEFAULT_FRAME_SIZE     (1 << XSK_UMEM__DEFAULT_FRAME_SHIFT)
>  #define XSK_UMEM__DEFAULT_FRAME_HEADROOM 0
> +#define XSK_UMEM__DEFAULT_FLAGS 0
>
>  struct xsk_umem_config {
>         __u32 fill_size;
>         __u32 comp_size;
>         __u32 frame_size;
>         __u32 frame_headroom;
> +       __u32 flags;
>  };
>
>  /* Flags for the libbpf_flags field. */
> --
> 2.17.1
>
> _______________________________________________
> Intel-wired-lan mailing list
> Intel-wired-lan@osuosl.org
> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply

* Re: [PATCH V2 4/9] vhost: reset invalidate_count in vhost_set_vring_num_addr()
From: Jason Gunthorpe @ 2019-07-31 12:41 UTC (permalink / raw)
  To: Jason Wang; +Cc: mst, kvm, virtualization, netdev, linux-kernel, linux-mm
In-Reply-To: <20190731084655.7024-5-jasowang@redhat.com>

On Wed, Jul 31, 2019 at 04:46:50AM -0400, Jason Wang wrote:
> The vhost_set_vring_num_addr() could be called in the middle of
> invalidate_range_start() and invalidate_range_end(). If we don't reset
> invalidate_count after the un-registering of MMU notifier, the
> invalidate_cont will run out of sync (e.g never reach zero). This will
> in fact disable the fast accessor path. Fixing by reset the count to
> zero.
> 
> Reported-by: Michael S. Tsirkin <mst@redhat.com>

Did Michael report this as well?

> Fixes: 7f466032dc9e ("vhost: access vq metadata through kernel virtual address")
> Signed-off-by: Jason Wang <jasowang@redhat.com>
>  drivers/vhost/vhost.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> index 2a3154976277..2a7217c33668 100644
> +++ b/drivers/vhost/vhost.c
> @@ -2073,6 +2073,10 @@ static long vhost_vring_set_num_addr(struct vhost_dev *d,
>  		d->has_notifier = false;
>  	}
>  
> +	/* reset invalidate_count in case we are in the middle of
> +	 * invalidate_start() and invalidate_end().
> +	 */
> +	vq->invalidate_count = 0;
>  	vhost_uninit_vq_maps(vq);
>  #endif
>  

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox