Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH net-next 08/13] net: sched: rename tcf_block_get{_ext}() and tcf_block_put{_ext}()
From: Vlad Buslov @ 2018-09-12  8:24 UTC (permalink / raw)
  To: Cong Wang
  Cc: Linux Kernel Network Developers, Jamal Hadi Salim, Jiri Pirko,
	David Miller, Stephen Hemminger, Kirill Tkhai, Paul E. McKenney,
	Nicolas Dichtel, Leon Romanovsky, Greg KH, mark.rutland,
	Florian Westphal, David Ahern, lucien xin, Jakub Kicinski,
	Christian Brauner, Jiri Benc
In-Reply-To: <CAM_iQpVMU7nPgkw+N_V6ukrH5rx-37+nd+++wj47JRA33RqMUQ@mail.gmail.com>


On Fri 07 Sep 2018 at 20:09, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> On Thu, Sep 6, 2018 at 12:59 AM Vlad Buslov <vladbu@mellanox.com> wrote:
>>
>> Functions tcf_block_get{_ext}() and tcf_block_put{_ext}() actually
>> attach/detach block to specific Qdisc besides just taking/putting
>> reference. Rename them according to their purpose.
>
> Where exactly does it attach to?
>
> Each qdisc provides a pointer to a pointer of a block, like
> &cl->block. It is where the result is saved to. It takes a parameter
> of Qdisc* merely for read-only purpose.

tcf_block_attach_ext() passes qdisc parameter to tcf_block_owner_add()
which saves qdisc to new tcf_block_owner_item and adds the item to
block's owner list. I proposed several naming options for these
functions to Jiri on internal review and he suggested "attach" as better
option.

>
> So, renaming it to *attach() is even confusing, at least not
> any better. Please find other names or leave them as they are.

What would you recommend?

^ permalink raw reply

* Re: [PATCH] net: ipv4: Use BUG_ON directly instead of a if condition followed by BUG
From: zhong jiang @ 2018-09-12 13:25 UTC (permalink / raw)
  To: David Miller; +Cc: edumazet, kuznet, yoshfuji, netdev, linux-kernel
In-Reply-To: <20180911.235417.1481257495387934935.davem@davemloft.net>

On 2018/9/12 14:54, David Miller wrote:
> From: zhong jiang <zhongjiang@huawei.com>
> Date: Mon, 10 Sep 2018 22:38:02 +0800
>
>> The if condition can be removed if we use BUG_ON directly.
>> The issule is detected with the help of Coccinelle.
>>
>> Signed-off-by: zhong jiang <zhongjiang@huawei.com>
> You keep asking if this is worthy to do.
>
> I think the most worthy thing to do is actually build test your
> patches.
 I am sorry for that.  :-[

Thanks,
zhong jiang

^ permalink raw reply

* Re: [RFC] netlink: add NLA_REJECT policy type
From: Johannes Berg @ 2018-09-12  8:21 UTC (permalink / raw)
  To: Michal Kubecek; +Cc: linux-wireless, netdev
In-Reply-To: <20180912081621.GC29691@unicorn.suse.cz>

On Wed, 2018-09-12 at 10:16 +0200, Michal Kubecek wrote:
> On Wed, Sep 12, 2018 at 09:32:45AM +0200, Johannes Berg wrote:
> > From: Johannes Berg <johannes.berg@intel.com>
> > 
> > In some situations some netlink attributes may be used for output
> > only (kernel->userspace) or may be reserved for future use. It's
> > then helpful to be able to prevent userspace from using them in
> > messages sent to the kernel, since they'd otherwise be ignored and
> > any future will become impossible if this happens.
> > 
> > Add NLA_REJECT to the policy which does nothing but reject (with
> > EINVAL) validation of any messages containing this attribute.
> > 
> > The specific case I have in mind now is a shared nested attribute
> > containing request/response data, and it would be pointless and
> > potentially confusing to have userspace include response data in
> > the messages that actually contain a request.
> 
> I find this feature very useful. Actually, I was a bit surprised when
> I found I can't mark an attribute "forbidden" using policy.

:-)

> IMHO it would be even nicer if one could also specify an error message
> to use in extack if NLA_REJECT is applied; the easiest way would be
> using .validation_data and passing extack to validate_nla() but I'm not
> sure if it wouldn't qualify as an abuse.

I think it's fine - validation data is by nature validation type
dependent, so we can document it here. I'll send a v2.

johannes

^ permalink raw reply

* Re: [RFC] netlink: add NLA_REJECT policy type
From: Michal Kubecek @ 2018-09-12  8:16 UTC (permalink / raw)
  To: Johannes Berg; +Cc: linux-wireless, netdev, Johannes Berg
In-Reply-To: <20180912073245.8047-1-johannes@sipsolutions.net>

On Wed, Sep 12, 2018 at 09:32:45AM +0200, Johannes Berg wrote:
> From: Johannes Berg <johannes.berg@intel.com>
> 
> In some situations some netlink attributes may be used for output
> only (kernel->userspace) or may be reserved for future use. It's
> then helpful to be able to prevent userspace from using them in
> messages sent to the kernel, since they'd otherwise be ignored and
> any future will become impossible if this happens.
> 
> Add NLA_REJECT to the policy which does nothing but reject (with
> EINVAL) validation of any messages containing this attribute.
> 
> The specific case I have in mind now is a shared nested attribute
> containing request/response data, and it would be pointless and
> potentially confusing to have userspace include response data in
> the messages that actually contain a request.

I find this feature very useful. Actually, I was a bit surprised when
I found I can't mark an attribute "forbidden" using policy.

IMHO it would be even nicer if one could also specify an error message
to use in extack if NLA_REJECT is applied; the easiest way would be
using .validation_data and passing extack to validate_nla() but I'm not
sure if it wouldn't qualify as an abuse.

Michal Kubecek

^ permalink raw reply

* Re: bpf: btf: Change how section is supported in btf_header
From: Dmitry Vyukov @ 2018-09-12  8:10 UTC (permalink / raw)
  To: Martin KaFai Lau; +Cc: Daniel Borkmann, Alexei Starovoitov, netdev
In-Reply-To: <20180911181010.vqbp2pxba2k5mshc@kafai-mbp.dhcp.thefacebook.com>

On Tue, Sep 11, 2018 at 8:10 PM, Martin KaFai Lau <kafai@fb.com> wrote:
> On Tue, Sep 11, 2018 at 06:40:05PM +0200, Dmitry Vyukov wrote:
>> Hi Martin,
>>
>> I am looking at the subj commit:
>>
>>  static int btf_add_type(struct btf_verifier_env *env, struct btf_type *t)
>> @@ -1754,9 +1756,9 @@ static int btf_check_all_metas(struct
>> btf_verifier_env *env)
>>         struct btf_header *hdr;
>>         void *cur, *end;
>>
>> -       hdr = btf->hdr;
>> +       hdr = &btf->hdr;
>>         cur = btf->nohdr_data + hdr->type_off;
>> -       end = btf->nohdr_data + hdr->str_off;
>> +       end = btf->nohdr_data + hdr->type_len;
>>
>> Shouldn't this be:
>>
>> +       end = cur + hdr->type_len;
>>
>> ? Or otherwise I am having trouble understanding meaning of fields.
> You are correct.  Thanks for pointing this out.
> Do you want to post an offical patch for the bpf branch?

no :)

Only if you want to send a syzkaller patch with btf descriptions ;)

>> On a related note, what's between header and type_off? Is type_off
>> supposed to be 0 always?
> type section is always the first section for now (i.e. immediately after
> the header).  Some other sections could be introduced later and it could
> be located before the type section such that the type_off will not be 0.

I see. Thanks.

^ permalink raw reply

* Re: Re: [PATCH net-next v2 0/5] virtio: support packed ring
From: Michael S. Tsirkin @ 2018-09-12 13:06 UTC (permalink / raw)
  To: Tiwei Bie
  Cc: jasowang, virtualization, linux-kernel, netdev, virtio-dev, wexu,
	jfreimann
In-Reply-To: <20180910030053.GA15645@debian>

On Mon, Sep 10, 2018 at 11:00:53AM +0800, Tiwei Bie wrote:
> On Fri, Sep 07, 2018 at 09:00:49AM -0400, Michael S. Tsirkin wrote:
> > On Fri, Sep 07, 2018 at 09:22:25AM +0800, Tiwei Bie wrote:
> > > On Mon, Aug 27, 2018 at 05:00:40PM +0300, Michael S. Tsirkin wrote:
> > > > Are there still plans to test the performance with vost pmd?
> > > > vhost doesn't seem to show a performance gain ...
> > > > 
> > > 
> > > I tried some performance tests with vhost PMD. In guest, the
> > > XDP program will return XDP_DROP directly. And in host, testpmd
> > > will do txonly fwd.
> > > 
> > > When burst size is 1 and packet size is 64 in testpmd and
> > > testpmd needs to iterate 5 Tx queues (but only the first two
> > > queues are enabled) to prepare and inject packets, I got ~12%
> > > performance boost (5.7Mpps -> 6.4Mpps). And if the vhost PMD
> > > is faster (e.g. just need to iterate the first two queues to
> > > prepare and inject packets), then I got similar performance
> > > for both rings (~9.9Mpps) (packed ring's performance can be
> > > lower, because it's more complicated in driver.)
> > > 
> > > I think packed ring makes vhost PMD faster, but it doesn't make
> > > the driver faster. In packed ring, the ring is simplified, and
> > > the handling of the ring in vhost (device) is also simplified,
> > > but things are not simplified in driver, e.g. although there is
> > > no desc table in the virtqueue anymore, driver still needs to
> > > maintain a private desc state table (which is still managed as
> > > a list in this patch set) to support the out-of-order desc
> > > processing in vhost (device).
> > > 
> > > I think this patch set is mainly to make the driver have a full
> > > functional support for the packed ring, which makes it possible
> > > to leverage the packed ring feature in vhost (device). But I'm
> > > not sure whether there is any other better idea, I'd like to
> > > hear your thoughts. Thanks!
> > 
> > Just this: Jens seems to report a nice gain with virtio and
> > vhost pmd across the board. Try to compare virtio and
> > virtio pmd to see what does pmd do better?
> 
> The virtio PMD (drivers/net/virtio) in DPDK doesn't need to share
> the virtio ring operation code with other drivers and is highly
> optimized for network. E.g. in Rx, the Rx burst function won't
> chain descs.
> So the ID management for the Rx ring can be quite
> simple and straightforward, we just need to initialize these IDs
> when initializing the ring and don't need to change these IDs
> in data path anymore (the mergable Rx code in that patch set
> assumes the descs will be written back in order, which should be
> fixed. I.e., the ID in the desc should be used to index vq->descx[]).
> The Tx code in that patch set also assumes the descs will be
> written back by device in order, which should be fixed.
> 
> But in kernel virtio driver, the virtio_ring.c is very generic.
> The enqueue (virtqueue_add()) and dequeue (virtqueue_get_buf_ctx())
> functions need to support all the virtio devices and should be
> able to handle all the possible cases that may happen. So although
> the packed ring can be very efficient in some cases, currently
> the room to optimize the performance in kernel's virtio_ring.c
> isn't that much. If we want to take the fully advantage of the
> packed ring's efficiency, we need some further e.g. API changes
> in virtio_ring.c, which shouldn't be part of this patch set. So
> I still think this patch set is mainly to make the kernel virtio
> driver to have a full functional support of the packed ring, and
> we can't expect impressive performance boost with it.

So what are the cases that make things complex?
Maybe we should drop support for them completely.


> > 
> > 
> > > 
> > > > 
> > > > On Wed, Jul 11, 2018 at 10:27:06AM +0800, Tiwei Bie wrote:
> > > > > Hello everyone,
> > > > > 
> > > > > This patch set implements packed ring support in virtio driver.
> > > > > 
> > > > > Some functional tests have been done with Jason's
> > > > > packed ring implementation in vhost:
> > > > > 
> > > > > https://lkml.org/lkml/2018/7/3/33
> > > > > 
> > > > > Both of ping and netperf worked as expected.
> > > > > 
> > > > > v1 -> v2:
> > > > > - Use READ_ONCE() to read event off_wrap and flags together (Jason);
> > > > > - Add comments related to ccw (Jason);
> > > > > 
> > > > > RFC (v6) -> v1:
> > > > > - Avoid extra virtio_wmb() in virtqueue_enable_cb_delayed_packed()
> > > > >   when event idx is off (Jason);
> > > > > - Fix bufs calculation in virtqueue_enable_cb_delayed_packed() (Jason);
> > > > > - Test the state of the desc at used_idx instead of last_used_idx
> > > > >   in virtqueue_enable_cb_delayed_packed() (Jason);
> > > > > - Save wrap counter (as part of queue state) in the return value
> > > > >   of virtqueue_enable_cb_prepare_packed();
> > > > > - Refine the packed ring definitions in uapi;
> > > > > - Rebase on the net-next tree;
> > > > > 
> > > > > RFC v5 -> RFC v6:
> > > > > - Avoid tracking addr/len/flags when DMA API isn't used (MST/Jason);
> > > > > - Define wrap counter as bool (Jason);
> > > > > - Use ALIGN() in vring_init_packed() (Jason);
> > > > > - Avoid using pointer to track `next` in detach_buf_packed() (Jason);
> > > > > - Add comments for barriers (Jason);
> > > > > - Don't enable RING_PACKED on ccw for now (noticed by Jason);
> > > > > - Refine the memory barrier in virtqueue_poll();
> > > > > - Add a missing memory barrier in virtqueue_enable_cb_delayed_packed();
> > > > > - Remove the hacks in virtqueue_enable_cb_prepare_packed();
> > > > > 
> > > > > RFC v4 -> RFC v5:
> > > > > - Save DMA addr, etc in desc state (Jason);
> > > > > - Track used wrap counter;
> > > > > 
> > > > > RFC v3 -> RFC v4:
> > > > > - Make ID allocation support out-of-order (Jason);
> > > > > - Various fixes for EVENT_IDX support;
> > > > > 
> > > > > RFC v2 -> RFC v3:
> > > > > - Split into small patches (Jason);
> > > > > - Add helper virtqueue_use_indirect() (Jason);
> > > > > - Just set id for the last descriptor of a list (Jason);
> > > > > - Calculate the prev in virtqueue_add_packed() (Jason);
> > > > > - Fix/improve desc suppression code (Jason/MST);
> > > > > - Refine the code layout for XXX_split/packed and wrappers (MST);
> > > > > - Fix the comments and API in uapi (MST);
> > > > > - Remove the BUG_ON() for indirect (Jason);
> > > > > - Some other refinements and bug fixes;
> > > > > 
> > > > > RFC v1 -> RFC v2:
> > > > > - Add indirect descriptor support - compile test only;
> > > > > - Add event suppression supprt - compile test only;
> > > > > - Move vring_packed_init() out of uapi (Jason, MST);
> > > > > - Merge two loops into one in virtqueue_add_packed() (Jason);
> > > > > - Split vring_unmap_one() for packed ring and split ring (Jason);
> > > > > - Avoid using '%' operator (Jason);
> > > > > - Rename free_head -> next_avail_idx (Jason);
> > > > > - Add comments for virtio_wmb() in virtqueue_add_packed() (Jason);
> > > > > - Some other refinements and bug fixes;
> > > > > 
> > > > > Thanks!
> > > > > 
> > > > > Tiwei Bie (5):
> > > > >   virtio: add packed ring definitions
> > > > >   virtio_ring: support creating packed ring
> > > > >   virtio_ring: add packed ring support
> > > > >   virtio_ring: add event idx support in packed ring
> > > > >   virtio_ring: enable packed ring
> > > > > 
> > > > >  drivers/s390/virtio/virtio_ccw.c   |   14 +
> > > > >  drivers/virtio/virtio_ring.c       | 1365 ++++++++++++++++++++++------
> > > > >  include/linux/virtio_ring.h        |    8 +-
> > > > >  include/uapi/linux/virtio_config.h |    3 +
> > > > >  include/uapi/linux/virtio_ring.h   |   43 +
> > > > >  5 files changed, 1157 insertions(+), 276 deletions(-)
> > > > > 
> > > > > -- 
> > > > > 2.18.0
> > > > 
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> > > > For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
> > > > 

^ permalink raw reply

* Re: [PATCH net] Revert "rtnetlink: add rtnl_link_state check in rtnl_configure_link"
From: mcbirnie.l @ 2018-09-12  7:56 UTC (permalink / raw)
  To: netdev; +Cc: Liam McBirnie
In-Reply-To: <20180911231110.3506-1-mcbirnie.l@gmail.com>

From: Liam McBirnie <liam.mcbirnie@boeing.com>

Please reject this patch.
The author of the original commit is working on a better patch.
https://bugzilla.kernel.org/show_bug.cgi?id=201071

^ permalink raw reply

* Re: [PATCH net-next v2 1/5] virtio: add packed ring definitions
From: Michael S. Tsirkin @ 2018-09-12 12:53 UTC (permalink / raw)
  To: Tiwei Bie
  Cc: jasowang, virtualization, linux-kernel, netdev, virtio-dev, wexu,
	jfreimann
In-Reply-To: <20180910021322.GA10912@debian>

On Mon, Sep 10, 2018 at 10:13:22AM +0800, Tiwei Bie wrote:
> On Fri, Sep 07, 2018 at 09:51:23AM -0400, Michael S. Tsirkin wrote:
> > On Wed, Jul 11, 2018 at 10:27:07AM +0800, Tiwei Bie wrote:
> > > Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> > > ---
> > >  include/uapi/linux/virtio_config.h |  3 +++
> > >  include/uapi/linux/virtio_ring.h   | 43 ++++++++++++++++++++++++++++++
> > >  2 files changed, 46 insertions(+)
> > > 
> > > diff --git a/include/uapi/linux/virtio_config.h b/include/uapi/linux/virtio_config.h
> > > index 449132c76b1c..1196e1c1d4f6 100644
> > > --- a/include/uapi/linux/virtio_config.h
> > > +++ b/include/uapi/linux/virtio_config.h
> > > @@ -75,6 +75,9 @@
> > >   */
> > >  #define VIRTIO_F_IOMMU_PLATFORM		33
> > >  
> > > +/* This feature indicates support for the packed virtqueue layout. */
> > > +#define VIRTIO_F_RING_PACKED		34
> > > +
> > >  /*
> > >   * Does the device support Single Root I/O Virtualization?
> > >   */
> > > diff --git a/include/uapi/linux/virtio_ring.h b/include/uapi/linux/virtio_ring.h
> > > index 6d5d5faa989b..0254a2ba29cf 100644
> > > --- a/include/uapi/linux/virtio_ring.h
> > > +++ b/include/uapi/linux/virtio_ring.h
> > > @@ -44,6 +44,10 @@
> > >  /* This means the buffer contains a list of buffer descriptors. */
> > >  #define VRING_DESC_F_INDIRECT	4
> > >  
> > > +/* Mark a descriptor as available or used. */
> > > +#define VRING_DESC_F_AVAIL	(1ul << 7)
> > > +#define VRING_DESC_F_USED	(1ul << 15)
> > > +
> > >  /* The Host uses this in used->flags to advise the Guest: don't kick me when
> > >   * you add a buffer.  It's unreliable, so it's simply an optimization.  Guest
> > >   * will still kick if it's out of buffers. */
> > > @@ -53,6 +57,17 @@
> > >   * optimization.  */
> > >  #define VRING_AVAIL_F_NO_INTERRUPT	1
> > >  
> > > +/* Enable events. */
> > > +#define VRING_EVENT_F_ENABLE	0x0
> > > +/* Disable events. */
> > > +#define VRING_EVENT_F_DISABLE	0x1
> > > +/*
> > > + * Enable events for a specific descriptor
> > > + * (as specified by Descriptor Ring Change Event Offset/Wrap Counter).
> > > + * Only valid if VIRTIO_RING_F_EVENT_IDX has been negotiated.
> > > + */
> > > +#define VRING_EVENT_F_DESC	0x2
> > > +
> > >  /* We support indirect buffer descriptors */
> > >  #define VIRTIO_RING_F_INDIRECT_DESC	28
> > >  
> > 
> > These are for the packed ring, right? Pls prefix accordingly.
> 
> How about something like this:
> 
> #define VRING_PACKED_DESC_F_AVAIL	(1u << 7)
> #define VRING_PACKED_DESC_F_USED	(1u << 15)

_F_ should be a bit number, not a shifted value.
Yes I know current virtio_ring.h is inconsistent in
that respect. changes welcome though we need to
maintain the old names for the sake of old apps.

> 
> #define VRING_PACKED_EVENT_F_ENABLE	0x0
> #define VRING_PACKED_EVENT_F_DISABLE	0x1
> #define VRING_PACKED_EVENT_F_DESC	0x2

These are values, I think they should not have _F_.
Yes I know current virtio_ring.h is inconsistent in
that respect.


> 
> > Also, you likely need macros for the wrap counters.
> 
> How about something like this:
> 
> #define VRING_PACKED_EVENT_WRAP_COUNTER_SHIFT	15

Given it's a single bit, maybe even
VRING_PACKED_EVENT_F_WRAP_CTR

> #define VRING_PACKED_EVENT_WRAP_COUNTER_MASK	\
> 			(1u << VRING_PACKED_WRAP_COUNTER_SHIFT)
> #define VRING_PACKED_EVENT_OFFSET_MASK	\
> 			((1u << VRING_PACKED_WRAP_COUNTER_SHIFT) - 1)

I'm not sure the mask is justified.

> > 
> > > @@ -171,4 +186,32 @@ static inline int vring_need_event(__u16 event_idx, __u16 new_idx, __u16 old)
> > >  	return (__u16)(new_idx - event_idx - 1) < (__u16)(new_idx - old);
> > >  }
> > >  
> > > +struct vring_packed_desc_event {
> > > +	/* Descriptor Ring Change Event Offset/Wrap Counter. */
> > > +	__virtio16 off_wrap;
> > > +	/* Descriptor Ring Change Event Flags. */
> > > +	__virtio16 flags;
> > > +};
> > > +
> > > +struct vring_packed_desc {
> > > +	/* Buffer Address. */
> > > +	__virtio64 addr;
> > > +	/* Buffer Length. */
> > > +	__virtio32 len;
> > > +	/* Buffer ID. */
> > > +	__virtio16 id;
> > > +	/* The flags depending on descriptor type. */
> > > +	__virtio16 flags;
> > > +};
> > 
> > Don't use __virtioXX types, just __leXX ones.
> 
> Got it, will do that.
> 
> > 
> > > +
> > > +struct vring_packed {
> > > +	unsigned int num;
> > > +
> > > +	struct vring_packed_desc *desc;
> > > +
> > > +	struct vring_packed_desc_event *driver;
> > > +
> > > +	struct vring_packed_desc_event *device;
> > > +};
> > > +
> > >  #endif /* _UAPI_LINUX_VIRTIO_RING_H */
> > > -- 
> > > 2.18.0

^ permalink raw reply

* Re: [PATCH net-next v2 2/5] virtio_ring: support creating packed ring
From: Michael S. Tsirkin @ 2018-09-12 12:45 UTC (permalink / raw)
  To: Tiwei Bie
  Cc: jasowang, virtualization, linux-kernel, netdev, virtio-dev, wexu,
	jfreimann
In-Reply-To: <20180910022837.GA11822@debian>

On Mon, Sep 10, 2018 at 10:28:37AM +0800, Tiwei Bie wrote:
> On Fri, Sep 07, 2018 at 10:03:24AM -0400, Michael S. Tsirkin wrote:
> > On Wed, Jul 11, 2018 at 10:27:08AM +0800, Tiwei Bie wrote:
> > > This commit introduces the support for creating packed ring.
> > > All split ring specific functions are added _split suffix.
> > > Some necessary stubs for packed ring are also added.
> > > 
> > > Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> > 
> > I'd rather have a patch just renaming split functions, then
> > add all packed stuff in as a separate patch on top.
> 
> Sure, I will do that.
> 
> > 
> > 
> > > ---
> > >  drivers/virtio/virtio_ring.c | 801 +++++++++++++++++++++++------------
> > >  include/linux/virtio_ring.h  |   8 +-
> > >  2 files changed, 546 insertions(+), 263 deletions(-)
> > > 
> > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > index 814b395007b2..c4f8abc7445a 100644
> > > --- a/drivers/virtio/virtio_ring.c
> > > +++ b/drivers/virtio/virtio_ring.c
> > > @@ -60,11 +60,15 @@ struct vring_desc_state {
> > >  	struct vring_desc *indir_desc;	/* Indirect descriptor, if any. */
> > >  };
> > >  
> > > +struct vring_desc_state_packed {
> > > +	int next;			/* The next desc state. */
> > 
> > So this can go away with IN_ORDER?
> 
> Yes. If IN_ORDER is negotiated, next won't be needed anymore.
> Currently, IN_ORDER isn't included in this patch set, because
> some changes for split ring are needed to make sure that it
> will use the descs in the expected order. After that,
> optimizations can be done for both of split ring and packed
> ring respectively.
> 
> > 
> > > +};
> > > +
> > >  struct vring_virtqueue {
> > >  	struct virtqueue vq;
> > >  
> > > -	/* Actual memory layout for this queue */
> > > -	struct vring vring;
> > > +	/* Is this a packed ring? */
> > > +	bool packed;
> > >  
> > >  	/* Can we use weak barriers? */
> > >  	bool weak_barriers;
> > > @@ -86,11 +90,39 @@ struct vring_virtqueue {
> > >  	/* Last used index we've seen. */
> > >  	u16 last_used_idx;
> > >  
> > > -	/* Last written value to avail->flags */
> > > -	u16 avail_flags_shadow;
> > > +	union {
> > > +		/* Available for split ring */
> > > +		struct {
> > > +			/* Actual memory layout for this queue. */
> > > +			struct vring vring;
> > >  
> > > -	/* Last written value to avail->idx in guest byte order */
> > > -	u16 avail_idx_shadow;
> > > +			/* Last written value to avail->flags */
> > > +			u16 avail_flags_shadow;
> > > +
> > > +			/* Last written value to avail->idx in
> > > +			 * guest byte order. */
> > > +			u16 avail_idx_shadow;
> > > +		};
> > 
> > Name this field split so it's easier to detect misuse of e.g.
> > packed fields in split code?
> 
> Good point, I'll do that.
> 
> > 
> > > +
> > > +		/* Available for packed ring */
> > > +		struct {
> > > +			/* Actual memory layout for this queue. */
> > > +			struct vring_packed vring_packed;
> > > +
> > > +			/* Driver ring wrap counter. */
> > > +			bool avail_wrap_counter;
> > > +
> > > +			/* Device ring wrap counter. */
> > > +			bool used_wrap_counter;
> > > +
> > > +			/* Index of the next avail descriptor. */
> > > +			u16 next_avail_idx;
> > > +
> > > +			/* Last written value to driver->flags in
> > > +			 * guest byte order. */
> > > +			u16 event_flags_shadow;
> > > +		};
> > > +	};
> [...]
> > > +
> > > +/*
> > > + * The layout for the packed ring is a continuous chunk of memory
> > > + * which looks like this.
> > > + *
> > > + * struct vring_packed {
> > > + *	// The actual descriptors (16 bytes each)
> > > + *	struct vring_packed_desc desc[num];
> > > + *
> > > + *	// Padding to the next align boundary.
> > > + *	char pad[];
> > > + *
> > > + *	// Driver Event Suppression
> > > + *	struct vring_packed_desc_event driver;
> > > + *
> > > + *	// Device Event Suppression
> > > + *	struct vring_packed_desc_event device;
> > > + * };
> > > + */
> > 
> > Why not just allocate event structures separately?
> > Is it a win to have them share a cache line for some reason?
> 
> Will do that.
> 
> > 
> > > +static inline void vring_init_packed(struct vring_packed *vr, unsigned int num,
> > > +				     void *p, unsigned long align)
> > > +{
> > > +	vr->num = num;
> > > +	vr->desc = p;
> > > +	vr->driver = (void *)ALIGN(((uintptr_t)p +
> > > +		sizeof(struct vring_packed_desc) * num), align);
> > > +	vr->device = vr->driver + 1;
> > > +}
> > 
> > What's all this about alignment? Where does it come from?
> 
> It comes from the `vring_align` parameter of vring_create_virtqueue()
> and vring_new_virtqueue():
> 
> https://github.com/torvalds/linux/blob/a49a9dcce802/drivers/virtio/virtio_ring.c#L1061
> https://github.com/torvalds/linux/blob/a49a9dcce802/drivers/virtio/virtio_ring.c#L1123

Note the TODO - we just never got to fixing it. It would be great to fix
this for virtio 1 rings, but if not - at least let's not add this
stuff for packed rings.

> Should I just ignore it in packed ring?
> 
> CCW defined this:
> 
> #define KVM_VIRTIO_CCW_RING_ALIGN 4096
> 
> I'm not familiar with CCW. Currently, in this patch set, packed ring
> isn't enabled on CCW dues to some legacy accessors are not implemented
> in packed ring yet.


Then you need to take steps not to negotiate this feature bit for
ccw drivers.

> > 
> > > +
> > > +static inline unsigned vring_size_packed(unsigned int num, unsigned long align)
> > > +{
> > > +	return ((sizeof(struct vring_packed_desc) * num + align - 1)
> > > +		& ~(align - 1)) + sizeof(struct vring_packed_desc_event) * 2;
> > > +}
> [...]
> > > @@ -1129,10 +1388,17 @@ struct virtqueue *vring_new_virtqueue(unsigned int index,
> > >  				      void (*callback)(struct virtqueue *vq),
> > >  				      const char *name)
> > >  {
> > > -	struct vring vring;
> > > -	vring_init(&vring, num, pages, vring_align);
> > > -	return __vring_new_virtqueue(index, vring, vdev, weak_barriers, context,
> > > -				     notify, callback, name);
> > > +	union vring_union vring;
> > > +	bool packed;
> > > +
> > > +	packed = virtio_has_feature(vdev, VIRTIO_F_RING_PACKED);
> > > +	if (packed)
> > > +		vring_init_packed(&vring.vring_packed, num, pages, vring_align);
> > > +	else
> > > +		vring_init(&vring.vring_split, num, pages, vring_align);
> > 
> > 
> > vring_init in the UAPI header is more or less a bug.
> > I'd just stop using it, keep it around for legacy userspace.
> 
> Got it. I'd like to do that. Thanks.
> 
> > 
> > > +
> > > +	return __vring_new_virtqueue(index, vring, packed, vdev, weak_barriers,
> > > +				     context, notify, callback, name);
> > >  }
> > >  EXPORT_SYMBOL_GPL(vring_new_virtqueue);
> > >  
> > > @@ -1142,7 +1408,9 @@ void vring_del_virtqueue(struct virtqueue *_vq)
> > >  
> > >  	if (vq->we_own_ring) {
> > >  		vring_free_queue(vq->vq.vdev, vq->queue_size_in_bytes,
> > > -				 vq->vring.desc, vq->queue_dma_addr);
> > > +				 vq->packed ? (void *)vq->vring_packed.desc :
> > > +					      (void *)vq->vring.desc,
> > > +				 vq->queue_dma_addr);
> > >  	}
> > >  	list_del(&_vq->list);
> > >  	kfree(vq);
> > > @@ -1184,7 +1452,7 @@ unsigned int virtqueue_get_vring_size(struct virtqueue *_vq)
> > >  
> > >  	struct vring_virtqueue *vq = to_vvq(_vq);
> > >  
> > > -	return vq->vring.num;
> > > +	return vq->packed ? vq->vring_packed.num : vq->vring.num;
> > >  }
> > >  EXPORT_SYMBOL_GPL(virtqueue_get_vring_size);
> > >  
> > > @@ -1227,6 +1495,10 @@ dma_addr_t virtqueue_get_avail_addr(struct virtqueue *_vq)
> > >  
> > >  	BUG_ON(!vq->we_own_ring);
> > >  
> > > +	if (vq->packed)
> > > +		return vq->queue_dma_addr + ((char *)vq->vring_packed.driver -
> > > +				(char *)vq->vring_packed.desc);
> > > +
> > >  	return vq->queue_dma_addr +
> > >  		((char *)vq->vring.avail - (char *)vq->vring.desc);
> > >  }
> > > @@ -1238,11 +1510,16 @@ dma_addr_t virtqueue_get_used_addr(struct virtqueue *_vq)
> > >  
> > >  	BUG_ON(!vq->we_own_ring);
> > >  
> > > +	if (vq->packed)
> > > +		return vq->queue_dma_addr + ((char *)vq->vring_packed.device -
> > > +				(char *)vq->vring_packed.desc);
> > > +
> > >  	return vq->queue_dma_addr +
> > >  		((char *)vq->vring.used - (char *)vq->vring.desc);
> > >  }
> > >  EXPORT_SYMBOL_GPL(virtqueue_get_used_addr);
> > >  
> > > +/* Only available for split ring */
> > >  const struct vring *virtqueue_get_vring(struct virtqueue *vq)
> > >  {
> > >  	return &to_vvq(vq)->vring;
> > > diff --git a/include/linux/virtio_ring.h b/include/linux/virtio_ring.h
> > > index fab02133a919..992142b35f55 100644
> > > --- a/include/linux/virtio_ring.h
> > > +++ b/include/linux/virtio_ring.h
> > > @@ -60,6 +60,11 @@ static inline void virtio_store_mb(bool weak_barriers,
> > >  struct virtio_device;
> > >  struct virtqueue;
> > >  
> > > +union vring_union {
> > > +	struct vring vring_split;
> > > +	struct vring_packed vring_packed;
> > > +};
> > > +
> > >  /*
> > >   * Creates a virtqueue and allocates the descriptor ring.  If
> > >   * may_reduce_num is set, then this may allocate a smaller ring than
> > > @@ -79,7 +84,8 @@ struct virtqueue *vring_create_virtqueue(unsigned int index,
> > >  
> > >  /* Creates a virtqueue with a custom layout. */
> > >  struct virtqueue *__vring_new_virtqueue(unsigned int index,
> > > -					struct vring vring,
> > > +					union vring_union vring,
> > > +					bool packed,
> > >  					struct virtio_device *vdev,
> > >  					bool weak_barriers,
> > >  					bool ctx,
> > > -- 
> > > 2.18.0

^ permalink raw reply

* Re: Fw: Payment approval and SEPA direct debit mandate
From: a.hodin @ 2018-09-12  7:09 UTC (permalink / raw)


[-- Attachment #1: Type: text/plain, Size: 1020 bytes --]

Hello,

As outlined by telephone from your office, enclosed the undersigned 
purchase contract for ELW And payment receipt.

wie telefonisch besprochen von deinem Büro, anbei der Unterzeichnete 
Kaufvertrag zum ELW Und Zahlungsbeleg.

Mit freundlichen Grüßen


I.A. Florian Carolus

Klaase GmbH & Co. KG
Spedition & Logistik
Zum Kratt 8 - 10
D-24787 Fockbek

Tel. +49 (0) 4331-43817-19
Fax +49 (0) 4331-43817-25
info@klaasegmbh.com
www.spedition-kraase.de



Personenbezogene Daten, die im Rahmen dieser Kommunikation ausgetauscht 
werden, unterliegen den Datenschutzbestimmungen der EU-DSGVO.
Weitere Informationen und Ihre Rechte können Sie unter nachfolgendem 
Link abrufen: https://www.spedition-braase.de/datenschutz.html

Steuer-Nr.: 2828202105 UST-ID.Nr.: DE 135375677
Kommanditgesellschaft Sitz 24787 Fockbek, Handelsregister: Amtsgericht 
Kiel, HRA 311RD
Komplementär: FD Verwaltungs-GmbH Sitz 24787 Fockbek, Handelsregister: 
Amtsgericht Kiel, HRB 631RD
Geschäftsführung: Fritz Dethlefsen, Dirk Zernitz

[-- Attachment #2: Invoice Documents GmbH--RDPNMAZSFS-ZTJP-12-09 PDF.zip --]
[-- Type: application/zip, Size: 1166 bytes --]

^ permalink raw reply

* Re: [PATCH] net: ethernet: Use DIV_ROUND_UP instead of reimplementing its function
From: Tariq Toukan @ 2018-09-12 12:14 UTC (permalink / raw)
  To: zhong jiang, davem, claudiu.manoil, saeedm, derek.chickles
  Cc: leon, jdmason, netdev, linux-kernel
In-Reply-To: <1536671295-14432-1-git-send-email-zhongjiang@huawei.com>



On 11/09/2018 4:08 PM, zhong jiang wrote:
> DIV_ROUND_UP has implemented the code-opened function. Therefore, just
> replace the implementation with DIV_ROUND_UP.
> 
> Signed-off-by: zhong jiang <zhongjiang@huawei.com>
> ---
>   drivers/net/ethernet/broadcom/sb1250-mac.c         | 2 +-
>   drivers/net/ethernet/cavium/liquidio/octeon_droq.c | 2 +-
>   drivers/net/ethernet/freescale/gianfar_ethtool.c   | 2 +-
>   drivers/net/ethernet/ibm/emac/mal.h                | 2 +-
>   drivers/net/ethernet/mellanox/mlx4/alloc.c         | 2 +-
>   drivers/net/ethernet/mellanox/mlx4/icm.c           | 2 +-
>   drivers/net/ethernet/mellanox/mlx5/core/srq.c      | 2 +-
>   drivers/net/ethernet/neterion/s2io.c               | 2 +-
>   8 files changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/net/ethernet/broadcom/sb1250-mac.c b/drivers/net/ethernet/broadcom/sb1250-mac.c
> index ef4a0c3..1082529 100644
> --- a/drivers/net/ethernet/broadcom/sb1250-mac.c
> +++ b/drivers/net/ethernet/broadcom/sb1250-mac.c
> @@ -156,7 +156,7 @@ enum sbmac_state {
>   			  (d)->sbdma_dscrtable : (d)->f+1)
>   
>   
> -#define NUMCACHEBLKS(x) (((x)+SMP_CACHE_BYTES-1)/SMP_CACHE_BYTES)
> +#define NUMCACHEBLKS(x) DIV_ROUND_UP(x, SMP_CACHE_BYTES)
>   
>   #define SBMAC_MAX_TXDESCR	256
>   #define SBMAC_MAX_RXDESCR	256
> diff --git a/drivers/net/ethernet/cavium/liquidio/octeon_droq.c b/drivers/net/ethernet/cavium/liquidio/octeon_droq.c
> index a71dbb7..b57008d 100644
> --- a/drivers/net/ethernet/cavium/liquidio/octeon_droq.c
> +++ b/drivers/net/ethernet/cavium/liquidio/octeon_droq.c
> @@ -530,7 +530,7 @@ void octeon_droq_check_oom(struct octeon_droq *droq)
>   static inline u32
>   octeon_droq_get_bufcount(u32 buf_size, u32 total_len)
>   {
> -	return ((total_len + buf_size - 1) / buf_size);
> +	return DIV_ROUND_UP(total_len, buf_size);
>   }
>   
>   static int
> diff --git a/drivers/net/ethernet/freescale/gianfar_ethtool.c b/drivers/net/ethernet/freescale/gianfar_ethtool.c
> index 395a526..b536eb0 100644
> --- a/drivers/net/ethernet/freescale/gianfar_ethtool.c
> +++ b/drivers/net/ethernet/freescale/gianfar_ethtool.c
> @@ -230,7 +230,7 @@ static unsigned int gfar_usecs2ticks(struct gfar_private *priv,
>   
>   	/* Make sure we return a number greater than 0
>   	 * if usecs > 0 */
> -	return (usecs * 1000 + count - 1) / count;
> +	return DIV_ROUND_UP(usecs * 1000, count);
>   }
>   
>   /* Convert ethernet clock ticks to microseconds */
> diff --git a/drivers/net/ethernet/ibm/emac/mal.h b/drivers/net/ethernet/ibm/emac/mal.h
> index eeade2e..e4c20f0 100644
> --- a/drivers/net/ethernet/ibm/emac/mal.h
> +++ b/drivers/net/ethernet/ibm/emac/mal.h
> @@ -136,7 +136,7 @@ static inline int mal_rx_size(int len)
>   
>   static inline int mal_tx_chunks(int len)
>   {
> -	return (len + MAL_MAX_TX_SIZE - 1) / MAL_MAX_TX_SIZE;
> +	return DIV_ROUND_UP(len, MAL_MAX_TX_SIZE);
>   }
>   
>   #define MAL_CHAN_MASK(n)	(0x80000000 >> (n))
> diff --git a/drivers/net/ethernet/mellanox/mlx4/alloc.c b/drivers/net/ethernet/mellanox/mlx4/alloc.c
> index 4bdf250..deef5a9 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/alloc.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/alloc.c
> @@ -614,7 +614,7 @@ int mlx4_buf_alloc(struct mlx4_dev *dev, int size, int max_direct,
>   		int i;
>   
>   		buf->direct.buf = NULL;
> -		buf->nbufs	= (size + PAGE_SIZE - 1) / PAGE_SIZE;
> +		buf->nbufs      = DIV_ROUND_UP(size, PAGE_SIZE);
>   		buf->npages	= buf->nbufs;
>   		buf->page_shift  = PAGE_SHIFT;
>   		buf->page_list   = kcalloc(buf->nbufs, sizeof(*buf->page_list),
> diff --git a/drivers/net/ethernet/mellanox/mlx4/icm.c b/drivers/net/ethernet/mellanox/mlx4/icm.c
> index 7262c63..4b43511 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/icm.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/icm.c
> @@ -406,7 +406,7 @@ int mlx4_init_icm_table(struct mlx4_dev *dev, struct mlx4_icm_table *table,
>   	obj_per_chunk = MLX4_TABLE_CHUNK_SIZE / obj_size;
>   	if (WARN_ON(!obj_per_chunk))
>   		return -EINVAL;
> -	num_icm = (nobj + obj_per_chunk - 1) / obj_per_chunk;
> +	num_icm = DIV_ROUND_UP(nobj, obj_per_chunk);
>   
>   	table->icm      = kvcalloc(num_icm, sizeof(*table->icm), GFP_KERNEL);
>   	if (!table->icm)
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/srq.c b/drivers/net/ethernet/mellanox/mlx5/core/srq.c
> index 23cc337..7e20666 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/srq.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/srq.c
> @@ -73,7 +73,7 @@ static int get_pas_size(struct mlx5_srq_attr *in)
>   	u32 rq_sz	  = 1 << (log_srq_size + 4 + log_rq_stride);
>   	u32 page_size	  = 1 << log_page_size;
>   	u32 rq_sz_po      = rq_sz + (page_offset * po_quanta);
> -	u32 rq_num_pas	  = (rq_sz_po + page_size - 1) / page_size;
> +	u32 rq_num_pas    = DIV_ROUND_UP(rq_sz_po, page_size);
>   
>   	return rq_num_pas * sizeof(u64);
>   }
> diff --git a/drivers/net/ethernet/neterion/s2io.c b/drivers/net/ethernet/neterion/s2io.c
> index b8983e7..f980f10 100644
> --- a/drivers/net/ethernet/neterion/s2io.c
> +++ b/drivers/net/ethernet/neterion/s2io.c
> @@ -491,7 +491,7 @@ static void do_s2io_copy_mac_addr(struct s2io_nic *sp, int offset, u64 mac_addr)
>   };
>   
>   /* A simplifier macro used both by init and free shared_mem Fns(). */
> -#define TXD_MEM_PAGE_CNT(len, per_each) ((len+per_each - 1) / per_each)
> +#define TXD_MEM_PAGE_CNT(len, per_each) DIV_ROUND_UP(len, per_each)
>   
>   /* netqueue manipulation helper functions */
>   static inline void s2io_stop_all_tx_queue(struct s2io_nic *sp)
> 

For the Mellanox parts:
Acked-by: Tariq Toukan <tariqt@mellanox.com>

^ permalink raw reply

* Re: [Patch net v2] rds: fix two RCU related problems
From: David Miller @ 2018-09-12  7:10 UTC (permalink / raw)
  To: xiyou.wangcong; +Cc: netdev, sowmini.varadhan, santosh.shilimkar, rds-devel
In-Reply-To: <20180911012726.5353-1-xiyou.wangcong@gmail.com>

From: Cong Wang <xiyou.wangcong@gmail.com>
Date: Mon, 10 Sep 2018 18:27:26 -0700

> When a rds sock is bound, it is inserted into the bind_hash_table
> which is protected by RCU. But when releasing rds sock, after it
> is removed from this hash table, it is freed immediately without
> respecting RCU grace period. This could cause some use-after-free
> as reported by syzbot.
> 
> Mark the rds sock with SOCK_RCU_FREE before inserting it into the
> bind_hash_table, so that it would be always freed after a RCU grace
> period.
> 
> The other problem is in rds_find_bound(), the rds sock could be
> freed in between rhashtable_lookup_fast() and rds_sock_addref(),
> so we need to extend RCU read lock protection in rds_find_bound()
> to close this race condition.
> 
> Reported-and-tested-by: syzbot+8967084bcac563795dc6@syzkaller.appspotmail.com
> Reported-by: syzbot+93a5839deb355537440f@syzkaller.appspotmail.com
> Cc: Sowmini Varadhan <sowmini.varadhan@oracle.com>
> Cc: Santosh Shilimkar <santosh.shilimkar@oracle.com>
> Cc: rds-devel@oss.oracle.com
> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>

Applied and queued up for -stable, thanks.

^ permalink raw reply

* Re: [PATCH][net-next][v2] netlink: remove hash::nelems check in netlink_insert
From: David Miller @ 2018-09-12  7:08 UTC (permalink / raw)
  To: lirongqing; +Cc: netdev
In-Reply-To: <1536627901-18256-1-git-send-email-lirongqing@baidu.com>

From: Li RongQing <lirongqing@baidu.com>
Date: Tue, 11 Sep 2018 09:05:01 +0800

> The type of hash::nelems has been changed from size_t to atom_t
> which in fact is int, so not need to check if BITS_PER_LONG, that
> is bit number of size_t, is bigger than 32
> 
> and rht_grow_above_max() will be called to check if hashtable is
> too big, ensure it can not bigger than 1<<31
> 
> Signed-off-by: Zhang Yu <zhangyu31@baidu.com>
> Signed-off-by: Li RongQing <lirongqing@baidu.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next v1] net/tls: Fixed return value when tls_complete_pending_work() fails
From: David Miller @ 2018-09-12  7:04 UTC (permalink / raw)
  To: vakul.garg; +Cc: netdev, borisp, aviadye, davejwatson, doronrk
In-Reply-To: <20180910172346.13181-1-vakul.garg@nxp.com>

From: Vakul Garg <vakul.garg@nxp.com>
Date: Mon, 10 Sep 2018 22:53:46 +0530

> In tls_sw_sendmsg() and tls_sw_sendpage(), the variable 'ret' has
> been set to return value of tls_complete_pending_work(). This allows
> return of proper error code if tls_complete_pending_work() fails.
> 
> Fixes: 3c4d7559159b ("tls: kernel TLS support")
> Signed-off-by: Vakul Garg <vakul.garg@nxp.com>

Applied, thank you.

^ permalink raw reply

* Re: [PATCH net] geneve: add ttl inherit support
From: Jiri Benc @ 2018-09-12  6:53 UTC (permalink / raw)
  To: Hangbin Liu; +Cc: netdev, Pravin Shelar, David S. Miller, Stefano Brivio
In-Reply-To: <1536717861-21590-1-git-send-email-liuhangbin@gmail.com>

On Wed, 12 Sep 2018 10:04:21 +0800, Hangbin Liu wrote:
> Similar with commit 72f6d71e491e6 ("vxlan: add ttl inherit support"),
> currently ttl == 0 means "use whatever default value" on geneve instead
> of inherit inner ttl. To respect compatibility with old behavior, let's
> add a new IFLA_GENEVE_TTL_INHERIT for geneve ttl inherit support.
> 
> Reported-by: Jianlin Shi <jishi@redhat.com>
> Suggested-by: Jiri Benc <jbenc@redhat.com>
> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>

Reviewed-by: Jiri Benc <jbenc@redhat.com>

Thanks, Hangbin!

 Jiri

^ permalink raw reply

* Re: [PATCH net-next v3 0/7] net: aquantia: implement WOL and EEE support
From: David Miller @ 2018-09-12  6:48 UTC (permalink / raw)
  To: igor.russkikh; +Cc: netdev
In-Reply-To: <cover.1536572107.git.igor.russkikh@aquantia.com>

From: Igor Russkikh <igor.russkikh@aquantia.com>
Date: Mon, 10 Sep 2018 12:39:27 +0300

> This is v3 of WOL/EEE functionality patch for atlantic driver.
> 
> In this patchset Yana Esina and Nikita Danilov implemented:
> 
> - Upload function to interact with FW memory
> - Definitions and structures necessary for the correct operation of Wake ON Lan
> - The functionality Wake On Lan via ethtool (Magic packet is supported)
> - The functionality for Energy-Efficient Ethernet configuration via ethtool
> 
> Version 3:
> - use ETH_ALEN instead of raw number
> 
> Version 2 has the following fixes:
> - patchset reorganized to extract renaming and whitespace fixes into separate
>   patches
> - some of magic numbers replaced with defines
> - reverse christmas tree applied

Series applied, thanks.

> Discussion outcome regarding driver version bumps was not finished
> (here https://patchwork.ozlabs.org/patch/954905/)
> David, could you suggest the best way to proceed on this?

Having a channel for your driver that is outside of upstream and Linux
distribution packages creates lots of problems.

When a user reports a problem with an upstream kernel, that verion
dictates which driver source was being used.  There is not confusion
or ambiguity.

For a distribution kernel, the distributor hashes out which driver
they published in their kernel package when evaluating a bug reported
to them.

None of these two entities is ready to evaluate and handle properly
your custom scheme.

So generally I frown against separate distribution schemes.  It is
in the final analysis an inferior experience for the user because
you basically narrow all of their support channels for problems
down to you and you alone.  The whole idea is to make it work the
opposite way.

So in the upstream tree, really, the driver version is pretty useless.

^ permalink raw reply

* Re: [PATCH net-next 0/8] bnxt_en: devlink param updates
From: Vasundhara Volam @ 2018-09-12  6:40 UTC (permalink / raw)
  To: Jiří Pírko
  Cc: jakub.kicinski, David Miller, michael.chan@broadcom.com, Netdev,
	alexander.duyck
In-Reply-To: <20180911115126.GF25110@nanopsycho>

On Tue, Sep 11, 2018 at 5:25 PM Jiri Pirko <jiri@resnulli.us> wrote:
>
> Tue, Sep 11, 2018 at 01:33:51PM CEST, jakub.kicinski@netronome.com wrote:
> >On Tue, 11 Sep 2018 14:14:57 +0530, Vasundhara Volam wrote:
> >> This patchset adds support for 4 generic and 1 driver-specific devlink
> >> parameters.
> >>
> >> Also, this patchset adds support to return proper error code if
> >> HWRM_NVM_GET/SET_VARIABLE commands return error code
> >> HWRM_ERR_CODE_RESOURCE_ACCESS_DENIED.
> >>
> >> Vasundhara Volam (8):
> >>   devlink: Add generic parameter hw_tc_offload
> >
> >Much like Jiri, I can't help but wonder why do you need this?
> >
> >>   devlink: Add generic parameter ignore_ari
> >>   devlink: Add generic parameter msix_vec_per_pf_max
> >>   devlink: Add generic parameter msix_vec_per_pf_min
> >
> >IMHO more structured API would be preferable if possible.  The string
> >keys won't scale if you want to set the parameters per PF, and
> >creating more structured API for PCIe which is a relatively slow
> >moving HW spec seems tractable.
> >
> >Not to mention the question Alex posed before about where this knobs
> >should actually live.  I'm personally fine with devlink.
> >
> >>   bnxt_en: Use hw_tc_offload and ignore_ari devlink parameters
> >>   bnxt_en: return proper error when FW returns
> >>     HWRM_ERR_CODE_RESOURCE_ACCESS_DENIED
> >>   bnxt_en: Use msix_vec_per_pf_max and msix_vec_per_pf_min devlink
> >>     params.
> >>   bnxt_en: Add a driver specific devlink parameter.
> >
> >The details about the device specific devlink parameter are very much
> >lacking.  Why does the patch subject not mention any specifics?  What's
> >GRE in the first place?
> >
> >You really need to provide more details and docs for all these
> >parameters.
>
> We really need Documentation/networking/devlink-params.txt to describe
> these.
>
Sure. I will add the documentation in my next version of the patchset
for all generic and driver specific parameters.

^ permalink raw reply

* Re: [PATCH net-next 0/8] bnxt_en: devlink param updates
From: Vasundhara Volam @ 2018-09-12  6:39 UTC (permalink / raw)
  To: jakub.kicinski
  Cc: David Miller, michael.chan@broadcom.com, Netdev, alexander.duyck
In-Reply-To: <20180911133215.5798ed2a@cakuba>

On Tue, Sep 11, 2018 at 5:04 PM Jakub Kicinski
<jakub.kicinski@netronome.com> wrote:
>
> On Tue, 11 Sep 2018 14:14:57 +0530, Vasundhara Volam wrote:
> > This patchset adds support for 4 generic and 1 driver-specific devlink
> > parameters.
> >
> > Also, this patchset adds support to return proper error code if
> > HWRM_NVM_GET/SET_VARIABLE commands return error code
> > HWRM_ERR_CODE_RESOURCE_ACCESS_DENIED.
> >
> > Vasundhara Volam (8):
> >   devlink: Add generic parameter hw_tc_offload
>
> Much like Jiri, I can't help but wonder why do you need this?
There is a request from our customer for a way to toggle tc_offload
feature in our adapter.
>
> >   devlink: Add generic parameter ignore_ari
> >   devlink: Add generic parameter msix_vec_per_pf_max
> >   devlink: Add generic parameter msix_vec_per_pf_min
>
> IMHO more structured API would be preferable if possible.  The string
> keys won't scale if you want to set the parameters per PF, and
> creating more structured API for PCIe which is a relatively slow
> moving HW spec seems tractable.
Sorry, could you please suggest an example? We will try to adapt.
>
> Not to mention the question Alex posed before about where this knobs
> should actually live.  I'm personally fine with devlink.
Okay
>
> >   bnxt_en: Use hw_tc_offload and ignore_ari devlink parameters
> >   bnxt_en: return proper error when FW returns
> >     HWRM_ERR_CODE_RESOURCE_ACCESS_DENIED
> >   bnxt_en: Use msix_vec_per_pf_max and msix_vec_per_pf_min devlink
> >     params.
> >   bnxt_en: Add a driver specific devlink parameter.
>
> The details about the device specific devlink parameter are very much
> lacking.  Why does the patch subject not mention any specifics?  What's
> GRE in the first place?
>
> You really need to provide more details and docs for all these
> parameters.
Sure, I will add all the documentation in my next version of the patch. Thanks.

^ permalink raw reply

* Re: [PATCH net-next 0/3] liquidio: Removed droq lock from Rx path
From: David Miller @ 2018-09-12  6:37 UTC (permalink / raw)
  To: felix.manlunas
  Cc: netdev, raghu.vatsavayi, derek.chickles, satananda.burla,
	intiyaz.basha
In-Reply-To: <20180910063322.GA4011@felix-thinkpad.cavium.com>

From: Felix Manlunas <felix.manlunas@cavium.com>
Date: Sun, 9 Sep 2018 23:33:22 -0700

> Series of patches for removing droq lock from Rx Path.

Series applied, thanks.

^ permalink raw reply

* Re: [PATCH net-next] tcp: rate limit synflood warnings further
From: David Miller @ 2018-09-12  6:35 UTC (permalink / raw)
  To: willemdebruijn.kernel; +Cc: netdev, eric.dumazet, willemb
In-Reply-To: <20180909231212.212470-1-willemdebruijn.kernel@gmail.com>

From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Date: Sun,  9 Sep 2018 19:12:12 -0400

> From: Willem de Bruijn <willemb@google.com>
> 
> Convert pr_info to net_info_ratelimited to limit the total number of
> synflood warnings.
> 
> Commit 946cedccbd73 ("tcp: Change possible SYN flooding messages")
> rate limits synflood warnings to one per listener.
> 
> Workloads that open many listener sockets can still see a high rate of
> log messages. Syzkaller is one frequent example.
> 
> Signed-off-by: Willem de Bruijn <willemb@google.com>

Applied, thanks Willem.

Is this stable material?

^ permalink raw reply

* Re: [PATCH net-next 1/8] devlink: Add generic parameter hw_tc_offload
From: Jakub Kicinski @ 2018-09-12  6:34 UTC (permalink / raw)
  To: Vasundhara Volam
  Cc: Jiří Pírko, David Miller,
	michael.chan@broadcom.com, Netdev
In-Reply-To: <CAACQVJrNyDXCuEFRVWcVh9tpb4_8WpLJ3U+ufm8TXcVYncHD6A@mail.gmail.com>

On Wed, 12 Sep 2018 11:47:59 +0530, Vasundhara Volam wrote:
> On Tue, Sep 11, 2018 at 3:25 PM Jiri Pirko <jiri@resnulli.us> wrote:
> >
> > Tue, Sep 11, 2018 at 10:44:58AM CEST, vasundhara-v.volam@broadcom.com wrote:  
> > >hw_tc_offload - Enable/Disable TC flower offload in the device.
> > >
> > >Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
> > >---
> > > include/net/devlink.h | 4 ++++
> > > net/core/devlink.c    | 5 +++++
> > > 2 files changed, 9 insertions(+)
> > >
> > >diff --git a/include/net/devlink.h b/include/net/devlink.h
> > >index b9b89d6..a0e9ce9 100644
> > >--- a/include/net/devlink.h
> > >+++ b/include/net/devlink.h
> > >@@ -362,6 +362,7 @@ enum devlink_param_generic_id {
> > >       DEVLINK_PARAM_GENERIC_ID_MAX_MACS,
> > >       DEVLINK_PARAM_GENERIC_ID_ENABLE_SRIOV,
> > >       DEVLINK_PARAM_GENERIC_ID_REGION_SNAPSHOT,
> > >+      DEVLINK_PARAM_GENERIC_ID_HW_TC_OFFLOAD,  
> >
> > Could you please describe why do you need this here and why the
> > tc_offload flag in ethtool is not enough. How do you imagine the user
> > should use them together?  
> Jiri, tc_offload flag in ethtool will modify feature in driver at
> runtime. But I am adding
> tc_offload param here to toggle this feature in NVM Config of our
> adapter, whose configuration
> mode is permanent and will be effective only with a reboot of the server.
> 
> User has to turn on tc_offload feature in NVM config of the adapter and then
> enabling the tc_offload flag in ethtool will completely enable the
> feature in driver.

Thanks for explaining, however, I don't think we have trouble
understanding *what* you are doing, but rather *why* you're doing it.

^ permalink raw reply

* Re: [PATCH v2 net] MIPS: lantiq: dma: add dev pointer
From: David Miller @ 2018-09-12  6:33 UTC (permalink / raw)
  To: hauke
  Cc: netdev, andrew, f.fainelli, john, linux-mips, hauke.mehrtens,
	paul.burton
In-Reply-To: <20180909192623.14998-1-hauke@hauke-m.de>

From: Hauke Mehrtens <hauke@hauke-m.de>
Date: Sun,  9 Sep 2018 21:26:23 +0200

> dma_zalloc_coherent() now crashes if no dev pointer is given.
> Add a dev pointer to the ltq_dma_channel structure and fill it in the
> driver using it.
> 
> This fixes a bug introduced in kernel 4.19.
> 
> Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>

Applied, thank you.

^ permalink raw reply

* Re: [PATCH net] powerpc: use big endian to hash len and proto in csum_ipv6_magic
From: Xin Long @ 2018-09-12  6:27 UTC (permalink / raw)
  To: davem
  Cc: network dev, linuxppc-dev, Christophe Leroy, Michael Ellerman,
	Roopa Prabhu
In-Reply-To: <20180911.230105.563027666901362955.davem@davemloft.net>

On Wed, Sep 12, 2018 at 2:01 PM David Miller <davem@davemloft.net> wrote:
>
> From: Xin Long <lucien.xin@gmail.com>
> Date: Sat,  8 Sep 2018 18:15:12 +0800
>
> > The function csum_ipv6_magic doesn't convert len and proto to big
> > endian before doing ipv6 csum hash, which is not consistent with
> > RFC and other arches.
> >
> > Jianlin found it when ICMPv6 packets from other hosts were dropped
> > in the powerpc64 system.
> >
> > This patch is to fix it by using instruction 'lwbrx' to do this
> > conversion in powerpc32/64 csum_ipv6_magic.
> >
> > Fixes: e9c4943a107b ("powerpc: Implement csum_ipv6_magic in assembly")
> > Reported-by: Jianlin Shi <jishi@redhat.com>
> > Signed-off-by: Xin Long <lucien.xin@gmail.com>
>
> Xin, please address the feedback you were given.
Christophe posted another one,
https://lore.kernel.org/patchwork/patch/983905/

Sorry, I didn't notice netdev wasn't in its CC-list.

^ permalink raw reply

* Re: [PATCH net] powerpc: use big endian to hash len and proto in csum_ipv6_magic
From: Christophe LEROY @ 2018-09-12  6:27 UTC (permalink / raw)
  To: David Miller, lucien.xin; +Cc: netdev, linuxppc-dev, mpe, roopa
In-Reply-To: <20180911.230105.563027666901362955.davem@davemloft.net>


Le 12/09/2018 à 08:01, David Miller a écrit :
> From: Xin Long <lucien.xin@gmail.com>
> Date: Sat,  8 Sep 2018 18:15:12 +0800
> 
>> The function csum_ipv6_magic doesn't convert len and proto to big
>> endian before doing ipv6 csum hash, which is not consistent with
>> RFC and other arches.
>>
>> Jianlin found it when ICMPv6 packets from other hosts were dropped
>> in the powerpc64 system.
>>
>> This patch is to fix it by using instruction 'lwbrx' to do this
>> conversion in powerpc32/64 csum_ipv6_magic.
>>
>> Fixes: e9c4943a107b ("powerpc: Implement csum_ipv6_magic in assembly")
>> Reported-by: Jianlin Shi <jishi@redhat.com>
>> Signed-off-by: Xin Long <lucien.xin@gmail.com>
> 
> Xin, please address the feedback you were given.

I submitted an alternative fix, and Lucien Xin gave its Tested-by:

See https://patchwork.ozlabs.org/patch/967868/

Christophe

^ permalink raw reply

* Re: [PATCH net-next 1/8] devlink: Add generic parameter hw_tc_offload
From: Vasundhara Volam @ 2018-09-12  6:17 UTC (permalink / raw)
  To: Jiří Pírko; +Cc: David Miller, michael.chan@broadcom.com, Netdev
In-Reply-To: <20180911095130.GC25110@nanopsycho>

On Tue, Sep 11, 2018 at 3:25 PM Jiri Pirko <jiri@resnulli.us> wrote:
>
> Tue, Sep 11, 2018 at 10:44:58AM CEST, vasundhara-v.volam@broadcom.com wrote:
> >hw_tc_offload - Enable/Disable TC flower offload in the device.
> >
> >Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
> >---
> > include/net/devlink.h | 4 ++++
> > net/core/devlink.c    | 5 +++++
> > 2 files changed, 9 insertions(+)
> >
> >diff --git a/include/net/devlink.h b/include/net/devlink.h
> >index b9b89d6..a0e9ce9 100644
> >--- a/include/net/devlink.h
> >+++ b/include/net/devlink.h
> >@@ -362,6 +362,7 @@ enum devlink_param_generic_id {
> >       DEVLINK_PARAM_GENERIC_ID_MAX_MACS,
> >       DEVLINK_PARAM_GENERIC_ID_ENABLE_SRIOV,
> >       DEVLINK_PARAM_GENERIC_ID_REGION_SNAPSHOT,
> >+      DEVLINK_PARAM_GENERIC_ID_HW_TC_OFFLOAD,
>
> Could you please describe why do you need this here and why the
> tc_offload flag in ethtool is not enough. How do you imagine the user
> should use them together?
Jiri, tc_offload flag in ethtool will modify feature in driver at
runtime. But I am adding
tc_offload param here to toggle this feature in NVM Config of our
adapter, whose configuration
mode is permanent and will be effective only with a reboot of the server.

User has to turn on tc_offload feature in NVM config of the adapter and then
enabling the tc_offload flag in ethtool will completely enable the
feature in driver.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox