* Re: [PATCH net-next] net/ipv6: Make ipv6_route_table_template static
From: David Miller @ 2018-10-11 5:26 UTC (permalink / raw)
To: dsahern; +Cc: netdev, dsahern
In-Reply-To: <20181008210634.19645-1-dsahern@kernel.org>
From: David Ahern <dsahern@kernel.org>
Date: Mon, 8 Oct 2018 14:06:34 -0700
> From: David Ahern <dsahern@gmail.com>
>
> ipv6_route_table_template is exported but there are no users outside
> of route.c. Make it static.
>
> Signed-off-by: David Ahern <dsahern@gmail.com>
Applied.
^ permalink raw reply
* Re: [PATCH net-next] rtnetlink: Update comment in rtnl_stats_dump regarding strict data checking
From: David Miller @ 2018-10-11 5:26 UTC (permalink / raw)
To: dsahern; +Cc: netdev, dsahern
In-Reply-To: <20181008205807.18727-1-dsahern@kernel.org>
From: David Ahern <dsahern@kernel.org>
Date: Mon, 8 Oct 2018 13:58:07 -0700
> From: David Ahern <dsahern@gmail.com>
>
> The NLM_F_DUMP_PROPER_HDR netlink flag was replaced by a setsockopt.
> Update the comment in rtnl_stats_dump.
>
> Fixes: 841891ec0c65 ("rtnetlink: Update rtnl_stats_dump for strict data checking")
> Reported-by: Christian Brauner <christian@brauner.io>
> Signed-off-by: David Ahern <dsahern@gmail.com>
Applied.
^ permalink raw reply
* Re: [PATCH net-next] rtnetlink: Move ifm in valid_fdb_dump_legacy to closer to use
From: David Miller @ 2018-10-11 5:26 UTC (permalink / raw)
To: dsahern; +Cc: netdev, dsahern
In-Reply-To: <20181008205724.13030-1-dsahern@kernel.org>
From: David Ahern <dsahern@kernel.org>
Date: Mon, 8 Oct 2018 13:57:24 -0700
> From: David Ahern <dsahern@gmail.com>
>
> Move setting of local variable ifm to after the message parsing in
> valid_fdb_dump_legacy. Avoid potential future use of unchecked variable.
>
> Fixes: 8dfbda19a21b ("rtnetlink: Move input checking for rtnl_fdb_dump to helper")
> Reported-by: Christian Brauner <christian@brauner.io>
> Signed-off-by: David Ahern <dsahern@gmail.com>
Applied.
^ permalink raw reply
* Re: [PATCH net-next 0/3] mlxsw: selftests: Few small updates
From: David Miller @ 2018-10-11 5:23 UTC (permalink / raw)
To: idosch; +Cc: netdev, jiri, petrm, nird
In-Reply-To: <20181008184945.788-1-idosch@mellanox.com>
From: Ido Schimmel <idosch@mellanox.com>
Date: Mon, 8 Oct 2018 18:50:38 +0000
> First patch fixes a typo in mlxsw.
>
> Second patch fixes a race in a recent test.
>
> Third patch makes a recent test executable.
Series applied, thanks Ido.
^ permalink raw reply
* Re: [PATCH net] rds: RDS (tcp) hangs on sendto() to unresponding address
From: David Miller @ 2018-10-11 5:21 UTC (permalink / raw)
To: ka-cheong.poon; +Cc: netdev, santosh.shilimkar, rds-devel
In-Reply-To: <1539015431-26974-1-git-send-email-ka-cheong.poon@oracle.com>
From: Ka-Cheong Poon <ka-cheong.poon@oracle.com>
Date: Mon, 8 Oct 2018 09:17:11 -0700
> In rds_send_mprds_hash(), if the calculated hash value is non-zero and
> the MPRDS connections are not yet up, it will wait. But it should not
> wait if the send is non-blocking. In this case, it should just use the
> base c_path for sending the message.
>
> Signed-off-by: Ka-Cheong Poon <ka-cheong.poon@oracle.com>
Applied.
^ permalink raw reply
* Re: BUG: corrupted list in p9_read_work
From: Dmitry Vyukov @ 2018-10-11 12:33 UTC (permalink / raw)
To: Dominique Martinet
Cc: Leon Romanovsky, syzbot, David Miller, Eric Van Hensbergen, LKML,
Latchesar Ionkov, netdev, Ron Minnich, syzkaller-bugs,
v9fs-developer
In-Reply-To: <20181010155814.GC20918@nautica>
On Wed, Oct 10, 2018 at 5:58 PM, Dominique Martinet
<asmadeus@codewreck.org> wrote:
> Dmitry Vyukov wrote on Wed, Oct 10, 2018:
>> > The problem is that you can't just give the client a file like trans fd;
>> > you'd need to open an ""rdma socket"" (simplifying wording a bit), and
>> > afaik there is no standard tool for it ; or rather, the problem is that
>> > RDMA is packet based so even if there were you can't just write stuff
>> > in a fd and hope it'll work, so you need a server.
>> >
>> > If you're interested, 9p is trivial enough that I could provide you with
>> > a trivial server that works like your file (just need to reimplement
>> > something that parses header to packetize it properly; so you could
>> > write to its stdin for example) ; that'd require some setup in the VM
>> > (configure rxe and install that tool), but it would definitely be
>> > possible.
>> > What do you think ?
>>
>> I would like to hear more details.
>> Opening a socket is not a problem. Why do we need a tool for this?
>
> Sorry, that's my head thinking unixy and piping things :)
>
>> I don't understand the problem with "packet-based" and what does it
>> mean to have a separate server? Any why?
>
> Packet-based means you can't just read/write in a fd and expect the
> other side to know where to cut the packets to send it to the client,
> but if we do it internally there's no problem. We know where to cut.
Ah, OK. This should not be a problem because the descriptions:
https://github.com/google/syzkaller/blob/master/sys/linux/9p.txt#L70
know what a packet is. So we always give write a single packet.
>> We definitely don't want to involve a separate third-party server,
>> that's very problematic for multiple reasons. But we can have a chunk
>> of custom C code inside of syzkaller.
>> What exactly setup we need?
>
> The setup itself isn't that bad, it's actually pretty much trivial - on
> a fedora VM I just had to run 'rxe_cfg start ens3' (virtio interface
> name) and then the infiniband tools are happy e.g. ibv_devinfo should
> list an interface if you have the userspace library that should have
> come with rxe_cfg.
> (specifically, my VM uses /etc/libibverbs.d/rxe.driver to point to the
> lib, and /usr/lib64/libibverbs/librxe-rdmav16.so the lib itself)
>
> Once tools like ibv_devinfo list the interface, it means syzkaller can
> use it, and very probably means the kernel can as well; that's it.
>
>
>> I guess it will make things simpler if you provide some kind of "hello
>> world" C program that mounts 9p/rdma. I don't need exact messages
>> (they will be same as with pipe transport, right?) nor actual server
>> implementation, but just the place where to inject these packets.
>
> That's still the tricky part, I'm afraid... Making a separate server
> would have been easy because I could have reused some of my junk for the
> actual connection handling (some rdma helper library I wrote ages
> ago[1]), but if you're going to just embed C code you'll probably want
> something lower level? I've never seen syzkaller use any library call
> but I'm not even sure I would know how to create a qp without
> libibverbs, would standard stuff be OK ?
Raw syscalls preferably.
What does 'rxe_cfg start ens3' do on syscall level? Some netlink?
Any libraries and utilities are hell pain in linux world. Will it work
in Android userspace? gVisor? Who will explain all syzkaller users
where they get this for their who-knows-what distro, which is 10 years
old because of corp policies, and debug how their version of the
library has a slightly incompatible version?
For example, after figuring out that rxe_cfg actually comes from
rdma-core (which is a separate delight on linux), my debian
destribution failed to install it because of some conflicts around
/etc/modprobe.d/mlx4.conf, and my ubuntu distro does not know about
such package. And we've just started :)
Syscalls tend to be simpler and more reliable. If it gives ENOSUPP,
ok, that's it. If it works, great, we can use it.
^ permalink raw reply
* Re: Re: [PATCH net-next v2 0/5] virtio: support packed ring
From: Tiwei Bie @ 2018-10-11 12:12 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Jason Wang, virtualization, linux-kernel, netdev, virtio-dev,
wexu, jfreimann
In-Reply-To: <20181010103335-mutt-send-email-mst@kernel.org>
On Wed, Oct 10, 2018 at 10:36:26AM -0400, Michael S. Tsirkin wrote:
> On Thu, Sep 13, 2018 at 05:47:29PM +0800, Jason Wang wrote:
> > On 2018年09月13日 16:59, Tiwei Bie wrote:
> > > > If what you say is true then we should take a careful look
> > > > and not supporting these generic things with packed layout.
> > > > Once we do support them it will be too late and we won't
> > > > be able to get performance back.
> > > I think it's a good point that we don't need to support
> > > everything in packed ring (especially these which would
> > > hurt the performance), as the packed ring aims at high
> > > performance. I'm also wondering about the features. Is
> > > there any possibility that we won't support the out of
> > > order processing (at least not by default) in packed ring?
> > > If I didn't miss anything, the need to support out of order
> > > processing in packed ring will make the data structure
> > > inside the driver not cache friendly which is similar to
> > > the case of the descriptor table in the split ring (the
> > > difference is that, it only happens in driver now).
> >
> > Out of order is not the only user, DMA is another one. We don't have used
> > ring(len), so we need to maintain buffer length somewhere even for in order
> > device.
>
> For a bunch of systems dma unmap is a nop so we do not really
> need to maintain it. It's a question of an API to detect that
> and optimize for it. I posted a proposed patch for that -
> want to try using that?
Yeah, definitely!
>
> > But if it's not too late, I second for a OUT_OF_ORDER feature.
> > Starting from in order can have much simpler code in driver.
> >
> > Thanks
>
> It's tricky to change the flag polarity because of compatibility
> with legacy interfaces. Why is this such a big deal?
>
> Let's teach drivers about IN_ORDER, then if devices
> are in order it will get enabled by default.
Yeah, make sense.
Besides, I have done some further profiling and debugging
both in kernel driver and DPDK vhost. Previously I was mislead
by a bug in vhost code. I will send a patch to fix that bug.
With that bug fixed, the performance of packed ring in the
test between kernel driver and DPDK vhost is better now.
I will send a new series soon. Thanks!
>
> --
> MST
^ permalink raw reply
* Re: net/tipc: recursive locking in tipc_link_reset
From: Dmitry Vyukov @ 2018-10-11 12:11 UTC (permalink / raw)
To: Ying Xue; +Cc: Jon Maloy, David Miller, netdev, tipc-discussion, LKML
In-Reply-To: <97d8c196-cd11-8859-fc14-ed12191f52af@windriver.com>
On Thu, Oct 11, 2018 at 2:03 PM, Ying Xue <ying.xue@windriver.com> wrote:
>>> Hi,
>>>
>>> I am getting the following error while booting the latest kernel on
>>> bb2d8f2f61047cbde08b78ec03e4ebdb01ee5434 (Oct 10). Config is attached.
>>>
>>> Since this happens during boot, this makes LOCKDEP completely
>>> unusable, does not allow to discover any other locking issues and
>>> masks all new bugs being introduced into kernel.
>>> Please fix asap.
>>> Thanks
>> -parthasarathy.bhuvaragan address as it gives me bounces
>> but this is highly likely due to:
>>
>> commit 3f32d0be6c16b902b687453c962d17eea5b8ea19
>> Author: Parthasarathy Bhuvaragan
>> Date: Tue Sep 25 22:09:10 2018 +0200
>>
>> tipc: lock wakeup & inputq at tipc_link_reset()
>>
>>
>
> Dmitry, I agree with you. The complaint should be caused by the commit
> above. Please try to verify the patch:
> https://patchwork.ozlabs.org/patch/982447.
I trust you for testing ;)
Thanks for the quick fix!
^ permalink raw reply
* Re: net/tipc: recursive locking in tipc_link_reset
From: Ying Xue @ 2018-10-11 12:05 UTC (permalink / raw)
To: Jon Maloy, Dmitry Vyukov, parthasarathy.bhuvaragan@ericsson.com,
David Miller, netdev, tipc-discussion@lists.sourceforge.net, LKML
In-Reply-To: <DM5PR15MB15134EF6AA09E8268FA58D519AE10@DM5PR15MB1513.namprd15.prod.outlook.com>
Jon, please help to review the patch:
https://patchwork.ozlabs.org/patch/982447.
Thanks,
Ying
On 10/11/2018 06:55 PM, Jon Maloy wrote:
> Hi Dmitry,
> Yes, we are aware of this, the kernel test robot warned us about this a few days ago.
> I am looking into it.
>
> ///jon
^ permalink raw reply
* Re: net/tipc: recursive locking in tipc_link_reset
From: Ying Xue @ 2018-10-11 12:03 UTC (permalink / raw)
To: Dmitry Vyukov, Jon Maloy, David Miller, netdev, tipc-discussion,
LKML
In-Reply-To: <CACT4Y+Ym3_NuqWrgHXWfTahWTm17xi_2gjy8hvuDeyUCPuYXUQ@mail.gmail.com>
On 10/11/2018 03:59 PM, Dmitry Vyukov wrote:
> On Thu, Oct 11, 2018 at 9:55 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
>> Hi,
>>
>> I am getting the following error while booting the latest kernel on
>> bb2d8f2f61047cbde08b78ec03e4ebdb01ee5434 (Oct 10). Config is attached.
>>
>> Since this happens during boot, this makes LOCKDEP completely
>> unusable, does not allow to discover any other locking issues and
>> masks all new bugs being introduced into kernel.
>> Please fix asap.
>> Thanks
> -parthasarathy.bhuvaragan address as it gives me bounces
> but this is highly likely due to:
>
> commit 3f32d0be6c16b902b687453c962d17eea5b8ea19
> Author: Parthasarathy Bhuvaragan
> Date: Tue Sep 25 22:09:10 2018 +0200
>
> tipc: lock wakeup & inputq at tipc_link_reset()
>
>
Dmitry, I agree with you. The complaint should be caused by the commit
above. Please try to verify the patch:
https://patchwork.ozlabs.org/patch/982447.
Thanks,
Ying
^ permalink raw reply
* Re: [RFC PATCH 2/2] net/ncsi: Configure multi-package, multi-channel modes with failover
From: Samuel Mendoza-Jonas @ 2018-10-11 4:25 UTC (permalink / raw)
To: Justin.Lee1, netdev; +Cc: davem, linux-kernel, openbmc
In-Reply-To: <d97bb52a060a40cf8e03a49eb82a285f@AUSX13MPS302.AMER.DELL.COM>
On Wed, 2018-10-10 at 22:36 +0000, Justin.Lee1@Dell.com wrote:
> Hi Samuel,
>
> I am still testing your change and have some comments below.
>
> Thanks,
> Justin
>
>
> > This patch extends the ncsi-netlink interface with two new commands and
> > three new attributes to configure multiple packages and/or channels at
> > once, and configure specific failover modes.
> >
> > NCSI_CMD_SET_PACKAGE mask and NCSI_CMD_SET_CHANNEL_MASK set a whitelist
> > of packages or channels allowed to be configured with the
> > NCSI_ATTR_PACKAGE_MASK and NCSI_ATTR_CHANNEL_MASK attributes
> > respectively. If one of these whitelists is set only packages or
> > channels matching the whitelist are considered for the channel queue in
> > ncsi_choose_active_channel().
> >
> > These commands may also use the NCSI_ATTR_MULTI_FLAG to signal that
> > multiple packages or channels may be configured simultaneously. NCSI
> > hardware arbitration (HWA) must be available in order to enable
> > multi-package mode. Multi-channel mode is always available.
> >
> > If the NCSI_ATTR_CHANNEL_ID attribute is present in the
> > NCSI_CMD_SET_CHANNEL_MASK command the it sets the preferred channel as
> > with the NCSI_CMD_SET_INTERFACE command. The combination of preferred
> > channel and channel whitelist defines a primary channel and the allowed
> > failover channels.
> > If the NCSI_ATTR_MULTI_FLAG attribute is also present then the preferred
> > channel is configured for Tx/Rx and the other channels are enabled only
> > for Rx.
> >
> > Signed-off-by: Samuel Mendoza-Jonas <sam@mendozajonas.com>
> > ---
> > include/uapi/linux/ncsi.h | 16 +++
> > net/ncsi/internal.h | 11 +-
> > net/ncsi/ncsi-aen.c | 2 +-
> > net/ncsi/ncsi-manage.c | 138 ++++++++++++++++--------
> > net/ncsi/ncsi-netlink.c | 217 +++++++++++++++++++++++++++++++++-----
> > net/ncsi/ncsi-rsp.c | 2 +-
> > 6 files changed, 312 insertions(+), 74 deletions(-)
> >
> > diff --git a/include/uapi/linux/ncsi.h b/include/uapi/linux/ncsi.h
> > index 4c292ecbb748..035fba1693f9 100644
> > --- a/include/uapi/linux/ncsi.h
> > +++ b/include/uapi/linux/ncsi.h
> > @@ -23,6 +23,13 @@
> > * optionally the preferred NCSI_ATTR_CHANNEL_ID.
> > * @NCSI_CMD_CLEAR_INTERFACE: clear any preferred package/channel combination.
> > * Requires NCSI_ATTR_IFINDEX.
> > + * @NCSI_CMD_SET_PACKAGE_MASK: set a whitelist of allowed packages.
> > + * @NCSI_CMD_SET_PACKAGE_MASK: set a whitelist of allowed channels.
> > + * Requires NCSI_ATTR_IFINDEX and NCSI_ATTR_PACKAGE_MASK.
> > + * @NCSI_CMD_SET_PACKAGE_MASK: set a whitelist of allowed channels.
> > + * Requires NCSI_ATTR_IFINDEX, NCSI_ATTR_PACKAGE_ID, and
> > + * NCSI_ATTR_CHANNEL_MASK. If NCSI_ATTR_CHANNEL_ID is present it sets
> > + * the primary channel.
> > * @NCSI_CMD_MAX: highest command number
> > */
>
> There are some typo in the description.
> * @NCSI_CMD_SET_PACKAGE_MASK: set a whitelist of allowed packages.
> * Requires NCSI_ATTR_IFINDEX and NCSI_ATTR_PACKAGE_MASK.
> * @NCSI_CMD_SET_CHANNEL_MASK: set a whitelist of allowed channels.
> * Requires NCSI_ATTR_IFINDEX, NCSI_ATTR_PACKAGE_ID, and
> * NCSI_ATTR_CHANNEL_MASK. If NCSI_ATTR_CHANNEL_ID is present it sets
> * the primary channel.
Haha, yes I threw that in at the end, thanks for catching it.
>
> > enum ncsi_nl_commands {
> > @@ -30,6 +37,8 @@ enum ncsi_nl_commands {
> > NCSI_CMD_PKG_INFO,
> > NCSI_CMD_SET_INTERFACE,
> > NCSI_CMD_CLEAR_INTERFACE,
> > + NCSI_CMD_SET_PACKAGE_MASK,
> > + NCSI_CMD_SET_CHANNEL_MASK,
> >
> > __NCSI_CMD_AFTER_LAST,
> > NCSI_CMD_MAX = __NCSI_CMD_AFTER_LAST - 1
> > @@ -43,6 +52,10 @@ enum ncsi_nl_commands {
> > * @NCSI_ATTR_PACKAGE_LIST: nested array of NCSI_PKG_ATTR attributes
> > * @NCSI_ATTR_PACKAGE_ID: package ID
> > * @NCSI_ATTR_CHANNEL_ID: channel ID
> > + * @NCSI_ATTR_MULTI_FLAG: flag to signal that multi-mode should be enabled with
> > + * NCSI_CMD_SET_PACKAGE_MASK or NCSI_CMD_SET_CHANNEL_MASK.
> > + * @NCSI_ATTR_PACKAGE_MASK: 32-bit mask of allowed packages.
> > + * @NCSI_ATTR_CHANNEL_MASK: 32-bit mask of allowed channels.
> > * @NCSI_ATTR_MAX: highest attribute number
> > */
> > enum ncsi_nl_attrs {
> > @@ -51,6 +64,9 @@ enum ncsi_nl_attrs {
> > NCSI_ATTR_PACKAGE_LIST,
> > NCSI_ATTR_PACKAGE_ID,
> > NCSI_ATTR_CHANNEL_ID,
> > + NCSI_ATTR_MULTI_FLAG,
> > + NCSI_ATTR_PACKAGE_MASK,
> > + NCSI_ATTR_CHANNEL_MASK,
>
> Is there a case that we might set these two masks at the same time?
> If not, maybe we can just have one generic MASK attribute.
>
I thought of this too: not yet, but I wonder if we might in the future.
I'll have a think about it.
> >
> > __NCSI_ATTR_AFTER_LAST,
> > NCSI_ATTR_MAX = __NCSI_ATTR_AFTER_LAST - 1
> > diff --git a/net/ncsi/internal.h b/net/ncsi/internal.h
> > index 3d0a33b874f5..8437474d0a78 100644
> > --- a/net/ncsi/internal.h
> > +++ b/net/ncsi/internal.h
> > @@ -213,6 +213,10 @@ struct ncsi_package {
> > unsigned int channel_num; /* Number of channels */
> > struct list_head channels; /* List of chanels */
> > struct list_head node; /* Form list of packages */
> > +
> > + bool multi_channel; /* Enable multiple channels */
> > + u32 channel_whitelist; /* Channels to configure */
> > + struct ncsi_channel *preferred_channel; /* Primary channel */
> > };
> >
> > struct ncsi_request {
> > @@ -280,8 +284,6 @@ struct ncsi_dev_priv {
> > unsigned int package_num; /* Number of packages */
> > struct list_head packages; /* List of packages */
> > struct ncsi_channel *hot_channel; /* Channel was ever active */
> > - struct ncsi_package *force_package; /* Force a specific package */
> > - struct ncsi_channel *force_channel; /* Force a specific channel */
> > struct ncsi_request requests[256]; /* Request table */
> > unsigned int request_id; /* Last used request ID */
> > #define NCSI_REQ_START_IDX 1
> > @@ -294,6 +296,9 @@ struct ncsi_dev_priv {
> > struct list_head node; /* Form NCSI device list */
> > #define NCSI_MAX_VLAN_VIDS 15
> > struct list_head vlan_vids; /* List of active VLAN IDs */
> > +
> > + bool multi_package; /* Enable multiple packages */
> > + u32 package_whitelist; /* Packages to configure */
> > };
> >
> > struct ncsi_cmd_arg {
> > @@ -345,6 +350,8 @@ struct ncsi_request *ncsi_alloc_request(struct ncsi_dev_priv *ndp,
> > void ncsi_free_request(struct ncsi_request *nr);
> > struct ncsi_dev *ncsi_find_dev(struct net_device *dev);
> > int ncsi_process_next_channel(struct ncsi_dev_priv *ndp);
> > +bool ncsi_channel_is_last(struct ncsi_dev_priv *ndp,
> > + struct ncsi_channel *channel);
> >
> > /* Packet handlers */
> > u32 ncsi_calculate_checksum(unsigned char *data, int len);
> > diff --git a/net/ncsi/ncsi-aen.c b/net/ncsi/ncsi-aen.c
> > index 65f47a648be3..eac56aee30c4 100644
> > --- a/net/ncsi/ncsi-aen.c
> > +++ b/net/ncsi/ncsi-aen.c
> > @@ -86,7 +86,7 @@ static int ncsi_aen_handler_lsc(struct ncsi_dev_priv *ndp,
> > !(state == NCSI_CHANNEL_ACTIVE && !(data & 0x1)))
> > return 0;
> >
> > - if (state == NCSI_CHANNEL_ACTIVE)
> > + if (state == NCSI_CHANNEL_ACTIVE && ncsi_channel_is_last(ndp, nc))
> > ndp->flags |= NCSI_DEV_RESHUFFLE;
> >
> > ncsi_stop_channel_monitor(nc);
> > diff --git a/net/ncsi/ncsi-manage.c b/net/ncsi/ncsi-manage.c
> > index 665bee25ec44..6a55df700bcb 100644
> > --- a/net/ncsi/ncsi-manage.c
> > +++ b/net/ncsi/ncsi-manage.c
> > @@ -27,6 +27,24 @@
> > LIST_HEAD(ncsi_dev_list);
> > DEFINE_SPINLOCK(ncsi_dev_lock);
> >
> > +/* Returns true if the given channel is the last channel available */
> > +bool ncsi_channel_is_last(struct ncsi_dev_priv *ndp,
> > + struct ncsi_channel *channel)
> > +{
> > + struct ncsi_package *np;
> > + struct ncsi_channel *nc;
> > +
> > + NCSI_FOR_EACH_PACKAGE(ndp, np)
> > + NCSI_FOR_EACH_CHANNEL(np, nc) {
> > + if (nc == channel)
> > + continue;
> > + if (nc->state == NCSI_CHANNEL_ACTIVE)
> > + return false;
> > + }
> > +
> > + return true;
> > +}
> > +
> > static void ncsi_report_link(struct ncsi_dev_priv *ndp, bool force_down)
> > {
> > struct ncsi_dev *nd = &ndp->ndev;
> > @@ -266,6 +284,7 @@ struct ncsi_package *ncsi_add_package(struct ncsi_dev_priv *ndp,
> > np->ndp = ndp;
> > spin_lock_init(&np->lock);
> > INIT_LIST_HEAD(&np->channels);
> > + np->channel_whitelist = UINT_MAX;
> >
> > spin_lock_irqsave(&ndp->lock, flags);
> > tmp = ncsi_find_package(ndp, id);
> > @@ -633,6 +652,34 @@ static int set_one_vid(struct ncsi_dev_priv *ndp, struct ncsi_channel *nc,
> > return 0;
> > }
> >
> > +/* Determine if a given channel should be the Tx channel */
> > +bool ncsi_channel_is_tx(struct ncsi_dev_priv *ndp, struct ncsi_channel *nc)
> > +{
> > + struct ncsi_package *np = nc->package;
> > + struct ncsi_channel *channel;
> > + struct ncsi_channel_mode *ncm;
> > +
> > + NCSI_FOR_EACH_CHANNEL(np, channel) {
> > + ncm = &channel->modes[NCSI_MODE_TX_ENABLE];
> > + /* Another channel is already Tx */
> > + if (ncm->enable)
> > + return false;
> > + }
> > +
> > + if (!np->preferred_channel)
> > + return true;
> > +
> > + if (np->preferred_channel == nc)
> > + return true;
> > +
> > + /* The preferred channel is not in the queue and not active */
> > + if (list_empty(&np->preferred_channel->link) &&
> > + np->preferred_channel->state != NCSI_CHANNEL_ACTIVE)
> > + return true;
> > +
> > + return false;
> > +}
> > +
> > static void ncsi_configure_channel(struct ncsi_dev_priv *ndp)
> > {
> > struct ncsi_dev *nd = &ndp->ndev;
> > @@ -745,18 +792,22 @@ static void ncsi_configure_channel(struct ncsi_dev_priv *ndp)
> > } else if (nd->state == ncsi_dev_state_config_ebf) {
> > nca.type = NCSI_PKT_CMD_EBF;
> > nca.dwords[0] = nc->caps[NCSI_CAP_BC].cap;
> > - nd->state = ncsi_dev_state_config_ecnt;
> > + if (ncsi_channel_is_tx(ndp, nc))
> > + nd->state = ncsi_dev_state_config_ecnt;
> > + else
> > + nd->state = ncsi_dev_state_config_ec;
> > #if IS_ENABLED(CONFIG_IPV6)
> > if (ndp->inet6_addr_num > 0 &&
> > (nc->caps[NCSI_CAP_GENERIC].cap &
> > NCSI_CAP_GENERIC_MC))
> > nd->state = ncsi_dev_state_config_egmf;
> > - else
> > - nd->state = ncsi_dev_state_config_ecnt;
> > } else if (nd->state == ncsi_dev_state_config_egmf) {
> > nca.type = NCSI_PKT_CMD_EGMF;
> > nca.dwords[0] = nc->caps[NCSI_CAP_MC].cap;
> > - nd->state = ncsi_dev_state_config_ecnt;
> > + if (ncsi_channel_is_tx(ndp, nc))
> > + nd->state = ncsi_dev_state_config_ecnt;
> > + else
> > + nd->state = ncsi_dev_state_config_ec;
> > #endif /* CONFIG_IPV6 */
> > } else if (nd->state == ncsi_dev_state_config_ecnt) {
> > nca.type = NCSI_PKT_CMD_ECNT;
> > @@ -840,43 +891,35 @@ static void ncsi_configure_channel(struct ncsi_dev_priv *ndp)
> >
> > static int ncsi_choose_active_channel(struct ncsi_dev_priv *ndp)
> > {
> > - struct ncsi_package *np, *force_package;
> > - struct ncsi_channel *nc, *found, *hot_nc, *force_channel;
> > + struct ncsi_package *np;
> > + struct ncsi_channel *nc, *found, *hot_nc;
> > struct ncsi_channel_mode *ncm;
> > - unsigned long flags;
> > + unsigned long flags, cflags;
> > + bool with_link;
> >
> > spin_lock_irqsave(&ndp->lock, flags);
> > hot_nc = ndp->hot_channel;
> > - force_channel = ndp->force_channel;
> > - force_package = ndp->force_package;
> > spin_unlock_irqrestore(&ndp->lock, flags);
> >
> > - /* Force a specific channel whether or not it has link if we have been
> > - * configured to do so
> > - */
> > - if (force_package && force_channel) {
> > - found = force_channel;
> > - ncm = &found->modes[NCSI_MODE_LINK];
> > - if (!(ncm->data[2] & 0x1))
> > - netdev_info(ndp->ndev.dev,
> > - "NCSI: Channel %u forced, but it is link down\n",
> > - found->id);
> > - goto out;
> > - }
> > -
> > - /* The search is done once an inactive channel with up
> > - * link is found.
> > + /* By default the search is done once an inactive channel with up
> > + * link is found, unless a preferred channel is set.
> > + * If multi_package or multi_channel are configured all channels in the
> > + * whitelist with link are added to the channel queue.
> > */
> > found = NULL;
> > + with_link = false;
> > NCSI_FOR_EACH_PACKAGE(ndp, np) {
> > - if (ndp->force_package && np != ndp->force_package)
> > + if (!(ndp->package_whitelist & (0x1 << np->id)))
> > continue;
> > NCSI_FOR_EACH_CHANNEL(np, nc) {
> > - spin_lock_irqsave(&nc->lock, flags);
> > + if (!(np->channel_whitelist & (0x1 << nc->id)))
> > + continue;
> > +
> > + spin_lock_irqsave(&nc->lock, cflags);
> >
> > if (!list_empty(&nc->link) ||
> > nc->state != NCSI_CHANNEL_INACTIVE) {
> > - spin_unlock_irqrestore(&nc->lock, flags);
> > + spin_unlock_irqrestore(&nc->lock, cflags);
> > continue;
> > }
> >
> > @@ -888,32 +931,42 @@ static int ncsi_choose_active_channel(struct ncsi_dev_priv *ndp)
> >
> > ncm = &nc->modes[NCSI_MODE_LINK];
> > if (ncm->data[2] & 0x1) {
>
> This data will not be updated if the channel monitor for it is not running.
> If I move the cable from the current configured channel to the other channel,
> NC-SI module will not detect the link status as the other channel is not configured
> and AEN will not happen.
> Is it per design that NC-SI module will always use the first interface with the link?
We'll check the link state of every channel at init, and per-package on
suspend events, but you're right that we only get AENs right now from
configured channels. There's probably no reason why we could at least
enable AENs on all channels even if we don't use them for network, maybe
during the package probe step.
>
> > - spin_unlock_irqrestore(&nc->lock, flags);
> > found = nc;
> > - goto out;
> > + with_link = true;
> > +
> > + spin_lock_irqsave(&ndp->lock, flags);
> > + list_add_tail_rcu(&found->link,
> > + &ndp->channel_queue);
> > + spin_unlock_irqrestore(&ndp->lock, flags);
> > +
> > + netdev_dbg(ndp->ndev.dev,
> > + "NCSI: Channel %u added to queue (link %s)\n",
> > + found->id,
> > + ncm->data[2] & 0x1 ? "up" : "down");
> > }
> > + spin_unlock_irqrestore(&nc->lock, cflags);
> >
> > - spin_unlock_irqrestore(&nc->lock, flags);
> > + if (with_link && !np->multi_channel)
> > + break;
> > }
> > + if (with_link && !ndp->multi_package)
> > + break;
> > }
> >
> > - if (!found) {
> > + if (!with_link && found) {
> > + netdev_info(ndp->ndev.dev,
> > + "NCSI: No channel with link found, configuring channel %u\n",
> > + found->id);
> > + spin_lock_irqsave(&ndp->lock, flags);
> > + list_add_tail_rcu(&found->link, &ndp->channel_queue);
> > + spin_unlock_irqrestore(&ndp->lock, flags);
> > + } else if (!found) {
> > netdev_warn(ndp->ndev.dev,
> > - "NCSI: No channel found with link\n");
> > + "NCSI: No channel found to configure!\n");
> > ncsi_report_link(ndp, true);
> > return -ENODEV;
> > }
> >
> > - ncm = &found->modes[NCSI_MODE_LINK];
> > - netdev_dbg(ndp->ndev.dev,
> > - "NCSI: Channel %u added to queue (link %s)\n",
> > - found->id, ncm->data[2] & 0x1 ? "up" : "down");
> > -
> > -out:
> > - spin_lock_irqsave(&ndp->lock, flags);
> > - list_add_tail_rcu(&found->link, &ndp->channel_queue);
> > - spin_unlock_irqrestore(&ndp->lock, flags);
> > -
> > return ncsi_process_next_channel(ndp);
> > }
> >
> > @@ -1428,6 +1481,7 @@ struct ncsi_dev *ncsi_register_dev(struct net_device *dev,
> > INIT_LIST_HEAD(&ndp->channel_queue);
> > INIT_LIST_HEAD(&ndp->vlan_vids);
> > INIT_WORK(&ndp->work, ncsi_dev_work);
> > + ndp->package_whitelist = UINT_MAX;
> >
> > /* Initialize private NCSI device */
> > spin_lock_init(&ndp->lock);
> > diff --git a/net/ncsi/ncsi-netlink.c b/net/ncsi/ncsi-netlink.c
> > index 32cb7751d216..33a091e6f466 100644
> > --- a/net/ncsi/ncsi-netlink.c
> > +++ b/net/ncsi/ncsi-netlink.c
>
> Is the following missed in the patch?
> static const struct nla_policy ncsi_genl_policy[NCSI_ATTR_MAX + 1] = {
> ...
> [NCSI_ATTR_MULTI_FLAG] = { .type = NLA_FLAG },
> [NCSI_ATTR_PACKAGE_MASK] = { .type = NLA_U32 },
> [NCSI_ATTR_CHANNEL_MASK] = { .type = NLA_U32 },
Oops, added.
>
> > @@ -67,7 +67,7 @@ static int ncsi_write_channel_info(struct sk_buff *skb,
> > nla_put_u32(skb, NCSI_CHANNEL_ATTR_LINK_STATE, m->data[2]);
> > if (nc->state == NCSI_CHANNEL_ACTIVE)
> > nla_put_flag(skb, NCSI_CHANNEL_ATTR_ACTIVE);
> > - if (ndp->force_channel == nc)
> > + if (nc == nc->package->preferred_channel)
> > nla_put_flag(skb, NCSI_CHANNEL_ATTR_FORCED);
> >
> > nla_put_u32(skb, NCSI_CHANNEL_ATTR_VERSION_MAJOR, nc->version.version);
> > @@ -112,7 +112,7 @@ static int ncsi_write_package_info(struct sk_buff *skb,
> > if (!pnest)
> > return -ENOMEM;
> > nla_put_u32(skb, NCSI_PKG_ATTR_ID, np->id);
> > - if (ndp->force_package == np)
> > + if ((0x1 << np->id) == ndp->package_whitelist)
> > nla_put_flag(skb, NCSI_PKG_ATTR_FORCED);
> > cnest = nla_nest_start(skb, NCSI_PKG_ATTR_CHANNEL_LIST);
> > if (!cnest) {
> > @@ -288,45 +288,54 @@ static int ncsi_set_interface_nl(struct sk_buff *msg, struct genl_info *info)
> > package_id = nla_get_u32(info->attrs[NCSI_ATTR_PACKAGE_ID]);
> > package = NULL;
> >
> > - spin_lock_irqsave(&ndp->lock, flags);
> > -
> > NCSI_FOR_EACH_PACKAGE(ndp, np)
> > if (np->id == package_id)
> > package = np;
> > if (!package) {
> > /* The user has set a package that does not exist */
> > - spin_unlock_irqrestore(&ndp->lock, flags);
> > return -ERANGE;
> > }
> >
> > channel = NULL;
> > - if (!info->attrs[NCSI_ATTR_CHANNEL_ID]) {
> > - /* Allow any channel */
> > - channel_id = NCSI_RESERVED_CHANNEL;
> > - } else {
> > + if (info->attrs[NCSI_ATTR_CHANNEL_ID]) {
> > channel_id = nla_get_u32(info->attrs[NCSI_ATTR_CHANNEL_ID]);
> > NCSI_FOR_EACH_CHANNEL(package, nc)
> > - if (nc->id == channel_id)
> > + if (nc->id == channel_id) {
> > channel = nc;
> > + break;
> > + }
> > + if (!channel) {
> > + netdev_info(ndp->ndev.dev,
> > + "NCSI: Channel %u does not exist!\n",
> > + channel_id);
> > + return -ERANGE;
> > + }
> > }
> >
> > - if (channel_id != NCSI_RESERVED_CHANNEL && !channel) {
> > - /* The user has set a channel that does not exist on this
> > - * package
> > - */
> > - spin_unlock_irqrestore(&ndp->lock, flags);
> > - netdev_info(ndp->ndev.dev, "NCSI: Channel %u does not exist!\n",
> > - channel_id);
> > - return -ERANGE;
> > - }
> > -
> > - ndp->force_package = package;
> > - ndp->force_channel = channel;
> > + spin_lock_irqsave(&ndp->lock, flags);
> > + ndp->package_whitelist = 0x1 << package->id;
> > + ndp->multi_package = false;
> > spin_unlock_irqrestore(&ndp->lock, flags);
> >
> > - netdev_info(ndp->ndev.dev, "Set package 0x%x, channel 0x%x%s as preferred\n",
> > - package_id, channel_id,
> > - channel_id == NCSI_RESERVED_CHANNEL ? " (any)" : "");
> > + spin_lock_irqsave(&package->lock, flags);
> > + package->multi_channel = false;
> > + if (channel) {
> > + package->channel_whitelist = 0x1 << channel->id;
> > + package->preferred_channel = channel;
> > + } else {
> > + /* Allow any channel */
> > + package->channel_whitelist = UINT_MAX;
> > + package->preferred_channel = NULL;
> > + }
> > + spin_unlock_irqrestore(&package->lock, flags);
> > +
> > + if (channel)
> > + netdev_info(ndp->ndev.dev,
> > + "Set package 0x%x, channel 0x%x as preferred\n",
> > + package_id, channel_id);
> > + else
> > + netdev_info(ndp->ndev.dev, "Set package 0x%x as preferred\n",
> > + package_id);
> >
> > /* Bounce the NCSI channel to set changes */
> > ncsi_stop_dev(&ndp->ndev);
> > @@ -338,6 +347,7 @@ static int ncsi_set_interface_nl(struct sk_buff *msg, struct genl_info *info)
> > static int ncsi_clear_interface_nl(struct sk_buff *msg, struct genl_info *info)
> > {
> > struct ncsi_dev_priv *ndp;
> > + struct ncsi_package *np;
> > unsigned long flags;
> >
> > if (!info || !info->attrs)
> > @@ -351,11 +361,19 @@ static int ncsi_clear_interface_nl(struct sk_buff *msg, struct genl_info *info)
> > if (!ndp)
> > return -ENODEV;
> >
> > - /* Clear any override */
> > + /* Reset any whitelists and disable multi mode */
> > spin_lock_irqsave(&ndp->lock, flags);
> > - ndp->force_package = NULL;
> > - ndp->force_channel = NULL;
> > + ndp->package_whitelist = UINT_MAX;
> > + ndp->multi_package = false;
> > spin_unlock_irqrestore(&ndp->lock, flags);
> > +
> > + NCSI_FOR_EACH_PACKAGE(ndp, np) {
> > + spin_lock_irqsave(&np->lock, flags);
> > + np->multi_channel = false;
> > + np->channel_whitelist = UINT_MAX;
> > + np->preferred_channel = NULL;
> > + spin_unlock_irqrestore(&np->lock, flags);
> > + }
> > netdev_info(ndp->ndev.dev, "NCSI: Cleared preferred package/channel\n");
> >
> > /* Bounce the NCSI channel to set changes */
> > @@ -365,6 +383,137 @@ static int ncsi_clear_interface_nl(struct sk_buff *msg, struct genl_info *info)
> > return 0;
> > }
> >
> > +static int ncsi_set_package_mask_nl(struct sk_buff *msg,
> > + struct genl_info *info)
> > +{
> > + struct ncsi_dev_priv *ndp;
> > + unsigned long flags;
> > + int rc;
> > +
> > + if (!info || !info->attrs)
> > + return -EINVAL;
> > +
> > + if (!info->attrs[NCSI_ATTR_IFINDEX])
> > + return -EINVAL;
> > +
> > + if (!info->attrs[NCSI_ATTR_PACKAGE_MASK])
> > + return -EINVAL;
> > +
> > + ndp = ndp_from_ifindex(get_net(sock_net(msg->sk)),
> > + nla_get_u32(info->attrs[NCSI_ATTR_IFINDEX]));
> > + if (!ndp)
> > + return -ENODEV;
> > +
> > + spin_lock_irqsave(&ndp->lock, flags);
> > + ndp->package_whitelist =
> > + nla_get_u32(info->attrs[NCSI_ATTR_PACKAGE_MASK]);
> > +
> > + if (nla_get_flag(info->attrs[NCSI_ATTR_MULTI_FLAG])) {
> > + if (ndp->flags & NCSI_DEV_HWA) {
> > + ndp->multi_package = true;
> > + rc = 0;
> > + } else {
> > + netdev_err(ndp->ndev.dev,
> > + "NCSI: Can't use multiple packages without HWA\n");
> > + rc = -EPERM;
> > + }
> > + } else {
> > + rc = 0;
> > + }
> > +
> > + spin_unlock_irqrestore(&ndp->lock, flags);
> > +
> > + if (!rc) {
> > + /* Bounce the NCSI channel to set changes */
> > + ncsi_stop_dev(&ndp->ndev);
> > + ncsi_start_dev(&ndp->ndev);
>
> Is it possible to delay the restart? If we have two packages, we might send
> set_package_mask command once and set_channel_mask command twice.
> We will see the unnecessary reconfigurations in a very short period time.
We could probably do with a better way of bouncing the link here too. I
suppose we could schedule the actual link 'bounce' to occur a second or
two later, and delay if we receive new configuration in that time.
Perhaps something outside of the scope of this patch though.
>
> > + }
> > +
> > + return rc;
> > +}
> > +
> > +static int ncsi_set_channel_mask_nl(struct sk_buff *msg,
> > + struct genl_info *info)
> > +{
> > + struct ncsi_package *np, *package;
> > + struct ncsi_channel *nc, *channel;
> > + struct ncsi_dev_priv *ndp;
> > + unsigned long flags;
> > + u32 package_id, channel_id;
> > +
> > + if (!info || !info->attrs)
> > + return -EINVAL;
> > +
> > + if (!info->attrs[NCSI_ATTR_IFINDEX])
> > + return -EINVAL;
> > +
> > + if (!info->attrs[NCSI_ATTR_PACKAGE_ID])
> > + return -EINVAL;
> > +
> > + if (!info->attrs[NCSI_ATTR_CHANNEL_MASK])
> > + return -EINVAL;
> > +
> > + ndp = ndp_from_ifindex(get_net(sock_net(msg->sk)),
> > + nla_get_u32(info->attrs[NCSI_ATTR_IFINDEX]));
> > + if (!ndp)
> > + return -ENODEV;
> > +
> > + package_id = nla_get_u32(info->attrs[NCSI_ATTR_PACKAGE_ID]);
> > + package = NULL;
> > + NCSI_FOR_EACH_PACKAGE(ndp, np)
> > + if (np->id == package_id) {
> > + package = np;
> > + break;
> > + }
> > + if (!package)
> > + return -ERANGE;
> > +
> > + spin_lock_irqsave(&package->lock, flags);
> > +
> > + channel = NULL;
> > + if (info->attrs[NCSI_ATTR_CHANNEL_ID]) {
> > + channel_id = nla_get_u32(info->attrs[NCSI_ATTR_CHANNEL_ID]);
> > + NCSI_FOR_EACH_CHANNEL(np, nc)
> > + if (nc->id == channel_id) {
> > + channel = nc;
> > + break;
> > + }
> > + if (!channel) {
> > + spin_unlock_irqrestore(&package->lock, flags);
> > + return -ERANGE;
> > + }
> > + netdev_dbg(ndp->ndev.dev,
> > + "NCSI: Channel %u set as preferred channel\n",
> > + channel->id);
> > + }
> > +
> > + package->channel_whitelist =
> > + nla_get_u32(info->attrs[NCSI_ATTR_CHANNEL_MASK]);
> > + if (package->channel_whitelist == 0)
> > + netdev_dbg(ndp->ndev.dev,
> > + "NCSI: Package %u set to all channels disabled\n",
> > + package->id);
> > +
> > + package->preferred_channel = channel;
> > +
> > + if (nla_get_flag(info->attrs[NCSI_ATTR_MULTI_FLAG])) {
> > + package->multi_channel = true;
> > + netdev_info(ndp->ndev.dev,
> > + "NCSI: Multi-channel enabled on package %u\n",
> > + package_id);
> > + } else {
> > + package->multi_channel = false;
> > + }
> > +
> > + spin_unlock_irqrestore(&package->lock, flags);
> > +
> > + /* Bounce the NCSI channel to set changes */
> > + ncsi_stop_dev(&ndp->ndev);
> > + ncsi_start_dev(&ndp->ndev);
>
> Same question as set_package_mask function.
> Is it possible to delay the restart? If we have two packages, we might send
> set_package_mask command once and set_channel_mask command twice.
> We will see the unnecessary reconfigurations in a very short period time.
>
> > +
> > + return 0;
> > +}
> > +
> > static const struct genl_ops ncsi_ops[] = {
> > {
> > .cmd = NCSI_CMD_PKG_INFO,
> > @@ -385,6 +534,18 @@ static const struct genl_ops ncsi_ops[] = {
> > .doit = ncsi_clear_interface_nl,
> > .flags = GENL_ADMIN_PERM,
> > },
> > + {
> > + .cmd = NCSI_CMD_SET_PACKAGE_MASK,
> > + .policy = ncsi_genl_policy,
> > + .doit = ncsi_set_package_mask_nl,
> > + .flags = GENL_ADMIN_PERM,
> > + },
> > + {
> > + .cmd = NCSI_CMD_SET_CHANNEL_MASK,
> > + .policy = ncsi_genl_policy,
> > + .doit = ncsi_set_channel_mask_nl,
> > + .flags = GENL_ADMIN_PERM,
> > + },
> > };
> >
> > static struct genl_family ncsi_genl_family __ro_after_init = {
> > diff --git a/net/ncsi/ncsi-rsp.c b/net/ncsi/ncsi-rsp.c
> > index d66b34749027..02ce7626b579 100644
> > --- a/net/ncsi/ncsi-rsp.c
> > +++ b/net/ncsi/ncsi-rsp.c
> > @@ -241,7 +241,7 @@ static int ncsi_rsp_handler_dcnt(struct ncsi_request *nr)
> > if (!ncm->enable)
> > return 0;
> >
> > - ncm->enable = 1;
> > + ncm->enable = 0;
> > return 0;
> > }
> >
> > --
> > 2.19.0
>
>
^ permalink raw reply
* [PATCH net] tipc: eliminate possible recursive locking detected by LOCKDEP
From: Ying Xue @ 2018-10-11 11:57 UTC (permalink / raw)
To: jon.maloy, dvyukov
Cc: netdev, tipc-discussion, davem, linux-kernel,
parthasarathy.bhuvaragan
When booting kernel with LOCKDEP option, below warning info was found:
WARNING: possible recursive locking detected
4.19.0-rc7+ #14 Not tainted
--------------------------------------------
swapper/0/1 is trying to acquire lock:
00000000dcfc0fc8 (&(&list->lock)->rlock#4){+...}, at: spin_lock_bh
include/linux/spinlock.h:334 [inline]
00000000dcfc0fc8 (&(&list->lock)->rlock#4){+...}, at:
tipc_link_reset+0x125/0xdf0 net/tipc/link.c:850
but task is already holding lock:
00000000cbb9b036 (&(&list->lock)->rlock#4){+...}, at: spin_lock_bh
include/linux/spinlock.h:334 [inline]
00000000cbb9b036 (&(&list->lock)->rlock#4){+...}, at:
tipc_link_reset+0xfa/0xdf0 net/tipc/link.c:849
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&(&list->lock)->rlock#4);
lock(&(&list->lock)->rlock#4);
*** DEADLOCK ***
May be due to missing lock nesting notation
2 locks held by swapper/0/1:
#0: 00000000f7539d34 (pernet_ops_rwsem){+.+.}, at:
register_pernet_subsys+0x19/0x40 net/core/net_namespace.c:1051
#1: 00000000cbb9b036 (&(&list->lock)->rlock#4){+...}, at:
spin_lock_bh include/linux/spinlock.h:334 [inline]
#1: 00000000cbb9b036 (&(&list->lock)->rlock#4){+...}, at:
tipc_link_reset+0xfa/0xdf0 net/tipc/link.c:849
stack backtrace:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.19.0-rc7+ #14
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x1af/0x295 lib/dump_stack.c:113
print_deadlock_bug kernel/locking/lockdep.c:1759 [inline]
check_deadlock kernel/locking/lockdep.c:1803 [inline]
validate_chain kernel/locking/lockdep.c:2399 [inline]
__lock_acquire+0xf1e/0x3c60 kernel/locking/lockdep.c:3411
lock_acquire+0x1db/0x520 kernel/locking/lockdep.c:3900
__raw_spin_lock_bh include/linux/spinlock_api_smp.h:135 [inline]
_raw_spin_lock_bh+0x31/0x40 kernel/locking/spinlock.c:168
spin_lock_bh include/linux/spinlock.h:334 [inline]
tipc_link_reset+0x125/0xdf0 net/tipc/link.c:850
tipc_link_bc_create+0xb5/0x1f0 net/tipc/link.c:526
tipc_bcast_init+0x59b/0xab0 net/tipc/bcast.c:521
tipc_init_net+0x472/0x610 net/tipc/core.c:82
ops_init+0xf7/0x520 net/core/net_namespace.c:129
__register_pernet_operations net/core/net_namespace.c:940 [inline]
register_pernet_operations+0x453/0xac0 net/core/net_namespace.c:1011
register_pernet_subsys+0x28/0x40 net/core/net_namespace.c:1052
tipc_init+0x83/0x104 net/tipc/core.c:140
do_one_initcall+0x109/0x70a init/main.c:885
do_initcall_level init/main.c:953 [inline]
do_initcalls init/main.c:961 [inline]
do_basic_setup init/main.c:979 [inline]
kernel_init_freeable+0x4bd/0x57f init/main.c:1144
kernel_init+0x13/0x180 init/main.c:1063
ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:413
The reason why the noise above was complained by LOCKDEP is because we
nested to hold l->wakeupq.lock and l->inputq->lock in tipc_link_reset
function. In fact it's unnecessary to move skb buffer from l->wakeupq
queue to l->inputq queue while holding the two locks at the same time.
Instead, we can move skb buffers in l->wakeupq queue to a temporary
list first and then move the buffers of the temporary list to l->inputq
queue, which is also safe for us.
Fixes: 3f32d0be6c16 ("tipc: lock wakeup & inputq at tipc_link_reset()")
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Ying Xue <ying.xue@windriver.com>
---
net/tipc/link.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/net/tipc/link.c b/net/tipc/link.c
index fb886b5..1d21ae4 100644
--- a/net/tipc/link.c
+++ b/net/tipc/link.c
@@ -843,14 +843,21 @@ static void link_prepare_wakeup(struct tipc_link *l)
void tipc_link_reset(struct tipc_link *l)
{
+ struct sk_buff_head list;
+
+ __skb_queue_head_init(&list);
+
l->in_session = false;
l->session++;
l->mtu = l->advertised_mtu;
+
spin_lock_bh(&l->wakeupq.lock);
+ skb_queue_splice_init(&l->wakeupq, &list);
+ spin_unlock_bh(&l->wakeupq.lock);
+
spin_lock_bh(&l->inputq->lock);
- skb_queue_splice_init(&l->wakeupq, l->inputq);
+ skb_queue_splice_init(&list, l->inputq);
spin_unlock_bh(&l->inputq->lock);
- spin_unlock_bh(&l->wakeupq.lock);
__skb_queue_purge(&l->transmq);
__skb_queue_purge(&l->deferdq);
--
2.7.4
^ permalink raw reply related
* Re: [PATCH net-next v5] net/ncsi: Extend NC-SI Netlink interface to allow user space to send NC-SI command
From: Samuel Mendoza-Jonas @ 2018-10-11 3:55 UTC (permalink / raw)
To: Justin.Lee1, joel
Cc: linux-aspeed, netdev, openbmc, amithash, christian, vijaykhemka
In-Reply-To: <245b32d9a3dd463f9334b9704b639ec4@AUSX13MPS302.AMER.DELL.COM>
On Wed, 2018-10-10 at 18:11 +0000, Justin.Lee1@Dell.com wrote:
<snip>
> +
> + len = nla_len(info->attrs[NCSI_ATTR_DATA]);
> + if (len < sizeof(struct ncsi_pkt_hdr)) {
> + netdev_info(ndp->ndev.dev, "NCSI: no command to send %u\n",
> + package_id);
> + ret = -EINVAL;
> + goto out_netlink;
> + } else {
> + data = (unsigned char *)nla_data(info->attrs[NCSI_ATTR_DATA]);
> + }
I only just noticed this, the call to nla_len() can cause a null-dereference if
the NCSI_ATTR_DATA attribute isn't present; we need to make sure it exists
before accessing it in info->attrs.
eg:
root@ozrom2-bmc:~# ./ncsi-netlink -l 2 -p 0 -c 0 --cmd
[ 81.399837] Unable to handle kernel NULL pointer dereference at virtual address 00000000
[ 81.409092] pgd = ddaa9fa6
[ 81.413084] [00000000] *pgd=9702c831, *pte=00000000, *ppte=00000000
[ 81.420729] Internal error: Oops: 17 [#1] ARM
[ 81.426447] CPU: 0 PID: 1028 Comm: ncsi-netlink Not tainted 4.18.8-sammj-00144-gbc129f31bfa5 #12
...
[ 81.874434] Kernel panic - not syncing: Fatal exception
Cheers,
Sam
^ permalink raw reply
* RE: net/tipc: recursive locking in tipc_link_reset
From: Jon Maloy @ 2018-10-11 10:55 UTC (permalink / raw)
To: Dmitry Vyukov, parthasarathy.bhuvaragan@ericsson.com,
David Miller, Ying Xue, netdev,
tipc-discussion@lists.sourceforge.net, LKML
Cc: Parthasarathy Bhuvaragan
In-Reply-To: <CACT4Y+bcxEz=P2E5TRj6A1qeCOQ+WtWR44WGLzqdDttLzD393A@mail.gmail.com>
Hi Dmitry,
Yes, we are aware of this, the kernel test robot warned us about this a few days ago.
I am looking into it.
///jon
> -----Original Message-----
> From: Dmitry Vyukov <dvyukov@google.com>
> Sent: October 11, 2018 3:55 AM
> To: parthasarathy.bhuvaragan@ericsson.com; Jon Maloy
> <jon.maloy@ericsson.com>; David Miller <davem@davemloft.net>; Ying Xue
> <ying.xue@windriver.com>; netdev <netdev@vger.kernel.org>; tipc-
> discussion@lists.sourceforge.net; LKML <linux-kernel@vger.kernel.org
> Subject: net/tipc: recursive locking in tipc_link_reset
>
> Hi,
>
> I am getting the following error while booting the latest kernel on
> bb2d8f2f61047cbde08b78ec03e4ebdb01ee5434 (Oct 10). Config is attached.
>
> Since this happens during boot, this makes LOCKDEP completely unusable,
> does not allow to discover any other locking issues and masks all new bugs
> being introduced into kernel.
> Please fix asap.
> Thanks
>
>
> WARNING: possible recursive locking detected 4.19.0-rc7+ #14 Not tainted
> --------------------------------------------
> swapper/0/1 is trying to acquire lock:
> 00000000dcfc0fc8 (&(&list->lock)->rlock#4){+...}, at: spin_lock_bh
> include/linux/spinlock.h:334 [inline]
> 00000000dcfc0fc8 (&(&list->lock)->rlock#4){+...}, at:
> tipc_link_reset+0x125/0xdf0 net/tipc/link.c:850
>
> but task is already holding lock:
> 00000000cbb9b036 (&(&list->lock)->rlock#4){+...}, at: spin_lock_bh
> include/linux/spinlock.h:334 [inline]
> 00000000cbb9b036 (&(&list->lock)->rlock#4){+...}, at:
> tipc_link_reset+0xfa/0xdf0 net/tipc/link.c:849
>
> other info that might help us debug this:
> Possible unsafe locking scenario:
>
> CPU0
> ----
> lock(&(&list->lock)->rlock#4);
> lock(&(&list->lock)->rlock#4);
>
> *** DEADLOCK ***
>
> May be due to missing lock nesting notation
>
> 2 locks held by swapper/0/1:
> #0: 00000000f7539d34 (pernet_ops_rwsem){+.+.}, at:
> register_pernet_subsys+0x19/0x40 net/core/net_namespace.c:1051
> #1: 00000000cbb9b036 (&(&list->lock)->rlock#4){+...}, at:
> spin_lock_bh include/linux/spinlock.h:334 [inline]
> #1: 00000000cbb9b036 (&(&list->lock)->rlock#4){+...}, at:
> tipc_link_reset+0xfa/0xdf0 net/tipc/link.c:849
>
> stack backtrace:
> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.19.0-rc7+ #14 Hardware name:
> QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 Call Trace:
> __dump_stack lib/dump_stack.c:77 [inline]
> dump_stack+0x1af/0x295 lib/dump_stack.c:113 print_deadlock_bug
> kernel/locking/lockdep.c:1759 [inline] check_deadlock
> kernel/locking/lockdep.c:1803 [inline] validate_chain
> kernel/locking/lockdep.c:2399 [inline]
> __lock_acquire+0xf1e/0x3c60 kernel/locking/lockdep.c:3411
> lock_acquire+0x1db/0x520 kernel/locking/lockdep.c:3900
> __raw_spin_lock_bh include/linux/spinlock_api_smp.h:135 [inline]
> _raw_spin_lock_bh+0x31/0x40 kernel/locking/spinlock.c:168 spin_lock_bh
> include/linux/spinlock.h:334 [inline]
> tipc_link_reset+0x125/0xdf0 net/tipc/link.c:850
> tipc_link_bc_create+0xb5/0x1f0 net/tipc/link.c:526
> tipc_bcast_init+0x59b/0xab0 net/tipc/bcast.c:521
> tipc_init_net+0x472/0x610 net/tipc/core.c:82
> ops_init+0xf7/0x520 net/core/net_namespace.c:129
> __register_pernet_operations net/core/net_namespace.c:940 [inline]
> register_pernet_operations+0x453/0xac0 net/core/net_namespace.c:1011
> register_pernet_subsys+0x28/0x40 net/core/net_namespace.c:1052
> tipc_init+0x83/0x104 net/tipc/core.c:140 do_one_initcall+0x109/0x70a
> init/main.c:885 do_initcall_level init/main.c:953 [inline] do_initcalls
> init/main.c:961 [inline] do_basic_setup init/main.c:979 [inline]
> kernel_init_freeable+0x4bd/0x57f init/main.c:1144
> kernel_init+0x13/0x180 init/main.c:1063
> ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:413
^ permalink raw reply
* Re: [PATCH] virtio_net: enable tx after resuming from suspend
From: ake @ 2018-10-11 10:22 UTC (permalink / raw)
To: Jason Wang
Cc: Michael S. Tsirkin, David S. Miller, virtualization, netdev,
linux-kernel
In-Reply-To: <dd530053-efab-b661-3423-1faaab8fb12f@redhat.com>
On 2018年10月11日 18:44, Jason Wang wrote:
>
>
> On 2018年10月11日 15:51, Ake Koomsin wrote:
>> commit 713a98d90c5e ("virtio-net: serialize tx routine during reset")
>> disabled the virtio tx before going to suspend to avoid a use after free.
>> However, after resuming, it causes the virtio_net device to lose its
>> network connectivity.
>>
>> To solve the issue, we need to enable tx after resuming.
>>
>> Fixes commit 713a98d90c5e ("virtio-net: serialize tx routine during
>> reset")
>> Signed-off-by: Ake Koomsin <ake@igel.co.jp>
>> ---
>> drivers/net/virtio_net.c | 1 +
>> 1 file changed, 1 insertion(+)
>>
>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>> index dab504ec5e50..3453d80f5f81 100644
>> --- a/drivers/net/virtio_net.c
>> +++ b/drivers/net/virtio_net.c
>> @@ -2256,6 +2256,7 @@ static int virtnet_restore_up(struct
>> virtio_device *vdev)
>> }
>> netif_device_attach(vi->dev);
>> + netif_start_queue(vi->dev);
>
> I believe this is duplicated with netif_tx_wake_all_queues() in
> netif_device_attach() above?
Thank you for your review.
If both netif_tx_wake_all_queues() and netif_start_queue() result in
clearing __QUEUE_STATE_DRV_XOFF, then is it possible that some
conditions in netif_device_attach() is not satisfied? Without
netif_start_queue(), the virtio_net device does not resume properly
after waking up.
Is it better to report this as a bug first? If I am to do more
investigation, what areas should I look into?
Best Regards
Ake Koomsin
^ permalink raw reply
* [PATCH] iwlwifi: fix spelling mistake "registrating" -> "registering"
From: Colin King @ 2018-10-11 9:57 UTC (permalink / raw)
To: Johannes Berg, Emmanuel Grumbach, Luca Coelho,
Intel Linux Wireless, Kalle Valo, David S . Miller,
linux-wireless, netdev
Cc: kernel-janitors, linux-kernel
From: Colin Ian King <colin.king@canonical.com>
Trivial fix to spelling mistake in IWL_ERR error message
Signed-off-by: Colin Ian King <colin.king@canonical.com>
---
drivers/net/wireless/intel/iwlwifi/dvm/main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/wireless/intel/iwlwifi/dvm/main.c b/drivers/net/wireless/intel/iwlwifi/dvm/main.c
index 1088ff036e13..a5041d92babc 100644
--- a/drivers/net/wireless/intel/iwlwifi/dvm/main.c
+++ b/drivers/net/wireless/intel/iwlwifi/dvm/main.c
@@ -1054,7 +1054,7 @@ static void iwl_bg_restart(struct work_struct *data)
ieee80211_restart_hw(priv->hw);
else
IWL_ERR(priv,
- "Cannot request restart before registrating with mac80211\n");
+ "Cannot request restart before registering with mac80211\n");
} else {
WARN_ON(1);
}
--
2.17.1
^ permalink raw reply related
* Re: [PATCH][next] ath10k: fix out of bound read on array ath10k_rates
From: Kalle Valo @ 2018-10-11 9:49 UTC (permalink / raw)
To: Colin King
Cc: linux-wireless, Sriram R, netdev, kernel-janitors, linux-kernel,
ath10k, David S . Miller
In-Reply-To: <20181005215609.13935-1-colin.king@canonical.com>
Colin King <colin.king@canonical.com> wrote:
> From: Colin Ian King <colin.king@canonical.com>
>
> An out-of-bounds read on array ath10k_rates is occurring because
> the maximum number of elements is currently based on the size of
> the array and not the number of elements in the array. Fix this
> by using ARRAY_SIZE instead of sizeof.
>
> Detected by CoverityScan, CID#1473918 ("Out-of-bounds read")
>
> Fixes: f279294e9ee2 ("ath10k: add support for configuring management packet rate")
> Signed-off-by: Colin Ian King <colin.king@canonical.com>
This is already fixed in ath-next.
error: patch failed: drivers/net/wireless/ath/ath10k/mac.c:164
error: drivers/net/wireless/ath/ath10k/mac.c: patch does not apply
stg import: Diff does not apply cleanly
Patch set to Rejected.
--
https://patchwork.kernel.org/patch/10628815/
https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches
^ permalink raw reply
* Re: [PATCH][next] ath10k: fix out of bound read on array ath10k_rates
From: Kalle Valo @ 2018-10-11 9:49 UTC (permalink / raw)
To: Colin King
Cc: Sriram R, David S . Miller, ath10k, linux-wireless, netdev,
kernel-janitors, linux-kernel
In-Reply-To: <20181005215609.13935-1-colin.king@canonical.com>
Colin King <colin.king@canonical.com> wrote:
> From: Colin Ian King <colin.king@canonical.com>
>
> An out-of-bounds read on array ath10k_rates is occurring because
> the maximum number of elements is currently based on the size of
> the array and not the number of elements in the array. Fix this
> by using ARRAY_SIZE instead of sizeof.
>
> Detected by CoverityScan, CID#1473918 ("Out-of-bounds read")
>
> Fixes: f279294e9ee2 ("ath10k: add support for configuring management packet rate")
> Signed-off-by: Colin Ian King <colin.king@canonical.com>
This is already fixed in ath-next.
error: patch failed: drivers/net/wireless/ath/ath10k/mac.c:164
error: drivers/net/wireless/ath/ath10k/mac.c: patch does not apply
stg import: Diff does not apply cleanly
Patch set to Rejected.
--
https://patchwork.kernel.org/patch/10628815/
https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches
^ permalink raw reply
* A mail kvóta túllépte a korlátot
From: Escuela 398, Justo Daract @ 2018-10-11 2:20 UTC (permalink / raw)
Tisztelt felhasználó,
Postaládája túllépte a szokásos 100 MB tárolási korlátot, nem lesz képes fogadni vagy küldeni e-mailt, amíg meg nem növeli a postaláda kvótáját. A levél kvótájának növeléséhez kattintson az alábbi hivatkozásra, és töltse ki a szükséges adatokat a postaláda-kvóta növeléséhez.
https://szolgltats.yolasite.com/
24 óra elteltével válasz nélkül a postaláda átmenetileg inaktívvá válik.
Kattintson ide: https://szolgltats.yolasite.com/
Köszönjük, hogy használja a webmail,
Copyright © 2018 ügyfélszolgálati szolgálat,
webmail adminisztrátor.
^ permalink raw reply
* Re: [PATCH] virtio_net: enable tx after resuming from suspend
From: Jason Wang @ 2018-10-11 9:44 UTC (permalink / raw)
To: Ake Koomsin
Cc: Michael S. Tsirkin, David S. Miller, virtualization, netdev,
linux-kernel
In-Reply-To: <20181011075127.2608-1-ake@igel.co.jp>
On 2018年10月11日 15:51, Ake Koomsin wrote:
> commit 713a98d90c5e ("virtio-net: serialize tx routine during reset")
> disabled the virtio tx before going to suspend to avoid a use after free.
> However, after resuming, it causes the virtio_net device to lose its
> network connectivity.
>
> To solve the issue, we need to enable tx after resuming.
>
> Fixes commit 713a98d90c5e ("virtio-net: serialize tx routine during reset")
> Signed-off-by: Ake Koomsin <ake@igel.co.jp>
> ---
> drivers/net/virtio_net.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index dab504ec5e50..3453d80f5f81 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -2256,6 +2256,7 @@ static int virtnet_restore_up(struct virtio_device *vdev)
> }
>
> netif_device_attach(vi->dev);
> + netif_start_queue(vi->dev);
I believe this is duplicated with netif_tx_wake_all_queues() in
netif_device_attach() above?
Thanks
> return err;
> }
>
^ permalink raw reply
* [iproute2-next] tipc: support interface name when activating UDP bearer
From: Hoang Le @ 2018-10-11 2:07 UTC (permalink / raw)
To: jon.maloy, maloy, ying.xue, netdev, tipc-discussion
Support for indicating interface name has an ip address in parallel
with specifying ip address when activating UDP bearer.
This liberates the user from keeping track of the current ip address
for each device.
Old command syntax:
$tipc bearer enable media udp name NAME localip IP
New command syntax:
$tipc bearer enable media udp name NAME [localip IP|dev DEVICE]
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au>
---
tipc/bearer.c | 65 +++++++++++++++++++++++++++++++++++++++++++++++----
1 file changed, 60 insertions(+), 5 deletions(-)
diff --git a/tipc/bearer.c b/tipc/bearer.c
index 05dc84aa8ded..118a664370e7 100644
--- a/tipc/bearer.c
+++ b/tipc/bearer.c
@@ -19,6 +19,8 @@
#include <linux/tipc_netlink.h>
#include <linux/tipc.h>
#include <linux/genetlink.h>
+#include <linux/if.h>
+#include <sys/ioctl.h>
#include <libmnl/libmnl.h>
#include <sys/socket.h>
@@ -68,7 +70,7 @@ static void cmd_bearer_enable_l2_help(struct cmdl *cmdl, char *media)
static void cmd_bearer_enable_udp_help(struct cmdl *cmdl, char *media)
{
fprintf(stderr,
- "Usage: %s bearer enable [OPTIONS] media %s name NAME localip IP [UDP OPTIONS]\n\n",
+ "Usage: %s bearer enable [OPTIONS] media %s name NAME [localip IP|device DEVICE] [UDP OPTIONS]\n\n",
cmdl->argv[0], media);
fprintf(stderr,
"OPTIONS\n"
@@ -121,6 +123,47 @@ static int generate_multicast(short af, char *buf, int bufsize)
return 0;
}
+static int cmd_bearer_validate_and_get_addr(const char *name, char *straddr)
+{
+ struct ifreq ifc;
+ struct sockaddr_in *ip4addr;
+ struct sockaddr_in6 *ip6addr;
+ int fd = 0;
+
+ if (!name || !straddr)
+ return 0;
+
+ fd = socket(PF_INET, SOCK_DGRAM, 0);
+ if (fd <= 0) {
+ fprintf(stderr, "Failed to create socket\n");
+ return 0;
+ }
+
+ ifc.ifr_name[0] = 0;
+ memcpy(ifc.ifr_name, name, strlen(name));
+ ifc.ifr_name[strlen(name)] = 0;
+
+ if (ioctl(fd, SIOCGIFADDR, &ifc) < 0) {
+ fprintf(stderr, "ioctl failed: %s\n", strerror(errno));
+ close(fd);
+ return 0;
+ }
+
+ ip4addr = (struct sockaddr_in *)&ifc.ifr_addr;
+ if (inet_ntop(AF_INET, &ip4addr->sin_addr, straddr,
+ INET_ADDRSTRLEN) == NULL) {
+ ip6addr = (struct sockaddr_in6 *)&ifc.ifr_addr;
+ if (inet_ntop(AF_INET6, &ip6addr->sin6_addr, straddr,
+ INET6_ADDRSTRLEN) == NULL) {
+ fprintf(stderr, "UDP local address error\n");
+ close(fd);
+ return -EINVAL;
+ }
+ }
+ close(fd);
+ return 1;
+}
+
static int nl_add_udp_enable_opts(struct nlmsghdr *nlh, struct opt *opts,
struct cmdl *cmdl)
{
@@ -138,13 +181,25 @@ static int nl_add_udp_enable_opts(struct nlmsghdr *nlh, struct opt *opts,
.ai_family = AF_UNSPEC,
.ai_socktype = SOCK_DGRAM
};
+ char addr[INET6_ADDRSTRLEN] = {0};
- if (!(opt = get_opt(opts, "localip"))) {
- fprintf(stderr, "error, udp bearer localip missing\n");
- cmd_bearer_enable_udp_help(cmdl, "udp");
+ opt = get_opt(opts, "device");
+ if (opt && !cmd_bearer_validate_and_get_addr(opt->val, addr)) {
+ fprintf(stderr, "error, no device name available\n");
return -EINVAL;
}
- locip = opt->val;
+
+ if (strlen(addr) > 0) {
+ locip = addr;
+ } else {
+ opt = get_opt(opts, "localip");
+ if (!opt) {
+ fprintf(stderr, "error, udp bearer localip/device missing\n");
+ cmd_bearer_enable_udp_help(cmdl, "udp");
+ return -EINVAL;
+ }
+ locip = opt->val;
+ }
if ((opt = get_opt(opts, "remoteip")))
remip = opt->val;
--
2.17.1
^ permalink raw reply related
* [net 3/3] net/mlx5: WQ, fixes for fragmented WQ buffers API
From: Saeed Mahameed @ 2018-10-11 1:32 UTC (permalink / raw)
To: David S. Miller; +Cc: netdev, Tariq Toukan, Saeed Mahameed
In-Reply-To: <20181011013244.29554-1-saeedm@mellanox.com>
From: Tariq Toukan <tariqt@mellanox.com>
mlx5e netdevice used to calculate fragment edges by a call to
mlx5_wq_cyc_get_frag_size(). This calculation did not give the correct
indication for queues smaller than a PAGE_SIZE, (broken by default on
PowerPC, where PAGE_SIZE == 64KB). Here it is replaced by the correct new
calls/API.
Since (TX/RX) Work Queues buffers are fragmented, here we introduce
changes to the API in core driver, so that it gets a stride index and
returns the index of last stride on same fragment, and an additional
wrapping function that returns the number of physically contiguous
strides that can be written contiguously to the work queue.
This obsoletes the following API functions, and their buggy
usage in EN driver:
* mlx5_wq_cyc_get_frag_size()
* mlx5_wq_cyc_ctr2fragix()
The new API improves modularity and hides the details of such
calculation for mlx5e netdevice and mlx5_ib rdma drivers.
New calculation is also more efficient, and improves performance
as follows:
Packet rate test: pktgen, UDP / IPv4, 64byte, single ring, 8K ring size.
Before: 16,477,619 pps
After: 17,085,793 pps
3.7% improvement
Fixes: 3a2f70331226 ("net/mlx5: Use order-0 allocations for all WQ types")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
.../net/ethernet/mellanox/mlx5/core/en_rx.c | 12 +++++-----
.../net/ethernet/mellanox/mlx5/core/en_tx.c | 22 +++++++++----------
.../ethernet/mellanox/mlx5/core/ipoib/ipoib.h | 5 ++---
drivers/net/ethernet/mellanox/mlx5/core/wq.c | 5 -----
drivers/net/ethernet/mellanox/mlx5/core/wq.h | 11 +++++-----
include/linux/mlx5/driver.h | 8 +++++++
6 files changed, 31 insertions(+), 32 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index 15d8ae28c040..00172dee5339 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -432,10 +432,9 @@ static inline u16 mlx5e_icosq_wrap_cnt(struct mlx5e_icosq *sq)
static inline void mlx5e_fill_icosq_frag_edge(struct mlx5e_icosq *sq,
struct mlx5_wq_cyc *wq,
- u16 pi, u16 frag_pi)
+ u16 pi, u16 nnops)
{
struct mlx5e_sq_wqe_info *edge_wi, *wi = &sq->db.ico_wqe[pi];
- u8 nnops = mlx5_wq_cyc_get_frag_size(wq) - frag_pi;
edge_wi = wi + nnops;
@@ -454,15 +453,14 @@ static int mlx5e_alloc_rx_mpwqe(struct mlx5e_rq *rq, u16 ix)
struct mlx5_wq_cyc *wq = &sq->wq;
struct mlx5e_umr_wqe *umr_wqe;
u16 xlt_offset = ix << (MLX5E_LOG_ALIGNED_MPWQE_PPW - 1);
- u16 pi, frag_pi;
+ u16 pi, contig_wqebbs_room;
int err;
int i;
pi = mlx5_wq_cyc_ctr2ix(wq, sq->pc);
- frag_pi = mlx5_wq_cyc_ctr2fragix(wq, sq->pc);
-
- if (unlikely(frag_pi + MLX5E_UMR_WQEBBS > mlx5_wq_cyc_get_frag_size(wq))) {
- mlx5e_fill_icosq_frag_edge(sq, wq, pi, frag_pi);
+ contig_wqebbs_room = mlx5_wq_cyc_get_contig_wqebbs(wq, pi);
+ if (unlikely(contig_wqebbs_room < MLX5E_UMR_WQEBBS)) {
+ mlx5e_fill_icosq_frag_edge(sq, wq, pi, contig_wqebbs_room);
pi = mlx5_wq_cyc_ctr2ix(wq, sq->pc);
}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
index ae73ea992845..6dacaeba2fbf 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
@@ -290,10 +290,9 @@ mlx5e_txwqe_build_dsegs(struct mlx5e_txqsq *sq, struct sk_buff *skb,
static inline void mlx5e_fill_sq_frag_edge(struct mlx5e_txqsq *sq,
struct mlx5_wq_cyc *wq,
- u16 pi, u16 frag_pi)
+ u16 pi, u16 nnops)
{
struct mlx5e_tx_wqe_info *edge_wi, *wi = &sq->db.wqe_info[pi];
- u8 nnops = mlx5_wq_cyc_get_frag_size(wq) - frag_pi;
edge_wi = wi + nnops;
@@ -348,8 +347,8 @@ netdev_tx_t mlx5e_sq_xmit(struct mlx5e_txqsq *sq, struct sk_buff *skb,
struct mlx5e_tx_wqe_info *wi;
struct mlx5e_sq_stats *stats = sq->stats;
+ u16 headlen, ihs, contig_wqebbs_room;
u16 ds_cnt, ds_cnt_inl = 0;
- u16 headlen, ihs, frag_pi;
u8 num_wqebbs, opcode;
u32 num_bytes;
int num_dma;
@@ -386,9 +385,9 @@ netdev_tx_t mlx5e_sq_xmit(struct mlx5e_txqsq *sq, struct sk_buff *skb,
}
num_wqebbs = DIV_ROUND_UP(ds_cnt, MLX5_SEND_WQEBB_NUM_DS);
- frag_pi = mlx5_wq_cyc_ctr2fragix(wq, sq->pc);
- if (unlikely(frag_pi + num_wqebbs > mlx5_wq_cyc_get_frag_size(wq))) {
- mlx5e_fill_sq_frag_edge(sq, wq, pi, frag_pi);
+ contig_wqebbs_room = mlx5_wq_cyc_get_contig_wqebbs(wq, pi);
+ if (unlikely(contig_wqebbs_room < num_wqebbs)) {
+ mlx5e_fill_sq_frag_edge(sq, wq, pi, contig_wqebbs_room);
mlx5e_sq_fetch_wqe(sq, &wqe, &pi);
}
@@ -636,7 +635,7 @@ netdev_tx_t mlx5i_sq_xmit(struct mlx5e_txqsq *sq, struct sk_buff *skb,
struct mlx5e_tx_wqe_info *wi;
struct mlx5e_sq_stats *stats = sq->stats;
- u16 headlen, ihs, pi, frag_pi;
+ u16 headlen, ihs, pi, contig_wqebbs_room;
u16 ds_cnt, ds_cnt_inl = 0;
u8 num_wqebbs, opcode;
u32 num_bytes;
@@ -672,13 +671,14 @@ netdev_tx_t mlx5i_sq_xmit(struct mlx5e_txqsq *sq, struct sk_buff *skb,
}
num_wqebbs = DIV_ROUND_UP(ds_cnt, MLX5_SEND_WQEBB_NUM_DS);
- frag_pi = mlx5_wq_cyc_ctr2fragix(wq, sq->pc);
- if (unlikely(frag_pi + num_wqebbs > mlx5_wq_cyc_get_frag_size(wq))) {
+ pi = mlx5_wq_cyc_ctr2ix(wq, sq->pc);
+ contig_wqebbs_room = mlx5_wq_cyc_get_contig_wqebbs(wq, pi);
+ if (unlikely(contig_wqebbs_room < num_wqebbs)) {
+ mlx5e_fill_sq_frag_edge(sq, wq, pi, contig_wqebbs_room);
pi = mlx5_wq_cyc_ctr2ix(wq, sq->pc);
- mlx5e_fill_sq_frag_edge(sq, wq, pi, frag_pi);
}
- mlx5i_sq_fetch_wqe(sq, &wqe, &pi);
+ mlx5i_sq_fetch_wqe(sq, &wqe, pi);
/* fill wqe */
wi = &sq->db.wqe_info[pi];
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.h b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.h
index 08eac92fc26c..0982c579ec74 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.h
@@ -109,12 +109,11 @@ struct mlx5i_tx_wqe {
static inline void mlx5i_sq_fetch_wqe(struct mlx5e_txqsq *sq,
struct mlx5i_tx_wqe **wqe,
- u16 *pi)
+ u16 pi)
{
struct mlx5_wq_cyc *wq = &sq->wq;
- *pi = mlx5_wq_cyc_ctr2ix(wq, sq->pc);
- *wqe = mlx5_wq_cyc_get_wqe(wq, *pi);
+ *wqe = mlx5_wq_cyc_get_wqe(wq, pi);
memset(*wqe, 0, sizeof(**wqe));
}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/wq.c b/drivers/net/ethernet/mellanox/mlx5/core/wq.c
index 68e7f8df2a6d..ddca327e8950 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/wq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/wq.c
@@ -39,11 +39,6 @@ u32 mlx5_wq_cyc_get_size(struct mlx5_wq_cyc *wq)
return (u32)wq->fbc.sz_m1 + 1;
}
-u16 mlx5_wq_cyc_get_frag_size(struct mlx5_wq_cyc *wq)
-{
- return wq->fbc.frag_sz_m1 + 1;
-}
-
u32 mlx5_cqwq_get_size(struct mlx5_cqwq *wq)
{
return wq->fbc.sz_m1 + 1;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/wq.h b/drivers/net/ethernet/mellanox/mlx5/core/wq.h
index 3a1a170bb2d7..b1293d153a58 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/wq.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/wq.h
@@ -80,7 +80,6 @@ int mlx5_wq_cyc_create(struct mlx5_core_dev *mdev, struct mlx5_wq_param *param,
void *wqc, struct mlx5_wq_cyc *wq,
struct mlx5_wq_ctrl *wq_ctrl);
u32 mlx5_wq_cyc_get_size(struct mlx5_wq_cyc *wq);
-u16 mlx5_wq_cyc_get_frag_size(struct mlx5_wq_cyc *wq);
int mlx5_wq_qp_create(struct mlx5_core_dev *mdev, struct mlx5_wq_param *param,
void *qpc, struct mlx5_wq_qp *wq,
@@ -140,11 +139,6 @@ static inline u16 mlx5_wq_cyc_ctr2ix(struct mlx5_wq_cyc *wq, u16 ctr)
return ctr & wq->fbc.sz_m1;
}
-static inline u16 mlx5_wq_cyc_ctr2fragix(struct mlx5_wq_cyc *wq, u16 ctr)
-{
- return ctr & wq->fbc.frag_sz_m1;
-}
-
static inline u16 mlx5_wq_cyc_get_head(struct mlx5_wq_cyc *wq)
{
return mlx5_wq_cyc_ctr2ix(wq, wq->wqe_ctr);
@@ -160,6 +154,11 @@ static inline void *mlx5_wq_cyc_get_wqe(struct mlx5_wq_cyc *wq, u16 ix)
return mlx5_frag_buf_get_wqe(&wq->fbc, ix);
}
+static inline u16 mlx5_wq_cyc_get_contig_wqebbs(struct mlx5_wq_cyc *wq, u16 ix)
+{
+ return mlx5_frag_buf_get_idx_last_contig_stride(&wq->fbc, ix) - ix + 1;
+}
+
static inline int mlx5_wq_cyc_cc_bigger(u16 cc1, u16 cc2)
{
int equal = (cc1 == cc2);
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 66d94b4557cf..88a041b73abf 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -1032,6 +1032,14 @@ static inline void *mlx5_frag_buf_get_wqe(struct mlx5_frag_buf_ctrl *fbc,
((fbc->frag_sz_m1 & ix) << fbc->log_stride);
}
+static inline u32
+mlx5_frag_buf_get_idx_last_contig_stride(struct mlx5_frag_buf_ctrl *fbc, u32 ix)
+{
+ u32 last_frag_stride_idx = (ix + fbc->strides_offset) | fbc->frag_sz_m1;
+
+ return min_t(u32, last_frag_stride_idx - fbc->strides_offset, fbc->sz_m1);
+}
+
int mlx5_cmd_init(struct mlx5_core_dev *dev);
void mlx5_cmd_cleanup(struct mlx5_core_dev *dev);
void mlx5_cmd_use_events(struct mlx5_core_dev *dev);
--
2.17.1
^ permalink raw reply related
* [net 2/3] net/mlx5: Take only bit 24-26 of wqe.pftype_wq for page fault type
From: Saeed Mahameed @ 2018-10-11 1:32 UTC (permalink / raw)
To: David S. Miller; +Cc: netdev, Huy Nguyen, Eli Cohen, Saeed Mahameed
In-Reply-To: <20181011013244.29554-1-saeedm@mellanox.com>
From: Huy Nguyen <huyn@mellanox.com>
The HW spec defines only bits 24-26 of pftype_wq as the page fault type,
use the required mask to ensure that.
Fixes: d9aaed838765 ("{net,IB}/mlx5: Refactor page fault handling")
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx5/core/eq.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
index 48864f4988a4..c1e1a16a9b07 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
@@ -273,7 +273,7 @@ static void eq_pf_process(struct mlx5_eq *eq)
case MLX5_PFAULT_SUBTYPE_WQE:
/* WQE based event */
pfault->type =
- be32_to_cpu(pf_eqe->wqe.pftype_wq) >> 24;
+ (be32_to_cpu(pf_eqe->wqe.pftype_wq) >> 24) & 0x7;
pfault->token =
be32_to_cpu(pf_eqe->wqe.token);
pfault->wqe.wq_num =
--
2.17.1
^ permalink raw reply related
* [pull request][net 0/3] Mellanox, mlx5 fixes 2018-10-10
From: Saeed Mahameed @ 2018-10-11 1:32 UTC (permalink / raw)
To: David S. Miller; +Cc: netdev, Saeed Mahameed
Hi Dave,
This pull request includes some fixes to mlx5 driver,
Please pull and let me know if there's any problem.
For -stable v4.11:
('net/mlx5: Take only bit 24-26 of wqe.pftype_wq for page fault type')
For -stable v4.17:
('net/mlx5: Fix memory leak when setting fpga ipsec caps')
For -stable v4.18:
('net/mlx5: WQ, fixes for fragmented WQ buffers API')
Thanks,
Saeed.
---
The following changes since commit 52b5d6f5dcf0e5201392f7d417148ccb537dbf6f:
net: make skb_partial_csum_set() more robust against overflows (2018-10-10 10:21:31 -0700)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git tags/mlx5-fixes-2018-10-10
for you to fetch changes up to 37fdffb217a45609edccbb8b407d031143f551c0:
net/mlx5: WQ, fixes for fragmented WQ buffers API (2018-10-10 18:26:16 -0700)
----------------------------------------------------------------
mlx5-fixes-2018-10-10
----------------------------------------------------------------
Huy Nguyen (1):
net/mlx5: Take only bit 24-26 of wqe.pftype_wq for page fault type
Talat Batheesh (1):
net/mlx5: Fix memory leak when setting fpga ipsec caps
Tariq Toukan (1):
net/mlx5: WQ, fixes for fragmented WQ buffers API
drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 12 +++++-------
drivers/net/ethernet/mellanox/mlx5/core/en_tx.c | 22 +++++++++++-----------
drivers/net/ethernet/mellanox/mlx5/core/eq.c | 2 +-
.../net/ethernet/mellanox/mlx5/core/fpga/ipsec.c | 9 ++++-----
.../net/ethernet/mellanox/mlx5/core/ipoib/ipoib.h | 5 ++---
drivers/net/ethernet/mellanox/mlx5/core/wq.c | 5 -----
drivers/net/ethernet/mellanox/mlx5/core/wq.h | 11 +++++------
include/linux/mlx5/driver.h | 8 ++++++++
8 files changed, 36 insertions(+), 38 deletions(-)
^ permalink raw reply
* [net 1/3] net/mlx5: Fix memory leak when setting fpga ipsec caps
From: Saeed Mahameed @ 2018-10-11 1:32 UTC (permalink / raw)
To: David S. Miller; +Cc: netdev, Talat Batheesh, Saeed Mahameed
In-Reply-To: <20181011013244.29554-1-saeedm@mellanox.com>
From: Talat Batheesh <talatb@mellanox.com>
Allocated memory for context should be freed once
finished working with it.
Fixes: d6c4f0298cec ("net/mlx5: Refactor accel IPSec code")
Signed-off-by: Talat Batheesh <talatb@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx5/core/fpga/ipsec.c | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fpga/ipsec.c b/drivers/net/ethernet/mellanox/mlx5/core/fpga/ipsec.c
index 5645a4facad2..b8ee9101c506 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fpga/ipsec.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fpga/ipsec.c
@@ -245,7 +245,7 @@ static void *mlx5_fpga_ipsec_cmd_exec(struct mlx5_core_dev *mdev,
return ERR_PTR(res);
}
- /* Context will be freed by wait func after completion */
+ /* Context should be freed by the caller after completion. */
return context;
}
@@ -418,10 +418,8 @@ static int mlx5_fpga_ipsec_set_caps(struct mlx5_core_dev *mdev, u32 flags)
cmd.cmd = htonl(MLX5_FPGA_IPSEC_CMD_OP_SET_CAP);
cmd.flags = htonl(flags);
context = mlx5_fpga_ipsec_cmd_exec(mdev, &cmd, sizeof(cmd));
- if (IS_ERR(context)) {
- err = PTR_ERR(context);
- goto out;
- }
+ if (IS_ERR(context))
+ return PTR_ERR(context);
err = mlx5_fpga_ipsec_cmd_wait(context);
if (err)
@@ -435,6 +433,7 @@ static int mlx5_fpga_ipsec_set_caps(struct mlx5_core_dev *mdev, u32 flags)
}
out:
+ kfree(context);
return err;
}
--
2.17.1
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox