Netdev List

Netdev List
 help / color / mirror / Atom feed

* RE: [Intel-wired-lan] [PATCH] e1000e: Ignore TSYNCRXCTL when getting I219 clock attributes
From: Brown, Aaron F @ 2018-05-23  0:44 UTC (permalink / raw)
  To: Benjamin Poirier, Kirsher, Jeffrey T
  Cc: ehabkost@redhat.com, netdev@vger.kernel.org, jayanth@goubiq.com,
	linux-kernel@vger.kernel.org, Bart.VanAssche@wdc.com,
	postmodern.mod3@gmail.com, Achim Mildenberger,
	intel-wired-lan@lists.osuosl.org, olouvignes@gmail.com
In-Reply-To: <20180510072835.5549-1-bpoirier@suse.com>

> From: Intel-wired-lan [mailto:intel-wired-lan-bounces@osuosl.org] On
> Behalf Of Benjamin Poirier
> Sent: Thursday, May 10, 2018 12:29 AM
> To: Kirsher, Jeffrey T <jeffrey.t.kirsher@intel.com>
> Cc: ehabkost@redhat.com; netdev@vger.kernel.org; jayanth@goubiq.com;
> linux-kernel@vger.kernel.org; Bart.VanAssche@wdc.com;
> postmodern.mod3@gmail.com; Achim Mildenberger
> <admin@fph.physik.uni-karlsruhe.de>; intel-wired-lan@lists.osuosl.org;
> olouvignes@gmail.com
> Subject: [Intel-wired-lan] [PATCH] e1000e: Ignore TSYNCRXCTL when getting
> I219 clock attributes
> 
> There have been multiple reports of crashes that look like
> kernel: RIP: 0010:[<ffffffff8110303f>] timecounter_read+0xf/0x50
> [...]
> kernel: Call Trace:
> kernel:  [<ffffffffa0806b0f>] e1000e_phc_gettime+0x2f/0x60 [e1000e]
> kernel:  [<ffffffffa0806c5d>] e1000e_systim_overflow_work+0x1d/0x80
> [e1000e]
> kernel:  [<ffffffff810992c5>] process_one_work+0x155/0x440
> kernel:  [<ffffffff81099e16>] worker_thread+0x116/0x4b0
> kernel:  [<ffffffff8109f422>] kthread+0xd2/0xf0
> kernel:  [<ffffffff8163184f>] ret_from_fork+0x3f/0x70
> 
> These can be traced back to the fact that e1000e_systim_reset() skips the
> timecounter_init() call if e1000e_get_base_timinca() returns -EINVAL, which
> leads to a null deref in timecounter_read().
> 
> Commit 83129b37ef35 ("e1000e: fix systim issues", v4.2-rc1) reworked
> e1000e_get_base_timinca() in such a way that it can return -EINVAL for
> e1000_pch_spt if the SYSCFI bit is not set in TSYNCRXCTL.
> 
> Some experimentation has shown that on I219 (e1000_pch_spt, "MAC: 12")
> adapters, the E1000_TSYNCRXCTL_SYSCFI flag is unstable; TSYNCRXCTL reads
> sometimes don't have the SYSCFI bit set. Retrying the read shortly after
> finds the bit to be set. This was observed at boot (probe) but also link up
> and link down.
> 
> Moreover, the phc (PTP Hardware Clock) seems to operate normally even
> after
> reads where SYSCFI=0. Therefore, remove this register read and
> unconditionally set the clock parameters.
> 
> Reported-by: Achim Mildenberger <admin@fph.physik.uni-karlsruhe.de>
> Message-Id: <20180425065243.g5mqewg5irkwgwgv@f2>
> Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1075876
> Fixes: 83129b37ef35 ("e1000e: fix systim issues")
> Signed-off-by: Benjamin Poirier <bpoirier@suse.com>
> ---
>  drivers/net/ethernet/intel/e1000e/netdev.c | 15 ++++++---------
>  1 file changed, 6 insertions(+), 9 deletions(-)

Tested-by: Aaron Brown <aaron.f.brown@intel.com>

^ permalink raw reply

* Re: [PATCH 08/31] aio: implement IOCB_CMD_POLL
From: Al Viro @ 2018-05-23  0:45 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Avi Kivity, linux-aio, linux-fsdevel, netdev, linux-api,
	linux-kernel
In-Reply-To: <20180522220524.GE30522@ZenIV.linux.org.uk>

On Tue, May 22, 2018 at 11:05:24PM +0100, Al Viro wrote:
> > +{
> > +	struct aio_kiocb *iocb = container_of(req, struct aio_kiocb, poll);
> > +
> > +	fput(req->file);
> > +	aio_complete(iocb, mangle_poll(mask), 0);
> > +}
> 
> Careful.
> 
> > +static int aio_poll_cancel(struct kiocb *iocb)
> > +{
> > +	struct aio_kiocb *aiocb = container_of(iocb, struct aio_kiocb, rw);
> > +	struct poll_iocb *req = &aiocb->poll;
> > +	struct wait_queue_head *head = req->head;
> > +	bool found = false;
> > +
> > +	spin_lock(&head->lock);
> > +	found = __aio_poll_remove(req);
> > +	spin_unlock(&head->lock);
> 
> What's to guarantee that req->head has not been freed by that point?
> Look: wakeup finds ->ctx_lock held, so it leaves the sucker on the
> list, removes it from queue and schedules the call of __aio_poll_complete().
> Which gets executed just as we hit aio_poll_cancel(), starting with fput().
> 
> You really want to do aio_complete() before fput().  That way you know that
> req->wait is alive and well at least until iocb gets removed from the list.

Oh, bugger...

wakeup
	removed from queue
	schedule __aio_poll_complete()

cancel
	grab ctx->lock
	remove from list
work
	aio_complete()
		check if it's in the list
		it isn't, move on to free the sucker
cancel
	call ->ki_cancel()
	BOOM

Looks like we want to call ->ki_cancel() *BEFORE* removing from the list,
as well as doing fput() after aio_complete().  The same ordering, BTW, goes
for aio_read() et.al.

Look:
CPU1:	io_cancel() grabs ->ctx_lock, finds iocb and removes it from the list.
CPU2:	aio_rw_complete() on that iocb.  Since the sucker is not in the list
anymore, we do NOT spin on ->ctx_lock and proceed to free iocb
CPU1:	pass freed iocb to ->ki_cancel().  BOOM.

and if we have fput() done first (in aio_rw_complete()) we are vulnerable to
CPU1:	io_cancel() grabs ->ctx_lock, finds iocb and removes it from the list.
CPU2:	aio_rw_complete() on that iocb. fput() done, opening us to rmmod.
CPU1:	call ->ki_cancel(), which points to freed memory now.  BOOM.

--
To unsubscribe, send a message with 'unsubscribe linux-aio' in
the body to majordomo@kvack.org.  For more info on Linux AIO,
see: http://www.kvack.org/aio/
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>

^ permalink raw reply

* Re: [PATCH 08/31] aio: implement IOCB_CMD_POLL
From: Al Viro @ 2018-05-23  0:49 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Avi Kivity, linux-aio, linux-fsdevel, netdev, linux-api,
	linux-kernel
In-Reply-To: <20180523004530.GG30522@ZenIV.linux.org.uk>

On Wed, May 23, 2018 at 01:45:30AM +0100, Al Viro wrote:

> Oh, bugger...
> 
> wakeup
> 	removed from queue
> 	schedule __aio_poll_complete()
> 
> cancel
> 	grab ctx->lock
> 	remove from list
> work
> 	aio_complete()
> 		check if it's in the list
> 		it isn't, move on to free the sucker
> cancel
> 	call ->ki_cancel()
> 	BOOM
> 
> Looks like we want to call ->ki_cancel() *BEFORE* removing from the list,
> as well as doing fput() after aio_complete().  The same ordering, BTW, goes
> for aio_read() et.al.
> 
> Look:
> CPU1:	io_cancel() grabs ->ctx_lock, finds iocb and removes it from the list.
> CPU2:	aio_rw_complete() on that iocb.  Since the sucker is not in the list
> anymore, we do NOT spin on ->ctx_lock and proceed to free iocb
> CPU1:	pass freed iocb to ->ki_cancel().  BOOM.

BTW, it seems that the mainline is vulnerable to this one.  I might be
missing something, but...

--
To unsubscribe, send a message with 'unsubscribe linux-aio' in
the body to majordomo@kvack.org.  For more info on Linux AIO,
see: http://www.kvack.org/aio/
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>

^ permalink raw reply

* Re: [net-next 1/6] net/dcb: Add dcbnl buffer attribute
From: Huy Nguyen @ 2018-05-23  1:01 UTC (permalink / raw)
  To: Jakub Kicinski, Saeed Mahameed, David S. Miller, netdev,
	Jiri Pirko, Or Gerlitz, Parav Pandit
In-Reply-To: <20180522113202.54c43440@cakuba>

Dear Jakub, PSB.

On 5/22/2018 1:32 PM, Jakub Kicinski wrote:
> On Tue, 22 May 2018 10:36:17 -0500, Huy Nguyen wrote:
>> On 5/22/2018 12:20 AM, Jakub Kicinski wrote:
>>> On Mon, 21 May 2018 14:04:57 -0700, Saeed Mahameed wrote:
>>>> From: Huy Nguyen <huyn@mellanox.com>
>>>>
>>>> In this patch, we add dcbnl buffer attribute to allow user
>>>> change the NIC's buffer configuration such as priority
>>>> to buffer mapping and buffer size of individual buffer.
>>>>
>>>> This attribute combined with pfc attribute allows advance user to
>>>> fine tune the qos setting for specific priority queue. For example,
>>>> user can give dedicated buffer for one or more prirorities or user
>>>> can give large buffer to certain priorities.
>>>>
>>>> We present an use case scenario where dcbnl buffer attribute configured
>>>> by advance user helps reduce the latency of messages of different sizes.
>>>>
>>>> Scenarios description:
>>>> On ConnectX-5, we run latency sensitive traffic with
>>>> small/medium message sizes ranging from 64B to 256KB and bandwidth sensitive
>>>> traffic with large messages sizes 512KB and 1MB. We group small, medium,
>>>> and large message sizes to their own pfc enables priorities as follow.
>>>>     Priorities 1 & 2 (64B, 256B and 1KB)
>>>>     Priorities 3 & 4 (4KB, 8KB, 16KB, 64KB, 128KB and 256KB)
>>>>     Priorities 5 & 6 (512KB and 1MB)
>>>>
>>>> By default, ConnectX-5 maps all pfc enabled priorities to a single
>>>> lossless fixed buffer size of 50% of total available buffer space. The
>>>> other 50% is assigned to lossy buffer. Using dcbnl buffer attribute,
>>>> we create three equal size lossless buffers. Each buffer has 25% of total
>>>> available buffer space. Thus, the lossy buffer size reduces to 25%. Priority
>>>> to lossless  buffer mappings are set as follow.
>>>>     Priorities 1 & 2 on lossless buffer #1
>>>>     Priorities 3 & 4 on lossless buffer #2
>>>>     Priorities 5 & 6 on lossless buffer #3
>>>>
>>>> We observe improvements in latency for small and medium message sizes
>>>> as follows. Please note that the large message sizes bandwidth performance is
>>>> reduced but the total bandwidth remains the same.
>>>>     256B message size (42 % latency reduction)
>>>>     4K message size (21% latency reduction)
>>>>     64K message size (16% latency reduction)
>>>>
>>>> Signed-off-by: Huy Nguyen <huyn@mellanox.com>
>>>> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
>>> On a cursory look this bares a lot of resemblance to devlink shared
>>> buffer configuration ABI.  Did you look into using that?
>>>
>>> Just to be clear devlink shared buffer ABIs don't require representors
>>> and "switchdev mode".
>>> .
>> [HQN] Dear Jakub, there are several reasons that devlink shared buffer
>> ABI cannot be used:
>> 1. The devlink shared buffer ABI is written based on the switch cli
>> which you can find out more
>> from this link https://community.mellanox.com/docs/DOC-2558.
> Devlink API accommodates requirements of simpler (SwitchX2?) and more
> advanced schemes (present in Spectrum).  The simpler/basic static
> threshold configurations is exactly what you are doing here, AFAIU.
[HQN] Devlink API is tailored specifically for switch. We don't 
configure threshold configuration
explicitly. It is done via PFC. Once PFC is enabled on priority, 
threshold is setup based on our
proprietary formula that were tested rigorously for performance.
>> 2. The dcbnl interfaces have been used for QoS settings.
> QoS settings != shared buffer configuration.
[HQN] I think we have different definition about "shared buffer". Please 
refer to this below switch cli link.
It explained in detail what is the "shared buffer" in switch means.
Our NIC does not have "shared buffer" supported.
https://community.mellanox.com/docs/DOC-2591

>
>> In NIC, the  buffer configuration are tied to priority (ETS PFC).
> Some customers use DCB, a lot (most?) of them don't.  I don't think the
> "this is a logical extension of a commonly used API" really stands here.
[HQN] DCBNL are being actively used. The whole point of this patch
is to tie buffer configuration with IEEE's priority and is IEEE's PFC 
configuration.

Ambitious future is to have the switch configure the NIC's buffer size 
and buffer mapping
via TLV packet and this DCBNL interface. But we won't go too far here.
>
>> The buffer configuration are not tied to port like switch.
> It's tied to a port and TCs, you just have one port but still have 8
> TCs exactly like a switch...
[HQN] No. Our buffer ties to priority not to TCs.
>> 3. Shared buffer, alpha, threshold are switch specific terms.
> IDK how talking about alpha is relevant, it's just one threshold type
> the API supports.  As far as shared buffer and threshold I don't know
> if these are switch terms (or how "switch" differs from "NIC" at that
> level) - I personally find carving shared buffer into pools very
> intuitive.
[HQN] Yes, I understand your point too. The NIC's buffer shares some 
characteristics with the switch's
buffer settings. But this DCB buffer setting is to improve the 
performance and work together with the
PFC setting. We would like to keep all the qos setting under DCB Netlink 
as they are designed
to be this way.

>
> Could you give examples of commands/configs one can use with your new
> ABI?
[HQN] The plan is to add the support in lldptool once the kernel code is 
accepted. To test the kernel code,
I am using small python scripts that works on top of the netlink library.
It will be like this format which is similar to other options in lldptool
     priority2buffer: 0,2,5,7,1,2,3,6 maps priorities 0,1,2,3,4,5,6,7 to 
buffer 0,2,5,7,1,2,3,6
     buffer_size: 87296,87296,0,87296,0,0,0,0 set receive buffer size 
for buffer 0,1,2,3,4,5,6,7 respectively
>    How does one query the total size of the buffer to be carved?
[HQN] This is not necessary. If the total size is too big, error will be 
return via DCB netlink interface.
>

^ permalink raw reply

* Re: [PATCH net-next v2] net: sched: don't disable bh when accessing action idr
From: Cong Wang @ 2018-05-23  1:10 UTC (permalink / raw)
  To: Vlad Buslov
  Cc: David Miller, Linux Kernel Network Developers, Jamal Hadi Salim,
	Jiri Pirko, LKML
In-Reply-To: <1526932984-11544-1-git-send-email-vladbu@mellanox.com>

On Mon, May 21, 2018 at 1:03 PM, Vlad Buslov <vladbu@mellanox.com> wrote:
> Initial net_device implementation used ingress_lock spinlock to synchronize
> ingress path of device. This lock was used in both process and bh context.
> In some code paths action map lock was obtained while holding ingress_lock.
> Commit e1e992e52faa ("[NET_SCHED] protect action config/dump from irqs")
> modified actions to always disable bh, while using action map lock, in
> order to prevent deadlock on ingress_lock in softirq. This lock was removed
> from net_device, so disabling bh, while accessing action map, is no longer
> necessary.
>
> Replace all action idr spinlock usage with regular calls that do not
> disable bh.

While your patch is probably fine, the above justification seems not.

In the past, tc actions could be released in BH context because tc
filters use call_rcu(). However, I moved them to a workqueue recently.
So before my change I don't think you can remove the BH protection,
otherwise race with idr_remove()...

^ permalink raw reply

* Re: [PATCH net] net: phy: broadcom: Fix bcm_write_exp()
From: Florian Fainelli @ 2018-05-23  1:20 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: netdev, arunp, David S. Miller, Ray Jui, Scott Branden, Jon Mason,
	maintainer:BROADCOM IPROC ARM ARCHITECTURE,
	moderated list:BROADCOM IPROC ARM ARCHITECTURE, open list
In-Reply-To: <20180523001503.GA16062@lunn.ch>

Hi Andrew,

On 05/22/2018 05:15 PM, Andrew Lunn wrote:
> On Tue, May 22, 2018 at 05:04:49PM -0700, Florian Fainelli wrote:
>> On newer PHYs, we need to select the expansion register to write with
>> setting bits [11:8] to 0xf. This was done correctly by bcm7xxx.c prior
>> to being migrated to generic code under bcm-phy-lib.c which
>> unfortunately used the older implementation from the BCM54xx days.
> 
> Hi Florian
> 
> Does selecting the expansion register affect access to the standard
> registers? Does this need locking like the Marvell PHY has when
> changing pages?

We should probably convert this to the page accessors since the
expansion, misc and other shadow 0x1c accesses are all indirection
layers to poke into a different address space of the PHY. That would be
a separate fix though for a number of reasons.
-- 
Florian

^ permalink raw reply

* Re: [PATCH net-next 1/2] tcp: add max_quickacks param to tcp_incr_quickack and tcp_enter_quickack_mode
From: Neal Cardwell @ 2018-05-23  1:37 UTC (permalink / raw)
  To: kbuild test robot
  Cc: Eric Dumazet, kbuild-all, David Miller, Netdev, Van Jacobson,
	Yuchung Cheng, Soheil Hassas Yeganeh, Eric Dumazet
In-Reply-To: <201805230804.KP8LODgK%fengguang.wu@intel.com>

On Tue, May 22, 2018 at 8:31 PM kbuild test robot <lkp@intel.com> wrote:

> Hi Eric,

> Thank you for the patch! Yet something to improve:

> [auto build test ERROR on net/master]
> [also build test ERROR on v4.17-rc6 next-20180517]
> [cannot apply to net-next/master]
> [if your patch is applied to the wrong git tree, please drop us a note to
help improve the system]

> url:
https://github.com/0day-ci/linux/commits/Eric-Dumazet/tcp-add-max_quickacks-param-to-tcp_incr_quickack-and-tcp_enter_quickack_mode/20180523-075103
> config: i386-randconfig-x012-201820 (attached as .config)
> compiler: gcc-7 (Debian 7.3.0-16) 7.3.0
> reproduce:
>          # save the attached .config to linux build tree
>          make ARCH=i386

> All errors (new ones prefixed by >>):

>     net//ipv4/tcp_input.c: In function 'tcp_data_queue':
> >> net//ipv4/tcp_input.c:4656:2: error: too few arguments to function
'tcp_enter_quickack_mode'
>       tcp_enter_quickack_mode(sk);
>       ^~~~~~~~~~~~~~~~~~~~~~~
>     net//ipv4/tcp_input.c:199:13: note: declared here
>      static void tcp_enter_quickack_mode(struct sock *sk, unsigned int
max_quickacks)
>                  ^~~~~~~~~~~~~~~~~~~~~~~
...

For the record, this is an error in the tool, rather than the patch. The
tool seems to be using a stale net-next tree for building this patch.

The compile error is here in line 4656:

> ^1da177e4 Linus Torvalds           2005-04-16  4652     /* Out of window.
F.e. zero window probe. */
> ^1da177e4 Linus Torvalds           2005-04-16  4653     if
(!before(TCP_SKB_CB(skb)->seq, tp->rcv_nxt + tcp_receive_window(tp)))
> ^1da177e4 Linus Torvalds           2005-04-16  4654             goto
out_of_window;
> ^1da177e4 Linus Torvalds           2005-04-16  4655
> 463c84b97 Arnaldo Carvalho de Melo 2005-08-09 @4656
tcp_enter_quickack_mode(sk);
> ^1da177e4 Linus Torvalds           2005-04-16  4657
> ^1da177e4 Linus Torvalds           2005-04-16  4658     if
(before(TCP_SKB_CB(skb)->seq, tp->rcv_nxt)) {
> ^1da177e4 Linus Torvalds           2005-04-16  4659             /*
Partial packet, seq < rcv_next < end_seq */
...

But that line is not in net-next any more, after Eric's recent net-next
commit:

a3893637e1eb0e ("tcp: do not force quickack when receiving out-of-order
packets")

https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/commit/net/ipv4/tcp_input.c?id=a3893637e1eb0ef5eb1bbc52b3a8d2dfa317a35d

That commit removed that line:

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 0bf032839548f..f5622b2506651 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -4715,8 +4715,6 @@ drop:
         if (!before(TCP_SKB_CB(skb)->seq, tp->rcv_nxt +
tcp_receive_window(tp)))
                 goto out_of_window;

-       tcp_enter_quickack_mode(sk);
-
         if (before(TCP_SKB_CB(skb)->seq, tp->rcv_nxt)) {
                 /* Partial packet, seq < rcv_next < end_seq */
                 SOCK_DEBUG(sk, "partial packet: rcv_next %X seq %X - %X\n",

cheers,
neal

^ permalink raw reply related

* Re: [RFC V4 PATCH 7/8] vhost: packed ring support
From: Jason Wang @ 2018-05-23  1:39 UTC (permalink / raw)
  To: Wei Xu; +Cc: mst, kvm, virtualization, netdev, linux-kernel, jfreimann,
	tiwei.bie
In-Reply-To: <20180522165448.GA13523@wei-ubt>



On 2018年05月23日 00:54, Wei Xu wrote:
> On Wed, May 16, 2018 at 08:32:20PM +0800, Jason Wang wrote:
>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>> ---
>>   drivers/vhost/net.c   |   3 +-
>>   drivers/vhost/vhost.c | 539 ++++++++++++++++++++++++++++++++++++++++++++++----
>>   drivers/vhost/vhost.h |   8 +-
>>   3 files changed, 513 insertions(+), 37 deletions(-)
>>
>> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
>> index 8304c30..f2a0f5b 100644
>> --- a/drivers/vhost/vhost.c
>> +++ b/drivers/vhost/vhost.c
>> @@ -1358,6 +1382,8 @@ long vhost_vring_ioctl(struct vhost_dev *d, unsigned int ioctl, void __user *arg
>>   			break;
>>   		}
>>   		vq->last_avail_idx = s.num;
>> +		if (vhost_has_feature(vq, VIRTIO_F_RING_PACKED))
>> +			vq->avail_wrap_counter = s.num >> 31;
>>   		/* Forget the cached index value. */
>>   		vq->avail_idx = vq->last_avail_idx;
>>   		break;
>> @@ -1366,6 +1392,8 @@ long vhost_vring_ioctl(struct vhost_dev *d, unsigned int ioctl, void __user *arg
>>   		s.num = vq->last_avail_idx;
>>   		if (copy_to_user(argp, &s, sizeof s))
>>   			r = -EFAULT;
>> +		if (vhost_has_feature(vq, VIRTIO_F_RING_PACKED))
>> +			s.num |= vq->avail_wrap_counter << 31;
>>   		break;
>>   	case VHOST_SET_VRING_ADDR:
>>   		if (copy_from_user(&a, argp, sizeof a)) {
> 'last_used_idx' also needs to be saved/restored here.
>
> I have figured out the root cause of broken device after reloading
> 'virtio-net' module, all indices have been reset for a reloading but
> 'last_used_idx' is not properly reset in this case. This confuses
> handle_rx()/tx().
>
> Wei
>

Good catch, so we probably need a new ioctl to sync between qemu and vhost.

Something like VHOST_SET/GET_USED_BASE.

Thanks

^ permalink raw reply

* Re: [PATCH v3] mlx4_core: allocate ICM memory in page size chunks
From: Qing Huang @ 2018-05-23  1:41 UTC (permalink / raw)
  To: Tariq Toukan, Eric Dumazet, davem, haakon.bugge, yanjun.zhu
  Cc: netdev, linux-rdma, linux-kernel, gi-oh.kim
In-Reply-To: <35ba0f14-7b24-96ff-6b2d-610a4b2980c2@mellanox.com>



On 5/22/2018 8:33 AM, Tariq Toukan wrote:
>
>
> On 18/05/2018 12:45 AM, Qing Huang wrote:
>>
>>
>> On 5/17/2018 2:14 PM, Eric Dumazet wrote:
>>> On 05/17/2018 01:53 PM, Qing Huang wrote:
>>>> When a system is under memory presure (high usage with fragments),
>>>> the original 256KB ICM chunk allocations will likely trigger kernel
>>>> memory management to enter slow path doing memory compact/migration
>>>> ops in order to complete high order memory allocations.
>>>>
>>>> When that happens, user processes calling uverb APIs may get stuck
>>>> for more than 120s easily even though there are a lot of free pages
>>>> in smaller chunks available in the system.
>>>>
>>>> Syslog:
>>>> ...
>>>> Dec 10 09:04:51 slcc03db02 kernel: [397078.572732] INFO: task
>>>> oracle_205573_e:205573 blocked for more than 120 seconds.
>>>> ...
>>>>
>>> NACK on this patch.
>>>
>>> You have been asked repeatedly to use kvmalloc()
>>>
>>> This is not a minor suggestion.
>>>
>>> Take a look 
>>> athttps://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d8c13f2271ec5178c52fbde072ec7b562651ed9d 
>>>
>>
>> Would you please take a look at how table->icm is being used in the 
>> mlx4 driver? It's a meta data used for individual pointer variable 
>> referencing,
>> not as data frag or in/out buffer. It has no need for contiguous phy. 
>> memory.
>>
>> Thanks.
>>
>
> NACK.
>
> This would cause a degradation when iterating the entries of table->icm.
> For example, in mlx4_table_get_range.
E.g.
int mlx4_table_get_range(struct mlx4_dev *dev, struct mlx4_icm_table *table,
                          u32 start, u32 end)
{
         int inc = MLX4_TABLE_CHUNK_SIZE / table->obj_size;
         int err;
         u32 i;

         for (i = start; i <= end; i += inc) {
                 err = mlx4_table_get(dev, table, i);
                 if (err)
                         goto fail;
         }

         return 0;
...
}

E.g. mtt obj is 8 bytes, so a 4KB ICM block would have 512 mtt objects. 
So you will have to allocate
more 512 mtt objects in order to have table->icm pointer to increment by 
1 to fetch next pointer
value.  So 256K mtt objects are needed in order to traverse table->icm 
pointer across a page boundary
in the call stacks.

Considering mlx4_table_get_range() is only used in control path, there 
is no significant gain by using kvzalloc
vs. vzalloc for table->icm.

Anyway, if a user makes sure mlx4 driver to be loaded very early and 
doesn't remove and reload it afterwards,
we should have enough (and not wasting) contiguous phy mem for 
table->icm allocation. I will use kvzalloc to
replace vzalloc and send a V4 patch.

Thanks,
Qing


>
> Thanks,
> Tariq
>
>>> And you'll understand some people care about this.
>>>
>>> Strongly.
>>>
>>> Thanks.
>>>
>>
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* YAaioRace (was Re: [PATCH 08/31] aio: implement IOCB_CMD_POLL)
From: Al Viro @ 2018-05-23  1:43 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Avi Kivity, linux-aio, linux-fsdevel, netdev, linux-api,
	linux-kernel, Kent Overstreet, Christoph Hellwig
In-Reply-To: <20180523004904.GH30522@ZenIV.linux.org.uk>

On Wed, May 23, 2018 at 01:49:04AM +0100, Al Viro wrote:

> > Looks like we want to call ->ki_cancel() *BEFORE* removing from the list,
> > as well as doing fput() after aio_complete().  The same ordering, BTW, goes
> > for aio_read() et.al.
> > 
> > Look:
> > CPU1:	io_cancel() grabs ->ctx_lock, finds iocb and removes it from the list.
> > CPU2:	aio_rw_complete() on that iocb.  Since the sucker is not in the list
> > anymore, we do NOT spin on ->ctx_lock and proceed to free iocb
> > CPU1:	pass freed iocb to ->ki_cancel().  BOOM.
> 
> BTW, it seems that the mainline is vulnerable to this one.  I might be
> missing something, but...

It is, but with a different attack vector - io_cancel(2) won't do it (it
does not remove from the list at all), but io_destroy(2) bloody well will.

IMO, we need this in mainline; unless somebody has a problem with it, to
#fixes it goes:

fix io_destroy()/aio_complete() race

If io_destroy() gets to cancelling everything that can be cancelled and
gets to kiocb_cancel() calling the function driver has left in ->ki_cancel,
it becomes vulnerable to a race with IO completion.  At that point req
is already taken off the list and aio_complete() does *NOT* spin until
we (in free_ioctx_users()) releases ->ctx_lock.  As the result, it proceeds
to kiocb_free(), freing req just it gets passed to ->ki_cancel().

Fix is simple - remove from the list after the call of kiocb_cancel().  All
instances of ->ki_cancel() already have to cope with the being called with
iocb still on list - that's what happens in io_cancel(2).

Cc: stable@kernel.org
Fixes: 0460fef2a921 "aio: use cancellation list lazily"
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
diff --git a/fs/aio.c b/fs/aio.c
index 8061d9787e54..49f53516eef0 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -634,9 +634,8 @@ static void free_ioctx_users(struct percpu_ref *ref)
 	while (!list_empty(&ctx->active_reqs)) {
 		req = list_first_entry(&ctx->active_reqs,
 				       struct aio_kiocb, ki_list);
-
-		list_del_init(&req->ki_list);
 		kiocb_cancel(req);
+		list_del_init(&req->ki_list);
 	}

 	spin_unlock_irq(&ctx->ctx_lock);

--
To unsubscribe, send a message with 'unsubscribe linux-aio' in
the body to majordomo@kvack.org.  For more info on Linux AIO,
see: http://www.kvack.org/aio/
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>

^ permalink raw reply related

* Re: [PATCH] e1000: check the return of pci_get_drvdata() in e1000_remove()
From: Stephen Hemminger @ 2018-05-23  1:47 UTC (permalink / raw)
  To: Bo Chen; +Cc: jeffrey.t.kirsher, davem, intel-wired-lan, netdev, linux-kernel
In-Reply-To: <20180523001743.8492-1-chenbo@pdx.edu>

On Tue, 22 May 2018 17:17:43 -0700
Bo Chen <chenbo@pdx.edu> wrote:

> This check on pci_get_drvdata() prevents potential invalid pointer dereferences,
> and is a common practice in *_remove() functions from other drivers, such as
> 'intel/e100.c', 'amd/pcnet32.c', 'realtek/8139too.c', and 'broadcom/tg3.c'.
> 
> Signed-off-by: Bo Chen <chenbo@pdx.edu>

Why check for something that can never be true.
You are creating dead code paths that can never be exercised.

^ permalink raw reply

* Re: [PATCH net-next] cxgb4: do L1 config when module is inserted
From: kbuild test robot @ 2018-05-23  1:47 UTC (permalink / raw)
  To: Ganesh Goudar
  Cc: kbuild-all, netdev, davem, nirranjan, indranil, venkatesh,
	Ganesh Goudar, Casey Leedom
In-Reply-To: <1526894143-4986-1-git-send-email-ganeshgr@chelsio.com>

[-- Attachment #1: Type: text/plain, Size: 4618 bytes --]

Hi Ganesh,

I love your patch! Yet something to improve:

[auto build test ERROR on net-next/master]

url:    https://github.com/0day-ci/linux/commits/Ganesh-Goudar/cxgb4-do-L1-config-when-module-is-inserted/20180523-085637
config: sparc64-allyesconfig (attached as .config)
compiler: sparc64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=sparc64 

All error/warnings (new ones prefixed by >>):

   drivers/scsi/csiostor/csio_hw.c: In function 'fwcaps16_to_caps32':
>> drivers/scsi/csiostor/csio_hw.c:1490:17: error: 'FW_PORT_CAP_MDIX' undeclared (first use in this function); did you mean 'FW_PORT_CAP_MDI_S'?
       if (caps16 & FW_PORT_CAP_##__cap) \
                    ^
>> drivers/scsi/csiostor/csio_hw.c:1503:2: note: in expansion of macro 'CAP16_TO_CAP32'
     CAP16_TO_CAP32(MDIX);
     ^~~~~~~~~~~~~~
   drivers/scsi/csiostor/csio_hw.c:1490:17: note: each undeclared identifier is reported only once for each function it appears in
       if (caps16 & FW_PORT_CAP_##__cap) \
                    ^
>> drivers/scsi/csiostor/csio_hw.c:1503:2: note: in expansion of macro 'CAP16_TO_CAP32'
     CAP16_TO_CAP32(MDIX);
     ^~~~~~~~~~~~~~
>> drivers/scsi/csiostor/csio_hw.c:1491:15: error: 'FW_PORT_CAP32_MDIX' undeclared (first use in this function); did you mean 'FW_PORT_CAP_MDIX'?
        caps32 |= FW_PORT_CAP32_##__cap; \
                  ^
>> drivers/scsi/csiostor/csio_hw.c:1503:2: note: in expansion of macro 'CAP16_TO_CAP32'
     CAP16_TO_CAP32(MDIX);
     ^~~~~~~~~~~~~~

vim +1490 drivers/scsi/csiostor/csio_hw.c

e1735d9a Varun Prakash 2018-03-11  1477  
e1735d9a Varun Prakash 2018-03-11  1478  /**
e1735d9a Varun Prakash 2018-03-11  1479   *      fwcaps16_to_caps32 - convert 16-bit Port Capabilities to 32-bits
e1735d9a Varun Prakash 2018-03-11  1480   *      @caps16: a 16-bit Port Capabilities value
e1735d9a Varun Prakash 2018-03-11  1481   *
e1735d9a Varun Prakash 2018-03-11  1482   *      Returns the equivalent 32-bit Port Capabilities value.
e1735d9a Varun Prakash 2018-03-11  1483   */
e1735d9a Varun Prakash 2018-03-11  1484  fw_port_cap32_t fwcaps16_to_caps32(fw_port_cap16_t caps16)
e1735d9a Varun Prakash 2018-03-11  1485  {
e1735d9a Varun Prakash 2018-03-11  1486  	fw_port_cap32_t caps32 = 0;
e1735d9a Varun Prakash 2018-03-11  1487  
e1735d9a Varun Prakash 2018-03-11  1488  	#define CAP16_TO_CAP32(__cap) \
e1735d9a Varun Prakash 2018-03-11  1489  		do { \
e1735d9a Varun Prakash 2018-03-11 @1490  			if (caps16 & FW_PORT_CAP_##__cap) \
e1735d9a Varun Prakash 2018-03-11 @1491  				caps32 |= FW_PORT_CAP32_##__cap; \
e1735d9a Varun Prakash 2018-03-11  1492  		} while (0)
e1735d9a Varun Prakash 2018-03-11  1493  
e1735d9a Varun Prakash 2018-03-11  1494  	CAP16_TO_CAP32(SPEED_100M);
e1735d9a Varun Prakash 2018-03-11  1495  	CAP16_TO_CAP32(SPEED_1G);
e1735d9a Varun Prakash 2018-03-11  1496  	CAP16_TO_CAP32(SPEED_25G);
e1735d9a Varun Prakash 2018-03-11  1497  	CAP16_TO_CAP32(SPEED_10G);
e1735d9a Varun Prakash 2018-03-11  1498  	CAP16_TO_CAP32(SPEED_40G);
e1735d9a Varun Prakash 2018-03-11  1499  	CAP16_TO_CAP32(SPEED_100G);
e1735d9a Varun Prakash 2018-03-11  1500  	CAP16_TO_CAP32(FC_RX);
e1735d9a Varun Prakash 2018-03-11  1501  	CAP16_TO_CAP32(FC_TX);
e1735d9a Varun Prakash 2018-03-11  1502  	CAP16_TO_CAP32(ANEG);
e1735d9a Varun Prakash 2018-03-11 @1503  	CAP16_TO_CAP32(MDIX);
e1735d9a Varun Prakash 2018-03-11  1504  	CAP16_TO_CAP32(MDIAUTO);
e1735d9a Varun Prakash 2018-03-11  1505  	CAP16_TO_CAP32(FEC_RS);
e1735d9a Varun Prakash 2018-03-11  1506  	CAP16_TO_CAP32(FEC_BASER_RS);
e1735d9a Varun Prakash 2018-03-11  1507  	CAP16_TO_CAP32(802_3_PAUSE);
e1735d9a Varun Prakash 2018-03-11  1508  	CAP16_TO_CAP32(802_3_ASM_DIR);
e1735d9a Varun Prakash 2018-03-11  1509  
e1735d9a Varun Prakash 2018-03-11  1510  	#undef CAP16_TO_CAP32
e1735d9a Varun Prakash 2018-03-11  1511  
e1735d9a Varun Prakash 2018-03-11  1512  	return caps32;
e1735d9a Varun Prakash 2018-03-11  1513  }
e1735d9a Varun Prakash 2018-03-11  1514  

:::::: The code at line 1490 was first introduced by commit
:::::: e1735d9a98ab5593484bbba1933e362a261e0de0 scsi: csiostor: add support for 32 bit port capabilities

:::::: TO: Varun Prakash <varun@chelsio.com>
:::::: CC: Martin K. Petersen <martin.petersen@oracle.com>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 53396 bytes --]

^ permalink raw reply

* Re: KASAN: use-after-free Read in remove_wait_queue (2)
From: Eric Biggers @ 2018-05-23  3:29 UTC (permalink / raw)
  To: Guillaume Nault
  Cc: linux-ppp, Paul Mackerras, netdev, linux-kernel, syzkaller-bugs,
	syzbot, viro
In-Reply-To: <20180518160223.GF1534@alphalink.fr>

On Fri, May 18, 2018 at 06:02:23PM +0200, Guillaume Nault wrote:
> On Sun, May 13, 2018 at 11:11:55PM -0700, Eric Biggers wrote:
> > [+ppp list and maintainer]
> > 
> > This is a bug in ppp_generic.c; it still happens on Linus' tree and it's easily
> > reproducible, see program below.  The bug is that the PPPIOCDETACH ioctl doesn't
> > consider that the file can still be attached to epoll instances even when
> > ->f_count == 1.
> 
> Right. What would it take to remove the file for the epoll instances?
> Sorry for the naive question, but I'm not familiar with VFS and didn't
> find a helper function we could call.
> 

There is eventpoll_release_file(), but it's not exported to modules.  It might
work to call it, but it seems like a hack.

> > Also, the reproducer doesn't test this but I think ppp_poll(),
> > ppp_read(), and ppp_write() can all race with PPPIOCDETACH, causing
> > use-after-frees as well.
> 
> I also believe so. ppp_release() resets ->private_data, and even though
> functions like ppp_read() test ->private_data before executing, there's
> no synchronisation mechanism to ensure that the update is visible
> before the unit or channel is destroyed. Is that the kind of race you
> had in mind?

Yes, though after looking into it more I *think* these additional races aren't
actually possible, due to the 'f_count < 2' check.  These races could only
happen with a shared fd table, but in that case fdget() would increment f_count
for the duration of each operation, resulting in 'f_count >= 2' if both ioctl()
and something else is running on the same file concurrently.

Note that this also means PPPIOCDETACH doesn't work at all if called from a
multithreaded application...

> 
> > Any chance that PPPIOCDETACH can simply be removed,
> > given that it's apparently been "deprecated" for 16 years?
> > Does anyone use it?
> 
> The only users I'm aware of are pppd versions older than ppp-2.4.2
> (released in November 2003). But even at that time, there were issues
> with PPPIOCDETACH as pppd didn't seem to react properly when this call
> failed. An Internet search on the "PPPIOCDETACH file->f_count=" kernel
> log string, or on the "Couldn't release PPP unit: Invalid argument"
> error message of pppd, returns several related bug reports.
> 
> Originally, PPPIOCDETACH never failed, but testing ->f_count was
> later introduced to fix crashes when the file descriptor had been
> duplicated. It seems that this was motivated by polling issues too.
> 
> Long story short, it looks like PPPIOCDETACH never has worked well
> and we have at least two more bugs to fix. Given how it has proven
> fragile, I wouldn't be surprised if there were even more lurking
> around. I'd say that it's probably safer to drop it than to add more
> workarounds and playing wack-a-mole with those bugs.

IMO, if we can get away with removing it without any users noticing, that would
be much better than trying to fix it with a VFS-level hack, and probably missing
some cases.  I'll send a patch to get things started...

- Eric

^ permalink raw reply

* Re: [PATCH net-next] cxgb4: do L1 config when module is inserted
From: kbuild test robot @ 2018-05-23  3:29 UTC (permalink / raw)
  To: Ganesh Goudar
  Cc: kbuild-all, netdev, davem, nirranjan, indranil, venkatesh,
	Ganesh Goudar, Casey Leedom
In-Reply-To: <1526894143-4986-1-git-send-email-ganeshgr@chelsio.com>

[-- Attachment #1: Type: text/plain, Size: 4438 bytes --]

Hi Ganesh,

I love your patch! Yet something to improve:

[auto build test ERROR on net-next/master]

url:    https://github.com/0day-ci/linux/commits/Ganesh-Goudar/cxgb4-do-L1-config-when-module-is-inserted/20180523-085637
config: x86_64-rhel (attached as .config)
compiler: gcc-7 (Debian 7.3.0-16) 7.3.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

All errors (new ones prefixed by >>):

   drivers/scsi/csiostor/csio_hw.c: In function 'fwcaps16_to_caps32':
>> drivers/scsi/csiostor/csio_hw.c:1490:17: error: 'FW_PORT_CAP_MDIX' undeclared (first use in this function); did you mean 'FW_PORT_CAP_MDI_V'?
       if (caps16 & FW_PORT_CAP_##__cap) \
                    ^
   drivers/scsi/csiostor/csio_hw.c:1503:2: note: in expansion of macro 'CAP16_TO_CAP32'
     CAP16_TO_CAP32(MDIX);
     ^~~~~~~~~~~~~~
   drivers/scsi/csiostor/csio_hw.c:1490:17: note: each undeclared identifier is reported only once for each function it appears in
       if (caps16 & FW_PORT_CAP_##__cap) \
                    ^
   drivers/scsi/csiostor/csio_hw.c:1503:2: note: in expansion of macro 'CAP16_TO_CAP32'
     CAP16_TO_CAP32(MDIX);
     ^~~~~~~~~~~~~~
   drivers/scsi/csiostor/csio_hw.c:1491:15: error: 'FW_PORT_CAP32_MDIX' undeclared (first use in this function); did you mean 'FW_PORT_CAP_MDIX'?
        caps32 |= FW_PORT_CAP32_##__cap; \
                  ^
   drivers/scsi/csiostor/csio_hw.c:1503:2: note: in expansion of macro 'CAP16_TO_CAP32'
     CAP16_TO_CAP32(MDIX);
     ^~~~~~~~~~~~~~

vim +1490 drivers/scsi/csiostor/csio_hw.c

e1735d9a Varun Prakash 2018-03-11  1477  
e1735d9a Varun Prakash 2018-03-11  1478  /**
e1735d9a Varun Prakash 2018-03-11  1479   *      fwcaps16_to_caps32 - convert 16-bit Port Capabilities to 32-bits
e1735d9a Varun Prakash 2018-03-11  1480   *      @caps16: a 16-bit Port Capabilities value
e1735d9a Varun Prakash 2018-03-11  1481   *
e1735d9a Varun Prakash 2018-03-11  1482   *      Returns the equivalent 32-bit Port Capabilities value.
e1735d9a Varun Prakash 2018-03-11  1483   */
e1735d9a Varun Prakash 2018-03-11  1484  fw_port_cap32_t fwcaps16_to_caps32(fw_port_cap16_t caps16)
e1735d9a Varun Prakash 2018-03-11  1485  {
e1735d9a Varun Prakash 2018-03-11  1486  	fw_port_cap32_t caps32 = 0;
e1735d9a Varun Prakash 2018-03-11  1487  
e1735d9a Varun Prakash 2018-03-11  1488  	#define CAP16_TO_CAP32(__cap) \
e1735d9a Varun Prakash 2018-03-11  1489  		do { \
e1735d9a Varun Prakash 2018-03-11 @1490  			if (caps16 & FW_PORT_CAP_##__cap) \
e1735d9a Varun Prakash 2018-03-11  1491  				caps32 |= FW_PORT_CAP32_##__cap; \
e1735d9a Varun Prakash 2018-03-11  1492  		} while (0)
e1735d9a Varun Prakash 2018-03-11  1493  
e1735d9a Varun Prakash 2018-03-11  1494  	CAP16_TO_CAP32(SPEED_100M);
e1735d9a Varun Prakash 2018-03-11  1495  	CAP16_TO_CAP32(SPEED_1G);
e1735d9a Varun Prakash 2018-03-11  1496  	CAP16_TO_CAP32(SPEED_25G);
e1735d9a Varun Prakash 2018-03-11  1497  	CAP16_TO_CAP32(SPEED_10G);
e1735d9a Varun Prakash 2018-03-11  1498  	CAP16_TO_CAP32(SPEED_40G);
e1735d9a Varun Prakash 2018-03-11  1499  	CAP16_TO_CAP32(SPEED_100G);
e1735d9a Varun Prakash 2018-03-11  1500  	CAP16_TO_CAP32(FC_RX);
e1735d9a Varun Prakash 2018-03-11  1501  	CAP16_TO_CAP32(FC_TX);
e1735d9a Varun Prakash 2018-03-11  1502  	CAP16_TO_CAP32(ANEG);
e1735d9a Varun Prakash 2018-03-11  1503  	CAP16_TO_CAP32(MDIX);
e1735d9a Varun Prakash 2018-03-11  1504  	CAP16_TO_CAP32(MDIAUTO);
e1735d9a Varun Prakash 2018-03-11  1505  	CAP16_TO_CAP32(FEC_RS);
e1735d9a Varun Prakash 2018-03-11  1506  	CAP16_TO_CAP32(FEC_BASER_RS);
e1735d9a Varun Prakash 2018-03-11  1507  	CAP16_TO_CAP32(802_3_PAUSE);
e1735d9a Varun Prakash 2018-03-11  1508  	CAP16_TO_CAP32(802_3_ASM_DIR);
e1735d9a Varun Prakash 2018-03-11  1509  
e1735d9a Varun Prakash 2018-03-11  1510  	#undef CAP16_TO_CAP32
e1735d9a Varun Prakash 2018-03-11  1511  
e1735d9a Varun Prakash 2018-03-11  1512  	return caps32;
e1735d9a Varun Prakash 2018-03-11  1513  }
e1735d9a Varun Prakash 2018-03-11  1514  

:::::: The code at line 1490 was first introduced by commit
:::::: e1735d9a98ab5593484bbba1933e362a261e0de0 scsi: csiostor: add support for 32 bit port capabilities

:::::: TO: Varun Prakash <varun@chelsio.com>
:::::: CC: Martin K. Petersen <martin.petersen@oracle.com>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 40459 bytes --]

^ permalink raw reply

* Re: [PATCH net-next v2 00/10] qed*: Add support for management firmware TLV request.
From: David Miller @ 2018-05-23  3:31 UTC (permalink / raw)
  To: sudarsana.kalluru; +Cc: netdev, Ariel.Elior, chad.dupuis, manish.rangankar
In-Reply-To: <20180522072846.2454-1-sudarsana.kalluru@cavium.com>

From: Sudarsana Reddy Kalluru <sudarsana.kalluru@cavium.com>
Date: Tue, 22 May 2018 00:28:36 -0700

> From: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
> 
> Management firmware (MFW) requires config and state information from
> the driver. It queries this via TLV (type-length-value) request wherein
> mfw specificies the list of required TLVs. Driver fills the TLV data
> and responds back to MFW.
> This patch series adds qed/qede/qedf/qedi driver implementation for
> supporting the TLV queries from MFW.
> 
> Changes from previous versions:
> -------------------------------
> v2: Split patch (2) into multiple simpler patches.
> v2: Update qed_tlv_parsed_buf->p_val datatype to void pointer to avoid
>     bunch of unnecessary typecasts.
> 
> Please consider applying this series to "net-next".

Series applied, thanks.

^ permalink raw reply

* Re: [PATCH] e1000: check the return of pci_get_drvdata() in e1000_remove()
From: Bo Chen @ 2018-05-23  3:31 UTC (permalink / raw)
  Cc: netdev, linux-kernel
In-Reply-To: <20180522184702.64cd62f6@xeon-e3>

Re-send to mailing lists as the previous email was rejected because of
not using plain text.

-----

Hi Stephen,

Thanks for the quick reply and your comments. I am new to network
drivers, and certainly trust your assessment.

I was assuming 'pdev->dev.p' can be NULL with a 'kzalloc()' failure in
'pci_set_drvdata()' invoked by 'e1000_probe()', in which case the
"pci_get_drvdata()" will return a NULL pointer. But I did not trace
back to confirm whether 'pdev->dev.p' has to be valid before
'e1000_probe()' is invoked. Maybe this is what I am missing? Please
excuse my newbie questions and mistakes.

Here is some context about why I started this patch. I am building a
tool to perform "grey-box" fault injection on linux-kernel-module
binaries. In my tool, there is a set of kernel API functions that
faults can be generated, for example making the return of
'__kmalloc()' be NULL. My tool fired an alarm (a kernel panic), when a
fault on ''dev_get_drvdata()" is injected to 'e1000.ko' (and other
drivers were fine, such as e100.ko and pcnet32.ko). It seems that I
was wrong to assume 'dev_get_drvdata()' can return NULL and inject
faults from it. Do you have any suggestions about how I can avoid such
wrong assumptions?

Thanks again for your time and attention. I hope this patch is not
wasting too much effort from the netdev community. Any other comments
or suggestions will be appreciated.

Best Regards,
Bo Chen

On Tue, May 22, 2018 at 6:47 PM, Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> On Tue, 22 May 2018 17:17:43 -0700
> Bo Chen <chenbo@pdx.edu> wrote:
>
> > This check on pci_get_drvdata() prevents potential invalid pointer dereferences,
> > and is a common practice in *_remove() functions from other drivers, such as
> > 'intel/e100.c', 'amd/pcnet32.c', 'realtek/8139too.c', and 'broadcom/tg3.c'.
> >
> > Signed-off-by: Bo Chen <chenbo@pdx.edu>
>
> Why check for something that can never be true.
> You are creating dead code paths that can never be exercised.

^ permalink raw reply

* Re: [PATCH] [RFC] bpf: tracing: new helper bpf_get_current_cgroup_ino
From: Y Song @ 2018-05-23  3:33 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Alban Crequy, netdev, linux-kernel, containers, cgroups,
	Alban Crequy, tj
In-Reply-To: <CAH3MdRWgruVq+3r+2pHTah-c2zTO03vPkepjWDZ0_KrYcroy9A@mail.gmail.com>

I did a quick prototyping and the above interface seems working fine.

The kernel change:
===============

[yhs@localhost bpf-next]$ git diff
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 97446bbe2ca5..669b7383fddb 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -1976,7 +1976,8 @@ union bpf_attr {
        FN(fib_lookup),                 \
        FN(sock_hash_update),           \
        FN(msg_redirect_hash),          \
-       FN(sk_redirect_hash),
+       FN(sk_redirect_hash),           \
+       FN(get_current_cgroup_id),

 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
  * function eBPF program intends to call
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index ce2cbbff27e4..e11e3298f911 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -493,6 +493,21 @@ static const struct bpf_func_proto
bpf_current_task_under_cgroup_proto = {
        .arg2_type      = ARG_ANYTHING,
 };

+BPF_CALL_0(bpf_get_current_cgroup_id)
+{
+       struct cgroup *cgrp = task_dfl_cgroup(current);
+       if (!cgrp)
+               return -EINVAL;
+
+       return cgrp->kn->id.id;
+}
+
+static const struct bpf_func_proto bpf_get_current_cgroup_id_proto = {
+       .func           = bpf_get_current_cgroup_id,
+       .gpl_only       = false,
+       .ret_type       = RET_INTEGER,
+};
+
 BPF_CALL_3(bpf_probe_read_str, void *, dst, u32, size,
           const void *, unsafe_ptr)
 {
@@ -563,6 +578,8 @@ tracing_func_proto(enum bpf_func_id func_id, const
struct bpf_prog *prog)
                return &bpf_get_prandom_u32_proto;
        case BPF_FUNC_probe_read_str:
                return &bpf_probe_read_str_proto;
+       case BPF_FUNC_get_current_cgroup_id:
+               return &bpf_get_current_cgroup_id_proto;
        default:
                return NULL;
        }

The following program can be used to print out a cgroup id given a cgroup path.
[yhs@localhost cg]$ cat get_cgroup_id.c
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

int main(int argc, char **argv)
{
    int dirfd, err, flags, mount_id, fhsize;
    struct file_handle *fhp;
    char *pathname;

    if (argc != 2) {
        printf("usage: %s <cgroup_path>\n", argv[0]);
        return 1;
    }

    pathname = argv[1];
    dirfd = AT_FDCWD;
    flags = 0;

    fhsize = sizeof(*fhp);
    fhp = malloc(fhsize);
    if (!fhp)
        return 1;

    err = name_to_handle_at(dirfd, pathname, fhp, &mount_id, flags);
    if (err >= 0) {
        printf("error\n");
        return 1;
    }

    fhsize = sizeof(struct file_handle) + fhp->handle_bytes;
    fhp = realloc(fhp, fhsize);
    if (!fhp)
        return 1;

    err = name_to_handle_at(dirfd, pathname, fhp, &mount_id, flags);
    if (err < 0)
        perror("name_to_handle_at");
    else {
        int i;

        printf("dir = %s, mount_id = %d\n", pathname, mount_id);
        printf("handle_bytes = %d, handle_type = %d\n", fhp->handle_bytes,
            fhp->handle_type);
        if (fhp->handle_bytes != 8)
            return 1;

        printf("cgroup_id = 0x%llx\n", *(unsigned long long *)fhp->f_handle);
    }

    return 0;
}
[yhs@localhost cg]$

Given a cgroup path, the user can get cgroup_id and use it in their bpf
program for filtering purpose.

I run a simple program t.c
   int main() { while(1) sleep(1); return 0; }
in the cgroup v2 directory /home/yhs/tmp/yhs
   none on /home/yhs/tmp type cgroup2 (rw,relatime,seclabel)

$ ./get_cgroup_id /home/yhs/tmp/yhs
dir = /home/yhs/tmp/yhs, mount_id = 124
handle_bytes = 8, handle_type = 1
cgroup_id = 0x1000006b2

// the below command to get cgroup_id from the kernel for the
// process compiled with t.c and ran under /home/yhs/tmp/yhs:
$ sudo ./trace.py -p 4067 '__x64_sys_nanosleep "cgid = %llx", $cgid'
PID     TID     COMM            FUNC             -
4067    4067    a.out           __x64_sys_nanosleep cgid = 1000006b2
4067    4067    a.out           __x64_sys_nanosleep cgid = 1000006b2
4067    4067    a.out           __x64_sys_nanosleep cgid = 1000006b2
^C[yhs@localhost tools]$

The kernel and user space cgid matches. Will provide a
formal patch later.




On Mon, May 21, 2018 at 5:24 PM, Y Song <ys114321@gmail.com> wrote:
> On Mon, May 21, 2018 at 9:26 AM, Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
>> On Sun, May 13, 2018 at 07:33:18PM +0200, Alban Crequy wrote:
>>>
>>> +BPF_CALL_2(bpf_get_current_cgroup_ino, u32, hierarchy, u64, flags)
>>> +{
>>> +     // TODO: pick the correct hierarchy instead of the mem controller
>>> +     struct cgroup *cgrp = task_cgroup(current, memory_cgrp_id);
>>> +
>>> +     if (unlikely(!cgrp))
>>> +             return -EINVAL;
>>> +     if (unlikely(hierarchy))
>>> +             return -EINVAL;
>>> +     if (unlikely(flags))
>>> +             return -EINVAL;
>>> +
>>> +     return cgrp->kn->id.ino;
>>
>> ino only is not enough to identify cgroup. It needs generation number too.
>> I don't quite see how hierarchy and flags can be used in the future.
>> Also why limit it to memcg?
>>
>> How about something like this instead:
>>
>> BPF_CALL_2(bpf_get_current_cgroup_id)
>> {
>>         struct cgroup *cgrp = task_dfl_cgroup(current);
>>
>>         return cgrp->kn->id.id;
>> }
>> The user space can use fhandle api to get the same 64-bit id.
>
> I think this should work. This will also be useful to bcc as user
> space can encode desired id
> in the bpf program and compared that id to the current cgroup id, so we can have
> cgroup level tracing (esp. stat collection) support. To cope with
> cgroup hierarchy, user can use
> cgroup-array based approach or explicitly compare against multiple cgroup id's.

^ permalink raw reply related

* Re: [PATCH] [RFC] bpf: tracing: new helper bpf_get_current_cgroup_ino
From: Alexei Starovoitov @ 2018-05-23  3:35 UTC (permalink / raw)
  To: Y Song
  Cc: Alban Crequy, netdev, linux-kernel, containers, cgroups,
	Alban Crequy, tj
In-Reply-To: <CAH3MdRVdfw52atavT3KL8MpPw7zDM_hR6aUcqDP1PogLn_sH+w@mail.gmail.com>

On Tue, May 22, 2018 at 08:33:24PM -0700, Y Song wrote:
> +       struct cgroup *cgrp = task_dfl_cgroup(current);
> +       if (!cgrp)
> +               return -EINVAL;

why this check is needed?

^ permalink raw reply

* [PATCH] ppp: remove the PPPIOCDETACH ioctl
From: Eric Biggers @ 2018-05-23  3:59 UTC (permalink / raw)
  To: linux-ppp, Paul Mackerras
  Cc: netdev, linux-fsdevel, linux-kernel, Guillaume Nault,
	syzkaller-bugs, Eric Biggers
In-Reply-To: <20180523032958.GE658@sol.localdomain>

From: Eric Biggers <ebiggers@google.com>

The PPPIOCDETACH ioctl effectively tries to "close" the given ppp file
before f_count has reached 0, which is fundamentally a bad idea.  It
does check 'f_count < 2', which excludes concurrent operations on the
file since they would only be possible with a shared fd table, in which
case each fdget() would take a file reference.  However, it fails to
account for the fact that even with 'f_count == 1' the file can still be
linked into epoll instances.  As reported by syzbot, this can trivially
be used to cause a use-after-free.

Yet, the only known user of PPPIOCDETACH is pppd versions older than
ppp-2.4.2, which was released almost 15 years ago (November 2003).
Also, PPPIOCDETACH apparently stopped working reliably at around the
same time, when the f_count check was added to the kernel, e.g. see
https://lkml.org/lkml/2002/12/31/83.  Also, the current 'f_count < 2'
check makes PPPIOCDETACH only work in single-threaded applications; it
always fails if called from a multithreaded application.

All pppd versions released in the last 15 years just close() the file
descriptor instead.

Therefore, instead of hacking around this bug by exporting epoll
internals to modules, and probably missing other related bugs, just
remove the PPPIOCDETACH ioctl and see if anyone actually notices.

Reported-by: syzbot+16363c99d4134717c05b@syzkaller.appspotmail.com
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: stable@vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
---
 Documentation/networking/ppp_generic.txt |  6 -----
 drivers/net/ppp/ppp_generic.c            | 29 ------------------------
 fs/compat_ioctl.c                        |  1 -
 include/uapi/linux/ppp-ioctl.h           |  1 -
 4 files changed, 37 deletions(-)

diff --git a/Documentation/networking/ppp_generic.txt b/Documentation/networking/ppp_generic.txt
index 091d20273dcb..61daf4b39600 100644
--- a/Documentation/networking/ppp_generic.txt
+++ b/Documentation/networking/ppp_generic.txt
@@ -300,12 +300,6 @@ unattached instance are:
 The ioctl calls available on an instance of /dev/ppp attached to a
 channel are:
 
-* PPPIOCDETACH detaches the instance from the channel.  This ioctl is
-  deprecated since the same effect can be achieved by closing the
-  instance.  In order to prevent possible races this ioctl will fail
-  with an EINVAL error if more than one file descriptor refers to this
-  instance (i.e. as a result of dup(), dup2() or fork()).
-
 * PPPIOCCONNECT connects this channel to a PPP interface.  The
   argument should point to an int containing the interface unit
   number.  It will return an EINVAL error if the channel is already
diff --git a/drivers/net/ppp/ppp_generic.c b/drivers/net/ppp/ppp_generic.c
index dc7c7ec43202..dce8812fe802 100644
--- a/drivers/net/ppp/ppp_generic.c
+++ b/drivers/net/ppp/ppp_generic.c
@@ -603,35 +603,6 @@ static long ppp_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
 		goto out;
 	}
 
-	if (cmd == PPPIOCDETACH) {
-		/*
-		 * We have to be careful here... if the file descriptor
-		 * has been dup'd, we could have another process in the
-		 * middle of a poll using the same file *, so we had
-		 * better not free the interface data structures -
-		 * instead we fail the ioctl.  Even in this case, we
-		 * shut down the interface if we are the owner of it.
-		 * Actually, we should get rid of PPPIOCDETACH, userland
-		 * (i.e. pppd) could achieve the same effect by closing
-		 * this fd and reopening /dev/ppp.
-		 */
-		err = -EINVAL;
-		if (pf->kind == INTERFACE) {
-			ppp = PF_TO_PPP(pf);
-			rtnl_lock();
-			if (file == ppp->owner)
-				unregister_netdevice(ppp->dev);
-			rtnl_unlock();
-		}
-		if (atomic_long_read(&file->f_count) < 2) {
-			ppp_release(NULL, file);
-			err = 0;
-		} else
-			pr_warn("PPPIOCDETACH file->f_count=%ld\n",
-				atomic_long_read(&file->f_count));
-		goto out;
-	}
-
 	if (pf->kind == CHANNEL) {
 		struct channel *pch;
 		struct ppp_channel *chan;
diff --git a/fs/compat_ioctl.c b/fs/compat_ioctl.c
index ef80085ed564..8285b570d635 100644
--- a/fs/compat_ioctl.c
+++ b/fs/compat_ioctl.c
@@ -917,7 +917,6 @@ COMPATIBLE_IOCTL(PPPIOCSDEBUG)
 /* PPPIOCGIDLE is translated */
 COMPATIBLE_IOCTL(PPPIOCNEWUNIT)
 COMPATIBLE_IOCTL(PPPIOCATTACH)
-COMPATIBLE_IOCTL(PPPIOCDETACH)
 COMPATIBLE_IOCTL(PPPIOCSMRRU)
 COMPATIBLE_IOCTL(PPPIOCCONNECT)
 COMPATIBLE_IOCTL(PPPIOCDISCONN)
diff --git a/include/uapi/linux/ppp-ioctl.h b/include/uapi/linux/ppp-ioctl.h
index b19a9c249b15..d46caf217ea4 100644
--- a/include/uapi/linux/ppp-ioctl.h
+++ b/include/uapi/linux/ppp-ioctl.h
@@ -106,7 +106,6 @@ struct pppol2tp_ioc_stats {
 #define PPPIOCGIDLE	_IOR('t', 63, struct ppp_idle) /* get idle time */
 #define PPPIOCNEWUNIT	_IOWR('t', 62, int)	/* create new ppp unit */
 #define PPPIOCATTACH	_IOW('t', 61, int)	/* attach to ppp unit */
-#define PPPIOCDETACH	_IOW('t', 60, int)	/* detach from ppp unit/chan */
 #define PPPIOCSMRRU	_IOW('t', 59, int)	/* set multilink MRU */
 #define PPPIOCCONNECT	_IOW('t', 58, int)	/* connect channel to unit */
 #define PPPIOCDISCONN	_IO('t', 57)		/* disconnect channel */
-- 
2.17.0

^ permalink raw reply related

* Re: [PATCH] [RFC] bpf: tracing: new helper bpf_get_current_cgroup_ino
From: Y Song @ 2018-05-23  4:31 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Alban Crequy, netdev, linux-kernel, containers, cgroups,
	Alban Crequy, tj
In-Reply-To: <20180523033550.z3tqo4lhd3zrmtdu@ast-mbp>

On Tue, May 22, 2018 at 8:35 PM, Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
> On Tue, May 22, 2018 at 08:33:24PM -0700, Y Song wrote:
>> +       struct cgroup *cgrp = task_dfl_cgroup(current);
>> +       if (!cgrp)
>> +               return -EINVAL;
>
> why this check is needed?

No reason :-) Originally I am concerned whether it is possible cgrp
could be NULL.
By looking at the code, it SEEMS to me that it could not be NULL, but I am not
100% sure (as I am not a cgroup expert). Since you are asking,
probably means it cannot be NULL, so will remove it in formal upstream patch.

^ permalink raw reply

* [PATCH] net: phy: replace bool members in struct phy_device with bit-fields
From: Heiner Kallweit @ 2018-05-23  6:05 UTC (permalink / raw)
  To: Florian Fainelli, Andrew Lunn; +Cc: netdev@vger.kernel.org

In struct phy_device we have a number of flags being defined as type
bool. Similar to e.g. struct pci_dev we can save some space by using
bit-fields.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
---
 include/linux/phy.h | 23 +++++++++++------------
 1 file changed, 11 insertions(+), 12 deletions(-)

diff --git a/include/linux/phy.h b/include/linux/phy.h
index 073235e70..6cd090984 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -406,13 +406,17 @@ struct phy_device {
 	u32 phy_id;
 
 	struct phy_c45_device_ids c45_ids;
-	bool is_c45;
-	bool is_internal;
-	bool is_pseudo_fixed_link;
-	bool has_fixups;
-	bool suspended;
-	bool sysfs_links;
-	bool loopback_enabled;
+	unsigned is_c45:1;
+	unsigned is_internal:1;
+	unsigned is_pseudo_fixed_link:1;
+	unsigned has_fixups:1;
+	unsigned suspended:1;
+	unsigned sysfs_links:1;
+	unsigned loopback_enabled:1;
+
+	unsigned autoneg:1;
+	/* The most recently read link state */
+	unsigned link:1;
 
 	enum phy_state state;
 
@@ -429,9 +433,6 @@ struct phy_device {
 	int pause;
 	int asym_pause;
 
-	/* The most recently read link state */
-	int link;
-
 	/* Enabled Interrupts */
 	u32 interrupts;
 
@@ -444,8 +445,6 @@ struct phy_device {
 	/* Energy efficient ethernet modes which should be prohibited */
 	u32 eee_broken_modes;
 
-	int autoneg;
-
 	int link_timeout;
 
 #ifdef CONFIG_LED_TRIGGER_PHY
-- 
2.17.0

^ permalink raw reply related

* [PATCH net-next] cxgb4: Add new T6 device ids
From: Ganesh Goudar @ 2018-05-23  6:06 UTC (permalink / raw)
  To: netdev, davem; +Cc: nirranjan, indranil, venkatesh, Ganesh Goudar

Add 0x6088 and 0x6089 device ids for new T6 cards.

Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
---
 drivers/net/ethernet/chelsio/cxgb4/t4_pci_id_tbl.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/t4_pci_id_tbl.h b/drivers/net/ethernet/chelsio/cxgb4/t4_pci_id_tbl.h
index adacc63..c7f8d04 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/t4_pci_id_tbl.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/t4_pci_id_tbl.h
@@ -212,6 +212,8 @@ CH_PCI_DEVICE_ID_TABLE_DEFINE_BEGIN
 	CH_PCI_ID_TABLE_FENTRY(0x6085), /* Custom T6240-SO */
 	CH_PCI_ID_TABLE_FENTRY(0x6086), /* Custom T6225-SO-CR */
 	CH_PCI_ID_TABLE_FENTRY(0x6087), /* Custom T6225-CR */
+	CH_PCI_ID_TABLE_FENTRY(0x6088), /* Custom T62100-CR */
+	CH_PCI_ID_TABLE_FENTRY(0x6089), /* Custom T62100-KR */
 CH_PCI_DEVICE_ID_TABLE_DEFINE_END;
 
 #endif /* __T4_PCI_ID_TBL_H__ */
-- 
2.1.0

^ permalink raw reply related

* Re: [net-next 1/6] net/dcb: Add dcbnl buffer attribute
From: Or Gerlitz @ 2018-05-23  6:15 UTC (permalink / raw)
  To: Huy Nguyen; +Cc: Linux Netdev List
In-Reply-To: <1576e986-6e04-63f3-3569-a105493929a6@mellanox.com>

On Wed, May 23, 2018 at 4:01 AM, Huy Nguyen <huyn@mellanox.com> wrote:
> Dear Jakub, PSB.
> On 5/22/2018 1:32 PM, Jakub Kicinski wrote:

>> Devlink API accommodates requirements of simpler (SwitchX2?) and more
>> advanced schemes (present in Spectrum).  The simpler/basic static
>> threshold configurations is exactly what you are doing here, AFAIU.

> [HQN] Devlink API is tailored specifically for switch. We don't configure
> threshold configuration
> explicitly. It is done via PFC. Once PFC is enabled on priority, threshold
> is setup based on our
> proprietary formula that were tested rigorously for performance.

Huy, please do not prefix your reply lines with your name, it's not needed
and confusing, the email clients used by people in this list do the job.

^ permalink raw reply

* Re: [net-next] i40iw/i40e: Remove link dependency on i40e
From: Christoph Hellwig @ 2018-05-23  6:19 UTC (permalink / raw)
  To: Jeff Kirsher
  Cc: Jason Gunthorpe, davem, dledford, Sindhu Devale, netdev,
	linux-rdma, nhorman, sassmann, jogreene, Shiraz Saleem
In-Reply-To: <079ceee3bc8cd0ea50dd7ddc12b27512ca5ac49e.camel@intel.com>

On Tue, May 22, 2018 at 02:04:06PM -0700, Jeff Kirsher wrote:
> > Why would you want to do this? The rdma driver is non-functional
> > without the ethernet driver, so why on earth would we want to defeat
> > the module dependency mechanism?
> 
> This change is driven by the OSV's like Red Hat, where customer's were
> updating the i40e driver, which in turn broke i40iw.

Doctor it hurts when I do this..

There is no reason to make a mess of our drivers because people are
doing things they should haver never done and that aren't supported
in Linux.

If Intel didn;t offer any out of tree drivers I'm pretty sure no
customer would even attempt this.  So fix this where the problem is.

^ permalink raw reply

* [PATCH net-next v3 0/7] Add support for QCA8334 switch
From: Michal Vokáč @ 2018-05-23  6:20 UTC (permalink / raw)
  To: netdev
  Cc: linux-kernel, devicetree, f.fainelli, vivien.didelot, andrew,
	mark.rutland, robh+dt, davem, michal.vokac

This series basically adds support for a QCA8334 ethernet switch to the
qca8k driver. It is a four-port variant of the already supported seven
port QCA8337. Register map is the same for the whole familly and all chips
have the same device ID.

Major part of this series enhances the CPU port setting. Currently the CPU
port is not set to any sensible defaults compatible with the xGMII
interface. This series forces the CPU port to its maximum bandwidth and
also allows to adjust the new defaults using fixed-link device tree
sub-node.

Alongside these changes I fixed two checkpatch warnings regarding SPDX and
redundant parentheses.

Changes in v3:
 - Rebased on latest net-next/master.
 - Corrected fixed-link documentation.

Michal Vokáč (7):
  net: dsa: qca8k: Add QCA8334 binding documentation
  net: dsa: qca8k: Add support for QCA8334 switch
  net: dsa: qca8k: Enable RXMAC when bringing up a port
  net: dsa: qca8k: Force CPU port to its highest bandwidth
  net: dsa: qca8k: Allow overwriting CPU port setting
  net: dsa: qca8k: Replace GPL boilerplate by SPDX
  net: dsa: qca8k: Remove redundant parentheses

 .../devicetree/bindings/net/dsa/qca8k.txt          | 23 +++++++-
 drivers/net/dsa/qca8k.c                            | 64 ++++++++++++++++++----
 drivers/net/dsa/qca8k.h                            |  7 ++-
 3 files changed, 79 insertions(+), 15 deletions(-)

-- 
2.7.4

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox