Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH net-next] tun: use per cpu variables for stats accounting
From: Paolo Abeni @ 2016-04-15 12:30 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, mst, hannes, ebiederm, gkurz, jasowang
In-Reply-To: <20160414.225617.2128747969495613941.davem@davemloft.net>

On Thu, 2016-04-14 at 22:56 -0400, David Miller wrote:
> From: Paolo Abeni <pabeni@redhat.com>
> Date: Wed, 13 Apr 2016 10:52:20 +0200
> 
> > Currently the tun device accounting uses dev->stats without applying any
> > kind of protection, regardless that accounting happens in preemptible
> > process context.
> > This patch move the tun stats to a per cpu data structure, and protect
> > the updates with  u64_stats_update_begin()/u64_stats_update_end() or
> > this_cpu_inc according to the stat type. The per cpu stats are
> > aggregated by the newly added ndo_get_stats64 ops.
> > 
> > Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> 
> Ok this seems reasonable, applied, thanks.
> 
> I guess most applications use tuntap by having two threads, one
> for transmit and one for receive processing?

Probably. I guess that an user space application can leverage multiple
transmit threads to improve the throughput.

Paolo

^ permalink raw reply

* Re: [PATCH 1/2] [v4] net: emac: emac gigabit ethernet controller driver
From: Rob Herring @ 2016-04-15 12:35 UTC (permalink / raw)
  To: Timur Tabi
  Cc: Vikram Sethi, Florian Fainelli, netdev,
	linux-kernel@vger.kernel.org, devicetree@vger.kernel.org,
	linux-arm-msm, Sagar Dharia, Shanker Donthineni,
	Greg Kroah-Hartman, Christopher Covington, Gilad Avidov,
	Andrew Lunn, Bjorn Andersson, Mark Langsdorf, Jon Masters,
	Andy Gross, David S. Miller
In-Reply-To: <57102920.7000104@codeaurora.org>

On Thu, Apr 14, 2016 at 6:34 PM, Timur Tabi <timur@codeaurora.org> wrote:
> Vikram Sethi wrote:
>>>>
>>>> >>     retval = dma_coerce_mask_and_coherent(&pdev->dev,
>>>> >> DMA_BIT_MASK(64));
>>>> >>     if (retval) {
>>>> >>         dev_err(&pdev->dev, "failed to set DMA mask err %d\n",
>>>> >> retval);
>>>> >>         goto err_res;
>>>> >>     }
>>
>> How can you set the mask to 64 bits when the EMAC IP on FSM9900 and
>> QDF2432 can only do 32 bit DMA?
>> The mask in that API is a bit mask describing which bits of an address
>> your device supports.
>
>
> Vikram, Shanker, and I discussed this offline, and came to a consensus.
>
> The FSM9900 is a 32-bit platform, so the kernel will never create a DMA
> address above 4GB. Even if the driver sets the mask to 64 bits, it will
> technically work.  However, the mask should be set to 32 because all address
> buses are 32 bits.
>
> The QDF2432 is different.  Although it's an ARM64 platform, we have the
> unfortunate situation that only 32 bits of that address is wired to the rest
> of the chip.  So even though the Emac can handle 64-bit bus addresses, if it
> actually attempts to DMA above 4GB, the address will get truncated and
> corrupt memory.  The mask needs to be set to 32.
>
> There may or may not be other ARM64 chips from us that won't have this
> problem in the future, so these hypothetical chips would have a mask of 64.
>
> So I think the solution is to create a device tree (and ACPI) property that
> holds the mask.
>
>         dma-mask = <0 0xffffffff>;
>
> or
>
>         dma-mask = <0xffffffff 0xffffffff>;

No. See dma-ranges.

Rob

>
> The driver will then do this:
>
>         u64 dma_mask;
>         device_property_read_u64(&pdev->dev, "dma-mask", &dma_mask);
>         dma_coerce_mask_and_coherent(&pdev->dev, dma_mask);
>
> What I'm not sure yet is whether I should call
> dma_coerce_mask_and_coherent() or dma_set_coherent_mask().
>
> --
> Qualcomm Innovation Center, Inc.
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora
> Forum, a Linux Foundation collaborative project.

^ permalink raw reply

* Re: Deleting child qdisc doesn't reset parent to default qdisc?
From: Jamal Hadi Salim @ 2016-04-15 12:42 UTC (permalink / raw)
  To: Eric Dumazet, Jiri Kosina; +Cc: Phil Sutter, netdev, linux-kernel
In-Reply-To: <1460656170.10638.61.camel@edumazet-glaptop3.roam.corp.google.com>

On 16-04-14 01:49 PM, Eric Dumazet wrote:

> And what would be the chosen behavior ?
>

TBF is probably a bad example because it started life as
a classless qdisc. There was only one built-in fifo queue
that was shaped. Then someone made it classful and changed
this behavior. To me it sounds reasonable to have the
default behavior restored. At minimal consistency.

> Relying on TBF installing a bfifo for you at delete would be hazardous.
>
> For example CBQ got it differently than HFSC
>
> If qdisc_create_dflt() fails in CBQ, we fail the 'delete', while HFSC
> falls back to noop_qdisc, without warning the user :(
>
> At least always using noop_qdisc is consistent. No magic there.
>
> Doing 'unification' right now would break existing scripts.
>
> This is too late, I am afraid.

Sigh. So rant:
IMO, we should let any new APIs and API updates stay longer
in discussion. Or better mark them as unstable for sometime.
The excuse that "it is out in the wild therefore cant be changed"
is harmful because the timeline is "forever" whereas
patches are applied after a short period of posting
and discussions and sometimes not involving the right people.
It is like having a jury issuing a death sentence after 1 week of
deliberation. You cant take it back after execution.

cheers,
jamal

^ permalink raw reply

* Re: [PATCH 00/28] Optimise page alloc/free fast paths v3
From: Jesper Dangaard Brouer @ 2016-04-15 12:44 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Andrew Morton, Vlastimil Babka, Linux-MM, LKML, brouer,
	netdev@vger.kernel.org
In-Reply-To: <1460710760-32601-1-git-send-email-mgorman@techsingularity.net>

On Fri, 15 Apr 2016 09:58:52 +0100
Mel Gorman <mgorman@techsingularity.net> wrote:

> There were no further responses to the last series but I kept going and
> added a few more small bits. Most are basic micro-optimisations.  The last
> two patches weaken debugging checks to improve performance at the cost of
> delayed detection of some use-after-free and memory corruption bugs. If
> they make people uncomfortable, they can be dropped and the rest of the
> series stands on its own.
> 
> Changelog since v2
> o Add more micro-optimisations
> o Weak debugging checks in favor of speed
> 
[...]
> 
> The overall impact on a page allocator microbenchmark for a range of orders

I also micro benchmarked this patchset.  Avail via Mel Gorman's kernel tree:
 http://git.kernel.org/cgit/linux/kernel/git/mel/linux.git
tested branch mm-vmscan-node-lru-v5r9 which also contain the node-lru series.

Tool:
 https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/mm/bench/page_bench01.c
Run as:
 modprobe page_bench01; rmmod page_bench01 ; dmesg | tail -n40 | grep 'alloc_pages order'

Results kernel 4.6.0-rc1 :

 alloc_pages order:0(4096B/x1) 272 cycles per-4096B 272 cycles
 alloc_pages order:1(8192B/x2) 395 cycles per-4096B 197 cycles
 alloc_pages order:2(16384B/x4) 433 cycles per-4096B 108 cycles
 alloc_pages order:3(32768B/x8) 503 cycles per-4096B 62 cycles
 alloc_pages order:4(65536B/x16) 682 cycles per-4096B 42 cycles
 alloc_pages order:5(131072B/x32) 910 cycles per-4096B 28 cycles
 alloc_pages order:6(262144B/x64) 1384 cycles per-4096B 21 cycles
 alloc_pages order:7(524288B/x128) 2335 cycles per-4096B 18 cycles
 alloc_pages order:8(1048576B/x256) 4108 cycles per-4096B 16 cycles
 alloc_pages order:9(2097152B/x512) 8398 cycles per-4096B 16 cycles

After Mel Gorman's optimizations, results from mm-vmscan-node-lru-v5r::

 alloc_pages order:0(4096B/x1) 231 cycles per-4096B 231 cycles
 alloc_pages order:1(8192B/x2) 351 cycles per-4096B 175 cycles
 alloc_pages order:2(16384B/x4) 357 cycles per-4096B 89 cycles
 alloc_pages order:3(32768B/x8) 397 cycles per-4096B 49 cycles
 alloc_pages order:4(65536B/x16) 481 cycles per-4096B 30 cycles
 alloc_pages order:5(131072B/x32) 652 cycles per-4096B 20 cycles
 alloc_pages order:6(262144B/x64) 1054 cycles per-4096B 16 cycles
 alloc_pages order:7(524288B/x128) 1852 cycles per-4096B 14 cycles
 alloc_pages order:8(1048576B/x256) 3156 cycles per-4096B 12 cycles
 alloc_pages order:9(2097152B/x512) 6790 cycles per-4096B 13 cycles



I've also started doing some parallel concurrency testing workloads[1]
 [1] https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/mm/bench/page_bench03.c

Order-0 pages scale nicely:

Results kernel 4.6.0-rc1 :
 Parallel-CPUs:1 page order:0(4096B/x1) ave 274 cycles per-4096B 274 cycles
 Parallel-CPUs:2 page order:0(4096B/x1) ave 283 cycles per-4096B 283 cycles
 Parallel-CPUs:3 page order:0(4096B/x1) ave 284 cycles per-4096B 284 cycles
 Parallel-CPUs:4 page order:0(4096B/x1) ave 288 cycles per-4096B 288 cycles
 Parallel-CPUs:5 page order:0(4096B/x1) ave 417 cycles per-4096B 417 cycles
 Parallel-CPUs:6 page order:0(4096B/x1) ave 503 cycles per-4096B 503 cycles
 Parallel-CPUs:7 page order:0(4096B/x1) ave 567 cycles per-4096B 567 cycles
 Parallel-CPUs:8 page order:0(4096B/x1) ave 620 cycles per-4096B 620 cycles

And even better with you changes! :-))) This is great work!

Results from mm-vmscan-node-lru-v5r:
 Parallel-CPUs:1 page order:0(4096B/x1) ave 246 cycles per-4096B 246 cycles
 Parallel-CPUs:2 page order:0(4096B/x1) ave 251 cycles per-4096B 251 cycles
 Parallel-CPUs:3 page order:0(4096B/x1) ave 254 cycles per-4096B 254 cycles
 Parallel-CPUs:4 page order:0(4096B/x1) ave 258 cycles per-4096B 258 cycles
 Parallel-CPUs:5 page order:0(4096B/x1) ave 313 cycles per-4096B 313 cycles
 Parallel-CPUs:6 page order:0(4096B/x1) ave 369 cycles per-4096B 369 cycles
 Parallel-CPUs:7 page order:0(4096B/x1) ave 379 cycles per-4096B 379 cycles
 Parallel-CPUs:8 page order:0(4096B/x1) ave 399 cycles per-4096B 399 cycles


It does not seem that higher order page scale... and your patches does
not change this pattern.

Example order-3 pages, which is often used in the network stack:

Results kernel 4.6.0-rc1 ::
 Parallel-CPUs:1 page order:3(32768B/x8) ave 524 cycles per-4096B 65 cycles
 Parallel-CPUs:2 page order:3(32768B/x8) ave 2131 cycles per-4096B 266 cycles
 Parallel-CPUs:3 page order:3(32768B/x8) ave 3885 cycles per-4096B 485 cycles
 Parallel-CPUs:4 page order:3(32768B/x8) ave 4520 cycles per-4096B 565 cycles
 Parallel-CPUs:5 page order:3(32768B/x8) ave 5604 cycles per-4096B 700 cycles
 Parallel-CPUs:6 page order:3(32768B/x8) ave 7125 cycles per-4096B 890 cycles
 Parallel-CPUs:7 page order:3(32768B/x8) ave 7883 cycles per-4096B 985 cycles
 Parallel-CPUs:8 page order:3(32768B/x8) ave 9364 cycles per-4096B 1170 cycles

Results from mm-vmscan-node-lru-v5r:
 Parallel-CPUs:1 page order:3(32768B/x8) ave 421 cycles per-4096B 52 cycles
 Parallel-CPUs:2 page order:3(32768B/x8) ave 2236 cycles per-4096B 279 cycles
 Parallel-CPUs:3 page order:3(32768B/x8) ave 3408 cycles per-4096B 426 cycles
 Parallel-CPUs:4 page order:3(32768B/x8) ave 4687 cycles per-4096B 585 cycles
 Parallel-CPUs:5 page order:3(32768B/x8) ave 5972 cycles per-4096B 746 cycles
 Parallel-CPUs:6 page order:3(32768B/x8) ave 7349 cycles per-4096B 918 cycles
 Parallel-CPUs:7 page order:3(32768B/x8) ave 8436 cycles per-4096B 1054 cycles
 Parallel-CPUs:8 page order:3(32768B/x8) ave 9589 cycles per-4096B 1198 cycles

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/mm/bench/page_bench03.c

 for ORDER in $(seq 0 5) ; do \
    for X in $(seq 1 8) ; do \
       modprobe page_bench03 page_order=$ORDER parallel_cpus=$X run_flags=$((2#100)); \
       rmmod page_bench03 ; dmesg | tail -n 3 | grep Parallel-CPUs ; \
    done; \
 done

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* Re: [PATCH 00/28] Optimise page alloc/free fast paths v3
From: Mel Gorman @ 2016-04-15 13:08 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Andrew Morton, Vlastimil Babka, Linux-MM, LKML,
	netdev@vger.kernel.org
In-Reply-To: <20160415144402.5fbe7a1e@redhat.com>

On Fri, Apr 15, 2016 at 02:44:02PM +0200, Jesper Dangaard Brouer wrote:
> On Fri, 15 Apr 2016 09:58:52 +0100
> Mel Gorman <mgorman@techsingularity.net> wrote:
> 
> > There were no further responses to the last series but I kept going and
> > added a few more small bits. Most are basic micro-optimisations.  The last
> > two patches weaken debugging checks to improve performance at the cost of
> > delayed detection of some use-after-free and memory corruption bugs. If
> > they make people uncomfortable, they can be dropped and the rest of the
> > series stands on its own.
> > 
> > Changelog since v2
> > o Add more micro-optimisations
> > o Weak debugging checks in favor of speed
> > 
> [...]
> > 
> > The overall impact on a page allocator microbenchmark for a range of orders
> 
> I also micro benchmarked this patchset.  Avail via Mel Gorman's kernel tree:
>  http://git.kernel.org/cgit/linux/kernel/git/mel/linux.git
> tested branch mm-vmscan-node-lru-v5r9 which also contain the node-lru series.
> 
> Tool:
>  https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/mm/bench/page_bench01.c
> Run as:
>  modprobe page_bench01; rmmod page_bench01 ; dmesg | tail -n40 | grep 'alloc_pages order'
> 

Thanks Jesper.

> Results kernel 4.6.0-rc1 :
> 
>  alloc_pages order:0(4096B/x1) 272 cycles per-4096B 272 cycles
>  alloc_pages order:1(8192B/x2) 395 cycles per-4096B 197 cycles
>  alloc_pages order:2(16384B/x4) 433 cycles per-4096B 108 cycles
>  alloc_pages order:3(32768B/x8) 503 cycles per-4096B 62 cycles
>  alloc_pages order:4(65536B/x16) 682 cycles per-4096B 42 cycles
>  alloc_pages order:5(131072B/x32) 910 cycles per-4096B 28 cycles
>  alloc_pages order:6(262144B/x64) 1384 cycles per-4096B 21 cycles
>  alloc_pages order:7(524288B/x128) 2335 cycles per-4096B 18 cycles
>  alloc_pages order:8(1048576B/x256) 4108 cycles per-4096B 16 cycles
>  alloc_pages order:9(2097152B/x512) 8398 cycles per-4096B 16 cycles
> 
> After Mel Gorman's optimizations, results from mm-vmscan-node-lru-v5r::
> 
>  alloc_pages order:0(4096B/x1) 231 cycles per-4096B 231 cycles
>  alloc_pages order:1(8192B/x2) 351 cycles per-4096B 175 cycles
>  alloc_pages order:2(16384B/x4) 357 cycles per-4096B 89 cycles
>  alloc_pages order:3(32768B/x8) 397 cycles per-4096B 49 cycles
>  alloc_pages order:4(65536B/x16) 481 cycles per-4096B 30 cycles
>  alloc_pages order:5(131072B/x32) 652 cycles per-4096B 20 cycles
>  alloc_pages order:6(262144B/x64) 1054 cycles per-4096B 16 cycles
>  alloc_pages order:7(524288B/x128) 1852 cycles per-4096B 14 cycles
>  alloc_pages order:8(1048576B/x256) 3156 cycles per-4096B 12 cycles
>  alloc_pages order:9(2097152B/x512) 6790 cycles per-4096B 13 cycles
> 

This is broadly in line with expectations. order-0 sees the biggest
boost because that's what the series focused on. High-order allocations
see some benefits but they're still going through the slower paths of
the allocator so it's less obvious.

I'm glad to see this independently verified.

> 
> I've also started doing some parallel concurrency testing workloads[1]
>  [1] https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/mm/bench/page_bench03.c
> 
> Order-0 pages scale nicely:
> 
> Results kernel 4.6.0-rc1 :
>  Parallel-CPUs:1 page order:0(4096B/x1) ave 274 cycles per-4096B 274 cycles
>  Parallel-CPUs:2 page order:0(4096B/x1) ave 283 cycles per-4096B 283 cycles
>  Parallel-CPUs:3 page order:0(4096B/x1) ave 284 cycles per-4096B 284 cycles
>  Parallel-CPUs:4 page order:0(4096B/x1) ave 288 cycles per-4096B 288 cycles
>  Parallel-CPUs:5 page order:0(4096B/x1) ave 417 cycles per-4096B 417 cycles
>  Parallel-CPUs:6 page order:0(4096B/x1) ave 503 cycles per-4096B 503 cycles
>  Parallel-CPUs:7 page order:0(4096B/x1) ave 567 cycles per-4096B 567 cycles
>  Parallel-CPUs:8 page order:0(4096B/x1) ave 620 cycles per-4096B 620 cycles
> 
> And even better with you changes! :-))) This is great work!
> 
> Results from mm-vmscan-node-lru-v5r:
>  Parallel-CPUs:1 page order:0(4096B/x1) ave 246 cycles per-4096B 246 cycles
>  Parallel-CPUs:2 page order:0(4096B/x1) ave 251 cycles per-4096B 251 cycles
>  Parallel-CPUs:3 page order:0(4096B/x1) ave 254 cycles per-4096B 254 cycles
>  Parallel-CPUs:4 page order:0(4096B/x1) ave 258 cycles per-4096B 258 cycles
>  Parallel-CPUs:5 page order:0(4096B/x1) ave 313 cycles per-4096B 313 cycles
>  Parallel-CPUs:6 page order:0(4096B/x1) ave 369 cycles per-4096B 369 cycles
>  Parallel-CPUs:7 page order:0(4096B/x1) ave 379 cycles per-4096B 379 cycles
>  Parallel-CPUs:8 page order:0(4096B/x1) ave 399 cycles per-4096B 399 cycles
> 

Excellent, thanks!

> 
> It does not seem that higher order page scale... and your patches does
> not change this pattern.
> 
> Example order-3 pages, which is often used in the network stack:
> 

Unfortunately, this lack of scaling is expected. All the high-order
allocations bypass the per-cpu allocator so multiple parallel requests
will contend on the zone->lock. Technically, the per-cpu allocator could
handle high-order pages but failures would require IPIs to drain the
remote lists and the memory footprint would be high. Whatever about the
memory footprint, sending IPIs on every allocation failure is going to
cause undesirable latency spikes.

The original design of the per-cpu allocator assumed that high-order
allocations were rare. This expectation is partially violated by SLUB
using high-order pages, the network layer using compound pages and also
by the test case unfortunately.

I'll put some thought into how it could be improved on the flight over to
LSF/MM but right now, I'm not very optimistic that a solution will be simple.

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* [patch net-next 0/2] mlxsw: spectrum_buffers: couple of cosmetic patches
From: Jiri Pirko @ 2016-04-15 13:09 UTC (permalink / raw)
  To: netdev; +Cc: davem, idosch, eladr, yotamg, ogerlitz, David.Laight

From: Jiri Pirko <jiri@mellanox.com>

As suggested by David Laight

Jiri Pirko (2):
  mlxsw: spectrum_buffers: Use designated initializers for mlxsw_sp_pbs
  mlxsw: spectrum_buffers: Use MLXSW_SP_PB_UNUSED define for unused pb

 drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c | 15 ++++-----------
 1 file changed, 4 insertions(+), 11 deletions(-)

-- 
2.5.5

^ permalink raw reply

* [patch net-next 1/2] mlxsw: spectrum_buffers: Use designated initializers for mlxsw_sp_pbs
From: Jiri Pirko @ 2016-04-15 13:09 UTC (permalink / raw)
  To: netdev; +Cc: davem, idosch, eladr, yotamg, ogerlitz, David.Laight
In-Reply-To: <1460725778-14655-1-git-send-email-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

Suggested-by: David Laight <David.Laight@ACULAB.COM>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c | 12 ++----------
 1 file changed, 2 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c
index f2e073a..64166dd 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c
@@ -162,16 +162,8 @@ static int mlxsw_sp_sb_pm_occ_query(struct mlxsw_sp *mlxsw_sp, u8 local_port,
 }
 
 static const u16 mlxsw_sp_pbs[] = {
-	2 * MLXSW_SP_BYTES_TO_CELLS(ETH_FRAME_LEN),
-	0,
-	0,
-	0,
-	0,
-	0,
-	0,
-	0,
-	0, /* Unused */
-	2 * MLXSW_SP_BYTES_TO_CELLS(MLXSW_PORT_MAX_MTU),
+	[0] = 2 * MLXSW_SP_BYTES_TO_CELLS(ETH_FRAME_LEN),
+	[9] = 2 * MLXSW_SP_BYTES_TO_CELLS(MLXSW_PORT_MAX_MTU),
 };
 
 #define MLXSW_SP_PBS_LEN ARRAY_SIZE(mlxsw_sp_pbs)
-- 
2.5.5

^ permalink raw reply related

* [patch net-next 2/2] mlxsw: spectrum_buffers: Use MLXSW_SP_PB_UNUSED define for unused pb
From: Jiri Pirko @ 2016-04-15 13:09 UTC (permalink / raw)
  To: netdev; +Cc: davem, idosch, eladr, yotamg, ogerlitz, David.Laight
In-Reply-To: <1460725778-14655-1-git-send-email-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

Suggested-by: David Laight <David.Laight@ACULAB.COM>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c
index 64166dd..a3720a0 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c
@@ -167,6 +167,7 @@ static const u16 mlxsw_sp_pbs[] = {
 };
 
 #define MLXSW_SP_PBS_LEN ARRAY_SIZE(mlxsw_sp_pbs)
+#define MLXSW_SP_PB_UNUSED 8
 
 static int mlxsw_sp_port_pb_init(struct mlxsw_sp_port *mlxsw_sp_port)
 {
@@ -176,7 +177,7 @@ static int mlxsw_sp_port_pb_init(struct mlxsw_sp_port *mlxsw_sp_port)
 	mlxsw_reg_pbmc_pack(pbmc_pl, mlxsw_sp_port->local_port,
 			    0xffff, 0xffff / 2);
 	for (i = 0; i < MLXSW_SP_PBS_LEN; i++) {
-		if (i == 8)
+		if (i == MLXSW_SP_PB_UNUSED)
 			continue;
 		mlxsw_reg_pbmc_lossy_buffer_pack(pbmc_pl, i, mlxsw_sp_pbs[i]);
 	}
-- 
2.5.5

^ permalink raw reply related

* Re: Fwd: Re: Section 4 No. 9,10 Failed was occurred by IPv6 Ready Logo Conformance Test
From: Hagen Paul Pfeifer @ 2016-04-15 13:59 UTC (permalink / raw)
  To: hannes, rongqing.li, netdev, Yuki Machida, davem
In-Reply-To: <5710AAA3.2020703@jp.fujitsu.com>

> On April 15, 2016 at 10:47 AM Yuki Machida <machida.yuki@jp.fujitsu.com> wrote:
> 
> >> commit 9d289715eb5c252ae15bd547cb252ca547a3c4f2
> >> Author: Hagen Paul Pfeifer <hagen@jauu.net>
> >> Date: Thu Jan 15 22:34:25 2015 +0100
> >>
> >>       ipv6: stop sending PTB packets for MTU < 1280
> >>
> >>       Reduce the attack vector and stop generating IPv6 Fragment Header for
> >>       paths with an MTU smaller than the minimum required IPv6 MTU
> >>       size (1280 byte) - called atomic fragments.
> >>
> >>       See IETF I-D "Deprecating the Generation of IPv6 Atomic Fragments" [1]
> >>       for more information and how this "feature" can be misused.
> >>
> >>       [1]
> >> https://tools.ietf.org/html/draft-ietf-6man-deprecate-atomfrag-generation-00
> >>
> >>       Signed-off-by: Fernando Gont <fgont@si6networks.com>
> >>       Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
> >>       Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
> >>       Signed-off-by: David S. Miller <davem@davemloft.net>
> > 
> > I will try.
> 
> I confirmed that v4.1.20 revert above patch is passed Section 4 No. 9 and 10 testcases
> in IPv6 Ready Logo Conformance Test.
> I can't immediately revert above patch from v4.6-rc1 by implementation has changed.

is it to please a conforming test tool or fix "revert 9d289715eb5c2" a real problem? If so: which problem do you have with 9d289715eb5c2 or draft-ietf-6man-deprecate-atomfrag-generation-06?


Hagen

^ permalink raw reply

* Re: [PATCH v2 01/15] wcn36xx: Clean up wcn36xx_smd_send_beacon
From: Kalle Valo @ 2016-04-15 14:25 UTC (permalink / raw)
  To: Bjorn Andersson
  Cc: Eugene Krasnikov, Pontus Fuchs, wcn36xx, linux-wireless, netdev,
	linux-kernel
In-Reply-To: <20160414005935.GO391@tuxbot>

Bjorn Andersson <bjorn.andersson@linaro.org> writes:

> On Sun 03 Apr 15:16 PDT 2016, Bjorn Andersson wrote:
>
>> From: Pontus Fuchs <pontus.fuchs@gmail.com>
>> 
>> Needed for coming improvements. No functional changes.
>> 
>
> Kalle, Eugene,
>
> Have you picked up these patches yet?
>
> As I was debugging a firmware crash when trying to start hostap on the
> DragonBoard410c I found an issue with this patch, would like to know if
> I should send an incremental patch or resend this one.

I haven't applied these yet, so please resend the whole series as v3.

-- 
Kalle Valo

^ permalink raw reply

* Re: [PATCH 2/2] net-ath9k_htc: Replace a variable initialisation by an assignment in ath9k_htc_set_channel()
From: Julian Calaby @ 2016-04-15 14:34 UTC (permalink / raw)
  To: Kalle Valo
  Cc: ath9k-devel, linux-wireless, netdev, QCA ath9k Development, LKML,
	SF Markus Elfring, kernel-janitors, Julia Lawall
In-Reply-To: <87ega7qdjg.fsf@kamboji.qca.qualcomm.com>

Hi Kalle,

On Fri, Apr 15, 2016 at 10:09 PM, Kalle Valo <kvalo@codeaurora.org> wrote:
> Julian Calaby <julian.calaby@gmail.com> writes:
>
>> Hi Kalle,
>>
>> On Sat, Jan 2, 2016 at 5:25 AM, SF Markus Elfring
>> <elfring@users.sourceforge.net> wrote:
>>> From: Markus Elfring <elfring@users.sourceforge.net>
>>> Date: Fri, 1 Jan 2016 19:09:32 +0100
>>>
>>> Replace an explicit initialisation for one local variable at the beginning
>>> by a conditional assignment.
>>>
>>> Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
>>
>> This looks sane to me.
>>
>> Reviewed-by: Julian Calaby <julian.calaby@gmail.com>
>
> Before I commit I'll just change the commit title to:
>
> ath9k_htc: Replace a variable initialisation by an assignment in ath9k_htc_set_channel()

Sounds good to me.

Thanks,

-- 
Julian Calaby

Email: julian.calaby@gmail.com
Profile: http://www.google.com/profiles/julian.calaby/

^ permalink raw reply

* Re: Deleting child qdisc doesn't reset parent to default qdisc?
From: Eric Dumazet @ 2016-04-15 14:58 UTC (permalink / raw)
  To: Jamal Hadi Salim; +Cc: Jiri Kosina, Phil Sutter, netdev, linux-kernel
In-Reply-To: <5710E1C1.2090209@mojatatu.com>

On Fri, 2016-04-15 at 08:42 -0400, Jamal Hadi Salim wrote:
> On 16-04-14 01:49 PM, Eric Dumazet wrote:
> 
> > And what would be the chosen behavior ?
> >
> 
> TBF is probably a bad example because it started life as
> a classless qdisc. There was only one built-in fifo queue
> that was shaped. Then someone made it classful and changed
> this behavior. To me it sounds reasonable to have the
> default behavior restored. At minimal consistency.

Then you need to save the initial qdisc (bfifo for TBF) in a special
place, to make sure the delete operation is guaranteed to succeed.

Or fail the delete if the bfifo can not be allocated.

I can tell that determinism if far more interesting than usability for
some users occasionally playing with tc.

Surely the silent fallback to noop_qdisc is wrong.

Anyway, we probably need to improve our ability to understand qdisc
hierarchies. Having some hidden qdiscs is the real problem here.

We need to add some hash table so that qdisc_match_from_root() does not
have to scan hundred of qdiscs.

^ permalink raw reply

* RE: [patch net-next 05/18] mlxsw: spectrum_buffers: Push out indexes and direction out of SB structs
From: David Laight @ 2016-04-15 15:16 UTC (permalink / raw)
  To: 'Jiri Pirko'
  Cc: netdev@vger.kernel.org, davem@davemloft.net, idosch@mellanox.com,
	eladr@mellanox.com, yotamg@mellanox.com, ogerlitz@mellanox.com,
	roopa@cumulusnetworks.com, nikolay@cumulusnetworks.com,
	jhs@mojatatu.com, john.fastabend@gmail.com, rami.rosen@intel.com,
	gospo@cumulusnetworks.com, stephen@networkplumber.org,
	sfeldma@gmail.com
In-Reply-To: <20160415085259.GB1911@nanopsycho.orion>

From: Jiri Pirko
> Sent: 15 April 2016 09:53
..
> >> @@ -106,10 +96,9 @@ static int mlxsw_sp_port_pb_init(struct mlxsw_sp_port *mlxsw_sp_port)
> >>  	mlxsw_reg_pbmc_pack(pbmc_pl, mlxsw_sp_port->local_port,
> >>  			    0xffff, 0xffff / 2);
> >>  	for (i = 0; i < MLXSW_SP_PBS_LEN; i++) {
> >
> >I'd rather see an explicit ARRAY_COUNT(mlxsw_sp_pbs) than some 'randon' constant.
> 
> See "#define MLXSW_SP_PBS_LEN ARRAY_SIZE(mlxsw_sp_pbs)"

Imagine I'm reading the code quickly.
If the for() loop uses ARRAY_COUNT(mlxsw_sp_pbs) then (provided I grok ARRAY_COUNT())
I know that the array bounds are honoured.
When it uses MLXSW_SP_PBS_LEN I have to search for the definition in order
to check that it is the correct value/constant for that array.

	David

^ permalink raw reply

* Re: [PATCH net-next 2/2] intel: ixgbevf: Support Windows hosts (Hyper-V)
From: Alexander Duyck @ 2016-04-15 15:39 UTC (permalink / raw)
  To: KY Srinivasan
  Cc: David Miller, Netdev, linux-kernel@vger.kernel.org,
	devel@linuxdriverproject.org, olaf@aepfle.de, Robo Bot,
	Jason Wang, eli@mellanox.com, jackm@mellanox.com,
	yevgenyp@mellanox.com, John Ronciak, intel-wired-lan
In-Reply-To: <SN2PR03MB19191AD0BEAEAF61499157D2A0680@SN2PR03MB1919.namprd03.prod.outlook.com>

On Thu, Apr 14, 2016 at 7:49 PM, KY Srinivasan <kys@microsoft.com> wrote:
>
>
>> -----Original Message-----
>> From: Alexander Duyck [mailto:alexander.duyck@gmail.com]
>> Sent: Thursday, April 14, 2016 4:18 PM
>> To: KY Srinivasan <kys@microsoft.com>
>> Cc: David Miller <davem@davemloft.net>; Netdev
>> <netdev@vger.kernel.org>; linux-kernel@vger.kernel.org;
>> devel@linuxdriverproject.org; olaf@aepfle.de; Robo Bot
>> <apw@canonical.com>; Jason Wang <jasowang@redhat.com>;
>> eli@mellanox.com; jackm@mellanox.com; yevgenyp@mellanox.com; John
>> Ronciak <john.ronciak@intel.com>; intel-wired-lan <intel-wired-
>> lan@lists.osuosl.org>
>> Subject: Re: [PATCH net-next 2/2] intel: ixgbevf: Support Windows hosts
>> (Hyper-V)
>>
>> On Thu, Apr 14, 2016 at 4:55 PM, K. Y. Srinivasan <kys@microsoft.com>
>> wrote:
>> > On Hyper-V, the VF/PF communication is a via software mediated path
>> > as opposed to the hardware mailbox. Make the necessary
>> > adjustments to support Hyper-V.
>> >
>> > Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
>> > ---
>> >  drivers/net/ethernet/intel/ixgbevf/ixgbevf.h      |   11 ++
>> >  drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c |   56 ++++++---
>> >  drivers/net/ethernet/intel/ixgbevf/mbx.c          |   12 ++
>> >  drivers/net/ethernet/intel/ixgbevf/vf.c           |  138
>> +++++++++++++++++++++
>> >  drivers/net/ethernet/intel/ixgbevf/vf.h           |    2 +
>> >  5 files changed, 201 insertions(+), 18 deletions(-)
>> >
>> > diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
>> b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
>> > index 5ac60ee..f8d2a0b 100644
>> > --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
>> > +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
>> > @@ -460,9 +460,13 @@ enum ixbgevf_state_t {
>> >
>> >  enum ixgbevf_boards {
>> >         board_82599_vf,
>> > +       board_82599_vf_hv,
>> >         board_X540_vf,
>> > +       board_X540_vf_hv,
>> >         board_X550_vf,
>> > +       board_X550_vf_hv,
>> >         board_X550EM_x_vf,
>> > +       board_X550EM_x_vf_hv,
>> >  };
>> >
>> >  enum ixgbevf_xcast_modes {
>> > @@ -477,6 +481,13 @@ extern const struct ixgbevf_info
>> ixgbevf_X550_vf_info;
>> >  extern const struct ixgbevf_info ixgbevf_X550EM_x_vf_info;
>> >  extern const struct ixgbe_mbx_operations ixgbevf_mbx_ops;
>> >
>> > +
>> > +extern const struct ixgbevf_info ixgbevf_82599_vf_hv_info;
>> > +extern const struct ixgbevf_info ixgbevf_X540_vf_hv_info;
>> > +extern const struct ixgbevf_info ixgbevf_X550_vf_hv_info;
>> > +extern const struct ixgbevf_info ixgbevf_X550EM_x_vf_hv_info;
>> > +extern const struct ixgbe_mbx_operations ixgbevf_hv_mbx_ops;
>> > +
>> >  /* needed by ethtool.c */
>> >  extern const char ixgbevf_driver_name[];
>> >  extern const char ixgbevf_driver_version[];
>> > diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
>> b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
>> > index 007cbe0..4a0ffac 100644
>> > --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
>> > +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
>> > @@ -49,6 +49,7 @@
>> >  #include <linux/if.h>
>> >  #include <linux/if_vlan.h>
>> >  #include <linux/prefetch.h>
>> > +#include <asm/mshyperv.h>
>> >
>> >  #include "ixgbevf.h"
>> >
>> > @@ -62,10 +63,14 @@ static char ixgbevf_copyright[] =
>> >         "Copyright (c) 2009 - 2015 Intel Corporation.";
>> >
>> >  static const struct ixgbevf_info *ixgbevf_info_tbl[] = {
>> > -       [board_82599_vf] = &ixgbevf_82599_vf_info,
>> > -       [board_X540_vf]  = &ixgbevf_X540_vf_info,
>> > -       [board_X550_vf]  = &ixgbevf_X550_vf_info,
>> > -       [board_X550EM_x_vf] = &ixgbevf_X550EM_x_vf_info,
>> > +       [board_82599_vf]        = &ixgbevf_82599_vf_info,
>> > +       [board_82599_vf_hv]     = &ixgbevf_82599_vf_hv_info,
>> > +       [board_X540_vf]         = &ixgbevf_X540_vf_info,
>> > +       [board_X540_vf_hv]      = &ixgbevf_X540_vf_hv_info,
>> > +       [board_X550_vf]         = &ixgbevf_X550_vf_info,
>> > +       [board_X550_vf_hv]      = &ixgbevf_X550_vf_hv_info,
>> > +       [board_X550EM_x_vf]     = &ixgbevf_X550EM_x_vf_info,
>> > +       [board_X550EM_x_vf_hv]  = &ixgbevf_X550EM_x_vf_hv_info,
>> >  };
>> >
>> >  /* ixgbevf_pci_tbl - PCI Device ID Table
>> > @@ -78,9 +83,13 @@ static const struct ixgbevf_info *ixgbevf_info_tbl[] =
>> {
>> >   */
>> >  static const struct pci_device_id ixgbevf_pci_tbl[] = {
>> >         {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_82599_VF), board_82599_vf },
>> > +       {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_82599_VF_HV),
>> board_82599_vf_hv },
>> >         {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X540_VF), board_X540_vf },
>> > +       {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X540_VF_HV),
>> board_X540_vf_hv },
>> >         {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550_VF), board_X550_vf },
>> > +       {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550_VF_HV),
>> board_X550_vf_hv },
>> >         {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550EM_X_VF),
>> board_X550EM_x_vf },
>> > +       {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550EM_X_VF_HV),
>> board_X550EM_x_vf_hv},
>> >         /* required last entry */
>> >         {0, }
>> >  };
>> > @@ -1809,12 +1818,13 @@ static int ixgbevf_vlan_rx_add_vid(struct
>> net_device *netdev,
>> >  {
>> >         struct ixgbevf_adapter *adapter = netdev_priv(netdev);
>> >         struct ixgbe_hw *hw = &adapter->hw;
>> > -       int err;
>> > +       int err = 0;
>> >
>> >         spin_lock_bh(&adapter->mbx_lock);
>> >
>> >         /* add VID to filter table */
>> > -       err = hw->mac.ops.set_vfta(hw, vid, 0, true);
>> > +       if (hw->mac.ops.set_vfta)
>> > +               err = hw->mac.ops.set_vfta(hw, vid, 0, true);
>> >
>> >         spin_unlock_bh(&adapter->mbx_lock);
>> >
>> > @@ -1835,12 +1845,13 @@ static int ixgbevf_vlan_rx_kill_vid(struct
>> net_device *netdev,
>> >  {
>> >         struct ixgbevf_adapter *adapter = netdev_priv(netdev);
>> >         struct ixgbe_hw *hw = &adapter->hw;
>> > -       int err;
>> > +       int err = 0;
>> >
>> >         spin_lock_bh(&adapter->mbx_lock);
>> >
>> >         /* remove VID from filter table */
>> > -       err = hw->mac.ops.set_vfta(hw, vid, 0, false);
>> > +       if (hw->mac.ops.set_vfta)
>> > +               err = hw->mac.ops.set_vfta(hw, vid, 0, false);
>> >
>> >         spin_unlock_bh(&adapter->mbx_lock);
>> >
>> > @@ -1873,14 +1884,16 @@ static int ixgbevf_write_uc_addr_list(struct
>> net_device *netdev)
>> >                 struct netdev_hw_addr *ha;
>> >
>> >                 netdev_for_each_uc_addr(ha, netdev) {
>> > -                       hw->mac.ops.set_uc_addr(hw, ++count, ha->addr);
>> > +                       if (hw->mac.ops.set_uc_addr)
>> > +                               hw->mac.ops.set_uc_addr(hw, ++count, ha->addr);
>> >                         udelay(200);
>> >                 }
>> >         } else {
>> >                 /* If the list is empty then send message to PF driver to
>> >                  * clear all MAC VLANs on this VF.
>> >                  */
>> > -               hw->mac.ops.set_uc_addr(hw, 0, NULL);
>> > +               if (hw->mac.ops.set_uc_addr)
>> > +                       hw->mac.ops.set_uc_addr(hw, 0, NULL);
>> >         }
>> >
>> >         return count;
>>
>> So if I am understanding this correctly it looks like you cannot read
>> or write any addresses for this device.  Would I be correct in
>> assuming that you get to have the one unicast address that is provided
>> by the PF and that is it?
>
> That is what I have been told by the Windows folks at Intel.

Yeah, so I didn't see the other half of this patchset that has you
using a software interface for the multicast and broadcast traffic.
What I would recommend doing is just writing up a stub function you
can put in vf.c that will allow you to avoid having to add all these
if checks to the code.

>> If so we may want to look at possibly returning some sort of error on
>> these calls so that we can configure the device to indicate that we
>> cannot support any of these filters.
>
> I will do that. So, I am going to check the device ID and return an error.
> Would IXGBE_NOT_IMPLEMENTED be appropriate?

I'd say don't bother returning an error.  You can probably just stub
out the function since if I recall correctly it is a void function
anyway and doesn't provide a return value.

>>
>> > @@ -1908,10 +1921,13 @@ static void ixgbevf_set_rx_mode(struct
>> net_device *netdev)
>> >
>> >         spin_lock_bh(&adapter->mbx_lock);
>> >
>> > -       hw->mac.ops.update_xcast_mode(hw, netdev, xcast_mode);
>> > +       if (hw->mac.ops.update_mc_addr_list)
>> > +               if (hw->mac.ops.update_xcast_mode)
>> > +                       hw->mac.ops.update_xcast_mode(hw, netdev, xcast_mode);
>> >
>> >         /* reprogram multicast list */
>> > -       hw->mac.ops.update_mc_addr_list(hw, netdev);
>> > +       if (hw->mac.ops.update_mc_addr_list)
>> > +               hw->mac.ops.update_mc_addr_list(hw, netdev);
>> >
>> >         ixgbevf_write_uc_addr_list(netdev);
>> >

Same thing for the multicast list as well.

>> > @@ -2074,10 +2090,13 @@ static void ixgbevf_up_complete(struct
>> ixgbevf_adapter *adapter)
>> >
>> >         spin_lock_bh(&adapter->mbx_lock);
>> >
>> > -       if (is_valid_ether_addr(hw->mac.addr))
>> > -               hw->mac.ops.set_rar(hw, 0, hw->mac.addr, 0);
>> > -       else
>> > -               hw->mac.ops.set_rar(hw, 0, hw->mac.perm_addr, 0);
>> > +       if (is_valid_ether_addr(hw->mac.addr)) {
>> > +               if (hw->mac.ops.set_rar)
>> > +                       hw->mac.ops.set_rar(hw, 0, hw->mac.addr, 0);
>> > +       } else {
>> > +               if (hw->mac.ops.set_rar)
>> > +                       hw->mac.ops.set_rar(hw, 0, hw->mac.perm_addr, 0);
>> > +       }
>> >
>> >         spin_unlock_bh(&adapter->mbx_lock);
>> >
>>
>> Same here.  We shouldn't let the user set a MAC address that we cannot
>> support.  We should be returning an error.
>
> Yes; I will return an error.

This is the one that needs to return an error.  That way we should be
able to prevent MAC address changes.

>>
>> > @@ -3672,14 +3691,15 @@ static int ixgbevf_set_mac(struct net_device
>> *netdev, void *p)
>> >         struct ixgbevf_adapter *adapter = netdev_priv(netdev);
>> >         struct ixgbe_hw *hw = &adapter->hw;
>> >         struct sockaddr *addr = p;
>> > -       int err;
>> > +       int err = 0;
>> >
>> >         if (!is_valid_ether_addr(addr->sa_data))
>> >                 return -EADDRNOTAVAIL;
>> >
>> >         spin_lock_bh(&adapter->mbx_lock);
>> >
>> > -       err = hw->mac.ops.set_rar(hw, 0, addr->sa_data, 0);
>> > +       if (hw->mac.ops.set_rar)
>> > +               err = hw->mac.ops.set_rar(hw, 0, addr->sa_data, 0);
>> >
>> >         spin_unlock_bh(&adapter->mbx_lock);
>> >
>>
>> Specifically here.  If hw->mac.ops.set_rar is NULL we cannot allow any
>> MAC address change so we should probably return "-EADDRNOTAVAIL".
>
> Will do.
>>
>> > diff --git a/drivers/net/ethernet/intel/ixgbevf/mbx.c
>> b/drivers/net/ethernet/intel/ixgbevf/mbx.c
>> > index dc68fea..298a0da 100644
>> > --- a/drivers/net/ethernet/intel/ixgbevf/mbx.c
>> > +++ b/drivers/net/ethernet/intel/ixgbevf/mbx.c
>> > @@ -346,3 +346,15 @@ const struct ixgbe_mbx_operations
>> ixgbevf_mbx_ops = {
>> >         .check_for_rst  = ixgbevf_check_for_rst_vf,
>> >  };
>> >
>> > +/**
>> > + * Mailbox operations when running on Hyper-V.
>> > + * On Hyper-V, PF/VF communiction is not through the
>> > + * hardware mailbox; this communication is through
>> > + * a software mediated path.
>> > + * Most mail box operations are noop while running on
>> > + * Hyper-V.
>> > + */
>> > +const struct ixgbe_mbx_operations ixgbevf_hv_mbx_ops = {
>> > +       .init_params    = ixgbevf_init_mbx_params_vf,
>> > +       .check_for_rst  = ixgbevf_check_for_rst_vf,
>> > +};
>>
>> I'd say if you aren't going to use a mailbox then don't initialize it.
>> The code was based off of the same code that runs the ixgbe driver.
>> It should be able to function without a mailbox provided the necessary
>> calls are updated in the ixgbe_mac_operations that are used by the
>> hyperv VF.
>>
> Ok; I will change the code.
>
>> > diff --git a/drivers/net/ethernet/intel/ixgbevf/vf.c
>> b/drivers/net/ethernet/intel/ixgbevf/vf.c
>> > index 4d613a4..92397fd 100644
>> > --- a/drivers/net/ethernet/intel/ixgbevf/vf.c
>> > +++ b/drivers/net/ethernet/intel/ixgbevf/vf.c
>> > @@ -27,6 +27,13 @@
>> >  #include "vf.h"
>> >  #include "ixgbevf.h"
>> >
>> > +/*
>> > + * On Hyper-V, to reset, we need to read from this offset
>> > + * from the PCI config space. This is the mechanism used on
>> > + * Hyper-V to support PF/VF communication.
>> > + */
>> > +#define IXGBE_HV_RESET_OFFSET           0x201
>> > +
>> >  /**
>> >   *  ixgbevf_start_hw_vf - Prepare hardware for Tx/Rx
>> >   *  @hw: pointer to hardware structure
>> > @@ -126,6 +133,23 @@ static s32 ixgbevf_reset_hw_vf(struct ixgbe_hw
>> *hw)
>> >  }
>> >
>> >  /**
>> > + * Hyper-V variant; the VF/PF communication is through the PCI
>> > + * config space.
>> > + */
>> > +static s32 ixgbevf_hv_reset_hw_vf(struct ixgbe_hw *hw)
>> > +{
>> > +       int i;
>> > +       struct ixgbevf_adapter *adapter = hw->back;
>> > +
>> > +       for (i = 0; i < 6; i++)
>> > +               pci_read_config_byte(adapter->pdev,
>> > +                                    (i + IXGBE_HV_RESET_OFFSET),
>> > +                                    &hw->mac.perm_addr[i]);
>> > +
>> > +       return 0;
>> > +}
>> > +
>>
>> This bit can be problematic.  What about the case where PCI_MMCONFIG
>> is not defined.  In such a case it will kill this function without
>> reporting an error other than maybe a MAC address that is all 0s or
>> all FF's.
>>
>> You might want to add some sort of check here with some message if
>> such a situation occurs just so it can be easier to debug.
>
> I am a little confused here. This function will only execute when we are handling Hyper-V
> device IDs (while running on Hyper-V). Hyper-V PCI pass-through driver will support the
> config space access.

The kernel has a configuration option called CONFIG_PCI_MMCONFIG.  On
x86 it is usually enabled by default, but it can be disabled.

Right now this bit of code is dependent on it being enabled in order
to function correctly.  Otherwise you will only have access to the
first 256 bytes of the PCI configuration space.  You might want to
explore the possibility of a Kconfig option that would allow you to
only support the HyperV VFs if CONFIG_PCI_MMCONFIG is enabled.

>>
>> > +/**
>> >   *  ixgbevf_stop_hw_vf - Generic stop Tx/Rx units
>> >   *  @hw: pointer to hardware structure
>> >   *
>> > @@ -656,6 +680,68 @@ out:
>> >  }
>> >
>> >  /**
>> > + * Hyper-V variant; there is no mailbox communication.
>> > + */
>> > +static s32 ixgbevf_hv_check_mac_link_vf(struct ixgbe_hw *hw,
>> > +                                       ixgbe_link_speed *speed,
>> > +                                       bool *link_up,
>> > +                                       bool autoneg_wait_to_complete)
>> > +{
>> > +       struct ixgbe_mbx_info *mbx = &hw->mbx;
>> > +       struct ixgbe_mac_info *mac = &hw->mac;
>> > +       s32 ret_val = 0;
>> > +       u32 links_reg;
>> > +
>> > +       /* If we were hit with a reset drop the link */
>> > +       if (!mbx->ops.check_for_rst(hw) || !mbx->timeout)
>> > +               mac->get_link_status = true;
>> > +
>> > +       if (!mac->get_link_status)
>> > +               goto out;
>> > +
>> > +       /* if link status is down no point in checking to see if pf is up */
>> > +       links_reg = IXGBE_READ_REG(hw, IXGBE_VFLINKS);
>> > +       if (!(links_reg & IXGBE_LINKS_UP))
>> > +               goto out;
>> > +
>> > +       /* for SFP+ modules and DA cables on 82599 it can take up to 500usecs
>> > +        * before the link status is correct
>> > +        */
>> > +       if (mac->type == ixgbe_mac_82599_vf) {
>> > +               int i;
>> > +
>> > +               for (i = 0; i < 5; i++) {
>> > +                       udelay(100);
>> > +                       links_reg = IXGBE_READ_REG(hw, IXGBE_VFLINKS);
>> > +
>> > +                       if (!(links_reg & IXGBE_LINKS_UP))
>> > +                               goto out;
>> > +               }
>> > +       }
>> > +
>> > +       switch (links_reg & IXGBE_LINKS_SPEED_82599) {
>> > +       case IXGBE_LINKS_SPEED_10G_82599:
>> > +               *speed = IXGBE_LINK_SPEED_10GB_FULL;
>> > +               break;
>> > +       case IXGBE_LINKS_SPEED_1G_82599:
>> > +               *speed = IXGBE_LINK_SPEED_1GB_FULL;
>> > +               break;
>> > +       case IXGBE_LINKS_SPEED_100_82599:
>> > +               *speed = IXGBE_LINK_SPEED_100_FULL;
>> > +               break;
>> > +       }
>> > +
>> > +       /* if we passed all the tests above then the link is up and we no
>> > +        * longer need to check for link
>> > +        */
>> > +       mac->get_link_status = false;
>> > +
>> > +out:
>> > +       *link_up = !mac->get_link_status;
>> > +       return ret_val;
>> > +}
>> > +
>>
>> How does this handle the PF resetting?  Seems like you are going to be
>> generating Tx hangs in such a case.
>
> I am not sure how the Windows PF driver communicates this information.
> Do you have any suggestions for this.

You might want to take a look at the fm10k_get_host_state_generic
function.  You could probably do something like the trick we did there
with the TXDCTL in order to detect cases where the PF has reset the VF
so that you could then reset your guest once the PF has come back up.
That way at least you would report link down instead of Tx hang if the
host has reset you.

>>
>> > +/**
>> >   *  ixgbevf_rlpml_set_vf - Set the maximum receive packet length
>> >   *  @hw: pointer to the HW structure
>> >   *  @max_size: value to assign to max frame size
>> > @@ -663,6 +749,19 @@ out:
>> >  void ixgbevf_rlpml_set_vf(struct ixgbe_hw *hw, u16 max_size)
>> >  {
>> >         u32 msgbuf[2];
>> > +       u32 reg;
>> > +
>> > +       if (x86_hyper == &x86_hyper_ms_hyperv) {
>> > +               /*
>> > +                * If we are on Hyper-V, we implement
>> > +                * this functionality differently.
>> > +                */
>> > +               reg =  IXGBE_READ_REG(hw, IXGBE_VFRXDCTL(0));
>> > +               /* CRC == 4 */
>> > +               reg |= ((max_size + 4) | IXGBE_RXDCTL_RLPML_EN);
>> > +               IXGBE_WRITE_REG(hw, IXGBE_VFRXDCTL(0), reg);
>> > +               return;
>> > +       }
>> >
>> >         msgbuf[0] = IXGBE_VF_SET_LPE;
>> >         msgbuf[1] = max_size;
>>
>> You would be better off just implementing a hyperv version of this
>> function and avoiding the mailbox call entirely.
> Ok; will do.
>
>>
>> > @@ -679,6 +778,16 @@ int ixgbevf_negotiate_api_version(struct
>> ixgbe_hw *hw, int api)
>> >         int err;
>> >         u32 msg[3];
>> >
>> > +       if (x86_hyper == &x86_hyper_ms_hyperv) {
>> > +               /*
>> > +                * Hyper-V only supports api version ixgbe_mbox_api_10
>> > +                */
>> > +               if (api != ixgbe_mbox_api_10)
>> > +                       return IXGBE_ERR_INVALID_ARGUMENT;
>> > +
>> > +               return 0;
>> > +       }
>> > +
>> >         /* Negotiate the mailbox API version */
>> >         msg[0] = IXGBE_VF_API_NEGOTIATE;
>> >         msg[1] = api;
>>
>> Same here.  Just implement a hyperv version of this function instead
>> of splicing into the existing call.  Also you will need to wrap this
>> code up so that we can build on all architectures, not just x86.
>
> Ok; will do.
>>
>> > @@ -776,22 +885,51 @@ static const struct ixgbe_mac_operations
>> ixgbevf_mac_ops = {
>> >         .set_vfta               = ixgbevf_set_vfta_vf,
>> >  };
>> >
>> > +static const struct ixgbe_mac_operations ixgbevf_hv_mac_ops = {
>> > +       .init_hw                = ixgbevf_init_hw_vf,
>> > +       .reset_hw               = ixgbevf_hv_reset_hw_vf,
>> > +       .start_hw               = ixgbevf_start_hw_vf,
>> > +       .get_mac_addr           = ixgbevf_get_mac_addr_vf,
>> > +       .stop_adapter           = ixgbevf_stop_hw_vf,
>> > +       .setup_link             = ixgbevf_setup_mac_link_vf,
>> > +       .check_link             = ixgbevf_hv_check_mac_link_vf,
>> > +};
>>
>> You might want to consider implementing a set_rar call that will
>> return an error if you try to change the address to anything that is
>> not the permanent addr.
>
> Ok; will do.
>>
>> >  const struct ixgbevf_info ixgbevf_82599_vf_info = {
>> >         .mac = ixgbe_mac_82599_vf,
>> >         .mac_ops = &ixgbevf_mac_ops,
>> >  };
>> >
>> > +const struct ixgbevf_info ixgbevf_82599_vf_hv_info = {
>> > +       .mac = ixgbe_mac_82599_vf,
>> > +       .mac_ops = &ixgbevf_hv_mac_ops,
>> > +};
>> > +
>> >  const struct ixgbevf_info ixgbevf_X540_vf_info = {
>> >         .mac = ixgbe_mac_X540_vf,
>> >         .mac_ops = &ixgbevf_mac_ops,
>> >  };
>> >
>> > +const struct ixgbevf_info ixgbevf_X540_vf_hv_info = {
>> > +       .mac = ixgbe_mac_X540_vf,
>> > +       .mac_ops = &ixgbevf_hv_mac_ops,
>> > +};
>> > +
>> >  const struct ixgbevf_info ixgbevf_X550_vf_info = {
>> >         .mac = ixgbe_mac_X550_vf,
>> >         .mac_ops = &ixgbevf_mac_ops,
>> >  };
>> >
>> > +const struct ixgbevf_info ixgbevf_X550_vf_hv_info = {
>> > +       .mac = ixgbe_mac_X550_vf,
>> > +       .mac_ops = &ixgbevf_hv_mac_ops,
>> > +};
>> > +
>> >  const struct ixgbevf_info ixgbevf_X550EM_x_vf_info = {
>> >         .mac = ixgbe_mac_X550EM_x_vf,
>> >         .mac_ops = &ixgbevf_mac_ops,
>> >  };
>> > +
>> > +const struct ixgbevf_info ixgbevf_X550EM_x_vf_hv_info = {
>> > +       .mac = ixgbe_mac_X550EM_x_vf,
>> > +       .mac_ops = &ixgbevf_hv_mac_ops,
>> > +};
>> > diff --git a/drivers/net/ethernet/intel/ixgbevf/vf.h
>> b/drivers/net/ethernet/intel/ixgbevf/vf.h
>> > index ef9f773..7242a97 100644
>> > --- a/drivers/net/ethernet/intel/ixgbevf/vf.h
>> > +++ b/drivers/net/ethernet/intel/ixgbevf/vf.h
>> > @@ -32,7 +32,9 @@
>> >  #include <linux/interrupt.h>
>> >  #include <linux/if_ether.h>
>> >  #include <linux/netdevice.h>
>> > +#include <asm/hypervisor.h>
>> >
>> > +#include <asm/hypervisor.h>
>>
>> I don't think you need to include this twice.  Also this is a arch
>> specific header file.  You might want to move this to vf.c since that
>> is where you are using the code it provides.  Then you can probably
>> wrap it in an X86 build check so that you don't break the build on
>> other architectures.
>
> I will fix this. Thank you for your detailed comments.

Yeah, if we can get away from including this entirely it would be
preferred.  Odds are we probably don't need it since the device ID is
unique to HyperV hosts anyway.

I'll keep an eye out for the next patch set.

- Alex

^ permalink raw reply

* Re: [PATCH 1/2] [v4] net: emac: emac gigabit ethernet controller driver
From: Timur Tabi @ 2016-04-15 15:44 UTC (permalink / raw)
  To: Rob Herring
  Cc: Vikram Sethi, Florian Fainelli, netdev,
	linux-kernel@vger.kernel.org, devicetree@vger.kernel.org,
	linux-arm-msm, Sagar Dharia, Shanker Donthineni,
	Greg Kroah-Hartman, Christopher Covington, Gilad Avidov,
	Andrew Lunn, Bjorn Andersson, Mark Langsdorf, Jon Masters,
	Andy Gross, David S. Miller
In-Reply-To: <CAL_JsqLJffKzvMZTSLf+OESufDjrcMBd27fP5yh1R61=irXxcg@mail.gmail.com>

Rob Herring wrote:
>> >
>> >         dma-mask = <0 0xffffffff>;
>> >
>> >or
>> >
>> >         dma-mask = <0xffffffff 0xffffffff>;
> No. See dma-ranges.

How exactly should I use dma-ranges?  I can't find any other drivers 
that queries that property and uses the result to call dma_set_mask.  I 
thought the dma-ranges property is intended to specify address 
translation.  I don't need to translate any address, I just need to know 
a single number.

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora
Forum, a Linux Foundation collaborative project.

^ permalink raw reply

* Re: [PATCH 1/2] [v4] net: emac: emac gigabit ethernet controller driver
From: Rob Herring @ 2016-04-15 15:59 UTC (permalink / raw)
  To: Timur Tabi
  Cc: Vikram Sethi, Florian Fainelli, netdev,
	linux-kernel@vger.kernel.org, devicetree@vger.kernel.org,
	linux-arm-msm, Sagar Dharia, Shanker Donthineni,
	Greg Kroah-Hartman, Christopher Covington, Gilad Avidov,
	Andrew Lunn, Bjorn Andersson, Mark Langsdorf, Jon Masters,
	Andy Gross, David S. Miller
In-Reply-To: <57110C70.9090007@codeaurora.org>

On Fri, Apr 15, 2016 at 10:44 AM, Timur Tabi <timur@codeaurora.org> wrote:
> Rob Herring wrote:
>>>
>>> >
>>> >         dma-mask = <0 0xffffffff>;
>>> >
>>> >or
>>> >
>>> >         dma-mask = <0xffffffff 0xffffffff>;
>>
>> No. See dma-ranges.
>
>
> How exactly should I use dma-ranges?  I can't find any other drivers that
> queries that property and uses the result to call dma_set_mask.  I thought
> the dma-ranges property is intended to specify address translation.  I don't
> need to translate any address, I just need to know a single number.

You may only care about the size, but the binding has to handle the
more complex case. Here's an example

<0x0 0x2 0x0 0x1 0x0>

dma address 0 (cell 0) maps to cpu (parent) address 0x2_00000000 (cell
1-2) and the range/size is 4G (cell 3-4).

If you have the same base address, then use the same address. The core
will calculate the mask based on the size. IIRC, we also handle ~0 as
a special case to support 4G for #size-cell=1.

Rob

^ permalink raw reply

* RE: [PATCH net-next 2/2] intel: ixgbevf: Support Windows hosts (Hyper-V)
From: KY Srinivasan @ 2016-04-15 16:01 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: David Miller, Netdev, linux-kernel@vger.kernel.org,
	devel@linuxdriverproject.org, olaf@aepfle.de, Robo Bot,
	Jason Wang, eli@mellanox.com, jackm@mellanox.com,
	yevgenyp@mellanox.com, John Ronciak, intel-wired-lan
In-Reply-To: <CAKgT0Uf3saWQn9kdDzJcKJQnZ2Q7r3pSkGVi9-DxHzqofN8vFQ@mail.gmail.com>



> -----Original Message-----
> From: Alexander Duyck [mailto:alexander.duyck@gmail.com]
> Sent: Friday, April 15, 2016 8:40 AM
> To: KY Srinivasan <kys@microsoft.com>
> Cc: David Miller <davem@davemloft.net>; Netdev
> <netdev@vger.kernel.org>; linux-kernel@vger.kernel.org;
> devel@linuxdriverproject.org; olaf@aepfle.de; Robo Bot
> <apw@canonical.com>; Jason Wang <jasowang@redhat.com>;
> eli@mellanox.com; jackm@mellanox.com; yevgenyp@mellanox.com; John
> Ronciak <john.ronciak@intel.com>; intel-wired-lan <intel-wired-
> lan@lists.osuosl.org>
> Subject: Re: [PATCH net-next 2/2] intel: ixgbevf: Support Windows hosts
> (Hyper-V)
> 
> On Thu, Apr 14, 2016 at 7:49 PM, KY Srinivasan <kys@microsoft.com> wrote:
> >
> >
> >> -----Original Message-----
> >> From: Alexander Duyck [mailto:alexander.duyck@gmail.com]
> >> Sent: Thursday, April 14, 2016 4:18 PM
> >> To: KY Srinivasan <kys@microsoft.com>
> >> Cc: David Miller <davem@davemloft.net>; Netdev
> >> <netdev@vger.kernel.org>; linux-kernel@vger.kernel.org;
> >> devel@linuxdriverproject.org; olaf@aepfle.de; Robo Bot
> >> <apw@canonical.com>; Jason Wang <jasowang@redhat.com>;
> >> eli@mellanox.com; jackm@mellanox.com; yevgenyp@mellanox.com;
> John
> >> Ronciak <john.ronciak@intel.com>; intel-wired-lan <intel-wired-
> >> lan@lists.osuosl.org>
> >> Subject: Re: [PATCH net-next 2/2] intel: ixgbevf: Support Windows hosts
> >> (Hyper-V)
> >>
> >> On Thu, Apr 14, 2016 at 4:55 PM, K. Y. Srinivasan <kys@microsoft.com>
> >> wrote:
> >> > On Hyper-V, the VF/PF communication is a via software mediated path
> >> > as opposed to the hardware mailbox. Make the necessary
> >> > adjustments to support Hyper-V.
> >> >
> >> > Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
> >> > ---
> >> >  drivers/net/ethernet/intel/ixgbevf/ixgbevf.h      |   11 ++
> >> >  drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c |   56 ++++++---
> >> >  drivers/net/ethernet/intel/ixgbevf/mbx.c          |   12 ++
> >> >  drivers/net/ethernet/intel/ixgbevf/vf.c           |  138
> >> +++++++++++++++++++++
> >> >  drivers/net/ethernet/intel/ixgbevf/vf.h           |    2 +
> >> >  5 files changed, 201 insertions(+), 18 deletions(-)
> >> >
> >> > diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
> >> b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
> >> > index 5ac60ee..f8d2a0b 100644
> >> > --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
> >> > +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
> >> > @@ -460,9 +460,13 @@ enum ixbgevf_state_t {
> >> >
> >> >  enum ixgbevf_boards {
> >> >         board_82599_vf,
> >> > +       board_82599_vf_hv,
> >> >         board_X540_vf,
> >> > +       board_X540_vf_hv,
> >> >         board_X550_vf,
> >> > +       board_X550_vf_hv,
> >> >         board_X550EM_x_vf,
> >> > +       board_X550EM_x_vf_hv,
> >> >  };
> >> >
> >> >  enum ixgbevf_xcast_modes {
> >> > @@ -477,6 +481,13 @@ extern const struct ixgbevf_info
> >> ixgbevf_X550_vf_info;
> >> >  extern const struct ixgbevf_info ixgbevf_X550EM_x_vf_info;
> >> >  extern const struct ixgbe_mbx_operations ixgbevf_mbx_ops;
> >> >
> >> > +
> >> > +extern const struct ixgbevf_info ixgbevf_82599_vf_hv_info;
> >> > +extern const struct ixgbevf_info ixgbevf_X540_vf_hv_info;
> >> > +extern const struct ixgbevf_info ixgbevf_X550_vf_hv_info;
> >> > +extern const struct ixgbevf_info ixgbevf_X550EM_x_vf_hv_info;
> >> > +extern const struct ixgbe_mbx_operations ixgbevf_hv_mbx_ops;
> >> > +
> >> >  /* needed by ethtool.c */
> >> >  extern const char ixgbevf_driver_name[];
> >> >  extern const char ixgbevf_driver_version[];
> >> > diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
> >> b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
> >> > index 007cbe0..4a0ffac 100644
> >> > --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
> >> > +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
> >> > @@ -49,6 +49,7 @@
> >> >  #include <linux/if.h>
> >> >  #include <linux/if_vlan.h>
> >> >  #include <linux/prefetch.h>
> >> > +#include <asm/mshyperv.h>
> >> >
> >> >  #include "ixgbevf.h"
> >> >
> >> > @@ -62,10 +63,14 @@ static char ixgbevf_copyright[] =
> >> >         "Copyright (c) 2009 - 2015 Intel Corporation.";
> >> >
> >> >  static const struct ixgbevf_info *ixgbevf_info_tbl[] = {
> >> > -       [board_82599_vf] = &ixgbevf_82599_vf_info,
> >> > -       [board_X540_vf]  = &ixgbevf_X540_vf_info,
> >> > -       [board_X550_vf]  = &ixgbevf_X550_vf_info,
> >> > -       [board_X550EM_x_vf] = &ixgbevf_X550EM_x_vf_info,
> >> > +       [board_82599_vf]        = &ixgbevf_82599_vf_info,
> >> > +       [board_82599_vf_hv]     = &ixgbevf_82599_vf_hv_info,
> >> > +       [board_X540_vf]         = &ixgbevf_X540_vf_info,
> >> > +       [board_X540_vf_hv]      = &ixgbevf_X540_vf_hv_info,
> >> > +       [board_X550_vf]         = &ixgbevf_X550_vf_info,
> >> > +       [board_X550_vf_hv]      = &ixgbevf_X550_vf_hv_info,
> >> > +       [board_X550EM_x_vf]     = &ixgbevf_X550EM_x_vf_info,
> >> > +       [board_X550EM_x_vf_hv]  = &ixgbevf_X550EM_x_vf_hv_info,
> >> >  };
> >> >
> >> >  /* ixgbevf_pci_tbl - PCI Device ID Table
> >> > @@ -78,9 +83,13 @@ static const struct ixgbevf_info
> *ixgbevf_info_tbl[] =
> >> {
> >> >   */
> >> >  static const struct pci_device_id ixgbevf_pci_tbl[] = {
> >> >         {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_82599_VF), board_82599_vf },
> >> > +       {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_82599_VF_HV),
> >> board_82599_vf_hv },
> >> >         {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X540_VF), board_X540_vf },
> >> > +       {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X540_VF_HV),
> >> board_X540_vf_hv },
> >> >         {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550_VF), board_X550_vf },
> >> > +       {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550_VF_HV),
> >> board_X550_vf_hv },
> >> >         {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550EM_X_VF),
> >> board_X550EM_x_vf },
> >> > +       {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550EM_X_VF_HV),
> >> board_X550EM_x_vf_hv},
> >> >         /* required last entry */
> >> >         {0, }
> >> >  };
> >> > @@ -1809,12 +1818,13 @@ static int ixgbevf_vlan_rx_add_vid(struct
> >> net_device *netdev,
> >> >  {
> >> >         struct ixgbevf_adapter *adapter = netdev_priv(netdev);
> >> >         struct ixgbe_hw *hw = &adapter->hw;
> >> > -       int err;
> >> > +       int err = 0;
> >> >
> >> >         spin_lock_bh(&adapter->mbx_lock);
> >> >
> >> >         /* add VID to filter table */
> >> > -       err = hw->mac.ops.set_vfta(hw, vid, 0, true);
> >> > +       if (hw->mac.ops.set_vfta)
> >> > +               err = hw->mac.ops.set_vfta(hw, vid, 0, true);
> >> >
> >> >         spin_unlock_bh(&adapter->mbx_lock);
> >> >
> >> > @@ -1835,12 +1845,13 @@ static int ixgbevf_vlan_rx_kill_vid(struct
> >> net_device *netdev,
> >> >  {
> >> >         struct ixgbevf_adapter *adapter = netdev_priv(netdev);
> >> >         struct ixgbe_hw *hw = &adapter->hw;
> >> > -       int err;
> >> > +       int err = 0;
> >> >
> >> >         spin_lock_bh(&adapter->mbx_lock);
> >> >
> >> >         /* remove VID from filter table */
> >> > -       err = hw->mac.ops.set_vfta(hw, vid, 0, false);
> >> > +       if (hw->mac.ops.set_vfta)
> >> > +               err = hw->mac.ops.set_vfta(hw, vid, 0, false);
> >> >
> >> >         spin_unlock_bh(&adapter->mbx_lock);
> >> >
> >> > @@ -1873,14 +1884,16 @@ static int ixgbevf_write_uc_addr_list(struct
> >> net_device *netdev)
> >> >                 struct netdev_hw_addr *ha;
> >> >
> >> >                 netdev_for_each_uc_addr(ha, netdev) {
> >> > -                       hw->mac.ops.set_uc_addr(hw, ++count, ha->addr);
> >> > +                       if (hw->mac.ops.set_uc_addr)
> >> > +                               hw->mac.ops.set_uc_addr(hw, ++count, ha->addr);
> >> >                         udelay(200);
> >> >                 }
> >> >         } else {
> >> >                 /* If the list is empty then send message to PF driver to
> >> >                  * clear all MAC VLANs on this VF.
> >> >                  */
> >> > -               hw->mac.ops.set_uc_addr(hw, 0, NULL);
> >> > +               if (hw->mac.ops.set_uc_addr)
> >> > +                       hw->mac.ops.set_uc_addr(hw, 0, NULL);
> >> >         }
> >> >
> >> >         return count;
> >>
> >> So if I am understanding this correctly it looks like you cannot read
> >> or write any addresses for this device.  Would I be correct in
> >> assuming that you get to have the one unicast address that is provided
> >> by the PF and that is it?
> >
> > That is what I have been told by the Windows folks at Intel.
> 
> Yeah, so I didn't see the other half of this patchset that has you
> using a software interface for the multicast and broadcast traffic.
> What I would recommend doing is just writing up a stub function you
> can put in vf.c that will allow you to avoid having to add all these
> if checks to the code.
> 
> >> If so we may want to look at possibly returning some sort of error on
> >> these calls so that we can configure the device to indicate that we
> >> cannot support any of these filters.
> >
> > I will do that. So, I am going to check the device ID and return an error.
> > Would IXGBE_NOT_IMPLEMENTED be appropriate?
> 
> I'd say don't bother returning an error.  You can probably just stub
> out the function since if I recall correctly it is a void function
> anyway and doesn't provide a return value.
> 
> >>
> >> > @@ -1908,10 +1921,13 @@ static void ixgbevf_set_rx_mode(struct
> >> net_device *netdev)
> >> >
> >> >         spin_lock_bh(&adapter->mbx_lock);
> >> >
> >> > -       hw->mac.ops.update_xcast_mode(hw, netdev, xcast_mode);
> >> > +       if (hw->mac.ops.update_mc_addr_list)
> >> > +               if (hw->mac.ops.update_xcast_mode)
> >> > +                       hw->mac.ops.update_xcast_mode(hw, netdev,
> xcast_mode);
> >> >
> >> >         /* reprogram multicast list */
> >> > -       hw->mac.ops.update_mc_addr_list(hw, netdev);
> >> > +       if (hw->mac.ops.update_mc_addr_list)
> >> > +               hw->mac.ops.update_mc_addr_list(hw, netdev);
> >> >
> >> >         ixgbevf_write_uc_addr_list(netdev);
> >> >
> 
> Same thing for the multicast list as well.
> 
> >> > @@ -2074,10 +2090,13 @@ static void ixgbevf_up_complete(struct
> >> ixgbevf_adapter *adapter)
> >> >
> >> >         spin_lock_bh(&adapter->mbx_lock);
> >> >
> >> > -       if (is_valid_ether_addr(hw->mac.addr))
> >> > -               hw->mac.ops.set_rar(hw, 0, hw->mac.addr, 0);
> >> > -       else
> >> > -               hw->mac.ops.set_rar(hw, 0, hw->mac.perm_addr, 0);
> >> > +       if (is_valid_ether_addr(hw->mac.addr)) {
> >> > +               if (hw->mac.ops.set_rar)
> >> > +                       hw->mac.ops.set_rar(hw, 0, hw->mac.addr, 0);
> >> > +       } else {
> >> > +               if (hw->mac.ops.set_rar)
> >> > +                       hw->mac.ops.set_rar(hw, 0, hw->mac.perm_addr, 0);
> >> > +       }
> >> >
> >> >         spin_unlock_bh(&adapter->mbx_lock);
> >> >
> >>
> >> Same here.  We shouldn't let the user set a MAC address that we cannot
> >> support.  We should be returning an error.
> >
> > Yes; I will return an error.
> 
> This is the one that needs to return an error.  That way we should be
> able to prevent MAC address changes.
> 
> >>
> >> > @@ -3672,14 +3691,15 @@ static int ixgbevf_set_mac(struct net_device
> >> *netdev, void *p)
> >> >         struct ixgbevf_adapter *adapter = netdev_priv(netdev);
> >> >         struct ixgbe_hw *hw = &adapter->hw;
> >> >         struct sockaddr *addr = p;
> >> > -       int err;
> >> > +       int err = 0;
> >> >
> >> >         if (!is_valid_ether_addr(addr->sa_data))
> >> >                 return -EADDRNOTAVAIL;
> >> >
> >> >         spin_lock_bh(&adapter->mbx_lock);
> >> >
> >> > -       err = hw->mac.ops.set_rar(hw, 0, addr->sa_data, 0);
> >> > +       if (hw->mac.ops.set_rar)
> >> > +               err = hw->mac.ops.set_rar(hw, 0, addr->sa_data, 0);
> >> >
> >> >         spin_unlock_bh(&adapter->mbx_lock);
> >> >
> >>
> >> Specifically here.  If hw->mac.ops.set_rar is NULL we cannot allow any
> >> MAC address change so we should probably return "-EADDRNOTAVAIL".
> >
> > Will do.
> >>
> >> > diff --git a/drivers/net/ethernet/intel/ixgbevf/mbx.c
> >> b/drivers/net/ethernet/intel/ixgbevf/mbx.c
> >> > index dc68fea..298a0da 100644
> >> > --- a/drivers/net/ethernet/intel/ixgbevf/mbx.c
> >> > +++ b/drivers/net/ethernet/intel/ixgbevf/mbx.c
> >> > @@ -346,3 +346,15 @@ const struct ixgbe_mbx_operations
> >> ixgbevf_mbx_ops = {
> >> >         .check_for_rst  = ixgbevf_check_for_rst_vf,
> >> >  };
> >> >
> >> > +/**
> >> > + * Mailbox operations when running on Hyper-V.
> >> > + * On Hyper-V, PF/VF communiction is not through the
> >> > + * hardware mailbox; this communication is through
> >> > + * a software mediated path.
> >> > + * Most mail box operations are noop while running on
> >> > + * Hyper-V.
> >> > + */
> >> > +const struct ixgbe_mbx_operations ixgbevf_hv_mbx_ops = {
> >> > +       .init_params    = ixgbevf_init_mbx_params_vf,
> >> > +       .check_for_rst  = ixgbevf_check_for_rst_vf,
> >> > +};
> >>
> >> I'd say if you aren't going to use a mailbox then don't initialize it.
> >> The code was based off of the same code that runs the ixgbe driver.
> >> It should be able to function without a mailbox provided the necessary
> >> calls are updated in the ixgbe_mac_operations that are used by the
> >> hyperv VF.
> >>
> > Ok; I will change the code.
> >
> >> > diff --git a/drivers/net/ethernet/intel/ixgbevf/vf.c
> >> b/drivers/net/ethernet/intel/ixgbevf/vf.c
> >> > index 4d613a4..92397fd 100644
> >> > --- a/drivers/net/ethernet/intel/ixgbevf/vf.c
> >> > +++ b/drivers/net/ethernet/intel/ixgbevf/vf.c
> >> > @@ -27,6 +27,13 @@
> >> >  #include "vf.h"
> >> >  #include "ixgbevf.h"
> >> >
> >> > +/*
> >> > + * On Hyper-V, to reset, we need to read from this offset
> >> > + * from the PCI config space. This is the mechanism used on
> >> > + * Hyper-V to support PF/VF communication.
> >> > + */
> >> > +#define IXGBE_HV_RESET_OFFSET           0x201
> >> > +
> >> >  /**
> >> >   *  ixgbevf_start_hw_vf - Prepare hardware for Tx/Rx
> >> >   *  @hw: pointer to hardware structure
> >> > @@ -126,6 +133,23 @@ static s32 ixgbevf_reset_hw_vf(struct
> ixgbe_hw
> >> *hw)
> >> >  }
> >> >
> >> >  /**
> >> > + * Hyper-V variant; the VF/PF communication is through the PCI
> >> > + * config space.
> >> > + */
> >> > +static s32 ixgbevf_hv_reset_hw_vf(struct ixgbe_hw *hw)
> >> > +{
> >> > +       int i;
> >> > +       struct ixgbevf_adapter *adapter = hw->back;
> >> > +
> >> > +       for (i = 0; i < 6; i++)
> >> > +               pci_read_config_byte(adapter->pdev,
> >> > +                                    (i + IXGBE_HV_RESET_OFFSET),
> >> > +                                    &hw->mac.perm_addr[i]);
> >> > +
> >> > +       return 0;
> >> > +}
> >> > +
> >>
> >> This bit can be problematic.  What about the case where PCI_MMCONFIG
> >> is not defined.  In such a case it will kill this function without
> >> reporting an error other than maybe a MAC address that is all 0s or
> >> all FF's.
> >>
> >> You might want to add some sort of check here with some message if
> >> such a situation occurs just so it can be easier to debug.
> >
> > I am a little confused here. This function will only execute when we are
> handling Hyper-V
> > device IDs (while running on Hyper-V). Hyper-V PCI pass-through driver will
> support the
> > config space access.
> 
> The kernel has a configuration option called CONFIG_PCI_MMCONFIG.  On
> x86 it is usually enabled by default, but it can be disabled.
> 
> Right now this bit of code is dependent on it being enabled in order
> to function correctly.  Otherwise you will only have access to the
> first 256 bytes of the PCI configuration space.  You might want to
> explore the possibility of a Kconfig option that would allow you to
> only support the HyperV VFs if CONFIG_PCI_MMCONFIG is enabled.

Our PCI pss-through driver depends on this functionality. I will make sure we
Make that dependency more explicit.
> 
> >>
> >> > +/**
> >> >   *  ixgbevf_stop_hw_vf - Generic stop Tx/Rx units
> >> >   *  @hw: pointer to hardware structure
> >> >   *
> >> > @@ -656,6 +680,68 @@ out:
> >> >  }
> >> >
> >> >  /**
> >> > + * Hyper-V variant; there is no mailbox communication.
> >> > + */
> >> > +static s32 ixgbevf_hv_check_mac_link_vf(struct ixgbe_hw *hw,
> >> > +                                       ixgbe_link_speed *speed,
> >> > +                                       bool *link_up,
> >> > +                                       bool autoneg_wait_to_complete)
> >> > +{
> >> > +       struct ixgbe_mbx_info *mbx = &hw->mbx;
> >> > +       struct ixgbe_mac_info *mac = &hw->mac;
> >> > +       s32 ret_val = 0;
> >> > +       u32 links_reg;
> >> > +
> >> > +       /* If we were hit with a reset drop the link */
> >> > +       if (!mbx->ops.check_for_rst(hw) || !mbx->timeout)
> >> > +               mac->get_link_status = true;
> >> > +
> >> > +       if (!mac->get_link_status)
> >> > +               goto out;
> >> > +
> >> > +       /* if link status is down no point in checking to see if pf is up */
> >> > +       links_reg = IXGBE_READ_REG(hw, IXGBE_VFLINKS);
> >> > +       if (!(links_reg & IXGBE_LINKS_UP))
> >> > +               goto out;
> >> > +
> >> > +       /* for SFP+ modules and DA cables on 82599 it can take up to
> 500usecs
> >> > +        * before the link status is correct
> >> > +        */
> >> > +       if (mac->type == ixgbe_mac_82599_vf) {
> >> > +               int i;
> >> > +
> >> > +               for (i = 0; i < 5; i++) {
> >> > +                       udelay(100);
> >> > +                       links_reg = IXGBE_READ_REG(hw, IXGBE_VFLINKS);
> >> > +
> >> > +                       if (!(links_reg & IXGBE_LINKS_UP))
> >> > +                               goto out;
> >> > +               }
> >> > +       }
> >> > +
> >> > +       switch (links_reg & IXGBE_LINKS_SPEED_82599) {
> >> > +       case IXGBE_LINKS_SPEED_10G_82599:
> >> > +               *speed = IXGBE_LINK_SPEED_10GB_FULL;
> >> > +               break;
> >> > +       case IXGBE_LINKS_SPEED_1G_82599:
> >> > +               *speed = IXGBE_LINK_SPEED_1GB_FULL;
> >> > +               break;
> >> > +       case IXGBE_LINKS_SPEED_100_82599:
> >> > +               *speed = IXGBE_LINK_SPEED_100_FULL;
> >> > +               break;
> >> > +       }
> >> > +
> >> > +       /* if we passed all the tests above then the link is up and we no
> >> > +        * longer need to check for link
> >> > +        */
> >> > +       mac->get_link_status = false;
> >> > +
> >> > +out:
> >> > +       *link_up = !mac->get_link_status;
> >> > +       return ret_val;
> >> > +}
> >> > +
> >>
> >> How does this handle the PF resetting?  Seems like you are going to be
> >> generating Tx hangs in such a case.
> >
> > I am not sure how the Windows PF driver communicates this information.
> > Do you have any suggestions for this.
> 
> You might want to take a look at the fm10k_get_host_state_generic
> function.  You could probably do something like the trick we did there
> with the TXDCTL in order to detect cases where the PF has reset the VF
> so that you could then reset your guest once the PF has come back up.
> That way at least you would report link down instead of Tx hang if the
> host has reset you.

I will take a look at your example; thank you.
> 
> >>
> >> > +/**
> >> >   *  ixgbevf_rlpml_set_vf - Set the maximum receive packet length
> >> >   *  @hw: pointer to the HW structure
> >> >   *  @max_size: value to assign to max frame size
> >> > @@ -663,6 +749,19 @@ out:
> >> >  void ixgbevf_rlpml_set_vf(struct ixgbe_hw *hw, u16 max_size)
> >> >  {
> >> >         u32 msgbuf[2];
> >> > +       u32 reg;
> >> > +
> >> > +       if (x86_hyper == &x86_hyper_ms_hyperv) {
> >> > +               /*
> >> > +                * If we are on Hyper-V, we implement
> >> > +                * this functionality differently.
> >> > +                */
> >> > +               reg =  IXGBE_READ_REG(hw, IXGBE_VFRXDCTL(0));
> >> > +               /* CRC == 4 */
> >> > +               reg |= ((max_size + 4) | IXGBE_RXDCTL_RLPML_EN);
> >> > +               IXGBE_WRITE_REG(hw, IXGBE_VFRXDCTL(0), reg);
> >> > +               return;
> >> > +       }
> >> >
> >> >         msgbuf[0] = IXGBE_VF_SET_LPE;
> >> >         msgbuf[1] = max_size;
> >>
> >> You would be better off just implementing a hyperv version of this
> >> function and avoiding the mailbox call entirely.
> > Ok; will do.
> >
> >>
> >> > @@ -679,6 +778,16 @@ int ixgbevf_negotiate_api_version(struct
> >> ixgbe_hw *hw, int api)
> >> >         int err;
> >> >         u32 msg[3];
> >> >
> >> > +       if (x86_hyper == &x86_hyper_ms_hyperv) {
> >> > +               /*
> >> > +                * Hyper-V only supports api version ixgbe_mbox_api_10
> >> > +                */
> >> > +               if (api != ixgbe_mbox_api_10)
> >> > +                       return IXGBE_ERR_INVALID_ARGUMENT;
> >> > +
> >> > +               return 0;
> >> > +       }
> >> > +
> >> >         /* Negotiate the mailbox API version */
> >> >         msg[0] = IXGBE_VF_API_NEGOTIATE;
> >> >         msg[1] = api;
> >>
> >> Same here.  Just implement a hyperv version of this function instead
> >> of splicing into the existing call.  Also you will need to wrap this
> >> code up so that we can build on all architectures, not just x86.
> >
> > Ok; will do.
> >>
> >> > @@ -776,22 +885,51 @@ static const struct ixgbe_mac_operations
> >> ixgbevf_mac_ops = {
> >> >         .set_vfta               = ixgbevf_set_vfta_vf,
> >> >  };
> >> >
> >> > +static const struct ixgbe_mac_operations ixgbevf_hv_mac_ops = {
> >> > +       .init_hw                = ixgbevf_init_hw_vf,
> >> > +       .reset_hw               = ixgbevf_hv_reset_hw_vf,
> >> > +       .start_hw               = ixgbevf_start_hw_vf,
> >> > +       .get_mac_addr           = ixgbevf_get_mac_addr_vf,
> >> > +       .stop_adapter           = ixgbevf_stop_hw_vf,
> >> > +       .setup_link             = ixgbevf_setup_mac_link_vf,
> >> > +       .check_link             = ixgbevf_hv_check_mac_link_vf,
> >> > +};
> >>
> >> You might want to consider implementing a set_rar call that will
> >> return an error if you try to change the address to anything that is
> >> not the permanent addr.
> >
> > Ok; will do.
> >>
> >> >  const struct ixgbevf_info ixgbevf_82599_vf_info = {
> >> >         .mac = ixgbe_mac_82599_vf,
> >> >         .mac_ops = &ixgbevf_mac_ops,
> >> >  };
> >> >
> >> > +const struct ixgbevf_info ixgbevf_82599_vf_hv_info = {
> >> > +       .mac = ixgbe_mac_82599_vf,
> >> > +       .mac_ops = &ixgbevf_hv_mac_ops,
> >> > +};
> >> > +
> >> >  const struct ixgbevf_info ixgbevf_X540_vf_info = {
> >> >         .mac = ixgbe_mac_X540_vf,
> >> >         .mac_ops = &ixgbevf_mac_ops,
> >> >  };
> >> >
> >> > +const struct ixgbevf_info ixgbevf_X540_vf_hv_info = {
> >> > +       .mac = ixgbe_mac_X540_vf,
> >> > +       .mac_ops = &ixgbevf_hv_mac_ops,
> >> > +};
> >> > +
> >> >  const struct ixgbevf_info ixgbevf_X550_vf_info = {
> >> >         .mac = ixgbe_mac_X550_vf,
> >> >         .mac_ops = &ixgbevf_mac_ops,
> >> >  };
> >> >
> >> > +const struct ixgbevf_info ixgbevf_X550_vf_hv_info = {
> >> > +       .mac = ixgbe_mac_X550_vf,
> >> > +       .mac_ops = &ixgbevf_hv_mac_ops,
> >> > +};
> >> > +
> >> >  const struct ixgbevf_info ixgbevf_X550EM_x_vf_info = {
> >> >         .mac = ixgbe_mac_X550EM_x_vf,
> >> >         .mac_ops = &ixgbevf_mac_ops,
> >> >  };
> >> > +
> >> > +const struct ixgbevf_info ixgbevf_X550EM_x_vf_hv_info = {
> >> > +       .mac = ixgbe_mac_X550EM_x_vf,
> >> > +       .mac_ops = &ixgbevf_hv_mac_ops,
> >> > +};
> >> > diff --git a/drivers/net/ethernet/intel/ixgbevf/vf.h
> >> b/drivers/net/ethernet/intel/ixgbevf/vf.h
> >> > index ef9f773..7242a97 100644
> >> > --- a/drivers/net/ethernet/intel/ixgbevf/vf.h
> >> > +++ b/drivers/net/ethernet/intel/ixgbevf/vf.h
> >> > @@ -32,7 +32,9 @@
> >> >  #include <linux/interrupt.h>
> >> >  #include <linux/if_ether.h>
> >> >  #include <linux/netdevice.h>
> >> > +#include <asm/hypervisor.h>
> >> >
> >> > +#include <asm/hypervisor.h>
> >>
> >> I don't think you need to include this twice.  Also this is a arch
> >> specific header file.  You might want to move this to vf.c since that
> >> is where you are using the code it provides.  Then you can probably
> >> wrap it in an X86 build check so that you don't break the build on
> >> other architectures.
> >
> > I will fix this. Thank you for your detailed comments.
> 
> Yeah, if we can get away from including this entirely it would be
> preferred.  Odds are we probably don't need it since the device ID is
> unique to HyperV hosts anyway.
> 
> I'll keep an eye out for the next patch set.

Thank you; really appreciate all your help.

Regards,

K. Y
> 
> - Alex

^ permalink raw reply

* Re: [RFC PATCH 0/2] selinux: avoid nf hooks overhead when not needed
From: Casey Schaufler @ 2016-04-15 15:54 UTC (permalink / raw)
  To: Paolo Abeni, Paul Moore
  Cc: Florian Westphal, linux-security-module, David S. Miller,
	James Morris, Andreas Gruenbacher, Stephen Smalley, netdev,
	selinux
In-Reply-To: <1460713133.7410.2.camel@redhat.com>

On 4/15/2016 2:38 AM, Paolo Abeni wrote:
> On Thu, 2016-04-14 at 18:53 -0400, Paul Moore wrote:
>> On Tue, Apr 12, 2016 at 4:52 AM, Paolo Abeni <pabeni@redhat.com> wrote:
>>> Will be ok if we post a v2 version of this series, removing the hooks
>>> de-registration bits, but preserving the selinux nf-hooks and
>>> socket_sock_rcv_skb() on-demand/delayed registration ? Will that fit
>>> with the post-init read only memory usage that you are planning ?
>> The work Florian and and I were talking about would be limited just to
>> the netfilter hooks, the LSM hooks, e.g. socket_sock_rcv_skb() and
>> friends, would remain as they are today.  What what we discussing was
>> defaulting to not registering the netfilter hooks until it became
>> necessary due to a labeled networking configuration or the
>> always_check_network policy capability; the registration of the
>> netfilter hooks would be permanent, you could not unregister the hooks
>> at that point, you would need to reboot.  Does that make sense?
> Yes, AFAIC it makes sense. I'll try to follow this route for an eventual
> v2.
>
>> As far as Casey's concerns, I don't think the work we are talking
>> about for the v2 patchset would have any effect on the socket/sock
>> security blobs as you really can't manage those adequately from the
>> netfilter hooks; you most likely will reference them and perhaps even
>> update the data within, but not allocate or free the blobs.  Besides,
>> even in some weird case you were alloc/free'ing security blobs in the
>> netfilter hooks, we can deal with that on a per-LSM basis if/when the
>> full fledged stacking patches are merged; everything we are talking
>> about is a hidden implementation detail so changing it in the future
>> shouldn't be a problem.
> Casey, are you ok with the above?

Yes. My concern is with the security module hooks. Altering
the netfilter hooks is a separate issue, and I don't have
trouble with that.

I also would not expect to see an LSM doing blob allocation
during socket delivery, but hey, it *is* networking code,
and stranger things happen all the time.

> Thank you,
>
> Paolo
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply

* Re: [PATCH pci] pci: Add helper function to set VPD size
From: Steve Wise @ 2016-04-15 16:12 UTC (permalink / raw)
  To: bhelgaas
  Cc: 'Hariprasad Shenai', davem, linux-pci, netdev, leedom,
	nirranjan, santosh
In-Reply-To: <00d801d1967c$68e9d100$3abd7300$@opengridcomputing.com>

On 4/14/2016 1:35 PM, Steve Wise wrote:
>> The fix is to add a PCI helper function to set the VPD size, so the
>> driver can expicitly set the exact size of the VPD.
>>
>> Fixes commit 104daa71b396 ("PCI: Determine actual VPD size on first access")
>>
>> Signed-off-by: Casey Leedom <leedom@chelsio.com>
>> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
> Looks good!
>    
> Tested-by: Steve Wise <swise@opengridcomputing.com>
>
>

Hey Bjorn,

Will this make the next 4.6-rc?

Thanks,

Steve.

^ permalink raw reply

* Re: [PATCH pci] pci: Add helper function to set VPD size
From: Casey Leedom @ 2016-04-15 16:25 UTC (permalink / raw)
  To: SWise OGC, bhelgaas@google.com
  Cc: Hariprasad S, davem@davemloft.net, linux-pci@vger.kernel.org,
	netdev@vger.kernel.org, Nirranjan Kirubaharan, Santosh Rastapur,
	Ariel Elior, Hannes Reinecke
In-Reply-To: <571112D2.4010801@opengridcomputing.com>

| From: Steve Wise <swise@opengridcomputing.com>
| Sent: Friday, April 15, 2016 9:12 AM
|
| On 4/14/2016 1:35 PM, Steve Wise wrote:
| >> The fix is to add a PCI helper function to set the VPD size, so the
| >> driver can expicitly set the exact size of the VPD.
| >>
| >> Fixes commit 104daa71b396 ("PCI: Determine actual VPD size on first
| >> access")
| >>
| >> Signed-off-by: Casey Leedom <leedom@chelsio.com>
| >> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
| > Looks good!
|
| Hey Bjorn,
|
| Will this make the next 4.6-rc?

  Without a fix of some sort, cxgb4 is completely dead in the water as of the 4.6 series.  I'm also worried about the bnx2x driver which seems to be doing something similar to our cxgb4 driver.  (I.e. there isn't just a single VPD Data Structure hosted at the beginning of the VPD Space.)

Casey

^ permalink raw reply

* Re: [PATCH pci] pci: Add helper function to set VPD size
From: Bjorn Helgaas @ 2016-04-15 16:30 UTC (permalink / raw)
  To: Steve Wise
  Cc: bhelgaas, 'Hariprasad Shenai', davem, linux-pci, netdev,
	leedom, nirranjan, santosh
In-Reply-To: <571112D2.4010801@opengridcomputing.com>

On Fri, Apr 15, 2016 at 11:12:02AM -0500, Steve Wise wrote:
> On 4/14/2016 1:35 PM, Steve Wise wrote:
> >>The fix is to add a PCI helper function to set the VPD size, so the
> >>driver can expicitly set the exact size of the VPD.
> >>
> >>Fixes commit 104daa71b396 ("PCI: Determine actual VPD size on first access")
> >>
> >>Signed-off-by: Casey Leedom <leedom@chelsio.com>
> >>Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
> >Looks good!
> >Tested-by: Steve Wise <swise@opengridcomputing.com>
> >
> >
> 
> Hey Bjorn,
> 
> Will this make the next 4.6-rc?

No, it's too late for me to review it and merge it for v4.6-rc4.  But
there's still time for later RCs.

Bjorn

^ permalink raw reply

* Re: [PATCH 1/2] [v4] net: emac: emac gigabit ethernet controller driver
From: Bjorn Andersson @ 2016-04-15 16:44 UTC (permalink / raw)
  To: Timur Tabi
  Cc: Vikram Sethi, Florian Fainelli, netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-msm-u79uwXL29TY76Z2rM5mHXA,
	sdharia-sgV2jX0FEOL9JmXXK+q4OQ, Shanker Donthineni,
	Greg Kroah-Hartman, cov-sgV2jX0FEOL9JmXXK+q4OQ,
	gavidov-sgV2jX0FEOL9JmXXK+q4OQ, Rob Herring, andrew-g2DYL2Zd6BY,
	Mark Langsdorf, Jon Masters, Andy Gross, David S. Miller
In-Reply-To: <57102920.7000104-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>

On Thu 14 Apr 16:34 PDT 2016, Timur Tabi wrote:

[..]
> So I think the solution is to create a device tree (and ACPI) property that
> holds the mask.
> 
> 	dma-mask = <0 0xffffffff>;
> 
> or
> 
> 	dma-mask = <0xffffffff 0xffffffff>;
> 
> The driver will then do this:
> 
> 	u64 dma_mask;
> 	device_property_read_u64(&pdev->dev, "dma-mask", &dma_mask);
> 	dma_coerce_mask_and_coherent(&pdev->dev, dma_mask);
> 
> What I'm not sure yet is whether I should call
> dma_coerce_mask_and_coherent() or dma_set_coherent_mask().
> 

For platform devices being populated via from DT you will pass:
of_platform_bus_create()
  of_platform_device_create_pdata()
    of_dma_configure()

Which calls of_dma_get_range() to acquire this information from the
dma-ranges property and set up the dma ops and properties.

Regards,
Bjorn
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [patch net-next] devlink: fix sb register stub in case devlink is disabled
From: David Miller @ 2016-04-15 16:57 UTC (permalink / raw)
  To: jiri; +Cc: netdev, idosch, eladr, yotamg, ogerlitz, fengguang.wu
In-Reply-To: <1460704628-3662-1-git-send-email-jiri@resnulli.us>

From: Jiri Pirko <jiri@resnulli.us>
Date: Fri, 15 Apr 2016 09:17:08 +0200

> From: Jiri Pirko <jiri@mellanox.com>
> 
> Reported-by: kbuild test robot <fengguang.wu@intel.com>
> Fixes: bf7974710a40 ("devlink: add shared buffer configuration")
> Signed-off-by: Jiri Pirko <jiri@mellanox.com>

Applied, thanks Jiri.

^ permalink raw reply

* Re: [PATCH 1/2] [v4] net: emac: emac gigabit ethernet controller driver
From: Timur Tabi @ 2016-04-15 17:00 UTC (permalink / raw)
  To: Bjorn Andersson
  Cc: Vikram Sethi, Florian Fainelli, netdev, linux-kernel, devicetree,
	linux-arm-msm, sdharia, Shanker Donthineni, Greg Kroah-Hartman,
	cov, gavidov, Rob Herring, andrew, Mark Langsdorf, Jon Masters,
	Andy Gross, David S. Miller
In-Reply-To: <20160415164433.GV391@tuxbot>

Bjorn Andersson wrote:
> For platform devices being populated via from DT you will pass:
> of_platform_bus_create()
>    of_platform_device_create_pdata()
>      of_dma_configure()
>
> Which calls of_dma_get_range() to acquire this information from the
> dma-ranges property and set up the dma ops and properties.

This seems excessive.  I have to create a platform bus just to configure 
the DMA mask?  Most drivers just call dma_set_mask and give it a number, 
and that's not device-tree specific.  I also need to come up with a way 
to get this to work on ACPI.

I just seems like a lot of work only because I need to determine at 
runtime what my DMA mask is.  I also don't see any drivers that call 
of_dma_configure().

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora
Forum, a Linux Foundation collaborative project.

^ permalink raw reply

* Re: [patch net-next 0/2] mlxsw: spectrum_buffers: couple of cosmetic patches
From: David Miller @ 2016-04-15 17:03 UTC (permalink / raw)
  To: jiri; +Cc: netdev, idosch, eladr, yotamg, ogerlitz, David.Laight
In-Reply-To: <1460725778-14655-1-git-send-email-jiri@resnulli.us>

From: Jiri Pirko <jiri@resnulli.us>
Date: Fri, 15 Apr 2016 15:09:36 +0200

> As suggested by David Laight

Series applied.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox