All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V4 4/7] mmc: sdhci: add 32-bit block count support for v4 mode
From: Chunyan Zhang @ 2018-07-24  2:51 UTC (permalink / raw)
  To: Ulf Hansson, Adrian Hunter
  Cc: linux-mmc, linux-kernel, Orson Zhai, Baolin Wang, Billows Wu,
	Jason Wu, zhang.lyra
In-Reply-To: <1532340508-8749-5-git-send-email-zhang.chunyan@linaro.org>

Host Controller Version 4.10 re-defines SDMA System Address register
as 32-bit Block Count for v4 mode, and SDMA uses ADMA System
Address register (05Fh-058h) instead if v4 mode is enabled. Also
when using 32-bit block count, 16-bit block count register need
to be set to zero.

Signed-off-by: Chunyan Zhang <zhang.chunyan@linaro.org>
---
 drivers/mmc/host/sdhci.c | 14 +++++++++++++-
 drivers/mmc/host/sdhci.h |  1 +
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index 920d8ec..c272a2b 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -1070,7 +1070,19 @@ static void sdhci_prepare_data(struct sdhci_host *host, struct mmc_command *cmd)
 	/* Set the DMA boundary value and block size */
 	sdhci_writew(host, SDHCI_MAKE_BLKSZ(host->sdma_boundary, data->blksz),
 		     SDHCI_BLOCK_SIZE);
-	sdhci_writew(host, data->blocks, SDHCI_BLOCK_COUNT);
+
+	/*
+	 * For Version 4.10 onwards, if v4 mode is enabled, 16-bit Block Count
+	 * register need to be set to zero, 32-bit Block Count register would
+	 * be selected.
+	 */
+	if (host->version >= SDHCI_SPEC_410 && host->v4_mode) {
+		if (sdhci_readw(host, SDHCI_BLOCK_COUNT))
+			sdhci_writew(host, 0, SDHCI_BLOCK_COUNT);
+		sdhci_writew(host, data->blocks, SDHCI_32BIT_BLK_CNT);
+	} else {
+		sdhci_writew(host, data->blocks, SDHCI_BLOCK_COUNT);
+	}
 }
 
 static inline bool sdhci_auto_cmd12(struct sdhci_host *host,
diff --git a/drivers/mmc/host/sdhci.h b/drivers/mmc/host/sdhci.h
index 23318ff..81aae07 100644
--- a/drivers/mmc/host/sdhci.h
+++ b/drivers/mmc/host/sdhci.h
@@ -28,6 +28,7 @@
 
 #define SDHCI_DMA_ADDRESS	0x00
 #define SDHCI_ARGUMENT2		SDHCI_DMA_ADDRESS
+#define SDHCI_32BIT_BLK_CNT	SDHCI_DMA_ADDRESS
 
 #define SDHCI_BLOCK_SIZE	0x04
 #define  SDHCI_MAKE_BLKSZ(dma, blksz) (((dma & 0x7) << 12) | (blksz & 0xFFF))
-- 
2.7.4

^ permalink raw reply related

* Re: [PATCH] block: zram: Replace GFP_ATOMIC with GFP_KERNEL
From: Sergey Senozhatsky @ 2018-07-24  2:52 UTC (permalink / raw)
  To: Jia-Ju Bai
  Cc: minchan, ngupta, sergey.senozhatsky.work, axboe, linux-kernel,
	linux-block, Andrew Morton
In-Reply-To: <20180723141304.3300-1-baijiaju1990@gmail.com>

On (07/23/18 22:13), Jia-Ju Bai wrote:
> read_from_bdev_async() and write_to_bdev() are never called in atomic
> context. They call bio_alloc() with GFP_ATOMIC, which is not necessary.
> GFP_ATOMIC can be replaced with GFP_KERNEL.

[..]

> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> index 0f3fadd71230..b958ed0b8c35 100644
> --- a/drivers/block/zram/zram_drv.c
> +++ b/drivers/block/zram/zram_drv.c
> @@ -450,7 +450,7 @@ static int read_from_bdev_async(struct zram *zram, struct bio_vec *bvec,
>  {
>  	struct bio *bio;
>  
> -	bio = bio_alloc(GFP_ATOMIC, 1);
> +	bio = bio_alloc(GFP_KERNEL, 1);
>  	if (!bio)
>  		return -ENOMEM;
>  
> @@ -538,7 +538,7 @@ static int write_to_bdev(struct zram *zram, struct bio_vec *bvec,
>  	struct bio *bio;
>  	unsigned long entry;
>  
> -	bio = bio_alloc(GFP_ATOMIC, 1);
> +	bio = bio_alloc(GFP_KERNEL, 1);
>  	if (!bio)
>  		return -ENOMEM;

I think the intent here is different and is not related to atomic
contexts.

Consider the following
  OMM -> swapout -> __zram_bvec_write() -> write_to_bdev() -> bio_alloc(GFP_KERNEL) -> [OOM?]

So maybe we can do a bit better than GFP_ATOMIC (NOIO, etc.), but in general,
I believe, we can't use GFP_KERNEL [at least in write_to_bdev()].

	-ss

^ permalink raw reply

* Re: [PATCH v3 bpf-next 2/8] veth: Add driver XDP
From: Toshiaki Makita @ 2018-07-24  1:47 UTC (permalink / raw)
  To: Jakub Kicinski, Toshiaki Makita
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer
In-Reply-To: <20180723172324.248c4dc9@cakuba.netronome.com>

Hi Jakub,

Thanks for reviewing!

On 2018/07/24 9:23, Jakub Kicinski wrote:
> On Mon, 23 Jul 2018 00:13:02 +0900, Toshiaki Makita wrote:
>> From: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
>>
>> This is the basic implementation of veth driver XDP.
>>
>> Incoming packets are sent from the peer veth device in the form of skb,
>> so this is generally doing the same thing as generic XDP.
>>
>> This itself is not so useful, but a starting point to implement other
>> useful veth XDP features like TX and REDIRECT.
>>
>> This introduces NAPI when XDP is enabled, because XDP is now heavily
>> relies on NAPI context. Use ptr_ring to emulate NIC ring. Tx function
>> enqueues packets to the ring and peer NAPI handler drains the ring.
>>
>> Currently only one ring is allocated for each veth device, so it does
>> not scale on multiqueue env. This can be resolved by allocating rings
>> on the per-queue basis later.
>>
>> Note that NAPI is not used but netif_rx is used when XDP is not loaded,
>> so this does not change the default behaviour.
>>
>> v3:
>> - Fix race on closing the device.
>> - Add extack messages in ndo_bpf.
>>
>> v2:
>> - Squashed with the patch adding NAPI.
>> - Implement adjust_tail.
>> - Don't acquire consumer lock because it is guarded by NAPI.
>> - Make poll_controller noop since it is unnecessary.
>> - Register rxq_info on enabling XDP rather than on opening the device.
>>
>> Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
> 
>> +static struct sk_buff *veth_xdp_rcv_skb(struct veth_priv *priv,
>> +					struct sk_buff *skb)
>> +{
>> +	u32 pktlen, headroom, act, metalen;
>> +	void *orig_data, *orig_data_end;
>> +	int size, mac_len, delta, off;
>> +	struct bpf_prog *xdp_prog;
>> +	struct xdp_buff xdp;
>> +
>> +	rcu_read_lock();
>> +	xdp_prog = rcu_dereference(priv->xdp_prog);
>> +	if (unlikely(!xdp_prog)) {
>> +		rcu_read_unlock();
>> +		goto out;
>> +	}
>> +
>> +	mac_len = skb->data - skb_mac_header(skb);
>> +	pktlen = skb->len + mac_len;
>> +	size = SKB_DATA_ALIGN(VETH_XDP_HEADROOM + pktlen) +
>> +	       SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
>> +	if (size > PAGE_SIZE)
>> +		goto drop;
>> +
>> +	headroom = skb_headroom(skb) - mac_len;
>> +	if (skb_shared(skb) || skb_head_is_locked(skb) ||
>> +	    skb_is_nonlinear(skb) || headroom < XDP_PACKET_HEADROOM) {
>> +		struct sk_buff *nskb;
>> +		void *head, *start;
>> +		struct page *page;
>> +		int head_off;
>> +
>> +		page = alloc_page(GFP_ATOMIC);
>> +		if (!page)
>> +			goto drop;
>> +
>> +		head = page_address(page);
>> +		start = head + VETH_XDP_HEADROOM;
>> +		if (skb_copy_bits(skb, -mac_len, start, pktlen)) {
>> +			page_frag_free(head);
>> +			goto drop;
>> +		}
>> +
>> +		nskb = veth_build_skb(head,
>> +				      VETH_XDP_HEADROOM + mac_len, skb->len,
>> +				      PAGE_SIZE);
>> +		if (!nskb) {
>> +			page_frag_free(head);
>> +			goto drop;
>> +		}
> 
>> +static int veth_enable_xdp(struct net_device *dev)
>> +{
>> +	struct veth_priv *priv = netdev_priv(dev);
>> +	int err;
>> +
>> +	if (!xdp_rxq_info_is_reg(&priv->xdp_rxq)) {
>> +		err = xdp_rxq_info_reg(&priv->xdp_rxq, dev, 0);
>> +		if (err < 0)
>> +			return err;
>> +
>> +		err = xdp_rxq_info_reg_mem_model(&priv->xdp_rxq,
>> +						 MEM_TYPE_PAGE_SHARED, NULL);
> 
> nit: doesn't matter much but looks like a mix of MEM_TYPE_PAGE_SHARED
>      and MEM_TYPE_PAGE_ORDER0

Actually I'm not sure when to use MEM_TYPE_PAGE_ORDER0. It seems a page
allocated by alloc_page() can be freed by page_frag_free() and it is
more lightweight than put_page(), isn't it?
virtio_net is doing it in a similar way.

-- 
Toshiaki Makita

^ permalink raw reply

* Re: [Intel-gfx] [igt-dev] [PATCH i-g-t v7] tests/kms_rotation_crc: Move platform checks to one place for non exhaust fence cases
From: Dhinakaran Pandiyan @ 2018-07-24  2:52 UTC (permalink / raw)
  To: Radhakrishna Sripada, igt-dev; +Cc: Daniel Vetter, intel-gfx
In-Reply-To: <20180723182545.19461-1-radhakrishna.sripada@intel.com>

On Mon, 2018-07-23 at 11:25 -0700, Radhakrishna Sripada wrote:
> From: Anusha Srivatsa <anusha.srivatsa@intel.com>
> 
> Cleanup the testcases by moving the platform checks to a single
> function.
> 
> The earlier version of the path is posted here [1]
> 
> v2: Make use of the property enums to get the supported rotations
> v3: Move hardcodings to a single function(Ville)
> v4: Include the cherryview exception for reflect subtest(Maarten)
> v5: Rebase and move the check from CNL to ICL for reflect-x case
> v6: Fix the CI regression
> v7: rebase
> 
> [1]: https://patchwork.freedesktop.org/patch/209647/
> 

Oh well, I wrote my comments below and then read this link. Please add
new test requirements in separate patches. Only have the code movement
here.

> Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> Cc: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Mika Kahola <mika.kahola@intel.com>
> Signed-off-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
> Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
> ---
>  tests/kms_rotation_crc.c | 35 ++++++++++++++++++++---------------
>  1 file changed, 20 insertions(+), 15 deletions(-)
> 
> diff --git a/tests/kms_rotation_crc.c b/tests/kms_rotation_crc.c
> index 6cb5858adb0f..f20b8a6d4ba1 100644
> --- a/tests/kms_rotation_crc.c
> +++ b/tests/kms_rotation_crc.c
> @@ -43,6 +43,7 @@ typedef struct {
>  	uint32_t override_fmt;
>  	uint64_t override_tiling;
>  	int devid;
> +	int gen;
>  } data_t;
>  
>  typedef struct {
> @@ -284,6 +285,17 @@ static void prepare_fbs(data_t *data,
> igt_output_t *output,
>  		igt_plane_set_position(plane, data->pos_x, data-
> >pos_y);
>  }
>  
> +static void igt_check_rotation(data_t *data)
> +{
> +	if (data->rotation & (IGT_ROTATION_90 | IGT_ROTATION_270))
> +		igt_require(data->gen >= 9);
> +	if (data->rotation & IGT_REFLECT_X)
> +		igt_require(data->gen >= 11 ||

This check used to be igt_require(gen >= 10

> +			    (IS_CHERRYVIEW(data->devid) && (data-
> >rotation & IGT_ROTATION_0)));

There was also a check for tiling format 
-                                   (IS_CHERRYVIEW(data.devid) &&
reflect_x->rot == IGT_ROTATION_0
-                                    && reflect_x->tiling ==
LOCAL_I915_FORMAT_MOD_X_TILED));


> +	if (data->rotation & IGT_ROTATION_180)
> +		igt_require(data->gen >= 4);

Doesn't look like this requirement was is in the test earlier.

> +}
> +
>  static void test_single_case(data_t *data, enum pipe pipe,
>  			     igt_output_t *output, igt_plane_t
> *plane,
>  			     enum rectangle_type rect,
> @@ -352,15 +364,18 @@ static void test_plane_rotation(data_t *data,
> int plane_type, bool test_bad_form
>  
>  	igt_display_require_output(display);
>  
> +	igt_check_rotation(data);
> +
>  	for_each_pipe_with_valid_output(display, pipe, output) {
>  		igt_plane_t *plane;
>  		int i, j;
>  
> -		if (IS_CHERRYVIEW(data->devid) && pipe != PIPE_B)
> -			continue;
> -
>  		igt_output_set_pipe(output, pipe);
>  
> +		if (IS_CHERRYVIEW(data->devid) && (data->rotation &
> IGT_REFLECT_X) &&
> +		    pipe != kmstest_pipe_to_index('B'))
> +			continue;
> +

Why do this? 

>  		plane = igt_output_get_plane_type(output,
> plane_type);
>  		igt_require(igt_plane_has_prop(plane,
> IGT_PLANE_ROTATION));
>  
> @@ -521,14 +536,13 @@ igt_main
>  	};
>  
>  	data_t data = {};
> -	int gen = 0;
>  
>  	igt_skip_on_simulation();
>  
>  	igt_fixture {
>  		data.gfx_fd = drm_open_driver_master(DRIVER_INTEL);
>  		data.devid = intel_get_drm_devid(data.gfx_fd);
> -		gen = intel_gen(data.devid);
> +		data.gen = intel_gen(data.devid);
>  
>  		kmstest_set_vt_graphics_mode();
>  
> @@ -541,16 +555,12 @@ igt_main
>  		igt_subtest_f("%s-rotation-%s",
>  			      plane_test_str(subtest->plane),
>  			      rot_test_str(subtest->rot)) {
> -			igt_require(!(subtest->rot &
> -				    (IGT_ROTATION_90 |
> IGT_ROTATION_270)) ||
> -				    gen >= 9);
>  			data.rotation = subtest->rot;
>  			test_plane_rotation(&data, subtest->plane,
> false);
>  		}
>  	}
>  
>  	igt_subtest_f("sprite-rotation-90-pos-100-0") {
> -		igt_require(gen >= 9);
>  		data.rotation = IGT_ROTATION_90;
>  		data.pos_x = 100,
>  		data.pos_y = 0;
> @@ -560,7 +570,6 @@ igt_main
>  	data.pos_y = 0;
>  
>  	igt_subtest_f("bad-pixel-format") {
> -		igt_require(gen >= 9);
>  		data.rotation = IGT_ROTATION_90;
>  		data.override_fmt = DRM_FORMAT_RGB565;
>  		test_plane_rotation(&data, DRM_PLANE_TYPE_PRIMARY,
> true);
> @@ -568,7 +577,6 @@ igt_main
>  	data.override_fmt = 0;
>  
>  	igt_subtest_f("bad-tiling") {
> -		igt_require(gen >= 9);
>  		data.rotation = IGT_ROTATION_90;
>  		data.override_tiling =
> LOCAL_I915_FORMAT_MOD_X_TILED;
>  		test_plane_rotation(&data, DRM_PLANE_TYPE_PRIMARY,
> true);
> @@ -579,9 +587,6 @@ igt_main
>  		igt_subtest_f("primary-%s-reflect-x-%s",
>  			      tiling_test_str(reflect_x->tiling),
>  			      rot_test_str(reflect_x->rot)) {
> -			igt_require(gen >= 10 ||
> -				    (IS_CHERRYVIEW(data.devid) &&
> reflect_x->rot == IGT_ROTATION_0
> -				     && reflect_x->tiling ==
> LOCAL_I915_FORMAT_MOD_X_TILED));
>  			data.rotation = (IGT_REFLECT_X | reflect_x-
> >rot);
>  			data.override_tiling = reflect_x->tiling;
>  			test_plane_rotation(&data,
> DRM_PLANE_TYPE_PRIMARY, false);
> @@ -596,7 +601,7 @@ igt_main
>  		enum pipe pipe;
>  		igt_output_t *output;
>  
> -		igt_require(gen >= 9);
> +		igt_require(data.gen >= 9);
>  		igt_display_require_output(&data.display);
>  
>  		for_each_pipe_with_valid_output(&data.display, pipe,
> output) {
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply

* Re: [igt-dev] [PATCH i-g-t v7] tests/kms_rotation_crc: Move platform checks to one place for non exhaust fence cases
From: Dhinakaran Pandiyan @ 2018-07-24  2:52 UTC (permalink / raw)
  To: Radhakrishna Sripada, igt-dev; +Cc: Daniel Vetter, intel-gfx
In-Reply-To: <20180723182545.19461-1-radhakrishna.sripada@intel.com>

On Mon, 2018-07-23 at 11:25 -0700, Radhakrishna Sripada wrote:
> From: Anusha Srivatsa <anusha.srivatsa@intel.com>
> 
> Cleanup the testcases by moving the platform checks to a single
> function.
> 
> The earlier version of the path is posted here [1]
> 
> v2: Make use of the property enums to get the supported rotations
> v3: Move hardcodings to a single function(Ville)
> v4: Include the cherryview exception for reflect subtest(Maarten)
> v5: Rebase and move the check from CNL to ICL for reflect-x case
> v6: Fix the CI regression
> v7: rebase
> 
> [1]: https://patchwork.freedesktop.org/patch/209647/
> 

Oh well, I wrote my comments below and then read this link. Please add
new test requirements in separate patches. Only have the code movement
here.

> Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> Cc: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Mika Kahola <mika.kahola@intel.com>
> Signed-off-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
> Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
> ---
>  tests/kms_rotation_crc.c | 35 ++++++++++++++++++++---------------
>  1 file changed, 20 insertions(+), 15 deletions(-)
> 
> diff --git a/tests/kms_rotation_crc.c b/tests/kms_rotation_crc.c
> index 6cb5858adb0f..f20b8a6d4ba1 100644
> --- a/tests/kms_rotation_crc.c
> +++ b/tests/kms_rotation_crc.c
> @@ -43,6 +43,7 @@ typedef struct {
>  	uint32_t override_fmt;
>  	uint64_t override_tiling;
>  	int devid;
> +	int gen;
>  } data_t;
>  
>  typedef struct {
> @@ -284,6 +285,17 @@ static void prepare_fbs(data_t *data,
> igt_output_t *output,
>  		igt_plane_set_position(plane, data->pos_x, data-
> >pos_y);
>  }
>  
> +static void igt_check_rotation(data_t *data)
> +{
> +	if (data->rotation & (IGT_ROTATION_90 | IGT_ROTATION_270))
> +		igt_require(data->gen >= 9);
> +	if (data->rotation & IGT_REFLECT_X)
> +		igt_require(data->gen >= 11 ||

This check used to be igt_require(gen >= 10

> +			    (IS_CHERRYVIEW(data->devid) && (data-
> >rotation & IGT_ROTATION_0)));

There was also a check for tiling format 
-                                   (IS_CHERRYVIEW(data.devid) &&
reflect_x->rot == IGT_ROTATION_0
-                                    && reflect_x->tiling ==
LOCAL_I915_FORMAT_MOD_X_TILED));


> +	if (data->rotation & IGT_ROTATION_180)
> +		igt_require(data->gen >= 4);

Doesn't look like this requirement was is in the test earlier.

> +}
> +
>  static void test_single_case(data_t *data, enum pipe pipe,
>  			     igt_output_t *output, igt_plane_t
> *plane,
>  			     enum rectangle_type rect,
> @@ -352,15 +364,18 @@ static void test_plane_rotation(data_t *data,
> int plane_type, bool test_bad_form
>  
>  	igt_display_require_output(display);
>  
> +	igt_check_rotation(data);
> +
>  	for_each_pipe_with_valid_output(display, pipe, output) {
>  		igt_plane_t *plane;
>  		int i, j;
>  
> -		if (IS_CHERRYVIEW(data->devid) && pipe != PIPE_B)
> -			continue;
> -
>  		igt_output_set_pipe(output, pipe);
>  
> +		if (IS_CHERRYVIEW(data->devid) && (data->rotation &
> IGT_REFLECT_X) &&
> +		    pipe != kmstest_pipe_to_index('B'))
> +			continue;
> +

Why do this? 

>  		plane = igt_output_get_plane_type(output,
> plane_type);
>  		igt_require(igt_plane_has_prop(plane,
> IGT_PLANE_ROTATION));
>  
> @@ -521,14 +536,13 @@ igt_main
>  	};
>  
>  	data_t data = {};
> -	int gen = 0;
>  
>  	igt_skip_on_simulation();
>  
>  	igt_fixture {
>  		data.gfx_fd = drm_open_driver_master(DRIVER_INTEL);
>  		data.devid = intel_get_drm_devid(data.gfx_fd);
> -		gen = intel_gen(data.devid);
> +		data.gen = intel_gen(data.devid);
>  
>  		kmstest_set_vt_graphics_mode();
>  
> @@ -541,16 +555,12 @@ igt_main
>  		igt_subtest_f("%s-rotation-%s",
>  			      plane_test_str(subtest->plane),
>  			      rot_test_str(subtest->rot)) {
> -			igt_require(!(subtest->rot &
> -				    (IGT_ROTATION_90 |
> IGT_ROTATION_270)) ||
> -				    gen >= 9);
>  			data.rotation = subtest->rot;
>  			test_plane_rotation(&data, subtest->plane,
> false);
>  		}
>  	}
>  
>  	igt_subtest_f("sprite-rotation-90-pos-100-0") {
> -		igt_require(gen >= 9);
>  		data.rotation = IGT_ROTATION_90;
>  		data.pos_x = 100,
>  		data.pos_y = 0;
> @@ -560,7 +570,6 @@ igt_main
>  	data.pos_y = 0;
>  
>  	igt_subtest_f("bad-pixel-format") {
> -		igt_require(gen >= 9);
>  		data.rotation = IGT_ROTATION_90;
>  		data.override_fmt = DRM_FORMAT_RGB565;
>  		test_plane_rotation(&data, DRM_PLANE_TYPE_PRIMARY,
> true);
> @@ -568,7 +577,6 @@ igt_main
>  	data.override_fmt = 0;
>  
>  	igt_subtest_f("bad-tiling") {
> -		igt_require(gen >= 9);
>  		data.rotation = IGT_ROTATION_90;
>  		data.override_tiling =
> LOCAL_I915_FORMAT_MOD_X_TILED;
>  		test_plane_rotation(&data, DRM_PLANE_TYPE_PRIMARY,
> true);
> @@ -579,9 +587,6 @@ igt_main
>  		igt_subtest_f("primary-%s-reflect-x-%s",
>  			      tiling_test_str(reflect_x->tiling),
>  			      rot_test_str(reflect_x->rot)) {
> -			igt_require(gen >= 10 ||
> -				    (IS_CHERRYVIEW(data.devid) &&
> reflect_x->rot == IGT_ROTATION_0
> -				     && reflect_x->tiling ==
> LOCAL_I915_FORMAT_MOD_X_TILED));
>  			data.rotation = (IGT_REFLECT_X | reflect_x-
> >rot);
>  			data.override_tiling = reflect_x->tiling;
>  			test_plane_rotation(&data,
> DRM_PLANE_TYPE_PRIMARY, false);
> @@ -596,7 +601,7 @@ igt_main
>  		enum pipe pipe;
>  		igt_output_t *output;
>  
> -		igt_require(gen >= 9);
> +		igt_require(data.gen >= 9);
>  		igt_display_require_output(&data.display);
>  
>  		for_each_pipe_with_valid_output(&data.display, pipe,
> output) {
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply

* Re: [PATCH 1/5] f2fs: clear victim_secmap when section has full valid blocks
From: Chao Yu @ 2018-07-24  2:52 UTC (permalink / raw)
  To: Yunlong Song, jaegeuk, chao, yunlong.song
  Cc: miaoxie, bintian.wang, shengyong1, heyunlei, linux-f2fs-devel,
	linux-kernel
In-Reply-To: <1532355022-163029-2-git-send-email-yunlong.song@huawei.com>

On 2018/7/23 22:10, Yunlong Song wrote:
> Without this patch, f2fs only clears victim_secmap when it finds out
> that the section has no valid blocks at all, but forgets to clear the
> victim_secmap when the whole section has full valid blocks.
> 
> Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
> ---
>  fs/f2fs/segment.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> index cfff7cf..255bff5 100644
> --- a/fs/f2fs/segment.c
> +++ b/fs/f2fs/segment.c
> @@ -776,7 +776,9 @@ static void __remove_dirty_segment(struct f2fs_sb_info *sbi, unsigned int segno,
>  		if (test_and_clear_bit(segno, dirty_i->dirty_segmap[t]))
>  			dirty_i->nr_dirty[t]--;
>  
> -		if (get_valid_blocks(sbi, segno, true) == 0)
> +		if (get_valid_blocks(sbi, segno, true) == 0 ||
> +			get_valid_blocks(sbi, segno, true) ==
> +			(sbi->segs_per_sec << sbi->log_blocks_per_seg))

BLKS_PER_SEC(sbi)?

Thanks,

>  			clear_bit(GET_SEC_FROM_SEG(sbi, segno),
>  						dirty_i->victim_secmap);
>  	}
> 

^ permalink raw reply

* Re: [PATCH 1/5] f2fs: clear victim_secmap when section has full valid blocks
From: Chao Yu @ 2018-07-24  2:52 UTC (permalink / raw)
  To: Yunlong Song, jaegeuk, chao, yunlong.song
  Cc: miaoxie, bintian.wang, shengyong1, heyunlei, linux-f2fs-devel,
	linux-kernel
In-Reply-To: <1532355022-163029-2-git-send-email-yunlong.song@huawei.com>

On 2018/7/23 22:10, Yunlong Song wrote:
> Without this patch, f2fs only clears victim_secmap when it finds out
> that the section has no valid blocks at all, but forgets to clear the
> victim_secmap when the whole section has full valid blocks.
> 
> Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
> ---
>  fs/f2fs/segment.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> index cfff7cf..255bff5 100644
> --- a/fs/f2fs/segment.c
> +++ b/fs/f2fs/segment.c
> @@ -776,7 +776,9 @@ static void __remove_dirty_segment(struct f2fs_sb_info *sbi, unsigned int segno,
>  		if (test_and_clear_bit(segno, dirty_i->dirty_segmap[t]))
>  			dirty_i->nr_dirty[t]--;
>  
> -		if (get_valid_blocks(sbi, segno, true) == 0)
> +		if (get_valid_blocks(sbi, segno, true) == 0 ||
> +			get_valid_blocks(sbi, segno, true) ==
> +			(sbi->segs_per_sec << sbi->log_blocks_per_seg))

BLKS_PER_SEC(sbi)?

Thanks,

>  			clear_bit(GET_SEC_FROM_SEG(sbi, segno),
>  						dirty_i->victim_secmap);
>  	}
> 


^ permalink raw reply

* Re: [PATCH net-next v6 3/4] net: vhost: factor out busy polling logic to vhost_net_busy_poll()
From: Toshiaki Makita @ 2018-07-24  2:53 UTC (permalink / raw)
  To: Tonghao Zhang
  Cc: Linux Kernel Network Developers, toshiaki.makita1, virtualization,
	mst
In-Reply-To: <CAMDZJNXWs+yqAcZ-7gW6RQjen0-mzfJ-Ar-O0_wttse2A-3-HQ@mail.gmail.com>

On 2018/07/24 2:31, Tonghao Zhang wrote:
> On Mon, Jul 23, 2018 at 10:20 PM Toshiaki Makita
> <toshiaki.makita1@gmail.com> wrote:
>>
>> On 18/07/23 (月) 21:43, Tonghao Zhang wrote:
>>> On Mon, Jul 23, 2018 at 5:58 PM Toshiaki Makita
>>> <makita.toshiaki@lab.ntt.co.jp> wrote:
>>>>
>>>> On 2018/07/22 3:04, xiangxia.m.yue@gmail.com wrote:
>>>>> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
>>>>>
>>>>> Factor out generic busy polling logic and will be
>>>>> used for in tx path in the next patch. And with the patch,
>>>>> qemu can set differently the busyloop_timeout for rx queue.
>>>>>
>>>>> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
>>>>> ---
>>>> ...
>>>>> +static void vhost_net_busy_poll_vq_check(struct vhost_net *net,
>>>>> +                                      struct vhost_virtqueue *rvq,
>>>>> +                                      struct vhost_virtqueue *tvq,
>>>>> +                                      bool rx)
>>>>> +{
>>>>> +     struct socket *sock = rvq->private_data;
>>>>> +
>>>>> +     if (rx) {
>>>>> +             if (!vhost_vq_avail_empty(&net->dev, tvq)) {
>>>>> +                     vhost_poll_queue(&tvq->poll);
>>>>> +             } else if (unlikely(vhost_enable_notify(&net->dev, tvq))) {
>>>>> +                     vhost_disable_notify(&net->dev, tvq);
>>>>> +                     vhost_poll_queue(&tvq->poll);
>>>>> +             }
>>>>> +     } else if ((sock && sk_has_rx_data(sock->sk)) &&
>>>>> +                 !vhost_vq_avail_empty(&net->dev, rvq)) {
>>>>> +             vhost_poll_queue(&rvq->poll);
>>>>
>>>> Now we wait for vq_avail for rx as well, I think you cannot skip
>>>> vhost_enable_notify() on tx. Probably you might want to do:
>>> I think vhost_enable_notify is needed.
>>>
>>>> } else if (sock && sk_has_rx_data(sock->sk)) {
>>>>          if (!vhost_vq_avail_empty(&net->dev, rvq)) {
>>>>                  vhost_poll_queue(&rvq->poll);
>>>>          } else if (unlikely(vhost_enable_notify(&net->dev, rvq))) {
>>>>                  vhost_disable_notify(&net->dev, rvq);
>>>>                  vhost_poll_queue(&rvq->poll);
>>>>          }
>>>> }
>>> As Jason review as before, we only want rx kick when packet is pending at
>>> socket but we're out of available buffers. So we just enable notify,
>>> but not poll it ?
>>>
>>>          } else if ((sock && sk_has_rx_data(sock->sk)) &&
>>>                      !vhost_vq_avail_empty(&net->dev, rvq)) {
>>>                  vhost_poll_queue(&rvq->poll);
>>>          else {
>>>                  vhost_enable_notify(&net->dev, rvq);
>>>          }
>>
>> When vhost_enable_notify() returns true the avail becomes non-empty
>> while we are enabling notify. We may delay the rx process if we don't
>> check the return value of vhost_enable_notify().
> I got it thanks.
>>>> Also it's better to care vhost_net_disable_vq()/vhost_net_enable_vq() on tx?
>>> I cant find why it is better, if necessary, we can do it.
>>
>> The reason is pretty simple... we are busypolling the socket so we don't
>> need rx wakeups during it?
> OK, but one question, how about rx? do we use the
> vhost_net_disable_vq/vhost_net_ensable_vq on rx ?

If we are busypolling the sock tx buf? I'm not sure if polling it
improves the performance.

-- 
Toshiaki Makita

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [PATCH] xfs_db: add -d to short help for write command
From: Darrick J. Wong @ 2018-07-24  1:49 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: linux-xfs
In-Reply-To: <fba33a51-50c1-ade6-79d2-f7ceaceb372f@redhat.com>

On Mon, Jul 23, 2018 at 06:25:27PM -0700, Eric Sandeen wrote:
> And note in the man page that -c and -d are exclusive.
> 
> Signed-off-by: Eric Sandeen <sandeen@redhat.com>

Looks ok,
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

--D

> ---
> 
> diff --git a/db/write.c b/db/write.c
> index 5ef76bcd..bbaa609d 100644
> --- a/db/write.c
> +++ b/db/write.c
> @@ -38,7 +38,7 @@ static int	write_f(int argc, char **argv);
>  static void     write_help(void);
>  
>  static const cmdinfo_t	write_cmd =
> -	{ "write", NULL, write_f, 0, -1, 0, N_("[-c] [field or value]..."),
> +	{ "write", NULL, write_f, 0, -1, 0, N_("[-c|-d] [field or value]..."),
>  	  N_("write value to disk"), write_help };
>  
>  void
> diff --git a/man/man8/xfs_db.8 b/man/man8/xfs_db.8
> index 10f2beb9..a1ee3514 100644
> --- a/man/man8/xfs_db.8
> +++ b/man/man8/xfs_db.8
> @@ -837,7 +837,7 @@ and
>  bits respectively, and their string equivalent reported
>  (but no modifications are made).
>  .TP
> -.BI "write [\-c] [\-d] [" "field value" "] ..."
> +.BI "write [\-c|\-d] [" "field value" "] ..."
>  Write a value to disk.
>  Specific fields can be set in structures (struct mode),
>  or a block can be set to data values (data mode),
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [PATCH bpf-next] bpf: btf: fix inconsistent IS_ERR and PTR_ERR
From: YueHaibing @ 2018-07-24  2:55 UTC (permalink / raw)
  To: ast, daniel, quentin.monnet, jakub.kicinski, bhole_prashant_q7,
	osk
  Cc: linux-kernel, netdev, davem, YueHaibing

Fix inconsistent IS_ERR and PTR_ERR in get_btf,
the proper pointer to be passed as argument is '*btf'

This issue was detected with the help of Coccinelle.

Fixes: 2d3feca8c44f ("bpf: btf: print map dump and lookup with btf info")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
---
 tools/bpf/bpftool/map.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/bpf/bpftool/map.c b/tools/bpf/bpftool/map.c
index 9c81918..0ee3ba4 100644
--- a/tools/bpf/bpftool/map.c
+++ b/tools/bpf/bpftool/map.c
@@ -230,7 +230,7 @@ static int get_btf(struct bpf_map_info *map_info, struct btf **btf)
 
 	*btf = btf__new((__u8 *)btf_info.btf, btf_info.btf_size, NULL);
 	if (IS_ERR(*btf)) {
-		err = PTR_ERR(btf);
+		err = PTR_ERR(*btf);
 		*btf = NULL;
 	}
 
-- 
2.7.0



^ permalink raw reply related

* [LTP] [RFC PATCH 1/1] pounder21: Remove
From: Li Wang @ 2018-07-24  2:57 UTC (permalink / raw)
  To: ltp
In-Reply-To: <1784802899.35188200.1532349973680.JavaMail.zimbra@redhat.com>

On Mon, Jul 23, 2018 at 8:46 PM, Jan Stancek <jstancek@redhat.com> wrote:

>
>
> ----- Original Message -----
> > It's is obsolete, not being built.
>
> I think Li runs this internally.
>

​I occasionally use the pounder suit as a stress test to run more than 24
hours.​

But I'm OK to remove it from the latest ltp because it has long time no
substantive updating and the version I used is always ltp-full-20150903.


>
> [adding Hanns/IBM]
>
> I've seen pounder mentioned in RH BZs, though I'm not sure
> if LTP is the origin of pounder version that's used.
>
> Regards,
> Jan
>



-- 
Regards,
Li Wang
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linux.it/pipermail/ltp/attachments/20180724/334770b1/attachment.html>

^ permalink raw reply

* Re: [PATCH v3 bpf-next 3/8] veth: Avoid drops by oversized packets when XDP is enabled
From: Toshiaki Makita @ 2018-07-24  1:56 UTC (permalink / raw)
  To: Jakub Kicinski, Toshiaki Makita
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer
In-Reply-To: <20180723172707.74a8acfa@cakuba.netronome.com>

On 2018/07/24 9:27, Jakub Kicinski wrote:
> On Mon, 23 Jul 2018 00:13:03 +0900, Toshiaki Makita wrote:
>> From: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
>>
>> All oversized packets including GSO packets are dropped if XDP is
>> enabled on receiver side, so don't send such packets from peer.
>>
>> Drop TSO and SCTP fragmentation features so that veth devices themselves
>> segment packets with XDP enabled. Also cap MTU accordingly.
>>
>> Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
> 
> Is there any precedence for fixing up features and MTU like this?  Most
> drivers just refuse to install the program if settings are incompatible.

I don't know any precedence. I can refuse the program on installing it
when features and MTU are not appropriate. Is it preferred?
Note that with current implementation wanted_features are not touched so
features will be restored when the XDP program is removed. MTU will not
be restored though, as I do not remember the original MTU.


>> diff --git a/drivers/net/veth.c b/drivers/net/veth.c
>> index 78fa08cb6e24..f5b72e937d9d 100644
>> --- a/drivers/net/veth.c
>> +++ b/drivers/net/veth.c
>> @@ -542,6 +542,23 @@ static int veth_get_iflink(const struct net_device *dev)
>>  	return iflink;
>>  }
>>  
>> +static netdev_features_t veth_fix_features(struct net_device *dev,
>> +					   netdev_features_t features)
>> +{
>> +	struct veth_priv *priv = netdev_priv(dev);
>> +	struct net_device *peer;
>> +
>> +	peer = rtnl_dereference(priv->peer);
>> +	if (peer) {
>> +		struct veth_priv *peer_priv = netdev_priv(peer);
>> +
>> +		if (peer_priv->_xdp_prog)
>> +			features &= ~NETIF_F_GSO_SOFTWARE;
>> +	}
>> +
>> +	return features;
>> +}
>> +
>>  static void veth_set_rx_headroom(struct net_device *dev, int new_hr)
>>  {
>>  	struct veth_priv *peer_priv, *priv = netdev_priv(dev);
>> @@ -591,14 +608,33 @@ static int veth_xdp_set(struct net_device *dev, struct bpf_prog *prog,
>>  				goto err;
>>  			}
>>  		}
>> +
>> +		if (!old_prog) {
>> +			peer->hw_features &= ~NETIF_F_GSO_SOFTWARE;
>> +			peer->max_mtu = PAGE_SIZE - VETH_XDP_HEADROOM -
>> +				peer->hard_header_len -
>> +				SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
>> +			if (peer->mtu > peer->max_mtu)
>> +				dev_set_mtu(peer, peer->max_mtu);
>> +		}
>>  	}
>>  
>>  	if (old_prog) {
>> -		if (!prog && dev->flags & IFF_UP)
>> -			veth_disable_xdp(dev);
>> +		if (!prog) {
>> +			if (dev->flags & IFF_UP)
>> +				veth_disable_xdp(dev);
>> +
>> +			if (peer) {
>> +				peer->hw_features |= NETIF_F_GSO_SOFTWARE;
>> +				peer->max_mtu = ETH_MAX_MTU;
>> +			}
>> +		}
>>  		bpf_prog_put(old_prog);
>>  	}
>>  
>> +	if ((!!old_prog ^ !!prog) && peer)
>> +		netdev_update_features(peer);
>> +
>>  	return 0;
>>  err:
>>  	priv->_xdp_prog = old_prog;
>> @@ -643,6 +679,7 @@ static const struct net_device_ops veth_netdev_ops = {
>>  	.ndo_poll_controller	= veth_poll_controller,
>>  #endif
>>  	.ndo_get_iflink		= veth_get_iflink,
>> +	.ndo_fix_features	= veth_fix_features,
>>  	.ndo_features_check	= passthru_features_check,
>>  	.ndo_set_rx_headroom	= veth_set_rx_headroom,
>>  	.ndo_bpf		= veth_xdp,

-- 
Toshiaki Makita

^ permalink raw reply

* Re: [PATCH v3 bpf-next 5/8] veth: Add ndo_xdp_xmit
From: Toshiaki Makita @ 2018-07-24  1:59 UTC (permalink / raw)
  To: Toshiaki Makita
  Cc: kbuild test robot, kbuild-all, netdev, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer
In-Reply-To: <201807240711.kbMKpUVb%fengguang.wu@intel.com>

On 2018/07/24 9:19, kbuild test robot wrote:
> Hi Toshiaki,
> 
> Thank you for the patch! Yet something to improve:
> 
> [auto build test ERROR on bpf-next/master]
> 
> url:    https://github.com/0day-ci/linux/commits/Toshiaki-Makita/veth-Driver-XDP/20180724-065517
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
> config: i386-randconfig-x001-201829 (attached as .config)
> compiler: gcc-7 (Debian 7.3.0-16) 7.3.0
> reproduce:
>         # save the attached .config to linux build tree
>         make ARCH=i386 
> 
> All errors (new ones prefixed by >>):
> 
>    In file included from include/linux/kernel.h:10:0,
>                     from include/linux/list.h:9,
>                     from include/linux/timer.h:5,
>                     from include/linux/netdevice.h:28,
>                     from drivers//net/veth.c:11:
>    drivers//net/veth.c: In function 'veth_xdp_xmit':
>>> drivers//net/veth.c:300:16: error: implicit declaration of function 'xdp_ok_fwd_dev' [-Werror=implicit-function-declaration]
>       if (unlikely(xdp_ok_fwd_dev(rcv, frame->len) ||

This is because this series depends on commit d8d7218ad842 ("xdp:
XDP_REDIRECT should check IFF_UP and MTU") which is currently in DaveM's
net-next tree, as I noted in the cover letter.

-- 
Toshiaki Makita

^ permalink raw reply

* Re: [PATCH] tpm: add support for partial reads
From: Jason Gunthorpe @ 2018-07-24  2:05 UTC (permalink / raw)
  To: Tadeusz Struk
  Cc: James Bottomley, Jarkko Sakkinen, linux-integrity,
	linux-security-module, linux-kernel
In-Reply-To: <ee0f0ccf-f757-a7d4-bf55-316f81fb490b@intel.com>

On Mon, Jul 23, 2018 at 04:42:38PM -0700, Tadeusz Struk wrote:
> On 07/23/2018 03:08 PM, Jason Gunthorpe wrote:
> > On Mon, Jul 23, 2018 at 03:00:20PM -0700, Tadeusz Struk wrote:
> >> On 07/23/2018 02:56 PM, Jason Gunthorpe wrote:
> >>> The proposed patch doesn't clear the data_pending if the entire buffer
> >>> is not consumed, so of course it is ABI breaking, that really isn't OK.
> >> The data_pending will be cleared by the timeout handler if the user doesn't
> >> read the response fully before the timeout expires. The is the same situation
> >> if the user would not read the response at all.
> > That causes write() to fail with EBUSY
> > 
> > NAK from me on breaking the ABI like this
> 
> What if we introduce this new behavior only for the non-blocking mode
> as James suggested? Or do you have some other suggestions?

I think you should do it entirely in userspace.

But something sensible linked to O_NONBLOCK could be OK.

Jason

^ permalink raw reply

* [PATCH v2 2/2] PCI: NVMe device specific reset quirk
From: Alex Williamson @ 2018-07-24  3:16 UTC (permalink / raw)

In-Reply-To: <CAK9iUCPG-H3vWjiEnjAgr_nWEf=0Sn+AvRt68_Sg16Zw8TobRg@mail.gmail.com>

On Mon, 23 Jul 2018 19:20:41 -0700
Sinan Kaya <Okaya@kernel.org> wrote:

> On 7/23/18, Alex Williamson <alex.williamson@redhat.com> wrote:
> > On Mon, 23 Jul 2018 17:40:02 -0700
> > Sinan Kaya <okaya@kernel.org> wrote:
> >  
> >> On 7/23/2018 5:13 PM, Alex Williamson wrote:  
> >> > + * The NVMe specification requires that controllers support PCIe FLR,
> >> > but
> >> > + * but some Samsung SM961/PM961 controllers fail to recover after FLR
> >> > (-1
> >> > + * config space) unless the device is quiesced prior to FLR.  
> >>
> >> Does disabling the memory bit in PCI config space as part of the FLR
> >> reset function help? (like the very first thing)  
> >
> > No, it does not.  I modified this to only clear PCI_COMMAND_MEMORY and
> > call pcie_flr(), the Samsung controller dies just as it did previously.
> >  
> >> Can we do that in the pcie_flr() function to cover other endpoint types
> >> that might be pushing traffic while code is trying to do a reset?  
> >
> > Do you mean PCI_COMMAND_MASTER rather than PCI_COMMAND_MEMORY?  
> 
> Yes
> 
> >  I tried
> > that too, it doesn't work either.  I'm not really sure the theory
> > behind clearing memory, clearing busmaster to stop DMA seems like a
> > sane thing to do, but doesn't help here.  
> 
> Let me explain what I guessed. You might be able to fill in the blanks
> where I am completely off.
> 
> We do vfio initiated flr reset immediately following guest machine
> shutdown. The card could be fully enabled and pushing traffic to the
> system at this moment.
> 
> I don't know if vfio does any device disable or not.

Yes, pci_clear_master() is the very first thing we do in
vfio_pci_disable(), well before we try to reset the device.
 
> FLR is supposed to reset the endpoint but endpoint doesn't recover per
> your report.
> 
> Having vendor specific reset routines for PCIE endpoints defeats the
> purpose of FLR.
> 
> Since the adapter is fully functional, i suggested turning off bus
> master and memory enable bits to stop endpoint from sending packets.
> 
> But, this is not helping either.
> 
> Those sleep statements looked very fragile to be honest.
> 
> I was curious if there is something else that we could do for other endpoints.
> 
> No objections otherwise.

I certainly agree that it would be nice if FLR was more robust on these
devices, but if all devices behaved within the specs we wouldn't have
these quirks to start with ;)  Just as you're suggesting maybe we could
disable busmaster before FLR, which is reasonable but doesn't work
here, I'm basically moving that to a class specific action, quiesce the
controller at the NVMe level rather than PCI level.  Essentially that's
why I thought it reasonable to apply to all NVMe class devices rather
than create just a quirk that delays after FLR for Intel and another
that disables the NVMe controller just for Samsung.  Once I decide to
apply to the whole class, then I need to bring in the device specific
knowledge already found in the native nvme driver for the delay between
clearing the enable bit and checking the ready status bit.  If it's
fragile, then the bare metal nvme driver has the same frailty.  For the
delay I added, all I can say is that it works for me and improves the
usability of the device for this purpose.  I know that 200ms is too
low, ISTR the issue was fixed at 210-220ms, so 250ms provides some
headroom and I've not seen any issues there.  If we want to make it 500
or 1000ms, that's fine by me, I expect it'd work, it's just unnecessary
until we find devices that need longer delays.  Thanks,

Alex

^ permalink raw reply

* Re: [PATCH v2 2/2] PCI: NVMe device specific reset quirk
From: Alex Williamson @ 2018-07-24  3:16 UTC (permalink / raw)
  To: Sinan Kaya; +Cc: linux-pci, linux-kernel, linux-nvme
In-Reply-To: <CAK9iUCPG-H3vWjiEnjAgr_nWEf=0Sn+AvRt68_Sg16Zw8TobRg@mail.gmail.com>

On Mon, 23 Jul 2018 19:20:41 -0700
Sinan Kaya <Okaya@kernel.org> wrote:

> On 7/23/18, Alex Williamson <alex.williamson@redhat.com> wrote:
> > On Mon, 23 Jul 2018 17:40:02 -0700
> > Sinan Kaya <okaya@kernel.org> wrote:
> >  
> >> On 7/23/2018 5:13 PM, Alex Williamson wrote:  
> >> > + * The NVMe specification requires that controllers support PCIe FLR,
> >> > but
> >> > + * but some Samsung SM961/PM961 controllers fail to recover after FLR
> >> > (-1
> >> > + * config space) unless the device is quiesced prior to FLR.  
> >>
> >> Does disabling the memory bit in PCI config space as part of the FLR
> >> reset function help? (like the very first thing)  
> >
> > No, it does not.  I modified this to only clear PCI_COMMAND_MEMORY and
> > call pcie_flr(), the Samsung controller dies just as it did previously.
> >  
> >> Can we do that in the pcie_flr() function to cover other endpoint types
> >> that might be pushing traffic while code is trying to do a reset?  
> >
> > Do you mean PCI_COMMAND_MASTER rather than PCI_COMMAND_MEMORY?  
> 
> Yes
> 
> >  I tried
> > that too, it doesn't work either.  I'm not really sure the theory
> > behind clearing memory, clearing busmaster to stop DMA seems like a
> > sane thing to do, but doesn't help here.  
> 
> Let me explain what I guessed. You might be able to fill in the blanks
> where I am completely off.
> 
> We do vfio initiated flr reset immediately following guest machine
> shutdown. The card could be fully enabled and pushing traffic to the
> system at this moment.
> 
> I don't know if vfio does any device disable or not.

Yes, pci_clear_master() is the very first thing we do in
vfio_pci_disable(), well before we try to reset the device.
 
> FLR is supposed to reset the endpoint but endpoint doesn't recover per
> your report.
> 
> Having vendor specific reset routines for PCIE endpoints defeats the
> purpose of FLR.
> 
> Since the adapter is fully functional, i suggested turning off bus
> master and memory enable bits to stop endpoint from sending packets.
> 
> But, this is not helping either.
> 
> Those sleep statements looked very fragile to be honest.
> 
> I was curious if there is something else that we could do for other endpoints.
> 
> No objections otherwise.

I certainly agree that it would be nice if FLR was more robust on these
devices, but if all devices behaved within the specs we wouldn't have
these quirks to start with ;)  Just as you're suggesting maybe we could
disable busmaster before FLR, which is reasonable but doesn't work
here, I'm basically moving that to a class specific action, quiesce the
controller at the NVMe level rather than PCI level.  Essentially that's
why I thought it reasonable to apply to all NVMe class devices rather
than create just a quirk that delays after FLR for Intel and another
that disables the NVMe controller just for Samsung.  Once I decide to
apply to the whole class, then I need to bring in the device specific
knowledge already found in the native nvme driver for the delay between
clearing the enable bit and checking the ready status bit.  If it's
fragile, then the bare metal nvme driver has the same frailty.  For the
delay I added, all I can say is that it works for me and improves the
usability of the device for this purpose.  I know that 200ms is too
low, ISTR the issue was fixed at 210-220ms, so 250ms provides some
headroom and I've not seen any issues there.  If we want to make it 500
or 1000ms, that's fine by me, I expect it'd work, it's just unnecessary
until we find devices that need longer delays.  Thanks,

Alex

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply

* [qemu-upstream-4.11-testing test] 125508: regressions - FAIL
From: osstest service owner @ 2018-07-24  3:16 UTC (permalink / raw)
  To: xen-devel, osstest-admin

flight 125508 qemu-upstream-4.11-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/125508/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-arm64-arm64-xl             <job status>                 broken  in 125498
 test-arm64-arm64-libvirt-xsm    <job status>                 broken  in 125498
 test-amd64-i386-qemuu-rhel6hvm-amd 10 redhat-install     fail REGR. vs. 124797

Tests which are failing intermittently (not blocking):
 test-armhf-armhf-xl-rtds     16 guest-start/debian.repeat  fail pass in 125498

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-libvirt-xsm 4 host-install(4) broken in 125498 blocked in 124797
 test-arm64-arm64-xl       4 host-install(4) broken in 125498 blocked in 124797
 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 10 debian-hvm-install fail never pass
 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 10 debian-hvm-install fail never pass
 test-amd64-i386-xl-pvshim    12 guest-start                  fail   never pass
 test-amd64-i386-libvirt      13 migrate-support-check        fail   never pass
 test-arm64-arm64-xl-xsm      13 migrate-support-check        fail   never pass
 test-arm64-arm64-xl-xsm      14 saverestore-support-check    fail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-check        fail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-check    fail   never pass
 test-arm64-arm64-xl          13 migrate-support-check        fail   never pass
 test-arm64-arm64-xl          14 saverestore-support-check    fail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-check        fail   never pass
 test-amd64-amd64-libvirt     13 migrate-support-check        fail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-check        fail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-check        fail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-check    fail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-check        fail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-check        fail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-check    fail   never pass
 test-armhf-armhf-libvirt     13 migrate-support-check        fail   never pass
 test-armhf-armhf-libvirt     14 saverestore-support-check    fail   never pass
 test-armhf-armhf-xl-rtds     13 migrate-support-check        fail   never pass
 test-armhf-armhf-xl-rtds     14 saverestore-support-check    fail   never pass
 test-armhf-armhf-xl          13 migrate-support-check        fail   never pass
 test-armhf-armhf-xl          14 saverestore-support-check    fail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-check        fail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-check    fail never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-check        fail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-check    fail  never pass
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop              fail never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stop             fail never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-check        fail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-check    fail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-check        fail   never pass
 test-armhf-armhf-libvirt-raw 13 saverestore-support-check    fail   never pass
 test-armhf-armhf-xl-vhd      12 migrate-support-check        fail   never pass
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stop             fail never pass
 test-armhf-armhf-xl-vhd      13 saverestore-support-check    fail   never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop              fail never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install         fail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-install        fail never pass

version targeted for testing:
 qemuu                20c76f9a5fbf16d58c6add2ace2ff0fabd785926
baseline version:
 qemuu                43139135a8938de44f66333831d3a8655d07663a

Last test of basis   124797  2018-06-28 16:27:31 Z   25 days
Testing same since   125273  2018-07-17 11:38:59 Z    6 days    5 attempts

------------------------------------------------------------
People who touched revisions under test:
  Alberto Garcia <berto@igalia.com>
  Alexandro Sanchez Bach <alexandro@phi.nz>
  Anthony PERARD <anthony.perard@citrix.com>
  Brijesh Singh <brijesh.singh@amd.com>
  Bruce Rogers <brogers@suse.com>
  Christian Borntraeger <borntraeger@de.ibm.com>
  Cornelia Huck <cohuck@redhat.com>
  Daniel P. Berrange <berrange@redhat.com>
  Daniel P. Berrangé <berrange@redhat.com>
  David Gibson <david@gibson.dropbear.id.au>
  Dr. David Alan Gilbert <dgilbert@redhat.com>
  Eduardo Habkost <ehabkost@redhat.com>
  Eric Blake <eblake@redhat.com>
  Fam Zheng <famz@redhat.com>
  Geert Uytterhoeven <geert+renesas@glider.be>
  Gerd Hoffmann <kraxel@redhat.com>
  Greg Kurz <groug@kaod.org>
  Halil Pasic <pasic@linux.ibm.com>
  Henry Wertz <hwertz10@gmail.com>
  Jack Schwartz <jack.schwartz@oracle.com>
  Jan Kiszka <jan.kiszka@siemens.com>
  Jason Andryuk <jandryuk@gmail.com>
  Jason Wang <jasowang@redhat.com>
  Jeff Cody <jcody@redhat.com>
  Jintack Lim <jintack@cs.columbia.edu>
  John Snow <jsnow@redhat.com>
  John Thomson <git@johnthomson.fastmail.com.au>
  Kevin Wolf <kwolf@redhat.com>
  KONRAD Frederic <frederic.konrad@adacore.com>
  Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
  Laszlo Ersek <lersek@redhat.com>
  Laurent Vivier <laurent@vivier.eu>
  Laurent Vivier <lvivier@redhat.com>
  linzhecheng <linzhecheng@huawei.com>
  Marc-André Lureau <marcandre.lureau@redhat.com>
  Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
  Max Filippov <jcmvbkbc@gmail.com>
  Max Reitz <mreitz@redhat.com>
  Michael Roth <mdroth@linux.vnet.ibm.com>
  Michael S. Tsirkin <mst@redhat.com>
  Michael Walle <michael@walle.cc>
  Michal Privoznik <mprivozn@redhat.com>
  Murilo Opsfelder Araujo <muriloo@linux.vnet.ibm.com>
  Nia Alarie <nia.alarie@gmail.com>
  Olaf Hering <olaf@aepfle.de>
  Paolo Bonzini <pbonzini@redhat.com>
  Peter Lieven <pl@kamp.de>
  Peter Maydell <peter.maydell@linaro.org>
  Peter Xu <peterx@redhat.com>
  Philippe Mathieu-Daudé <f4bug@amsat.org>
  Prasad Singamsetty <prasad.singamsetty@oracle.com>
  Prasad Singamsetty <prasad.singamsety@oracle.com>
  R. Nageswara Sastry <nasastry@in.ibm.com>
  Richard Henderson <richard.henderson@linaro.org>
  Ross Lagerwall <ross.lagerwall@citrix.com>
  Shannon Zhao <zhaoshenglong@huawei.com>
  Stefan Berger <stefanb@linux.vnet.ibm.com>
  Stefan Hajnoczi <stefanha@redhat.com>
  Thomas Huth <thuth@redhat.com>
  Tiwei Bie <tiwei.bie@intel.com>
  Victor Kamensky <kamensky@cisco.com>
  Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>

jobs:
 build-amd64-xsm                                              pass    
 build-arm64-xsm                                              pass    
 build-i386-xsm                                               pass    
 build-amd64                                                  pass    
 build-arm64                                                  pass    
 build-armhf                                                  pass    
 build-i386                                                   pass    
 build-amd64-libvirt                                          pass    
 build-arm64-libvirt                                          pass    
 build-armhf-libvirt                                          pass    
 build-i386-libvirt                                           pass    
 build-amd64-pvops                                            pass    
 build-arm64-pvops                                            pass    
 build-armhf-pvops                                            pass    
 build-i386-pvops                                             pass    
 test-amd64-amd64-xl                                          pass    
 test-arm64-arm64-xl                                          pass    
 test-armhf-armhf-xl                                          pass    
 test-amd64-i386-xl                                           pass    
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm           pass    
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm            pass    
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm                pass    
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm                 pass    
 test-amd64-amd64-libvirt-xsm                                 pass    
 test-arm64-arm64-libvirt-xsm                                 pass    
 test-amd64-i386-libvirt-xsm                                  pass    
 test-amd64-amd64-xl-xsm                                      pass    
 test-arm64-arm64-xl-xsm                                      pass    
 test-amd64-i386-xl-xsm                                       pass    
 test-amd64-amd64-qemuu-nested-amd                            fail    
 test-amd64-amd64-xl-pvhv2-amd                                pass    
 test-amd64-i386-qemuu-rhel6hvm-amd                           fail    
 test-amd64-amd64-xl-qemuu-debianhvm-amd64                    pass    
 test-amd64-i386-xl-qemuu-debianhvm-amd64                     pass    
 test-amd64-i386-freebsd10-amd64                              pass    
 test-amd64-amd64-xl-qemuu-ovmf-amd64                         pass    
 test-amd64-i386-xl-qemuu-ovmf-amd64                          pass    
 test-amd64-amd64-xl-qemuu-win7-amd64                         fail    
 test-amd64-i386-xl-qemuu-win7-amd64                          fail    
 test-amd64-amd64-xl-qemuu-ws16-amd64                         fail    
 test-amd64-i386-xl-qemuu-ws16-amd64                          fail    
 test-armhf-armhf-xl-arndale                                  pass    
 test-amd64-amd64-xl-credit2                                  pass    
 test-arm64-arm64-xl-credit2                                  pass    
 test-armhf-armhf-xl-credit2                                  pass    
 test-armhf-armhf-xl-cubietruck                               pass    
 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict        fail    
 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict         fail    
 test-amd64-i386-freebsd10-i386                               pass    
 test-amd64-amd64-xl-qemuu-win10-i386                         fail    
 test-amd64-i386-xl-qemuu-win10-i386                          fail    
 test-amd64-amd64-qemuu-nested-intel                          pass    
 test-amd64-amd64-xl-pvhv2-intel                              pass    
 test-amd64-i386-qemuu-rhel6hvm-intel                         pass    
 test-amd64-amd64-libvirt                                     pass    
 test-armhf-armhf-libvirt                                     pass    
 test-amd64-i386-libvirt                                      pass    
 test-amd64-amd64-xl-multivcpu                                pass    
 test-armhf-armhf-xl-multivcpu                                pass    
 test-amd64-amd64-pair                                        pass    
 test-amd64-i386-pair                                         pass    
 test-amd64-amd64-libvirt-pair                                pass    
 test-amd64-i386-libvirt-pair                                 pass    
 test-amd64-amd64-amd64-pvgrub                                pass    
 test-amd64-amd64-i386-pvgrub                                 pass    
 test-amd64-amd64-xl-pvshim                                   pass    
 test-amd64-i386-xl-pvshim                                    fail    
 test-amd64-amd64-pygrub                                      pass    
 test-amd64-amd64-xl-qcow2                                    pass    
 test-armhf-armhf-libvirt-raw                                 pass    
 test-amd64-i386-xl-raw                                       pass    
 test-amd64-amd64-xl-rtds                                     pass    
 test-armhf-armhf-xl-rtds                                     fail    
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-shadow             pass    
 test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow              pass    
 test-amd64-amd64-xl-shadow                                   pass    
 test-amd64-i386-xl-shadow                                    pass    
 test-amd64-amd64-libvirt-vhd                                 pass    
 test-armhf-armhf-xl-vhd                                      pass    


------------------------------------------------------------
sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
    http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
    http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
    http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
    http://xenbits.xen.org/gitweb?p=osstest.git;a=summary

broken-job test-arm64-arm64-xl broken
broken-job test-arm64-arm64-libvirt-xsm broken

Not pushing.

(No revision log; it would be 3001 lines long.)

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply

* Re: [PATCH v3 bpf-next 5/8] veth: Add ndo_xdp_xmit
From: Toshiaki Makita @ 2018-07-24  2:11 UTC (permalink / raw)
  To: Jakub Kicinski, Toshiaki Makita
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, tariqt
In-Reply-To: <20180723180246.1836bc11@cakuba.netronome.com>

On 2018/07/24 10:02, Jakub Kicinski wrote:
> On Mon, 23 Jul 2018 00:13:05 +0900, Toshiaki Makita wrote:
>> From: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
>>
>> This allows NIC's XDP to redirect packets to veth. The destination veth
>> device enqueues redirected packets to the napi ring of its peer, then
>> they are processed by XDP on its peer veth device.
>> This can be thought as calling another XDP program by XDP program using
>> REDIRECT, when the peer enables driver XDP.
>>
>> Note that when the peer veth device does not set driver xdp, redirected
>> packets will be dropped because the peer is not ready for NAPI.
> 
> Often we can't redirect to devices which don't have am xdp program
> installed.  In your case we can't redirect unless the peer of the
> target doesn't have a program installed?  :(

Right. I tried to avoid this case by converting xdp_frames to skb but
realized that should not be done.
https://patchwork.ozlabs.org/patch/903536/

> Perhaps it is time to reconsider what Saeed once asked for, a flag or
> attribute to enable being the destination of a XDP_REDIRECT.

Yes, something will be necessary. Jesper said Tariq had some ideas to
implement it.

> 
>> v2:
>> - Drop the part converting xdp_frame into skb when XDP is not enabled.
>> - Implement bulk interface of ndo_xdp_xmit.
>> - Implement XDP_XMIT_FLUSH bit and drop ndo_xdp_flush.
>>
>> Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
>> ---
>>  drivers/net/veth.c | 45 +++++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 45 insertions(+)
>>
>> diff --git a/drivers/net/veth.c b/drivers/net/veth.c
>> index 4be75c58bc6a..57187e955fea 100644
>> --- a/drivers/net/veth.c
>> +++ b/drivers/net/veth.c
>> @@ -17,6 +17,7 @@
>>  #include <net/rtnetlink.h>
>>  #include <net/dst.h>
>>  #include <net/xfrm.h>
>> +#include <net/xdp.h>
>>  #include <linux/veth.h>
>>  #include <linux/module.h>
>>  #include <linux/bpf.h>
>> @@ -125,6 +126,11 @@ static void *veth_ptr_to_xdp(void *ptr)
>>  	return (void *)((unsigned long)ptr & ~VETH_XDP_FLAG);
>>  }
>>  
>> +static void *veth_xdp_to_ptr(void *ptr)
>> +{
>> +	return (void *)((unsigned long)ptr | VETH_XDP_FLAG);
>> +}
>> +
>>  static void veth_ptr_free(void *ptr)
>>  {
>>  	if (veth_is_xdp_frame(ptr))
>> @@ -267,6 +273,44 @@ static struct sk_buff *veth_build_skb(void *head, int headroom, int len,
>>  	return skb;
>>  }
>>  
>> +static int veth_xdp_xmit(struct net_device *dev, int n,
>> +			 struct xdp_frame **frames, u32 flags)
>> +{
>> +	struct veth_priv *rcv_priv, *priv = netdev_priv(dev);
>> +	struct net_device *rcv;
>> +	int i, drops = 0;
>> +
>> +	if (unlikely(flags & ~XDP_XMIT_FLAGS_MASK))
>> +		return -EINVAL;
>> +
>> +	rcv = rcu_dereference(priv->peer);
>> +	if (unlikely(!rcv))
>> +		return -ENXIO;
>> +
>> +	rcv_priv = netdev_priv(rcv);
>> +	/* xdp_ring is initialized on receive side? */
>> +	if (!rcu_access_pointer(rcv_priv->xdp_prog))
>> +		return -ENXIO;
>> +
>> +	spin_lock(&rcv_priv->xdp_ring.producer_lock);
>> +	for (i = 0; i < n; i++) {
>> +		struct xdp_frame *frame = frames[i];
>> +		void *ptr = veth_xdp_to_ptr(frame);
>> +
>> +		if (unlikely(xdp_ok_fwd_dev(rcv, frame->len) ||
>> +			     __ptr_ring_produce(&rcv_priv->xdp_ring, ptr))) {
> 
> Would you mind sparing a few more words how this is safe vs the
> .ndo_close() on the peer?  Personally I'm a bit uncomfortable with the
> IFF_UP check in xdp_ok_fwd_dev(), I'm not sure what's supposed to
> guarantee the device doesn't go down right after that check, or is
> already down, but netdev->flags are not atomic...  
> 
>> +			xdp_return_frame_rx_napi(frame);
>> +			drops++;
>> +		}
>> +	}
>> +	spin_unlock(&rcv_priv->xdp_ring.producer_lock);
>> +
>> +	if (flags & XDP_XMIT_FLUSH)
>> +		__veth_xdp_flush(rcv_priv);
>> +
>> +	return n - drops;
>> +}
>> +
>>  static struct sk_buff *veth_xdp_rcv_one(struct veth_priv *priv,
>>  					struct xdp_frame *frame)
>>  {
>> @@ -760,6 +804,7 @@ static const struct net_device_ops veth_netdev_ops = {
>>  	.ndo_features_check	= passthru_features_check,
>>  	.ndo_set_rx_headroom	= veth_set_rx_headroom,
>>  	.ndo_bpf		= veth_xdp,
>> +	.ndo_xdp_xmit		= veth_xdp_xmit,
>>  };
>>  
>>  #define VETH_FEATURES (NETIF_F_SG | NETIF_F_FRAGLIST | NETIF_F_HW_CSUM | \
> 
> 
> 

-- 
Toshiaki Makita

^ permalink raw reply

* Re: [PATCH v2 2/2] PCI: NVMe device specific reset quirk
From: Alex Williamson @ 2018-07-24  3:16 UTC (permalink / raw)
  To: Sinan Kaya; +Cc: linux-pci, linux-kernel, linux-nvme
In-Reply-To: <CAK9iUCPG-H3vWjiEnjAgr_nWEf=0Sn+AvRt68_Sg16Zw8TobRg@mail.gmail.com>

On Mon, 23 Jul 2018 19:20:41 -0700
Sinan Kaya <Okaya@kernel.org> wrote:

> On 7/23/18, Alex Williamson <alex.williamson@redhat.com> wrote:
> > On Mon, 23 Jul 2018 17:40:02 -0700
> > Sinan Kaya <okaya@kernel.org> wrote:
> >  
> >> On 7/23/2018 5:13 PM, Alex Williamson wrote:  
> >> > + * The NVMe specification requires that controllers support PCIe FLR,
> >> > but
> >> > + * but some Samsung SM961/PM961 controllers fail to recover after FLR
> >> > (-1
> >> > + * config space) unless the device is quiesced prior to FLR.  
> >>
> >> Does disabling the memory bit in PCI config space as part of the FLR
> >> reset function help? (like the very first thing)  
> >
> > No, it does not.  I modified this to only clear PCI_COMMAND_MEMORY and
> > call pcie_flr(), the Samsung controller dies just as it did previously.
> >  
> >> Can we do that in the pcie_flr() function to cover other endpoint types
> >> that might be pushing traffic while code is trying to do a reset?  
> >
> > Do you mean PCI_COMMAND_MASTER rather than PCI_COMMAND_MEMORY?  
> 
> Yes
> 
> >  I tried
> > that too, it doesn't work either.  I'm not really sure the theory
> > behind clearing memory, clearing busmaster to stop DMA seems like a
> > sane thing to do, but doesn't help here.  
> 
> Let me explain what I guessed. You might be able to fill in the blanks
> where I am completely off.
> 
> We do vfio initiated flr reset immediately following guest machine
> shutdown. The card could be fully enabled and pushing traffic to the
> system at this moment.
> 
> I don't know if vfio does any device disable or not.

Yes, pci_clear_master() is the very first thing we do in
vfio_pci_disable(), well before we try to reset the device.
 
> FLR is supposed to reset the endpoint but endpoint doesn't recover per
> your report.
> 
> Having vendor specific reset routines for PCIE endpoints defeats the
> purpose of FLR.
> 
> Since the adapter is fully functional, i suggested turning off bus
> master and memory enable bits to stop endpoint from sending packets.
> 
> But, this is not helping either.
> 
> Those sleep statements looked very fragile to be honest.
> 
> I was curious if there is something else that we could do for other endpoints.
> 
> No objections otherwise.

I certainly agree that it would be nice if FLR was more robust on these
devices, but if all devices behaved within the specs we wouldn't have
these quirks to start with ;)  Just as you're suggesting maybe we could
disable busmaster before FLR, which is reasonable but doesn't work
here, I'm basically moving that to a class specific action, quiesce the
controller at the NVMe level rather than PCI level.  Essentially that's
why I thought it reasonable to apply to all NVMe class devices rather
than create just a quirk that delays after FLR for Intel and another
that disables the NVMe controller just for Samsung.  Once I decide to
apply to the whole class, then I need to bring in the device specific
knowledge already found in the native nvme driver for the delay between
clearing the enable bit and checking the ready status bit.  If it's
fragile, then the bare metal nvme driver has the same frailty.  For the
delay I added, all I can say is that it works for me and improves the
usability of the device for this purpose.  I know that 200ms is too
low, ISTR the issue was fixed at 210-220ms, so 250ms provides some
headroom and I've not seen any issues there.  If we want to make it 500
or 1000ms, that's fine by me, I expect it'd work, it's just unnecessary
until we find devices that need longer delays.  Thanks,

Alex

^ permalink raw reply

* Re: Incorrect name of PCM
From: Christopher Head @ 2018-07-24  3:17 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel
In-Reply-To: <s5hin55zw8b.wl-tiwai@suse.de>


[-- Attachment #1.1.1: Type: text/plain, Size: 1295 bytes --]

On Mon, 23 Jul 2018 19:40:36 +0200
Takashi Iwai <tiwai@suse.de> wrote:

> OK, then another possibility is a BIOS bug.  BIOS declares the pin as
> HDMI incorrectly although it's a SPDIF.

That would appear to be the case. According to the VT1708S datasheet,
the typical application is for digital output widget node 0x12,
attached to pin complex node 0x20, to be used for S/PDIF, and digital
output widget node 0x15, attached to pin complex node 0x21, to be used
for HDMI. The datasheet’s default values for the Configuration Default
words for the two pin complex nodes agree with that configuration. This
also appears to be how my motherboard is configured.

However, I have attached /proc/asound/card0/codec#0; this file states
that node 0x20 is HDMI and 0x21 is S/PDIF, and having decoded the raw
words based on the Intel HDA specification revision 1.0a, I agree that
the kernel is decoding them correctly. I assume the kernel driver
doesn’t change these words, which means the information I’m seeing
there, since it’s not the codec default, must have been put there by
the BIOS, apparently erroneously.

Is this something that the ALSA project wants to (or even can) add a
quirk for? I am already running the most recent BIOS available.
-- 
Christopher Head

[-- Attachment #1.1.2: codec#0 --]
[-- Type: application/octet-stream, Size: 11233 bytes --]

Codec: VIA VT1708S
Address: 0
AFG Function Id: 0x1 (unsol 0)
Vendor Id: 0x11060397
Subsystem Id: 0x10438415
Revision Id: 0x100000
No Modem Function Group found
Default PCM:
    rates [0x0]:
    bits [0x0]:
    formats [0x0]:
Default Amp-In caps: N/A
Default Amp-Out caps: N/A
State of AFG node 0x01:
  Power states:  D0 D1 D2 D3
  Power: setting=D0, actual=D0
GPIO: io=1, o=0, i=0, unsolicited=1, wake=0
  IO[0]: enable=0, dir=0, wake=0, sticky=0, data=0, unsol=0
Node 0x10 [Audio Output] wcaps 0x41d: Stereo Amp-Out
  Control: name="Front Playback Volume", index=0, device=0
    ControlAmp: chs=3, dir=Out, idx=0, ofs=0
  Device: name="VT1708S Analog", type="Audio", device=0
  Amp-Out caps: ofs=0x2a, nsteps=0x2a, stepsize=0x05, mute=0
  Amp-Out vals:  [0x00 0x00]
  Converter: stream=0, channel=0
  PCM:
    rates [0x5e0]: 44100 48000 88200 96000 192000
    bits [0xe]: 16 20 24
    formats [0x1]: PCM
  Power states:  D0 D1 D2 D3
  Power: setting=D0, actual=D0
Node 0x11 [Audio Output] wcaps 0x41d: Stereo Amp-Out
  Control: name="Surround Playback Volume", index=0, device=0
    ControlAmp: chs=3, dir=Out, idx=0, ofs=0
  Amp-Out caps: ofs=0x2a, nsteps=0x2a, stepsize=0x05, mute=0
  Amp-Out vals:  [0x00 0x00]
  Converter: stream=0, channel=0
  PCM:
    rates [0x5e0]: 44100 48000 88200 96000 192000
    bits [0xe]: 16 20 24
    formats [0x1]: PCM
  Power states:  D0 D1 D2 D3
  Power: setting=D0, actual=D0
Node 0x12 [Audio Output] wcaps 0x611: Stereo Digital
  Control: name="IEC958 Playback Con Mask", index=0, device=0
  Control: name="IEC958 Playback Pro Mask", index=0, device=0
  Control: name="IEC958 Playback Default", index=0, device=0
  Control: name="IEC958 Playback Switch", index=0, device=0
  Control: name="IEC958 Default PCM Playback Switch", index=0, device=0
  Device: name="VT1708S Digital", type="HDMI", device=3
  Converter: stream=0, channel=0
  Digital: Enabled GenLevel
  Digital category: 0x2
  IEC Coding Type: 0x0
  PCM:
    rates [0x5e0]: 44100 48000 88200 96000 192000
    bits [0xe]: 16 20 24
    formats [0x1]: PCM
  Power states:  D0 D1 D2 D3
  Power: setting=D0, actual=D0
Node 0x13 [Audio Input] wcaps 0x10051b: Stereo Amp-In
  Control: name="Capture Volume", index=0, device=0
    ControlAmp: chs=3, dir=In, idx=0, ofs=0
  Control: name="Capture Switch", index=0, device=0
    ControlAmp: chs=3, dir=In, idx=0, ofs=0
  Device: name="VT1708S Analog", type="Audio", device=0
  Amp-In caps: ofs=0x0b, nsteps=0x1f, stepsize=0x05, mute=1
  Amp-In vals:  [0x12 0x12]
  Converter: stream=0, channel=0
  SDI-Select: 0
  PCM:
    rates [0x560]: 44100 48000 96000 192000
    bits [0xe]: 16 20 24
    formats [0x1]: PCM
  Power states:  D0 D1 D2 D3
  Power: setting=D0, actual=D0
  Connection: 1
     0x17
Node 0x14 [Audio Input] wcaps 0x10051b: Stereo Amp-In
  Amp-In caps: ofs=0x0b, nsteps=0x1f, stepsize=0x05, mute=1
  Amp-In vals:  [0x8b 0x8b]
  Converter: stream=0, channel=0
  SDI-Select: 0
  PCM:
    rates [0x560]: 44100 48000 96000 192000
    bits [0xe]: 16 20 24
    formats [0x1]: PCM
  Power states:  D0 D1 D2 D3
  Power: setting=D0, actual=D0
  Connection: 1
     0x1e
Node 0x15 [Audio Output] wcaps 0x611: Stereo Digital
  Converter: stream=0, channel=0
  Digital: Enabled GenLevel
  Digital category: 0x2
  IEC Coding Type: 0x0
  PCM:
    rates [0x5e0]: 44100 48000 88200 96000 192000
    bits [0xe]: 16 20 24
    formats [0x1]: PCM
  Power states:  D0 D1 D2 D3
  Power: setting=D0, actual=D0
Node 0x16 [Audio Mixer] wcaps 0x20050b: Stereo Amp-In
  Control: name="Rear Mic Playback Volume", index=0, device=0
    ControlAmp: chs=3, dir=In, idx=2, ofs=0
  Control: name="Rear Mic Playback Switch", index=0, device=0
    ControlAmp: chs=3, dir=In, idx=2, ofs=0
  Control: name="Front Mic Playback Volume", index=0, device=0
    ControlAmp: chs=3, dir=In, idx=4, ofs=0
  Control: name="Front Mic Playback Switch", index=0, device=0
    ControlAmp: chs=3, dir=In, idx=4, ofs=0
  Control: name="Line Playback Volume", index=0, device=0
    ControlAmp: chs=3, dir=In, idx=3, ofs=0
  Control: name="Line Playback Switch", index=0, device=0
    ControlAmp: chs=3, dir=In, idx=3, ofs=0
  Amp-In caps: ofs=0x17, nsteps=0x1f, stepsize=0x05, mute=1
  Amp-In vals:  [0x17 0x17] [0x80 0x80] [0x80 0x80] [0x80 0x80] [0x80 0x80] [0x80 0x80] [0x80 0x80]
  Power states:  D0 D1 D2 D3
  Power: setting=D0, actual=D0
  Connection: 7
     0x10 0x1f 0x1a 0x1b 0x1e 0x1d 0x25
Node 0x17 [Audio Selector] wcaps 0x300501: Stereo
  Power states:  D0 D1 D2 D3
  Power: setting=D0, actual=D0
  Connection: 6
     0x1f 0x1a 0x1b 0x1e* 0x1d 0x16
Node 0x18 [Audio Selector] wcaps 0x30050d: Stereo Amp-Out
  Control: name="Surround Playback Switch", index=0, device=0
    ControlAmp: chs=3, dir=Out, idx=0, ofs=0
  Amp-Out caps: ofs=0x00, nsteps=0x00, stepsize=0x00, mute=1
  Amp-Out vals:  [0x80 0x80]
  Power states:  D0 D1 D2 D3
  Power: setting=D0, actual=D0
  Connection: 1
     0x11
Node 0x19 [Pin Complex] wcaps 0x400581: Stereo
  Pincap 0x00000014: OUT Detect
  Pin Default 0x410110f0: [N/A] Line Out at Ext Rear
    Conn = 1/8, Color = Black
    DefAssociation = 0xf, Sequence = 0x0
  Pin-ctls: 0x00:
  Unsolicited: tag=00, enabled=0
  Power states:  D0 D1 D2 D3
  Power: setting=D0, actual=D0
  Connection: 1
     0x18
Node 0x1a [Pin Complex] wcaps 0x400581: Stereo
  Control: name="Rear Mic Boost Volume", index=0, device=0
    ControlAmp: chs=3, dir=In, idx=0, ofs=0
  Pincap 0x00002334: IN OUT Detect
    Vref caps: HIZ 50 100
  Pin Default 0x01a19036: [Jack] Mic at Ext Rear
    Conn = 1/8, Color = Pink
    DefAssociation = 0x3, Sequence = 0x6
  Pin-ctls: 0x21: IN VREF_50
  Unsolicited: tag=03, enabled=1
  Power states:  D0 D1 D2 D3
  Power: setting=D0, actual=D0
  Connection: 1
     0x26
Node 0x1b [Pin Complex] wcaps 0x400581: Stereo
  Pincap 0x00002334: IN OUT Detect
    Vref caps: HIZ 50 100
  Pin Default 0x0181303e: [Jack] Line In at Ext Rear
    Conn = 1/8, Color = Blue
    DefAssociation = 0x3, Sequence = 0xe
  Pin-ctls: 0x20: IN VREF_HIZ
  Unsolicited: tag=05, enabled=1
  Power states:  D0 D1 D2 D3
  Power: setting=D0, actual=D0
  Connection: 1
     0x18
Node 0x1c [Pin Complex] wcaps 0x40058d: Stereo Amp-Out
  Control: name="Front Playback Switch", index=0, device=0
    ControlAmp: chs=3, dir=Out, idx=0, ofs=0
  Amp-Out caps: ofs=0x00, nsteps=0x00, stepsize=0x00, mute=1
  Amp-Out vals:  [0x80 0x80]
  Pincap 0x0001001c: OUT HP EAPD Detect
  EAPD 0x2: EAPD
  Pin Default 0x01014010: [Jack] Line Out at Ext Rear
    Conn = 1/8, Color = Green
    DefAssociation = 0x1, Sequence = 0x0
  Pin-ctls: 0x40: OUT
  Unsolicited: tag=01, enabled=1
  Power states:  D0 D1 D2 D3
  Power: setting=D0, actual=D0
  Connection: 1
     0x16
Node 0x1d [Pin Complex] wcaps 0x40058d: Stereo Amp-Out
  Control: name="Headphone Playback Switch", index=0, device=0
    ControlAmp: chs=3, dir=Out, idx=0, ofs=0
  Amp-Out caps: ofs=0x00, nsteps=0x00, stepsize=0x00, mute=1
  Amp-Out vals:  [0x00 0x00]
  Pincap 0x0000233c: IN OUT HP Detect
    Vref caps: HIZ 50 100
  Pin Default 0x0221401f: [Jack] HP Out at Ext Front
    Conn = 1/8, Color = Green
    DefAssociation = 0x1, Sequence = 0xf
  Pin-ctls: 0xc0: OUT HP VREF_HIZ
  Unsolicited: tag=02, enabled=1
  Power states:  D0 D1 D2 D3
  Power: setting=D0, actual=D0
  Connection: 2
     0x16 0x25*
Node 0x1e [Pin Complex] wcaps 0x40058d: Stereo Amp-Out
  Control: name="Front Mic Boost Volume", index=0, device=0
    ControlAmp: chs=3, dir=In, idx=0, ofs=0
  Amp-Out caps: ofs=0x00, nsteps=0x00, stepsize=0x00, mute=1
  Amp-Out vals:  [0x80 0x80]
  Pincap 0x0000233c: IN OUT HP Detect
    Vref caps: HIZ 50 100
  Pin Default 0x02a19037: [Jack] Mic at Ext Front
    Conn = 1/8, Color = Pink
    DefAssociation = 0x3, Sequence = 0x7
  Pin-ctls: 0x21: IN VREF_50
  Unsolicited: tag=04, enabled=1
  Power states:  D0 D1 D2 D3
  Power: setting=D0, actual=D0
  Connection: 2
     0x16* 0x25
Node 0x1f [Pin Complex] wcaps 0x400401: Stereo
  Pincap 0x00000020: IN
  Pin Default 0x503701f0: [N/A] CD at Int N/A
    Conn = Analog, Color = Unknown
    DefAssociation = 0xf, Sequence = 0x0
    Misc = NO_PRESENCE
  Pin-ctls: 0x00:
  Power states:  D0 D1 D2 D3
  Power: setting=D0, actual=D0
Node 0x20 [Pin Complex] wcaps 0x400701: Stereo Digital
  Pincap 0x00000010: OUT
  Pin Default 0x185600f0: [Jack] Digital Out at Int HDMI
    Conn = Digital, Color = Unknown
    DefAssociation = 0xf, Sequence = 0x0
  Pin-ctls: 0x40: OUT
  Power states:  D0 D1 D2 D3
  Power: setting=D0, actual=D0
  Connection: 1
     0x12
Node 0x21 [Pin Complex] wcaps 0x400701: Stereo Digital
  Pincap 0x00000010: OUT
  Pin Default 0x074511f0: [Jack] SPDIF Out at Ext Rear Panel
    Conn = Optical, Color = Black
    DefAssociation = 0xf, Sequence = 0x0
    Misc = NO_PRESENCE
  Pin-ctls: 0x40: OUT
  Power states:  D0 D1 D2 D3
  Power: setting=D0, actual=D0
  Connection: 1
     0x15
Node 0x22 [Pin Complex] wcaps 0x400581: Stereo
  Pincap 0x00000014: OUT Detect
  Pin Default 0x410160f0: [N/A] Line Out at Ext Rear
    Conn = 1/8, Color = Orange
    DefAssociation = 0xf, Sequence = 0x0
  Pin-ctls: 0x00:
  Unsolicited: tag=00, enabled=0
  Power states:  D0 D1 D2 D3
  Power: setting=D0, actual=D0
  Connection: 1
     0x26
Node 0x23 [Pin Complex] wcaps 0x400581: Stereo
  Pincap 0x00000014: OUT Detect
  Pin Default 0x410120f0: [N/A] Line Out at Ext Rear
    Conn = 1/8, Color = Grey
    DefAssociation = 0xf, Sequence = 0x0
  Pin-ctls: 0x00:
  Unsolicited: tag=00, enabled=0
  Power states:  D0 D1 D2 D3
  Power: setting=D0, actual=D0
  Connection: 1
     0x27
Node 0x24 [Audio Output] wcaps 0x41d: Stereo Amp-Out
  Control: name="Center Playback Volume", index=0, device=0
    ControlAmp: chs=1, dir=Out, idx=0, ofs=0
  Control: name="LFE Playback Volume", index=0, device=0
    ControlAmp: chs=2, dir=Out, idx=0, ofs=0
  Amp-Out caps: ofs=0x2a, nsteps=0x2a, stepsize=0x05, mute=0
  Amp-Out vals:  [0x00 0x00]
  Converter: stream=0, channel=0
  PCM:
    rates [0x5e0]: 44100 48000 88200 96000 192000
    bits [0xe]: 16 20 24
    formats [0x1]: PCM
  Power states:  D0 D1 D2 D3
  Power: setting=D0, actual=D0
Node 0x25 [Audio Output] wcaps 0x41d: Stereo Amp-Out
  Control: name="Headphone Playback Volume", index=0, device=0
    ControlAmp: chs=3, dir=Out, idx=0, ofs=0
  Device: name="VT1708S Alt Analog", type="Audio", device=2
  Amp-Out caps: ofs=0x2a, nsteps=0x2a, stepsize=0x05, mute=0
  Amp-Out vals:  [0x1c 0x1c]
  Converter: stream=0, channel=0
  PCM:
    rates [0x5e0]: 44100 48000 88200 96000 192000
    bits [0xe]: 16 20 24
    formats [0x1]: PCM
  Power states:  D0 D1 D2 D3
  Power: setting=D0, actual=D0
Node 0x26 [Audio Selector] wcaps 0x30050d: Stereo Amp-Out
  Control: name="Center Playback Switch", index=0, device=0
    ControlAmp: chs=1, dir=Out, idx=0, ofs=0
  Control: name="LFE Playback Switch", index=0, device=0
    ControlAmp: chs=2, dir=Out, idx=0, ofs=0
  Amp-Out caps: ofs=0x00, nsteps=0x00, stepsize=0x00, mute=1
  Amp-Out vals:  [0x80 0x80]
  Power states:  D0 D1 D2 D3
  Power: setting=D0, actual=D0
  Connection: 1
     0x24
Node 0x27 [Audio Selector] wcaps 0x30050d: Stereo Amp-Out
  Amp-Out caps: ofs=0x00, nsteps=0x00, stepsize=0x00, mute=1
  Amp-Out vals:  [0x80 0x80]
  Power states:  D0 D1 D2 D3
  Power: setting=D0, actual=D0
  Connection: 1
     0x25

[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply

* Re: [PATCH net-next] tcp: ack immediately when a cwr packet arrives
From: Neal Cardwell @ 2018-07-24  2:15 UTC (permalink / raw)
  To: Lawrence Brakmo; +Cc: Netdev, Kernel Team, ast, Yuchung Cheng, Eric Dumazet
In-Reply-To: <20180724004939.2874202-1-brakmo@fb.com>

On Mon, Jul 23, 2018 at 8:49 PM Lawrence Brakmo <brakmo@fb.com> wrote:
>
> We observed high 99 and 99.9% latencies when doing RPCs with DCTCP. The
> problem is triggered when the last packet of a request arrives CE
> marked. The reply will carry the ECE mark causing TCP to shrink its cwnd
> to 1 (because there are no packets in flight). When the 1st packet of
> the next request arrives, the ACK was sometimes delayed even though it
> is CWR marked, adding up to 40ms to the RPC latency.
>
> This patch insures that CWR marked data packets arriving will be acked
> immediately.
...
> Modified based on comments by Neal Cardwell <ncardwell@google.com>
>
> Signed-off-by: Lawrence Brakmo <brakmo@fb.com>
> ---
>  net/ipv4/tcp_input.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)

Seems like a nice mechanism to have, IMHO.

Acked-by: Neal Cardwell <ncardwell@google.com>

Thanks!
neal

^ permalink raw reply

* Re: [PATCH bpf-next] bpf: btf: fix inconsistent IS_ERR and PTR_ERR
From: David Miller @ 2018-07-24  3:20 UTC (permalink / raw)
  To: yuehaibing
  Cc: ast, daniel, quentin.monnet, jakub.kicinski, bhole_prashant_q7,
	osk, linux-kernel, netdev
In-Reply-To: <20180724025524.22012-1-yuehaibing@huawei.com>

From: YueHaibing <yuehaibing@huawei.com>
Date: Tue, 24 Jul 2018 10:55:24 +0800

> Fix inconsistent IS_ERR and PTR_ERR in get_btf,
> the proper pointer to be passed as argument is '*btf'
> 
> This issue was detected with the help of Coccinelle.
> 
> Fixes: 2d3feca8c44f ("bpf: btf: print map dump and lookup with btf info")
> Signed-off-by: YueHaibing <yuehaibing@huawei.com>

Acked-by: David S. Miller <davem@davemloft.net>

^ permalink raw reply

* [PATCH 00/20] PIDTYPE_TGID removal of fork restarts
From: Eric W. Biederman @ 2018-07-24  3:22 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Oleg Nesterov, Andrew Morton, linux-kernel, Wen Yang, majiang
In-Reply-To: <877em2jxyr.fsf_-_@xmission.com>


This took longer than I thought to address all of the issues and double
check I am not missing something.  I have split of a few of the patches
so now the patch series appears longer.   It now covers less ground.

I realized while reviewing the group signals that for none of them is
siginfo important.  Which means by slightly lowering our quality of
implementation in delivering those signals to a brand new process (by
not queueing siginfo) I can collect them all in a sigset and the code
is no more difficult than a sequence counter.  Which means it is
straight forward to completely eliminate restarts from fork.

The implemenatation of PIDTYPE_TGID remains the same.  How it gets used
has changed to guarantee that looking up a thread group by the pid of
one of it's threads and sending it a signal continues to work exactly
the same as before.

Please take a look and verify that I have caught everything.  I think I
have but if not please let me know.

Thank you in advance,
Eric

Eric W. Biederman (20):
      pids: Initialize leader_pid in init_task
      pids: Move task_pid_type into sched/signal.h
      pids: Compute task_tgid using signal->leader_pid
      kvm: Don't open code task_pid in kvm_vcpu_ioctl
      pids: Move the pgrp and session pid pointers from task_struct to signal_struct
      pid: Implement PIDTYPE_TGID
      signal: Use PIDTYPE_TGID to clearly store where file signals will be sent
      posix-timers: Noralize good_sigevent
      signal: Pass pid and pid type into send_sigqueue
      signal: Pass pid type into group_send_sig_info
      signal: Pass pid type into send_sigio_to_task & send_sigurg_to_task
      signal: Pass pid type into do_send_sig_info
      signal: Push pid type down into send_signal
      signal: Push pid type down into __send_signal
      signal: Push pid type down into complete_signal.
      fork: Move and describe why the code examines PIDNS_ADDING
      fork: Unconditionally exit if a fatal signal is pending
      signal: Add calculate_sigpending()
      fork: Have new threads join on-going signal group stops
      signal: Don't restart fork when signals come in.

 arch/ia64/kernel/asm-offsets.c       |  4 +-
 arch/ia64/kernel/fsys.S              | 12 ++---
 arch/s390/kernel/perf_cpum_sf.c      |  2 +-
 drivers/net/tun.c                    |  2 +-
 drivers/platform/x86/thinkpad_acpi.c |  1 +
 drivers/tty/sysrq.c                  |  2 +-
 drivers/tty/tty_io.c                 |  2 +-
 fs/autofs/autofs_i.h                 |  1 +
 fs/exec.c                            |  1 +
 fs/fcntl.c                           | 72 +++++++++++++--------------
 fs/fuse/file.c                       |  1 +
 fs/locks.c                           |  2 +-
 fs/notify/dnotify/dnotify.c          |  3 +-
 fs/notify/fanotify/fanotify.c        |  1 +
 include/linux/init_task.h            |  9 ----
 include/linux/pid.h                  | 11 +----
 include/linux/sched.h                | 31 +++---------
 include/linux/sched/signal.h         | 49 +++++++++++++++++--
 include/linux/signal.h               |  6 ++-
 include/net/scm.h                    |  1 +
 init/init_task.c                     | 12 +++--
 kernel/events/core.c                 |  2 +-
 kernel/exit.c                        | 12 ++---
 kernel/fork.c                        | 70 +++++++++++++++++++--------
 kernel/pid.c                         | 42 ++++++++--------
 kernel/signal.c                      | 94 ++++++++++++++++++++++++++----------
 kernel/time/itimer.c                 |  5 +-
 kernel/time/posix-cpu-timers.c       |  2 +-
 kernel/time/posix-timers.c           | 21 ++++----
 mm/oom_kill.c                        |  4 +-
 virt/kvm/kvm_main.c                  |  2 +-
 31 files changed, 282 insertions(+), 197 deletions(-)

^ permalink raw reply

* [PATCH 01/20] pids: Initialize leader_pid in init_task
From: Eric W. Biederman @ 2018-07-24  3:24 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Oleg Nesterov, Andrew Morton, linux-kernel, Wen Yang, majiang,
	Eric W. Biederman
In-Reply-To: <87efft5ncd.fsf_-_@xmission.com>

This is cheap and no cost so we might as well.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 init/init_task.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/init/init_task.c b/init/init_task.c
index 74f60baa2799..7914ffb8dc73 100644
--- a/init/init_task.c
+++ b/init/init_task.c
@@ -33,6 +33,7 @@ static struct signal_struct init_signals = {
 	},
 #endif
 	INIT_CPU_TIMERS(init_signals)
+	.leader_pid = &init_struct_pid,
 	INIT_PREV_CPUTIME(init_signals)
 };
 
-- 
2.17.1


^ permalink raw reply related

* [PATCH 02/20] pids: Move task_pid_type into sched/signal.h
From: Eric W. Biederman @ 2018-07-24  3:24 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Oleg Nesterov, Andrew Morton, linux-kernel, Wen Yang, majiang,
	Eric W. Biederman
In-Reply-To: <87efft5ncd.fsf_-_@xmission.com>

The function is general and inline so there is no need
to hide it inside of exit.c

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 include/linux/sched/signal.h | 8 ++++++++
 kernel/exit.c                | 8 --------
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/include/linux/sched/signal.h b/include/linux/sched/signal.h
index 113d1ad1ced7..d8ef0a3d2e7e 100644
--- a/include/linux/sched/signal.h
+++ b/include/linux/sched/signal.h
@@ -556,6 +556,14 @@ extern bool current_is_single_threaded(void);
 typedef int (*proc_visitor)(struct task_struct *p, void *data);
 void walk_process_tree(struct task_struct *top, proc_visitor, void *);
 
+static inline
+struct pid *task_pid_type(struct task_struct *task, enum pid_type type)
+{
+	if (type != PIDTYPE_PID)
+		task = task->group_leader;
+	return task->pids[type].pid;
+}
+
 static inline int get_nr_threads(struct task_struct *tsk)
 {
 	return tsk->signal->nr_threads;
diff --git a/kernel/exit.c b/kernel/exit.c
index c3c7ac560114..16432428fc6c 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -1001,14 +1001,6 @@ struct wait_opts {
 	int			notask_error;
 };
 
-static inline
-struct pid *task_pid_type(struct task_struct *task, enum pid_type type)
-{
-	if (type != PIDTYPE_PID)
-		task = task->group_leader;
-	return task->pids[type].pid;
-}
-
 static int eligible_pid(struct wait_opts *wo, struct task_struct *p)
 {
 	return	wo->wo_type == PIDTYPE_MAX ||
-- 
2.17.1


^ permalink raw reply related


This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.