All of lore.kernel.org

All of lore.kernel.org
 help / color / mirror / Atom feed

* Re: [PATCH v4 4/8] mtd: nand: atmel: Avoid ECC errors when leaving backup mode
From: Boris Brezillon @ 2017-10-05  7:22 UTC (permalink / raw)
  To: Romain Izard
  Cc: Michael Turquette, Stephen Boyd, Lee Jones, Wenyou Yang, Josh Wu,
	Richard Weinberger, David Woodhouse, Brian Norris, Marek Vasut,
	Cyrille Pitchen, Thierry Reding, Richard Genoud,
	Greg Kroah-Hartman, Jiri Slaby, Alan Stern, Ludovic Desroches,
	Nicolas Ferre, Alexandre Belloni, linux-clk, linux-kernel,
	linux-mtd, linux-pwm, linux-serial, linux-usb
In-Reply-To: <20170928094627.31017-5-romain.izard.pro@gmail.com>

On Thu, 28 Sep 2017 11:46:23 +0200
Romain Izard <romain.izard.pro@gmail.com> wrote:

> During backup mode, the contents of all registers will be cleared as the
> SoC will be completely powered down. For a product that boots on NAND
> Flash memory, the bootloader will obviously use the related controller
> to read the Flash and correct any detected error in the memory, before
> handling back control to the kernel's resuming entry point.
> 
> But it does not clean the NAND controller registers after use and on its
> side the kernel driver expects the error locator to be powered down and
> in a clean state. Add a resume hook for the PMECC error locator, and
> reset its registers.
> 
> Signed-off-by: Romain Izard <romain.izard.pro@gmail.com>

Applied.

Thanks,

Boris

> ---
> Changes in v3:
> * keep the PMECC disabled when not in use, and use atmel_pmecc_resume to
>   reset the controller after the bootloader has left it enabled.
> 
> Changes in v4:
> * export atmel_pmecc_reset instead of atmel_pmecc_resume
> * use the correct pointer in atmel_nand_controller_resume
> 
>  drivers/mtd/nand/atmel/nand-controller.c |  3 +++
>  drivers/mtd/nand/atmel/pmecc.c           | 17 +++++++++--------
>  drivers/mtd/nand/atmel/pmecc.h           |  1 +
>  3 files changed, 13 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/mtd/nand/atmel/nand-controller.c b/drivers/mtd/nand/atmel/nand-controller.c
> index f25eca79f4e5..8afcff9a66ea 100644
> --- a/drivers/mtd/nand/atmel/nand-controller.c
> +++ b/drivers/mtd/nand/atmel/nand-controller.c
> @@ -2530,6 +2530,9 @@ static __maybe_unused int atmel_nand_controller_resume(struct device *dev)
>  	struct atmel_nand_controller *nc = dev_get_drvdata(dev);
>  	struct atmel_nand *nand;
>  
> +	if (nc->pmecc)
> +		atmel_pmecc_reset(nc->pmecc);
> +
>  	list_for_each_entry(nand, &nc->chips, node) {
>  		int i;
>  
> diff --git a/drivers/mtd/nand/atmel/pmecc.c b/drivers/mtd/nand/atmel/pmecc.c
> index 146af8218314..0a3f12141c45 100644
> --- a/drivers/mtd/nand/atmel/pmecc.c
> +++ b/drivers/mtd/nand/atmel/pmecc.c
> @@ -765,6 +765,13 @@ void atmel_pmecc_get_generated_eccbytes(struct atmel_pmecc_user *user,
>  }
>  EXPORT_SYMBOL_GPL(atmel_pmecc_get_generated_eccbytes);
>  
> +void atmel_pmecc_reset(struct atmel_pmecc *pmecc)
> +{
> +	writel(PMECC_CTRL_RST, pmecc->regs.base + ATMEL_PMECC_CTRL);
> +	writel(PMECC_CTRL_DISABLE, pmecc->regs.base + ATMEL_PMECC_CTRL);
> +}
> +EXPORT_SYMBOL_GPL(atmel_pmecc_reset);
> +
>  int atmel_pmecc_enable(struct atmel_pmecc_user *user, int op)
>  {
>  	struct atmel_pmecc *pmecc = user->pmecc;
> @@ -797,10 +804,7 @@ EXPORT_SYMBOL_GPL(atmel_pmecc_enable);
>  
>  void atmel_pmecc_disable(struct atmel_pmecc_user *user)
>  {
> -	struct atmel_pmecc *pmecc = user->pmecc;
> -
> -	writel(PMECC_CTRL_RST, pmecc->regs.base + ATMEL_PMECC_CTRL);
> -	writel(PMECC_CTRL_DISABLE, pmecc->regs.base + ATMEL_PMECC_CTRL);
> +	atmel_pmecc_reset(user->pmecc);
>  	mutex_unlock(&user->pmecc->lock);
>  }
>  EXPORT_SYMBOL_GPL(atmel_pmecc_disable);
> @@ -855,10 +859,7 @@ static struct atmel_pmecc *atmel_pmecc_create(struct platform_device *pdev,
>  
>  	/* Disable all interrupts before registering the PMECC handler. */
>  	writel(0xffffffff, pmecc->regs.base + ATMEL_PMECC_IDR);
> -
> -	/* Reset the ECC engine */
> -	writel(PMECC_CTRL_RST, pmecc->regs.base + ATMEL_PMECC_CTRL);
> -	writel(PMECC_CTRL_DISABLE, pmecc->regs.base + ATMEL_PMECC_CTRL);
> +	atmel_pmecc_reset(pmecc);
>  
>  	return pmecc;
>  }
> diff --git a/drivers/mtd/nand/atmel/pmecc.h b/drivers/mtd/nand/atmel/pmecc.h
> index a8ddbfca2ea5..817e0dd9fd15 100644
> --- a/drivers/mtd/nand/atmel/pmecc.h
> +++ b/drivers/mtd/nand/atmel/pmecc.h
> @@ -61,6 +61,7 @@ atmel_pmecc_create_user(struct atmel_pmecc *pmecc,
>  			struct atmel_pmecc_user_req *req);
>  void atmel_pmecc_destroy_user(struct atmel_pmecc_user *user);
>  
> +void atmel_pmecc_reset(struct atmel_pmecc *pmecc);
>  int atmel_pmecc_enable(struct atmel_pmecc_user *user, int op);
>  void atmel_pmecc_disable(struct atmel_pmecc_user *user);
>  int atmel_pmecc_wait_rdy(struct atmel_pmecc_user *user);

^ permalink raw reply

* Re: [PATCH v4 4/8] mtd: nand: atmel: Avoid ECC errors when leaving backup mode
From: Boris Brezillon @ 2017-10-05  7:22 UTC (permalink / raw)
  To: Romain Izard
  Cc: Michael Turquette, Stephen Boyd, Lee Jones, Wenyou Yang, Josh Wu,
	Richard Weinberger, David Woodhouse, Brian Norris, Marek Vasut,
	Cyrille Pitchen, Thierry Reding, Richard Genoud,
	Greg Kroah-Hartman, Jiri Slaby, Alan Stern, Ludovic Desroches,
	Nicolas Ferre, Alexandre Belloni
In-Reply-To: <20170928094627.31017-5-romain.izard.pro@gmail.com>

On Thu, 28 Sep 2017 11:46:23 +0200
Romain Izard <romain.izard.pro@gmail.com> wrote:

> During backup mode, the contents of all registers will be cleared as the
> SoC will be completely powered down. For a product that boots on NAND
> Flash memory, the bootloader will obviously use the related controller
> to read the Flash and correct any detected error in the memory, before
> handling back control to the kernel's resuming entry point.
> 
> But it does not clean the NAND controller registers after use and on its
> side the kernel driver expects the error locator to be powered down and
> in a clean state. Add a resume hook for the PMECC error locator, and
> reset its registers.
> 
> Signed-off-by: Romain Izard <romain.izard.pro@gmail.com>

Applied.

Thanks,

Boris

> ---
> Changes in v3:
> * keep the PMECC disabled when not in use, and use atmel_pmecc_resume to
>   reset the controller after the bootloader has left it enabled.
> 
> Changes in v4:
> * export atmel_pmecc_reset instead of atmel_pmecc_resume
> * use the correct pointer in atmel_nand_controller_resume
> 
>  drivers/mtd/nand/atmel/nand-controller.c |  3 +++
>  drivers/mtd/nand/atmel/pmecc.c           | 17 +++++++++--------
>  drivers/mtd/nand/atmel/pmecc.h           |  1 +
>  3 files changed, 13 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/mtd/nand/atmel/nand-controller.c b/drivers/mtd/nand/atmel/nand-controller.c
> index f25eca79f4e5..8afcff9a66ea 100644
> --- a/drivers/mtd/nand/atmel/nand-controller.c
> +++ b/drivers/mtd/nand/atmel/nand-controller.c
> @@ -2530,6 +2530,9 @@ static __maybe_unused int atmel_nand_controller_resume(struct device *dev)
>  	struct atmel_nand_controller *nc = dev_get_drvdata(dev);
>  	struct atmel_nand *nand;
>  
> +	if (nc->pmecc)
> +		atmel_pmecc_reset(nc->pmecc);
> +
>  	list_for_each_entry(nand, &nc->chips, node) {
>  		int i;
>  
> diff --git a/drivers/mtd/nand/atmel/pmecc.c b/drivers/mtd/nand/atmel/pmecc.c
> index 146af8218314..0a3f12141c45 100644
> --- a/drivers/mtd/nand/atmel/pmecc.c
> +++ b/drivers/mtd/nand/atmel/pmecc.c
> @@ -765,6 +765,13 @@ void atmel_pmecc_get_generated_eccbytes(struct atmel_pmecc_user *user,
>  }
>  EXPORT_SYMBOL_GPL(atmel_pmecc_get_generated_eccbytes);
>  
> +void atmel_pmecc_reset(struct atmel_pmecc *pmecc)
> +{
> +	writel(PMECC_CTRL_RST, pmecc->regs.base + ATMEL_PMECC_CTRL);
> +	writel(PMECC_CTRL_DISABLE, pmecc->regs.base + ATMEL_PMECC_CTRL);
> +}
> +EXPORT_SYMBOL_GPL(atmel_pmecc_reset);
> +
>  int atmel_pmecc_enable(struct atmel_pmecc_user *user, int op)
>  {
>  	struct atmel_pmecc *pmecc = user->pmecc;
> @@ -797,10 +804,7 @@ EXPORT_SYMBOL_GPL(atmel_pmecc_enable);
>  
>  void atmel_pmecc_disable(struct atmel_pmecc_user *user)
>  {
> -	struct atmel_pmecc *pmecc = user->pmecc;
> -
> -	writel(PMECC_CTRL_RST, pmecc->regs.base + ATMEL_PMECC_CTRL);
> -	writel(PMECC_CTRL_DISABLE, pmecc->regs.base + ATMEL_PMECC_CTRL);
> +	atmel_pmecc_reset(user->pmecc);
>  	mutex_unlock(&user->pmecc->lock);
>  }
>  EXPORT_SYMBOL_GPL(atmel_pmecc_disable);
> @@ -855,10 +859,7 @@ static struct atmel_pmecc *atmel_pmecc_create(struct platform_device *pdev,
>  
>  	/* Disable all interrupts before registering the PMECC handler. */
>  	writel(0xffffffff, pmecc->regs.base + ATMEL_PMECC_IDR);
> -
> -	/* Reset the ECC engine */
> -	writel(PMECC_CTRL_RST, pmecc->regs.base + ATMEL_PMECC_CTRL);
> -	writel(PMECC_CTRL_DISABLE, pmecc->regs.base + ATMEL_PMECC_CTRL);
> +	atmel_pmecc_reset(pmecc);
>  
>  	return pmecc;
>  }
> diff --git a/drivers/mtd/nand/atmel/pmecc.h b/drivers/mtd/nand/atmel/pmecc.h
> index a8ddbfca2ea5..817e0dd9fd15 100644
> --- a/drivers/mtd/nand/atmel/pmecc.h
> +++ b/drivers/mtd/nand/atmel/pmecc.h
> @@ -61,6 +61,7 @@ atmel_pmecc_create_user(struct atmel_pmecc *pmecc,
>  			struct atmel_pmecc_user_req *req);
>  void atmel_pmecc_destroy_user(struct atmel_pmecc_user *user);
>  
> +void atmel_pmecc_reset(struct atmel_pmecc *pmecc);
>  int atmel_pmecc_enable(struct atmel_pmecc_user *user, int op);
>  void atmel_pmecc_disable(struct atmel_pmecc_user *user);
>  int atmel_pmecc_wait_rdy(struct atmel_pmecc_user *user);

^ permalink raw reply

* [RFH] 4.9.52: task blocked for more than 300 seconds - xen_clocksource_get_cycles?
From: Philipp Hahn @ 2017-10-05  7:22 UTC (permalink / raw)
  To: linux-kernel@vger.kernel.org, xen-devel

Hello,

in a VM running linux-4.9.52 on Debian-Wheezy with Xen-4.1 I observed
several stuck processes: Here is one (truncated) dump of the Linux
kernel messages:

>  [<ffffffff8160db6d>] ? __schedule+0x23d/0x6d0
>  [<ffffffff8160e8a0>] ? bit_wait_timeout+0x90/0x90
>  [<ffffffff8160e032>] ? schedule+0x32/0x80
>  [<ffffffff8161150c>] ? schedule_timeout+0x1ec/0x360
>  [<ffffffff8130f277>] ? __blk_mq_run_hw_queue+0x327/0x3e0
*** see below
>  [<ffffffff8101b7f1>] ? xen_clocksource_get_cycles+0x11/0x20
>  [<ffffffff8160e8a0>] ? bit_wait_timeout+0x90/0x90
>  [<ffffffff8160d8b4>] ? io_schedule_timeout+0xb4/0x130
>  [<ffffffff810bb6f7>] ? prepare_to_wait+0x57/0x80
>  [<ffffffff8160e8b7>] ? bit_wait_io+0x17/0x60
>  [<ffffffff8160e3ac>] ? __wait_on_bit+0x5c/0x90
>  [<ffffffff8160e8a0>] ? bit_wait_timeout+0x90/0x90
>  [<ffffffff8160e50e>] ? out_of_line_wait_on_bit+0x7e/0xa0
>  [<ffffffff810bba20>] ? autoremove_wake_function+0x40/0x40
>  [<ffffffffc00abd18>] ? jbd2_journal_commit_transaction+0xd48/0x17e0 [jbd2]
>  [<ffffffff810247d9>] ? __switch_to+0x2c9/0x720
>  [<ffffffff810e67fd>] ? try_to_del_timer_sync+0x4d/0x80
>  [<ffffffffc00b099d>] ? kjournald2+0xdd/0x280 [jbd2]
>  [<ffffffff810bb9e0>] ? wake_up_atomic_t+0x30/0x30
>  [<ffffffffc00b08c0>] ? commit_timeout+0x10/0x10 [jbd2]
>  [<ffffffff81097660>] ? kthread+0xf0/0x110
>  [<ffffffff810247d9>] ? __switch_to+0x2c9/0x720
>  [<ffffffff81097570>] ? kthread_park+0x60/0x60
>  [<ffffffff81612bb5>] ? ret_from_fork+0x25/0x30
> NMI backtrace for cpu 2
> CPU: 2 PID: 35 Comm: khungtaskd Not tainted 4.9.0-ucs105-amd64 #1 Debian 4.9.30-2A~4.2.0.201709271649
>  0000000000000000 ffffffff81331935 0000000000000000 0000000000000002
>  ffffffff81335e60 0000000000000002 ffffffff8104cb70 ffff8801f0c90e80
>  ffffffff81335f6a ffff8801f0c90e80 00000000003fffbc ffffffff81128048
> Call Trace:
>  [<ffffffff81331935>] ? dump_stack+0x5c/0x77
>  [<ffffffff81335e60>] ? nmi_cpu_backtrace+0x90/0xa0
>  [<ffffffff8104cb70>] ? irq_force_complete_move+0x140/0x140
>  [<ffffffff81335f6a>] ? nmi_trigger_cpumask_backtrace+0xfa/0x130
>  [<ffffffff81128048>] ? watchdog+0x2b8/0x330
>  [<ffffffff81127d90>] ? reset_hung_task_detector+0x10/0x10
>  [<ffffffff81097660>] ? kthread+0xf0/0x110
>  [<ffffffff810247d9>] ? __switch_to+0x2c9/0x720
>  [<ffffffff81097570>] ? kthread_park+0x60/0x60
>  [<ffffffff81612bb5>] ? ret_from_fork+0x25/0x30
> Sending NMI from CPU 2 to CPUs 0-1,3:
> NMI backtrace for cpu 1
> CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.9.0-ucs105-amd64 #1 Debian 4.9.30-2A~4.2.0.201709271649
> task: ffff8801f4a02ec0 task.stack: ffffc90040ca4000
> RIP: e030:[<ffffffff810013aa>]  [<ffffffff810013aa>] xen_hypercall_sched_op+0xa/0x20
> RSP: e02b:ffffc90040ca7ed0  EFLAGS: 00000246
> RAX: 0000000000000000 RBX: ffff8801f4a02ec0 RCX: ffffffff810013aa
> RDX: ffffffff81c4ba70 RSI: 0000000000000000 RDI: 0000000000000001
> RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000007ff0 R11: 0000000000000246 R12: 0000000000000000
> R13: 0000000000000000 R14: ffff8801f4a02ec0 R15: ffff8801f4a02ec0
> FS:  00007f23ac595700(0000) GS:ffff8801f5a80000(0000) knlGS:0000000000000000
> CS:  e033 DS: 002b ES: 002b CR0: 0000000080050033
> CR2: 00007f52537d6d46 CR3: 00000001bba23000 CR4: 0000000000002660
> Stack:
>  ffff8801bb832201 0000000000000001 ffffffff8101b55c ffffffff81611ec8
>  ffff8801f4a02ec0 0000000000000001 ffffffff810bc280 ffff8801f4a02ec0
>  ffff8801f4a02ec0 c0a995961d41240f addce6dcadd009c9 0000000000000000
> Call Trace:
>  [<ffffffff8101b55c>] ? xen_safe_halt+0xc/0x20
>  [<ffffffff81611ec8>] ? default_idle+0x18/0xd0
>  [<ffffffff810bc280>] ? cpu_startup_entry+0x1f0/0x260
> Code: cc 51 41 53 b8 1c 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 1d 00 00 00 0f 05 <41> 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 
> NMI backtrace for cpu 3
> CPU: 3 PID: 0 Comm: swapper/3 Not tainted 4.9.0-ucs105-amd64 #1 Debian 4.9.30-2A~4.2.0.201709271649
> task: ffff8801f4a24f00 task.stack: ffffc90040cb4000
> RIP: e030:[<ffffffff810013aa>]  [<ffffffff810013aa>] xen_hypercall_sched_op+0xa/0x20
> RSP: e02b:ffffc90040cb7ed0  EFLAGS: 00000246
> RAX: 0000000000000000 RBX: ffff8801f4a24f00 RCX: ffffffff810013aa
> RDX: ffffffff81c4ba70 RSI: 0000000000000000 RDI: 0000000000000001
> RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000007ff0 R11: 0000000000000246 R12: 0000000000000000
> R13: 0000000000000000 R14: ffff8801f4a24f00 R15: ffff8801f4a24f00
> FS:  00007f1a2af19700(0000) GS:ffff8801f5b80000(0000) knlGS:0000000000000000
> CS:  e033 DS: 002b ES: 002b CR0: 0000000080050033
> CR2: 00007f4a5416b000 CR3: 00000001d83ec000 CR4: 0000000000002660
> Stack:
>  0000000000000001 0000000000000001 ffffffff8101b55c ffffffff81611ec8
>  ffff8801f4a24f00 0000000000000003 ffffffff810bc280 ffff8801f4a24f00
>  ffff8801f4a24f00 77816deb133b9979 addce6dcadd009c9 0000000000000000
> Call Trace:
>  [<ffffffff8101b55c>] ? xen_safe_halt+0xc/0x20
>  [<ffffffff81611ec8>] ? default_idle+0x18/0xd0
>  [<ffffffff810bc280>] ? cpu_startup_entry+0x1f0/0x260
> Code: cc 51 41 53 b8 1c 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 1d 00 00 00 0f 05 <41> 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 
> NMI backtrace for cpu 0
> CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.9.0-ucs105-amd64 #1 Debian 4.9.30-2A~4.2.0.201709271649
> task: ffffffff81c0e540 task.stack: ffffffff81c00000
> RIP: e030:[<ffffffff810013aa>]  [<ffffffff810013aa>] xen_hypercall_sched_op+0xa/0x20
> RSP: e02b:ffffffff81c03e90  EFLAGS: 00000246
> RAX: 0000000000000000 RBX: ffffffff81c0e540 RCX: ffffffff810013aa
> RDX: ffffffff81c4ba70 RSI: 0000000000000000 RDI: 0000000000000001
> RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000007ff0 R11: 0000000000000246 R12: 0000000000000000
> R13: 0000000000000000 R14: ffffffff81c0e540 R15: ffffffff81c0e540
> FS:  00007f977d3c2700(0000) GS:ffff8801f5a00000(0000) knlGS:0000000000000000
> CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 000055ae9becabc8 CR3: 00000001d6a40000 CR4: 0000000000002660
> Stack:
>  0000000000000001 0000000000000001 ffffffff8101b55c ffffffff81611ec8
>  ffffffff81c0e540 0000000000000000 ffffffff810bc280 ffffffff81c0e540
>  ffffffff81c0e540 ebeac214e330bd3f addce6dcadd009c9 ffffffffffffffff
> Call Trace:
>  [<ffffffff8101b55c>] ? xen_safe_halt+0xc/0x20
>  [<ffffffff81611ec8>] ? default_idle+0x18/0xd0
>  [<ffffffff810bc280>] ? cpu_startup_entry+0x1f0/0x260
>  [<ffffffff81d4bf84>] ? start_kernel+0x46d/0x48d
>  [<ffffffff81d510e6>] ? xen_start_kernel+0x52e/0x538
> Code: cc 51 41 53 b8 1c 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 1d 00 00 00 0f 05 <41> 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 


Looking at the dis-assembly of xen_clocksource_get_cycles() in
arch/x86/xen/time.c I see no path how that should call
__blk_mq_run_hw_queue():

> 00000000000001a0 <xen_clocksource_get_cycles> mov    %gs:0x0(%rip),%rdi        # 00000000000001a8 <xen_clocksource_get_cycles+0x8>
> 00000000000001a8 <xen_clocksource_get_cycles+0x8> add    $0x20,%rdi
>         ret = pvclock_clocksource_read(src);
> 00000000000001ac <xen_clocksource_get_cycles+0xc> callq  00000000000001b1 <xen_clocksource_get_cycles+0x11>
> }
> 
> static cycle_t xen_clocksource_get_cycles(struct clocksource *cs)
> {
>         return xen_clocksource_read();
> }
> 00000000000001b1 <xen_clocksource_get_cycles+0x11> retq   
> 00000000000001b2 <xen_clocksource_get_cycles+0x12> data16 data16 data16 data16 nopw %cs:0x0(%rax,%rax,1)
> 
> static void xen_read_wallclock(struct timespec *ts)
> {
>         struct shared_info *s = HYPERVISOR_shared_info;
>         struct pvclock_wall_clock *wall_clock = &(s->wc);
> 00000000000001c0 <xen_get_wallclock> mov    0x0(%rip),%rax        # 00000000000001c7 <xen_get_wallclock+0x7>


Here's another dump, which diverges from xen_clocksource_get_cycles() to
some completely other function:

> INFO: task btrfs-transacti:522 blocked for more than 300 seconds.
> btrfs-transacti D    0   522      2 0x00000000
>  ffff8801f3836000 0000000000000000 ffff8801f4a24f00 ffff8801f32da1c0
>  ffff8801f5b18780 ffffc9004199fa40 ffffffff8160cd2d 0000000000000000
>  ffff8801741a58a8 0000000000000102 000000000000000e ffff8801f32da1c0
> Call Trace:
>  [<ffffffff8160cd2d>] ? __schedule+0x23d/0x6d0
>  [<ffffffff8160da60>] ? bit_wait_timeout+0x90/0x90
>  [<ffffffff8160d1f2>] ? schedule+0x32/0x80
>  [<ffffffff81610729>] ? schedule_timeout+0x249/0x300
***
>  [<ffffffff8101b7d1>] ? xen_clocksource_get_cycles+0x11/0x20
>  [<ffffffff8160da60>] ? bit_wait_timeout+0x90/0x90
>  [<ffffffff8160ca74>] ? io_schedule_timeout+0xb4/0x130
>  [<ffffffff810bb5e7>] ? prepare_to_wait+0x57/0x80
>  [<ffffffff8160da77>] ? bit_wait_io+0x17/0x60
>  [<ffffffff8160d56c>] ? __wait_on_bit+0x5c/0x90
>  [<ffffffff8117e18e>] ? find_get_pages_tag+0x15e/0x300
>  [<ffffffff8117d016>] ? wait_on_page_bit+0x86/0xa0
>  [<ffffffff810bb910>] ? autoremove_wake_function+0x40/0x40
>  [<ffffffff8117d107>] ? __filemap_fdatawait_range+0xd7/0x150
>  [<ffffffff8117d18f>] ? filemap_fdatawait_range+0xf/0x30
>  [<ffffffffc0175643>] ? btrfs_wait_ordered_range+0x73/0x110 [btrfs]
>  [<ffffffffc01a2a0d>] ? btrfs_wait_cache_io+0x5d/0x1f0 [btrfs]
>  [<ffffffffc0141a36>] ? btrfs_write_dirty_block_groups+0x106/0x380 [btrfs]
>  [<ffffffffc0140c5d>] ? btrfs_run_delayed_refs+0x1fd/0x2b0 [btrfs]
>  [<ffffffffc01551d7>] ? commit_cowonly_roots+0x257/0x2f0 [btrfs]
>  [<ffffffffc0157a24>] ? btrfs_commit_transaction+0x4e4/0xa40 [btrfs]
>  [<ffffffffc015801d>] ? start_transaction+0x9d/0x4a0 [btrfs]
>  [<ffffffffc01523a2>] ? transaction_kthread+0x1b2/0x1f0 [btrfs]
>  [<ffffffffc01521f0>] ? btrfs_cleanup_transaction+0x580/0x580 [btrfs]
>  [<ffffffff810975c0>] ? kthread+0xf0/0x110
>  [<ffffffff8102476b>] ? __switch_to+0x2bb/0x700
>  [<ffffffff810974d0>] ? kthread_park+0x60/0x60
>  [<ffffffff81611d35>] ? ret_from_fork+0x25/0x30

And another one:
> INFO: task smbd:20101 blocked for more than 300 seconds.
> smbd            D    0 20101   1714 0x00000000
>  ffff8801f01cac00 0000000000000000 ffff8801f4a241c0 ffff88007f9a5240
>  ffff8801f5b98780 ffffc90049a33bc0 ffffffff8160cd2d ffff8800f78bb5b0
>  fd2236313fb5274b 000000008dc3c1a0 ffff8801f3ef0a40 ffff88007f9a5240
> Call Trace:
>  [<ffffffff8160cd2d>] ? __schedule+0x23d/0x6d0
>  [<ffffffff8160da60>] ? bit_wait_timeout+0x90/0x90
>  [<ffffffff8160d1f2>] ? schedule+0x32/0x80
>  [<ffffffff81610729>] ? schedule_timeout+0x249/0x300
>  [<ffffffff81336486>] ? __radix_tree_lookup+0x76/0xe0
***
>  [<ffffffff8101b7d1>] ? xen_clocksource_get_cycles+0x11/0x20
>  [<ffffffff8160da60>] ? bit_wait_timeout+0x90/0x90
>  [<ffffffff8160ca74>] ? io_schedule_timeout+0xb4/0x130
>  [<ffffffff810bb5e7>] ? prepare_to_wait+0x57/0x80
>  [<ffffffff8160da77>] ? bit_wait_io+0x17/0x60
>  [<ffffffff8160d56c>] ? __wait_on_bit+0x5c/0x90
>  [<ffffffffc00f290b>] ? __jbd2_journal_file_buffer+0xcb/0x180 [jbd2]
>  [<ffffffff8160da60>] ? bit_wait_timeout+0x90/0x90
>  [<ffffffff8160d6ce>] ? out_of_line_wait_on_bit+0x7e/0xa0
>  [<ffffffff810bb910>] ? autoremove_wake_function+0x40/0x40
>  [<ffffffffc00f2bc8>] ? do_get_write_access+0x208/0x420 [jbd2]
>  [<ffffffffc00f2e0e>] ? jbd2_journal_get_write_access+0x2e/0x60 [jbd2]
>  [<ffffffffc02c78a6>] ? __ext4_journal_get_write_access+0x36/0x70 [ext4]
>  [<ffffffffc02a3913>] ? ext4_orphan_add+0xd3/0x230 [ext4]
>  [<ffffffffc0296b4a>] ? ext4_mark_inode_dirty+0x6a/0x200 [ext4]
>  [<ffffffffc02a4b8a>] ? ext4_unlink+0x36a/0x380 [ext4]
>  [<ffffffff81211677>] ? vfs_unlink+0xe7/0x180
>  [<ffffffff81214439>] ? do_unlinkat+0x289/0x300
>  [<ffffffff81611abb>] ? system_call_fast_compare_end+0xc/0x9b

This does not look normal to me or did I miss something?

Where can I get more information on why there is no progress for 300s,
what should I do to debug which task is waiting for what?

The traces of the of other CPUs look normal to me: the one posted first
above is the shortest, in all other cases they were sooner or later
waiting for IO (my interpretation, but I can post them if necessary.)

This problem occurs since the upgrade of the Linux kernel inside the VM
from 4.1.x to 4.9.32 and now 4.9.52.

Any help is appreciated.
Philipp Hahn
-- 
Philipp Hahn
Open Source Software Engineer

Univention GmbH
be open.
Mary-Somerville-Str. 1
D-28359 Bremen
Tel.: +49 421 22232-0
Fax : +49 421 22232-99
hahn@univention.de

http://www.univention.de/
Geschäftsführer: Peter H. Ganten
HRB 20755 Amtsgericht Bremen
Steuer-Nr.: 71-597-02876

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply

* [RFH] 4.9.52: task blocked for more than 300 seconds - xen_clocksource_get_cycles?
From: Philipp Hahn @ 2017-10-05  7:22 UTC (permalink / raw)
  To: linux-kernel@vger.kernel.org, xen-devel

Hello,

in a VM running linux-4.9.52 on Debian-Wheezy with Xen-4.1 I observed
several stuck processes: Here is one (truncated) dump of the Linux
kernel messages:

>  [<ffffffff8160db6d>] ? __schedule+0x23d/0x6d0
>  [<ffffffff8160e8a0>] ? bit_wait_timeout+0x90/0x90
>  [<ffffffff8160e032>] ? schedule+0x32/0x80
>  [<ffffffff8161150c>] ? schedule_timeout+0x1ec/0x360
>  [<ffffffff8130f277>] ? __blk_mq_run_hw_queue+0x327/0x3e0
*** see below
>  [<ffffffff8101b7f1>] ? xen_clocksource_get_cycles+0x11/0x20
>  [<ffffffff8160e8a0>] ? bit_wait_timeout+0x90/0x90
>  [<ffffffff8160d8b4>] ? io_schedule_timeout+0xb4/0x130
>  [<ffffffff810bb6f7>] ? prepare_to_wait+0x57/0x80
>  [<ffffffff8160e8b7>] ? bit_wait_io+0x17/0x60
>  [<ffffffff8160e3ac>] ? __wait_on_bit+0x5c/0x90
>  [<ffffffff8160e8a0>] ? bit_wait_timeout+0x90/0x90
>  [<ffffffff8160e50e>] ? out_of_line_wait_on_bit+0x7e/0xa0
>  [<ffffffff810bba20>] ? autoremove_wake_function+0x40/0x40
>  [<ffffffffc00abd18>] ? jbd2_journal_commit_transaction+0xd48/0x17e0 [jbd2]
>  [<ffffffff810247d9>] ? __switch_to+0x2c9/0x720
>  [<ffffffff810e67fd>] ? try_to_del_timer_sync+0x4d/0x80
>  [<ffffffffc00b099d>] ? kjournald2+0xdd/0x280 [jbd2]
>  [<ffffffff810bb9e0>] ? wake_up_atomic_t+0x30/0x30
>  [<ffffffffc00b08c0>] ? commit_timeout+0x10/0x10 [jbd2]
>  [<ffffffff81097660>] ? kthread+0xf0/0x110
>  [<ffffffff810247d9>] ? __switch_to+0x2c9/0x720
>  [<ffffffff81097570>] ? kthread_park+0x60/0x60
>  [<ffffffff81612bb5>] ? ret_from_fork+0x25/0x30
> NMI backtrace for cpu 2
> CPU: 2 PID: 35 Comm: khungtaskd Not tainted 4.9.0-ucs105-amd64 #1 Debian 4.9.30-2A~4.2.0.201709271649
>  0000000000000000 ffffffff81331935 0000000000000000 0000000000000002
>  ffffffff81335e60 0000000000000002 ffffffff8104cb70 ffff8801f0c90e80
>  ffffffff81335f6a ffff8801f0c90e80 00000000003fffbc ffffffff81128048
> Call Trace:
>  [<ffffffff81331935>] ? dump_stack+0x5c/0x77
>  [<ffffffff81335e60>] ? nmi_cpu_backtrace+0x90/0xa0
>  [<ffffffff8104cb70>] ? irq_force_complete_move+0x140/0x140
>  [<ffffffff81335f6a>] ? nmi_trigger_cpumask_backtrace+0xfa/0x130
>  [<ffffffff81128048>] ? watchdog+0x2b8/0x330
>  [<ffffffff81127d90>] ? reset_hung_task_detector+0x10/0x10
>  [<ffffffff81097660>] ? kthread+0xf0/0x110
>  [<ffffffff810247d9>] ? __switch_to+0x2c9/0x720
>  [<ffffffff81097570>] ? kthread_park+0x60/0x60
>  [<ffffffff81612bb5>] ? ret_from_fork+0x25/0x30
> Sending NMI from CPU 2 to CPUs 0-1,3:
> NMI backtrace for cpu 1
> CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.9.0-ucs105-amd64 #1 Debian 4.9.30-2A~4.2.0.201709271649
> task: ffff8801f4a02ec0 task.stack: ffffc90040ca4000
> RIP: e030:[<ffffffff810013aa>]  [<ffffffff810013aa>] xen_hypercall_sched_op+0xa/0x20
> RSP: e02b:ffffc90040ca7ed0  EFLAGS: 00000246
> RAX: 0000000000000000 RBX: ffff8801f4a02ec0 RCX: ffffffff810013aa
> RDX: ffffffff81c4ba70 RSI: 0000000000000000 RDI: 0000000000000001
> RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000007ff0 R11: 0000000000000246 R12: 0000000000000000
> R13: 0000000000000000 R14: ffff8801f4a02ec0 R15: ffff8801f4a02ec0
> FS:  00007f23ac595700(0000) GS:ffff8801f5a80000(0000) knlGS:0000000000000000
> CS:  e033 DS: 002b ES: 002b CR0: 0000000080050033
> CR2: 00007f52537d6d46 CR3: 00000001bba23000 CR4: 0000000000002660
> Stack:
>  ffff8801bb832201 0000000000000001 ffffffff8101b55c ffffffff81611ec8
>  ffff8801f4a02ec0 0000000000000001 ffffffff810bc280 ffff8801f4a02ec0
>  ffff8801f4a02ec0 c0a995961d41240f addce6dcadd009c9 0000000000000000
> Call Trace:
>  [<ffffffff8101b55c>] ? xen_safe_halt+0xc/0x20
>  [<ffffffff81611ec8>] ? default_idle+0x18/0xd0
>  [<ffffffff810bc280>] ? cpu_startup_entry+0x1f0/0x260
> Code: cc 51 41 53 b8 1c 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 1d 00 00 00 0f 05 <41> 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 
> NMI backtrace for cpu 3
> CPU: 3 PID: 0 Comm: swapper/3 Not tainted 4.9.0-ucs105-amd64 #1 Debian 4.9.30-2A~4.2.0.201709271649
> task: ffff8801f4a24f00 task.stack: ffffc90040cb4000
> RIP: e030:[<ffffffff810013aa>]  [<ffffffff810013aa>] xen_hypercall_sched_op+0xa/0x20
> RSP: e02b:ffffc90040cb7ed0  EFLAGS: 00000246
> RAX: 0000000000000000 RBX: ffff8801f4a24f00 RCX: ffffffff810013aa
> RDX: ffffffff81c4ba70 RSI: 0000000000000000 RDI: 0000000000000001
> RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000007ff0 R11: 0000000000000246 R12: 0000000000000000
> R13: 0000000000000000 R14: ffff8801f4a24f00 R15: ffff8801f4a24f00
> FS:  00007f1a2af19700(0000) GS:ffff8801f5b80000(0000) knlGS:0000000000000000
> CS:  e033 DS: 002b ES: 002b CR0: 0000000080050033
> CR2: 00007f4a5416b000 CR3: 00000001d83ec000 CR4: 0000000000002660
> Stack:
>  0000000000000001 0000000000000001 ffffffff8101b55c ffffffff81611ec8
>  ffff8801f4a24f00 0000000000000003 ffffffff810bc280 ffff8801f4a24f00
>  ffff8801f4a24f00 77816deb133b9979 addce6dcadd009c9 0000000000000000
> Call Trace:
>  [<ffffffff8101b55c>] ? xen_safe_halt+0xc/0x20
>  [<ffffffff81611ec8>] ? default_idle+0x18/0xd0
>  [<ffffffff810bc280>] ? cpu_startup_entry+0x1f0/0x260
> Code: cc 51 41 53 b8 1c 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 1d 00 00 00 0f 05 <41> 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 
> NMI backtrace for cpu 0
> CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.9.0-ucs105-amd64 #1 Debian 4.9.30-2A~4.2.0.201709271649
> task: ffffffff81c0e540 task.stack: ffffffff81c00000
> RIP: e030:[<ffffffff810013aa>]  [<ffffffff810013aa>] xen_hypercall_sched_op+0xa/0x20
> RSP: e02b:ffffffff81c03e90  EFLAGS: 00000246
> RAX: 0000000000000000 RBX: ffffffff81c0e540 RCX: ffffffff810013aa
> RDX: ffffffff81c4ba70 RSI: 0000000000000000 RDI: 0000000000000001
> RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000007ff0 R11: 0000000000000246 R12: 0000000000000000
> R13: 0000000000000000 R14: ffffffff81c0e540 R15: ffffffff81c0e540
> FS:  00007f977d3c2700(0000) GS:ffff8801f5a00000(0000) knlGS:0000000000000000
> CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 000055ae9becabc8 CR3: 00000001d6a40000 CR4: 0000000000002660
> Stack:
>  0000000000000001 0000000000000001 ffffffff8101b55c ffffffff81611ec8
>  ffffffff81c0e540 0000000000000000 ffffffff810bc280 ffffffff81c0e540
>  ffffffff81c0e540 ebeac214e330bd3f addce6dcadd009c9 ffffffffffffffff
> Call Trace:
>  [<ffffffff8101b55c>] ? xen_safe_halt+0xc/0x20
>  [<ffffffff81611ec8>] ? default_idle+0x18/0xd0
>  [<ffffffff810bc280>] ? cpu_startup_entry+0x1f0/0x260
>  [<ffffffff81d4bf84>] ? start_kernel+0x46d/0x48d
>  [<ffffffff81d510e6>] ? xen_start_kernel+0x52e/0x538
> Code: cc 51 41 53 b8 1c 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 1d 00 00 00 0f 05 <41> 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 


Looking at the dis-assembly of xen_clocksource_get_cycles() in
arch/x86/xen/time.c I see no path how that should call
__blk_mq_run_hw_queue():

> 00000000000001a0 <xen_clocksource_get_cycles> mov    %gs:0x0(%rip),%rdi        # 00000000000001a8 <xen_clocksource_get_cycles+0x8>
> 00000000000001a8 <xen_clocksource_get_cycles+0x8> add    $0x20,%rdi
>         ret = pvclock_clocksource_read(src);
> 00000000000001ac <xen_clocksource_get_cycles+0xc> callq  00000000000001b1 <xen_clocksource_get_cycles+0x11>
> }
> 
> static cycle_t xen_clocksource_get_cycles(struct clocksource *cs)
> {
>         return xen_clocksource_read();
> }
> 00000000000001b1 <xen_clocksource_get_cycles+0x11> retq   
> 00000000000001b2 <xen_clocksource_get_cycles+0x12> data16 data16 data16 data16 nopw %cs:0x0(%rax,%rax,1)
> 
> static void xen_read_wallclock(struct timespec *ts)
> {
>         struct shared_info *s = HYPERVISOR_shared_info;
>         struct pvclock_wall_clock *wall_clock = &(s->wc);
> 00000000000001c0 <xen_get_wallclock> mov    0x0(%rip),%rax        # 00000000000001c7 <xen_get_wallclock+0x7>


Here's another dump, which diverges from xen_clocksource_get_cycles() to
some completely other function:

> INFO: task btrfs-transacti:522 blocked for more than 300 seconds.
> btrfs-transacti D    0   522      2 0x00000000
>  ffff8801f3836000 0000000000000000 ffff8801f4a24f00 ffff8801f32da1c0
>  ffff8801f5b18780 ffffc9004199fa40 ffffffff8160cd2d 0000000000000000
>  ffff8801741a58a8 0000000000000102 000000000000000e ffff8801f32da1c0
> Call Trace:
>  [<ffffffff8160cd2d>] ? __schedule+0x23d/0x6d0
>  [<ffffffff8160da60>] ? bit_wait_timeout+0x90/0x90
>  [<ffffffff8160d1f2>] ? schedule+0x32/0x80
>  [<ffffffff81610729>] ? schedule_timeout+0x249/0x300
***
>  [<ffffffff8101b7d1>] ? xen_clocksource_get_cycles+0x11/0x20
>  [<ffffffff8160da60>] ? bit_wait_timeout+0x90/0x90
>  [<ffffffff8160ca74>] ? io_schedule_timeout+0xb4/0x130
>  [<ffffffff810bb5e7>] ? prepare_to_wait+0x57/0x80
>  [<ffffffff8160da77>] ? bit_wait_io+0x17/0x60
>  [<ffffffff8160d56c>] ? __wait_on_bit+0x5c/0x90
>  [<ffffffff8117e18e>] ? find_get_pages_tag+0x15e/0x300
>  [<ffffffff8117d016>] ? wait_on_page_bit+0x86/0xa0
>  [<ffffffff810bb910>] ? autoremove_wake_function+0x40/0x40
>  [<ffffffff8117d107>] ? __filemap_fdatawait_range+0xd7/0x150
>  [<ffffffff8117d18f>] ? filemap_fdatawait_range+0xf/0x30
>  [<ffffffffc0175643>] ? btrfs_wait_ordered_range+0x73/0x110 [btrfs]
>  [<ffffffffc01a2a0d>] ? btrfs_wait_cache_io+0x5d/0x1f0 [btrfs]
>  [<ffffffffc0141a36>] ? btrfs_write_dirty_block_groups+0x106/0x380 [btrfs]
>  [<ffffffffc0140c5d>] ? btrfs_run_delayed_refs+0x1fd/0x2b0 [btrfs]
>  [<ffffffffc01551d7>] ? commit_cowonly_roots+0x257/0x2f0 [btrfs]
>  [<ffffffffc0157a24>] ? btrfs_commit_transaction+0x4e4/0xa40 [btrfs]
>  [<ffffffffc015801d>] ? start_transaction+0x9d/0x4a0 [btrfs]
>  [<ffffffffc01523a2>] ? transaction_kthread+0x1b2/0x1f0 [btrfs]
>  [<ffffffffc01521f0>] ? btrfs_cleanup_transaction+0x580/0x580 [btrfs]
>  [<ffffffff810975c0>] ? kthread+0xf0/0x110
>  [<ffffffff8102476b>] ? __switch_to+0x2bb/0x700
>  [<ffffffff810974d0>] ? kthread_park+0x60/0x60
>  [<ffffffff81611d35>] ? ret_from_fork+0x25/0x30

And another one:
> INFO: task smbd:20101 blocked for more than 300 seconds.
> smbd            D    0 20101   1714 0x00000000
>  ffff8801f01cac00 0000000000000000 ffff8801f4a241c0 ffff88007f9a5240
>  ffff8801f5b98780 ffffc90049a33bc0 ffffffff8160cd2d ffff8800f78bb5b0
>  fd2236313fb5274b 000000008dc3c1a0 ffff8801f3ef0a40 ffff88007f9a5240
> Call Trace:
>  [<ffffffff8160cd2d>] ? __schedule+0x23d/0x6d0
>  [<ffffffff8160da60>] ? bit_wait_timeout+0x90/0x90
>  [<ffffffff8160d1f2>] ? schedule+0x32/0x80
>  [<ffffffff81610729>] ? schedule_timeout+0x249/0x300
>  [<ffffffff81336486>] ? __radix_tree_lookup+0x76/0xe0
***
>  [<ffffffff8101b7d1>] ? xen_clocksource_get_cycles+0x11/0x20
>  [<ffffffff8160da60>] ? bit_wait_timeout+0x90/0x90
>  [<ffffffff8160ca74>] ? io_schedule_timeout+0xb4/0x130
>  [<ffffffff810bb5e7>] ? prepare_to_wait+0x57/0x80
>  [<ffffffff8160da77>] ? bit_wait_io+0x17/0x60
>  [<ffffffff8160d56c>] ? __wait_on_bit+0x5c/0x90
>  [<ffffffffc00f290b>] ? __jbd2_journal_file_buffer+0xcb/0x180 [jbd2]
>  [<ffffffff8160da60>] ? bit_wait_timeout+0x90/0x90
>  [<ffffffff8160d6ce>] ? out_of_line_wait_on_bit+0x7e/0xa0
>  [<ffffffff810bb910>] ? autoremove_wake_function+0x40/0x40
>  [<ffffffffc00f2bc8>] ? do_get_write_access+0x208/0x420 [jbd2]
>  [<ffffffffc00f2e0e>] ? jbd2_journal_get_write_access+0x2e/0x60 [jbd2]
>  [<ffffffffc02c78a6>] ? __ext4_journal_get_write_access+0x36/0x70 [ext4]
>  [<ffffffffc02a3913>] ? ext4_orphan_add+0xd3/0x230 [ext4]
>  [<ffffffffc0296b4a>] ? ext4_mark_inode_dirty+0x6a/0x200 [ext4]
>  [<ffffffffc02a4b8a>] ? ext4_unlink+0x36a/0x380 [ext4]
>  [<ffffffff81211677>] ? vfs_unlink+0xe7/0x180
>  [<ffffffff81214439>] ? do_unlinkat+0x289/0x300
>  [<ffffffff81611abb>] ? system_call_fast_compare_end+0xc/0x9b

This does not look normal to me or did I miss something?

Where can I get more information on why there is no progress for 300s,
what should I do to debug which task is waiting for what?

The traces of the of other CPUs look normal to me: the one posted first
above is the shortest, in all other cases they were sooner or later
waiting for IO (my interpretation, but I can post them if necessary.)

This problem occurs since the upgrade of the Linux kernel inside the VM
from 4.1.x to 4.9.32 and now 4.9.52.

Any help is appreciated.
Philipp Hahn
-- 
Philipp Hahn
Open Source Software Engineer

Univention GmbH
be open.
Mary-Somerville-Str. 1
D-28359 Bremen
Tel.: +49 421 22232-0
Fax : +49 421 22232-99
hahn@univention.de

http://www.univention.de/
Geschäftsführer: Peter H. Ganten
HRB 20755 Amtsgericht Bremen
Steuer-Nr.: 71-597-02876

^ permalink raw reply

* Re: [PATCH v4 13/14] platform/x86: dell-smbios-wmi: introduce userspace interface
From: Greg KH @ 2017-10-05  7:23 UTC (permalink / raw)
  To: Mario Limonciello
  Cc: dvhart, Andy Shevchenko, LKML, platform-driver-x86,
	Andy Lutomirski, quasisec, pali.rohar, rjw, mjg59, hch
In-Reply-To: <ad918eaaa2c9cab878d92be4691cb67f44d13681.1507156392.git.mario.limonciello@dell.com>

On Wed, Oct 04, 2017 at 05:48:39PM -0500, Mario Limonciello wrote:
> This userspace character device will be used to perform SMBIOS calls
> from any applications.
> 
> It provides an ioctl that will allow passing the 32k WMI calling
> interface buffer between userspace and kernel space.

{sigh}  Did you really test this?  It feels like it wasn't, due to the
api you are using here.  Did you run it with a 32bit userspace and 64bit
kernel?  32bit kernel/userspace?  How well did your userspace developer
fall down crying when you tried that?  :)

> This character device is intended to deprecate the dcdbas kernel module
> and the interface that it provides to userspace.

At least that driver has a well-documented api to userspace, you are
throwing all of that away here, are you _sure_ you want to do that?
Seems like you just made things much harder.

> It's important for the driver to provide a R/W ioctl to ensure that
> two competing userspace processes don't race to provide or read each
> others data.

The whole goal of this patch is to provide that ioctl, right?  So of
course it is "important" :)

> The API for interacting with this interface is defined in documentation
> as well as a uapi header provides the format of the structures.

Ok, let's _just_ review that api please:

> diff --git a/include/uapi/linux/dell-smbios-wmi.h b/include/uapi/linux/dell-smbios-wmi.h
> new file mode 100644
> index 000000000000..0d0d09b04021
> --- /dev/null
> +++ b/include/uapi/linux/dell-smbios-wmi.h
> @@ -0,0 +1,25 @@
> +#ifndef _UAPI_DELL_SMBIOS_WMI_H_
> +#define _UAPI_DELL_SMBIOS_WMI_H_
> +
> +#include <linux/ioctl.h>
> +#include <linux/wmi.h>
> +
> +struct wmi_calling_interface_buffer {
> +	u16 class;
> +	u16 select;
> +	u32 input[4];
> +	u32 output[4];
> +	u32 argattrib;
> +	u32 blength;
> +	u8 *data;
> +} __packed;

{sigh}

For structures that cross the user/kernel boundry, you _HAVE_ to use the
correct types.  For some things, that is easy, u16 needs to be __u16,
u32 needs to be __u32, but u8*?  Hah, good luck!  Remember what I
mentioned above about 32/64 bit issues?

Why do you need a pointer here at all?  You are providing a huge chunk
of memory to the ioctl, what's the use of a pointer?  How are you
dereferenceing this pointer (remember, it's a userspace pointer, which
you are not saying here, so odds are the kernel code is wrong...)

> +struct wmi_smbios_ioctl {
> +	u32 length;

__u32.

And why not __u64?  Is 32 bits always going to be ok?

> +	struct wmi_calling_interface_buffer *buf;

Another pointer?  2 pointer dereferences in the same ioctl structure?
Crazy, you are wanting to make your life harder than it has to be...

> +};
> +
> +/* only offers on the single instance */
> +#define DELL_WMI_SMBIOS_CMD		WMI_IOWR(0)

I don't understand the comment, please explain it better.

Please sit down and work out your api here, I don't think you have
thought it through properly, given the number of pointers alone.

thanks,

greg k-h

^ permalink raw reply

* [pci:pci/misc 7/9] arch/alpha/kernel/pci.c:260:3: error: implicit declaration of function 'pdev_save_srm_config'
From: kbuild test robot @ 2017-10-05  7:22 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: kbuild-all, linux-pci

[-- Attachment #1: Type: text/plain, Size: 2409 bytes --]

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git pci/misc
head:   51d7f00f1169641c5d0655a1453f2164eecd4fe0
commit: a05fa0a27a11250ff5dc9321a2e1e1feaa37d345 [7/9] alpha/PCI: Make pdev_save_srm_config() static
config: alpha-allnoconfig (attached as .config)
compiler: alpha-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        git checkout a05fa0a27a11250ff5dc9321a2e1e1feaa37d345
        # save the attached .config to linux build tree
        make.cross ARCH=alpha 

All errors (new ones prefixed by >>):

   arch/alpha/kernel/pci.c: In function 'pcibios_fixup_bus':
>> arch/alpha/kernel/pci.c:260:3: error: implicit declaration of function 'pdev_save_srm_config' [-Werror=implicit-function-declaration]
      pdev_save_srm_config(dev);
      ^~~~~~~~~~~~~~~~~~~~
   cc1: some warnings being treated as errors

vim +/pdev_save_srm_config +260 arch/alpha/kernel/pci.c

^1da177e Linus Torvalds     2005-04-16  249  
f8d6c8d9 Greg Kroah-Hartman 2012-12-21  250  void pcibios_fixup_bus(struct pci_bus *bus)
^1da177e Linus Torvalds     2005-04-16  251  {
237865f1 Bjorn Helgaas      2015-09-15  252  	struct pci_dev *dev = bus->self;
237865f1 Bjorn Helgaas      2015-09-15  253  
237865f1 Bjorn Helgaas      2015-09-15  254  	if (pci_has_flag(PCI_PROBE_ONLY) && dev &&
237865f1 Bjorn Helgaas      2015-09-15  255  	    (dev->class >> 8) == PCI_CLASS_BRIDGE_PCI) {
237865f1 Bjorn Helgaas      2015-09-15  256  		pci_read_bridge_bases(bus);
237865f1 Bjorn Helgaas      2015-09-15  257  	}
^1da177e Linus Torvalds     2005-04-16  258  
^1da177e Linus Torvalds     2005-04-16  259  	list_for_each_entry(dev, &bus->devices, bus_list) {
^1da177e Linus Torvalds     2005-04-16 @260  		pdev_save_srm_config(dev);
^1da177e Linus Torvalds     2005-04-16  261  	}
^1da177e Linus Torvalds     2005-04-16  262  }
^1da177e Linus Torvalds     2005-04-16  263  

:::::: The code at line 260 was first introduced by commit
:::::: 1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 Linux-2.6.12-rc2

:::::: TO: Linus Torvalds <torvalds@ppc970.osdl.org>
:::::: CC: Linus Torvalds <torvalds@ppc970.osdl.org>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 4928 bytes --]

^ permalink raw reply

* Re: [PATCH v2] mtd: nand: pxa3xx_nand: Update Kconfig information
From: Boris Brezillon @ 2017-10-05  7:23 UTC (permalink / raw)
  To: Gregory CLEMENT
  Cc: Richard Weinberger, David Woodhouse, Brian Norris, Marek Vasut,
	Cyrille Pitchen, linux-mtd, linux-kernel, Thomas Petazzoni,
	Antoine Tenart, Miquèl Raynal, Nadav Haklai, Shadi Ammouri,
	Yehuda Yitschak, Omri Itach, Hanna Hawa, Igal Liberman,
	Marcin Wojtas
In-Reply-To: <20170929133424.7205-1-gregory.clement@free-electrons.com>

On Fri, 29 Sep 2017 15:34:24 +0200
Gregory CLEMENT <gregory.clement@free-electrons.com> wrote:

> More and more SoCs use the pxa3xx_nand driver for their controller but
> the list of them was not updated. This patch add the last SoCs using the
> driver.
> 

Applied.

Thanks,

Boris

> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
> ---
> Changelog v1->v2:
> 
> - Improve wording thanks to Thomas Petazzoni
> 
>  drivers/mtd/nand/Kconfig | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/mtd/nand/Kconfig b/drivers/mtd/nand/Kconfig
> index 3f2036f31da4..bb48aafed9a2 100644
> --- a/drivers/mtd/nand/Kconfig
> +++ b/drivers/mtd/nand/Kconfig
> @@ -317,8 +317,11 @@ config MTD_NAND_PXA3xx
>  	tristate "NAND support on PXA3xx and Armada 370/XP"
>  	depends on PXA3xx || ARCH_MMP || PLAT_ORION || ARCH_MVEBU
>  	help
> +
>  	  This enables the driver for the NAND flash device found on
> -	  PXA3xx processors (NFCv1) and also on Armada 370/XP (NFCv2).
> +	  PXA3xx processors (NFCv1) and also on 32-bit Armada
> +	  platforms (XP, 370, 375, 38x, 39x) and 64-bit Armada
> +	  platforms (7K, 8K) (NFCv2).
>  
>  config MTD_NAND_SLC_LPC32XX
>  	tristate "NXP LPC32xx SLC Controller"

^ permalink raw reply

* Re: [PATCH] mwifiex: Use put_unaligned_le32
From: Kalle Valo @ 2017-10-05  7:23 UTC (permalink / raw)
  To: Himanshu Jha
  Cc: amitkarwar, nishants, gbhat, huxm, linux-wireless, netdev,
	linux-kernel
In-Reply-To: <1507141686-5178-1-git-send-email-himanshujha199640@gmail.com>

Himanshu Jha <himanshujha199640@gmail.com> writes:

> Use put_unaligned_le32 rather than using byte ordering function and
> memcpy which makes code clear.
> Also, add the header file where it is declared.
>
> Done using Coccinelle and semantic patch used is :
>
> @ rule1 @
> identifier tmp; expression ptr,x; type T;
> @@
>
> - tmp = cpu_to_le32(x);
>
>   <+... when != tmp
> - memcpy(ptr, (T)&tmp, ...);
> + put_unaligned_le32(x,ptr);
>   ...+>
>
> @ depends on rule1 @
> type j; identifier tmp;
> @@
>
> - j tmp;
>   ...when != tmp
>
> Signed-off-by: Himanshu Jha <himanshujha199640@gmail.com>
> ---
>  drivers/net/wireless/marvell/mwifiex/cmdevt.c | 10 ++++------
>  1 file changed, 4 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/net/wireless/marvell/mwifiex/cmdevt.c b/drivers/net/wireless/marvell/mwifiex/cmdevt.c
> index 0edc5d6..e28e119 100644
> --- a/drivers/net/wireless/marvell/mwifiex/cmdevt.c
> +++ b/drivers/net/wireless/marvell/mwifiex/cmdevt.c
> @@ -17,6 +17,7 @@
>   * this warranty disclaimer.
>   */
>  
> +#include <linux/unaligned/access_ok.h>

I don't think this is correct. Should it be asm/unaligned.h?

-- 
Kalle Valo

^ permalink raw reply

* Re: [PATCH v3] mtd: nand: denali: fix setup_data_interface to meet tCCS delay
From: Boris Brezillon @ 2017-10-05  7:23 UTC (permalink / raw)
  To: Masahiro Yamada
  Cc: linux-mtd, Cyrille Pitchen, linux-kernel, Marek Vasut,
	Brian Norris, Richard Weinberger, David Woodhouse
In-Reply-To: <1506694377-8531-1-git-send-email-yamada.masahiro@socionext.com>

On Fri, 29 Sep 2017 23:12:57 +0900
Masahiro Yamada <yamada.masahiro@socionext.com> wrote:

> The WE_2_RE register specifies the number of clock cycles inserted
> between the rising edge of #WE and the falling edge of #RE.
> 
> The current setup_data_interface implementation takes care of tWHR,
> but tCCS is missing.  Wait for max(tCSS, tWHR) to meet the spec.
> 
> With setup_data_interface() properly programmed, the Denali NAND
> controller can observe the timing, so NAND_WAIT_TCCS flag is unneeded.
> Clarify this in the comment block.

Applied.

Thanks,

Boris

> 
> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
> ---
> 
> Changes in v3:
>   - Remove comment abount NAND_WAIT_TWHR because 1/2 seems NACK
> 
> Changes in v2:
>   - newly added
> 
>  drivers/mtd/nand/denali.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/mtd/nand/denali.c b/drivers/mtd/nand/denali.c
> index 0b268ec..5124f8a 100644
> --- a/drivers/mtd/nand/denali.c
> +++ b/drivers/mtd/nand/denali.c
> @@ -1004,8 +1004,14 @@ static int denali_setup_data_interface(struct mtd_info *mtd, int chipnr,
>  	tmp |= FIELD_PREP(RE_2_RE__VALUE, re_2_re);
>  	iowrite32(tmp, denali->reg + RE_2_RE);
>  
> -	/* tWHR -> WE_2_RE */
> -	we_2_re = DIV_ROUND_UP(timings->tWHR_min, t_clk);
> +	/*
> +	 * tCCS, tWHR -> WE_2_RE
> +	 *
> +	 * With WE_2_RE properly set, the Denali controller automatically takes
> +	 * care of the delay; the driver need not set NAND_WAIT_TCCS.
> +	 */
> +	we_2_re = DIV_ROUND_UP(max(timings->tCCS_min, timings->tWHR_min),
> +			       t_clk);
>  	we_2_re = min_t(int, we_2_re, TWHR2_AND_WE_2_RE__WE_2_RE);
>  
>  	tmp = ioread32(denali->reg + TWHR2_AND_WE_2_RE);

^ permalink raw reply

* Re: [PATCH -next v2] mtd: nand: Add support for Toshiba BENAND (Built-in ECC NAND)
From: KOBAYASHI Yoshitake @ 2017-10-05  7:24 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: richard, linux-kernel, marek.vasut, linux-mtd, cyrille.pitchen,
	computersforpeace, dwmw2
In-Reply-To: <20170926231549.40c09c5d@bbrezillon>

On 2017/09/27 6:15, Boris Brezillon wrote:
> On Tue, 26 Sep 2017 18:18:04 +0900
> KOBAYASHI Yoshitake <yoshitake.kobayashi@toshiba.co.jp> wrote:
> 
>> On 2017/09/21 16:28, Boris Brezillon wrote:
>>> On Thu, 21 Sep 2017 14:32:02 +0900
>>> KOBAYASHI Yoshitake <yoshitake.kobayashi@toshiba.co.jp> wrote:
>>>   
>>>> This patch enables support for Toshiba BENAND.
>>>> The current implementation does not support vondor specific command  
>>>
>>> 					       ^ vendor
>>>   
>>>> TOSHIBA_NAND_CMD_ECC_STATUS. I would like to add the command, when
>>>> the exec_op() [1] infrastructure is implemented.  
>>>
>>> It's not a good idea to reference a branch that is likely to disappear
>>> in a commit message. Just say that you can't properly support the
>>> TOSHIBA_NAND_CMD_ECC_STATUS operation right and that it might be
>>> addressed in the future.  
>>
>> Thanks. I'll change the comment.
>>
>>>> +	 */
>>>> +	if (status & NAND_STATUS_FAIL) {
>>>> +		/* uncorrectable */
>>>> +		mtd->ecc_stats.failed++;
>>>> +	} else if (status & TOSHIBA_NAND_STATUS_REWRITE_RECOMMENDED) {
>>>> +		/* correctable */
>>>> +		max_bitflips = mtd->bitflip_threshold;
>>>> +		mtd->ecc_stats.corrected += max_bitflips;
>>>> +	}  
>>>
>>> Is this working correctly when you read more than one ECC chunk? The
>>> ECC step size is 512 bytes and the page is bigger than that, which means
>>> you have more than one ECC chunk per page. What happens to the
>>> NAND_STATUS_FAIL flag if the first chunk is uncorrectable but
>>> following ones are correctable (or do not contain bitflips at all)?   
>>
>> As you mentioned, the ECC step size is 512 byte. But when 0x70 command is used
>> a simplified ECC status per page is generated.
> 
> I'm fine with that as long as uncorrectable errors are detected
> correctly and TOSHIBA_NAND_STATUS_REWRITE_RECOMMENDED is set as soon as
> one of the ECC chunk exposes too many bitflips.
> 
>> In case the first chunk is uncorrectable but the following ones are correctable,
>> the 0x70 command can only check the status of the uncorrectable one.
> 
> Sounds good.
> 
>> Each ECC chunk status can be checked by using the 0x7A command.
>>
>>>> @@ -39,9 +105,43 @@ static void toshiba_nand_decode_id(struct nand_chip *chip)
>>>>  
>>>>  static int toshiba_nand_init(struct nand_chip *chip)
>>>>  {
>>>> +	struct mtd_info *mtd = nand_to_mtd(chip);
>>>> +
>>>>  	if (nand_is_slc(chip))
>>>>  		chip->bbt_options |= NAND_BBT_SCAN2NDPAGE;
>>>>  
>>>> +	if (nand_is_slc(chip) && (chip->id.data[4] & 0x80)) {
>>>> +		/* BENAND */
>>>> +
>>>> +		/*
>>>> +		 * We can't disable the internal ECC engine, the user
>>>> +		 * has to use on-die ECC, there is no alternative.
>>>> +		 */
>>>> +		if (chip->ecc.mode != NAND_ECC_ON_DIE) {
>>>> +			pr_err("On-die ECC should be selected.\n");
>>>> +			return -EINVAL;
>>>> +		}  
>>>
>>> According to your previous explanation that's not exactly true. Since
>>> ECC bytes are stored in a separate area, the user can decide to use
>>> another mode without trouble. Just skip the BENAND initialization when
>>> mode != NAND_ECC_ON_DIE and we should be good, or am I missing something?  
>>
>> I am asking to product department to confirm it.
> 
> I'm almost sure this is the case ;-).

According to the command sequence written in BENAND's datasheet, the status
of the internal ECC must be checked after reading. To do that, ecc.mode has been
set to NAND_ECC_ON_DIE and the status of the internal ECC is checked through 
the 0x70 or 0x7A command. That's the reason we are returning EINVAL here.

-- Yoshi

^ permalink raw reply

* Re: Bug#877735: busybox: m68k build broken due to "-Os" in CFLAGS
From: John Paul Adrian Glaubitz @ 2017-10-05  7:25 UTC (permalink / raw)
  To: Debian m68k; +Cc: Linux/m68k, 877735
In-Reply-To: <150715899716.21351.14956363415364529265.reportbug@stretch64.physik.fu-berlin.de>

Hi!

On 10/05/2017 01:16 AM, John Paul Adrian Glaubitz wrote:
> Since busybox is very important to boot the system and get
> debian-installer working, I would like to ask to have the
> change from [1] reverted for m68k until we have fixed the
> potential bug in gcc which most likely mis-compiled busybox.

On second thought, I would actually recommended to revert this change
for all architectures. Size isn't so much a constraint anymore these
days, you reduce the binary by about 200k. I don't think this is worth
the risk of breaking something as fundemantal as busybox.

Adrian

-- 
  .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer - glaubitz@debian.org
`. `'   Freie Universitaet Berlin - glaubitz@physik.fu-berlin.de
   `-    GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913

^ permalink raw reply

* Re: [PATCH] mtd: nand: sh_flctl: Use of_device_get_match_data() helper
From: Boris Brezillon @ 2017-10-05  7:24 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Richard Weinberger, David Woodhouse, Brian Norris, Marek Vasut,
	Cyrille Pitchen, linux-mtd, linux-renesas-soc
In-Reply-To: <1507119543-12723-1-git-send-email-geert+renesas@glider.be>

On Wed,  4 Oct 2017 14:19:03 +0200
Geert Uytterhoeven <geert+renesas@glider.be> wrote:

> Use the of_device_get_match_data() helper instead of open coding.
> While at it, make config const so the cast can be dropped.
> 

Applied.

Thanks,

Boris

> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
> ---
>  drivers/mtd/nand/sh_flctl.c | 9 +++------
>  1 file changed, 3 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/mtd/nand/sh_flctl.c b/drivers/mtd/nand/sh_flctl.c
> index e7f3c98487e620bc..3c5008a4f5f33acc 100644
> --- a/drivers/mtd/nand/sh_flctl.c
> +++ b/drivers/mtd/nand/sh_flctl.c
> @@ -1094,14 +1094,11 @@ MODULE_DEVICE_TABLE(of, of_flctl_match);
>  
>  static struct sh_flctl_platform_data *flctl_parse_dt(struct device *dev)
>  {
> -	const struct of_device_id *match;
> -	struct flctl_soc_config *config;
> +	const struct flctl_soc_config *config;
>  	struct sh_flctl_platform_data *pdata;
>  
> -	match = of_match_device(of_flctl_match, dev);
> -	if (match)
> -		config = (struct flctl_soc_config *)match->data;
> -	else {
> +	config = of_device_get_match_data(dev);
> +	if (!config) {
>  		dev_err(dev, "%s: no OF configuration attached\n", __func__);
>  		return NULL;
>  	}

^ permalink raw reply

* [Buildroot] [PATCH 2/2] dnsmasq: Disable CHAOS by default
From: Jeroen Roovers @ 2017-10-05  7:25 UTC (permalink / raw)
  To: buildroot

dnsmasq exposes its version and some statistics through DNS, a features
that is configurable only at build time and is enabled by default.
---
 package/dnsmasq/Config.in  | 6 ++++++
 package/dnsmasq/dnsmasq.mk | 4 ++++
 2 files changed, 10 insertions(+)

diff --git a/package/dnsmasq/Config.in b/package/dnsmasq/Config.in
index 8921fc629d..ea733d964e 100644
--- a/package/dnsmasq/Config.in
+++ b/package/dnsmasq/Config.in
@@ -27,6 +27,12 @@ config BR2_PACKAGE_DNSMASQ_DNSSEC
 	help
 	  Enable DNSSEC validation and caching support.
 
+config BR2_PACKAGE_DNSMASQ_CHAOS
+	bool "Enable identification support"
+	default n
+	help
+	  Report *.bind CHAOS info to clients.
+
 config BR2_PACKAGE_DNSMASQ_IDN
 	bool "IDN support"
 	depends on BR2_USE_WCHAR
diff --git a/package/dnsmasq/dnsmasq.mk b/package/dnsmasq/dnsmasq.mk
index 9f96030eb6..dd2b5eb615 100644
--- a/package/dnsmasq/dnsmasq.mk
+++ b/package/dnsmasq/dnsmasq.mk
@@ -30,6 +30,10 @@ ifneq ($(BR2_PACKAGE_DNSMASQ_TFTP),y)
 DNSMASQ_COPTS += -DNO_TFTP
 endif
 
+ifneq ($(BR2_PACKAGE_DNSMASQ_CHAOS),y)
+DNSMASQ_COPTS += -DNO_ID
+endif
+
 # NLS requires IDN so only enable it (i18n) when IDN is true
 ifeq ($(BR2_PACKAGE_DNSMASQ_IDN),y)
 DNSMASQ_DEPENDENCIES += libidn $(TARGET_NLS_DEPENDENCIES)
-- 
2.14.2

^ permalink raw reply related

* [PATCH v6] brcmfmac: add CLM download support
From: Wright Feng @ 2017-10-05  7:31 UTC (permalink / raw)
  To: arend.vanspriel, franky.lin, hante.meuleman, kvalo, chi-hsien.lin
  Cc: wright.feng, linux-wireless, brcm80211-dev-list.pdl,
	Chung-Hsien Hsu

From: Chung-Hsien Hsu <stanley.hsu@cypress.com>

The firmware for brcmfmac devices includes information regarding
regulatory constraints. For certain devices this information is kept
separately in a binary form that needs to be downloaded to the device.
This patch adds support to download this so-called CLM blob file. It
uses the same naming scheme as the other firmware files with extension
of .clm_blob.

The CLM blob file is optional. If the file does not exist, the download
process will be bypassed. It will not affect the driver loading.

Signed-off-by: Chung-Hsien Hsu <stanley.hsu@cypress.com>
---
v2: Revise commit message to describe in more detail
v3: Add error handling in brcmf_c_get_clm_name function
v4: Correct the length of dload_buf in brcmf_c_download function
v5: Remove unnecessary cast and alignment
v6: Add debug log for the case of no CLM file present
---
 .../net/wireless/broadcom/brcm80211/brcmfmac/bus.h |  10 ++
 .../wireless/broadcom/brcm80211/brcmfmac/common.c  | 162 +++++++++++++++++++++
 .../wireless/broadcom/brcm80211/brcmfmac/core.c    |   2 +
 .../wireless/broadcom/brcm80211/brcmfmac/core.h    |   2 +
 .../broadcom/brcm80211/brcmfmac/fwil_types.h       |  31 ++++
 .../wireless/broadcom/brcm80211/brcmfmac/pcie.c    |  19 +++
 .../wireless/broadcom/brcm80211/brcmfmac/sdio.c    |  19 +++
 .../net/wireless/broadcom/brcm80211/brcmfmac/usb.c |  18 +++
 8 files changed, 263 insertions(+)

diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bus.h b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bus.h
index b55c329..df42e09 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bus.h
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bus.h
@@ -71,6 +71,7 @@ struct brcmf_bus_dcmd {
  * @wowl_config: specify if dongle is configured for wowl when going to suspend
  * @get_ramsize: obtain size of device memory.
  * @get_memdump: obtain device memory dump in provided buffer.
+ * @get_fwname: obtain firmware name.
  *
  * This structure provides an abstract interface towards the
  * bus specific driver. For control messages to common driver
@@ -87,6 +88,8 @@ struct brcmf_bus_ops {
 	void (*wowl_config)(struct device *dev, bool enabled);
 	size_t (*get_ramsize)(struct device *dev);
 	int (*get_memdump)(struct device *dev, void *data, size_t len);
+	int (*get_fwname)(struct device *dev, uint chip, uint chiprev,
+			  unsigned char *fw_name);
 };
 
 
@@ -214,6 +217,13 @@ int brcmf_bus_get_memdump(struct brcmf_bus *bus, void *data, size_t len)
 	return bus->ops->get_memdump(bus->dev, data, len);
 }
 
+static inline
+int brcmf_bus_get_fwname(struct brcmf_bus *bus, uint chip, uint chiprev,
+			 unsigned char *fw_name)
+{
+	return bus->ops->get_fwname(bus->dev, chip, chiprev, fw_name);
+}
+
 /*
  * interface functions from common layer
  */
diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/common.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/common.c
index 7a2b495..5397727 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/common.c
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/common.c
@@ -18,6 +18,7 @@
 #include <linux/string.h>
 #include <linux/netdevice.h>
 #include <linux/module.h>
+#include <linux/firmware.h>
 #include <brcmu_wifi.h>
 #include <brcmu_utils.h>
 #include "core.h"
@@ -28,6 +29,7 @@
 #include "tracepoint.h"
 #include "common.h"
 #include "of.h"
+#include "firmware.h"
 
 MODULE_AUTHOR("Broadcom Corporation");
 MODULE_DESCRIPTION("Broadcom 802.11 wireless LAN fullmac driver.");
@@ -104,12 +106,140 @@ void brcmf_c_set_joinpref_default(struct brcmf_if *ifp)
 		brcmf_err("Set join_pref error (%d)\n", err);
 }
 
+static int brcmf_c_download(struct brcmf_if *ifp, u16 flag,
+			    struct brcmf_dload_data_le *dload_buf,
+			    u32 len)
+{
+	s32 err;
+
+	flag |= (DLOAD_HANDLER_VER << DLOAD_FLAG_VER_SHIFT);
+	dload_buf->flag = cpu_to_le16(flag);
+	dload_buf->dload_type = cpu_to_le16(DL_TYPE_CLM);
+	dload_buf->len = cpu_to_le32(len);
+	dload_buf->crc = cpu_to_le32(0);
+	len = sizeof(*dload_buf) + len - 1;
+
+	err = brcmf_fil_iovar_data_set(ifp, "clmload", dload_buf, len);
+
+	return err;
+}
+
+static int brcmf_c_get_clm_name(struct brcmf_if *ifp, u8 *clm_name)
+{
+	struct brcmf_bus *bus = ifp->drvr->bus_if;
+	struct brcmf_rev_info *ri = &ifp->drvr->revinfo;
+	u8 fw_name[BRCMF_FW_NAME_LEN];
+	u8 *ptr;
+	size_t len;
+	s32 err;
+
+	memset(fw_name, 0, BRCMF_FW_NAME_LEN);
+	err = brcmf_bus_get_fwname(bus, ri->chipnum, ri->chiprev, fw_name);
+	if (err) {
+		brcmf_err("get firmware name failed (%d)\n", err);
+		goto done;
+	}
+
+	/* generate CLM blob file name */
+	ptr = strrchr(fw_name, '.');
+	if (!ptr) {
+		err = -ENOENT;
+		goto done;
+	}
+
+	len = ptr - fw_name + 1;
+	if (len + strlen(".clm_blob") > BRCMF_FW_NAME_LEN) {
+		err = -E2BIG;
+	} else {
+		strlcpy(clm_name, fw_name, len);
+		strlcat(clm_name, ".clm_blob", BRCMF_FW_NAME_LEN);
+	}
+done:
+	return err;
+}
+
+static int brcmf_c_process_clm_blob(struct brcmf_if *ifp)
+{
+	struct device *dev = ifp->drvr->bus_if->dev;
+	struct brcmf_dload_data_le *chunk_buf;
+	const struct firmware *clm = NULL;
+	u8 clm_name[BRCMF_FW_NAME_LEN];
+	u32 chunk_len;
+	u32 datalen;
+	u32 cumulative_len;
+	u16 dl_flag = DL_BEGIN;
+	u32 status;
+	s32 err;
+
+	brcmf_dbg(TRACE, "Enter\n");
+
+	memset(clm_name, 0, BRCMF_FW_NAME_LEN);
+	err = brcmf_c_get_clm_name(ifp, clm_name);
+	if (err) {
+		brcmf_err("get CLM blob file name failed (%d)\n", err);
+		return err;
+	}
+
+	err = request_firmware(&clm, clm_name, dev);
+	if (err) {
+		if (err == -ENOENT) {
+			brcmf_dbg(INFO, "continue with CLM data currently present in firmware\n");
+			return 0;
+		}
+		brcmf_err("request CLM blob file failed (%d)\n", err);
+		return err;
+	}
+
+	chunk_buf = kzalloc(sizeof(*chunk_buf) + MAX_CHUNK_LEN - 1, GFP_KERNEL);
+	if (!chunk_buf) {
+		err = -ENOMEM;
+		goto done;
+	}
+
+	datalen = clm->size;
+	cumulative_len = 0;
+	do {
+		if (datalen > MAX_CHUNK_LEN) {
+			chunk_len = MAX_CHUNK_LEN;
+		} else {
+			chunk_len = datalen;
+			dl_flag |= DL_END;
+		}
+		memcpy(chunk_buf->data, clm->data + cumulative_len, chunk_len);
+
+		err = brcmf_c_download(ifp, dl_flag, chunk_buf, chunk_len);
+
+		dl_flag &= ~DL_BEGIN;
+
+		cumulative_len += chunk_len;
+		datalen -= chunk_len;
+	} while ((datalen > 0) && (err == 0));
+
+	if (err) {
+		brcmf_err("clmload (%zu byte file) failed (%d); ",
+			  clm->size, err);
+		/* Retrieve clmload_status and print */
+		err = brcmf_fil_iovar_int_get(ifp, "clmload_status", &status);
+		if (err)
+			brcmf_err("get clmload_status failed (%d)\n", err);
+		else
+			brcmf_dbg(INFO, "clmload_status=%d\n", status);
+		err = -EIO;
+	}
+
+	kfree(chunk_buf);
+done:
+	release_firmware(clm);
+	return err;
+}
+
 int brcmf_c_preinit_dcmds(struct brcmf_if *ifp)
 {
 	s8 eventmask[BRCMF_EVENTING_MASK_LEN];
 	u8 buf[BRCMF_DCMD_SMLEN];
 	struct brcmf_rev_info_le revinfo;
 	struct brcmf_rev_info *ri;
+	char *clmver;
 	char *ptr;
 	s32 err;
 
@@ -148,6 +278,13 @@ int brcmf_c_preinit_dcmds(struct brcmf_if *ifp)
 	}
 	ri->result = err;
 
+	/* Do any CLM downloading */
+	err = brcmf_c_process_clm_blob(ifp);
+	if (err < 0) {
+		brcmf_err("download CLM blob file failed, %d\n", err);
+		goto done;
+	}
+
 	/* query for 'ver' to get version info from firmware */
 	memset(buf, 0, sizeof(buf));
 	strcpy(buf, "ver");
@@ -167,6 +304,31 @@ int brcmf_c_preinit_dcmds(struct brcmf_if *ifp)
 	ptr = strrchr(buf, ' ') + 1;
 	strlcpy(ifp->drvr->fwver, ptr, sizeof(ifp->drvr->fwver));
 
+	/* Query for 'clmver' to get CLM version info from firmware */
+	memset(buf, 0, sizeof(buf));
+	err = brcmf_fil_iovar_data_get(ifp, "clmver", buf, sizeof(buf));
+	if (err) {
+		if (err == -23) {
+			brcmf_dbg(INFO, "clmver iovar unsupported, %d\n", err);
+		} else {
+			brcmf_err("retrieving clmver failed, %d\n", err);
+			goto done;
+		}
+	} else {
+		clmver = (char *)buf;
+		/* store CLM version for adding it to revinfo debugfs file */
+		memcpy(ifp->drvr->clmver, clmver, sizeof(ifp->drvr->clmver));
+
+		/* Replace all newline/linefeed characters with space
+		 * character
+		 */
+		ptr = clmver;
+		while ((ptr = strnchr(ptr, '\n', sizeof(buf))) != NULL)
+			*ptr = ' ';
+
+		brcmf_dbg(INFO, "CLM version = %s\n", clmver);
+	}
+
 	/* set mpc */
 	err = brcmf_fil_iovar_int_set(ifp, "mpc", 1);
 	if (err) {
diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.c
index 511d190..7414dff 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.c
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.c
@@ -941,6 +941,8 @@ static int brcmf_revinfo_read(struct seq_file *s, void *data)
 	seq_printf(s, "anarev: %u\n", ri->anarev);
 	seq_printf(s, "nvramrev: %08x\n", ri->nvramrev);
 
+	seq_printf(s, "clmrev: %s\n", bus_if->drvr->clmver);
+
 	return 0;
 }
 
diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.h b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.h
index a4dd313..ae39128 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.h
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.h
@@ -141,6 +141,8 @@ struct brcmf_pub {
 	struct notifier_block inetaddr_notifier;
 	struct notifier_block inet6addr_notifier;
 	struct brcmf_mp_device *settings;
+
+	u8 clmver[BRCMF_DCMD_SMLEN];
 };
 
 /* forward declarations */
diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fwil_types.h b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fwil_types.h
index 9a1eb5a..fffe347 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fwil_types.h
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fwil_types.h
@@ -147,6 +147,21 @@
 #define BRCMF_MFP_CAPABLE		1
 #define BRCMF_MFP_REQUIRED		2
 
+/* MAX_CHUNK_LEN is the maximum length for data passing to firmware in each
+ * ioctl. It is relatively small because firmware has small maximum size input
+ * playload restriction for ioctls.
+ */
+#define MAX_CHUNK_LEN			1400
+
+#define DLOAD_HANDLER_VER		1	/* Downloader version */
+#define DLOAD_FLAG_VER_MASK		0xf000	/* Downloader version mask */
+#define DLOAD_FLAG_VER_SHIFT		12	/* Downloader version shift */
+
+#define DL_BEGIN			0x0002
+#define DL_END				0x0004
+
+#define DL_TYPE_CLM			2
+
 /* join preference types for join_pref iovar */
 enum brcmf_join_pref_types {
 	BRCMF_JOIN_PREF_RSSI = 1,
@@ -835,4 +850,20 @@ struct brcmf_gtk_keyinfo_le {
 	u8 replay_counter[BRCMF_RSN_REPLAY_LEN];
 };
 
+/**
+ * struct brcmf_dload_data_le - data passing to firmware for downloading
+ * @flag: flags related to download data.
+ * @dload_type: type of download data.
+ * @len: length in bytes of download data.
+ * @crc: crc of download data.
+ * @data: download data.
+ */
+struct brcmf_dload_data_le {
+	__le16 flag;
+	__le16 dload_type;
+	__le32 len;
+	__le32 crc;
+	u8 data[1];
+};
+
 #endif /* FWIL_TYPES_H_ */
diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/pcie.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/pcie.c
index f878706..b370ee7 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/pcie.c
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/pcie.c
@@ -1350,6 +1350,24 @@ static int brcmf_pcie_get_memdump(struct device *dev, void *data, size_t len)
 	return 0;
 }
 
+static int brcmf_pcie_get_fwname(struct device *dev, u32 chip, u32 chiprev,
+				 u8 *fw_name)
+{
+	struct brcmf_bus *bus_if = dev_get_drvdata(dev);
+	struct brcmf_pciedev *buspub = bus_if->bus_priv.pcie;
+	struct brcmf_pciedev_info *devinfo = buspub->devinfo;
+	int ret = 0;
+
+	if (devinfo->fw_name[0] != '\0')
+		strlcpy(fw_name, devinfo->fw_name, BRCMF_FW_NAME_LEN);
+	else
+		ret = brcmf_fw_map_chip_to_name(chip, chiprev,
+						brcmf_pcie_fwnames,
+						ARRAY_SIZE(brcmf_pcie_fwnames),
+						fw_name, NULL);
+
+	return ret;
+}
 
 static const struct brcmf_bus_ops brcmf_pcie_bus_ops = {
 	.txdata = brcmf_pcie_tx,
@@ -1359,6 +1377,7 @@ static int brcmf_pcie_get_memdump(struct device *dev, void *data, size_t len)
 	.wowl_config = brcmf_pcie_wowl_config,
 	.get_ramsize = brcmf_pcie_get_ramsize,
 	.get_memdump = brcmf_pcie_get_memdump,
+	.get_fwname = brcmf_pcie_get_fwname,
 };
 
 
diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c
index b1789b1..3a71ca0 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c
@@ -3972,6 +3972,24 @@ static void brcmf_sdio_buscore_write32(void *ctx, u32 addr, u32 val)
 	}
 }
 
+static int brcmf_sdio_get_fwname(struct device *dev, u32 chip, u32 chiprev,
+				 u8 *fw_name)
+{
+	struct brcmf_bus *bus_if = dev_get_drvdata(dev);
+	struct brcmf_sdio_dev *sdiodev = bus_if->bus_priv.sdio;
+	int ret = 0;
+
+	if (sdiodev->fw_name[0] != '\0')
+		strlcpy(fw_name, sdiodev->fw_name, BRCMF_FW_NAME_LEN);
+	else
+		ret = brcmf_fw_map_chip_to_name(chip, chiprev,
+						brcmf_sdio_fwnames,
+						ARRAY_SIZE(brcmf_sdio_fwnames),
+						fw_name, NULL);
+
+	return ret;
+}
+
 static const struct brcmf_bus_ops brcmf_sdio_bus_ops = {
 	.stop = brcmf_sdio_bus_stop,
 	.preinit = brcmf_sdio_bus_preinit,
@@ -3982,6 +4000,7 @@ static void brcmf_sdio_buscore_write32(void *ctx, u32 addr, u32 val)
 	.wowl_config = brcmf_sdio_wowl_config,
 	.get_ramsize = brcmf_sdio_bus_get_ramsize,
 	.get_memdump = brcmf_sdio_bus_get_memdump,
+	.get_fwname = brcmf_sdio_get_fwname,
 };
 
 static void brcmf_sdio_firmware_callback(struct device *dev, int err,
diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/usb.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/usb.c
index 8f20a4b..9622dd9 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/usb.c
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/usb.c
@@ -1128,12 +1128,30 @@ static void brcmf_usb_wowl_config(struct device *dev, bool enabled)
 		device_set_wakeup_enable(devinfo->dev, false);
 }
 
+static int brcmf_usb_get_fwname(struct device *dev, u32 chip, u32 chiprev,
+				u8 *fw_name)
+{
+	struct brcmf_usbdev_info *devinfo = brcmf_usb_get_businfo(dev);
+	int ret = 0;
+
+	if (devinfo->fw_name[0] != '\0')
+		strlcpy(fw_name, devinfo->fw_name, BRCMF_FW_NAME_LEN);
+	else
+		ret = brcmf_fw_map_chip_to_name(chip, chiprev,
+						brcmf_usb_fwnames,
+						ARRAY_SIZE(brcmf_usb_fwnames),
+						fw_name, NULL);
+
+	return ret;
+}
+
 static const struct brcmf_bus_ops brcmf_usb_bus_ops = {
 	.txdata = brcmf_usb_tx,
 	.stop = brcmf_usb_down,
 	.txctl = brcmf_usb_tx_ctlpkt,
 	.rxctl = brcmf_usb_rx_ctlpkt,
 	.wowl_config = brcmf_usb_wowl_config,
+	.get_fwname = brcmf_usb_get_fwname,
 };
 
 static int brcmf_usb_bus_setup(struct brcmf_usbdev_info *devinfo)
-- 
1.9.1

^ permalink raw reply related

* Re: [RFC][PATCH] fstest: regression test for ext4 crash consistency bug
From: Xiao Yang @ 2017-10-05  7:27 UTC (permalink / raw)
  To: Ashlie Martinez
  Cc: Amir Goldstein, Theodore Ts'o, Eryu Guan, Josef Bacik,
	fstests, Ext4, Vijay Chidambaram
In-Reply-To: <CAFk8rvYvrGo6SjA6e0yj0XL_BARMmZoofB5kESM4EY4S2n-==w@mail.gmail.com>

On 2017/09/30 22:15, Ashlie Martinez wrote:
> Hi Xiao,
>
> I am a student at the University of Texas at Austin. Some researchers
> in the computer science department at UT, myself included, have
> recently been working to develop a file system crash consistency test
> harness called CrashMonkey [1][2]. I have been working on the
> CrashMonkey project since it was started late last year. With
> CrashMonkey we have also been able to reproduce the incorrect i_size
> error you noted but we have not been able to reproduce the other
> output that Amir found. CrashMonkey works by logging and replaying
> operations for a workload, so it should not be sensitive to
> differences in timing that could be caused by things like KVM+virtio.
> I also did a few experiments with Amir's new xfstests test 456 (both
> with and without KVM and virtio) and I was unable to reproduce the
> output noted in the xfstest. I have not spent a lot of time looking
> into the cause of the bug that Amir found and it is rather unfortunate
> that I was unable to reproduce it with either xfstests or CrashMonkey.
Hi Ashlie,

Thanks for your detailed comments.

1) Do you think the output that Amir noted in xfstests is a false positive?

2) About the output that both i and you reproduced,  did you look into 
it and find its root cause?

Thanks,
Xiao Yang
> At any rate, CrashMonkey is still under development, so it does have
> some caveats. First, we are running with a fixed random seed in our
> default RandomPermuter (used to generate crash states) to aid
> development. Second, the branch with the reproduction of this ext4
> regression bug in CrashMonkey [3] will yield a few false positives due
> to the way CrashMonkey works and how fsx runs. These false positives
> are due to CrashMonkey generating crash states where the directories
> for files used for the test have not be fsync-ed in the file system.
> The top of the README in the CrashMonkey branch with this bug
> reproduction outlines how we determined these were false positives
>
> [1] https://github.com/utsaslab/crashmonkey
> [2] https://www.usenix.org/conference/hotstorage17/program/presentation/martinez
> [3] https://github.com/utsaslab/crashmonkey/tree/ext4_regression_bug
>
>
> On Mon, Sep 25, 2017 at 5:53 AM, Amir Goldstein<amir73il@gmail.com>  wrote:
>> On Mon, Sep 25, 2017 at 12:49 PM, Xiao Yang<yangx.jy@cn.fujitsu.com>  wrote:
>>> On 2017/08/27 18:44, Amir Goldstein wrote:
>>>> This test is motivated by a bug found in ext4 during random crash
>>>> consistency tests.
>>>>
>>>> This test uses device mapper flakey target to demonstrate the bug
>>>> found using device mapper log-writes target.
>>>>
>>>> Signed-off-by: Amir Goldstein<amir73il@gmail.com>
>>>> ---
>>>>
>>>> Ted,
>>>>
>>>> While working on crash consistency xfstests [1], I stubmled on what
>>>> appeared to be an ext4 crash consistency bug.
>>>>
>>>> The tests I used rely on the log-writes dm target code written
>>>> by Josef Bacik, which had little exposure to the wide community
>>>> as far as I know.  I wanted to prove to myself that the found
>>>> inconsistency was not due to a test bug, so I bisected the failed
>>>> test to the minimal operations that trigger the failure and wrote
>>>> a small independent test to reproduce the issue using dm flakey target.
>>>>
>>>> The following fsck error is reliably reproduced by replaying some fsx ops
>>>> on overlapping file regions, then emulating a crash, followed by mount,
>>>> umount and fsck -nf:
>>>>
>>>>    ./ltp/fsx -d --replay-ops /tmp/8995.fsxops /mnt/scratch/testfile
>>>>    1 write 0x137dd thru    0x21445 (0xdc69 bytes)
>>>>    2 falloc        from 0xb531 to 0x16ade (0xb5ad bytes)
>>>>    3 collapse      from 0x1c000 to 0x20000, (0x4000 bytes)
>>>>    4 write 0x3e5ec thru    0x3ffff (0x1a14 bytes)
>>>>    5 zero  from 0x20fac to 0x27d48, (0x6d9c bytes)
>>>>    6 mapwrite      0x216ad thru    0x23dfb (0x274f bytes)
>>>>    All 7 operations completed A-OK!
>>>>    _check_generic_filesystem: filesystem on /dev/mapper/ssd-scratch is inconsistent
>>>>    *** fsck.ext4 output ***
>>>>    fsck from util-linux 2.27.1
>>>>    e2fsck 1.42.13 (17-May-2015)
>>>>    Pass 1: Checking inodes, blocks, and sizes
>>>>    Inode 12, end of extent exceeds allowed value
>>>>            (logical block 33, physical block 33441, len 7)
>>>>    Clear? no
>>>>    Inode 12, i_blocks is 184, should be 128.  Fix? no
>>> Hi Amir,
>>>
>>> I always get the following output when running your xfstests test case 501.
>> Now merged as test generic/456
>>
>>> ---------------------------------------------------------------------------
>>> e2fsck 1.42.9 (28-Dec-2013)
>>> Pass 1: Checking inodes, blocks, and sizes
>>> Inode 12, i_size is 147456, should be 163840. Fix? no
>>> ---------------------------------------------------------------------------
>>>
>>> Could you tell me how to get the expected output as you reported?
>> I can't say I am doing anything special, but I can say that I get the
>> same output as you did when running the test inside kvm-xfstests.
>> Actually, I could not reproduce ANY of the the crash consistency bugs
>> inside kvm-xfstests. Must be something to do with different timing of
>> IO with KVM+virtio disks??
>>
>> When running on my laptop (Ubuntu 16.04 with latest kernel)
>> on a 10G SSD volume, I always get the error reported above.
>> I just re-verified with latest stable e2fsprogs (1.43.6).
>>
>> Amir.
>
> .
>




^ permalink raw reply

* Re: [RFC PATCH 1/2] kbuild: Add a cache for generated variables
From: Masahiro Yamada @ 2017-10-05  7:26 UTC (permalink / raw)
  To: Doug Anderson
  Cc: Michal Marek, Guenter Roeck, Brian Norris, Marcin Nowakowski,
	Matthias Kaehlcke, Behan Webster, Arnd Bergmann, Mark Charlebois,
	Cao jin, Linux Kbuild mailing list, open list:DOCUMENTATION,
	Jonathan Corbet, Linux Kernel Mailing List, Ingo Molnar
In-Reply-To: <CAD=FV=XQfLf+5ZPe4Vfqn1kpdom16YT7Nj=mw2Qydk1hmbL=pg@mail.gmail.com>

Hi Douglas,


2017-10-05 7:38 GMT+09:00 Doug Anderson <dianders@chromium.org>:
> Hi,
>
> On Tue, Oct 3, 2017 at 9:05 PM, Masahiro Yamada
> <yamada.masahiro@socionext.com> wrote:
>> Thanks for the patches.  The idea is interesting.
>>
>> I am not a Chrome developer, but cc-option could be improved somehow.
>>
>>
>> I examined two approaches to mitigate the pain.
>>
>> [1] Skip cc-option completely when we run non-build targets
>>     such as "make help", "make clean", etc.
>>
>> [2] Cache the result of cc-option like this patch
>>
>>
>> I wrote some patches for [1]
>> https://patchwork.kernel.org/patch/9983825/
>> https://patchwork.kernel.org/patch/9983829/
>> https://patchwork.kernel.org/patch/9983833/
>> https://patchwork.kernel.org/patch/9983827/
>>
>> Comments are welcome.  :)
>
> OK, I'll take a look at them.  I'm not massively familiar with the
> kernel Makefiles, but hopefully I can figure some of it out.  :-P
>
>
>> [1] does not solve the slowness in the incremental build.
>> Instead, it makes non-build targets faster
>> from a clean source tree because cc-option is zero cost.
>>
>>
>> [2] solves the issues in the incremental build.
>> One funny thing is, it computes cc-option and store the cache file
>> even for "make clean", where we know the cache file will be
>> immediately deleted.
>>
>>
>> We can apply [1] or [2], or maybe both of them
>> because [1] and [2] are trying to solve the different issues.
>
> Yeah, I'm much more interested in [2] than [1].  I run incremental
> builds almost constantly and hate the slowness.  :(  I can certainly
> appreciate that the inefficient compiler setup in Chrome OS isn't your
> problem and that an extra .5 seconds for an incremental build for most
> people isn't that huge, though.  ...but I'l probably move forward with
> [2] since it still helps me a lot and still should help others.
> Solving [1] seems worthwhile too, though...


I agree.

[2] is more helpful because developers spend most of time
for the incremental build.

[1] can improve it even more, but it just addresses some minor issues.



>
>> The cache approach seemed clever, but I do not like the implementation.
>> The code is even more unreadable with lots of escape sequence.
>>
>>
>> Here is my suggestion for simpler implementation (including bike-shed)
>>
>>
>> Implement a new function "shell-cache".  It works like $(shell ...)
>>
>> The difference is
>> $(call shell-cached,...) returns the cached result, or run the command
>> if not cached.
>>
>>
>>
>> Also, add try-run-cached.  This is a cached variant of try-run.
>>
>>
>> I just played with it, and seems working.
>> (I did not have spend much time for testing, more wider test is needed.)
>>
>>
>> The code is like something like this:
>>
>>
>>
>> make-cache := $(if $(KBUILD_EXTMOD),$(KBUILD_EXTMOD)/,$(if
>> $(obj),$(obj)/)).cache.mk
>>
>> -include $(make-cache)
>
> Thanks, this is much better!  :)
>
>
>> define __run-and-store
>> ifeq ($(origin $(1)),undefined)
>>   $$(eval $(1) := $$(shell $$2))
>>   $$(shell echo '$(1) := $$($(1))' >> $(make-cache))
>> endif
>> endef
>
> I like your idea.  Essentially you're saying that we can just defer
> the shell command, which was the really essential part that needed to
> be deferred.  Nice!  It seems much nicer / cleaner.  Still a tiny bit
> of voodoo, but with all of your improvements the voodoo is at least
> contained to a very small part of the file.
>
> OK, things seem pretty nice with this and I'll submit out a new patch.
> I'm a bit conflicted about whether I should put myself as authorship
> or you, since mostly the patch is your code now.  ;-)  For now I'll
> leave my authorship but feel free to claim it and change me to just
> "Suggested-by".  :-)


No worry.
Without your patches, I would have never come up with this.
The code improvement happened in general review process.



> NOTE: one thing I noticed about your example code is that you weren't
> always consistent about $(1) vs. $1.  I've changed things to be
> consistent at the expense of extra parenthesis.  If you hate it, let
> me know.

I accept your choice.
Keeping the consistent coding style is good.


If we decide to use $1 instead of $(1),
we should convert all instances tree-wide
for consistency.



>>
>> __set-and-store takes two arguments,
>> but here, it is called with one argument __cached_$(1)
>>
>> Probably, $$(try-run, ...) will be passed as $2,
>> but I do not know why this works.
>
> Whew!  It's not just me that's confused by all this...  ;-)  It looks
> like you continued to use this in your suggestion, so I guess you
> thought it was useful, at least...  Yeah, it's a bit of magic.  The
> goal was to have one less set of evaluating and passing going around.
>
> So really '__set-and-store' takes _1_ argument.  It outputs a string
> where it uses this argument.  Also part of the output string is a
> reference to $(2).  This will refer to the caller's second variable.
>
> ===
>
> Maybe a "simpler" example?
>
> define example
> $$(eval $(1) := $$(2))
> endef
>
> ex_usage = $(eval $(call example,$(1)))$($(1))
>
> .PHONY: ex
> ex:
>   @echo $(call ex_usage,myvar,myval)
>   @echo $(myvar)
>
> If you do "make ex" you'll see "myval" printed twice.
>
> --
>
> Walking through that, let's first remove the "call" which we can do in
> this case pretty easily because there are no "ifdefs", unlike the real
> code.  Since the first argument to "example" is $(1), the above is the
> same as:
>
> ex_usage = $(eval $$(eval $(1) := $$(2)))$($(1))
>
> ...now we need to resolve that recursively where $(1) is 'myvar' and
> $(2) is 'myval'.  So we get:
>
> 1. $(eval $$(eval $(1) := $$(2)))$($(1))
> 2. $(eval $$(eval myvar := $$(2)))$(myvar)
> 3. $(eval myvar := $(2))$(myvar)
> 4. $(eval myvar := myval)$(myvar)


This example is very helpful!

Now I understood this.
Thanks for clear explanation!



> --
>
> An alternate version of my "simpler" example that doesn't use this magic:
>
> define example
> $$(eval $(1) := $(2))
> endef
>
> ex_usage = $(eval $(call example,$(1),$(2)))$($(1))
>
> .PHONY: ex
> ex:
>   @echo $(call ex_usage,myvar,myval)
>   @echo $(myvar)
>
> ...that works OK for the simple example case but you get into weird
> esoteric issues because of the extra eval.  You'll find that the two
> definitions behave differently for:
>
>   @echo $(call ex_usage,myvar,my$$$$(cat /proc/cmdline)val)
>
> ...in that example the "intention" is that bash should be passed the
> string: echo my$(cat /proc/cmdline)val, so it escapes the "$" once to
> "defer" it and twice because it wants the "$" to go through to bash.


You are right.

If we want to get the same result by using the alternate version,
we need one more level of escaping, so end up with crazy escaping:

      @echo $(call ex_usage,myvar,my$$$$$$$$(cat /proc/cmdline)val)

So, the magic is unclear, but very helpful.



>>>
>>>  # __cc-option
>>>  # Usage: MY_CFLAGS += $(call __cc-option,$(CC),$(MY_CFLAGS),-march=winchip-c6,-march=i586)
>>> -__cc-option = $(call try-run,\
>>> -       $(1) -Werror $(2) $(3) -c -x c /dev/null -o "$$TMP",$(3),$(4))
>>> +__cc-option = \
>>> +       $(call cached-val,$(call __sanitize-opt,__cc_opt_$(1)_$(2)_$(3)_$(4)),\
>>> +       $$(call try-run,\
>>> +       $(call __comma,$(1)) -Werror \
>>> +       $(call __comma,$(2)) \
>>> +       $(call __comma,$(3)) -c -x c /dev/null \
>>> +       -o "$$$$TMP",$(call __comma,$(strip $(3))),$(call __comma,$(strip $(4)))))
>>
>>
>> I did not follow how this was working here...
>
> Hoo boy.  Let's see...  Which part were you curious about?  I can try
> to explain the bits and you can tell me if I got them all.  Note that
> many of these aren't super important anymore since they're not needed
> with your improvements:

Right, all of these are not necessary any more.
I did not mean that you should explain all of these.

I did not want read them,
so I started to consider simpler idea that I can understand.

Sorry if you used much time for me,
but this discussion helped me a lot to understand it deeply.

>
>
> * $(call __sanitize-opt,__cc_opt_$(1)_$(2)_$(3)_$(4)):
>
> Tries to make a variable name that will be different if you have any
> different parameters.

Yes, this is an easy part.

>
> * $$(call try-run
>
> The "$$" around this call is because it's deferred and a general
> concept for "cached-val".  If (and only if) we decide that we need to
> recompute this value then we'll run this through an "eval".  By having
> the $$ then this action won't be taken unless the string is eval-ed.
> Note that plenty of stuff _will_ still happen right away, though.  All
> of the $(call __comma,$(1)) are _not_ deferred.  That's fine since
> that doesn't lead to a fork-exec and thus isn't "expensive".

Actually, I could not understand how this worked in v1,
but with the help of your simplified example above,
the intention is clear.  Thanks.



> *  $(call __comma,$(1))
>
> I'll admit that I didn't dig all the way into this.  I found at least
> one case where "make" was incorrectly mixing up arguments due to the
> "eval" and the "__comma" fixed it.  I decided it was important to do
> this for all arguments.  Possibly other things might need to be
> escaped too, but I didn't find any cases that needed it...  ...but no
> longer needed with your improvements...

OK, this is probably for corner cases, not that important part.

>
> * $$$$TMP
>
> An extra level of escaping is needed since it goes through an extra
> "eval".  This means $$ => $$$$

This is now clear to me
with the simplified example and its alternate version.


>
> * $(strip $(3))
>
> This was important because otherwise the cache otherwise wasn't
> completely "stable".  If you ran "make" the first time you'd get a
> cache, then running "make" the 2nd time would give you extra entries
> in the cache.  After the 3rd time you were OK, but it seemed nice to
> make it stable faster.
>
> To understand this, first understand that some people call __cc_option
> with a space after the ','.  Make treats this as important, thus these
> two are different:
>
> val := $(call cc-option, -opt1, -opt2)
> val := $(call cc-option,-opt1,-opt2)
>
> In one case val would be " -opt1" or " -opt2".  In the other case
> "-opt1" or "-opt2".
>
> However, when we cache things the space suddenly becomes irrelevant.
> That's because we've end up with something like:
>
>   __cached_result :=  -opt1
>
> ...and even though there are two spaces after the ":=" make throws that away.
>
> The result of all of that is that we'd get slightly different numbers
> of spaces after the cache was populated.  That could probably be
> ignored except that some flags are additive and previous flags are
> passed as arguments to later calls to __cc-option.  That means that
> previous flags are passed to __sanitize-opt.  ...and in __sanitize-opt
> spaces are significant.  ...so you'd come up with a slightly different
> variable name on the first run and the subsequent runs...
>
> Probably waaaay more than you expected for a simple $(strip) call, huh?

Understood.
The whitespace problem is nasty.


> In any case this doesn't seem to matter anymore with your improvements
> (I didn't dig into why), though I have noticed that the cache does
> still return slightly different spacing for some of the results...
>

As far as I tested, I always see only one space after ":=" in v2.

I did not consider this deeply,
but something is working nicely behind the scene.





-- 
Best Regards
Masahiro Yamada

^ permalink raw reply

* [kernel-hardening] Re: [PATCH RFC v4 1/3] gcc-plugins: Add STACKLEAK erasing the kernel stack at the end of syscalls
From: Ingo Molnar @ 2017-10-05  7:27 UTC (permalink / raw)
  To: Alexander Popov
  Cc: kernel-hardening, keescook, pageexec, spender, tycho,
	Laura Abbott, Mark Rutland, Ard Biesheuvel, Andy Lutomirski, x86,
	Linus Torvalds, Andy Lutomirski, Borislav Petkov, Thomas Gleixner,
	H. Peter Anvin, Peter Zijlstra
In-Reply-To: <1507157703-14972-2-git-send-email-alex.popov@linux.com>


* Alexander Popov <alex.popov@linux.com> wrote:

> diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
> index 8a13d46..06bc57b 100644
> --- a/arch/x86/entry/entry_32.S
> +++ b/arch/x86/entry/entry_32.S
> @@ -75,6 +75,71 @@
>  #endif
>  .endm
>  
> +.macro erase_kstack
> +#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
> +	call erase_kstack
> +#endif
> +.endm
> +
> +#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
> +/* For the detailed comments, see erase_kstack in entry_64.S */
> +ENTRY(erase_kstack)
> +	pushl	%edi
> +	pushl	%ecx
> +	pushl	%eax
> +	pushl	%ebp
> +
> +	movl	PER_CPU_VAR(current_task), %ebp
> +	mov	TASK_lowest_stack(%ebp), %edi
> +	mov	$STACKLEAK_POISON, %eax
> +	std
> +
> +1:
> +	mov	%edi, %ecx
> +	and	$THREAD_SIZE_asm - 1, %ecx
> +	shr	$2, %ecx
> +	repne	scasl
> +	jecxz	2f
> +
> +	cmp	$2*16, %ecx
> +	jc	2f
> +
> +	mov	$2*16, %ecx
> +	repe	scasl
> +	jecxz	2f
> +	jne	1b
> +
> +2:
> +	cld
> +	or	$2*4, %edi
> +	mov	%esp, %ecx
> +	sub	%edi, %ecx
> +
> +	cmp	$THREAD_SIZE_asm, %ecx
> +	jb	3f
> +	ud2
> +
> +3:
> +	shr	$2, %ecx
> +	rep	stosl
> +
> +	/*
> +	 * TODO: sp0 on x86_32 is not reliable, right?
> +	 * Doubt because of the definition of cpu_current_top_of_stack
> +	 * in arch/x86/kernel/cpu/common.c.
> +	 */
> +	mov	TASK_thread_sp0(%ebp), %edi
> +	sub	$128, %edi
> +	mov	%edi, TASK_lowest_stack(%ebp)
> +
> +	popl	%ebp
> +	popl	%eax
> +	popl	%ecx
> +	popl	%edi
> +	ret
> +ENDPROC(erase_kstack)
> +#endif
> +
>  /*
>   * User gs save/restore
>   *
> @@ -445,6 +510,8 @@ ENTRY(entry_SYSENTER_32)
>  	ALTERNATIVE "testl %eax, %eax; jz .Lsyscall_32_done", \
>  		    "jmp .Lsyscall_32_done", X86_FEATURE_XENPV
>  
> +	erase_kstack
> +
>  /* Opportunistic SYSEXIT */
>  	TRACE_IRQS_ON			/* User mode traces as IRQs on. */
>  	movl	PT_EIP(%esp), %edx	/* pt_regs->ip */
> @@ -531,6 +598,8 @@ ENTRY(entry_INT80_32)
>  	call	do_int80_syscall_32
>  .Lsyscall_32_done:
>  
> +	erase_kstack
> +
>  restore_all:
>  	TRACE_IRQS_IRET
>  .Lrestore_all_notrace:
> diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
> index 4916725..189d843 100644
> --- a/arch/x86/entry/entry_64.S
> +++ b/arch/x86/entry/entry_64.S
> @@ -59,6 +59,90 @@ END(native_usergs_sysret64)
>  #endif
>  .endm
>  
> +.macro erase_kstack
> +#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
> +	call erase_kstack
> +#endif
> +.endm
> +
> +#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
> +ENTRY(erase_kstack)
> +	pushq	%rdi
> +	pushq	%rcx
> +	pushq	%rax
> +	pushq	%r11
> +
> +	movq	PER_CPU_VAR(current_task), %r11
> +	mov	TASK_lowest_stack(%r11), %rdi
> +	mov	$STACKLEAK_POISON, %rax
> +	std
> +
> +	/*
> +	 * Let's search for the poison value in the stack.
> +	 * Start from the lowest_stack and go to the bottom (see std above).
> +	 */
> +1:
> +	mov	%edi, %ecx
> +	and	$THREAD_SIZE_asm - 1, %ecx
> +	shr	$3, %ecx
> +	repne	scasq
> +	jecxz	2f	/* Didn't find it. Go to poisoning. */
> +
> +	/*
> +	 * Found the poison value in the stack. Go to poisoning if there are
> +	 * less than 16 qwords left.
> +	 */
> +	cmp	$2*8, %ecx
> +	jc	2f
> +
> +	/*
> +	 * Check that 16 further qwords contain poison (avoid false positives).
> +	 * If so, the part of the stack below the address in %rdi is likely
> +	 * to be poisoned. Otherwise we need to search deeper.
> +	 */
> +	mov	$2*8, %ecx
> +	repe	scasq
> +	jecxz	2f	/* Poison the upper part of the stack. */
> +	jne	1b	/* Search deeper. */
> +
> +2:
> +	/*
> +	 * Prepare the counter for poisoning the kernel stack between
> +	 * %rdi and %rsp. Two qwords at the bottom of the stack are reserved
> +	 * and should not be poisoned (see CONFIG_SCHED_STACK_END_CHECK).
> +	 */
> +	cld
> +	or	$2*8, %rdi
> +	mov	%esp, %ecx
> +	sub	%edi, %ecx
> +
> +	/* Check that the counter value is sane. */
> +	cmp	$THREAD_SIZE_asm, %rcx
> +	jb	3f
> +	ud2
> +
> +3:
> +	/*
> +	 * So let's write the poison value to the kernel stack. Start from the
> +	 * address in %rdi and move up (see cld above) to the address in %rsp
> +	 * (not included, used memory).
> +	 */
> +	shr	$3, %ecx
> +	rep	stosq
> +
> +	/* Set the lowest_stack value to the top_of_stack - 256. */
> +	mov	TASK_thread_sp0(%r11), %rdi
> +	sub	$256, %rdi
> +	mov	%rdi, TASK_lowest_stack(%r11)
> +
> +	popq	%r11
> +	popq	%rax
> +	popq	%rcx
> +	popq	%rdi
> +	ret
> +ENDPROC(erase_kstack)

A couple of (first round) review observations:

- Why is the erase_kstack() function written in assembly, instead of plain C?
  The complexity and fragility of this patch could be reduced if it was moved to C.

- The GCC plugin adds instrumentation in form of extra 'track_stack()' and
  'check_alloca()' calls. Could you please provide a frequency analysis of the
  impact of this: x86-64 defconfig vmlinux size before/after the patch, and the
  number of instrumentation function calls inserted, compared to the number of
  functions?

- Is there a debug facility to query the current (latest) lowest_stack value of
  any task in the system, via /proc? Observing this threshold over time would give 
  a good idea about the typical work the clearing function has to perform for
  every system call.

- Please clean up the GCC plugin code to follow proper kernel coding style.
  The '//' comment lines in particular are a big eyesore, plus there are a lot of 
  other stylistic variations as well that make the code unnecessarily difficult to 
  read.

- Also, this patch is way too big - there's no reason why the GCC plugin and
  the stack erasure features should be introduced in the same patch, etc.

Thanks,

	Ingo

^ permalink raw reply

* Inward remittance - RM 15,800
From: CIMB Clicks Team @ 2017-10-05  6:14 UTC (permalink / raw)


[-- Attachment #1: Type: text/plain, Size: 125 bytes --]

We have received a credit instruction of RM 15,800 to your account

View attachment to complete the payment.

Thank you

[-- Attachment #2: Clicks_Remittance.docx --]
[-- Type: application/octet-stream, Size: 43388 bytes --]

^ permalink raw reply

* [PATCH] INSTALL: Update dependency list and configure with libxtables support
From: Harsha Sharma @ 2017-10-05  7:31 UTC (permalink / raw)
  To: pablo; +Cc: netfilter-devel, linux-kernel, outreachy-kernel, Harsha Sharma

Add configure with lixtables in INSTALL and required dependencies for
the same

Signed-off-by: Harsha Sharma <harshasharmaiitr@gmail.com>
---
 INSTALL | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/INSTALL b/INSTALL
index 3e9a6ad..04981f1 100644
--- a/INSTALL
+++ b/INSTALL
@@ -18,6 +18,12 @@ Installation instructions for nftables
 
   - libreadline
 
+  - pkg-config
+
+  - libtool
+
+  - optional: libxtables: required for libxtables support
+
   - optional: docbook2x: required for building man-page
 
   - optional: docbook-utils: required for building PDF man-page
@@ -51,6 +57,11 @@ Installation instructions for nftables
 	having no other use for libgmp.
 	Note: This decreases the debugging verbosity in some files.
 
+ --with-xtables
+
+	For libxtables support which allow input of tables with -f
+ 	option in nft.
+
  Suggested configuration options: --prefix=/ --datarootdir=/usr/share
 
  Run "make" to compile nftables, "make install" to install it in the
-- 
2.11.0

^ permalink raw reply related

* Re: [PATCH -next v2] mtd: nand: Add support for Toshiba BENAND (Built-in ECC NAND)
From: Boris Brezillon @ 2017-10-05  7:31 UTC (permalink / raw)
  To: KOBAYASHI Yoshitake
  Cc: richard, linux-kernel, marek.vasut, linux-mtd, cyrille.pitchen,
	computersforpeace, dwmw2
In-Reply-To: <fa307950-6085-acf9-aae3-725b9c2a60af@toshiba.co.jp>

On Thu, 5 Oct 2017 16:24:08 +0900
KOBAYASHI Yoshitake <yoshitake.kobayashi@toshiba.co.jp> wrote:

> On 2017/09/27 6:15, Boris Brezillon wrote:
> > On Tue, 26 Sep 2017 18:18:04 +0900
> > KOBAYASHI Yoshitake <yoshitake.kobayashi@toshiba.co.jp> wrote:
> >   
> >> On 2017/09/21 16:28, Boris Brezillon wrote:  
> >>> On Thu, 21 Sep 2017 14:32:02 +0900
> >>> KOBAYASHI Yoshitake <yoshitake.kobayashi@toshiba.co.jp> wrote:
> >>>     
> >>>> This patch enables support for Toshiba BENAND.
> >>>> The current implementation does not support vondor specific command    
> >>>
> >>> 					       ^ vendor
> >>>     
> >>>> TOSHIBA_NAND_CMD_ECC_STATUS. I would like to add the command, when
> >>>> the exec_op() [1] infrastructure is implemented.    
> >>>
> >>> It's not a good idea to reference a branch that is likely to disappear
> >>> in a commit message. Just say that you can't properly support the
> >>> TOSHIBA_NAND_CMD_ECC_STATUS operation right and that it might be
> >>> addressed in the future.    
> >>
> >> Thanks. I'll change the comment.
> >>  
> >>>> +	 */
> >>>> +	if (status & NAND_STATUS_FAIL) {
> >>>> +		/* uncorrectable */
> >>>> +		mtd->ecc_stats.failed++;
> >>>> +	} else if (status & TOSHIBA_NAND_STATUS_REWRITE_RECOMMENDED) {
> >>>> +		/* correctable */
> >>>> +		max_bitflips = mtd->bitflip_threshold;
> >>>> +		mtd->ecc_stats.corrected += max_bitflips;
> >>>> +	}    
> >>>
> >>> Is this working correctly when you read more than one ECC chunk? The
> >>> ECC step size is 512 bytes and the page is bigger than that, which means
> >>> you have more than one ECC chunk per page. What happens to the
> >>> NAND_STATUS_FAIL flag if the first chunk is uncorrectable but
> >>> following ones are correctable (or do not contain bitflips at all)?     
> >>
> >> As you mentioned, the ECC step size is 512 byte. But when 0x70 command is used
> >> a simplified ECC status per page is generated.  
> > 
> > I'm fine with that as long as uncorrectable errors are detected
> > correctly and TOSHIBA_NAND_STATUS_REWRITE_RECOMMENDED is set as soon as
> > one of the ECC chunk exposes too many bitflips.
> >   
> >> In case the first chunk is uncorrectable but the following ones are correctable,
> >> the 0x70 command can only check the status of the uncorrectable one.  
> > 
> > Sounds good.
> >   
> >> Each ECC chunk status can be checked by using the 0x7A command.
> >>  
> >>>> @@ -39,9 +105,43 @@ static void toshiba_nand_decode_id(struct nand_chip *chip)
> >>>>  
> >>>>  static int toshiba_nand_init(struct nand_chip *chip)
> >>>>  {
> >>>> +	struct mtd_info *mtd = nand_to_mtd(chip);
> >>>> +
> >>>>  	if (nand_is_slc(chip))
> >>>>  		chip->bbt_options |= NAND_BBT_SCAN2NDPAGE;
> >>>>  
> >>>> +	if (nand_is_slc(chip) && (chip->id.data[4] & 0x80)) {
> >>>> +		/* BENAND */
> >>>> +
> >>>> +		/*
> >>>> +		 * We can't disable the internal ECC engine, the user
> >>>> +		 * has to use on-die ECC, there is no alternative.
> >>>> +		 */
> >>>> +		if (chip->ecc.mode != NAND_ECC_ON_DIE) {
> >>>> +			pr_err("On-die ECC should be selected.\n");
> >>>> +			return -EINVAL;
> >>>> +		}    
> >>>
> >>> According to your previous explanation that's not exactly true. Since
> >>> ECC bytes are stored in a separate area, the user can decide to use
> >>> another mode without trouble. Just skip the BENAND initialization when
> >>> mode != NAND_ECC_ON_DIE and we should be good, or am I missing something?    
> >>
> >> I am asking to product department to confirm it.  
> > 
> > I'm almost sure this is the case ;-).  
> 
> According to the command sequence written in BENAND's datasheet, the status
> of the internal ECC must be checked after reading. To do that, ecc.mode has been
> set to NAND_ECC_ON_DIE and the status of the internal ECC is checked through 
> the 0x70 or 0x7A command. That's the reason we are returning EINVAL here.

But the status will anyway be retrieved, and what's the point of
checking the ECC flags if the user wants to use its own ECC engine? I
mean, since you have the whole OOB area exposed why would you prevent
existing setup from working (by existing setup I mean those that already
have a BENAND but haven't modified their driver to accept ON_DIE_ECC).

Maybe I'm missing something, but AFAICT it's safe to allow users to
completely ignore the on-die ECC engine and use their own, even if
that means duplicating the work since on-die ECC cannot be disabled on
BENAND devices.

^ permalink raw reply

* [PATCH] nspr, nss: Use BUILD_CC instead of hardcoded "gcc"
From: Nikolay Merinov @ 2017-10-05  7:25 UTC (permalink / raw)
  To: openembedded-core

Recipes nspr_4.16.bb and nss_3.31.1.bb ignored BUILD_CC and it's
BUILD_CFLAGS and tried to compile with hardcoded "gcc" instead. As
result build for this recipes will fail if host use different name for
compiler or require any flags.

Signed-off-by: Nikolay Merinov <n.merinov@inango-systems.com>
---
 meta/recipes-support/nspr/nspr_4.16.bb | 2 +-
 meta/recipes-support/nss/nss_3.31.1.bb | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/meta/recipes-support/nspr/nspr_4.16.bb b/meta/recipes-support/nspr/nspr_4.16.bb
index bb178fe712..78ef994ffd 100644
--- a/meta/recipes-support/nspr/nspr_4.16.bb
+++ b/meta/recipes-support/nspr/nspr_4.16.bb
@@ -155,7 +155,7 @@ PACKAGECONFIG ??= "${@bb.utils.filter('DISTRO_FEATURES', 'ipv6', d)}"
 PACKAGECONFIG[ipv6] = "--enable-ipv6,--disable-ipv6,"
 
 do_compile_prepend() {
-	oe_runmake CROSS_COMPILE=1 CFLAGS="-DXP_UNIX" LDFLAGS="" CC=gcc -C config export
+	oe_runmake CROSS_COMPILE=1 CFLAGS="-DXP_UNIX ${BUILD_CFLAGS}" LDFLAGS="" CC="${BUILD_CC}" -C config export
 }
 
 do_compile_append() {
diff --git a/meta/recipes-support/nss/nss_3.31.1.bb b/meta/recipes-support/nss/nss_3.31.1.bb
index 955862fddb..588708fc31 100644
--- a/meta/recipes-support/nss/nss_3.31.1.bb
+++ b/meta/recipes-support/nss/nss_3.31.1.bb
@@ -66,8 +66,8 @@ do_compile_prepend_class-native() {
 
 do_compile() {
     export CROSS_COMPILE=1
-    export NATIVE_CC="gcc"
-    export NATIVE_FLAGS="${HOST_CFLAGS}"
+    export NATIVE_CC="${BUILD_CC}"
+    export NATIVE_FLAGS="${BUILD_CFLAGS}"
     export BUILD_OPT=1
 
     export FREEBL_NO_DEPEND=1
@@ -118,7 +118,7 @@ do_install_prepend_class-nativesdk() {
 
 do_install() {
     export CROSS_COMPILE=1
-    export NATIVE_CC="gcc"
+    export NATIVE_CC="${BUILD_CC}"
     export BUILD_OPT=1
 
     export FREEBL_NO_DEPEND=1
-- 
2.14.2



^ permalink raw reply related

* Re: [PATCH v4 13/14] platform/x86: dell-smbios-wmi: introduce userspace interface
From: Greg KH @ 2017-10-05  7:33 UTC (permalink / raw)
  To: Mario Limonciello
  Cc: dvhart, Andy Shevchenko, LKML, platform-driver-x86,
	Andy Lutomirski, quasisec, pali.rohar, rjw, mjg59, hch
In-Reply-To: <ad918eaaa2c9cab878d92be4691cb67f44d13681.1507156392.git.mario.limonciello@dell.com>

On Wed, Oct 04, 2017 at 05:48:39PM -0500, Mario Limonciello wrote:
> +static long dell_smbios_wmi_ioctl(struct file *filp, unsigned int cmd,
> +	unsigned long arg)
> +{
> +	void __user *p = (void __user *) arg;
> +	struct wmi_smbios_ioctl *input;
> +	struct wmi_smbios_priv *priv;
> +	struct wmi_device *wdev;
> +	size_t ioctl_size;
> +	int ret = 0;
> +
> +	switch (cmd) {
> +	/* we only operate on first instance */
> +	case DELL_WMI_SMBIOS_CMD:
> +		wdev = get_first_wmi_device();
> +		if (!wdev) {
> +			pr_err("No WMI devices bound\n");

dev_err(), you are a driver, never use "raw" pr_ calls.

> +			return -ENODEV;
> +		}
> +		ioctl_size = sizeof(struct wmi_smbios_ioctl);
> +		priv = dev_get_drvdata(&wdev->dev);
> +		input = kmalloc(ioctl_size, GFP_KERNEL);
> +		if (!input)
> +			return -ENOMEM;
> +		mutex_lock(&wmi_mutex);
> +		if (!access_ok(VERIFY_READ, p, ioctl_size)) {

Hm, any time I see an access_ok() call, I get scared.  You should almost
never need to make that call if you are using the correct kernel apis.

> +			pr_err("Unsafe userspace pointer passed\n");

dev_err().

> +			return -EFAULT;

Memory leak!


> +		}
> +		if (copy_from_user(input, p, ioctl_size)) {
> +			ret = -EFAULT;

So, why did you call access_ok() followed by copy_from_user()?
copy_from/to() handle all of that for you automatically.

> +			goto fail_smbios_cmd;
> +		}
> +		if (input->length != priv->buffer_size) {
> +			pr_err("Got buffer size %d expected %d\n",
> +				input->length, priv->buffer_size);

length is user provided, it can be whatever anyone sets it to.  I don't
understand this error.

> +			ret = -EINVAL;
> +			goto fail_smbios_cmd;
> +		}
> +		if (!access_ok(VERIFY_WRITE, input->buf, priv->buffer_size)) {
> +			pr_err("Unsafe userspace pointer passed\n");

Again, don't need this.

> +			ret = -EFAULT;
> +			goto fail_smbios_cmd;
> +		}
> +		if (copy_from_user(priv->buf, input->buf, priv->buffer_size)) {

Wait, input->buf is a user pointer?  Ick, see my previous email about
your crazy api here being a mess.  This should not be needed.

And as you "know" the buffer size already, why do you have userspace
specify it?  What good is it?

> +			ret = -EFAULT;
> +			goto fail_smbios_cmd;
> +		}
> +		ret = run_smbios_call(wdev);

No other checking of the values in the structure?  You just "trust"
userspace to get it all right?  Hah!

> +		if (ret != 0)
> +			goto fail_smbios_cmd;

You didn't run this through checkpatch :(


> +		if (copy_to_user(input->buf, priv->buf, priv->buffer_size))
> +			ret = -EFAULT;
> +fail_smbios_cmd:
> +		kfree(input);
> +		mutex_unlock(&wmi_mutex);
> +		break;
> +	default:
> +		pr_err("unsupported ioctl: %d.\n", cmd);
> +		ret = -ENOIOCTLCMD;
> +	}
> +	return ret;
> +}
> +
> +static ssize_t buffer_size_show(struct device *dev,
> +				struct device_attribute *attr, char *buf)
> +{
> +	struct wmi_smbios_priv *priv = dev_get_drvdata(dev);
> +
> +	return sprintf(buf, "%d\n", priv->buffer_size);
> +}
> +static DEVICE_ATTR_RO(buffer_size);
> +
> +static struct attribute *smbios_wmi_attrs[] = {
> +	&dev_attr_buffer_size.attr,
> +	NULL
> +};
> +
> +static const struct attribute_group smbios_wmi_attribute_group = {
> +	.attrs = smbios_wmi_attrs,
> +};
> +
>  static int dell_smbios_wmi_probe(struct wmi_device *wdev)
>  {
>  	struct wmi_smbios_priv *priv;
> @@ -127,6 +209,11 @@ static int dell_smbios_wmi_probe(struct wmi_device *wdev)
>  	if (!priv->buf)
>  		return -ENOMEM;
>  
> +	ret = sysfs_create_group(&wdev->dev.kobj,
> +				 &smbios_wmi_attribute_group);

Hint, if a driver ever makes a call to sysfs_*(), something is wrong, it
should never be needed.

Also, you just raced with userspace and lost :(

There is a way to fix all of this, in a simple way, I'll leave that as
an exercise for the reader, I've reviewed enough of this code for
today...

> +static const struct file_operations dell_smbios_wmi_fops = {
> +	.owner		= THIS_MODULE,

And who uses that field?  Hint, no one is, which is another issue that I
forgot to review in your previous patch where you use this structure.
What is protecting this module from being unloaded while the ioctl call
is running?  (hint, nothing...)

I need more coffee...

greg k-h

^ permalink raw reply

* [PATCH] ARM: dts: omap3: Replace deprecated mcp prefix
From: Lars Poeschel @ 2017-10-05  7:33 UTC (permalink / raw)
  To: Benoît Cousson, Tony Lindgren, Rob Herring, Mark Rutland,
	Russell King, linux-omap, devicetree, linux-arm-kernel,
	linux-kernel
  Cc: Lars Poeschel

The devicetree prefix mcp is deprecated in favour of microchip. Thus
this replaces mcp with microchip for the mcp23017 gpio expander chip.

Signed-off-by: Lars Poeschel <poeschel@lemonage.de>
---
 arch/arm/boot/dts/omap3-lilly-a83x.dtsi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/boot/dts/omap3-lilly-a83x.dtsi b/arch/arm/boot/dts/omap3-lilly-a83x.dtsi
index fa611a5e4850..343a36d8031d 100644
--- a/arch/arm/boot/dts/omap3-lilly-a83x.dtsi
+++ b/arch/arm/boot/dts/omap3-lilly-a83x.dtsi
@@ -257,7 +257,7 @@
 	pinctrl-names = "default";
 	pinctrl-0 = <&i2c3_pins>;
 		gpiom1: gpio@20 {
-			compatible = "mcp,mcp23017";
+			compatible = "microchip,mcp23017";
 			gpio-controller;
 			#gpio-cells = <2>;
 			reg = <0x20>;
-- 
2.14.2

^ permalink raw reply related

* [PATCH] ARM: dts: omap3: Replace deprecated mcp prefix
From: Lars Poeschel @ 2017-10-05  7:33 UTC (permalink / raw)
  To: linux-arm-kernel

The devicetree prefix mcp is deprecated in favour of microchip. Thus
this replaces mcp with microchip for the mcp23017 gpio expander chip.

Signed-off-by: Lars Poeschel <poeschel@lemonage.de>
---
 arch/arm/boot/dts/omap3-lilly-a83x.dtsi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/boot/dts/omap3-lilly-a83x.dtsi b/arch/arm/boot/dts/omap3-lilly-a83x.dtsi
index fa611a5e4850..343a36d8031d 100644
--- a/arch/arm/boot/dts/omap3-lilly-a83x.dtsi
+++ b/arch/arm/boot/dts/omap3-lilly-a83x.dtsi
@@ -257,7 +257,7 @@
 	pinctrl-names = "default";
 	pinctrl-0 = <&i2c3_pins>;
 		gpiom1: gpio at 20 {
-			compatible = "mcp,mcp23017";
+			compatible = "microchip,mcp23017";
 			gpio-controller;
 			#gpio-cells = <2>;
 			reg = <0x20>;
-- 
2.14.2

^ permalink raw reply related

* Re: RISC-V Linux Port v9
From: Arnd Bergmann @ 2017-10-05  7:34 UTC (permalink / raw)
  To: Palmer Dabbelt
  Cc: Stephen Rothwell, Olof Johansson, Linux Kernel Mailing List,
	patches-q3qR2WxjNRFS9aJRtSZj7A, Rob Herring, Mark Rutland,
	albert-SpMDHPYPyPbQT0dZR+AlfA, Masahiro Yamada, Michal Marek,
	Will Deacon, Peter Zijlstra, Boqun Feng, Oleg Nesterov,
	Ingo Molnar, DTML
In-Reply-To: <mhng-fa7412d4-47d2-47e5-b36b-a237dfb04c9d@palmer-si-x1c4>

On Thu, Oct 5, 2017 at 2:21 AM, Palmer Dabbelt <palmer-96lFi9zoCfxBDgjK7y7TUQ@public.gmane.org> wrote:
> On Tue, 26 Sep 2017 23:08:02 PDT (-0700), Arnd Bergmann wrote:
>> On Tue, Sep 26, 2017 at 6:56 PM, Palmer Dabbelt <palmer-96lFi9zoCfxBDgjK7y7TUQ@public.gmane.org> wrote:
>>> As per suggestions on our v8 patch set, I've split the core architecture code
>>> out from our drivers and would like to submit this patch set to be included
>>> into linux-next, with the goal being to be merged in during the next merge
>>> window.  This patch set is based on 4.14-rc2, but if it's better to have it
>>> based on something else then I can change it around.
>>
>> -rc2 is good, just don't rebase it any more. I'd suggest that at the point this
>> becomes part of linux-next, you stop modifying the patches further and
>> move to adding any additional changes as patches on top.
>
> Sounds good.  I've gotten a kernel.org account now, so I've gone ahead and
> signed a "for-linux-next" tag that contains this patch set.  I'm going to treat
> what's here as an official pull request into linux-next and therefor I won't be
> rewriting history any more.  If I understand everything correctly, once I'm in
> linux-next I'm meant to update that tag with commits that are ready to go?
>
> Is there anything further I should do in order to get that tag merged into
> linux-next?

Please be aware that Stephen has announced that there won't be any
linux-next trees until the end of the month, which will be the kernel
summit in Prague.

https://lkml.org/lkml/2017/9/29/13

It may be worth sending him a request to include your tree when
he returns, I assume there will be a long email backlog and he might
miss it otherwise.

>>>  * I cleaned up the defconfigs -- there's actually now just one, and it's
>>>    empty.  For now I think we're OK with what the kernel sets as defaults, but
>>>    I anticipate we'll begin to expand this as people start to use the port
>>>    more.
>>
>> The kernel defaults are not really as sensible as one would hope. Maybe
>> go through your previous defconfig once more and pick up the items that
>> made sense.
>
> I was a bit surprised at the defaults: for example, I'd expect things like
> CONFIG_PCI and CONFIG_NET to be enabled by default.  I guess I just assumed
> that since technically we have a working kernel without those that it was fine
> to just stick with the defaults.

Some of the defaults are really pretty random and are only like this for
historic reasons.

> Looking at our old defconfig, I'd pick
>
>   CONFIG_PCI=y
>   CONFIG_NAMESPACES=y
>   CONFIG_NET=y
>   CONFIG_UNIX=y
>   CONFIG_INET=y
>   CONFIG_DEVTMPFS=y
>   CONFIG_EXT2_FS=y
>   CONFIG_TMPFS=y
>
> does that seem reasonable?

Mostly yes, but please disable ext2 and use ext4 instead.

       Arnd
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.