linux-rt-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* kmsg_dump() sleeping while atomic
@ 2019-04-23 21:56 Scott Wood
  2019-04-24 14:36 ` [PATCH] printk: kmsg_dump: remove mutex usage John Ogness
  0 siblings, 1 reply; 5+ messages in thread
From: Scott Wood @ 2019-04-23 21:56 UTC (permalink / raw)
  To: John Ogness, Sebastian Andrzej Siewior; +Cc: linux-rt-users

I got an oops on v5.0.7-rt5 (due to kexec code not properly checking for
disabled EFI, patch coming) and after the primary oops, I got this:

[  111.575331] 055: BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:2004
[  111.584457] 055: in_atomic(): 0, irqs_disabled(): 1, pid: 13289, name: kexec
[  111.591502] 055: INFO: lockdep is turned off.
[  111.595862] 055: irq event stamp: 481304
[  111.599786] 055: hardirqs last  enabled at (481303): [<ffffffff811853b0>] ktime_get_coarse_real_ts64+0x60/0xc0
[  111.609779] 055: hardirqs last disabled at (481304): [<ffffffff81001ea1>] trace_hardirqs_off_thunk+0x1a/0x1c
[  111.619599] 055: softirqs last  enabled at (0): [<ffffffff810d13e3>] copy_process.part.30+0x833/0x2680
[  111.628900] 055: softirqs last disabled at (0): [<0000000000000000>]           (null)
[  111.636723] 055: CPU: 55 PID: 13289 Comm: kexec Tainted: G      D     E     5.0.7-rt5+ #3
[  111.644896] 055: Hardware name: Intel Corporation S2600BT/S2600BT, BIOS SE5C620.86B.01.00.0763.022420181017 02/24/2018
[  111.655582] 055: Call Trace:
[  111.658468] 055:  dump_stack+0x5e/0x8b
[  111.662223] 055:  ___might_sleep+0x177/0x1e0
[  111.666494] 055:  __rt_mutex_lock_state+0x1f/0x80
[  111.671202] 055:  kmsg_dump+0x88/0x160
[  111.674951] 055:  ? kmsg_dump+0x1f/0x160
[  111.678880] 055:  oops_end+0x59/0xd0
[  111.682459] 055:  no_context+0x185/0x540
[  111.686383] 055:  ? __do_page_fault+0x1da/0x530
[  111.690918] 055:  do_page_fault+0x2d/0x290
[  111.695017] 055:  page_fault+0x1e/0x30
[  111.698769] 055: RIP: 0010:          (null)
[  111.702956] 055: Code: Bad RIP value.
[  111.706618] 055: RSP: 0018:ffffc900754d3ea8 EFLAGS: 00010246
[  111.712278] 055: 
[  111.714213] 055: RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffffc900754d3eb8
[  111.721777] 055: RDX: 0000000000000000 RSI: ffffc900754d3ec0 RDI: ffffc900754d3ed2
[  111.729342] 055: RBP: ffffc900754d3ef8 R08: ffffc900754d3eb7 R09: 0000000000000000
[  111.736910] 055: R10: 0000000000000000 R11: 0000000000000005 R12: 0000000000e84950
[  111.744476] 055: R13: 0000000037f54730 R14: 0000000000000000 R15: 0000000000000000
[  111.752045] 055:  ? arch_ima_get_secureboot+0x11e/0x190
[  111.757269] 055:  ? selinux_kernel_read_file+0x20/0x20
[  111.762406] 055:  ? ima_load_data+0x46/0xa0
[  111.766593] 055:  ? __x64_sys_kexec_load+0x44/0xd0
[  111.771384] 055:  ? do_syscall_64+0x5a/0x260
[  111.775657] 055:  ? entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  111.791742] 055: RIP: 0010:          (null)
[  111.795930] 055: Code: Bad RIP value.
[  111.799595] 055: RSP: 0018:ffffc900754d3ea8 EFLAGS: 00010246
[  111.805254] 055: 
[  111.807185] 055: RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffffc900754d3eb8
[  111.814750] 055: RDX: 0000000000000000 RSI: ffffc900754d3ec0 RDI: ffffc900754d3ed2
[  111.822309] 055: RBP: ffffc900754d3ef8 R08: ffffc900754d3eb7 R09: 0000000000000000
[  111.829873] 055: R10: 0000000000000000 R11: 0000000000000005 R12: 0000000000e84950
[  111.837440] 055: R13: 0000000037f54730 R14: 0000000000000000 R15: 0000000000000000
[  111.845008] 055: FS:  00007fe31bffa740(0000) GS:ffff8897de200000(0000) knlGS:0000000000000000
[  111.853525] 055: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  111.859704] 055: CR2: ffffffffffffffd6 CR3: 00000017d47da004 CR4: 00000000007606e0
[  111.867272] 055: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  111.874837] 055: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  111.882401] 055: PKRU: 55555554
[  111.885540] 055: Kernel panic - not syncing: Fatal exception
[  111.891314] 055: Kernel Offset: disabled

...plus a bunch of "unexpected reschedule of offline CPU" messages.

kmsg_dump() calls kmsg_dump_rewind(), which calls mutex_lock().  This would
be broken even on non-RT.

It's not clear what kmsg_dump_lock is even supposed to be protecting --
kmsg_dump_get_buffer() does not take the lock, and the only caller to the
locked version of kmsg_dump_get_line() is in arch/um/kernel/kmsg_dump.c.

-Scott



^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] printk: kmsg_dump: remove mutex usage
  2019-04-23 21:56 kmsg_dump() sleeping while atomic Scott Wood
@ 2019-04-24 14:36 ` John Ogness
  2019-04-24 20:07   ` Scott Wood
  2019-04-30 15:10   ` Sebastian Andrzej Siewior
  0 siblings, 2 replies; 5+ messages in thread
From: John Ogness @ 2019-04-24 14:36 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior; +Cc: Scott Wood, linux-rt-users

The kmsg dumper can be called from any context, but the dumping
helpers were using a mutex to synchronize the iterator against
concurrent dumps.

Rather than trying to synchronize the iterator, use a local copy
of the iterator during the dump. Then no synchronization is
required.

Reported-by: Scott Wood <swood@redhat.com>
Signed-off-by: John Ogness <john.ogness@linutronix.de>
---
 I tested this patch by creating a custom dumper and triggering
 it from NMI context.

 kernel/printk/printk.c | 23 ++++++++++-------------
 1 file changed, 10 insertions(+), 13 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 997d07b6bf97..7d3522e0bcec 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -359,8 +359,6 @@ static u64 syslog_seq;
 static size_t syslog_partial;
 static bool syslog_time;
 
-static DEFINE_MUTEX(kmsg_dump_lock);
-
 /* the last printk record at the last 'clear' command */
 static u64 clear_seq;
 
@@ -2820,6 +2818,7 @@ module_param_named(always_kmsg_dump, always_kmsg_dump, bool, S_IRUGO | S_IWUSR);
  */
 void kmsg_dump(enum kmsg_dump_reason reason)
 {
+	struct kmsg_dumper dumper_local;
 	struct kmsg_dumper *dumper;
 
 	if ((reason > KMSG_DUMP_OOPS) && !always_kmsg_dump)
@@ -2830,16 +2829,18 @@ void kmsg_dump(enum kmsg_dump_reason reason)
 		if (dumper->max_reason && reason > dumper->max_reason)
 			continue;
 
-		/* initialize iterator with data about the stored records */
-		dumper->active = true;
+		/*
+		 * use a local copy to avoid modifying the
+		 * iterator used by any other cpus/contexts
+		 */
+		memcpy(&dumper_local, dumper, sizeof(dumper_local));
 
-		kmsg_dump_rewind(dumper);
+		/* initialize iterator with data about the stored records */
+		dumper_local.active = true;
+		kmsg_dump_rewind(&dumper_local);
 
 		/* invoke dumper which will iterate over records */
-		dumper->dump(dumper, reason);
-
-		/* reset iterator */
-		dumper->active = false;
+		dumper_local.dump(&dumper_local, reason);
 	}
 	rcu_read_unlock();
 }
@@ -2951,9 +2952,7 @@ bool kmsg_dump_get_line(struct kmsg_dumper *dumper, bool syslog,
 {
 	bool ret;
 
-	mutex_lock(&kmsg_dump_lock);
 	ret = kmsg_dump_get_line_nolock(dumper, syslog, line, size, len);
-	mutex_unlock(&kmsg_dump_lock);
 
 	return ret;
 }
@@ -3105,9 +3104,7 @@ void kmsg_dump_rewind_nolock(struct kmsg_dumper *dumper)
  */
 void kmsg_dump_rewind(struct kmsg_dumper *dumper)
 {
-	mutex_lock(&kmsg_dump_lock);
 	kmsg_dump_rewind_nolock(dumper);
-	mutex_unlock(&kmsg_dump_lock);
 }
 EXPORT_SYMBOL_GPL(kmsg_dump_rewind);
 
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] printk: kmsg_dump: remove mutex usage
  2019-04-24 14:36 ` [PATCH] printk: kmsg_dump: remove mutex usage John Ogness
@ 2019-04-24 20:07   ` Scott Wood
  2019-04-25  6:51     ` John Ogness
  2019-04-30 15:10   ` Sebastian Andrzej Siewior
  1 sibling, 1 reply; 5+ messages in thread
From: Scott Wood @ 2019-04-24 20:07 UTC (permalink / raw)
  To: John Ogness, Sebastian Andrzej Siewior; +Cc: linux-rt-users

On Wed, 2019-04-24 at 16:36 +0200, John Ogness wrote:
> @@ -2830,16 +2829,18 @@ void kmsg_dump(enum kmsg_dump_reason reason)
>  		if (dumper->max_reason && reason > dumper->max_reason)
>  			continue;
>  
> -		/* initialize iterator with data about the stored records */
> -		dumper->active = true;
> +		/*
> +		 * use a local copy to avoid modifying the
> +		 * iterator used by any other cpus/contexts
> +		 */
> +		memcpy(&dumper_local, dumper, sizeof(dumper_local));
>  
> -		kmsg_dump_rewind(dumper);
> +		/* initialize iterator with data about the stored records */
> +		dumper_local.active = true;
> +		kmsg_dump_rewind(&dumper_local);
>  
>  		/* invoke dumper which will iterate over records */
> -		dumper->dump(dumper, reason);
> -
> -		/* reset iterator */
> -		dumper->active = false;
> +		dumper_local.dump(&dumper_local, reason);
>  	}

When would a dumper (or anything else that checks it) ever see active be
false?

>  	rcu_read_unlock();
>  }
> @@ -2951,9 +2952,7 @@ bool kmsg_dump_get_line(struct kmsg_dumper *dumper,
> bool syslog,
>  {
>  	bool ret;
>  
> -	mutex_lock(&kmsg_dump_lock);
>  	ret = kmsg_dump_get_line_nolock(dumper, syslog, line, size, len);
> -	mutex_unlock(&kmsg_dump_lock);
>  
>  	return ret;
>  }
> @@ -3105,9 +3104,7 @@ void kmsg_dump_rewind_nolock(struct kmsg_dumper
> *dumper)
>   */
>  void kmsg_dump_rewind(struct kmsg_dumper *dumper)
>  {
> -	mutex_lock(&kmsg_dump_lock);
>  	kmsg_dump_rewind_nolock(dumper);
> -	mutex_unlock(&kmsg_dump_lock);
>  }
>  EXPORT_SYMBOL_GPL(kmsg_dump_rewind);
>  

Any reason not to get rid of the wrappers now that the lock's gone?

-Scott



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] printk: kmsg_dump: remove mutex usage
  2019-04-24 20:07   ` Scott Wood
@ 2019-04-25  6:51     ` John Ogness
  0 siblings, 0 replies; 5+ messages in thread
From: John Ogness @ 2019-04-25  6:51 UTC (permalink / raw)
  To: Scott Wood; +Cc: Sebastian Andrzej Siewior, linux-rt-users

On 2019-04-24, Scott Wood <swood@redhat.com> wrote:
> On Wed, 2019-04-24 at 16:36 +0200, John Ogness wrote:
>> @@ -2830,16 +2829,18 @@ void kmsg_dump(enum kmsg_dump_reason reason)
>>  		if (dumper->max_reason && reason > dumper->max_reason)
>>  			continue;
>>  
>> -		/* initialize iterator with data about the stored records */
>> -		dumper->active = true;
>> +		/*
>> +		 * use a local copy to avoid modifying the
>> +		 * iterator used by any other cpus/contexts
>> +		 */
>> +		memcpy(&dumper_local, dumper, sizeof(dumper_local));
>>  
>> -		kmsg_dump_rewind(dumper);
>> +		/* initialize iterator with data about the stored records */
>> +		dumper_local.active = true;
>> +		kmsg_dump_rewind(&dumper_local);
>>  
>>  		/* invoke dumper which will iterate over records */
>> -		dumper->dump(dumper, reason);
>> -
>> -		/* reset iterator */
>> -		dumper->active = false;
>> +		dumper_local.dump(&dumper_local, reason);
>>  	}
>
> When would a dumper (or anything else that checks it) ever see active be
> false?

A valid question. Originally I wanted to completely remove the active
flag. But really the rt patchset is not the place for these kinds of
changes. I am currently reworking printk (for mainline) and I will
evaluate the purpose/usefulness of the active flag for that work.

>>  	rcu_read_unlock();
>>  }
>> @@ -2951,9 +2952,7 @@ bool kmsg_dump_get_line(struct kmsg_dumper *dumper,
>> bool syslog,
>>  {
>>  	bool ret;
>>  
>> -	mutex_lock(&kmsg_dump_lock);
>>  	ret = kmsg_dump_get_line_nolock(dumper, syslog, line, size, len);
>> -	mutex_unlock(&kmsg_dump_lock);
>>  
>>  	return ret;
>>  }
>> @@ -3105,9 +3104,7 @@ void kmsg_dump_rewind_nolock(struct kmsg_dumper
>> *dumper)
>>   */
>>  void kmsg_dump_rewind(struct kmsg_dumper *dumper)
>>  {
>> -	mutex_lock(&kmsg_dump_lock);
>>  	kmsg_dump_rewind_nolock(dumper);
>> -	mutex_unlock(&kmsg_dump_lock);
>>  }
>>  EXPORT_SYMBOL_GPL(kmsg_dump_rewind);
>>  
>
> Any reason not to get rid of the wrappers now that the lock's gone?

I wanted my patch to be as less intrusive as possible. For my mainline
work I will look into eliminating the wrappers.

John Ogness

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] printk: kmsg_dump: remove mutex usage
  2019-04-24 14:36 ` [PATCH] printk: kmsg_dump: remove mutex usage John Ogness
  2019-04-24 20:07   ` Scott Wood
@ 2019-04-30 15:10   ` Sebastian Andrzej Siewior
  1 sibling, 0 replies; 5+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-04-30 15:10 UTC (permalink / raw)
  To: John Ogness; +Cc: Scott Wood, linux-rt-users

On 2019-04-24 16:36:04 [+0200], John Ogness wrote:
> The kmsg dumper can be called from any context, but the dumping
> helpers were using a mutex to synchronize the iterator against
> concurrent dumps.
> 
> Rather than trying to synchronize the iterator, use a local copy
> of the iterator during the dump. Then no synchronization is
> required.
> 
> Reported-by: Scott Wood <swood@redhat.com>
> Signed-off-by: John Ogness <john.ogness@linutronix.de>

Applied.

Sebastian

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-04-30 15:10 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-04-23 21:56 kmsg_dump() sleeping while atomic Scott Wood
2019-04-24 14:36 ` [PATCH] printk: kmsg_dump: remove mutex usage John Ogness
2019-04-24 20:07   ` Scott Wood
2019-04-25  6:51     ` John Ogness
2019-04-30 15:10   ` Sebastian Andrzej Siewior

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).