Comm: bcache_allocato BUG: unable to handle kernel NULL pointer dereference at 00000000000006bc

public inbox for linux-bcache@vger.kernel.org
 help / color / mirror / Atom feed

* Comm: bcache_allocato BUG: unable to handle kernel NULL pointer dereference at 00000000000006bc
@ 2017-10-14 11:14 Sverd Johnsen
  2017-10-14 11:42 ` Coly Li
  0 siblings, 1 reply; 8+ messages in thread
From: Sverd Johnsen @ 2017-10-14 11:14 UTC (permalink / raw)
  To: linux-bcache

This is on 4.13.5. Happens sometime at boot, I just reboot and it
works fine. No other problems.

  40.391116] BUG: unable to handle kernel NULL pointer dereference at
00000000000006bc
   40.391663] IP: _raw_spin_lock_irqsave+0x12/0x30
   40.392152] PGD 0
   40.392153] P4D 0
   40.392658]
   40.393070] bcache: bch_journal_replay() journal replay done, 21
keys in 10 entries, seq 34810
   40.393427] bcache: register_cache() registered cache device sdc4
   40.394669] Oops: 0002 [#1] PREEMPT SMP
   40.395174] Modules linked in: tun(+) vhost tap kvm md_mod bcache
intel_cstate snd_hda_codec_realtek intel_uncore snd_hda_codec_generic
efi_pstore snd_hda_codec_hdmi intel_rapl_perf snd_hda_intel
snd_hda_codec efivars snd_hwdep snd_hda_core mei_me input_leds mei
led_class snd_pcm tpm_crb efivarfs algif_skcipher af_alg psmouse atkbd
libps2 crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcspkr shpchp
fan thermal battery i8042 tpm_tis tpm_tis_core tpm acpi_pad vfio_pci
irqbypass vfio_virqfd vfio_iommu_type1 vfio
   40.396917] CPU: 2 PID: 501 Comm: bcache_allocato Not tainted 4.13.5-5-ph #1
   40.397507] Hardware name: Gigabyte Technology Co., Ltd.
Z170X-UD3/Z170X-UD3-CF, BIOS F22 03/06/2017
   40.398095] task: ffff9efb793e5100 task.stack: ffffb546410d8000
   40.398687] RIP: 0010:_raw_spin_lock_irqsave+0x12/0x30
   40.399280] RSP: 0018:ffffb546410dbd60 EFLAGS: 00010046
   40.399883] R10: ffffb5464114d000 R11: ffff9efb74fe27f8 R12: ffff9efb7b69a028
   40.399883] RAX: 0000000000000000 RBX: 0000000000000246 RCX: 0000000000000000
   40.399883] RBP: 00000000000006bc R08: ffffffffc0474800 R09: 000000000000000c
   40.399883] RDX: 0000000000000001 RSI: 0000000000000003 RDI: 00000000000006bc
   40.399884] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
   40.399884] FS:  0000000000000000(0000) GS:ffff9efb8ed00000(0000)
knlGS:0000000000000000
   40.399884] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000003
   40.399885] CR2: 00000000000006bc CR3: 0000000435928000 CR4: 00000000003406e0
   40.399885] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
   40.399885] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
   40.399886] Call Trace:
   40.399888]  ? try_to_wake_up+0x39/0x350
   40.399891]  ? bch_bucket_alloc+0x9a/0x280 [bcache]
   40.399892]  ? wait_woken+0x80/0x80
   40.399893]  ? bch_prio_write+0x189/0x330 [bcache]
   40.399895]  ? bch_allocator_thread+0x57b/0xc90 [bcache]
   40.399896]  ? __schedule+0x18e/0x5d0
   40.399897]  ? bch_invalidate_one_bucket+0x70/0x70 [bcache]
   40.399898]  ? kthread+0x10e/0x130
   40.399899]  ? kthread_create_on_node+0x60/0x60
   40.399900]  ? ret_from_fork+0x22/0x30
   40.399900] Code: f5 27 87 64 74 02 f3 c3 e8 08 75 86 ff c3 90 66 2e
0f 1f 84 00 00 00 00 00 53 9c 5b fa 65 ff 05 d5 27 87 64 31 c0 ba 01
00 00 00 <f0> 0f b1 17 85 c0 75 05 48 89 d8 5b c3 89 c6 e8 2a 82 90 ff
48
   40.399909] CR2: 00000000000006bc
   40.399909] RIP: _raw_spin_lock_irqsave+0x12/0x30 RSP: ffffb546410dbd60
   40.399911] ---[ end trace 3b309679f786fde8 ]---

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Comm: bcache_allocato BUG: unable to handle kernel NULL pointer dereference at 00000000000006bc
  2017-10-14 11:14 Comm: bcache_allocato BUG: unable to handle kernel NULL pointer dereference at 00000000000006bc Sverd Johnsen
@ 2017-10-14 11:42 ` Coly Li
  2017-10-14 12:04   ` Sverd Johnsen
  0 siblings, 1 reply; 8+ messages in thread
From: Coly Li @ 2017-10-14 11:42 UTC (permalink / raw)
  To: Sverd Johnsen; +Cc: linux-bcache

On 2017/10/14 下午7:14, Sverd Johnsen wrote:
> This is on 4.13.5. Happens sometime at boot, I just reboot and it
> works fine. No other problems.
> 
>   40.391116] BUG: unable to handle kernel NULL pointer dereference at
> 00000000000006bc
>    40.391663] IP: _raw_spin_lock_irqsave+0x12/0x30
>    40.392152] PGD 0
>    40.392153] P4D 0
>    40.392658]
>    40.393070] bcache: bch_journal_replay() journal replay done, 21
> keys in 10 entries, seq 34810
>    40.393427] bcache: register_cache() registered cache device sdc4
>    40.394669] Oops: 0002 [#1] PREEMPT SMP
>    40.395174] Modules linked in: tun(+) vhost tap kvm md_mod bcache
> intel_cstate snd_hda_codec_realtek intel_uncore snd_hda_codec_generic
> efi_pstore snd_hda_codec_hdmi intel_rapl_perf snd_hda_intel
> snd_hda_codec efivars snd_hwdep snd_hda_core mei_me input_leds mei
> led_class snd_pcm tpm_crb efivarfs algif_skcipher af_alg psmouse atkbd
> libps2 crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcspkr shpchp
> fan thermal battery i8042 tpm_tis tpm_tis_core tpm acpi_pad vfio_pci
> irqbypass vfio_virqfd vfio_iommu_type1 vfio
>    40.396917] CPU: 2 PID: 501 Comm: bcache_allocato Not tainted 4.13.5-5-ph #1
>    40.397507] Hardware name: Gigabyte Technology Co., Ltd.
> Z170X-UD3/Z170X-UD3-CF, BIOS F22 03/06/2017
>    40.398095] task: ffff9efb793e5100 task.stack: ffffb546410d8000
>    40.398687] RIP: 0010:_raw_spin_lock_irqsave+0x12/0x30
>    40.399280] RSP: 0018:ffffb546410dbd60 EFLAGS: 00010046
>    40.399883] R10: ffffb5464114d000 R11: ffff9efb74fe27f8 R12: ffff9efb7b69a028
>    40.399883] RAX: 0000000000000000 RBX: 0000000000000246 RCX: 0000000000000000
>    40.399883] RBP: 00000000000006bc R08: ffffffffc0474800 R09: 000000000000000c
>    40.399883] RDX: 0000000000000001 RSI: 0000000000000003 RDI: 00000000000006bc
>    40.399884] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>    40.399884] FS:  0000000000000000(0000) GS:ffff9efb8ed00000(0000)
> knlGS:0000000000000000
>    40.399884] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000003
>    40.399885] CR2: 00000000000006bc CR3: 0000000435928000 CR4: 00000000003406e0
>    40.399885] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>    40.399885] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>    40.399886] Call Trace:
>    40.399888]  ? try_to_wake_up+0x39/0x350
>    40.399891]  ? bch_bucket_alloc+0x9a/0x280 [bcache]
>    40.399892]  ? wait_woken+0x80/0x80
>    40.399893]  ? bch_prio_write+0x189/0x330 [bcache]
>    40.399895]  ? bch_allocator_thread+0x57b/0xc90 [bcache]
>    40.399896]  ? __schedule+0x18e/0x5d0
>    40.399897]  ? bch_invalidate_one_bucket+0x70/0x70 [bcache]
>    40.399898]  ? kthread+0x10e/0x130
>    40.399899]  ? kthread_create_on_node+0x60/0x60
>    40.399900]  ? ret_from_fork+0x22/0x30
>    40.399900] Code: f5 27 87 64 74 02 f3 c3 e8 08 75 86 ff c3 90 66 2e
> 0f 1f 84 00 00 00 00 00 53 9c 5b fa 65 ff 05 d5 27 87 64 31 c0 ba 01
> 00 00 00 <f0> 0f b1 17 85 c0 75 05 48 89 d8 5b c3 89 c6 e8 2a 82 90 ff
> 48
>    40.399909] CR2: 00000000000006bc
>    40.399909] RIP: _raw_spin_lock_irqsave+0x12/0x30 RSP: ffffb546410dbd60
>    40.399911] ---[ end trace 3b309679f786fde8 ]---

Hi Sverd,

A fast glance on the code, c->data_bucket_lock from bch_alloc_sectors()
is very suspicious. c->data_bucket_lock is initialized in
bch_open_buckets_alloc(), which is called after calling kobject_init()
when allocate a cache set in bch_cache_set_alloc().

Cache/cache device register is via sysfs entry, therefor it is possible
that before spin lock c->data_bucket_lock is initialized, a cache/cached
device registration request sent into /sys/fs/bcache/register, then
trigger a NULL deference on the spin lock.

Normally it won't happen if the command is typed by human being. Do you
use some script to run the bcache automatically ? Then I can do further
check to confirm whether my guess is correct.

Thanks.

-- 
Coly Li

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Comm: bcache_allocato BUG: unable to handle kernel NULL pointer dereference at 00000000000006bc
  2017-10-14 11:42 ` Coly Li
@ 2017-10-14 12:04   ` Sverd Johnsen
  2017-10-14 16:22     ` Coly Li
  0 siblings, 1 reply; 8+ messages in thread
From: Sverd Johnsen @ 2017-10-14 12:04 UTC (permalink / raw)
  To: Coly Li; +Cc: linux-bcache

Yes. bcache-tools ships some udev rules associated helper utils that
are used here.

On 14 October 2017 at 13:42, Coly Li <i@coly.li> wrote:
> On 2017/10/14 下午7:14, Sverd Johnsen wrote:
>> This is on 4.13.5. Happens sometime at boot, I just reboot and it
>> works fine. No other problems.
>>
>>   40.391116] BUG: unable to handle kernel NULL pointer dereference at
>> 00000000000006bc
>>    40.391663] IP: _raw_spin_lock_irqsave+0x12/0x30
>>    40.392152] PGD 0
>>    40.392153] P4D 0
>>    40.392658]
>>    40.393070] bcache: bch_journal_replay() journal replay done, 21
>> keys in 10 entries, seq 34810
>>    40.393427] bcache: register_cache() registered cache device sdc4
>>    40.394669] Oops: 0002 [#1] PREEMPT SMP
>>    40.395174] Modules linked in: tun(+) vhost tap kvm md_mod bcache
>> intel_cstate snd_hda_codec_realtek intel_uncore snd_hda_codec_generic
>> efi_pstore snd_hda_codec_hdmi intel_rapl_perf snd_hda_intel
>> snd_hda_codec efivars snd_hwdep snd_hda_core mei_me input_leds mei
>> led_class snd_pcm tpm_crb efivarfs algif_skcipher af_alg psmouse atkbd
>> libps2 crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcspkr shpchp
>> fan thermal battery i8042 tpm_tis tpm_tis_core tpm acpi_pad vfio_pci
>> irqbypass vfio_virqfd vfio_iommu_type1 vfio
>>    40.396917] CPU: 2 PID: 501 Comm: bcache_allocato Not tainted 4.13.5-5-ph #1
>>    40.397507] Hardware name: Gigabyte Technology Co., Ltd.
>> Z170X-UD3/Z170X-UD3-CF, BIOS F22 03/06/2017
>>    40.398095] task: ffff9efb793e5100 task.stack: ffffb546410d8000
>>    40.398687] RIP: 0010:_raw_spin_lock_irqsave+0x12/0x30
>>    40.399280] RSP: 0018:ffffb546410dbd60 EFLAGS: 00010046
>>    40.399883] R10: ffffb5464114d000 R11: ffff9efb74fe27f8 R12: ffff9efb7b69a028
>>    40.399883] RAX: 0000000000000000 RBX: 0000000000000246 RCX: 0000000000000000
>>    40.399883] RBP: 00000000000006bc R08: ffffffffc0474800 R09: 000000000000000c
>>    40.399883] RDX: 0000000000000001 RSI: 0000000000000003 RDI: 00000000000006bc
>>    40.399884] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>    40.399884] FS:  0000000000000000(0000) GS:ffff9efb8ed00000(0000)
>> knlGS:0000000000000000
>>    40.399884] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000003
>>    40.399885] CR2: 00000000000006bc CR3: 0000000435928000 CR4: 00000000003406e0
>>    40.399885] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>    40.399885] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>>    40.399886] Call Trace:
>>    40.399888]  ? try_to_wake_up+0x39/0x350
>>    40.399891]  ? bch_bucket_alloc+0x9a/0x280 [bcache]
>>    40.399892]  ? wait_woken+0x80/0x80
>>    40.399893]  ? bch_prio_write+0x189/0x330 [bcache]
>>    40.399895]  ? bch_allocator_thread+0x57b/0xc90 [bcache]
>>    40.399896]  ? __schedule+0x18e/0x5d0
>>    40.399897]  ? bch_invalidate_one_bucket+0x70/0x70 [bcache]
>>    40.399898]  ? kthread+0x10e/0x130
>>    40.399899]  ? kthread_create_on_node+0x60/0x60
>>    40.399900]  ? ret_from_fork+0x22/0x30
>>    40.399900] Code: f5 27 87 64 74 02 f3 c3 e8 08 75 86 ff c3 90 66 2e
>> 0f 1f 84 00 00 00 00 00 53 9c 5b fa 65 ff 05 d5 27 87 64 31 c0 ba 01
>> 00 00 00 <f0> 0f b1 17 85 c0 75 05 48 89 d8 5b c3 89 c6 e8 2a 82 90 ff
>> 48
>>    40.399909] CR2: 00000000000006bc
>>    40.399909] RIP: _raw_spin_lock_irqsave+0x12/0x30 RSP: ffffb546410dbd60
>>    40.399911] ---[ end trace 3b309679f786fde8 ]---
>
> Hi Sverd,
>
> A fast glance on the code, c->data_bucket_lock from bch_alloc_sectors()
> is very suspicious. c->data_bucket_lock is initialized in
> bch_open_buckets_alloc(), which is called after calling kobject_init()
> when allocate a cache set in bch_cache_set_alloc().
>
> Cache/cache device register is via sysfs entry, therefor it is possible
> that before spin lock c->data_bucket_lock is initialized, a cache/cached
> device registration request sent into /sys/fs/bcache/register, then
> trigger a NULL deference on the spin lock.
>
> Normally it won't happen if the command is typed by human being. Do you
> use some script to run the bcache automatically ? Then I can do further
> check to confirm whether my guess is correct.
>
> Thanks.
>
> --
> Coly Li

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Comm: bcache_allocato BUG: unable to handle kernel NULL pointer dereference at 00000000000006bc
  2017-10-14 12:04   ` Sverd Johnsen
@ 2017-10-14 16:22     ` Coly Li
  2017-10-14 21:23       ` Michael Lyle
  0 siblings, 1 reply; 8+ messages in thread
From: Coly Li @ 2017-10-14 16:22 UTC (permalink / raw)
  To: Sverd Johnsen; +Cc: linux-bcache

[-- Attachment #1: Type: text/plain, Size: 1804 bytes --]

On 2017/10/14 下午8:04, Sverd Johnsen wrote:
> Yes. bcache-tools ships some udev rules associated helper utils that
> are used here.
> 

Hi Sverd,

Is it possible for you to test the attached patch ? This is an effort to
avoid NULL dereference issue, let's try whether it works.

Thanks in advance.

Coly Li


> On 14 October 2017 at 13:42, Coly Li <i@coly.li> wrote:
>> On 2017/10/14 下午7:14, Sverd Johnsen wrote:
>>> This is on 4.13.5. Happens sometime at boot, I just reboot and it
>>> works fine. No other problems.
>>>
>>>   40.391116] BUG: unable to handle kernel NULL pointer dereference at
>>> 00000000000006bc
>>>    40.391663] IP: _raw_spin_lock_irqsave+0x12/0x30
>>>    40.392152] PGD 0
>>>    40.392153] P4D 0
>>>    40.392658]
>>>    40.393070] bcache: bch_journal_replay() journal replay done, 21
>>> keys in 10 entries, seq 34810
>>>    40.393427] bcache: register_cache() registered cache device sdc4
>>>    40.394669] Oops: 0002 [#1] PREEMPT SMP
>>>    40.395174] Modules linked in: tun(+) vhost tap kvm md_mod bcache
[snip]
>>
>> Hi Sverd,
>>
>> A fast glance on the code, c->data_bucket_lock from bch_alloc_sectors()
>> is very suspicious. c->data_bucket_lock is initialized in
>> bch_open_buckets_alloc(), which is called after calling kobject_init()
>> when allocate a cache set in bch_cache_set_alloc().
>>
>> Cache/cache device register is via sysfs entry, therefor it is possible
>> that before spin lock c->data_bucket_lock is initialized, a cache/cached
>> device registration request sent into /sys/fs/bcache/register, then
>> trigger a NULL deference on the spin lock.
>>
>> Normally it won't happen if the command is typed by human being. Do you
>> use some script to run the bcache automatically ? Then I can do further
>> check to confirm whether my guess is correct.

[-- Attachment #2: initiate_kobj_late.patch --]
[-- Type: text/plain, Size: 2867 bytes --]

[Patch] bcache: initiate cached device kobjects at last

Bcache cached device sysfs entries are initialized eariler than
other related kernel data structure, there is possibility that
when cached device starts to run by writting to its sysfs but the
its kernel resources is not initialized yet, e.g. allocator thread
or data bucket spin_lock. This kind of race will trigger kernel
panic by NULL dereference.

This patch modifies the location where related kobjects are created,
to make sure before sysfs entry being avaible to user space, all
necessary kernel sources of the cached device are initialized.

Signed-off-by: Coly Li <colyli@suse.de>
---
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index fc0a31b13ac4..e1c02d869e0d 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -1104,12 +1104,10 @@ static int cached_dev_init(struct cached_dev *dc, unsigned block_size)
 	INIT_LIST_HEAD(&dc->list);
 	closure_init(&dc->disk.cl, NULL);
 	set_closure_fn(&dc->disk.cl, cached_dev_flush, system_wq);
-	kobject_init(&dc->disk.kobj, &bch_cached_dev_ktype);
 	INIT_WORK(&dc->detach, cached_dev_detach_finish);
 	sema_init(&dc->sb_write_mutex, 1);
 	INIT_LIST_HEAD(&dc->io_lru);
 	spin_lock_init(&dc->io_lock);
-	bch_cache_accounting_init(&dc->accounting, &dc->disk.cl);
 
 	dc->sequential_cutoff		= 4 << 20;
 
@@ -1138,6 +1136,10 @@ static int cached_dev_init(struct cached_dev *dc, unsigned block_size)
 
 	bch_cached_dev_request_init(dc);
 	bch_cached_dev_writeback_init(dc);
+	bch_cache_accounting_init(&dc->accounting, &dc->disk.cl);
+
+	kobject_init(&dc->disk.kobj, &bch_cached_dev_ktype);
+
 	return 0;
 }
 
@@ -1467,11 +1469,6 @@ struct cache_set *bch_cache_set_alloc(struct cache_sb *sb)
 	closure_set_stopped(&c->cl);
 	closure_put(&c->cl);
 
-	kobject_init(&c->kobj, &bch_cache_set_ktype);
-	kobject_init(&c->internal, &bch_cache_set_internal_ktype);
-
-	bch_cache_accounting_init(&c->accounting, &c->cl);
-
 	memcpy(c->sb.set_uuid, sb->set_uuid, 16);
 	c->sb.block_size	= sb->block_size;
 	c->sb.bucket_size	= sb->bucket_size;
@@ -1534,6 +1531,10 @@ struct cache_set *bch_cache_set_alloc(struct cache_sb *sb)
 	c->congested_write_threshold_us	= 20000;
 	c->error_limit	= 8 << IO_ERROR_SHIFT;
 
+	kobject_init(&c->kobj, &bch_cache_set_ktype);
+	kobject_init(&c->internal, &bch_cache_set_internal_ktype);
+	bch_cache_accounting_init(&c->accounting, &c->cl);
+
 	return c;
 err:
 	bch_cache_set_unregister(c);
@@ -1815,7 +1816,6 @@ static int cache_alloc(struct cache *ca)
 	struct bucket *b;
 
 	__module_get(THIS_MODULE);
-	kobject_init(&ca->kobj, &bch_cache_ktype);
 
 	bio_init(&ca->journal.bio, ca->journal.bio.bi_inline_vecs, 8);
 
@@ -1839,6 +1839,8 @@ static int cache_alloc(struct cache *ca)
 	for_each_bucket(b, ca)
 		atomic_set(&b->pin, 0);
 
+	kobject_init(&ca->kobj, &bch_cache_ktype);
+
 	return 0;
 }
 

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: Comm: bcache_allocato BUG: unable to handle kernel NULL pointer dereference at 00000000000006bc
  2017-10-14 16:22     ` Coly Li
@ 2017-10-14 21:23       ` Michael Lyle
  2017-10-14 22:02         ` Coly Li
  0 siblings, 1 reply; 8+ messages in thread
From: Michael Lyle @ 2017-10-14 21:23 UTC (permalink / raw)
  To: Coly Li, Sverd Johnsen; +Cc: linux-bcache

Hey Coly--

On 10/14/2017 09:22 AM, Coly Li wrote:
> On 2017/10/14 下午8:04, Sverd Johnsen wrote:
>> Yes. bcache-tools ships some udev rules associated helper utils that
>> are used here.
>>
> 
> Hi Sverd,
> 
> Is it possible for you to test the attached patch ? This is an effort to
> avoid NULL dereference issue, let's try whether it works.
> 
> Thanks in advance.
> 
> Coly Li

It looks like in the patch that you move some kobject_init's around and
it's a better factoring.  But for the sysfs to be accessed when the
structures aren't available, doesn't it have to be after the
kobject_adds, which is much later?

Isn't this reported issue is the thing that you fixed earlier in
"bcache: check ca->alloc_thread initialized before wake up it" ?

It seems like we have a lot of issues with bringing up devices and
tearing them down/detaching.

Mike

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Comm: bcache_allocato BUG: unable to handle kernel NULL pointer dereference at 00000000000006bc
  2017-10-14 21:23       ` Michael Lyle
@ 2017-10-14 22:02         ` Coly Li
  2017-10-14 22:34           ` Michael Lyle
  0 siblings, 1 reply; 8+ messages in thread
From: Coly Li @ 2017-10-14 22:02 UTC (permalink / raw)
  To: Michael Lyle; +Cc: Sverd Johnsen, linux-bcache

On 2017/10/15 上午5:23, Michael Lyle wrote:
> Hey Coly--
> 
> On 10/14/2017 09:22 AM, Coly Li wrote:
>> On 2017/10/14 下午8:04, Sverd Johnsen wrote:
>>> Yes. bcache-tools ships some udev rules associated helper utils that
>>> are used here.
>>>
>>
>> Hi Sverd,
>>
>> Is it possible for you to test the attached patch ? This is an effort to
>> avoid NULL dereference issue, let's try whether it works.
>>
>> Thanks in advance.
>>
>> Coly Li
> 
> It looks like in the patch that you move some kobject_init's around and
> it's a better factoring.  But for the sysfs to be accessed when the
> structures aren't available, doesn't it have to be after the
> kobject_adds, which is much later?
> 

Hi Mike,

I check the code, creating sysfs obj has no obvious logic change in the
code. And I ran a test for this change, just echo 1 into stop files and
re-register cache/cached devices by bash script.

The dmesg output displays,
[ 5646.739946] bcache: bch_journal_replay() journal replay done, 3406
keys in 8 entries, seq 15762
[ 5646.740120] bcache: register_cache() registered cache device nvme1n1p1
[ 5647.074204] bcache: register_bdev() registered backing device md0
[ 5647.145510] bcache: bch_cached_dev_attach() Caching md0 as bcache0 on
set 76abb9d9-aea3-4e1a-9b4a-46a217963834
[ 5664.285232] bcache: bcache_device_free() bcache0 stopped
[ 5664.322575] bcache: cache_set_free() Cache set
76abb9d9-aea3-4e1a-9b4a-46a217963834 unregistered
[ 5674.928248] bcache: bch_journal_replay() journal replay done, 4063
keys in 11 entries, seq 15859
[ 5674.928413] bcache: register_cache() registered cache device nvme1n1p1
[ 5675.257664] bcache: register_bdev() registered backing device md0
[ 5675.347494] bcache: bch_cached_dev_attach() Caching md0 as bcache0 on
set 76abb9d9-aea3-4e1a-9b4a-46a217963834
[ 5692.665452] bcache: bcache_device_free() bcache0 stopped

I ran the test for a while, it seems nothing broken. Then I post out
this draft version.

> Isn't this reported issue is the thing that you fixed earlier in
> "bcache: check ca->alloc_thread initialized before wake up it" ?
> 

They are different NULL dereference locations, my previous fix is a NULL
dereference when waking up allocator thread, this one is a NULL
dereference on data bucket spinlock.

This NULL spinlock is a little bit complex, I don't find a proper way to
handle caller's logic where data bucket spinlock is referenced, so I
change the function calling order to make sure it has been initialized
before any further reference.

> It seems like we have a lot of issues with bringing up devices and
> tearing them down/detaching.

Yes, if you look at Liang Chen's patch "bcache: explicitly destroy mutex
while exiting", you may find he moves sysfs_create_files(bcache_kobj,
files)) after bch_request_init() for the similar reason.

-- 
Coly Li

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Comm: bcache_allocato BUG: unable to handle kernel NULL pointer dereference at 00000000000006bc
  2017-10-14 22:02         ` Coly Li
@ 2017-10-14 22:34           ` Michael Lyle
  2017-10-15 10:27             ` Coly Li
  0 siblings, 1 reply; 8+ messages in thread
From: Michael Lyle @ 2017-10-14 22:34 UTC (permalink / raw)
  To: Coly Li; +Cc: Sverd Johnsen, linux-bcache

On 10/14/2017 03:02 PM, Coly Li wrote:
[snip]

Hey Coly,--

My bad: I misread the trace and your commit log.  This makes more sense.

Does this re-ordering also prevent the other alloc_thread issue from
being triggered?

I feel like we should maybe be holding a big lock that excludes sysfs
actions during these critical times-- maybe make the register_lock
bigger-- so that it is not as critical to handle these dependencies.  It
will be a little less responsive/concurrent during registration but this
is just a boot-time or shutdown issue, so...

Mike


> Hi Mike,
> 
> I check the code, creating sysfs obj has no obvious logic change in the
> code. And I ran a test for this change, just echo 1 into stop files and
> re-register cache/cached devices by bash script.
> 
> The dmesg output displays,
> [ 5646.739946] bcache: bch_journal_replay() journal replay done, 3406
> keys in 8 entries, seq 15762
> [ 5646.740120] bcache: register_cache() registered cache device nvme1n1p1
> [ 5647.074204] bcache: register_bdev() registered backing device md0
> [ 5647.145510] bcache: bch_cached_dev_attach() Caching md0 as bcache0 on
> set 76abb9d9-aea3-4e1a-9b4a-46a217963834
> [ 5664.285232] bcache: bcache_device_free() bcache0 stopped
> [ 5664.322575] bcache: cache_set_free() Cache set
> 76abb9d9-aea3-4e1a-9b4a-46a217963834 unregistered
> [ 5674.928248] bcache: bch_journal_replay() journal replay done, 4063
> keys in 11 entries, seq 15859
> [ 5674.928413] bcache: register_cache() registered cache device nvme1n1p1
> [ 5675.257664] bcache: register_bdev() registered backing device md0
> [ 5675.347494] bcache: bch_cached_dev_attach() Caching md0 as bcache0 on
> set 76abb9d9-aea3-4e1a-9b4a-46a217963834
> [ 5692.665452] bcache: bcache_device_free() bcache0 stopped
> 
> I ran the test for a while, it seems nothing broken. Then I post out
> this draft version.
> 
>> Isn't this reported issue is the thing that you fixed earlier in
>> "bcache: check ca->alloc_thread initialized before wake up it" ?
>>
> 
> They are different NULL dereference locations, my previous fix is a NULL
> dereference when waking up allocator thread, this one is a NULL
> dereference on data bucket spinlock.
> 
> This NULL spinlock is a little bit complex, I don't find a proper way to
> handle caller's logic where data bucket spinlock is referenced, so I
> change the function calling order to make sure it has been initialized
> before any further reference.
> 
>> It seems like we have a lot of issues with bringing up devices and
>> tearing them down/detaching.
> 
> Yes, if you look at Liang Chen's patch "bcache: explicitly destroy mutex
> while exiting", you may find he moves sysfs_create_files(bcache_kobj,
> files)) after bch_request_init() for the similar reason.
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Comm: bcache_allocato BUG: unable to handle kernel NULL pointer dereference at 00000000000006bc
  2017-10-14 22:34           ` Michael Lyle
@ 2017-10-15 10:27             ` Coly Li
  0 siblings, 0 replies; 8+ messages in thread
From: Coly Li @ 2017-10-15 10:27 UTC (permalink / raw)
  To: Michael Lyle; +Cc: Sverd Johnsen, linux-bcache

On 2017/10/15 上午6:34, Michael Lyle wrote:
> On 10/14/2017 03:02 PM, Coly Li wrote:
> [snip]
> 
> Hey Coly,--
> 
> My bad: I misread the trace and your commit log.  This makes more sense.
> 
> Does this re-ordering also prevent the other alloc_thread issue from
> being triggered?
> 

So far my answer is not yet. Because I don't have exact idea that
creating sysfs late may prevent such NULL dereference issue. This is a
draft patch for try, there might be some other reason I didn't cache yet.

For wake up allocator thread, if the thread pointer is not initialized,
ignore it and wake up for next time is safe. But this NULL dereference
happens in bucket allocation code patch, simply ignore it or return an
error may cause bcache fails to start.


> I feel like we should maybe be holding a big lock that excludes sysfs
> actions during these critical times-- maybe make the register_lock
> bigger-- so that it is not as critical to handle these dependencies.  It
> will be a little less responsive/concurrent during registration but this
> is just a boot-time or shutdown issue, so...

Copied, I keep in mind for this suggestion, need to look close for the
code dependence. I will update status late.

Thanks.

Coly Li

>> Hi Mike,
>>
>> I check the code, creating sysfs obj has no obvious logic change in the
>> code. And I ran a test for this change, just echo 1 into stop files and
>> re-register cache/cached devices by bash script.
>>
>> The dmesg output displays,
>> [ 5646.739946] bcache: bch_journal_replay() journal replay done, 3406
>> keys in 8 entries, seq 15762
>> [ 5646.740120] bcache: register_cache() registered cache device nvme1n1p1
>> [ 5647.074204] bcache: register_bdev() registered backing device md0
>> [ 5647.145510] bcache: bch_cached_dev_attach() Caching md0 as bcache0 on
>> set 76abb9d9-aea3-4e1a-9b4a-46a217963834
>> [ 5664.285232] bcache: bcache_device_free() bcache0 stopped
>> [ 5664.322575] bcache: cache_set_free() Cache set
>> 76abb9d9-aea3-4e1a-9b4a-46a217963834 unregistered
>> [ 5674.928248] bcache: bch_journal_replay() journal replay done, 4063
>> keys in 11 entries, seq 15859
>> [ 5674.928413] bcache: register_cache() registered cache device nvme1n1p1
>> [ 5675.257664] bcache: register_bdev() registered backing device md0
>> [ 5675.347494] bcache: bch_cached_dev_attach() Caching md0 as bcache0 on
>> set 76abb9d9-aea3-4e1a-9b4a-46a217963834
>> [ 5692.665452] bcache: bcache_device_free() bcache0 stopped
>>
>> I ran the test for a while, it seems nothing broken. Then I post out
>> this draft version.
>>
>>> Isn't this reported issue is the thing that you fixed earlier in
>>> "bcache: check ca->alloc_thread initialized before wake up it" ?
>>>
>>
>> They are different NULL dereference locations, my previous fix is a NULL
>> dereference when waking up allocator thread, this one is a NULL
>> dereference on data bucket spinlock.
>>
>> This NULL spinlock is a little bit complex, I don't find a proper way to
>> handle caller's logic where data bucket spinlock is referenced, so I
>> change the function calling order to make sure it has been initialized
>> before any further reference.
>>
>>> It seems like we have a lot of issues with bringing up devices and
>>> tearing them down/detaching.
>>
>> Yes, if you look at Liang Chen's patch "bcache: explicitly destroy mutex
>> while exiting", you may find he moves sysfs_create_files(bcache_kobj,
>> files)) after bch_request_init() for the similar reason.
>>
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


-- 
Coly Li

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2017-10-15 10:27 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-10-14 11:14 Comm: bcache_allocato BUG: unable to handle kernel NULL pointer dereference at 00000000000006bc Sverd Johnsen
2017-10-14 11:42 ` Coly Li
2017-10-14 12:04   ` Sverd Johnsen
2017-10-14 16:22     ` Coly Li
2017-10-14 21:23       ` Michael Lyle
2017-10-14 22:02         ` Coly Li
2017-10-14 22:34           ` Michael Lyle
2017-10-15 10:27             ` Coly Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox