All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1 0/1] aoe: possible interrupt unsafe locking scenario
@ 2024-01-24  7:24 Maksim Kiselev
  2024-01-24  7:24 ` [PATCH v1 1/1] aoe: avoid potential deadlock at set_capacity Maksim Kiselev
  2024-01-24 15:30 ` [PATCH v1 0/1] aoe: possible interrupt unsafe locking scenario Jens Axboe
  0 siblings, 2 replies; 5+ messages in thread
From: Maksim Kiselev @ 2024-01-24  7:24 UTC (permalink / raw)
  Cc: Justin Sanders, Jens Axboe, linux-block, linux-kernel,
	Maksim Kiselev

Hi all,

I'm using an AoE device on board with D1 SoC (riscv arch).
When I enabled CONFIG_PROVE_LOCKING option, I got the warning below.
After some investigations, I made a not very elegant, but working solution.
The patch sent with next message.

I would be glad to know what you think about this.

[ 3039.629033] aoe: 0050b6b92f30 e1.1 v4019 has 264192 sectors                                            
[ 3039.637315]                                                                                            
[ 3039.638831] =====================================================                                      
[ 3039.644921] WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected                                
[ 3039.651534] 6.7.0-next-20240119-00003-gc826d21919ac-dirty #7 Not tainted                               
[ 3039.658233] -----------------------------------------------------                                      
[ 3039.664322] kworker/0:0/8 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:                                
[ 3039.670940] ffffffd803082a60 (&bdev->bd_size_lock){+.+.}-{2:2}, at: bdev_set_nr_sectors+0x20/0x44      
[ 3039.679854]                                                                                            
[ 3039.679854] and this task is already holding:                                                          
[ 3039.685684] ffffffd803b44238 (&d->lock){..-.}-{2:2}, at: aoeblk_gdalloc+0xe8/0x322                     
[ 3039.693298] which would create a new lock dependency:                                                  
[ 3039.698349]  (&d->lock){..-.}-{2:2} -> (&bdev->bd_size_lock){+.+.}-{2:2}                               
[ 3039.705089]                                                                                            
[ 3039.705089] but this new dependency connects a SOFTIRQ-irq-safe lock:                                  
[ 3039.713002]  (&d->lock){..-.}-{2:2}                                                                    
[ 3039.713023]                                                                                            
[ 3039.713023] ... which became SOFTIRQ-irq-safe at:                                                      
[ 3039.722682]   __lock_acquire+0x730/0x170c                                                              
[ 3039.726704]   lock_acquire+0xb2/0x21a                                                                  
[ 3039.730375]   _raw_spin_lock_irqsave+0x3a/0x64                                                         
[ 3039.734833]   aoecmd_cfg_rsp+0xc6/0x43a                                                                
[ 3039.738683]   aoenet_rcv+0x19a/0x1c2                                                                   
[ 3039.742267]   __netif_receive_skb_list_core+0x13c/0x146                                                
[ 3039.747510]   netif_receive_skb_list_internal+0x19e/0x2ae                                              
[ 3039.752920]   napi_complete_done+0x4a/0x16a                                                            
[ 3039.757115]   r8152_poll+0x20c/0x79c                                                                   
[ 3039.760700]   __napi_poll.constprop.0+0x26/0x19a                                                       
[ 3039.765328]   net_rx_action+0x10e/0x226                                                                
[ 3039.769175]   __do_softirq+0x180/0x3a4                                                                 
[ 3039.772937]   irq_exit_rcu+0x7e/0xa6                                                                   
[ 3039.776524]   handle_riscv_irq+0x40/0x4e                                                               
[ 3039.780466]   call_on_irq_stack+0x1c/0x28                                                              
[ 3039.784486]                                                                                            
[ 3039.784486] to a SOFTIRQ-irq-unsafe lock:                                                              
[ 3039.789970]  (&bdev->bd_size_lock){+.+.}-{2:2}                                                         
[ 3039.789992]                                                                                            
[ 3039.789992] ... which became SOFTIRQ-irq-unsafe at:                                                    
[ 3039.800777] ...                                                                                        
[ 3039.800783]   __lock_acquire+0x3a2/0x170c                                                              
[ 3039.806548]   lock_acquire+0xb2/0x21a                                                                  
[ 3039.810221]   _raw_spin_lock+0x2c/0x40                                                                 
[ 3039.813980]   bdev_set_nr_sectors+0x20/0x44                                                            
[ 3039.818173]   set_capacity+0xe/0x16                                                                    
[ 3039.821680]   zram_add+0xdc/0x1fa                                                                      
[ 3039.825006]   zram_init+0xf8/0x124                                                                     
[ 3039.828422]   do_one_initcall+0x4e/0x1ca                                                               
[ 3039.832356]   kernel_init_freeable+0x1be/0x218                                                         
[ 3039.836812]   kernel_init+0x1e/0x10a                                                                   
[ 3039.840405]   ret_from_fork+0xe/0x1c                                                                   
[ 3039.843991]                                                                                            
[ 3039.843991] other info that might help us debug this:                                                  
[ 3039.843991]                                                                                            
[ 3039.851990]  Possible interrupt unsafe locking scenario:                                               
[ 3039.851990]                                                                                            
[ 3039.858773]        CPU0                    CPU1                                                        
[ 3039.863302]        ----                    ----                                                        
[ 3039.867829]   lock(&bdev->bd_size_lock);                                                               
[ 3039.871762]                                local_irq_disable();                                        
[ 3039.877678]                                lock(&d->lock);                                             
[ 3039.883172]                                lock(&bdev->bd_size_lock);                                  
[ 3039.889619]   <Interrupt>                                                                              
[ 3039.892241]     lock(&d->lock);                                                                        
[ 3039.895393]                                                                                            
[ 3039.895393]  *** DEADLOCK ***                                                                          
[ 3039.895393]                                                                                            
[ 3039.901309] 3 locks held by kworker/0:0/8:                                                             
[ 3039.905409]  #0: ffffffd802a95530 ((wq_completion)aoe_wq){+.+.}-{0:0}, at: process_one_work+0x110/0x3ce
[ 3039.914849]  #1: ffffffc800043db8 ((work_completion)(&d->work)){+.+.}-{0:0}, at: process_one_work+0x110
/0x3ce                                                                                                    
[ 3039.924799]  #2: ffffffd803b44238 (&d->lock){..-.}-{2:2}, at: aoeblk_gdalloc+0xe8/0x322                
[ 3039.932848]                                                                                            
[ 3039.932848] the dependencies between SOFTIRQ-irq-safe lock and the holding lock:                       
[ 3039.941715] -> (&d->lock){..-.}-{2:2} {                                                                
[ 3039.945576]    IN-SOFTIRQ-W at:                                                                        
[ 3039.948724]                     __lock_acquire+0x730/0x170c                                            
[ 3039.954307]                     lock_acquire+0xb2/0x21a                                                
[ 3039.959538]                     _raw_spin_lock_irqsave+0x3a/0x64                                       
[ 3039.965554]                     aoecmd_cfg_rsp+0xc6/0x43a                                              
[ 3039.970966]                     aoenet_rcv+0x19a/0x1c2                                                 
[ 3039.976112]                     __netif_receive_skb_list_core+0x13c/0x146                              
[ 3039.982915]                     netif_receive_skb_list_internal+0x19e/0x2ae                            
[ 3039.989885]                     napi_complete_done+0x4a/0x16a                                          
[ 3039.995641]                     r8152_poll+0x20c/0x79c                                                 
[ 3040.000788]                     __napi_poll.constprop.0+0x26/0x19a                                     
[ 3040.006977]                     net_rx_action+0x10e/0x226                                              
[ 3040.012384]                     __do_softirq+0x180/0x3a4                                               
[ 3040.017706]                     irq_exit_rcu+0x7e/0xa6                                                 
[ 3040.022854]                     handle_riscv_irq+0x40/0x4e                                             
[ 3040.028355]                     call_on_irq_stack+0x1c/0x28                                            
[ 3040.033937]    INITIAL USE at:                                                                         
[ 3040.036999]                    __lock_acquire+0x3ae/0x170c                                             
[ 3040.042492]                    lock_acquire+0xb2/0x21a                                                 
[ 3040.047638]                    _raw_spin_lock_irqsave+0x3a/0x64                                        
[ 3040.053565]                    aoecmd_cfg_rsp+0xc6/0x43a                                               
[ 3040.058893]                    aoenet_rcv+0x19a/0x1c2                                                  
[ 3040.063951]                    __netif_receive_skb_list_core+0x13c/0x146                               
[ 3040.070663]                    netif_receive_skb_list_internal+0x19e/0x2ae                             
[ 3040.077547]                    napi_complete_done+0x4a/0x16a                                           
[ 3040.083215]                    r8152_poll+0x20c/0x79c                                                  
[ 3040.088273]                    __napi_poll.constprop.0+0x26/0x19a                                      
[ 3040.094376]                    net_rx_action+0x10e/0x226                                               
[ 3040.099696]                    __do_softirq+0x180/0x3a4                                                
[ 3040.104931]                    irq_exit_rcu+0x7e/0xa6                                                  
[ 3040.109992]                    handle_riscv_irq+0x40/0x4e                                              
[ 3040.115402]                    call_on_irq_stack+0x1c/0x28                                             
[ 3040.120896]  }                                                                                         
[ 3040.122565]  ... key      at: [<ffffffff822deea8>] __key.2+0x0/0x10                                    
[ 3040.128848]                                                                                            
[ 3040.128848] the dependencies between the lock to be acquired                                           
[ 3040.128856]  and SOFTIRQ-irq-unsafe lock:                                                              
[ 3040.139994] -> (&bdev->bd_size_lock){+.+.}-{2:2} {                                                     
[ 3040.144811]    HARDIRQ-ON-W at:                                                                        
[ 3040.147960]                     __lock_acquire+0x390/0x170c                                            
[ 3040.153542]                     lock_acquire+0xb2/0x21a                                                
[ 3040.158774]                     _raw_spin_lock+0x2c/0x40                                               
[ 3040.164093]                     bdev_set_nr_sectors+0x20/0x44                                          
[ 3040.169848]                     set_capacity+0xe/0x16                                                  
[ 3040.174912]                     zram_add+0xdc/0x1fa                                                    
[ 3040.179798]                     zram_init+0xf8/0x124                                                   
[ 3040.184775]                     do_one_initcall+0x4e/0x1ca                                             
[ 3040.190269]                     kernel_init_freeable+0x1be/0x218                                       
[ 3040.196288]                     kernel_init+0x1e/0x10a                                                 
[ 3040.201439]                     ret_from_fork+0xe/0x1c                                                 
[ 3040.206587]    SOFTIRQ-ON-W at:                                                                        
[ 3040.209735]                     __lock_acquire+0x3a2/0x170c                                            
[ 3040.215316]                     lock_acquire+0xb2/0x21a                                                
[ 3040.220546]                     _raw_spin_lock+0x2c/0x40                                               
[ 3040.225866]                     bdev_set_nr_sectors+0x20/0x44                                          
[ 3040.231618]                     set_capacity+0xe/0x16                                                  
[ 3040.236678]                     zram_add+0xdc/0x1fa                                                    
[ 3040.241563]                     zram_init+0xf8/0x124                                                   
[ 3040.246536]                     do_one_initcall+0x4e/0x1ca                                             
[ 3040.252028]                     kernel_init_freeable+0x1be/0x218                                       
[ 3040.258044]                     kernel_init+0x1e/0x10a                                                 
[ 3040.263193]                     ret_from_fork+0xe/0x1c                                                 
[ 3040.268341]    INITIAL USE at:                                                                         
[ 3040.271402]                    __lock_acquire+0x3ae/0x170c                                             
[ 3040.276896]                    lock_acquire+0xb2/0x21a                                                 
[ 3040.282041]                    _raw_spin_lock+0x2c/0x40                                                
[ 3040.287273]                    bdev_set_nr_sectors+0x20/0x44                                           
[ 3040.292939]                    set_capacity+0xe/0x16                                                   
[ 3040.297913]                    zram_add+0xdc/0x1fa                                                     
[ 3040.302712]                    zram_init+0xf8/0x124                                                    
[ 3040.307600]                    do_one_initcall+0x4e/0x1ca                                              
[ 3040.313005]                    kernel_init_freeable+0x1be/0x218                                        
[ 3040.318936]                    kernel_init+0x1e/0x10a                                                  
[ 3040.323998]                    ret_from_fork+0xe/0x1c                                                  
[ 3040.329059]  }                                                                                         
[ 3040.330728]  ... key      at: [<ffffffff822d4d18>] __key.5+0x0/0x10                                    
[ 3040.337010]  ... acquired at:                                                                          
[ 3040.339980]    check_prev_add+0xf8/0xb40                                                               
[ 3040.343913]    __lock_acquire+0xd0e/0x170c                                                             
[ 3040.348017]    lock_acquire+0xb2/0x21a                                                                 
[ 3040.351775]    _raw_spin_lock+0x2c/0x40                                                                
[ 3040.355620]    bdev_set_nr_sectors+0x20/0x44                                                           
[ 3040.359899]    set_capacity+0xe/0x16                                                                   
[ 3040.363486]    +0x180/0x322                                                              
[ 3040.367514]    aoecmd_sleepwork+0x2e/0x7c                                                              
[ 3040.371536]    process_one_work+0x176/0x3ce                                                            
[ 3040.375740]    worker_thread+0x19a/0x348                                                               
[ 3040.379674]    kthread+0xd0/0xec                                                                       
[ 3040.382918]    ret_from_fork+0xe/0x1c                                                                  
[ 3040.386596]                                                                                            
[ 3040.388091]                                                                                            
[ 3040.388091] stack backtrace:                                                                           
[ 3040.392449] CPU: 0 PID: 8 Comm: kworker/0:0 Not tainted 6.7.0-next-20240119-00003-gc826d21919ac-dirty #7
[ 3040.401929] Hardware name: Sipeed Lichee RV Dock (DT)                                                  
[ 3040.406983] Workqueue: aoe_wq aoecmd_sleepwork                                                         
[ 3040.411450] Call Trace:                                                                                
[ 3040.413901] [<ffffffff800067b2>] dump_backtrace+0x1c/0x24                                              
[ 3040.419316] [<ffffffff809d176a>] show_stack+0x2c/0x38                                                  
[ 3040.424382] [<ffffffff809de7c2>] dump_stack_lvl+0x2e/0x48                                              
[ 3040.429798] [<ffffffff809de7f0>] dump_stack+0x14/0x1c                                                  
[ 3040.434862] [<ffffffff800511fc>] check_irq_usage+0x786/0x798                                           
[ 3040.440531] [<ffffffff80051306>] check_prev_add+0xf8/0xb40                                             
[ 3040.446026] [<ffffffff80053b22>] __lock_acquire+0xd0e/0x170c                                           
[ 3040.451693] [<ffffffff80054cc4>] lock_acquire+0xb2/0x21a                                               
[ 3040.457013] [<ffffffff809e91be>] _raw_spin_lock+0x2c/0x40                                              
[ 3040.462424] [<ffffffff8032468c>] bdev_set_nr_sectors+0x20/0x44                                         
[ 3040.468268] [<ffffffff8033ff0c>] set_capacity+0xe/0x16                                                 
[ 3040.473422] [<ffffffff805d32e2>] aoeblk_gdalloc+0x180/0x322                                            
[ 3040.479013] [<ffffffff805d5208>] aoecmd_sleepwork+0x2e/0x7c                                            
[ 3040.484599] [<ffffffff8002b612>] process_one_work+0x176/0x3ce                                          
[ 3040.490361] [<ffffffff8002ba04>] worker_thread+0x19a/0x348                                             
[ 3040.495857] [<ffffffff80033d08>] kthread+0xd0/0xec                                                     
[ 3040.500661] [<ffffffff809e9ebe>] ret_from_fork+0xe/0x1c                                                
[ 3040.538247]  etherd/e1.1: p1

Maksim Kiselev (1):
  aoe: avoid potential deadlock at set_capacity

 drivers/block/aoe/aoeblk.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

-- 
2.40.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v1 1/1] aoe: avoid potential deadlock at set_capacity
  2024-01-24  7:24 [PATCH v1 0/1] aoe: possible interrupt unsafe locking scenario Maksim Kiselev
@ 2024-01-24  7:24 ` Maksim Kiselev
  2024-01-24  9:18   ` Christoph Hellwig
  2024-01-24 15:30 ` [PATCH v1 0/1] aoe: possible interrupt unsafe locking scenario Jens Axboe
  1 sibling, 1 reply; 5+ messages in thread
From: Maksim Kiselev @ 2024-01-24  7:24 UTC (permalink / raw)
  Cc: Justin Sanders, Jens Axboe, linux-block, linux-kernel,
	Maksim Kiselev

Move set_capacity() outside of the section procected by (&d->lock).
To avoid possible interrupt unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
[1] lock(&bdev->bd_size_lock);
                                local_irq_disable();
                            [2] lock(&d->lock);
                            [3] lock(&bdev->bd_size_lock);
   <Interrupt>
[4]  lock(&d->lock);

  *** DEADLOCK ***

Where [1](&bdev->bd_size_lock) hold by zram_add()->set_capacity().
[2]lock(&d->lock) hold by aoeblk_gdalloc(). And aoeblk_gdalloc()
is trying to acquire [3](&bdev->bd_size_lock) at set_capacity() call.
In this situation an attempt to acquire [4]lock(&d->lock) from
aoecmd_cfg_rsp() will lead to deadlock.

So the simplest solution is breaking lock dependency
[2](&d->lock) -> [3](&bdev->bd_size_lock) by moving set_capacity()
outside.

Signed-off-by: Maksim Kiselev <bigunclemax@gmail.com>
---
 drivers/block/aoe/aoeblk.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/block/aoe/aoeblk.c b/drivers/block/aoe/aoeblk.c
index d2dbf8aaccb5..b1b47d88f5db 100644
--- a/drivers/block/aoe/aoeblk.c
+++ b/drivers/block/aoe/aoeblk.c
@@ -333,6 +333,7 @@ aoeblk_gdalloc(void *vp)
 	struct gendisk *gd;
 	mempool_t *mp;
 	struct blk_mq_tag_set *set;
+	sector_t ssize;
 	ulong flags;
 	int late = 0;
 	int err;
@@ -396,7 +397,7 @@ aoeblk_gdalloc(void *vp)
 	gd->minors = AOE_PARTITIONS;
 	gd->fops = &aoe_bdops;
 	gd->private_data = d;
-	set_capacity(gd, d->ssize);
+	ssize = d->ssize;
 	snprintf(gd->disk_name, sizeof gd->disk_name, "etherd/e%ld.%d",
 		d->aoemajor, d->aoeminor);
 
@@ -405,6 +406,8 @@ aoeblk_gdalloc(void *vp)
 
 	spin_unlock_irqrestore(&d->lock, flags);
 
+	set_capacity(gd, ssize);
+
 	err = device_add_disk(NULL, gd, aoe_attr_groups);
 	if (err)
 		goto out_disk_cleanup;
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v1 1/1] aoe: avoid potential deadlock at set_capacity
  2024-01-24  7:24 ` [PATCH v1 1/1] aoe: avoid potential deadlock at set_capacity Maksim Kiselev
@ 2024-01-24  9:18   ` Christoph Hellwig
  2024-01-24 10:23     ` Maxim Kiselev
  0 siblings, 1 reply; 5+ messages in thread
From: Christoph Hellwig @ 2024-01-24  9:18 UTC (permalink / raw)
  To: Maksim Kiselev; +Cc: Justin Sanders, Jens Axboe, linux-block, linux-kernel

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

Just curious: what is your rason for using aeo over nbd?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v1 1/1] aoe: avoid potential deadlock at set_capacity
  2024-01-24  9:18   ` Christoph Hellwig
@ 2024-01-24 10:23     ` Maxim Kiselev
  0 siblings, 0 replies; 5+ messages in thread
From: Maxim Kiselev @ 2024-01-24 10:23 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Justin Sanders, Jens Axboe, linux-block, linux-kernel

Hi, Christoph

ср, 24 янв. 2024 г. в 12:18, Christoph Hellwig <hch@infradead.org>:
>
> Looks good:
>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
>
> Just curious: what is your rason for using aeo over nbd?

The main reason why I use aoe, because it's not require a userspace
client. All magic can be done by the kernel itself :)
Just enough to specify an aoe interface via cmdline or bootparams.

Cheers,
Maksim

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v1 0/1] aoe: possible interrupt unsafe locking scenario
  2024-01-24  7:24 [PATCH v1 0/1] aoe: possible interrupt unsafe locking scenario Maksim Kiselev
  2024-01-24  7:24 ` [PATCH v1 1/1] aoe: avoid potential deadlock at set_capacity Maksim Kiselev
@ 2024-01-24 15:30 ` Jens Axboe
  1 sibling, 0 replies; 5+ messages in thread
From: Jens Axboe @ 2024-01-24 15:30 UTC (permalink / raw)
  To: Maksim Kiselev; +Cc: Justin Sanders, linux-block, linux-kernel


On Wed, 24 Jan 2024 10:24:35 +0300, Maksim Kiselev wrote:
> I'm using an AoE device on board with D1 SoC (riscv arch).
> When I enabled CONFIG_PROVE_LOCKING option, I got the warning below.
> After some investigations, I made a not very elegant, but working solution.
> The patch sent with next message.
> 
> I would be glad to know what you think about this.
> 
> [...]

Applied, thanks!

[1/1] aoe: avoid potential deadlock at set_capacity
      commit: e169bd4fb2b36c4b2bee63c35c740c85daeb2e86

Best regards,
-- 
Jens Axboe




^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-01-24 15:30 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-01-24  7:24 [PATCH v1 0/1] aoe: possible interrupt unsafe locking scenario Maksim Kiselev
2024-01-24  7:24 ` [PATCH v1 1/1] aoe: avoid potential deadlock at set_capacity Maksim Kiselev
2024-01-24  9:18   ` Christoph Hellwig
2024-01-24 10:23     ` Maxim Kiselev
2024-01-24 15:30 ` [PATCH v1 0/1] aoe: possible interrupt unsafe locking scenario Jens Axboe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.