From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Joe Jin" Subject: Re: [PATCH] Fix blkback/blktap sysfs read bug. Date: Thu, 21 Jan 2010 15:49:28 +0800 Message-ID: <20100121074928.GA31296@joejin-pc.cn.oracle.com> References: <20100119141338.GA22249@joejin-pc.cn.oracle.com> <4B55E9E7020000780002AC17@vpn.id2.novell.com> <20100120020605.GA25697@joejin-pc.cn.oracle.com> <4B56C2DB020000780002AE45@vpn.id2.novell.com> <20100120105136.GA6801@joejin-pc.cn.oracle.com> <4B56F1B6020000780002AEFA@vpn.id2.novell.com> <20100120114518.GA10851@joejin-pc.cn.oracle.com> <1264040197.12544.3679.camel@agari.van.xensource.com> <20100121031307.GA29727@joejin-pc.cn.oracle.com> <1264058809.6898.181.camel@ramone.somacoma.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: <1264058809.6898.181.camel@ramone.somacoma.net> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Daniel Stodden Cc: "xen-devel@lists.xensource.com" , "greg.marsden@oracle.com" , Joe Jin , Jan Beulich , "deepak.patel@oracle.com" , Keir Fraser List-Id: xen-devel@lists.xenproject.org > Your patch will work okay on 2.6.18. > > But collisions will deadlock after 2.6.23 > > Found an old stack trace: > > [2009-07-08 06:15:08 UTC] INFO: task xb.00021.xvdd:30039 blocked for more than 120 seconds. > [2009-07-08 06:15:08 UTC] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > [2009-07-08 06:15:08 UTC] c7adfe0c 00000246 00000000 00000000 5b88fd4f 00000256 c7ade000 c7adfdc8 > [2009-07-08 06:15:08 UTC] c0107c1b 38c984a4 00000256 c7adfddc ed629578 ed6293f0 ed629578 c16bdb00 > [2009-07-08 06:15:08 UTC] 00000000 eea0d500 c16bdb34 002dc05a 00000000 00000005 0024c31c e20a8ff0 > [2009-07-08 06:15:08 UTC] Call Trace: > [2009-07-08 06:15:08 UTC] [] ? local_clock+0x3b/0x90 > [2009-07-08 06:15:08 UTC] [] schedule_timeout+0x75/0xc0 > [2009-07-08 06:15:08 UTC] [] ? pick_next_task_fair+0x91/0xd0 > [2009-07-08 06:15:08 UTC] [] wait_for_common+0xa9/0x1c0 > [2009-07-08 06:15:08 UTC] [] ? default_wake_function+0x0/0x10 > [2009-07-08 06:15:08 UTC] [] wait_for_completion+0x12/0x20 > [2009-07-08 06:15:08 UTC] [] sysfs_addrm_finish+0x1e7/0x230 > [2009-07-08 06:15:08 UTC] [] sysfs_hash_and_remove+0x45/0x70 > [2009-07-08 06:15:08 UTC] [] remove_files+0x1b/0x30 > [2009-07-08 06:15:08 UTC] [] sysfs_remove_group+0x36/0xc0 > [2009-07-08 06:15:08 UTC] [] ? __blkdev_put+0x14f/0x160 > [2009-07-08 06:15:08 UTC] [] xenvbd_sysfs_delif+0x2c/0x60 > [2009-07-08 06:15:08 UTC] [] blkback_close+0x46/0x70 > [2009-07-08 06:15:08 UTC] [] blkif_schedule+0x583/0x5b0 > [2009-07-08 06:15:08 UTC] [] ? pick_next_task_fair+0x91/0xd0 > [2009-07-08 06:15:08 UTC] [] ? autoremove_wake_function+0x0/0x50 > [2009-07-08 06:15:08 UTC] [] ? blkif_schedule+0x0/0x5b0 > [2009-07-08 06:15:08 UTC] [] kthread+0x42/0x70 > [2009-07-08 06:15:08 UTC] [] ? kthread+0x0/0x70 > [2009-07-08 06:15:08 UTC] [] kernel_thread_helper+0x7/0x10 > > The reason is in sysfs_deactivate(), which will sync callers against any > remaining thread in .show() > - show() hangs on the lock > - the lock holder in sysfs_remove_group(), > waiting for show() to complete. > > Pardon me -- I'm not entirely sure where/how these patches are currently > submitted and merged. I suppose yours are only for linux-2.6.18.hg, not > e.g. pvops? Then sorry for any confusion. > Daniel, Thanks a lot of your comments, it really help for me, yes my patch based linux-2.6.18.hg branch. As Jan have pointed out in previous email, it should be sysfs's issue, looked like later kernel sysfs have fixed the issue? Thanks, Joe