From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ross Lagerwall Date: Tue, 26 Mar 2019 18:49:41 +0000 Subject: [Cluster-devel] [PATCH 1/2] gfs2: Fix occasional glock use-after-free In-Reply-To: References: <20190131105543.15421-1-ross.lagerwall@citrix.com> <20190131105543.15421-2-ross.lagerwall@citrix.com> Message-ID: <9b63ba9f-fda7-91db-80b8-60bddf6733f1@citrix.com> List-Id: To: cluster-devel.redhat.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit On 1/31/19 5:18 PM, Andreas Gruenbacher wrote: > Hi Ross, > > On Thu, 31 Jan 2019 at 11:56, Ross Lagerwall wrote: >> Each gfs2_bufdata stores a reference to a glock but the reference count >> isn't incremented. This causes an occasional use-after-free of the >> glock. Fix by taking a reference on the glock during allocation and >> dropping it when freeing. >> >> Found by KASAN: >> >> BUG: KASAN: use-after-free in revoke_lo_after_commit+0x8e/0xe0 [gfs2] >> Write of size 4 at addr ffff88801aff6134 by task kworker/0:2H/20371 >> >> CPU: 0 PID: 20371 Comm: kworker/0:2H Tainted: G O 4.19.0+0 #1 >> Hardware name: Dell Inc. PowerEdge R805/0D456H, BIOS 4.2.1 04/14/2010 >> Workqueue: glock_workqueue glock_work_func [gfs2] >> Call Trace: >> dump_stack+0x71/0xab >> print_address_description+0x6a/0x270 >> kasan_report+0x258/0x380 >> ? revoke_lo_after_commit+0x8e/0xe0 [gfs2] >> revoke_lo_after_commit+0x8e/0xe0 [gfs2] >> gfs2_log_flush+0x511/0xa70 [gfs2] >> ? gfs2_log_shutdown+0x1f0/0x1f0 [gfs2] >> ? __brelse+0x48/0x50 >> ? gfs2_log_commit+0x4de/0x6e0 [gfs2] >> ? gfs2_trans_end+0x18d/0x340 [gfs2] >> gfs2_ail_empty_gl+0x1ab/0x1c0 [gfs2] >> ? inode_go_dump+0xe0/0xe0 [gfs2] >> ? inode_go_sync+0xe4/0x220 [gfs2] >> inode_go_sync+0xe4/0x220 [gfs2] >> do_xmote+0x12b/0x290 [gfs2] >> glock_work_func+0x6f/0x160 [gfs2] >> process_one_work+0x461/0x790 >> worker_thread+0x69/0x6b0 >> ? process_one_work+0x790/0x790 >> kthread+0x1ae/0x1d0 >> ? kthread_create_worker_on_cpu+0xc0/0xc0 >> ret_from_fork+0x22/0x40 > > thanks for tracking this down, very interesting. > > The consistency model here is that every buffer head that a struct > gfs2_bufdata object is attached to is protected by a glock. Before a > glock can be released, all the buffers under that glock have to be > flushed out and released; this is what allows another node to access > the same on-disk location without causing inconsistencies. When there > is a bufdata object that points to a glock that has already been > freed, this consistency model is broken. Taking an additional refcount > as this patch does may make the use-after-free go away, but it doesn't > fix the underlying problem. So I think we'll need a different fix > here. > > Did you observe this problem in a real-world scenario, or with KASAN > only? It might be that we're looking at a small race that is unlikely > to trigger in the field. In any case, I think we need to understand > better what't actually going on. > I finally got time to look into this further. The following is what happens as far as I can tell though my understanding is a bit limited at this point: 1. A file is opened, truncated and closed which results in a metadata write so a gfs2_bufdata object is created and associated with the inode glock. 2. The gfs2_bufdata is written out and placed on an AIL list. 3. The VFS layer calls gfs2_evict_inode() and the inode is evicted which drops a reference on the glock. It takes case 3 (i_nlink > 0) so no log flush or ail_flush happens. 4. The gfs2_bufdata object is however still on one of the AIL lists and the next gfs2_log_flush() begins to write a revoke entry. 5. At about the same time the glock is freed. At the point of freeing, gl_revokes is 1 (should probably have a GLOCK_BUG_ON for this). 6. gfs2_log_flush() continues and calls revoke_lo_after_commit() and uses the freed glock (stack trace above). Should evict_inode call gfs2_ail_flush() or something so that the revoke is written before it drops its reference to the glock? Or is there something else that is meant to keep the glock around if the inode is evicted while there is a linked gfs2_bufdata sitting on some AIL list? Let me know if any more specific information is needed since I now have a test setup that can usually reproduce this within 10 minutes. Thanks, -- Ross Lagerwall