From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andreas Gruenbacher Date: Fri, 15 Mar 2019 21:58:12 +0100 Subject: [Cluster-devel] [PATCH] gfs2: Prevent writeback in gfs2_file_write_iter In-Reply-To: <05f91ec0-106f-703f-042b-88d2f65f112e@citrix.com> References: <05f91ec0-106f-703f-042b-88d2f65f112e@citrix.com> <20190313171322.23308-1-agruenba@redhat.com> Message-ID: <20190315205812.22727-1-agruenba@redhat.com> List-Id: To: cluster-devel.redhat.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Hi Ross, On Thu, 14 Mar 2019 at 12:18, Ross Lagerwall wrote: > On 3/13/19 5:13 PM, Andreas Gruenbacher wrote: > > Hi Edwin, > > > > On Wed, 6 Mar 2019 at 12:08, Edwin T?r?k > > wrote: > >> Hello, > >> > >> I've been trying to debug a GFS2 deadlock that we see in our lab > >> quite frequently with a 4.19 kernel. With 4.4 and older kernels we > >> were not able to reproduce this. > >> See below for lockdep dumps and stacktraces. > > > > thanks for the thorough bug report. Does the below fix work for > > you? > > > Hi Andreas, > > I've tested the patch and it doesn't fix the issue. As far as I can see, > current->backing_dev_info is not used by any of the code called from > balance_dirty_pages_ratelimited() so I don't see how it could work. yes, I see now. > I found a way of consistently reproducing the issue almost immediately > (tested with the latest master commit): > > # cat a.py > import os > > fd = os.open("f", os.O_CREAT|os.O_TRUNC|os.O_WRONLY) > > for i in range(1000): > os.mkdir("xxx" + str(i), 0777) > > buf = 'x' * 4096 > > while True: > count = os.write(fd, buf) > if count <= 0: > break > > # cat b.py > import os > while True: > os.mkdir("x", 0777) > os.rmdir("x") > > # echo 8192 > /proc/sys/vm/dirty_bytes > # cd /gfs2mnt > # (mkdir tmp1; cd tmp1; python2 ~/a.py) & > # (mkdir tmp2; cd tmp2; python2 ~/a.py) & > # (mkdir tmp3; cd tmp3; python2 ~/b.py) & > > This should deadlock almost immediately. One of the processes will be > waiting in balance_dirty_pages() and holding sd_log_flush_lock and > several others will be waiting for sd_log_flush_lock. This doesn't work for me: the python processes don't even start properly when dirty_bytes is set so low. > I came up with the following patch which seems to resolve the issue by > failing to write the inode if it can't take the lock, but it seems > like a dirty workaround rather than a proper fix: > > [...] Looking at ext4_dirty_inode, it seems that we should just be able to bail out of gfs2_write_inode an return 0 when PF_MEMALLOC is set in current->flags. Also, we should probably add the current->flags checks from xfs_do_writepage to gfs2_writepage_common. So what do you get with the below patch? Thanks, Andreas --- fs/gfs2/aops.c | 7 +++++++ fs/gfs2/super.c | 4 ++++ 2 files changed, 11 insertions(+) diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c index 05dd78f..694ff91 100644 --- a/fs/gfs2/aops.c +++ b/fs/gfs2/aops.c @@ -102,6 +102,13 @@ static int gfs2_writepage_common(struct page *page, pgoff_t end_index = i_size >> PAGE_SHIFT; unsigned offset; + /* (see xfs_do_writepage) */ + if (WARN_ON_ONCE((current->flags & (PF_MEMALLOC|PF_KSWAPD)) == + PF_MEMALLOC)) + goto redirty; + if (WARN_ON_ONCE(current->flags & PF_MEMALLOC_NOFS)) + goto redirty; + if (gfs2_assert_withdraw(sdp, gfs2_glock_is_held_excl(ip->i_gl))) goto out; if (current->journal_info) diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c index ca71163..540535c 100644 --- a/fs/gfs2/super.c +++ b/fs/gfs2/super.c @@ -756,6 +756,10 @@ static int gfs2_write_inode(struct inode *inode, struct writeback_control *wbc) int ret = 0; bool flush_all = (wbc->sync_mode == WB_SYNC_ALL || gfs2_is_jdata(ip)); + /* (see ext4_dirty_inode) */ + if (current->flags & PF_MEMALLOC) + return 0; + if (flush_all) gfs2_log_flush(GFS2_SB(inode), ip->i_gl, GFS2_LOG_HEAD_FLUSH_NORMAL | -- 1.8.3.1