From: Jeff Layton <jlayton@redhat.com>
To: penglaiyxy <penglaiyxy@gmail.com>
Cc: Ilya Dryomov <idryomov@gmail.com>,
ceph-devel <ceph-devel@vger.kernel.org>
Subject: Re: Re: [PATCH] libceph: force GFP_NOIO for socket allocations
Date: Wed, 22 Mar 2017 22:26:11 -0400 [thread overview]
Message-ID: <1490235971.3921.7.camel@redhat.com> (raw)
In-Reply-To: <201703230858170977508@gmail.com>
I think you're correct that NOFS would have prevented the recursion
shown in the stack trace below.
However, if you (for instance) had a userland program that was
accessing the krbd device directly with buffered I/O, then I think you
could still end up deadlocked here.
NOIO is more restrictive than NOFS and should prevent that situation in
addition to the one in the patch description.
-- Jeff
On Thu, 2017-03-23 at 08:58 +0800, penglaiyxy wrote:
>
> How about using GFP_NOFS instead?
>
> penglaiyxy
>
> From: Jeff Layton
> Date: 2017-03-23 04:49
> To: Ilya Dryomov; ceph-devel
> Subject: Re: [PATCH] libceph: force GFP_NOIO for socket allocations
> On Wed, 2017-03-22 at 12:12 +0100, Ilya Dryomov wrote:
> > sock_alloc_inode() allocates socket+inode and socket_wq with
> > GFP_KERNEL, which is not allowed on the writeback path:
> >
> > Workqueue: ceph-msgr con_work [libceph]
> > ffff8810871cb018 0000000000000046 0000000000000000 ffff881085d40000
> > 0000000000012b00 ffff881025cad428 ffff8810871cbfd8 0000000000012b00
> > ffff880102fc1000 ffff881085d40000 ffff8810871cb038 ffff8810871cb148
> > Call Trace:
> > [<ffffffff816dd629>] schedule+0x29/0x70
> > [<ffffffff816e066d>] schedule_timeout+0x1bd/0x200
> > [<ffffffff81093ffc>] ? ttwu_do_wakeup+0x2c/0x120
> > [<ffffffff81094266>] ? ttwu_do_activate.constprop.135+0x66/0x70
> > [<ffffffff816deb5f>] wait_for_completion+0xbf/0x180
> > [<ffffffff81097cd0>] ? try_to_wake_up+0x390/0x390
> > [<ffffffff81086335>] flush_work+0x165/0x250
> > [<ffffffff81082940>] ? worker_detach_from_pool+0xd0/0xd0
> > [<ffffffffa03b65b1>] xlog_cil_force_lsn+0x81/0x200 [xfs]
> > [<ffffffff816d6b42>] ? __slab_free+0xee/0x234
> > [<ffffffffa03b4b1d>] _xfs_log_force_lsn+0x4d/0x2c0 [xfs]
> > [<ffffffff811adc1e>] ? lookup_page_cgroup_used+0xe/0x30
> > [<ffffffffa039a723>] ? xfs_reclaim_inode+0xa3/0x330 [xfs]
> > [<ffffffffa03b4dcf>] xfs_log_force_lsn+0x3f/0xf0 [xfs]
> > [<ffffffffa039a723>] ? xfs_reclaim_inode+0xa3/0x330 [xfs]
> > [<ffffffffa03a62c6>] xfs_iunpin_wait+0xc6/0x1a0 [xfs]
> > [<ffffffff810aa250>] ? wake_atomic_t_function+0x40/0x40
> > [<ffffffffa039a723>] xfs_reclaim_inode+0xa3/0x330 [xfs]
> > [<ffffffffa039ac07>] xfs_reclaim_inodes_ag+0x257/0x3d0 [xfs]
> > [<ffffffffa039bb13>] xfs_reclaim_inodes_nr+0x33/0x40 [xfs]
> > [<ffffffffa03ab745>] xfs_fs_free_cached_objects+0x15/0x20 [xfs]
> > [<ffffffff811c0c18>] super_cache_scan+0x178/0x180
> > [<ffffffff8115912e>] shrink_slab_node+0x14e/0x340
> > [<ffffffff811afc3b>] ? mem_cgroup_iter+0x16b/0x450
> > [<ffffffff8115af70>] shrink_slab+0x100/0x140
> > [<ffffffff8115e425>] do_try_to_free_pages+0x335/0x490
> > [<ffffffff8115e7f9>] try_to_free_pages+0xb9/0x1f0
> > [<ffffffff816d56e4>] ? __alloc_pages_direct_compact+0x69/0x1be
> > [<ffffffff81150cba>] __alloc_pages_nodemask+0x69a/0xb40
> > [<ffffffff8119743e>] alloc_pages_current+0x9e/0x110
> > [<ffffffff811a0ac5>] new_slab+0x2c5/0x390
> > [<ffffffff816d71c4>] __slab_alloc+0x33b/0x459
> > [<ffffffff815b906d>] ? sock_alloc_inode+0x2d/0xd0
> > [<ffffffff8164bda1>] ? inet_sendmsg+0x71/0xc0
> > [<ffffffff815b906d>] ? sock_alloc_inode+0x2d/0xd0
> > [<ffffffff811a21f2>] kmem_cache_alloc+0x1a2/0x1b0
> > [<ffffffff815b906d>] sock_alloc_inode+0x2d/0xd0
> > [<ffffffff811d8566>] alloc_inode+0x26/0xa0
> > [<ffffffff811da04a>] new_inode_pseudo+0x1a/0x70
> > [<ffffffff815b933e>] sock_alloc+0x1e/0x80
> > [<ffffffff815ba855>] __sock_create+0x95/0x220
> > [<ffffffff815baa04>] sock_create_kern+0x24/0x30
> > [<ffffffffa04794d9>] con_work+0xef9/0x2050 [libceph]
> > [<ffffffffa04aa9ec>] ? rbd_img_request_submit+0x4c/0x60 [rbd]
> > [<ffffffff81084c19>] process_one_work+0x159/0x4f0
> > [<ffffffff8108561b>] worker_thread+0x11b/0x530
> > [<ffffffff81085500>] ? create_worker+0x1d0/0x1d0
> > [<ffffffff8108b6f9>] kthread+0xc9/0xe0
> > [<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90
> > [<ffffffff816e1b98>] ret_from_fork+0x58/0x90
> > [<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90
> >
> > Use memalloc_noio_{save,restore}() to temporarily force GFP_NOIO here.
> >
> > Cc: stable@vger.kernel.org # 3.10+, needs backporting
> > Link: http://tracker.ceph.com/issues/19309
> > Reported-by: Sergey Jerusalimov <wintchester@gmail.com>
> > Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
> > ---
> > net/ceph/messenger.c | 6 ++++++
> > 1 file changed, 6 insertions(+)
> >
> > diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
> > index 38dcf1eb427d..f76bb3332613 100644
> > --- a/net/ceph/messenger.c
> > +++ b/net/ceph/messenger.c
> > @@ -7,6 +7,7 @@
> > #include <linux/kthread.h>
> > #include <linux/net.h>
> > #include <linux/nsproxy.h>
> > +#include <linux/sched/mm.h>
> > #include <linux/slab.h>
> > #include <linux/socket.h>
> > #include <linux/string.h>
> > @@ -469,11 +470,16 @@ static int ceph_tcp_connect(struct ceph_connection *con)
> > {
> > struct sockaddr_storage *paddr = &con->peer_addr.in_addr;
> > struct socket *sock;
> > + unsigned int noio_flag;
> > int ret;
> >
> > BUG_ON(con->sock);
> > +
> > + /* sock_create_kern() allocates with GFP_KERNEL */
> > + noio_flag = memalloc_noio_save();
> > ret = sock_create_kern(read_pnet(&con->msgr->net), paddr->ss_family,
> > SOCK_STREAM, IPPROTO_TCP, &sock);
> > + memalloc_noio_restore(noio_flag);
> > if (ret)
> > return ret;
> > sock->sk->sk_allocation = GFP_NOFS;
>
> Reviewed-by: Jeff Layton <jlayton@redhat.com>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Jeff Layton <jlayton@redhat.com>
next prev parent reply other threads:[~2017-03-23 2:26 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-03-22 11:12 [PATCH] libceph: force GFP_NOIO for socket allocations Ilya Dryomov
2017-03-22 20:49 ` Jeff Layton
[not found] ` <201703230858170977508@gmail.com>
2017-03-23 2:26 ` Jeff Layton [this message]
2017-03-23 10:54 ` Ilya Dryomov
2017-03-23 3:07 ` Sage Weil
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1490235971.3921.7.camel@redhat.com \
--to=jlayton@redhat.com \
--cc=ceph-devel@vger.kernel.org \
--cc=idryomov@gmail.com \
--cc=penglaiyxy@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.