Re: [PATCH 4.4 48/76] libceph: force GFP_NOIO for socket allocations

linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Michal Hocko <mhocko@kernel.org>
To: Ilya Dryomov <idryomov@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	stable@vger.kernel.org,
	Sergey Jerusalimov <wintchester@gmail.com>,
	Jeff Layton <jlayton@redhat.com>,
	linux-xfs@vger.kernel.org
Subject: Re: [PATCH 4.4 48/76] libceph: force GFP_NOIO for socket allocations
Date: Thu, 30 Mar 2017 16:36:52 +0200	[thread overview]
Message-ID: <20170330143652.GA4326@dhcp22.suse.cz> (raw)
In-Reply-To: <CAOi1vP9Mdk+meGj39+wccBa6HN07y-pDxMLJj_xmAYBsRSoy1g@mail.gmail.com>

On Thu 30-03-17 15:48:42, Ilya Dryomov wrote:
> On Thu, Mar 30, 2017 at 1:21 PM, Michal Hocko <mhocko@kernel.org> wrote:
[...]
> > familiar with Ceph at all but does any of its (slab) shrinkers generate
> > IO to recurse back?
> 
> We don't register any custom shrinkers.  This is XFS on top of rbd,
> a ceph-backed block device.

OK, that was the part I was missing. So you depend on the XFS to make a
forward progress here.

> >> Well,
> >> it's got to go through the same ceph_connection:
> >>
> >> rbd_queue_workfn
> >>   ceph_osdc_start_request
> >>     ceph_con_send
> >>       mutex_lock(&con->mutex)  # deadlock, OSD X worker is knocked out
> >>
> >> Now if that was a GFP_NOIO allocation, we would simply block in the
> >> allocator.  The placement algorithm distributes objects across the OSDs
> >> in a pseudo-random fashion, so even if we had a whole bunch of I/Os for
> >> that OSD, some other I/Os for other OSDs would complete in the meantime
> >> and free up memory.  If we are under the kind of memory pressure that
> >> makes GFP_NOIO allocations block for an extended period of time, we are
> >> bound to have a lot of pre-open sockets, as we would have done at least
> >> some flushing by then.
> >
> > How is this any different from xfs waiting for its IO to be done?
> 
> I feel like we are talking past each other here.  If the worker in
> question isn't deadlocked, it will eventually get its socket and start
> flushing I/O.  If it has deadlocked, it won't...

But if the allocation is stuck then the holder of the lock cannot make
a forward progress and it is effectivelly deadlocked because other IO
depends on the lock it holds. Maybe I just ask bad questions but what
makes GFP_NOIO different from GFP_KERNEL here. We know that the later
might need to wait for an IO to finish in the shrinker but it itself
doesn't get the lock in question directly. The former depends on the
allocator forward progress as well and that in turn wait for somebody
else to proceed with the IO. So to me any blocking allocation while
holding a lock which blocks further IO to complete is simply broken.
-- 
Michal Hocko
SUSE Labs

next prev parent reply	other threads:[~2017-03-30 14:37 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20170328122559.966310440@linuxfoundation.org>
     [not found] ` <20170328122601.905696872@linuxfoundation.org>
     [not found]   ` <20170328124312.GE18241@dhcp22.suse.cz>
     [not found]     ` <CAOi1vP-TeEwNM8n=Z5b6yx1epMDVJ4f7+S1poubA7zfT7L0hQQ@mail.gmail.com>
     [not found]       ` <20170328133040.GJ18241@dhcp22.suse.cz>
     [not found]         ` <CAOi1vP-doHSj8epQ1zLBnEi8QM4Eb7nFb5uo-XeUquZUkhacsg@mail.gmail.com>
2017-03-29 10:41           ` [PATCH 4.4 48/76] libceph: force GFP_NOIO for socket allocations Michal Hocko
2017-03-29 10:55             ` Michal Hocko
2017-03-29 11:10               ` Ilya Dryomov
2017-03-29 11:16                 ` Michal Hocko
2017-03-29 14:25                   ` Ilya Dryomov
2017-03-30  6:25                     ` Michal Hocko
2017-03-30 10:02                       ` Ilya Dryomov
2017-03-30 11:21                         ` Michal Hocko
2017-03-30 13:48                           ` Ilya Dryomov
2017-03-30 14:36                             ` Michal Hocko [this message]
2017-03-30 15:06                               ` Ilya Dryomov
2017-03-30 16:12                                 ` Michal Hocko
2017-03-30 17:19                                   ` Ilya Dryomov
2017-03-30 18:44                                     ` Michal Hocko
2017-03-30 13:53                       ` Ilya Dryomov
2017-03-30 13:59                         ` Michal Hocko
2017-03-29 11:05             ` Brian Foster
2017-03-29 11:14               ` Ilya Dryomov
2017-03-29 11:18                 ` Michal Hocko
2017-03-29 11:49                   ` Brian Foster
2017-03-29 14:30                     ` Ilya Dryomov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170330143652.GA4326@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=idryomov@gmail.com \
    --cc=jlayton@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=wintchester@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).