public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Jim Schutt" <jaschut@sandia.gov>
To: "Alex Elder" <elder@inktank.com>
Cc: ceph-devel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] libceph: ceph_pagelist_append might sleep while atomic
Date: Tue, 14 May 2013 11:44:22 -0600	[thread overview]
Message-ID: <519277F6.4060808@sandia.gov> (raw)
In-Reply-To: <519269F2.1000907@inktank.com>

On 05/14/2013 10:44 AM, Alex Elder wrote:
> On 05/09/2013 09:42 AM, Jim Schutt wrote:
>> Ceph's encode_caps_cb() worked hard to not call __page_cache_alloc while
>> holding a lock, but it's spoiled because ceph_pagelist_addpage() always
>> calls kmap(), which might sleep.  Here's the result:
> 
> I finally took a close look at this today, Jim.  Sorry
> for the delay.
> 

No worries - thanks for taking a look.

> The issue is formatting the reconnect message--which will
> hold an arbitrary amount of data and therefore which we'll
> need to do some allocation (and kmap) for--in the face of
> having to hold the flock spinlock while doing so.
> 
> And as you found, ceph_pagelist_addpage(), which is called
> by ceph_pagelist_append(), calls kmap() even if it doesn't
> need to allocate anything.  This means that despite reserving
> the pages, those pages are in the free list and because they'll
> need to be the subject of kmap() their preallocation doesn't
> help.
> 
> Your solution was to pre-allocate a buffer, format the locks
> into that buffer while holding the lock, then append the
> buffer contents to a pagelist after releasing the lock.  You
> check for a changing (increasing) lock count while you format
> the locks, which is good.
> 
> So...  Given that, I think your change looks good.  It's a shame
> we can't format directly into the pagelist buffer but this won't
> happen much so it's not a big deal.  I have a few small suggestions,
> below.
> 
> I do find some byte order bugs though.   They aren't your doing,
> but I think they ought to be fixed first, as a separate patch
> that would precede this one.  The bug is that the lock counts
> that are put into the buffer (num_fcntl_locks and num_flock_locks)
> are not properly byte-swapped.  I'll point it out inline
> in your code, below.
> 
> I'll say that what you have is OK.  Consider my suggestions, and
> if you choose not to fix the byte order bugs, please let me know.

I'll happily fix up a v2 series with your suggestions addressed.
Thanks for catching those issues.  Stay tuned...

Thanks -- Jim



      reply	other threads:[~2013-05-14 17:44 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-09 14:42 [PATCH] libceph: ceph_pagelist_append might sleep while atomic Jim Schutt
2013-05-14 16:44 ` Alex Elder
2013-05-14 17:44   ` Jim Schutt [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=519277F6.4060808@sandia.gov \
    --to=jaschut@sandia.gov \
    --cc=ceph-devel@vger.kernel.org \
    --cc=elder@inktank.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox