From: "Jim Schutt" <jaschut@sandia.gov>
To: Alex Elder <elder@inktank.com>
Cc: ceph-devel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] libceph: ceph_pagelist_append might sleep while atomic
Date: Tue, 14 May 2013 11:44:22 -0600 [thread overview]
Message-ID: <519277F6.4060808@sandia.gov> (raw)
In-Reply-To: <519269F2.1000907@inktank.com>
On 05/14/2013 10:44 AM, Alex Elder wrote:
> On 05/09/2013 09:42 AM, Jim Schutt wrote:
>> Ceph's encode_caps_cb() worked hard to not call __page_cache_alloc while
>> holding a lock, but it's spoiled because ceph_pagelist_addpage() always
>> calls kmap(), which might sleep. Here's the result:
>
> I finally took a close look at this today, Jim. Sorry
> for the delay.
>
No worries - thanks for taking a look.
> The issue is formatting the reconnect message--which will
> hold an arbitrary amount of data and therefore which we'll
> need to do some allocation (and kmap) for--in the face of
> having to hold the flock spinlock while doing so.
>
> And as you found, ceph_pagelist_addpage(), which is called
> by ceph_pagelist_append(), calls kmap() even if it doesn't
> need to allocate anything. This means that despite reserving
> the pages, those pages are in the free list and because they'll
> need to be the subject of kmap() their preallocation doesn't
> help.
>
> Your solution was to pre-allocate a buffer, format the locks
> into that buffer while holding the lock, then append the
> buffer contents to a pagelist after releasing the lock. You
> check for a changing (increasing) lock count while you format
> the locks, which is good.
>
> So... Given that, I think your change looks good. It's a shame
> we can't format directly into the pagelist buffer but this won't
> happen much so it's not a big deal. I have a few small suggestions,
> below.
>
> I do find some byte order bugs though. They aren't your doing,
> but I think they ought to be fixed first, as a separate patch
> that would precede this one. The bug is that the lock counts
> that are put into the buffer (num_fcntl_locks and num_flock_locks)
> are not properly byte-swapped. I'll point it out inline
> in your code, below.
>
> I'll say that what you have is OK. Consider my suggestions, and
> if you choose not to fix the byte order bugs, please let me know.
I'll happily fix up a v2 series with your suggestions addressed.
Thanks for catching those issues. Stay tuned...
Thanks -- Jim
prev parent reply other threads:[~2013-05-14 17:44 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-05-09 14:42 [PATCH] libceph: ceph_pagelist_append might sleep while atomic Jim Schutt
2013-05-14 16:44 ` Alex Elder
2013-05-14 17:44 ` Jim Schutt [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=519277F6.4060808@sandia.gov \
--to=jaschut@sandia.gov \
--cc=ceph-devel@vger.kernel.org \
--cc=elder@inktank.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.