From: Wei Wang <wei.w.wang@intel.com>
To: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: mst@redhat.com, virtio-dev@lists.oasis-open.org,
linux-kernel@vger.kernel.org, qemu-devel@nongnu.org,
virtualization@lists.linux-foundation.org, kvm@vger.kernel.org,
linux-mm@kvack.org, mhocko@kernel.org, akpm@linux-foundation.org,
mawilcox@microsoft.com, david@redhat.com,
cornelia.huck@de.ibm.com, mgorman@techsingularity.net,
aarcange@redhat.com, amit.shah@redhat.com, pbonzini@redhat.com,
willy@infradead.org, liliang.opensource@gmail.com,
yang.zhang.wz@gmail.com, quan.xu@aliyun.com
Subject: Re: [Qemu-devel] [PATCH v16 3/5] virtio-balloon: VIRTIO_BALLOON_F_SG
Date: Wed, 11 Oct 2017 11:16:55 +0800 [thread overview]
Message-ID: <59DD8D27.5010601@intel.com> (raw)
In-Reply-To: <201710110226.v9B2QGdx019779@www262.sakura.ne.jp>
On 10/11/2017 10:26 AM, Tetsuo Handa wrote:
> Wei Wang wrote:
>> On 10/10/2017 09:09 PM, Tetsuo Handa wrote:
>>> Wei Wang wrote:
>>>>> And even if we could remove balloon_lock, you still cannot use
>>>>> __GFP_DIRECT_RECLAIM at xb_set_page(). I think you will need to use
>>>>> "whether it is safe to wait" flag from
>>>>> "[PATCH] virtio: avoid possible OOM lockup at virtballoon_oom_notify()" .
>>>> Without the lock being held, why couldn't we use __GFP_DIRECT_RECLAIM at
>>>> xb_set_page()?
>>> Because of dependency shown below.
>>>
>>> leak_balloon()
>>> xb_set_page()
>>> xb_preload(GFP_KERNEL)
>>> kmalloc(GFP_KERNEL)
>>> __alloc_pages_may_oom()
>>> Takes oom_lock
>>> out_of_memory()
>>> blocking_notifier_call_chain()
>>> leak_balloon()
>>> xb_set_page()
>>> xb_preload(GFP_KERNEL)
>>> kmalloc(GFP_KERNEL)
>>> __alloc_pages_may_oom()
>>> Fails to take oom_lock and loop forever
>> __alloc_pages_may_oom() uses mutex_trylock(&oom_lock).
> Yes. But this mutex_trylock(&oom_lock) is semantically mutex_lock(&oom_lock)
> because __alloc_pages_slowpath() will continue looping until
> mutex_trylock(&oom_lock) succeeds (or somebody releases memory).
>
>> I think the second __alloc_pages_may_oom() will not continue since the
>> first one is in progress.
> The second __alloc_pages_may_oom() will be called repeatedly because
> __alloc_pages_slowpath() will continue looping (unless somebody releases
> memory).
>
OK, I see, thanks. So, the point is that the OOM code path should not
have memory allocation, and the
old leak_balloon (without the F_SG feature) don't need xb_preload(). I
think one solution would be to let
the OOM uses the old leak_balloon() code path, and we can add one more
parameter to leak_balloon
to control that:
leak_balloon(struct virtio_balloon *vb, size_t num, bool oom)
>>> By the way, is xb_set_page() safe?
>>> Sleeping in the kernel with preemption disabled is a bug, isn't it?
>>> __radix_tree_preload() returns 0 with preemption disabled upon success.
>>> xb_preload() disables preemption if __radix_tree_preload() fails.
>>> Then, kmalloc() is called with preemption disabled, isn't it?
>>> But xb_set_page() calls xb_preload(GFP_KERNEL) which might sleep with
>>> preemption disabled.
>> Yes, I think that should not be expected, thanks.
>>
>> I plan to change it like this:
>>
>> bool xb_preload(gfp_t gfp)
>> {
>> if (!this_cpu_read(ida_bitmap)) {
>> struct ida_bitmap *bitmap = kmalloc(sizeof(*bitmap), gfp);
>>
>> if (!bitmap)
>> return false;
>> bitmap = this_cpu_cmpxchg(ida_bitmap, NULL, bitmap);
>> kfree(bitmap);
>> }
> Excuse me, but you are allocating per-CPU memory when running CPU might
> change at this line? What happens if running CPU has changed at this line?
> Will it work even with new CPU's ida_bitmap == NULL ?
>
Yes, it will be detected in xb_set_bit(): when ida_bitmap = NULL on the
new CPU, xb_set_bit() will
return -EAGAIN to the caller, and the caller should restart from
xb_preload().
Best,
Wei
next prev parent reply other threads:[~2017-10-11 3:15 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-09-30 4:05 [Qemu-devel] [PATCH v16 0/5] Virtio-balloon Enhancement Wei Wang
2017-09-30 4:05 ` [Qemu-devel] [PATCH v16 1/5] lib/xbitmap: Introduce xbitmap Wei Wang
2017-10-09 11:30 ` Tetsuo Handa
2017-09-30 4:05 ` [Qemu-devel] [PATCH v16 2/5] radix tree test suite: add tests for xbitmap Wei Wang
2017-09-30 4:05 ` [Qemu-devel] [PATCH v16 3/5] virtio-balloon: VIRTIO_BALLOON_F_SG Wei Wang
2017-10-02 4:30 ` Michael S. Tsirkin
2017-10-02 12:39 ` Wang, Wei W
2017-10-02 13:44 ` Michael S. Tsirkin
2017-10-09 15:20 ` Michael S. Tsirkin
2017-10-10 7:28 ` Wei Wang
2017-10-10 11:08 ` Tetsuo Handa
2017-10-10 12:32 ` Wei Wang
2017-10-10 13:09 ` Tetsuo Handa
2017-10-11 1:51 ` Wei Wang
2017-10-11 2:26 ` Tetsuo Handa
2017-10-11 3:16 ` Wei Wang [this message]
2017-09-30 4:05 ` [Qemu-devel] [PATCH v16 4/5] mm: support reporting free page blocks Wei Wang
2017-10-03 14:50 ` Michal Hocko
2017-09-30 4:05 ` [Qemu-devel] [PATCH v16 5/5] virtio-balloon: VIRTIO_BALLOON_F_CTRL_VQ Wei Wang
2017-10-01 3:18 ` Michael S. Tsirkin
2017-10-02 16:38 ` Wang, Wei W
2017-10-10 15:15 ` Michael S. Tsirkin
2017-10-11 6:03 ` Wei Wang
2017-10-11 13:49 ` Michael S. Tsirkin
2017-10-12 3:54 ` Wei Wang
2017-10-13 13:38 ` Michael S. Tsirkin
2017-10-19 8:07 ` Wei Wang
2017-10-01 13:16 ` [Qemu-devel] [PATCH v16 0/5] Virtio-balloon Enhancement Damian Tometzki
2017-10-01 13:25 ` Damian Tometzki
2017-10-09 9:39 ` Wei Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=59DD8D27.5010601@intel.com \
--to=wei.w.wang@intel.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=amit.shah@redhat.com \
--cc=cornelia.huck@de.ibm.com \
--cc=david@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=liliang.opensource@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mawilcox@microsoft.com \
--cc=mgorman@techsingularity.net \
--cc=mhocko@kernel.org \
--cc=mst@redhat.com \
--cc=pbonzini@redhat.com \
--cc=penguin-kernel@i-love.sakura.ne.jp \
--cc=qemu-devel@nongnu.org \
--cc=quan.xu@aliyun.com \
--cc=virtio-dev@lists.oasis-open.org \
--cc=virtualization@lists.linux-foundation.org \
--cc=willy@infradead.org \
--cc=yang.zhang.wz@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).