From: Wei Wang <wei.w.wang@intel.com>
To: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>, willy@infradead.org
Cc: virtio-dev@lists.oasis-open.org, linux-kernel@vger.kernel.org,
qemu-devel@nongnu.org, virtualization@lists.linux-foundation.org,
kvm@vger.kernel.org, linux-mm@kvack.org, mst@redhat.com,
mhocko@kernel.org, akpm@linux-foundation.org,
mawilcox@microsoft.com, david@redhat.com,
cornelia.huck@de.ibm.com, mgorman@techsingularity.net,
aarcange@redhat.com, amit.shah@redhat.com, pbonzini@redhat.com,
liliang.opensource@gmail.com, yang.zhang.wz@gmail.com,
quan.xu0@gmail.com, nilal@redhat.com, riel@redhat.com
Subject: Re: [Qemu-devel] [virtio-dev] Re: [PATCH v20 4/7] virtio-balloon: VIRTIO_BALLOON_F_SG
Date: Sun, 24 Dec 2017 16:16:20 +0800 [thread overview]
Message-ID: <5A3F6254.7070306@intel.com> (raw)
In-Reply-To: <5A3F5A4A.1070009@intel.com>
On 12/24/2017 03:42 PM, Wei Wang wrote:
> On 12/24/2017 12:45 PM, Tetsuo Handa wrote:
>> Matthew Wilcox wrote:
>>>> + unsigned long pfn = page_to_pfn(page);
>>>> + int ret;
>>>> +
>>>> + *pfn_min = min(pfn, *pfn_min);
>>>> + *pfn_max = max(pfn, *pfn_max);
>>>> +
>>>> + do {
>>>> + if (xb_preload(GFP_NOWAIT | __GFP_NOWARN) < 0)
>>>> + return -ENOMEM;
>>>> +
>>>> + ret = xb_set_bit(&vb->page_xb, pfn);
>>>> + xb_preload_end();
>>>> + } while (unlikely(ret == -EAGAIN));
>>> OK, so you don't need a spinlock because you're under a mutex? But you
>>> can't allocate memory because you're in the balloon driver, and so a
>>> GFP_KERNEL allocation might recurse into your driver?
>> Right. We can't (directly or indirectly) depend on
>> __GFP_DIRECT_RECLAIM && !__GFP_NORETRY
>> allocations because the balloon driver needs to handle OOM notifier
>> callback.
>>
>>> Would GFP_NOIO
>>> do the job? I'm a little hazy on exactly how the balloon driver works.
>> GFP_NOIO implies __GFP_DIRECT_RECLAIM. In the worst case, it can
>> lockup due to
>> the too small to fail memory allocation rule. GFP_NOIO |
>> __GFP_NORETRY would work
>> if there is really a guarantee that GFP_NOIO | __GFP_NORETRY never
>> depend on
>> __GFP_DIRECT_RECLAIM && !__GFP_NORETRY allocations, which is too
>> subtle for me to
>> validate. The direct reclaim dependency is too complicated to validate.
>> I consider that !__GFP_DIRECT_RECLAIM is the future-safe choice.
>
> What's the problem with (or how is it better than) the "GFP_NOWAIT |
> __GFP_NOWARN" we are using here?
>
>
>>> If you can't preload with anything better than that, I think that
>>> xb_set_bit() should attempt an allocation with GFP_NOWAIT |
>>> __GFP_NOWARN,
>>> and then you can skip the preload; it has no value for you.
>> Yes, that's why I suggest directly using kzalloc() at xb_set_bit().
>
> It has some possibilities to remove that preload if we also do the
> bitmap allocation in the xb_set_bit():
>
> bitmap = rcu_dereference_raw(*slot);
> if (!bitmap) {
> bitmap = this_cpu_xchg(ida_bitmap, NULL);
> if (!bitmap) {
> bitmap = kmalloc(sizeof(*bitmap), gfp);
> if (!bitmap)
> return -ENOMEM;
> }
> }
>
> But why not just follow the radix tree implementation style that puts
> the allocation in preload, which would be invoked with a more relaxed
> gfp in other use cases?
> Its usage in virtio_balloon is just a little special that we need to
> put the allocation within the balloon_lock, which doesn't give us the
> benefit of using a relaxed gfp in preload, but it doesn't prevent us
> from living with the current APIs (i.e. the preload + xb_set pattern).
> On the other side, if we do it as above, we have more things that need
> to consider. For example, what if the a use case just want the radix
> tree implementation style, which means it doesn't want allocation
> within xb_set(), then would we be troubled with how to avoid the
> allocation path in that case?
>
> So, I think it is better to stick with the convention by putting the
> allocation in preload. Breaking the convention should show obvious
> advantages, IMHO.
>
>
>>
>>>> @@ -173,8 +292,15 @@ static unsigned fill_balloon(struct
>>>> virtio_balloon *vb, size_t num)
>>>> while ((page = balloon_page_pop(&pages))) {
>>>> balloon_page_enqueue(&vb->vb_dev_info, page);
>>>> + if (use_sg) {
>>>> + if (xb_set_page(vb, page, &pfn_min, &pfn_max) < 0) {
>>>> + __free_page(page);
>>>> + continue;
>>>> + }
>>>> + } else {
>>>> + set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
>>>> + }
>>> Is this the right behaviour?
>> I don't think so. In the worst case, we can set no bit using
>> xb_set_page().
>
>>
>>> If we can't record the page in the xb,
>>> wouldn't we rather send it across as a single page?
>>>
>> I think that we need to be able to fallback to !use_sg path when OOM.
>
> I also have different thoughts:
>
> 1) For OOM, we have leak_balloon_sg_oom (oom has nothing to do with
> fill_balloon), which does not use xbitmap to record pages, thus no
> memory allocation.
>
> 2) If the memory is already under pressure, it is pointless to
> continue inflating memory to the host. We need to give thanks to the
> memory allocation failure reported by xbitmap, which gets us a chance
> to release the inflated pages that have been demonstrated to cause the
> memory pressure of the guest.
>
Forgot to add my conclusion: I think the above behavior is correct.
Best,
Wei
next prev parent reply other threads:[~2017-12-24 8:14 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-12-19 12:17 [Qemu-devel] [PATCH v20 0/7] Virtio-balloon Enhancement Wei Wang
2017-12-19 12:17 ` [Qemu-devel] [PATCH v20 1/7] xbitmap: Introduce xbitmap Wei Wang
2017-12-19 15:58 ` Philippe Ombredanne
2017-12-19 12:17 ` [Qemu-devel] [PATCH v20 2/7] xbitmap: potential improvement Wei Wang
2017-12-19 12:17 ` [Qemu-devel] [PATCH v20 3/7] xbitmap: add more operations Wei Wang
2017-12-19 12:17 ` [Qemu-devel] [PATCH v20 4/7] virtio-balloon: VIRTIO_BALLOON_F_SG Wei Wang
2017-12-24 3:21 ` Matthew Wilcox
2017-12-24 4:45 ` Tetsuo Handa
2017-12-24 7:42 ` Wei Wang
2017-12-24 8:16 ` Wei Wang [this message]
2017-12-25 14:51 ` Tetsuo Handa
2017-12-26 3:06 ` Wei Wang
2017-12-26 10:38 ` Tetsuo Handa
2017-12-26 11:36 ` Wei Wang
2017-12-26 13:40 ` Tetsuo Handa
2018-01-02 13:24 ` Matthew Wilcox
2018-01-03 2:29 ` Tetsuo Handa
2018-01-03 9:00 ` Wei Wang
2018-01-03 10:19 ` Tetsuo Handa
2017-12-19 12:17 ` [Qemu-devel] [PATCH v20 5/7] mm: support reporting free page blocks Wei Wang
2017-12-19 12:17 ` [Qemu-devel] [PATCH v20 6/7] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_VQ Wei Wang
2017-12-19 12:17 ` [Qemu-devel] [PATCH v20 7/7] virtio-balloon: don't report free pages when page poisoning is enabled Wei Wang
2017-12-19 14:05 ` [Qemu-devel] [PATCH v20 0/7] Virtio-balloon Enhancement Tetsuo Handa
2017-12-19 14:40 ` Matthew Wilcox
2017-12-20 2:33 ` Tetsuo Handa
2017-12-19 18:08 ` Michael S. Tsirkin
2017-12-20 10:34 ` Wei Wang
2017-12-20 12:25 ` Matthew Wilcox
2017-12-20 16:13 ` Wang, Wei W
2017-12-20 17:10 ` Matthew Wilcox
2017-12-21 2:49 ` Wei Wang
2017-12-21 12:14 ` Matthew Wilcox
2017-12-21 12:56 ` Tetsuo Handa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5A3F6254.7070306@intel.com \
--to=wei.w.wang@intel.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=amit.shah@redhat.com \
--cc=cornelia.huck@de.ibm.com \
--cc=david@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=liliang.opensource@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mawilcox@microsoft.com \
--cc=mgorman@techsingularity.net \
--cc=mhocko@kernel.org \
--cc=mst@redhat.com \
--cc=nilal@redhat.com \
--cc=pbonzini@redhat.com \
--cc=penguin-kernel@I-love.SAKURA.ne.jp \
--cc=qemu-devel@nongnu.org \
--cc=quan.xu0@gmail.com \
--cc=riel@redhat.com \
--cc=virtio-dev@lists.oasis-open.org \
--cc=virtualization@lists.linux-foundation.org \
--cc=willy@infradead.org \
--cc=yang.zhang.wz@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).