linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Wei Wang <wei.w.wang@intel.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: linux-kernel@vger.kernel.org,
	virtualization@lists.linux-foundation.org, kvm@vger.kernel.org,
	linux-mm@kvack.org, mst@redhat.com, mawilcox@microsoft.com,
	akpm@linux-foundation.org, virtio-dev@lists.oasis-open.org,
	david@redhat.com, cornelia.huck@de.ibm.com,
	mgorman@techsingularity.net, aarcange@redhat.com,
	amit.shah@redhat.com, pbonzini@redhat.com,
	liliang.opensource@gmail.com, yang.zhang.wz@gmail.com,
	quan.xu@aliyun.com
Subject: Re: [PATCH v13 4/5] mm: support reporting free page blocks
Date: Thu, 03 Aug 2017 20:11:58 +0800	[thread overview]
Message-ID: <5983130E.2070806@intel.com> (raw)
In-Reply-To: <20170803112831.GN12521@dhcp22.suse.cz>

On 08/03/2017 07:28 PM, Michal Hocko wrote:
> On Thu 03-08-17 19:27:19, Wei Wang wrote:
>> On 08/03/2017 06:44 PM, Michal Hocko wrote:
>>> On Thu 03-08-17 18:42:15, Wei Wang wrote:
>>>> On 08/03/2017 05:11 PM, Michal Hocko wrote:
>>>>> On Thu 03-08-17 14:38:18, Wei Wang wrote:
>>> [...]
>>>>>> +static int report_free_page_block(struct zone *zone, unsigned int order,
>>>>>> +				  unsigned int migratetype, struct page **page)
>>>>> This is just too ugly and wrong actually. Never provide struct page
>>>>> pointers outside of the zone->lock. What I've had in mind was to simply
>>>>> walk free lists of the suitable order and call the callback for each one.
>>>>> Something as simple as
>>>>>
>>>>> 	for (i = 0; i < MAX_NR_ZONES; i++) {
>>>>> 		struct zone *zone = &pgdat->node_zones[i];
>>>>>
>>>>> 		if (!populated_zone(zone))
>>>>> 			continue;
>>>>> 		spin_lock_irqsave(&zone->lock, flags);
>>>>> 		for (order = min_order; order < MAX_ORDER; ++order) {
>>>>> 			struct free_area *free_area = &zone->free_area[order];
>>>>> 			enum migratetype mt;
>>>>> 			struct page *page;
>>>>>
>>>>> 			if (!free_area->nr_pages)
>>>>> 				continue;
>>>>>
>>>>> 			for_each_migratetype_order(order, mt) {
>>>>> 				list_for_each_entry(page,
>>>>> 						&free_area->free_list[mt], lru) {
>>>>>
>>>>> 					pfn = page_to_pfn(page);
>>>>> 					visit(opaque2, prn, 1<<order);
>>>>> 				}
>>>>> 			}
>>>>> 		}
>>>>>
>>>>> 		spin_unlock_irqrestore(&zone->lock, flags);
>>>>> 	}
>>>>>
>>>>> [...]
>>>> I think the above would take the lock for too long time. That's why we
>>>> prefer to take one free page block each time, and taking it one by one
>>>> also doesn't make a difference, in terms of the performance that we
>>>> need.
>>> I think you should start with simple approach and impove incrementally
>>> if this turns out to be not optimal. I really detest taking struct pages
>>> outside of the lock. You never know what might happen after the lock is
>>> dropped. E.g. can you race with the memory hotremove?
>>
>> The caller won't use pages returned from the function, so I think there
>> shouldn't be an issue or race if the returned pages are used (i.e. not free
>> anymore) or simply gone due to hotremove.
> No, this is just too error prone. Consider that struct page pointer
> itself could get invalid in the meantime. Please always keep robustness
> in mind first. Optimizations are nice but it is even not clear whether
> the simple variant will cause any problems.


how about this:

for_each_populated_zone(zone) {
               for_each_migratetype_order_decend(min_order, order, type) {
                     do {
      =>                  spin_lock_irqsave(&zone->lock, flags);
                         ret = report_free_page_block(zone, order, type,
                              &page)) {
                                pfn = page_to_pfn(page);
                                nr_pages = 1 << order;
                                visit(opaque1, pfn, nr_pages);
                          }
      => spin_unlock_irqrestore(&zone->lock, flags);
                     } while (!ret)
}

In this way, we can still keep the lock granularity at one free page block
while having the struct page operated under the lock.



Best,
Wei











--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-08-03 12:09 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-03  6:38 [PATCH v13 0/5] Virtio-balloon Enhancement Wei Wang
2017-08-03  6:38 ` [PATCH v13 1/5] Introduce xbitmap Wei Wang
2017-08-07  6:58   ` Wei Wang
2017-08-09 21:36   ` Andrew Morton
2017-08-10  5:59     ` Wei Wang
2017-08-03  6:38 ` [PATCH v13 2/5] xbitmap: add xb_find_next_bit() and xb_zero() Wei Wang
2017-08-03  6:38 ` [PATCH v13 3/5] virtio-balloon: VIRTIO_BALLOON_F_SG Wei Wang
2017-08-03 14:22   ` Michael S. Tsirkin
2017-08-03 15:17     ` Wang, Wei W
2017-08-03 15:55       ` Michael S. Tsirkin
2017-08-03  6:38 ` [PATCH v13 4/5] mm: support reporting free page blocks Wei Wang
2017-08-03  9:11   ` Michal Hocko
2017-08-03 10:42     ` Wei Wang
2017-08-03 10:44       ` Michal Hocko
2017-08-03 11:27         ` Wei Wang
2017-08-03 11:28           ` Michal Hocko
2017-08-03 12:11             ` Wei Wang [this message]
2017-08-03 12:41               ` Michal Hocko
2017-08-03 13:17                 ` Wei Wang
2017-08-03 13:50                   ` Michal Hocko
2017-08-03 15:20                     ` Wang, Wei W
2017-08-03 21:02                       ` Michael S. Tsirkin
2017-08-04  7:53                         ` Michal Hocko
2017-08-04  8:15                           ` Wei Wang
2017-08-04  8:24                             ` Michal Hocko
2017-08-04  8:55                               ` Wei Wang
2017-08-08  6:12     ` Wei Wang
2017-08-08  6:34       ` [virtio-dev] " Wei Wang
2017-08-10  7:05         ` Michal Hocko
2017-08-10  7:38           ` Wei Wang
2017-08-10  7:53             ` Michal Hocko
2017-08-03  6:38 ` [PATCH v13 5/5] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_VQ Wei Wang
2017-08-03  8:13   ` Pankaj Gupta
2017-08-03 12:28     ` Wei Wang
2017-08-03 13:05       ` Pankaj Gupta
2017-08-03 13:21         ` Wei Wang
2017-08-03 12:33   ` Michael S. Tsirkin
2017-08-03 16:11   ` kbuild test robot
2017-08-16  5:57 ` [virtio-dev] [PATCH v13 0/5] Virtio-balloon Enhancement Adam Tao
2017-08-16  9:33   ` Wei Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5983130E.2070806@intel.com \
    --to=wei.w.wang@intel.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=amit.shah@redhat.com \
    --cc=cornelia.huck@de.ibm.com \
    --cc=david@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=liliang.opensource@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mawilcox@microsoft.com \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@kernel.org \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=quan.xu@aliyun.com \
    --cc=virtio-dev@lists.oasis-open.org \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=yang.zhang.wz@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).