From: Andrea Arcangeli <andrea@suse.de>
To: Hugh Dickins <hugh@veritas.com>
Cc: Andrew Morton <akpm@osdl.org>, linux-kernel@vger.kernel.org
Subject: Re: VMA_MERGING_FIXUP and patch
Date: Mon, 22 Mar 2004 18:52:16 +0100 [thread overview]
Message-ID: <20040322175216.GJ3649@dualathlon.random> (raw)
In-Reply-To: <Pine.LNX.4.44.0403221640230.11645-100000@localhost.localdomain>
On Mon, Mar 22, 2004 at 05:05:33PM +0000, Hugh Dickins wrote:
> Just a reminder that you still have several #if VMA_MERGING_FIXUPs
> in your 2.6.5-rc2-aa1 tree, so mprotects and mremaps are not merging
> vmas at all.
>
> I can understand if you'd prefer to leave the mremaps that way,
> at least for now. (I do have code to decide whether any page is
> shared, will post later in anobjrmap 7/6, you could use the same
> to allow mremap vma merging if unproblematic.) But I think you
> ought to get to merging the mprotects, aren't there apps which
> will give you a frightening number of vmas unless merged?
agreed, enabling again the merging in mprotect is quite high prio,
but really some of those apps have never been able to merge in mprotect
because we would never been able to merge the file mappings after a
vma_split. The pros is that after we fixup, the file mappings will merge
fine too (the cons is that the merging code will have to be more
complicated than it was before ;).
> Here's some minor updates (no hurry) to objrmap.c,
> mirroring recentish s390 mods to mainline rmap.c:
> the page_test_and_clear_dirty I mentioned before.
this is greatly appreciated, I didn't notice this change (I got
uninteresting rejects in mm/rmap.c.rej too often and by mistake I
stopped looking at them at some point).
> Hmm, I wonder, is that safe to be calling set_page_dirty
> from inside the page rmap lock? Andrew?
I agree, I don't feel safe running it there indeed (it will not hang
immediatly but it enforces an ordering of other spinlocks, the less
enforced ordering the better), I'd prefer to set a local
variable and run it after the page_map_unlock. Actually re-reading
page->mapcount lockless should be ok too, if we don't read zero due a
race, it means somebody else will run the unmap or remove_page_rmap
again. All it matters is that the very last one unmapping the page reads
zero and that's guaranteed even without the spinlock.
> --- 2.6.5-rc2-aa1/mm/objrmap.c 2004-03-22 11:38:55.000000000 +0000
> +++ linux/mm/objrmap.c 2004-03-22 16:34:29.421216936 +0000
> @@ -212,7 +212,7 @@ int fastcall page_referenced(struct page
> BUG_ON(!page->mapping);
>
> if (page_test_and_clear_young(page))
> - mark_page_accessed(page);
> + referenced++;
>
> if (TestClearPageReferenced(page))
> referenced++;
I definitely agree with this one.
what about this?
thanks for spotting these s390 bugs!
--- x/mm/vmscan.c.~1~ 2004-03-21 15:21:42.000000000 +0100
+++ x/mm/vmscan.c 2004-03-22 18:39:05.551793088 +0100
@@ -313,16 +313,14 @@ shrink_list(struct list_head *page_list,
if (page_mapped(page) && mapping) {
switch (try_to_unmap(page)) {
case SWAP_FAIL:
- page_map_unlock(page);
goto activate_locked;
case SWAP_AGAIN:
- page_map_unlock(page);
goto keep_locked;
case SWAP_SUCCESS:
; /* try to free the page below */
}
- }
- page_map_unlock(page);
+ } else
+ page_map_unlock(page);
/*
* If the page is dirty, only perform writeback if that write
--- x/mm/memory.c.~1~ 2004-03-21 15:21:42.000000000 +0100
+++ x/mm/memory.c 2004-03-22 13:40:26.852849384 +0100
@@ -324,9 +324,11 @@ skip_copy_pte_range:
* Device driver pages must not be
* tracked by the VM for unmapping.
*/
- BUG_ON(!page_mapped(page));
- BUG_ON(!page->mapping);
- page_add_rmap(page, vma, address, PageAnon(page));
+ if (likely(page_mapped(page) && page->mapping))
+ page_add_rmap(page, vma, address, PageAnon(page));
+ else
+ printk("Badness in %s at %s:%d\n",
+ __FUNCTION__, __FILE__, __LINE__);
} else {
BUG_ON(page_mapped(page));
BUG_ON(page->mapping);
@@ -1429,7 +1431,9 @@ retry:
* real anonymous pages, they're "device" reserved pages instead.
*/
reserved = !!(vma->vm_flags & VM_RESERVED);
- WARN_ON(reserved == pageable);
+ if (unlikely(reserved == pageable))
+ printk("Badness in %s at %s:%d\n",
+ __FUNCTION__, __FILE__, __LINE__);
/*
* Should we do an early C-O-W break?
--- x/mm/objrmap.c.~1~ 2004-03-21 15:21:42.000000000 +0100
+++ x/mm/objrmap.c 2004-03-22 18:41:56.434814920 +0100
@@ -48,6 +48,12 @@ static inline void validate_anon_vma_fin
#endif
}
+static pte_t *
+find_pte_nonlinear(struct mm_struct * mm, struct vm_area_struct *vma, struct page *page)
+{
+
+}
+
/**
* find_pte - Find a pte pointer given a vma and a struct page.
* @vma: the vma to search
@@ -62,7 +68,7 @@ static inline void validate_anon_vma_fin
*
* It is the caller's responsibility to unmap the pte if it is returned.
*/
-static inline pte_t *
+static pte_t *
find_pte(struct vm_area_struct *vma, struct page *page, unsigned long *addr)
{
struct mm_struct *mm = vma->vm_mm;
@@ -72,6 +78,9 @@ find_pte(struct vm_area_struct *vma, str
unsigned long loffset;
unsigned long address;
+ if (vma->vm_flags & VM_NONLINEAR)
+ return find_pte_nonlinear(mm, vma, page);
+
loffset = (page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT));
address = vma->vm_start + ((loffset - vma->vm_pgoff) << PAGE_SHIFT);
if (address < vma->vm_start || address >= vma->vm_end)
@@ -120,6 +129,13 @@ page_referenced_one(struct vm_area_struc
pte_t *pte;
int referenced = 0;
+ /*
+ * Tracking the referenced info is too expensive
+ * for nonlinear mappings.
+ */
+ if (vma->vm_flags & VM_NONLINEAR)
+ goto out;
+
if (!spin_trylock(&mm->page_table_lock))
goto out;
@@ -212,7 +228,7 @@ int fastcall page_referenced(struct page
BUG_ON(!page->mapping);
if (page_test_and_clear_young(page))
- mark_page_accessed(page);
+ referenced++;
if (TestClearPageReferenced(page))
referenced++;
@@ -344,6 +360,10 @@ void fastcall page_remove_rmap(struct pa
out_unlock:
page_map_unlock(page);
+
+ if (page_test_and_clear_dirty(page) && !page_mapped(page))
+ set_page_dirty(page);
+
return;
}
@@ -523,6 +543,11 @@ int fastcall try_to_unmap(struct page *
dec_page_state(nr_mapped);
ret = SWAP_SUCCESS;
}
+ page_map_unlock(page);
+
+ if (page_test_and_clear_dirty(page) && ret == SWAP_SUCCESS)
+ set_page_dirty(page);
+
return ret;
}
next prev parent reply other threads:[~2004-03-22 17:51 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-03-22 17:05 VMA_MERGING_FIXUP and patch Hugh Dickins
2004-03-22 17:52 ` Andrea Arcangeli [this message]
2004-03-22 19:02 ` Hugh Dickins
2004-03-22 19:58 ` Andrea Arcangeli
2004-03-23 21:44 ` nonlinear swapping w/o pte_chains [Re: VMA_MERGING_FIXUP and patch] Andrea Arcangeli
2004-03-24 2:35 ` Andrea Arcangeli
2004-03-24 4:38 ` Andrea Arcangeli
2004-03-24 10:12 ` Hugh Dickins
2004-03-24 12:18 ` Hugh Dickins
2004-03-24 14:47 ` Andrea Arcangeli
2004-03-24 14:37 ` Andrea Arcangeli
2004-03-24 18:42 ` Andrea Arcangeli
2004-03-22 19:57 ` VMA_MERGING_FIXUP and patch Andrew Morton
2004-03-22 20:05 ` Andrea Arcangeli
2004-03-22 20:33 ` Andrew Morton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20040322175216.GJ3649@dualathlon.random \
--to=andrea@suse.de \
--cc=akpm@osdl.org \
--cc=hugh@veritas.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox