linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@linux.intel.com>
To: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>,
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v8 05/22] Add vm_replace_mixed()
Date: Fri, 25 Jul 2014 15:44:50 -0400	[thread overview]
Message-ID: <20140725194450.GJ6754@linux.intel.com> (raw)
In-Reply-To: <20140723155500.GA12790@node.dhcp.inet.fi>

[-- Attachment #1: Type: text/plain, Size: 5869 bytes --]

On Wed, Jul 23, 2014 at 06:55:00PM +0300, Kirill A. Shutemov wrote:
> >         update_hiwater_rss(mm);
> 
> No: you cannot end up with lower rss after replace, iiuc.

Actually, you can ... when we replace a real page with a PFN, our rss
decreases.

> Do you mean you pointed to new file all the time? O_CREAT doesn't truncate
> file if it exists, iirc.

It was pointing to a new file.  Still not sure why that one failed to trigger
the problem.  The slightly modified version attached triggered the problem
*just fine* :-)

I've attached all the patches in my tree so far.  For the v9 patch kit,
I'll keep patch 3 as a separate patch, but roll patches 1, 2 and 4 into
other patches.

I am seeing something odd though.  When I run double-map with debugging
printks inserted in strategic spots in the kernel, I see four calls to
do_dax_fault().  The first two, as expected, are the loads from the two
mapped addresses.  The third is via mkwrite, but then the fourth time
I get a regular page fault for write, and I don't understand why I get it.

Any ideas?


[  880.966632] do_dax_fault: fault a8 page =           (null) bh state 0 written
 0 addr 7ff598835000
[  880.966637] dax_load_hole: page = ffffea0002784730
[  882.114387] do_dax_fault: fault a8 page = ffffea0002784730 bh state 0 written 0 addr 7ff598834000
[  882.114389] dax_load_hole: page = ffffea0002784730
[  882.780013] do_dax_fault: fault 5 page = ffffea0002784730 bh state 0 written 0 addr 7ff598835000
[  882.780095] insert_pfn: pte = 8000000108200225
[  882.780096] do_dax_fault: page = ffffea0002784730 pfn = 108200 error = 0
[  882.780098] CPU: 1 PID: 1511 Comm: double-map Not tainted 3.16.0-rc6+ #89
[  882.780099] Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./Q87M-D2H, BIOS F6 08/03/2013
[  882.780100]  0000000000000000 ffff8800b41e3ba0 ffffffff815c6e73 ffffea0002784730
[  882.780102]  ffff8800b41e3c88 ffffffff8124c319 00007ff598835000 ffff8800b2436ac0
[  882.780104]  0000000000000005 ffffffffa01e3020 0000000000000004 ffff880000000005
[  882.780106] Call Trace:
[  882.780110]  [<ffffffff815c6e73>] dump_stack+0x4d/0x66
[  882.780114]  [<ffffffff8124c319>] do_dax_fault+0x569/0x6f0
[  882.780133]  [<ffffffff8124c4df>] dax_fault+0x3f/0x90
[  882.780136]  [<ffffffff81023b2e>] ? native_sched_clock+0x2e/0xb0
[  882.780137]  [<ffffffff8124c53e>] dax_mkwrite+0xe/0x10
[  882.780143]  [<ffffffffa01db955>] ext4_dax_mkwrite+0x15/0x20 [ext4]
[  882.780146]  [<ffffffff811ab627>] do_page_mkwrite+0x47/0xb0
[  882.780148]  [<ffffffff811ad7f2>] do_wp_page+0x6e2/0x990
[  882.780150]  [<ffffffff811aff1b>] handle_mm_fault+0x6ab/0xf70
[  882.780154]  [<ffffffff81062d2c>] __do_page_fault+0x1ec/0x5b0
[  882.780163]  [<ffffffff81063112>] do_page_fault+0x22/0x30
[  882.780165]  [<ffffffff815d1b78>] page_fault+0x28/0x30
[  882.780204] do_dax_fault: fault a9 page =           (null) bh state 20 written 1 addr 7ff598835000
[  882.780206] insert_pfn: pte = 8000000108200225
[  882.780207] do_dax_fault: page =           (null) pfn = 108200 error = 0
[  882.780208] CPU: 1 PID: 1511 Comm: double-map Not tainted 3.16.0-rc6+ #89
[  882.780208] Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./Q87M-D2H, BIOS F6 08/03/2013
[  882.780209]  0000000000000000 ffff8800b41e3bc0 ffffffff815c6e73 0000000000000000
[  882.780211]  ffff8800b41e3ca8 ffffffff8124c319 00007ff598835000 ffff8800b41e3c08
[  882.780213]  ffff8800a2a60608 ffffffffa01e3020 0000000000000000 ffff8800000000a9
[  882.780214] Call Trace:
[  882.780216]  [<ffffffff815c6e73>] dump_stack+0x4d/0x66
[  882.780218]  [<ffffffff8124c319>] do_dax_fault+0x569/0x6f0
[  882.780232]  [<ffffffff8124c4df>] dax_fault+0x3f/0x90
[  882.780238]  [<ffffffffa01db975>] ext4_dax_fault+0x15/0x20 [ext4]
[  882.780240]  [<ffffffff811ab6d1>] __do_fault+0x41/0xd0
[  882.780241]  [<ffffffff811ae8a5>] do_shared_fault.isra.56+0x35/0x220
[  882.780243]  [<ffffffff811afb73>] handle_mm_fault+0x303/0xf70
[  882.780246]  [<ffffffff81062d2c>] __do_page_fault+0x1ec/0x5b0
[  882.780254]  [<ffffffff81063112>] do_page_fault+0x22/0x30
[  882.780255]  [<ffffffff815d1b78>] page_fault+0x28/0x30


diff --git a/fs/dax.c b/fs/dax.c
index b4fdfd9..4b0f928 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -257,6 +257,7 @@ static int dax_load_hole(struct address_space *mapping, struct page *page,
 	if (!page)
 		page = find_or_create_page(mapping, vmf->pgoff,
 						GFP_KERNEL | __GFP_ZERO);
+printk("%s: page = %p\n", __func__, page);
 	if (!page)
 		return VM_FAULT_OOM;
 	/* Recheck i_size under page lock to avoid truncate race */
@@ -332,6 +333,7 @@ static int do_dax_fault(struct vm_area_struct *vma, struct vm_fault *vmf,
 	if (error || bh.b_size < PAGE_SIZE)
 		goto sigbus;
 
+printk("%s: fault %x page = %p bh state %lx written %d addr %lx\n", __func__, vmf->flags, page, bh.b_state, buffer_written(&bh), vaddr);
 	if (!buffer_written(&bh) && !vmf->cow_page) {
 		if (vmf->flags & FAULT_FLAG_WRITE) {
 			error = get_block(inode, block, &bh, 1);
@@ -372,6 +374,8 @@ static int do_dax_fault(struct vm_area_struct *vma, struct vm_fault *vmf,
 	error = dax_get_pfn(&bh, &pfn, blkbits);
 	if (error > 0)
 		error = vm_replace_mixed(vma, vaddr, pfn);
+printk("%s: page = %p pfn = %lx error = %d\n", __func__, page, pfn, error);
+if ((vmf->flags & FAULT_FLAG_WRITE) || !(vmf->flags & FAULT_FLAG_USER)) dump_stack();
 
 	if (!page) {
 		mutex_unlock(&mapping->i_mmap_mutex);
diff --git a/mm/memory.c b/mm/memory.c
index a8e17ce..189716c 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1614,6 +1614,7 @@ static int insert_pfn(struct vm_area_struct *vma, unsigned long addr,
 	/* Ok, finally just insert the thing.. */
 	entry = pte_mkspecial(pfn_pte(pfn, prot));
 	set_pte_at(mm, addr, pte, entry);
+printk("%s: pte = %llx\n", __func__, pte_val(entry));
 	update_mmu_cache(vma, addr, pte); /* XXX: why not for insert_page? */
 
 	retval = 0;

[-- Attachment #2: 0001-dax-Only-unlock-the-i_mmap_mutex-if-we-locked-it.patch --]
[-- Type: text/x-diff, Size: 0 bytes --]



  parent reply	other threads:[~2014-07-25 19:45 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-22 19:47 [PATCH v8 00/22] Support ext4 on NV-DIMMs Matthew Wilcox
2014-07-22 19:47 ` [PATCH v8 01/22] Fix XIP fault vs truncate race Matthew Wilcox
2014-07-23 11:21   ` Kirill A. Shutemov
2014-07-22 19:47 ` [PATCH v8 02/22] Allow page fault handlers to perform the COW Matthew Wilcox
2014-07-23 11:23   ` Kirill A. Shutemov
2014-07-22 19:47 ` [PATCH v8 03/22] axonram: Fix bug in direct_access Matthew Wilcox
2014-07-23 11:24   ` Kirill A. Shutemov
2014-07-22 19:47 ` [PATCH v8 04/22] Change direct_access calling convention Matthew Wilcox
2014-07-30 16:03   ` Boaz Harrosh
2014-07-30 16:12     ` Boaz Harrosh
2014-07-30 20:34       ` Matthew Wilcox
2014-07-31 10:16         ` Boaz Harrosh
2014-07-30 19:45     ` Matthew Wilcox
2014-07-31 10:11       ` Boaz Harrosh
2014-07-31 14:13         ` Matthew Wilcox
2014-07-31 15:28           ` Boaz Harrosh
2014-07-31 17:19             ` Matthew Wilcox
2014-07-31 18:04               ` Boaz Harrosh
2014-07-31 20:30                 ` Zwisler, Ross
2014-08-01 18:45                   ` Zwisler, Ross
2014-07-22 19:47 ` [PATCH v8 05/22] Add vm_replace_mixed() Matthew Wilcox
2014-07-23  9:10   ` Jan Kara
2014-07-23 11:45   ` Kirill A. Shutemov
2014-07-23 13:52     ` Matthew Wilcox
2014-07-23 14:20       ` Kirill A. Shutemov
2014-07-23 14:27         ` Matthew Wilcox
2014-07-23 15:55           ` Kirill A. Shutemov
2014-07-24  1:36             ` Zhang, Tianfei
2014-07-25 19:44             ` Matthew Wilcox [this message]
2014-07-28 13:25               ` Kirill A. Shutemov
2014-07-29  1:55                 ` Zhang, Tianfei
2014-07-22 19:47 ` [PATCH v8 06/22] Introduce IS_DAX(inode) Matthew Wilcox
2014-07-22 19:47 ` [PATCH v8 07/22] Add copy_to_iter(), copy_from_iter() and iov_iter_zero() Matthew Wilcox
2014-07-22 19:47 ` [PATCH v8 08/22] Replace XIP read and write with DAX I/O Matthew Wilcox
2014-07-22 19:47 ` [PATCH v8 09/22] Replace ext2_clear_xip_target with dax_clear_blocks Matthew Wilcox
2014-07-22 19:47 ` [PATCH v8 10/22] Replace the XIP page fault handler with the DAX page fault handler Matthew Wilcox
2014-07-23 12:10   ` Kirill A. Shutemov
2014-07-23 13:55     ` Matthew Wilcox
2014-07-23 16:57   ` Boaz Harrosh
2014-07-23 19:57     ` Matthew Wilcox
2014-07-22 19:47 ` [PATCH v8 11/22] Replace xip_truncate_page with dax_truncate_page Matthew Wilcox
2014-07-22 19:48 ` [PATCH v8 12/22] Replace XIP documentation with DAX documentation Matthew Wilcox
2014-07-22 19:48 ` [PATCH v8 13/22] Remove get_xip_mem Matthew Wilcox
2014-07-22 19:48 ` [PATCH v8 14/22] ext2: Remove ext2_xip_verify_sb() Matthew Wilcox
2014-07-22 19:48 ` [PATCH v8 15/22] ext2: Remove ext2_use_xip Matthew Wilcox
2014-07-22 19:48 ` [PATCH v8 16/22] ext2: Remove xip.c and xip.h Matthew Wilcox
2014-07-22 19:48 ` [PATCH v8 17/22] Remove CONFIG_EXT2_FS_XIP and rename CONFIG_FS_XIP to CONFIG_FS_DAX Matthew Wilcox
2014-07-22 19:48 ` [PATCH v8 18/22] ext2: Remove ext2_aops_xip Matthew Wilcox
2014-07-22 19:48 ` [PATCH v8 19/22] Get rid of most mentions of XIP in ext2 Matthew Wilcox
2014-07-22 19:48 ` [PATCH v8 20/22] xip: Add xip_zero_page_range Matthew Wilcox
2014-07-22 19:48 ` [PATCH v8 21/22] ext4: Add DAX functionality Matthew Wilcox
2014-07-22 19:48 ` [PATCH v8 22/22] brd: Rename XIP to DAX Matthew Wilcox
2014-07-23 12:30 ` [PATCH v8 00/22] Support ext4 on NV-DIMMs Kirill A. Shutemov
2014-07-23 13:59   ` Matthew Wilcox
2014-07-23 15:58 ` Boaz Harrosh
2014-07-23 19:50   ` Matthew Wilcox
2014-07-24 18:51     ` Ross Zwisler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140725194450.GJ6754@linux.intel.com \
    --to=willy@linux.intel.com \
    --cc=kirill@shutemov.name \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=matthew.r.wilcox@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).