All of lore.kernel.org
 help / color / mirror / Atom feed
From: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
To: Michael Holzheu <holzheu@linux.vnet.ibm.com>,
	Alexey Dobriyan <adobriyan@gmail.com>
Cc: linux-kernel@vger.kernel.org, kexec@lists.infradead.org,
	"David S. Miller" <davem@davemloft.net>,
	Vivek Goyal <vgoyal@redhat.com>, Jan Willeke <willeke@de.ibm.com>
Subject: Re: mmap for /proc/vmcore broken since 3.12-rc1
Date: Mon, 07 Oct 2013 11:42:45 +0900	[thread overview]
Message-ID: <52521FA5.3040101@jp.fujitsu.com> (raw)
In-Reply-To: <524D0ADF.2010507@jp.fujitsu.com>

(2013/10/03 15:12), HATAYAMA Daisuke wrote:
> (2013/10/02 21:03), Michael Holzheu wrote:
>> Hello Alexey,
>>
>> Looks like the following commit broke mmap for /proc/vmcore:
>>
>> commit c4fe24485729fc2cbff324c111e67a1cc2f9adea
>> Author: Alexey Dobriyan <adobriyan@gmail.com>
>> Date:   Tue Aug 20 22:17:24 2013 +0300
>>
>>      sparc: fix PCI device proc file mmap(2)
>>
>> Because /proc/vmcore (fs/proc/vmcore.c) does not implement the
>> get_unmapped_area() fops function mmap now always returns EIO.
>>
>> Michael
>>
>
> I confirmed the bug on v3.12-rc3. According to makedumpfile's log,
> mmap failed on /proc/vmcore.
>
> mem_map (271)
>    mem_map    : ffffea001da40000
>    pfn_start  : 878000
>    pfn_end    : 880000
> Kernel can't mmap vmcore, using reads.
> STEP [Excluding unnecessary pages] : 1.268799 seconds
> STEP [Excluding unnecessary pages] : 1.268756 seconds
> STEP [Copying data               ] : 44.847924 seconds
> Writing erase info...
>
> I'll post a patch later.
>

I've not completed this. I thought it was short task but after I
tried to fix, makedumpfile became frequently failing with -ENOMEM and
I'm not sure why even now.

Here's current progress.

First, on v3.12-rc3 mmap() on /proc/vmcore fails while returning -EIO.
This is due to the commit c4fe24485729fc2cbff324c111e67a1cc2f9adea,
just as reported by Holzheu, where proc_reg_get_unmapped_area was
newly added to proc_reg_file_ops_no_compat file operations as
get_unmapped_area method. Looking at get_unmapped_area function,
it calls current->mm->get_unmapped_area at default, but calls
f_ops->get_unmapped_area_function if it's assigned.

         get_area = current->mm->get_unmapped_area;
         if (file && file->f_op && file->f_op->get_unmapped_area)
                 get_area = file->f_op->get_unmapped_area;
         addr = get_area(file, addr, len, pgoff, flags);
         if (IS_ERR_VALUE(addr))
                 return addr;

For regular files in procfs, proc_reg_file_ops_no_compat is used
first and then this behaves as wrapper.

static unsigned long proc_reg_get_unmapped_area(struct file *file, unsigned long orig_addr, unsigned long len, unsigned long pgoff, unsigned long flags)
{
         struct proc_dir_entry *pde = PDE(file_inode(file));
         int rv = -EIO;
         unsigned long (*get_unmapped_area)(struct file *, unsigned long, unsigned long, unsigned long, unsigned long);
         if (use_pde(pde)) {
                 get_unmapped_area = pde->proc_fops->get_unmapped_area;
                 if (get_unmapped_area)
                         rv = get_unmapped_area(file, orig_addr, len, pgoff, flags);
                 unuse_pde(pde);
         }
         return rv;
}

Since this was added in proc_reg_file_ops_no_compat, proc_reg_get_unmapped_area
is used in get_unmapped_area now and it always returns -EIO since proc_vmcore_operations
has no get_unmapped_area method now.

So, immediate fix idea is to define get_unmapped_area method in proc_vmcore_operations
and to design it so that it just calls current->mm->get_unmapped_area.

---
  fs/proc/vmcore.c | 13 +++++++++++++
  1 file changed, 13 insertions(+)

diff --git a/fs/proc/vmcore.c b/fs/proc/vmcore.c
index 9100d69..9583419 100644
--- a/fs/proc/vmcore.c
+++ b/fs/proc/vmcore.c
@@ -412,10 +412,23 @@ static int mmap_vmcore(struct file *file, struct vm_area_struct *vma)
  }
  #endif

+static unsigned long
+get_unmapped_area_vmcore(struct file *filp, unsigned long addr,
+                        unsigned long len, unsigned long pgoff,
+                        unsigned long flags)
+{
+#ifdef CONFIG_MMU
+       return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags);
+#else
+       return -EIO;
+#endif
+}
+
  static const struct file_operations proc_vmcore_operations = {
         .read           = read_vmcore,
         .llseek         = default_llseek,
         .mmap           = mmap_vmcore,
+       .get_unmapped_area = get_unmapped_area_vmcore,
  };

  static struct vmcore* __init get_new_element(void)
--
1.8.3.1

However, after applying this patch, makedumpfile now somehow fails returning -ENOMEM
frequently. It's about 50/128 on my box.

Searching for where to return -ENOMEM in mmap path by printk debug, I found instance
of get_unmapped_area returns kernel-space address:

         get_area = current->mm->get_unmapped_area;
         if (file && file->f_op && file->f_op->get_unmapped_area)
                 get_area = file->f_op->get_unmapped_area;
         addr = get_area(file, addr, len, pgoff, flags);
         if (IS_ERR_VALUE(addr))
                 return addr;

         if (addr > TASK_SIZE - len)   <---- Here
                 return -ENOMEM;

The log is:

kdump:/# cd /mnt/
kdump:/mnt# for ((i=0; i<128; ++i)) ; do
> makedumpfile -f -p -d 31 /proc/vmcore vmcore-pd31
> done
The kernel version is not supported.
The created dumpfile may be incomplete.
cyclic buffer size has been changed: 65535 => 64512
[   49.462536] addr: 0xffffffff8ef28000
[   49.463686] TASK_SIZE: 0x007ffffffff000
[   49.464952] len: 0x00000000400000

Note that makedumpfile tries to mmap some area in 4MiB size here,
get_unmapped_area tries to find some area to cover such requested 4MiB size
within user-space address space limit but it returns somehow kernel-space
address.

The actual instance of current->mm->get_unmapped_area on my environment
is arch_get_unmapped_area_topdown.

Finally, note: I tried to run makedumpfile using mmap on /proc/vmcore 128-times. Then,
- v3.12-rc3 returns -EIO in every case (trivial),
- v3.12-rc3 with the above patch applied returns -ENOMEM at 50/128,
- v3.12-rc3 with commit c4fe24485729fc2cbff324c111e67a1cc2f9adea reverted
   works well in every case, and
- v3.11.1-201.fc19.x86_64 works well in every case.

So, I suspect procfs wrapper affects arch_get_unmapped_region_topdown?

Any comments are helpful.

-- 
Thanks.
HATAYAMA, Daisuke


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

WARNING: multiple messages have this Message-ID (diff)
From: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
To: Michael Holzheu <holzheu@linux.vnet.ibm.com>,
	Alexey Dobriyan <adobriyan@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>,
	Vivek Goyal <vgoyal@redhat.com>, Jan Willeke <willeke@de.ibm.com>,
	linux-kernel@vger.kernel.org, kexec@lists.infradead.org
Subject: Re: mmap for /proc/vmcore broken since 3.12-rc1
Date: Mon, 07 Oct 2013 11:42:45 +0900	[thread overview]
Message-ID: <52521FA5.3040101@jp.fujitsu.com> (raw)
In-Reply-To: <524D0ADF.2010507@jp.fujitsu.com>

(2013/10/03 15:12), HATAYAMA Daisuke wrote:
> (2013/10/02 21:03), Michael Holzheu wrote:
>> Hello Alexey,
>>
>> Looks like the following commit broke mmap for /proc/vmcore:
>>
>> commit c4fe24485729fc2cbff324c111e67a1cc2f9adea
>> Author: Alexey Dobriyan <adobriyan@gmail.com>
>> Date:   Tue Aug 20 22:17:24 2013 +0300
>>
>>      sparc: fix PCI device proc file mmap(2)
>>
>> Because /proc/vmcore (fs/proc/vmcore.c) does not implement the
>> get_unmapped_area() fops function mmap now always returns EIO.
>>
>> Michael
>>
>
> I confirmed the bug on v3.12-rc3. According to makedumpfile's log,
> mmap failed on /proc/vmcore.
>
> mem_map (271)
>    mem_map    : ffffea001da40000
>    pfn_start  : 878000
>    pfn_end    : 880000
> Kernel can't mmap vmcore, using reads.
> STEP [Excluding unnecessary pages] : 1.268799 seconds
> STEP [Excluding unnecessary pages] : 1.268756 seconds
> STEP [Copying data               ] : 44.847924 seconds
> Writing erase info...
>
> I'll post a patch later.
>

I've not completed this. I thought it was short task but after I
tried to fix, makedumpfile became frequently failing with -ENOMEM and
I'm not sure why even now.

Here's current progress.

First, on v3.12-rc3 mmap() on /proc/vmcore fails while returning -EIO.
This is due to the commit c4fe24485729fc2cbff324c111e67a1cc2f9adea,
just as reported by Holzheu, where proc_reg_get_unmapped_area was
newly added to proc_reg_file_ops_no_compat file operations as
get_unmapped_area method. Looking at get_unmapped_area function,
it calls current->mm->get_unmapped_area at default, but calls
f_ops->get_unmapped_area_function if it's assigned.

         get_area = current->mm->get_unmapped_area;
         if (file && file->f_op && file->f_op->get_unmapped_area)
                 get_area = file->f_op->get_unmapped_area;
         addr = get_area(file, addr, len, pgoff, flags);
         if (IS_ERR_VALUE(addr))
                 return addr;

For regular files in procfs, proc_reg_file_ops_no_compat is used
first and then this behaves as wrapper.

static unsigned long proc_reg_get_unmapped_area(struct file *file, unsigned long orig_addr, unsigned long len, unsigned long pgoff, unsigned long flags)
{
         struct proc_dir_entry *pde = PDE(file_inode(file));
         int rv = -EIO;
         unsigned long (*get_unmapped_area)(struct file *, unsigned long, unsigned long, unsigned long, unsigned long);
         if (use_pde(pde)) {
                 get_unmapped_area = pde->proc_fops->get_unmapped_area;
                 if (get_unmapped_area)
                         rv = get_unmapped_area(file, orig_addr, len, pgoff, flags);
                 unuse_pde(pde);
         }
         return rv;
}

Since this was added in proc_reg_file_ops_no_compat, proc_reg_get_unmapped_area
is used in get_unmapped_area now and it always returns -EIO since proc_vmcore_operations
has no get_unmapped_area method now.

So, immediate fix idea is to define get_unmapped_area method in proc_vmcore_operations
and to design it so that it just calls current->mm->get_unmapped_area.

---
  fs/proc/vmcore.c | 13 +++++++++++++
  1 file changed, 13 insertions(+)

diff --git a/fs/proc/vmcore.c b/fs/proc/vmcore.c
index 9100d69..9583419 100644
--- a/fs/proc/vmcore.c
+++ b/fs/proc/vmcore.c
@@ -412,10 +412,23 @@ static int mmap_vmcore(struct file *file, struct vm_area_struct *vma)
  }
  #endif

+static unsigned long
+get_unmapped_area_vmcore(struct file *filp, unsigned long addr,
+                        unsigned long len, unsigned long pgoff,
+                        unsigned long flags)
+{
+#ifdef CONFIG_MMU
+       return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags);
+#else
+       return -EIO;
+#endif
+}
+
  static const struct file_operations proc_vmcore_operations = {
         .read           = read_vmcore,
         .llseek         = default_llseek,
         .mmap           = mmap_vmcore,
+       .get_unmapped_area = get_unmapped_area_vmcore,
  };

  static struct vmcore* __init get_new_element(void)
--
1.8.3.1

However, after applying this patch, makedumpfile now somehow fails returning -ENOMEM
frequently. It's about 50/128 on my box.

Searching for where to return -ENOMEM in mmap path by printk debug, I found instance
of get_unmapped_area returns kernel-space address:

         get_area = current->mm->get_unmapped_area;
         if (file && file->f_op && file->f_op->get_unmapped_area)
                 get_area = file->f_op->get_unmapped_area;
         addr = get_area(file, addr, len, pgoff, flags);
         if (IS_ERR_VALUE(addr))
                 return addr;

         if (addr > TASK_SIZE - len)   <---- Here
                 return -ENOMEM;

The log is:

kdump:/# cd /mnt/
kdump:/mnt# for ((i=0; i<128; ++i)) ; do
> makedumpfile -f -p -d 31 /proc/vmcore vmcore-pd31
> done
The kernel version is not supported.
The created dumpfile may be incomplete.
cyclic buffer size has been changed: 65535 => 64512
[   49.462536] addr: 0xffffffff8ef28000
[   49.463686] TASK_SIZE: 0x007ffffffff000
[   49.464952] len: 0x00000000400000

Note that makedumpfile tries to mmap some area in 4MiB size here,
get_unmapped_area tries to find some area to cover such requested 4MiB size
within user-space address space limit but it returns somehow kernel-space
address.

The actual instance of current->mm->get_unmapped_area on my environment
is arch_get_unmapped_area_topdown.

Finally, note: I tried to run makedumpfile using mmap on /proc/vmcore 128-times. Then,
- v3.12-rc3 returns -EIO in every case (trivial),
- v3.12-rc3 with the above patch applied returns -ENOMEM at 50/128,
- v3.12-rc3 with commit c4fe24485729fc2cbff324c111e67a1cc2f9adea reverted
   works well in every case, and
- v3.11.1-201.fc19.x86_64 works well in every case.

So, I suspect procfs wrapper affects arch_get_unmapped_region_topdown?

Any comments are helpful.

-- 
Thanks.
HATAYAMA, Daisuke


  reply	other threads:[~2013-10-07  2:46 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-10-02 12:03 mmap for /proc/vmcore broken since 3.12-rc1 Michael Holzheu
2013-10-02 12:03 ` Michael Holzheu
2013-10-03  6:12 ` HATAYAMA Daisuke
2013-10-03  6:12   ` HATAYAMA Daisuke
2013-10-07  2:42   ` HATAYAMA Daisuke [this message]
2013-10-07  2:42     ` HATAYAMA Daisuke
2013-10-08 12:49     ` Alexey Dobriyan
2013-10-08 12:49       ` Alexey Dobriyan
2013-10-09 10:14       ` HATAYAMA Daisuke
2013-10-09 10:14         ` HATAYAMA Daisuke
2013-10-12 20:32         ` Alexey Dobriyan
2013-10-12 20:32           ` Alexey Dobriyan
2013-10-14  4:52           ` HATAYAMA Daisuke
2013-10-14  4:52             ` HATAYAMA Daisuke

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52521FA5.3040101@jp.fujitsu.com \
    --to=d.hatayama@jp.fujitsu.com \
    --cc=adobriyan@gmail.com \
    --cc=davem@davemloft.net \
    --cc=holzheu@linux.vnet.ibm.com \
    --cc=kexec@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=vgoyal@redhat.com \
    --cc=willeke@de.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.