All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wen Congyang <wency@cn.fujitsu.com>
To: Stefan Hajnoczi <stefanha@gmail.com>, David Weber <wb@munzinger.de>
Cc: linux-ext4@vger.kernel.org, qemu-devel@nongnu.org
Subject: Re: Strange problems with lseek in qemu-img map
Date: Thu, 4 Jun 2015 11:52:02 +0800	[thread overview]
Message-ID: <556FCB62.80200@cn.fujitsu.com> (raw)
In-Reply-To: <20150603140607.GC20862@stefanha-thinkpad.redhat.com>

Cc: ext4 maillist

On 06/03/2015 10:06 PM, Stefan Hajnoczi wrote:
> On Tue, Jun 02, 2015 at 02:54:17PM +0200, David Weber wrote:
>> Testcase:
>> # qemu-img create test 500G
>> # time qemu-img map test
>>
>> Systems:
>> O3-3: Kubuntu 15.04 Workstation with stock-kernel 3.19.0-18-generic and stock 
>> qemu 2.2.0
>> Dinah: Ubuntu Server 15.04 with stock-kernel 3.19.0-18-generic and stock qemu 
>> 2.2.0
> 
> These systems have the same kernel but for some reason O3-3 completes
> quickly while Dinah takes a long time in lseek(fd, offset, SEEK_DATA).
> It looks like the file is empty (the syscall keeps returning ENXIO
> because there are no allocated blocks in the file where qemu-img
> probes).
> 
>> Result on O3-3:
>> root@o3-3:~# qemu-img create test 500G
>> Formatting 'test', fmt=raw size=536870912000 
>> root@o3-3:~# time qemu-img map test
>> Offset          Length          Mapped to       File
>>
>> real    0m0.049s
>> user    0m0.048s
>> sys     0m0.000s
>>
>> Result on dinah:
>> root@dinah:~# qemu-img create test 500G
>> Formatting 'test', fmt=raw size=536870912000 
>> root@dinah:~# time qemu-img map test
>> Offset          Length          Mapped to       File
>> ^C
>>
>> real    0m41.862s
>> user    0m0.004s
>> sys     0m0.068s
>> (Stopped with ^C)
>>
>> Strace on O3-3:
>> https://gist.github.com/anonymous/f221035e9176f7c71c74
>>
>> Strace on dinah:
>> https://gist.github.com/anonymous/40b42888a65478c90b32
>>
>> A git bisect between 1.7 and master revealed 
>> 7c15903789953ead14a417882657d52dc0c19a24 "block/raw-posix: use seek_hole ahead 
>> of fiemap" as bad but this is not the real problem.
>> I also tried to switch from btrfs to ext4 but it didn't change anything.
>>
>> At this point, I was pretty sure that was just stupit and missing something 
>> trivial.
>> I then startet a fedora 22 live system and I saw the same problem. It happens 
>> on both the ramdisk and a ext4 filesystem.
> 
> "it" == qemu-img map hangs or takes a very long time?
> 
> Can you post a shell script that reproduces this with a ramdisk?  That
> seems like the easiest way to get people debugging it.

I think it is ext4's problem. I add some printk in ext4_seek_data():
[  335.579506] ext4_seek_data(): isize: 7d00000000, offset: 0, maxsize: ffffffff000
[  335.579512] ext4_seek_data(): blkbits: 12, start: 0, end: 7d00000
[  340.672400] ext4_seek_data(): loop count: 131072001
[  340.672402] ext4_seek_data() returns -ENXIO
[  340.672447] ext4_seek_data(): isize: 7d00000000, offset: 40000000, maxsize: ffffffff000
[  340.672449] ext4_seek_data(): blkbits: 12, start: 40000, end: 7d00000
[  345.701852] ext4_seek_data(): loop count: 130809857
[  345.701853] ext4_seek_data() returns -ENXIO
[  345.701891] ext4_seek_data(): isize: 7d00000000, offset: 80000000, maxsize: ffffffff000
[  345.701893] ext4_seek_data(): blkbits: 12, start: 80000, end: 7d00000
[  350.718479] ext4_seek_data(): loop count: 130547713
[  350.718480] ext4_seek_data() returns -ENXIO
[  350.718507] ext4_seek_data(): isize: 7d00000000, offset: c0000000, maxsize: ffffffff000
[  350.718508] ext4_seek_data(): blkbits: 12, start: c0000, end: 7d00000
[  355.729692] ext4_seek_data(): loop count: 130285569
[  355.729693] ext4_seek_data() returns -ENXIO
[  355.729732] ext4_seek_data(): isize: 7d00000000, offset: 100000000, maxsize: ffffffff000
[  355.729734] ext4_seek_data(): blkbits: 12, start: 100000, end: 7d00000
[  360.728206] ext4_seek_data(): loop count: 130023425
[  360.728207] ext4_seek_data() returns -ENXIO

The diff:
diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index 0613c25..9b334cc 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -453,12 +453,16 @@ static loff_t ext4_seek_data(struct file *file, loff_t offset, loff_t maxsize)
        loff_t dataoff, isize;
        int blkbits;
        int ret = 0;
+       unsigned long count = 0;
 
        mutex_lock(&inode->i_mutex);
 
        isize = i_size_read(inode);
+       pr_info("%s(): isize: %llx, offset: %llx, maxsize: %llx\n",
+               __func__, isize, offset, maxsize);
        if (offset >= isize) {
                mutex_unlock(&inode->i_mutex);
+               pr_info("%s() returns -ENXIO(offset is too large)\n", __func__);
                return -ENXIO;
        }
 
@@ -467,8 +471,11 @@ static loff_t ext4_seek_data(struct file *file, loff_t offset, loff_t maxsize)
        last = start;
        end = isize >> blkbits;
        dataoff = offset;
+       pr_info("%s(): blkbits: %d, start: %x, end: %x\n",
+               __func__, blkbits, start, end);
 
        do {
+               count++;
                map.m_lblk = last;
                map.m_len = end - last + 1;
                ret = ext4_map_blocks(NULL, inode, &map, 0);
@@ -508,8 +515,12 @@ static loff_t ext4_seek_data(struct file *file, loff_t offset, loff_t maxsize)
 
        mutex_unlock(&inode->i_mutex);
 
-       if (dataoff > isize)
+       pr_info("%s(): loop count: %ld\n", __func__, count);
+
+       if (dataoff > isize) {
+               pr_info("%s() returns -ENXIO\n", __func__);
                return -ENXIO;
+       }
 
        return vfs_setpos(file, dataoff, maxsize);
 }

We will call the syscall lseek(..., SEEK_DATA) 500 times.

Thanks
Wen Congyang

> 
> Stefan
> 

WARNING: multiple messages have this Message-ID (diff)
From: Wen Congyang <wency@cn.fujitsu.com>
To: Stefan Hajnoczi <stefanha@gmail.com>, David Weber <wb@munzinger.de>
Cc: linux-ext4@vger.kernel.org, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] Strange problems with lseek in qemu-img map
Date: Thu, 4 Jun 2015 11:52:02 +0800	[thread overview]
Message-ID: <556FCB62.80200@cn.fujitsu.com> (raw)
In-Reply-To: <20150603140607.GC20862@stefanha-thinkpad.redhat.com>

Cc: ext4 maillist

On 06/03/2015 10:06 PM, Stefan Hajnoczi wrote:
> On Tue, Jun 02, 2015 at 02:54:17PM +0200, David Weber wrote:
>> Testcase:
>> # qemu-img create test 500G
>> # time qemu-img map test
>>
>> Systems:
>> O3-3: Kubuntu 15.04 Workstation with stock-kernel 3.19.0-18-generic and stock 
>> qemu 2.2.0
>> Dinah: Ubuntu Server 15.04 with stock-kernel 3.19.0-18-generic and stock qemu 
>> 2.2.0
> 
> These systems have the same kernel but for some reason O3-3 completes
> quickly while Dinah takes a long time in lseek(fd, offset, SEEK_DATA).
> It looks like the file is empty (the syscall keeps returning ENXIO
> because there are no allocated blocks in the file where qemu-img
> probes).
> 
>> Result on O3-3:
>> root@o3-3:~# qemu-img create test 500G
>> Formatting 'test', fmt=raw size=536870912000 
>> root@o3-3:~# time qemu-img map test
>> Offset          Length          Mapped to       File
>>
>> real    0m0.049s
>> user    0m0.048s
>> sys     0m0.000s
>>
>> Result on dinah:
>> root@dinah:~# qemu-img create test 500G
>> Formatting 'test', fmt=raw size=536870912000 
>> root@dinah:~# time qemu-img map test
>> Offset          Length          Mapped to       File
>> ^C
>>
>> real    0m41.862s
>> user    0m0.004s
>> sys     0m0.068s
>> (Stopped with ^C)
>>
>> Strace on O3-3:
>> https://gist.github.com/anonymous/f221035e9176f7c71c74
>>
>> Strace on dinah:
>> https://gist.github.com/anonymous/40b42888a65478c90b32
>>
>> A git bisect between 1.7 and master revealed 
>> 7c15903789953ead14a417882657d52dc0c19a24 "block/raw-posix: use seek_hole ahead 
>> of fiemap" as bad but this is not the real problem.
>> I also tried to switch from btrfs to ext4 but it didn't change anything.
>>
>> At this point, I was pretty sure that was just stupit and missing something 
>> trivial.
>> I then startet a fedora 22 live system and I saw the same problem. It happens 
>> on both the ramdisk and a ext4 filesystem.
> 
> "it" == qemu-img map hangs or takes a very long time?
> 
> Can you post a shell script that reproduces this with a ramdisk?  That
> seems like the easiest way to get people debugging it.

I think it is ext4's problem. I add some printk in ext4_seek_data():
[  335.579506] ext4_seek_data(): isize: 7d00000000, offset: 0, maxsize: ffffffff000
[  335.579512] ext4_seek_data(): blkbits: 12, start: 0, end: 7d00000
[  340.672400] ext4_seek_data(): loop count: 131072001
[  340.672402] ext4_seek_data() returns -ENXIO
[  340.672447] ext4_seek_data(): isize: 7d00000000, offset: 40000000, maxsize: ffffffff000
[  340.672449] ext4_seek_data(): blkbits: 12, start: 40000, end: 7d00000
[  345.701852] ext4_seek_data(): loop count: 130809857
[  345.701853] ext4_seek_data() returns -ENXIO
[  345.701891] ext4_seek_data(): isize: 7d00000000, offset: 80000000, maxsize: ffffffff000
[  345.701893] ext4_seek_data(): blkbits: 12, start: 80000, end: 7d00000
[  350.718479] ext4_seek_data(): loop count: 130547713
[  350.718480] ext4_seek_data() returns -ENXIO
[  350.718507] ext4_seek_data(): isize: 7d00000000, offset: c0000000, maxsize: ffffffff000
[  350.718508] ext4_seek_data(): blkbits: 12, start: c0000, end: 7d00000
[  355.729692] ext4_seek_data(): loop count: 130285569
[  355.729693] ext4_seek_data() returns -ENXIO
[  355.729732] ext4_seek_data(): isize: 7d00000000, offset: 100000000, maxsize: ffffffff000
[  355.729734] ext4_seek_data(): blkbits: 12, start: 100000, end: 7d00000
[  360.728206] ext4_seek_data(): loop count: 130023425
[  360.728207] ext4_seek_data() returns -ENXIO

The diff:
diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index 0613c25..9b334cc 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -453,12 +453,16 @@ static loff_t ext4_seek_data(struct file *file, loff_t offset, loff_t maxsize)
        loff_t dataoff, isize;
        int blkbits;
        int ret = 0;
+       unsigned long count = 0;
 
        mutex_lock(&inode->i_mutex);
 
        isize = i_size_read(inode);
+       pr_info("%s(): isize: %llx, offset: %llx, maxsize: %llx\n",
+               __func__, isize, offset, maxsize);
        if (offset >= isize) {
                mutex_unlock(&inode->i_mutex);
+               pr_info("%s() returns -ENXIO(offset is too large)\n", __func__);
                return -ENXIO;
        }
 
@@ -467,8 +471,11 @@ static loff_t ext4_seek_data(struct file *file, loff_t offset, loff_t maxsize)
        last = start;
        end = isize >> blkbits;
        dataoff = offset;
+       pr_info("%s(): blkbits: %d, start: %x, end: %x\n",
+               __func__, blkbits, start, end);
 
        do {
+               count++;
                map.m_lblk = last;
                map.m_len = end - last + 1;
                ret = ext4_map_blocks(NULL, inode, &map, 0);
@@ -508,8 +515,12 @@ static loff_t ext4_seek_data(struct file *file, loff_t offset, loff_t maxsize)
 
        mutex_unlock(&inode->i_mutex);
 
-       if (dataoff > isize)
+       pr_info("%s(): loop count: %ld\n", __func__, count);
+
+       if (dataoff > isize) {
+               pr_info("%s() returns -ENXIO\n", __func__);
                return -ENXIO;
+       }
 
        return vfs_setpos(file, dataoff, maxsize);
 }

We will call the syscall lseek(..., SEEK_DATA) 500 times.

Thanks
Wen Congyang

> 
> Stefan
> 

  reply	other threads:[~2015-06-04  3:52 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-02 12:54 [Qemu-devel] Strange problems with lseek in qemu-img map David Weber
2015-06-03 14:06 ` Stefan Hajnoczi
2015-06-04  3:52   ` Wen Congyang [this message]
2015-06-04  3:52     ` Wen Congyang
2015-06-04  1:19 ` Wen Congyang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=556FCB62.80200@cn.fujitsu.com \
    --to=wency@cn.fujitsu.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@gmail.com \
    --cc=wb@munzinger.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.