Flexible I/O Tester development
 help / color / mirror / Atom feed
* I/O is issued twice at scsi level
@ 2012-12-01  8:31 Hiroyuki Yamada
  2012-12-01 14:26 ` Georg Schönberger
  0 siblings, 1 reply; 8+ messages in thread
From: Hiroyuki Yamada @ 2012-12-01  8:31 UTC (permalink / raw)
  To: fio

Hi,

I am using fio for benchmarking random read IOPS of files.
(Test configuration is listed at the bottom.)

I have traced I/Os from fio by systemtap and
noticed that the number of I/Os at scsi level is twice as many as the
number of I/Os at vfs level.
But, I/O size at both scsi level and vfs level shown as 4KB, so simply
measured 1/2 performance.
I also tried by benchmarking tools and the same issue happend.
so, it's not fio specific issue.
But, I am wondering if any of you knows the reason for that or some hints.


Test configuration.
=================
ioengine=psync
rw=randread
numjobs=1
blocksize=4096
filename=file_morethan_100G
thread
runtime=60
randrepeat=0
=================
(I clean up page caches every time before mesurement.)


Thanks,
Hiroyuki

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: I/O is issued twice at scsi level
  2012-12-01  8:31 I/O is issued twice at scsi level Hiroyuki Yamada
@ 2012-12-01 14:26 ` Georg Schönberger
  2012-12-01 14:51   ` Hiroyuki Yamada
  0 siblings, 1 reply; 8+ messages in thread
From: Georg Schönberger @ 2012-12-01 14:26 UTC (permalink / raw)
  To: Hiroyuki Yamada; +Cc: fio

----- Original Message -----
> From: "Hiroyuki Yamada" <mogwaing@gmail.com>
> To: fio@vger.kernel.org
> Sent: Saturday, 1 December, 2012 9:31:42 AM
> Subject: I/O is issued twice at scsi level
> 
> Hi,
> 
> I am using fio for benchmarking random read IOPS of files.
> (Test configuration is listed at the bottom.)
> 
> I have traced I/Os from fio by systemtap and
> noticed that the number of I/Os at scsi level is twice as many as the
> number of I/Os at vfs level.
> But, I/O size at both scsi level and vfs level shown as 4KB, so
> simply
> measured 1/2 performance.
> I also tried by benchmarking tools and the same issue happend.
> so, it's not fio specific issue.
> But, I am wondering if any of you knows the reason for that or some
> hints.
> 
> 
> Test configuration.
> =================
> ioengine=psync
> rw=randread
> numjobs=1
> blocksize=4096
> filename=file_morethan_100G
> thread
> runtime=60
> randrepeat=0
> =================
> (I clean up page caches every time before mesurement.)
> 
> 
> Thanks,
> Hiroyuki
> --
> To unsubscribe from this list: send the line "unsubscribe fio" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
This is very interesting as I am currently investigating a 50% performance gap between two performance systems.
I am inspecting a 50% difference concerning 4k random read IOPS for the same device on different systems (a SCSI SSD), one Ubuntu 12.04 and one CentOS.

Can you provide some more information about your platform?

Thanks, Georg

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: I/O is issued twice at scsi level
  2012-12-01 14:26 ` Georg Schönberger
@ 2012-12-01 14:51   ` Hiroyuki Yamada
  2012-12-02  9:23     ` Hiroyuki Yamada
  0 siblings, 1 reply; 8+ messages in thread
From: Hiroyuki Yamada @ 2012-12-01 14:51 UTC (permalink / raw)
  To: Georg Schönberger; +Cc: fio

Hi Georg,

I am using CentOS 5.7 and 5.8.
Using ext3 FS on LVM.
This issue happens without LVM, so LVM is not the cause, I think.

I changed the I/O size at the application level to 16KB then,
16KB I/O and 4KB I/O are issued at scsi level as following.
(SYSPREAD is application level I/O and SCSI is scsi i/o dispatching
from systemtap.)

=============================================
SYSPREAD random(8472) 3, 0x16fc5200, 16384, 128137183232
SCSI random(8472) 0 1 0 0 start-sector: 226321183 size: 4096 bufflen
4096 FROM_DEVICE 1354354008068009
SCSI random(8472) 0 1 0 0 start-sector: 226323431 size: 16384 bufflen
16384 FROM_DEVICE 1354354008075927
SYSPREAD random(8472) 3, 0x16fc5200, 16384, 21807710208
SCSI random(8472) 0 1 0 0 start-sector: 1889888935 size: 4096 bufflen
4096 FROM_DEVICE 1354354008085128
SCSI random(8472) 0 1 0 0 start-sector: 1889891823 size: 16384 bufflen
16384 FROM_DEVICE 1354354008097161
SYSPREAD random(8472) 3, 0x16fc5200, 16384, 139365318656
SCSI random(8472) 0 1 0 0 start-sector: 254092663 size: 4096 bufflen
4096 FROM_DEVICE 1354354008100633
SCSI random(8472) 0 1 0 0 start-sector: 254094879 size: 16384 bufflen
16384 FROM_DEVICE 1354354008111723
SYSPREAD random(8472) 3, 0x16fc5200, 16384, 60304424960
SCSI random(8472) 0 1 0 0 start-sector: 58119807 size: 4096 bufflen
4096 FROM_DEVICE 1354354008120469
SCSI random(8472) 0 1 0 0 start-sector: 58125415 size: 16384 bufflen
16384 FROM_DEVICE 1354354008126343
============================================

Do you have any idea what's going on ?



On Sat, Dec 1, 2012 at 11:26 PM, Georg Sch�nberger
<gschoenberger@thomas-krenn.com> wrote:
> ----- Original Message -----
>> From: "Hiroyuki Yamada" <mogwaing@gmail.com>
>> To: fio@vger.kernel.org
>> Sent: Saturday, 1 December, 2012 9:31:42 AM
>> Subject: I/O is issued twice at scsi level
>>
>> Hi,
>>
>> I am using fio for benchmarking random read IOPS of files.
>> (Test configuration is listed at the bottom.)
>>
>> I have traced I/Os from fio by systemtap and
>> noticed that the number of I/Os at scsi level is twice as many as the
>> number of I/Os at vfs level.
>> But, I/O size at both scsi level and vfs level shown as 4KB, so
>> simply
>> measured 1/2 performance.
>> I also tried by benchmarking tools and the same issue happend.
>> so, it's not fio specific issue.
>> But, I am wondering if any of you knows the reason for that or some
>> hints.
>>
>>
>> Test configuration.
>> =================
>> ioengine=psync
>> rw=randread
>> numjobs=1
>> blocksize=4096
>> filename=file_morethan_100G
>> thread
>> runtime=60
>> randrepeat=0
>> =================
>> (I clean up page caches every time before mesurement.)
>>
>>
>> Thanks,
>> Hiroyuki
>> --
>> To unsubscribe from this list: send the line "unsubscribe fio" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
> This is very interesting as I am currently investigating a 50% performance gap between two performance systems.
> I am inspecting a 50% difference concerning 4k random read IOPS for the same device on different systems (a SCSI SSD), one Ubuntu 12.04 and one CentOS.
>
> Can you provide some more information about your platform?
>
> Thanks, Georg

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: I/O is issued twice at scsi level
  2012-12-01 14:51   ` Hiroyuki Yamada
@ 2012-12-02  9:23     ` Hiroyuki Yamada
  2012-12-03 13:43       ` Jens Axboe
  0 siblings, 1 reply; 8+ messages in thread
From: Hiroyuki Yamada @ 2012-12-02  9:23 UTC (permalink / raw)
  To: Georg Schönberger; +Cc: fio

I figured out what is going on, but I don't know what it is for.

Ext3 filesystem has some 4KB data in each 4096KB(8192 sectors) data.
Visually, data is aligned like the following.

|4KB|4096KB|4KB|4096KB|4KB|4096KB| ...

And 4096KB area in only accessible by application programs.
When accessing the first 4096KB area for the first time,
then OS reads the 4KB just before the 4096KB area first
and then read the requested data in the 4096KB area.

When accessing a large file (compared to the DRAM size) randomly,
every I/O has rare chance of hitting page cahce,
so every I/O request comes together with 4KB I/O.

The thing is what the 4KB data is for ?
Is this location metadata for filesystem ?
Is there any way I can remove this ?
Or Is there any way I can clear the 4096KB area only ?

Any comments and advices are appreciated.

(I tested in many machines with many kernel versions. this happens in
all machines.)

Thanks.

On Sat, Dec 1, 2012 at 11:51 PM, Hiroyuki Yamada <mogwaing@gmail.com> wrote:
> Hi Georg,
>
> I am using CentOS 5.7 and 5.8.
> Using ext3 FS on LVM.
> This issue happens without LVM, so LVM is not the cause, I think.
>
> I changed the I/O size at the application level to 16KB then,
> 16KB I/O and 4KB I/O are issued at scsi level as following.
> (SYSPREAD is application level I/O and SCSI is scsi i/o dispatching
> from systemtap.)
>
> =============================================
> SYSPREAD random(8472) 3, 0x16fc5200, 16384, 128137183232
> SCSI random(8472) 0 1 0 0 start-sector: 226321183 size: 4096 bufflen
> 4096 FROM_DEVICE 1354354008068009
> SCSI random(8472) 0 1 0 0 start-sector: 226323431 size: 16384 bufflen
> 16384 FROM_DEVICE 1354354008075927
> SYSPREAD random(8472) 3, 0x16fc5200, 16384, 21807710208
> SCSI random(8472) 0 1 0 0 start-sector: 1889888935 size: 4096 bufflen
> 4096 FROM_DEVICE 1354354008085128
> SCSI random(8472) 0 1 0 0 start-sector: 1889891823 size: 16384 bufflen
> 16384 FROM_DEVICE 1354354008097161
> SYSPREAD random(8472) 3, 0x16fc5200, 16384, 139365318656
> SCSI random(8472) 0 1 0 0 start-sector: 254092663 size: 4096 bufflen
> 4096 FROM_DEVICE 1354354008100633
> SCSI random(8472) 0 1 0 0 start-sector: 254094879 size: 16384 bufflen
> 16384 FROM_DEVICE 1354354008111723
> SYSPREAD random(8472) 3, 0x16fc5200, 16384, 60304424960
> SCSI random(8472) 0 1 0 0 start-sector: 58119807 size: 4096 bufflen
> 4096 FROM_DEVICE 1354354008120469
> SCSI random(8472) 0 1 0 0 start-sector: 58125415 size: 16384 bufflen
> 16384 FROM_DEVICE 1354354008126343
> ============================================
>
> Do you have any idea what's going on ?
>
>
>
> On Sat, Dec 1, 2012 at 11:26 PM, Georg Sch�nberger
> <gschoenberger@thomas-krenn.com> wrote:
>> ----- Original Message -----
>>> From: "Hiroyuki Yamada" <mogwaing@gmail.com>
>>> To: fio@vger.kernel.org
>>> Sent: Saturday, 1 December, 2012 9:31:42 AM
>>> Subject: I/O is issued twice at scsi level
>>>
>>> Hi,
>>>
>>> I am using fio for benchmarking random read IOPS of files.
>>> (Test configuration is listed at the bottom.)
>>>
>>> I have traced I/Os from fio by systemtap and
>>> noticed that the number of I/Os at scsi level is twice as many as the
>>> number of I/Os at vfs level.
>>> But, I/O size at both scsi level and vfs level shown as 4KB, so
>>> simply
>>> measured 1/2 performance.
>>> I also tried by benchmarking tools and the same issue happend.
>>> so, it's not fio specific issue.
>>> But, I am wondering if any of you knows the reason for that or some
>>> hints.
>>>
>>>
>>> Test configuration.
>>> =================
>>> ioengine=psync
>>> rw=randread
>>> numjobs=1
>>> blocksize=4096
>>> filename=file_morethan_100G
>>> thread
>>> runtime=60
>>> randrepeat=0
>>> =================
>>> (I clean up page caches every time before mesurement.)
>>>
>>>
>>> Thanks,
>>> Hiroyuki
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe fio" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>> This is very interesting as I am currently investigating a 50% performance gap between two performance systems.
>> I am inspecting a 50% difference concerning 4k random read IOPS for the same device on different systems (a SCSI SSD), one Ubuntu 12.04 and one CentOS.
>>
>> Can you provide some more information about your platform?
>>
>> Thanks, Georg

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: I/O is issued twice at scsi level
  2012-12-02  9:23     ` Hiroyuki Yamada
@ 2012-12-03 13:43       ` Jens Axboe
  2012-12-03 13:59         ` Andrey Kuzmin
  0 siblings, 1 reply; 8+ messages in thread
From: Jens Axboe @ 2012-12-03 13:43 UTC (permalink / raw)
  To: Hiroyuki Yamada; +Cc: Georg Schönberger, fio

Quick guess - it's updating the mtime/atime on the inode?


On 2012-12-02 10:23, Hiroyuki Yamada wrote:
> I figured out what is going on, but I don't know what it is for.
> 
> Ext3 filesystem has some 4KB data in each 4096KB(8192 sectors) data.
> Visually, data is aligned like the following.
> 
> |4KB|4096KB|4KB|4096KB|4KB|4096KB| ...
> 
> And 4096KB area in only accessible by application programs.
> When accessing the first 4096KB area for the first time,
> then OS reads the 4KB just before the 4096KB area first
> and then read the requested data in the 4096KB area.
> 
> When accessing a large file (compared to the DRAM size) randomly,
> every I/O has rare chance of hitting page cahce,
> so every I/O request comes together with 4KB I/O.
> 
> The thing is what the 4KB data is for ?
> Is this location metadata for filesystem ?
> Is there any way I can remove this ?
> Or Is there any way I can clear the 4096KB area only ?
> 
> Any comments and advices are appreciated.
> 
> (I tested in many machines with many kernel versions. this happens in
> all machines.)
> 
> Thanks.
> 
> On Sat, Dec 1, 2012 at 11:51 PM, Hiroyuki Yamada <mogwaing@gmail.com> wrote:
>> Hi Georg,
>>
>> I am using CentOS 5.7 and 5.8.
>> Using ext3 FS on LVM.
>> This issue happens without LVM, so LVM is not the cause, I think.
>>
>> I changed the I/O size at the application level to 16KB then,
>> 16KB I/O and 4KB I/O are issued at scsi level as following.
>> (SYSPREAD is application level I/O and SCSI is scsi i/o dispatching
>> from systemtap.)
>>
>> =============================================
>> SYSPREAD random(8472) 3, 0x16fc5200, 16384, 128137183232
>> SCSI random(8472) 0 1 0 0 start-sector: 226321183 size: 4096 bufflen
>> 4096 FROM_DEVICE 1354354008068009
>> SCSI random(8472) 0 1 0 0 start-sector: 226323431 size: 16384 bufflen
>> 16384 FROM_DEVICE 1354354008075927
>> SYSPREAD random(8472) 3, 0x16fc5200, 16384, 21807710208
>> SCSI random(8472) 0 1 0 0 start-sector: 1889888935 size: 4096 bufflen
>> 4096 FROM_DEVICE 1354354008085128
>> SCSI random(8472) 0 1 0 0 start-sector: 1889891823 size: 16384 bufflen
>> 16384 FROM_DEVICE 1354354008097161
>> SYSPREAD random(8472) 3, 0x16fc5200, 16384, 139365318656
>> SCSI random(8472) 0 1 0 0 start-sector: 254092663 size: 4096 bufflen
>> 4096 FROM_DEVICE 1354354008100633
>> SCSI random(8472) 0 1 0 0 start-sector: 254094879 size: 16384 bufflen
>> 16384 FROM_DEVICE 1354354008111723
>> SYSPREAD random(8472) 3, 0x16fc5200, 16384, 60304424960
>> SCSI random(8472) 0 1 0 0 start-sector: 58119807 size: 4096 bufflen
>> 4096 FROM_DEVICE 1354354008120469
>> SCSI random(8472) 0 1 0 0 start-sector: 58125415 size: 16384 bufflen
>> 16384 FROM_DEVICE 1354354008126343
>> ============================================
>>
>> Do you have any idea what's going on ?
>>
>>
>>
>> On Sat, Dec 1, 2012 at 11:26 PM, Georg Sch�nberger
>> <gschoenberger@thomas-krenn.com> wrote:
>>> ----- Original Message -----
>>>> From: "Hiroyuki Yamada" <mogwaing@gmail.com>
>>>> To: fio@vger.kernel.org
>>>> Sent: Saturday, 1 December, 2012 9:31:42 AM
>>>> Subject: I/O is issued twice at scsi level
>>>>
>>>> Hi,
>>>>
>>>> I am using fio for benchmarking random read IOPS of files.
>>>> (Test configuration is listed at the bottom.)
>>>>
>>>> I have traced I/Os from fio by systemtap and
>>>> noticed that the number of I/Os at scsi level is twice as many as the
>>>> number of I/Os at vfs level.
>>>> But, I/O size at both scsi level and vfs level shown as 4KB, so
>>>> simply
>>>> measured 1/2 performance.
>>>> I also tried by benchmarking tools and the same issue happend.
>>>> so, it's not fio specific issue.
>>>> But, I am wondering if any of you knows the reason for that or some
>>>> hints.
>>>>
>>>>
>>>> Test configuration.
>>>> =================
>>>> ioengine=psync
>>>> rw=randread
>>>> numjobs=1
>>>> blocksize=4096
>>>> filename=file_morethan_100G
>>>> thread
>>>> runtime=60
>>>> randrepeat=0
>>>> =================
>>>> (I clean up page caches every time before mesurement.)
>>>>
>>>>
>>>> Thanks,
>>>> Hiroyuki
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe fio" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>
>>> This is very interesting as I am currently investigating a 50% performance gap between two performance systems.
>>> I am inspecting a 50% difference concerning 4k random read IOPS for the same device on different systems (a SCSI SSD), one Ubuntu 12.04 and one CentOS.
>>>
>>> Can you provide some more information about your platform?
>>>
>>> Thanks, Georg
> --
> To unsubscribe from this list: send the line "unsubscribe fio" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: I/O is issued twice at scsi level
  2012-12-03 13:43       ` Jens Axboe
@ 2012-12-03 13:59         ` Andrey Kuzmin
  2012-12-03 17:09           ` Hiroyuki Yamada
  0 siblings, 1 reply; 8+ messages in thread
From: Andrey Kuzmin @ 2012-12-03 13:59 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Hiroyuki Yamada, Georg Schönberger, fio

And where is the corresponding updated atime write? Buffered?
Regards,
Andrey


On Mon, Dec 3, 2012 at 5:43 PM, Jens Axboe <axboe@kernel.dk> wrote:
> Quick guess - it's updating the mtime/atime on the inode?
>
>
> On 2012-12-02 10:23, Hiroyuki Yamada wrote:
>> I figured out what is going on, but I don't know what it is for.
>>
>> Ext3 filesystem has some 4KB data in each 4096KB(8192 sectors) data.
>> Visually, data is aligned like the following.
>>
>> |4KB|4096KB|4KB|4096KB|4KB|4096KB| ...
>>
>> And 4096KB area in only accessible by application programs.
>> When accessing the first 4096KB area for the first time,
>> then OS reads the 4KB just before the 4096KB area first
>> and then read the requested data in the 4096KB area.
>>
>> When accessing a large file (compared to the DRAM size) randomly,
>> every I/O has rare chance of hitting page cahce,
>> so every I/O request comes together with 4KB I/O.
>>
>> The thing is what the 4KB data is for ?
>> Is this location metadata for filesystem ?
>> Is there any way I can remove this ?
>> Or Is there any way I can clear the 4096KB area only ?
>>
>> Any comments and advices are appreciated.
>>
>> (I tested in many machines with many kernel versions. this happens in
>> all machines.)
>>
>> Thanks.
>>
>> On Sat, Dec 1, 2012 at 11:51 PM, Hiroyuki Yamada <mogwaing@gmail.com> wrote:
>>> Hi Georg,
>>>
>>> I am using CentOS 5.7 and 5.8.
>>> Using ext3 FS on LVM.
>>> This issue happens without LVM, so LVM is not the cause, I think.
>>>
>>> I changed the I/O size at the application level to 16KB then,
>>> 16KB I/O and 4KB I/O are issued at scsi level as following.
>>> (SYSPREAD is application level I/O and SCSI is scsi i/o dispatching
>>> from systemtap.)
>>>
>>> =============================================
>>> SYSPREAD random(8472) 3, 0x16fc5200, 16384, 128137183232
>>> SCSI random(8472) 0 1 0 0 start-sector: 226321183 size: 4096 bufflen
>>> 4096 FROM_DEVICE 1354354008068009
>>> SCSI random(8472) 0 1 0 0 start-sector: 226323431 size: 16384 bufflen
>>> 16384 FROM_DEVICE 1354354008075927
>>> SYSPREAD random(8472) 3, 0x16fc5200, 16384, 21807710208
>>> SCSI random(8472) 0 1 0 0 start-sector: 1889888935 size: 4096 bufflen
>>> 4096 FROM_DEVICE 1354354008085128
>>> SCSI random(8472) 0 1 0 0 start-sector: 1889891823 size: 16384 bufflen
>>> 16384 FROM_DEVICE 1354354008097161
>>> SYSPREAD random(8472) 3, 0x16fc5200, 16384, 139365318656
>>> SCSI random(8472) 0 1 0 0 start-sector: 254092663 size: 4096 bufflen
>>> 4096 FROM_DEVICE 1354354008100633
>>> SCSI random(8472) 0 1 0 0 start-sector: 254094879 size: 16384 bufflen
>>> 16384 FROM_DEVICE 1354354008111723
>>> SYSPREAD random(8472) 3, 0x16fc5200, 16384, 60304424960
>>> SCSI random(8472) 0 1 0 0 start-sector: 58119807 size: 4096 bufflen
>>> 4096 FROM_DEVICE 1354354008120469
>>> SCSI random(8472) 0 1 0 0 start-sector: 58125415 size: 16384 bufflen
>>> 16384 FROM_DEVICE 1354354008126343
>>> ============================================
>>>
>>> Do you have any idea what's going on ?
>>>
>>>
>>>
>>> On Sat, Dec 1, 2012 at 11:26 PM, Georg Schönberger
>>> <gschoenberger@thomas-krenn.com> wrote:
>>>> ----- Original Message -----
>>>>> From: "Hiroyuki Yamada" <mogwaing@gmail.com>
>>>>> To: fio@vger.kernel.org
>>>>> Sent: Saturday, 1 December, 2012 9:31:42 AM
>>>>> Subject: I/O is issued twice at scsi level
>>>>>
>>>>> Hi,
>>>>>
>>>>> I am using fio for benchmarking random read IOPS of files.
>>>>> (Test configuration is listed at the bottom.)
>>>>>
>>>>> I have traced I/Os from fio by systemtap and
>>>>> noticed that the number of I/Os at scsi level is twice as many as the
>>>>> number of I/Os at vfs level.
>>>>> But, I/O size at both scsi level and vfs level shown as 4KB, so
>>>>> simply
>>>>> measured 1/2 performance.
>>>>> I also tried by benchmarking tools and the same issue happend.
>>>>> so, it's not fio specific issue.
>>>>> But, I am wondering if any of you knows the reason for that or some
>>>>> hints.
>>>>>
>>>>>
>>>>> Test configuration.
>>>>> =================
>>>>> ioengine=psync
>>>>> rw=randread
>>>>> numjobs=1
>>>>> blocksize=4096
>>>>> filename=file_morethan_100G
>>>>> thread
>>>>> runtime=60
>>>>> randrepeat=0
>>>>> =================
>>>>> (I clean up page caches every time before mesurement.)
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Hiroyuki
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe fio" in
>>>>> the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>
>>>> This is very interesting as I am currently investigating a 50% performance gap between two performance systems.
>>>> I am inspecting a 50% difference concerning 4k random read IOPS for the same device on different systems (a SCSI SSD), one Ubuntu 12.04 and one CentOS.
>>>>
>>>> Can you provide some more information about your platform?
>>>>
>>>> Thanks, Georg
>> --
>> To unsubscribe from this list: send the line "unsubscribe fio" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>
>
> --
> Jens Axboe
>
> --
> To unsubscribe from this list: send the line "unsubscribe fio" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: I/O is issued twice at scsi level
  2012-12-03 13:59         ` Andrey Kuzmin
@ 2012-12-03 17:09           ` Hiroyuki Yamada
  2012-12-03 18:33             ` Jens Axboe
  0 siblings, 1 reply; 8+ messages in thread
From: Hiroyuki Yamada @ 2012-12-03 17:09 UTC (permalink / raw)
  To: Andrey Kuzmin; +Cc: Jens Axboe, Georg Schönberger, fio

Sorry for update.
I figured it out.
It's from ext3's indirect block mapping.
There is one mapping block in each 1024 blocks,
so accessing a large file randomly has high chance of accessing the
mapping block and the requested block.

Ext4 has different and more efficient addressing called extent (tree), and
we can avoid the issue.

On Mon, Dec 3, 2012 at 10:59 PM, Andrey Kuzmin
<andrey.v.kuzmin@gmail.com> wrote:
> And where is the corresponding updated atime write? Buffered?
> Regards,
> Andrey
>
>
> On Mon, Dec 3, 2012 at 5:43 PM, Jens Axboe <axboe@kernel.dk> wrote:
>> Quick guess - it's updating the mtime/atime on the inode?
>>
>>
>> On 2012-12-02 10:23, Hiroyuki Yamada wrote:
>>> I figured out what is going on, but I don't know what it is for.
>>>
>>> Ext3 filesystem has some 4KB data in each 4096KB(8192 sectors) data.
>>> Visually, data is aligned like the following.
>>>
>>> |4KB|4096KB|4KB|4096KB|4KB|4096KB| ...
>>>
>>> And 4096KB area in only accessible by application programs.
>>> When accessing the first 4096KB area for the first time,
>>> then OS reads the 4KB just before the 4096KB area first
>>> and then read the requested data in the 4096KB area.
>>>
>>> When accessing a large file (compared to the DRAM size) randomly,
>>> every I/O has rare chance of hitting page cahce,
>>> so every I/O request comes together with 4KB I/O.
>>>
>>> The thing is what the 4KB data is for ?
>>> Is this location metadata for filesystem ?
>>> Is there any way I can remove this ?
>>> Or Is there any way I can clear the 4096KB area only ?
>>>
>>> Any comments and advices are appreciated.
>>>
>>> (I tested in many machines with many kernel versions. this happens in
>>> all machines.)
>>>
>>> Thanks.
>>>
>>> On Sat, Dec 1, 2012 at 11:51 PM, Hiroyuki Yamada <mogwaing@gmail.com> wrote:
>>>> Hi Georg,
>>>>
>>>> I am using CentOS 5.7 and 5.8.
>>>> Using ext3 FS on LVM.
>>>> This issue happens without LVM, so LVM is not the cause, I think.
>>>>
>>>> I changed the I/O size at the application level to 16KB then,
>>>> 16KB I/O and 4KB I/O are issued at scsi level as following.
>>>> (SYSPREAD is application level I/O and SCSI is scsi i/o dispatching
>>>> from systemtap.)
>>>>
>>>> =============================================
>>>> SYSPREAD random(8472) 3, 0x16fc5200, 16384, 128137183232
>>>> SCSI random(8472) 0 1 0 0 start-sector: 226321183 size: 4096 bufflen
>>>> 4096 FROM_DEVICE 1354354008068009
>>>> SCSI random(8472) 0 1 0 0 start-sector: 226323431 size: 16384 bufflen
>>>> 16384 FROM_DEVICE 1354354008075927
>>>> SYSPREAD random(8472) 3, 0x16fc5200, 16384, 21807710208
>>>> SCSI random(8472) 0 1 0 0 start-sector: 1889888935 size: 4096 bufflen
>>>> 4096 FROM_DEVICE 1354354008085128
>>>> SCSI random(8472) 0 1 0 0 start-sector: 1889891823 size: 16384 bufflen
>>>> 16384 FROM_DEVICE 1354354008097161
>>>> SYSPREAD random(8472) 3, 0x16fc5200, 16384, 139365318656
>>>> SCSI random(8472) 0 1 0 0 start-sector: 254092663 size: 4096 bufflen
>>>> 4096 FROM_DEVICE 1354354008100633
>>>> SCSI random(8472) 0 1 0 0 start-sector: 254094879 size: 16384 bufflen
>>>> 16384 FROM_DEVICE 1354354008111723
>>>> SYSPREAD random(8472) 3, 0x16fc5200, 16384, 60304424960
>>>> SCSI random(8472) 0 1 0 0 start-sector: 58119807 size: 4096 bufflen
>>>> 4096 FROM_DEVICE 1354354008120469
>>>> SCSI random(8472) 0 1 0 0 start-sector: 58125415 size: 16384 bufflen
>>>> 16384 FROM_DEVICE 1354354008126343
>>>> ============================================
>>>>
>>>> Do you have any idea what's going on ?
>>>>
>>>>
>>>>
>>>> On Sat, Dec 1, 2012 at 11:26 PM, Georg Schönberger
>>>> <gschoenberger@thomas-krenn.com> wrote:
>>>>> ----- Original Message -----
>>>>>> From: "Hiroyuki Yamada" <mogwaing@gmail.com>
>>>>>> To: fio@vger.kernel.org
>>>>>> Sent: Saturday, 1 December, 2012 9:31:42 AM
>>>>>> Subject: I/O is issued twice at scsi level
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I am using fio for benchmarking random read IOPS of files.
>>>>>> (Test configuration is listed at the bottom.)
>>>>>>
>>>>>> I have traced I/Os from fio by systemtap and
>>>>>> noticed that the number of I/Os at scsi level is twice as many as the
>>>>>> number of I/Os at vfs level.
>>>>>> But, I/O size at both scsi level and vfs level shown as 4KB, so
>>>>>> simply
>>>>>> measured 1/2 performance.
>>>>>> I also tried by benchmarking tools and the same issue happend.
>>>>>> so, it's not fio specific issue.
>>>>>> But, I am wondering if any of you knows the reason for that or some
>>>>>> hints.
>>>>>>
>>>>>>
>>>>>> Test configuration.
>>>>>> =================
>>>>>> ioengine=psync
>>>>>> rw=randread
>>>>>> numjobs=1
>>>>>> blocksize=4096
>>>>>> filename=file_morethan_100G
>>>>>> thread
>>>>>> runtime=60
>>>>>> randrepeat=0
>>>>>> =================
>>>>>> (I clean up page caches every time before mesurement.)
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Hiroyuki
>>>>>> --
>>>>>> To unsubscribe from this list: send the line "unsubscribe fio" in
>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>>
>>>>> This is very interesting as I am currently investigating a 50% performance gap between two performance systems.
>>>>> I am inspecting a 50% difference concerning 4k random read IOPS for the same device on different systems (a SCSI SSD), one Ubuntu 12.04 and one CentOS.
>>>>>
>>>>> Can you provide some more information about your platform?
>>>>>
>>>>> Thanks, Georg
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe fio" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>
>>
>> --
>> Jens Axboe
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe fio" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: I/O is issued twice at scsi level
  2012-12-03 17:09           ` Hiroyuki Yamada
@ 2012-12-03 18:33             ` Jens Axboe
  0 siblings, 0 replies; 8+ messages in thread
From: Jens Axboe @ 2012-12-03 18:33 UTC (permalink / raw)
  To: Hiroyuki Yamada; +Cc: Andrey Kuzmin, Georg Schönberger, fio

On 2012-12-03 18:09, Hiroyuki Yamada wrote:
> Sorry for update.
> I figured it out.
> It's from ext3's indirect block mapping.
> There is one mapping block in each 1024 blocks,
> so accessing a large file randomly has high chance of accessing the
> mapping block and the requested block.
> 
> Ext4 has different and more efficient addressing called extent (tree), and
> we can avoid the issue.

Ah, yes that makes sense. I thought we were talking about extra writes,
but it seems I misread the log since it clearly states FROM_DEVICE
transfers.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2012-12-03 18:33 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-12-01  8:31 I/O is issued twice at scsi level Hiroyuki Yamada
2012-12-01 14:26 ` Georg Schönberger
2012-12-01 14:51   ` Hiroyuki Yamada
2012-12-02  9:23     ` Hiroyuki Yamada
2012-12-03 13:43       ` Jens Axboe
2012-12-03 13:59         ` Andrey Kuzmin
2012-12-03 17:09           ` Hiroyuki Yamada
2012-12-03 18:33             ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox