All of lore.kernel.org
 help / color / mirror / Atom feed
* Drop in Iops with fsync when using NVMe as cache
@ 2017-02-21 16:48 shiva rkreddy
  2017-02-22  9:40 ` Vojtech Pavlik
  0 siblings, 1 reply; 5+ messages in thread
From: shiva rkreddy @ 2017-02-21 16:48 UTC (permalink / raw)
  To: Eric Wheeler; +Cc: linux-bcache

Kernel version: 4.4.0-62

Backing Devices: Segate Enterprise 7.2K rpm 2TB SAS (ST2000NX0433)

Cache Device: Intel DC P3700 NVMe 1.6TB

bcache cache mode: writeback

# make-bcache --block 4k  --bucket 2M  -B /dev/sdb  -C /dev/nvme0n1p2


Created backing and cache devices with above command. I was expecting
very high number of iops with and without fsync option of fio.


fio command without fsync:

# fio -filename=/dev/bcache0 -direct=1 -ioengine=libaio -rw=randwrite
-bs=4k -name=mytest -iodepth=1 -runtime=30 -time_based

iops : 35k

fio command with fsync:

fio -filename=/dev/bcache0 -direct=1 -ioengine=libaio -rw=randwrite
-bs=4k -name=mytest -iodepth=1 -runtime=30 -time_based -fsync=1

iops: 8.1k

Attempted following combinations and saw same results:

1. block size 512,4k bucket, 512k, 2M, 4M for bcache devices
2. fio -rw option of write also showed similar results.
3. bcache writeback_percent 10 or 50; sequential_cutoff: 64M ; read_ahead_kb: 4k
4. Captured blktrace for a single io and that didn't show anything interes

I'm quite surprised by the drop in iops with fsync turned on. Is this
expected or am I missing some basic setting?
Appreciate any help !.
Thanks,
Shiva

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Drop in Iops with fsync when using NVMe as cache
  2017-02-21 16:48 Drop in Iops with fsync when using NVMe as cache shiva rkreddy
@ 2017-02-22  9:40 ` Vojtech Pavlik
  2017-02-22 15:47   ` shiva rkreddy
  0 siblings, 1 reply; 5+ messages in thread
From: Vojtech Pavlik @ 2017-02-22  9:40 UTC (permalink / raw)
  To: shiva rkreddy; +Cc: Eric Wheeler, linux-bcache

On Tue, Feb 21, 2017 at 10:48:06AM -0600, shiva rkreddy wrote:

> fio command without fsync:
> 
> # fio -filename=/dev/bcache0 -direct=1 -ioengine=libaio -rw=randwrite
> -bs=4k -name=mytest -iodepth=1 -runtime=30 -time_based
> 
> iops : 35k
> 
> fio command with fsync:
> 
> fio -filename=/dev/bcache0 -direct=1 -ioengine=libaio -rw=randwrite
> -bs=4k -name=mytest -iodepth=1 -runtime=30 -time_based -fsync=1
> 
> iops: 8.1k

> I'm quite surprised by the drop in iops with fsync turned on. Is this
> expected or am I missing some basic setting?

It's not uncommon that fsync would have a huge performance impact.
Without fsync, most of the data never hits the storage and is only
staying in the system memory.

May I suggest that you try to measure the performance of the same tests
when the filesystem is created on the NVMe device directly, without
using bcache? You're likely to observe a similar pattern.

-- 
Vojtech Pavlik
Director SUSE Labs

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Drop in Iops with fsync when using NVMe as cache
  2017-02-22  9:40 ` Vojtech Pavlik
@ 2017-02-22 15:47   ` shiva rkreddy
  2017-03-01  0:55     ` Eric Wheeler
  0 siblings, 1 reply; 5+ messages in thread
From: shiva rkreddy @ 2017-02-22 15:47 UTC (permalink / raw)
  To: Vojtech Pavlik; +Cc: Eric Wheeler, linux-bcache

I've tried fio directly on nvme device and without filesystem. The
drop with fsync is not that significant; 44313 vs 42713 on a 30s
randwrite run with iodepth=1


# fio -filename=/dev/nvme0n1 -direct=1 -ioengine=libaio -rw=randwrite
-bs=4k -name=mytest -iodepth=1 -runtime=30 -time_based
mytest: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=1
fio-2.1.3
Starting 1 process
Jobs: 1 (f=1): [w] [100.0% done] [0KB/177.6MB/0KB /s] [0/45.5K/0 iops]
[eta 00m:00s]
mytest: (groupid=0, jobs=1): err= 0: pid=2131: Thu Feb  9 18:56:01 2017
  write: io=5193.2MB, bw=177253KB/s, iops=44313, runt= 30001msec

# fio -filename=/dev/nvme0n1 -direct=1 -ioengine=libaio -rw=randwrite
-bs=4k -name=mytest -iodepth=1 -runtime=30 -time_based -fsync=1
mytest: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=1
fio-2.1.3
Starting 1 process
Jobs: 1 (f=1): [w] [100.0% done] [0KB/167.4MB/0KB /s] [0/42.9K/0 iops]
[eta 00m:00s]
mytest: (groupid=0, jobs=1): err= 0: pid=2136: Thu Feb  9 19:04:54 2017
  write: io=5005.5MB, bw=170853KB/s, iops=42713, runt= 30000msec


On Wed, Feb 22, 2017 at 3:40 AM, Vojtech Pavlik <vojtech@suse.com> wrote:
> On Tue, Feb 21, 2017 at 10:48:06AM -0600, shiva rkreddy wrote:
>
>> fio command without fsync:
>>
>> # fio -filename=/dev/bcache0 -direct=1 -ioengine=libaio -rw=randwrite
>> -bs=4k -name=mytest -iodepth=1 -runtime=30 -time_based
>>
>> iops : 35k
>>
>> fio command with fsync:
>>
>> fio -filename=/dev/bcache0 -direct=1 -ioengine=libaio -rw=randwrite
>> -bs=4k -name=mytest -iodepth=1 -runtime=30 -time_based -fsync=1
>>
>> iops: 8.1k
>
>> I'm quite surprised by the drop in iops with fsync turned on. Is this
>> expected or am I missing some basic setting?
>
> It's not uncommon that fsync would have a huge performance impact.
> Without fsync, most of the data never hits the storage and is only
> staying in the system memory.
>
> May I suggest that you try to measure the performance of the same tests
> when the filesystem is created on the NVMe device directly, without
> using bcache? You're likely to observe a similar pattern.
>
> --
> Vojtech Pavlik
> Director SUSE Labs

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Drop in Iops with fsync when using NVMe as cache
  2017-02-22 15:47   ` shiva rkreddy
@ 2017-03-01  0:55     ` Eric Wheeler
  2017-03-01  3:00       ` shiva rkreddy
  0 siblings, 1 reply; 5+ messages in thread
From: Eric Wheeler @ 2017-03-01  0:55 UTC (permalink / raw)
  To: shiva rkreddy; +Cc: Vojtech Pavlik, linux-bcache

On Wed, 22 Feb 2017, shiva rkreddy wrote:
> >> fio command without fsync:
> >>
> >> # fio -filename=/dev/bcache0 -direct=1 -ioengine=libaio -rw=randwrite
> >> -bs=4k -name=mytest -iodepth=1 -runtime=30 -time_based
> >>
> >> iops : 35k
> >>
> >> fio command with fsync:
> >>
> >> fio -filename=/dev/bcache0 -direct=1 -ioengine=libaio -rw=randwrite
> >> -bs=4k -name=mytest -iodepth=1 -runtime=30 -time_based -fsync=1

Try -runtime=25 since 30s is the default writeback delay.  More below.

> >>
> >> iops: 8.1k
> >
> >> I'm quite surprised by the drop in iops with fsync turned on. Is this
> >> expected or am I missing some basic setting?
> >
> > It's not uncommon that fsync would have a huge performance impact.
> > Without fsync, most of the data never hits the storage and is only
> > staying in the system memory.
> >
> > May I suggest that you try to measure the performance of the same tests
> > when the filesystem is created on the NVMe device directly, without
> > using bcache? You're likely to observe a similar pattern.
>
> I've tried fio directly on nvme device and without filesystem. The
> drop with fsync is not that significant; 44313 vs 42713 on a 30s
> randwrite run with iodepth=1

Try using `make-bcache --data-offset X ...` to align your backing device. 
It defaults to an 8k offset which may not be optimal. By the way, what is 
your backing device /dev/sdb?

Try these, too:

echo 0 > /sys/block/bcache0/bcache/sequential_cutoff
echo 10000000 > /sys/block/bcache0/bcache/cache/congested_read_threshold_us 
echo 10000000 > /sys/block/bcache0/bcache/cache/congested_write_threshold_us





--
Eric Wheeler


> 
> 
> # fio -filename=/dev/nvme0n1 -direct=1 -ioengine=libaio -rw=randwrite
> -bs=4k -name=mytest -iodepth=1 -runtime=30 -time_based
> mytest: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=1
> fio-2.1.3
> Starting 1 process
> Jobs: 1 (f=1): [w] [100.0% done] [0KB/177.6MB/0KB /s] [0/45.5K/0 iops]
> [eta 00m:00s]
> mytest: (groupid=0, jobs=1): err= 0: pid=2131: Thu Feb  9 18:56:01 2017
>   write: io=5193.2MB, bw=177253KB/s, iops=44313, runt= 30001msec
> 
> # fio -filename=/dev/nvme0n1 -direct=1 -ioengine=libaio -rw=randwrite
> -bs=4k -name=mytest -iodepth=1 -runtime=30 -time_based -fsync=1
> mytest: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=1
> fio-2.1.3
> Starting 1 process
> Jobs: 1 (f=1): [w] [100.0% done] [0KB/167.4MB/0KB /s] [0/42.9K/0 iops]
> [eta 00m:00s]
> mytest: (groupid=0, jobs=1): err= 0: pid=2136: Thu Feb  9 19:04:54 2017
>   write: io=5005.5MB, bw=170853KB/s, iops=42713, runt= 30000msec
> 
> 
> On Wed, Feb 22, 2017 at 3:40 AM, Vojtech Pavlik <vojtech@suse.com> wrote:
> > On Tue, Feb 21, 2017 at 10:48:06AM -0600, shiva rkreddy wrote:
> >
> >
> > --
> > Vojtech Pavlik
> > Director SUSE Labs
> --
> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Drop in Iops with fsync when using NVMe as cache
  2017-03-01  0:55     ` Eric Wheeler
@ 2017-03-01  3:00       ` shiva rkreddy
  0 siblings, 0 replies; 5+ messages in thread
From: shiva rkreddy @ 2017-03-01  3:00 UTC (permalink / raw)
  To: Eric Wheeler; +Cc: Vojtech Pavlik, linux-bcache

The backing device is a 7.2K SAS disk, in RAID0 (Megaraid sas controller)..

Thanks for comment on data-offset. Will give that a try..

On Tue, Feb 28, 2017 at 6:55 PM, Eric Wheeler <bcache@lists.ewheeler.net> wrote:
> On Wed, 22 Feb 2017, shiva rkreddy wrote:
>> >> fio command without fsync:
>> >>
>> >> # fio -filename=/dev/bcache0 -direct=1 -ioengine=libaio -rw=randwrite
>> >> -bs=4k -name=mytest -iodepth=1 -runtime=30 -time_based
>> >>
>> >> iops : 35k
>> >>
>> >> fio command with fsync:
>> >>
>> >> fio -filename=/dev/bcache0 -direct=1 -ioengine=libaio -rw=randwrite
>> >> -bs=4k -name=mytest -iodepth=1 -runtime=30 -time_based -fsync=1
>
> Try -runtime=25 since 30s is the default writeback delay.  More below.
>
>> >>
>> >> iops: 8.1k
>> >
>> >> I'm quite surprised by the drop in iops with fsync turned on. Is this
>> >> expected or am I missing some basic setting?
>> >
>> > It's not uncommon that fsync would have a huge performance impact.
>> > Without fsync, most of the data never hits the storage and is only
>> > staying in the system memory.
>> >
>> > May I suggest that you try to measure the performance of the same tests
>> > when the filesystem is created on the NVMe device directly, without
>> > using bcache? You're likely to observe a similar pattern.
>>
>> I've tried fio directly on nvme device and without filesystem. The
>> drop with fsync is not that significant; 44313 vs 42713 on a 30s
>> randwrite run with iodepth=1
>
> Try using `make-bcache --data-offset X ...` to align your backing device.
> It defaults to an 8k offset which may not be optimal. By the way, what is
> your backing device /dev/sdb?
>
> Try these, too:
>
> echo 0 > /sys/block/bcache0/bcache/sequential_cutoff
> echo 10000000 > /sys/block/bcache0/bcache/cache/congested_read_threshold_us
> echo 10000000 > /sys/block/bcache0/bcache/cache/congested_write_threshold_us
>
>
>
>
>
> --
> Eric Wheeler
>
>
>>
>>
>> # fio -filename=/dev/nvme0n1 -direct=1 -ioengine=libaio -rw=randwrite
>> -bs=4k -name=mytest -iodepth=1 -runtime=30 -time_based
>> mytest: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=1
>> fio-2.1.3
>> Starting 1 process
>> Jobs: 1 (f=1): [w] [100.0% done] [0KB/177.6MB/0KB /s] [0/45.5K/0 iops]
>> [eta 00m:00s]
>> mytest: (groupid=0, jobs=1): err= 0: pid=2131: Thu Feb  9 18:56:01 2017
>>   write: io=5193.2MB, bw=177253KB/s, iops=44313, runt= 30001msec
>>
>> # fio -filename=/dev/nvme0n1 -direct=1 -ioengine=libaio -rw=randwrite
>> -bs=4k -name=mytest -iodepth=1 -runtime=30 -time_based -fsync=1
>> mytest: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=1
>> fio-2.1.3
>> Starting 1 process
>> Jobs: 1 (f=1): [w] [100.0% done] [0KB/167.4MB/0KB /s] [0/42.9K/0 iops]
>> [eta 00m:00s]
>> mytest: (groupid=0, jobs=1): err= 0: pid=2136: Thu Feb  9 19:04:54 2017
>>   write: io=5005.5MB, bw=170853KB/s, iops=42713, runt= 30000msec
>>
>>
>> On Wed, Feb 22, 2017 at 3:40 AM, Vojtech Pavlik <vojtech@suse.com> wrote:
>> > On Tue, Feb 21, 2017 at 10:48:06AM -0600, shiva rkreddy wrote:
>> >
>> >
>> > --
>> > Vojtech Pavlik
>> > Director SUSE Labs
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-03-01  3:07 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-02-21 16:48 Drop in Iops with fsync when using NVMe as cache shiva rkreddy
2017-02-22  9:40 ` Vojtech Pavlik
2017-02-22 15:47   ` shiva rkreddy
2017-03-01  0:55     ` Eric Wheeler
2017-03-01  3:00       ` shiva rkreddy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.