* blog article about f2fs on smr drives
@ 2015-10-08 13:56 Marc Lehmann
2015-10-09 0:45 ` Jaegeuk Kim
2015-10-19 10:43 ` Chao Yu
0 siblings, 2 replies; 8+ messages in thread
From: Marc Lehmann @ 2015-10-08 13:56 UTC (permalink / raw)
To: linux-f2fs-devel
Hi!
I write a short and very preliminary blog article about hoe to get SMR drives
to work, fast, with f2fs.
http://blog.schmorp.de/2015-10-08-smr-archive-drives-fast-now.html
This is just a heads up - due to external events, I haven't been able to
look into new developments (such as background_gc=sync), which look evry
exciting, but I think this informastion needs to get out, as there are a
lot of people who suffer badly with these drives, mostly due to kernel
issues, but of course also due to the filesystem situation.
--
The choice of a Deliantra, the free code+content MORPG
-----==- _GNU_ http://www.deliantra.net
----==-- _ generation
---==---(_)__ __ ____ __ Marc Lehmann
--==---/ / _ \/ // /\ \/ / schmorp@schmorp.de
-=====/_/_//_/\_,_/ /_/\_\
------------------------------------------------------------------------------
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: blog article about f2fs on smr drives
2015-10-08 13:56 Marc Lehmann
@ 2015-10-09 0:45 ` Jaegeuk Kim
2015-11-14 19:21 ` Marc Lehmann
2015-10-19 10:43 ` Chao Yu
1 sibling, 1 reply; 8+ messages in thread
From: Jaegeuk Kim @ 2015-10-09 0:45 UTC (permalink / raw)
To: Marc Lehmann; +Cc: linux-f2fs-devel
On Thu, Oct 08, 2015 at 03:56:35PM +0200, Marc Lehmann wrote:
> Hi!
>
> I write a short and very preliminary blog article about hoe to get SMR drives
> to work, fast, with f2fs.
>
> http://blog.schmorp.de/2015-10-08-smr-archive-drives-fast-now.html
>
> This is just a heads up - due to external events, I haven't been able to
> look into new developments (such as background_gc=sync), which look evry
> exciting, but I think this informastion needs to get out, as there are a
> lot of people who suffer badly with these drives, mostly due to kernel
> issues, but of course also due to the filesystem situation.
Cool and pretty much interesting topic to me!
BTW, it might be best to publish this kind of investigation as papers or
presentation later.
Thanks,
>
> --
> The choice of a Deliantra, the free code+content MORPG
> -----==- _GNU_ http://www.deliantra.net
> ----==-- _ generation
> ---==---(_)__ __ ____ __ Marc Lehmann
> --==---/ / _ \/ // /\ \/ / schmorp@schmorp.de
> -=====/_/_//_/\_,_/ /_/\_\
>
> ------------------------------------------------------------------------------
> _______________________________________________
> Linux-f2fs-devel mailing list
> Linux-f2fs-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
------------------------------------------------------------------------------
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: blog article about f2fs on smr drives
2015-10-08 13:56 Marc Lehmann
2015-10-09 0:45 ` Jaegeuk Kim
@ 2015-10-19 10:43 ` Chao Yu
1 sibling, 0 replies; 8+ messages in thread
From: Chao Yu @ 2015-10-19 10:43 UTC (permalink / raw)
To: 'Marc Lehmann', linux-f2fs-devel
Hello,
> -----Original Message-----
> From: Marc Lehmann [mailto:schmorp@schmorp.de]
> Sent: Thursday, October 08, 2015 9:57 PM
> To: linux-f2fs-devel@lists.sourceforge.net
> Subject: [f2fs-dev] blog article about f2fs on smr drives
>
> Hi!
>
> I write a short and very preliminary blog article about hoe to get SMR drives
> to work, fast, with f2fs.
>
> http://blog.schmorp.de/2015-10-08-smr-archive-drives-fast-now.html
Really nice experience sharing!
>
> This is just a heads up - due to external events, I haven't been able to
> look into new developments (such as background_gc=sync), which look evry
> exciting, but I think this informastion needs to get out, as there are a
> lot of people who suffer badly with these drives, mostly due to kernel
> issues, but of course also due to the filesystem situation.
I have tracked your IO trace log, I found an issue which may slow down our
App, so I wrote the patch to optimize the flow mainly for SMR drive.
I think we can tune up with /sys/fs/f2fs/(device)/ra_nid_pages to avoid long
latency of creating node in f2fs.
If you want to have a test with the new tunable parameter, last git tree is
preferred. :)
Thanks,
>
> --
> The choice of a Deliantra, the free code+content MORPG
> -----==- _GNU_ http://www.deliantra.net
> ----==-- _ generation
> ---==---(_)__ __ ____ __ Marc Lehmann
> --==---/ / _ \/ // /\ \/ / schmorp@schmorp.de
> -=====/_/_//_/\_,_/ /_/\_\
>
> ------------------------------------------------------------------------------
> _______________________________________________
> Linux-f2fs-devel mailing list
> Linux-f2fs-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
------------------------------------------------------------------------------
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: blog article about f2fs on smr drives
2015-10-09 0:45 ` Jaegeuk Kim
@ 2015-11-14 19:21 ` Marc Lehmann
0 siblings, 0 replies; 8+ messages in thread
From: Marc Lehmann @ 2015-11-14 19:21 UTC (permalink / raw)
To: linux-f2fs-devel
So sorry again for me regular going into hiding, but I am now back on the SMR
drive problems :)
So, first, a question - how many of the changes are now in 4.3 (and
unfortunately, it seems 4.3 is still not stable with these drives, still
leaving only 3.18 as an full option).
Next, I am trying to replicate my tests with faster data, and I get rather
erratic results, e.g. (tar|buffer|tar, average file size 266MB, the source
volume is capable of delivering sustained 200MB/s, so no bottleneck on the
read side):
summary: 2380.2 GB in 15 h 58 min 30.0 sec - average of 42.4 MB/s
While there are stretches at >>100MB/s, most of the time, this is at less,
and often for long stretches at ~10MB/s, which is the reason the end
result is (relatively) bad.
And lastly, is there a document describing the implementation of
encryption in the fs, and the goals (privacy? integrity? both?)
(If there isn't, I plan to review the encryption design in f2fs myself).
On Thu, Oct 08, 2015 at 05:45:11PM -0700, Jaegeuk Kim <jaegeuk@kernel.org> wrote:
> Cool and pretty much interesting topic to me!
>
> BTW, it might be best to publish this kind of investigation as papers or
> presentation later.
Maybe, I wonder how one would go about that (I never was involved with a
serious paper, I will unlikely give presentations on that, but yes, I think
somebody should :).
On Mon, Oct 19, 2015 at 06:43:36PM +0800, Chao Yu <chao2.yu@samsung.com> wrote:
> I have tracked your IO trace log, I found an issue which may slow down our
> App, so I wrote the patch to optimize the flow mainly for SMR drive.
> I think we can tune up with /sys/fs/f2fs/(device)/ra_nid_pages to avoid long
> latency of creating node in f2fs.
Maybe that could help with my problem - a git pull of the 3.18 branch seems
to have your changes in it - anything in specific that I should do?
from the Changes, it doesn't quite sound as if it will be a big help, as
there is very little read traffic overall, but I didn't analyze the IO
trace :)
it does sound like something that could be very useful for rotational disks
in general, under normal conditions (reading), though.
> If you want to have a test with the new tunable parameter, last git tree is
> preferred. :)
Sure, will be happy to do that, what should I tune how? And thanks for
working on this!
--
The choice of a Deliantra, the free code+content MORPG
-----==- _GNU_ http://www.deliantra.net
----==-- _ generation
---==---(_)__ __ ____ __ Marc Lehmann
--==---/ / _ \/ // /\ \/ / schmorp@schmorp.de
-=====/_/_//_/\_,_/ /_/\_\
------------------------------------------------------------------------------
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: blog article about f2fs on smr drives
@ 2015-11-16 4:31 Chao Yu
2015-11-16 23:52 ` Marc Lehmann
0 siblings, 1 reply; 8+ messages in thread
From: Chao Yu @ 2015-11-16 4:31 UTC (permalink / raw)
To: 'Marc Lehmann', linux-f2fs-devel
Hi Marc,
> -----Original Message-----
> From: Marc Lehmann [mailto:schmorp@schmorp.de]
> Sent: Sunday, November 15, 2015 3:22 AM
> To: linux-f2fs-devel@lists.sourceforge.net
> Subject: Re: [f2fs-dev] blog article about f2fs on smr drives
>
> So sorry again for me regular going into hiding, but I am now back on the SMR
> drive problems :)
>
> So, first, a question - how many of the changes are now in 4.3 (and
> unfortunately, it seems 4.3 is still not stable with these drives, still
> leaving only 3.18 as an full option).
I think you could get the change information in below link:
http://sourceforge.net/p/linux-f2fs/mailman/message/34427519/
>
> Next, I am trying to replicate my tests with faster data, and I get rather
> erratic results, e.g. (tar|buffer|tar, average file size 266MB, the source
> volume is capable of delivering sustained 200MB/s, so no bottleneck on the
> read side):
>
> summary: 2380.2 GB in 15 h 58 min 30.0 sec - average of 42.4 MB/s
>
> While there are stretches at >>100MB/s, most of the time, this is at less,
> and often for long stretches at ~10MB/s, which is the reason the end
> result is (relatively) bad.
Could you please share us IO trace log?
>
> And lastly, is there a document describing the implementation of
> encryption in the fs, and the goals (privacy? integrity? both?)
The feature was ported from ext4, please refer following article:
https://lwn.net/Articles/639427/
>
> (If there isn't, I plan to review the encryption design in f2fs myself).
>
> On Thu, Oct 08, 2015 at 05:45:11PM -0700, Jaegeuk Kim <jaegeuk@kernel.org> wrote:
> > Cool and pretty much interesting topic to me!
> >
> > BTW, it might be best to publish this kind of investigation as papers or
> > presentation later.
>
> Maybe, I wonder how one would go about that (I never was involved with a
> serious paper, I will unlikely give presentations on that, but yes, I think
> somebody should :).
>
> On Mon, Oct 19, 2015 at 06:43:36PM +0800, Chao Yu <chao2.yu@samsung.com> wrote:
> > I have tracked your IO trace log, I found an issue which may slow down our
> > App, so I wrote the patch to optimize the flow mainly for SMR drive.
> > I think we can tune up with /sys/fs/f2fs/(device)/ra_nid_pages to avoid long
> > latency of creating node in f2fs.
>
> Maybe that could help with my problem - a git pull of the 3.18 branch seems
> to have your changes in it - anything in specific that I should do?
As I checked, related patches were merged into linux-3.18 branch in Jaegeuk's
git tree. It's OK to use that tree.
>
> from the Changes, it doesn't quite sound as if it will be a big help, as
> there is very little read traffic overall, but I didn't analyze the IO
> trace :)
Actually, as I test in flash device, it does help to improve performance in
workload of creating nodes aggressively, this is because we add asynchronous
readahead to mitigate small synchronous random read which may block all APPs
sometime. Considering rotational device has worse random synchronous read
performance, I expect better result in SMR.
>
> it does sound like something that could be very useful for rotational disks
> in general, under normal conditions (reading), though.
I hope it will be useful in workload of nodes allocation storm rather than
in condition of read.
>
> > If you want to have a test with the new tunable parameter, last git tree is
> > preferred. :)
>
> Sure, will be happy to do that, what should I tune how? And thanks for
> working on this!
IMO, one way is to do the test with default value and then do geometrically
increasing with the value to see how it affects the IO.
Thanks,
>
> --
> The choice of a Deliantra, the free code+content MORPG
> -----==- _GNU_ http://www.deliantra.net
> ----==-- _ generation
> ---==---(_)__ __ ____ __ Marc Lehmann
> --==---/ / _ \/ // /\ \/ / schmorp@schmorp.de
> -=====/_/_//_/\_,_/ /_/\_\
>
> ------------------------------------------------------------------------------
> _______________________________________________
> Linux-f2fs-devel mailing list
> Linux-f2fs-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
------------------------------------------------------------------------------
Presto, an open source distributed SQL query engine for big data, initially
developed by Facebook, enables you to easily query your data on Hadoop in a
more interactive manner. Teradata is also now providing full enterprise
support for Presto. Download a free open source copy now.
http://pubads.g.doubleclick.net/gampad/clk?id=250295911&iu=/4140
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: blog article about f2fs on smr drives
2015-11-16 4:31 blog article about f2fs on smr drives Chao Yu
@ 2015-11-16 23:52 ` Marc Lehmann
2015-11-17 0:05 ` Jaegeuk Kim
2015-11-17 11:35 ` Chao Yu
0 siblings, 2 replies; 8+ messages in thread
From: Marc Lehmann @ 2015-11-16 23:52 UTC (permalink / raw)
To: Chao Yu; +Cc: linux-f2fs-devel
On Mon, Nov 16, 2015 at 12:31:06PM +0800, Chao Yu <chao2.yu@samsung.com> wrote:
> I think you could get the change information in below link:
>
> http://sourceforge.net/p/linux-f2fs/mailman/message/34427519/
Right, I saw the merge requests, but I don't know if these have been accepted
for 4.3. I assume they are, thanks for finding them again for me.
> > While there are stretches at >>100MB/s, most of the time, this is at less,
> > and often for long stretches at ~10MB/s, which is the reason the end
> > result is (relatively) bad.
>
> Could you please share us IO trace log?
I am already working on it, I just didn't expect it to be a problem. Also,
I did it without your patch.
> > And lastly, is there a document describing the implementation of
> > encryption in the fs, and the goals (privacy? integrity? both?)
>
> The feature was ported from ext4, please refer following article:
>
> https://lwn.net/Articles/639427/
I see, so it inherits all the security issues of that design.
> Actually, as I test in flash device, it does help to improve performance in
> workload of creating nodes aggressively, this is because we add asynchronous
> readahead to mitigate small synchronous random read which may block all APPs
> sometime. Considering rotational device has worse random synchronous read
> performance, I expect better result in SMR.
Ok, that could help, as even a relatively small number of random reads could
cause performance regressions (did the original 3.18 code not yet do this?)
I was a bit confused by the use of SMR, as SMRs don't suffer more from
random reads as othere rotational devices (in fact, they can suffer less,
if the data is still in the journal).
> > Sure, will be happy to do that, what should I tune how? And thanks for
> > working on this!
>
> IMO, one way is to do the test with default value and then do geometrically
> increasing with the value to see how it affects the IO.
I am not sure I can make a large number of tests, I will try to do smaller
tests and see if I can make more of them and see a difference.
Thanks again!
--
The choice of a Deliantra, the free code+content MORPG
-----==- _GNU_ http://www.deliantra.net
----==-- _ generation
---==---(_)__ __ ____ __ Marc Lehmann
--==---/ / _ \/ // /\ \/ / schmorp@schmorp.de
-=====/_/_//_/\_,_/ /_/\_\
------------------------------------------------------------------------------
Presto, an open source distributed SQL query engine for big data, initially
developed by Facebook, enables you to easily query your data on Hadoop in a
more interactive manner. Teradata is also now providing full enterprise
support for Presto. Download a free open source copy now.
http://pubads.g.doubleclick.net/gampad/clk?id=250295911&iu=/4140
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: blog article about f2fs on smr drives
2015-11-16 23:52 ` Marc Lehmann
@ 2015-11-17 0:05 ` Jaegeuk Kim
2015-11-17 11:35 ` Chao Yu
1 sibling, 0 replies; 8+ messages in thread
From: Jaegeuk Kim @ 2015-11-17 0:05 UTC (permalink / raw)
To: Marc Lehmann; +Cc: linux-f2fs-devel
On Tue, Nov 17, 2015 at 12:52:47AM +0100, Marc Lehmann wrote:
> On Mon, Nov 16, 2015 at 12:31:06PM +0800, Chao Yu <chao2.yu@samsung.com> wrote:
> > I think you could get the change information in below link:
> >
> > http://sourceforge.net/p/linux-f2fs/mailman/message/34427519/
>
> Right, I saw the merge requests, but I don't know if these have been accepted
> for 4.3. I assume they are, thanks for finding them again for me.
Of course, those were merged.
>
> > > While there are stretches at >>100MB/s, most of the time, this is at less,
> > > and often for long stretches at ~10MB/s, which is the reason the end
> > > result is (relatively) bad.
> >
> > Could you please share us IO trace log?
>
> I am already working on it, I just didn't expect it to be a problem. Also,
> I did it without your patch.
>
> > > And lastly, is there a document describing the implementation of
> > > encryption in the fs, and the goals (privacy? integrity? both?)
> >
> > The feature was ported from ext4, please refer following article:
> >
> > https://lwn.net/Articles/639427/
>
> I see, so it inherits all the security issues of that design.
>
> > Actually, as I test in flash device, it does help to improve performance in
> > workload of creating nodes aggressively, this is because we add asynchronous
> > readahead to mitigate small synchronous random read which may block all APPs
> > sometime. Considering rotational device has worse random synchronous read
> > performance, I expect better result in SMR.
>
> Ok, that could help, as even a relatively small number of random reads could
> cause performance regressions (did the original 3.18 code not yet do this?)
>
> I was a bit confused by the use of SMR, as SMRs don't suffer more from
> random reads as othere rotational devices (in fact, they can suffer less,
> if the data is still in the journal).
The important point here is that such the read operations to build free nids
hurts the concurrency of filesystem operations.
>
> > > Sure, will be happy to do that, what should I tune how? And thanks for
> > > working on this!
> >
> > IMO, one way is to do the test with default value and then do geometrically
> > increasing with the value to see how it affects the IO.
>
> I am not sure I can make a large number of tests, I will try to do smaller
> tests and see if I can make more of them and see a difference.
Thanks,
>
> Thanks again!
>
> --
> The choice of a Deliantra, the free code+content MORPG
> -----==- _GNU_ http://www.deliantra.net
> ----==-- _ generation
> ---==---(_)__ __ ____ __ Marc Lehmann
> --==---/ / _ \/ // /\ \/ / schmorp@schmorp.de
> -=====/_/_//_/\_,_/ /_/\_\
>
> ------------------------------------------------------------------------------
> Presto, an open source distributed SQL query engine for big data, initially
> developed by Facebook, enables you to easily query your data on Hadoop in a
> more interactive manner. Teradata is also now providing full enterprise
> support for Presto. Download a free open source copy now.
> http://pubads.g.doubleclick.net/gampad/clk?id=250295911&iu=/4140
> _______________________________________________
> Linux-f2fs-devel mailing list
> Linux-f2fs-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
------------------------------------------------------------------------------
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: blog article about f2fs on smr drives
2015-11-16 23:52 ` Marc Lehmann
2015-11-17 0:05 ` Jaegeuk Kim
@ 2015-11-17 11:35 ` Chao Yu
1 sibling, 0 replies; 8+ messages in thread
From: Chao Yu @ 2015-11-17 11:35 UTC (permalink / raw)
To: 'Marc Lehmann'; +Cc: linux-f2fs-devel
Hi Marc,
> -----Original Message-----
> From: Marc Lehmann [mailto:schmorp@schmorp.de]
> Sent: Tuesday, November 17, 2015 7:53 AM
> To: Chao Yu
> Cc: linux-f2fs-devel@lists.sourceforge.net
> Subject: Re: [f2fs-dev] blog article about f2fs on smr drives
>
> On Mon, Nov 16, 2015 at 12:31:06PM +0800, Chao Yu <chao2.yu@samsung.com> wrote:
> > I think you could get the change information in below link:
> >
> > http://sourceforge.net/p/linux-f2fs/mailman/message/34427519/
>
> Right, I saw the merge requests, but I don't know if these have been accepted
> for 4.3. I assume they are, thanks for finding them again for me.
>
> > > While there are stretches at >>100MB/s, most of the time, this is at less,
> > > and often for long stretches at ~10MB/s, which is the reason the end
> > > result is (relatively) bad.
> >
> > Could you please share us IO trace log?
>
> I am already working on it, I just didn't expect it to be a problem. Also,
> I did it without your patch.
>
> > > And lastly, is there a document describing the implementation of
> > > encryption in the fs, and the goals (privacy? integrity? both?)
> >
> > The feature was ported from ext4, please refer following article:
> >
> > https://lwn.net/Articles/639427/
>
> I see, so it inherits all the security issues of that design.
>
> > Actually, as I test in flash device, it does help to improve performance in
> > workload of creating nodes aggressively, this is because we add asynchronous
> > readahead to mitigate small synchronous random read which may block all APPs
> > sometime. Considering rotational device has worse random synchronous read
> > performance, I expect better result in SMR.
>
> Ok, that could help, as even a relatively small number of random reads could
> cause performance regressions (did the original 3.18 code not yet do this?)
As Jaegeuk mentioned, the serious problem here is all node allocaters who may
generate more dirty data later will be blocked by it, so continuous write
stream will be broken, IOs in disk may drop after a while.
>
> I was a bit confused by the use of SMR, as SMRs don't suffer more from
> random reads as othere rotational devices (in fact, they can suffer less,
> if the data is still in the journal).
You mean journal region in disk, right? Read IO still goes to that region on
disk. So why SMR suffers less than other rotations? Or readahead in that region
can do some help when most of random reads goes to the journal.
>
> > > Sure, will be happy to do that, what should I tune how? And thanks for
> > > working on this!
> >
> > IMO, one way is to do the test with default value and then do geometrically
> > increasing with the value to see how it affects the IO.
>
> I am not sure I can make a large number of tests, I will try to do smaller
> tests and see if I can make more of them and see a difference.
That's OK, one thing I should mention is that small file is preferred in our
test. :)
Thanks,
>
> Thanks again!
>
> --
> The choice of a Deliantra, the free code+content MORPG
> -----==- _GNU_ http://www.deliantra.net
> ----==-- _ generation
> ---==---(_)__ __ ____ __ Marc Lehmann
> --==---/ / _ \/ // /\ \/ / schmorp@schmorp.de
> -=====/_/_//_/\_,_/ /_/\_\
------------------------------------------------------------------------------
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2015-11-17 11:36 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-11-16 4:31 blog article about f2fs on smr drives Chao Yu
2015-11-16 23:52 ` Marc Lehmann
2015-11-17 0:05 ` Jaegeuk Kim
2015-11-17 11:35 ` Chao Yu
-- strict thread matches above, loose matches on Subject: below --
2015-10-08 13:56 Marc Lehmann
2015-10-09 0:45 ` Jaegeuk Kim
2015-11-14 19:21 ` Marc Lehmann
2015-10-19 10:43 ` Chao Yu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).