* [PATCH] ceph: set io_pages bdi hint @ 2016-12-30 5:37 Andreas Gerstmayr 2017-01-04 3:25 ` Yan, Zheng 0 siblings, 1 reply; 13+ messages in thread From: Andreas Gerstmayr @ 2016-12-30 5:37 UTC (permalink / raw) To: ceph-devel Cc: andreas.gerstmayr, Andreas Gerstmayr, Yan, Zheng, Sage Weil, Ilya Dryomov This patch sets the io_pages bdi hint based on the rsize mount option. Without this patch large buffered reads (request size > max readahead) are processed sequentially in chunks of the readahead size (i.e. read requests are sent out up to the readahead size, then the do_generic_file_read() function waits until the first page is received). This patch removes this cap and enables parallel reads up to the specified maximum read size mount option (rsize). Signed-off-by: Andreas Gerstmayr <andreas.gerstmayr@catalysts.cc> --- Feedback is appreciated. Maybe we should apply a sensible default value for rsize instead of unlimited? Please note: This patch depends on commit #9491ae4, which is not yet merged in the testing branch of the ceph-client repository (this commit is included in kernel version 4.10-rc1). fs/ceph/super.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/fs/ceph/super.c b/fs/ceph/super.c index 6bd20d7..3c50477 100644 --- a/fs/ceph/super.c +++ b/fs/ceph/super.c @@ -952,6 +952,13 @@ static int ceph_register_bdi(struct super_block *sb, fsc->backing_dev_info.ra_pages = VM_MAX_READAHEAD * 1024 / PAGE_SIZE; + if (fsc->mount_options->rsize) + fsc->backing_dev_info.io_pages = + (fsc->mount_options->rsize + PAGE_SIZE - 1) + >> PAGE_SHIFT; + else + fsc->backing_dev_info.io_pages = ULONG_MAX; + err = bdi_register(&fsc->backing_dev_info, NULL, "ceph-%ld", atomic_long_inc_return(&bdi_seq)); if (!err) -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH] ceph: set io_pages bdi hint 2016-12-30 5:37 [PATCH] ceph: set io_pages bdi hint Andreas Gerstmayr @ 2017-01-04 3:25 ` Yan, Zheng 2017-01-05 13:23 ` [PATCH v2] " Andreas Gerstmayr 0 siblings, 1 reply; 13+ messages in thread From: Yan, Zheng @ 2017-01-04 3:25 UTC (permalink / raw) To: Andreas Gerstmayr; +Cc: ceph-devel, andreas.gerstmayr, Sage Weil, Ilya Dryomov > On 30 Dec 2016, at 13:37, Andreas Gerstmayr <andreas.gerstmayr@catalysts.cc> wrote: > > This patch sets the io_pages bdi hint based on the rsize mount option. > Without this patch large buffered reads (request size > max readahead) > are processed sequentially in chunks of the readahead size (i.e. read > requests are sent out up to the readahead size, then the > do_generic_file_read() function waits until the first page is received). > > This patch removes this cap and enables parallel reads up to the > specified maximum read size mount option (rsize). > > Signed-off-by: Andreas Gerstmayr <andreas.gerstmayr@catalysts.cc> > --- > > Feedback is appreciated. Maybe we should apply a sensible default value > for rsize instead of unlimited? > > Please note: This patch depends on commit #9491ae4, which is not yet > merged in the testing branch of the ceph-client repository (this commit > is included in kernel version 4.10-rc1). > > > fs/ceph/super.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/fs/ceph/super.c b/fs/ceph/super.c > index 6bd20d7..3c50477 100644 > --- a/fs/ceph/super.c > +++ b/fs/ceph/super.c > @@ -952,6 +952,13 @@ static int ceph_register_bdi(struct super_block *sb, > fsc->backing_dev_info.ra_pages = > VM_MAX_READAHEAD * 1024 / PAGE_SIZE; > > + if (fsc->mount_options->rsize) > + fsc->backing_dev_info.io_pages = > + (fsc->mount_options->rsize + PAGE_SIZE - 1) > + >> PAGE_SHIFT; > + else > + fsc->backing_dev_info.io_pages = ULONG_MAX; > + unlimited by default does not seem like a good idea. I think we should set CEPH_RSIZE_DEFAULT to reasonable value (such as 64M) Regards Yan, Zheng > err = bdi_register(&fsc->backing_dev_info, NULL, "ceph-%ld", > atomic_long_inc_return(&bdi_seq)); > if (!err) > -- > 1.8.3.1 > ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v2] ceph: set io_pages bdi hint 2017-01-04 3:25 ` Yan, Zheng @ 2017-01-05 13:23 ` Andreas Gerstmayr 2017-01-07 16:31 ` Ilya Dryomov 0 siblings, 1 reply; 13+ messages in thread From: Andreas Gerstmayr @ 2017-01-05 13:23 UTC (permalink / raw) To: ceph-devel Cc: andreas.gerstmayr, Andreas Gerstmayr, Yan, Zheng, Sage Weil, Ilya Dryomov This patch sets the io_pages bdi hint based on the rvsize mount option. Without this patch large buffered reads (request size > max readahead) are processed sequentially in chunks of the readahead size (i.e. read requests are sent out up to the readahead size, then the do_generic_file_read() function waits until the first page is received). With this patch read requests are sent out up to the size specified in the new rvsize mount option at once (default: 64 MB). Signed-off-by: Andreas Gerstmayr <andreas.gerstmayr@catalysts.cc> --- Thanks for your review. On second thought, I think I should not reuse the rsize mount option (maximum read size per OSD request), therefore I created a new mount option rvsize with a default value of 64 MB (as you suggested). (Note: This patch depends on kernel version 4.10-rc1) Documentation/filesystems/ceph.txt | 4 ++++ fs/ceph/super.c | 15 +++++++++++++++ fs/ceph/super.h | 8 +++++--- 3 files changed, 24 insertions(+), 3 deletions(-) diff --git a/Documentation/filesystems/ceph.txt b/Documentation/filesystems/ceph.txt index f5306ee..65171e1 100644 --- a/Documentation/filesystems/ceph.txt +++ b/Documentation/filesystems/ceph.txt @@ -104,6 +104,10 @@ Mount Options rasize=X Specify the maximum readahead. + rvsize=X + Specify the maximum volume of read requests sent out at once. + The default is 64 MB. + mount_timeout=X Specify the timeout value for mount (in seconds), in the case of a non-responsive Ceph file system. The default is 30 diff --git a/fs/ceph/super.c b/fs/ceph/super.c index 6bd20d7..71bed5a 100644 --- a/fs/ceph/super.c +++ b/fs/ceph/super.c @@ -111,6 +111,7 @@ enum { Opt_wsize, Opt_rsize, Opt_rasize, + Opt_rvsize, Opt_caps_wanted_delay_min, Opt_caps_wanted_delay_max, Opt_cap_release_safety, @@ -149,6 +150,7 @@ enum { {Opt_wsize, "wsize=%d"}, {Opt_rsize, "rsize=%d"}, {Opt_rasize, "rasize=%d"}, + {Opt_rvsize, "rvsize=%d"}, {Opt_caps_wanted_delay_min, "caps_wanted_delay_min=%d"}, {Opt_caps_wanted_delay_max, "caps_wanted_delay_max=%d"}, {Opt_cap_release_safety, "cap_release_safety=%d"}, @@ -233,6 +235,9 @@ static int parse_fsopt_token(char *c, void *private) case Opt_rasize: fsopt->rasize = intval; break; + case Opt_rvsize: + fsopt->rvsize = intval; + break; case Opt_caps_wanted_delay_min: fsopt->caps_wanted_delay_min = intval; break; @@ -381,6 +386,7 @@ static int parse_mount_options(struct ceph_mount_options **pfsopt, fsopt->rsize = CEPH_RSIZE_DEFAULT; fsopt->rasize = CEPH_RASIZE_DEFAULT; + fsopt->rvsize = CEPH_RVSIZE_DEFAULT; fsopt->snapdir_name = kstrdup(CEPH_SNAPDIRNAME_DEFAULT, GFP_KERNEL); if (!fsopt->snapdir_name) { err = -ENOMEM; @@ -495,6 +501,8 @@ static int ceph_show_options(struct seq_file *m, struct dentry *root) seq_printf(m, ",rsize=%d", fsopt->rsize); if (fsopt->rasize != CEPH_RASIZE_DEFAULT) seq_printf(m, ",rasize=%d", fsopt->rasize); + if (fsopt->rvsize != CEPH_RVSIZE_DEFAULT) + seq_printf(m, ",rvsize=%d", fsopt->rvsize); if (fsopt->congestion_kb != default_congestion_kb()) seq_printf(m, ",write_congestion_kb=%d", fsopt->congestion_kb); if (fsopt->caps_wanted_delay_min != CEPH_CAPS_WANTED_DELAY_MIN_DEFAULT) @@ -952,6 +960,13 @@ static int ceph_register_bdi(struct super_block *sb, fsc->backing_dev_info.ra_pages = VM_MAX_READAHEAD * 1024 / PAGE_SIZE; + if (fsc->mount_options->rvsize) + fsc->backing_dev_info.io_pages = + (fsc->mount_options->rvsize + PAGE_SIZE - 1) + >> PAGE_SHIFT; + else + fsc->backing_dev_info.io_pages = ULONG_MAX; + err = bdi_register(&fsc->backing_dev_info, NULL, "ceph-%ld", atomic_long_inc_return(&bdi_seq)); if (!err) diff --git a/fs/ceph/super.h b/fs/ceph/super.h index 3373b61..676ef6d 100644 --- a/fs/ceph/super.h +++ b/fs/ceph/super.h @@ -45,8 +45,9 @@ #define ceph_test_mount_opt(fsc, opt) \ (!!((fsc)->mount_options->flags & CEPH_MOUNT_OPT_##opt)) -#define CEPH_RSIZE_DEFAULT 0 /* max read size */ -#define CEPH_RASIZE_DEFAULT (8192*1024) /* readahead */ +#define CEPH_RSIZE_DEFAULT 0 /* max read size per osd request */ +#define CEPH_RASIZE_DEFAULT (8192*1024) /* max readahead */ +#define CEPH_RVSIZE_DEFAULT (64*1024*1024) /* max volume of read requests sent out at once */ #define CEPH_MAX_READDIR_DEFAULT 1024 #define CEPH_MAX_READDIR_BYTES_DEFAULT (512*1024) #define CEPH_SNAPDIRNAME_DEFAULT ".snap" @@ -56,8 +57,9 @@ struct ceph_mount_options { int sb_flags; int wsize; /* max write size */ - int rsize; /* max read size */ + int rsize; /* max read size per osd request */ int rasize; /* max readahead */ + int rvsize; /* max volume of read requests sent out at once */ int congestion_kb; /* max writeback in flight */ int caps_wanted_delay_min, caps_wanted_delay_max; int cap_release_safety; -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v2] ceph: set io_pages bdi hint 2017-01-05 13:23 ` [PATCH v2] " Andreas Gerstmayr @ 2017-01-07 16:31 ` Ilya Dryomov 2017-01-09 1:54 ` Yan, Zheng 0 siblings, 1 reply; 13+ messages in thread From: Ilya Dryomov @ 2017-01-07 16:31 UTC (permalink / raw) To: Andreas Gerstmayr Cc: Ceph Development, andreas.gerstmayr, Yan, Zheng, Sage Weil On Thu, Jan 5, 2017 at 4:23 PM, Andreas Gerstmayr <andreas.gerstmayr@catalysts.cc> wrote: > This patch sets the io_pages bdi hint based on the rvsize mount option. > Without this patch large buffered reads (request size > max readahead) > are processed sequentially in chunks of the readahead size (i.e. read > requests are sent out up to the readahead size, then the > do_generic_file_read() function waits until the first page is received). > > With this patch read requests are sent out up to the size specified in > the new rvsize mount option at once (default: 64 MB). > > Signed-off-by: Andreas Gerstmayr <andreas.gerstmayr@catalysts.cc> > --- > > Thanks for your review. > On second thought, I think I should not reuse the rsize mount option > (maximum read size per OSD request), therefore I created a new mount > option rvsize with a default value of 64 MB (as you suggested). > > (Note: This patch depends on kernel version 4.10-rc1) I'll defer to Zheng's judgement, but a separate mount option for this seems overkill to me. We should be able to work something out between the existing rsize and rasize. Thanks, Ilya ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2] ceph: set io_pages bdi hint 2017-01-07 16:31 ` Ilya Dryomov @ 2017-01-09 1:54 ` Yan, Zheng 2017-01-09 9:29 ` Andreas Gerstmayr 0 siblings, 1 reply; 13+ messages in thread From: Yan, Zheng @ 2017-01-09 1:54 UTC (permalink / raw) To: Andreas Gerstmayr Cc: Ilya Dryomov, Ceph Development, andreas.gerstmayr, Sage Weil > On 8 Jan 2017, at 00:31, Ilya Dryomov <idryomov@gmail.com> wrote: > > On Thu, Jan 5, 2017 at 4:23 PM, Andreas Gerstmayr > <andreas.gerstmayr@catalysts.cc> wrote: >> This patch sets the io_pages bdi hint based on the rvsize mount option. >> Without this patch large buffered reads (request size > max readahead) >> are processed sequentially in chunks of the readahead size (i.e. read >> requests are sent out up to the readahead size, then the >> do_generic_file_read() function waits until the first page is received). >> >> With this patch read requests are sent out up to the size specified in >> the new rvsize mount option at once (default: 64 MB). >> >> Signed-off-by: Andreas Gerstmayr <andreas.gerstmayr@catalysts.cc> >> --- >> >> Thanks for your review. >> On second thought, I think I should not reuse the rsize mount option >> (maximum read size per OSD request), therefore I created a new mount >> option rvsize with a default value of 64 MB (as you suggested). >> >> (Note: This patch depends on kernel version 4.10-rc1) > > I'll defer to Zheng's judgement, but a separate mount option for this > seems overkill to me. We should be able to work something out between > the existing rsize and rasize. I agree with Ilya. I think we can user rsize here. Regards Yan, Zheng > > Thanks, > > Ilya ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2] ceph: set io_pages bdi hint 2017-01-09 1:54 ` Yan, Zheng @ 2017-01-09 9:29 ` Andreas Gerstmayr 2017-01-10 6:42 ` Yan, Zheng 0 siblings, 1 reply; 13+ messages in thread From: Andreas Gerstmayr @ 2017-01-09 9:29 UTC (permalink / raw) To: Yan, Zheng; +Cc: Ilya Dryomov, Ceph Development, andreas.gerstmayr, Sage Weil Am 09.01.2017 um 02:54 schrieb Yan, Zheng: > >> On 8 Jan 2017, at 00:31, Ilya Dryomov <idryomov@gmail.com> wrote: >> >> On Thu, Jan 5, 2017 at 4:23 PM, Andreas Gerstmayr >> <andreas.gerstmayr@catalysts.cc> wrote: >>> This patch sets the io_pages bdi hint based on the rvsize mount option. >>> Without this patch large buffered reads (request size > max readahead) >>> are processed sequentially in chunks of the readahead size (i.e. read >>> requests are sent out up to the readahead size, then the >>> do_generic_file_read() function waits until the first page is received). >>> >>> With this patch read requests are sent out up to the size specified in >>> the new rvsize mount option at once (default: 64 MB). >>> >>> Signed-off-by: Andreas Gerstmayr <andreas.gerstmayr@catalysts.cc> >>> --- >>> >>> Thanks for your review. >>> On second thought, I think I should not reuse the rsize mount option >>> (maximum read size per OSD request), therefore I created a new mount >>> option rvsize with a default value of 64 MB (as you suggested). >>> >>> (Note: This patch depends on kernel version 4.10-rc1) >> >> I'll defer to Zheng's judgement, but a separate mount option for this >> seems overkill to me. We should be able to work something out between >> the existing rsize and rasize. > > I agree with Ilya. I think we can user rsize here. But then we are using a single config option for two different purposes? - to specify the maximum size of a single read request to an OSD - to specify the maximum cumulative size of read requests sent out at once In general the latter will be a multiple of the former. Regards, Andreas ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2] ceph: set io_pages bdi hint 2017-01-09 9:29 ` Andreas Gerstmayr @ 2017-01-10 6:42 ` Yan, Zheng 2017-01-10 12:56 ` Andreas Gerstmayr 2017-01-10 13:17 ` Andreas Gerstmayr 0 siblings, 2 replies; 13+ messages in thread From: Yan, Zheng @ 2017-01-10 6:42 UTC (permalink / raw) To: Andreas Gerstmayr Cc: Ilya Dryomov, Ceph Development, andreas.gerstmayr, Sage Weil > On 9 Jan 2017, at 17:29, Andreas Gerstmayr <andreas.gerstmayr@catalysts.cc> wrote: > > Am 09.01.2017 um 02:54 schrieb Yan, Zheng: >> >>> On 8 Jan 2017, at 00:31, Ilya Dryomov <idryomov@gmail.com> wrote: >>> >>> On Thu, Jan 5, 2017 at 4:23 PM, Andreas Gerstmayr >>> <andreas.gerstmayr@catalysts.cc> wrote: >>>> This patch sets the io_pages bdi hint based on the rvsize mount option. >>>> Without this patch large buffered reads (request size > max readahead) >>>> are processed sequentially in chunks of the readahead size (i.e. read >>>> requests are sent out up to the readahead size, then the >>>> do_generic_file_read() function waits until the first page is received). >>>> >>>> With this patch read requests are sent out up to the size specified in >>>> the new rvsize mount option at once (default: 64 MB). >>>> >>>> Signed-off-by: Andreas Gerstmayr <andreas.gerstmayr@catalysts.cc> >>>> --- >>>> >>>> Thanks for your review. >>>> On second thought, I think I should not reuse the rsize mount option >>>> (maximum read size per OSD request), therefore I created a new mount >>>> option rvsize with a default value of 64 MB (as you suggested). >>>> >>>> (Note: This patch depends on kernel version 4.10-rc1) >>> >>> I'll defer to Zheng's judgement, but a separate mount option for this >>> seems overkill to me. We should be able to work something out between >>> the existing rsize and rasize. >> >> I agree with Ilya. I think we can user rsize here. > > But then we are using a single config option for two different purposes? > - to specify the maximum size of a single read request to an OSD > - to specify the maximum cumulative size of read requests sent out at > once limit max size of single request does not make much sense. The only case I can think of is system has limited memory. For that case, it does not make sense to send parallel requests. Regards Yan, Zheng > > In general the latter will be a multiple of the former. > > > Regards, > Andreas ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v3] ceph: set io_pages bdi hint 2017-01-10 6:42 ` Yan, Zheng @ 2017-01-10 12:56 ` Andreas Gerstmayr 2017-01-10 13:17 ` Andreas Gerstmayr 1 sibling, 0 replies; 13+ messages in thread From: Andreas Gerstmayr @ 2017-01-10 12:56 UTC (permalink / raw) To: ceph-devel Cc: andreas.gerstmayr, Andreas Gerstmayr, Yan, Zheng, Sage Weil, Ilya Dryomov, linux-kernel This patch sets the io_pages bdi hint based on the rsize mount option. Without this patch large buffered reads (request size > max readahead) are processed sequentially in chunks of the readahead size (i.e. read requests are sent out up to the readahead size, then the do_generic_file_read() function waits until the first page is received). With this patch read requests are sent out at once up to the size specified in the rsize mount option (default: 64 MB). Signed-off-by: Andreas Gerstmayr <andreas.gerstmayr@catalysts.cc> --- Thanks for your input. Changes in v3: - set default rsize to 64 MB - sanity check of the rsize mount option (Note: This patch depends on kernel version 4.10-rc1) fs/ceph/super.c | 8 ++++++++ fs/ceph/super.h | 4 ++-- 2 files changed, 10 insertions(+), 2 deletions(-) diff --git a/fs/ceph/super.c b/fs/ceph/super.c index 6bd20d7..a0a0b6d 100644 --- a/fs/ceph/super.c +++ b/fs/ceph/super.c @@ -952,6 +952,14 @@ static int ceph_register_bdi(struct super_block *sb, fsc->backing_dev_info.ra_pages = VM_MAX_READAHEAD * 1024 / PAGE_SIZE; + if (fsc->mount_options->rsize > fsc->mount_options->rasize && + fsc->mount_options->rsize >= PAGE_SIZE) + fsc->backing_dev_info.io_pages = + (fsc->mount_options->rsize + PAGE_SIZE - 1) + >> PAGE_SHIFT; + else if (fsc->mount_options->rsize == 0) + fsc->backing_dev_info.io_pages = ULONG_MAX; + err = bdi_register(&fsc->backing_dev_info, NULL, "ceph-%ld", atomic_long_inc_return(&bdi_seq)); if (!err) diff --git a/fs/ceph/super.h b/fs/ceph/super.h index 3373b61..88b2e6e 100644 --- a/fs/ceph/super.h +++ b/fs/ceph/super.h @@ -45,8 +45,8 @@ #define ceph_test_mount_opt(fsc, opt) \ (!!((fsc)->mount_options->flags & CEPH_MOUNT_OPT_##opt)) -#define CEPH_RSIZE_DEFAULT 0 /* max read size */ -#define CEPH_RASIZE_DEFAULT (8192*1024) /* readahead */ +#define CEPH_RSIZE_DEFAULT (64*1024*1024) /* max read size */ +#define CEPH_RASIZE_DEFAULT (8192*1024) /* max readahead */ #define CEPH_MAX_READDIR_DEFAULT 1024 #define CEPH_MAX_READDIR_BYTES_DEFAULT (512*1024) #define CEPH_SNAPDIRNAME_DEFAULT ".snap" -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v3] ceph: set io_pages bdi hint @ 2017-01-10 12:56 ` Andreas Gerstmayr 0 siblings, 0 replies; 13+ messages in thread From: Andreas Gerstmayr @ 2017-01-10 12:56 UTC (permalink / raw) To: ceph-devel Cc: andreas.gerstmayr, Andreas Gerstmayr, Yan, Zheng, Sage Weil, Ilya Dryomov, linux-kernel This patch sets the io_pages bdi hint based on the rsize mount option. Without this patch large buffered reads (request size > max readahead) are processed sequentially in chunks of the readahead size (i.e. read requests are sent out up to the readahead size, then the do_generic_file_read() function waits until the first page is received). With this patch read requests are sent out at once up to the size specified in the rsize mount option (default: 64 MB). Signed-off-by: Andreas Gerstmayr <andreas.gerstmayr@catalysts.cc> --- Thanks for your input. Changes in v3: - set default rsize to 64 MB - sanity check of the rsize mount option (Note: This patch depends on kernel version 4.10-rc1) fs/ceph/super.c | 8 ++++++++ fs/ceph/super.h | 4 ++-- 2 files changed, 10 insertions(+), 2 deletions(-) diff --git a/fs/ceph/super.c b/fs/ceph/super.c index 6bd20d7..a0a0b6d 100644 --- a/fs/ceph/super.c +++ b/fs/ceph/super.c @@ -952,6 +952,14 @@ static int ceph_register_bdi(struct super_block *sb, fsc->backing_dev_info.ra_pages = VM_MAX_READAHEAD * 1024 / PAGE_SIZE; + if (fsc->mount_options->rsize > fsc->mount_options->rasize && + fsc->mount_options->rsize >= PAGE_SIZE) + fsc->backing_dev_info.io_pages = + (fsc->mount_options->rsize + PAGE_SIZE - 1) + >> PAGE_SHIFT; + else if (fsc->mount_options->rsize == 0) + fsc->backing_dev_info.io_pages = ULONG_MAX; + err = bdi_register(&fsc->backing_dev_info, NULL, "ceph-%ld", atomic_long_inc_return(&bdi_seq)); if (!err) diff --git a/fs/ceph/super.h b/fs/ceph/super.h index 3373b61..88b2e6e 100644 --- a/fs/ceph/super.h +++ b/fs/ceph/super.h @@ -45,8 +45,8 @@ #define ceph_test_mount_opt(fsc, opt) \ (!!((fsc)->mount_options->flags & CEPH_MOUNT_OPT_##opt)) -#define CEPH_RSIZE_DEFAULT 0 /* max read size */ -#define CEPH_RASIZE_DEFAULT (8192*1024) /* readahead */ +#define CEPH_RSIZE_DEFAULT (64*1024*1024) /* max read size */ +#define CEPH_RASIZE_DEFAULT (8192*1024) /* max readahead */ #define CEPH_MAX_READDIR_DEFAULT 1024 #define CEPH_MAX_READDIR_BYTES_DEFAULT (512*1024) #define CEPH_SNAPDIRNAME_DEFAULT ".snap" -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v4] ceph: set io_pages bdi hint 2017-01-10 6:42 ` Yan, Zheng @ 2017-01-10 13:17 ` Andreas Gerstmayr 2017-01-10 13:17 ` Andreas Gerstmayr 1 sibling, 0 replies; 13+ messages in thread From: Andreas Gerstmayr @ 2017-01-10 13:17 UTC (permalink / raw) To: ceph-devel Cc: andreas.gerstmayr, Andreas Gerstmayr, Yan, Zheng, Sage Weil, Ilya Dryomov, Jonathan Corbet, linux-doc, linux-kernel This patch sets the io_pages bdi hint based on the rsize mount option. Without this patch large buffered reads (request size > max readahead) are processed sequentially in chunks of the readahead size (i.e. read requests are sent out up to the readahead size, then the do_generic_file_read() function waits until the first page is received). With this patch read requests are sent out at once up to the size specified in the rsize mount option (default: 64 MB). Signed-off-by: Andreas Gerstmayr <andreas.gerstmayr@catalysts.cc> --- Changes in v4: - update documentation (Note: This patch depends on kernel version 4.10-rc1) Documentation/filesystems/ceph.txt | 5 ++--- fs/ceph/super.c | 8 ++++++++ fs/ceph/super.h | 4 ++-- 3 files changed, 12 insertions(+), 5 deletions(-) diff --git a/Documentation/filesystems/ceph.txt b/Documentation/filesystems/ceph.txt index f5306ee..0b302a1 100644 --- a/Documentation/filesystems/ceph.txt +++ b/Documentation/filesystems/ceph.txt @@ -98,11 +98,10 @@ Mount Options size. rsize=X - Specify the maximum read size in bytes. By default there is no - maximum. + Specify the maximum read size in bytes. Default: 64 MB. rasize=X - Specify the maximum readahead. + Specify the maximum readahead. Default: 8 MB. mount_timeout=X Specify the timeout value for mount (in seconds), in the case diff --git a/fs/ceph/super.c b/fs/ceph/super.c index 6bd20d7..a0a0b6d 100644 --- a/fs/ceph/super.c +++ b/fs/ceph/super.c @@ -952,6 +952,14 @@ static int ceph_register_bdi(struct super_block *sb, fsc->backing_dev_info.ra_pages = VM_MAX_READAHEAD * 1024 / PAGE_SIZE; + if (fsc->mount_options->rsize > fsc->mount_options->rasize && + fsc->mount_options->rsize >= PAGE_SIZE) + fsc->backing_dev_info.io_pages = + (fsc->mount_options->rsize + PAGE_SIZE - 1) + >> PAGE_SHIFT; + else if (fsc->mount_options->rsize == 0) + fsc->backing_dev_info.io_pages = ULONG_MAX; + err = bdi_register(&fsc->backing_dev_info, NULL, "ceph-%ld", atomic_long_inc_return(&bdi_seq)); if (!err) diff --git a/fs/ceph/super.h b/fs/ceph/super.h index 3373b61..88b2e6e 100644 --- a/fs/ceph/super.h +++ b/fs/ceph/super.h @@ -45,8 +45,8 @@ #define ceph_test_mount_opt(fsc, opt) \ (!!((fsc)->mount_options->flags & CEPH_MOUNT_OPT_##opt)) -#define CEPH_RSIZE_DEFAULT 0 /* max read size */ -#define CEPH_RASIZE_DEFAULT (8192*1024) /* readahead */ +#define CEPH_RSIZE_DEFAULT (64*1024*1024) /* max read size */ +#define CEPH_RASIZE_DEFAULT (8192*1024) /* max readahead */ #define CEPH_MAX_READDIR_DEFAULT 1024 #define CEPH_MAX_READDIR_BYTES_DEFAULT (512*1024) #define CEPH_SNAPDIRNAME_DEFAULT ".snap" -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v4] ceph: set io_pages bdi hint @ 2017-01-10 13:17 ` Andreas Gerstmayr 0 siblings, 0 replies; 13+ messages in thread From: Andreas Gerstmayr @ 2017-01-10 13:17 UTC (permalink / raw) To: ceph-devel Cc: andreas.gerstmayr, Andreas Gerstmayr, Yan, Zheng, Sage Weil, Ilya Dryomov, Jonathan Corbet, linux-doc, linux-kernel This patch sets the io_pages bdi hint based on the rsize mount option. Without this patch large buffered reads (request size > max readahead) are processed sequentially in chunks of the readahead size (i.e. read requests are sent out up to the readahead size, then the do_generic_file_read() function waits until the first page is received). With this patch read requests are sent out at once up to the size specified in the rsize mount option (default: 64 MB). Signed-off-by: Andreas Gerstmayr <andreas.gerstmayr@catalysts.cc> --- Changes in v4: - update documentation (Note: This patch depends on kernel version 4.10-rc1) Documentation/filesystems/ceph.txt | 5 ++--- fs/ceph/super.c | 8 ++++++++ fs/ceph/super.h | 4 ++-- 3 files changed, 12 insertions(+), 5 deletions(-) diff --git a/Documentation/filesystems/ceph.txt b/Documentation/filesystems/ceph.txt index f5306ee..0b302a1 100644 --- a/Documentation/filesystems/ceph.txt +++ b/Documentation/filesystems/ceph.txt @@ -98,11 +98,10 @@ Mount Options size. rsize=X - Specify the maximum read size in bytes. By default there is no - maximum. + Specify the maximum read size in bytes. Default: 64 MB. rasize=X - Specify the maximum readahead. + Specify the maximum readahead. Default: 8 MB. mount_timeout=X Specify the timeout value for mount (in seconds), in the case diff --git a/fs/ceph/super.c b/fs/ceph/super.c index 6bd20d7..a0a0b6d 100644 --- a/fs/ceph/super.c +++ b/fs/ceph/super.c @@ -952,6 +952,14 @@ static int ceph_register_bdi(struct super_block *sb, fsc->backing_dev_info.ra_pages = VM_MAX_READAHEAD * 1024 / PAGE_SIZE; + if (fsc->mount_options->rsize > fsc->mount_options->rasize && + fsc->mount_options->rsize >= PAGE_SIZE) + fsc->backing_dev_info.io_pages = + (fsc->mount_options->rsize + PAGE_SIZE - 1) + >> PAGE_SHIFT; + else if (fsc->mount_options->rsize == 0) + fsc->backing_dev_info.io_pages = ULONG_MAX; + err = bdi_register(&fsc->backing_dev_info, NULL, "ceph-%ld", atomic_long_inc_return(&bdi_seq)); if (!err) diff --git a/fs/ceph/super.h b/fs/ceph/super.h index 3373b61..88b2e6e 100644 --- a/fs/ceph/super.h +++ b/fs/ceph/super.h @@ -45,8 +45,8 @@ #define ceph_test_mount_opt(fsc, opt) \ (!!((fsc)->mount_options->flags & CEPH_MOUNT_OPT_##opt)) -#define CEPH_RSIZE_DEFAULT 0 /* max read size */ -#define CEPH_RASIZE_DEFAULT (8192*1024) /* readahead */ +#define CEPH_RSIZE_DEFAULT (64*1024*1024) /* max read size */ +#define CEPH_RASIZE_DEFAULT (8192*1024) /* max readahead */ #define CEPH_MAX_READDIR_DEFAULT 1024 #define CEPH_MAX_READDIR_BYTES_DEFAULT (512*1024) #define CEPH_SNAPDIRNAME_DEFAULT ".snap" -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v4] ceph: set io_pages bdi hint 2017-01-10 13:17 ` Andreas Gerstmayr (?) @ 2017-01-10 16:26 ` Jeff Layton -1 siblings, 0 replies; 13+ messages in thread From: Jeff Layton @ 2017-01-10 16:26 UTC (permalink / raw) To: Andreas Gerstmayr, ceph-devel Cc: andreas.gerstmayr, Yan, Zheng, Sage Weil, Ilya Dryomov, Jonathan Corbet, linux-doc, linux-kernel On Tue, 2017-01-10 at 14:17 +0100, Andreas Gerstmayr wrote: > This patch sets the io_pages bdi hint based on the rsize mount option. > Without this patch large buffered reads (request size > max readahead) > are processed sequentially in chunks of the readahead size (i.e. read > requests are sent out up to the readahead size, then the > do_generic_file_read() function waits until the first page is received). > > With this patch read requests are sent out at once up to the size > specified in the rsize mount option (default: 64 MB). > > Signed-off-by: Andreas Gerstmayr <andreas.gerstmayr@catalysts.cc> > --- > > Changes in v4: > - update documentation > > (Note: This patch depends on kernel version 4.10-rc1) > > > Documentation/filesystems/ceph.txt | 5 ++--- > fs/ceph/super.c | 8 ++++++++ > fs/ceph/super.h | 4 ++-- > 3 files changed, 12 insertions(+), 5 deletions(-) > > diff --git a/Documentation/filesystems/ceph.txt b/Documentation/filesystems/ceph.txt > index f5306ee..0b302a1 100644 > --- a/Documentation/filesystems/ceph.txt > +++ b/Documentation/filesystems/ceph.txt > @@ -98,11 +98,10 @@ Mount Options > size. > > rsize=X > - Specify the maximum read size in bytes. By default there is no > - maximum. > + Specify the maximum read size in bytes. Default: 64 MB. > > rasize=X > - Specify the maximum readahead. > + Specify the maximum readahead. Default: 8 MB. > > mount_timeout=X > Specify the timeout value for mount (in seconds), in the case > diff --git a/fs/ceph/super.c b/fs/ceph/super.c > index 6bd20d7..a0a0b6d 100644 > --- a/fs/ceph/super.c > +++ b/fs/ceph/super.c > @@ -952,6 +952,14 @@ static int ceph_register_bdi(struct super_block *sb, > fsc->backing_dev_info.ra_pages = > VM_MAX_READAHEAD * 1024 / PAGE_SIZE; > > + if (fsc->mount_options->rsize > fsc->mount_options->rasize && > + fsc->mount_options->rsize >= PAGE_SIZE) > + fsc->backing_dev_info.io_pages = > + (fsc->mount_options->rsize + PAGE_SIZE - 1) > + >> PAGE_SHIFT; > + else if (fsc->mount_options->rsize == 0) > + fsc->backing_dev_info.io_pages = ULONG_MAX; > + > err = bdi_register(&fsc->backing_dev_info, NULL, "ceph-%ld", > atomic_long_inc_return(&bdi_seq)); > if (!err) > diff --git a/fs/ceph/super.h b/fs/ceph/super.h > index 3373b61..88b2e6e 100644 > --- a/fs/ceph/super.h > +++ b/fs/ceph/super.h > @@ -45,8 +45,8 @@ > #define ceph_test_mount_opt(fsc, opt) \ > (!!((fsc)->mount_options->flags & CEPH_MOUNT_OPT_##opt)) > > -#define CEPH_RSIZE_DEFAULT 0 /* max read size */ > -#define CEPH_RASIZE_DEFAULT (8192*1024) /* readahead */ > +#define CEPH_RSIZE_DEFAULT (64*1024*1024) /* max read size */ > +#define CEPH_RASIZE_DEFAULT (8192*1024) /* max readahead */ > #define CEPH_MAX_READDIR_DEFAULT 1024 > #define CEPH_MAX_READDIR_BYTES_DEFAULT (512*1024) > #define CEPH_SNAPDIRNAME_DEFAULT ".snap" Acked-by: Jeff Layton <jlayton@redhat.com> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v4] ceph: set io_pages bdi hint 2017-01-10 13:17 ` Andreas Gerstmayr (?) (?) @ 2017-01-11 2:43 ` Yan, Zheng -1 siblings, 0 replies; 13+ messages in thread From: Yan, Zheng @ 2017-01-11 2:43 UTC (permalink / raw) To: Andreas Gerstmayr Cc: ceph-devel, Jeff Layton, andreas.gerstmayr, Sage Weil, Ilya Dryomov, Jonathan Corbet, linux-doc, Linux Kernel Mailing List > On 10 Jan 2017, at 21:17, Andreas Gerstmayr <andreas.gerstmayr@catalysts.cc> wrote: > > This patch sets the io_pages bdi hint based on the rsize mount option. > Without this patch large buffered reads (request size > max readahead) > are processed sequentially in chunks of the readahead size (i.e. read > requests are sent out up to the readahead size, then the > do_generic_file_read() function waits until the first page is received). > > With this patch read requests are sent out at once up to the size > specified in the rsize mount option (default: 64 MB). > > Signed-off-by: Andreas Gerstmayr <andreas.gerstmayr@catalysts.cc> > --- > > Changes in v4: > - update documentation > > (Note: This patch depends on kernel version 4.10-rc1) > > > Documentation/filesystems/ceph.txt | 5 ++--- > fs/ceph/super.c | 8 ++++++++ > fs/ceph/super.h | 4 ++-- > 3 files changed, 12 insertions(+), 5 deletions(-) > > diff --git a/Documentation/filesystems/ceph.txt b/Documentation/filesystems/ceph.txt > index f5306ee..0b302a1 100644 > --- a/Documentation/filesystems/ceph.txt > +++ b/Documentation/filesystems/ceph.txt > @@ -98,11 +98,10 @@ Mount Options > size. > > rsize=X > - Specify the maximum read size in bytes. By default there is no > - maximum. > + Specify the maximum read size in bytes. Default: 64 MB. > > rasize=X > - Specify the maximum readahead. > + Specify the maximum readahead. Default: 8 MB. > > mount_timeout=X > Specify the timeout value for mount (in seconds), in the case > diff --git a/fs/ceph/super.c b/fs/ceph/super.c > index 6bd20d7..a0a0b6d 100644 > --- a/fs/ceph/super.c > +++ b/fs/ceph/super.c > @@ -952,6 +952,14 @@ static int ceph_register_bdi(struct super_block *sb, > fsc->backing_dev_info.ra_pages = > VM_MAX_READAHEAD * 1024 / PAGE_SIZE; > > + if (fsc->mount_options->rsize > fsc->mount_options->rasize && > + fsc->mount_options->rsize >= PAGE_SIZE) > + fsc->backing_dev_info.io_pages = > + (fsc->mount_options->rsize + PAGE_SIZE - 1) > + >> PAGE_SHIFT; > + else if (fsc->mount_options->rsize == 0) > + fsc->backing_dev_info.io_pages = ULONG_MAX; > + > err = bdi_register(&fsc->backing_dev_info, NULL, "ceph-%ld", > atomic_long_inc_return(&bdi_seq)); > if (!err) > diff --git a/fs/ceph/super.h b/fs/ceph/super.h > index 3373b61..88b2e6e 100644 > --- a/fs/ceph/super.h > +++ b/fs/ceph/super.h > @@ -45,8 +45,8 @@ > #define ceph_test_mount_opt(fsc, opt) \ > (!!((fsc)->mount_options->flags & CEPH_MOUNT_OPT_##opt)) > > -#define CEPH_RSIZE_DEFAULT 0 /* max read size */ > -#define CEPH_RASIZE_DEFAULT (8192*1024) /* readahead */ > +#define CEPH_RSIZE_DEFAULT (64*1024*1024) /* max read size */ > +#define CEPH_RASIZE_DEFAULT (8192*1024) /* max readahead */ > #define CEPH_MAX_READDIR_DEFAULT 1024 > #define CEPH_MAX_READDIR_BYTES_DEFAULT (512*1024) > #define CEPH_SNAPDIRNAME_DEFAULT ".snap” Applied, Thanks Yan, Zheng > -- > 1.8.3.1 > ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2017-01-11 2:43 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-12-30 5:37 [PATCH] ceph: set io_pages bdi hint Andreas Gerstmayr 2017-01-04 3:25 ` Yan, Zheng 2017-01-05 13:23 ` [PATCH v2] " Andreas Gerstmayr 2017-01-07 16:31 ` Ilya Dryomov 2017-01-09 1:54 ` Yan, Zheng 2017-01-09 9:29 ` Andreas Gerstmayr 2017-01-10 6:42 ` Yan, Zheng 2017-01-10 12:56 ` [PATCH v3] " Andreas Gerstmayr 2017-01-10 12:56 ` Andreas Gerstmayr 2017-01-10 13:17 ` [PATCH v4] " Andreas Gerstmayr 2017-01-10 13:17 ` Andreas Gerstmayr 2017-01-10 16:26 ` Jeff Layton 2017-01-11 2:43 ` Yan, Zheng
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.