* Re: [lkp-robot] [fs/locks] 9d21d181d0: will-it-scale.per_process_ops -14.1% regression [not found] <20170601020556.GE16905@yexl-desktop> @ 2017-06-01 11:41 ` Jeff Layton 2017-06-01 11:49 ` Benjamin Coddington 0 siblings, 1 reply; 10+ messages in thread From: Jeff Layton @ 2017-06-01 11:41 UTC (permalink / raw) To: kernel test robot, Benjamin Coddington Cc: Alexander Viro, bfields, linux-fsdevel, linux-nfs, lkp, Christoph Hellwig On Thu, 2017-06-01 at 10:05 +0800, kernel test robot wrote: > Greeting, > > FYI, we noticed a -14.1% regression of will-it-scale.per_process_ops due to commit: > > > commit: 9d21d181d06acab9a8e80eac2ec4eed77b656793 ("fs/locks: Set fl_nspid at file_lock allocation") > url: https://github.com/0day-ci/linux/commits/Benjamin-Coddington/fs-locks-Alloc-file_lock-where-practical/20170527-050700 > > Ouch, that's a rather nasty performance hit. In hindsight, maybe we shouldn't move those off the stack after all? Heck, if it's that significant, maybe we should move the F_SETLK callers to allocate these on the stack as well? > in testcase: will-it-scale > on test machine: 4 threads Intel(R) Core(TM) i3-3220 CPU @ 3.30GHz with 4G memory > with following parameters: > > test: lock1 > cpufreq_governor: performance > > test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two. > test-url: https://github.com/antonblanchard/will-it-scale > > In addition to that, the commit also has significant impact on the following tests: > > +------------------+----------------------------------------------------------------+ > > testcase: change | will-it-scale: will-it-scale.per_process_ops -4.9% regression | > > test machine | 16 threads Intel(R) Atom(R) CPU 3958 @ 2.00GHz with 64G memory | > > test parameters | cpufreq_governor=performance | > > | mode=process | > > | nr_task=100% | > > | test=lock1 | > > +------------------+----------------------------------------------------------------+ > > > Details are as below: > --------------------------------------------------------------------------------------------------> > > > To reproduce: > > git clone https://github.com/01org/lkp-tests.git > cd lkp-tests > bin/lkp install job.yaml # job file is attached in this email > bin/lkp run job.yaml > > testcase/path_params/tbox_group/run: will-it-scale/lock1-performance/lkp-ivb-d04 > > 09790e423b32fba4 9d21d181d06acab9a8e80eac2e > ---------------- -------------------------- > 0.51 19% 0.60 ± 7% will-it-scale.scalability > 2462089 -14% 2114597 will-it-scale.per_process_ops > 2195246 -26% 1631578 will-it-scale.per_thread_ops > 350 356 will-it-scale.time.system_time > 28.89 -24% 22.06 will-it-scale.time.user_time > 32.78 31.97 turbostat.PkgWatt > 15.58 -5% 14.80 turbostat.CorWatt > 19284 18803 vmstat.system.in > 32208 -4% 31052 vmstat.system.cs > 1630 ±173% 2e+04 18278 ± 27% latency_stats.avg.perf_event_alloc.SYSC_perf_event_open.SyS_perf_event_open.entry_SYSCALL_64_fastpath > 1630 ±173% 2e+04 18278 ± 27% latency_stats.max.perf_event_alloc.SYSC_perf_event_open.SyS_perf_event_open.entry_SYSCALL_64_fastpath > 1630 ±173% 2e+04 18278 ± 27% latency_stats.sum.perf_event_alloc.SYSC_perf_event_open.SyS_perf_event_open.entry_SYSCALL_64_fastpath > 1.911e+09 ± 6% 163% 5.022e+09 ± 5% perf-stat.cache-references > 27.58 ± 12% 17% 32.14 ± 7% perf-stat.iTLB-load-miss-rate% > 9881103 -4% 9527607 perf-stat.context-switches > 9.567e+11 ± 9% -14% 8.181e+11 ± 9% perf-stat.dTLB-loads > 6.85e+11 ± 4% -16% 5.761e+11 ± 6% perf-stat.branch-instructions > 3.469e+12 ± 4% -17% 2.893e+12 ± 6% perf-stat.instructions > 1.24 ± 4% -19% 1.00 perf-stat.ipc > 3.18 ± 8% -62% 1.19 ± 19% perf-stat.cache-miss-rate% > > > > perf-stat.cache-references > > 8e+09 ++------------------------------------------------------------------+ > | | > 7e+09 ++ O O | > | O | > 6e+09 ++ O | > | O | > 5e+09 ++O O O O O O O > O O O O O O O O O O O O O O O O O O | > 4e+09 ++ O O O | > | | > 3e+09 ++ | > | *. *.. *. | > 2e+09 *+ + *.. .*. + *. .*. + *. .*.*. .*.*. .*..* | > | * *.*.*.*.* * * * *.*. *.* * | > 1e+09 ++------------------------------------------------------------------+ > > > will-it-scale.time.user_time > > 30 ++--*-------------------*-----------*----------------------------------+ > 29 *+* *.*.*.*..*.*.*.* *.*.*.*. *.*. *. .*. .*.*.* | > | *. .. * *.*. | > 28 ++ * | > 27 ++ | > | | > 26 ++ | > 25 ++ | > 24 ++ | > | | > 23 ++ O O O O O O O O O O O O O | > 22 O+O O O O O O O O O O O O O O | > | O O O O > 21 ++ O | > 20 ++---------------------------------------------------------------------+ > > > will-it-scale.time.system_time > > 358 ++--------------------------------------------------------------------+ > 357 O+O O O O O O O O > | O O O O O O O O O O O O O O O O O O | > 356 ++ O O O O O O | > 355 ++ | > | | > 354 ++ | > 353 ++ | > 352 ++ | > | | > 351 ++ *. .*. .*. | > 350 *+*. .* * .*.*.*. .* *.*. .* + * *..* *.*.* | > | *. + + + .*. * + .. * + .* | > 349 ++ * * * *. | > 348 ++--------------------------------------------------------------------+ > > > will-it-scale.per_thread_ops > > 2.3e+06 ++----------------------------------------------------------------+ > | | > 2.2e+06 ++*.*. .*. .*..*. .*.*. .*.*. .*.*..*.*.*.*.* | > * *.* * * *.*.* *.*. .*.* | > 2.1e+06 ++ * | > 2e+06 ++ | > | | > 1.9e+06 ++ | > | | > 1.8e+06 ++ | > 1.7e+06 ++ O O O O O O | > O O O O O O O O O O | > 1.6e+06 ++ O O O O O O O O O O O O O O O > | O O | > 1.5e+06 ++----------------------------------------------------------------+ > > [*] bisect-good sample > [O] bisect-bad sample > > > Disclaimer: > Results have been estimated based on internal Intel analysis and are provided > for informational purposes only. Any difference in system hardware or software > design or configuration may affect actual performance. > > > Thanks, > Xiaolong -- Jeff Layton <jlayton@redhat.com> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [lkp-robot] [fs/locks] 9d21d181d0: will-it-scale.per_process_ops -14.1% regression 2017-06-01 11:41 ` [lkp-robot] [fs/locks] 9d21d181d0: will-it-scale.per_process_ops -14.1% regression Jeff Layton @ 2017-06-01 11:49 ` Benjamin Coddington 2017-06-01 12:59 ` Jeff Layton 0 siblings, 1 reply; 10+ messages in thread From: Benjamin Coddington @ 2017-06-01 11:49 UTC (permalink / raw) To: Jeff Layton Cc: kernel test robot, Alexander Viro, bfields, linux-fsdevel, linux-nfs, lkp, Christoph Hellwig On 1 Jun 2017, at 7:41, Jeff Layton wrote: > On Thu, 2017-06-01 at 10:05 +0800, kernel test robot wrote: >> Greeting, >> >> FYI, we noticed a -14.1% regression of will-it-scale.per_process_ops >> due to commit: >> >> >> commit: 9d21d181d06acab9a8e80eac2ec4eed77b656793 ("fs/locks: Set >> fl_nspid at file_lock allocation") >> url: >> https://github.com/0day-ci/linux/commits/Benjamin-Coddington/fs-locks-Alloc-file_lock-where-practical/20170527-050700 >> >> > > Ouch, that's a rather nasty performance hit. In hindsight, maybe we > shouldn't move those off the stack after all? Heck, if it's that > significant, maybe we should move the F_SETLK callers to allocate > these > on the stack as well? We can do that. But, I think this is picking up the locks_mandatory_area() allocation which is now removed. The attached config has CONFIG_MANDATORY_FILE_LOCKING=y, so there's allocation on every read/write. Ben ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [lkp-robot] [fs/locks] 9d21d181d0: will-it-scale.per_process_ops -14.1% regression 2017-06-01 11:49 ` Benjamin Coddington @ 2017-06-01 12:59 ` Jeff Layton 2017-06-01 15:14 ` J. Bruce Fields 0 siblings, 1 reply; 10+ messages in thread From: Jeff Layton @ 2017-06-01 12:59 UTC (permalink / raw) To: Benjamin Coddington Cc: kernel test robot, Alexander Viro, bfields, linux-fsdevel, linux-nfs, lkp, Christoph Hellwig On Thu, 2017-06-01 at 07:49 -0400, Benjamin Coddington wrote: > On 1 Jun 2017, at 7:41, Jeff Layton wrote: > > > On Thu, 2017-06-01 at 10:05 +0800, kernel test robot wrote: > > > Greeting, > > > > > > FYI, we noticed a -14.1% regression of will-it-scale.per_process_ops > > > due to commit: > > > > > > > > > commit: 9d21d181d06acab9a8e80eac2ec4eed77b656793 ("fs/locks: Set > > > fl_nspid at file_lock allocation") > > > url: > > > https://github.com/0day-ci/linux/commits/Benjamin-Coddington/fs-locks-Alloc-file_lock-where-practical/20170527-050700 > > > > > > > > > > Ouch, that's a rather nasty performance hit. In hindsight, maybe we > > shouldn't move those off the stack after all? Heck, if it's that > > significant, maybe we should move the F_SETLK callers to allocate > > these > > on the stack as well? > > We can do that. But, I think this is picking up the > locks_mandatory_area() > allocation which is now removed. The attached config has > CONFIG_MANDATORY_FILE_LOCKING=y, so there's allocation on every > read/write. > I'm not so sure. That would only be the case if the thing were marked for manadatory locking (a really rare thing). The test is really simple and I don't think any read/write activity is involved: https://github.com/antonblanchard/will-it-scale/blob/master/tests/lock1.c ...and the 0 day bisected it down to this patch, IIUC: https://github.com/0day-ci/linux/commit/9d21d181d06acab9a8e80eac2ec4eed77b656793 It seems likely that it's the extra get_pid/put_pid in the allocation and free codepath. I expected those to be pretty cheap, but maybe they're not? -- Jeff Layton <jlayton@redhat.com> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [lkp-robot] [fs/locks] 9d21d181d0: will-it-scale.per_process_ops -14.1% regression 2017-06-01 12:59 ` Jeff Layton @ 2017-06-01 15:14 ` J. Bruce Fields 2017-06-01 15:48 ` Jeff Layton 0 siblings, 1 reply; 10+ messages in thread From: J. Bruce Fields @ 2017-06-01 15:14 UTC (permalink / raw) To: Jeff Layton Cc: Benjamin Coddington, kernel test robot, Alexander Viro, linux-fsdevel, linux-nfs, lkp, Christoph Hellwig On Thu, Jun 01, 2017 at 08:59:21AM -0400, Jeff Layton wrote: > On Thu, 2017-06-01 at 07:49 -0400, Benjamin Coddington wrote: > > On 1 Jun 2017, at 7:41, Jeff Layton wrote: > > > > > On Thu, 2017-06-01 at 10:05 +0800, kernel test robot wrote: > > > > Greeting, > > > > > > > > FYI, we noticed a -14.1% regression of will-it-scale.per_process_ops > > > > due to commit: > > > > > > > > > > > > commit: 9d21d181d06acab9a8e80eac2ec4eed77b656793 ("fs/locks: Set > > > > fl_nspid at file_lock allocation") > > > > url: > > > > https://github.com/0day-ci/linux/commits/Benjamin-Coddington/fs-locks-Alloc-file_lock-where-practical/20170527-050700 > > > > > > > > > > > > > > Ouch, that's a rather nasty performance hit. In hindsight, maybe we > > > shouldn't move those off the stack after all? Heck, if it's that > > > significant, maybe we should move the F_SETLK callers to allocate > > > these > > > on the stack as well? > > > > We can do that. But, I think this is picking up the > > locks_mandatory_area() > > allocation which is now removed. The attached config has > > CONFIG_MANDATORY_FILE_LOCKING=y, so there's allocation on every > > read/write. > > > > I'm not so sure. That would only be the case if the thing were marked > for manadatory locking (a really rare thing). > > The test is really simple and I don't think any read/write activity is > involved: > > https://github.com/antonblanchard/will-it-scale/blob/master/tests/lock1.c So it's just F_WRLCK/F_UNLCK in a loop spread across multiple cores? I'd think real workloads do some work while holding the lock, and a 15% regression on just the pure lock/unlock loop might not matter? But best to be careful, I guess. --b. > > ...and the 0 day bisected it down to this patch, IIUC: > > https://github.com/0day-ci/linux/commit/9d21d181d06acab9a8e80eac2ec4eed77b656793 > > It seems likely that it's the extra get_pid/put_pid in the allocation > and free codepath. I expected those to be pretty cheap, but maybe > they're not? ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [lkp-robot] [fs/locks] 9d21d181d0: will-it-scale.per_process_ops -14.1% regression 2017-06-01 15:14 ` J. Bruce Fields @ 2017-06-01 15:48 ` Jeff Layton 2017-06-05 18:34 ` Benjamin Coddington 0 siblings, 1 reply; 10+ messages in thread From: Jeff Layton @ 2017-06-01 15:48 UTC (permalink / raw) To: J. Bruce Fields Cc: Benjamin Coddington, kernel test robot, Alexander Viro, linux-fsdevel, linux-nfs, lkp, Christoph Hellwig On Thu, 2017-06-01 at 11:14 -0400, J. Bruce Fields wrote: > On Thu, Jun 01, 2017 at 08:59:21AM -0400, Jeff Layton wrote: > > On Thu, 2017-06-01 at 07:49 -0400, Benjamin Coddington wrote: > > > On 1 Jun 2017, at 7:41, Jeff Layton wrote: > > > > > > > On Thu, 2017-06-01 at 10:05 +0800, kernel test robot wrote: > > > > > Greeting, > > > > > > > > > > FYI, we noticed a -14.1% regression of will-it-scale.per_process_ops > > > > > due to commit: > > > > > > > > > > > > > > > commit: 9d21d181d06acab9a8e80eac2ec4eed77b656793 ("fs/locks: Set > > > > > fl_nspid at file_lock allocation") > > > > > url: > > > > > https://github.com/0day-ci/linux/commits/Benjamin-Coddington/fs-locks-Alloc-file_lock-where-practical/20170527-050700 > > > > > > > > > > > > > > > > > > Ouch, that's a rather nasty performance hit. In hindsight, maybe we > > > > shouldn't move those off the stack after all? Heck, if it's that > > > > significant, maybe we should move the F_SETLK callers to allocate > > > > these > > > > on the stack as well? > > > > > > We can do that. But, I think this is picking up the > > > locks_mandatory_area() > > > allocation which is now removed. The attached config has > > > CONFIG_MANDATORY_FILE_LOCKING=y, so there's allocation on every > > > read/write. > > > > > > > I'm not so sure. That would only be the case if the thing were marked > > for manadatory locking (a really rare thing). > > > > The test is really simple and I don't think any read/write activity is > > involved: > > > > https://github.com/antonblanchard/will-it-scale/blob/master/tests/lock1.c > > So it's just F_WRLCK/F_UNLCK in a loop spread across multiple cores? > I'd think real workloads do some work while holding the lock, and a 15% > regression on just the pure lock/unlock loop might not matter? But best > to be careful, I guess. > > --b. > Yeah, that's my take. I was assuming that getting a pid reference would be essentially free, but it doesn't seem to be. So, I think we probably want to avoid taking it for a file_lock that we use to request a lock, but do take it for a file_lock that is used to record a lock. How best to code that up, I'm not quite sure... > > > > ...and the 0 day bisected it down to this patch, IIUC: > > > > https://github.com/0day-ci/linux/commit/9d21d181d06acab9a8e80eac2ec4eed77b656793 > > > > It seems likely that it's the extra get_pid/put_pid in the allocation > > and free codepath. I expected those to be pretty cheap, but maybe > > they're not? > > -- Jeff Layton <jlayton@redhat.com> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [lkp-robot] [fs/locks] 9d21d181d0: will-it-scale.per_process_ops -14.1% regression 2017-06-01 15:48 ` Jeff Layton @ 2017-06-05 18:34 ` Benjamin Coddington 2017-06-05 22:02 ` Jeff Layton 0 siblings, 1 reply; 10+ messages in thread From: Benjamin Coddington @ 2017-06-05 18:34 UTC (permalink / raw) To: Jeff Layton Cc: J. Bruce Fields, kernel test robot, Alexander Viro, linux-fsdevel, linux-nfs, lkp, Christoph Hellwig On 1 Jun 2017, at 11:48, Jeff Layton wrote: > On Thu, 2017-06-01 at 11:14 -0400, J. Bruce Fields wrote: >> On Thu, Jun 01, 2017 at 08:59:21AM -0400, Jeff Layton wrote: >>> I'm not so sure. That would only be the case if the thing were >>> marked >>> for manadatory locking (a really rare thing). >>> >>> The test is really simple and I don't think any read/write activity >>> is >>> involved: >>> >>> https://github.com/antonblanchard/will-it-scale/blob/master/tests/lock1.c >> >> So it's just F_WRLCK/F_UNLCK in a loop spread across multiple cores? >> I'd think real workloads do some work while holding the lock, and a >> 15% >> regression on just the pure lock/unlock loop might not matter? But >> best >> to be careful, I guess. >> >> --b. >> > > Yeah, that's my take. > > I was assuming that getting a pid reference would be essentially free, > but it doesn't seem to be. > > So, I think we probably want to avoid taking it for a file_lock that > we > use to request a lock, but do take it for a file_lock that is used to > record a lock. How best to code that up, I'm not quite sure... Maybe as simple as only setting fl_nspid in locks_insert_lock_ctx(), but that seems to just take us back to the problem of getting the pid wrong if the lock is inserted later by a different worker than created the request. I have a mind now to just drop fl_nspid off the struct file_lock completely, and instead just carry fl_pid, and when we do F_GETLK, we can do: task = find_task_by_pid_ns(fl_pid, init_pid_ns) fl_nspid = task_pid_nr_ns(task, task_active_pid_ns(current)) That moves all the work off into the F_GETLK case, which I think is not used so much. Ben ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [lkp-robot] [fs/locks] 9d21d181d0: will-it-scale.per_process_ops -14.1% regression 2017-06-05 18:34 ` Benjamin Coddington @ 2017-06-05 22:02 ` Jeff Layton 2017-06-06 13:00 ` Benjamin Coddington 0 siblings, 1 reply; 10+ messages in thread From: Jeff Layton @ 2017-06-05 22:02 UTC (permalink / raw) To: Benjamin Coddington Cc: J. Bruce Fields, kernel test robot, Alexander Viro, linux-fsdevel, linux-nfs, lkp, Christoph Hellwig On Mon, 2017-06-05 at 14:34 -0400, Benjamin Coddington wrote: > On 1 Jun 2017, at 11:48, Jeff Layton wrote: > > > On Thu, 2017-06-01 at 11:14 -0400, J. Bruce Fields wrote: > > > On Thu, Jun 01, 2017 at 08:59:21AM -0400, Jeff Layton wrote: > > > > I'm not so sure. That would only be the case if the thing were > > > > marked > > > > for manadatory locking (a really rare thing). > > > > > > > > The test is really simple and I don't think any read/write activity > > > > is > > > > involved: > > > > > > > > https://github.com/antonblanchard/will-it-scale/blob/master/tests/lock1.c > > > > > > So it's just F_WRLCK/F_UNLCK in a loop spread across multiple cores? > > > I'd think real workloads do some work while holding the lock, and a > > > 15% > > > regression on just the pure lock/unlock loop might not matter? But > > > best > > > to be careful, I guess. > > > > > > --b. > > > > > > > Yeah, that's my take. > > > > I was assuming that getting a pid reference would be essentially free, > > but it doesn't seem to be. > > > > So, I think we probably want to avoid taking it for a file_lock that > > we > > use to request a lock, but do take it for a file_lock that is used to > > record a lock. How best to code that up, I'm not quite sure... > > Maybe as simple as only setting fl_nspid in locks_insert_lock_ctx(), but > that seems to just take us back to the problem of getting the pid wrong > if > the lock is inserted later by a different worker than created the > request. > > I have a mind now to just drop fl_nspid off the struct file_lock > completely, > and instead just carry fl_pid, and when we do F_GETLK, we can do: > > task = find_task_by_pid_ns(fl_pid, init_pid_ns) > fl_nspid = task_pid_nr_ns(task, task_active_pid_ns(current)) > > That moves all the work off into the F_GETLK case, which I think is not > used > so much. > Actually I think what might work best is to: - have locks_copy_conflock also copy the fl_nspid and take a reference to it (as your patch #2 does) - only set fl_nspid and take a reference there in locks_insert_lock_ctx if it's not already set - allow ->lock operations (like nfs) to set fl_nspid before they call locks_lock_inode_wait to set the local lock. Might need to take a nspid reference before dispatching an RPC so that you get the right thread context. Would that work? -- Jeff Layton <jlayton@redhat.com> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [lkp-robot] [fs/locks] 9d21d181d0: will-it-scale.per_process_ops -14.1% regression 2017-06-05 22:02 ` Jeff Layton @ 2017-06-06 13:00 ` Benjamin Coddington 2017-06-06 13:15 ` Jeff Layton 0 siblings, 1 reply; 10+ messages in thread From: Benjamin Coddington @ 2017-06-06 13:00 UTC (permalink / raw) To: Jeff Layton Cc: J. Bruce Fields, kernel test robot, Alexander Viro, linux-fsdevel, linux-nfs, lkp, Christoph Hellwig On 5 Jun 2017, at 18:02, Jeff Layton wrote: > On Mon, 2017-06-05 at 14:34 -0400, Benjamin Coddington wrote: >> On 1 Jun 2017, at 11:48, Jeff Layton wrote: >> >>> On Thu, 2017-06-01 at 11:14 -0400, J. Bruce Fields wrote: >>>> On Thu, Jun 01, 2017 at 08:59:21AM -0400, Jeff Layton wrote: >>>>> I'm not so sure. That would only be the case if the thing were >>>>> marked >>>>> for manadatory locking (a really rare thing). >>>>> >>>>> The test is really simple and I don't think any read/write >>>>> activity >>>>> is >>>>> involved: >>>>> >>>>> https://github.com/antonblanchard/will-it-scale/blob/master/tests/lock1.c >>>> >>>> So it's just F_WRLCK/F_UNLCK in a loop spread across multiple >>>> cores? >>>> I'd think real workloads do some work while holding the lock, and a >>>> 15% >>>> regression on just the pure lock/unlock loop might not matter? But >>>> best >>>> to be careful, I guess. >>>> >>>> --b. >>>> >>> >>> Yeah, that's my take. >>> >>> I was assuming that getting a pid reference would be essentially >>> free, >>> but it doesn't seem to be. >>> >>> So, I think we probably want to avoid taking it for a file_lock that >>> we >>> use to request a lock, but do take it for a file_lock that is used >>> to >>> record a lock. How best to code that up, I'm not quite sure... >> >> Maybe as simple as only setting fl_nspid in locks_insert_lock_ctx(), >> but >> that seems to just take us back to the problem of getting the pid >> wrong >> if >> the lock is inserted later by a different worker than created the >> request. >> >> I have a mind now to just drop fl_nspid off the struct file_lock >> completely, >> and instead just carry fl_pid, and when we do F_GETLK, we can do: >> >> task = find_task_by_pid_ns(fl_pid, init_pid_ns) >> fl_nspid = task_pid_nr_ns(task, task_active_pid_ns(current)) >> >> That moves all the work off into the F_GETLK case, which I think is >> not >> used >> so much. >> > > Actually I think what might work best is to: > > - have locks_copy_conflock also copy the fl_nspid and take a reference > to it (as your patch #2 does) > > - only set fl_nspid and take a reference there in > locks_insert_lock_ctx > if it's not already set > > - allow ->lock operations (like nfs) to set fl_nspid before they call > locks_lock_inode_wait to set the local lock. Might need to take a > nspid > reference before dispatching an RPC so that you get the right thread > context. It would, but I think fl_nspid is completely unnecessary. The reason we have it so that we can translate the pid number into other namespaces, the most common case being that F_GETLK and views of /proc/locks within a namespace represent the same pid numbers as the processes in that namespace that are holding the locks. It is much simpler to just keep using fl_pid as the pid number in the init namespace, but move the translation of that pid number to lookup time, rather than creation time. Ben ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [lkp-robot] [fs/locks] 9d21d181d0: will-it-scale.per_process_ops -14.1% regression 2017-06-06 13:00 ` Benjamin Coddington @ 2017-06-06 13:15 ` Jeff Layton 2017-06-06 13:21 ` Benjamin Coddington 0 siblings, 1 reply; 10+ messages in thread From: Jeff Layton @ 2017-06-06 13:15 UTC (permalink / raw) To: Benjamin Coddington Cc: J. Bruce Fields, kernel test robot, Alexander Viro, linux-fsdevel, linux-nfs, lkp, Christoph Hellwig On Tue, 2017-06-06 at 09:00 -0400, Benjamin Coddington wrote: > > On 5 Jun 2017, at 18:02, Jeff Layton wrote: > > > On Mon, 2017-06-05 at 14:34 -0400, Benjamin Coddington wrote: > > > On 1 Jun 2017, at 11:48, Jeff Layton wrote: > > > > > > > On Thu, 2017-06-01 at 11:14 -0400, J. Bruce Fields wrote: > > > > > On Thu, Jun 01, 2017 at 08:59:21AM -0400, Jeff Layton wrote: > > > > > > I'm not so sure. That would only be the case if the thing were > > > > > > marked > > > > > > for manadatory locking (a really rare thing). > > > > > > > > > > > > The test is really simple and I don't think any read/write > > > > > > activity > > > > > > is > > > > > > involved: > > > > > > > > > > > > https://github.com/antonblanchard/will-it-scale/blob/master/tests/lock1.c > > > > > > > > > > So it's just F_WRLCK/F_UNLCK in a loop spread across multiple > > > > > cores? > > > > > I'd think real workloads do some work while holding the lock, and a > > > > > 15% > > > > > regression on just the pure lock/unlock loop might not matter? But > > > > > best > > > > > to be careful, I guess. > > > > > > > > > > --b. > > > > > > > > > > > > > Yeah, that's my take. > > > > > > > > I was assuming that getting a pid reference would be essentially > > > > free, > > > > but it doesn't seem to be. > > > > > > > > So, I think we probably want to avoid taking it for a file_lock that > > > > we > > > > use to request a lock, but do take it for a file_lock that is used > > > > to > > > > record a lock. How best to code that up, I'm not quite sure... > > > > > > Maybe as simple as only setting fl_nspid in locks_insert_lock_ctx(), > > > but > > > that seems to just take us back to the problem of getting the pid > > > wrong > > > if > > > the lock is inserted later by a different worker than created the > > > request. > > > > > > I have a mind now to just drop fl_nspid off the struct file_lock > > > completely, > > > and instead just carry fl_pid, and when we do F_GETLK, we can do: > > > > > > task = find_task_by_pid_ns(fl_pid, init_pid_ns) > > > fl_nspid = task_pid_nr_ns(task, task_active_pid_ns(current)) > > > > > > That moves all the work off into the F_GETLK case, which I think is > > > not > > > used > > > so much. > > > > > > > Actually I think what might work best is to: > > > > - have locks_copy_conflock also copy the fl_nspid and take a reference > > to it (as your patch #2 does) > > > > - only set fl_nspid and take a reference there in > > locks_insert_lock_ctx > > if it's not already set > > > > - allow ->lock operations (like nfs) to set fl_nspid before they call > > locks_lock_inode_wait to set the local lock. Might need to take a > > nspid > > reference before dispatching an RPC so that you get the right thread > > context. > > It would, but I think fl_nspid is completely unnecessary. The reason we > have it so that we can translate the pid number into other namespaces, > the > most common case being that F_GETLK and views of /proc/locks within a > namespace represent the same pid numbers as the processes in that > namespace > that are holding the locks. > > It is much simpler to just keep using fl_pid as the pid number in the > init > namespace, but move the translation of that pid number to lookup time, > rather than creation time. > I think that would also work and I like the idea of getting rid of a field in file_lock. So, to be clear: fl_pid would then store the pid of the process in the init_pid_ns, and you'd just translate it as appropriate to the requestor's namespace? If we want to go that route, then you'll probably still need a flag of some sort to indicate that the fl_pid is to be expressed "as is", for remote filesystems. OTOH, if the lock is held remotely, I wonder if we'd be better off simply reporting the pid as '-1', like we do with OFD locks. Hardly anything pays attention to l_pid anyway and it's more or less meaningless once the filesystem extends beyond the machine you're on. That said, I'd be inclined to do that in a separate set so we could revert it if it caused problems somewhere. -- Jeff Layton <jlayton@redhat.com> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [lkp-robot] [fs/locks] 9d21d181d0: will-it-scale.per_process_ops -14.1% regression 2017-06-06 13:15 ` Jeff Layton @ 2017-06-06 13:21 ` Benjamin Coddington 0 siblings, 0 replies; 10+ messages in thread From: Benjamin Coddington @ 2017-06-06 13:21 UTC (permalink / raw) To: Jeff Layton Cc: J. Bruce Fields, kernel test robot, Alexander Viro, linux-fsdevel, linux-nfs, lkp, Christoph Hellwig On 6 Jun 2017, at 9:15, Jeff Layton wrote: > On Tue, 2017-06-06 at 09:00 -0400, Benjamin Coddington wrote: >> It would, but I think fl_nspid is completely unnecessary. The reason we >> have it so that we can translate the pid number into other namespaces, >> the >> most common case being that F_GETLK and views of /proc/locks within a >> namespace represent the same pid numbers as the processes in that >> namespace >> that are holding the locks. >> >> It is much simpler to just keep using fl_pid as the pid number in the >> init >> namespace, but move the translation of that pid number to lookup time, >> rather than creation time. >> > > I think that would also work and I like the idea of getting rid of a > field in file_lock. > > So, to be clear: > > fl_pid would then store the pid of the process in the init_pid_ns, and > you'd just translate it as appropriate to the requestor's namespace? Right! > If we want to go that route, then you'll probably still need a flag of > some sort to indicate that the fl_pid is to be expressed "as is", for > remote filesystems. Yes, the hack I have now still uses that flag. > OTOH, if the lock is held remotely, I wonder if we'd be better off > simply reporting the pid as '-1', like we do with OFD locks. Hardly > anything pays attention to l_pid anyway and it's more or less > meaningless once the filesystem extends beyond the machine you're on. > > That said, I'd be inclined to do that in a separate set so we could > revert it if it caused problems somewhere. I think keeping that change separate is a good idea. It might be a good idea to just stay away from that change unless there's a compelling reason for it. If l_sysid ever happens, then there's no need to change the behavior for l_pid back again. I'll send some patches in a bit when I am happy with them. Ben ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2017-06-06 13:21 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20170601020556.GE16905@yexl-desktop>
2017-06-01 11:41 ` [lkp-robot] [fs/locks] 9d21d181d0: will-it-scale.per_process_ops -14.1% regression Jeff Layton
2017-06-01 11:49 ` Benjamin Coddington
2017-06-01 12:59 ` Jeff Layton
2017-06-01 15:14 ` J. Bruce Fields
2017-06-01 15:48 ` Jeff Layton
2017-06-05 18:34 ` Benjamin Coddington
2017-06-05 22:02 ` Jeff Layton
2017-06-06 13:00 ` Benjamin Coddington
2017-06-06 13:15 ` Jeff Layton
2017-06-06 13:21 ` Benjamin Coddington
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).