linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Ma, Yu" <yu.ma@intel.com>
To: Mateusz Guzik <mjguzik@gmail.com>
Cc: viro@zeniv.linux.org.uk, brauner@kernel.org, jack@suse.cz,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	tim.c.chen@linux.intel.com, tim.c.chen@intel.com,
	pan.deng@intel.com, tianyou.li@intel.com, yu.ma@intel.com
Subject: Re: [PATCH 3/3] fs/file.c: move sanity_check from alloc_fd() to put_unused_fd()
Date: Tue, 18 Jun 2024 01:22:13 +0800	[thread overview]
Message-ID: <c96d2f39-fa90-41da-985c-116cf4cc967a@intel.com> (raw)
In-Reply-To: <suehfaqsibehz3vws6oourswenaz7bbbn75a7joi5cxi6komyk@3fxp7v3bg5qk>


On 6/17/2024 7:23 PM, Mateusz Guzik wrote:
> On Sun, Jun 16, 2024 at 11:47:40AM +0800, Ma, Yu wrote:
>> On 6/15/2024 12:41 PM, Mateusz Guzik wrote:
>>> So you are moving this to another locked area, but one which does not
>>> execute in the benchmark?
>> The consideration here as mentioned is to reduce the performance impact (if
>> to reserve the sanity check, and have the same functionality) by moving it
>> from critical path to non-critical, as put_unused_fd() is mostly used for
>> error handling when fd is allocated successfully but struct file failed to
>> obtained in the next step.
>>
> As mentioned by Christian in his mail this check can just be removed.

Yes, that's great, I'll update in v2

>
>>          error = -EMFILE;
>>          if (fd < fdt->max_fds) {
> I would likely() on it.

That's better :)

>
>>                  if (~fdt->open_fds[0]) {
>>                          fd = find_next_zero_bit(fdt->open_fds,
>> BITS_PER_LONG, fd);
>>                          goto fastreturn;
>>                  }
>>                  fd = find_next_fd(fdt, fd);
>>          }
>>
>>          if (unlikely(fd >= fdt->max_fds)) {
>>                  error = expand_files(files, fd);
>>                  if (error < 0)
>>                          goto out;
>>                  if (error)
>>                          goto repeat;
>>          }
>> fastreturn:
>>          if (unlikely(fd >= end))
>>                  goto out;
>>          if (start <= files->next_fd)
>>                  files->next_fd = fd + 1;
>>
>>         ....
>>
> This is not a review, but it does read fine.
>
> LTP (https://github.com/linux-test-project/ltp.git) has a bunch of fd
> tests, I would make sure they still pass before posting a v2.
Got it, thanks for the kind reminder.
>
> I would definitely try moving out the lock to its own cacheline --
> currently it resides with the bitmaps (and first 4 fds of the embedded
> array).
>
> I expect it to provide some win on top of the current patchset, but
> whether it will be sufficient to justify it I have no idea.
>
> Something of this sort:
> diff --git a/include/linux/fdtable.h b/include/linux/fdtable.h
> index 2944d4aa413b..423cb599268a 100644
> --- a/include/linux/fdtable.h
> +++ b/include/linux/fdtable.h
> @@ -50,11 +50,11 @@ struct files_struct {
>      * written part on a separate cache line in SMP
>      */
>          spinlock_t file_lock ____cacheline_aligned_in_smp;
> -       unsigned int next_fd;
> +       unsigned int next_fd ____cacheline_aligned_in_smp;
>          unsigned long close_on_exec_init[1];
>          unsigned long open_fds_init[1];
>          unsigned long full_fds_bits_init[1];
> -       struct file __rcu * fd_array[NR_OPEN_DEFAULT];
> +       struct file __rcu * fd_array[NR_OPEN_DEFAULT] ____cacheline_aligned_in_smp;
>   };
>
>   struct file_operations;
>
> (feel free to just take)

Thanks Guzik for the thoughtful suggestions, seems we've ever tried to 
separate file_lock and next_fd to different cachelines, but no obvious 
improvement observed, we'll double check and verify these 2 tips to see 
how it goes.

>
> All that said, I have to note blogbench has numerous bugs. For example
> it collects stats by merely bumping a global variable (not even with
> atomics), so some updates are straight up lost.
>
> To my understanding it was meant to test filesystems and I'm guessing
> threading (instead of separate processes) was only used to make it
> easier to share statistics. Given the current scales this
> unintentionally transitioned into bottlenecking on the file_lock
> instead.
>
> There were scalability changes made about 9 years ago and according to
> git log they were benchmarked by Eric Dumazet, while benchmark code was
> not shared. I suggest CC'ing the guy with your v2 and asking if he can
> bench.  Even if he is no longer in position to do it perhaps he can rope
> someone in (or even better share his benchmark).

Good advice, we also noticed the problem for blogbench, and current data 
in commits is collected based on revision to make sure the variables for 
score statistic/reporting are safe, will submit patch to blogbench for 
update. We'll copy Eric for more information on his benchmark and check 
details if it is available.


  reply	other threads:[~2024-06-17 17:22 UTC|newest]

Thread overview: 103+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-14 16:34 [PATCH 0/3] fs/file.c: optimize the critical section of Yu Ma
2024-06-14 16:34 ` [PATCH 1/3] fs/file.c: add fast path in alloc_fd() Yu Ma
2024-06-15  6:31   ` Mateusz Guzik
2024-06-16  4:01     ` Ma, Yu
2024-06-17 17:49     ` Tim Chen
2024-06-19 10:36   ` David Laight
2024-06-19 17:09     ` Ma, Yu
2024-06-14 16:34 ` [PATCH 2/3] fs/file.c: conditionally clear full_fds Yu Ma
2024-06-14 16:34 ` [PATCH 3/3] fs/file.c: move sanity_check from alloc_fd() to put_unused_fd() Yu Ma
2024-06-15  4:41   ` Mateusz Guzik
2024-06-15  5:07     ` Mateusz Guzik
2024-06-17 17:55       ` Tim Chen
2024-06-17 17:59         ` Mateusz Guzik
2024-06-17 18:04         ` Tim Chen
2024-06-18  8:35           ` Michal Hocko
2024-06-18  9:06             ` Mateusz Guzik
2024-06-18 20:40             ` Tim Chen
2024-06-16  3:47     ` Ma, Yu
2024-06-17 11:23       ` Mateusz Guzik
2024-06-17 17:22         ` Ma, Yu [this message]
2024-06-17  8:36     ` Christian Brauner
2024-06-22 15:49 ` [PATCH v2 0/3] fs/file.c: optimize the critical section of file_lock in Yu Ma
2024-06-22 15:49   ` [PATCH v2 1/3] fs/file.c: add fast path in alloc_fd() Yu Ma
2024-06-25 11:52     ` Jan Kara
2024-06-25 12:53       ` Jan Kara
2024-06-25 15:33         ` Ma, Yu
2024-06-26 11:54           ` Jan Kara
2024-06-26 16:43             ` Tim Chen
2024-06-26 16:52               ` Tim Chen
2024-06-27 12:09                 ` Jan Kara
2024-06-27 12:20                   ` Mateusz Guzik
2024-06-27 16:21                   ` Tim Chen
2024-06-26 19:13             ` Mateusz Guzik
2024-06-27 14:03               ` Jan Kara
2024-06-27 15:33               ` Christian Brauner
2024-06-27 18:27                 ` Ma, Yu
2024-06-27 19:59                   ` Mateusz Guzik
2024-06-28  9:12                     ` Jan Kara
2024-06-29 15:41                       ` Ma, Yu
2024-06-29 15:46                         ` Mateusz Guzik
2024-06-29 14:23                     ` Ma, Yu
2024-06-22 15:49   ` [PATCH v2 2/3] fs/file.c: conditionally clear full_fds Yu Ma
2024-06-25 11:54     ` Jan Kara
2024-06-25 15:41       ` Ma, Yu
2024-06-22 15:49   ` [PATCH v2 3/3] fs/file.c: remove sanity_check from alloc_fd() Yu Ma
2024-06-25 12:08     ` Jan Kara
2024-06-25 13:09       ` Mateusz Guzik
2024-06-25 13:11         ` Mateusz Guzik
2024-06-25 13:30           ` Jan Kara
2024-06-26 13:10             ` Christian Brauner
2024-07-03 14:33 ` [PATCH v3 0/3] fs/file.c: optimize the critical section of file_lock in Yu Ma
2024-07-03 14:33   ` [PATCH v3 1/3] fs/file.c: remove sanity_check and add likely/unlikely in alloc_fd() Yu Ma
2024-07-03 14:34     ` Christian Brauner
2024-07-03 14:46       ` Ma, Yu
2024-07-04 10:11       ` Jan Kara
2024-07-04 14:45         ` Ma, Yu
2024-07-04 15:41           ` Jan Kara
2024-07-03 14:33   ` [PATCH v3 2/3] fs/file.c: conditionally clear full_fds Yu Ma
2024-07-03 14:33   ` [PATCH v3 3/3] fs/file.c: add fast path in find_next_fd() Yu Ma
2024-07-03 14:17     ` Mateusz Guzik
2024-07-03 14:28       ` Ma, Yu
2024-07-04 10:07       ` Jan Kara
2024-07-04 10:03     ` Jan Kara
2024-07-04 14:50       ` Ma, Yu
2024-07-04 17:44     ` Mateusz Guzik
2024-07-04 21:55       ` Jan Kara
2024-07-05  7:56         ` Ma, Yu
2024-07-09  8:32           ` Ma, Yu
2024-07-09 10:17             ` Mateusz Guzik
2024-07-10 23:40               ` Tim Chen
2024-07-11  9:27                 ` Ma, Yu
2024-07-13  2:39 ` [PATCH v4 0/3] fs/file.c: optimize the critical section of file_lock in Yu Ma
2024-07-13  2:39   ` [PATCH v4 1/3] fs/file.c: remove sanity_check and add likely/unlikely in alloc_fd() Yu Ma
2024-07-16 11:11     ` Jan Kara
2024-07-13  2:39   ` [PATCH v4 2/3] fs/file.c: conditionally clear full_fds Yu Ma
2024-07-13  2:39   ` [PATCH v4 3/3] fs/file.c: add fast path in find_next_fd() Yu Ma
2024-07-16 11:19     ` Jan Kara
2024-07-16 12:37       ` Ma, Yu
2024-07-17 14:50 ` [PATCH v5 0/3] fs/file.c: optimize the critical section of file_lock in Yu Ma
2024-07-17 14:50   ` [PATCH v5 1/3] fs/file.c: remove sanity_check and add likely/unlikely in alloc_fd() Yu Ma
2024-08-06 13:44     ` kernel test robot
2024-08-14 21:38     ` Al Viro
2024-08-15  2:49       ` Ma, Yu
2024-08-15  3:45         ` Al Viro
2024-08-15  8:34           ` Ma, Yu
2024-10-31  7:42           ` Mateusz Guzik
2024-10-31 10:14             ` Christian Brauner
2024-07-17 14:50   ` [PATCH v5 2/3] fs/file.c: conditionally clear full_fds Yu Ma
2024-07-17 14:50   ` [PATCH v5 3/3] fs/file.c: add fast path in find_next_fd() Yu Ma
2024-07-19 17:53     ` Mateusz Guzik
2024-07-20 12:57       ` Ma, Yu
2024-07-20 14:22         ` Mateusz Guzik
2024-08-06 13:48     ` kernel test robot
2024-07-22 15:02   ` [PATCH v5 0/3] fs/file.c: optimize the critical section of file_lock in Christian Brauner
2024-08-01 19:13     ` Al Viro
2024-08-02 11:04       ` Christian Brauner
2024-08-02 14:22         ` Al Viro
2024-08-05  6:56           ` Christian Brauner
2024-08-12  1:31             ` Ma, Yu
2024-08-12  2:40               ` Al Viro
2024-08-12 15:09                 ` Ma, Yu
2024-11-06 17:44                 ` Jan Kara
2024-11-06 17:59                   ` Al Viro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c96d2f39-fa90-41da-985c-116cf4cc967a@intel.com \
    --to=yu.ma@intel.com \
    --cc=brauner@kernel.org \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mjguzik@gmail.com \
    --cc=pan.deng@intel.com \
    --cc=tianyou.li@intel.com \
    --cc=tim.c.chen@intel.com \
    --cc=tim.c.chen@linux.intel.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).