From: Jan Kara <jack@suse.cz>
To: Wang Shilong <wangshilong1991@gmail.com>
Cc: Jan Kara <jack@suse.cz>, Andrew Perepechko <anserper@yandex.ru>,
Shuichi Ihara <sihara@ddn.com>, Wang Shilong <wshilong@ddn.com>,
Li Xi <lixi@ddn.com>,
Ext4 Developers List <linux-ext4@vger.kernel.org>,
linux-fsdevel@vger.kernel.org
Subject: Re: quota: dqio_mutex design
Date: Tue, 8 Aug 2017 18:06:35 +0200 [thread overview]
Message-ID: <20170808160635.GA23565@quack2.suse.cz> (raw)
In-Reply-To: <CAP9B-QknSSZ0wpakdiANmhNBkSA6oaa=vPC_XmwAnus1Qg+n5Q@mail.gmail.com>
Hi,
On Thu 03-08-17 22:39:51, Wang Shilong wrote:
> Please send me patches, we could test and response you!
So I finally have something which isn't obviously wrong (it survives basic
testing and gives me improvements for some workloads). I have pushed out
the patches to:
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs.git quota_scaling
I'd be happy if you can share your results with my patches. I have not yet
figured out a safe way to reduce the contention on dq_lock during update of
on-disk structure when lot of processes bang single dquot. I have
experimental patch but it didn't bring any benefit in my testing - I'll
rebase it on top of other patches I have send it to you for some testing.
Honza
> On Thu, Aug 3, 2017 at 10:36 PM, Jan Kara <jack@suse.cz> wrote:
> > Hello!
> >
> > On Thu 03-08-17 19:31:04, Wang Shilong wrote:
> >> We DDN is investigating the same issue!
> >>
> >> Some comments comes:
> >>
> >> On Thu, Aug 3, 2017 at 1:52 AM, Andrew Perepechko <anserper@yandex.ru> wrote:
> >> >> On Tue 01-08-17 15:02:42, Jan Kara wrote:
> >> >> > Hi Andrew,
> >> >> >
> >> >> I've been experimenting with this today but this idea didn't bring any
> >> >> benefit in my testing. Was your setup with multiple users or a single user?
> >> >> Could you give some testing to my patches to see whether they bring some
> >> >> benefit to you?
> >> >>
> >> >> Honza
> >> >
> >> > Hi Jan!
> >> >
> >> > My setup was with a single user. Unfortunately, it may take some time before
> >> > I can try a patched kernel other than RHEL6 or RHEL7 with the same test,
> >> > we have a lot of dependencies on these kernels.
> >> >
> >> > The actual test we ran was mdtest.
> >> >
> >> > By the way, we had 15+% performance improvement in creates from the
> >> > change that was discussed earlier in this thread:
> >> >
> >> > EXT4_SB(dquot->dq_sb)->s_qf_names[GRPQUOTA]) {
> >> > + if (test_bit(DQ_MOD_B, &dquot->dq_flags))
> >> > + return 0;
> >>
> >> I don't think this is right, as far as i understand, journal quota need go
> >> together with quota space change update inside same transaction, this will
> >> break consistency if power off or RO happen.
> >>
> >> Here is some ideas that i have thought:
> >>
> >> 1) switch dqio_mutex to a read/write lock, especially, i think most of
> >> time journal quota updates is in-place update, that means we don't need
> >> change quota tree in memory, firstly try read lock, retry with write lock if
> >> there is real tree change.
> >>
> >> 2)another is similar idea of Andrew's walkaround, but we need make correct
> >> fix, maintain dirty list for per transaction, and gurantee quota updates are
> >> flushed when commit transaction, this might be complex, i am not very
> >> familiar with JBD2 codes.
> >>
> >> It will be really nice if we could fix this regression, as we see 20% performace
> >> regression.
> >
> > So I have couple of patches:
> >
> > 1) I convert dqio_mutex do rw semaphore and use it in exclusive mode only
> > when quota tree is going to change. We also use dq_lock to serialize writes
> > of dquot - you cannot have two writes happening in parallel as that could
> > result in stale data being on disk. This patch brings benefit when there
> > are multiple users - now they don't contend on common lock. It shows
> > advantage in my testing so I plan to merge these patches. When the
> > contention is on a structure for single user this change however doesn't
> > bring much (the performance change is in statistical noise in my testing).
> >
> > 2) I have patches to remove some contention on dq_list_lock by not using
> > dirty list for tracking dquots in ext4 (and thus avoid dq_list_lock
> > completely in quota modification path). This does not bring measurable
> > benefit in my testing even on ramdisk but lockstat data for dq_list_lock
> > looks much better after this - it seems lock contention just shifted to
> > dq_data_lock - I'll try to address that as well and see whether I'll be
> > able to measure some advantage.
> >
> > 3) I have patches to convert dquot dirty bit to sequence counter so that
> > in commit_dqblk() we can check whether dquot state we wanted to write is
> > already on disk. Note that this is different from Andrew's approach in that
> > we do wait for dquot to be actually written before returning. We just don't
> > repeat the write unnecessarily. However this didn't bring any measurable
> > benefit in my testing so unless I'll be able to confirm it benefits some
> > workloads I won't merge this change.
> >
> > If you can experiment with your workloads, I can send you patches. I'd be
> > keen on having some performance data from real setups...
> >
> > Honza
> >
> >>
> >> Thanks,
> >> Shilong
> >>
> >> > dquot_mark_dquot_dirty(dquot);
> >> > return ext4_write_dquot(dquot);
> >> >
> >> > The idea was that if we know that some thread is somewhere between
> >> > mark_dirty and clear_dirty, then we can avoid blocking on dqio_mutex,
> >> > since that thread will update the ondisk dquot for us.
> >> >
> > --
> > Jan Kara <jack@suse.com>
> > SUSE Labs, CR
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
next prev parent reply other threads:[~2017-08-08 16:06 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <10928956.Fla3vXZ7d9@panda>
[not found] ` <20170801130242.GH4215@quack2.suse.cz>
[not found] ` <20170802162552.GA30353@quack2.suse.cz>
[not found] ` <1691224.ooLB1CWbbI@panda>
2017-08-03 11:31 ` quota: dqio_mutex design Wang Shilong
2017-08-03 12:24 ` Andrew Perepechko
2017-08-03 13:19 ` Wang Shilong
2017-08-03 13:41 ` Andrew Perepechko
2017-08-03 13:55 ` Andrew Perepechko
2017-08-03 14:23 ` Jan Kara
2017-08-03 14:36 ` Jan Kara
2017-08-03 14:39 ` Wang Shilong
2017-08-08 16:06 ` Jan Kara [this message]
2017-08-14 3:24 ` Wang Shilong
2017-08-14 3:28 ` Wang Shilong
2017-08-14 3:53 ` Wang Shilong
2017-08-14 8:22 ` Jan Kara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170808160635.GA23565@quack2.suse.cz \
--to=jack@suse.cz \
--cc=anserper@yandex.ru \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=lixi@ddn.com \
--cc=sihara@ddn.com \
--cc=wangshilong1991@gmail.com \
--cc=wshilong@ddn.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox