* [PATCH] Fix reference counting race on log buffers
@ 2008-07-11 5:01 Dave Chinner
2008-07-11 5:25 ` Timothy Shimmin
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Dave Chinner @ 2008-07-11 5:01 UTC (permalink / raw)
To: xfs; +Cc: Dave Chinner
When we release the iclog, we do an atomic_dec_and_lock to determine
if we are the last reference to enable update of log headers and
writeout. however, in xlog_state_get_iclog_space() we need to check
if we have the last reference count there. if we do, we release the
log buffer, otherwise we decrement the reference count.
The issue is that the compare and decrement in
xlog_state_get_iclog_space() is not atomic, so both places can see a
reference count of 2 and neither will release the iclog. That leads
to a filesystem hang.
Close the hole replacing the compare and decrement with
atomic_add_unless() to ensure that they are executed atomically.
Signed-off-by: Dave Chinner <david@fromorbit.com>
---
fs/xfs/xfs_log.c | 15 +++++++++++----
1 files changed, 11 insertions(+), 4 deletions(-)
diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
index 760d543..0816c5d 100644
--- a/fs/xfs/xfs_log.c
+++ b/fs/xfs/xfs_log.c
@@ -2425,13 +2425,20 @@ restart:
if (iclog->ic_size - iclog->ic_offset < 2*sizeof(xlog_op_header_t)) {
xlog_state_switch_iclogs(log, iclog, iclog->ic_size);
- /* If I'm the only one writing to this iclog, sync it to disk */
- if (atomic_read(&iclog->ic_refcnt) == 1) {
+ /*
+ * If I'm the only one writing to this iclog, sync it to disk.
+ * We need to do an atomic compare and decrement here to avoid
+ * racing with concurrent atomic_dec_and_lock() calls in
+ * xlog_state_release_iclog() when there is more than one
+ * reference to the iclog.
+ */
+ if (!atomic_add_unless(&iclog->ic_refcnt, -1, 1)) {
+ /* we are the only one */
spin_unlock(&log->l_icloglock);
- if ((error = xlog_state_release_iclog(log, iclog)))
+ error = xlog_state_release_iclog(log, iclog);
+ if (error)
return error;
} else {
- atomic_dec(&iclog->ic_refcnt);
spin_unlock(&log->l_icloglock);
}
goto restart;
--
1.5.6
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] Fix reference counting race on log buffers
2008-07-11 5:01 [PATCH] Fix reference counting race on log buffers Dave Chinner
@ 2008-07-11 5:25 ` Timothy Shimmin
2008-07-11 5:30 ` Timothy Shimmin
2008-07-11 7:19 ` Dave Chinner
2008-07-11 6:28 ` Eric Sandeen
[not found] ` <20080814180603.GA3087@infradead.org>
2 siblings, 2 replies; 7+ messages in thread
From: Timothy Shimmin @ 2008-07-11 5:25 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs
Dave,
Yeah, looks good. Thanks.
(My personal preference is for reversing if/else than using !
as it is easier to read but whatever :)
But will need Eric's result before checking it in.
--Tim
Dave Chinner wrote:
> When we release the iclog, we do an atomic_dec_and_lock to determine
> if we are the last reference to enable update of log headers and
> writeout. however, in xlog_state_get_iclog_space() we need to check
> if we have the last reference count there. if we do, we release the
> log buffer, otherwise we decrement the reference count.
>
> The issue is that the compare and decrement in
> xlog_state_get_iclog_space() is not atomic, so both places can see a
> reference count of 2 and neither will release the iclog. That leads
> to a filesystem hang.
>
> Close the hole replacing the compare and decrement with
> atomic_add_unless() to ensure that they are executed atomically.
>
> Signed-off-by: Dave Chinner <david@fromorbit.com>
> ---
> fs/xfs/xfs_log.c | 15 +++++++++++----
> 1 files changed, 11 insertions(+), 4 deletions(-)
>
> diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> index 760d543..0816c5d 100644
> --- a/fs/xfs/xfs_log.c
> +++ b/fs/xfs/xfs_log.c
> @@ -2425,13 +2425,20 @@ restart:
> if (iclog->ic_size - iclog->ic_offset < 2*sizeof(xlog_op_header_t)) {
> xlog_state_switch_iclogs(log, iclog, iclog->ic_size);
>
> - /* If I'm the only one writing to this iclog, sync it to disk */
> - if (atomic_read(&iclog->ic_refcnt) == 1) {
> + /*
> + * If I'm the only one writing to this iclog, sync it to disk.
> + * We need to do an atomic compare and decrement here to avoid
> + * racing with concurrent atomic_dec_and_lock() calls in
> + * xlog_state_release_iclog() when there is more than one
> + * reference to the iclog.
> + */
> + if (!atomic_add_unless(&iclog->ic_refcnt, -1, 1)) {
> + /* we are the only one */
> spin_unlock(&log->l_icloglock);
> - if ((error = xlog_state_release_iclog(log, iclog)))
> + error = xlog_state_release_iclog(log, iclog);
> + if (error)
> return error;
> } else {
> - atomic_dec(&iclog->ic_refcnt);
> spin_unlock(&log->l_icloglock);
> }
> goto restart;
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Fix reference counting race on log buffers
2008-07-11 5:25 ` Timothy Shimmin
@ 2008-07-11 5:30 ` Timothy Shimmin
2008-07-11 7:19 ` Dave Chinner
1 sibling, 0 replies; 7+ messages in thread
From: Timothy Shimmin @ 2008-07-11 5:30 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs
Timothy Shimmin wrote:
> Dave,
>
> Yeah, looks good. Thanks.
> (My personal preference is for reversing if/else than using !
> as it is easier to read but whatever :)
>
> But will need Eric's result before checking it in.
>
And please push straight to Linus as I won't be able to check it in
today as about to go to QLD.
Thanks,
--Tim
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Fix reference counting race on log buffers
2008-07-11 5:01 [PATCH] Fix reference counting race on log buffers Dave Chinner
2008-07-11 5:25 ` Timothy Shimmin
@ 2008-07-11 6:28 ` Eric Sandeen
2008-07-11 6:53 ` Mark Goodwin
[not found] ` <20080814180603.GA3087@infradead.org>
2 siblings, 1 reply; 7+ messages in thread
From: Eric Sandeen @ 2008-07-11 6:28 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs
Dave Chinner wrote:
> When we release the iclog, we do an atomic_dec_and_lock to determine
> if we are the last reference to enable update of log headers and
> writeout. however, in xlog_state_get_iclog_space() we need to check
> if we have the last reference count there. if we do, we release the
> log buffer, otherwise we decrement the reference count.
>
> The issue is that the compare and decrement in
> xlog_state_get_iclog_space() is not atomic, so both places can see a
> reference count of 2 and neither will release the iclog. That leads
> to a filesystem hang.
>
> Close the hole replacing the compare and decrement with
> atomic_add_unless() to ensure that they are executed atomically.
>
> Signed-off-by: Dave Chinner <david@fromorbit.com>
Tested-by: Eric Sandeen <sandeen@sandeen.net>
Passes the fs_mark testcase I hit this on, 18 million inodes & counting.
Thanks,
-Eric
> ---
> fs/xfs/xfs_log.c | 15 +++++++++++----
> 1 files changed, 11 insertions(+), 4 deletions(-)
>
> diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> index 760d543..0816c5d 100644
> --- a/fs/xfs/xfs_log.c
> +++ b/fs/xfs/xfs_log.c
> @@ -2425,13 +2425,20 @@ restart:
> if (iclog->ic_size - iclog->ic_offset < 2*sizeof(xlog_op_header_t)) {
> xlog_state_switch_iclogs(log, iclog, iclog->ic_size);
>
> - /* If I'm the only one writing to this iclog, sync it to disk */
> - if (atomic_read(&iclog->ic_refcnt) == 1) {
> + /*
> + * If I'm the only one writing to this iclog, sync it to disk.
> + * We need to do an atomic compare and decrement here to avoid
> + * racing with concurrent atomic_dec_and_lock() calls in
> + * xlog_state_release_iclog() when there is more than one
> + * reference to the iclog.
> + */
> + if (!atomic_add_unless(&iclog->ic_refcnt, -1, 1)) {
> + /* we are the only one */
> spin_unlock(&log->l_icloglock);
> - if ((error = xlog_state_release_iclog(log, iclog)))
> + error = xlog_state_release_iclog(log, iclog);
> + if (error)
> return error;
> } else {
> - atomic_dec(&iclog->ic_refcnt);
> spin_unlock(&log->l_icloglock);
> }
> goto restart;
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Fix reference counting race on log buffers
2008-07-11 6:28 ` Eric Sandeen
@ 2008-07-11 6:53 ` Mark Goodwin
0 siblings, 0 replies; 7+ messages in thread
From: Mark Goodwin @ 2008-07-11 6:53 UTC (permalink / raw)
To: Eric Sandeen; +Cc: Dave Chinner, xfs
Eric Sandeen wrote:
> Dave Chinner wrote:
>> When we release the iclog, we do an atomic_dec_and_lock to determine
>> if we are the last reference to enable update of log headers and
>> writeout. however, in xlog_state_get_iclog_space() we need to check
>> if we have the last reference count there. if we do, we release the
>> log buffer, otherwise we decrement the reference count.
>>
>> The issue is that the compare and decrement in
>> xlog_state_get_iclog_space() is not atomic, so both places can see a
>> reference count of 2 and neither will release the iclog. That leads
>> to a filesystem hang.
>>
>> Close the hole replacing the compare and decrement with
>> atomic_add_unless() to ensure that they are executed atomically.
>>
>> Signed-off-by: Dave Chinner <david@fromorbit.com>
>
> Tested-by: Eric Sandeen <sandeen@sandeen.net>
>
> Passes the fs_mark testcase I hit this on, 18 million inodes & counting.
Thanks Eric & Dave. Given the short time available, could one of you
please push this directly to Linus? I don't have anyone here to do that
until Monday, which is probably too late.
Cheers
-- Mark
>
> Thanks,
> -Eric
>
>> ---
>> fs/xfs/xfs_log.c | 15 +++++++++++----
>> 1 files changed, 11 insertions(+), 4 deletions(-)
>>
>> diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
>> index 760d543..0816c5d 100644
>> --- a/fs/xfs/xfs_log.c
>> +++ b/fs/xfs/xfs_log.c
>> @@ -2425,13 +2425,20 @@ restart:
>> if (iclog->ic_size - iclog->ic_offset < 2*sizeof(xlog_op_header_t)) {
>> xlog_state_switch_iclogs(log, iclog, iclog->ic_size);
>>
>> - /* If I'm the only one writing to this iclog, sync it to disk */
>> - if (atomic_read(&iclog->ic_refcnt) == 1) {
>> + /*
>> + * If I'm the only one writing to this iclog, sync it to disk.
>> + * We need to do an atomic compare and decrement here to avoid
>> + * racing with concurrent atomic_dec_and_lock() calls in
>> + * xlog_state_release_iclog() when there is more than one
>> + * reference to the iclog.
>> + */
>> + if (!atomic_add_unless(&iclog->ic_refcnt, -1, 1)) {
>> + /* we are the only one */
>> spin_unlock(&log->l_icloglock);
>> - if ((error = xlog_state_release_iclog(log, iclog)))
>> + error = xlog_state_release_iclog(log, iclog);
>> + if (error)
>> return error;
>> } else {
>> - atomic_dec(&iclog->ic_refcnt);
>> spin_unlock(&log->l_icloglock);
>> }
>> goto restart;
>
>
--
Mark Goodwin markgw@sgi.com
Engineering Manager for XFS and PCP Phone: +61-3-99631937
SGI Australian Software Group Cell: +61-4-18969583
-------------------------------------------------------------
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Fix reference counting race on log buffers
2008-07-11 5:25 ` Timothy Shimmin
2008-07-11 5:30 ` Timothy Shimmin
@ 2008-07-11 7:19 ` Dave Chinner
1 sibling, 0 replies; 7+ messages in thread
From: Dave Chinner @ 2008-07-11 7:19 UTC (permalink / raw)
To: Timothy Shimmin; +Cc: xfs
On Fri, Jul 11, 2008 at 03:25:43PM +1000, Timothy Shimmin wrote:
> Dave,
>
> Yeah, looks good. Thanks.
Thanks, Tim - I'll add the right tags and send it to Linus.
> (My personal preference is for reversing if/else than using !
> as it is easier to read but whatever :)
It looked funny with one line in the upper branch, and I added
the comment inside the if() because it wasn't obvious from the
code (either way) which branch was the 'only ref' branch....
> But will need Eric's result before checking it in.
Seems like it's working fine, and i've got a coupl eof runs
through xfsqa now, so it should be good to go.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Fix reference counting race on log buffers
[not found] ` <48A4F61E.3050106@sgi.com>
@ 2008-08-15 22:07 ` Christoph Hellwig
0 siblings, 0 replies; 7+ messages in thread
From: Christoph Hellwig @ 2008-08-15 22:07 UTC (permalink / raw)
To: Lachlan McIlroy; +Cc: Christoph Hellwig, Dave Chinner, xfs
On Fri, Aug 15, 2008 at 01:21:02PM +1000, Lachlan McIlroy wrote:
>> - a different xfs_version.h - IMHO we should just kill this versioning
>> scheme completely..
>> - xfsidbg. Maybe now with kgdb in the kgdb folks get something
>> submitted and then this is taken care off..
> That stuff is there for kdb. Are you suggesting we switch to kgdb and
> drop this file?
No, sorry for the stupid typo. I meant now with kgdb in maybe the kdb
folks can get their act together and try to submit it, at which point
xfsidbg could go upstream.
>> - modular quota. IMHO we should just stop doing this in CVS, it
>> doesn't really buy us anything. If people are okay with that
>> I'll send a patch
> I don't see any problem with this. Seems like a good idea.
Ok, I'll prepare something.
>
>> - dmapi - well, this won't get merged and currently I don't see anyone
>> doing HSM for mainline properly. But I have some ideas how to make
>> it less intrusive at least
> All that would be very helpful.
> Last time I checked the fs/xfs differences between mainline and our
> internal tree resulted in a patch around 470KB. Most of that is changes
> that we don't push to mainline such as fs/xfs/dmapi and fs/xfs/xfsidbg.c.
> The rest I'm working though.
Yeah, the list above should have been complete, I just went through the
diff yesterday.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2008-08-15 22:05 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-07-11 5:01 [PATCH] Fix reference counting race on log buffers Dave Chinner
2008-07-11 5:25 ` Timothy Shimmin
2008-07-11 5:30 ` Timothy Shimmin
2008-07-11 7:19 ` Dave Chinner
2008-07-11 6:28 ` Eric Sandeen
2008-07-11 6:53 ` Mark Goodwin
[not found] ` <20080814180603.GA3087@infradead.org>
[not found] ` <48A4DB09.7020702@sgi.com>
[not found] ` <20080815012807.GA547@infradead.org>
[not found] ` <48A4F61E.3050106@sgi.com>
2008-08-15 22:07 ` Christoph Hellwig
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox