From: Lachlan McIlroy <lachlan@sgi.com>
To: Christoph Hellwig <hch@lst.de>
Cc: xfs@oss.sgi.com, xfs-dev <xfs-dev@sgi.com>
Subject: Re: [PATCH] kill superflous buffer locking
Date: Wed, 28 Nov 2007 13:30:18 +1100 [thread overview]
Message-ID: <474CD2BA.8070204@sgi.com> (raw)
In-Reply-To: <4716DD79.6040309@sgi.com>
[-- Attachment #1: Type: text/plain, Size: 9899 bytes --]
Christoph,
We've fixed the source of the assertion (that was the bugs in
xfs_buf_associate_memory()) so I'm pushing your buffer lock
removal patch back in again.
While looking through it I found a couple of issues:
- It called unlock_page() before calls to PagePrivate() and
PageUptodate(). I think the page needs to be locked during
these calls so I moved the unlock_page() further down.
- Unlocking the pages as we go can cause a double unlock in the
error handling for a NULL page in the XBF_READ_AHEAD case so I
removed the unlocking code for that case.
Would you mind checking these changes?
Lachlan
Lachlan McIlroy wrote:
> Christoph,
>
> We've had to reverse this change because it's caused a regression.
> We haven't been able to identify why we see the following assertion
> trigger with these changes but the assertion goes away without the
> changes. Until we figure out why we'll have to leave the buffer
> locking in.
>
> <5>XFS mounting filesystem hdb2
> <5>Starting XFS recovery on filesystem: hdb2 (logdev: internal)
> <4>XFS: xlog_recover_process_data: bad clientid
> <4>Assertion failed: 0, file: fs/xfs/xfs_log_recover.c, line: 2912
> <0>------------[ cut here ]------------
> <2>kernel BUG at fs/xfs/support/debug.c:81!
> <0>invalid opcode: 0000 [#1]
> <0>SMP
> <4>Modules linked in:
> <0>CPU: 2
> <0>EIP: 0060:[<c028a125>] Not tainted VLI
> <0>EFLAGS: 00010286 (2.6.23-kali-26_xfs-debug #1)
> <0>EIP is at assfail+0x1e/0x22
> <0>eax: 00000043 ebx: f3002a50 ecx: 00000001 edx: 00000086
> <0>esi: f56e2300 edi: f8fa5c28 ebp: efa67ae4 esp: efa67ad4
> <0>ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068
> <0>Process mount (pid: 15191, ti=efa66000 task=f7b43570 task.ti=efa66000)
> <0>Stack: c05c8bda c05c6762 c05c4750 00000b60 efa67b1c c0269a35 00000004
> c05c5903
> <0> f3a14000 efa67ba8 f7e458c0 f8fa5c34 efa67bb8 f8fa6a38 0000000d
> 00001e00
> <0> f8fa4000 00000000 efa67bf4 c026a566 f8fa4000 00000001 00000651
> 00000000
> <0>Call Trace:
> <0> [<c0105eb6>] show_trace_log_lvl+0x1a/0x2f
> <0> [<c0105f66>] show_stack_log_lvl+0x9b/0xa3
> <0> [<c0106127>] show_registers+0x1b9/0x28b
> <0> [<c0106312>] die+0x119/0x27b
> <0> [<c04d5748>] do_trap+0x8a/0xa3
> <0> [<c0106733>] do_invalid_op+0x88/0x92
> <0> [<c04d551a>] error_code+0x72/0x78
> <0> [<c0269a35>] xlog_recover_process_data+0x6a/0x1ff
> <0> [<c026a566>] xlog_do_recovery_pass+0x810/0x9f3
> <0> [<c026a7ab>] xlog_do_log_recovery+0x62/0xe2
> <0> [<c026a848>] xlog_do_recover+0x1d/0x187
> <0> [<c026bd17>] xlog_recover+0x88/0x95
> <0> [<c0264d9d>] xfs_log_mount+0x100/0x144
> <0> [<c026ea6f>] xfs_mountfs+0x278/0x639
> <0> [<c0277917>] xfs_mount+0x25c/0x2f7
> <0> [<c0289952>] xfs_fs_fill_super+0xab/0x1fd
> <0> [<c0164677>] get_sb_bdev+0xd6/0x114
> <0> [<c0288c38>] xfs_fs_get_sb+0x21/0x27
> <0> [<c0164181>] vfs_kern_mount+0x41/0x7a
> <0> [<c0164209>] do_kern_mount+0x37/0xbd
> <0> [<c0175abe>] do_mount+0x566/0x5c0
> <0> [<c0175b87>] sys_mount+0x6f/0xa9
> <0> [<c0104e7e>] sysenter_past_esp+0x5f/0x85
> <0> =======================
> <0>Code: 04 24 10 00 00 00 e8 2a e7 03 00 c9 c3 55 89 e5 83 ec 10 89 4c
> 24 0c 89 54 24 08 89 44 24 04 c7 04 24 da 8b 5c c0 e8 07 bf e9 ff <0f>
> 0b eb fe 55 83 e0 07 89 e5 57 bf 07 00 00 00 56 89 d6 53 89
> <0>EIP: [<c028a125>] assfail+0x1e/0x22 SS:ESP 0068:efa67ad4
>
> Lachlan
>
> Christoph Hellwig wrote:
>> There is no need to lock any page in xfs_buf.c because we operate
>> on our own address_space and all locking is covered by the buffer
>> semaphore. If we ever switch back to main blockdeive address_space
>> as suggested e.g. for fsblock with a similar scheme the locking will
>> have to be totally revised anyway because the current scheme is
>> neither correct nor coherent with itself.
>>
>>
>> Signed-off-by: Christoph Hellwig <hch@lst.de>
>>
>> Index: linux-2.6-xfs/fs/xfs/linux-2.6/xfs_buf.c
>> ===================================================================
>> --- linux-2.6-xfs.orig/fs/xfs/linux-2.6/xfs_buf.c 2007-09-23
>> 13:28:00.000000000 +0200
>> +++ linux-2.6-xfs/fs/xfs/linux-2.6/xfs_buf.c 2007-09-23
>> 14:13:43.000000000 +0200
>> @@ -396,6 +396,7 @@ _xfs_buf_lookup_pages(
>> congestion_wait(WRITE, HZ/50);
>> goto retry;
>> }
>> + unlock_page(page);
>>
>> XFS_STATS_INC(xb_page_found);
>>
>> @@ -405,10 +406,7 @@ _xfs_buf_lookup_pages(
>> ASSERT(!PagePrivate(page));
>> if (!PageUptodate(page)) {
>> page_count--;
>> - if (blocksize >= PAGE_CACHE_SIZE) {
>> - if (flags & XBF_READ)
>> - bp->b_locked = 1;
>> - } else if (!PagePrivate(page)) {
>> + if (blocksize < PAGE_CACHE_SIZE && !PagePrivate(page)) {
>> if (test_page_region(page, offset, nbytes))
>> page_count++;
>> }
>> @@ -418,11 +416,6 @@ _xfs_buf_lookup_pages(
>> offset = 0;
>> }
>>
>> - if (!bp->b_locked) {
>> - for (i = 0; i < bp->b_page_count; i++)
>> - unlock_page(bp->b_pages[i]);
>> - }
>> -
>> if (page_count == bp->b_page_count)
>> bp->b_flags |= XBF_DONE;
>>
>> @@ -747,7 +740,6 @@ xfs_buf_associate_memory(
>> bp->b_page_count = ++i;
>> ptr += PAGE_CACHE_SIZE;
>> }
>> - bp->b_locked = 0;
>>
>> bp->b_count_desired = bp->b_buffer_length = len;
>> bp->b_flags |= XBF_MAPPED;
>> @@ -1093,25 +1085,13 @@ xfs_buf_iostart(
>> return status;
>> }
>>
>> -STATIC_INLINE int
>> -_xfs_buf_iolocked(
>> - xfs_buf_t *bp)
>> -{
>> - ASSERT(bp->b_flags & (XBF_READ | XBF_WRITE));
>> - if (bp->b_flags & XBF_READ)
>> - return bp->b_locked;
>> - return 0;
>> -}
>> -
>> STATIC_INLINE void
>> _xfs_buf_ioend(
>> xfs_buf_t *bp,
>> int schedule)
>> {
>> - if (atomic_dec_and_test(&bp->b_io_remaining) == 1) {
>> - bp->b_locked = 0;
>> + if (atomic_dec_and_test(&bp->b_io_remaining) == 1)
>> xfs_buf_ioend(bp, schedule);
>> - }
>> }
>>
>> STATIC int
>> @@ -1146,10 +1126,6 @@ xfs_buf_bio_end_io(
>>
>> if (--bvec >= bio->bi_io_vec)
>> prefetchw(&bvec->bv_page->flags);
>> -
>> - if (_xfs_buf_iolocked(bp)) {
>> - unlock_page(page);
>> - }
>> } while (bvec >= bio->bi_io_vec);
>>
>> _xfs_buf_ioend(bp, 1);
>> @@ -1161,13 +1137,12 @@ STATIC void
>> _xfs_buf_ioapply(
>> xfs_buf_t *bp)
>> {
>> - int i, rw, map_i, total_nr_pages, nr_pages;
>> + int rw, map_i, total_nr_pages, nr_pages;
>> struct bio *bio;
>> int offset = bp->b_offset;
>> int size = bp->b_count_desired;
>> sector_t sector = bp->b_bn;
>> unsigned int blocksize = bp->b_target->bt_bsize;
>> - int locking = _xfs_buf_iolocked(bp);
>>
>> total_nr_pages = bp->b_page_count;
>> map_i = 0;
>> @@ -1190,7 +1165,7 @@ _xfs_buf_ioapply(
>> * filesystem block size is not smaller than the page size.
>> */
>> if ((bp->b_buffer_length < PAGE_CACHE_SIZE) &&
>> - (bp->b_flags & XBF_READ) && locking &&
>> + (bp->b_flags & XBF_READ) &&
>> (blocksize >= PAGE_CACHE_SIZE)) {
>> bio = bio_alloc(GFP_NOIO, 1);
>>
>> @@ -1207,24 +1182,6 @@ _xfs_buf_ioapply(
>> goto submit_io;
>> }
>>
>> - /* Lock down the pages which we need to for the request */
>> - if (locking && (bp->b_flags & XBF_WRITE) && (bp->b_locked == 0)) {
>> - for (i = 0; size; i++) {
>> - int nbytes = PAGE_CACHE_SIZE - offset;
>> - struct page *page = bp->b_pages[i];
>> -
>> - if (nbytes > size)
>> - nbytes = size;
>> -
>> - lock_page(page);
>> -
>> - size -= nbytes;
>> - offset = 0;
>> - }
>> - offset = bp->b_offset;
>> - size = bp->b_count_desired;
>> - }
>> -
>> next_chunk:
>> atomic_inc(&bp->b_io_remaining);
>> nr_pages = BIO_MAX_SECTORS >> (PAGE_SHIFT - BBSHIFT);
>> Index: linux-2.6-xfs/fs/xfs/linux-2.6/xfs_buf.h
>> ===================================================================
>> --- linux-2.6-xfs.orig/fs/xfs/linux-2.6/xfs_buf.h 2007-09-05
>> 11:17:42.000000000 +0200
>> +++ linux-2.6-xfs/fs/xfs/linux-2.6/xfs_buf.h 2007-09-23
>> 14:04:36.000000000 +0200
>> @@ -143,7 +143,6 @@ typedef struct xfs_buf {
>> void *b_fspriv2;
>> void *b_fspriv3;
>> unsigned short b_error; /* error code on I/O */
>> - unsigned short b_locked; /* page array is locked */
>> unsigned int b_page_count; /* size of page array */
>> unsigned int b_offset; /* page offset in first page */
>> struct page **b_pages; /* array of page pointers */
>> Index: linux-2.6-xfs/fs/xfs/xfsidbg.c
>> ===================================================================
>> --- linux-2.6-xfs.orig/fs/xfs/xfsidbg.c 2007-09-23
>> 13:33:07.000000000 +0200
>> +++ linux-2.6-xfs/fs/xfs/xfsidbg.c 2007-09-23 14:04:36.000000000 +0200
>> @@ -2110,9 +2110,9 @@ print_xfs_buf(
>> (unsigned long long) bp->b_file_offset,
>> (unsigned long long) bp->b_buffer_length,
>> bp->b_addr);
>> - kdb_printf(" b_bn 0x%llx b_count_desired 0x%lx b_locked %d\n",
>> + kdb_printf(" b_bn 0x%llx b_count_desired 0x%lxn",
>> (unsigned long long)bp->b_bn,
>> - (unsigned long) bp->b_count_desired, (int)bp->b_locked);
>> + (unsigned long) bp->b_count_desired);
>> kdb_printf(" b_queuetime %ld (now=%ld/age=%ld) b_io_remaining
>> %d\n",
>> bp->b_queuetime, jiffies, bp->b_queuetime + age,
>> bp->b_io_remaining.counter);
>>
>>
>>
>
>
>
[-- Attachment #2: buflock.diff --]
[-- Type: text/x-patch, Size: 4503 bytes --]
--- fs/xfs/linux-2.6/xfs_buf.c_1.249 2007-11-27 15:28:34.000000000 +1100
+++ fs/xfs/linux-2.6/xfs_buf.c 2007-11-28 13:02:38.000000000 +1100
@@ -387,8 +387,6 @@ _xfs_buf_lookup_pages(
if (unlikely(page == NULL)) {
if (flags & XBF_READ_AHEAD) {
bp->b_page_count = i;
- for (i = 0; i < bp->b_page_count; i++)
- unlock_page(bp->b_pages[i]);
return -ENOMEM;
}
@@ -418,24 +416,17 @@ _xfs_buf_lookup_pages(
ASSERT(!PagePrivate(page));
if (!PageUptodate(page)) {
page_count--;
- if (blocksize >= PAGE_CACHE_SIZE) {
- if (flags & XBF_READ)
- bp->b_locked = 1;
- } else if (!PagePrivate(page)) {
+ if (blocksize < PAGE_CACHE_SIZE && !PagePrivate(page)) {
if (test_page_region(page, offset, nbytes))
page_count++;
}
}
+ unlock_page(page);
bp->b_pages[i] = page;
offset = 0;
}
- if (!bp->b_locked) {
- for (i = 0; i < bp->b_page_count; i++)
- unlock_page(bp->b_pages[i]);
- }
-
if (page_count == bp->b_page_count)
bp->b_flags |= XBF_DONE;
@@ -752,7 +743,6 @@ xfs_buf_associate_memory(
bp->b_pages[i] = mem_to_page((void *)pageaddr);
pageaddr += PAGE_CACHE_SIZE;
}
- bp->b_locked = 0;
bp->b_count_desired = len;
bp->b_buffer_length = buflen;
@@ -1099,25 +1089,13 @@ xfs_buf_iostart(
return status;
}
-STATIC_INLINE int
-_xfs_buf_iolocked(
- xfs_buf_t *bp)
-{
- ASSERT(bp->b_flags & (XBF_READ | XBF_WRITE));
- if (bp->b_flags & XBF_READ)
- return bp->b_locked;
- return 0;
-}
-
STATIC_INLINE void
_xfs_buf_ioend(
xfs_buf_t *bp,
int schedule)
{
- if (atomic_dec_and_test(&bp->b_io_remaining) == 1) {
- bp->b_locked = 0;
+ if (atomic_dec_and_test(&bp->b_io_remaining) == 1)
xfs_buf_ioend(bp, schedule);
- }
}
STATIC int
@@ -1152,10 +1130,6 @@ xfs_buf_bio_end_io(
if (--bvec >= bio->bi_io_vec)
prefetchw(&bvec->bv_page->flags);
-
- if (_xfs_buf_iolocked(bp)) {
- unlock_page(page);
- }
} while (bvec >= bio->bi_io_vec);
_xfs_buf_ioend(bp, 1);
@@ -1167,13 +1141,12 @@ STATIC void
_xfs_buf_ioapply(
xfs_buf_t *bp)
{
- int i, rw, map_i, total_nr_pages, nr_pages;
+ int rw, map_i, total_nr_pages, nr_pages;
struct bio *bio;
int offset = bp->b_offset;
int size = bp->b_count_desired;
sector_t sector = bp->b_bn;
unsigned int blocksize = bp->b_target->bt_bsize;
- int locking = _xfs_buf_iolocked(bp);
total_nr_pages = bp->b_page_count;
map_i = 0;
@@ -1196,7 +1169,7 @@ _xfs_buf_ioapply(
* filesystem block size is not smaller than the page size.
*/
if ((bp->b_buffer_length < PAGE_CACHE_SIZE) &&
- (bp->b_flags & XBF_READ) && locking &&
+ (bp->b_flags & XBF_READ) &&
(blocksize >= PAGE_CACHE_SIZE)) {
bio = bio_alloc(GFP_NOIO, 1);
@@ -1213,24 +1186,6 @@ _xfs_buf_ioapply(
goto submit_io;
}
- /* Lock down the pages which we need to for the request */
- if (locking && (bp->b_flags & XBF_WRITE) && (bp->b_locked == 0)) {
- for (i = 0; size; i++) {
- int nbytes = PAGE_CACHE_SIZE - offset;
- struct page *page = bp->b_pages[i];
-
- if (nbytes > size)
- nbytes = size;
-
- lock_page(page);
-
- size -= nbytes;
- offset = 0;
- }
- offset = bp->b_offset;
- size = bp->b_count_desired;
- }
-
next_chunk:
atomic_inc(&bp->b_io_remaining);
nr_pages = BIO_MAX_SECTORS >> (PAGE_SHIFT - BBSHIFT);
--- fs/xfs/linux-2.6/xfs_buf.h_1.122 2007-11-27 15:28:36.000000000 +1100
+++ fs/xfs/linux-2.6/xfs_buf.h 2007-11-27 15:21:51.000000000 +1100
@@ -143,7 +143,6 @@ typedef struct xfs_buf {
void *b_fspriv2;
void *b_fspriv3;
unsigned short b_error; /* error code on I/O */
- unsigned short b_locked; /* page array is locked */
unsigned int b_page_count; /* size of page array */
unsigned int b_offset; /* page offset in first page */
struct page **b_pages; /* array of page pointers */
--- fs/xfs/xfsidbg.c_1.340 2007-11-27 15:28:37.000000000 +1100
+++ fs/xfs/xfsidbg.c 2007-11-27 15:28:33.000000000 +1100
@@ -2007,9 +2007,9 @@ print_xfs_buf(
(unsigned long long) bp->b_file_offset,
(unsigned long long) bp->b_buffer_length,
bp->b_addr);
- kdb_printf(" b_bn 0x%llx b_count_desired 0x%lx b_locked %d\n",
+ kdb_printf(" b_bn 0x%llx b_count_desired 0x%lx\n",
(unsigned long long)bp->b_bn,
- (unsigned long) bp->b_count_desired, (int)bp->b_locked);
+ (unsigned long) bp->b_count_desired);
kdb_printf(" b_queuetime %ld (now=%ld/age=%ld) b_io_remaining %d\n",
bp->b_queuetime, jiffies, bp->b_queuetime + age,
bp->b_io_remaining.counter);
next prev parent reply other threads:[~2007-11-28 2:31 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-09-24 18:49 [PATCH] kill superflous buffer locking Christoph Hellwig
2007-10-18 4:13 ` Lachlan McIlroy
2007-11-28 2:30 ` Lachlan McIlroy [this message]
2007-11-28 9:48 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=474CD2BA.8070204@sgi.com \
--to=lachlan@sgi.com \
--cc=hch@lst.de \
--cc=xfs-dev@sgi.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox