All of lore.kernel.org
 help / color / mirror / Atom feed
From: Zach Brown <zach.brown@oracle.com>
To: Veerendra Chandrappa <veerendra.chandrappa@in.ibm.com>
Cc: linux-fsdevel@vger.kernel.org, linux-aio@kvack.org,
	linux-kernel@vger.kernel.org, suparna@in.ibm.com,
	xfs@oss.sgi.com
Subject: Re: [RFC 0/5] dio: clean up completion phase of direct_io_worker()
Date: Thu, 21 Sep 2006 11:38:13 -0700	[thread overview]
Message-ID: <4512DC15.8050101@oracle.com> (raw)
In-Reply-To: <OFBE544A3C.7C1B2C64-ON652571F0.003C21B6-652571F0.003C2DF3@in.ibm.com>


> on EXT2, EXT3 and XFS filesystems. For the EXT2 and EXT3 filesystems the
> tests went okay. But I got stack trace on XFS filesystem and the machine
> went down.

Fantastic, thanks for running these tests.

> kernel BUG at kernel/workqueue.c:113!

> EIP is at queue_work+0x86/0x90

We were able to set the pending bit but then found that list_empty()
failed on the work queue's entry list_head.  Let's call this memory
corruption of some kind.

>  [<c02b43a2>] xfs_finish_ioend+0x20/0x22
>  [<c02b5e2f>] xfs_end_io_direct+0x3c/0x68
>  [<c018e77a>] dio_complete+0xe3/0xfe
>  [<c018e82d>] dio_bio_end_aio+0x98/0xb1
>  [<c016e889>] bio_endio+0x4e/0x78
>  [<c02cdc89>] __end_that_request_first+0xcd/0x416

It was completing an AIO request.

        ret = blockdev_direct_IO_own_locking(rw, iocb, inode,
                iomap.iomap_target->bt_bdev,
                iov, offset, nr_segs,
                xfs_get_blocks_direct,
                xfs_end_io_direct);

        if (unlikely(ret <= 0 && iocb->private))
                xfs_destroy_ioend(iocb->private);

It looks like xfs_vm_direct_io() is destroying the ioend in the case
where direct IO is returning -EIOCBQUEUED.  Later the AIO will complete
and try to call queue_work on the freed ioend.  This wasn't a problem
before when blkdev_direct_IO_*() would just return the number of bytes
in the op that was in flight.  That test should be

        if (unlikely(ret != -EIOCBQUEUED && iocb->private))

I'll update the patch set and send it out.

This makes me worry that XFS might have other paths that need to know
about the magical -EIOCBQUEUED case which actually means that a AIO DIO
is in flight.

Could I coerce some XFS guys into investigating if we might have other
problems with trying to bubble -EIOCBQUEUED up from
blockdev_direct_IO_own_locking() up through to xfs_file_aio_write()'s
caller before calling xfs_end_io_direct()?

- z

       reply	other threads:[~2006-09-21 19:41 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <OFBE544A3C.7C1B2C64-ON652571F0.003C21B6-652571F0.003C2DF3@in.ibm.com>
2006-09-21 18:38 ` Zach Brown [this message]
2006-09-21 12:24 [RFC 0/5] dio: clean up completion phase of direct_io_worker() Veerendra Chandrappa
  -- strict thread matches above, loose matches on Subject: below --
2006-09-05 23:57 Zach Brown
2006-09-06  7:36 ` Suparna Bhattacharya
2006-09-06 16:36   ` Zach Brown
2006-09-06 14:57 ` Jeff Moyer
2006-09-06 16:46   ` Zach Brown
2006-09-06 18:13     ` Jeff Moyer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4512DC15.8050101@oracle.com \
    --to=zach.brown@oracle.com \
    --cc=linux-aio@kvack.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=suparna@in.ibm.com \
    --cc=veerendra.chandrappa@in.ibm.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.