From: Ming Lei <ming.lei@redhat.com>
To: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Ming Lei <tom.leiming@gmail.com>,
linux-block <linux-block@vger.kernel.org>,
linux-mm <linux-mm@kvack.org>,
Linux FS Devel <linux-fsdevel@vger.kernel.org>,
"open list:XFS FILESYSTEM" <linux-xfs@vger.kernel.org>,
Dave Chinner <dchinner@redhat.com>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Christoph Hellwig <hch@lst.de>, Jens Axboe <axboe@kernel.dk>
Subject: Re: block: DMA alignment of IO buffer allocated from slab
Date: Wed, 19 Sep 2018 18:02:57 +0800 [thread overview]
Message-ID: <20180919100256.GD23172@ming.t460p> (raw)
In-Reply-To: <877ejh3jv0.fsf@vitty.brq.redhat.com>
Hi Vitaly,
On Wed, Sep 19, 2018 at 11:41:07AM +0200, Vitaly Kuznetsov wrote:
> Ming Lei <tom.leiming@gmail.com> writes:
>
> > Hi Guys,
> >
> > Some storage controllers have DMA alignment limit, which is often set via
> > blk_queue_dma_alignment(), such as 512-byte alignment for IO buffer.
>
> While mostly drivers use 512-byte alignment it is not a rule of thumb,
> 'git grep' tell me we have:
> ide-cd.c with 32-byte alignment
> ps3disk.c and rsxx/dev.c with variable alignment.
>
> What if our block configuration consists of several devices (in raid
> array, for example) with different requirements, e.g. one requiring
> 512-byte alignment and the other requiring 256?
512-byte alignment is also 256-byte aligned, and the sector size is 512 byte.
>
> >
> > Block layer now only checks if this limit is respected for buffer of
> > pass-through request,
> > see blk_rq_map_user_iov(), bio_map_user_iov().
> >
> > The userspace buffer for direct IO is checked in dio path, see
> > do_blockdev_direct_IO().
> > IO buffer from page cache should be fine wrt. this limit too.
> >
> > However, some file systems, such as XFS, may allocate single sector IO buffer
> > via slab. Usually I guess kmalloc-512 should be fine to return
> > 512-aligned buffer.
> > But once KASAN or other slab debug options are enabled, looks this
> > isn't true any
> > more, kmalloc-512 may not return 512-aligned buffer. Then data corruption
> > can be observed because the IO buffer from fs layer doesn't respect the DMA
> > alignment limit any more.
> >
> > Follows several related questions:
> >
> > 1) does kmalloc-N slab guarantee to return N-byte aligned buffer? If
> > yes, is it a stable rule?
> >
> > 2) If it is a rule for kmalloc-N slab to return N-byte aligned buffer,
> > seems KASAN violates this
> > rule?
>
> (as I was kinda involved in debugging): the issue was observed with SLUB
> allocator KASAN is not to blame, everything wich requires aditional
> metadata space will break this, see e.g. calculate_sizes() in slub.c
Buffer allocated via kmalloc() should be aligned with L1 HW cache size
at least.
I have raised the question: does kmalloc-512 slab guarantee to return
512-byte aligned buffer, let's see what the answer is from MM guys,:-)
>From the Red Hat BZ, looks I understand this issue is only triggered when
KASAN is enabled, or you have figured out how to reproduce it without
KASAN involved?
>
> >
> > 3) If slab can't guarantee to return 512-aligned buffer, how to fix
> > this data corruption issue?
>
> I'm no expert in block layer but in case of complex block device
> configurations when bio submitter can't know all the requirements I see
> no other choice than bouncing.
I guess that might be the last straw, given the current way without
bouncing works for decades, and seems no one complains before.
Thanks,
Ming
WARNING: multiple messages have this Message-ID (diff)
From: Ming Lei <ming.lei@redhat.com>
To: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Ming Lei <tom.leiming@gmail.com>,
linux-block <linux-block@vger.kernel.org>,
linux-mm <linux-mm@kvack.org>,
Linux FS Devel <linux-fsdevel@vger.kernel.org>,
"open list:XFS FILESYSTEM" <linux-xfs@vger.kernel.org>,
Dave Chinner <dchinner@redhat.com>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Christoph Hellwig <hch@lst.de>, Jens Axboe <axboe@kernel.dk>
Subject: Re: block: DMA alignment of IO buffer allocated from slab
Date: Wed, 19 Sep 2018 18:02:57 +0800 [thread overview]
Message-ID: <20180919100256.GD23172@ming.t460p> (raw)
In-Reply-To: <877ejh3jv0.fsf@vitty.brq.redhat.com>
Hi Vitaly,
On Wed, Sep 19, 2018 at 11:41:07AM +0200, Vitaly Kuznetsov wrote:
> Ming Lei <tom.leiming@gmail.com> writes:
>
> > Hi Guys,
> >
> > Some storage controllers have DMA alignment limit, which is often set via
> > blk_queue_dma_alignment(), such as 512-byte alignment for IO buffer.
>
> While mostly drivers use 512-byte alignment it is not a rule of thumb,
> 'git grep' tell me we have:
> ide-cd.c with 32-byte alignment
> ps3disk.c and rsxx/dev.c with variable alignment.
>
> What if our block configuration consists of several devices (in raid
> array, for example) with different requirements, e.g. one requiring
> 512-byte alignment and the other requiring 256?
512-byte alignment is also 256-byte aligned, and the sector size is 512 byte.
>
> >
> > Block layer now only checks if this limit is respected for buffer of
> > pass-through request,
> > see blk_rq_map_user_iov(), bio_map_user_iov().
> >
> > The userspace buffer for direct IO is checked in dio path, see
> > do_blockdev_direct_IO().
> > IO buffer from page cache should be fine wrt. this limit too.
> >
> > However, some file systems, such as XFS, may allocate single sector IO buffer
> > via slab. Usually I guess kmalloc-512 should be fine to return
> > 512-aligned buffer.
> > But once KASAN or other slab debug options are enabled, looks this
> > isn't true any
> > more, kmalloc-512 may not return 512-aligned buffer. Then data corruption
> > can be observed because the IO buffer from fs layer doesn't respect the DMA
> > alignment limit any more.
> >
> > Follows several related questions:
> >
> > 1) does kmalloc-N slab guarantee to return N-byte aligned buffer? If
> > yes, is it a stable rule?
> >
> > 2) If it is a rule for kmalloc-N slab to return N-byte aligned buffer,
> > seems KASAN violates this
> > rule?
>
> (as I was kinda involved in debugging): the issue was observed with SLUB
> allocator KASAN is not to blame, everything wich requires aditional
> metadata space will break this, see e.g. calculate_sizes() in slub.c
Buffer allocated via kmalloc() should be aligned with L1 HW cache size
at least.
I have raised the question: does kmalloc-512 slab guarantee to return
512-byte aligned buffer, let's see what the answer is from MM guys,:-)
From the Red Hat BZ, looks I understand this issue is only triggered when
KASAN is enabled, or you have figured out how to reproduce it without
KASAN involved?
>
> >
> > 3) If slab can't guarantee to return 512-aligned buffer, how to fix
> > this data corruption issue?
>
> I'm no expert in block layer but in case of complex block device
> configurations when bio submitter can't know all the requirements I see
> no other choice than bouncing.
I guess that might be the last straw, given the current way without
bouncing works for decades, and seems no one complains before.
Thanks,
Ming
next prev parent reply other threads:[~2018-09-19 10:02 UTC|newest]
Thread overview: 51+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-09-19 9:15 block: DMA alignment of IO buffer allocated from slab Ming Lei
2018-09-19 9:41 ` Vitaly Kuznetsov
2018-09-19 9:41 ` Vitaly Kuznetsov
2018-09-19 10:02 ` Ming Lei [this message]
2018-09-19 10:02 ` Ming Lei
2018-09-19 11:15 ` Vitaly Kuznetsov
2018-09-19 11:15 ` Vitaly Kuznetsov
2018-09-20 1:28 ` Ming Lei
2018-09-20 3:59 ` Yang Shi
2018-09-20 6:32 ` Christoph Hellwig
2018-09-20 6:31 ` Christoph Hellwig
2018-09-21 13:04 ` Vitaly Kuznetsov
2018-09-21 13:04 ` Vitaly Kuznetsov
2018-09-21 13:05 ` Christoph Hellwig
2018-09-21 15:00 ` Jens Axboe
2018-09-24 16:06 ` Christopher Lameter
2018-09-24 17:49 ` Jens Axboe
2018-09-24 18:00 ` Christopher Lameter
2018-09-24 18:09 ` Jens Axboe
2018-09-25 7:49 ` Dave Chinner
2018-09-25 15:44 ` Jens Axboe
2018-09-25 21:04 ` Matthew Wilcox
2018-09-23 22:42 ` Ming Lei
2018-09-24 9:46 ` Andrey Ryabinin
2018-09-24 14:19 ` Bart Van Assche
2018-09-24 14:43 ` Andrey Ryabinin
2018-09-24 14:43 ` Andrey Ryabinin
2018-09-24 15:08 ` Bart Van Assche
2018-09-24 15:52 ` Andrey Ryabinin
2018-09-24 15:58 ` Bart Van Assche
2018-09-24 16:07 ` Andrey Ryabinin
2018-09-24 16:19 ` Bart Van Assche
2018-09-24 16:47 ` Christopher Lameter
2018-09-24 18:57 ` Matthew Wilcox
2018-09-24 19:56 ` Bart Van Assche
2018-09-24 20:41 ` Matthew Wilcox
2018-09-24 20:54 ` Bart Van Assche
2018-09-24 21:09 ` Matthew Wilcox
2018-09-25 0:16 ` Ming Lei
2018-09-25 3:28 ` Matthew Wilcox
2018-09-25 4:10 ` Bart Van Assche
2018-09-25 4:44 ` Matthew Wilcox
2018-09-25 6:55 ` Ming Lei
2018-09-24 15:17 ` Christopher Lameter
2018-09-25 0:20 ` Ming Lei
2018-09-20 14:07 ` Bart Van Assche
2018-09-21 1:56 ` Dave Chinner
2018-09-21 1:56 ` Dave Chinner
2018-09-21 7:08 ` Christoph Hellwig
2018-09-21 7:25 ` Ming Lei
2018-09-21 14:59 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180919100256.GD23172@ming.t460p \
--to=ming.lei@redhat.com \
--cc=axboe@kernel.dk \
--cc=dchinner@redhat.com \
--cc=hch@lst.de \
--cc=linux-block@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-xfs@vger.kernel.org \
--cc=tom.leiming@gmail.com \
--cc=vkuznets@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.