From: Vitaly Kuznetsov <vkuznets@redhat.com>
To: Ming Lei <tom.leiming@gmail.com>
Cc: linux-block <linux-block@vger.kernel.org>,
linux-mm <linux-mm@kvack.org>,
Linux FS Devel <linux-fsdevel@vger.kernel.org>,
"open list\:XFS FILESYSTEM" <linux-xfs@vger.kernel.org>,
Dave Chinner <dchinner@redhat.com>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Christoph Hellwig <hch@lst.de>, Jens Axboe <axboe@kernel.dk>,
Ming Lei <ming.lei@redhat.com>
Subject: Re: block: DMA alignment of IO buffer allocated from slab
Date: Wed, 19 Sep 2018 11:41:07 +0200 [thread overview]
Message-ID: <877ejh3jv0.fsf@vitty.brq.redhat.com> (raw)
In-Reply-To: <CACVXFVOBq3L_EjSTCoiqUL1PH=HMR5EuNNQV0hNndFpGxmUK6g@mail.gmail.com> (Ming Lei's message of "Wed, 19 Sep 2018 17:15:43 +0800")
Ming Lei <tom.leiming@gmail.com> writes:
> Hi Guys,
>
> Some storage controllers have DMA alignment limit, which is often set via
> blk_queue_dma_alignment(), such as 512-byte alignment for IO buffer.
While mostly drivers use 512-byte alignment it is not a rule of thumb,
'git grep' tell me we have:
ide-cd.c with 32-byte alignment
ps3disk.c and rsxx/dev.c with variable alignment.
What if our block configuration consists of several devices (in raid
array, for example) with different requirements, e.g. one requiring
512-byte alignment and the other requiring 256?
>
> Block layer now only checks if this limit is respected for buffer of
> pass-through request,
> see blk_rq_map_user_iov(), bio_map_user_iov().
>
> The userspace buffer for direct IO is checked in dio path, see
> do_blockdev_direct_IO().
> IO buffer from page cache should be fine wrt. this limit too.
>
> However, some file systems, such as XFS, may allocate single sector IO buffer
> via slab. Usually I guess kmalloc-512 should be fine to return
> 512-aligned buffer.
> But once KASAN or other slab debug options are enabled, looks this
> isn't true any
> more, kmalloc-512 may not return 512-aligned buffer. Then data corruption
> can be observed because the IO buffer from fs layer doesn't respect the DMA
> alignment limit any more.
>
> Follows several related questions:
>
> 1) does kmalloc-N slab guarantee to return N-byte aligned buffer? If
> yes, is it a stable rule?
>
> 2) If it is a rule for kmalloc-N slab to return N-byte aligned buffer,
> seems KASAN violates this
> rule?
(as I was kinda involved in debugging): the issue was observed with SLUB
allocator KASAN is not to blame, everything wich requires aditional
metadata space will break this, see e.g. calculate_sizes() in slub.c
>
> 3) If slab can't guarantee to return 512-aligned buffer, how to fix
> this data corruption issue?
I'm no expert in block layer but in case of complex block device
configurations when bio submitter can't know all the requirements I see
no other choice than bouncing.
--
Vitaly
WARNING: multiple messages have this Message-ID (diff)
From: Vitaly Kuznetsov <vkuznets@redhat.com>
To: Ming Lei <tom.leiming@gmail.com>
Cc: linux-block <linux-block@vger.kernel.org>,
linux-mm <linux-mm@kvack.org>,
Linux FS Devel <linux-fsdevel@vger.kernel.org>,
"open list:XFS FILESYSTEM" <linux-xfs@vger.kernel.org>,
Dave Chinner <dchinner@redhat.com>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Christoph Hellwig <hch@lst.de>, Jens Axboe <axboe@kernel.dk>,
Ming Lei <ming.lei@redhat.com>
Subject: Re: block: DMA alignment of IO buffer allocated from slab
Date: Wed, 19 Sep 2018 11:41:07 +0200 [thread overview]
Message-ID: <877ejh3jv0.fsf@vitty.brq.redhat.com> (raw)
In-Reply-To: <CACVXFVOBq3L_EjSTCoiqUL1PH=HMR5EuNNQV0hNndFpGxmUK6g@mail.gmail.com> (Ming Lei's message of "Wed, 19 Sep 2018 17:15:43 +0800")
Ming Lei <tom.leiming@gmail.com> writes:
> Hi Guys,
>
> Some storage controllers have DMA alignment limit, which is often set via
> blk_queue_dma_alignment(), such as 512-byte alignment for IO buffer.
While mostly drivers use 512-byte alignment it is not a rule of thumb,
'git grep' tell me we have:
ide-cd.c with 32-byte alignment
ps3disk.c and rsxx/dev.c with variable alignment.
What if our block configuration consists of several devices (in raid
array, for example) with different requirements, e.g. one requiring
512-byte alignment and the other requiring 256?
>
> Block layer now only checks if this limit is respected for buffer of
> pass-through request,
> see blk_rq_map_user_iov(), bio_map_user_iov().
>
> The userspace buffer for direct IO is checked in dio path, see
> do_blockdev_direct_IO().
> IO buffer from page cache should be fine wrt. this limit too.
>
> However, some file systems, such as XFS, may allocate single sector IO buffer
> via slab. Usually I guess kmalloc-512 should be fine to return
> 512-aligned buffer.
> But once KASAN or other slab debug options are enabled, looks this
> isn't true any
> more, kmalloc-512 may not return 512-aligned buffer. Then data corruption
> can be observed because the IO buffer from fs layer doesn't respect the DMA
> alignment limit any more.
>
> Follows several related questions:
>
> 1) does kmalloc-N slab guarantee to return N-byte aligned buffer? If
> yes, is it a stable rule?
>
> 2) If it is a rule for kmalloc-N slab to return N-byte aligned buffer,
> seems KASAN violates this
> rule?
(as I was kinda involved in debugging): the issue was observed with SLUB
allocator KASAN is not to blame, everything wich requires aditional
metadata space will break this, see e.g. calculate_sizes() in slub.c
>
> 3) If slab can't guarantee to return 512-aligned buffer, how to fix
> this data corruption issue?
I'm no expert in block layer but in case of complex block device
configurations when bio submitter can't know all the requirements I see
no other choice than bouncing.
--
Vitaly
next prev parent reply other threads:[~2018-09-19 9:41 UTC|newest]
Thread overview: 51+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-09-19 9:15 block: DMA alignment of IO buffer allocated from slab Ming Lei
2018-09-19 9:41 ` Vitaly Kuznetsov [this message]
2018-09-19 9:41 ` Vitaly Kuznetsov
2018-09-19 10:02 ` Ming Lei
2018-09-19 10:02 ` Ming Lei
2018-09-19 11:15 ` Vitaly Kuznetsov
2018-09-19 11:15 ` Vitaly Kuznetsov
2018-09-20 1:28 ` Ming Lei
2018-09-20 3:59 ` Yang Shi
2018-09-20 6:32 ` Christoph Hellwig
2018-09-20 6:31 ` Christoph Hellwig
2018-09-21 13:04 ` Vitaly Kuznetsov
2018-09-21 13:04 ` Vitaly Kuznetsov
2018-09-21 13:05 ` Christoph Hellwig
2018-09-21 15:00 ` Jens Axboe
2018-09-24 16:06 ` Christopher Lameter
2018-09-24 17:49 ` Jens Axboe
2018-09-24 18:00 ` Christopher Lameter
2018-09-24 18:09 ` Jens Axboe
2018-09-25 7:49 ` Dave Chinner
2018-09-25 15:44 ` Jens Axboe
2018-09-25 21:04 ` Matthew Wilcox
2018-09-23 22:42 ` Ming Lei
2018-09-24 9:46 ` Andrey Ryabinin
2018-09-24 14:19 ` Bart Van Assche
2018-09-24 14:43 ` Andrey Ryabinin
2018-09-24 14:43 ` Andrey Ryabinin
2018-09-24 15:08 ` Bart Van Assche
2018-09-24 15:52 ` Andrey Ryabinin
2018-09-24 15:58 ` Bart Van Assche
2018-09-24 16:07 ` Andrey Ryabinin
2018-09-24 16:19 ` Bart Van Assche
2018-09-24 16:47 ` Christopher Lameter
2018-09-24 18:57 ` Matthew Wilcox
2018-09-24 19:56 ` Bart Van Assche
2018-09-24 20:41 ` Matthew Wilcox
2018-09-24 20:54 ` Bart Van Assche
2018-09-24 21:09 ` Matthew Wilcox
2018-09-25 0:16 ` Ming Lei
2018-09-25 3:28 ` Matthew Wilcox
2018-09-25 4:10 ` Bart Van Assche
2018-09-25 4:44 ` Matthew Wilcox
2018-09-25 6:55 ` Ming Lei
2018-09-24 15:17 ` Christopher Lameter
2018-09-25 0:20 ` Ming Lei
2018-09-20 14:07 ` Bart Van Assche
2018-09-21 1:56 ` Dave Chinner
2018-09-21 1:56 ` Dave Chinner
2018-09-21 7:08 ` Christoph Hellwig
2018-09-21 7:25 ` Ming Lei
2018-09-21 14:59 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=877ejh3jv0.fsf@vitty.brq.redhat.com \
--to=vkuznets@redhat.com \
--cc=axboe@kernel.dk \
--cc=dchinner@redhat.com \
--cc=hch@lst.de \
--cc=linux-block@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-xfs@vger.kernel.org \
--cc=ming.lei@redhat.com \
--cc=tom.leiming@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.