Re: [PATCH V4 00/13] MD: a caching layer for raid5/6

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Shaohua Li <shli@fb.com>
To: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Cc: linux-raid@vger.kernel.org, songliubraving@fb.com,
	hch@infradead.org, dan.j.williams@intel.com, neilb@suse.de
Subject: Re: [PATCH V4 00/13] MD: a caching layer for raid5/6
Date: Thu, 2 Jul 2015 10:11:18 -0700	[thread overview]
Message-ID: <20150702171104.GA683150@devbig257.prn2.facebook.com> (raw)
In-Reply-To: <20150702032516.GB10378@yliu-dev.sh.intel.com>

On Thu, Jul 02, 2015 at 11:25:16AM +0800, Yuanhan Liu wrote:
> Hi Shaohua,
> 
> I gave it a quick test(dd if=/dev/zero of=/dev/md0) in a KVM, and
> here I met a kernel oops.
> 
>     [   14.768563] BUG: unable to handle kernel NULL pointer dereference at   (null)
>     [   14.769153] IP: [<c1383b13>] xor_sse_3_pf64+0x67/0x207
>     [   14.769585] *pde = 00000000
>     [   14.769822] Oops: 0000 [#1] SMP
>     [   14.770092] Modules linked in:
>     [   14.770349] CPU: 0 PID: 2234 Comm: md0_raid5 Not tainted 4.1.0+ #18
>     [   14.770854] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
>     [   14.771859] task: f62c5e80 ti: f04ec000 task.ti: f04ec000
>     [   14.772287] EIP: 0060:[<c1383b13>] EFLAGS: 00010246 CPU: 0
>     [   14.772327] EIP is at xor_sse_3_pf64+0x67/0x207
>     [   14.772327] EAX: 00000010 EBX: 00000000 ECX: e2a99000 EDX: f48ae000
>     [   14.772327] ESI: f48ae000 EDI: 00000010 EBP: f04edcd8 ESP: f04edccc
>     [   14.772327]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
>     [   14.772327] CR0: 80050033 CR2: 00000000 CR3: 22ac5000 CR4: 000006d0
>     [   14.772327] Stack:
>     [   14.772327]  c1c4d96c 00000000 00000001 f04edcf8 c1382bcf 00000000 00000002 c112d2e5
>     [   14.772327]  00000000 f04edd98 00000002 f04edd30 c138554b f6a2e028 00000001 f6f51b30
>     [   14.772327]  00000002 00000000 f48ae000 f6a2e028 00001000 f04edd40 00000002 00000000
>     [   14.772327] Call Trace:
>     [   14.772327]  [<c1382bcf>] xor_blocks+0x3a/0x6d
>     [   14.772327]  [<c112d2e5>] ? page_address+0xb8/0xc0
>     [   14.772327]  [<c138554b>] async_xor+0xf3/0x113
>     [   14.772327]  [<c186d954>] raid_run_ops+0xf73/0x1025
>     [   14.772327]  [<c186d954>] ? raid_run_ops+0xf73/0x1025
>     [   14.772327]  [<c16ce8b9>] handle_stripe+0x120/0x199b
>     [   14.772327]  [<c10c7df4>] ? clockevents_program_event+0xfb/0x118
>     [   14.772327]  [<c10c94f7>] ? tick_program_event+0x5f/0x68
>     [   14.772327]  [<c10be529>] ? hrtimer_interrupt+0xa3/0x134
>     [   14.772327]  [<c16d0366>] handle_active_stripes.isra.37+0x232/0x2b5
>     [   14.772327]  [<c16d0701>] raid5d+0x267/0x44c
>     [   14.772327]  [<c18717e6>] ? schedule_timeout+0x1c/0x14d
>     [   14.772327]  [<c1872011>] ? _raw_spin_lock_irqsave+0x1f/0x3d
>     [   14.772327]  [<c16f684a>] md_thread+0x103/0x112
>     [   14.772327]  [<c10a3146>] ? wait_woken+0x62/0x62
>     [   14.772327]  [<c16f6747>] ? md_wait_for_blocked_rdev+0xda/0xda
>     [   14.772327]  [<c108d44e>] kthread+0xa4/0xa9
>     [   14.772327]  [<c1872401>] ret_from_kernel_thread+0x21/0x30
>     [   14.772327]  [<c108d3aa>] ? kthread_create_on_node+0x104/0x104
>     
> 
> I dug it a while, and came up with following fix. Does it make sense
> to you?
> 
> ----
> diff --git a/crypto/async_tx/async_xor.c b/crypto/async_tx/async_xor.c
> index e1bce26..063ac50 100644
> --- a/crypto/async_tx/async_xor.c
> +++ b/crypto/async_tx/async_xor.c
> @@ -30,6 +30,7 @@
>  #include <linux/dma-mapping.h>
>  #include <linux/raid/xor.h>
>  #include <linux/async_tx.h>
> +#include <linux/highmem.h>
> 
>  /* do_async_xor - dma map the pages and perform the xor with an engine */
>  static __async_inline struct dma_async_tx_descriptor *
> @@ -127,7 +128,7 @@ do_sync_xor(struct page *dest, struct page **src_list, unsigned int offset,
>         /* convert to buffer pointers */
>         for (i = 0; i < src_cnt; i++)
>                 if (src_list[i])
> -                       srcs[xor_src_cnt++] = page_address(src_list[i]) + offset;
> +                       srcs[xor_src_cnt++] = kmap_atomic(src_list[i]) + offset;
>         src_cnt = xor_src_cnt;
>         /* set destination address */
>         dest_buf = page_address(dest) + offset;
> @@ -140,6 +141,9 @@ do_sync_xor(struct page *dest, struct page **src_list, unsigned int offset,
>                 xor_src_cnt = min(src_cnt, MAX_XOR_BLOCKS);
>                 xor_blocks(xor_src_cnt, len, dest_buf, &srcs[src_off]);
> 
> +               for(i = src_off; i < src_off + xor_src_cnt; i++)
> +                       kunmap_atomic(srcs[i]);
> +
>                 /* drop completed sources */
>                 src_cnt -= xor_src_cnt;
>                 src_off += xor_src_cnt;

Thanks! This is the side effect of skip_copy, I think. skip_copy isn't
enabled by default, so we don't hit it before, while my previous test
was done with x86_64. We should see this with skip_copy enabled even
without the cache patches.

The patch does make sense. There are other async APIs using page_address
too: async_raid6_recov.c, async_pq.c, can you please fix them? You can
check them with raid6.

Thanks,
Shaohua

next prev parent reply	other threads:[~2015-07-02 17:11 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-23 21:37 [PATCH V4 00/13] MD: a caching layer for raid5/6 Shaohua Li
2015-06-23 21:37 ` [PATCH V4 01/13] MD: add a new disk role to present cache device Shaohua Li
2015-06-23 21:37 ` [PATCH V4 02/13] raid5: directly use mddev->queue Shaohua Li
2015-06-23 21:37 ` [PATCH V4 03/13] raid5: cache log handling Shaohua Li
2015-06-23 21:37 ` [PATCH V4 04/13] raid5: cache part of raid5 cache Shaohua Li
2015-06-23 21:37 ` [PATCH V4 05/13] raid5: cache reclaim support Shaohua Li
2015-06-23 21:37 ` [PATCH V4 06/13] raid5: cache IO error handling Shaohua Li
2015-06-23 21:37 ` [PATCH V4 07/13] raid5: cache device quiesce support Shaohua Li
2015-06-23 21:37 ` [PATCH V4 08/13] raid5: cache recovery support Shaohua Li
2015-06-23 21:37 ` [PATCH V4 09/13] raid5: add some sysfs entries Shaohua Li
2015-06-23 21:38 ` [PATCH V4 10/13] raid5: don't allow resize/reshape with cache support Shaohua Li
2015-06-23 21:38 ` [PATCH V4 11/13] raid5: guarantee cache release stripes in correct way Shaohua Li
2015-06-23 21:38 ` [PATCH V4 12/13] raid5: enable cache for raid array with cache disk Shaohua Li
2015-06-23 21:38 ` [PATCH V4 13/13] raid5: skip resync if caching is enabled Shaohua Li
2015-07-02  3:25 ` [PATCH V4 00/13] MD: a caching layer for raid5/6 Yuanhan Liu
2015-07-02 17:11   ` Shaohua Li [this message]
2015-07-03  2:18     ` Yuanhan Liu
2015-07-08  1:56 ` NeilBrown
2015-07-08  5:44   ` Shaohua Li
2015-07-09 23:21     ` NeilBrown
2015-07-10  4:08       ` Shaohua Li
2015-07-10  4:36         ` NeilBrown
2015-07-10  4:52           ` Shaohua Li
2015-07-10  5:10             ` NeilBrown
2015-07-10  5:18               ` Shaohua Li
2015-07-10  6:42                 ` NeilBrown
2015-07-10 17:48                   ` Shaohua Li
2015-07-13 22:22                     ` NeilBrown
2015-07-13 22:35                       ` Shaohua Li
2015-07-15  0:45           ` Shaohua Li
2015-07-15  2:12             ` NeilBrown
2015-07-15  3:16               ` Shaohua Li
2015-07-15  4:06                 ` NeilBrown
2015-07-15 19:49                   ` Shaohua Li
2015-07-15 23:16                     ` NeilBrown
2015-07-16  0:07                       ` Shaohua Li
2015-07-16  1:22                         ` NeilBrown
2015-07-16  4:13                           ` Shaohua Li
2015-07-16  6:07                             ` NeilBrown
2015-07-16 15:07                               ` John Stoffel
2015-07-20  0:03                                 ` NeilBrown
2015-07-20 14:11                                   ` John Stoffel
2015-07-16 17:40                               ` Shaohua Li
2015-07-17  3:47                                 ` NeilBrown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150702171104.GA683150@devbig257.prn2.facebook.com \
    --to=shli@fb.com \
    --cc=dan.j.williams@intel.com \
    --cc=hch@infradead.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=songliubraving@fb.com \
    --cc=yuanhan.liu@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.