From mboxrd@z Thu Jan 1 00:00:00 1970 From: Marek Szyprowski Subject: Re: [PATCH V15 14/18] block: enable multipage bvecs Date: Thu, 21 Feb 2019 11:22:39 +0100 Message-ID: <9269fbbf-b5dd-6be1-682f-e791847ea00d@samsung.com> References: <20190215111324.30129-1-ming.lei@redhat.com> <20190215111324.30129-15-ming.lei@redhat.com> <6c9ae4de-c56f-a2b3-2542-da7d8b95601d@samsung.com> <20190221095733.GA12448@ming.t460p> <20190221101618.GB12448@ming.t460p> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 8bit Return-path: In-Reply-To: <20190221101618.GB12448@ming.t460p> Content-Language: en-US Sender: linux-kernel-owner@vger.kernel.org To: Ming Lei Cc: Jens Axboe , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Theodore Ts'o , Omar Sandoval , Sagi Grimberg , Dave Chinner , Kent Overstreet , Mike Snitzer , dm-devel@redhat.com, Alexander Viro , linux-fsdevel@vger.kernel.org, linux-raid@vger.kernel.org, David Sterba , linux-btrfs@vger.kernel.org, "Darrick J . Wong" , linux-xfs@vger.kernel.org, Gao Xiang , Christoph Hellwig , linux-ext4@vger.kernel.org, Coly Li , linux-bcache@vger.kernel.org, Boaz Harrosh , Bob List-Id: linux-bcache@vger.kernel.org Hi Ming, On 2019-02-21 11:16, Ming Lei wrote: > On Thu, Feb 21, 2019 at 11:08:19AM +0100, Marek Szyprowski wrote: >> On 2019-02-21 10:57, Ming Lei wrote: >>> On Thu, Feb 21, 2019 at 09:42:59AM +0100, Marek Szyprowski wrote: >>>> On 2019-02-15 12:13, Ming Lei wrote: >>>>> This patch pulls the trigger for multi-page bvecs. >>>>> >>>>> Reviewed-by: Omar Sandoval >>>>> Signed-off-by: Ming Lei >>>> Since Linux next-20190218 I've observed problems with block layer on one >>>> of my test devices (Odroid U3 with EXT4 rootfs on SD card). Bisecting >>>> this issue led me to this change. This is also the first linux-next >>>> release with this change merged. The issue is fully reproducible and can >>>> be observed in the following kernel log: >>>> >>>> sdhci: Secure Digital Host Controller Interface driver >>>> sdhci: Copyright(c) Pierre Ossman >>>> s3c-sdhci 12530000.sdhci: clock source 2: mmc_busclk.2 (100000000 Hz) >>>> s3c-sdhci 12530000.sdhci: Got CD GPIO >>>> mmc0: SDHCI controller on samsung-hsmmc [12530000.sdhci] using ADMA >>>> mmc0: new high speed SDHC card at address aaaa >>>> mmcblk0: mmc0:aaaa SL16G 14.8 GiB >>>> >>>> ... >>>> >>>> EXT4-fs (mmcblk0p2): INFO: recovery required on readonly filesystem >>>> EXT4-fs (mmcblk0p2): write access will be enabled during recovery >>>> EXT4-fs (mmcblk0p2): recovery complete >>>> EXT4-fs (mmcblk0p2): mounted filesystem with ordered data mode. Opts: (null) >>>> VFS: Mounted root (ext4 filesystem) readonly on device 179:2. >>>> devtmpfs: mounted >>>> Freeing unused kernel memory: 1024K >>>> hub 1-3:1.0: USB hub found >>>> Run /sbin/init as init process >>>> hub 1-3:1.0: 3 ports detected >>>> *** stack smashing detected ***: terminated >>>> Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000004 >>>> CPU: 1 PID: 1 Comm: init Not tainted 5.0.0-rc6-next-20190218 #1546 >>>> Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) >>>> [] (unwind_backtrace) from [] (show_stack+0x10/0x14) >>>> [] (show_stack) from [] (dump_stack+0x90/0xc8) >>>> [] (dump_stack) from [] (panic+0xfc/0x304) >>>> [] (panic) from [] (do_exit+0xabc/0xc6c) >>>> [] (do_exit) from [] (do_group_exit+0x3c/0xbc) >>>> [] (do_group_exit) from [] (get_signal+0x130/0xbf4) >>>> [] (get_signal) from [] (do_work_pending+0x130/0x618) >>>> [] (do_work_pending) from [] >>>> (slow_work_pending+0xc/0x20) >>>> Exception stack(0xe88c3fb0 to 0xe88c3ff8) >>>> 3fa0:                                     00000000 bea7787c 00000005 >>>> b6e8d0b8 >>>> 3fc0: bea77a18 b6f92010 b6e8d0b8 00000001 b6e8d0c8 00000001 b6e8c000 >>>> bea77b60 >>>> 3fe0: 00000020 bea77998 ffffffff b6d52368 60000050 ffffffff >>>> CPU3: stopping >>>> >>>> I would like to help debugging and fixing this issue, but I don't really >>>> have idea where to start. Here are some more detailed information about >>>> my test system: >>>> >>>> 1. Board: ARM 32bit Samsung Exynos4412-based Odroid U3 (device tree >>>> source: arch/arm/boot/dts/exynos4412-odroidu3.dts) >>>> >>>> 2. Block device: MMC/SDHCI/SDHCI-S3C with SD card >>>> (drivers/mmc/host/sdhci-s3c.c driver, sdhci_2 device node in the device >>>> tree) >>>> >>>> 3. Rootfs: Ext4 >>>> >>>> 4. Kernel config: arch/arm/configs/exynos_defconfig >>>> >>>> I can gather more logs if needed, just let me which kernel option to >>>> enable. Reverting this commit on top of next-20190218 as well as current >>>> linux-next (tested with next-20190221) fixes this issue and makes the >>>> system bootable again. >>> Could you test the patch in following link and see if it can make a difference? >>> >>> https://marc.info/?l=linux-aio&m=155070355614541&w=2 >> I've tested that patch, but it doesn't make any difference on the test >> system. In the log I see no warning added by it. > I guess it might be related with memory corruption, could you enable the > following debug options and post the dmesg log? > > CONFIG_DEBUG_STACKOVERFLOW=y > CONFIG_KASAN=y It won't be that easy as none of the above options is available on ARM 32bit. I will try to apply some ARM KASAN patches floating on the net and let you know the result. Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland