From mboxrd@z Thu Jan 1 00:00:00 1970 From: Rob Harris Subject: Custom driver FS brokenness at 4GB? Date: Wed, 27 May 2015 09:56:29 -0400 Message-ID: <5565CD0D.4080408@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit To: linux-ext4@vger.kernel.org Return-path: Received: from mail-qk0-f169.google.com ([209.85.220.169]:34240 "EHLO mail-qk0-f169.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751307AbbE0N4b (ORCPT ); Wed, 27 May 2015 09:56:31 -0400 Received: by qkoo18 with SMTP id o18so5764389qko.1 for ; Wed, 27 May 2015 06:56:30 -0700 (PDT) Received: from [10.13.1.157] ([38.103.30.210]) by mx.google.com with ESMTPSA id l74sm10251463qgd.7.2015.05.27.06.56.30 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 27 May 2015 06:56:30 -0700 (PDT) Sender: linux-ext4-owner@vger.kernel.org List-ID: Greetings. I have an odd issue and need some ideas of where to go next -- I'm out of hair to rip out. I'm writing a custom block device driver talking to some custom RAID hardware (>32TB) using DMA scatter-gather, with no partitions and am using make_request() to service all the BIO requests to simplify debugging. I have the driver working to the point where using DD against the block device seems to work fine (I'm setting iflag|oflag=direct to ensure it's writing to the disk). I also have the blk_queue set to only request a single 4k I/O per BIO (again to simplify debugging for now.) Also, again to debug, I have a mutex wrapping the entire make_request call to ensure that only a single request is being serviced at a time. So, this should be as "simple" as I can make the environment to debug this problem. Once the driver is loaded, when I try to create a file system (ext4 but the same thing happens with xfs) it seems like there is some corruption occurring, but only when I set the sector size of the block device over 4GB. For instance, when I set the size to 4G, I can mkfs.ext4, but after 2 or 3 mount/umounts the FS refuses to mount anymore and the kernel log complains that the journal is missing. This was discovered running this loop... #!/bin/sh COUNT=4032 while [ 1 ] ; do figlet ${COUNT} ( umount /mnt ; rmmod smc ) || true modprobe smc capacity_in_mb=${COUNT} debug=1 mkfs.ext4 -m 0 /dev/smcd mount /dev/smcd /mnt cp count_512m.dat /mnt/test umount /mnt mount /dev/smcd /mnt umount /mnt mount /dev/smcd /mnt cmp count_512m.dat /mnt/test umount /mnt mount /dev/smcd /mnt # *** sync umount /mnt mount /dev/smcd /mnt sleep 1 umount /mnt COUNT=$(( COUNT + 64 )) sleep 1 done Sometimes I'll get in the kernel log: May 27 09:39:01 febtober kernel: [64547.304695] EXT4-fs (smcd): ext4_check_descriptors: Checksum for group 0 failed (7009!=0) May 27 09:39:01 febtober kernel: [64547.305744] EXT4-fs (smcd): group descriptors corrupted! Others I'll get: May 27 09:46:49 ryftone-smcdrv kernel: [65014.342850] EXT4-fs (smcd): no journal found I've seen this loop fail as early as COUNT=4096, but as late as COUNT=4220; removing the sync changes the behavior. When it fails, it usually does so on the 3rd mount (***). FYI, I effectively call: set_capacity( disk, capacity_in_mb * 2048 ); ( 2048 * 512b (kernel sector) = 1M ) Another example: if I set the sector count of the disk to 16G, I can run mkfs.ext4 but the first mount fails and I see May 27 09:07:27 febtober kernel: [62653.269387] EXT4-fs (smcd): ext4_check_descriptors: Block bitmap for group 0 not in group (block 4294967295)! But, again, if I set the sector size < 4G, everything seems fine. I can currently DD read and write across that 4G boundary without issue -- it's ONLY the filesystem accesses. My gut is screaming there's 32/64 bit overflow condition somewhere but for the life of me I can't find it. Is there something I need to set to tell the block layer I have a 64-bit addressible device? set_capacity is always the number of LINUX KERNEL sectors (not what I set blk_queue_logical|physical_block_size to) correct? I'm currently on 3.16.0 (Ubuntu 14.04.2 LTS) if it matters. Any help/pointers would be greatly appreciated. --Rob Harris