* [NOISE] merge window blues, XFS broken
@ 2014-01-26 19:35 Michael L. Semon
2014-01-27 1:56 ` Dave Chinner
0 siblings, 1 reply; 10+ messages in thread
From: Michael L. Semon @ 2014-01-26 19:35 UTC (permalink / raw)
To: xfs-oss
Hi! This is more an observation than a bug report, me trying to figure
out what happened on what is now a 3-day-old kernel on 32-bit x86
(Pentium 4). The report is marked as [NOISE] because I can do this...
git pull origin master
git remote update # updates xfs-oss
git reset --hard v3.13
git merge xfs-oss/master
...and the resulting kernel and XFS will be as smooth as silk.
However, if I do this...
git pull origin master
git remote update # at time of pull, "Already up-to-date."
git merge xfs-oss/master
...the resulting XFS will not pass this, for either v4- or v5-
superblock XFS:
mkfs.xfs -f $TEST_DEV # always OK
mount $TEST_DEV $TEST_DIR # may succeed, may fail
ls $TEST_DIR/ # may succeed, may fail
umount $TEST_DEV # always fails
The assertion is this (from notes taken by hand):
Assertion failed: IS_ALIGNED((unsigned long)vec->i_addr, sizeof(uint64_t)), file: fs/xfs/xfs_log.h, line: 49
Anything after my closing is supporting data. My questions are these:
1) Could the patch "xfs: format log items write directly into the linear
CIL buffer" have thrown off alignment with XFS, when combined with the
kernel changes for v3.14-rc?
2) Can kernel header changes--especially for this kernfs feature that
keeps getting in the way--bleed into structures that XFS uses and cause
chaos?
3) Because the XFS behavior changed due to some kernel change that
probably isn't going away, do I need to keep worrying about this issue?
I tried to use pahole and hoped that it would point out something
obvious. However, after attempting to use Perl to massage the large
amount of data created by pahole, it looks like a little bit changed
for XFS and a lot changed for the kernel. Lack of knowledge keeps that
piece of data that must be aligned from jumping out at me.
In case I botched something in my report, I can reproduce this problem at
will and can try to make a better report next time. A crash dump is
available.
Thanks!
Michael
The stack trace is this (from the "crash" utility, different session):
root@plbearer:/mnt/storage/crashdump# crash vmlinux System.map vmcore
# `crash` initialization snipped
SYSTEM MAP: System.map
DEBUG KERNEL: vmlinux
DUMPFILE: vmcore
CPUS: 1
DATE: Fri Jan 24 10:04:35 2014
UPTIME: 00:02:26
LOAD AVERAGE: 0.50, 0.20, 0.08
TASKS: 63
NODENAME: plbearer
RELEASE: 3.13.0+
VERSION: #1 Fri Jan 24 09:57:19 EST 2014
MACHINE: i686 (1794 Mhz)
MEMORY: 1.2 GB
PANIC: "kernel BUG at fs/xfs/xfs_message.c:107!"
PID: 301
COMMAND: "mount"
TASK: c5528c30 [THREAD_INFO: bca5a000]
CPU: 0
STATE: TASK_RUNNING (PANIC)
crash> bt
PID: 301 TASK: c5528c30 CPU: 0 COMMAND: "mount"
#0 [bca5bc60] crash_kexec at 7907489a
#1 [bca5bcac] do_invalid_op at 790023c8
#2 [bca5bd48] error_code (via invalid_op) at 7944b2ff
EAX: 00000071 EBX: a40c11bc ECX: 000002ac EDX: c5529020 EBP: bca5bd9c
DS: 007b ESI: 78135a00 ES: 007b EDI: 78135a1c GS: 2342
CS: 0060 EIP: 79175065 ERR: ffffffff EFLAGS: 00010286
#3 [bca5bd7c] assfail at 79175065
#4 [bca5bda0] xfs_buf_item_format at 791cbd67
#5 [bca5bde8] xfs_log_commit_cil at 791cb4c8
#6 [bca5be4c] xfs_trans_commit at 7917c4fe
#7 [bca5be78] xfs_log_sbcount at 79176043
#8 [bca5be8c] xfs_unmountfs at 791760f8
#9 [bca5beb4] xfs_fs_put_super at 79179257
#10 [bca5bec0] generic_shutdown_super at 790d7b5b
#11 [bca5bedc] kill_block_super at 790d89be
#12 [bca5beec] deactivate_locked_super at 790d7814
#13 [bca5befc] deactivate_super at 790d7871
#14 [bca5bf0c] mntput_no_expire at 790ef90b
#15 [bca5bf28] mntput at 790f047f
#16 [bca5bf30] do_mount at 790f1c66
#17 [bca5bf80] sys_mount at 790f26ed
#18 [bca5bfb0] ia32_sysenter_target at 7944b5b1
EAX: 00000015 EBX: 09bd3b70 ECX: 09bd3b80 EDX: 09bd61e0
DS: 007b ESI: c0ed0000 ES: 007b EDI: 00000000
SS: 007b ESP: 77c3d2f0 EBP: 00000000 GS: 0000
CS: 0073 EIP: 6f708424 ERR: 00000015 EFLAGS: 00000246
crash> quit
Bisect led me here:
bde7cff67c39227c6ad503394e19e58debdbc5e3 is the first bad commit
commit bde7cff67c39227c6ad503394e19e58debdbc5e3
Author: Christoph Hellwig <hch@infradead.org>
Date: Fri Dec 13 11:34:02 2013 +1100
xfs: format log items write directly into the linear CIL buffer
# git bisect log
git bisect start
# good: [d8ec26d7f8287f5788a494f56e8814210f0e64be] Linux 3.13
git bisect good d8ec26d7f8287f5788a494f56e8814210f0e64be
# bad: [12e881fb6be0cada0ed4bebe6806945fb85f170a] nilfs2: implementation of NILFS_IOCTL_SET_SUINFO ioctl
git bisect bad 12e881fb6be0cada0ed4bebe6806945fb85f170a
# good: [de4fe30af1620b5117d65489621a5037913e7a92] Merge tag 'staging-3.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
git bisect good de4fe30af1620b5117d65489621a5037913e7a92
# good: [d4371f94bc003e912d4825f5c4bdf57959857073] Merge tag 'sound-3.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
git bisect good d4371f94bc003e912d4825f5c4bdf57959857073
# good: [e1ba84597c9012b9f9075aac283ac7537d7561ba] Merge tag 'pci-v3.14-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
git bisect good e1ba84597c9012b9f9075aac283ac7537d7561ba
# good: [7ebd3faa9b5b42caf2d5aa1352a93dcfa0098011] Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
git bisect good 7ebd3faa9b5b42caf2d5aa1352a93dcfa0098011
# bad: [1d32bdafaaa8bcc4c39b41ab9f674887d147f188] Merge tag 'xfs-for-linus-v3.14-rc1' of git://oss.sgi.com/xfs/xfs
git bisect bad 1d32bdafaaa8bcc4c39b41ab9f674887d147f188
# good: [93b05cba8ed52a751da9c4c7da6c97bc514bec77] Merge tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux
git bisect good 93b05cba8ed52a751da9c4c7da6c97bc514bec77
# good: [eef334e5776c8ef547ada4cec17549929fe590b4] xfs: assert that we hold the ilock for extent map access
git bisect good eef334e5776c8ef547ada4cec17549929fe590b4
# bad: [46f23adf78545c49591619a615edeec41ed5a549] Merge branch 'xfs-factor-icluster-macros' into for-next
git bisect bad 46f23adf78545c49591619a615edeec41ed5a549
# bad: [ce8e962939ca12218092f8eb3c8cfb196cd8cc51] xfs: remove the dquot log format from the dquot log item
git bisect bad ce8e962939ca12218092f8eb3c8cfb196cd8cc51
# good: [3de559fbd04d67473b9be2bd183823c40c4b7557] xfs: refactor xfs_inode_item_format
git bisect good 3de559fbd04d67473b9be2bd183823c40c4b7557
# bad: [bde7cff67c39227c6ad503394e19e58debdbc5e3] xfs: format log items write directly into the linear CIL buffer
git bisect bad bde7cff67c39227c6ad503394e19e58debdbc5e3
# good: [1234351cba958cd5d4338172ccfc869a687cd736] xfs: introduce xlog_copy_iovec
git bisect good 1234351cba958cd5d4338172ccfc869a687cd736
# first bad commit: [bde7cff67c39227c6ad503394e19e58debdbc5e3] xfs: format log items write directly into the linear CIL buffer
Note: For the data below, I tried my hardest to run both good and bad
kernels with the same kernel config. Kernel source code was generated
by the `git merge xfs-oss/master` methods cited above.
Changes from kernel with working XFS (a) and kernel with broken
XFS (b), looking for size changes:
struct xsave_struct {
a: /* size: 832, cachelines: 13, members: 3 */
b: /* size: 1088, cachelines: 17, members: 6 */
struct perf_event {
a: /* size: 864, cachelines: 14, members: 55 */
b: /* size: 872, cachelines: 14, members: 56 */
struct task_struct {
a: /* size: 2988, cachelines: 47, members: 139 */
b: /* size: 3116, cachelines: 49, members: 143 */
struct zone {
a: /* size: 764, cachelines: 12, members: 29 */
b: /* size: 768, cachelines: 12, members: 30 */
struct pglist_data {
a: /* size: 3236, cachelines: 51, members: 14 */
b: /* size: 3252, cachelines: 51, members: 14 */
struct mm_struct {
a: /* size: 564, cachelines: 9, members: 45 */
b: /* size: 576, cachelines: 9, members: 46 */
struct signal_struct {
a: /* size: 632, cachelines: 10, members: 53 */
b: /* size: 640, cachelines: 10, members: 54 */
struct xfs_dquot {
a: /* size: 516, cachelines: 9, members: 22 */
b: /* size: 492, cachelines: 8, members: 22 */
struct xfs_inode_log_item {
a: /* size: 160, cachelines: 3, members: 11 */
b: /* size: 100, cachelines: 2, members: 8 */
struct xfs_dq_logitem {
a: /* size: 104, cachelines: 2, members: 4 */
b: /* size: 80, cachelines: 2, members: 3 */
struct ftrace_event_call {
a: /* size: 72, cachelines: 2, members: 12 */
b: /* size: 76, cachelines: 2, members: 13 */
struct ftrace_event_file {
a: /* size: 36, cachelines: 1, members: 8 */
b: /* size: 48, cachelines: 1, members: 10 */
struct xfs_qoff_logitem {
a: /* size: 92, cachelines: 2, members: 3 */
b: /* size: 76, cachelines: 2, members: 3 */
Changes from kernel with working XFS (a) and kernel with broken
XFS (b), looking changes in the holes in structures.
struct kernfs_open_file {
a:
b: /* XXX 3 bytes hole, try to pack */
I work with pahole maybe once every three months, and my fuzzy memory, it
seems like there are more holes total on 32-bit x86 than there used to be,
don't know why. As always, maybe there's a glitch in the system that
created all of this. My systems evolve over time and may be in a
substandard state at present.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [NOISE] merge window blues, XFS broken
2014-01-26 19:35 [NOISE] merge window blues, XFS broken Michael L. Semon
@ 2014-01-27 1:56 ` Dave Chinner
2014-01-27 7:41 ` Christoph Hellwig
2014-01-27 9:46 ` Michael L. Semon
0 siblings, 2 replies; 10+ messages in thread
From: Dave Chinner @ 2014-01-27 1:56 UTC (permalink / raw)
To: Michael L. Semon; +Cc: xfs-oss
On Sun, Jan 26, 2014 at 02:35:34PM -0500, Michael L. Semon wrote:
> Hi! This is more an observation than a bug report, me trying to figure
> out what happened on what is now a 3-day-old kernel on 32-bit x86
> (Pentium 4). The report is marked as [NOISE] because I can do this...
>
> git pull origin master
> git remote update # updates xfs-oss
> git reset --hard v3.13
> git merge xfs-oss/master
>
> ...and the resulting kernel and XFS will be as smooth as silk.
> However, if I do this...
>
> git pull origin master
> git remote update # at time of pull, "Already up-to-date."
> git merge xfs-oss/master
>
> ...the resulting XFS will not pass this, for either v4- or v5-
> superblock XFS:
>
> mkfs.xfs -f $TEST_DEV # always OK
> mount $TEST_DEV $TEST_DIR # may succeed, may fail
> ls $TEST_DIR/ # may succeed, may fail
> umount $TEST_DEV # always fails
>
> The assertion is this (from notes taken by hand):
>
> Assertion failed: IS_ALIGNED((unsigned long)vec->i_addr, sizeof(uint64_t)), file: fs/xfs/xfs_log.h, line: 49
Hmmmm - that would point to struct xfs_log_iovec being 12 bytes on
ia32, similarly the struct xfs_log_vec may not be 8 byte aligned
when allocated because pointers only require 4 byte alignement.
Hence the initial data buffer may be 4 byte aligned and so that will
break the alignment requirement.
As a test, can you add pad both structures to be a multiple of 8
bytes in size, and add to them __aligned(8) so that they are
correctly aligned in memory? i.e.
struct xfs_log_vec {
......
int pad;
} __aligned(8);
struct xfs_log_iovec {
......
int pad;
} __aligned(8);
And see if that makes the problem go away?
> I work with pahole maybe once every three months, and my fuzzy memory, it
> seems like there are more holes total on 32-bit x86 than there used to be,
> don't know why.
Check the structs with pahole before and after you make the above
padding change to see if there is a difference in size and alignment
as a result.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [NOISE] merge window blues, XFS broken
2014-01-27 1:56 ` Dave Chinner
@ 2014-01-27 7:41 ` Christoph Hellwig
2014-01-27 9:46 ` Michael L. Semon
1 sibling, 0 replies; 10+ messages in thread
From: Christoph Hellwig @ 2014-01-27 7:41 UTC (permalink / raw)
To: Dave Chinner; +Cc: Michael L. Semon, xfs-oss
On Mon, Jan 27, 2014 at 12:56:14PM +1100, Dave Chinner wrote:
> Hmmmm - that would point to struct xfs_log_iovec being 12 bytes on
> ia32, similarly the struct xfs_log_vec may not be 8 byte aligned
> when allocated because pointers only require 4 byte alignement.
>
> Hence the initial data buffer may be 4 byte aligned and so that will
> break the alignment requirement.
Yeah. Shoulkd be enough to just align the initial lv_buf allocation.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [NOISE] merge window blues, XFS broken
2014-01-27 1:56 ` Dave Chinner
2014-01-27 7:41 ` Christoph Hellwig
@ 2014-01-27 9:46 ` Michael L. Semon
2014-01-27 23:30 ` Dave Chinner
1 sibling, 1 reply; 10+ messages in thread
From: Michael L. Semon @ 2014-01-27 9:46 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs-oss
On 01/26/2014 08:56 PM, Dave Chinner wrote:
> On Sun, Jan 26, 2014 at 02:35:34PM -0500, Michael L. Semon wrote:
>> Hi! This is more an observation than a bug report, me trying to figure
>> out what happened on what is now a 3-day-old kernel on 32-bit x86
>> (Pentium 4). The report is marked as [NOISE] because I can do this...
>>
>> git pull origin master
>> git remote update # updates xfs-oss
>> git reset --hard v3.13
>> git merge xfs-oss/master
>>
>> ...and the resulting kernel and XFS will be as smooth as silk.
>> However, if I do this...
>>
>> git pull origin master
>> git remote update # at time of pull, "Already up-to-date."
>> git merge xfs-oss/master
>>
>> ...the resulting XFS will not pass this, for either v4- or v5-
>> superblock XFS:
>>
>> mkfs.xfs -f $TEST_DEV # always OK
>> mount $TEST_DEV $TEST_DIR # may succeed, may fail
>> ls $TEST_DIR/ # may succeed, may fail
>> umount $TEST_DEV # always fails
>>
>> The assertion is this (from notes taken by hand):
>>
>> Assertion failed: IS_ALIGNED((unsigned long)vec->i_addr, sizeof(uint64_t)), file: fs/xfs/xfs_log.h, line: 49
>
> Hmmmm - that would point to struct xfs_log_iovec being 12 bytes on
> ia32, similarly the struct xfs_log_vec may not be 8 byte aligned
> when allocated because pointers only require 4 byte alignement.
>
> Hence the initial data buffer may be 4 byte aligned and so that will
> break the alignment requirement.
>
> As a test, can you add pad both structures to be a multiple of 8
> bytes in size, and add to them __aligned(8) so that they are
> correctly aligned in memory? i.e.
>
> struct xfs_log_vec {
> ......
> int pad;
> } __aligned(8);
>
> struct xfs_log_iovec {
> ......
> int pad;
> } __aligned(8);
>
> And see if that makes the problem go away?
No. However, I didn't try to pad anywhere except at the end of each struct.
Also, I didn't know how to handle the mix of __aligned(8) within a typedef
definition.
What I did do however, is add a similar pad to xfs_inode_log_item, and that
got both umounts and xfstests working again. There's a new lockdep that I'll
have to submit tomorrow, but otherwise it looks consistent with previous XFS
behavior in very, very little testing.
Suggestions are welcome. If there was a list of what is supposed to be
8-byte aligned, I'd be happy to look further.
>> I work with pahole maybe once every three months, and my fuzzy memory, it
>> seems like there are more holes total on 32-bit x86 than there used to be,
>> don't know why.
>
> Check the structs with pahole before and after you make the above
> padding change to see if there is a difference in size and alignment
> as a result.
>
> Cheers,
>
> Dave.
Everything after my closing is just supporting data, and a verification that
last night's git pull didn't have a non-XFS fix to make the problem go away.
Just grep "pahole reports:" to get to the pahole results.
Thanks, Dave and Christoph!
Michael
# ======================= MERGE ========================
# display kernel commit
root@plbearer:/usr/src/kernel-git/linux# git log | head -n 8
commit b2e448eca1a52fea181905845728ae00a138d84e
Merge: 2d2e7d1 d02b370
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date: Sat Jan 25 15:33:41 2014 -0800
Merge branch 'ipmi' (ipmi patches from Corey Minyard)
Merge ipmi fixes from Corey Minyard:
# merge
root@plbearer:/usr/src/kernel-git/linux# git merge xfs-oss/master
Already up-to-date.
root@plbearer:/usr/src/kernel-git/linux# zcat /proc/config.gz > .config
root@plbearer:/usr/src/kernel-git/linux# make oldconfig
# and so on and so forth...
# ============================== BEFORE ==============================
# pahole reports:
struct xfs_log_iovec {
void * i_addr; /* 0 4 */
int i_len; /* 4 4 */
uint i_type; /* 8 4 */
/* size: 12, cachelines: 1, members: 3 */
/* last cacheline: 12 bytes */
};
struct xfs_inode_log_item {
xfs_log_item_t ili_item; /* 0 68 */
/* --- cacheline 1 boundary (64 bytes) was 4 bytes ago --- */
struct xfs_inode * ili_inode; /* 68 4 */
xfs_lsn_t ili_flush_lsn; /* 72 8 */
xfs_lsn_t ili_last_lsn; /* 80 8 */
short unsigned int ili_lock_flags; /* 88 2 */
short unsigned int ili_logged; /* 90 2 */
unsigned int ili_last_fields; /* 92 4 */
unsigned int ili_fields; /* 96 4 */
/* size: 100, cachelines: 2, members: 8 */
/* last cacheline: 36 bytes */
};
struct xfs_log_vec {
struct xfs_log_vec * lv_next; /* 0 4 */
int lv_niovecs; /* 4 4 */
struct xfs_log_iovec * lv_iovecp; /* 8 4 */
struct xfs_log_item * lv_item; /* 12 4 */
char * lv_buf; /* 16 4 */
int lv_buf_len; /* 20 4 */
int lv_size; /* 24 4 */
/* size: 28, cachelines: 1, members: 7 */
/* last cacheline: 28 bytes */
};
# ============== CHECK TO SEE IF KERNEL STILL SHOWS ISSUE =============
# serial login:
mls@oldsvrhw:~$ cu -l /dev/ttyS0 -s 115200
Connected.
root
Password:
Last login: Sun Jan 26 23:09:13 -0500 2014 on /dev/tty4.
root@plbearer:~# dmesg -c > /dev/null
root@plbearer:~# dmesg --console-on
root@plbearer:~# dmesg --console-level debug
root@plbearer:~# mkfs.xfs -f $TEST_DEV
meta-data=/dev/md3p3 isize=256 agcount=8, agsize=131056 blks
= sectsz=512 attr=2, projid32bit=1, parent=0
= crc=0
data = bsize=4096 blocks=1048448, imaxpct=25
= sunit=16 swidth=32 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=0
log =internal log bsize=4096 blocks=12800, version=2
= sectsz=512 sunit=16 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
root@plbearer:~# mount $TEST_DEV $TEST_DIR
[ 87.438116] XFS (md3p3): Mounting Filesystem
[ 87.630320] XFS (md3p3): Ending clean mount
root@plbearer:~# ls $TEST_DIR/
[ 94.140207] XFS: Assertion failed: IS_ALIGNED((unsigned long)vec->i_addr, sizeof(uint64_t)), file: fs/xfs/xfs_log.h, line: 49
Entering kdb (current=0xc5298c30, pid 297) Oops: (null)
due to oops @ 0x791752c5
CPU: 0 PID: 297 Comm: ls Not tainted 3.13.0+ #1
Hardware name: Dell Computer Corporation Dimension 2350/07W080, BIOS A01 12/17/2002
task: c5298c30 ti: c520e000 task.ti: c520e000
EIP: 0060:[<791752c5>] EFLAGS: 00010286 CPU: 0
EIP is at assfail+0x2b/0x2d
EAX: 00000071 EBX: c60ba600 ECX: 00000296 EDX: c5299098
ESI: c60ba61c EDI: c60ba600 EBP: c520fe40 ESP: c520fe2c
DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
CR0: 80050033 CR2: 08f1612c CR3: 4d1f0000 CR4: 000007d0
Stack:
00000000 79570bc8 79576e28 7956946d 00000031 c520fe70 791ce45f c520fe70
7917ceb0 c520fec4 c50d5068 c520fe70 c55d8000 00000000 c50d5068 c607ae30
c60ba600 c520fed4 791cb72b c607af80 c6c01e80 000000d8 c4294000 c520feec
Call Trace:
[<791ce45f>] xfs_inode_item_format+0x4a/0x1c5
[<7917ceb0>] ? kmem_alloc+0x64/0xdf
[<791cb72b>] xfs_log_commit_cil+0x391/0x4c4
[<7917c763>] xfs_trans_commit+0xac/0x230
[<79172cf1>] xfs_vn_update_time+0xdb/0x142
[<79172c16>] ? xfs_setattr_mode.isra.10+0x63/0x63
[<790eb7f2>] update_time+0x1e/0x9e
[<790ed28c>] touch_atime+0xcb/0x103
[<790e5e89>] iterate_dir+0x8f/0x9b
[<790e6041>] SyS_getdents64+0x6d/0xcc
[<790e5d18>] ? filldir+0xc7/0xc7
[<7944e938>] sysenter_do_call+0x12/0x36
Code:
55 89 e5 83 ec 14 3e 8d 74 26 00 89 4c 24 10 89
54 24 0c 89 44 24 08 c7 44 24 04 c8 0b 57 79 c7
04 24 00 00 00 00 e8 ad fd ff ff <0f> 0b 55 89 e5
83 ec 14 3e 8d 74 26 00 c7 44 24 10 01 00 00 00
# ============================== PATCH ==============================
# I present this only so you can point out the error of my ways or
# try this yourself. God help the poor soul who finds this through
# Google and applies it.
>From 7e92813556e5cf3fe466680be268efa9dd7e8776 Mon Sep 17 00:00:00 2001
From: "Michael L. Semon" <mlsemon35@gmail.com>
Date: Mon, 27 Jan 2014 02:33:04 -0500
Subject: [PATCH] xfs: add alignment aids to three log-related structs
A change in the kernel after 3.13 altered the alignment of several
structures, causing all XFS umounts to fail. xfs_inode_log_item was
padded to fix this. Along the way, xfs_log_vec and xfs_log_iovec were
also padded to aid in solving the problem.
---
fs/xfs/xfs_inode_item.h | 7 +++++--
fs/xfs/xfs_log.h | 3 ++-
fs/xfs/xfs_log_format.h | 7 +++++--
3 files changed, 12 insertions(+), 5 deletions(-)
diff --git a/fs/xfs/xfs_inode_item.h b/fs/xfs/xfs_inode_item.h
index 488d812..820aac8 100644
--- a/fs/xfs/xfs_inode_item.h
+++ b/fs/xfs/xfs_inode_item.h
@@ -25,7 +25,7 @@ struct xfs_bmbt_rec;
struct xfs_inode;
struct xfs_mount;
-typedef struct xfs_inode_log_item {
+struct xfs_inode_log_item {
xfs_log_item_t ili_item; /* common portion */
struct xfs_inode *ili_inode; /* inode ptr */
xfs_lsn_t ili_flush_lsn; /* lsn at last flush */
@@ -34,7 +34,10 @@ typedef struct xfs_inode_log_item {
unsigned short ili_logged; /* flushed logged data */
unsigned int ili_last_fields; /* fields when flushed */
unsigned int ili_fields; /* fields to be logged */
-} xfs_inode_log_item_t;
+ int pad; /* for 8-byte alignment */
+} __aligned(8);
+
+typedef struct xfs_inode_log_item xfs_inode_log_item_t;
static inline int xfs_inode_clean(xfs_inode_t *ip)
{
diff --git a/fs/xfs/xfs_log.h b/fs/xfs/xfs_log.h
index b0f4ef7..488833c 100644
--- a/fs/xfs/xfs_log.h
+++ b/fs/xfs/xfs_log.h
@@ -26,7 +26,8 @@ struct xfs_log_vec {
char *lv_buf; /* formatted buffer */
int lv_buf_len; /* size of formatted buffer */
int lv_size; /* size of allocated lv */
-};
+ int pad; /* for 8-byte alignment */
+} __aligned(8);
#define XFS_LOG_VEC_ORDERED (-1)
diff --git a/fs/xfs/xfs_log_format.h b/fs/xfs/xfs_log_format.h
index f0969c7..d3174cb 100644
--- a/fs/xfs/xfs_log_format.h
+++ b/fs/xfs/xfs_log_format.h
@@ -184,11 +184,14 @@ typedef union xlog_in_core2 {
} xlog_in_core_2_t;
/* not an on-disk structure, but needed by log recovery in userspace */
-typedef struct xfs_log_iovec {
+struct xfs_log_iovec {
void *i_addr; /* beginning address of region */
int i_len; /* length in bytes of region */
uint i_type; /* type of region */
-} xfs_log_iovec_t;
+ int pad; /* for 8-byte alignment */
+} __aligned(8);
+
+typedef struct xfs_log_iovec xfs_log_iovec_t;
/*
--
1.8.4
# ============================== AFTER ===============================
# pahole reports:
struct xfs_log_iovec {
void * i_addr; /* 0 4 */
int i_len; /* 4 4 */
uint i_type; /* 8 4 */
int pad; /* 12 4 */
/* size: 16, cachelines: 1, members: 4 */
/* last cacheline: 16 bytes */
};
struct xfs_inode_log_item {
xfs_log_item_t ili_item; /* 0 68 */
/* --- cacheline 1 boundary (64 bytes) was 4 bytes ago --- */
struct xfs_inode * ili_inode; /* 68 4 */
xfs_lsn_t ili_flush_lsn; /* 72 8 */
xfs_lsn_t ili_last_lsn; /* 80 8 */
short unsigned int ili_lock_flags; /* 88 2 */
short unsigned int ili_logged; /* 90 2 */
unsigned int ili_last_fields; /* 92 4 */
unsigned int ili_fields; /* 96 4 */
int pad; /* 100 4 */
/* size: 104, cachelines: 2, members: 9 */
/* last cacheline: 40 bytes */
};
struct xfs_log_vec {
struct xfs_log_vec * lv_next; /* 0 4 */
int lv_niovecs; /* 4 4 */
struct xfs_log_iovec * lv_iovecp; /* 8 4 */
struct xfs_log_item * lv_item; /* 12 4 */
char * lv_buf; /* 16 4 */
int lv_buf_len; /* 20 4 */
int lv_size; /* 24 4 */
int pad; /* 28 4 */
/* size: 32, cachelines: 1, members: 8 */
/* last cacheline: 32 bytes */
};
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [NOISE] merge window blues, XFS broken
2014-01-27 9:46 ` Michael L. Semon
@ 2014-01-27 23:30 ` Dave Chinner
2014-01-28 8:22 ` Michael L. Semon
0 siblings, 1 reply; 10+ messages in thread
From: Dave Chinner @ 2014-01-27 23:30 UTC (permalink / raw)
To: Michael L. Semon; +Cc: xfs-oss
On Mon, Jan 27, 2014 at 04:46:02AM -0500, Michael L. Semon wrote:
> root@plbearer:~# ls $TEST_DIR/
>
> [ 94.140207] XFS: Assertion failed: IS_ALIGNED((unsigned long)vec->i_addr, sizeof(uint64_t)), file: fs/xfs/xfs_log.h, line: 49
>
> Entering kdb (current=0xc5298c30, pid 297) Oops: (null)
> due to oops @ 0x791752c5
> CPU: 0 PID: 297 Comm: ls Not tainted 3.13.0+ #1
> Hardware name: Dell Computer Corporation Dimension 2350/07W080, BIOS A01 12/17/2002
> task: c5298c30 ti: c520e000 task.ti: c520e000
> EIP: 0060:[<791752c5>] EFLAGS: 00010286 CPU: 0
> EIP is at assfail+0x2b/0x2d
> EAX: 00000071 EBX: c60ba600 ECX: 00000296 EDX: c5299098
> ESI: c60ba61c EDI: c60ba600 EBP: c520fe40 ESP: c520fe2c
> DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
> CR0: 80050033 CR2: 08f1612c CR3: 4d1f0000 CR4: 000007d0
> Stack:
> 00000000 79570bc8 79576e28 7956946d 00000031 c520fe70 791ce45f c520fe70
> 7917ceb0 c520fec4 c50d5068 c520fe70 c55d8000 00000000 c50d5068 c607ae30
> c60ba600 c520fed4 791cb72b c607af80 c6c01e80 000000d8 c4294000 c520feec
> Call Trace:
> [<791ce45f>] xfs_inode_item_format+0x4a/0x1c5
It's not clear to me that there's anything wrong with the inode log
item structure, so I need to know what iovec we tripped over here.
Can you post the disassembly of this function so we can see which
call to xlog_prepare_iovec tripped the assert? i.e.:
gdb> disass xfs_inode_item_format
to give the raw disassembly output, and
gdb> disass /m xfs_inode_item_format
To output the c-code annotated version.
Or even just annotating the code with printk()s prior to each
xlog_prepare_iovec() call in xfs_inode_item_format will do ;)
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [NOISE] merge window blues, XFS broken
2014-01-27 23:30 ` Dave Chinner
@ 2014-01-28 8:22 ` Michael L. Semon
2014-01-28 9:55 ` Dave Chinner
0 siblings, 1 reply; 10+ messages in thread
From: Michael L. Semon @ 2014-01-28 8:22 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs-oss
On 01/27/2014 06:30 PM, Dave Chinner wrote:
> On Mon, Jan 27, 2014 at 04:46:02AM -0500, Michael L. Semon wrote:
>> root@plbearer:~# ls $TEST_DIR/
>>
>> [ 94.140207] XFS: Assertion failed: IS_ALIGNED((unsigned long)vec->i_addr, sizeof(uint64_t)), file: fs/xfs/xfs_log.h, line: 49
>>
>> Entering kdb (current=0xc5298c30, pid 297) Oops: (null)
>> due to oops @ 0x791752c5
>> CPU: 0 PID: 297 Comm: ls Not tainted 3.13.0+ #1
>> Hardware name: Dell Computer Corporation Dimension 2350/07W080, BIOS A01 12/17/2002
>> task: c5298c30 ti: c520e000 task.ti: c520e000
>> EIP: 0060:[<791752c5>] EFLAGS: 00010286 CPU: 0
>> EIP is at assfail+0x2b/0x2d
>> EAX: 00000071 EBX: c60ba600 ECX: 00000296 EDX: c5299098
>> ESI: c60ba61c EDI: c60ba600 EBP: c520fe40 ESP: c520fe2c
>> DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
>> CR0: 80050033 CR2: 08f1612c CR3: 4d1f0000 CR4: 000007d0
>> Stack:
>> 00000000 79570bc8 79576e28 7956946d 00000031 c520fe70 791ce45f c520fe70
>> 7917ceb0 c520fec4 c50d5068 c520fe70 c55d8000 00000000 c50d5068 c607ae30
>> c60ba600 c520fed4 791cb72b c607af80 c6c01e80 000000d8 c4294000 c520feec
>> Call Trace:
>> [<791ce45f>] xfs_inode_item_format+0x4a/0x1c5
>
> It's not clear to me that there's anything wrong with the inode log
> item structure, so I need to know what iovec we tripped over here.
> Can you post the disassembly of this function so we can see which
> call to xlog_prepare_iovec tripped the assert? i.e.:
>
> gdb> disass xfs_inode_item_format
>
> to give the raw disassembly output, and
>
> gdb> disass /m xfs_inode_item_format
>
> To output the c-code annotated version.
>
> Or even just annotating the code with printk()s prior to each
> xlog_prepare_iovec() call in xfs_inode_item_format will do ;)
>
> Cheers,
>
> Dave.
>
OK, I had to generate a new crash for this, so pardon the dust:
# ======= SERIAL SESSION
root@plbearer:/var/lib/xfstests# mkfs.xfs -f $TEST_DEV
meta-data=/dev/md3p3 isize=256 agcount=8, agsize=131056 blks
= sectsz=512 attr=2, projid32bit=1
= crc=0
data = bsize=4096 blocks=1048448, imaxpct=25
= sunit=16 swidth=32 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=0
log =internal log bsize=4096 blocks=12800, version=2
= sectsz=512 sunit=16 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
root@plbearer:/var/lib/xfstests# mount $TEST_DEV $TEST_DIR
root@plbearer:/var/lib/xfstests# ls $TEST_DIR/
[ 218.561794] XFS: Assertion failed: IS_ALIGNED((unsigned long)vec->i_addr, sizeof(uint64_t)), file: fs/xfs/xfs_log.h, line: 49
Entering kdb (current=0xc5289860, pid 320) Oops: (null)
due to oops @ 0x791752c5
CPU: 0 PID: 320 Comm: ls Not tainted 3.13.0+ #12
Hardware name: Dell Computer Corporation Dimension 2350/07W080, BIOS A01 12/17/2002
task: c5289860 ti: c5210000 task.ti: c5210000
EIP: 0060:[<791752c5>] EFLAGS: 00010286 CPU: 0
EIP is at assfail+0x2b/0x2d
EAX: 00000071 EBX: c5095500 ECX: 000002a3 EDX: c5289cc8
ESI: c509551c EDI: c5095500 EBP: c5211e40 ESP: c5211e2c
DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
CR0: 80050033 CR2: 0860f12c CR3: 4c2a8000 CR4: 000007d0
Stack:
00000000 79571d00 79577f60 7956a5a5 00000031 c5211e70 791ce45f c5211e70
7917ceb0 c5211ec4 c6356068 c5211e70 c55cc000 00000000 c6356068 c607ab10
c5095500 c5211ed4 791cb72b c607af80 c6c01e80 000000d8 c5121000 c5211eec
Call Trace:
[<791ce45f>] xfs_inode_item_format+0x4a/0x1c5
[<7917ceb0>] ? kmem_alloc+0x64/0xdf
[<791cb72b>] xfs_log_commit_cil+0x391/0x4c4
[<7917c763>] xfs_trans_commit+0xac/0x230
[<79172cf1>] xfs_vn_update_time+0xdb/0x142
[<79172c16>] ? xfs_setattr_mode.isra.10+0x63/0x63
[<790eb7f2>] update_time+0x1e/0x9e
[<790ed28c>] touch_atime+0xcb/0x103
[<790e5e89>] iterate_dir+0x8f/0x9b
[<790e6041>] SyS_getdents64+0x6d/0xcc
[<790e5d18>] ? filldir+0xc7/0xc7
[<7944f1b8>] sysenter_do_call+0x12/0x36
Code:
55 89 e5 83 ec 14 3e 8d 74 26 00 89 4c 24 10 89
54 24 0c 89 44 24 08 c7 44 24 04 00 1d 57 79 c7
04 24 00 00 00 00 e8 ad fd ff ff <0f> 0b 55 89 e5
83 ec 14 3e 8d 74 26 00 c7 44 24 10 01 00 00 00
# ===== CRASH SESSION
root@plbearer:/mnt/storage/crashdump# crash vmlinux System.map vmcore
# setup was snipped
SYSTEM MAP: System.map
DEBUG KERNEL: vmlinux
DUMPFILE: vmcore
CPUS: 1
DATE: Mon Jan 27 23:39:03 2014
UPTIME: 00:03:38
LOAD AVERAGE: 0.06, 0.04, 0.02
TASKS: 63
NODENAME: plbearer
RELEASE: 3.13.0+
VERSION: #12 Mon Jan 27 23:30:59 EST 2014
MACHINE: i686 (1794 Mhz)
MEMORY: 1.2 GB
PANIC: "kernel BUG at fs/xfs/xfs_message.c:107!"
PID: 320
COMMAND: "ls"
TASK: c5289860 [THREAD_INFO: c5210000]
CPU: 0
STATE: TASK_RUNNING (PANIC)
crash> bt
PID: 320 TASK: c5289860 CPU: 0 COMMAND: "ls"
#0 [c5211d04] crash_kexec at 79074890
#1 [c5211d50] do_invalid_op at 790023c8
#2 [c5211dec] error_code (via invalid_op) at 7944eeef
EAX: 00000071 EBX: c5095500 ECX: 000002a3 EDX: c5289cc8 EBP: c5211e40
DS: 007b ESI: c509551c ES: 007b EDI: c5095500 GS: 2342
CS: 0060 EIP: 791752c5 ERR: ffffffff EFLAGS: 00010286
#3 [c5211e20] assfail at 791752c5
#4 [c5211e44] xfs_inode_item_format at 791ce45a
#5 [c5211e74] xfs_log_commit_cil at 791cb728
#6 [c5211ed8] xfs_trans_commit at 7917c75e
#7 [c5211f04] xfs_vn_update_time at 79172cec
#8 [c5211f28] update_time at 790eb7f0
#9 [c5211f44] touch_atime at 790ed287
#10 [c5211f5c] iterate_dir at 790e5e84
#11 [c5211f78] sys_getdents64 at 790e603c
#12 [c5211fb0] ia32_sysenter_target at 7944f1b1
EAX: 000000dc EBX: 00000003 ECX: 08607128 EDX: 00008000
DS: 007b ESI: 08607128 ES: 007b EDI: 6f790000
SS: 007b ESP: 778278b0 EBP: 00000000 GS: 0000
CS: 0073 EIP: 6f7b2424 ERR: 000000dc EFLAGS: 00000216
crash> gdb disass /m xfs_inode_item_format
Dump of assembler code for function xfs_inode_item_format:
367 {
0x791ce415 <+0>: push %ebp
0x791ce416 <+1>: mov %esp,%ebp
0x791ce418 <+3>: push %edi
0x791ce419 <+4>: push %esi
0x791ce41a <+5>: push %ebx
0x791ce41b <+6>: sub $0x1c,%esp
0x791ce41e <+9>: lea %ds:0x0(%esi,%eiz,1),%esi
0x791ce423 <+14>: mov %eax,-0x1c(%ebp)
0x791ce426 <+17>: mov %edx,%ebx
368 struct xfs_inode_log_item *iip = INODE_ITEM(lip);
369 struct xfs_inode *ip = iip->ili_inode;
0x791ce428 <+19>: mov 0x44(%eax),%eax
0x791ce42b <+22>: mov %eax,-0x14(%ebp)
370 struct xfs_inode_log_format *ilf;
371 struct xfs_log_iovec *vecp = NULL;
0x791ce42e <+25>: movl $0x0,-0x10(%ebp)
372
373 ilf = xlog_prepare_iovec(lv, &vecp, XLOG_REG_TYPE_IFORMAT);
374 ilf->ilf_type = XFS_LI_INODE;
0x791ce464 <+79>: movw $0x123b,(%esi)
375 ilf->ilf_ino = ip->i_ino;
0x791ce469 <+84>: mov -0x14(%ebp),%ecx
0x791ce46c <+87>: mov 0x14(%ecx),%edx
0x791ce46f <+90>: mov 0x10(%ecx),%eax
0x791ce472 <+93>: mov %eax,0xc(%esi)
0x791ce475 <+96>: mov %edx,0x10(%esi)
376 ilf->ilf_blkno = ip->i_imap.im_blkno;
0x791ce478 <+99>: mov 0x1c(%ecx),%edx
0x791ce47b <+102>: mov 0x18(%ecx),%eax
0x791ce47e <+105>: mov %eax,0x24(%esi)
0x791ce481 <+108>: mov %edx,0x28(%esi)
377 ilf->ilf_len = ip->i_imap.im_len;
0x791ce484 <+111>: movzwl 0x20(%ecx),%eax
0x791ce488 <+115>: mov %eax,0x2c(%esi)
378 ilf->ilf_boffset = ip->i_imap.im_boffset;
0x791ce48b <+118>: mov %ecx,%edx
0x791ce48d <+120>: movzwl 0x22(%ecx),%eax
0x791ce491 <+124>: mov %eax,0x30(%esi)
379 ilf->ilf_fields = XFS_ILOG_CORE;
0x791ce494 <+127>: movl $0x1,0x4(%esi)
380 ilf->ilf_size = 2; /* format + core */
0x791ce49b <+134>: movw $0x2,0x2(%esi)
381 xlog_finish_iovec(lv, vecp, sizeof(struct xfs_inode_log_format));
0x791ce4a1 <+140>: mov -0x10(%ebp),%edi
382
383 if (ip->i_d.di_version == 1)
0x791ce4af <+154>: mov %ecx,%eax
0x791ce4b1 <+156>: movzbl 0x13c(%ecx),%ecx
0x791ce4b8 <+163>: cmp $0x1,%cl
0x791ce4bb <+166>: je 0x791ce5c1 <xfs_inode_item_format+428>
0x791ce4c1 <+172>: mov %cl,-0x18(%ebp)
384 xfs_inode_item_format_v1_inode(ip);
0x791ce5c1 <+428>: mov %edx,%edi
0x791ce5c3 <+430>: call 0x791cdc1e <xfs_inode_item_format_v1_inode>
0x791ce5c8 <+435>: movzbl 0x13c(%edi),%eax
0x791ce5cf <+442>: mov %al,-0x18(%ebp)
0x791ce5d2 <+445>: mov -0x10(%ebp),%edi
0x791ce5d5 <+448>: jmp 0x791ce4c4 <xfs_inode_item_format+175>
385 xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_ICORE,
386 &ip->i_d,
0x791ce4dd <+200>: mov -0x14(%ebp),%eax
0x791ce4e0 <+203>: add $0x138,%eax
0x791ce4e5 <+208>: mov %eax,-0x18(%ebp)
387 xfs_icdinode_size(ip->i_d.di_version));
388
389 xfs_inode_item_format_data_fork(iip, ilf, lv, &vecp);
0x791ce55d <+328>: lea -0x10(%ebp),%edi
0x791ce560 <+331>: mov %edi,(%esp)
0x791ce563 <+334>: mov %ebx,%ecx
0x791ce565 <+336>: mov %esi,%edx
0x791ce567 <+338>: mov -0x1c(%ebp),%eax
0x791ce56a <+341>: call 0x791cdc99 <xfs_inode_item_format_data_fork>
390 if (XFS_IFORK_Q(ip)) {
0x791ce56f <+346>: mov -0x14(%ebp),%eax
0x791ce572 <+349>: cmpb $0x0,0x18a(%eax)
0x791ce579 <+356>: jne 0x791ce597 <xfs_inode_item_format+386>
391 xfs_inode_item_format_attr_fork(iip, ilf, lv, &vecp);
0x791ce597 <+386>: mov %edi,(%esp)
0x791ce59a <+389>: mov %ebx,%ecx
0x791ce59c <+391>: mov %esi,%edx
0x791ce59e <+393>: mov -0x1c(%ebp),%ebx
0x791ce5a1 <+396>: mov %ebx,%eax
0x791ce5a3 <+398>: call 0x791ce09d <xfs_inode_item_format_attr_fork>
0x791ce5a8 <+403>: mov 0x60(%ebx),%eax
392 } else {
393 iip->ili_fields &=
0x791ce57b <+358>: mov -0x1c(%ebp),%ebx
0x791ce57e <+361>: mov 0x60(%ebx),%eax
0x791ce581 <+364>: and $0xfffffe3f,%eax
0x791ce586 <+369>: mov %eax,0x60(%ebx)
394 ~(XFS_ILOG_ADATA | XFS_ILOG_ABROOT | XFS_ILOG_AEXT);
395 }
396
397 /* update the format with the exact fields we actually logged */
398 ilf->ilf_fields |= (iip->ili_fields & ~XFS_ILOG_TIMESTAMP);
0x791ce589 <+372>: and $0xbf,%ah
0x791ce58c <+375>: or %eax,0x4(%esi)
0x791ce5ab <+406>: and $0xbf,%ah
0x791ce5ae <+409>: or %eax,0x4(%esi)
399 }
0x791ce58f <+378>: add $0x1c,%esp
0x791ce592 <+381>: pop %ebx
0x791ce593 <+382>: pop %esi
0x791ce594 <+383>: pop %edi
0x791ce595 <+384>: pop %ebp
0x791ce596 <+385>: ret
0x791ce5b1 <+412>: add $0x1c,%esp
0x791ce5b4 <+415>: pop %ebx
0x791ce5b5 <+416>: pop %esi
0x791ce5b6 <+417>: pop %edi
0x791ce5b7 <+418>: pop %ebp
0x791ce5b8 <+419>: ret
End of assembler dump.
crash> quit
root@plbearer:/mnt/storage/crashdump# exit
Thanks!
Michael
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [NOISE] merge window blues, XFS broken
2014-01-28 8:22 ` Michael L. Semon
@ 2014-01-28 9:55 ` Dave Chinner
2014-01-29 22:31 ` Michael L. Semon
2014-02-12 0:15 ` Michael L. Semon
0 siblings, 2 replies; 10+ messages in thread
From: Dave Chinner @ 2014-01-28 9:55 UTC (permalink / raw)
To: Michael L. Semon; +Cc: xfs-oss
On Tue, Jan 28, 2014 at 03:22:39AM -0500, Michael L. Semon wrote:
> On 01/27/2014 06:30 PM, Dave Chinner wrote:
> > On Mon, Jan 27, 2014 at 04:46:02AM -0500, Michael L. Semon wrote:
> >> root@plbearer:~# ls $TEST_DIR/
> >>
> >> [ 94.140207] XFS: Assertion failed: IS_ALIGNED((unsigned long)vec->i_addr, sizeof(uint64_t)), file: fs/xfs/xfs_log.h, line: 49
> >>
> >> Entering kdb (current=0xc5298c30, pid 297) Oops: (null)
> >> due to oops @ 0x791752c5
> >> CPU: 0 PID: 297 Comm: ls Not tainted 3.13.0+ #1
> >> Hardware name: Dell Computer Corporation Dimension 2350/07W080, BIOS A01 12/17/2002
> >> task: c5298c30 ti: c520e000 task.ti: c520e000
> >> EIP: 0060:[<791752c5>] EFLAGS: 00010286 CPU: 0
> >> EIP is at assfail+0x2b/0x2d
> >> EAX: 00000071 EBX: c60ba600 ECX: 00000296 EDX: c5299098
> >> ESI: c60ba61c EDI: c60ba600 EBP: c520fe40 ESP: c520fe2c
> >> DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
> >> CR0: 80050033 CR2: 08f1612c CR3: 4d1f0000 CR4: 000007d0
> >> Stack:
> >> 00000000 79570bc8 79576e28 7956946d 00000031 c520fe70 791ce45f c520fe70
> >> 7917ceb0 c520fec4 c50d5068 c520fe70 c55d8000 00000000 c50d5068 c607ae30
> >> c60ba600 c520fed4 791cb72b c607af80 c6c01e80 000000d8 c4294000 c520feec
> >> Call Trace:
> >> [<791ce45f>] xfs_inode_item_format+0x4a/0x1c5
> >
> > It's not clear to me that there's anything wrong with the inode log
> > item structure, so I need to know what iovec we tripped over here.
> > Can you post the disassembly of this function so we can see which
> > call to xlog_prepare_iovec tripped the assert? i.e.:
.....
> OK, I had to generate a new crash for this, so pardon the dust:
Actually, you don't ned to crash a kernel to do this. Just run
'gdb vmlinux' from your build directory....
> crash> gdb disass /m xfs_inode_item_format
> Dump of assembler code for function xfs_inode_item_format:
> 367 {
> 0x791ce415 <+0>: push %ebp
> 0x791ce416 <+1>: mov %esp,%ebp
> 0x791ce418 <+3>: push %edi
> 0x791ce419 <+4>: push %esi
> 0x791ce41a <+5>: push %ebx
> 0x791ce41b <+6>: sub $0x1c,%esp
> 0x791ce41e <+9>: lea %ds:0x0(%esi,%eiz,1),%esi
> 0x791ce423 <+14>: mov %eax,-0x1c(%ebp)
> 0x791ce426 <+17>: mov %edx,%ebx
>
> 368 struct xfs_inode_log_item *iip = INODE_ITEM(lip);
> 369 struct xfs_inode *ip = iip->ili_inode;
> 0x791ce428 <+19>: mov 0x44(%eax),%eax
> 0x791ce42b <+22>: mov %eax,-0x14(%ebp)
>
> 370 struct xfs_inode_log_format *ilf;
> 371 struct xfs_log_iovec *vecp = NULL;
> 0x791ce42e <+25>: movl $0x0,-0x10(%ebp)
>
> 372
> 373 ilf = xlog_prepare_iovec(lv, &vecp, XLOG_REG_TYPE_IFORMAT);
> 374 ilf->ilf_type = XFS_LI_INODE;
> 0x791ce464 <+79>: movw $0x123b,(%esi)
Ok, so xfs_inode_item_format+0x4a is inside the very first call to
preapre the ilf structure. That tells us that the initial
xfs_log_vec/xfs_log_iovec array are resulting in an unaligned
buffer.
Can you try the patch below, Michael?
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
xfs: ensure correct log item buffer alignment
From: Dave Chinner <dchinner@redhat.com>
On 32 bit platforms, the log item vector headers are not 64 bit
aligned or sized. hence if we don't take care to align them
correctly or pad the buffer appropriately for 8 byte alignment, we
can end up with alignment issues when accessing the user buffer
directly as a structure.
To solve this, simply pad the buffer headers to 64 bit offset so
that the data section is always 8 byte aligned.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
fs/xfs/xfs_log_cil.c | 19 +++++++++++++++----
1 file changed, 15 insertions(+), 4 deletions(-)
diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c
index cdebd83..4ef6fdb 100644
--- a/fs/xfs/xfs_log_cil.c
+++ b/fs/xfs/xfs_log_cil.c
@@ -205,16 +205,25 @@ xlog_cil_insert_format_items(
/*
* We 64-bit align the length of each iovec so that the start
* of the next one is naturally aligned. We'll need to
- * account for that slack space here.
+ * account for that slack space here. Then round nbytes up
+ * to 64-bit alignment so that the initial buffer alignment is
+ * easy to calculate and verify.
*/
nbytes += niovecs * sizeof(uint64_t);
+ nbytes = round_up(nbytes, sizeof(uint64_t));
/* grab the old item if it exists for reservation accounting */
old_lv = lip->li_lv;
- /* calc buffer size */
- buf_size = sizeof(struct xfs_log_vec) + nbytes +
- niovecs * sizeof(struct xfs_log_iovec);
+ /*
+ * The data buffer needs to start 64-bit aligned, so round up
+ * that space to ensure we can align it appropriately and not
+ * overrun the buffer.
+ */
+ buf_size = nbytes +
+ round_up((sizeof(struct xfs_log_vec) +
+ niovecs * sizeof(struct xfs_log_iovec)),
+ sizeof(uint64_t));
/* compare to existing item size */
if (lip->li_lv && buf_size <= lip->li_lv->lv_size) {
@@ -251,6 +260,8 @@ xlog_cil_insert_format_items(
/* The allocated data region lies beyond the iovec region */
lv->lv_buf_len = 0;
lv->lv_buf = (char *)lv + buf_size - nbytes;
+ ASSERT(IS_ALIGNED((unsigned long)lv->lv_buf, sizeof(uint64_t)));
+
lip->li_ops->iop_format(lip, lv);
insert:
ASSERT(lv->lv_buf_len <= nbytes);
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [NOISE] merge window blues, XFS broken
2014-01-28 9:55 ` Dave Chinner
@ 2014-01-29 22:31 ` Michael L. Semon
2014-02-12 0:15 ` Michael L. Semon
1 sibling, 0 replies; 10+ messages in thread
From: Michael L. Semon @ 2014-01-29 22:31 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs-oss
On 01/28/2014 04:55 AM, Dave Chinner wrote:
> Ok, so xfs_inode_item_format+0x4a is inside the very first call to
> preapre the ilf structure. That tells us that the initial
> xfs_log_vec/xfs_log_iovec array are resulting in an unaligned
> buffer.
>
> Can you try the patch below, Michael?
>
> Cheers,
>
> Dave.
To the best of my knowledge, it works fine. It was subjected to some
of the xfstests xfs/* and generic/* series along with some of the
"log" group. In addition, it was run through fs_mark and a homebrew
benchmark idea of running a ( make clean; make ) loop for the kernel
while two fsx processes were running. This was for default (4k,
internal logdev) v5-superblock XFS filesystems. v4-superblock XFS
went through similar testing, to a lesser degree.
I did not know about a message like "log buf needs to be larger than
stripe size" on tests that require v2 logs. [xfs/087 might be one of
those tests.] This was also the case for the test-patch that padded
the structs directly, and this might be an old message.
At the very least, I'm keeping this patch. Your approach to the
solution looks cool. Basically, you're stating that you can have
either xfs_log_vec or xfs_log_iovec in there, and no matter what
their size, they get aligned in the log, correct?
Thanks!
Michael
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [NOISE] merge window blues, XFS broken
2014-01-28 9:55 ` Dave Chinner
2014-01-29 22:31 ` Michael L. Semon
@ 2014-02-12 0:15 ` Michael L. Semon
2014-02-12 1:55 ` Dave Chinner
1 sibling, 1 reply; 10+ messages in thread
From: Michael L. Semon @ 2014-02-12 0:15 UTC (permalink / raw)
To: Dave Chinner; +Cc: Michael L. Semon, xfs-oss
On Tue, 28 Jan 2014, Dave Chinner wrote:
> On Tue, Jan 28, 2014 at 03:22:39AM -0500, Michael L. Semon wrote:
>> On 01/27/2014 06:30 PM, Dave Chinner wrote:
>>> On Mon, Jan 27, 2014 at 04:46:02AM -0500, Michael L. Semon wrote:
>>>> root@plbearer:~# ls $TEST_DIR/
>>>>
>>>> [ 94.140207] XFS: Assertion failed: IS_ALIGNED((unsigned long)vec->i_addr, sizeof(uint64_t)), file: fs/xfs/xfs_log.h, line: 49
>>>>
>>>> Entering kdb (current=0xc5298c30, pid 297) Oops: (null)
>>>> due to oops @ 0x791752c5
>>>> CPU: 0 PID: 297 Comm: ls Not tainted 3.13.0+ #1
>>>> Hardware name: Dell Computer Corporation Dimension 2350/07W080, BIOS A01 12/17/2002
>>>> task: c5298c30 ti: c520e000 task.ti: c520e000
>>>> EIP: 0060:[<791752c5>] EFLAGS: 00010286 CPU: 0
>>>> EIP is at assfail+0x2b/0x2d
>>>> EAX: 00000071 EBX: c60ba600 ECX: 00000296 EDX: c5299098
>>>> ESI: c60ba61c EDI: c60ba600 EBP: c520fe40 ESP: c520fe2c
>>>> DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
>>>> CR0: 80050033 CR2: 08f1612c CR3: 4d1f0000 CR4: 000007d0
>>>> Stack:
>>>> 00000000 79570bc8 79576e28 7956946d 00000031 c520fe70 791ce45f c520fe70
>>>> 7917ceb0 c520fec4 c50d5068 c520fe70 c55d8000 00000000 c50d5068 c607ae30
>>>> c60ba600 c520fed4 791cb72b c607af80 c6c01e80 000000d8 c4294000 c520feec
>>>> Call Trace:
>>>> [<791ce45f>] xfs_inode_item_format+0x4a/0x1c5
>>>
>>> It's not clear to me that there's anything wrong with the inode log
>>> item structure, so I need to know what iovec we tripped over here.
>>> Can you post the disassembly of this function so we can see which
>>> call to xlog_prepare_iovec tripped the assert? i.e.:
> .....
>> OK, I had to generate a new crash for this, so pardon the dust:
>
> Actually, you don't ned to crash a kernel to do this. Just run
> 'gdb vmlinux' from your build directory....
>
>> crash> gdb disass /m xfs_inode_item_format
>> Dump of assembler code for function xfs_inode_item_format:
>> 367 {
>> 0x791ce415 <+0>: push %ebp
>> 0x791ce416 <+1>: mov %esp,%ebp
>> 0x791ce418 <+3>: push %edi
>> 0x791ce419 <+4>: push %esi
>> 0x791ce41a <+5>: push %ebx
>> 0x791ce41b <+6>: sub $0x1c,%esp
>> 0x791ce41e <+9>: lea %ds:0x0(%esi,%eiz,1),%esi
>> 0x791ce423 <+14>: mov %eax,-0x1c(%ebp)
>> 0x791ce426 <+17>: mov %edx,%ebx
>>
>> 368 struct xfs_inode_log_item *iip = INODE_ITEM(lip);
>> 369 struct xfs_inode *ip = iip->ili_inode;
>> 0x791ce428 <+19>: mov 0x44(%eax),%eax
>> 0x791ce42b <+22>: mov %eax,-0x14(%ebp)
>>
>> 370 struct xfs_inode_log_format *ilf;
>> 371 struct xfs_log_iovec *vecp = NULL;
>> 0x791ce42e <+25>: movl $0x0,-0x10(%ebp)
>>
>> 372
>> 373 ilf = xlog_prepare_iovec(lv, &vecp, XLOG_REG_TYPE_IFORMAT);
>> 374 ilf->ilf_type = XFS_LI_INODE;
>> 0x791ce464 <+79>: movw $0x123b,(%esi)
>
> Ok, so xfs_inode_item_format+0x4a is inside the very first call to
> preapre the ilf structure. That tells us that the initial
> xfs_log_vec/xfs_log_iovec array are resulting in an unaligned
> buffer.
>
> Can you try the patch below, Michael?
This patch still works fine. This reply is only in hopes that the
patch doesn't get lost for being in a subthread of a [NOISE] post.
I'm having to use this patch quite a bit in order to use XFS at all
on 32-bit x86.
Note that I'm questioning some of the results I'm getting from git,
so I both upgraded git and reworked my kernel git clone. Should this
issue go away as a result, I'll post it long before 3.14.0 is released.
[I'm not understanding why `git merge xfs-oss/master` is consistently
"Already up-to-date", or why a span of 12 commits in `git log` takes
13 steps and 3100 revisions to bisect. It could all be OK, but...]
Thanks!
Michael
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
>
> xfs: ensure correct log item buffer alignment
>
> From: Dave Chinner <dchinner@redhat.com>
>
> On 32 bit platforms, the log item vector headers are not 64 bit
> aligned or sized. hence if we don't take care to align them
> correctly or pad the buffer appropriately for 8 byte alignment, we
> can end up with alignment issues when accessing the user buffer
> directly as a structure.
>
> To solve this, simply pad the buffer headers to 64 bit offset so
> that the data section is always 8 byte aligned.
>
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> ---
> fs/xfs/xfs_log_cil.c | 19 +++++++++++++++----
> 1 file changed, 15 insertions(+), 4 deletions(-)
>
> diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c
> index cdebd83..4ef6fdb 100644
> --- a/fs/xfs/xfs_log_cil.c
> +++ b/fs/xfs/xfs_log_cil.c
> @@ -205,16 +205,25 @@ xlog_cil_insert_format_items(
> /*
> * We 64-bit align the length of each iovec so that the start
> * of the next one is naturally aligned. We'll need to
> - * account for that slack space here.
> + * account for that slack space here. Then round nbytes up
> + * to 64-bit alignment so that the initial buffer alignment is
> + * easy to calculate and verify.
> */
> nbytes += niovecs * sizeof(uint64_t);
> + nbytes = round_up(nbytes, sizeof(uint64_t));
>
> /* grab the old item if it exists for reservation accounting */
> old_lv = lip->li_lv;
>
> - /* calc buffer size */
> - buf_size = sizeof(struct xfs_log_vec) + nbytes +
> - niovecs * sizeof(struct xfs_log_iovec);
> + /*
> + * The data buffer needs to start 64-bit aligned, so round up
> + * that space to ensure we can align it appropriately and not
> + * overrun the buffer.
> + */
> + buf_size = nbytes +
> + round_up((sizeof(struct xfs_log_vec) +
> + niovecs * sizeof(struct xfs_log_iovec)),
> + sizeof(uint64_t));
>
> /* compare to existing item size */
> if (lip->li_lv && buf_size <= lip->li_lv->lv_size) {
> @@ -251,6 +260,8 @@ xlog_cil_insert_format_items(
> /* The allocated data region lies beyond the iovec region */
> lv->lv_buf_len = 0;
> lv->lv_buf = (char *)lv + buf_size - nbytes;
> + ASSERT(IS_ALIGNED((unsigned long)lv->lv_buf, sizeof(uint64_t)));
> +
> lip->li_ops->iop_format(lip, lv);
> insert:
> ASSERT(lv->lv_buf_len <= nbytes);
>
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [NOISE] merge window blues, XFS broken
2014-02-12 0:15 ` Michael L. Semon
@ 2014-02-12 1:55 ` Dave Chinner
0 siblings, 0 replies; 10+ messages in thread
From: Dave Chinner @ 2014-02-12 1:55 UTC (permalink / raw)
To: Michael L. Semon; +Cc: xfs-oss
On Tue, Feb 11, 2014 at 07:15:47PM -0500, Michael L. Semon wrote:
> On Tue, 28 Jan 2014, Dave Chinner wrote:
> >Ok, so xfs_inode_item_format+0x4a is inside the very first call to
> >preapre the ilf structure. That tells us that the initial
> >xfs_log_vec/xfs_log_iovec array are resulting in an unaligned
> >buffer.
> >
> >Can you try the patch below, Michael?
>
> This patch still works fine. This reply is only in hopes that the
> patch doesn't get lost for being in a subthread of a [NOISE] post.
> I'm having to use this patch quite a bit in order to use XFS at all
> on 32-bit x86.
It's in xfs-fixes-for-3.14-rc3 anf for-next right now. It'll go
upstream in the next couple of days.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2014-02-12 1:55 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-01-26 19:35 [NOISE] merge window blues, XFS broken Michael L. Semon
2014-01-27 1:56 ` Dave Chinner
2014-01-27 7:41 ` Christoph Hellwig
2014-01-27 9:46 ` Michael L. Semon
2014-01-27 23:30 ` Dave Chinner
2014-01-28 8:22 ` Michael L. Semon
2014-01-28 9:55 ` Dave Chinner
2014-01-29 22:31 ` Michael L. Semon
2014-02-12 0:15 ` Michael L. Semon
2014-02-12 1:55 ` Dave Chinner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).