All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael L. Semon" <mlsemon35@gmail.com>
To: xfs-oss <xfs@oss.sgi.com>
Subject: Null pointer dereference while at ACL limit on v5 XFS
Date: Mon, 23 Jun 2014 17:48:31 -0400	[thread overview]
Message-ID: <53A8A0AF.9070009@gmail.com> (raw)

At the ACL limit of v5-superblock XFS--with a directory filled with both default 
and access ACL entries--I'm getting a null pointer dereference on x86 after 
creating the directory successfully.

Disclaimer:  There's some current issues on 32-bit x86 that, for instance, can 
make badblocks see phantom bad blocks on a read test.  My apologies in advance 
if this turns out to be a false alarm bug report.

My first encounter with this issue involved fsstress.  Here's part of a `crash` 
session from the fsstress run.

root@oldsvrhw:/mnt/crashdump/xfs-fsstress-max-acl-2# crash vmlinux System.map vmcore
crash 7.0.4
# setup was snipped.
DEBUG KERNEL: vmlinux  
    DUMPFILE: vmcore
        CPUS: 1
        DATE: Fri Jun 20 13:04:23 2014
      UPTIME: 00:29:49
LOAD AVERAGE: 1.06, 1.56, 0.75
       TASKS: 78
    NODENAME: oldsvrhw
     RELEASE: 3.16.0-rc1+
     VERSION: #1 SMP Thu Jun 19 20:10:57 EDT 2014
     MACHINE: i686  (730 Mhz)
      MEMORY: 510.4 MB
       PANIC: "Oops: 0000 [#1] SMP DEBUG_PAGEALLOC" (check log for details)
         PID: 41
     COMMAND: "kworker/0:1H"
        TASK: de8f2ac0  [THREAD_INFO: de92e000]
         CPU: 0
       STATE: TASK_RUNNING (PANIC)

crash> dmesg
# ### excerpt:

# ### mounted $SCRATCH_DEV, applied ACLs to $SCRATCH_MNT/test_dir
[ 1499.886170] XFS (hdc5): Mounting V5 Filesystem
[ 1500.057759] XFS (hdc5): Ending clean mount

# ### ran `fsstress -d $SCRATCH_MNT/test-dir/a -n 10000 -p 16`
# ### BTW, does fsstress trash the existing directory before a run?
[ 1654.043846] fsstress (610) used greatest stack depth: 4956 bytes left
[ 1654.063619] fsstress (615) used greatest stack depth: 4920 bytes left
[ 1654.082220] fsstress (623) used greatest stack depth: 4820 bytes left
[ 1654.087344] fsstress (611) used greatest stack depth: 4800 bytes left
[ 1654.094295] fsstress (614) used greatest stack depth: 4784 bytes left
[ 1654.191650] fsstress (608) used greatest stack depth: 4768 bytes left
[ 1663.452036] perf interrupt took too long (2537 > 2500), lowering kernel.perf_event_max_sample_rate to 50000

# ### This was OK, so I hit Ctrl-c, then ran this (not in child directory):
# ### ran `fsstress -d $SCRATCH_MNT/test-dir -n 10000 -p 16`
[ 1789.338622] BUG: unable to handle kernel NULL pointer dereference at 0000000c
[ 1789.338842] IP: [<c1263048>] xfs_ail_check+0x58/0xc0
[ 1789.338994] *pde = 00000000 
[ 1789.339042] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
[ 1789.339042] CPU: 0 PID: 41 Comm: kworker/0:1H Not tainted 3.16.0-rc1+ #1
[ 1789.339042] Hardware name: Dell Computer Corporation       L733r                          /CA810E                         , BIOS A14 09/05/2001
[ 1789.339042] Workqueue: xfslogd xfs_buf_iodone_work
[ 1789.339042] task: de8f2ac0 ti: de92e000 task.ti: de92e000
[ 1789.339042] EIP: 0060:[<c1263048>] EFLAGS: 00010286 CPU: 0
[ 1789.339042] EIP is at xfs_ail_check+0x58/0xc0
[ 1789.339042] EAX: 00000000 EBX: dde37370 ECX: 0000330a EDX: 0000330a
[ 1789.339042] ESI: 00000001 EDI: 00000001 EBP: de92fc9c ESP: de92fc90
[ 1789.339042]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[ 1789.339042] CR0: 8005003b CR2: 0000000c CR3: 1c8ef000 CR4: 000007d0
[ 1789.339042] Stack:
[ 1789.339042]  dde37370 ddc4ea80 00000001 de92fcac c12630c3 dde37370 00000012 de92fd04
[ 1789.339042]  c1263d1d 00000000 00000001 00000000 00000000 ddc4ea88 de92fd38 dc8bba28
[ 1789.339042]  ddc4ea80 00000000 0000330a de92fd44 0000001f 00000001 00000012 00003362
[ 1789.339042] Call Trace:
[ 1789.339042]  [<c12630c3>] xfs_ail_delete+0x13/0x60
[ 1789.339042]  [<c1263d1d>] xfs_trans_ail_update_bulk+0xad/0x3c0
[ 1789.339042]  [<c11fbd35>] xfs_trans_committed_bulk+0x255/0x300
[ 1789.339042]  [<c125dcac>] xlog_cil_committed+0x3c/0x160
[ 1789.339042]  [<c1259f8c>] xlog_state_do_callback+0x17c/0x380
[ 1789.339042]  [<c125a253>] xlog_state_done_syncing+0xc3/0xe0
[ 1789.339042]  [<c125a2de>] xlog_iodone+0x6e/0x100
[ 1789.339042]  [<c11dd08b>] xfs_buf_iodone_work+0x5b/0xe0
[ 1789.339042]  [<c1055bc5>] process_one_work+0x1b5/0x570
[ 1789.339042]  [<c1055b48>] ? process_one_work+0x138/0x570
[ 1789.339042]  [<c10560e5>] ? worker_thread+0x165/0x470
[ 1789.339042]  [<c1056077>] worker_thread+0xf7/0x470
[ 1789.339042]  [<c1055f80>] ? process_one_work+0x570/0x570
[ 1789.339042]  [<c105d061>] kthread+0xa1/0xc0
[ 1789.339042]  [<c108509b>] ? trace_hardirqs_on+0xb/0x10
[ 1789.339042]  [<c1500ae1>] ret_from_kernel_thread+0x21/0x30
[ 1789.339042]  [<c105cfc0>] ? insert_kthread_work+0x80/0x80
[ 1789.339042] Code: c1 b8 d8 9e 62 c1 e8 a8 00 f9 ff 8b 43 04 39 c6 74 10 8b 7b 0c 39 78 0c 8b 53 08 8b 48 08 74 43 73 45 8b 03 39 c6 74 24 8b 73 0c <39> 70 0c 8b 53 08 8b 48 08 74 4d 73 14 b9 38 00 00 00 ba e3 a3
[ 1789.339042] EIP: [<c1263048>] xfs_ail_check+0x58/0xc0 SS:ESP 0068:de92fc90
[ 1789.339042] CR2: 000000000000000c

Since then, I've been trying out different ways of reproducing this 
message.

# ------ shortest way found so far ------

For a seed file, use this URL...

https://docs.google.com/file/d/0B41268QKoNjtMEU5UUZvMXF6ZzQ

Hopefully, the order will go like this (from memory):

# get the seed file, and
xz -d max_acl_file.xz

mkfs.xfs -f -m crc=1 $SCRATCH_DEV
mount $SCRATCH_DEV $SCRATCH_MNT

mkdir $SCRATCH_MNT/acl-dir

setfacl --set-file=max_acl_file $SCRATCH_MNT/acl-dir

cd $SCRATCH_MNT/acl-dir

# or `touch a b c; mkdir d e f`
mkdir a b c
sync

rm -rv ./*
sync

# ----------------------------------------

That's as short as I can get it...if it works.  If not, keep trying 
different things.  The tests need not be heavy:  A few seconds worth 
of fs_mark should populate the directory sufficiently.  The `rm -rv ./*` 
is key.  sync is not required, the oops will happen on its own.

This seems to happen only at a point where one or both ACL limits 
have been hit.  I'm only guessing that when a default entry is made, space 
is allocated for the access entry, and vice versa.

Thanks!

Michael

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

             reply	other threads:[~2014-06-23 21:48 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-23 21:48 Michael L. Semon [this message]
2014-06-23 22:08 ` Null pointer dereference while at ACL limit on v5 XFS Mark Tinguely
2014-06-23 22:13   ` Mark Tinguely
2014-06-24  3:34     ` Michael L. Semon
2014-06-24  4:04       ` Dave Chinner
2014-06-24 13:31         ` Michael L. Semon
2014-07-01 22:27         ` Michael L. Semon
2014-07-03 11:56           ` Jeff Liu
2014-06-24 16:31       ` Mark Tinguely
2014-06-24 18:25         ` Mark Tinguely
2014-06-24  2:18 ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53A8A0AF.9070009@gmail.com \
    --to=mlsemon35@gmail.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.