mkfs.xfs pagefault when removed storage during operation

* mkfs.xfs pagefault when removed storage during operation
@ 2011-02-01 11:06 Ajeet Yadav
  2011-02-02  8:09 ` Ajeet Yadav
  2011-02-03  4:10 ` Eric Sandeen
  0 siblings, 2 replies; 7+ messages in thread
From: Ajeet Yadav @ 2011-02-01 11:06 UTC (permalink / raw)
  To: xfs

We are testing mkfs.xfs and xfs_repair stability to look for crashes
and other issues specially with removable devices.
And unfortunately crashes does occur.
Code inspection shows in most cases the caller does not handle
libxfs_readbuf() for error cases i.e when return value = NULL.

Now I need your suggestion.
We should fix all such cases or the simplest way is to exit... if
read() or write() fails with EIO errorno in libxfs_readbufr() and
libxfs_writebufr().
Fortunately these function already support exit, if we use flag
LIBXFS_EXIT_ON_FAILURE, LIBXFS_B_EXIT but they are used selectively.

The current problem is related to function libxfs_trans_read_buf()

       bp = libxfs_readbuf(dev, blkno, len, flags);
#ifdef XACT_DEBUG
        fprintf(stderr, "trans_read_buf buffer %p, transaction %p\n", bp, tp);
#endif
        xfs_buf_item_init(bp, tp->t_mountp);
        bip = XFS_BUF_FSPRIVATE(bp, xfs_buf_log_item_t *);
        bip->bli_recur = 0;
        xfs_trans_add_item(tp, (xfs_log_item_t *)bip);

        /* initialise b_fsprivate2 so we can find it incore */
        XFS_BUF_SET_FSPRIVATE2(bp, tp);
        *bpp = bp;
        return 0;

if  libxfs_readbuf() fails due to device removal or other error, bp = NULL.
In function xfs_buf_item_init(bp, tp->t_mountp) as soon as bp is
dereferenced occurs

mkfs.xfs: unhandled page fault (11) at 0x00000070, code 0x017

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 7+ messages in thread