* [PATCH] xfs: logsunit rounding causes iclog corruption/crash [not found] <20130302201446.854313570@sgi.com> @ 2013-03-02 20:14 ` Mark Tinguely 2013-03-02 23:05 ` Dave Chinner 0 siblings, 1 reply; 4+ messages in thread From: Mark Tinguely @ 2013-03-02 20:14 UTC (permalink / raw) To: xfs [-- Attachment #1: xfs-limit-round-log-buffer-size-to-lsunit.patch --] [-- Type: text/plain, Size: 2544 bytes --] When the iclog buffer size and log stripe unit are both defined and the log stripe unit is less the log buffer size then the buffer is rounded up to the log stripe unit size during the xlog_sync(). This rounding can exceed the iclog buffer length and in xlog_data_pack(): 1) Cause corruption inside the iclog buffer because there will not be enough space for the headers in the front of the iclog buffer for the rounding. 2) Cause corruption in memory that follows the iclog buffer when stamping the lsn in each of the rounded blocks. 3) If CONFIG_XFS_DEBUG is defined will cause a crash in xlog_verify_iclog(). 4) Cause page fault crash if the memory after the buffer is not mapped. This has been found in XFS versions at least as far back as Linux 2.6.32. This patch forces the iclog buffer to be a multiple of the log stripe unit when they are both defined. Example: # mkfs.xfs -l su=192k -f /dev/sda2 # mount -o logbsize=256k /dev/sda3 /scratch # io such as fsstress in /scratch will immediately crash a debug xfs kernel and most like a non-debug xfs kernel. Signed-off-by: Mark Tinguely <tinguely@sgi.com> --- fs/xfs/xfs_super.c | 20 ++++++++++++++++---- 1 file changed, 16 insertions(+), 4 deletions(-) Index: b/fs/xfs/xfs_super.c =================================================================== --- a/fs/xfs/xfs_super.c +++ b/fs/xfs/xfs_super.c @@ -1357,11 +1357,23 @@ xfs_finish_flags( if (mp->m_logbsize <= 0 && mp->m_sb.sb_logsunit > XLOG_BIG_RECORD_BSIZE) { mp->m_logbsize = mp->m_sb.sb_logsunit; - } else if (mp->m_logbsize > 0 && - mp->m_logbsize < mp->m_sb.sb_logsunit) { - xfs_warn(mp, + } else if (mp->m_logbsize > 0) { + if (mp->m_logbsize > mp->m_sb.sb_logsunit) { + int logbsize; + /* round up to the next multiple of logsunit */ + logbsize = roundup(mp->m_logbsize, + mp->m_sb.sb_logsunit); + if (logbsize > XLOG_MAX_RECORD_BSIZE) + /* buffer size too large. round down. */ + logbsize -= mp->m_sb.sb_logsunit; + xfs_warn(mp, "log bufsize rounded from %d to %d", + mp->m_logbsize, logbsize); + mp->m_logbsize = logbsize; + } else if (mp->m_logbsize < mp->m_sb.sb_logsunit) { + xfs_warn(mp, "logbuf size must be greater than or equal to log stripe size"); - return XFS_ERROR(EINVAL); + return XFS_ERROR(EINVAL); + } } } else { /* Fail a mount if the logbuf is larger than 32K */ _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] xfs: logsunit rounding causes iclog corruption/crash 2013-03-02 20:14 ` [PATCH] xfs: logsunit rounding causes iclog corruption/crash Mark Tinguely @ 2013-03-02 23:05 ` Dave Chinner 2013-03-03 19:04 ` Mark Tinguely 0 siblings, 1 reply; 4+ messages in thread From: Dave Chinner @ 2013-03-02 23:05 UTC (permalink / raw) To: Mark Tinguely; +Cc: xfs On Sat, Mar 02, 2013 at 02:14:47PM -0600, Mark Tinguely wrote: > When the iclog buffer size and log stripe unit are both defined and > the log stripe unit is less the log buffer size then the buffer is > rounded up to the log stripe unit size during the xlog_sync(). > > This rounding can exceed the iclog buffer length and in xlog_data_pack(): > 1) Cause corruption inside the iclog buffer because there will not be > enough space for the headers in the front of the iclog buffer for > the rounding. > 2) Cause corruption in memory that follows the iclog buffer when > stamping the lsn in each of the rounded blocks. > 3) If CONFIG_XFS_DEBUG is defined will cause a crash in xlog_verify_iclog(). > 4) Cause page fault crash if the memory after the buffer is not mapped. > > This has been found in XFS versions at least as far back as > Linux 2.6.32. > > This patch forces the iclog buffer to be a multiple of the log stripe > unit when they are both defined. > > Example: > # mkfs.xfs -l su=192k -f /dev/sda2 $ sudo mkfs.xfs -l sunit=192k /dev/vdb Specify log sunit in 512-byte blocks, no size suffix .... $ > # mount -o logbsize=256k /dev/sda3 /scratch > # io such as fsstress in /scratch will immediately crash a debug xfs > kernel and most like a non-debug xfs kernel. It's definitely not an immediate crash. xfstests runs for several minutes (including through fsstress workloads) before it finally falls over. I'd suggest a targetted xfstest is needed for this... > Index: b/fs/xfs/xfs_super.c > =================================================================== > --- a/fs/xfs/xfs_super.c > +++ b/fs/xfs/xfs_super.c > @@ -1357,11 +1357,23 @@ xfs_finish_flags( > if (mp->m_logbsize <= 0 && > mp->m_sb.sb_logsunit > XLOG_BIG_RECORD_BSIZE) { > mp->m_logbsize = mp->m_sb.sb_logsunit; > - } else if (mp->m_logbsize > 0 && > - mp->m_logbsize < mp->m_sb.sb_logsunit) { > - xfs_warn(mp, > + } else if (mp->m_logbsize > 0) { > + if (mp->m_logbsize > mp->m_sb.sb_logsunit) { > + int logbsize; > + /* round up to the next multiple of logsunit */ > + logbsize = roundup(mp->m_logbsize, > + mp->m_sb.sb_logsunit); > + if (logbsize > XLOG_MAX_RECORD_BSIZE) > + /* buffer size too large. round down. */ > + logbsize -= mp->m_sb.sb_logsunit; > + xfs_warn(mp, "log bufsize rounded from %d to %d", > + mp->m_logbsize, logbsize); > + mp->m_logbsize = logbsize; > + } else if (mp->m_logbsize < mp->m_sb.sb_logsunit) { > + xfs_warn(mp, > "logbuf size must be greater than or equal to log stripe size"); > - return XFS_ERROR(EINVAL); > + return XFS_ERROR(EINVAL); > + } If the user has specified an invalid log buffer size, then reject it with: logbsize XXX is not an integer multiple of the log stripe unit YYY Rounding means that the user isn't getting what they want and they may not realise it. If they make a mistake, they should be informed and forced to fix it before going any further. Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] xfs: logsunit rounding causes iclog corruption/crash 2013-03-02 23:05 ` Dave Chinner @ 2013-03-03 19:04 ` Mark Tinguely 2013-03-04 0:31 ` Dave Chinner 0 siblings, 1 reply; 4+ messages in thread From: Mark Tinguely @ 2013-03-03 19:04 UTC (permalink / raw) To: Dave Chinner; +Cc: xfs On 03/02/13 17:05, Dave Chinner wrote: > On Sat, Mar 02, 2013 at 02:14:47PM -0600, Mark Tinguely wrote: >> When the iclog buffer size and log stripe unit are both defined and >> the log stripe unit is less the log buffer size then the buffer is >> rounded up to the log stripe unit size during the xlog_sync(). >> >> This rounding can exceed the iclog buffer length and in xlog_data_pack(): >> 1) Cause corruption inside the iclog buffer because there will not be >> enough space for the headers in the front of the iclog buffer for >> the rounding. >> 2) Cause corruption in memory that follows the iclog buffer when >> stamping the lsn in each of the rounded blocks. >> 3) If CONFIG_XFS_DEBUG is defined will cause a crash in xlog_verify_iclog(). >> 4) Cause page fault crash if the memory after the buffer is not mapped. >> >> This has been found in XFS versions at least as far back as >> Linux 2.6.32. >> >> This patch forces the iclog buffer to be a multiple of the log stripe >> unit when they are both defined. >> >> Example: >> # mkfs.xfs -l su=192k -f /dev/sda2 > > $ sudo mkfs.xfs -l sunit=192k /dev/vdb > Specify log sunit in 512-byte blocks, no size suffix > .... > $ > >> # mount -o logbsize=256k /dev/sda3 /scratch >> # io such as fsstress in /scratch will immediately crash a debug xfs >> kernel and most like a non-debug xfs kernel. > > It's definitely not an immediate crash. xfstests runs for several > minutes (including through fsstress workloads) before it finally > falls over. I'd suggest a targetted xfstest is needed for this... > >> Index: b/fs/xfs/xfs_super.c >> =================================================================== >> --- a/fs/xfs/xfs_super.c >> +++ b/fs/xfs/xfs_super.c >> @@ -1357,11 +1357,23 @@ xfs_finish_flags( >> if (mp->m_logbsize<= 0&& >> mp->m_sb.sb_logsunit> XLOG_BIG_RECORD_BSIZE) { >> mp->m_logbsize = mp->m_sb.sb_logsunit; >> - } else if (mp->m_logbsize> 0&& >> - mp->m_logbsize< mp->m_sb.sb_logsunit) { >> - xfs_warn(mp, >> + } else if (mp->m_logbsize> 0) { >> + if (mp->m_logbsize> mp->m_sb.sb_logsunit) { >> + int logbsize; >> + /* round up to the next multiple of logsunit */ >> + logbsize = roundup(mp->m_logbsize, >> + mp->m_sb.sb_logsunit); >> + if (logbsize> XLOG_MAX_RECORD_BSIZE) >> + /* buffer size too large. round down. */ >> + logbsize -= mp->m_sb.sb_logsunit; eeks: if (logbsize != mp->m_logbsize) { >> + xfs_warn(mp, "log bufsize rounded from %d to %d", >> + mp->m_logbsize, logbsize); >> + mp->m_logbsize = logbsize; } >> + } else if (mp->m_logbsize< mp->m_sb.sb_logsunit) { >> + xfs_warn(mp, >> "logbuf size must be greater than or equal to log stripe size"); >> - return XFS_ERROR(EINVAL); >> + return XFS_ERROR(EINVAL); >> + } > > If the user has specified an invalid log buffer size, then reject it > with: > > logbsize XXX is not an integer multiple of the log stripe unit YYY > > Rounding means that the user isn't getting what they want and they > may not realise it. If they make a mistake, they should be informed > and forced to fix it before going any further. > > Cheers, > > Dave. The code already silently changes the log blocksize if mp->m_sb.sb_logsunit > mp->m_logbsize. IMO, it should fix it not a multiple too. --Mark. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] xfs: logsunit rounding causes iclog corruption/crash 2013-03-03 19:04 ` Mark Tinguely @ 2013-03-04 0:31 ` Dave Chinner 0 siblings, 0 replies; 4+ messages in thread From: Dave Chinner @ 2013-03-04 0:31 UTC (permalink / raw) To: Mark Tinguely; +Cc: xfs On Sun, Mar 03, 2013 at 01:04:39PM -0600, Mark Tinguely wrote: > On 03/02/13 17:05, Dave Chinner wrote: > >On Sat, Mar 02, 2013 at 02:14:47PM -0600, Mark Tinguely wrote: > >>When the iclog buffer size and log stripe unit are both defined and > >>the log stripe unit is less the log buffer size then the buffer is > >>rounded up to the log stripe unit size during the xlog_sync(). > >> > >>This rounding can exceed the iclog buffer length and in xlog_data_pack(): > >> 1) Cause corruption inside the iclog buffer because there will not be > >> enough space for the headers in the front of the iclog buffer for > >> the rounding. > >> 2) Cause corruption in memory that follows the iclog buffer when > >> stamping the lsn in each of the rounded blocks. > >> 3) If CONFIG_XFS_DEBUG is defined will cause a crash in xlog_verify_iclog(). > >> 4) Cause page fault crash if the memory after the buffer is not mapped. ... > >If the user has specified an invalid log buffer size, then reject it > >with: > > > >logbsize XXX is not an integer multiple of the log stripe unit YYY > > > >Rounding means that the user isn't getting what they want and they > >may not realise it. If they make a mistake, they should be informed > >and forced to fix it before going any further. > > The code already silently changes the log blocksize if > mp->m_sb.sb_logsunit > mp->m_logbsize. Where? It only changes m_logbsize if it is <= 0, which means the user has not specified the value as a mount option and hence we are free to chose whatever value we want. > IMO, it should fix it not a multiple too. ENOPARSE. :/ FWIW, a further argument against the rounding approach is this: the iclogbuf size supplied to mount is constrainted to be a power of 2 size. if (mp->m_logbsize != -1 && mp->m_logbsize != 0 && (mp->m_logbsize < XLOG_MIN_RECORD_BSIZE || mp->m_logbsize > XLOG_MAX_RECORD_BSIZE || !is_power_of_2(mp->m_logbsize))) { xfs_warn(mp, "invalid logbufsize: %d [not 16k,32k,64k,128k or 256k]", mp->m_logbsize); return XFS_ERROR(EINVAL); } This constraint appears to be a feature of the original log stripe unit code that did only support power-of-2 log stripe units. However, that was removed way back in 2004: http://oss.sgi.com/cgi-bin/gitweb.cgi?p=archive/xfs-import.git;a=commitdiff;h=d3037d91429cc2ea383f8a2736c86ed9f1eec542 That commit introduced the very rounding code in xlog_sync() that is causing the crash you are trying to address right now. i.e. this code: count = XLOG_LSUNITTOB(log, XLOG_BTOLSUNIT(log, count_init)); IOWs, it makes no sense to only allow power-of-2 logbsize options if we then round it to something completely different. The above commit makes it pretty clear that the intent is that logbsize should be an exact integer multiple of the log stripe unit - which it is if no logbsize mount option is given - but our mount option parsing does not reflect this at all. Personally, I'd prefer that logbsize be limited to power-of-2 multiples of the lsunit or XLOG_MIN_RECORD_BSIZE (if lsunit = 0) as allowing arbitrary values to be specified by users leads to a testing and bug triage nightmare. If we are not going to change the power-of-2 logbsize mount option requirement, then I think that the correct solution is to make the log limit the iclogbuf size internally to power-of-2 multiples of the lsunit so that it is not at all externally visible (e.g. /proc/self/mounts shows exactly what came in as a mount option). This problem really has nothing to do with what mount options are specified and passed to the log - the log is separate code with it's own internal constraints and hence should be ensuring that it's setup is consistent with those constraints.... Further, if we really need to know what iclogbuf size the log is using on disk, both the physical iclogbuf size and the size of the current write gets written into every log header block.... Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2013-03-04 0:31 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20130302201446.854313570@sgi.com>
2013-03-02 20:14 ` [PATCH] xfs: logsunit rounding causes iclog corruption/crash Mark Tinguely
2013-03-02 23:05 ` Dave Chinner
2013-03-03 19:04 ` Mark Tinguely
2013-03-04 0:31 ` Dave Chinner
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox