public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Christoph Hellwig <hch@lst.de>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 13/20] xfs: use bios directly to write log buffers
Date: Tue, 4 Jun 2019 15:54:08 +1000	[thread overview]
Message-ID: <20190604055408.GP29573@dread.disaster.area> (raw)
In-Reply-To: <20190603172945.13819-14-hch@lst.de>

On Mon, Jun 03, 2019 at 07:29:38PM +0200, Christoph Hellwig wrote:
> Currently the XFS logging code uses the xfs_buf structure and
> associated APIs to write the log buffers to disk.  This requires
> various special cases in the log code and is generally not very
> optimal.
> 
> Instead of using a buffer just allocate a kmem_alloc_larger region for
> each log buffer, and use a bio and bio_vec array embedded in the iclog
> structure to write the buffer to disk.  This also allows for using
> the bio split and chaining case to deal with the case of a log
> buffer wrapping around the end of the log.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Overall a good cleanup, comments inline.

FWIW, what does ic_sema protect?  It looks to me like it just
replaces the xfs_buf_lock(), and the only reason we were using that
is to allow unmount to wait for iclogbuf IO completion. Can we just
use a completion for this now?

> ---
>  fs/xfs/xfs_log.c      | 220 ++++++++++++++++++++----------------------
>  fs/xfs/xfs_log_priv.h |  15 +--
>  2 files changed, 112 insertions(+), 123 deletions(-)
> 
> diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> index 358a19789402..1d4480ea1725 100644
> --- a/fs/xfs/xfs_log.c
> +++ b/fs/xfs/xfs_log.c
> @@ -1239,32 +1239,32 @@ xlog_space_left(
>  }
>  
>  
> -/*
> - * Log function which is called when an io completes.
> - *
> - * The log manager needs its own routine, in order to control what
> - * happens with the buffer after the write completes.
> - */
>  static void
> -xlog_iodone(xfs_buf_t *bp)
> +xlog_ioend_work(
> +	struct work_struct	*work)
>  {
> -	struct xlog_in_core	*iclog = bp->b_log_item;
> -	struct xlog		*l = iclog->ic_log;
> +	struct xlog_in_core     *iclog =
> +		container_of(work, struct xlog_in_core, ic_end_io_work);
> +	struct xlog		*log = iclog->ic_log;
>  	int			aborted = 0;
> +	int			error;
> +
> +	if (is_vmalloc_addr(iclog->ic_data))
> +		invalidate_kernel_vmap_range(iclog->ic_data, iclog->ic_io_size);

Do we need to invalidate here for write only operation?  It's only
when we are bringing new data into memory we have to invalidate the
range, right?  e.g. xfs_buf_bio_end_io() only does invalidation on
read IO. 

> @@ -1475,30 +1473,6 @@ xlog_alloc_log(
>  
>  	xlog_get_iclog_buffer_size(mp, log);
>  
> -	/*
> -	 * Use a NULL block for the extra log buffer used during splits so that
> -	 * it will trigger errors if we ever try to do IO on it without first
> -	 * having set it up properly.
> -	 */
> -	error = -ENOMEM;
> -	bp = xfs_buf_alloc(log->l_targ, XFS_BUF_DADDR_NULL,
> -			   BTOBB(log->l_iclog_size), XBF_NO_IOACCT);
> -	if (!bp)
> -		goto out_free_log;
> -
> -	/*
> -	 * The iclogbuf buffer locks are held over IO but we are not going to do
> -	 * IO yet.  Hence unlock the buffer so that the log IO path can grab it
> -	 * when appropriately.
> -	 */
> -	ASSERT(xfs_buf_islocked(bp));
> -	xfs_buf_unlock(bp);
> -
> -	/* use high priority wq for log I/O completion */
> -	bp->b_ioend_wq = mp->m_log_workqueue;
> -	bp->b_iodone = xlog_iodone;
> -	log->l_xbuf = bp;
> -
>  	spin_lock_init(&log->l_icloglock);
>  	init_waitqueue_head(&log->l_flush_wait);
>  
> @@ -1512,7 +1486,9 @@ xlog_alloc_log(
>  	 */
>  	ASSERT(log->l_iclog_size >= 4096);
>  	for (i=0; i < log->l_iclog_bufs; i++) {

Fix the whitespace while you are touching this code?

> -		*iclogp = kmem_zalloc(sizeof(xlog_in_core_t), KM_MAYFAIL);
> +		*iclogp = kmem_zalloc(struct_size(*iclogp, ic_bvec,
> +				howmany(log->l_iclog_size, PAGE_SIZE)),
> +				KM_MAYFAIL);

That's a bit of a mess - hard to read. It's times like this that I
think generic helpers make the code worse rather than bettter.
Perhaps some slightly different indenting to indicate that the
howmany() function is actually a parameter of the struct_size()
macro?

		*iclogp = kmem_zalloc(struct_size(*iclogp, ic_bvec,
					howmany(log->l_iclog_size, PAGE_SIZE)),
				      KM_MAYFAIL);

> +static void
> +xlog_bio_end_io(
> +	struct bio		*bio)
> +{
> +	struct xlog_in_core	*iclog = bio->bi_private;
> +
> +	queue_work(iclog->ic_log->l_mp->m_log_workqueue,
> +		   &iclog->ic_end_io_work);
> +}

Can we just put a pointer to the wq in the iclog? It only needs to
be set up at init time, then this only needs to be

	queue_work(iclog->ic_wq, &iclog->ic_end_io_work);

> +
> +static void
> +xlog_map_iclog_data(
> +	struct bio		*bio,
> +	void			*data,
> +	size_t			count)
> +{
> +	do {
> +		struct page	*page = kmem_to_page(data);
> +		unsigned int	off = offset_in_page(data);
> +		size_t		len = min_t(size_t, count, PAGE_SIZE - off);
> +
> +		WARN_ON_ONCE(bio_add_page(bio, page, len, off) != len);
> +
> +		data += len;
> +		count -= len;
> +	} while (count);
> +}

Aren't we're always going to be mapping the same pages to the same
bio at the same offsets. The only thing that changes is the length
of the bio and the sector it is addressed to. It seems kind of odd
to have an inline data buffer, bio and biovec all pre-allocated, but
then have to map them into exactly the same state for every IO we do
with them...

> @@ -1786,11 +1772,10 @@ xlog_write_iclog(
>  	 * tearing down the iclogbufs.  Hence we need to hold the buffer lock
>  	 * across the log IO to archieve that.
>  	 */
> -	xfs_buf_lock(bp);
> +	down(&iclog->ic_sema);
>  	if (unlikely(iclog->ic_state & XLOG_STATE_IOERROR)) {
> -		xfs_buf_ioerror(bp, -EIO);
> -		xfs_buf_stale(bp);
> -		xfs_buf_ioend(bp);
> +		xlog_state_done_syncing(iclog, XFS_LI_ABORTED);
> +		up(&iclog->ic_sema);

Hmmm - this open codes the end io error completion. Might be wroth a
comment indicating that this needs to be kept in sync with the io
completion processing?

> diff --git a/fs/xfs/xfs_log_priv.h b/fs/xfs/xfs_log_priv.h
> index b5f82cb36202..062ee9c13039 100644
> --- a/fs/xfs/xfs_log_priv.h
> +++ b/fs/xfs/xfs_log_priv.h
> @@ -179,7 +179,6 @@ typedef struct xlog_ticket {
>   *	the iclog.
>   * - ic_forcewait is used to implement synchronous forcing of the iclog to disk.
>   * - ic_next is the pointer to the next iclog in the ring.
> - * - ic_bp is a pointer to the buffer used to write this incore log to disk.
>   * - ic_log is a pointer back to the global log structure.
>   * - ic_callback is a linked list of callback function/argument pairs to be
>   *	called after an iclog finishes writing.
> @@ -206,11 +205,10 @@ typedef struct xlog_in_core {
>  	wait_queue_head_t	ic_write_wait;
>  	struct xlog_in_core	*ic_next;
>  	struct xlog_in_core	*ic_prev;
> -	struct xfs_buf		*ic_bp;
>  	struct xlog		*ic_log;
> -	int			ic_size;
> -	int			ic_offset;
> -	int			ic_bwritecnt;
> +	u32			ic_size;
> +	u32			ic_io_size;
> +	u32			ic_offset;

Can we get a couple of comments here describing the difference
between ic_size, ic_io_size and log->l_iclog_size so I don't have to
go read all the code to find out what they are again in 6 months
time?

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2019-06-04  5:54 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-03 17:29 use bios directly in the log code v2 Christoph Hellwig
2019-06-03 17:29 ` [PATCH 01/20] xfs: remove the no-op spinlock_destroy stub Christoph Hellwig
2019-06-03 17:29 ` [PATCH 02/20] xfs: remove the never used _XBF_COMPOUND flag Christoph Hellwig
2019-06-03 17:29 ` [PATCH 03/20] xfs: renumber XBF_WRITE_FAIL Christoph Hellwig
2019-06-03 17:29 ` [PATCH 04/20] xfs: make mem_to_page available outside of xfs_buf.c Christoph Hellwig
2019-06-03 17:29 ` [PATCH 05/20] xfs: reformat xlog_get_lowest_lsn Christoph Hellwig
2019-06-03 17:29 ` [PATCH 06/20] xfs: don't use REQ_PREFLUSH for split log writes Christoph Hellwig
2019-06-04 16:12   ` Brian Foster
2019-06-04 22:45     ` Dave Chinner
2019-06-05 10:51       ` Brian Foster
2019-06-05 15:14         ` Christoph Hellwig
2019-06-05 15:47           ` Brian Foster
2019-06-03 17:29 ` [PATCH 07/20] xfs: factor out log buffer writing from xlog_sync Christoph Hellwig
2019-06-04  2:20   ` Dave Chinner
2019-06-03 17:29 ` [PATCH 08/20] xfs: factor out splitting of an iclog " Christoph Hellwig
2019-06-04  2:21   ` Dave Chinner
2019-06-03 17:29 ` [PATCH 09/20] xfs: factor out iclog size calculation " Christoph Hellwig
2019-06-04  2:23   ` Dave Chinner
2019-06-03 17:29 ` [PATCH 10/20] xfs: update both stat counters together in xlog_sync Christoph Hellwig
2019-06-03 17:29 ` [PATCH 11/20] xfs: remove the syncing argument from xlog_verify_iclog Christoph Hellwig
2019-06-03 17:29 ` [PATCH 12/20] xfs: make use of the l_targ field in struct xlog Christoph Hellwig
2019-06-03 17:29 ` [PATCH 13/20] xfs: use bios directly to write log buffers Christoph Hellwig
2019-06-04  5:54   ` Dave Chinner [this message]
2019-06-04  6:10     ` Christoph Hellwig
2019-06-03 17:29 ` [PATCH 14/20] xfs: move the log ioend workqueue to struct xlog Christoph Hellwig
2019-06-19 12:19   ` Christoph Hellwig
2019-06-19 22:51     ` Darrick J. Wong
2019-06-20  6:08       ` Christoph Hellwig
2019-06-03 17:29 ` [PATCH 15/20] xfs: return an offset instead of a pointer from xlog_align Christoph Hellwig
2019-06-03 17:29 ` [PATCH 16/20] xfs: use bios directly to read and write the log recovery buffers Christoph Hellwig
2019-06-04  6:13   ` Dave Chinner
2019-06-05 15:09     ` Christoph Hellwig
2019-06-03 17:29 ` [PATCH 17/20] xfs: stop using bp naming for " Christoph Hellwig
2019-06-04  6:19   ` Dave Chinner
2019-06-03 17:29 ` [PATCH 18/20] xfs: remove unused buffer cache APIs Christoph Hellwig
2019-06-04  6:24   ` Dave Chinner
2019-06-05 15:12     ` Christoph Hellwig
2019-06-05 21:24       ` Dave Chinner
2019-06-03 17:29 ` [PATCH 19/20] xfs: properly type the b_log_item field in struct xfs_buf Christoph Hellwig
2019-06-04  6:25   ` Dave Chinner
2019-06-03 17:29 ` [PATCH 20/20] xfs: remove the b_io_length " Christoph Hellwig
2019-06-04  6:27   ` Dave Chinner
2019-06-03 17:35 ` use bios directly in the log code v2 Darrick J. Wong
2019-06-03 17:38   ` Christoph Hellwig
2019-06-04 17:25     ` Darrick J. Wong
2019-06-04 17:54       ` Christoph Hellwig
2019-06-04 18:42         ` Brian Foster
2019-06-04 18:58           ` Christoph Hellwig
  -- strict thread matches above, loose matches on Subject: below --
2019-05-23 17:37 Christoph Hellwig
2019-05-23 17:37 ` [PATCH 13/20] xfs: use bios directly to write log buffers Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190604055408.GP29573@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=hch@lst.de \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox