From: Leon Romanovsky <leon@kernel.org>
To: Jason Gunthorpe <jgg@mellanox.com>
Cc: Doug Ledford <dledford@redhat.com>,
RDMA mailing list <linux-rdma@vger.kernel.org>,
Erez Alfasi <ereza@mellanox.com>
Subject: Re: [PATCH rdma-next v2 1/4] IB/mlx5: Introduce ODP diagnostic counters
Date: Thu, 10 Oct 2019 12:02:03 +0300 [thread overview]
Message-ID: <20191010090203.GJ5855@unreal> (raw)
In-Reply-To: <20191008195701.GE22714@mellanox.com>
On Tue, Oct 08, 2019 at 07:57:05PM +0000, Jason Gunthorpe wrote:
> On Sun, Oct 06, 2019 at 06:51:36PM +0300, Leon Romanovsky wrote:
> > From: Erez Alfasi <ereza@mellanox.com>
> >
> > Introduce ODP diagnostic counters and count the following
> > per MR within IB/mlx5 driver:
> > 1) Page faults:
> > Total number of faulted pages.
> > 2) Page invalidations:
> > Total number of pages invalidated by the OS during all
> > invalidation events. The translations can be no longer
> > valid due to either non-present pages or mapping changes.
> > 3) Prefetched pages:
> > When prefetching a page, page fault is generated
> > in order to bring the page to the main memory.
> > The prefetched pages counter will be updated
> > during a page fault flow only if it was derived
> > from prefetching operation.
> >
> > Signed-off-by: Erez Alfasi <ereza@mellanox.com>
> > Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
> > drivers/infiniband/hw/mlx5/mlx5_ib.h | 4 ++++
> > drivers/infiniband/hw/mlx5/odp.c | 18 ++++++++++++++++++
> > include/rdma/ib_verbs.h | 6 ++++++
> > 3 files changed, 28 insertions(+)
> >
> > diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h
> > index bf30d53d94dc..5aae05ebf64b 100644
> > +++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
> > @@ -585,6 +585,9 @@ struct mlx5_ib_dm {
> > IB_ACCESS_REMOTE_READ |\
> > IB_ZERO_BASED)
> >
> > +#define mlx5_update_odp_stats(mr, counter_name, value) \
> > + atomic64_add(value, &((mr)->odp_stats.counter_name))
> > +
> > struct mlx5_ib_mr {
> > struct ib_mr ibmr;
> > void *descs;
> > @@ -622,6 +625,7 @@ struct mlx5_ib_mr {
> > wait_queue_head_t q_leaf_free;
> > struct mlx5_async_work cb_work;
> > atomic_t num_pending_prefetch;
> > + struct ib_odp_counters odp_stats;
> > };
> >
> > static inline bool is_odp_mr(struct mlx5_ib_mr *mr)
> > diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c
> > index 95cf0249b015..966783bfb557 100644
> > +++ b/drivers/infiniband/hw/mlx5/odp.c
> > @@ -261,6 +261,10 @@ void mlx5_ib_invalidate_range(struct ib_umem_odp *umem_odp, unsigned long start,
> > blk_start_idx = idx;
> > in_block = 1;
> > }
> > +
> > + /* Count page invalidations */
> > + mlx5_update_odp_stats(mr, invalidations,
> > + (idx - blk_start_idx + 1));
>
> I feel like these should be batched and the atomic done once at the
> end of the routine..
We can, but does it worth it?
>
> > } else {
> > u64 umr_offset = idx & umr_block_mask;
> >
> > @@ -287,6 +291,7 @@ void mlx5_ib_invalidate_range(struct ib_umem_odp *umem_odp, unsigned long start,
> >
> > ib_umem_odp_unmap_dma_pages(umem_odp, start, end);
> >
> > +
> > if (unlikely(!umem_odp->npages && mr->parent &&
> > !umem_odp->dying)) {
> > WRITE_ONCE(umem_odp->dying, 1);
> > @@ -801,6 +806,19 @@ static int pagefault_single_data_segment(struct mlx5_ib_dev *dev,
> > if (ret < 0)
> > goto srcu_unlock;
> >
> > + /*
> > + * When prefetching a page, page fault is generated
> > + * in order to bring the page to the main memory.
> > + * In the current flow, page faults are being counted.
> > + * Prefetched pages counter will be updated as well
> > + * only if the current page fault flow was derived
> > + * from prefetching flow.
> > + */
> > + mlx5_update_odp_stats(mr, faults, ret);
> > +
> > + if (prefetch)
> > + mlx5_update_odp_stats(mr, prefetched, ret);
>
> Hm, I'm about to post a series that eliminates 'prefetch' here..
Jason,
For various reasons we are delaying this series for months already.
Let's drop "prefetch" counter for now and merge everything without
it.
>
> This is also not quite right for prefetch as we are doing a form of
> prefetching in the mlx5_ib_mr_rdma_pfault_handler() too, although it
> is less clear how to count those. Maybe this should be split to SQ/RQ
> faults?
mlx5_ib_mr_rdma_pfault_handler() calls to pagefault_single_data_segment()
without MLX5_PF_FLAGS_PREFETCH, so I'm unsure that this counter should
count mlx5_ib_mr_rdma_pfault_handler() pagefaults.
However the idea to separate SQ/RQ for everything sounds appealing.
Thanks
>
> Jason
next prev parent reply other threads:[~2019-10-10 9:02 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-06 15:51 [PATCH rdma-next v2 0/4] ODP information and statistics Leon Romanovsky
2019-10-06 15:51 ` [PATCH rdma-next v2 1/4] IB/mlx5: Introduce ODP diagnostic counters Leon Romanovsky
2019-10-08 19:57 ` Jason Gunthorpe
2019-10-10 9:02 ` Leon Romanovsky [this message]
2019-10-10 19:49 ` Jason Gunthorpe
2019-10-06 15:51 ` [PATCH rdma-next v2 2/4] RDMA/nldev: Allow different fill function per resource Leon Romanovsky
2019-10-08 19:58 ` Jason Gunthorpe
2019-10-06 15:51 ` [PATCH rdma-next v2 3/4] RDMA/mlx5: Return ODP type per MR Leon Romanovsky
2019-10-06 15:51 ` [PATCH rdma-next v2 4/4] RDMA/nldev: Provide MR statistics Leon Romanovsky
2019-10-09 14:40 ` Jason Gunthorpe
2019-10-10 9:02 ` Leon Romanovsky
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191010090203.GJ5855@unreal \
--to=leon@kernel.org \
--cc=dledford@redhat.com \
--cc=ereza@mellanox.com \
--cc=jgg@mellanox.com \
--cc=linux-rdma@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.