Linux-NVME Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: willy@linux.intel.com (Matthew Wilcox)
Subject: [PATCH] NVMe: Remove superfluous cqe_seen
Date: Thu, 19 Jun 2014 12:59:57 -0400	[thread overview]
Message-ID: <20140619165957.GK12025@linux.intel.com> (raw)
In-Reply-To: <80B89753B40C5141A3E2D53FE7A2A8A9943A7ADC@NTXBOIMBX02.micron.com>

On Thu, May 22, 2014@12:10:19AM +0000, Sam Bradshaw (sbradshaw) wrote:
> Performance problem, though not very easily measured.  At very high iops
> rates, most if not all cqe's are processed via nvme_process_cq() in 
> make_request(), leaving nvme_irq() with no work to do.  Nevertheless, it
> always writes cqe_seen, which invalidates a very hot cacheline.  This
> is somewhat exacerbated when IO submissions originate on a remote node
> relative to the cpu handling the irq.

I was thinking "Hey, we should move cqe_seen to a different cacheline".
So I looked at the cacheline assignments for the different variables,
and cqe_seen is on the same cacheline as cq_head and cq_phase, so that
cacheline is already being dirtied.  Indeed, it's in the same Dword as
cq_phase, so I'd be amazed if the CPU didn't coalesce the two writes.
That might be a more fruitful patch ... rearrange nvme_queue to put
cq_head, cq_phase and cqe_seen in the same Dword, and expect the CPU to
optimise the three assignments into a single Dword store.

I'll let you try it out since you have the setup to benchmark it.  Right now,
this is the layout I see:

        /* --- cacheline 3 boundary (192 bytes) --- */
        u32 *                      q_db;                 /*   192     8 */
        u16                        q_depth;              /*   200     2 */
        u16                        cq_vector;            /*   202     2 */
        u16                        sq_head;              /*   204     2 */
        u16                        sq_tail;              /*   206     2 */
        u16                        cq_head;              /*   208     2 */
        u16                        qid;                  /*   210     2 */
        u8                         cq_phase;             /*   212     1 */
        u8                         cqe_seen;             /*   213     1 */
        u8                         q_suspended;          /*   214     1 */

I notice a 4-byte hole after q_lock, so moving cq_head, cq_phase and
cqe_seen into that space would probably be a good idea (since that
cacheline is definitely dirty).  I really haven't tried to optimise the
frequently-updated parts of the data structure into the same cacheline,
and it should really help your bizarre setup :-).

      parent reply	other threads:[~2014-06-19 16:59 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-21 18:14 [PATCH] NVMe: Remove superfluous cqe_seen Sam Bradshaw
2014-05-21 23:17 ` Matthew Wilcox
2014-05-22  0:10   ` Sam Bradshaw (sbradshaw)
2014-05-23 14:55     ` Matthew Wilcox
2014-06-19 16:59     ` Matthew Wilcox [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140619165957.GK12025@linux.intel.com \
    --to=willy@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox