From: Jiang Liu <liuj97@gmail.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Jiang Liu <jiang.liu@huawei.com>,
Vinod Koul <vinod.koul@intel.com>,
Keping Chen <chenkeping@huawei.com>,
"David S. Miller" <davem@davemloft.net>,
Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>,
James Morris <jmorris@namei.org>,
Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>,
Patrick McHardy <kaber@trash.net>,
netdev@vger.kernel.org, linux-pci@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v1 6/8] dmaengine: enhance network subsystem to support DMA device hotplug
Date: Wed, 25 Apr 2012 23:47:19 +0800 [thread overview]
Message-ID: <4F981C87.60403@gmail.com> (raw)
In-Reply-To: <CABE8wwvW9MUWaqSUCPMr7_yj7XGtSj+X9VE1-xWrEusZ3TMaqQ@mail.gmail.com>
Hi Dan,
Thanks for your great comments about the performance penalty issue. And I'm trying
to refine the implementation to reduce penalty caused by hotplug logic. If the algorithm works
correctly, the optimized hot path code will be:
------------------------------------------------------------------------------
struct dma_chan *dma_find_channel(enum dma_transaction_type tx_type)
{
struct dma_chan *chan = this_cpu_read(channel_table[tx_type]->chan);
this_cpu_inc(dmaengine_chan_ref_count);
if (static_key_false(&dmaengine_quiesce)) {
chan = NULL;
}
return chan;
}
EXPORT_SYMBOL(dma_find_channel);
struct dma_chan *dma_get_channel(struct dma_chan *chan)
{
if (static_key_false(&dmaengine_quiesce))
atomic_inc(&dmaengine_dirty);
this_cpu_inc(dmaengine_chan_ref_count);
return chan;
}
EXPORT_SYMBOL(dma_get_channel);
void dma_put_channel(struct dma_chan *chan)
{
this_cpu_dec(dmaengine_chan_ref_count);
}
EXPORT_SYMBOL(dma_put_channel);
-----------------------------------------------------------------------------
The disassembled code is:
(gdb) disassemble dma_find_channel
Dump of assembler code for function dma_find_channel:
0x0000000000000000 <+0>: push %rbp
0x0000000000000001 <+1>: mov %rsp,%rbp
0x0000000000000004 <+4>: callq 0x9 <dma_find_channel+9>
0x0000000000000009 <+9>: mov %edi,%edi
0x000000000000000b <+11>: mov 0x0(,%rdi,8),%rax
0x0000000000000013 <+19>: mov %gs:(%rax),%rax
0x0000000000000017 <+23>: incq %gs:0x0 //overhead: this_cpu_inc(dmaengine_chan_ref_count)
0x0000000000000020 <+32>: jmpq 0x25 <dma_find_channel+37> //overhead: if (static_key_false(&dmaengine_quiesce)), will be replaced as NOP by jump label
0x0000000000000025 <+37>: pop %rbp
0x0000000000000026 <+38>: retq
0x0000000000000027 <+39>: nopw 0x0(%rax,%rax,1)
0x0000000000000030 <+48>: xor %eax,%eax
0x0000000000000032 <+50>: pop %rbp
0x0000000000000033 <+51>: retq
End of assembler dump.
(gdb) disassemble dma_put_channel // overhead: to decrease channel reference count, 6 instructions
Dump of assembler code for function dma_put_channel:
0x0000000000000070 <+0>: push %rbp
0x0000000000000071 <+1>: mov %rsp,%rbp
0x0000000000000074 <+4>: callq 0x79 <dma_put_channel+9>
0x0000000000000079 <+9>: decq %gs:0x0
0x0000000000000082 <+18>: pop %rbp
0x0000000000000083 <+19>: retq
End of assembler dump.
(gdb) disassemble dma_get_channel
Dump of assembler code for function dma_get_channel:
0x0000000000000040 <+0>: push %rbp
0x0000000000000041 <+1>: mov %rsp,%rbp
0x0000000000000044 <+4>: callq 0x49 <dma_get_channel+9>
0x0000000000000049 <+9>: mov %rdi,%rax
0x000000000000004c <+12>: jmpq 0x51 <dma_get_channel+17>
0x0000000000000051 <+17>: incq %gs:0x0
0x000000000000005a <+26>: pop %rbp
0x000000000000005b <+27>: retq
0x000000000000005c <+28>: nopl 0x0(%rax)
0x0000000000000060 <+32>: lock incl 0x0(%rip) # 0x67 <dma_get_channel+39>
0x0000000000000067 <+39>: jmp 0x51 <dma_get_channel+17>
End of assembler dump.
So for a typical dma_find_channel()/dma_put_channel(), the total overhead
is about 10 instructions and two percpu(local) memory updates. And there's
no shared cache pollution any more. Is this acceptable ff the algorithm
works as expected? I will test the code tomorrow.
For typical systems which don't support DMA device hotplug, the overhead
could be completely removed by condition compilation.
Any comments are welcomed!
Thanks!
--gerry
On 04/24/2012 11:09 AM, Dan Williams wrote:
>>> If you are going to hotplug the entire IOH, then you are probably ok
next prev parent reply other threads:[~2012-04-25 15:47 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-04-23 13:51 [PATCH v1 0/8] enhance dmaengine core to support DMA device hotplug Jiang Liu
2012-04-23 13:51 ` [PATCH v1 1/8] dmaengine: enhance DMA channel reference count management Jiang Liu
2012-04-23 13:51 ` [PATCH v1 2/8] dmaengine: rebalance DMA channels when CPU hotplug happens Jiang Liu
2012-04-23 13:51 ` [PATCH v1 3/8] dmaengine: introduce CONFIG_DMA_ENGINE_HOTPLUG for DMA device hotplug Jiang Liu
2012-04-23 13:51 ` [PATCH v1 4/8] dmaengine: use atomic_t for struct dma_chan->client_count field Jiang Liu
2012-04-23 13:51 ` [PATCH v1 5/8] dmaengine: enhance dma_async_device_unregister() to be called by drv->remove() Jiang Liu
2012-04-23 13:51 ` [PATCH v1 6/8] dmaengine: enhance network subsystem to support DMA device hotplug Jiang Liu
2012-04-23 18:30 ` Dan Williams
2012-04-24 2:30 ` Jiang Liu
2012-04-24 3:09 ` Dan Williams
2012-04-24 3:56 ` Jiang Liu
2012-04-25 15:47 ` Jiang Liu [this message]
2012-04-23 13:51 ` [PATCH v1 7/8] dmaengine: enhance ASYNC_TX " Jiang Liu
2012-04-23 13:51 ` [RFC PATCH v1 8/8] dmaengine: assign DMA channel to CPU according to NUMA affinity Jiang Liu
2012-04-23 16:40 ` [PATCH v1 0/8] enhance dmaengine core to support DMA device hotplug Dan Williams
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F981C87.60403@gmail.com \
--to=liuj97@gmail.com \
--cc=chenkeping@huawei.com \
--cc=dan.j.williams@intel.com \
--cc=davem@davemloft.net \
--cc=jiang.liu@huawei.com \
--cc=jmorris@namei.org \
--cc=kaber@trash.net \
--cc=kuznet@ms2.inr.ac.ru \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=vinod.koul@intel.com \
--cc=yoshfuji@linux-ipv6.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).