From: Boaz Harrosh <openosd@gmail.com>
To: "Wilcox, Matthew R" <matthew.r.wilcox@intel.com>,
Boaz Harrosh <boaz@plexistor.com>,
Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Jens Axboe <axboe@fb.com>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>
Subject: Re: [PATCH] SQUASHME: pmem: no need to copy a page at a time
Date: Mon, 15 Sep 2014 11:47:33 +0300 [thread overview]
Message-ID: <5416A7A5.9000601@gmail.com> (raw)
In-Reply-To: <100D68C7BA14664A8938383216E40DE0407E3B06@FMSMSX114.amr.corp.intel.com>
No this can only be PAGE_SIZE at the time because it comes from a bio
page. It might split because that user-page may not be aligned to
a disk 4k. So the original code copy/pasted from brd had this
split since each brd-page is random.
But with pmem we do not need the split it is mute. Please read the all
code.
About write-with-cache-though. It is your call. FS/APP will need to
do fsync in which case we can implement flush. But I think
write-with-cache-though might be more efficient with most ARCHs.
Your call please send a patch
Thanks
Boaz
On 09/15/2014 03:23 AM, Wilcox, Matthew R wrote:
> Ummpf. No. The current code needs a cond_resched() thrown into that
> loop, but we can't afford the latency of calling memcpy() for a
> potentially 2GB+ write. (current CPUs have memory bandwidth somewhere
> around 20GB/s, iirc, so that would take a tenth of a second; utterly
> unacceptalbe scheduling latency).
>
> Also, copying to pmem should be using cache-bypassing stores, since
> there's not going to be any discernible benefit to having this data
> in cache. Copying from pmem is arguable ... surely it's being read so
> that it can be used, but on the other hand if we have a 2GB write,
> we're going to blow away the entire LLC.>
> ________________________________________
> From: Boaz Harrosh [boaz@plexistor.com]
> Sent: September 14, 2014 9:02 AM
> To: Ross Zwisler
> Cc: Jens Axboe; Wilcox, Matthew R; linux-fsdevel; linux-nvdimm@lists.01.org
> Subject: [PATCH] SQUASHME: pmem: no need to copy a page at a time
>
> With current code structure a single pmem device always spans
> a single contiguous memory range both in the physical as well
> as the Kernel virtual space. So this means a contiguous block's
> sectors means contiguous virtual addressing.
>
> So we do not need to memcpy a page boundary at a time we can
> memcpy any arbitrary memory range.
>
> TODO: Further optimization can be done at callers of
> copy_to/from_pmem
>
> Signed-off-by: Boaz Harrosh <boaz@plexistor.com>
> ---
> drivers/block/pmem.c | 37 +++----------------------------------
> 1 file changed, 3 insertions(+), 34 deletions(-)
>
> diff --git a/drivers/block/pmem.c b/drivers/block/pmem.c
> index 8e15c19..780ed94 100644
> --- a/drivers/block/pmem.c
> +++ b/drivers/block/pmem.c
> @@ -56,12 +56,10 @@ static int pmem_getgeo(struct block_device *bd, struct hd_geometry *geo)
> /*
> * direct translation from (pmem,sector) => void*
> * We do not require that sector be page aligned.
> - * The return value will point to the beginning of the page containing the
> - * given sector, not to the sector itself.
> */
> static void *pmem_lookup_pg_addr(struct pmem_device *pmem, sector_t sector)
> {
> - size_t offset = round_down(sector << SECTOR_SHIFT, PAGE_SIZE);
> + size_t offset = sector << SECTOR_SHIFT;
>
> BUG_ON(offset >= pmem->size);
> return pmem->virt_addr + offset;
> @@ -69,55 +67,26 @@ static void *pmem_lookup_pg_addr(struct pmem_device *pmem, sector_t sector)
>
> /*
> * sector is not required to be page aligned.
> - * n is at most a single page, but could be less.
> */
> static void copy_to_pmem(struct pmem_device *pmem, const void *src,
> sector_t sector, size_t n)
> {
> void *dst;
> - unsigned int offset = (sector & (PAGE_SECTORS - 1)) << SECTOR_SHIFT;
> - size_t copy;
> -
> - BUG_ON(n > PAGE_SIZE);
>
> - copy = min_t(size_t, n, PAGE_SIZE - offset);
> dst = pmem_lookup_pg_addr(pmem, sector);
> - memcpy(dst + offset, src, copy);
> -
> - if (copy < n) {
> - src += copy;
> - sector += copy >> SECTOR_SHIFT;
> - copy = n - copy;
> - dst = pmem_lookup_pg_addr(pmem, sector);
> - memcpy(dst, src, copy);
> - }
> + memcpy(dst, src, n);
> }
>
> /*
> * sector is not required to be page aligned.
> - * n is at most a single page, but could be less.
> */
> static void copy_from_pmem(void *dst, struct pmem_device *pmem,
> sector_t sector, size_t n)
> {
> void *src;
> - unsigned int offset = (sector & (PAGE_SECTORS - 1)) << SECTOR_SHIFT;
> - size_t copy;
> -
> - BUG_ON(n > PAGE_SIZE);
>
> - copy = min_t(size_t, n, PAGE_SIZE - offset);
> src = pmem_lookup_pg_addr(pmem, sector);
> -
> - memcpy(dst, src + offset, copy);
> -
> - if (copy < n) {
> - dst += copy;
> - sector += copy >> SECTOR_SHIFT;
> - copy = n - copy;
> - src = pmem_lookup_pg_addr(pmem, sector);
> - memcpy(dst, src, copy);
> - }
> + memcpy(dst, src, n);
> }
>
> static void pmem_do_bvec(struct pmem_device *pmem, struct page *page,
> --
> 1.9.3
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
next prev parent reply other threads:[~2014-09-15 8:47 UTC|newest]
Thread overview: 100+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-08-27 21:11 [PATCH 0/4] Add persistent memory driver Ross Zwisler
2014-08-27 21:11 ` Ross Zwisler
2014-08-27 21:11 ` [PATCH 1/4] pmem: Initial version of " Ross Zwisler
2014-08-27 21:11 ` Ross Zwisler
2014-09-09 16:23 ` [PATCH v2] " Boaz Harrosh
2014-09-09 16:23 ` Boaz Harrosh
2014-09-09 16:53 ` [Linux-nvdimm] " Dan Williams
2014-09-09 16:53 ` Dan Williams
2014-09-10 13:23 ` Boaz Harrosh
2014-09-10 13:23 ` Boaz Harrosh
2014-09-10 17:03 ` Dan Williams
2014-09-10 17:03 ` Dan Williams
2014-09-10 17:47 ` Boaz Harrosh
2014-09-10 17:47 ` Boaz Harrosh
2014-09-10 23:01 ` Dan Williams
2014-09-10 23:01 ` Dan Williams
2014-09-11 10:45 ` Boaz Harrosh
2014-09-11 10:45 ` Boaz Harrosh
2014-09-11 16:31 ` Dan Williams
2014-09-11 16:31 ` Dan Williams
2014-09-14 11:18 ` Boaz Harrosh
2014-09-14 11:18 ` Boaz Harrosh
2014-09-16 13:54 ` Jeff Moyer
2014-09-16 16:24 ` Boaz Harrosh
2014-09-19 16:27 ` Dan Williams
2014-09-21 9:27 ` Boaz Harrosh
2014-11-02 3:22 ` [PATCH 1/4] " Elliott, Robert (Server Storage)
2014-11-02 3:22 ` Elliott, Robert (Server Storage)
2014-11-03 15:50 ` Jeff Moyer
2014-11-03 16:19 ` Wilcox, Matthew R
2014-11-03 16:19 ` Wilcox, Matthew R
2014-11-04 10:37 ` Boaz Harrosh
2014-11-04 10:37 ` Boaz Harrosh
2014-11-04 16:26 ` Elliott, Robert (Server Storage)
2014-11-04 16:26 ` Elliott, Robert (Server Storage)
2014-11-04 16:41 ` Ross Zwisler
2014-11-04 16:41 ` Ross Zwisler
2014-11-04 17:06 ` Boaz Harrosh
2014-11-04 17:06 ` Boaz Harrosh
2014-08-27 21:12 ` [PATCH 2/4] pmem: Add support for getgeo() Ross Zwisler
2014-08-27 21:12 ` Ross Zwisler
2014-11-02 3:27 ` Elliott, Robert (Server Storage)
2014-11-02 3:27 ` Elliott, Robert (Server Storage)
2014-11-03 16:36 ` Wilcox, Matthew R
2014-11-03 16:36 ` Wilcox, Matthew R
2014-08-27 21:12 ` [PATCH 3/4] pmem: Add support for rw_page() Ross Zwisler
2014-08-27 21:12 ` Ross Zwisler
2014-08-27 21:12 ` [PATCH 4/4] pmem: Add support for direct_access() Ross Zwisler
2014-08-27 21:12 ` Ross Zwisler
2014-09-09 15:37 ` [PATCH 0/9] pmem: Fixes and farther development (mm: add_persistent_memory) Boaz Harrosh
2014-09-09 15:37 ` Boaz Harrosh
2014-09-09 15:40 ` [PATCH 1/9] SQUASHME: pmem: Remove unused #include headers Boaz Harrosh
2014-09-09 22:29 ` Ross Zwisler
2014-09-10 11:36 ` Boaz Harrosh
2014-09-10 19:16 ` [Linux-nvdimm] " Matthew Wilcox
2014-09-11 11:35 ` Boaz Harrosh
2014-09-11 19:34 ` Matthew Wilcox
2014-09-09 15:41 ` [PATCH 2/9] SQUASHME: pmem: Request from fdisk 4k alignment Boaz Harrosh
2014-09-11 18:39 ` Ross Zwisler
2014-09-14 11:25 ` Boaz Harrosh
2014-09-09 15:43 ` [PATCH 3/9] SQUASHME: pmem: Let each device manage private memory region Boaz Harrosh
2014-09-11 20:35 ` Ross Zwisler
2014-09-09 15:44 ` [PATCH 4/9] SQUASHME: pmem: Support of multiple memory regions Boaz Harrosh
2014-09-09 15:44 ` Boaz Harrosh
2014-09-09 15:45 ` [PATCH 5/9] mm: Let sparse_{add,remove}_one_section receive a node_id Boaz Harrosh
2014-09-09 15:45 ` Boaz Harrosh
2014-09-09 18:36 ` Dave Hansen
2014-09-09 18:36 ` Dave Hansen
2014-09-10 10:07 ` Boaz Harrosh
2014-09-10 10:07 ` Boaz Harrosh
2014-09-10 16:10 ` Dave Hansen
2014-09-10 16:10 ` Dave Hansen
2014-09-10 17:25 ` Boaz Harrosh
2014-09-10 17:25 ` Boaz Harrosh
2014-09-10 18:28 ` Dave Hansen
2014-09-10 18:28 ` Dave Hansen
2014-09-10 18:28 ` Dave Hansen
2014-09-11 8:39 ` Boaz Harrosh
2014-09-11 8:39 ` Boaz Harrosh
2014-09-11 17:07 ` Dave Hansen
2014-09-11 17:07 ` Dave Hansen
2014-09-14 9:36 ` Boaz Harrosh
2014-09-14 9:36 ` Boaz Harrosh
2014-09-09 15:47 ` [PATCH 6/9] mm: New add_persistent_memory/remove_persistent_memory Boaz Harrosh
2014-09-09 15:47 ` Boaz Harrosh
2014-09-09 15:48 ` [PATCH 7/9] pmem: Add support for page structs Boaz Harrosh
2014-09-09 15:48 ` Boaz Harrosh
2014-09-09 15:49 ` [PATCH 8/9] SQUASHME: pmem: Fixs to getgeo Boaz Harrosh
2014-09-09 15:51 ` [PATCH 9/9] pmem: KISS, remove register_blkdev Boaz Harrosh
2014-09-09 15:51 ` Boaz Harrosh
2014-09-10 16:50 ` [PATCH] SQUASHME pmem: Micro optimization for pmem_direct_access Boaz Harrosh
2014-09-10 22:32 ` Ross Zwisler
2014-09-11 11:42 ` Boaz Harrosh
2014-09-14 14:58 ` [PATCH v2] SQUASHME pmem: Micro optimize the hotpath Boaz Harrosh
2014-09-14 16:02 ` [PATCH] SQUASHME: pmem: no need to copy a page at a time Boaz Harrosh
2014-09-15 0:23 ` Wilcox, Matthew R
2014-09-15 8:47 ` Boaz Harrosh [this message]
2014-09-10 17:50 ` [PATCH] SQUASHME: pmem: Add MODULE_ALIAS Boaz Harrosh
2014-09-10 19:22 ` Ross Zwisler
2014-09-11 11:44 ` Boaz Harrosh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5416A7A5.9000601@gmail.com \
--to=openosd@gmail.com \
--cc=axboe@fb.com \
--cc=boaz@plexistor.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-nvdimm@lists.01.org \
--cc=matthew.r.wilcox@intel.com \
--cc=ross.zwisler@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.