* [PATCH RFC 00/13] PRAM: Persistent over-kexec memory storage
@ 2013-07-01 11:57 Vladimir Davydov
2013-07-01 11:57 ` [PATCH RFC 01/13] mm: add PRAM API stubs and Kconfig Vladimir Davydov
` (12 more replies)
0 siblings, 13 replies; 14+ messages in thread
From: Vladimir Davydov @ 2013-07-01 11:57 UTC (permalink / raw)
To: linux-kernel; +Cc: linux-mm, criu, devel, xemul, khorenko
Hi,
This patchset implements persistent over-kexec memory storage or PRAM, which is
intended to be used for saving memory pages of the currently executing kernel
and restoring them after a kexec in the newly booted one. This can be utilized
for speeding up reboot by leaving process memory and/or FS caches in-place. The
patchset introduces the PRAM kernel API serving for that purpose and makes use
of this API to make tmpfs 'persistent', i.e. makes it possible to save tmpfs
tree on unmount and restore it on the next mount even if the system is kexec'd
between the mount and unmount.
For further details, please see below.
-- The problem --
If Ksplice is not available or cannot be applied, a kernel update requires
restarting the system, which implies reinitialization of all running
application. Since this is a disk-bound operation, it can take quite a lot of
time. What is worse, if the host serves as a web or database or whatever else
server, apart from huge downtime the system reboot will cause any existent
connection to be dropped, which may not always be tolerated.
Although the kernel boot can be speeded up significantly by employing kexec,
which jumps directly to the new kernel skipping the BIOS and boot loader
stages, it has nothing to do with running applications, which still need to be
restarted.
-- The solution --
There is the rapidly developing criu project (www.criu.org), which targets on
saving running application states to disk to be restored later. It is already
accepted by the community and hopefully it will soon be able to dump and
restore every Linux process. Obviously criu can be successfully used to omit
full application reinitialization on reboot, but criu'ing may still take a lot
of time. To illustrate, imagine a database server that cached to its internal
buffers 100 GB of data. Writing the image of that process sequentially at 100
MB/s will take more that 15 minutes. Multiplied by two, since the image must be
read after reboot, it gives half an hour of downtime! The server's clients will
probably disconnect by timeout until the system is up and running, which
cancels all the benefits of criu'ing.
However, the disk read/write, which is the bottleneck in the criu scheme, can
be avoided if kexec is used for rebooting. The point is kexec does not reset
the RAM state leaving all data written to memory intact. This fact is already
utilized by kdump to gather the full memory image on kernel panic. If it were
possible to save arbitrary data and restore them after kexec, it could be
utilized to completely avoid disk accesses when criu'ing.
This patchset implements the kernel API for saving data to be restored after
kexec and employs it to make tmpfs 'persistent' as described below.
-- Usage --
1) Boot kernel with 'pram_banned=MEMRANGE' boot option.
MEMRANGE=MEMMIN-MEMMAX specifies memory range where kexec will load the new
kernel code. It is used to avoid conflicts with persistent memory as
described in implementation details. MEMRANGE=0-128M should be enough.
2) Mount tmpfs with 'pram=NAME' option.
NAME is an arbitrary string specifying persistent memory node. Different
tmpfs trees may be saved to PRAM if different names are passed.
# mkdir -p /mnt/crdump
# mount -t tmpfs -o pram=mytmpfs none /mnt/crdump
3) Checkpoint the process tree you'd want to pass over kexec to tmpfs.
# criu dump -D /mnt/crdump -t $PID
4) Unmount tmpfs.
It will be automatically saved to PRAM on unmount.
# umount /mnt/crdump
5) Load the new kernel image.
Kexec needs some tweaking for PRAM to work. First, one should pass PRAM
super block pfn via 'pram' boot option. The pfn is exported via the sysfs
file /sys/kernel/pram. Second, kexec must be forced to load the kernel code
to MEMRANGE (see p.1).
# kexec --load /vmlinuz --initrd=initrd.img \
--append="$(cat /proc/cmdline | sed -e 's/pram=[^ ]*//g') pram=$(cat /sys/kernel/pram)" \
--mem-min=$MEMMIN --mem-max=$MEMMAX
6) Boot to the new kernel.
# reboot
7) Mount tmpfs with 'pram=NAME' option.
It should find the PRAM node with the tmpfs tree saved on previous unmount
and restore it.
# mount -t tmpfs -o pram=mytmpfs none /mnt/crdump
8) Restore the process saved in p.3.
# criu restore -d -D /mnt/crdump
9) Remove the dump and unmount tmpfs
# rm -f /mnt/crdump
# umount /mnt/crdump
-- Implementation details --
* Saving a memory page is simply incrementing its refcounter so the page will
not get freed when the last user puts it. So the data saved to PRAM may be
safely used as usual.
* To preserve persistent memory in the newly booted kernel, PRAM marks all the
pages saved as reserved at early boot so that they will not be recycled. For
the new kernel to find persistent memory metadata, one should pass PRAM
super block pfn, which is exported via /sys/kernel/pram, in the 'pram' boot
param.
* Since some memory is required for completing boot sequence, PRAM tracks all
memory regions that have ever been reserved by other parts of the kernel and
avoids using them for persistent memory. Since the device configuration
cannot change during kexec, and the newly booted kernel is likely to have
the same set of device drivers, it should work in most cases.
* Since kexec may load the new kernel code to any memory region, it can
destroy persistent memory. To exclude this, kexec should be forced to load
the new kernel code to a memory region that is banned for PRAM. For that
purpose, there is the 'pram_banned' boot param and --mem-min and --mem-max
otpions of the kexec utility.
* If a conflict still happens, it will be identified and all persistent memory
will be discarded to prevent further errors. It is guaranteed by
checksumming all data saved to PRAM.
* tmpfs is saved to PRAM on unmount and loaded on mount if 'pram=NAME' mount
option is passed. NAME specifies the PRAM node to save data to. This is to
allow saving several tmpfs trees.
* Saving tmpfs to PRAM is not well elaborated at present and serves rather as
a proof of concept. Namely, only regular files without multiple hard links
are supported and tmpfs may not be swapped out. If these requirements are
not met, save to PRAM will be aborted spewing a message to the kernel log.
This is not very difficult to fix, but at present one should turn off swap
to test the feature.
-- Future plans --
What we'd like to do:
* Implement swap entries 'freezing' to allow saving a swapped out tmpfs.
* Implement full support of tmpfs including saving dirs, special files, etc.
* Implement SPLICE_F_MOVE, SPLICE_F_GIFT flags for splicing data from/to
shmem. This would allow avoiding memory copying on checkpoint/restore.
* Save uptodate fs cache on umount to be restored on mount after kexec.
Thanks,
Vladimir Davydov (13):
mm: add PRAM API stubs and Kconfig
mm: PRAM: implement node load and save functions
mm: PRAM: implement page stream operations
mm: PRAM: implement byte stream operations
mm: PRAM: link nodes by pfn before reboot
mm: PRAM: introduce super block
mm: PRAM: preserve persistent memory at boot
mm: PRAM: checksum saved data
mm: PRAM: ban pages that have been reserved at boot time
mm: PRAM: allow to ban arbitrary memory ranges
mm: PRAM: allow to free persistent memory from userspace
mm: shmem: introduce shmem_insert_page
mm: shmem: enable saving to PRAM
arch/x86/kernel/setup.c | 2 +
arch/x86/mm/init_32.c | 5 +
arch/x86/mm/init_64.c | 5 +
include/linux/pram.h | 62 +++
include/linux/shmem_fs.h | 29 ++
mm/Kconfig | 14 +
mm/Makefile | 1 +
mm/bootmem.c | 4 +
mm/memblock.c | 7 +-
mm/pram.c | 1279 ++++++++++++++++++++++++++++++++++++++++++++++
mm/shmem.c | 97 +++-
mm/shmem_pram.c | 378 ++++++++++++++
12 files changed, 1878 insertions(+), 5 deletions(-)
create mode 100644 include/linux/pram.h
create mode 100644 mm/pram.c
create mode 100644 mm/shmem_pram.c
--
1.7.10.4
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH RFC 01/13] mm: add PRAM API stubs and Kconfig
2013-07-01 11:57 [PATCH RFC 00/13] PRAM: Persistent over-kexec memory storage Vladimir Davydov
@ 2013-07-01 11:57 ` Vladimir Davydov
2013-07-01 11:57 ` [PATCH RFC 02/13] mm: PRAM: implement node load and save functions Vladimir Davydov
` (11 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Vladimir Davydov @ 2013-07-01 11:57 UTC (permalink / raw)
To: linux-kernel; +Cc: linux-mm, criu, devel, xemul, khorenko
Persistent memory subsys or PRAM is intended to be used for saving
memory pages of the currently executing kernel and restoring them after
a kexec in the newly booted one. This can be utilized for speeding up
reboot by leaving process memory and/or FS caches in-place.
The proposed API:
* Persistent memory is divided into nodes, which can be saved or loaded
independently of each other. The nodes are identified by unique name
strings. PRAM node is created (removed) when save (load) is initiated
by calling pram_prepare_save() (pram_prepare_load()), see below.
* For saving/loading data from a PRAM node an instance of the
pram_stream struct is used. The struct is initialized by calling
pram_prepare_save() for saving data or pram_prepare_load() for
loading data. After save (load) is complete, pram_finish_save()
(pram_finish_load()) must be called. If an error occurred during
save, the saved data and the PRAM node may be freed by calling
pram_discard_save() instead of pram_finish_save().
* Each pram_stream has a type, which determines the set of operations
that may be used for saving/loading data. The type is defined by the
pram_stream_type enum. Currently there are two stream types
available: PRAM_PAGE_STREAM to save/load memory pages, and
PRAM_BYTE_STREAM to save/load byte strings. For page streams
pram_save_page() and pram_load_page() may be used, and for byte
streams pram_write() and pram_read() may be used for saving and
loading data respectively.
Thus a sequence of operations for saving/loading data from PRAM should
look like:
* For saving data to PRAM:
/* create PRAM node and initialize stream for saving data to it */
pram_prepare_save()
/* save data to the node */
pram_save_page()[,...] /* for page stream, or
pram_write()[,...] * ... for byte stream */
/* commit the save or discard and delete the node */
pram_finish_save() /* on success, or
pram_discard_save() * ... in case of error */
* For loading data from PRAM:
/* remove PRAM node from the list and initialize stream for
* loading data from it */
pram_prepare_load()
/* load data from the node */
pram_load_page()[,...] /* for page stream, or
pram_read()[,...] * ... for byte stream */
/* free the node */
pram_finish_load()
---
include/linux/pram.h | 38 +++++++++++++++
mm/Kconfig | 9 ++++
mm/Makefile | 1 +
mm/pram.c | 131 ++++++++++++++++++++++++++++++++++++++++++++++++++
4 files changed, 179 insertions(+)
create mode 100644 include/linux/pram.h
create mode 100644 mm/pram.c
diff --git a/include/linux/pram.h b/include/linux/pram.h
new file mode 100644
index 0000000..cf04548
--- /dev/null
+++ b/include/linux/pram.h
@@ -0,0 +1,38 @@
+#ifndef _LINUX_PRAM_H
+#define _LINUX_PRAM_H
+
+#include <linux/gfp.h>
+#include <linux/types.h>
+#include <linux/mm_types.h>
+
+struct pram_stream;
+
+#define PRAM_NAME_MAX 256 /* including nul */
+
+enum pram_stream_type {
+ PRAM_PAGE_STREAM,
+ PRAM_BYTE_STREAM,
+};
+
+extern int pram_prepare_save(struct pram_stream *ps,
+ const char *name, enum pram_stream_type type, gfp_t gfp_mask);
+extern void pram_finish_save(struct pram_stream *ps);
+extern void pram_discard_save(struct pram_stream *ps);
+
+extern int pram_prepare_load(struct pram_stream *ps,
+ const char *name, enum pram_stream_type type);
+extern void pram_finish_load(struct pram_stream *ps);
+
+#define PRAM_PAGE_LRU 0x01 /* page is on the LRU */
+
+/* page-stream specific methods */
+extern int pram_save_page(struct pram_stream *ps,
+ struct page *page, int flags);
+extern struct page *pram_load_page(struct pram_stream *ps, int *flags);
+
+/* byte-stream specific methods */
+extern ssize_t pram_write(struct pram_stream *ps,
+ const void *buf, size_t count);
+extern size_t pram_read(struct pram_stream *ps, void *buf, size_t count);
+
+#endif /* _LINUX_PRAM_H */
diff --git a/mm/Kconfig b/mm/Kconfig
index 3bea74f..46337e8 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -471,3 +471,12 @@ config FRONTSWAP
and swap data is stored as normal on the matching swap device.
If unsure, say Y to enable frontswap.
+
+config PRAM
+ bool "Persistent over-kexec memory storage"
+ default n
+ help
+ This option adds the kernel API that enables saving memory pages of
+ the currently executing kernel and restoring them after a kexec in
+ the newly booted one. This can be utilized for speeding up reboot by
+ leaving process memory and/or FS caches in-place.
diff --git a/mm/Makefile b/mm/Makefile
index 3a46287..33ad952 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -58,3 +58,4 @@ obj-$(CONFIG_DEBUG_KMEMLEAK) += kmemleak.o
obj-$(CONFIG_DEBUG_KMEMLEAK_TEST) += kmemleak-test.o
obj-$(CONFIG_CLEANCACHE) += cleancache.o
obj-$(CONFIG_MEMORY_ISOLATION) += page_isolation.o
+obj-$(CONFIG_PRAM) += pram.o
diff --git a/mm/pram.c b/mm/pram.c
new file mode 100644
index 0000000..cea0e87
--- /dev/null
+++ b/mm/pram.c
@@ -0,0 +1,131 @@
+#include <linux/err.h>
+#include <linux/gfp.h>
+#include <linux/kernel.h>
+#include <linux/mm.h>
+#include <linux/pram.h>
+#include <linux/types.h>
+
+/**
+ * Create a persistent memory node with name @name and initialize stream @ps
+ * for saving data to it.
+ *
+ * @type determines the content type of the newly created node and, as a
+ * result, the set of operations that may be used on the stream as follows:
+ * %PRAM_PAGE_STREAM: page stream, use pram_save_page()
+ * %PRAM_BYTE_STREAM: byte stream, use pram_write()
+ *
+ * @gfp_mask specifies the memory allocation mask to be used when saving data.
+ *
+ * Returns 0 on success, -errno on failure.
+ *
+ * After the save has finished, pram_finish_save() (or pram_discard_save() in
+ * case of failure) is to be called.
+ */
+int pram_prepare_save(struct pram_stream *ps,
+ const char *name, enum pram_stream_type type, gfp_t gfp_mask)
+{
+ return -ENOSYS;
+}
+
+/**
+ * Commit the save to persistent memory started with pram_prepare_save().
+ * After the call, the stream may not be used any more.
+ */
+void pram_finish_save(struct pram_stream *ps)
+{
+ BUG();
+}
+
+/**
+ * Cancel the save to persistent memory started with pram_prepare_save() and
+ * destroy the corresponding persistent memory node freeing all data that have
+ * been saved to it.
+ */
+void pram_discard_save(struct pram_stream *ps)
+{
+ BUG();
+}
+
+/**
+ * Remove the peristent memory node with name @name and initialize stream @ps
+ * for loading data from it.
+ *
+ * @type determines the content type of the node to be loaded and, as a result,
+ * the set of operations that may be used on the stream as follows:
+ * %PRAM_PAGE_STREAM: page stream, use pram_load_page()
+ * %PRAM_BYTE_STREAM: byte stream, use pram_read()
+ *
+ * Returns 0 on success, -errno on failure.
+ *
+ * After the load has finished, pram_finish_load() is to be called.
+ */
+int pram_prepare_load(struct pram_stream *ps,
+ const char *name, enum pram_stream_type type)
+{
+ return -ENOSYS;
+}
+
+/**
+ * Finish the load from persistent memory started with pram_prepare_load()
+ * freeing the corresponding persistent memory node and all data that have not
+ * been loaded from it.
+ */
+void pram_finish_load(struct pram_stream *ps)
+{
+ BUG();
+}
+
+/**
+ * Save page @page to the persistent memory node associated with stream @ps.
+ * The stream must be initialized with pram_prepare_save().
+ *
+ * @flags determines the page state. If the page is on the lru, @flags should
+ * have the PRAM_PAGE_LRU bit set.
+ *
+ * Returns 0 on success, -errno on failure.
+ */
+int pram_save_page(struct pram_stream *ps, struct page *page, int flags)
+{
+ return -ENOSYS;
+}
+
+/**
+ * Load the next page from the persistent memory node associated with stream
+ * @ps. The stream must be initialized with pram_prepare_load().
+ *
+ * If not NULL, @flags is initialized with the state of the page loaded. If the
+ * page is on the lru, it will have the PRAM_PAGE_LRU bit set.
+ *
+ * Returns the page loaded or NULL if the node is empty.
+ *
+ * Pages are loaded from persistent memory in the same order they were saved.
+ * The page loaded has its refcounter incremeneted.
+ */
+struct page *pram_load_page(struct pram_stream *ps, int *flags)
+{
+ return NULL;
+}
+
+/**
+ * Copy @count bytes from @buf to the persistent memory node assiciated with
+ * stream @ps. The stream must be initialized with pram_prepare_save().
+ *
+ * On success, returns the number of bytes written, which is always equal to
+ * @count. On failure, -errno is returned.
+ */
+ssize_t pram_write(struct pram_stream *ps, const void *buf, size_t count)
+{
+ return -ENOSYS;
+}
+
+/**
+ * Copy up to @count bytes from the persistent memory node assiciated with
+ * stream @ps to @buf. The stream must be initialized with pram_prepare_load().
+ *
+ * Returns the number of bytes read, which may be less than @count if the node
+ * has fewer bytes available.
+ */
+size_t pram_read(struct pram_stream *ps, void *buf, size_t count)
+{
+ return 0;
+}
--
1.7.10.4
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH RFC 02/13] mm: PRAM: implement node load and save functions
2013-07-01 11:57 [PATCH RFC 00/13] PRAM: Persistent over-kexec memory storage Vladimir Davydov
2013-07-01 11:57 ` [PATCH RFC 01/13] mm: add PRAM API stubs and Kconfig Vladimir Davydov
@ 2013-07-01 11:57 ` Vladimir Davydov
2013-07-01 11:57 ` [PATCH RFC 03/13] mm: PRAM: implement page stream operations Vladimir Davydov
` (10 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Vladimir Davydov @ 2013-07-01 11:57 UTC (permalink / raw)
To: linux-kernel; +Cc: linux-mm, criu, devel, xemul, khorenko
Persistent memory is divided into nodes, which can be saved and loaded
independently of each other. PRAM nodes are kept on the list and
identified by unique names. Whenever a save operation is initiated by
calling pram_prepare_save(), a new node is created and linked to the
list. When the save operation has been committed by calling
pram_finish_save(), the node becomes loadable. A load operation can be
then initiated by calling pram_prepare_load(), which deletes the node
from the list and prepares the corresponding stream for loading data
from it. After the load has been finished, the pram_finish_load()
function must be called to free the node. Nodes are also deleted when a
save operation is discarded, i.e. pram_discard_save() is called instead
of pram_finish_save().
---
include/linux/pram.h | 7 ++-
mm/pram.c | 158 ++++++++++++++++++++++++++++++++++++++++++++++++--
2 files changed, 159 insertions(+), 6 deletions(-)
diff --git a/include/linux/pram.h b/include/linux/pram.h
index cf04548..5b8c2c1 100644
--- a/include/linux/pram.h
+++ b/include/linux/pram.h
@@ -5,7 +5,12 @@
#include <linux/types.h>
#include <linux/mm_types.h>
-struct pram_stream;
+struct pram_node;
+
+struct pram_stream {
+ gfp_t gfp_mask;
+ struct pram_node *node;
+};
#define PRAM_NAME_MAX 256 /* including nul */
diff --git a/mm/pram.c b/mm/pram.c
index cea0e87..3af2039 100644
--- a/mm/pram.c
+++ b/mm/pram.c
@@ -1,10 +1,75 @@
#include <linux/err.h>
#include <linux/gfp.h>
#include <linux/kernel.h>
+#include <linux/list.h>
#include <linux/mm.h>
+#include <linux/mutex.h>
#include <linux/pram.h>
+#include <linux/string.h>
#include <linux/types.h>
+/*
+ * Persistent memory is divided into nodes that can be saved or loaded
+ * independently of each other. The nodes are identified by unique name
+ * strings.
+ *
+ * The structure occupies a memory page.
+ */
+struct pram_node {
+ __u32 flags; /* see PRAM_* flags below */
+ __u32 type; /* data type, see enum pram_stream_type */
+
+ __u8 name[PRAM_NAME_MAX];
+};
+
+#define PRAM_SAVE 1
+#define PRAM_LOAD 2
+#define PRAM_ACCMODE_MASK 3
+
+static LIST_HEAD(pram_nodes); /* linked through page::lru */
+static DEFINE_MUTEX(pram_mutex); /* serializes open/close */
+
+static inline struct page *pram_alloc_page(gfp_t gfp_mask)
+{
+ return alloc_page(gfp_mask);
+}
+
+static inline void pram_free_page(void *addr)
+{
+ free_page((unsigned long)addr);
+}
+
+static inline void pram_insert_node(struct pram_node *node)
+{
+ list_add(&virt_to_page(node)->lru, &pram_nodes);
+}
+
+static inline void pram_delete_node(struct pram_node *node)
+{
+ list_del(&virt_to_page(node)->lru);
+}
+
+static struct pram_node *pram_find_node(const char *name)
+{
+ struct page *page;
+ struct pram_node *node;
+
+ list_for_each_entry(page, &pram_nodes, lru) {
+ node = page_address(page);
+ if (strcmp(node->name, name) == 0)
+ return node;
+ }
+ return NULL;
+}
+
+static void pram_stream_init(struct pram_stream *ps,
+ struct pram_node *node, gfp_t gfp_mask)
+{
+ memset(ps, 0, sizeof(*ps));
+ ps->gfp_mask = gfp_mask;
+ ps->node = node;
+}
+
/**
* Create a persistent memory node with name @name and initialize stream @ps
* for saving data to it.
@@ -18,13 +83,49 @@
*
* Returns 0 on success, -errno on failure.
*
+ * Error values:
+ * %ENAMETOOLONG: name len >= PRAM_NAME_MAX
+ * %ENOMEM: insufficient memory available
+ * %EEXIST: node with specified name already exists
+ *
* After the save has finished, pram_finish_save() (or pram_discard_save() in
* case of failure) is to be called.
*/
int pram_prepare_save(struct pram_stream *ps,
const char *name, enum pram_stream_type type, gfp_t gfp_mask)
{
- return -ENOSYS;
+ struct page *page;
+ struct pram_node *node;
+ int err = 0;
+
+ BUG_ON(type != PRAM_PAGE_STREAM &&
+ type != PRAM_BYTE_STREAM);
+
+ if (strlen(name) >= PRAM_NAME_MAX)
+ return -ENAMETOOLONG;
+
+ page = pram_alloc_page(GFP_KERNEL | __GFP_ZERO);
+ if (!page)
+ return -ENOMEM;
+ node = page_address(page);
+
+ node->flags = PRAM_SAVE;
+ node->type = type;
+ strcpy(node->name, name);
+
+ mutex_lock(&pram_mutex);
+ if (!pram_find_node(name))
+ pram_insert_node(node);
+ else
+ err = -EEXIST;
+ mutex_unlock(&pram_mutex);
+ if (err) {
+ __free_page(page);
+ return err;
+ }
+
+ pram_stream_init(ps, node, gfp_mask);
+ return 0;
}
/**
@@ -33,7 +134,12 @@ int pram_prepare_save(struct pram_stream *ps,
*/
void pram_finish_save(struct pram_stream *ps)
{
- BUG();
+ struct pram_node *node = ps->node;
+
+ BUG_ON((node->flags & PRAM_ACCMODE_MASK) != PRAM_SAVE);
+
+ smp_wmb();
+ node->flags &= ~PRAM_ACCMODE_MASK;
}
/**
@@ -43,7 +149,15 @@ void pram_finish_save(struct pram_stream *ps)
*/
void pram_discard_save(struct pram_stream *ps)
{
- BUG();
+ struct pram_node *node = ps->node;
+
+ BUG_ON((node->flags & PRAM_ACCMODE_MASK) != PRAM_SAVE);
+
+ mutex_lock(&pram_mutex);
+ pram_delete_node(node);
+ mutex_unlock(&pram_mutex);
+
+ pram_free_page(node);
}
/**
@@ -57,12 +171,42 @@ void pram_discard_save(struct pram_stream *ps)
*
* Returns 0 on success, -errno on failure.
*
+ * Error values:
+ * %ENOENT: node with specified name does not exist
+ * %EBUSY: save to required node has not finished yet
+ * %EPERM: specified type conflicts with type of required node
+ *
* After the load has finished, pram_finish_load() is to be called.
*/
int pram_prepare_load(struct pram_stream *ps,
const char *name, enum pram_stream_type type)
{
- return -ENOSYS;
+ struct pram_node *node;
+ int err = 0;
+
+ mutex_lock(&pram_mutex);
+ node = pram_find_node(name);
+ if (!node) {
+ err = -ENOENT;
+ goto out_unlock;
+ }
+ if (node->flags & PRAM_ACCMODE_MASK) {
+ err = -EBUSY;
+ goto out_unlock;
+ }
+ if (node->type != type) {
+ err = -EPERM;
+ goto out_unlock;
+ }
+ pram_delete_node(node);
+out_unlock:
+ mutex_unlock(&pram_mutex);
+ if (err)
+ return err;
+
+ node->flags |= PRAM_LOAD;
+ pram_stream_init(ps, node, 0);
+ return 0;
}
/**
@@ -72,7 +216,11 @@ int pram_prepare_load(struct pram_stream *ps,
*/
void pram_finish_load(struct pram_stream *ps)
{
- BUG();
+ struct pram_node *node = ps->node;
+
+ BUG_ON((node->flags & PRAM_ACCMODE_MASK) != PRAM_LOAD);
+
+ pram_free_page(node);
}
/**
--
1.7.10.4
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH RFC 03/13] mm: PRAM: implement page stream operations
2013-07-01 11:57 [PATCH RFC 00/13] PRAM: Persistent over-kexec memory storage Vladimir Davydov
2013-07-01 11:57 ` [PATCH RFC 01/13] mm: add PRAM API stubs and Kconfig Vladimir Davydov
2013-07-01 11:57 ` [PATCH RFC 02/13] mm: PRAM: implement node load and save functions Vladimir Davydov
@ 2013-07-01 11:57 ` Vladimir Davydov
2013-07-01 11:57 ` [PATCH RFC 04/13] mm: PRAM: implement byte " Vladimir Davydov
` (9 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Vladimir Davydov @ 2013-07-01 11:57 UTC (permalink / raw)
To: linux-kernel; +Cc: linux-mm, criu, devel, xemul, khorenko
Using the pram_save_page() function, one can populate PRAM nodes with
memory pages, which can be then loaded using the pram_load_page()
function. Saving a memory page to PRAM is implemented as storing the pfn
in the PRAM node and incrementing its ref count so that it will not get
freed after the last user puts it.
---
include/linux/pram.h | 3 +
mm/pram.c | 166 +++++++++++++++++++++++++++++++++++++++++++++++++-
2 files changed, 167 insertions(+), 2 deletions(-)
diff --git a/include/linux/pram.h b/include/linux/pram.h
index 5b8c2c1..dd17316 100644
--- a/include/linux/pram.h
+++ b/include/linux/pram.h
@@ -6,10 +6,13 @@
#include <linux/mm_types.h>
struct pram_node;
+struct pram_link;
struct pram_stream {
gfp_t gfp_mask;
struct pram_node *node;
+ struct pram_link *link; /* current link */
+ unsigned int page_index; /* next page index in link */
};
#define PRAM_NAME_MAX 256 /* including nul */
diff --git a/mm/pram.c b/mm/pram.c
index 3af2039..a443eb0 100644
--- a/mm/pram.c
+++ b/mm/pram.c
@@ -5,19 +5,48 @@
#include <linux/mm.h>
#include <linux/mutex.h>
#include <linux/pram.h>
+#include <linux/sched.h>
#include <linux/string.h>
#include <linux/types.h>
/*
+ * Represents a reference to a data page saved to PRAM.
+ */
+struct pram_entry {
+ __u32 flags; /* see PRAM_PAGE_* flags */
+ __u64 pfn; /* the page frame number */
+};
+
+/*
+ * Keeps references to data pages saved to PRAM.
+ * The structure occupies a memory page.
+ */
+struct pram_link {
+ __u64 link_pfn; /* points to the next link of the node */
+
+ /* the array occupies the rest of the link page; if the link is not
+ * full, the rest of the array must be filled with zeros */
+ struct pram_entry entry[0];
+};
+
+#define PRAM_LINK_ENTRIES_MAX \
+ ((PAGE_SIZE-sizeof(struct pram_link))/sizeof(struct pram_entry))
+
+/*
* Persistent memory is divided into nodes that can be saved or loaded
* independently of each other. The nodes are identified by unique name
* strings.
*
+ * References to data pages saved to a persistent memory node are kept in a
+ * singly-linked list of PRAM link structures (see above), the node has a
+ * pointer to the head of.
+ *
* The structure occupies a memory page.
*/
struct pram_node {
__u32 flags; /* see PRAM_* flags below */
__u32 type; /* data type, see enum pram_stream_type */
+ __u64 link_pfn; /* points to the first link of the node */
__u8 name[PRAM_NAME_MAX];
};
@@ -62,12 +91,46 @@ static struct pram_node *pram_find_node(const char *name)
return NULL;
}
+static void pram_truncate_link(struct pram_link *link)
+{
+ int i;
+ unsigned long pfn;
+ struct page *page;
+
+ for (i = 0; i < PRAM_LINK_ENTRIES_MAX; i++) {
+ pfn = link->entry[i].pfn;
+ if (!pfn)
+ continue;
+ page = pfn_to_page(pfn);
+ put_page(page);
+ }
+}
+
+static void pram_truncate_node(struct pram_node *node)
+{
+ unsigned long link_pfn;
+ struct pram_link *link;
+
+ link_pfn = node->link_pfn;
+ while (link_pfn) {
+ link = pfn_to_kaddr(link_pfn);
+ pram_truncate_link(link);
+ link_pfn = link->link_pfn;
+ pram_free_page(link);
+ cond_resched();
+ }
+ node->link_pfn = 0;
+
+}
+
static void pram_stream_init(struct pram_stream *ps,
struct pram_node *node, gfp_t gfp_mask)
{
memset(ps, 0, sizeof(*ps));
ps->gfp_mask = gfp_mask;
ps->node = node;
+ if (node->link_pfn)
+ ps->link = pfn_to_kaddr(node->link_pfn);
}
/**
@@ -157,6 +220,7 @@ void pram_discard_save(struct pram_stream *ps)
pram_delete_node(node);
mutex_unlock(&pram_mutex);
+ pram_truncate_node(node);
pram_free_page(node);
}
@@ -220,9 +284,46 @@ void pram_finish_load(struct pram_stream *ps)
BUG_ON((node->flags & PRAM_ACCMODE_MASK) != PRAM_LOAD);
+ pram_truncate_node(node);
pram_free_page(node);
}
+/*
+ * Insert page to PRAM node allocating a new PRAM link if necessary.
+ */
+static int __pram_save_page(struct pram_stream *ps,
+ struct page *page, int flags)
+{
+ struct pram_node *node = ps->node;
+ struct pram_link *link = ps->link;
+ struct pram_entry *entry;
+
+ if (!link || ps->page_index >= PRAM_LINK_ENTRIES_MAX) {
+ struct page *link_page;
+ unsigned long link_pfn;
+
+ link_page = pram_alloc_page((ps->gfp_mask & GFP_RECLAIM_MASK) |
+ __GFP_ZERO);
+ if (!link_page)
+ return -ENOMEM;
+
+ link_pfn = page_to_pfn(link_page);
+ if (link)
+ link->link_pfn = link_pfn;
+ else
+ node->link_pfn = link_pfn;
+
+ ps->link = link = page_address(link_page);
+ ps->page_index = 0;
+ }
+
+ get_page(page);
+ entry = &link->entry[ps->page_index++];
+ entry->flags = flags;
+ entry->pfn = page_to_pfn(page);
+ return 0;
+}
+
/**
* Save page @page to the persistent memory node associated with stream @ps.
* The stream must be initialized with pram_prepare_save().
@@ -231,10 +332,66 @@ void pram_finish_load(struct pram_stream *ps)
* have the PRAM_PAGE_LRU bit set.
*
* Returns 0 on success, -errno on failure.
+ *
+ * Error values:
+ * %ENOMEM: insufficient amount of memory available
+ *
+ * Saving a page to persistent memory is simply incrementing its refcount so
+ * that it will not get freed after the last user puts it. That means it is
+ * safe to use the page as usual after it has been saved.
*/
int pram_save_page(struct pram_stream *ps, struct page *page, int flags)
{
- return -ENOSYS;
+ struct pram_node *node = ps->node;
+
+ BUG_ON(node->type != PRAM_PAGE_STREAM);
+ BUG_ON((node->flags & PRAM_ACCMODE_MASK) != PRAM_SAVE);
+
+ BUG_ON(PageCompound(page));
+
+ return __pram_save_page(ps, page, flags);
+}
+
+/*
+ * Extract the next page from persistent memory freeing a PRAM link if it
+ * becomes empty.
+ */
+static struct page *__pram_load_page(struct pram_stream *ps, int *flags)
+{
+ struct pram_node *node = ps->node;
+ struct pram_link *link = ps->link;
+ struct pram_entry *entry;
+ struct page *page = NULL;
+ bool eof = false;
+
+ if (!link)
+ return NULL;
+
+ BUG_ON(ps->page_index >= PRAM_LINK_ENTRIES_MAX);
+ entry = &link->entry[ps->page_index];
+ if (entry->pfn) {
+ page = pfn_to_page(entry->pfn);
+ if (flags)
+ *flags = entry->flags;
+ } else
+ eof = true;
+
+ /* clear to avoid double free (see pram_truncate_link()) */
+ memset(entry, 0, sizeof(*entry));
+
+ if (eof || ++ps->page_index >= PRAM_LINK_ENTRIES_MAX) {
+ if (link->link_pfn) {
+ WARN_ON(eof);
+ ps->link = pfn_to_kaddr(link->link_pfn);
+ ps->page_index = 0;
+ } else
+ ps->link = NULL;
+
+ node->link_pfn = link->link_pfn;
+ pram_free_page(link);
+ }
+
+ return page;
}
/**
@@ -251,7 +408,12 @@ int pram_save_page(struct pram_stream *ps, struct page *page, int flags)
*/
struct page *pram_load_page(struct pram_stream *ps, int *flags)
{
- return NULL;
+ struct pram_node *node = ps->node;
+
+ BUG_ON(node->type != PRAM_PAGE_STREAM);
+ BUG_ON((node->flags & PRAM_ACCMODE_MASK) != PRAM_LOAD);
+
+ return __pram_load_page(ps, flags);
}
/**
--
1.7.10.4
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH RFC 04/13] mm: PRAM: implement byte stream operations
2013-07-01 11:57 [PATCH RFC 00/13] PRAM: Persistent over-kexec memory storage Vladimir Davydov
` (2 preceding siblings ...)
2013-07-01 11:57 ` [PATCH RFC 03/13] mm: PRAM: implement page stream operations Vladimir Davydov
@ 2013-07-01 11:57 ` Vladimir Davydov
2013-07-01 11:57 ` [PATCH RFC 05/13] mm: PRAM: link nodes by pfn before reboot Vladimir Davydov
` (8 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Vladimir Davydov @ 2013-07-01 11:57 UTC (permalink / raw)
To: linux-kernel; +Cc: linux-mm, criu, devel, xemul, khorenko
This patch adds ability to save arbitrary byte strings to PRAM using
pram_write() to be restored later using pram_read(). These two
operations are implemented on top of pram_save_page() and
pram_load_page() respectively.
---
include/linux/pram.h | 4 +++
mm/pram.c | 86 ++++++++++++++++++++++++++++++++++++++++++++++++--
2 files changed, 88 insertions(+), 2 deletions(-)
diff --git a/include/linux/pram.h b/include/linux/pram.h
index dd17316..61c536c 100644
--- a/include/linux/pram.h
+++ b/include/linux/pram.h
@@ -13,6 +13,10 @@ struct pram_stream {
struct pram_node *node;
struct pram_link *link; /* current link */
unsigned int page_index; /* next page index in link */
+
+ /* byte-stream specific */
+ struct page *data_page;
+ unsigned int data_offset;
};
#define PRAM_NAME_MAX 256 /* including nul */
diff --git a/mm/pram.c b/mm/pram.c
index a443eb0..f7eebe1 100644
--- a/mm/pram.c
+++ b/mm/pram.c
@@ -1,5 +1,6 @@
#include <linux/err.h>
#include <linux/gfp.h>
+#include <linux/highmem.h>
#include <linux/kernel.h>
#include <linux/list.h>
#include <linux/mm.h>
@@ -46,6 +47,7 @@ struct pram_link {
struct pram_node {
__u32 flags; /* see PRAM_* flags below */
__u32 type; /* data type, see enum pram_stream_type */
+ __u64 data_len; /* data size, only for byte streams */
__u64 link_pfn; /* points to the first link of the node */
__u8 name[PRAM_NAME_MAX];
@@ -284,6 +286,9 @@ void pram_finish_load(struct pram_stream *ps)
BUG_ON((node->flags & PRAM_ACCMODE_MASK) != PRAM_LOAD);
+ if (ps->data_page)
+ put_page(ps->data_page);
+
pram_truncate_node(node);
pram_free_page(node);
}
@@ -422,10 +427,51 @@ struct page *pram_load_page(struct pram_stream *ps, int *flags)
*
* On success, returns the number of bytes written, which is always equal to
* @count. On failure, -errno is returned.
+ *
+ * Error values:
+ * %ENOMEM: insufficient amount of memory available
*/
ssize_t pram_write(struct pram_stream *ps, const void *buf, size_t count)
{
- return -ENOSYS;
+ void *addr;
+ size_t copy_count, write_count = 0;
+ struct pram_node *node = ps->node;
+
+ BUG_ON(node->type != PRAM_BYTE_STREAM);
+ BUG_ON((node->flags & PRAM_ACCMODE_MASK) != PRAM_SAVE);
+
+ while (count > 0) {
+ if (!ps->data_page) {
+ struct page *page;
+ int err;
+
+ page = pram_alloc_page((ps->gfp_mask & GFP_RECLAIM_MASK) |
+ __GFP_HIGHMEM | __GFP_ZERO);
+ if (!page)
+ return -ENOMEM;
+ err = __pram_save_page(ps, page, 0);
+ put_page(page);
+ if (err)
+ return err;
+ ps->data_page = page;
+ ps->data_offset = 0;
+ }
+
+ copy_count = min_t(size_t, count, PAGE_SIZE - ps->data_offset);
+ addr = kmap_atomic(ps->data_page);
+ memcpy(addr + ps->data_offset, buf, copy_count);
+ kunmap_atomic(addr);
+
+ buf += copy_count;
+ node->data_len += copy_count;
+ ps->data_offset += copy_count;
+ if (ps->data_offset >= PAGE_SIZE)
+ ps->data_page = NULL;
+
+ write_count += copy_count;
+ count -= copy_count;
+ }
+ return write_count;
}
/**
@@ -437,5 +483,41 @@ ssize_t pram_write(struct pram_stream *ps, const void *buf, size_t count)
*/
size_t pram_read(struct pram_stream *ps, void *buf, size_t count)
{
- return 0;
+ char *addr;
+ size_t copy_count, read_count = 0;
+ struct pram_node *node = ps->node;
+
+ BUG_ON(node->type != PRAM_BYTE_STREAM);
+ BUG_ON((node->flags & PRAM_ACCMODE_MASK) != PRAM_LOAD);
+
+ while (count > 0 && node->data_len > 0) {
+ if (!ps->data_page) {
+ struct page *page;
+
+ page = __pram_load_page(ps, NULL);
+ if (!page)
+ break;
+ ps->data_page = page;
+ ps->data_offset = 0;
+ }
+
+ copy_count = min_t(size_t, count, PAGE_SIZE - ps->data_offset);
+ if (copy_count > node->data_len)
+ copy_count = node->data_len;
+ addr = kmap_atomic(ps->data_page);
+ memcpy(buf, addr + ps->data_offset, copy_count);
+ kunmap_atomic(addr);
+
+ buf += copy_count;
+ node->data_len -= copy_count;
+ ps->data_offset += copy_count;
+ if (ps->data_offset >= PAGE_SIZE || !node->data_len) {
+ put_page(ps->data_page);
+ ps->data_page = NULL;
+ }
+
+ read_count += copy_count;
+ count -= copy_count;
+ }
+ return read_count;
}
--
1.7.10.4
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH RFC 05/13] mm: PRAM: link nodes by pfn before reboot
2013-07-01 11:57 [PATCH RFC 00/13] PRAM: Persistent over-kexec memory storage Vladimir Davydov
` (3 preceding siblings ...)
2013-07-01 11:57 ` [PATCH RFC 04/13] mm: PRAM: implement byte " Vladimir Davydov
@ 2013-07-01 11:57 ` Vladimir Davydov
2013-07-01 11:57 ` [PATCH RFC 06/13] mm: PRAM: introduce super block Vladimir Davydov
` (7 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Vladimir Davydov @ 2013-07-01 11:57 UTC (permalink / raw)
To: linux-kernel; +Cc: linux-mm, criu, devel, xemul, khorenko
Since page structs, which are used for linking PRAM nodes, are cleared
on boot, organize all PRAM nodes into a list singly-linked by pfn's
before reboot to facilitate the node list restore in the new kernel.
---
mm/pram.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 50 insertions(+)
diff --git a/mm/pram.c b/mm/pram.c
index f7eebe1..c7706dc 100644
--- a/mm/pram.c
+++ b/mm/pram.c
@@ -1,11 +1,15 @@
#include <linux/err.h>
#include <linux/gfp.h>
#include <linux/highmem.h>
+#include <linux/init.h>
#include <linux/kernel.h>
#include <linux/list.h>
#include <linux/mm.h>
+#include <linux/module.h>
#include <linux/mutex.h>
+#include <linux/notifier.h>
#include <linux/pram.h>
+#include <linux/reboot.h>
#include <linux/sched.h>
#include <linux/string.h>
#include <linux/types.h>
@@ -42,6 +46,9 @@ struct pram_link {
* singly-linked list of PRAM link structures (see above), the node has a
* pointer to the head of.
*
+ * To facilitate data restore in the new kernel, before reboot all PRAM nodes
+ * are organized into a list singly-linked by pfn's (see pram_reboot()).
+ *
* The structure occupies a memory page.
*/
struct pram_node {
@@ -49,6 +56,7 @@ struct pram_node {
__u32 type; /* data type, see enum pram_stream_type */
__u64 data_len; /* data size, only for byte streams */
__u64 link_pfn; /* points to the first link of the node */
+ __u64 node_pfn; /* points to the next node in the node list */
__u8 name[PRAM_NAME_MAX];
};
@@ -57,6 +65,10 @@ struct pram_node {
#define PRAM_LOAD 2
#define PRAM_ACCMODE_MASK 3
+/*
+ * For convenience sake PRAM nodes are kept in an auxiliary doubly-linked list
+ * connected through the lru field of the page struct.
+ */
static LIST_HEAD(pram_nodes); /* linked through page::lru */
static DEFINE_MUTEX(pram_mutex); /* serializes open/close */
@@ -521,3 +533,41 @@ size_t pram_read(struct pram_stream *ps, void *buf, size_t count)
}
return read_count;
}
+
+/*
+ * Build the list of PRAM nodes.
+ */
+static void __pram_reboot(void)
+{
+ struct page *page;
+ struct pram_node *node;
+ unsigned long node_pfn = 0;
+
+ list_for_each_entry_reverse(page, &pram_nodes, lru) {
+ node = page_address(page);
+ if (WARN_ON(node->flags & PRAM_ACCMODE_MASK))
+ continue;
+ node->node_pfn = node_pfn;
+ node_pfn = page_to_pfn(page);
+ }
+}
+
+static int pram_reboot(struct notifier_block *notifier,
+ unsigned long val, void *v)
+{
+ if (val != SYS_RESTART)
+ return NOTIFY_DONE;
+ __pram_reboot();
+ return NOTIFY_OK;
+}
+
+static struct notifier_block pram_reboot_notifier = {
+ .notifier_call = pram_reboot,
+};
+
+static int __init pram_init(void)
+{
+ register_reboot_notifier(&pram_reboot_notifier);
+ return 0;
+}
+module_init(pram_init);
--
1.7.10.4
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH RFC 06/13] mm: PRAM: introduce super block
2013-07-01 11:57 [PATCH RFC 00/13] PRAM: Persistent over-kexec memory storage Vladimir Davydov
` (4 preceding siblings ...)
2013-07-01 11:57 ` [PATCH RFC 05/13] mm: PRAM: link nodes by pfn before reboot Vladimir Davydov
@ 2013-07-01 11:57 ` Vladimir Davydov
2013-07-01 11:57 ` [PATCH RFC 07/13] mm: PRAM: preserve persistent memory at boot Vladimir Davydov
` (6 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Vladimir Davydov @ 2013-07-01 11:57 UTC (permalink / raw)
To: linux-kernel; +Cc: linux-mm, criu, devel, xemul, khorenko
The PRAM super block is the starting point for restoring persistent
memory. If the kernel locates the super block at boot time, it will
preserve the persistent memory structure from the previous kernel. To
point the kernel to the location of the super block, one should pass its
pfn via the 'pram' boot param. For that purpose, the pram super block
pfn is exported via /sys/kernel/pram. If none is passed, persistent
memory will not be preserved, and a new super block will be allocated.
The current patch introduces only super block handling. Memory
preservation will be implemented later.
---
mm/pram.c | 94 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 92 insertions(+), 2 deletions(-)
diff --git a/mm/pram.c b/mm/pram.c
index c7706dc..58ae9ed 100644
--- a/mm/pram.c
+++ b/mm/pram.c
@@ -3,15 +3,18 @@
#include <linux/highmem.h>
#include <linux/init.h>
#include <linux/kernel.h>
+#include <linux/kobject.h>
#include <linux/list.h>
#include <linux/mm.h>
#include <linux/module.h>
#include <linux/mutex.h>
#include <linux/notifier.h>
+#include <linux/pfn.h>
#include <linux/pram.h>
#include <linux/reboot.h>
#include <linux/sched.h>
#include <linux/string.h>
+#include <linux/sysfs.h>
#include <linux/types.h>
/*
@@ -66,12 +69,39 @@ struct pram_node {
#define PRAM_ACCMODE_MASK 3
/*
+ * The PRAM super block contains data needed to restore the persistent memory
+ * structure on boot. The pointer to it (pfn) should be passed via the 'pram'
+ * boot param if one wants to restore persistent data saved by the previously
+ * executing kernel. For that purpose the kernel exports the pfn via
+ * /sys/kernel/pram. If none is passed, persistent memory if any will not be
+ * preserved and a new clean page will be allocated for the super block.
+ *
+ * The structure occupies a memory page.
+ */
+struct pram_super_block {
+ __u64 node_pfn; /* points to the first element of
+ * the node list */
+};
+
+static unsigned long __initdata pram_sb_pfn;
+static struct pram_super_block *pram_sb;
+
+/*
* For convenience sake PRAM nodes are kept in an auxiliary doubly-linked list
* connected through the lru field of the page struct.
*/
static LIST_HEAD(pram_nodes); /* linked through page::lru */
static DEFINE_MUTEX(pram_mutex); /* serializes open/close */
+/*
+ * The PRAM super block pfn, see above.
+ */
+static int __init parse_pram_sb_pfn(char *arg)
+{
+ return kstrtoul(arg, 16, &pram_sb_pfn);
+}
+early_param("pram", parse_pram_sb_pfn);
+
static inline struct page *pram_alloc_page(gfp_t gfp_mask)
{
return alloc_page(gfp_mask);
@@ -161,6 +191,7 @@ static void pram_stream_init(struct pram_stream *ps,
* Returns 0 on success, -errno on failure.
*
* Error values:
+ * %ENODEV: PRAM not available
* %ENAMETOOLONG: name len >= PRAM_NAME_MAX
* %ENOMEM: insufficient memory available
* %EEXIST: node with specified name already exists
@@ -175,6 +206,9 @@ int pram_prepare_save(struct pram_stream *ps,
struct pram_node *node;
int err = 0;
+ if (!pram_sb)
+ return -ENODEV;
+
BUG_ON(type != PRAM_PAGE_STREAM &&
type != PRAM_BYTE_STREAM);
@@ -250,6 +284,7 @@ void pram_discard_save(struct pram_stream *ps)
* Returns 0 on success, -errno on failure.
*
* Error values:
+ * %ENODEV: PRAM not available
* %ENOENT: node with specified name does not exist
* %EBUSY: save to required node has not finished yet
* %EPERM: specified type conflicts with type of required node
@@ -262,6 +297,9 @@ int pram_prepare_load(struct pram_stream *ps,
struct pram_node *node;
int err = 0;
+ if (!pram_sb)
+ return -ENODEV;
+
mutex_lock(&pram_mutex);
node = pram_find_node(name);
if (!node) {
@@ -550,6 +588,7 @@ static void __pram_reboot(void)
node->node_pfn = node_pfn;
node_pfn = page_to_pfn(page);
}
+ pram_sb->node_pfn = node_pfn;
}
static int pram_reboot(struct notifier_block *notifier,
@@ -557,7 +596,8 @@ static int pram_reboot(struct notifier_block *notifier,
{
if (val != SYS_RESTART)
return NOTIFY_DONE;
- __pram_reboot();
+ if (pram_sb)
+ __pram_reboot();
return NOTIFY_OK;
}
@@ -565,9 +605,59 @@ static struct notifier_block pram_reboot_notifier = {
.notifier_call = pram_reboot,
};
+static ssize_t show_pram_sb_pfn(struct kobject *kobj,
+ struct kobj_attribute *attr, char *buf)
+{
+ unsigned long pfn = pram_sb ? PFN_DOWN(__pa(pram_sb)) : 0;
+ return sprintf(buf, "%lx\n", pfn);
+}
+
+static struct kobj_attribute pram_sb_pfn_attr =
+ __ATTR(pram, 0444, show_pram_sb_pfn, NULL);
+
+static struct attribute *pram_attrs[] = {
+ &pram_sb_pfn_attr.attr,
+ NULL,
+};
+
+static struct attribute_group pram_attr_group = {
+ .attrs = pram_attrs,
+};
+
+/* returns non-zero on success */
+static int __init pram_init_sb(void)
+{
+ unsigned long pfn;
+ struct pram_node *node;
+
+ if (!pram_sb) {
+ struct page *page;
+
+ page = pram_alloc_page(GFP_KERNEL | __GFP_ZERO);
+ if (!page) {
+ pr_err("PRAM: Failed to allocate super block\n");
+ return 0;
+ }
+ pram_sb = page_address(page);
+ }
+
+ /* build auxiliary doubly-linked list of nodes connected through
+ * page::lru for convenience sake */
+ pfn = pram_sb->node_pfn;
+ while (pfn) {
+ node = pfn_to_kaddr(pfn);
+ pram_insert_node(node);
+ pfn = node->node_pfn;
+ }
+ return 1;
+}
+
static int __init pram_init(void)
{
- register_reboot_notifier(&pram_reboot_notifier);
+ if (pram_init_sb()) {
+ register_reboot_notifier(&pram_reboot_notifier);
+ sysfs_update_group(kernel_kobj, &pram_attr_group);
+ }
return 0;
}
module_init(pram_init);
--
1.7.10.4
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH RFC 07/13] mm: PRAM: preserve persistent memory at boot
2013-07-01 11:57 [PATCH RFC 00/13] PRAM: Persistent over-kexec memory storage Vladimir Davydov
` (5 preceding siblings ...)
2013-07-01 11:57 ` [PATCH RFC 06/13] mm: PRAM: introduce super block Vladimir Davydov
@ 2013-07-01 11:57 ` Vladimir Davydov
2013-07-01 11:57 ` [PATCH RFC 08/13] mm: PRAM: checksum saved data Vladimir Davydov
` (5 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Vladimir Davydov @ 2013-07-01 11:57 UTC (permalink / raw)
To: linux-kernel; +Cc: linux-mm, criu, devel, xemul, khorenko
Persistent memory preservation is done by reserving memory pages
belonging to PRAM at early boot so that they will not be recycled. If
memory reservation fails for some reason (e.g. memory region is busy),
persistent memory will be lost.
Currently, PRAM preservation is only implemented for x86.
---
arch/x86/kernel/setup.c | 2 +
arch/x86/mm/init_32.c | 4 +
arch/x86/mm/init_64.c | 4 +
include/linux/pram.h | 8 ++
mm/Kconfig | 1 +
mm/pram.c | 203 +++++++++++++++++++++++++++++++++++++++++++++++
6 files changed, 222 insertions(+)
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index fae9134..caf1b29 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -69,6 +69,7 @@
#include <linux/crash_dump.h>
#include <linux/tboot.h>
#include <linux/jiffies.h>
+#include <linux/pram.h>
#include <video/edid.h>
@@ -1127,6 +1128,7 @@ void __init setup_arch(char **cmdline_p)
acpi_initrd_override((void *)initrd_start, initrd_end - initrd_start);
#endif
+ pram_reserve();
reserve_crashkernel();
vsmp_init();
diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c
index 2d19001..da38426 100644
--- a/arch/x86/mm/init_32.c
+++ b/arch/x86/mm/init_32.c
@@ -31,6 +31,7 @@
#include <linux/initrd.h>
#include <linux/cpumask.h>
#include <linux/gfp.h>
+#include <linux/pram.h>
#include <asm/asm.h>
#include <asm/bios_ebda.h>
@@ -779,6 +780,9 @@ void __init mem_init(void)
after_bootmem = 1;
+ totalram_pages += pram_reserved_pages;
+ reservedpages -= pram_reserved_pages;
+
codesize = (unsigned long) &_etext - (unsigned long) &_text;
datasize = (unsigned long) &_edata - (unsigned long) &_etext;
initsize = (unsigned long) &__init_end - (unsigned long) &__init_begin;
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 474e28f..8aa4bc4 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -32,6 +32,7 @@
#include <linux/memory_hotplug.h>
#include <linux/nmi.h>
#include <linux/gfp.h>
+#include <linux/pram.h>
#include <asm/processor.h>
#include <asm/bios_ebda.h>
@@ -1077,6 +1078,9 @@ void __init mem_init(void)
reservedpages = max_pfn - totalram_pages - absent_pages;
after_bootmem = 1;
+ totalram_pages += pram_reserved_pages;
+ reservedpages -= pram_reserved_pages;
+
codesize = (unsigned long) &_etext - (unsigned long) &_text;
datasize = (unsigned long) &_edata - (unsigned long) &_etext;
initsize = (unsigned long) &__init_end - (unsigned long) &__init_begin;
diff --git a/include/linux/pram.h b/include/linux/pram.h
index 61c536c..b7f2799 100644
--- a/include/linux/pram.h
+++ b/include/linux/pram.h
@@ -47,4 +47,12 @@ extern ssize_t pram_write(struct pram_stream *ps,
const void *buf, size_t count);
extern size_t pram_read(struct pram_stream *ps, void *buf, size_t count);
+#ifdef CONFIG_PRAM
+extern unsigned long pram_reserved_pages;
+extern void pram_reserve(void);
+#else
+#define pram_reserved_pages 0UL
+static inline void pram_reserve(void) { }
+#endif
+
#endif /* _LINUX_PRAM_H */
diff --git a/mm/Kconfig b/mm/Kconfig
index 46337e8..f1e11a0 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -474,6 +474,7 @@ config FRONTSWAP
config PRAM
bool "Persistent over-kexec memory storage"
+ depends on X86
default n
help
This option adds the kernel API that enables saving memory pages of
diff --git a/mm/pram.c b/mm/pram.c
index 58ae9ed..380735f 100644
--- a/mm/pram.c
+++ b/mm/pram.c
@@ -1,3 +1,4 @@
+#include <linux/bootmem.h>
#include <linux/err.h>
#include <linux/gfp.h>
#include <linux/highmem.h>
@@ -5,6 +6,7 @@
#include <linux/kernel.h>
#include <linux/kobject.h>
#include <linux/list.h>
+#include <linux/memblock.h>
#include <linux/mm.h>
#include <linux/module.h>
#include <linux/mutex.h>
@@ -93,6 +95,8 @@ static struct pram_super_block *pram_sb;
static LIST_HEAD(pram_nodes); /* linked through page::lru */
static DEFINE_MUTEX(pram_mutex); /* serializes open/close */
+unsigned long __initdata pram_reserved_pages;
+
/*
* The PRAM super block pfn, see above.
*/
@@ -102,6 +106,196 @@ static int __init parse_pram_sb_pfn(char *arg)
}
early_param("pram", parse_pram_sb_pfn);
+static void * __init pram_map_meta(unsigned long pfn)
+{
+ if (pfn >= max_low_pfn)
+ return ERR_PTR(-EINVAL);
+ return pfn_to_kaddr(pfn);
+}
+
+static int __init pram_reserve_page(unsigned long pfn)
+{
+ int err = 0;
+ phys_addr_t base, size;
+
+ if (pfn >= max_pfn)
+ return -EINVAL;
+
+ base = PFN_PHYS(pfn);
+ size = PAGE_SIZE;
+
+#ifdef CONFIG_NO_BOOTMEM
+ if (memblock_is_region_reserved(base, size) ||
+ memblock_reserve(base, size) < 0)
+ err = -EBUSY;
+#else
+ err = reserve_bootmem(base, size, BOOTMEM_EXCLUSIVE);
+#endif
+ if (!err)
+ pram_reserved_pages++;
+ return err;
+}
+
+static void __init pram_unreserve_page(unsigned long pfn)
+{
+ free_bootmem(PFN_PHYS(pfn), PAGE_SIZE);
+ pram_reserved_pages--;
+}
+
+static int __init pram_reserve_link(struct pram_link *link)
+{
+ int i;
+ int err = 0;
+
+ for (i = 0; i < PRAM_LINK_ENTRIES_MAX; i++) {
+ struct pram_entry *p = &link->entry[i];
+ if (!p->pfn)
+ break;
+ err = pram_reserve_page(p->pfn);
+ if (err)
+ break;
+ p->flags &= ~PRAM_PAGE_LRU;
+ }
+ if (err) {
+ /* undo */
+ while (--i >= 0)
+ pram_unreserve_page(link->entry[i].pfn);
+ }
+ return err;
+}
+
+static void __init pram_unreserve_link(struct pram_link *link)
+{
+ int i;
+
+ for (i = 0; i < PRAM_LINK_ENTRIES_MAX; i++) {
+ unsigned long pfn = link->entry[i].pfn;
+ if (!pfn)
+ break;
+ pram_unreserve_page(pfn);
+ }
+}
+
+static int __init pram_reserve_node(struct pram_node *node)
+{
+ unsigned long link_pfn;
+ struct pram_link *link;
+ int err = 0;
+
+ link_pfn = node->link_pfn;
+ while (link_pfn) {
+ err = pram_reserve_page(link_pfn);
+ if (err)
+ break;
+ link = pram_map_meta(link_pfn);
+ if (IS_ERR(link)) {
+ pram_unreserve_page(link_pfn);
+ err = PTR_ERR(link);
+ break;
+ }
+ err = pram_reserve_link(link);
+ if (err) {
+ pram_unreserve_page(link_pfn);
+ break;
+ }
+ link_pfn = link->link_pfn;
+ }
+ if (err) {
+ /* undo */
+ unsigned long bad_pfn = link_pfn;
+ link_pfn = node->link_pfn;
+ while (link_pfn != bad_pfn) {
+ link = pfn_to_kaddr(link_pfn);
+ pram_unreserve_link(link);
+ link_pfn = link->link_pfn;
+ pram_unreserve_page(link_pfn);
+ }
+ }
+ return err;
+}
+
+static void __init pram_unreserve_node(struct pram_node *node)
+{
+ unsigned long link_pfn;
+ struct pram_link *link;
+
+ link_pfn = node->link_pfn;
+ while (link_pfn) {
+ link = pfn_to_kaddr(link_pfn);
+ pram_unreserve_link(link);
+ link_pfn = link->link_pfn;
+ pram_unreserve_page(link_pfn);
+ }
+}
+
+/*
+ * Mark pages that belong to persistent memory reserved.
+ *
+ * This function should be called at boot time as early as possible to prevent
+ * persistent memory from being recycled.
+ */
+void __init pram_reserve(void)
+{
+ unsigned long node_pfn;
+ struct pram_node *node;
+ int err = 0;
+
+ if (!pram_sb_pfn)
+ return;
+
+ pr_info("PRAM: Examining persistent memory...\n");
+
+ err = pram_reserve_page(pram_sb_pfn);
+ if (err)
+ goto out;
+ pram_sb = pram_map_meta(pram_sb_pfn);
+ if (IS_ERR(pram_sb)) {
+ pram_unreserve_page(pram_sb_pfn);
+ err = PTR_ERR(pram_sb);
+ goto out;
+ }
+
+ node_pfn = pram_sb->node_pfn;
+ while (node_pfn) {
+ err = pram_reserve_page(node_pfn);
+ if (err)
+ break;
+ node = pram_map_meta(node_pfn);
+ if (IS_ERR(node)) {
+ pram_unreserve_page(node_pfn);
+ err = PTR_ERR(node);
+ break;
+ }
+ err = pram_reserve_node(node);
+ if (err) {
+ pram_unreserve_page(node_pfn);
+ break;
+ }
+ node_pfn = node->node_pfn;
+ }
+
+ if (err) {
+ /* undo */
+ unsigned long bad_pfn = node_pfn;
+ node_pfn = pram_sb->node_pfn;
+ while (node_pfn != bad_pfn) {
+ node = pfn_to_kaddr(node_pfn);
+ pram_unreserve_node(node);
+ node_pfn = node->node_pfn;
+ pram_unreserve_page(node_pfn);
+ }
+ pram_unreserve_page(pram_sb_pfn);
+ }
+
+out:
+ if (err) {
+ BUG_ON(pram_reserved_pages > 0);
+ pr_err("PRAM: Reservation failed: %d\n", err);
+ pram_sb = NULL;
+ } else
+ pr_info("PRAM: %lu pages reserved\n", pram_reserved_pages);
+}
+
static inline struct page *pram_alloc_page(gfp_t gfp_mask)
{
return alloc_page(gfp_mask);
@@ -109,6 +303,9 @@ static inline struct page *pram_alloc_page(gfp_t gfp_mask)
static inline void pram_free_page(void *addr)
{
+ /* since early reservations are used for preserving persistent
+ * memory, the page may have the reserved bit set */
+ ClearPageReserved(virt_to_page(addr));
free_page((unsigned long)addr);
}
@@ -146,6 +343,9 @@ static void pram_truncate_link(struct pram_link *link)
if (!pfn)
continue;
page = pfn_to_page(pfn);
+ /* since early reservations are used for preserving persistent
+ * memory, the page may have the reserved bit set */
+ ClearPageReserved(page);
put_page(page);
}
}
@@ -426,6 +626,9 @@ static struct page *__pram_load_page(struct pram_stream *ps, int *flags)
entry = &link->entry[ps->page_index];
if (entry->pfn) {
page = pfn_to_page(entry->pfn);
+ /* since early reservations are used for preserving persistent
+ * memory, the page may have the reserved bit set */
+ ClearPageReserved(page);
if (flags)
*flags = entry->flags;
} else
--
1.7.10.4
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH RFC 08/13] mm: PRAM: checksum saved data
2013-07-01 11:57 [PATCH RFC 00/13] PRAM: Persistent over-kexec memory storage Vladimir Davydov
` (6 preceding siblings ...)
2013-07-01 11:57 ` [PATCH RFC 07/13] mm: PRAM: preserve persistent memory at boot Vladimir Davydov
@ 2013-07-01 11:57 ` Vladimir Davydov
2013-07-01 11:57 ` [PATCH RFC 09/13] mm: PRAM: ban pages that have been reserved at boot time Vladimir Davydov
` (4 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Vladimir Davydov @ 2013-07-01 11:57 UTC (permalink / raw)
To: linux-kernel; +Cc: linux-mm, criu, devel, xemul, khorenko
Checksum PRAM pages with crc32 to ensure persistent memory is not
corrupted during reboot.
---
mm/Kconfig | 4 ++
mm/pram.c | 128 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
2 files changed, 130 insertions(+), 2 deletions(-)
diff --git a/mm/Kconfig b/mm/Kconfig
index f1e11a0..0a4d4c6 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -475,6 +475,10 @@ config FRONTSWAP
config PRAM
bool "Persistent over-kexec memory storage"
depends on X86
+ select CRC32
+ select LIBCRC32C
+ select CRYPTO_CRC32C
+ select CRYPTO_CRC32C_INTEL
default n
help
This option adds the kernel API that enables saving memory pages of
diff --git a/mm/pram.c b/mm/pram.c
index 380735f..8a66a86 100644
--- a/mm/pram.c
+++ b/mm/pram.c
@@ -1,4 +1,6 @@
#include <linux/bootmem.h>
+#include <linux/crc32.h>
+#include <linux/crc32c.h>
#include <linux/err.h>
#include <linux/gfp.h>
#include <linux/highmem.h>
@@ -19,11 +21,14 @@
#include <linux/sysfs.h>
#include <linux/types.h>
+#define PRAM_MAGIC 0x7072616D
+
/*
* Represents a reference to a data page saved to PRAM.
*/
struct pram_entry {
__u32 flags; /* see PRAM_PAGE_* flags */
+ __u32 csum; /* the page csum */
__u64 pfn; /* the page frame number */
};
@@ -32,6 +37,9 @@ struct pram_entry {
* The structure occupies a memory page.
*/
struct pram_link {
+ __u32 magic;
+ __u32 csum;
+
__u64 link_pfn; /* points to the next link of the node */
/* the array occupies the rest of the link page; if the link is not
@@ -57,6 +65,9 @@ struct pram_link {
* The structure occupies a memory page.
*/
struct pram_node {
+ __u32 magic;
+ __u32 csum;
+
__u32 flags; /* see PRAM_* flags below */
__u32 type; /* data type, see enum pram_stream_type */
__u64 data_len; /* data size, only for byte streams */
@@ -81,6 +92,9 @@ struct pram_node {
* The structure occupies a memory page.
*/
struct pram_super_block {
+ __u32 magic;
+ __u32 csum;
+
__u64 node_pfn; /* points to the first element of
* the node list */
};
@@ -106,11 +120,34 @@ static int __init parse_pram_sb_pfn(char *arg)
}
early_param("pram", parse_pram_sb_pfn);
+static u32 pram_data_csum(struct page *page)
+{
+ u32 ret;
+ void *addr;
+
+ addr = kmap_atomic(page);
+ ret = crc32c(0, addr, PAGE_SIZE);
+ kunmap_atomic(addr);
+ return ret;
+}
+
+/* SSE-4.2 crc32c faster than crc32, but not available at early boot */
+static inline u32 pram_meta_csum(void *addr)
+{
+ /* skip magic and csum fields */
+ return crc32(0, addr + 8, PAGE_SIZE - 8);
+}
+
static void * __init pram_map_meta(unsigned long pfn)
{
+ __u32 *p;
+
if (pfn >= max_low_pfn)
return ERR_PTR(-EINVAL);
- return pfn_to_kaddr(pfn);
+ p = pfn_to_kaddr(pfn);
+ if (p[0] != PRAM_MAGIC || p[1] != pram_meta_csum(p))
+ return ERR_PTR(-EINVAL);
+ return p;
}
static int __init pram_reserve_page(unsigned long pfn)
@@ -332,6 +369,65 @@ static struct pram_node *pram_find_node(const char *name)
return NULL;
}
+static void pram_csum_link(struct pram_link *link)
+{
+ int i;
+ struct pram_entry *entry;
+
+ for (i = 0; i < PRAM_LINK_ENTRIES_MAX; i++) {
+ entry = &link->entry[i];
+ if (entry->pfn)
+ entry->csum = pram_data_csum(pfn_to_page(entry->pfn));
+ }
+}
+
+static void pram_csum_node(struct pram_node *node)
+{
+ unsigned long link_pfn;
+ struct pram_link *link;
+
+ link_pfn = node->link_pfn;
+ while (link_pfn) {
+ link = pfn_to_kaddr(link_pfn);
+ pram_csum_link(link);
+ link_pfn = link->link_pfn;
+ cond_resched();
+ }
+}
+
+static int pram_check_link(struct pram_link *link)
+{
+ int i;
+ struct pram_entry *entry;
+
+ for (i = 0; i < PRAM_LINK_ENTRIES_MAX; i++) {
+ entry = &link->entry[i];
+ if (!entry->pfn)
+ break;
+ if (entry->csum != pram_data_csum(pfn_to_page(entry->pfn)))
+ return -EFAULT;
+ }
+ return 0;
+}
+
+static int pram_check_node(struct pram_node *node)
+{
+ unsigned long link_pfn;
+ struct pram_link *link;
+ int ret = 0;
+
+ link_pfn = node->link_pfn;
+ while (link_pfn) {
+ link = pfn_to_kaddr(link_pfn);
+ ret = pram_check_link(link);
+ if (ret)
+ break;
+ link_pfn = link->link_pfn;
+ cond_resched();
+ }
+ return ret;
+}
+
static void pram_truncate_link(struct pram_link *link)
{
int i;
@@ -449,6 +545,7 @@ void pram_finish_save(struct pram_stream *ps)
BUG_ON((node->flags & PRAM_ACCMODE_MASK) != PRAM_SAVE);
+ pram_csum_node(node);
smp_wmb();
node->flags &= ~PRAM_ACCMODE_MASK;
}
@@ -488,6 +585,7 @@ void pram_discard_save(struct pram_stream *ps)
* %ENOENT: node with specified name does not exist
* %EBUSY: save to required node has not finished yet
* %EPERM: specified type conflicts with type of required node
+ * %EFAULT: node corrupted
*
* After the load has finished, pram_finish_load() is to be called.
*/
@@ -520,6 +618,13 @@ out_unlock:
if (err)
return err;
+ err = pram_check_node(node);
+ if (err) {
+ pram_truncate_node(node);
+ pram_free_page(node);
+ return err;
+ }
+
node->flags |= PRAM_LOAD;
pram_stream_init(ps, node, 0);
return 0;
@@ -775,8 +880,24 @@ size_t pram_read(struct pram_stream *ps, void *buf, size_t count)
return read_count;
}
+static void pram_csum_node_meta(struct pram_node *node)
+{
+ unsigned long link_pfn;
+ struct pram_link *link;
+
+ link_pfn = node->link_pfn;
+ while (link_pfn) {
+ link = pfn_to_kaddr(link_pfn);
+ link->magic = PRAM_MAGIC;
+ link->csum = pram_meta_csum(link);
+ link_pfn = link->link_pfn;
+ }
+ node->magic = PRAM_MAGIC;
+ node->csum = pram_meta_csum(node);
+}
+
/*
- * Build the list of PRAM nodes.
+ * Build the list of PRAM nodes and update metadata csums.
*/
static void __pram_reboot(void)
{
@@ -789,9 +910,12 @@ static void __pram_reboot(void)
if (WARN_ON(node->flags & PRAM_ACCMODE_MASK))
continue;
node->node_pfn = node_pfn;
+ pram_csum_node_meta(node);
node_pfn = page_to_pfn(page);
}
pram_sb->node_pfn = node_pfn;
+ pram_sb->magic = PRAM_MAGIC;
+ pram_sb->csum = pram_meta_csum(pram_sb);
}
static int pram_reboot(struct notifier_block *notifier,
--
1.7.10.4
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH RFC 09/13] mm: PRAM: ban pages that have been reserved at boot time
2013-07-01 11:57 [PATCH RFC 00/13] PRAM: Persistent over-kexec memory storage Vladimir Davydov
` (7 preceding siblings ...)
2013-07-01 11:57 ` [PATCH RFC 08/13] mm: PRAM: checksum saved data Vladimir Davydov
@ 2013-07-01 11:57 ` Vladimir Davydov
2013-07-01 11:57 ` [PATCH RFC 10/13] mm: PRAM: allow to ban arbitrary memory ranges Vladimir Davydov
` (3 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Vladimir Davydov @ 2013-07-01 11:57 UTC (permalink / raw)
To: linux-kernel; +Cc: linux-mm, criu, devel, xemul, khorenko
Obviously, not all memory ranges can be used for saving persistent
over-kexec data, because some of them are reserved by the system core
and various device drivers at boot time. If a memory range used for
initialization of a particular device turns out to be busy because PRAM
uses it for storing its data, the device driver initialization stage or
even the whole system boot sequence may fail.
As a workaround the current implementation uses a rather dirty hack. It
tracks all memory regions that have ever been reserved during the boot
sequence and avoids using pages belonging to those regions for storing
persistent data. Since the device configuration cannot change during
kexec and the newly booted kernel is likely to have a similar boot-time
device driver set, this hack should work in most cases.
---
arch/x86/mm/init_32.c | 1 +
arch/x86/mm/init_64.c | 1 +
include/linux/pram.h | 4 +
mm/bootmem.c | 4 +
mm/memblock.c | 7 +-
mm/pram.c | 211 ++++++++++++++++++++++++++++++++++++++++++++++++-
6 files changed, 225 insertions(+), 3 deletions(-)
diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c
index da38426..67b963a 100644
--- a/arch/x86/mm/init_32.c
+++ b/arch/x86/mm/init_32.c
@@ -782,6 +782,7 @@ void __init mem_init(void)
totalram_pages += pram_reserved_pages;
reservedpages -= pram_reserved_pages;
+ pram_show_banned();
codesize = (unsigned long) &_etext - (unsigned long) &_text;
datasize = (unsigned long) &_edata - (unsigned long) &_etext;
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 8aa4bc4..fbe3e17 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -1080,6 +1080,7 @@ void __init mem_init(void)
totalram_pages += pram_reserved_pages;
reservedpages -= pram_reserved_pages;
+ pram_show_banned();
codesize = (unsigned long) &_etext - (unsigned long) &_text;
datasize = (unsigned long) &_edata - (unsigned long) &_etext;
diff --git a/include/linux/pram.h b/include/linux/pram.h
index b7f2799..d4f23e3 100644
--- a/include/linux/pram.h
+++ b/include/linux/pram.h
@@ -50,9 +50,13 @@ extern size_t pram_read(struct pram_stream *ps, void *buf, size_t count);
#ifdef CONFIG_PRAM
extern unsigned long pram_reserved_pages;
extern void pram_reserve(void);
+extern void pram_ban_region(unsigned long start, unsigned long end);
+extern void pram_show_banned(void);
#else
#define pram_reserved_pages 0UL
static inline void pram_reserve(void) { }
+static inline void pram_ban_region(unsigned long start, unsigned long end) { }
+static inline void pram_show_banned(void) { }
#endif
#endif /* _LINUX_PRAM_H */
diff --git a/mm/bootmem.c b/mm/bootmem.c
index 2b0bcb0..34d0b42 100644
--- a/mm/bootmem.c
+++ b/mm/bootmem.c
@@ -16,6 +16,7 @@
#include <linux/kmemleak.h>
#include <linux/range.h>
#include <linux/memblock.h>
+#include <linux/pram.h>
#include <asm/bug.h>
#include <asm/io.h>
@@ -328,6 +329,9 @@ static int __init __reserve(bootmem_data_t *bdata, unsigned long sidx,
bdebug("silent double reserve of PFN %lx\n",
idx + bdata->node_min_pfn);
}
+
+ pram_ban_region(sidx + bdata->node_min_pfn,
+ eidx + bdata->node_min_pfn - 1);
return 0;
}
diff --git a/mm/memblock.c b/mm/memblock.c
index b8d9147..d2c248e 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -19,6 +19,7 @@
#include <linux/debugfs.h>
#include <linux/seq_file.h>
#include <linux/memblock.h>
+#include <linux/pram.h>
static struct memblock_region memblock_memory_init_regions[INIT_MEMBLOCK_REGIONS] __initdata_memblock;
static struct memblock_region memblock_reserved_init_regions[INIT_MEMBLOCK_REGIONS] __initdata_memblock;
@@ -553,13 +554,17 @@ int __init_memblock memblock_free(phys_addr_t base, phys_addr_t size)
int __init_memblock memblock_reserve(phys_addr_t base, phys_addr_t size)
{
struct memblock_type *_rgn = &memblock.reserved;
+ int err;
memblock_dbg("memblock_reserve: [%#016llx-%#016llx] %pF\n",
(unsigned long long)base,
(unsigned long long)base + size,
(void *)_RET_IP_);
- return memblock_add_region(_rgn, base, size, MAX_NUMNODES);
+ err = memblock_add_region(_rgn, base, size, MAX_NUMNODES);
+ if (!err)
+ pram_ban_region(PFN_DOWN(base), PFN_UP(base + size) - 1);
+ return err;
}
/**
diff --git a/mm/pram.c b/mm/pram.c
index 8a66a86..969ff3f 100644
--- a/mm/pram.c
+++ b/mm/pram.c
@@ -17,6 +17,7 @@
#include <linux/pram.h>
#include <linux/reboot.h>
#include <linux/sched.h>
+#include <linux/spinlock.h>
#include <linux/string.h>
#include <linux/sysfs.h>
#include <linux/types.h>
@@ -110,6 +111,47 @@ static LIST_HEAD(pram_nodes); /* linked through page::lru */
static DEFINE_MUTEX(pram_mutex); /* serializes open/close */
unsigned long __initdata pram_reserved_pages;
+static bool __meminitdata pram_reservation_in_progress;
+
+/*
+ * Obviously, not all memory ranges can be used for saving persistent
+ * over-kexec data, because some of them are reserved by the system core and
+ * various device drivers at boot time. If a memory range used for
+ * initialization of a particular device turns out to be busy because PRAM uses
+ * it for storing its data, the device driver initialization stage or even the
+ * whole system boot sequence may fail.
+ *
+ * As a workaround the current implementation uses a rather dirty hack. It
+ * tracks all memory regions that have ever been reserved during the boot
+ * sequence and avoids using pages belonging to those regions for storing
+ * persistent data. Since the device configuration cannot change during kexec
+ * and the newly booted kernel is likely to have a similar boot-time device
+ * driver set, this hack should work in most cases.
+ */
+
+/*
+ * Represents a region of memory that PRAM is not allowed to use.
+ */
+struct banned_region {
+ unsigned long start, end; /* pfn, inclusive */
+};
+
+#define MAX_NR_BANNED (32 + MAX_NUMNODES * 2)
+
+static unsigned int nr_banned; /* number of banned regions */
+
+/* banned regions; arranged in ascending order, do not overlap */
+static struct banned_region banned[MAX_NR_BANNED];
+
+/*
+ * If a page allocated for PRAM needs turns out to belong to a banned region,
+ * it is placed to the banned_pages list for next allocation attempts not to
+ * encounter it all over again. The list is shrunk when the system memory is
+ * low.
+ */
+static LIST_HEAD(banned_pages); /* linked through page::lru */
+static DEFINE_SPINLOCK(banned_pages_lock);
+static unsigned long nr_banned_pages;
/*
* The PRAM super block pfn, see above.
@@ -281,6 +323,7 @@ void __init pram_reserve(void)
return;
pr_info("PRAM: Examining persistent memory...\n");
+ pram_reservation_in_progress = true;
err = pram_reserve_page(pram_sb_pfn);
if (err)
@@ -325,6 +368,7 @@ void __init pram_reserve(void)
}
out:
+ pram_reservation_in_progress = false;
if (err) {
BUG_ON(pram_reserved_pages > 0);
pr_err("PRAM: Reservation failed: %d\n", err);
@@ -333,9 +377,114 @@ out:
pr_info("PRAM: %lu pages reserved\n", pram_reserved_pages);
}
+/*
+ * Bans pfn range [start..end] (inclusive) for PRAM.
+ */
+void __meminit pram_ban_region(unsigned long start, unsigned long end)
+{
+ int i, merged = -1;
+
+ if (pram_reservation_in_progress)
+ return;
+
+ /* first try to merge the region with an existing one */
+ for (i = nr_banned - 1; i >= 0 && start <= banned[i].end + 1; i--) {
+ if (end + 1 >= banned[i].start) {
+ start = min(banned[i].start, start);
+ end = max(banned[i].end, end);
+ if (merged < 0)
+ merged = i;
+ } else
+ /* regions are arranged in ascending order and do not
+ * intersect so the merged region cannot jump over its
+ * predecessors */
+ BUG_ON(merged >= 0);
+ }
+
+ i++;
+
+ if (merged >= 0) {
+ banned[i].start = start;
+ banned[i].end = end;
+ /* shift if merged with more than one region */
+ memmove(banned + i + 1, banned + merged + 1,
+ sizeof(*banned) * (nr_banned - merged - 1));
+ nr_banned -= merged - i;
+ return;
+ }
+
+ /* the region does not intersect with anyone existing,
+ * try to create a new one */
+ if (nr_banned == MAX_NR_BANNED) {
+ pr_err("PRAM: Failed to ban %lu-%lu: "
+ "Too many banned regions\n", start, end);
+ return;
+ }
+
+ memmove(banned + i + 1, banned + i,
+ sizeof(*banned) * (nr_banned - i));
+ banned[i].start = start;
+ banned[i].end = end;
+ nr_banned++;
+}
+
+void __init pram_show_banned(void)
+{
+ int i;
+ unsigned long n, total = 0;
+
+ pr_info("PRAM: banned regions:\n");
+ for (i = 0; i < nr_banned; i++) {
+ n = banned[i].end - banned[i].start + 1;
+ pr_info("%4d: [%08lx - %08lx] %ld pages\n",
+ i, banned[i].start, banned[i].end, n);
+ total += n;
+ }
+ pr_info("Total banned: %ld pages in %d regions\n",
+ total, nr_banned);
+}
+
+/*
+ * Returns true if the page may not be used for storing persistent data.
+ */
+static bool pram_page_banned(struct page *page)
+{
+ unsigned long pfn = page_to_pfn(page);
+ int l = 0, r = nr_banned - 1, m;
+
+ /* do binary search */
+ while (l <= r) {
+ m = (l + r) / 2;
+ if (pfn < banned[m].start)
+ r = m - 1;
+ else if (pfn > banned[m].end)
+ l = m + 1;
+ else
+ return true;
+ }
+ return false;
+}
+
static inline struct page *pram_alloc_page(gfp_t gfp_mask)
{
- return alloc_page(gfp_mask);
+ struct page *page;
+ LIST_HEAD(list);
+ unsigned long len = 0;
+
+ page = alloc_page(gfp_mask);
+ gfp_mask |= __GFP_COLD;
+ while (page && pram_page_banned(page)) {
+ len++;
+ list_add(&page->lru, &list);
+ page = alloc_page(gfp_mask);
+ }
+ if (len > 0) {
+ spin_lock(&banned_pages_lock);
+ nr_banned_pages += len;
+ list_splice(&list, &banned_pages);
+ spin_unlock(&banned_pages_lock);
+ }
+ return page;
}
static inline void pram_free_page(void *addr)
@@ -346,6 +495,46 @@ static inline void pram_free_page(void *addr)
free_page((unsigned long)addr);
}
+static void __banned_pages_shrink(unsigned long nr_to_scan)
+{
+ struct page *page;
+
+ if (nr_to_scan <= 0)
+ return;
+
+ while (nr_banned_pages > 0) {
+ BUG_ON(list_empty(&banned_pages));
+ page = list_first_entry(&banned_pages, struct page, lru);
+ list_del(&page->lru);
+ __free_page(page);
+ nr_banned_pages--;
+ nr_to_scan--;
+ if (!nr_to_scan)
+ break;
+ }
+}
+
+static int banned_pages_shrink(struct shrinker *shrink,
+ struct shrink_control *sc)
+{
+ int nr_left = nr_banned_pages;
+
+ if (!sc->nr_to_scan || !nr_left)
+ return nr_left;
+
+ spin_lock(&banned_pages_lock);
+ __banned_pages_shrink(sc->nr_to_scan);
+ nr_left = nr_banned_pages;
+ spin_unlock(&banned_pages_lock);
+
+ return nr_left;
+}
+
+static struct shrinker banned_pages_shrinker = {
+ .shrink = banned_pages_shrink,
+ .seeks = DEFAULT_SEEKS,
+};
+
static inline void pram_insert_node(struct pram_node *node)
{
list_add(&virt_to_page(node)->lru, &pram_nodes);
@@ -650,6 +839,7 @@ void pram_finish_load(struct pram_stream *ps)
/*
* Insert page to PRAM node allocating a new PRAM link if necessary.
+ * It is up to the caller to assert that the page is not banned.
*/
static int __pram_save_page(struct pram_stream *ps,
struct page *page, int flags)
@@ -703,13 +893,28 @@ static int __pram_save_page(struct pram_stream *ps,
int pram_save_page(struct pram_stream *ps, struct page *page, int flags)
{
struct pram_node *node = ps->node;
+ struct page *new = NULL;
+ int err;
BUG_ON(node->type != PRAM_PAGE_STREAM);
BUG_ON((node->flags & PRAM_ACCMODE_MASK) != PRAM_SAVE);
BUG_ON(PageCompound(page));
- return __pram_save_page(ps, page, flags);
+ /* if page is banned, relocate it */
+ if (pram_page_banned(page)) {
+ new = pram_alloc_page(ps->gfp_mask);
+ if (!new)
+ return -ENOMEM;
+ copy_highpage(new, page);
+ page = new;
+ flags &= ~PRAM_PAGE_LRU;
+ }
+
+ err = __pram_save_page(ps, page, flags);
+ if (new)
+ put_page(new);
+ return err;
}
/*
@@ -963,6 +1168,7 @@ static int __init pram_init_sb(void)
page = pram_alloc_page(GFP_KERNEL | __GFP_ZERO);
if (!page) {
pr_err("PRAM: Failed to allocate super block\n");
+ __banned_pages_shrink(ULONG_MAX);
return 0;
}
pram_sb = page_address(page);
@@ -983,6 +1189,7 @@ static int __init pram_init(void)
{
if (pram_init_sb()) {
register_reboot_notifier(&pram_reboot_notifier);
+ register_shrinker(&banned_pages_shrinker);
sysfs_update_group(kernel_kobj, &pram_attr_group);
}
return 0;
--
1.7.10.4
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH RFC 10/13] mm: PRAM: allow to ban arbitrary memory ranges
2013-07-01 11:57 [PATCH RFC 00/13] PRAM: Persistent over-kexec memory storage Vladimir Davydov
` (8 preceding siblings ...)
2013-07-01 11:57 ` [PATCH RFC 09/13] mm: PRAM: ban pages that have been reserved at boot time Vladimir Davydov
@ 2013-07-01 11:57 ` Vladimir Davydov
2013-07-01 11:57 ` [PATCH RFC 11/13] mm: PRAM: allow to free persistent memory from userspace Vladimir Davydov
` (2 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Vladimir Davydov @ 2013-07-01 11:57 UTC (permalink / raw)
To: linux-kernel; +Cc: linux-mm, criu, devel, xemul, khorenko
Banning for PRAM memory ranges that have been reserved at boot time is
not enough for avoiding all conflicts. The point is that kexec may load
the new kernel code to some address range that have never been reserved
possibly overwriting persistent data.
Fortunately, it is possible to specify a memory range kexec will load
the new kernel code into. Thus, to avoid kexec-vs-PRAM conflicts, it is
enough to disallow for PRAM some memory range large enough to load the
new kernel and make kexec load the new kernel code into that range.
For that purpose, This patch adds ability to specify arbitrary banned
ranges using the 'pram_banned' boot option.
---
mm/pram.c | 45 +++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 45 insertions(+)
diff --git a/mm/pram.c b/mm/pram.c
index 969ff3f..3ad769b 100644
--- a/mm/pram.c
+++ b/mm/pram.c
@@ -127,6 +127,15 @@ static bool __meminitdata pram_reservation_in_progress;
* persistent data. Since the device configuration cannot change during kexec
* and the newly booted kernel is likely to have a similar boot-time device
* driver set, this hack should work in most cases.
+ *
+ * This solution has one exception. The point is that kexec may load the new
+ * kernel code to some address range that have never been reserved and thus
+ * banned for PRAM by the current kernel possibly overwriting persistent data.
+ * Fortunately, it is possible to specify an exact range kexec will load the
+ * new kernel code into. Thus, to avoid kexec-vs-PRAM conflicts, one should
+ * disallow for PRAM some memory range large enough to load the new kernel (see
+ * the 'pram_banned' boot param) and make kexec load the new kernel code into
+ * that range.
*/
/*
@@ -378,6 +387,42 @@ out:
}
/*
+ * A comma separated list of memory regions that PRAM is not allowed to use.
+ */
+static int __init parse_pram_banned(char *arg)
+{
+ char *cur = arg, *tmp;
+ unsigned long long start, end;
+
+ do {
+ start = memparse(cur, &tmp);
+ if (cur == tmp) {
+ pr_warning("pram_banned: Memory value expected\n");
+ return -EINVAL;
+ }
+ cur = tmp;
+ if (*cur != '-') {
+ pr_warning("pram_banned: '-' expected\n");
+ return -EINVAL;
+ }
+ cur++;
+ end = memparse(cur, &tmp);
+ if (cur == tmp) {
+ pr_warning("pram_banned: Memory value expected\n");
+ return -EINVAL;
+ }
+ if (end <= start) {
+ pr_warning("pram_banned: end <= start\n");
+ return -EINVAL;
+ }
+ pram_ban_region(PFN_DOWN(start), PFN_UP(end) - 1);
+ } while (*cur++ == ',');
+
+ return 0;
+}
+early_param("pram_banned", parse_pram_banned);
+
+/*
* Bans pfn range [start..end] (inclusive) for PRAM.
*/
void __meminit pram_ban_region(unsigned long start, unsigned long end)
--
1.7.10.4
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH RFC 11/13] mm: PRAM: allow to free persistent memory from userspace
2013-07-01 11:57 [PATCH RFC 00/13] PRAM: Persistent over-kexec memory storage Vladimir Davydov
` (9 preceding siblings ...)
2013-07-01 11:57 ` [PATCH RFC 10/13] mm: PRAM: allow to ban arbitrary memory ranges Vladimir Davydov
@ 2013-07-01 11:57 ` Vladimir Davydov
2013-07-01 11:57 ` [PATCH RFC 12/13] mm: shmem: introduce shmem_insert_page Vladimir Davydov
2013-07-01 11:57 ` [PATCH RFC 13/13] mm: shmem: enable saving to PRAM Vladimir Davydov
12 siblings, 0 replies; 14+ messages in thread
From: Vladimir Davydov @ 2013-07-01 11:57 UTC (permalink / raw)
To: linux-kernel; +Cc: linux-mm, criu, devel, xemul, khorenko
To free all space utilized for persistent memory, one can write 0 to
/sys/kernel/pram. This will destroy all PRAM nodes that are not
currently being read or written.
---
mm/pram.c | 39 ++++++++++++++++++++++++++++++++++++++-
1 file changed, 38 insertions(+), 1 deletion(-)
diff --git a/mm/pram.c b/mm/pram.c
index 3ad769b..43ad85f 100644
--- a/mm/pram.c
+++ b/mm/pram.c
@@ -697,6 +697,32 @@ static void pram_truncate_node(struct pram_node *node)
}
+/*
+ * Free all nodes that are not under operation.
+ */
+static void pram_truncate(void)
+{
+ struct page *page, *tmp;
+ struct pram_node *node;
+ LIST_HEAD(dispose);
+
+ mutex_lock(&pram_mutex);
+ list_for_each_entry_safe(page, tmp, &pram_nodes, lru) {
+ node = page_address(page);
+ if (!(node->flags & PRAM_ACCMODE_MASK))
+ list_move(&page->lru, &dispose);
+ }
+ mutex_unlock(&pram_mutex);
+
+ while (!list_empty(&dispose)) {
+ page = list_first_entry(&dispose, struct page, lru);
+ list_del(&page->lru);
+ node = page_address(page);
+ pram_truncate_node(node);
+ pram_free_page(node);
+ }
+}
+
static void pram_stream_init(struct pram_stream *ps,
struct pram_node *node, gfp_t gfp_mask)
{
@@ -1189,8 +1215,19 @@ static ssize_t show_pram_sb_pfn(struct kobject *kobj,
return sprintf(buf, "%lx\n", pfn);
}
+static ssize_t store_pram_sb_pfn(struct kobject *kobj,
+ struct kobj_attribute *attr, const char *buf, size_t count)
+{
+ int val;
+
+ if (kstrtoint(buf, 0, &val) || val)
+ return -EINVAL;
+ pram_truncate();
+ return count;
+}
+
static struct kobj_attribute pram_sb_pfn_attr =
- __ATTR(pram, 0444, show_pram_sb_pfn, NULL);
+ __ATTR(pram, 0644, show_pram_sb_pfn, store_pram_sb_pfn);
static struct attribute *pram_attrs[] = {
&pram_sb_pfn_attr.attr,
--
1.7.10.4
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH RFC 12/13] mm: shmem: introduce shmem_insert_page
2013-07-01 11:57 [PATCH RFC 00/13] PRAM: Persistent over-kexec memory storage Vladimir Davydov
` (10 preceding siblings ...)
2013-07-01 11:57 ` [PATCH RFC 11/13] mm: PRAM: allow to free persistent memory from userspace Vladimir Davydov
@ 2013-07-01 11:57 ` Vladimir Davydov
2013-07-01 11:57 ` [PATCH RFC 13/13] mm: shmem: enable saving to PRAM Vladimir Davydov
12 siblings, 0 replies; 14+ messages in thread
From: Vladimir Davydov @ 2013-07-01 11:57 UTC (permalink / raw)
To: linux-kernel; +Cc: linux-mm, criu, devel, xemul, khorenko
The function inserts a memory page to a shmem file under an arbitrary
offset. If there is something at the specified offset (page or swap),
the function fails.
The function will be sued by the next patch.
---
include/linux/shmem_fs.h | 3 ++
mm/shmem.c | 68 ++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 71 insertions(+)
diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h
index 30aa0dc..da63308 100644
--- a/include/linux/shmem_fs.h
+++ b/include/linux/shmem_fs.h
@@ -62,4 +62,7 @@ static inline struct page *shmem_read_mapping_page(
mapping_gfp_mask(mapping));
}
+extern int shmem_insert_page(struct inode *inode,
+ pgoff_t index, struct page *page, bool on_lru);
+
#endif
diff --git a/mm/shmem.c b/mm/shmem.c
index 1c44af7..71fac31 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -328,6 +328,74 @@ static void shmem_delete_from_page_cache(struct page *page, void *radswap)
BUG_ON(error);
}
+int shmem_insert_page(struct inode *inode,
+ pgoff_t index, struct page *page, bool on_lru)
+{
+ struct address_space *mapping = inode->i_mapping;
+ struct shmem_inode_info *info = SHMEM_I(inode);
+ struct shmem_sb_info *sbinfo = SHMEM_SB(inode->i_sb);
+ gfp_t gfp = mapping_gfp_mask(mapping);
+ int err;
+
+ if (index > (MAX_LFS_FILESIZE >> PAGE_CACHE_SHIFT))
+ return -EFBIG;
+
+ err = -ENOSPC;
+ if (shmem_acct_block(info->flags))
+ goto out;
+ if (sbinfo->max_blocks) {
+ if (percpu_counter_compare(&sbinfo->used_blocks,
+ sbinfo->max_blocks) >= 0)
+ goto out_unacct;
+ percpu_counter_inc(&sbinfo->used_blocks);
+ }
+
+ if (!on_lru) {
+ SetPageSwapBacked(page);
+ __set_page_locked(page);
+ } else
+ lock_page(page);
+
+ err = mem_cgroup_cache_charge(page, current->mm,
+ gfp & GFP_RECLAIM_MASK);
+ if (err)
+ goto out_unlock;
+ err = radix_tree_preload(gfp & GFP_RECLAIM_MASK);
+ if (!err) {
+ err = shmem_add_to_page_cache(page, mapping, index, gfp, NULL);
+ radix_tree_preload_end();
+ }
+ if (err)
+ goto out_uncharge;
+
+ if (!on_lru)
+ lru_cache_add_anon(page);
+
+ spin_lock(&info->lock);
+ info->alloced++;
+ inode->i_blocks += BLOCKS_PER_PAGE;
+ shmem_recalc_inode(inode);
+ spin_unlock(&info->lock);
+
+ flush_dcache_page(page);
+ SetPageUptodate(page);
+ set_page_dirty(page);
+
+ unlock_page(page);
+ return 0;
+
+out_uncharge:
+ mem_cgroup_uncharge_cache_page(page);
+out_unlock:
+ unlock_page(page);
+ if (sbinfo->max_blocks)
+ percpu_counter_add(&sbinfo->used_blocks, -1);
+out_unacct:
+ shmem_unacct_blocks(info->flags, 1);
+out:
+ return err;
+}
+
/*
* Like find_get_pages, but collecting swap entries as well as pages.
*/
--
1.7.10.4
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH RFC 13/13] mm: shmem: enable saving to PRAM
2013-07-01 11:57 [PATCH RFC 00/13] PRAM: Persistent over-kexec memory storage Vladimir Davydov
` (11 preceding siblings ...)
2013-07-01 11:57 ` [PATCH RFC 12/13] mm: shmem: introduce shmem_insert_page Vladimir Davydov
@ 2013-07-01 11:57 ` Vladimir Davydov
12 siblings, 0 replies; 14+ messages in thread
From: Vladimir Davydov @ 2013-07-01 11:57 UTC (permalink / raw)
To: linux-kernel; +Cc: linux-mm, criu, devel, xemul, khorenko
This patch illustrates how PRAM API can be used for making tmpfs
'persistent'. It adds 'pram=' option to tmpfs, which specifies the PRAM
node to load/save FS tree from/to.
If the option is passed on mount, shmem will look for the corresponding
PRAM node and load the FS tree from it. On the subsequent unmount, it
will save FS tree to that PRAM node.
A typical usage scenario looks like:
# mount -t tmpfs -o pram=mytmpfs none /mnt
# echo something > /mnt/smth
# umount /mnt
<possibly kexec>
# mount -t tmpfs -o pram=mytmpfs none /mnt
# cat /mnt/smth
Each FS tree is saved into two PRAM nodes, one acting as a byte stream
and the other acting as a page stream. The byte stream is used for
saving files metadata (name, permissions, etc) and data page offsets
while the page stream accommodates file content pages.
Current implementation serves for demonstration purposes and so is quite
simplified: it supports only regular files in the root directory without
multiple hard links, and it does not save swapped out files aborting if
any. However, it can be elaborated to fully support tmpfs.
---
include/linux/shmem_fs.h | 26 ++++
mm/Makefile | 2 +-
mm/shmem.c | 29 +++-
mm/shmem_pram.c | 378 ++++++++++++++++++++++++++++++++++++++++++++++
4 files changed, 430 insertions(+), 5 deletions(-)
create mode 100644 mm/shmem_pram.c
diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h
index da63308..0408421 100644
--- a/include/linux/shmem_fs.h
+++ b/include/linux/shmem_fs.h
@@ -23,6 +23,11 @@ struct shmem_inode_info {
struct inode vfs_inode;
};
+#define SHMEM_PRAM_NAME_MAX 128
+struct shmem_pram_info {
+ char name[SHMEM_PRAM_NAME_MAX];
+};
+
struct shmem_sb_info {
unsigned long max_blocks; /* How many blocks are allowed */
struct percpu_counter used_blocks; /* How many are allocated */
@@ -33,6 +38,7 @@ struct shmem_sb_info {
kgid_t gid; /* Mount gid for root directory */
umode_t mode; /* Mount mode for root directory */
struct mempolicy *mpol; /* default memory policy for mappings */
+ struct shmem_pram_info *pram;
};
static inline struct shmem_inode_info *SHMEM_I(struct inode *inode)
@@ -62,7 +68,27 @@ static inline struct page *shmem_read_mapping_page(
mapping_gfp_mask(mapping));
}
+struct pagevec;
+
extern int shmem_insert_page(struct inode *inode,
pgoff_t index, struct page *page, bool on_lru);
+extern unsigned shmem_find_get_pages_and_swap(struct address_space *mapping,
+ pgoff_t start, unsigned int nr_pages,
+ struct page **pages, pgoff_t *indices);
+extern void shmem_deswap_pagevec(struct pagevec *pvec);
+
+#ifdef CONFIG_PRAM
+extern int shmem_parse_pram(const char *str, struct shmem_pram_info **pram);
+extern void shmem_show_pram(struct seq_file *seq, struct shmem_pram_info *pram);
+extern void shmem_save_pram(struct super_block *sb);
+extern void shmem_load_pram(struct super_block *sb);
+#else
+static inline int shmem_parse_pram(const char *str,
+ struct shmem_pram_info **pram) { return 1; }
+static inline void shmem_show_pram(struct seq_file *seq,
+ struct shmem_pram_info *pram) { }
+static inline void shmem_save_pram(struct super_block *sb) { }
+static inline void shmem_load_pram(struct super_block *sb) { }
+#endif
#endif
diff --git a/mm/Makefile b/mm/Makefile
index 33ad952..6a8c61d 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -58,4 +58,4 @@ obj-$(CONFIG_DEBUG_KMEMLEAK) += kmemleak.o
obj-$(CONFIG_DEBUG_KMEMLEAK_TEST) += kmemleak-test.o
obj-$(CONFIG_CLEANCACHE) += cleancache.o
obj-$(CONFIG_MEMORY_ISOLATION) += page_isolation.o
-obj-$(CONFIG_PRAM) += pram.o
+obj-$(CONFIG_PRAM) += pram.o shmem_pram.o
diff --git a/mm/shmem.c b/mm/shmem.c
index 71fac31..2d6b618 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -399,7 +399,7 @@ out:
/*
* Like find_get_pages, but collecting swap entries as well as pages.
*/
-static unsigned shmem_find_get_pages_and_swap(struct address_space *mapping,
+unsigned shmem_find_get_pages_and_swap(struct address_space *mapping,
pgoff_t start, unsigned int nr_pages,
struct page **pages, pgoff_t *indices)
{
@@ -465,7 +465,7 @@ static int shmem_free_swap(struct address_space *mapping,
/*
* Pagevec may contain swap entries, so shuffle up pages before releasing.
*/
-static void shmem_deswap_pagevec(struct pagevec *pvec)
+void shmem_deswap_pagevec(struct pagevec *pvec)
{
int i, j;
@@ -2535,6 +2535,10 @@ static int shmem_parse_options(char *options, struct shmem_sb_info *sbinfo,
mpol = NULL;
if (mpol_parse_str(value, &mpol))
goto bad_val;
+ } else if (!strcmp(this_char,"pram")) {
+ kfree(sbinfo->pram);
+ if (shmem_parse_pram(value, &sbinfo->pram))
+ goto bad_val;
} else {
printk(KERN_ERR "tmpfs: Bad mount option %s\n",
this_char);
@@ -2561,6 +2565,7 @@ static int shmem_remount_fs(struct super_block *sb, int *flags, char *data)
int error = -EINVAL;
config.mpol = NULL;
+ config.pram = NULL;
if (shmem_parse_options(data, &config, true))
return error;
@@ -2592,6 +2597,9 @@ static int shmem_remount_fs(struct super_block *sb, int *flags, char *data)
mpol_put(sbinfo->mpol);
sbinfo->mpol = config.mpol; /* transfers initial ref */
}
+
+ kfree(sbinfo->pram);
+ sbinfo->pram = config.pram;
out:
spin_unlock(&sbinfo->stat_lock);
return error;
@@ -2615,6 +2623,7 @@ static int shmem_show_options(struct seq_file *seq, struct dentry *root)
seq_printf(seq, ",gid=%u",
from_kgid_munged(&init_user_ns, sbinfo->gid));
shmem_show_mpol(seq, sbinfo->mpol);
+ shmem_show_pram(seq, sbinfo->pram);
return 0;
}
#endif /* CONFIG_TMPFS */
@@ -2625,6 +2634,7 @@ static void shmem_put_super(struct super_block *sb)
percpu_counter_destroy(&sbinfo->used_blocks);
mpol_put(sbinfo->mpol);
+ kfree(sbinfo->pram);
kfree(sbinfo);
sb->s_fs_info = NULL;
}
@@ -2838,14 +2848,25 @@ static const struct vm_operations_struct shmem_vm_ops = {
static struct dentry *shmem_mount(struct file_system_type *fs_type,
int flags, const char *dev_name, void *data)
{
- return mount_nodev(fs_type, flags, data, shmem_fill_super);
+ struct dentry *root;
+
+ root = mount_nodev(fs_type, flags, data, shmem_fill_super);
+ if (!IS_ERR(root))
+ shmem_load_pram(root->d_sb);
+ return root;
+}
+
+static void shmem_kill_sb(struct super_block *sb)
+{
+ shmem_save_pram(sb);
+ kill_litter_super(sb);
}
static struct file_system_type shmem_fs_type = {
.owner = THIS_MODULE,
.name = "tmpfs",
.mount = shmem_mount,
- .kill_sb = kill_litter_super,
+ .kill_sb = shmem_kill_sb,
.fs_flags = FS_USERNS_MOUNT,
};
diff --git a/mm/shmem_pram.c b/mm/shmem_pram.c
new file mode 100644
index 0000000..9a01040
--- /dev/null
+++ b/mm/shmem_pram.c
@@ -0,0 +1,378 @@
+#include <linux/dcache.h>
+#include <linux/err.h>
+#include <linux/fs.h>
+#include <linux/gfp.h>
+#include <linux/kernel.h>
+#include <linux/list.h>
+#include <linux/mm.h>
+#include <linux/mount.h>
+#include <linux/mutex.h>
+#include <linux/namei.h>
+#include <linux/pagemap.h>
+#include <linux/pagevec.h>
+#include <linux/pram.h>
+#include <linux/seq_file.h>
+#include <linux/shmem_fs.h>
+#include <linux/spinlock.h>
+#include <linux/string.h>
+#include <linux/time.h>
+#include <linux/types.h>
+#include <linux/uaccess.h>
+
+#define META_ID 1
+#define DATA_ID 2
+
+#define EOF_MARK ((__u64)~0ULL)
+
+struct file_header {
+ __u32 mode;
+ __u32 uid;
+ __u32 gid;
+ __u32 namelen;
+ __u64 size;
+ __u64 atime;
+ __u64 mtime;
+ __u64 ctime;
+};
+
+int shmem_parse_pram(const char *str, struct shmem_pram_info **pram)
+{
+ struct shmem_pram_info *new;
+ size_t len;
+
+ len = strlen(str);
+ if (!len || len >= SHMEM_PRAM_NAME_MAX)
+ return 1;
+ new = kzalloc(sizeof(*new), GFP_KERNEL);
+ if (!new)
+ return 1;
+ strcpy(new->name, str);
+ *pram = new;
+ return 0;
+}
+
+void shmem_show_pram(struct seq_file *seq, struct shmem_pram_info *pram)
+{
+ if (pram)
+ seq_printf(seq, ",pram=%s", pram->name);
+}
+
+static int shmem_pram_name(char *buf, size_t bufsize,
+ struct shmem_sb_info *sbinfo, int id)
+{
+ if (snprintf(buf, bufsize, "shmem-%d-%s", id,
+ sbinfo->pram->name) >= bufsize)
+ return -ENAMETOOLONG;
+ return 0;
+}
+
+static int save_page(struct page *page,
+ struct pram_stream *psmeta, struct pram_stream *psdata)
+{
+ __u64 val;
+ ssize_t ret;
+ int err = 0;
+
+ if (page) {
+ val = page->index;
+ err = pram_save_page(psdata, page, PRAM_PAGE_LRU);
+ } else
+ val = EOF_MARK;
+ if (!err) {
+ ret = pram_write(psmeta, &val, sizeof(val));
+ if (ret < 0)
+ err = ret;
+ }
+ return err;
+}
+
+static int save_file_content(struct address_space *mapping,
+ struct pram_stream *psmeta, struct pram_stream *psdata)
+{
+ struct pagevec pvec;
+ pgoff_t indices[PAGEVEC_SIZE];
+ pgoff_t index = 0;
+ struct page *page;
+ int i, err = 0;
+
+ pagevec_init(&pvec, 0);
+ for ( ; ; ) {
+ pvec.nr = shmem_find_get_pages_and_swap(mapping,
+ index, PAGEVEC_SIZE, pvec.pages, indices);
+ if (!pvec.nr)
+ break;
+ for (i = 0; i < pagevec_count(&pvec); i++) {
+ page = pvec.pages[i];
+ index = indices[i];
+
+ if (radix_tree_exceptional_entry(page)) {
+ err = -ENOSYS;
+ break;
+ }
+
+ lock_page(page);
+ if (likely(page->mapping == mapping))
+ save_page(page, psmeta, psdata);
+ unlock_page(page);
+ if (err)
+ break;
+ }
+ shmem_deswap_pagevec(&pvec);
+ pagevec_release(&pvec);
+ if (err)
+ break;
+ cond_resched();
+ index++;
+ }
+ if (!err)
+ err = save_page(NULL, psmeta, psdata); /* eof */
+ return err;
+}
+
+static int save_file(struct dentry *dentry,
+ struct pram_stream *psmeta, struct pram_stream *psdata)
+{
+ struct inode *inode = dentry->d_inode;
+ umode_t mode = inode->i_mode;
+ struct file_header hdr;
+ ssize_t ret;
+
+ if (!S_ISREG(mode))
+ return -ENOSYS;
+ if (inode->i_nlink > 1)
+ return -ENOSYS;
+
+ hdr.mode = mode;
+ hdr.uid = inode->i_uid;
+ hdr.gid = inode->i_gid;
+ hdr.namelen = dentry->d_name.len;
+ hdr.size = i_size_read(inode);
+ hdr.atime = timespec_to_ns(&inode->i_atime);
+ hdr.mtime = timespec_to_ns(&inode->i_mtime);
+ hdr.ctime = timespec_to_ns(&inode->i_ctime);
+
+ ret = pram_write(psmeta, &hdr, sizeof(hdr));
+ if (ret < 0)
+ return ret;
+ ret = pram_write(psmeta, dentry->d_name.name, dentry->d_name.len);
+ if (ret < 0)
+ return ret;
+ return save_file_content(inode->i_mapping, psmeta, psdata);
+}
+
+static int save_tree(struct super_block *sb,
+ struct pram_stream *psmeta, struct pram_stream *psdata)
+{
+ struct dentry *dentry, *root = sb->s_root;
+ int err = 0;
+
+ mutex_lock(&root->d_inode->i_mutex);
+ spin_lock(&root->d_lock);
+ list_for_each_entry(dentry, &root->d_subdirs, d_u.d_child) {
+ if (d_unhashed(dentry) || !dentry->d_inode)
+ continue;
+ dget(dentry);
+ spin_unlock(&root->d_lock);
+
+ err = save_file(dentry, psmeta, psdata);
+
+ spin_lock(&root->d_lock);
+ dput(dentry);
+ if (err)
+ break;
+ }
+ spin_unlock(&root->d_lock);
+ mutex_unlock(&root->d_inode->i_mutex);
+
+ return err;
+}
+
+void shmem_save_pram(struct super_block *sb)
+{
+ struct shmem_sb_info *sbinfo = sb->s_fs_info;
+ struct pram_stream psmeta, psdata;
+ char *buf;
+ int err = -ENOMEM;
+
+ if (!sbinfo || !sbinfo->pram)
+ return;
+
+ buf = (void *)__get_free_page(GFP_TEMPORARY);
+ if (!buf)
+ goto out;
+
+ err = shmem_pram_name(buf, PAGE_SIZE, sbinfo, META_ID);
+ if (!err)
+ err = pram_prepare_save(&psmeta, buf,
+ PRAM_BYTE_STREAM, GFP_KERNEL);
+ if (err)
+ goto out_free_buf;
+
+ err = shmem_pram_name(buf, PAGE_SIZE, sbinfo, DATA_ID);
+ if (!err)
+ err = pram_prepare_save(&psdata, buf,
+ PRAM_PAGE_STREAM, GFP_HIGHUSER);
+ if (err)
+ goto out_discard_meta_save;
+
+ err = save_tree(sb, &psmeta, &psdata);
+ if (err)
+ goto out_discard_data_save;
+
+ pram_finish_save(&psmeta);
+ pram_finish_save(&psdata);
+ goto out_free_buf;
+
+out_discard_data_save:
+ pram_discard_save(&psdata);
+out_discard_meta_save:
+ pram_discard_save(&psmeta);
+out_free_buf:
+ free_page((unsigned long)buf);
+out:
+ if (err)
+ pr_err("SHMEM: PRAM save failed: %d\n", err);
+}
+
+static struct page *load_page(unsigned long *index, int *flags,
+ struct pram_stream *psmeta, struct pram_stream *psdata)
+{
+ __u64 val;
+ struct page *page;
+
+ if (pram_read(psmeta, &val, sizeof(val)) != sizeof(val))
+ return ERR_PTR(-EINVAL);
+ if (val == EOF_MARK)
+ return NULL;
+ *index = val;
+ page = pram_load_page(psdata, flags);
+ return page ?: ERR_PTR(-EINVAL);
+}
+
+static int load_file_content(struct address_space *mapping,
+ struct pram_stream *psmeta, struct pram_stream *psdata)
+{
+ struct page *page;
+ unsigned long index;
+ int flags, err;
+
+next:
+ page = load_page(&index, &flags, psmeta, psdata);
+ if (IS_ERR_OR_NULL(page))
+ return PTR_ERR(page);
+ err = shmem_insert_page(mapping->host, index, page,
+ flags & PRAM_PAGE_LRU);
+ put_page(page);
+ if (err)
+ return err;
+ goto next;
+}
+
+static int load_file(struct dentry *parent,
+ struct pram_stream *psmeta, struct pram_stream *psdata,
+ char *buf, size_t bufsize)
+{
+ struct dentry *dentry;
+ struct inode *inode;
+ struct file_header hdr;
+ size_t ret;
+ umode_t mode;
+ int namelen;
+ int err;
+
+ ret = pram_read(psmeta, &hdr, sizeof(hdr));
+ if (!ret)
+ return 0;
+ if (ret != sizeof(hdr))
+ return -EINVAL;
+
+ mode = hdr.mode;
+ namelen = hdr.namelen;
+ if (!S_ISREG(mode) || namelen > bufsize)
+ return -EINVAL;
+ if (pram_read(psmeta, buf, namelen) != namelen)
+ return -EINVAL;
+
+ mutex_lock_nested(&parent->d_inode->i_mutex, I_MUTEX_PARENT);
+
+ dentry = lookup_one_len(buf, parent, namelen);
+ if (IS_ERR(dentry)) {
+ err = PTR_ERR(dentry);
+ goto out_unlock;
+ }
+
+ err = vfs_create(parent->d_inode, dentry, mode, NULL);
+ dput(dentry); /* on success shmem pinned it */
+ if (err)
+ goto out_unlock;
+
+ inode = dentry->d_inode;
+ inode->i_mode = mode;
+ inode->i_uid = hdr.uid;
+ inode->i_gid = hdr.gid;
+ inode->i_atime = ns_to_timespec(hdr.atime);
+ inode->i_mtime = ns_to_timespec(hdr.mtime);
+ inode->i_ctime = ns_to_timespec(hdr.ctime);
+ i_size_write(inode, hdr.size);
+
+ err = load_file_content(inode->i_mapping, psmeta, psdata);
+out_unlock:
+ mutex_unlock(&parent->d_inode->i_mutex);
+ if (err)
+ return err;
+ return 1;
+}
+
+static int load_tree(struct super_block *sb,
+ struct pram_stream *psmeta, struct pram_stream *psdata,
+ char *buf, size_t bufsize)
+{
+ int ret;
+
+next:
+ ret = load_file(sb->s_root, psmeta, psdata, buf, PAGE_SIZE);
+ if (ret <= 0)
+ return ret;
+ goto next;
+}
+
+void shmem_load_pram(struct super_block *sb)
+{
+ struct shmem_sb_info *sbinfo = sb->s_fs_info;
+ struct pram_stream psmeta, psdata;
+ char *buf;
+ int err = -ENOMEM;
+
+ if (!sbinfo->pram)
+ return;
+
+ buf = (void *)__get_free_page(GFP_TEMPORARY);
+ if (!buf)
+ goto out;
+
+ err = shmem_pram_name(buf, PAGE_SIZE, sbinfo, META_ID);
+ if (!err)
+ err = pram_prepare_load(&psmeta, buf, PRAM_BYTE_STREAM);
+ if (err) {
+ if (err == -ENOENT)
+ err = 0;
+ goto out_free_buf;
+ }
+
+ err = shmem_pram_name(buf, PAGE_SIZE, sbinfo, DATA_ID);
+ if (!err)
+ err = pram_prepare_load(&psdata, buf, PRAM_PAGE_STREAM);
+ if (err)
+ goto out_finish_meta_load;
+
+ err = load_tree(sb, &psmeta, &psdata, buf, PAGE_SIZE);
+
+ pram_finish_load(&psmeta);
+out_finish_meta_load:
+ pram_finish_load(&psdata);
+out_free_buf:
+ free_page((unsigned long)buf);
+out:
+ if (err)
+ pr_err("SHMEM: PRAM load failed: %d\n", err);
+}
--
1.7.10.4
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 14+ messages in thread
end of thread, other threads:[~2013-07-01 11:58 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-07-01 11:57 [PATCH RFC 00/13] PRAM: Persistent over-kexec memory storage Vladimir Davydov
2013-07-01 11:57 ` [PATCH RFC 01/13] mm: add PRAM API stubs and Kconfig Vladimir Davydov
2013-07-01 11:57 ` [PATCH RFC 02/13] mm: PRAM: implement node load and save functions Vladimir Davydov
2013-07-01 11:57 ` [PATCH RFC 03/13] mm: PRAM: implement page stream operations Vladimir Davydov
2013-07-01 11:57 ` [PATCH RFC 04/13] mm: PRAM: implement byte " Vladimir Davydov
2013-07-01 11:57 ` [PATCH RFC 05/13] mm: PRAM: link nodes by pfn before reboot Vladimir Davydov
2013-07-01 11:57 ` [PATCH RFC 06/13] mm: PRAM: introduce super block Vladimir Davydov
2013-07-01 11:57 ` [PATCH RFC 07/13] mm: PRAM: preserve persistent memory at boot Vladimir Davydov
2013-07-01 11:57 ` [PATCH RFC 08/13] mm: PRAM: checksum saved data Vladimir Davydov
2013-07-01 11:57 ` [PATCH RFC 09/13] mm: PRAM: ban pages that have been reserved at boot time Vladimir Davydov
2013-07-01 11:57 ` [PATCH RFC 10/13] mm: PRAM: allow to ban arbitrary memory ranges Vladimir Davydov
2013-07-01 11:57 ` [PATCH RFC 11/13] mm: PRAM: allow to free persistent memory from userspace Vladimir Davydov
2013-07-01 11:57 ` [PATCH RFC 12/13] mm: shmem: introduce shmem_insert_page Vladimir Davydov
2013-07-01 11:57 ` [PATCH RFC 13/13] mm: shmem: enable saving to PRAM Vladimir Davydov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).