All of lore.kernel.org
 help / color / mirror / Atom feed
* how to check kernel is configured with preemption or not
From: Greg KH @ 2011-10-24  4:38 UTC (permalink / raw)
  To: kernelnewbies
In-Reply-To: <CAGpXXZK_tJVZdxu=B65uAy225Y9cCS1x8yqhJOeK7OEQEfenxA@mail.gmail.com>

On Sun, Oct 23, 2011 at 11:20:03PM -0400, Greg Freemyer wrote:
> On Sun, Oct 23, 2011 at 1:34 PM, sri <bskmohan@gmail.com> wrote:
> > No, uname did not show anything.
> > Is there any way to get the kernel preemption mode, programatically?
> >
> > Thanks,
> > --Sri
> >
> > On Fri, Oct 21, 2011 at 6:41 PM, Daniel Baluta <daniel.baluta@gmail.com>
> > wrote:
> >>
> >> On Fri, Oct 21, 2011 at 2:28 PM, sri <bskmohan@gmail.com> wrote:
> >> > Hi,
> >> >
> >> > Am using kernel 2.6.18-195(centos 5.5).
> >> > My kernel configs have CONFIG_PREEMPT_NONE=7 and
> >> > "CONFIG_PREEMPT_VOLUNTERY
> >> > is not set".
> >> > How to check that preemption is really in place?
> >> > Is there any way to check my kernel is configured with what preemption
> >> > levels?
> >>
> >> Hmm, uname -a?
> I'm sure its in /sys somewhere.

I do not think so.

> Remember /sys is part of the official ABI.

As documented in Documentation/ABI/, so perhaps you can read there.

> Also, you see what your config look like for sure by looking at
> /proc/config.gz  (that file is virtual, but shows the contents of how
> your config file was at compile time for the running kernel.

Not all distros enable this :(

I think the question needs to really be stated, why, from userspace,
does it matter if preempt is enabled or not?  This should never be
something that userspace cares about at all.

greg k-h

^ permalink raw reply

* [U-Boot] [PATCH 01/39] DEBUG: Fix debug macros
From: Simon Glass @ 2011-10-24  4:31 UTC (permalink / raw)
  To: u-boot
In-Reply-To: <1319242654-15534-2-git-send-email-marek.vasut@gmail.com>

Hi Marek,

On Fri, Oct 21, 2011 at 5:16 PM, Marek Vasut <marek.vasut@gmail.com> wrote:
> The current implementation of debug doesn't play well with GCC4.6.
> This implementation also fixes GCC4.6 complaints about unused variables
> while maintaining code size.
>
> Signed-off-by: Mike Frysinger <vapier@gentoo.org>
> Signed-off-by: Marek Vasut <marek.vasut@gmail.com>
> Cc: Wolfgang Denk <wd@denx.de>
> Cc: Simon Glass <sjg@chromium.org>
> ---
> ?include/common.h | ? 20 ++++++++++++--------
> ?1 files changed, 12 insertions(+), 8 deletions(-)
>
> diff --git a/include/common.h b/include/common.h
> index eb19a44..c3b23551 100644
> --- a/include/common.h
> +++ b/include/common.h
> @@ -116,20 +116,24 @@ typedef volatile unsigned char ? ?vu_char;
> ?#include <flash.h>
> ?#include <image.h>
>
> -#ifdef DEBUG
> -#define debug(fmt,args...) ? ? printf (fmt ,##args)
> -#define debugX(level,fmt,args...) if (DEBUG>=level) printf(fmt,##args);
> -#else
> -#define debug(fmt,args...)
> -#define debugX(level,fmt,args...)
> -#endif /* DEBUG */
> -
> ?#ifdef DEBUG
> ?# define _DEBUG 1
> ?#else
> ?# define _DEBUG 0
> ?#endif
>
> +#define debug_cond(cond, fmt, args...) ? ? ? ? \

Yes this is much nicer. Could perhaps add a little comment about how
to use this and to avoid putting debug() inside #ifdef?

> + ? ? ? do { ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?\
> + ? ? ? ? ? ? ? if (cond) ? ? ? ? ? ? ? ? ? ? ? \
> + ? ? ? ? ? ? ? ? ? ? ? printf(fmt, ##args); ? ?\
> + ? ? ? } while (0)
> +
> +#define debug(fmt, args...) ? ? ? ? ? ? ? ? ? ?\
> + ? ? ? debug_cond(_DEBUG, fmt, ##args)
> +
> +#define debugX(level, fmt, args...) ? ? ? ? ? ?\
> + ? ? ? debug_cond((_DEBUG && DEBUG >= (level)), fmt, ##args)
> +
> ?/*
> ?* An assertion is run-time check done in debug mode only. If DEBUG is not
> ?* defined then it is skipped. If DEBUG is defined and the assertion fails,
> --
> 1.7.6.3
>
>

^ permalink raw reply

* [U-Boot] [PATCH 0/8] Add tftpput command for uploading files over network
From: Simon Glass @ 2011-10-24  4:28 UTC (permalink / raw)
  To: u-boot
In-Reply-To: <CAPnjgZ2G_cGq5vRMy=hbqCpEoNyAjHx_AwPOFFSR-3xbmFz2uA@mail.gmail.com>

Hi Albert,

On Sat, Oct 22, 2011 at 9:15 AM, Simon Glass <sjg@chromium.org> wrote:
> Hi Albert,
>
> On Sat, Oct 22, 2011 at 1:21 AM, Albert ARIBAUD
> <albert.u.boot@aribaud.net> wrote:
>> Le 22/10/2011 06:51, Simon Glass a ?crit :
>>>
>>> The tftpboot command permits reading of files over a network interface
>>> using the Trivial FTP protocol. This patch series adds the ability to
>>> transfer files the other way.
>>>
>>> Why is this useful?
>>>
>>> - Uploading boot time data to a server
>>> - Uploading profiling information
>>> - Uploading large mounts of data for comparison / checking on a host
>>> ? ? (e.g. use tftpput and ghex2 instead of the 'md' command)
>>
>> Especially I find it interesting for backing up things like MTD and small
>> disk files (not partitions, though). Most of my work currently is trying to
>> bring mainline U-Boot support to existing boards with bad U-Boot
>> implementations, and being able to backup things from U-Boot (as opposed to
>> having to set up NFS root and Linux boot) would definitely be a plus.
>>
>>> Mostly the existing code can be re-used and I have tried to avoid too
>>> much refactoring or cleaning up.
>>
>> :)
>>
>>> The feature is activated by the CONFIG_CMD_TFTPPUT option.
>>>
>>> This has been very lightly tested on a Seaboard with a USB network
>>> adaptor. I don't think it handles block number overflow.
>>
>> What size does this limit transfers to?
>
> I think about 1468 * 65535 - around 95MB - it's fairly easy to fix
> just by copying out the existing tftp get wrap code. I put it in the
> commit message so it wouldn't get lost.
>
>>
>>> Simon Glass (8):
>>> ? Move simple_itoa to vsprintf
>>> ? Add setenv_uint() and setenv_addr()
>>> ? tftpput: Rename TFTP to TFTPGET
>>> ? tftpput: move common code into separate functions
>>> ? tftpput: support selecting get/put for tftp
>>> ? tftpput: add save_addr and save_size global variables
>>> ? tftpput: implement tftp logic
>>> ? tftpput: add tftpput command
>>
>> Many U-Boot environments use 'tftp' as a shorthand to tftpboot. Did you
>> verify that this is not broken by the introduction of 'tftpput'?
>>
>> Also, I'd be happy to test this if a branch exists that already holds these
>> commits.
>
> I will see if I can organise one at Denx.

Thanks to Wolfgang I have something you can try:

git clone git://git.denx.de/u-boot-simonglass
git checkout us-tftp

Regards,
Simon

>
> Regards,
> Simon
>
>>
>> Amicalement,
>> --
>> Albert.
>>
>

^ permalink raw reply

* Re: [Qemu-devel] [RFC v2 PATCH 5/4 PATCH] virtio-net: send gratuitous packet when needed
From: Rusty Russell @ 2011-10-24  4:24 UTC (permalink / raw)
  To: Jason Wang, aliguori, quintela, jan.kiszka, mst, qemu-devel,
	blauwirbel
  Cc: pbonzini, kvm, netdev
In-Reply-To: <20111022054311.21798.3340.stgit@dhcp-8-146.nay.redhat.com>

On Sat, 22 Oct 2011 13:43:11 +0800, Jason Wang <jasowang@redhat.com> wrote:
> This make let virtio-net driver can send gratituous packet by a new
> config bit - VIRTIO_NET_S_ANNOUNCE in each config update
> interrupt. When this bit is set by backend, the driver would schedule
> a workqueue to send gratituous packet through NETDEV_NOTIFY_PEERS.
> 
> This feature is negotiated through bit VIRTIO_NET_F_GUEST_ANNOUNCE.
> 
> Signed-off-by: Jason Wang <jasowang@redhat.com>

This seems like a huge layering violation.  Imagine this in real
hardware, for example.

There may be a good reason why virtual devices might want this kind of
reconfiguration cheat, which is unnecessary for normal machines, but
it'd have to be spelled out clearly in the spec to justify it...

Cheers,
Rusty.

^ permalink raw reply

* Re: [PATCH v3] virtio: Add platform bus driver for memory mapped virtio device
From: Rusty Russell @ 2011-10-24  2:33 UTC (permalink / raw)
  To: Pawel Moll
  Cc: linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	virtualization@lists.linux-foundation.org, Anthony Liguori,
	Michael S.Tsirkin
In-Reply-To: <1319219844.5094.16.camel@hornet.cambridge.arm.com>

On Fri, 21 Oct 2011 18:57:24 +0100, Pawel Moll <pawel.moll@arm.com> wrote:
> On Wed, 2011-10-19 at 03:57 +0100, Rusty Russell wrote:
> > > Oh, really? My host-side implementation is just doing that:
> > > 
> > >         addr += align - 1;
> > >         addr &= ~(align - 1);
> > 
> > OK, so you're assuming power of 2.  Make sure you kill the guest or at
> > least the device if it's not though.
> 
> Yep, I have assertions all around such places :-) (it's a non-production
> code yet so I can do that)
> 
> > > \item The dynamic configuration changes, as described in p. 2.4.3
> > > ``Dealing With Configuration Changes'' are not permitted.
> > 
> > This means some devices simply won't work, at least in theory.  Why
> > don't you support this?
> 
> Uh. I simply forgot about it - my Host block device doesn't do that, so
> I ignored that feature initially and then it slipped through cracks. And
> till now I didn't realize that most of the drivers actually use this :-O
> My fault.
> 
> Simple to fix anyway - I'll just add InterruptStatus register and use
> second bit (same with InterruptACK) to get this through. Will be done on
> Monday.
> 
> Any other final complaints regarding the interface while I'm on it? ;-)

No, that's it I think.  Please send a diff for the documentation, since
I'm updating the LyX master and I've already applied your previous
version.

Thanks!
Rusty.

^ permalink raw reply

* [PATCH] Reindent closing bracket using tab instead of spaces
From: Nguyễn Thái Ngọc Duy @ 2011-10-24  4:24 UTC (permalink / raw)
  To: git, Junio C Hamano; +Cc: Nguyễn Thái Ngọc Duy


Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 I'm not going to convert all leading spaces to tabs. But this one looks just ugly
 because it mis-aligns with the rest of the function.

 wt-status.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/wt-status.c b/wt-status.c
index 8836a52..70fdb76 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -396,7 +396,7 @@ static void wt_status_collect_changes_worktree(struct wt_status *s)
 	if (s->ignore_submodule_arg) {
 		DIFF_OPT_SET(&rev.diffopt, OVERRIDE_SUBMODULE_CONFIG);
 		handle_ignore_submodules_arg(&rev.diffopt, s->ignore_submodule_arg);
-    }
+	}
 	rev.diffopt.format_callback = wt_status_collect_changed_cb;
 	rev.diffopt.format_callback_data = s;
 	init_pathspec(&rev.prune_data, s->pathspec);
-- 
1.7.3.1.256.g2539c.dirty

^ permalink raw reply related

* Re: [PATCH 9/9] make net/core/scm.c uid comparisons user namespace aware
From: Eric W. Biederman @ 2011-10-24  4:27 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Serge E. Hallyn, linux-kernel, akpm, oleg, richard, mikevs,
	segoon, gregkh, dhowells, eparis, netdev
In-Reply-To: <20111024041529.GA23618@hallyn.com>

"Serge E. Hallyn" <serge@hallyn.com> writes:

> Never mind, it all gets a little convoluted, but I see how it works,
> and - when the time comes - how to do it right for userns.  :)  Sorry
> about that.

No problem.

Eric

^ permalink raw reply

* [refpolicy] Error when using refpolicy with apache httpd service
From: Justin Mattock @ 2011-10-24  4:25 UTC (permalink / raw)
  To: refpolicy
In-Reply-To: <1318433954.2238.63.camel@vortex>





----- Original Message -----
From: Guido Trentalancia <guido@trentalancia.com>
To: Dominick Grift <dominick.grift@gmail.com>
Cc: refpolicy <refpolicy@oss.tresys.com>
Sent: Wednesday, October 12, 2011 8:39 AM
Subject: Re: [refpolicy] Error when using refpolicy with apache httpd service

On Wed, 2011-10-12 at 17:15 +0200, Dominick Grift wrote:
> On Thu, 2011-10-13 at 00:08 +0900, Thu?n ?inh wrote:
> > Hi,
> > 
> > 
> > I'm very strange that the /sbin/init is labeled bin_t
> > 
> > 
> > The /sbin/init is point to /bin/systemd
> > 
> > 
> > I check in the /system/init.fc have defiled: 
> > 
> > 
> > /sbin/init(ng)? -- gen_context(system_u:object_r:init_exec_t,s0)
> > # because nowadays, /sbin/init is often a symlink to /sbin/upstart
> > /sbin/upstart -- gen_context(system_u:object_r:init_exec_t,s0)
> > 
> > 
> > So, I changed it to: 
> > 
> > 
> > /bin/systemd? ?  -- gen_context(system_u:object_r:init_exec_t,s0)
> > /sbin/init? ? ? ? --
> >? gen_context(system_u:object_r:init_exec_t,s0)
> > 
> > 
> > And then, I make, install, load and relabel it again.
> > 
> > 
> > But after that, the /sbin/init still have labeled bin_t (instead of
> > the /bin/systemd is now have init_exec_t)
> > 
> > 
> > I'm very strange. So, I try to relabel it by command: 
> > 
> > 
> > chcon -t init_exec_t /sbin/init 
> 
> The /sbin/init symbolic link can be bin_t, no problem.
> 
> /sbin/systemd though should be type init_exec_t.
> 
> The problem is that reference policy currently does not support systemd.
> 
> systemd is not stable yet.
> 
> refpolicy is waiting until systemd is stable before she will support it,
> because there are too many changes happening to systemd currently.
> 
> You could probably, atleast to some extend, work around the issues by
> making init a unconfined domain, but that will probably cause issues as
> well. So if you are not comfortable with selinux you may want to avoid
> that.
> 
> ?nstead use the policy provided/supported by your distribution instead.

Consider Justin Mattock has recently submitted an initial patch (derived
from F15, I suppose) for better supporting systemd in the reference
policy:

18th September 2011
[RFC 1/2]selinux-contrib: add systemd support to refpolicy git
[RFC 2/2] refpolicy: add systemd support to tresys main policy

It's probably worth trying that out (along with the init_systemd
boolean), if it's using systemd...

Regards,

Guido

yeah, anybody have the time to go through that patch set feel free..
last I remember I was hitting some sandbox error for some reason, then ran out of?
time due to external obligations. maybe if the weekend is permitting I can have another go at
it.. as for the patch I pretty much just grepped dans git tree for systemd then copied it to refpolicy,
but there is probably more to it than just grepping.

Justin P. Mattock?

_______________________________________________
refpolicy mailing list
refpolicy at oss.tresys.com
http://oss.tresys.com/mailman/listinfo/refpolicy

^ permalink raw reply

* Re: [RFC v2 PATCH 5/4 PATCH] virtio-net: send gratuitous packet when needed
From: Rusty Russell @ 2011-10-24  4:24 UTC (permalink / raw)
  To: Jason Wang, aliguori, quintela, jan.kiszka, mst, qemu-devel,
	blauwirbel
  Cc: pbonzini, kvm, netdev
In-Reply-To: <20111022054311.21798.3340.stgit@dhcp-8-146.nay.redhat.com>

On Sat, 22 Oct 2011 13:43:11 +0800, Jason Wang <jasowang@redhat.com> wrote:
> This make let virtio-net driver can send gratituous packet by a new
> config bit - VIRTIO_NET_S_ANNOUNCE in each config update
> interrupt. When this bit is set by backend, the driver would schedule
> a workqueue to send gratituous packet through NETDEV_NOTIFY_PEERS.
> 
> This feature is negotiated through bit VIRTIO_NET_F_GUEST_ANNOUNCE.
> 
> Signed-off-by: Jason Wang <jasowang@redhat.com>

This seems like a huge layering violation.  Imagine this in real
hardware, for example.

There may be a good reason why virtual devices might want this kind of
reconfiguration cheat, which is unnecessary for normal machines, but
it'd have to be spelled out clearly in the spec to justify it...

Cheers,
Rusty.

^ permalink raw reply

* [PATCH] i2c-designware: Change readl to readw and writel to writew
From: Rajeev Kumar @ 2011-10-24  4:24 UTC (permalink / raw)
  To: baruch-NswTu9S1W3P6gbPvEgmw2w, linux-i2c-u79uwXL29TY76Z2rM5mHXA
  Cc: shiraz.hashim-qxv4g6HH51o, viresh.kumar-qxv4g6HH51o,
	bhupesh.sharma-qxv4g6HH51o, pratyush.anand-qxv4g6HH51o,
	vipin.kumar-qxv4g6HH51o, deepak.sikri-qxv4g6HH51o,
	amit.virdi-qxv4g6HH51o, vipulkumar.samar-qxv4g6HH51o,
	armando.visconti-qxv4g6HH51o, Rajeev Kumar

Since I2C designware registers are 16 bit wide and so we should use
readw/writew.

Signed-off-by: Rajeev Kumar <rajeev-dlh.kumar-qxv4g6HH51o@public.gmane.org>
---
 drivers/i2c/busses/i2c-designware.c |   84 +++++++++++++++++-----------------
 1 files changed, 42 insertions(+), 42 deletions(-)

diff --git a/drivers/i2c/busses/i2c-designware.c b/drivers/i2c/busses/i2c-designware.c
index 6eaa681..f3b2bba 100644
--- a/drivers/i2c/busses/i2c-designware.c
+++ b/drivers/i2c/busses/i2c-designware.c
@@ -289,7 +289,7 @@ static void i2c_dw_init(struct dw_i2c_dev *dev)
 	u32 ic_con, hcnt, lcnt;
 
 	/* Disable the adapter */
-	writel(0, dev->base + DW_IC_ENABLE);
+	writew(0, dev->base + DW_IC_ENABLE);
 
 	/* set standard and fast speed deviders for high/low periods */
 
@@ -303,8 +303,8 @@ static void i2c_dw_init(struct dw_i2c_dev *dev)
 				47,	/* tLOW = 4.7 us */
 				3,	/* tf = 0.3 us */
 				0);	/* No offset */
-	writel(hcnt, dev->base + DW_IC_SS_SCL_HCNT);
-	writel(lcnt, dev->base + DW_IC_SS_SCL_LCNT);
+	writew(hcnt, dev->base + DW_IC_SS_SCL_HCNT);
+	writew(lcnt, dev->base + DW_IC_SS_SCL_LCNT);
 	dev_dbg(dev->dev, "Standard-mode HCNT:LCNT = %d:%d\n", hcnt, lcnt);
 
 	/* Fast-mode */
@@ -317,18 +317,18 @@ static void i2c_dw_init(struct dw_i2c_dev *dev)
 				13,	/* tLOW = 1.3 us */
 				3,	/* tf = 0.3 us */
 				0);	/* No offset */
-	writel(hcnt, dev->base + DW_IC_FS_SCL_HCNT);
-	writel(lcnt, dev->base + DW_IC_FS_SCL_LCNT);
+	writew(hcnt, dev->base + DW_IC_FS_SCL_HCNT);
+	writew(lcnt, dev->base + DW_IC_FS_SCL_LCNT);
 	dev_dbg(dev->dev, "Fast-mode HCNT:LCNT = %d:%d\n", hcnt, lcnt);
 
 	/* Configure Tx/Rx FIFO threshold levels */
-	writel(dev->tx_fifo_depth - 1, dev->base + DW_IC_TX_TL);
-	writel(0, dev->base + DW_IC_RX_TL);
+	writew(dev->tx_fifo_depth - 1, dev->base + DW_IC_TX_TL);
+	writew(0, dev->base + DW_IC_RX_TL);
 
 	/* configure the i2c master */
 	ic_con = DW_IC_CON_MASTER | DW_IC_CON_SLAVE_DISABLE |
 		DW_IC_CON_RESTART_EN | DW_IC_CON_SPEED_FAST;
-	writel(ic_con, dev->base + DW_IC_CON);
+	writew(ic_con, dev->base + DW_IC_CON);
 }
 
 /*
@@ -338,7 +338,7 @@ static int i2c_dw_wait_bus_not_busy(struct dw_i2c_dev *dev)
 {
 	int timeout = TIMEOUT;
 
-	while (readl(dev->base + DW_IC_STATUS) & DW_IC_STATUS_ACTIVITY) {
+	while (readw(dev->base + DW_IC_STATUS) & DW_IC_STATUS_ACTIVITY) {
 		if (timeout <= 0) {
 			dev_warn(dev->dev, "timeout waiting for bus ready\n");
 			return -ETIMEDOUT;
@@ -356,24 +356,24 @@ static void i2c_dw_xfer_init(struct dw_i2c_dev *dev)
 	u32 ic_con;
 
 	/* Disable the adapter */
-	writel(0, dev->base + DW_IC_ENABLE);
+	writew(0, dev->base + DW_IC_ENABLE);
 
 	/* set the slave (target) address */
-	writel(msgs[dev->msg_write_idx].addr, dev->base + DW_IC_TAR);
+	writew(msgs[dev->msg_write_idx].addr, dev->base + DW_IC_TAR);
 
 	/* if the slave address is ten bit address, enable 10BITADDR */
-	ic_con = readl(dev->base + DW_IC_CON);
+	ic_con = readw(dev->base + DW_IC_CON);
 	if (msgs[dev->msg_write_idx].flags & I2C_M_TEN)
 		ic_con |= DW_IC_CON_10BITADDR_MASTER;
 	else
 		ic_con &= ~DW_IC_CON_10BITADDR_MASTER;
-	writel(ic_con, dev->base + DW_IC_CON);
+	writew(ic_con, dev->base + DW_IC_CON);
 
 	/* Enable the adapter */
-	writel(1, dev->base + DW_IC_ENABLE);
+	writew(1, dev->base + DW_IC_ENABLE);
 
 	/* Enable interrupts */
-	writel(DW_IC_INTR_DEFAULT_MASK, dev->base + DW_IC_INTR_MASK);
+	writew(DW_IC_INTR_DEFAULT_MASK, dev->base + DW_IC_INTR_MASK);
 }
 
 /*
@@ -420,15 +420,15 @@ i2c_dw_xfer_msg(struct dw_i2c_dev *dev)
 			buf_len = msgs[dev->msg_write_idx].len;
 		}
 
-		tx_limit = dev->tx_fifo_depth - readl(dev->base + DW_IC_TXFLR);
-		rx_limit = dev->rx_fifo_depth - readl(dev->base + DW_IC_RXFLR);
+		tx_limit = dev->tx_fifo_depth - readw(dev->base + DW_IC_TXFLR);
+		rx_limit = dev->rx_fifo_depth - readw(dev->base + DW_IC_RXFLR);
 
 		while (buf_len > 0 && tx_limit > 0 && rx_limit > 0) {
 			if (msgs[dev->msg_write_idx].flags & I2C_M_RD) {
-				writel(0x100, dev->base + DW_IC_DATA_CMD);
+				writew(0x100, dev->base + DW_IC_DATA_CMD);
 				rx_limit--;
 			} else
-				writel(*buf++, dev->base + DW_IC_DATA_CMD);
+				writew(*buf++, dev->base + DW_IC_DATA_CMD);
 			tx_limit--; buf_len--;
 		}
 
@@ -453,7 +453,7 @@ i2c_dw_xfer_msg(struct dw_i2c_dev *dev)
 	if (dev->msg_err)
 		intr_mask = 0;
 
-	writel(intr_mask, dev->base + DW_IC_INTR_MASK);
+	writew(intr_mask, dev->base + DW_IC_INTR_MASK);
 }
 
 static void
@@ -477,10 +477,10 @@ i2c_dw_read(struct dw_i2c_dev *dev)
 			buf = dev->rx_buf;
 		}
 
-		rx_valid = readl(dev->base + DW_IC_RXFLR);
+		rx_valid = readw(dev->base + DW_IC_RXFLR);
 
 		for (; len > 0 && rx_valid > 0; len--, rx_valid--)
-			*buf++ = readl(dev->base + DW_IC_DATA_CMD);
+			*buf++ = readw(dev->base + DW_IC_DATA_CMD);
 
 		if (len > 0) {
 			dev->status |= STATUS_READ_IN_PROGRESS;
@@ -563,7 +563,7 @@ i2c_dw_xfer(struct i2c_adapter *adap, struct i2c_msg msgs[], int num)
 	/* no error */
 	if (likely(!dev->cmd_err)) {
 		/* Disable the adapter */
-		writel(0, dev->base + DW_IC_ENABLE);
+		writew(0, dev->base + DW_IC_ENABLE);
 		ret = num;
 		goto done;
 	}
@@ -601,47 +601,47 @@ static u32 i2c_dw_read_clear_intrbits(struct dw_i2c_dev *dev)
 	 * in the IC_RAW_INTR_STAT register.
 	 *
 	 * That is,
-	 *   stat = readl(IC_INTR_STAT);
+	 *   stat = readw(IC_INTR_STAT);
 	 * equals to,
-	 *   stat = readl(IC_RAW_INTR_STAT) & readl(IC_INTR_MASK);
+	 *   stat = readw(IC_RAW_INTR_STAT) & readw(IC_INTR_MASK);
 	 *
 	 * The raw version might be useful for debugging purposes.
 	 */
-	stat = readl(dev->base + DW_IC_INTR_STAT);
+	stat = readw(dev->base + DW_IC_INTR_STAT);
 
 	/*
 	 * Do not use the IC_CLR_INTR register to clear interrupts, or
 	 * you'll miss some interrupts, triggered during the period from
-	 * readl(IC_INTR_STAT) to readl(IC_CLR_INTR).
+	 * readw(IC_INTR_STAT) to readw(IC_CLR_INTR).
 	 *
 	 * Instead, use the separately-prepared IC_CLR_* registers.
 	 */
 	if (stat & DW_IC_INTR_RX_UNDER)
-		readl(dev->base + DW_IC_CLR_RX_UNDER);
+		readw(dev->base + DW_IC_CLR_RX_UNDER);
 	if (stat & DW_IC_INTR_RX_OVER)
-		readl(dev->base + DW_IC_CLR_RX_OVER);
+		readw(dev->base + DW_IC_CLR_RX_OVER);
 	if (stat & DW_IC_INTR_TX_OVER)
-		readl(dev->base + DW_IC_CLR_TX_OVER);
+		readw(dev->base + DW_IC_CLR_TX_OVER);
 	if (stat & DW_IC_INTR_RD_REQ)
-		readl(dev->base + DW_IC_CLR_RD_REQ);
+		readw(dev->base + DW_IC_CLR_RD_REQ);
 	if (stat & DW_IC_INTR_TX_ABRT) {
 		/*
 		 * The IC_TX_ABRT_SOURCE register is cleared whenever
 		 * the IC_CLR_TX_ABRT is read.  Preserve it beforehand.
 		 */
-		dev->abort_source = readl(dev->base + DW_IC_TX_ABRT_SOURCE);
-		readl(dev->base + DW_IC_CLR_TX_ABRT);
+		dev->abort_source = readw(dev->base + DW_IC_TX_ABRT_SOURCE);
+		readw(dev->base + DW_IC_CLR_TX_ABRT);
 	}
 	if (stat & DW_IC_INTR_RX_DONE)
-		readl(dev->base + DW_IC_CLR_RX_DONE);
+		readw(dev->base + DW_IC_CLR_RX_DONE);
 	if (stat & DW_IC_INTR_ACTIVITY)
-		readl(dev->base + DW_IC_CLR_ACTIVITY);
+		readw(dev->base + DW_IC_CLR_ACTIVITY);
 	if (stat & DW_IC_INTR_STOP_DET)
-		readl(dev->base + DW_IC_CLR_STOP_DET);
+		readw(dev->base + DW_IC_CLR_STOP_DET);
 	if (stat & DW_IC_INTR_START_DET)
-		readl(dev->base + DW_IC_CLR_START_DET);
+		readw(dev->base + DW_IC_CLR_START_DET);
 	if (stat & DW_IC_INTR_GEN_CALL)
-		readl(dev->base + DW_IC_CLR_GEN_CALL);
+		readw(dev->base + DW_IC_CLR_GEN_CALL);
 
 	return stat;
 }
@@ -666,7 +666,7 @@ static irqreturn_t i2c_dw_isr(int this_irq, void *dev_id)
 		 * Anytime TX_ABRT is set, the contents of the tx/rx
 		 * buffers are flushed.  Make sure to skip them.
 		 */
-		writel(0, dev->base + DW_IC_INTR_MASK);
+		writew(0, dev->base + DW_IC_INTR_MASK);
 		goto tx_aborted;
 	}
 
@@ -747,14 +747,14 @@ static int __devinit dw_i2c_probe(struct platform_device *pdev)
 		goto err_unuse_clocks;
 	}
 	{
-		u32 param1 = readl(dev->base + DW_IC_COMP_PARAM_1);
+		u32 param1 = readw(dev->base + DW_IC_COMP_PARAM_1);
 
 		dev->tx_fifo_depth = ((param1 >> 16) & 0xff) + 1;
 		dev->rx_fifo_depth = ((param1 >> 8)  & 0xff) + 1;
 	}
 	i2c_dw_init(dev);
 
-	writel(0, dev->base + DW_IC_INTR_MASK); /* disable IRQ */
+	writew(0, dev->base + DW_IC_INTR_MASK); /* disable IRQ */
 	r = request_irq(dev->irq, i2c_dw_isr, IRQF_DISABLED, pdev->name, dev);
 	if (r) {
 		dev_err(&pdev->dev, "failure requesting irq %i\n", dev->irq);
@@ -810,7 +810,7 @@ static int __devexit dw_i2c_remove(struct platform_device *pdev)
 	clk_put(dev->clk);
 	dev->clk = NULL;
 
-	writel(0, dev->base + DW_IC_ENABLE);
+	writew(0, dev->base + DW_IC_ENABLE);
 	free_irq(dev->irq, dev);
 	kfree(dev);
 
-- 
1.6.0.2

^ permalink raw reply related

* Re: [RFD] Network configuration data in sysfs
From: Kirill A. Shutemov @ 2011-10-24  4:24 UTC (permalink / raw)
  To: David Miller
  Cc: kay.sievers, netdev, kuznet, jmorris, yoshfuji, kaber, gregkh,
	gladkov.alexey
In-Reply-To: <20111023.232416.1038111296509565828.davem@davemloft.net>

On Sun, Oct 23, 2011 at 11:24:16PM -0400, David Miller wrote:
> From: "Kirill A. Shutemov" <kirill@shutemov.name>
> Date: Mon, 24 Oct 2011 04:34:07 +0300
> 
> > On Sun, Oct 23, 2011 at 08:49:43PM -0400, David Miller wrote:
> >> From: "Kirill A. Shutemov" <kirill@shutemov.name>
> >> Date: Mon, 24 Oct 2011 02:35:58 +0300
> >> 
> >> > Is there something fundamental preventing us to have sysfs interface for
> >> > network configuration?
> >> 
> >> Netlink already provides everything sysfs would potentially provide as
> >> well as event tracking.
> > 
> > Not everything. You still need special utilities to view/change the
> > configuration.
> 
> You can use netlink to perform any configuration change you want, or
> to view any network configuration setting.

You need /sbin/ip or similar tool to do this, right?

-- 
 Kirill A. Shutemov

^ permalink raw reply

* Re: [Qemu-devel] [PATCH] ppc: Fix up usermode only builds
From: Alexander Graf @ 2011-10-24  4:17 UTC (permalink / raw)
  To: David Gibson; +Cc: qemu-ppc@nongnu.org, qemu-devel@nongnu.org
In-Reply-To: <1319426704-26850-1-git-send-email-david@gibson.dropbear.id.au>





On 23.10.2011, at 20:25, David Gibson <david@gibson.dropbear.id.au> wrote:

> The recent usage of MemoryRegion in kvm_ppc.h breaks builds with
> CONFIG_USER_ONLY=y.  This patch fixes it.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

Ouch. Thanks a lot for the fix!

Alex

> ---
> target-ppc/kvm_ppc.h |    4 ++++
> 1 files changed, 4 insertions(+), 0 deletions(-)
> 
> diff --git a/target-ppc/kvm_ppc.h b/target-ppc/kvm_ppc.h
> index b0d6fb6..f9c0198 100644
> --- a/target-ppc/kvm_ppc.h
> +++ b/target-ppc/kvm_ppc.h
> @@ -23,9 +23,11 @@ int kvmppc_get_hypercall(CPUState *env, uint8_t *buf, int buf_len);
> int kvmppc_set_interrupt(CPUState *env, int irq, int level);
> void kvmppc_set_papr(CPUState *env);
> int kvmppc_smt_threads(void);
> +#ifndef CONFIG_USER_ONLY
> off_t kvmppc_alloc_rma(const char *name, MemoryRegion *sysmem);
> void *kvmppc_create_spapr_tce(uint32_t liobn, uint32_t window_size, int *pfd);
> int kvmppc_remove_spapr_tce(void *table, int pfd, uint32_t window_size);
> +#endif /* !CONFIG_USER_ONLY */
> const ppc_def_t *kvmppc_host_cpu_def(void);
> 
> #else
> @@ -69,6 +71,7 @@ static inline int kvmppc_smt_threads(void)
>     return 1;
> }
> 
> +#ifndef CONFIG_USER_ONLY
> static inline off_t kvmppc_alloc_rma(const char *name, MemoryRegion *sysmem)
> {
>     return 0;
> @@ -85,6 +88,7 @@ static inline int kvmppc_remove_spapr_tce(void *table, int pfd,
> {
>     return -1;
> }
> +#endif /* !CONFIG_USER_ONLY */
> 
> static inline const ppc_def_t *kvmppc_host_cpu_def(void)
> {
> -- 
> 1.7.6.3
> 

^ permalink raw reply

* [PATCH meta-ti 0/3] Update linux-ti33x-psp and beaglebone-tester to latest, add EEPROM patches.
From: Joel A Fernandes @ 2011-10-24  3:59 UTC (permalink / raw)
  To: meta-ti; +Cc: Joel A Fernandes, jdk

Please apply these 3 patches which apart from other things also fixes the kernel build and adds EEPROM support.

All changes have been build and runtime tested.

Thanks!

Joel A Fernandes (3):
  sdcard_image: Copy uEnv.txt once again as the new bonetester also
    checks for GPIO grounding
  linux-ti33x-psp 3.1rc8: Update to latest SRCREV, add EEPROM patches
  beaglebone-tester: Update to latest and bump PR

 classes/sdcard_image.bbclass                       |    3 +-
 ...heck-return-value-of-omap_mux_init_signal.patch |   34 ++++++++++
 ...ility-to-dynamically-reconfigure-chip-inf.patch |   57 +++++++++++++++++
 ...econfigure-EEPROM-with-new-eeprom_info-in.patch |   65 ++++++++++++++++++++
 recipes-kernel/linux/linux-ti33x-psp_3.0+3.1rc.bb  |    7 ++-
 recipes-ti/beagleboard/beaglebone-tester.bb        |    6 +-
 6 files changed, 165 insertions(+), 7 deletions(-)
 create mode 100644 recipes-kernel/linux/linux-ti33x-psp-3.0+3.1rc/0001-am335x-Check-return-value-of-omap_mux_init_signal.patch
 create mode 100644 recipes-kernel/linux/linux-ti33x-psp-3.0+3.1rc/0002-at24-Add-ability-to-dynamically-reconfigure-chip-inf.patch
 create mode 100644 recipes-kernel/linux/linux-ti33x-psp-3.0+3.1rc/0003-am335x-evm-Reconfigure-EEPROM-with-new-eeprom_info-in.patch



^ permalink raw reply

* [PATCH meta-ti 3/3] beaglebone-tester: Update to latest and bump PR
From: Joel A Fernandes @ 2011-10-24  3:59 UTC (permalink / raw)
  To: meta-ti; +Cc: Joel A Fernandes, jdk
In-Reply-To: <1319428763-9677-1-git-send-email-joelagnel@ti.com>

Added:
* PMIC tests
* EEPROM tests
* Memory tests
* GPIO 38 grounding check

Signed-off-by: Joel A Fernandes <joelagnel@ti.com>
---
 recipes-ti/beagleboard/beaglebone-tester.bb |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/recipes-ti/beagleboard/beaglebone-tester.bb b/recipes-ti/beagleboard/beaglebone-tester.bb
index 1716122..9e65d65 100644
--- a/recipes-ti/beagleboard/beaglebone-tester.bb
+++ b/recipes-ti/beagleboard/beaglebone-tester.bb
@@ -7,11 +7,11 @@ LIC_FILES_CHKSUM="file://gpl.txt;md5=5b122a36d0f6dc55279a0ebc69f3c60b"
 # only scripts and data
 inherit allarch
 
-PR = "r1"
+PR = "r2"
 
 SRC_URI = "git://github.com/joelagnel/validation-scripts.git;protocol=git \
           "
-SRCREV = "0806b54c1248b080953402728b0e420243fe844c"
+SRCREV = "fdeaf580e553a0968b777e75306aac3f6a73519e"
 
 S = "${WORKDIR}/git"
 
@@ -44,7 +44,7 @@ FILES_${PN} += "${base_libdir}/systemd \
                 /boot \
                "
 
-RDEPENDS_${PN} = "iputils"
+RDEPENDS_${PN} = "iputils memtester"
 RRECOMMENDS_${PN} = "kernel-module-g-zero \
                      kernel-module-g-file-storage \
                      kernel-module-smsc95xx"
-- 
1.7.0.4



^ permalink raw reply related

* [PATCH meta-ti 2/3] linux-ti33x-psp 3.1rc8: Update to latest SRCREV, add EEPROM patches
From: Joel A Fernandes @ 2011-10-24  3:59 UTC (permalink / raw)
  To: meta-ti; +Cc: Joel A Fernandes, jdk
In-Reply-To: <1319428763-9677-1-git-send-email-joelagnel@ti.com>

* Updated to latest SRCREV and bump PR.

Added following patches being submitted to PSP currently:
* EEPROM patches required to get EEPROM working correctly on BBB without
  breaking support for EVM.
* omap_mux_init_signal patch to safe guard against incorrectly setting up pinmux.

Signed-off-by: Joel A Fernandes <joelagnel@ti.com>
---
 ...heck-return-value-of-omap_mux_init_signal.patch |   34 ++++++++++
 ...ility-to-dynamically-reconfigure-chip-inf.patch |   57 +++++++++++++++++
 ...econfigure-EEPROM-with-new-eeprom_info-in.patch |   65 ++++++++++++++++++++
 recipes-kernel/linux/linux-ti33x-psp_3.0+3.1rc.bb  |    7 ++-
 4 files changed, 161 insertions(+), 2 deletions(-)
 create mode 100644 recipes-kernel/linux/linux-ti33x-psp-3.0+3.1rc/0001-am335x-Check-return-value-of-omap_mux_init_signal.patch
 create mode 100644 recipes-kernel/linux/linux-ti33x-psp-3.0+3.1rc/0002-at24-Add-ability-to-dynamically-reconfigure-chip-inf.patch
 create mode 100644 recipes-kernel/linux/linux-ti33x-psp-3.0+3.1rc/0003-am335x-evm-Reconfigure-EEPROM-with-new-eeprom_info-in.patch

diff --git a/recipes-kernel/linux/linux-ti33x-psp-3.0+3.1rc/0001-am335x-Check-return-value-of-omap_mux_init_signal.patch b/recipes-kernel/linux/linux-ti33x-psp-3.0+3.1rc/0001-am335x-Check-return-value-of-omap_mux_init_signal.patch
new file mode 100644
index 0000000..3ce2df8
--- /dev/null
+++ b/recipes-kernel/linux/linux-ti33x-psp-3.0+3.1rc/0001-am335x-Check-return-value-of-omap_mux_init_signal.patch
@@ -0,0 +1,34 @@
+From b11df2bf8e19b8a4d4e4bb6eae59fde6a1498920 Mon Sep 17 00:00:00 2001
+From: Joel A Fernandes <joelagnel@ti.com>
+Date: Wed, 19 Oct 2011 20:11:00 -0500
+Subject: [PATCH 1/3] am335x: Check return value of omap_mux_init_signal
+
+This helps guard against setting up pin muxmode incorrectly
+
+Signed-off-by: Joel A Fernandes <joelagnel@ti.com>
+---
+ arch/arm/mach-omap2/board-am335xevm.c |    8 +++++---
+ 1 files changed, 5 insertions(+), 3 deletions(-)
+
+diff --git a/arch/arm/mach-omap2/board-am335xevm.c b/arch/arm/mach-omap2/board-am335xevm.c
+index 187f758..f959d95 100644
+--- a/arch/arm/mach-omap2/board-am335xevm.c
++++ b/arch/arm/mach-omap2/board-am335xevm.c
+@@ -590,9 +590,11 @@ static void setup_pin_mux(struct pinmux_config *pin_mux)
+ {
+ 	int i;
+ 
+-	for (i = 0; pin_mux->string_name != NULL; pin_mux++)
+-		omap_mux_init_signal(pin_mux->string_name, pin_mux->val);
+-
++	for (i = 0; pin_mux->string_name != NULL; pin_mux++) {
++		if(omap_mux_init_signal(pin_mux->string_name, pin_mux->val) < 0) {
++			printk(KERN_ERR "Failed to setup pinmux for %s\n", pin_mux->string_name);
++		}
++	}
+ }
+ 
+ /*
+-- 
+1.7.4.1
+
diff --git a/recipes-kernel/linux/linux-ti33x-psp-3.0+3.1rc/0002-at24-Add-ability-to-dynamically-reconfigure-chip-inf.patch b/recipes-kernel/linux/linux-ti33x-psp-3.0+3.1rc/0002-at24-Add-ability-to-dynamically-reconfigure-chip-inf.patch
new file mode 100644
index 0000000..5d0d580
--- /dev/null
+++ b/recipes-kernel/linux/linux-ti33x-psp-3.0+3.1rc/0002-at24-Add-ability-to-dynamically-reconfigure-chip-inf.patch
@@ -0,0 +1,57 @@
+From 8d0697f8962ef52e06012101efdea7713e0e5055 Mon Sep 17 00:00:00 2001
+From: Joel A Fernandes <joelagnel@ti.com>
+Date: Sat, 22 Oct 2011 12:56:44 -0500
+Subject: [PATCH 2/3] at24: Add ability to dynamically reconfigure chip information
+
+As some EEPROMs are used for board name detection, it is not possible to detect
+in advance which EEPROM type is connected without detecting the board first.
+
+In board-a335xevm.c, we use a trial and error approach and this requires for us
+to reconfigure the driver with a new 'eeprom_info' structure different from any
+earlier ones that were passed.
+
+We add new accessor functions to the at24 driver to help with this.
+
+Signed-off-by: Joel A Fernandes <joelagnel@ti.com>
+---
+ drivers/misc/eeprom/at24.c |   11 +++++++++++
+ include/linux/i2c/at24.h   |    3 +++
+ 2 files changed, 14 insertions(+), 0 deletions(-)
+
+diff --git a/drivers/misc/eeprom/at24.c b/drivers/misc/eeprom/at24.c
+index ab1ad41..41ebc1f 100644
+--- a/drivers/misc/eeprom/at24.c
++++ b/drivers/misc/eeprom/at24.c
+@@ -456,6 +456,17 @@ static ssize_t at24_macc_write(struct memory_accessor *macc, const char *buf,
+ 	return at24_write(at24, buf, offset, count);
+ }
+ 
++struct at24_platform_data *at24_macc_getpdata(struct memory_accessor *macc)
++{
++	struct at24_data *at24 = container_of(macc, struct at24_data, macc);
++	return &at24->chip;
++}
++
++void at24_macc_setpdata(struct memory_accessor *macc, struct at24_platform_data *chip)
++{
++	struct at24_data *at24 = container_of(macc, struct at24_data, macc);
++	at24->chip = *chip;
++}
+ /*-------------------------------------------------------------------------*/
+ 
+ #ifdef CONFIG_OF
+diff --git a/include/linux/i2c/at24.h b/include/linux/i2c/at24.h
+index 8ace930..7872912 100644
+--- a/include/linux/i2c/at24.h
++++ b/include/linux/i2c/at24.h
+@@ -29,4 +29,7 @@ struct at24_platform_data {
+ 	void		*context;
+ };
+ 
++struct at24_platform_data *at24_macc_getpdata(struct memory_accessor *macc);
++void at24_macc_setpdata(struct memory_accessor *macc, struct at24_platform_data *chip);
++
+ #endif /* _LINUX_AT24_H */
+-- 
+1.7.4.1
+
diff --git a/recipes-kernel/linux/linux-ti33x-psp-3.0+3.1rc/0003-am335x-evm-Reconfigure-EEPROM-with-new-eeprom_info-in.patch b/recipes-kernel/linux/linux-ti33x-psp-3.0+3.1rc/0003-am335x-evm-Reconfigure-EEPROM-with-new-eeprom_info-in.patch
new file mode 100644
index 0000000..9d3bb6e
--- /dev/null
+++ b/recipes-kernel/linux/linux-ti33x-psp-3.0+3.1rc/0003-am335x-evm-Reconfigure-EEPROM-with-new-eeprom_info-in.patch
@@ -0,0 +1,65 @@
+From 18a4a980113f7b290c5694239b0e9b21fb7fe132 Mon Sep 17 00:00:00 2001
+From: Joel A Fernandes <joelagnel@ti.com>
+Date: Sat, 22 Oct 2011 13:03:08 -0500
+Subject: [PATCH 3/3] am335x-evm: Reconfigure EEPROM with new eeprom_info incase of failure
+
+The earlier bone boards have an 8-bit address capable EEPROM with 2kbit size
+and 16 byte page size. This is very different from the EEPROM on the AM335x
+EVM and causes problem when reading for board detection and other purposes.
+
+We first attempt a read with the original EEPROM settings and incase of an
+invalid header, we reconfigure the EEPROM driver with bone_eeprom_info and
+perform a restart of the setup function to reread all EEPROM data again this
+time with the correct EEPROM configuration.
+
+This patch is required to get EEPROM reading working correctly on bone board
+without breaking support for EVM.
+
+Signed-off-by: Joel A Fernandes <joelagnel@ti.com>
+---
+ arch/arm/mach-omap2/board-am335xevm.c |   15 +++++++++++++++
+ 1 files changed, 15 insertions(+), 0 deletions(-)
+
+diff --git a/arch/arm/mach-omap2/board-am335xevm.c b/arch/arm/mach-omap2/board-am335xevm.c
+index f959d95..eb18fb9 100644
+--- a/arch/arm/mach-omap2/board-am335xevm.c
++++ b/arch/arm/mach-omap2/board-am335xevm.c
+@@ -1387,6 +1387,8 @@ static void am335x_setup_daughter_board(struct memory_accessor *m, void *c)
+ 	}
+ }
+ 
++static struct at24_platform_data bone_eeprom_info;
++
+ static void am335x_evm_setup(struct memory_accessor *mem_acc, void *context)
+ {
+ 	int ret;
+@@ -1413,6 +1415,11 @@ static void am335x_evm_setup(struct memory_accessor *mem_acc, void *context)
+ 	}
+ 
+ 	if (config.header != AM335X_EEPROM_HEADER) {
++		if(memcmp(at24_macc_getpdata(mem_acc), &bone_eeprom_info,
++		  sizeof(struct at24_platform_data)) != 0) {
++			at24_macc_setpdata(mem_acc, &bone_eeprom_info);
++			return am335x_evm_setup(mem_acc, context);
++		}
+ 		pr_warning("AM335X: wrong header 0x%x, expected 0x%x\n",
+ 			config.header, AM335X_EEPROM_HEADER);
+ 		goto out;
+@@ -1485,6 +1492,14 @@ static struct at24_platform_data am335x_baseboard_eeprom_info = {
+ 	.context        = (void *)NULL,
+ };
+ 
++static struct at24_platform_data bone_eeprom_info = {
++	.byte_len       = (2*1024) / 8,
++	.page_size      = 16,
++	.flags          = 0x0,
++	.setup          = am335x_evm_setup,
++	.context        = (void *)NULL,
++};
++
+ /*
+ * Daughter board Detection.
+ * Every board has a ID memory (EEPROM) on board. We probe these devices at
+-- 
+1.7.4.1
+
diff --git a/recipes-kernel/linux/linux-ti33x-psp_3.0+3.1rc.bb b/recipes-kernel/linux/linux-ti33x-psp_3.0+3.1rc.bb
index 1c4bee0..bffebea 100644
--- a/recipes-kernel/linux/linux-ti33x-psp_3.0+3.1rc.bb
+++ b/recipes-kernel/linux/linux-ti33x-psp_3.0+3.1rc.bb
@@ -10,8 +10,8 @@ S = "${WORKDIR}/git"
 MULTI_CONFIG_BASE_SUFFIX = ""
 
 BRANCH = "master"
-SRCREV = "c7fc664a6a36a4721b43dc287e410a2453f0b782"
-MACHINE_KERNEL_PR_append = "j+gitr${SRCREV}"
+SRCREV = "1955a86594526e18f03c8d62db81119ffc4ccf0f"
+MACHINE_KERNEL_PR_append = "k+gitr${SRCREV}"
 
 COMPATIBLE_MACHINE = "(ti33x)"
 
@@ -27,6 +27,9 @@ SRC_URI += "git://arago-project.org/git/projects/linux-am33x.git;protocol=git;br
 PATCHES_OVER_PSP = " \
 	file://0001-f_rndis-HACK-around-undefined-variables.patch \
 	file://0001-am335x-Add-pin-mux-and-init-for-beaglebone-specific-.patch \
+	file://0001-am335x-Check-return-value-of-omap_mux_init_signal.patch \
+	file://0002-at24-Add-ability-to-dynamically-reconfigure-chip-inf.patch \
+	file://0003-am335x-evm-Reconfigure-EEPROM-with-new-eeprom_info-in.patch \
 	"
 
 SRC_URI += "${@base_contains('DISTRO_FEATURES', 'tipspkernel', "", "${PATCHES_OVER_PSP}", d)}"
-- 
1.7.0.4



^ permalink raw reply related

* [PATCH meta-ti 1/3] sdcard_image: Copy uEnv.txt once again as the new bonetester also checks for GPIO grounding
From: Joel A Fernandes @ 2011-10-24  3:59 UTC (permalink / raw)
  To: meta-ti; +Cc: Joel A Fernandes, jdk
In-Reply-To: <1319428763-9677-1-git-send-email-joelagnel@ti.com>

Reverted commit: 6743f427e7fccf9c5f5a262d982da339f69582cf

Signed-off-by: Joel A Fernandes <joelagnel@ti.com>
---
 classes/sdcard_image.bbclass |    3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/classes/sdcard_image.bbclass b/classes/sdcard_image.bbclass
index 0c9f457..1570d37 100644
--- a/classes/sdcard_image.bbclass
+++ b/classes/sdcard_image.bbclass
@@ -93,8 +93,7 @@ IMAGE_CMD_sdimg () {
 		suffix=bin
 	fi
 
-	#cp -v ${IMAGE_ROOTFS}/boot/uEnv.txt ${WORKDIR}/tmp-mnt-boot || true
-	cp -v ${IMAGE_ROOTFS}/boot/user.txt ${WORKDIR}/tmp-mnt-boot || true
+	cp -v ${IMAGE_ROOTFS}/boot/{user.txt,uEnv.txt} ${WORKDIR}/tmp-mnt-boot || true
 
 	if [ -e ${IMAGE_ROOTFS}/boot/u-boot.$suffix ] ; then
 		cp -v ${IMAGE_ROOTFS}/boot/{u-boot.$suffix} ${WORKDIR}/tmp-mnt-boot || true
-- 
1.7.0.4



^ permalink raw reply related

* Re: [PATCH 9/9] make net/core/scm.c uid comparisons user namespace aware
From: Serge E. Hallyn @ 2011-10-24  4:15 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Eric W. Biederman, linux-kernel, akpm, oleg, richard, mikevs,
	segoon, gregkh, dhowells, eparis, netdev
In-Reply-To: <20111020141440.GA6201@sergelap>

Quoting Serge E. Hallyn (serge.hallyn@canonical.com):
> Quoting Eric W. Biederman (ebiederm@xmission.com):
> > "Serge E. Hallyn" <serge@hallyn.com> writes:
> > 
> > > Quoting Eric W. Biederman (ebiederm@xmission.com):
> > >> Serge Hallyn <serge@hallyn.com> writes:
> > >> 
> > >> > From: "Serge E. Hallyn" <serge.hallyn@canonical.com>
> > >> >
> > >> > Currently uids are compared without regard for the user namespace.
> > >> > Fix that to prevent tasks in a different user namespace from
> > >> > wrongly matching on SCM_CREDENTIALS.
> > >> >
> > >> > In the past, either your uids had to match, or you had to have
> > >> > CAP_SETXID.  In a namespaced world, you must either (both be in the
> > >> > same user namespace and have your uids match), or you must have
> > >> > CAP_SETXID targeted at the other user namespace.  The latter can
> > >> > happen for instance if uid 500 created a new user namespace and
> > >> > now interacts with uid 0 in it.
> > >> 
> > >> Serge this approach is wrong.
> > >
> > > Thanks for looking, Eric.
> > >
> > >> Because we pass the cred and the pid through the socket socket itself
> > >> is just a conduit and should be ignored in this context.
> > >
> > > Ok, that makes sense, but
> > >
> > >> The only interesting test should be are you allowed to impersonate other
> > >> users in your current userk namespace.
> > >
> > > Why in your current user namespace?  Shouldn't it be in the
> > > target user ns?  I understand it could be wrong to tie the
> > > user ns owning the socket to the target userns (though I still
> > > kind of like it), but just because I have CAP_SETUID in my
> > > own user_ns doesn't mean I should be able to pose as another
> > > uid in your user_ns.
> > 
> > First and foremost it is important that you be able if you have the
> > capability to impersonate other users in your current user namespace.
> > That is what the capability actually controls.
> > 
> > None of this allows you to impersonate any user in any other user
> > namespace.  The translation between users prevents that.
> > 
> > > (Now I also see that cred_to_ucred() translates to the current
> > > user_ns, so that should have been a hint to me before about
> > > your intent, but I'm not convinced I agree with your intent).
> > >
> > > And you do the same with the pid.  Why is that a valid assumption?
> > 
> > Yes.  Basically all the code is allow you to impersonate people you
> > would have been able to impersonate before.  If your target is in
> > another namespace you can not fool them.
> > 
> > With pids the logic should be a lot clearer.  Pretend to be a pid you can
> > see in your current pid namespace.  Lookup and convert to struct pid aka
> > the namespace agnostic object.  On output return the pid value that
> 
> No.  That conversion is happending before the user-specified pid is
> set.

Never mind, it all gets a little convoluted, but I see how it works,
and - when the time comes - how to do it right for userns.  :)  Sorry
about that.

thanks,
-serge

^ permalink raw reply

* Re: Question about how to troubleshoot sandybridge kernel opps and subsequest GPU lockup
From: James R. Leu @ 2011-10-24  4:12 UTC (permalink / raw)
  To: intel-gfx
In-Reply-To: <20111024024822.GA5123@mindspring.com>


[-- Attachment #1.1: Type: text/plain, Size: 4353 bytes --]

Hello,

I'm running wow in wine on 64 bit fedora rawhide on a dell vostro 3550
(i5 with integrated GPU).

I'm reliably able to produce 2 types of crashes:
- wow freezes, but I can get to text console, in this case I'm able to
  grab a kernel stack trace  (below) prior to seeing the normal
  [drm:i915_wait_request] *ERROR* i915_wait_request returns -11 (awaiting 452684 at 452608, next 452686)
- the other is a complete freeze of the system, hard reset required, nothing logged to /var/log/messages

Is there any value in me creating a bug report for this, it seems to be a pretty common issue.
Is there any use in my trying different kernel command line optios for the i915 driver
or config options to the xorg intel driver?

I have the various git trees pulled out (I was looking for recent changes that might be related
to this issue).  I'm capable of building and installing from these git trees if there are specific
bits that I should test.

Oct 22 20:52:59 localhost kernel: [  939.830806] ------------[ cut here ]------------
Oct 22 20:52:59 localhost kernel: [  939.830814] WARNING: at drivers/gpu/drm/i915/i915_drv.c:372 gen6_gt_force_wake_put+0x29/0x51 [i915]()
Oct 22 20:52:59 localhost kernel: [  939.830816] Hardware name: Vostro 3550
Oct 22 20:52:59 localhost kernel: [  939.830818] Modules linked in: snd_seq_dummy fuse ip6table_filter ip6_tables ebtable_nat ebtables xt_state xt_CHECKSUM iptable_mangle ppdev parport_pc lp parport vboxpci vboxnetadp vboxnetflt vboxdrv bridge stp llc tun rfcomm bnep ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 snd_hda_codec_hdmi snd_hda_codec_idt uvcvideo videodev btusb media bluetooth v4l2_compat_ioctl32 arc4 snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm iwlagn microcode mac80211 dell_laptop iTCO_wdt r8169 i2c_i801 snd_timer cfg80211 snd mii iTCO_vendor_support dcdbas dell_wmi sparse_keymap soundcore rfkill snd_page_alloc virtio_net kvm_intel kvm binfmt_misc wmi i915 drm_kms_helper drm i2c_algo_bit i2c_core video [last unloaded: scsi_wait_scan]
Oct 22 20:52:59 localhost kernel: [  939.830926] Pid: 0, comm: swapper Tainted: G        WC  3.1.0-0.rc10.git0.1.fc17.x86_64 #1
Oct 22 20:52:59 localhost kernel: [  939.830928] Call Trace:
Oct 22 20:52:59 localhost kernel: [  939.830930]  <IRQ [<ffffffff8105c3a0>] warn_slowpath_common+0x83/0x9b
Oct 22 20:52:59 localhost kernel: [  939.830941]  [<ffffffff8105c3d2>] warn_slowpath_null+0x1a/0x1c
Oct 22 20:52:59 localhost kernel: [  939.830952]  [<ffffffffa006b624>] gen6_gt_force_wake_put+0x29/0x51 [i915]
Oct 22 20:52:59 localhost kernel: [  939.830963]  [<ffffffffa006f45f>] i915_read32+0x44/0x6b [i915]
Oct 22 20:52:59 localhost kernel: [  939.830975]  [<ffffffffa00724a9>] i915_hangcheck_elapsed+0xe8/0x1f8 [i915]
Oct 22 20:52:59 localhost kernel: [  939.831027]  [<ffffffff81062ddd>] irq_exit+0x5d/0xcf
Oct 22 20:52:59 localhost kernel: [  939.831032]  [<ffffffff8150de91>] smp_apic_timer_interrupt+0x7c/0x8a
Oct 22 20:52:59 localhost kernel: [  939.831036]  [<ffffffff8150bd73>] apic_timer_interrupt+0x73/0x80
Oct 22 20:52:59 localhost kernel: [  939.831038]  <EOI [<ffffffff81014ded>] ? paravirt_read_tsc+0x9/0xd
Oct 22 20:52:59 localhost kernel: [  939.831046]  [<ffffffff81297075>] ? intel_idle+0xe5/0x10c
Oct 22 20:52:59 localhost kernel: [  939.831050]  [<ffffffff81297071>] ? intel_idle+0xe1/0x10c
Oct 22 20:52:59 localhost kernel: [  939.831054]  [<ffffffff813e14fe>] cpuidle_idle_call+0x11c/0x1fe
Oct 22 20:52:59 localhost kernel: [  939.831059]  [<ffffffff8100e2ef>] cpu_idle+0xab/0x101
Oct 22 20:52:59 localhost kernel: [  939.831063]  [<ffffffff814df673>] rest_init+0xd7/0xde
Oct 22 20:52:59 localhost kernel: [  939.831067]  [<ffffffff814df59c>] ? csum_partial_copy_generic+0x16c/0x16c
Oct 22 20:52:59 localhost kernel: [  939.831072]  [<ffffffff81d53bb0>] start_kernel+0x3dd/0x3ea
Oct 22 20:52:59 localhost kernel: [  939.831076]  [<ffffffff81d532c4>] x86_64_start_reservations+0xaf/0xb3
Oct 22 20:52:59 localhost kernel: [  939.831081]  [<ffffffff81d53140>] ? early_idt_handlers+0x140/0x140
Oct 22 20:52:59 localhost kernel: [  939.831085]  [<ffffffff81d533ca>] x86_64_start_kernel+0x102/0x111
Oct 22 20:52:59 localhost kernel: [  939.831088] ---[ end trace f5cba358bac6b7e5 ]---

-- 
James R. Leu
jleu@mindspring.com

[-- Attachment #1.2: Type: application/pgp-signature, Size: 198 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply

* Re: [PATCH] x86: Fix S4 regression
From: Yinghai Lu @ 2011-10-24  4:10 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: Linus Torvalds, Rafael J.Wysocki, x86, linux-kernel
In-Reply-To: <4EA4B164.4000009@oracle.com>

On Sun, Oct 23, 2011 at 5:29 PM, Yinghai Lu <yinghai.lu@oracle.com> wrote:
> On 10/23/2011 02:19 PM, Takashi Iwai wrote:
>
>> The commit 4b239f458: [x86-64, mm: Put early page table high] causes
>> a S4 regression since 2.6.39, namely the machine reboots occasionally
>> at S4 resume.  It doesn't happen always, overall rate is about 1/20.
>> But, like other bugs, once when this happens, it continues to happen.
>>
>> This patch fixes the problem by essentially reverting the memory
>> assignment in the older way.
>>
>> Cc: <stable@kernel.org>
>> Signed-off-by: Takashi Iwai <tiwai@suse.de>
>>
>> ---
>> I resend this as a "fix" patch now before it's forgotten and rotten.
>> It's just papering again over the mystery, but IMO better than the
>> hard-reset behavior as of now.  Unfortunately, bisection is pretty
>> much difficult because the bug itself is fairly unstable...
>
>
>
> Did you try to check several commit that Rafael pointed out:
>
>
> On Wed, Sep 28, 2011 at 12:30 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>> On Wednesday, September 28, 2011, Takashi Iwai wrote:
>>>
>>> If my previous test -- 2.6.37+Yinghai's patches didn't show the
>>> problem -- is correct, it means that some change in 2.6.38 reacted
>>> badly with Yinghai's patches, not about 2.6.39.  I'll check tomorrow
>>> again whether this observation is really correct.
>>
>> Yes, that would be good to know, thanks for doing this!
>>
>> If that turns out to be the case, there are the following commits
>> looking like worth checking:
>>
>> d344e38 x86, nx: Mark the ACPI resume trampoline code as +x
>> 884b821 ACPI: Fix acpi_os_read_memory() and acpi_os_write_memory() (v2)
>> d551d81 ACPI / PM: Call suspend_nvs_free() earlier during resume
>> 2d6d9fd ACPI: Introduce acpi_os_ioremap()
>

Also, can you check if reverting following patch could help?

| commit e5f15b45ddf3afa2bbbb10c7ea34fb32b6de0a0e
| Author: Yinghai Lu <yinghai@kernel.org>
| Date:   Fri Feb 18 11:30:30 2011 +0000
|
|    x86: Cleanup highmap after brk is concluded
|
|    Now cleanup_highmap actually is in two steps: one is early in head64.c
|    and only clears above _end; a second one is in init_memory_mapping() and
|    tries to clean from _brk_end to _end.
|    It should check if those boundaries are PMD_SIZE aligned but currently
|    does not.
|    Also init_memory_mapping() is called several times for numa or memory
|    hotplug, so we really should not handle initial kernel mappings there.
|
|   This patch moves cleanup_highmap() down after _brk_end is settled so
|    we can do everything in one step.
|    Also we honor max_pfn_mapped in the implementation of cleanup_highmap.

Thanks

Yinghai Lu

^ permalink raw reply

* Re: [PATCH 2/9] mm: alloc_contig_freed_pages() added
From: Michal Nazarewicz @ 2011-10-24  4:05 UTC (permalink / raw)
  To: Marek Szyprowski, Mel Gorman
  Cc: linux-kernel, linux-arm-kernel, linux-media, linux-mm,
	linaro-mm-sig, Kyungmin Park, Russell King, Andrew Morton,
	KAMEZAWA Hiroyuki, Ankita Garg, Daniel Walker, Arnd Bergmann,
	Jesse Barker, Jonathan Corbet, Shariq Hasnain, Chunsang Jeong,
	Dave Hansen
In-Reply-To: <20111018122109.GB6660@csn.ul.ie>

> On Thu, Oct 06, 2011 at 03:54:42PM +0200, Marek Szyprowski wrote:
>> This commit introduces alloc_contig_freed_pages() function
>> which allocates (ie. removes from buddy system) free pages
>> in range. Caller has to guarantee that all pages in range
>> are in buddy system.

On Tue, 18 Oct 2011 05:21:09 -0700, Mel Gorman <mel@csn.ul.ie> wrote:
> Straight away, I'm wondering why you didn't use
> mm/compaction.c#isolate_freepages()

Does the below look like a step in the right direction?

It basically moves isolate_freepages_block() to page_alloc.c (changing
it name to isolate_freepages_range()) and changes it so that depending
on arguments it treats holes (either invalid PFN or non-free page) as
errors so that CMA can use it.

It also accepts a range rather then just assuming a single pageblock
thus the change moves range calculation in compaction.c from
isolate_freepages_block() up to isolate_freepages().

The change also modifies spilt_free_page() so that it does not try to
change pageblock's migrate type if current migrate type is ISOLATE or
CMA.

---
 include/linux/mm.h             |    1 -
 include/linux/page-isolation.h |    4 +-
 mm/compaction.c                |   73 +++--------------------
 mm/internal.h                  |    5 ++
 mm/page_alloc.c                |  128 +++++++++++++++++++++++++---------------
 5 files changed, 95 insertions(+), 116 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index fd599f4..98c99c4 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -435,7 +435,6 @@ void put_page(struct page *page);
 void put_pages_list(struct list_head *pages);
 
 void split_page(struct page *page, unsigned int order);
-int split_free_page(struct page *page);
 
 /*
  * Compound pages have a destructor function.  Provide a
diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h
index 003c52f..6becc74 100644
--- a/include/linux/page-isolation.h
+++ b/include/linux/page-isolation.h
@@ -48,10 +48,8 @@ static inline void unset_migratetype_isolate(struct page *page)
 }
 
 /* The below functions must be run on a range from a single zone. */
-extern unsigned long alloc_contig_freed_pages(unsigned long start,
-					      unsigned long end, gfp_t flag);
 extern int alloc_contig_range(unsigned long start, unsigned long end,
-			      gfp_t flags, unsigned migratetype);
+			      unsigned migratetype);
 extern void free_contig_pages(unsigned long pfn, unsigned nr_pages);
 
 /*
diff --git a/mm/compaction.c b/mm/compaction.c
index 9e5cc59..685a19e 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -58,77 +58,15 @@ static unsigned long release_freepages(struct list_head *freelist)
 	return count;
 }
 
-/* Isolate free pages onto a private freelist. Must hold zone->lock */
-static unsigned long isolate_freepages_block(struct zone *zone,
-				unsigned long blockpfn,
-				struct list_head *freelist)
-{
-	unsigned long zone_end_pfn, end_pfn;
-	int nr_scanned = 0, total_isolated = 0;
-	struct page *cursor;
-
-	/* Get the last PFN we should scan for free pages at */
-	zone_end_pfn = zone->zone_start_pfn + zone->spanned_pages;
-	end_pfn = min(blockpfn + pageblock_nr_pages, zone_end_pfn);
-
-	/* Find the first usable PFN in the block to initialse page cursor */
-	for (; blockpfn < end_pfn; blockpfn++) {
-		if (pfn_valid_within(blockpfn))
-			break;
-	}
-	cursor = pfn_to_page(blockpfn);
-
-	/* Isolate free pages. This assumes the block is valid */
-	for (; blockpfn < end_pfn; blockpfn++, cursor++) {
-		int isolated, i;
-		struct page *page = cursor;
-
-		if (!pfn_valid_within(blockpfn))
-			continue;
-		nr_scanned++;
-
-		if (!PageBuddy(page))
-			continue;
-
-		/* Found a free page, break it into order-0 pages */
-		isolated = split_free_page(page);
-		total_isolated += isolated;
-		for (i = 0; i < isolated; i++) {
-			list_add(&page->lru, freelist);
-			page++;
-		}
-
-		/* If a page was split, advance to the end of it */
-		if (isolated) {
-			blockpfn += isolated - 1;
-			cursor += isolated - 1;
-		}
-	}
-
-	trace_mm_compaction_isolate_freepages(nr_scanned, total_isolated);
-	return total_isolated;
-}
-
 /* Returns true if the page is within a block suitable for migration to */
 static bool suitable_migration_target(struct page *page)
 {
-
 	int migratetype = get_pageblock_migratetype(page);
 
 	/* Don't interfere with memory hot-remove or the min_free_kbytes blocks */
 	if (migratetype == MIGRATE_ISOLATE || migratetype == MIGRATE_RESERVE)
 		return false;
 
-	/* Keep MIGRATE_CMA alone as well. */
-	/*
-	 * XXX Revisit.  We currently cannot let compaction touch CMA
-	 * pages since compaction insists on changing their migration
-	 * type to MIGRATE_MOVABLE (see split_free_page() called from
-	 * isolate_freepages_block() above).
-	 */
-	if (is_migrate_cma(migratetype))
-		return false;
-
 	/* If the page is a large free page, then allow migration */
 	if (PageBuddy(page) && page_order(page) >= pageblock_order)
 		return true;
@@ -149,7 +87,7 @@ static void isolate_freepages(struct zone *zone,
 				struct compact_control *cc)
 {
 	struct page *page;
-	unsigned long high_pfn, low_pfn, pfn;
+	unsigned long high_pfn, low_pfn, pfn, zone_end_pfn, end_pfn;
 	unsigned long flags;
 	int nr_freepages = cc->nr_freepages;
 	struct list_head *freelist = &cc->freepages;
@@ -169,6 +107,8 @@ static void isolate_freepages(struct zone *zone,
 	 */
 	high_pfn = min(low_pfn, pfn);
 
+	zone_end_pfn = zone->zone_start_pfn + zone->spanned_pages;
+
 	/*
 	 * Isolate free pages until enough are available to migrate the
 	 * pages on cc->migratepages. We stop searching if the migrate
@@ -176,7 +116,7 @@ static void isolate_freepages(struct zone *zone,
 	 */
 	for (; pfn > low_pfn && cc->nr_migratepages > nr_freepages;
 					pfn -= pageblock_nr_pages) {
-		unsigned long isolated;
+		unsigned isolated, scanned;
 
 		if (!pfn_valid(pfn))
 			continue;
@@ -205,7 +145,10 @@ static void isolate_freepages(struct zone *zone,
 		isolated = 0;
 		spin_lock_irqsave(&zone->lock, flags);
 		if (suitable_migration_target(page)) {
-			isolated = isolate_freepages_block(zone, pfn, freelist);
+			end_pfn = min(pfn + pageblock_nr_pages, zone_end_pfn);
+			isolated = isolate_freepages_range(zone, pfn,
+					end_pfn, freelist, &scanned);
+			trace_mm_compaction_isolate_freepages(scanned, isolated);
 			nr_freepages += isolated;
 		}
 		spin_unlock_irqrestore(&zone->lock, flags);
diff --git a/mm/internal.h b/mm/internal.h
index d071d380..4a9bb3f 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -263,3 +263,8 @@ extern u64 hwpoison_filter_flags_mask;
 extern u64 hwpoison_filter_flags_value;
 extern u64 hwpoison_filter_memcg;
 extern u32 hwpoison_filter_enable;
+
+unsigned isolate_freepages_range(struct zone *zone,
+				 unsigned long start, unsigned long end,
+				 struct list_head *freelist,
+				 unsigned *scannedp);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index df69706..adf3f34 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1300,10 +1300,11 @@ void split_page(struct page *page, unsigned int order)
  * Note: this is probably too low level an operation for use in drivers.
  * Please consult with lkml before using this in your driver.
  */
-int split_free_page(struct page *page)
+static unsigned split_free_page(struct page *page)
 {
 	unsigned int order;
 	unsigned long watermark;
+	struct page *endpage;
 	struct zone *zone;
 
 	BUG_ON(!PageBuddy(page));
@@ -1326,14 +1327,18 @@ int split_free_page(struct page *page)
 	set_page_refcounted(page);
 	split_page(page, order);
 
-	if (order >= pageblock_order - 1) {
-		struct page *endpage = page + (1 << order) - 1;
-		for (; page < endpage; page += pageblock_nr_pages)
-			if (!is_pageblock_cma(page))
-				set_pageblock_migratetype(page,
-							  MIGRATE_MOVABLE);
+	if (order < pageblock_order - 1)
+		goto done;
+
+	endpage = page + (1 << order) - 1;
+	for (; page < endpage; page += pageblock_nr_pages) {
+		int mt = get_pageblock_migratetype(page);
+		/* Don't change CMA nor ISOLATE */
+		if (!is_migrate_cma(mt) && mt != MIGRATE_ISOLATE)
+			set_pageblock_migratetype(page, MIGRATE_MOVABLE);
 	}
 
+done:
 	return 1 << order;
 }
 
@@ -5723,57 +5728,76 @@ out:
 	spin_unlock_irqrestore(&zone->lock, flags);
 }
 
-unsigned long alloc_contig_freed_pages(unsigned long start, unsigned long end,
-				       gfp_t flag)
+/**
+ * isolate_freepages_range() - isolate free pages, must hold zone->lock.
+ * @zone:	Zone pages are in.
+ * @start:	The first PFN to start isolating.
+ * @end:	The one-past-last PFN.
+ * @freelist:	A list to save isolated pages to.
+ * @scannedp:	Optional pointer where to save number of scanned pages.
+ *
+ * If @freelist is not provided, holes in range (either non-free pages
+ * or invalid PFN) are considered an error and function undos its
+ * actions and returns zero.
+ *
+ * If @freelist is provided, function will simply skip non-free and
+ * missing pages and put only the ones isolated on the list.  It will
+ * also call trace_mm_compaction_isolate_freepages() at the end.
+ *
+ * Returns number of isolated pages.  This may be more then end-start
+ * if end fell in a middle of a free page.
+ */
+unsigned isolate_freepages_range(struct zone *zone,
+				 unsigned long start, unsigned long end,
+				 struct list_head *freelist, unsigned *scannedp)
 {
-	unsigned long pfn = start, count;
+	unsigned nr_scanned = 0, total_isolated = 0;
+	unsigned long pfn = start;
 	struct page *page;
-	struct zone *zone;
-	int order;
 
 	VM_BUG_ON(!pfn_valid(start));
-	page = pfn_to_page(start);
-	zone = page_zone(page);
 
-	spin_lock_irq(&zone->lock);
+	/* Isolate free pages. This assumes the block is valid */
+	page = pfn_to_page(pfn);
+	while (pfn < end) {
+		unsigned isolated = 1;
 
-	for (;;) {
-		VM_BUG_ON(!page_count(page) || !PageBuddy(page) ||
-			  page_zone(page) != zone);
+		VM_BUG_ON(page_zone(page) != zone);
 
-		list_del(&page->lru);
-		order = page_order(page);
-		count = 1UL << order;
-		zone->free_area[order].nr_free--;
-		rmv_page_order(page);
-		__mod_zone_page_state(zone, NR_FREE_PAGES, -(long)count);
+		if (!pfn_valid_within(blockpfn))
+			goto skip;
+		++nr_scanned;
 
-		pfn += count;
-		if (pfn >= end)
-			break;
-		VM_BUG_ON(!pfn_valid(pfn));
-
-		if (zone_pfn_same_memmap(pfn - count, pfn))
-			page += count;
-		else
-			page = pfn_to_page(pfn);
-	}
+		if (!PageBuddy(page)) {
+skip:
+			if (freelist)
+				goto next;
+			for (; start < pfn; ++start)
+				__free_page(pfn_to_page(pfn));
+			return 0;
+		}
 
-	spin_unlock_irq(&zone->lock);
+		/* Found a free page, break it into order-0 pages */
+		isolated = split_free_page(page);
+		total_isolated += isolated;
+		if (freelist) {
+			struct page *p = page;
+			unsigned i = isolated;
+			for (; i--; ++page)
+				list_add(&p->lru, freelist);
+		}
 
-	/* After this, pages in the range can be freed one be one */
-	count = pfn - start;
-	pfn = start;
-	for (page = pfn_to_page(pfn); count; --count) {
-		prep_new_page(page, 0, flag);
-		++pfn;
-		if (likely(zone_pfn_same_memmap(pfn - 1, pfn)))
-			++page;
+next:		/* Advance to the next page */
+		pfn += isolated;
+		if (zone_pfn_same_memmap(pfn - isolated, pfn))
+			page += isolated;
 		else
 			page = pfn_to_page(pfn);
 	}
 
-	return pfn;
+	if (scannedp)
+		*scannedp = nr_scanned;
+	return total_isolated;
 }
 
 static unsigned long pfn_to_maxpage(unsigned long pfn)
@@ -5837,7 +5861,6 @@ static int __alloc_contig_migrate_range(unsigned long start, unsigned long end)
  * alloc_contig_range() -- tries to allocate given range of pages
  * @start:	start PFN to allocate
  * @end:	one-past-the-last PFN to allocate
- * @flags:	flags passed to alloc_contig_freed_pages().
  * @migratetype:	migratetype of the underlaying pageblocks (either
  *			#MIGRATE_MOVABLE or #MIGRATE_CMA).  All pageblocks
  *			in range must have the same migratetype and it must
@@ -5853,9 +5876,10 @@ static int __alloc_contig_migrate_range(unsigned long start, unsigned long end)
  * need to be freed with free_contig_pages().
  */
 int alloc_contig_range(unsigned long start, unsigned long end,
-		       gfp_t flags, unsigned migratetype)
+		       unsigned migratetype)
 {
 	unsigned long outer_start, outer_end;
+	struct zone *zone;
 	int ret;
 
 	/*
@@ -5910,7 +5934,17 @@ int alloc_contig_range(unsigned long start, unsigned long end,
 			return -EINVAL;
 
 	outer_start = start & (~0UL << ret);
-	outer_end   = alloc_contig_freed_pages(outer_start, end, flags);
+
+	zone = page_zone(pfn_to_page(outer_start));
+	spin_lock_irq(&zone->lock);
+	outer_end = isolate_freepages_range(zone, outer_start, end, NULL, NULL);
+	spin_unlock_irq(&zone->lock);
+
+	if (!outer_end) {
+		ret = -EBUSY;
+		goto done;
+	}
+	outer_end += outer_start;
 
 	/* Free head and tail (if any) */
 	if (start != outer_start)
-- 
1.7.3.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related

* Re: [PATCH 2/9] mm: alloc_contig_freed_pages() added
From: Michal Nazarewicz @ 2011-10-24  4:05 UTC (permalink / raw)
  To: Marek Szyprowski, Mel Gorman
  Cc: linux-kernel, linux-arm-kernel, linux-media, linux-mm,
	linaro-mm-sig, Kyungmin Park, Russell King, Andrew Morton,
	KAMEZAWA Hiroyuki, Ankita Garg, Daniel Walker, Arnd Bergmann,
	Jesse Barker, Jonathan Corbet, Shariq Hasnain, Chunsang Jeong,
	Dave Hansen
In-Reply-To: <20111018122109.GB6660@csn.ul.ie>

> On Thu, Oct 06, 2011 at 03:54:42PM +0200, Marek Szyprowski wrote:
>> This commit introduces alloc_contig_freed_pages() function
>> which allocates (ie. removes from buddy system) free pages
>> in range. Caller has to guarantee that all pages in range
>> are in buddy system.

On Tue, 18 Oct 2011 05:21:09 -0700, Mel Gorman <mel@csn.ul.ie> wrote:
> Straight away, I'm wondering why you didn't use
> mm/compaction.c#isolate_freepages()

Does the below look like a step in the right direction?

It basically moves isolate_freepages_block() to page_alloc.c (changing
it name to isolate_freepages_range()) and changes it so that depending
on arguments it treats holes (either invalid PFN or non-free page) as
errors so that CMA can use it.

It also accepts a range rather then just assuming a single pageblock
thus the change moves range calculation in compaction.c from
isolate_freepages_block() up to isolate_freepages().

The change also modifies spilt_free_page() so that it does not try to
change pageblock's migrate type if current migrate type is ISOLATE or
CMA.

---
 include/linux/mm.h             |    1 -
 include/linux/page-isolation.h |    4 +-
 mm/compaction.c                |   73 +++--------------------
 mm/internal.h                  |    5 ++
 mm/page_alloc.c                |  128 +++++++++++++++++++++++++---------------
 5 files changed, 95 insertions(+), 116 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index fd599f4..98c99c4 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -435,7 +435,6 @@ void put_page(struct page *page);
 void put_pages_list(struct list_head *pages);
 
 void split_page(struct page *page, unsigned int order);
-int split_free_page(struct page *page);
 
 /*
  * Compound pages have a destructor function.  Provide a
diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h
index 003c52f..6becc74 100644
--- a/include/linux/page-isolation.h
+++ b/include/linux/page-isolation.h
@@ -48,10 +48,8 @@ static inline void unset_migratetype_isolate(struct page *page)
 }
 
 /* The below functions must be run on a range from a single zone. */
-extern unsigned long alloc_contig_freed_pages(unsigned long start,
-					      unsigned long end, gfp_t flag);
 extern int alloc_contig_range(unsigned long start, unsigned long end,
-			      gfp_t flags, unsigned migratetype);
+			      unsigned migratetype);
 extern void free_contig_pages(unsigned long pfn, unsigned nr_pages);
 
 /*
diff --git a/mm/compaction.c b/mm/compaction.c
index 9e5cc59..685a19e 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -58,77 +58,15 @@ static unsigned long release_freepages(struct list_head *freelist)
 	return count;
 }
 
-/* Isolate free pages onto a private freelist. Must hold zone->lock */
-static unsigned long isolate_freepages_block(struct zone *zone,
-				unsigned long blockpfn,
-				struct list_head *freelist)
-{
-	unsigned long zone_end_pfn, end_pfn;
-	int nr_scanned = 0, total_isolated = 0;
-	struct page *cursor;
-
-	/* Get the last PFN we should scan for free pages at */
-	zone_end_pfn = zone->zone_start_pfn + zone->spanned_pages;
-	end_pfn = min(blockpfn + pageblock_nr_pages, zone_end_pfn);
-
-	/* Find the first usable PFN in the block to initialse page cursor */
-	for (; blockpfn < end_pfn; blockpfn++) {
-		if (pfn_valid_within(blockpfn))
-			break;
-	}
-	cursor = pfn_to_page(blockpfn);
-
-	/* Isolate free pages. This assumes the block is valid */
-	for (; blockpfn < end_pfn; blockpfn++, cursor++) {
-		int isolated, i;
-		struct page *page = cursor;
-
-		if (!pfn_valid_within(blockpfn))
-			continue;
-		nr_scanned++;
-
-		if (!PageBuddy(page))
-			continue;
-
-		/* Found a free page, break it into order-0 pages */
-		isolated = split_free_page(page);
-		total_isolated += isolated;
-		for (i = 0; i < isolated; i++) {
-			list_add(&page->lru, freelist);
-			page++;
-		}
-
-		/* If a page was split, advance to the end of it */
-		if (isolated) {
-			blockpfn += isolated - 1;
-			cursor += isolated - 1;
-		}
-	}
-
-	trace_mm_compaction_isolate_freepages(nr_scanned, total_isolated);
-	return total_isolated;
-}
-
 /* Returns true if the page is within a block suitable for migration to */
 static bool suitable_migration_target(struct page *page)
 {
-
 	int migratetype = get_pageblock_migratetype(page);
 
 	/* Don't interfere with memory hot-remove or the min_free_kbytes blocks */
 	if (migratetype == MIGRATE_ISOLATE || migratetype == MIGRATE_RESERVE)
 		return false;
 
-	/* Keep MIGRATE_CMA alone as well. */
-	/*
-	 * XXX Revisit.  We currently cannot let compaction touch CMA
-	 * pages since compaction insists on changing their migration
-	 * type to MIGRATE_MOVABLE (see split_free_page() called from
-	 * isolate_freepages_block() above).
-	 */
-	if (is_migrate_cma(migratetype))
-		return false;
-
 	/* If the page is a large free page, then allow migration */
 	if (PageBuddy(page) && page_order(page) >= pageblock_order)
 		return true;
@@ -149,7 +87,7 @@ static void isolate_freepages(struct zone *zone,
 				struct compact_control *cc)
 {
 	struct page *page;
-	unsigned long high_pfn, low_pfn, pfn;
+	unsigned long high_pfn, low_pfn, pfn, zone_end_pfn, end_pfn;
 	unsigned long flags;
 	int nr_freepages = cc->nr_freepages;
 	struct list_head *freelist = &cc->freepages;
@@ -169,6 +107,8 @@ static void isolate_freepages(struct zone *zone,
 	 */
 	high_pfn = min(low_pfn, pfn);
 
+	zone_end_pfn = zone->zone_start_pfn + zone->spanned_pages;
+
 	/*
 	 * Isolate free pages until enough are available to migrate the
 	 * pages on cc->migratepages. We stop searching if the migrate
@@ -176,7 +116,7 @@ static void isolate_freepages(struct zone *zone,
 	 */
 	for (; pfn > low_pfn && cc->nr_migratepages > nr_freepages;
 					pfn -= pageblock_nr_pages) {
-		unsigned long isolated;
+		unsigned isolated, scanned;
 
 		if (!pfn_valid(pfn))
 			continue;
@@ -205,7 +145,10 @@ static void isolate_freepages(struct zone *zone,
 		isolated = 0;
 		spin_lock_irqsave(&zone->lock, flags);
 		if (suitable_migration_target(page)) {
-			isolated = isolate_freepages_block(zone, pfn, freelist);
+			end_pfn = min(pfn + pageblock_nr_pages, zone_end_pfn);
+			isolated = isolate_freepages_range(zone, pfn,
+					end_pfn, freelist, &scanned);
+			trace_mm_compaction_isolate_freepages(scanned, isolated);
 			nr_freepages += isolated;
 		}
 		spin_unlock_irqrestore(&zone->lock, flags);
diff --git a/mm/internal.h b/mm/internal.h
index d071d380..4a9bb3f 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -263,3 +263,8 @@ extern u64 hwpoison_filter_flags_mask;
 extern u64 hwpoison_filter_flags_value;
 extern u64 hwpoison_filter_memcg;
 extern u32 hwpoison_filter_enable;
+
+unsigned isolate_freepages_range(struct zone *zone,
+				 unsigned long start, unsigned long end,
+				 struct list_head *freelist,
+				 unsigned *scannedp);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index df69706..adf3f34 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1300,10 +1300,11 @@ void split_page(struct page *page, unsigned int order)
  * Note: this is probably too low level an operation for use in drivers.
  * Please consult with lkml before using this in your driver.
  */
-int split_free_page(struct page *page)
+static unsigned split_free_page(struct page *page)
 {
 	unsigned int order;
 	unsigned long watermark;
+	struct page *endpage;
 	struct zone *zone;
 
 	BUG_ON(!PageBuddy(page));
@@ -1326,14 +1327,18 @@ int split_free_page(struct page *page)
 	set_page_refcounted(page);
 	split_page(page, order);
 
-	if (order >= pageblock_order - 1) {
-		struct page *endpage = page + (1 << order) - 1;
-		for (; page < endpage; page += pageblock_nr_pages)
-			if (!is_pageblock_cma(page))
-				set_pageblock_migratetype(page,
-							  MIGRATE_MOVABLE);
+	if (order < pageblock_order - 1)
+		goto done;
+
+	endpage = page + (1 << order) - 1;
+	for (; page < endpage; page += pageblock_nr_pages) {
+		int mt = get_pageblock_migratetype(page);
+		/* Don't change CMA nor ISOLATE */
+		if (!is_migrate_cma(mt) && mt != MIGRATE_ISOLATE)
+			set_pageblock_migratetype(page, MIGRATE_MOVABLE);
 	}
 
+done:
 	return 1 << order;
 }
 
@@ -5723,57 +5728,76 @@ out:
 	spin_unlock_irqrestore(&zone->lock, flags);
 }
 
-unsigned long alloc_contig_freed_pages(unsigned long start, unsigned long end,
-				       gfp_t flag)
+/**
+ * isolate_freepages_range() - isolate free pages, must hold zone->lock.
+ * @zone:	Zone pages are in.
+ * @start:	The first PFN to start isolating.
+ * @end:	The one-past-last PFN.
+ * @freelist:	A list to save isolated pages to.
+ * @scannedp:	Optional pointer where to save number of scanned pages.
+ *
+ * If @freelist is not provided, holes in range (either non-free pages
+ * or invalid PFN) are considered an error and function undos its
+ * actions and returns zero.
+ *
+ * If @freelist is provided, function will simply skip non-free and
+ * missing pages and put only the ones isolated on the list.  It will
+ * also call trace_mm_compaction_isolate_freepages() at the end.
+ *
+ * Returns number of isolated pages.  This may be more then end-start
+ * if end fell in a middle of a free page.
+ */
+unsigned isolate_freepages_range(struct zone *zone,
+				 unsigned long start, unsigned long end,
+				 struct list_head *freelist, unsigned *scannedp)
 {
-	unsigned long pfn = start, count;
+	unsigned nr_scanned = 0, total_isolated = 0;
+	unsigned long pfn = start;
 	struct page *page;
-	struct zone *zone;
-	int order;
 
 	VM_BUG_ON(!pfn_valid(start));
-	page = pfn_to_page(start);
-	zone = page_zone(page);
 
-	spin_lock_irq(&zone->lock);
+	/* Isolate free pages. This assumes the block is valid */
+	page = pfn_to_page(pfn);
+	while (pfn < end) {
+		unsigned isolated = 1;
 
-	for (;;) {
-		VM_BUG_ON(!page_count(page) || !PageBuddy(page) ||
-			  page_zone(page) != zone);
+		VM_BUG_ON(page_zone(page) != zone);
 
-		list_del(&page->lru);
-		order = page_order(page);
-		count = 1UL << order;
-		zone->free_area[order].nr_free--;
-		rmv_page_order(page);
-		__mod_zone_page_state(zone, NR_FREE_PAGES, -(long)count);
+		if (!pfn_valid_within(blockpfn))
+			goto skip;
+		++nr_scanned;
 
-		pfn += count;
-		if (pfn >= end)
-			break;
-		VM_BUG_ON(!pfn_valid(pfn));
-
-		if (zone_pfn_same_memmap(pfn - count, pfn))
-			page += count;
-		else
-			page = pfn_to_page(pfn);
-	}
+		if (!PageBuddy(page)) {
+skip:
+			if (freelist)
+				goto next;
+			for (; start < pfn; ++start)
+				__free_page(pfn_to_page(pfn));
+			return 0;
+		}
 
-	spin_unlock_irq(&zone->lock);
+		/* Found a free page, break it into order-0 pages */
+		isolated = split_free_page(page);
+		total_isolated += isolated;
+		if (freelist) {
+			struct page *p = page;
+			unsigned i = isolated;
+			for (; i--; ++page)
+				list_add(&p->lru, freelist);
+		}
 
-	/* After this, pages in the range can be freed one be one */
-	count = pfn - start;
-	pfn = start;
-	for (page = pfn_to_page(pfn); count; --count) {
-		prep_new_page(page, 0, flag);
-		++pfn;
-		if (likely(zone_pfn_same_memmap(pfn - 1, pfn)))
-			++page;
+next:		/* Advance to the next page */
+		pfn += isolated;
+		if (zone_pfn_same_memmap(pfn - isolated, pfn))
+			page += isolated;
 		else
 			page = pfn_to_page(pfn);
 	}
 
-	return pfn;
+	if (scannedp)
+		*scannedp = nr_scanned;
+	return total_isolated;
 }
 
 static unsigned long pfn_to_maxpage(unsigned long pfn)
@@ -5837,7 +5861,6 @@ static int __alloc_contig_migrate_range(unsigned long start, unsigned long end)
  * alloc_contig_range() -- tries to allocate given range of pages
  * @start:	start PFN to allocate
  * @end:	one-past-the-last PFN to allocate
- * @flags:	flags passed to alloc_contig_freed_pages().
  * @migratetype:	migratetype of the underlaying pageblocks (either
  *			#MIGRATE_MOVABLE or #MIGRATE_CMA).  All pageblocks
  *			in range must have the same migratetype and it must
@@ -5853,9 +5876,10 @@ static int __alloc_contig_migrate_range(unsigned long start, unsigned long end)
  * need to be freed with free_contig_pages().
  */
 int alloc_contig_range(unsigned long start, unsigned long end,
-		       gfp_t flags, unsigned migratetype)
+		       unsigned migratetype)
 {
 	unsigned long outer_start, outer_end;
+	struct zone *zone;
 	int ret;
 
 	/*
@@ -5910,7 +5934,17 @@ int alloc_contig_range(unsigned long start, unsigned long end,
 			return -EINVAL;
 
 	outer_start = start & (~0UL << ret);
-	outer_end   = alloc_contig_freed_pages(outer_start, end, flags);
+
+	zone = page_zone(pfn_to_page(outer_start));
+	spin_lock_irq(&zone->lock);
+	outer_end = isolate_freepages_range(zone, outer_start, end, NULL, NULL);
+	spin_unlock_irq(&zone->lock);
+
+	if (!outer_end) {
+		ret = -EBUSY;
+		goto done;
+	}
+	outer_end += outer_start;
 
 	/* Free head and tail (if any) */
 	if (start != outer_start)
-- 
1.7.3.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related

* Re: [PATCH 2/9] mm: alloc_contig_freed_pages() added
From: Michal Nazarewicz @ 2011-10-24  4:05 UTC (permalink / raw)
  To: Marek Szyprowski, Mel Gorman, Marek Szyprowski, Mel Gorman
  Cc: linux-kernel, linux-arm-kernel, linux-media, linux-mm,
	linaro-mm-sig, Kyungmin Park, Russell King, Andrew Morton,
	KAMEZAWA Hiroyuki, Ankita Garg, Daniel Walker, Arnd Bergmann,
	Jesse Barker, Jonathan Corbet, Shariq Hasnain, Chunsang Jeong,
	Dave Hansen
In-Reply-To: <20111018122109.GB6660@csn.ul.ie>

> On Thu, Oct 06, 2011 at 03:54:42PM +0200, Marek Szyprowski wrote:
>> This commit introduces alloc_contig_freed_pages() function
>> which allocates (ie. removes from buddy system) free pages
>> in range. Caller has to guarantee that all pages in range
>> are in buddy system.

On Tue, 18 Oct 2011 05:21:09 -0700, Mel Gorman <mel@csn.ul.ie> wrote:
> Straight away, I'm wondering why you didn't use
> mm/compaction.c#isolate_freepages()

Does the below look like a step in the right direction?

It basically moves isolate_freepages_block() to page_alloc.c (changing
it name to isolate_freepages_range()) and changes it so that depending
on arguments it treats holes (either invalid PFN or non-free page) as
errors so that CMA can use it.

It also accepts a range rather then just assuming a single pageblock
thus the change moves range calculation in compaction.c from
isolate_freepages_block() up to isolate_freepages().

The change also modifies spilt_free_page() so that it does not try to
change pageblock's migrate type if current migrate type is ISOLATE or
CMA.

---
 include/linux/mm.h             |    1 -
 include/linux/page-isolation.h |    4 +-
 mm/compaction.c                |   73 +++--------------------
 mm/internal.h                  |    5 ++
 mm/page_alloc.c                |  128 +++++++++++++++++++++++++---------------
 5 files changed, 95 insertions(+), 116 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index fd599f4..98c99c4 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -435,7 +435,6 @@ void put_page(struct page *page);
 void put_pages_list(struct list_head *pages);
 
 void split_page(struct page *page, unsigned int order);
-int split_free_page(struct page *page);
 
 /*
  * Compound pages have a destructor function.  Provide a
diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h
index 003c52f..6becc74 100644
--- a/include/linux/page-isolation.h
+++ b/include/linux/page-isolation.h
@@ -48,10 +48,8 @@ static inline void unset_migratetype_isolate(struct page *page)
 }
 
 /* The below functions must be run on a range from a single zone. */
-extern unsigned long alloc_contig_freed_pages(unsigned long start,
-					      unsigned long end, gfp_t flag);
 extern int alloc_contig_range(unsigned long start, unsigned long end,
-			      gfp_t flags, unsigned migratetype);
+			      unsigned migratetype);
 extern void free_contig_pages(unsigned long pfn, unsigned nr_pages);
 
 /*
diff --git a/mm/compaction.c b/mm/compaction.c
index 9e5cc59..685a19e 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -58,77 +58,15 @@ static unsigned long release_freepages(struct list_head *freelist)
 	return count;
 }
 
-/* Isolate free pages onto a private freelist. Must hold zone->lock */
-static unsigned long isolate_freepages_block(struct zone *zone,
-				unsigned long blockpfn,
-				struct list_head *freelist)
-{
-	unsigned long zone_end_pfn, end_pfn;
-	int nr_scanned = 0, total_isolated = 0;
-	struct page *cursor;
-
-	/* Get the last PFN we should scan for free pages at */
-	zone_end_pfn = zone->zone_start_pfn + zone->spanned_pages;
-	end_pfn = min(blockpfn + pageblock_nr_pages, zone_end_pfn);
-
-	/* Find the first usable PFN in the block to initialse page cursor */
-	for (; blockpfn < end_pfn; blockpfn++) {
-		if (pfn_valid_within(blockpfn))
-			break;
-	}
-	cursor = pfn_to_page(blockpfn);
-
-	/* Isolate free pages. This assumes the block is valid */
-	for (; blockpfn < end_pfn; blockpfn++, cursor++) {
-		int isolated, i;
-		struct page *page = cursor;
-
-		if (!pfn_valid_within(blockpfn))
-			continue;
-		nr_scanned++;
-
-		if (!PageBuddy(page))
-			continue;
-
-		/* Found a free page, break it into order-0 pages */
-		isolated = split_free_page(page);
-		total_isolated += isolated;
-		for (i = 0; i < isolated; i++) {
-			list_add(&page->lru, freelist);
-			page++;
-		}
-
-		/* If a page was split, advance to the end of it */
-		if (isolated) {
-			blockpfn += isolated - 1;
-			cursor += isolated - 1;
-		}
-	}
-
-	trace_mm_compaction_isolate_freepages(nr_scanned, total_isolated);
-	return total_isolated;
-}
-
 /* Returns true if the page is within a block suitable for migration to */
 static bool suitable_migration_target(struct page *page)
 {
-
 	int migratetype = get_pageblock_migratetype(page);
 
 	/* Don't interfere with memory hot-remove or the min_free_kbytes blocks */
 	if (migratetype == MIGRATE_ISOLATE || migratetype == MIGRATE_RESERVE)
 		return false;
 
-	/* Keep MIGRATE_CMA alone as well. */
-	/*
-	 * XXX Revisit.  We currently cannot let compaction touch CMA
-	 * pages since compaction insists on changing their migration
-	 * type to MIGRATE_MOVABLE (see split_free_page() called from
-	 * isolate_freepages_block() above).
-	 */
-	if (is_migrate_cma(migratetype))
-		return false;
-
 	/* If the page is a large free page, then allow migration */
 	if (PageBuddy(page) && page_order(page) >= pageblock_order)
 		return true;
@@ -149,7 +87,7 @@ static void isolate_freepages(struct zone *zone,
 				struct compact_control *cc)
 {
 	struct page *page;
-	unsigned long high_pfn, low_pfn, pfn;
+	unsigned long high_pfn, low_pfn, pfn, zone_end_pfn, end_pfn;
 	unsigned long flags;
 	int nr_freepages = cc->nr_freepages;
 	struct list_head *freelist = &cc->freepages;
@@ -169,6 +107,8 @@ static void isolate_freepages(struct zone *zone,
 	 */
 	high_pfn = min(low_pfn, pfn);
 
+	zone_end_pfn = zone->zone_start_pfn + zone->spanned_pages;
+
 	/*
 	 * Isolate free pages until enough are available to migrate the
 	 * pages on cc->migratepages. We stop searching if the migrate
@@ -176,7 +116,7 @@ static void isolate_freepages(struct zone *zone,
 	 */
 	for (; pfn > low_pfn && cc->nr_migratepages > nr_freepages;
 					pfn -= pageblock_nr_pages) {
-		unsigned long isolated;
+		unsigned isolated, scanned;
 
 		if (!pfn_valid(pfn))
 			continue;
@@ -205,7 +145,10 @@ static void isolate_freepages(struct zone *zone,
 		isolated = 0;
 		spin_lock_irqsave(&zone->lock, flags);
 		if (suitable_migration_target(page)) {
-			isolated = isolate_freepages_block(zone, pfn, freelist);
+			end_pfn = min(pfn + pageblock_nr_pages, zone_end_pfn);
+			isolated = isolate_freepages_range(zone, pfn,
+					end_pfn, freelist, &scanned);
+			trace_mm_compaction_isolate_freepages(scanned, isolated);
 			nr_freepages += isolated;
 		}
 		spin_unlock_irqrestore(&zone->lock, flags);
diff --git a/mm/internal.h b/mm/internal.h
index d071d380..4a9bb3f 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -263,3 +263,8 @@ extern u64 hwpoison_filter_flags_mask;
 extern u64 hwpoison_filter_flags_value;
 extern u64 hwpoison_filter_memcg;
 extern u32 hwpoison_filter_enable;
+
+unsigned isolate_freepages_range(struct zone *zone,
+				 unsigned long start, unsigned long end,
+				 struct list_head *freelist,
+				 unsigned *scannedp);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index df69706..adf3f34 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1300,10 +1300,11 @@ void split_page(struct page *page, unsigned int order)
  * Note: this is probably too low level an operation for use in drivers.
  * Please consult with lkml before using this in your driver.
  */
-int split_free_page(struct page *page)
+static unsigned split_free_page(struct page *page)
 {
 	unsigned int order;
 	unsigned long watermark;
+	struct page *endpage;
 	struct zone *zone;
 
 	BUG_ON(!PageBuddy(page));
@@ -1326,14 +1327,18 @@ int split_free_page(struct page *page)
 	set_page_refcounted(page);
 	split_page(page, order);
 
-	if (order >= pageblock_order - 1) {
-		struct page *endpage = page + (1 << order) - 1;
-		for (; page < endpage; page += pageblock_nr_pages)
-			if (!is_pageblock_cma(page))
-				set_pageblock_migratetype(page,
-							  MIGRATE_MOVABLE);
+	if (order < pageblock_order - 1)
+		goto done;
+
+	endpage = page + (1 << order) - 1;
+	for (; page < endpage; page += pageblock_nr_pages) {
+		int mt = get_pageblock_migratetype(page);
+		/* Don't change CMA nor ISOLATE */
+		if (!is_migrate_cma(mt) && mt != MIGRATE_ISOLATE)
+			set_pageblock_migratetype(page, MIGRATE_MOVABLE);
 	}
 
+done:
 	return 1 << order;
 }
 
@@ -5723,57 +5728,76 @@ out:
 	spin_unlock_irqrestore(&zone->lock, flags);
 }
 
-unsigned long alloc_contig_freed_pages(unsigned long start, unsigned long end,
-				       gfp_t flag)
+/**
+ * isolate_freepages_range() - isolate free pages, must hold zone->lock.
+ * @zone:	Zone pages are in.
+ * @start:	The first PFN to start isolating.
+ * @end:	The one-past-last PFN.
+ * @freelist:	A list to save isolated pages to.
+ * @scannedp:	Optional pointer where to save number of scanned pages.
+ *
+ * If @freelist is not provided, holes in range (either non-free pages
+ * or invalid PFN) are considered an error and function undos its
+ * actions and returns zero.
+ *
+ * If @freelist is provided, function will simply skip non-free and
+ * missing pages and put only the ones isolated on the list.  It will
+ * also call trace_mm_compaction_isolate_freepages() at the end.
+ *
+ * Returns number of isolated pages.  This may be more then end-start
+ * if end fell in a middle of a free page.
+ */
+unsigned isolate_freepages_range(struct zone *zone,
+				 unsigned long start, unsigned long end,
+				 struct list_head *freelist, unsigned *scannedp)
 {
-	unsigned long pfn = start, count;
+	unsigned nr_scanned = 0, total_isolated = 0;
+	unsigned long pfn = start;
 	struct page *page;
-	struct zone *zone;
-	int order;
 
 	VM_BUG_ON(!pfn_valid(start));
-	page = pfn_to_page(start);
-	zone = page_zone(page);
 
-	spin_lock_irq(&zone->lock);
+	/* Isolate free pages. This assumes the block is valid */
+	page = pfn_to_page(pfn);
+	while (pfn < end) {
+		unsigned isolated = 1;
 
-	for (;;) {
-		VM_BUG_ON(!page_count(page) || !PageBuddy(page) ||
-			  page_zone(page) != zone);
+		VM_BUG_ON(page_zone(page) != zone);
 
-		list_del(&page->lru);
-		order = page_order(page);
-		count = 1UL << order;
-		zone->free_area[order].nr_free--;
-		rmv_page_order(page);
-		__mod_zone_page_state(zone, NR_FREE_PAGES, -(long)count);
+		if (!pfn_valid_within(blockpfn))
+			goto skip;
+		++nr_scanned;
 
-		pfn += count;
-		if (pfn >= end)
-			break;
-		VM_BUG_ON(!pfn_valid(pfn));
-
-		if (zone_pfn_same_memmap(pfn - count, pfn))
-			page += count;
-		else
-			page = pfn_to_page(pfn);
-	}
+		if (!PageBuddy(page)) {
+skip:
+			if (freelist)
+				goto next;
+			for (; start < pfn; ++start)
+				__free_page(pfn_to_page(pfn));
+			return 0;
+		}
 
-	spin_unlock_irq(&zone->lock);
+		/* Found a free page, break it into order-0 pages */
+		isolated = split_free_page(page);
+		total_isolated += isolated;
+		if (freelist) {
+			struct page *p = page;
+			unsigned i = isolated;
+			for (; i--; ++page)
+				list_add(&p->lru, freelist);
+		}
 
-	/* After this, pages in the range can be freed one be one */
-	count = pfn - start;
-	pfn = start;
-	for (page = pfn_to_page(pfn); count; --count) {
-		prep_new_page(page, 0, flag);
-		++pfn;
-		if (likely(zone_pfn_same_memmap(pfn - 1, pfn)))
-			++page;
+next:		/* Advance to the next page */
+		pfn += isolated;
+		if (zone_pfn_same_memmap(pfn - isolated, pfn))
+			page += isolated;
 		else
 			page = pfn_to_page(pfn);
 	}
 
-	return pfn;
+	if (scannedp)
+		*scannedp = nr_scanned;
+	return total_isolated;
 }
 
 static unsigned long pfn_to_maxpage(unsigned long pfn)
@@ -5837,7 +5861,6 @@ static int __alloc_contig_migrate_range(unsigned long start, unsigned long end)
  * alloc_contig_range() -- tries to allocate given range of pages
  * @start:	start PFN to allocate
  * @end:	one-past-the-last PFN to allocate
- * @flags:	flags passed to alloc_contig_freed_pages().
  * @migratetype:	migratetype of the underlaying pageblocks (either
  *			#MIGRATE_MOVABLE or #MIGRATE_CMA).  All pageblocks
  *			in range must have the same migratetype and it must
@@ -5853,9 +5876,10 @@ static int __alloc_contig_migrate_range(unsigned long start, unsigned long end)
  * need to be freed with free_contig_pages().
  */
 int alloc_contig_range(unsigned long start, unsigned long end,
-		       gfp_t flags, unsigned migratetype)
+		       unsigned migratetype)
 {
 	unsigned long outer_start, outer_end;
+	struct zone *zone;
 	int ret;
 
 	/*
@@ -5910,7 +5934,17 @@ int alloc_contig_range(unsigned long start, unsigned long end,
 			return -EINVAL;
 
 	outer_start = start & (~0UL << ret);
-	outer_end   = alloc_contig_freed_pages(outer_start, end, flags);
+
+	zone = page_zone(pfn_to_page(outer_start));
+	spin_lock_irq(&zone->lock);
+	outer_end = isolate_freepages_range(zone, outer_start, end, NULL, NULL);
+	spin_unlock_irq(&zone->lock);
+
+	if (!outer_end) {
+		ret = -EBUSY;
+		goto done;
+	}
+	outer_end += outer_start;
 
 	/* Free head and tail (if any) */
 	if (start != outer_start)
-- 
1.7.3.1


^ permalink raw reply related

* Re: [PATCH 2/9] mm: alloc_contig_freed_pages() added
From: Michal Nazarewicz @ 2011-10-24  4:05 UTC (permalink / raw)
  To: Marek Szyprowski, Mel Gorman
  Cc: Ankita Garg, Daniel Walker, Russell King, Arnd Bergmann,
	Jesse Barker, Chunsang Jeong, Jonathan Corbet, linux-kernel,
	Dave Hansen, linaro-mm-sig, linux-mm, Kyungmin Park,
	KAMEZAWA Hiroyuki, Shariq Hasnain, Andrew Morton,
	linux-arm-kernel, linux-media
In-Reply-To: <20111018122109.GB6660@csn.ul.ie>

> On Thu, Oct 06, 2011 at 03:54:42PM +0200, Marek Szyprowski wrote:
>> This commit introduces alloc_contig_freed_pages() function
>> which allocates (ie. removes from buddy system) free pages
>> in range. Caller has to guarantee that all pages in range
>> are in buddy system.

On Tue, 18 Oct 2011 05:21:09 -0700, Mel Gorman <mel@csn.ul.ie> wrote:
> Straight away, I'm wondering why you didn't use
> mm/compaction.c#isolate_freepages()

Does the below look like a step in the right direction?

It basically moves isolate_freepages_block() to page_alloc.c (changing
it name to isolate_freepages_range()) and changes it so that depending
on arguments it treats holes (either invalid PFN or non-free page) as
errors so that CMA can use it.

It also accepts a range rather then just assuming a single pageblock
thus the change moves range calculation in compaction.c from
isolate_freepages_block() up to isolate_freepages().

The change also modifies spilt_free_page() so that it does not try to
change pageblock's migrate type if current migrate type is ISOLATE or
CMA.

---
 include/linux/mm.h             |    1 -
 include/linux/page-isolation.h |    4 +-
 mm/compaction.c                |   73 +++--------------------
 mm/internal.h                  |    5 ++
 mm/page_alloc.c                |  128 +++++++++++++++++++++++++---------------
 5 files changed, 95 insertions(+), 116 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index fd599f4..98c99c4 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -435,7 +435,6 @@ void put_page(struct page *page);
 void put_pages_list(struct list_head *pages);
 
 void split_page(struct page *page, unsigned int order);
-int split_free_page(struct page *page);
 
 /*
  * Compound pages have a destructor function.  Provide a
diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h
index 003c52f..6becc74 100644
--- a/include/linux/page-isolation.h
+++ b/include/linux/page-isolation.h
@@ -48,10 +48,8 @@ static inline void unset_migratetype_isolate(struct page *page)
 }
 
 /* The below functions must be run on a range from a single zone. */
-extern unsigned long alloc_contig_freed_pages(unsigned long start,
-					      unsigned long end, gfp_t flag);
 extern int alloc_contig_range(unsigned long start, unsigned long end,
-			      gfp_t flags, unsigned migratetype);
+			      unsigned migratetype);
 extern void free_contig_pages(unsigned long pfn, unsigned nr_pages);
 
 /*
diff --git a/mm/compaction.c b/mm/compaction.c
index 9e5cc59..685a19e 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -58,77 +58,15 @@ static unsigned long release_freepages(struct list_head *freelist)
 	return count;
 }
 
-/* Isolate free pages onto a private freelist. Must hold zone->lock */
-static unsigned long isolate_freepages_block(struct zone *zone,
-				unsigned long blockpfn,
-				struct list_head *freelist)
-{
-	unsigned long zone_end_pfn, end_pfn;
-	int nr_scanned = 0, total_isolated = 0;
-	struct page *cursor;
-
-	/* Get the last PFN we should scan for free pages at */
-	zone_end_pfn = zone->zone_start_pfn + zone->spanned_pages;
-	end_pfn = min(blockpfn + pageblock_nr_pages, zone_end_pfn);
-
-	/* Find the first usable PFN in the block to initialse page cursor */
-	for (; blockpfn < end_pfn; blockpfn++) {
-		if (pfn_valid_within(blockpfn))
-			break;
-	}
-	cursor = pfn_to_page(blockpfn);
-
-	/* Isolate free pages. This assumes the block is valid */
-	for (; blockpfn < end_pfn; blockpfn++, cursor++) {
-		int isolated, i;
-		struct page *page = cursor;
-
-		if (!pfn_valid_within(blockpfn))
-			continue;
-		nr_scanned++;
-
-		if (!PageBuddy(page))
-			continue;
-
-		/* Found a free page, break it into order-0 pages */
-		isolated = split_free_page(page);
-		total_isolated += isolated;
-		for (i = 0; i < isolated; i++) {
-			list_add(&page->lru, freelist);
-			page++;
-		}
-
-		/* If a page was split, advance to the end of it */
-		if (isolated) {
-			blockpfn += isolated - 1;
-			cursor += isolated - 1;
-		}
-	}
-
-	trace_mm_compaction_isolate_freepages(nr_scanned, total_isolated);
-	return total_isolated;
-}
-
 /* Returns true if the page is within a block suitable for migration to */
 static bool suitable_migration_target(struct page *page)
 {
-
 	int migratetype = get_pageblock_migratetype(page);
 
 	/* Don't interfere with memory hot-remove or the min_free_kbytes blocks */
 	if (migratetype == MIGRATE_ISOLATE || migratetype == MIGRATE_RESERVE)
 		return false;
 
-	/* Keep MIGRATE_CMA alone as well. */
-	/*
-	 * XXX Revisit.  We currently cannot let compaction touch CMA
-	 * pages since compaction insists on changing their migration
-	 * type to MIGRATE_MOVABLE (see split_free_page() called from
-	 * isolate_freepages_block() above).
-	 */
-	if (is_migrate_cma(migratetype))
-		return false;
-
 	/* If the page is a large free page, then allow migration */
 	if (PageBuddy(page) && page_order(page) >= pageblock_order)
 		return true;
@@ -149,7 +87,7 @@ static void isolate_freepages(struct zone *zone,
 				struct compact_control *cc)
 {
 	struct page *page;
-	unsigned long high_pfn, low_pfn, pfn;
+	unsigned long high_pfn, low_pfn, pfn, zone_end_pfn, end_pfn;
 	unsigned long flags;
 	int nr_freepages = cc->nr_freepages;
 	struct list_head *freelist = &cc->freepages;
@@ -169,6 +107,8 @@ static void isolate_freepages(struct zone *zone,
 	 */
 	high_pfn = min(low_pfn, pfn);
 
+	zone_end_pfn = zone->zone_start_pfn + zone->spanned_pages;
+
 	/*
 	 * Isolate free pages until enough are available to migrate the
 	 * pages on cc->migratepages. We stop searching if the migrate
@@ -176,7 +116,7 @@ static void isolate_freepages(struct zone *zone,
 	 */
 	for (; pfn > low_pfn && cc->nr_migratepages > nr_freepages;
 					pfn -= pageblock_nr_pages) {
-		unsigned long isolated;
+		unsigned isolated, scanned;
 
 		if (!pfn_valid(pfn))
 			continue;
@@ -205,7 +145,10 @@ static void isolate_freepages(struct zone *zone,
 		isolated = 0;
 		spin_lock_irqsave(&zone->lock, flags);
 		if (suitable_migration_target(page)) {
-			isolated = isolate_freepages_block(zone, pfn, freelist);
+			end_pfn = min(pfn + pageblock_nr_pages, zone_end_pfn);
+			isolated = isolate_freepages_range(zone, pfn,
+					end_pfn, freelist, &scanned);
+			trace_mm_compaction_isolate_freepages(scanned, isolated);
 			nr_freepages += isolated;
 		}
 		spin_unlock_irqrestore(&zone->lock, flags);
diff --git a/mm/internal.h b/mm/internal.h
index d071d380..4a9bb3f 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -263,3 +263,8 @@ extern u64 hwpoison_filter_flags_mask;
 extern u64 hwpoison_filter_flags_value;
 extern u64 hwpoison_filter_memcg;
 extern u32 hwpoison_filter_enable;
+
+unsigned isolate_freepages_range(struct zone *zone,
+				 unsigned long start, unsigned long end,
+				 struct list_head *freelist,
+				 unsigned *scannedp);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index df69706..adf3f34 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1300,10 +1300,11 @@ void split_page(struct page *page, unsigned int order)
  * Note: this is probably too low level an operation for use in drivers.
  * Please consult with lkml before using this in your driver.
  */
-int split_free_page(struct page *page)
+static unsigned split_free_page(struct page *page)
 {
 	unsigned int order;
 	unsigned long watermark;
+	struct page *endpage;
 	struct zone *zone;
 
 	BUG_ON(!PageBuddy(page));
@@ -1326,14 +1327,18 @@ int split_free_page(struct page *page)
 	set_page_refcounted(page);
 	split_page(page, order);
 
-	if (order >= pageblock_order - 1) {
-		struct page *endpage = page + (1 << order) - 1;
-		for (; page < endpage; page += pageblock_nr_pages)
-			if (!is_pageblock_cma(page))
-				set_pageblock_migratetype(page,
-							  MIGRATE_MOVABLE);
+	if (order < pageblock_order - 1)
+		goto done;
+
+	endpage = page + (1 << order) - 1;
+	for (; page < endpage; page += pageblock_nr_pages) {
+		int mt = get_pageblock_migratetype(page);
+		/* Don't change CMA nor ISOLATE */
+		if (!is_migrate_cma(mt) && mt != MIGRATE_ISOLATE)
+			set_pageblock_migratetype(page, MIGRATE_MOVABLE);
 	}
 
+done:
 	return 1 << order;
 }
 
@@ -5723,57 +5728,76 @@ out:
 	spin_unlock_irqrestore(&zone->lock, flags);
 }
 
-unsigned long alloc_contig_freed_pages(unsigned long start, unsigned long end,
-				       gfp_t flag)
+/**
+ * isolate_freepages_range() - isolate free pages, must hold zone->lock.
+ * @zone:	Zone pages are in.
+ * @start:	The first PFN to start isolating.
+ * @end:	The one-past-last PFN.
+ * @freelist:	A list to save isolated pages to.
+ * @scannedp:	Optional pointer where to save number of scanned pages.
+ *
+ * If @freelist is not provided, holes in range (either non-free pages
+ * or invalid PFN) are considered an error and function undos its
+ * actions and returns zero.
+ *
+ * If @freelist is provided, function will simply skip non-free and
+ * missing pages and put only the ones isolated on the list.  It will
+ * also call trace_mm_compaction_isolate_freepages() at the end.
+ *
+ * Returns number of isolated pages.  This may be more then end-start
+ * if end fell in a middle of a free page.
+ */
+unsigned isolate_freepages_range(struct zone *zone,
+				 unsigned long start, unsigned long end,
+				 struct list_head *freelist, unsigned *scannedp)
 {
-	unsigned long pfn = start, count;
+	unsigned nr_scanned = 0, total_isolated = 0;
+	unsigned long pfn = start;
 	struct page *page;
-	struct zone *zone;
-	int order;
 
 	VM_BUG_ON(!pfn_valid(start));
-	page = pfn_to_page(start);
-	zone = page_zone(page);
 
-	spin_lock_irq(&zone->lock);
+	/* Isolate free pages. This assumes the block is valid */
+	page = pfn_to_page(pfn);
+	while (pfn < end) {
+		unsigned isolated = 1;
 
-	for (;;) {
-		VM_BUG_ON(!page_count(page) || !PageBuddy(page) ||
-			  page_zone(page) != zone);
+		VM_BUG_ON(page_zone(page) != zone);
 
-		list_del(&page->lru);
-		order = page_order(page);
-		count = 1UL << order;
-		zone->free_area[order].nr_free--;
-		rmv_page_order(page);
-		__mod_zone_page_state(zone, NR_FREE_PAGES, -(long)count);
+		if (!pfn_valid_within(blockpfn))
+			goto skip;
+		++nr_scanned;
 
-		pfn += count;
-		if (pfn >= end)
-			break;
-		VM_BUG_ON(!pfn_valid(pfn));
-
-		if (zone_pfn_same_memmap(pfn - count, pfn))
-			page += count;
-		else
-			page = pfn_to_page(pfn);
-	}
+		if (!PageBuddy(page)) {
+skip:
+			if (freelist)
+				goto next;
+			for (; start < pfn; ++start)
+				__free_page(pfn_to_page(pfn));
+			return 0;
+		}
 
-	spin_unlock_irq(&zone->lock);
+		/* Found a free page, break it into order-0 pages */
+		isolated = split_free_page(page);
+		total_isolated += isolated;
+		if (freelist) {
+			struct page *p = page;
+			unsigned i = isolated;
+			for (; i--; ++page)
+				list_add(&p->lru, freelist);
+		}
 
-	/* After this, pages in the range can be freed one be one */
-	count = pfn - start;
-	pfn = start;
-	for (page = pfn_to_page(pfn); count; --count) {
-		prep_new_page(page, 0, flag);
-		++pfn;
-		if (likely(zone_pfn_same_memmap(pfn - 1, pfn)))
-			++page;
+next:		/* Advance to the next page */
+		pfn += isolated;
+		if (zone_pfn_same_memmap(pfn - isolated, pfn))
+			page += isolated;
 		else
 			page = pfn_to_page(pfn);
 	}
 
-	return pfn;
+	if (scannedp)
+		*scannedp = nr_scanned;
+	return total_isolated;
 }
 
 static unsigned long pfn_to_maxpage(unsigned long pfn)
@@ -5837,7 +5861,6 @@ static int __alloc_contig_migrate_range(unsigned long start, unsigned long end)
  * alloc_contig_range() -- tries to allocate given range of pages
  * @start:	start PFN to allocate
  * @end:	one-past-the-last PFN to allocate
- * @flags:	flags passed to alloc_contig_freed_pages().
  * @migratetype:	migratetype of the underlaying pageblocks (either
  *			#MIGRATE_MOVABLE or #MIGRATE_CMA).  All pageblocks
  *			in range must have the same migratetype and it must
@@ -5853,9 +5876,10 @@ static int __alloc_contig_migrate_range(unsigned long start, unsigned long end)
  * need to be freed with free_contig_pages().
  */
 int alloc_contig_range(unsigned long start, unsigned long end,
-		       gfp_t flags, unsigned migratetype)
+		       unsigned migratetype)
 {
 	unsigned long outer_start, outer_end;
+	struct zone *zone;
 	int ret;
 
 	/*
@@ -5910,7 +5934,17 @@ int alloc_contig_range(unsigned long start, unsigned long end,
 			return -EINVAL;
 
 	outer_start = start & (~0UL << ret);
-	outer_end   = alloc_contig_freed_pages(outer_start, end, flags);
+
+	zone = page_zone(pfn_to_page(outer_start));
+	spin_lock_irq(&zone->lock);
+	outer_end = isolate_freepages_range(zone, outer_start, end, NULL, NULL);
+	spin_unlock_irq(&zone->lock);
+
+	if (!outer_end) {
+		ret = -EBUSY;
+		goto done;
+	}
+	outer_end += outer_start;
 
 	/* Free head and tail (if any) */
 	if (start != outer_start)
-- 
1.7.3.1

^ permalink raw reply related

* Re: [PATCH 2/9] mm: alloc_contig_freed_pages() added
From: Michal Nazarewicz @ 2011-10-24  4:05 UTC (permalink / raw)
  To: Marek Szyprowski, Mel Gorman
  Cc: linux-kernel, linux-arm-kernel, linux-media, linux-mm,
	linaro-mm-sig, Kyungmin Park, Russell King, Andrew Morton,
	KAMEZAWA Hiroyuki, Ankita Garg, Daniel Walker, Arnd Bergmann,
	Jesse Barker, Jonathan Corbet, Shariq Hasnain, Chunsang Jeong,
	Dave Hansen
In-Reply-To: <20111018122109.GB6660@csn.ul.ie>

> On Thu, Oct 06, 2011 at 03:54:42PM +0200, Marek Szyprowski wrote:
>> This commit introduces alloc_contig_freed_pages() function
>> which allocates (ie. removes from buddy system) free pages
>> in range. Caller has to guarantee that all pages in range
>> are in buddy system.

On Tue, 18 Oct 2011 05:21:09 -0700, Mel Gorman <mel@csn.ul.ie> wrote:
> Straight away, I'm wondering why you didn't use
> mm/compaction.c#isolate_freepages()

Does the below look like a step in the right direction?

It basically moves isolate_freepages_block() to page_alloc.c (changing
it name to isolate_freepages_range()) and changes it so that depending
on arguments it treats holes (either invalid PFN or non-free page) as
errors so that CMA can use it.

It also accepts a range rather then just assuming a single pageblock
thus the change moves range calculation in compaction.c from
isolate_freepages_block() up to isolate_freepages().

The change also modifies spilt_free_page() so that it does not try to
change pageblock's migrate type if current migrate type is ISOLATE or
CMA.

---
 include/linux/mm.h             |    1 -
 include/linux/page-isolation.h |    4 +-
 mm/compaction.c                |   73 +++--------------------
 mm/internal.h                  |    5 ++
 mm/page_alloc.c                |  128 +++++++++++++++++++++++++---------------
 5 files changed, 95 insertions(+), 116 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index fd599f4..98c99c4 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -435,7 +435,6 @@ void put_page(struct page *page);
 void put_pages_list(struct list_head *pages);
 
 void split_page(struct page *page, unsigned int order);
-int split_free_page(struct page *page);
 
 /*
  * Compound pages have a destructor function.  Provide a
diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h
index 003c52f..6becc74 100644
--- a/include/linux/page-isolation.h
+++ b/include/linux/page-isolation.h
@@ -48,10 +48,8 @@ static inline void unset_migratetype_isolate(struct page *page)
 }
 
 /* The below functions must be run on a range from a single zone. */
-extern unsigned long alloc_contig_freed_pages(unsigned long start,
-					      unsigned long end, gfp_t flag);
 extern int alloc_contig_range(unsigned long start, unsigned long end,
-			      gfp_t flags, unsigned migratetype);
+			      unsigned migratetype);
 extern void free_contig_pages(unsigned long pfn, unsigned nr_pages);
 
 /*
diff --git a/mm/compaction.c b/mm/compaction.c
index 9e5cc59..685a19e 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -58,77 +58,15 @@ static unsigned long release_freepages(struct list_head *freelist)
 	return count;
 }
 
-/* Isolate free pages onto a private freelist. Must hold zone->lock */
-static unsigned long isolate_freepages_block(struct zone *zone,
-				unsigned long blockpfn,
-				struct list_head *freelist)
-{
-	unsigned long zone_end_pfn, end_pfn;
-	int nr_scanned = 0, total_isolated = 0;
-	struct page *cursor;
-
-	/* Get the last PFN we should scan for free pages at */
-	zone_end_pfn = zone->zone_start_pfn + zone->spanned_pages;
-	end_pfn = min(blockpfn + pageblock_nr_pages, zone_end_pfn);
-
-	/* Find the first usable PFN in the block to initialse page cursor */
-	for (; blockpfn < end_pfn; blockpfn++) {
-		if (pfn_valid_within(blockpfn))
-			break;
-	}
-	cursor = pfn_to_page(blockpfn);
-
-	/* Isolate free pages. This assumes the block is valid */
-	for (; blockpfn < end_pfn; blockpfn++, cursor++) {
-		int isolated, i;
-		struct page *page = cursor;
-
-		if (!pfn_valid_within(blockpfn))
-			continue;
-		nr_scanned++;
-
-		if (!PageBuddy(page))
-			continue;
-
-		/* Found a free page, break it into order-0 pages */
-		isolated = split_free_page(page);
-		total_isolated += isolated;
-		for (i = 0; i < isolated; i++) {
-			list_add(&page->lru, freelist);
-			page++;
-		}
-
-		/* If a page was split, advance to the end of it */
-		if (isolated) {
-			blockpfn += isolated - 1;
-			cursor += isolated - 1;
-		}
-	}
-
-	trace_mm_compaction_isolate_freepages(nr_scanned, total_isolated);
-	return total_isolated;
-}
-
 /* Returns true if the page is within a block suitable for migration to */
 static bool suitable_migration_target(struct page *page)
 {
-
 	int migratetype = get_pageblock_migratetype(page);
 
 	/* Don't interfere with memory hot-remove or the min_free_kbytes blocks */
 	if (migratetype == MIGRATE_ISOLATE || migratetype == MIGRATE_RESERVE)
 		return false;
 
-	/* Keep MIGRATE_CMA alone as well. */
-	/*
-	 * XXX Revisit.  We currently cannot let compaction touch CMA
-	 * pages since compaction insists on changing their migration
-	 * type to MIGRATE_MOVABLE (see split_free_page() called from
-	 * isolate_freepages_block() above).
-	 */
-	if (is_migrate_cma(migratetype))
-		return false;
-
 	/* If the page is a large free page, then allow migration */
 	if (PageBuddy(page) && page_order(page) >= pageblock_order)
 		return true;
@@ -149,7 +87,7 @@ static void isolate_freepages(struct zone *zone,
 				struct compact_control *cc)
 {
 	struct page *page;
-	unsigned long high_pfn, low_pfn, pfn;
+	unsigned long high_pfn, low_pfn, pfn, zone_end_pfn, end_pfn;
 	unsigned long flags;
 	int nr_freepages = cc->nr_freepages;
 	struct list_head *freelist = &cc->freepages;
@@ -169,6 +107,8 @@ static void isolate_freepages(struct zone *zone,
 	 */
 	high_pfn = min(low_pfn, pfn);
 
+	zone_end_pfn = zone->zone_start_pfn + zone->spanned_pages;
+
 	/*
 	 * Isolate free pages until enough are available to migrate the
 	 * pages on cc->migratepages. We stop searching if the migrate
@@ -176,7 +116,7 @@ static void isolate_freepages(struct zone *zone,
 	 */
 	for (; pfn > low_pfn && cc->nr_migratepages > nr_freepages;
 					pfn -= pageblock_nr_pages) {
-		unsigned long isolated;
+		unsigned isolated, scanned;
 
 		if (!pfn_valid(pfn))
 			continue;
@@ -205,7 +145,10 @@ static void isolate_freepages(struct zone *zone,
 		isolated = 0;
 		spin_lock_irqsave(&zone->lock, flags);
 		if (suitable_migration_target(page)) {
-			isolated = isolate_freepages_block(zone, pfn, freelist);
+			end_pfn = min(pfn + pageblock_nr_pages, zone_end_pfn);
+			isolated = isolate_freepages_range(zone, pfn,
+					end_pfn, freelist, &scanned);
+			trace_mm_compaction_isolate_freepages(scanned, isolated);
 			nr_freepages += isolated;
 		}
 		spin_unlock_irqrestore(&zone->lock, flags);
diff --git a/mm/internal.h b/mm/internal.h
index d071d380..4a9bb3f 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -263,3 +263,8 @@ extern u64 hwpoison_filter_flags_mask;
 extern u64 hwpoison_filter_flags_value;
 extern u64 hwpoison_filter_memcg;
 extern u32 hwpoison_filter_enable;
+
+unsigned isolate_freepages_range(struct zone *zone,
+				 unsigned long start, unsigned long end,
+				 struct list_head *freelist,
+				 unsigned *scannedp);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index df69706..adf3f34 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1300,10 +1300,11 @@ void split_page(struct page *page, unsigned int order)
  * Note: this is probably too low level an operation for use in drivers.
  * Please consult with lkml before using this in your driver.
  */
-int split_free_page(struct page *page)
+static unsigned split_free_page(struct page *page)
 {
 	unsigned int order;
 	unsigned long watermark;
+	struct page *endpage;
 	struct zone *zone;
 
 	BUG_ON(!PageBuddy(page));
@@ -1326,14 +1327,18 @@ int split_free_page(struct page *page)
 	set_page_refcounted(page);
 	split_page(page, order);
 
-	if (order >= pageblock_order - 1) {
-		struct page *endpage = page + (1 << order) - 1;
-		for (; page < endpage; page += pageblock_nr_pages)
-			if (!is_pageblock_cma(page))
-				set_pageblock_migratetype(page,
-							  MIGRATE_MOVABLE);
+	if (order < pageblock_order - 1)
+		goto done;
+
+	endpage = page + (1 << order) - 1;
+	for (; page < endpage; page += pageblock_nr_pages) {
+		int mt = get_pageblock_migratetype(page);
+		/* Don't change CMA nor ISOLATE */
+		if (!is_migrate_cma(mt) && mt != MIGRATE_ISOLATE)
+			set_pageblock_migratetype(page, MIGRATE_MOVABLE);
 	}
 
+done:
 	return 1 << order;
 }
 
@@ -5723,57 +5728,76 @@ out:
 	spin_unlock_irqrestore(&zone->lock, flags);
 }
 
-unsigned long alloc_contig_freed_pages(unsigned long start, unsigned long end,
-				       gfp_t flag)
+/**
+ * isolate_freepages_range() - isolate free pages, must hold zone->lock.
+ * @zone:	Zone pages are in.
+ * @start:	The first PFN to start isolating.
+ * @end:	The one-past-last PFN.
+ * @freelist:	A list to save isolated pages to.
+ * @scannedp:	Optional pointer where to save number of scanned pages.
+ *
+ * If @freelist is not provided, holes in range (either non-free pages
+ * or invalid PFN) are considered an error and function undos its
+ * actions and returns zero.
+ *
+ * If @freelist is provided, function will simply skip non-free and
+ * missing pages and put only the ones isolated on the list.  It will
+ * also call trace_mm_compaction_isolate_freepages() at the end.
+ *
+ * Returns number of isolated pages.  This may be more then end-start
+ * if end fell in a middle of a free page.
+ */
+unsigned isolate_freepages_range(struct zone *zone,
+				 unsigned long start, unsigned long end,
+				 struct list_head *freelist, unsigned *scannedp)
 {
-	unsigned long pfn = start, count;
+	unsigned nr_scanned = 0, total_isolated = 0;
+	unsigned long pfn = start;
 	struct page *page;
-	struct zone *zone;
-	int order;
 
 	VM_BUG_ON(!pfn_valid(start));
-	page = pfn_to_page(start);
-	zone = page_zone(page);
 
-	spin_lock_irq(&zone->lock);
+	/* Isolate free pages. This assumes the block is valid */
+	page = pfn_to_page(pfn);
+	while (pfn < end) {
+		unsigned isolated = 1;
 
-	for (;;) {
-		VM_BUG_ON(!page_count(page) || !PageBuddy(page) ||
-			  page_zone(page) != zone);
+		VM_BUG_ON(page_zone(page) != zone);
 
-		list_del(&page->lru);
-		order = page_order(page);
-		count = 1UL << order;
-		zone->free_area[order].nr_free--;
-		rmv_page_order(page);
-		__mod_zone_page_state(zone, NR_FREE_PAGES, -(long)count);
+		if (!pfn_valid_within(blockpfn))
+			goto skip;
+		++nr_scanned;
 
-		pfn += count;
-		if (pfn >= end)
-			break;
-		VM_BUG_ON(!pfn_valid(pfn));
-
-		if (zone_pfn_same_memmap(pfn - count, pfn))
-			page += count;
-		else
-			page = pfn_to_page(pfn);
-	}
+		if (!PageBuddy(page)) {
+skip:
+			if (freelist)
+				goto next;
+			for (; start < pfn; ++start)
+				__free_page(pfn_to_page(pfn));
+			return 0;
+		}
 
-	spin_unlock_irq(&zone->lock);
+		/* Found a free page, break it into order-0 pages */
+		isolated = split_free_page(page);
+		total_isolated += isolated;
+		if (freelist) {
+			struct page *p = page;
+			unsigned i = isolated;
+			for (; i--; ++page)
+				list_add(&p->lru, freelist);
+		}
 
-	/* After this, pages in the range can be freed one be one */
-	count = pfn - start;
-	pfn = start;
-	for (page = pfn_to_page(pfn); count; --count) {
-		prep_new_page(page, 0, flag);
-		++pfn;
-		if (likely(zone_pfn_same_memmap(pfn - 1, pfn)))
-			++page;
+next:		/* Advance to the next page */
+		pfn += isolated;
+		if (zone_pfn_same_memmap(pfn - isolated, pfn))
+			page += isolated;
 		else
 			page = pfn_to_page(pfn);
 	}
 
-	return pfn;
+	if (scannedp)
+		*scannedp = nr_scanned;
+	return total_isolated;
 }
 
 static unsigned long pfn_to_maxpage(unsigned long pfn)
@@ -5837,7 +5861,6 @@ static int __alloc_contig_migrate_range(unsigned long start, unsigned long end)
  * alloc_contig_range() -- tries to allocate given range of pages
  * @start:	start PFN to allocate
  * @end:	one-past-the-last PFN to allocate
- * @flags:	flags passed to alloc_contig_freed_pages().
  * @migratetype:	migratetype of the underlaying pageblocks (either
  *			#MIGRATE_MOVABLE or #MIGRATE_CMA).  All pageblocks
  *			in range must have the same migratetype and it must
@@ -5853,9 +5876,10 @@ static int __alloc_contig_migrate_range(unsigned long start, unsigned long end)
  * need to be freed with free_contig_pages().
  */
 int alloc_contig_range(unsigned long start, unsigned long end,
-		       gfp_t flags, unsigned migratetype)
+		       unsigned migratetype)
 {
 	unsigned long outer_start, outer_end;
+	struct zone *zone;
 	int ret;
 
 	/*
@@ -5910,7 +5934,17 @@ int alloc_contig_range(unsigned long start, unsigned long end,
 			return -EINVAL;
 
 	outer_start = start & (~0UL << ret);
-	outer_end   = alloc_contig_freed_pages(outer_start, end, flags);
+
+	zone = page_zone(pfn_to_page(outer_start));
+	spin_lock_irq(&zone->lock);
+	outer_end = isolate_freepages_range(zone, outer_start, end, NULL, NULL);
+	spin_unlock_irq(&zone->lock);
+
+	if (!outer_end) {
+		ret = -EBUSY;
+		goto done;
+	}
+	outer_end += outer_start;
 
 	/* Free head and tail (if any) */
 	if (start != outer_start)
-- 
1.7.3.1

^ permalink raw reply related

* [PATCH 2/9] mm: alloc_contig_freed_pages() added
From: Michal Nazarewicz @ 2011-10-24  4:05 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20111018122109.GB6660@csn.ul.ie>

> On Thu, Oct 06, 2011 at 03:54:42PM +0200, Marek Szyprowski wrote:
>> This commit introduces alloc_contig_freed_pages() function
>> which allocates (ie. removes from buddy system) free pages
>> in range. Caller has to guarantee that all pages in range
>> are in buddy system.

On Tue, 18 Oct 2011 05:21:09 -0700, Mel Gorman <mel@csn.ul.ie> wrote:
> Straight away, I'm wondering why you didn't use
> mm/compaction.c#isolate_freepages()

Does the below look like a step in the right direction?

It basically moves isolate_freepages_block() to page_alloc.c (changing
it name to isolate_freepages_range()) and changes it so that depending
on arguments it treats holes (either invalid PFN or non-free page) as
errors so that CMA can use it.

It also accepts a range rather then just assuming a single pageblock
thus the change moves range calculation in compaction.c from
isolate_freepages_block() up to isolate_freepages().

The change also modifies spilt_free_page() so that it does not try to
change pageblock's migrate type if current migrate type is ISOLATE or
CMA.

---
 include/linux/mm.h             |    1 -
 include/linux/page-isolation.h |    4 +-
 mm/compaction.c                |   73 +++--------------------
 mm/internal.h                  |    5 ++
 mm/page_alloc.c                |  128 +++++++++++++++++++++++++---------------
 5 files changed, 95 insertions(+), 116 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index fd599f4..98c99c4 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -435,7 +435,6 @@ void put_page(struct page *page);
 void put_pages_list(struct list_head *pages);
 
 void split_page(struct page *page, unsigned int order);
-int split_free_page(struct page *page);
 
 /*
  * Compound pages have a destructor function.  Provide a
diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h
index 003c52f..6becc74 100644
--- a/include/linux/page-isolation.h
+++ b/include/linux/page-isolation.h
@@ -48,10 +48,8 @@ static inline void unset_migratetype_isolate(struct page *page)
 }
 
 /* The below functions must be run on a range from a single zone. */
-extern unsigned long alloc_contig_freed_pages(unsigned long start,
-					      unsigned long end, gfp_t flag);
 extern int alloc_contig_range(unsigned long start, unsigned long end,
-			      gfp_t flags, unsigned migratetype);
+			      unsigned migratetype);
 extern void free_contig_pages(unsigned long pfn, unsigned nr_pages);
 
 /*
diff --git a/mm/compaction.c b/mm/compaction.c
index 9e5cc59..685a19e 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -58,77 +58,15 @@ static unsigned long release_freepages(struct list_head *freelist)
 	return count;
 }
 
-/* Isolate free pages onto a private freelist. Must hold zone->lock */
-static unsigned long isolate_freepages_block(struct zone *zone,
-				unsigned long blockpfn,
-				struct list_head *freelist)
-{
-	unsigned long zone_end_pfn, end_pfn;
-	int nr_scanned = 0, total_isolated = 0;
-	struct page *cursor;
-
-	/* Get the last PFN we should scan for free pages at */
-	zone_end_pfn = zone->zone_start_pfn + zone->spanned_pages;
-	end_pfn = min(blockpfn + pageblock_nr_pages, zone_end_pfn);
-
-	/* Find the first usable PFN in the block to initialse page cursor */
-	for (; blockpfn < end_pfn; blockpfn++) {
-		if (pfn_valid_within(blockpfn))
-			break;
-	}
-	cursor = pfn_to_page(blockpfn);
-
-	/* Isolate free pages. This assumes the block is valid */
-	for (; blockpfn < end_pfn; blockpfn++, cursor++) {
-		int isolated, i;
-		struct page *page = cursor;
-
-		if (!pfn_valid_within(blockpfn))
-			continue;
-		nr_scanned++;
-
-		if (!PageBuddy(page))
-			continue;
-
-		/* Found a free page, break it into order-0 pages */
-		isolated = split_free_page(page);
-		total_isolated += isolated;
-		for (i = 0; i < isolated; i++) {
-			list_add(&page->lru, freelist);
-			page++;
-		}
-
-		/* If a page was split, advance to the end of it */
-		if (isolated) {
-			blockpfn += isolated - 1;
-			cursor += isolated - 1;
-		}
-	}
-
-	trace_mm_compaction_isolate_freepages(nr_scanned, total_isolated);
-	return total_isolated;
-}
-
 /* Returns true if the page is within a block suitable for migration to */
 static bool suitable_migration_target(struct page *page)
 {
-
 	int migratetype = get_pageblock_migratetype(page);
 
 	/* Don't interfere with memory hot-remove or the min_free_kbytes blocks */
 	if (migratetype == MIGRATE_ISOLATE || migratetype == MIGRATE_RESERVE)
 		return false;
 
-	/* Keep MIGRATE_CMA alone as well. */
-	/*
-	 * XXX Revisit.  We currently cannot let compaction touch CMA
-	 * pages since compaction insists on changing their migration
-	 * type to MIGRATE_MOVABLE (see split_free_page() called from
-	 * isolate_freepages_block() above).
-	 */
-	if (is_migrate_cma(migratetype))
-		return false;
-
 	/* If the page is a large free page, then allow migration */
 	if (PageBuddy(page) && page_order(page) >= pageblock_order)
 		return true;
@@ -149,7 +87,7 @@ static void isolate_freepages(struct zone *zone,
 				struct compact_control *cc)
 {
 	struct page *page;
-	unsigned long high_pfn, low_pfn, pfn;
+	unsigned long high_pfn, low_pfn, pfn, zone_end_pfn, end_pfn;
 	unsigned long flags;
 	int nr_freepages = cc->nr_freepages;
 	struct list_head *freelist = &cc->freepages;
@@ -169,6 +107,8 @@ static void isolate_freepages(struct zone *zone,
 	 */
 	high_pfn = min(low_pfn, pfn);
 
+	zone_end_pfn = zone->zone_start_pfn + zone->spanned_pages;
+
 	/*
 	 * Isolate free pages until enough are available to migrate the
 	 * pages on cc->migratepages. We stop searching if the migrate
@@ -176,7 +116,7 @@ static void isolate_freepages(struct zone *zone,
 	 */
 	for (; pfn > low_pfn && cc->nr_migratepages > nr_freepages;
 					pfn -= pageblock_nr_pages) {
-		unsigned long isolated;
+		unsigned isolated, scanned;
 
 		if (!pfn_valid(pfn))
 			continue;
@@ -205,7 +145,10 @@ static void isolate_freepages(struct zone *zone,
 		isolated = 0;
 		spin_lock_irqsave(&zone->lock, flags);
 		if (suitable_migration_target(page)) {
-			isolated = isolate_freepages_block(zone, pfn, freelist);
+			end_pfn = min(pfn + pageblock_nr_pages, zone_end_pfn);
+			isolated = isolate_freepages_range(zone, pfn,
+					end_pfn, freelist, &scanned);
+			trace_mm_compaction_isolate_freepages(scanned, isolated);
 			nr_freepages += isolated;
 		}
 		spin_unlock_irqrestore(&zone->lock, flags);
diff --git a/mm/internal.h b/mm/internal.h
index d071d380..4a9bb3f 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -263,3 +263,8 @@ extern u64 hwpoison_filter_flags_mask;
 extern u64 hwpoison_filter_flags_value;
 extern u64 hwpoison_filter_memcg;
 extern u32 hwpoison_filter_enable;
+
+unsigned isolate_freepages_range(struct zone *zone,
+				 unsigned long start, unsigned long end,
+				 struct list_head *freelist,
+				 unsigned *scannedp);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index df69706..adf3f34 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1300,10 +1300,11 @@ void split_page(struct page *page, unsigned int order)
  * Note: this is probably too low level an operation for use in drivers.
  * Please consult with lkml before using this in your driver.
  */
-int split_free_page(struct page *page)
+static unsigned split_free_page(struct page *page)
 {
 	unsigned int order;
 	unsigned long watermark;
+	struct page *endpage;
 	struct zone *zone;
 
 	BUG_ON(!PageBuddy(page));
@@ -1326,14 +1327,18 @@ int split_free_page(struct page *page)
 	set_page_refcounted(page);
 	split_page(page, order);
 
-	if (order >= pageblock_order - 1) {
-		struct page *endpage = page + (1 << order) - 1;
-		for (; page < endpage; page += pageblock_nr_pages)
-			if (!is_pageblock_cma(page))
-				set_pageblock_migratetype(page,
-							  MIGRATE_MOVABLE);
+	if (order < pageblock_order - 1)
+		goto done;
+
+	endpage = page + (1 << order) - 1;
+	for (; page < endpage; page += pageblock_nr_pages) {
+		int mt = get_pageblock_migratetype(page);
+		/* Don't change CMA nor ISOLATE */
+		if (!is_migrate_cma(mt) && mt != MIGRATE_ISOLATE)
+			set_pageblock_migratetype(page, MIGRATE_MOVABLE);
 	}
 
+done:
 	return 1 << order;
 }
 
@@ -5723,57 +5728,76 @@ out:
 	spin_unlock_irqrestore(&zone->lock, flags);
 }
 
-unsigned long alloc_contig_freed_pages(unsigned long start, unsigned long end,
-				       gfp_t flag)
+/**
+ * isolate_freepages_range() - isolate free pages, must hold zone->lock.
+ * @zone:	Zone pages are in.
+ * @start:	The first PFN to start isolating.
+ * @end:	The one-past-last PFN.
+ * @freelist:	A list to save isolated pages to.
+ * @scannedp:	Optional pointer where to save number of scanned pages.
+ *
+ * If @freelist is not provided, holes in range (either non-free pages
+ * or invalid PFN) are considered an error and function undos its
+ * actions and returns zero.
+ *
+ * If @freelist is provided, function will simply skip non-free and
+ * missing pages and put only the ones isolated on the list.  It will
+ * also call trace_mm_compaction_isolate_freepages() at the end.
+ *
+ * Returns number of isolated pages.  This may be more then end-start
+ * if end fell in a middle of a free page.
+ */
+unsigned isolate_freepages_range(struct zone *zone,
+				 unsigned long start, unsigned long end,
+				 struct list_head *freelist, unsigned *scannedp)
 {
-	unsigned long pfn = start, count;
+	unsigned nr_scanned = 0, total_isolated = 0;
+	unsigned long pfn = start;
 	struct page *page;
-	struct zone *zone;
-	int order;
 
 	VM_BUG_ON(!pfn_valid(start));
-	page = pfn_to_page(start);
-	zone = page_zone(page);
 
-	spin_lock_irq(&zone->lock);
+	/* Isolate free pages. This assumes the block is valid */
+	page = pfn_to_page(pfn);
+	while (pfn < end) {
+		unsigned isolated = 1;
 
-	for (;;) {
-		VM_BUG_ON(!page_count(page) || !PageBuddy(page) ||
-			  page_zone(page) != zone);
+		VM_BUG_ON(page_zone(page) != zone);
 
-		list_del(&page->lru);
-		order = page_order(page);
-		count = 1UL << order;
-		zone->free_area[order].nr_free--;
-		rmv_page_order(page);
-		__mod_zone_page_state(zone, NR_FREE_PAGES, -(long)count);
+		if (!pfn_valid_within(blockpfn))
+			goto skip;
+		++nr_scanned;
 
-		pfn += count;
-		if (pfn >= end)
-			break;
-		VM_BUG_ON(!pfn_valid(pfn));
-
-		if (zone_pfn_same_memmap(pfn - count, pfn))
-			page += count;
-		else
-			page = pfn_to_page(pfn);
-	}
+		if (!PageBuddy(page)) {
+skip:
+			if (freelist)
+				goto next;
+			for (; start < pfn; ++start)
+				__free_page(pfn_to_page(pfn));
+			return 0;
+		}
 
-	spin_unlock_irq(&zone->lock);
+		/* Found a free page, break it into order-0 pages */
+		isolated = split_free_page(page);
+		total_isolated += isolated;
+		if (freelist) {
+			struct page *p = page;
+			unsigned i = isolated;
+			for (; i--; ++page)
+				list_add(&p->lru, freelist);
+		}
 
-	/* After this, pages in the range can be freed one be one */
-	count = pfn - start;
-	pfn = start;
-	for (page = pfn_to_page(pfn); count; --count) {
-		prep_new_page(page, 0, flag);
-		++pfn;
-		if (likely(zone_pfn_same_memmap(pfn - 1, pfn)))
-			++page;
+next:		/* Advance to the next page */
+		pfn += isolated;
+		if (zone_pfn_same_memmap(pfn - isolated, pfn))
+			page += isolated;
 		else
 			page = pfn_to_page(pfn);
 	}
 
-	return pfn;
+	if (scannedp)
+		*scannedp = nr_scanned;
+	return total_isolated;
 }
 
 static unsigned long pfn_to_maxpage(unsigned long pfn)
@@ -5837,7 +5861,6 @@ static int __alloc_contig_migrate_range(unsigned long start, unsigned long end)
  * alloc_contig_range() -- tries to allocate given range of pages
  * @start:	start PFN to allocate
  * @end:	one-past-the-last PFN to allocate
- * @flags:	flags passed to alloc_contig_freed_pages().
  * @migratetype:	migratetype of the underlaying pageblocks (either
  *			#MIGRATE_MOVABLE or #MIGRATE_CMA).  All pageblocks
  *			in range must have the same migratetype and it must
@@ -5853,9 +5876,10 @@ static int __alloc_contig_migrate_range(unsigned long start, unsigned long end)
  * need to be freed with free_contig_pages().
  */
 int alloc_contig_range(unsigned long start, unsigned long end,
-		       gfp_t flags, unsigned migratetype)
+		       unsigned migratetype)
 {
 	unsigned long outer_start, outer_end;
+	struct zone *zone;
 	int ret;
 
 	/*
@@ -5910,7 +5934,17 @@ int alloc_contig_range(unsigned long start, unsigned long end,
 			return -EINVAL;
 
 	outer_start = start & (~0UL << ret);
-	outer_end   = alloc_contig_freed_pages(outer_start, end, flags);
+
+	zone = page_zone(pfn_to_page(outer_start));
+	spin_lock_irq(&zone->lock);
+	outer_end = isolate_freepages_range(zone, outer_start, end, NULL, NULL);
+	spin_unlock_irq(&zone->lock);
+
+	if (!outer_end) {
+		ret = -EBUSY;
+		goto done;
+	}
+	outer_end += outer_start;
 
 	/* Free head and tail (if any) */
 	if (start != outer_start)
-- 
1.7.3.1

^ permalink raw reply related


This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.