[Cluster-devel] (no subject)

cluster-devel.redhat.com archive mirror
 help / color / mirror / Atom feed

* [Cluster-devel] (no subject)
@ 2017-10-09  9:12 Andreas Gruenbacher
  0 siblings, 0 replies; 3+ messages in thread
From: Andreas Gruenbacher @ 2017-10-09  9:12 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Date: Wed, 4 Oct 2017 23:09:38 +0200
Subject: [PATCH] direct-io: Prevent NULL pointer access in submit_page_section

In the code added to function submit_page_section by commit b1058b981,
sdio->bio can currently be NULL when calling dio_bio_submit.  This then
leads to a NULL pointer access in dio_bio_submit, so check for a NULL
bio in submit_page_section before trying to submit it instead.

Fixes xfstest generic/250 on gfs2.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
---
 fs/direct-io.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/direct-io.c b/fs/direct-io.c
index 5fa2211e49ae..e0332da392d8 100644
--- a/fs/direct-io.c
+++ b/fs/direct-io.c
@@ -838,7 +838,8 @@ submit_page_section(struct dio *dio, struct dio_submit *sdio, struct page *page,
 	 */
 	if (sdio->boundary) {
 		ret = dio_send_cur_page(dio, sdio, map_bh);
-		dio_bio_submit(dio, sdio);
+		if (sdio->bio)
+			dio_bio_submit(dio, sdio);
 		put_page(sdio->cur_page);
 		sdio->cur_page = NULL;
 	}
-- 
2.13.5



^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [Cluster-devel] (no subject)
@ 2015-10-13 10:07 eric
  0 siblings, 0 replies; 3+ messages in thread
From: eric @ 2015-10-13 10:07 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi David and list,

I'm working on ocfs2, and encountered an problem about dlm posix file lock.
After some investigation, I'd like to share information about it and get 
some
hints from you.

Environment:
    kernel: 3.12.47
    FS: OCFS2
    stack: pacemaker
    cluster: 2 testing nodes, node1, node2

Issue desc:
There is a deadlock test case for file lock in ocfs2 test suites. The 
deadlock test first prepare
an testing file1 on shared disk, then on node1 do "fcntl(file1, 
F_SETLKW, {F_WRLCK, SEEK_SET, 0, 0})"
, then on node2 set alarm(10s) and also  "fcntl(file1, F_SETLKW, 
{F_WRLCK, SEEK_SET, 0, 0})".
It expects alarm timeout to send SIGALRM, and wake up the sleep process, 
as "man fcntl"
says: "If a  signal  is  caught  while waiting,  then  the call is 
interrupted and (after the signal handler has returned)
returns immediately (with return value -1 and errno set to EINTR".

But, the process on node2 was in "Dl" state when using ps, and signal 
was blocked. So, the test case was hung for ever.

Investigations:
* Key debug infos:
process stack on node1:

n1:/opt/ocfs2-test/bin # cat /proc/22677/stack
[<ffffffff8104250b>] kvm_clock_get_cycles+0x1b/0x20
[<ffffffff810ba924>] __getnstimeofday+0x34/0xc0
[<ffffffff810ba9ba>] getnstimeofday+0xa/0x30
[<ffffffff811bb30d>] SyS_poll+0x5d/0xf0
[<ffffffff81529809>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffff

process stack on node2:
n2:~ # cat /proc/1534/stack
[<ffffffffa050fa65>] dlm_posix_lock+0x185/0x380 [dlm]
[<ffffffff811f39ce>] fcntl_setlk+0x12e/0x2d0
[<ffffffff811b8231>] SyS_fcntl+0x261/0x510
[<ffffffff81529809>] system_call_fastpath+0x16/0x1b
[<00007f3f5721eb42>] 0x7f3f5721eb42
[<ffffffffffffffff>] 0xffffffffffffffff

* dlm_posix_lock
Through adding printk and recompile dlm kernel module, where n2 is hung
has been located:
      dlm_posix_lock -> wait_event_killable
And wait_event_killable will put process into "TASK_KILLABLE" state which's like
"UNINTERRUPTABLE" but can be waked up by fatal signals. I did some tests, SIGTERM
can did it, but SIGALRM cannot.

Did this go against posix file lock semanteme? Any hints would be very appreciated!
I can provide any infos as I can if needed;-)

Thanks,
Eric

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Cluster-devel] (no subject)
@ 2010-02-05  5:45 Dave Chinner
  0 siblings, 0 replies; 3+ messages in thread
From: Dave Chinner @ 2010-02-05  5:45 UTC (permalink / raw)
  To: cluster-devel.redhat.com

These patches improve sequential write IO patterns and reduce ordered
write log contention.

The first patch is simply for diagnosis purposes - it enabled me to
see where Io was being dispatched from, and led directly to he fix
in the second patch. The third patch removes the use of WRITE_SYNC_PLUG for
async writes (data, metadata and log), and the third moves the AIL pushing out
from under the log lock so that incoming writes can still proceed while
the log is being flushed.

The difference is on a local disk that XFS can do 85MB/s sequential
write, gfs2 can do:
                        cfq     noop
        vanilla         38MB/s  48MB/s
        +2              48MB/s  65MB/s
        +3              48MB/s  65MB/s
        +4              51MB/s  75MB/s

The improvement is due to the IO patterns resulting in the disk being IO bound,
and the subsequent improvements in IO patterns directly translate into more
throughput

On a faster 4-disk dm stripe array on the same machine that XFS can do 265MB/s
(@ 550iop/s) sequential write, gfs2 can do:

				cfq			 noop
        vanilla         135MB/s @ 400iop/s	130MB/s @ 800iop/s
        +4              135MB/s @ 400iop/s	130MB/s @ 500iop/s

No improvement or degradation in throughput is seen here as the disks never get
to being IO bound - the write is cpu bound. However, there is an improvement in
iops seen on no-op scheduler as a result of the improvement in IO dispatch
patterns.

The patches have not seen much testing, so this is really just a posting
for comments/feedback at this point.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2017-10-09  9:12 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-10-09  9:12 [Cluster-devel] (no subject) Andreas Gruenbacher
  -- strict thread matches above, loose matches on Subject: below --
2015-10-13 10:07 eric
2010-02-05  5:45 Dave Chinner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).