27897 Commits

Author SHA1 Message Date
05c32f47bf dlm: fix race between remove and lookup
It was possible for a remove message on an old
rsb to be sent after a lookup message on a new
rsb, where the rsbs were for the same resource
name.  This could lead to a missing directory
entry for the new rsb.

It is fixed by keeping a copy of the resource
name being removed until after the remove has
been sent.  A lookup checks if this in-progress
remove matches the name it is looking up.

Signed-off-by: David Teigland <teigland@redhat.com>
2012-07-16 14:18:01 -05:00
1d7c484eeb dlm: use idr instead of list for recovered rsbs
When a large number of resources are being recovered,
a linear search of the recover_list takes a long time.
Use an idr in place of a list.

Signed-off-by: David Teigland <teigland@redhat.com>
2012-07-16 14:17:52 -05:00
c04fecb4d9 dlm: use rsbtbl as resource directory
Remove the dir hash table (dirtbl), and use
the rsb hash table (rsbtbl) as the resource
directory.  It has always been an unnecessary
duplication of information.

This improves efficiency by using a single rsbtbl
lookup in many cases where both rsbtbl and dirtbl
lookups were needed previously.

This eliminates the need to handle cases of rsbtbl
and dirtbl being out of sync.

In many cases there will be memory savings because
the dir hash table no longer exists.

Signed-off-by: David Teigland <teigland@redhat.com>
2012-07-16 14:16:19 -05:00
fce667c574 xfs: regression fixes for 3.5-rc7
- Really fix a cursor leak in xfs_alloc_ag_vextent_near
  - Fix a performance regression related to doing allocation in workqueues
  - Prevent recursion in xfs_buf_iorequest which is causing stack overflows
  - Don't call xfs_bdstrat_cb in xfs_buf_iodone callbacks
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.10 (GNU/Linux)
 
 iQIcBAABAgAGBQJQAGW0AAoJENaLyazVq6ZOOyEP/0xLuQKFF71eB/VL8BFylEHY
 H22/aDEWTo9pWDxxwBirVioBeCU07ByEv7zeQM1nqEm9pXESTSzsBUYKX2tWSRzY
 YclXYA0rpdCK2cdXpuz+0kWBFr9Y1Q1BIYNll6C3ZqhADgubAMHa13rKVUQlQqpD
 EZhvGrh42ujRhckwmi1E3+g3Ll79fty47WzyzEOa18ij3LI3q5Dm7WZpZlhv/MVW
 Fj975Q+LdJVchoQZ7gTiddMkZ936TwjLxM4EtWZd46CMG2/YRcPHx2YaItI0k6Xa
 Q34pbUHidZjqzng28iO6Y6BaB/rPLX/f7KcoZib+rc85zt8sShoDFXS9eq+DdTeH
 f5AgkPzpFfr3QpK3Fv5ZjICj5SkC6KhI14qnxLZhVGIWgJLGD8lb0fYDwN2y5BHq
 HmA7G4hALBaZ+TOpXQ2+XfxFyrffkQcM/Ja57yfDj37aOKYxMubAco4JlADBzBkg
 m1gF2QebQrbZ49k9x85vpIvoNFXcAg6FeuesYXciMcORCpMXp4f0oLjGlfwf13Fo
 upZ2AGfcpIcMl3fZzliw64VuJpX4QswTzUZFqXH8+Vn9eB6cwFqA8PJpLxBSshCk
 Y6E+LE6oYbsN+T6R+6Xj0DsPl4OIE0eWlGPaZZu9eWku02e0vZMj57RxUWpNcOmH
 6lq8TJaS/N5Qri51NPVs
 =TP1A
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-v3.5-rc7' of git://oss.sgi.com/xfs/xfs

Pull xfs regression fixes from Ben Myers:
 - Really fix a cursor leak in xfs_alloc_ag_vextent_near
 - Fix a performance regression related to doing allocation in
   workqueues
 - Prevent recursion in xfs_buf_iorequest which is causing stack
   overflows
 - Don't call xfs_bdstrat_cb in xfs_buf_iodone callbacks

* tag 'for-linus-v3.5-rc7' of git://oss.sgi.com/xfs/xfs:
  xfs: do not call xfs_bdstrat_cb in xfs_buf_iodone_callbacks
  xfs: prevent recursion in xfs_buf_iorequest
  xfs: don't defer metadata allocation to the workqueue
  xfs: really fix the cursor leak in xfs_alloc_ag_vextent_near
2012-07-16 10:02:36 -07:00
05d290d66b fifo: Do not restart open() if it already found a partner
If a parent and child process open the two ends of a fifo, and the
child immediately exits, the parent may receive a SIGCHLD before its
open() returns.  In that case, we need to make sure that open() will
return successfully after the SIGCHLD handler returns, instead of
throwing EINTR or being restarted.  Otherwise, the restarted open()
would incorrectly wait for a second partner on the other end.

The following test demonstrates the EINTR that was wrongly thrown from
the parent’s open().  Change .sa_flags = 0 to .sa_flags = SA_RESTART
to see a deadlock instead, in which the restarted open() waits for a
second reader that will never come.  (On my systems, this happens
pretty reliably within about 5 to 500 iterations.  Others report that
it manages to loop ~forever sometimes; YMMV.)

  #include <sys/stat.h>
  #include <sys/types.h>
  #include <sys/wait.h>
  #include <fcntl.h>
  #include <signal.h>
  #include <stdio.h>
  #include <stdlib.h>
  #include <unistd.h>

  #define CHECK(x) do if ((x) == -1) {perror(#x); abort();} while(0)

  void handler(int signum) {}

  int main()
  {
      struct sigaction act = {.sa_handler = handler, .sa_flags = 0};
      CHECK(sigaction(SIGCHLD, &act, NULL));
      CHECK(mknod("fifo", S_IFIFO | S_IRWXU, 0));
      for (;;) {
          int fd;
          pid_t pid;
          putc('.', stderr);
          CHECK(pid = fork());
          if (pid == 0) {
              CHECK(fd = open("fifo", O_RDONLY));
              _exit(0);
          }
          CHECK(fd = open("fifo", O_WRONLY));
          CHECK(close(fd));
          CHECK(waitpid(pid, NULL, 0));
      }
  }

This is what I suspect was causing the Git test suite to fail in
t9010-svn-fe.sh:

	http://bugs.debian.org/678852

Signed-off-by: Anders Kaseorg <andersk@mit.edu>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-07-16 08:33:14 -07:00
0bdaea9017 VFS: Split inode_permission()
Split inode_permission() into inode- and superblock-dependent parts.

This is aimed at unionmounts where the superblock from the upper layer has to
be checked rather than the superblock from the lower layer as the upper layer
may be writable, thus allowing an unwritable file from the lower layer to be
copied up and modified.

Original-author: Valerie Aurora <vaurora@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com> (Further development)
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14 16:38:36 +04:00
9249e17fe0 VFS: Pass mount flags to sget()
Pass mount flags to sget() so that it can use them in initialising a new
superblock before the set function is called.  They could also be passed to the
compare function.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14 16:38:34 +04:00
f015f1267b VFS: Comment mount following code
Add comments describing what the directions "up" and "down" mean and ref count
handling to the VFS mount following family of functions.

Signed-off-by: Valerie Aurora <vaurora@redhat.com> (Original author)
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14 16:38:32 +04:00
be34d1a3bc VFS: Make clone_mnt()/copy_tree()/collect_mounts() return errors
copy_tree() can theoretically fail in a case other than ENOMEM, but always
returns NULL which is interpreted by callers as -ENOMEM.  Change it to return
an explicit error.

Also change clone_mnt() for consistency and because union mounts will add new
error cases.

Thanks to Andreas Gruenbacher <agruen@suse.de> for a bug fix.
[AV: folded braino fix by Dan Carpenter]

Original-author: Valerie Aurora <vaurora@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Valerie Aurora <valerie.aurora@gmail.com>
Cc: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14 16:37:27 +04:00
55e4def0a6 VFS: Make chown() and lchown() call fchownat()
Make the chown() and lchown() syscalls jump to the fchownat() syscall with the
appropriate extra arguments.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14 16:35:54 +04:00
c3c4f69424 do_dentry_open(): close the race with mark_files_ro() in failure exit
we want to take it out of mark_files_ro() reach *before* we start
checking if we ought to drop write access.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14 16:35:50 +04:00
85d7d618c1 mark_files_ro(): don't bother with mntget/mntput
mnt_drop_write_file() is safe under any lock

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14 16:35:46 +04:00
c4107b3097 notify_change(): check that i_mutex is held
Cc: Djalal Harouni <tixxdz@opendz.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14 16:35:42 +04:00
b5fb63c183 fs: add nd_jump_link
Add a helper that abstracts out the jump to an already parsed struct path
from ->follow_link operation from procfs.  Not only does this clean up
the code by moving the two sides of this game into a single helper, but
it also prepares for making struct nameidata private to namei.c

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14 16:35:40 +04:00
408ef013cc fs: move path_put on failure out of ->follow_link
Currently the non-nd_set_link based versions of ->follow_link are expected
to do a path_put(&nd->path) on failure.  This calling convention is unexpected,
undocumented and doesn't match what the nd_set_link-based instances do.

Move the path_put out of the only non-nd_set_link based ->follow_link
instance into the caller.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14 16:35:35 +04:00
ac481d6ca4 debugfs: get rid of useless arguments to debugfs_{mkdir,symlink}
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14 16:35:30 +04:00
cfa57c11b0 debugfs: fold debugfs_create_by_name() into the only caller
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14 16:35:25 +04:00
c3b1a35084 debugfs: make sure that debugfs_create_file() gets used only for regulars
It, debugfs_create_dir() and debugfs_create_link() use the common helper
now.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14 16:35:19 +04:00
ee3efa91e2 __d_unalias() should refuse to move mountpoints
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14 16:35:15 +04:00
e77fb7cef8 sysfs: just use d_materialise_unique()
same as for nfs et.al.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14 16:35:12 +04:00
469796d105 sysfs: switch to ->s_d_op and ->d_release()
a) ->d_iput() is wrong here - what we do to inode is completely usual, it's
dentry->d_fsdata that we want to drop.  Just use ->d_release().

b) switch to ->s_d_op - no need to play with d_set_d_op()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14 16:35:06 +04:00
79714f72d3 get rid of kern_path_parent()
all callers want the same thing, actually - a kinda-sorta analog of
kern_path_create().  I.e. they want parent vfsmount/dentry (with
->i_mutex held, to make sure the child dentry is still their child)
+ the child dentry.

Signed-off-by Al Viro <viro@zeniv.linux.org.uk>
2012-07-14 16:35:02 +04:00
1acf0af9b9 VFS: Fix the banner comment on lookup_open()
Since commit 197e37d9, the banner comment on lookup_open() no longer matches
what the function returns.  It used to return a struct file pointer or NULL and
now it returns an integer and is passed the struct file pointer it is to use
amongst its arguments.  Update the comment to reflect this.

Also add a banner comment to atomic_open().

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14 16:34:57 +04:00
312b63fba9 don't pass nameidata * to vfs_create()
all we want is a boolean flag, same as the method gets now

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14 16:34:50 +04:00
ebfc3b49a7 don't pass nameidata to ->create()
boolean "does it have to be exclusive?" flag is passed instead;
Local filesystem should just ignore it - the object is guaranteed
not to be there yet.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14 16:34:47 +04:00
72bd866a01 fs/namei.c: don't pass nameidata to __lookup_hash() and lookup_real()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14 16:34:40 +04:00
00cd8dd3bf stop passing nameidata to ->lookup()
Just the flags; only NFS cares even about that, but there are
legitimate uses for such argument.  And getting rid of that
completely would require splitting ->lookup() into a couple
of methods (at least), so let's leave that alone for now...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14 16:34:32 +04:00
201f956e43 fs/namei.c: don't pass namedata to lookup_dcache()
just the flags...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14 16:34:25 +04:00
4ce16ef3fe fs/namei.c: don't pass nameidata to d_revalidate()
since the method wrapped by it doesn't need that anymore...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14 16:34:21 +04:00
0b728e1911 stop passing nameidata * to ->d_revalidate()
Just the lookup flags.  Die, bastard, die...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14 16:34:14 +04:00
fa3c56bbda fs/nfs/dir.c: switch to passing nd->flags instead of nd wherever possible
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14 16:34:07 +04:00
facc3530fb nfs_lookup_verify_inode() - nd is *always* non-NULL here
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14 16:34:02 +04:00
93420b40bb switch nfs_lookup_check_intent() away from nameidata
just pass the flags

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14 16:33:57 +04:00
02e5180d99 do_dentry_open(): take initialization of file->f_path to caller
... and get rid of a couple of arguments and a pointless reassignment
in finish_open() case.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14 16:33:54 +04:00
2a027e7a18 fold __dentry_open() into its sole caller
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14 16:33:52 +04:00
96b7e579ad switch do_dentry_open() to returning int
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14 16:33:49 +04:00
e45198a6ac make finish_no_open() return int
namely, 1 ;-)  That's what we want to return from ->atomic_open()
instances after finish_no_open().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14 16:33:45 +04:00
2675a4eb6a fs/namei.c: get do_last() and friends return int
Same conventions as for ->atomic_open().  Trimmed the
forest of labels a bit, while we are at it...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14 16:33:43 +04:00
30d9049474 kill struct opendata
Just pass struct file *.  Methods are happier that way...
There's no need to return struct file * from finish_open() now,
so let it return int.  Next: saner prototypes for parts in
namei.c

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14 16:33:39 +04:00
a4a3bdd778 kill opendata->{mnt,dentry}
->filp->f_path is there for purpose...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14 16:33:37 +04:00
d95852777b make ->atomic_open() return int
Change of calling conventions:
old		new
NULL		1
file		0
ERR_PTR(-ve)	-ve

Caller *knows* that struct file *; no need to return it.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14 16:33:35 +04:00
3d8a00d209 don't modify od->filp at all
make put_filp() conditional on flag set by finish_open()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14 16:33:33 +04:00
47237687d7 ->atomic_open() prototype change - pass int * instead of bool *
... and let finish_open() report having opened the file via that sucker.
Next step: don't modify od->filp at all.

[AV: FILE_CREATE was already used by cifs; Miklos' fix folded]

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14 16:33:31 +04:00
a8277b9baa vfs: move O_DIRECT check to common code
Perform open_check_o_direct() in a common place in do_last after opening the
file.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14 16:33:28 +04:00
f60dc3db6e vfs: do_last(): clean up retry
Move the lookup retry logic to the bottom of the function to make the normal
case simpler to read.

Reported-by: David Howells <dhowells@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14 16:33:26 +04:00
77d660a8a8 vfs: do_last(): clean up bool
Consistently use bool for boolean values in do_last().

Reported-by: David Howells <dhowells@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14 16:33:24 +04:00
e83db16722 vfs: do_last(): clean up labels
Reported-by: David Howells <dhowells@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14 16:33:21 +04:00
aa4caadb70 vfs: do_last(): clean up error handling
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14 16:33:20 +04:00
015c3bbcd8 vfs: remove open intents from nameidata
All users of open intents have been converted to use ->atomic_{open,create}.

This patch gets rid of nd->intent.open and related infrastructure.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14 16:33:18 +04:00
e43ae79c54 9p: implement i_op->atomic_open()
Add an ->atomic_open implementation which replaces the atomic open+create
operation implemented via ->create.  No functionality is changed.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
CC: Eric Van Hensbergen <ericvh@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14 16:33:17 +04:00