30831 Commits

Author SHA1 Message Date
3509b678a6 9p: double iput() in ->lookup() if d_materialise_unique() fails
d_materialise_unique() does iput() itself.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-28 01:21:38 -05:00
2ea03e1d62 9p: v9fs_fid_add() can't fail now
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-28 01:18:14 -05:00
aaeb7ecfb4 v9fs: get rid of v9fs_dentry
->d_fsdata can act as hlist_head...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-28 01:13:19 -05:00
2a7d2b96d5 Merge branch 'akpm' (final batch from Andrew)
Merge third patch-bumb from Andrew Morton:
 "This wraps me up for -rc1.
   - Lots of misc stuff and things which were deferred/missed from
     patchbombings 1 & 2.
   - ocfs2 things
   - lib/scatterlist
   - hfsplus
   - fatfs
   - documentation
   - signals
   - procfs
   - lockdep
   - coredump
   - seqfile core
   - kexec
   - Tejun's large IDR tree reworkings
   - ipmi
   - partitions
   - nbd
   - random() things
   - kfifo
   - tools/testing/selftests updates
   - Sasha's large and pointless hlist cleanup"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (163 commits)
  hlist: drop the node parameter from iterators
  kcmp: make it depend on CHECKPOINT_RESTORE
  selftests: add a simple doc
  tools/testing/selftests/Makefile: rearrange targets
  selftests/efivarfs: add create-read test
  selftests/efivarfs: add empty file creation test
  selftests: add tests for efivarfs
  kfifo: fix kfifo_alloc() and kfifo_init()
  kfifo: move kfifo.c from kernel/ to lib/
  arch Kconfig: centralise CONFIG_ARCH_NO_VIRT_TO_BUS
  w1: add support for DS2413 Dual Channel Addressable Switch
  memstick: move the dereference below the NULL test
  drivers/pps/clients/pps-gpio.c: use devm_kzalloc
  Documentation/DMA-API-HOWTO.txt: fix typo
  include/linux/eventfd.h: fix incorrect filename is a comment
  mtd: mtd_stresstest: use prandom_bytes()
  mtd: mtd_subpagetest: convert to use prandom library
  mtd: mtd_speedtest: use prandom_bytes
  mtd: mtd_pagetest: convert to use prandom library
  mtd: mtd_oobtest: convert to use prandom library
  ...
2013-02-27 20:58:09 -08:00
c4d30967f3 9p: turn fid->dlist into hlist
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-27 22:51:08 -05:00
634095dab2 9p: don't bother with private lock in ->d_fsdata; dentry->d_lock will do just fine
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-27 22:37:21 -05:00
b67bfe0d42 hlist: drop the node parameter from iterators
I'm not sure why, but the hlist for each entry iterators were conceived

        list_for_each_entry(pos, head, member)

The hlist ones were greedy and wanted an extra parameter:

        hlist_for_each_entry(tpos, pos, head, member)

Why did they need an extra pos parameter? I'm not quite sure. Not only
they don't really need it, it also prevents the iterator from looking
exactly like the list iterator, which is unfortunate.

Besides the semantic patch, there was some manual work required:

 - Fix up the actual hlist iterators in linux/list.h
 - Fix up the declaration of other iterators based on the hlist ones.
 - A very small amount of places were using the 'node' parameter, this
 was modified to use 'obj->member' instead.
 - Coccinelle didn't handle the hlist_for_each_entry_safe iterator
 properly, so those had to be fixed up manually.

The semantic patch which is mostly the work of Peter Senna Tschudin is here:

@@
iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;

type T;
expression a,c,d,e;
identifier b;
statement S;
@@

-T b;
    <+... when != b
(
hlist_for_each_entry(a,
- b,
c, d) S
|
hlist_for_each_entry_continue(a,
- b,
c) S
|
hlist_for_each_entry_from(a,
- b,
c) S
|
hlist_for_each_entry_rcu(a,
- b,
c, d) S
|
hlist_for_each_entry_rcu_bh(a,
- b,
c, d) S
|
hlist_for_each_entry_continue_rcu_bh(a,
- b,
c) S
|
for_each_busy_worker(a, c,
- b,
d) S
|
ax25_uid_for_each(a,
- b,
c) S
|
ax25_for_each(a,
- b,
c) S
|
inet_bind_bucket_for_each(a,
- b,
c) S
|
sctp_for_each_hentry(a,
- b,
c) S
|
sk_for_each(a,
- b,
c) S
|
sk_for_each_rcu(a,
- b,
c) S
|
sk_for_each_from
-(a, b)
+(a)
S
+ sk_for_each_from(a) S
|
sk_for_each_safe(a,
- b,
c, d) S
|
sk_for_each_bound(a,
- b,
c) S
|
hlist_for_each_entry_safe(a,
- b,
c, d, e) S
|
hlist_for_each_entry_continue_rcu(a,
- b,
c) S
|
nr_neigh_for_each(a,
- b,
c) S
|
nr_neigh_for_each_safe(a,
- b,
c, d) S
|
nr_node_for_each(a,
- b,
c) S
|
nr_node_for_each_safe(a,
- b,
c, d) S
|
- for_each_gfn_sp(a, c, d, b) S
+ for_each_gfn_sp(a, c, d) S
|
- for_each_gfn_indirect_valid_sp(a, c, d, b) S
+ for_each_gfn_indirect_valid_sp(a, c, d) S
|
for_each_host(a,
- b,
c) S
|
for_each_host_safe(a,
- b,
c, d) S
|
for_each_mesh_entry(a,
- b,
c, d) S
)
    ...+>

[akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
[akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
[akpm@linux-foundation.org: checkpatch fixes]
[akpm@linux-foundation.org: fix warnings]
[akpm@linux-foudnation.org: redo intrusive kvm changes]
Tested-by: Peter Senna Tschudin <peter.senna@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:24 -08:00
e8c8d1bc06 idr: remove MAX_IDR_MASK and move left MAX_IDR_* into idr.c
MAX_IDR_MASK is another weirdness in the idr interface.  As idr covers
whole positive integer range, it's defined as 0x7fffffff or INT_MAX.

Its usage in idr_find(), idr_replace() and idr_remove() is bizarre.
They basically mask off the sign bit and operate on the rest, so if
the caller, by accident, passes in a negative number, the sign bit
will be masked off and the remaining part will be used as if that was
the input, which is worse than crashing.

The constant is visible in idr.h and there are several users in the
kernel.

* drivers/i2c/i2c-core.c:i2c_add_numbered_adapter()

  Basically used to test if adap->nr is a negative number which isn't
  -1 and returns -EINVAL if so.  idr_alloc() already has negative
  @start checking (w/ WARN_ON_ONCE), so this can go away.

* drivers/infiniband/core/cm.c:cm_alloc_id()
  drivers/infiniband/hw/mlx4/cm.c:id_map_alloc()

  Used to wrap cyclic @start.  Can be replaced with max(next, 0).
  Note that this type of cyclic allocation using idr is buggy.  These
  are prone to spurious -ENOSPC failure after the first wraparound.

* fs/super.c:get_anon_bdev()

  The ID allocated from ida is masked off before being tested whether
  it's inside valid range.  ida allocated ID can never be a negative
  number and the masking is unnecessary.

Update idr_*() functions to fail with -EINVAL when negative @id is
specified and update other MAX_IDR_MASK users as described above.

This leaves MAX_IDR_MASK without any user, remove it and relocate
other MAX_IDR_* constants to lib/idr.c.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jean Delvare <khali@linux-fr.org>
Cc: Roland Dreier <roland@kernel.org>
Cc: Sean Hefty <sean.hefty@intel.com>
Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
Cc: "Marciniszyn, Mike" <mike.marciniszyn@intel.com>
Cc: Jack Morgenstein <jackm@dev.mellanox.co.il>
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Wolfram Sang <wolfram@the-dreams.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:20 -08:00
d687031265 nfs4client: convert to idr_alloc()
Convert to the much saner new idr interface.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:20 -08:00
6b207ba3eb ocfs2: convert to idr_alloc()
Convert to the much saner new idr interface.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Acked-by: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:19 -08:00
4542da631a inotify: convert to idr_alloc()
Convert to the much saner new idr interface.

Note that the adhoc cyclic id allocation is buggy.  If wraparound
happens, the previous code with idr_get_new_above() may segfault and
the converted code will trigger WARN and return -EINVAL.  Even if it's
fixed to wrap to zero, the code will be prone to unnecessary -ENOSPC
failures after the first wraparound.  We probably need to implement
proper cyclic support in idr.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: John McCutchan <john@johnmccutchan.com>
Cc: Robert Love <rlove@rlove.org>
Cc: Eric Paris <eparis@parisplace.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:19 -08:00
2a86b3e74f dlm: convert to idr_alloc()
Convert to the much saner new idr interface.  Error return values from
recover_idr_add() mix -1 and -errno.  The conversion doesn't change
that but it looks iffy.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:19 -08:00
644e1b90ef inotify: don't use idr_remove_all()
idr_destroy() can destroy idr by itself and idr_remove_all() is being
deprecated.  Drop its usage.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: John McCutchan <john@johnmccutchan.com>
Cc: Robert Love <rlove@rlove.org>
Cc: Eric Paris <eparis@parisplace.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:14 -08:00
d97bec801d nfs: idr_destroy() no longer needs idr_remove_all()
idr_destroy() can destroy idr by itself and idr_remove_all() is being
deprecated.  Drop reference to idr_remove_all().  Note that the code
wasn't completely correct before because idr_remove() on all entries
doesn't necessarily release all idr_layers which could lead to memory
leak.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:14 -08:00
a67a380e6f dlm: don't use idr_remove_all()
idr_destroy() can destroy idr by itself and idr_remove_all() is being
deprecated.

The conversion isn't completely trivial for recover_idr_clear() as it's
the only place in kernel which makes legitimate use of idr_remove_all()
w/o idr_destroy().  Replace it with idr_remove() call inside
idr_for_each_entry() loop.  It goes on top so that it matches the
operation order in recover_idr_del().

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Christine Caulfield <ccaulfie@redhat.com>
Cc: David Teigland <teigland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:13 -08:00
cda95406c8 dlm: use idr_for_each_entry() in recover_idr_clear() error path
Convert recover_idr_clear() to use idr_for_each_entry() instead of
idr_for_each().  It's somewhat less efficient this way but it shouldn't
matter in an error path.  This is to help with deprecation of
idr_remove_all().

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Christine Caulfield <ccaulfie@redhat.com>
Cc: David Teigland <teigland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:13 -08:00
5e62adef9e fs/seq_file.c:seq_lseek(): fix switch statement indenting
[akpm@linux-foundation.org: checkpatch fixes]
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:12 -08:00
80de7f7ae0 seq-file: use SEEK_ macros instead of hardcoded numbers
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:12 -08:00
c2c1b089b4 fs/proc/vmcore.c: put if tests in the top of the while loop to reduce duplication
In read_vmcore() two `if' tests are duplicated.  Change the position of
them could reduce the duplication.  This change does not affect the
behaviour of the function.

[akpm@linux-foundation.org: avoid `if (foo = bar)' thing, use min_t()]
[akpm@linux-foundation.org: s/max_t/min_t/]
Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:11 -08:00
87ebdc00ee fs/proc: clean up printks
- use pr_foo() throughout

- remove a couple of duplicated KERN_WARNINGs, via WARN(KERN_WARNING "...")

- nuke a few warnings which I've never seen happen, ever.

Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:11 -08:00
e579d2c259 coredump: remove redundant defines for dumpable states
The existing SUID_DUMP_* defines duplicate the newer SUID_DUMPABLE_*
defines introduced in 54b501992dd2 ("coredump: warn about unsafe
suid_dumpable / core_pattern combo").  Remove the new ones, and use the
prior values instead.

Signed-off-by: Kees Cook <keescook@chromium.org>
Reported-by: Chen Gang <gang.chen@asianux.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alan Cox <alan@linux.intel.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Doug Ledford <dledford@redhat.com>
Cc: Serge Hallyn <serge.hallyn@canonical.com>
Cc: James Morris <james.l.morris@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:11 -08:00
b88a105802 fat: mark fs as dirty on mount and clean on umount
There is no documented methods to mark FAT as dirty.  Unofficially MS
started to use reserved Byte in boot sector for this purpose, at least
since Win 2000.  With Win 7 user is warned if fs is dirty and asked to
clean it.

Different versions of Win, handle it in different ways, but always have
same meaning:

- Win 2000 and XP, set it on write operations and
  remove it after operation was finnished
- Win 7, set dirty flag on first write and remove it on umount.

We will do it as follows:

- set dirty flag on mount. If fs was initially dirty, warn user,
  remember it and do not do any changes to boot sector.
- clean it on umount. If fs was initially dirty, leave it dirty.
- do not do any thing if fs mounted read-only.
- TODO: leave fs dirty if we found some error after mount.

Signed-off-by: Oleksij Rempel <bug-track@fisher-privat.net>
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:11 -08:00
6b46419b04 fat: add extended fileds to struct fat_boot_sector
Later we will need "state" field to check if volume was cleanly unmounted.

Signed-off-by: Oleksij Rempel <bug-track@fisher-privat.net>
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:10 -08:00
899bed05e9 hfsplus: fix issue with unzeroed unused b-tree nodes
The fsck_hfs (under MacOS X) complains about unzeroed unused b-tree nodes
after deletion of folders' tree under Linux.

SYMPTOMS:

  Running Disk Utiltiy's "Verify Disk" on "test" gives the following:
  Verifying volume “Test”
  Checking file systemChecking Journaled HFS Plus volume.
  Checking extents overflow file.
  Checking catalog file.
  Unused node is not erased (node = 3111)
  Checking multi-linked files.
  Checking catalog hierarchy.
  Checking extended attributes file.
  Checking volume bitmap.
  Checking volume information.
  The volume Test was found corrupt and needs to be repaired.
  Error: This disk needs to be repaired. Click Repair Disk.

REPRODUCING PATH:

1. Prepare HFS+ (non-case sensitive) partition (for example, 5GB)
   under MacOS X.
2. Copy linux kernel source tree (for example, 3.7-rc6 version) on
   this partition under MacOS X.
3. Then switch to Linux and mount this prepared partition.
4. Execute `sudo rm -r` under prepared directory with linux kernel
   source tree.
5. Unmount and boot back into OS X.
6. Open up Disk Utility and verify partition.

REPRODUCIBILITY: 100%

FIX:

It is added code of node clearing in hfs_bnode_put() method for the case
when node has flag HFS_BNODE_DELETED.

Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com>
Reported-by: Kyle Laracey <kalaracey@gmail.com>
Acked-by: Hin-Tak Leung <htl10@users.sourceforge.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:10 -08:00
324ef39a8a hfsplus: add support of manipulation by attributes file
Add support of manipulation by attributes file.

Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com>
Reported-by: Hin-Tak Leung <htl10@users.sourceforge.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:10 -08:00
127e5f5ae5 hfsplus: rework functionality of getting, setting and deleting of extended attributes
Rework functionality of getting, setting and deleting of extended attributes.

Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com>
Reported-by: Hin-Tak Leung <htl10@users.sourceforge.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:10 -08:00
3e05ca20fb hfsplus: add functionality of manipulating by records in attributes tree
Add functionality of manipulating by records in attributes tree.

Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com>
Reported-by: Hin-Tak Leung <htl10@users.sourceforge.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:10 -08:00
9ed083d8cc hfsplus: add on-disk layout declarations related to attributes tree
Add all necessary on-disk layout declarations related to attributes file.

Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com>
Reported-by: Hin-Tak Leung <htl10@users.sourceforge.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:10 -08:00
309a85b686 ocfs2: ac->ac_allow_chain_relink=0 won't disable group relink
ocfs2_block_group_alloc_discontig() disables chain relink by setting
ac->ac_allow_chain_relink = 0 because it grabs clusters from multiple
cluster groups.

It doesn't keep the credits for all chain relink,but
ocfs2_claim_suballoc_bits overrides this in this call trace:
ocfs2_block_group_claim_bits()->ocfs2_claim_clusters()->
__ocfs2_claim_clusters()->ocfs2_claim_suballoc_bits()
ocfs2_claim_suballoc_bits set ac->ac_allow_chain_relink = 1; then call
ocfs2_search_chain() one time and disable it again, and then we run out
of credits.

Fix is to allow relink by default and disable it in
ocfs2_block_group_alloc_discontig.

Without this patch, End-users will run into a crash due to run out of
credits, backtrace like this:

  RIP: 0010:[<ffffffffa0808b14>]  [<ffffffffa0808b14>]
  jbd2_journal_dirty_metadata+0x164/0x170 [jbd2]
  RSP: 0018:ffff8801b919b5b8  EFLAGS: 00010246
  RAX: 0000000000000000 RBX: ffff88022139ddc0 RCX: ffff880159f652d0
  RDX: ffff880178aa3000 RSI: ffff880159f652d0 RDI: ffff880087f09bf8
  RBP: ffff8801b919b5e8 R08: 0000000000000000 R09: 0000000000000000
  R10: 0000000000001e00 R11: 00000000000150b0 R12: ffff880159f652d0
  R13: ffff8801a0cae908 R14: ffff880087f09bf8 R15: ffff88018d177800
  FS:  00007fc9b0b6b6e0(0000) GS:ffff88022fd40000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
  CR2: 000000000040819c CR3: 0000000184017000 CR4: 00000000000006e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
  Process dd (pid: 9945, threadinfo ffff8801b919a000, task ffff880149a264c0)
  Call Trace:
    ocfs2_journal_dirty+0x2f/0x70 [ocfs2]
    ocfs2_relink_block_group+0x111/0x480 [ocfs2]
    ocfs2_search_chain+0x455/0x9a0 [ocfs2]
    ...

Signed-off-by: Xiaowei.Hu <xiaowei.hu@oracle.com>
Reviewed-by: Srinivas Eeda <srinivas.eeda@oracle.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:09 -08:00
32918dd9f1 ocfs2: fix ocfs2_init_security_and_acl() to initialize acl correctly
We need to re-initialize the security for a new reflinked inode with its
parent dirs if it isn't specified to be preserved for ocfs2_reflink().
However, the code logic is broken at ocfs2_init_security_and_acl()
although ocfs2_init_security_get() succeed.  As a result,
ocfs2_acl_init() does not involked and therefore the default ACL of
parent dir was missing on the new inode.

Note this was introduced by 9d8f13ba3 ("security: new
security_inode_init_security API adds function callback")

To reproduce:

    set default ACL for the parent dir(ocfs2 in this case):
    $ setfacl -m default:user:jeff:rwx ../ocfs2/
    $ getfacl ../ocfs2/
    # file: ../ocfs2/
    # owner: jeff
    # group: jeff
    user::rwx
    group::r-x
    other::r-x
    default:user::rwx
    default:user:jeff:rwx
    default:group::r-x
    default😷:rwx
    default:other::r-x

    $ touch a
    $ getfacl a
    # file: a
    # owner: jeff
    # group: jeff
    user::rw-
    group::rw-
    other::r--

Before patching, create reflink file b from a, the user
default ACL entry(user:jeff:rwx)was missing:

    $ ./ocfs2_reflink a b
    $ getfacl b
    # file: b
    # owner: jeff
    # group: jeff
    user::rw-
    group::rw-
    other::r--

In this case, the end user can also observed an error message at syslog:

  (ocfs2_reflink,3229,2):ocfs2_init_security_and_acl:7193 ERROR: status = 0

After applying this patch, create reflink file c from a:

    $ ./ocfs2_reflink a c
    $ getfacl c
    # file: c
    # owner: jeff
    # group: jeff
    user::rw-
    user:jeff:rwx			#effective:rw-
    group::r-x			#effective:r--
    mask::rw-
    other::r--

Test program:
/* Usage: reflink <source> <dest> */
#include <stdio.h>
#include <stdint.h>
#include <stdbool.h>
#include <string.h>
#include <errno.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <sys/ioctl.h>

static int
reflink_file(char const *src_name, char const *dst_name,
	     bool preserve_attrs)
{
	int fd;

#ifndef REFLINK_ATTR_NONE
#  define REFLINK_ATTR_NONE 0
#endif
#ifndef REFLINK_ATTR_PRESERVE
#  define REFLINK_ATTR_PRESERVE 1
#endif
#ifndef OCFS2_IOC_REFLINK
	struct reflink_arguments {
		uint64_t old_path;
		uint64_t new_path;
		uint64_t preserve;
	};

#  define OCFS2_IOC_REFLINK _IOW ('o', 4, struct reflink_arguments)
#endif
	struct reflink_arguments args = {
		.old_path = (unsigned long) src_name,
		.new_path = (unsigned long) dst_name,
		.preserve = preserve_attrs ? REFLINK_ATTR_PRESERVE :
					     REFLINK_ATTR_NONE,
	};

	fd = open(src_name, O_RDONLY);
	if (fd < 0) {
		fprintf(stderr, "Failed to open %s: %s\n",
			src_name, strerror(errno));
		return -1;
	}

	if (ioctl(fd, OCFS2_IOC_REFLINK, &args) < 0) {
		fprintf(stderr, "Failed to reflink %s to %s: %s\n",
			src_name, dst_name, strerror(errno));
		return -1;
	}
}

int
main(int argc, char *argv[])
{
	if (argc != 3) {
		fprintf(stdout, "Usage: %s source dest\n", argv[0]);
		return 1;
	}

	return reflink_file(argv[1], argv[2], 0);
}

Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Reviewed-by: Tao Ma <boyu.mt@taobao.com>
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:09 -08:00
f6488c9ba5 nfs: don't allow nfs_find_actor to match inodes of the wrong type
Benny Halevy reported the following oops when testing RHEL6:

<7>nfs_update_inode: inode 892950 mode changed, 0040755 to 0100644
<1>BUG: unable to handle kernel NULL pointer dereference at (null)
<1>IP: [<ffffffffa02a52c5>] nfs_closedir+0x15/0x30 [nfs]
<4>PGD 81448a067 PUD 831632067 PMD 0
<4>Oops: 0000 [#1] SMP
<4>last sysfs file: /sys/kernel/mm/redhat_transparent_hugepage/enabled
<4>CPU 6
<4>Modules linked in: fuse bonding 8021q garp ebtable_nat ebtables be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi softdog bridge stp llc xt_physdev ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 xt_multiport iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 dm_round_robin dm_multipath objlayoutdriver2(U) nfs(U) lockd fscache auth_rpcgss nfs_acl sunrpc vhost_net macvtap macvlan tun kvm_intel kvm be2net igb dca ptp pps_core microcode serio_raw sg iTCO_wdt iTCO_vendor_support i7core_edac edac_core shpchp ext4 mbcache jbd2 sd_mod crc_t10dif ahci dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
<4>
<4>Pid: 6332, comm: dd Not tainted 2.6.32-358.el6.x86_64 #1 HP ProLiant DL170e G6  /ProLiant DL170e G6
<4>RIP: 0010:[<ffffffffa02a52c5>]  [<ffffffffa02a52c5>] nfs_closedir+0x15/0x30 [nfs]
<4>RSP: 0018:ffff88081458bb98  EFLAGS: 00010292
<4>RAX: ffffffffa02a52b0 RBX: 0000000000000000 RCX: 0000000000000003
<4>RDX: ffffffffa02e45a0 RSI: ffff88081440b300 RDI: ffff88082d5f5760
<4>RBP: ffff88081458bba8 R08: 0000000000000000 R09: 0000000000000000
<4>R10: 0000000000000772 R11: 0000000000400004 R12: 0000000040000008
<4>R13: ffff88082d5f5760 R14: ffff88082d6e8800 R15: ffff88082f12d780
<4>FS:  00007f728f37e700(0000) GS:ffff8800456c0000(0000) knlGS:0000000000000000
<4>CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
<4>CR2: 0000000000000000 CR3: 0000000831279000 CR4: 00000000000007e0
<4>DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<4>DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
<4>Process dd (pid: 6332, threadinfo ffff88081458a000, task ffff88082fa0e040)
<4>Stack:
<4> 0000000040000008 ffff88081440b300 ffff88081458bbf8 ffffffff81182745
<4><d> ffff88082d5f5760 ffff88082d6e8800 ffff88081458bbf8 ffffffffffffffea
<4><d> ffff88082f12d780 ffff88082d6e8800 ffffffffa02a50a0 ffff88082d5f5760
<4>Call Trace:
<4> [<ffffffff81182745>] __fput+0xf5/0x210
<4> [<ffffffffa02a50a0>] ? do_open+0x0/0x20 [nfs]
<4> [<ffffffff81182885>] fput+0x25/0x30
<4> [<ffffffff8117e23e>] __dentry_open+0x27e/0x360
<4> [<ffffffff811c397a>] ? inotify_d_instantiate+0x2a/0x60
<4> [<ffffffff8117e4b9>] lookup_instantiate_filp+0x69/0x90
<4> [<ffffffffa02a6679>] nfs_intent_set_file+0x59/0x90 [nfs]
<4> [<ffffffffa02a686b>] nfs_atomic_lookup+0x1bb/0x310 [nfs]
<4> [<ffffffff8118e0c2>] __lookup_hash+0x102/0x160
<4> [<ffffffff81225052>] ? selinux_inode_permission+0x72/0xb0
<4> [<ffffffff8118e76a>] lookup_hash+0x3a/0x50
<4> [<ffffffff81192a4b>] do_filp_open+0x2eb/0xdd0
<4> [<ffffffff8104757c>] ? __do_page_fault+0x1ec/0x480
<4> [<ffffffff8119f562>] ? alloc_fd+0x92/0x160
<4> [<ffffffff8117de79>] do_sys_open+0x69/0x140
<4> [<ffffffff811811f6>] ? sys_lseek+0x66/0x80
<4> [<ffffffff8117df90>] sys_open+0x20/0x30
<4> [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
<4>Code: 65 48 8b 04 25 c8 cb 00 00 83 a8 44 e0 ff ff 01 5b 41 5c c9 c3 90 55 48 89 e5 53 48 83 ec 08 0f 1f 44 00 00 48 8b 9e a0 00 00 00 <48> 8b 3b e8 13 0c f7 ff 48 89 df e8 ab 3d ec e0 48 83 c4 08 31
<1>RIP  [<ffffffffa02a52c5>] nfs_closedir+0x15/0x30 [nfs]
<4> RSP <ffff88081458bb98>
<4>CR2: 0000000000000000

I think this is ultimately due to a bug on the server. The client had
previously found a directory dentry. It then later tried to do an atomic
open on a new (regular file) dentry. The attributes it got back had the
same filehandle as the previously found directory inode. It then tried
to put the filp because it failed the aops tests for O_DIRECT opens, and
oopsed here because the ctx was still NULL.

Obviously the root cause here is a server issue, but we can take steps
to mitigate this on the client. When nfs_fhget is called, we always know
what type of inode it is. In the event that there's a broken or
malicious server on the other end of the wire, the client can end up
crashing because the wrong ops are set on it.

Have nfs_find_actor check that the inode type is correct after checking
the fileid. The fileid check should rarely ever match, so it should only
rarely ever get to this check. In the case where we have a broken
server, we may see two different inodes with the same i_ino, but the
client should be able to cope with them without crashing.

This should fix the oops reported here:

    https://bugzilla.redhat.com/show_bug.cgi?id=913660

Reported-by: Benny Halevy <bhalevy@tonian.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-02-27 17:28:20 -08:00
0b7bc84000 cifs: set MAY_SIGN when sec=krb5
Setting this secFlg allows usage of dfs where some servers require
signing and others don't.

Signed-off-by: Martijn de Gouw <martijn.de.gouw@prodrive.nl>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2013-02-27 16:54:25 -06:00
07b92d0d57 POSIX extensions disabled on client due to illegal O_EXCL flag sent to Samba
Samba rejected libreoffice's attempt to open a file with illegal
O_EXCL (without O_CREAT).  Mask this flag off (as the local
linux file system case does) for this case, so that we
don't have disable Unix Extensions unnecessarily due to
the Samba error (Samba server is also being fixed).

See https://bugzilla.samba.org/show_bug.cgi?id=9519

Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2013-02-27 16:54:18 -06:00
ce2ac52105 cifs: ensure that cifs_get_root() only traverses directories
Kjell Braden reported this oops:

[  833.211970] BUG: unable to handle kernel NULL pointer dereference at           (null)
[  833.212816] IP: [<          (null)>]           (null)
[  833.213280] PGD 1b9b2067 PUD e9f7067 PMD 0
[  833.213874] Oops: 0010 [#1] SMP
[  833.214344] CPU 0
[  833.214458] Modules linked in: des_generic md4 nls_utf8 cifs vboxvideo drm snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq bnep rfcomm snd_timer bluetooth snd_seq_device ppdev snd vboxguest parport_pc joydev mac_hid soundcore snd_page_alloc psmouse i2c_piix4 serio_raw lp parport usbhid hid e1000
[  833.215629]
[  833.215629] Pid: 1752, comm: mount.cifs Not tainted 3.0.0-rc7-bisectcifs-fec11dd9a0+ #18 innotek GmbH VirtualBox/VirtualBox
[  833.215629] RIP: 0010:[<0000000000000000>]  [<          (null)>]           (null)
[  833.215629] RSP: 0018:ffff8800119c9c50  EFLAGS: 00010282
[  833.215629] RAX: ffffffffa02186c0 RBX: ffff88000c427780 RCX: 0000000000000000
[  833.215629] RDX: 0000000000000000 RSI: ffff88000c427780 RDI: ffff88000c4362e8
[  833.215629] RBP: ffff8800119c9c88 R08: ffff88001fc15e30 R09: 00000000d69515c7
[  833.215629] R10: ffffffffa0201972 R11: ffff88000e8f6a28 R12: ffff88000c4362e8
[  833.215629] R13: 0000000000000000 R14: 0000000000000000 R15: ffff88001181aaa6
[  833.215629] FS:  00007f2986171700(0000) GS:ffff88001fc00000(0000) knlGS:0000000000000000
[  833.215629] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  833.215629] CR2: 0000000000000000 CR3: 000000001b982000 CR4: 00000000000006f0
[  833.215629] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  833.215629] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  833.215629] Process mount.cifs (pid: 1752, threadinfo ffff8800119c8000, task ffff88001c1c16f0)
[  833.215629] Stack:
[  833.215629]  ffffffff8116a9b5 ffff8800119c9c88 ffffffff81178075 0000000000000286
[  833.215629]  0000000000000000 ffff88000c4276c0 ffff8800119c9ce8 ffff8800119c9cc8
[  833.215629]  ffffffff8116b06e ffff88001bc6fc00 ffff88000c4276c0 ffff88000c4276c0
[  833.215629] Call Trace:
[  833.215629]  [<ffffffff8116a9b5>] ? d_alloc_and_lookup+0x45/0x90
[  833.215629]  [<ffffffff81178075>] ? d_lookup+0x35/0x60
[  833.215629]  [<ffffffff8116b06e>] __lookup_hash.part.14+0x9e/0xc0
[  833.215629]  [<ffffffff8116b1d6>] lookup_one_len+0x146/0x1e0
[  833.215629]  [<ffffffff815e4f7e>] ? _raw_spin_lock+0xe/0x20
[  833.215629]  [<ffffffffa01eef0d>] cifs_do_mount+0x26d/0x500 [cifs]
[  833.215629]  [<ffffffff81163bd3>] mount_fs+0x43/0x1b0
[  833.215629]  [<ffffffff8117d41a>] vfs_kern_mount+0x6a/0xd0
[  833.215629]  [<ffffffff8117e584>] do_kern_mount+0x54/0x110
[  833.215629]  [<ffffffff8117fdc2>] do_mount+0x262/0x840
[  833.215629]  [<ffffffff81108a0e>] ? __get_free_pages+0xe/0x50
[  833.215629]  [<ffffffff8117f9ca>] ? copy_mount_options+0x3a/0x180
[  833.215629]  [<ffffffff8118075d>] sys_mount+0x8d/0xe0
[  833.215629]  [<ffffffff815ece82>] system_call_fastpath+0x16/0x1b
[  833.215629] Code:  Bad RIP value.
[  833.215629] RIP  [<          (null)>]           (null)
[  833.215629]  RSP <ffff8800119c9c50>
[  833.215629] CR2: 0000000000000000
[  833.238525] ---[ end trace ec00758b8d44f529 ]---

When walking down the path on the server, it's possible to hit a
symlink. The path walking code assumes that the caller will handle that
situation properly, but cifs_get_root() isn't set up for it. This patch
prevents the oops by simply returning an error.

A better solution would be to try and chase the symlinks here, but that's
fairly complicated to handle.

Fixes:

    https://bugzilla.kernel.org/show_bug.cgi?id=53221

Reported-and-tested-by: Kjell Braden <afflux@pentabarf.de>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2013-02-27 16:35:23 -06:00
6131ffaa1f more file_inode() open-coded instances
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-27 16:59:05 -05:00
7bf2fbcdf5 This fixes a real brown paper bag bug which causes ext4 to choke on
file systems larger than 512GB.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABCAAGBQJRLmSkAAoJENNvdpvBGATwhf8P/1OqyhnveyeGPGHF3gvufx6W
 QLwVdIAWLIuLmltmgXhah/IugYOYv2msIFkmtpH/exx9mmXSfRpScZ1QQfPoIuW8
 ovZDljbkDUGyYiZPvHBKnek+8EK8zwIq2TQYz1E87rvFqCg97EonwzjeUo6dcggv
 toQvHFBjCAKGUI9qGiTW9Xthqh+6MUYGw9Wh3CtUkOhNMqhyc2U0SxMuqoc32qVZ
 Sj8Yn4Sba5CRnMSqZbFEJh3B6JtDrsw7tTe/W0Oqu6X+z/ro9ky8Gg7OpxTyWzHI
 HyJKYPDwkS+Bj8n4oH3Q8EKs6mA5rzFrwwH0vNA/GErRdga2SEHxoaLhnP8LGhDi
 GrdZGOfOIAC6qwYaNA9G+uk3Zt73N+BurIgbUMExOLRz51ViTw8ep9j0mNvYoZLZ
 S977/SO4axtv34Gmsi3oi3zuy+zmHPRQnLpWo3bghPo0jIuSVl225zqIkkVf8jaE
 zpRUNvycymaCSEMc2daLXhJDKOO1Ll46wwsSZ7ppZqRBMGV4jp2c3I7sG5GNcB4Z
 TmxKoIVhXlCs1sD3cLXNpEwdpSA4NWDLFtfqNPyssyGrzafVIMH1cYUzAvP36Ymx
 mLFtSt5pnEnUF/C8Kb3Yowsh0kOdnkVmadgKdDa05fEmiwLd0oxGFIqed9zhhIcI
 vKEiSNphSdwxXPbGF7ZF
 =cuPi
 -----END PGP SIGNATURE-----

Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext4 regression fix from Theodore Ts'o:
 "This fixes a real brown paper bag bug which causes ext4 to choke on
  file systems larger than 512GB."

* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: fix extent status tree regression for file systems > 512GB
2013-02-27 12:22:30 -08:00
8e919d1304 ext4: fix extent status tree regression for file systems > 512GB
This fixes a regression introduced by commit f7fec032aa782.  The
problem was that the extents status flags caused us to mask out block
numbers smaller than 2**28 blocks.  Since we didn't test with file
systems smaller than 512GB, we didn't notice this during the
development cycle.

A typical failure looks like this:

EXT4-fs error (device sdb1): htree_dirblock_to_tree:919: inode #172235804: block
152052301: comm ls: bad entry in directory: rec_len is smaller than minimal -
offset=0(0), inode=0, rec_len=0, name_len=0

... where 'debugfs -R "stat <172235804>" /dev/sdb1' reports that the
inode has block number 688923213.  When viewed in hex, block number
152052301 (from the syslog) is 0x910224D, while block number 688923213
is 0x2910224D.  Note the missing "0x20000000" in the block number.

Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Verified-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Reported-by: Dave Jones <davej@redhat.com>
Verified-by: Dave Jones <davej@redhat.com>
Cc: Zheng Liu <gnehzuil.liu@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-02-27 14:54:37 -05:00
d895cb1af1 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs pile (part one) from Al Viro:
 "Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent
  locking violations, etc.

  The most visible changes here are death of FS_REVAL_DOT (replaced with
  "has ->d_weak_revalidate()") and a new helper getting from struct file
  to inode.  Some bits of preparation to xattr method interface changes.

  Misc patches by various people sent this cycle *and* ocfs2 fixes from
  several cycles ago that should've been upstream right then.

  PS: the next vfs pile will be xattr stuff."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits)
  saner proc_get_inode() calling conventions
  proc: avoid extra pde_put() in proc_fill_super()
  fs: change return values from -EACCES to -EPERM
  fs/exec.c: make bprm_mm_init() static
  ocfs2/dlm: use GFP_ATOMIC inside a spin_lock
  ocfs2: fix possible use-after-free with AIO
  ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path
  get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero
  target: writev() on single-element vector is pointless
  export kernel_write(), convert open-coded instances
  fs: encode_fh: return FILEID_INVALID if invalid fid_type
  kill f_vfsmnt
  vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op
  nfsd: handle vfs_getattr errors in acl protocol
  switch vfs_getattr() to struct path
  default SET_PERSONALITY() in linux/elf.h
  ceph: prepopulate inodes only when request is aborted
  d_hash_and_lookup(): export, switch open-coded instances
  9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate()
  9p: split dropping the acls from v9fs_set_create_acl()
  ...
2013-02-26 20:16:07 -08:00
1b83bef24c libceph: update osd request/reply encoding
Use the new version of the encoding for osd requests and replies.  In the
process, update the way we are tracking request ops and reply lengths and
results in the struct ceph_osd_request.  Update the rbd and fs/ceph users
appropriately.

The main changes are:
 - we keep pointers into the request memory for fields we need to update
   each time the request is sent out over the wire
 - we keep information about the result in an array in the request struct
   where the users can easily get at it.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>
2013-02-26 15:02:50 -08:00
2169aea649 libceph: calculate placement based on the internal data types
Instead of using the old ceph_object_layout struct, update our internal
ceph_calc_object_layout method to use the ceph_pg type.  This allows us to
pass the full 32-bit precision of the pgid.seed to the callers.  It also
allows some callers to avoid reaching into the request structures for the
struct ceph_object_layout fields.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>
2013-02-26 15:02:37 -08:00
4f6a7e5ee1 ceph: update support for PGID64, PGPOOL3, OSDENC protocol features
Support (and require) the PGID64, PGPOOL3, and OSDENC protocol features.
These have been present in ceph.git since v0.42, Feb 2012.  Require these
features to simplify support; nobody is running older userspace.

Note that the new request and reply encoding is still not in place, so the new
code is not yet functional.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>
2013-02-26 15:02:25 -08:00
5b191d9914 libceph: decode into cpu-native ceph_pg type
Always decode data into our cpu-native ceph_pg type that has the correct
field widths.  Limit any remaining uses of ceph_pg_v1 to dealing with the
legacy protocol.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>
2013-02-26 15:01:57 -08:00
12979354a1 libceph: rename ceph_pg -> ceph_pg_v1
Rename the old version this type to distinguish it from the new version.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>
2013-02-26 15:01:41 -08:00
6515925b82 The one new feature added in this patch series is the ability to use
the "punch hole" functionality for inodes that are not using extent
 maps.
 
 In the bug fix category, we fixed some races in the AIO and fstrim
 code, and some potential NULL pointer dereferences and memory leaks in
 error handling code paths.
 
 In the optimization category, we fixed a performance regression in the
 jbd2 layer introduced by commit d9b0193 (introduced in v3.0) which
 shows up in the AIM7 benchmark.  We also further optimized jbd2 by
 minimize the amount of time that transaction handles are held active.
 
 This patch series also features some additional enhancement of the
 extent status tree, which is now used to cache extent information in a
 more efficient/compact form than what we use on-disk.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABCAAGBQJRLRs7AAoJENNvdpvBGATwNb8QAML+TjGtHlJ1coDUzGT2Cq9R
 yREAzI1N/+Phiohy3O0JNx55uPvYEMx6+zi+JCNSs1/gnf/OWruESTXssRbBv3Yd
 WxfOiCIaK8BbOEGZlMwGsFDCzVNKfvHxRrmyeHtcyUONKLFQUmBcE/woVPHcsvlE
 ya/zGnD2e58NaGwS643bqfvTrVt/azH0U0osNCNwfZepZmboEXK8fzT9b3Auh+1Q
 EI28m0GSRp0V0cgwOEN54EhTtocyS30GN8sbC1K5cFHK8tGLhyVwnvIonyFDI5/D
 GOkEPeRb7v2FwGpAilQ/V0jT++E//7zzyMFwvIY1U6b1dzBFCaJUuLMO1R8xoaoa
 c/Qd3AFIt1anS66qZAnW3m5rRyJgU2YA3VrKJj4q0jPKCh+k3+EqVfNTOB8BPLmC
 oCI/4ApUyHeYDdcErFjW4VDJ5N0debPP4yjma3uUtdM7RvQvMdQECnkAjIDCcGKe
 bMc7dtI9jdUYDCPGDeOjdrvk623QpE7J4Pf6iSQ5WxA4f2QmOQ8uIuGe8CPQSVtQ
 bUYjkthtWX2cX2/kHVvSYx6FzAjkgwmxCpAaiCXtGploxJIDjlWkiTXibkRYPLp4
 jBmQPK8ct8bl98k/i3mdybZnJU2TxWLA45hub0zBYs0aSgi8HzFyd+y8DiCKRS0S
 2sANbrsKG6TCzZ6C6ods
 =KSV1
 -----END PGP SIGNATURE-----

Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext4 updates from Theodore Ts'o:
 "The one new feature added in this patch series is the ability to use
  the "punch hole" functionality for inodes that are not using extent
  maps.

  In the bug fix category, we fixed some races in the AIO and fstrim
  code, and some potential NULL pointer dereferences and memory leaks in
  error handling code paths.

  In the optimization category, we fixed a performance regression in the
  jbd2 layer introduced by commit d9b01934d56a ("jbd: fix fsync() tid
  wraparound bug", introduced in v3.0) which shows up in the AIM7
  benchmark.  We also further optimized jbd2 by minimize the amount of
  time that transaction handles are held active.

  This patch series also features some additional enhancement of the
  extent status tree, which is now used to cache extent information in a
  more efficient/compact form than what we use on-disk."

* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (65 commits)
  ext4: fix free clusters calculation in bigalloc filesystem
  ext4: no need to remove extent if len is 0 in ext4_es_remove_extent()
  ext4: fix xattr block allocation/release with bigalloc
  ext4: reclaim extents from extent status tree
  ext4: adjust some functions for reclaiming extents from extent status tree
  ext4: remove single extent cache
  ext4: lookup block mapping in extent status tree
  ext4: track all extent status in extent status tree
  ext4: let ext4_ext_map_blocks return EXT4_MAP_UNWRITTEN flag
  ext4: rename and improbe ext4_es_find_extent()
  ext4: add physical block and status member into extent status tree
  ext4: refine extent status tree
  ext4: use ERR_PTR() abstraction for ext4_append()
  ext4: refactor code to read directory blocks into ext4_read_dirblock()
  ext4: add debugging context for warning in ext4_da_update_reserve_space()
  ext4: use KERN_WARNING for warning messages
  jbd2: use module parameters instead of debugfs for jbd_debug
  ext4: use module parameters instead of debugfs for mballoc_debug
  ext4: start handle at the last possible moment when creating inodes
  ext4: fix the number of credits needed for acl ops with inline data
  ...
2013-02-26 14:52:45 -08:00
bbbd27e694 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull ext2, ext3, udf updates from Jan Kara:
 "Several UDF fixes, a support for UDF extent cache, and couple of ext2
  and ext3 cleanups and minor fixes"

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  Ext2: remove the static function release_blocks to optimize the kernel
  Ext2: mark inode dirty after the function dquot_free_block_nodirty is called
  Ext2: remove the overhead check about sb in the function ext2_new_blocks
  udf: Remove unused s_extLength from udf_bitmap
  udf: Make s_block_bitmap standard array
  udf: Fix bitmap overflow on large filesystems with small block size
  udf: add extent cache support in case of file reading
  udf: Write LVID to disk after opening / closing
  Ext3: return ENOMEM rather than EIO if sb_getblk fails
  Ext2: return ENOMEM rather than EIO if sb_getblk fails
  Ext3: use unlikely to improve the efficiency of the kernel
  Ext2: use unlikely to improve the efficiency of the kernel
  Ext3: add necessary check in case IO error happens
  Ext2: free memory allocated and forget buffer head when io error happens
  ext3: Fix memory leak when quota options are specified multiple times
  ext3, ext4, ocfs2: remove unused macro NAMEI_RA_INDEX
2013-02-26 14:51:52 -08:00
a6590b9f01 I's been quite silent and we have only a couple of bug-fixes for the orphans
handling code plus one cosmetic change.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJRKy9eAAoJECmIfjd9wqK0zTAP/3aA/0tGCLDUNne+6Di57IhN
 LMn9y6N5hvRAcjPD/ZXnx6G1sOshimiMyUGwFg6fEXEQ4LKqV5YefPYOvnDFDEsz
 r+7zcuUX/9FJTR0yWe2uQfoIfASoJMjOYhL3/YQrutivxUjEvLlY6JIBcoXIfUV4
 n2NyygtY8iFzmG/2WDl1Ku1MSOhWBzpjWc66XJ6XLFjrJhFXnAPZF6wLSzfNzC/U
 V4vkmK6Je78nZ5qkTmtX4SoapZMtfkCPJ4unZJ161hmem+xZx7ZRO+0Vd02WnXUH
 gDK/9BALWp1HRC7pudvzy54DJz1LmncrMJ8+z6be2zHiZh4+ahuxXtu6xypy746P
 Y2g2A7uPsd2/rgfiEgHLbBBHILKt9hK8cQ0DZLIkJv1Gqqpwnb01oqXfyIdriwC2
 6EGdOTK7GTKR30ioxG1dpjotILqkl84DcefKssNjhba2Ez5QhoOdekDaWfGCNp7C
 bj7TZ7760HRCV01l45OqPwBu2Xj+tID2IzD0tvrbDc1SR+LhvWgRcdWHYqN3+Us0
 QKFT5HlX8OsrBHo7u4XCVOLlkHKkg5Iqu/mysw4eCHqfc5xi1IG7YifPNUZ80FFi
 IO1IjVHi5lc0ti2XxuQOwEdIxXhVcybAD1YC+7B5Vim1VvZD1o9b4kdEtrkL5U8h
 gM+AI8hiU8nIyORm0mWV
 =0FV4
 -----END PGP SIGNATURE-----

Merge tag 'upstream-3.9-rc1' of git://git.infradead.org/linux-ubifs

Pull ubifs updates from Artem Bityutskiy:
 "It's been quite silent and we have only a couple of bug-fixes for the
  orphans handling code plus one cosmetic change."

* tag 'upstream-3.9-rc1' of git://git.infradead.org/linux-ubifs:
  UBIFS: fix double free of ubifs_orphan objects
  UBIFS: fix use of freed ubifs_orphan objects
  UBIFS: rename random32() to prandom_u32()
2013-02-26 14:51:14 -08:00
1085db4aa6 f2fs for v3.9
[Major bug fixes]
 o Store device file information correctly
 o Fix -EIO handling with respect to power-off-recovery
 o Allocate blocks with global locks
 o Fix wrong calculation of the SSR cost
 
 [Cleanups]
 o Get rid of fake on-stack dentries
 
 [Enhancement]
 o Support (un)freeze_fs
 o Enhance the f2fs_gc flow
 o Support 32-bit binary execution on 64-bit kernel
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJRKr6KAAoJEEAUqH6CSFDSlbMP/AssVlP9k6sxP0vu3FRx13Iw
 gy0dbGvsuO37lWGV3vF4qt/knf91SvUgYQyihQbXfBpmrDvpDui7bpBo6bK/wyf0
 qQNFcqdct37YrrOuglg1vUHODdCi4gVxj6z311ZeDgC2L6IGuedXPKEHc5xy7zSc
 UgUR7+e9Qhm0llMe69GaP1ditQXIN2xAhdZ4SUY5fT+QsHbge8JxyjQRuZCdnq3Y
 QG4J3vAXsSnhk9H5AhXDZdS11iZdcpf0yLC2ATPdaMEv1ZpQg9mM97clUrF2tya0
 XDgS6YYselD94W6Bv4ZtcNX2hUXid1Qp+Zi2DR559W/W351PTQmB6S3NWrBZ1sCF
 Ig8pa+t5eyZ98nrixdK2l/2EnHGDg1HG+zUGNXRJ2Sg+ine10MB3ajYHmHC+NciS
 vmoDDTPO3h0N5gunBaYpSgLVJtr5DVIg6X0ZRHKuue3Chw2xGrN4ESK7kH0hg5bl
 4eFCJVZVwzlGUdnDHCNQam1MG4MVFO5GPtxBz2algKaQbbAgUvQG1tA+0zH47/iz
 ZZVqr7tI7evvyMJU9ep8ksEsSOznyzt8j7z9pjd5PIRifQRNC6Xuewx1uMlAztLj
 XUaG4kU7jQr/opoSi7kVrc67RY+RLRuhfwO8NOebV101O+TgnXVKUOQfQGNKXrud
 LzN7SOJgviYjO04Bb5u6
 =MdRi
 -----END PGP SIGNATURE-----

Merge tag 'f2fs-for-3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs

Pull f2fs update from Jaegeuk Kim:
 "[Major bug fixes]
   o Store device file information correctly
   o Fix -EIO handling with respect to power-off-recovery
   o Allocate blocks with global locks
   o Fix wrong calculation of the SSR cost

  [Cleanups]
   o Get rid of fake on-stack dentries

  [Enhancement]
   o Support (un)freeze_fs
   o Enhance the f2fs_gc flow
   o Support 32-bit binary execution on 64-bit kernel"

* tag 'f2fs-for-3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (29 commits)
  f2fs: avoid build warning
  f2fs: add compat_ioctl to provide backward compatability
  f2fs: fix calculation of max. gc cost in the SSR case
  f2fs: clarify and enhance the f2fs_gc flow
  f2fs: optimize the return condition for has_not_enough_free_secs
  f2fs: make an accessor to get sections for particular block type
  f2fs: mark gc_thread as NULL when thread creation is failed
  f2fs: name gc task as per the block device
  f2fs: remove unnecessary gc option check and balance_fs
  f2fs: remove repeated F2FS_SET_SB_DIRT call
  f2fs: when check superblock failed, try to check another superblock
  f2fs: use F2FS_BLKSIZE to judge bloksize and page_cache_size
  f2fs: add device name in debugfs
  f2fs: stop repeated checking if cp is needed
  f2fs: avoid balanc_fs during evict_inode
  f2fs: remove the use of page_cache_release
  f2fs: fix typo mistake for data_version description
  f2fs: reorganize code for ra_node_page
  f2fs: avoid redundant call to has_not_enough_free_secs in f2fs_gc
  f2fs: add un/freeze_fs into super_operations
  ...
2013-02-26 14:50:15 -08:00
fda2832feb btrfs: cleanup for open-coded alignment
Though most of the btrfs codes are using ALIGN macro for page alignment,
there are still some codes using open-coded alignment like the
following:
------
        u64 mask = ((u64)root->stripesize - 1);
        u64 ret = (val + mask) & ~mask;
------
Or even hidden one:
------
        num_bytes = (end - start + blocksize) & ~(blocksize - 1);
------

Sometimes these open-coded alignment is not so easy to understand for
newbie like me.

This commit changes the open-coded alignment to the ALIGN macro for a
better readability.

Also there is a previous patch from David Sterba with similar changes,
but the patch is for 3.2 kernel and seems not merged.
http://www.spinics.net/lists/linux-btrfs/msg12747.html

Cc: David Sterba <dave@jikos.cz>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-02-26 11:04:13 -05:00
8c4ce81e91 Btrfs: do not change inode flags in rename
Before we forced to change a file's NOCOW and COMPRESS flag due to
the parent directory's, but this ends up a bad idea, because it
confuses end users a lot about file's NOCOW status, eg. if someone
change a file to NOCOW via 'chattr' and then rename it in the current
directory which is without NOCOW attribute, the file will lose the
NOCOW flag silently.

This diables 'change flags in rename', so from now on we'll only
inherit flags from the parent directory on creation stage while in
other places we can use 'chattr' to set NOCOW or COMPRESS flags.

Reported-by: Marios Titas <redneb8888@gmail.com>
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-02-26 11:01:19 -05:00
2382c5cc7e Btrfs: use reserved space for creating a snapshot
While inserting dir index and updating inode for a snapshot, we'd
add delayed items which consume trans->block_rsv, if we don't have
any space reserved in this trans handle, we either just return or
reserve space again.

But before creating pending snapshots during committing transaction,
we've done a release on this trans handle, so we don't have space reserved
in it at this stage.

What we're using is block_rsv of pending snapshots which has already
reserved well enough space for both inserting dir index and updating
inode, so we need to set trans handle to indicate that we have space
now.

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-02-26 11:00:52 -05:00