raw_smp_processor_id() is the means of avoiding the runtime preemptibility
check.
[akpm@linux-foundation.org: fix printk warning]
Cc: Joe Perches <joe@perches.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Using a function for __mlog_printk instead of a macro reduces the object
size of built-in.o by about 190KB, or ~18% overall (x86-64 defconfig
with all ocfs2 options)
$ size fs/ocfs2/built-in.o*
text data bss dec hex filename
870954 118471 134408 1123833 1125f9 fs/ocfs2/built-in.o,new
1064081 118071 134408 1316560 1416d0 fs/ocfs2/built-in.o.old
Miscellanea:
- Move the used-once __mlog_cpu_guess statement expression macro to the
masklog.c file above the use in __mlog_printk function
- Simplify the mlog macro moving the and/or logic and level code into
__mlog_printk
[akpm@linux-foundation.org: export __mlog_printk() to other ocfs2 modules]
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
config_item_init() is only used in item.c
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Use kvfree() instead of open-coding it.
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Cc: Anton Altaparmakov <anton@tuxera.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull networking updates from David Miller:
1) Add TX fast path in mac80211, from Johannes Berg.
2) Add TSO/GRO support to ibmveth, from Thomas Falcon
3) Move away from cached routes in ipv6, just like ipv4, from Martin
KaFai Lau.
4) Lots of new rhashtable tests, from Thomas Graf.
5) Run ingress qdisc lockless, from Alexei Starovoitov.
6) Allow servers to fetch TCP packet headers for SYN packets of new
connections, for fingerprinting. From Eric Dumazet.
7) Add mode parameter to pktgen, for testing receive. From Alexei
Starovoitov.
8) Cache access optimizations via simplifications of build_skb(), from
Alexander Duyck.
9) Move page frag allocator under mm/, also from Alexander.
10) Add xmit_more support to hv_netvsc, from KY Srinivasan.
11) Add a counter guard in case we try to perform endless reclassify
loops in the packet scheduler.
12) Extern flow dissector to be programmable and use it in new "Flower"
classifier. From Jiri Pirko.
13) AF_PACKET fanout rollover fixes, performance improvements, and new
statistics. From Willem de Bruijn.
14) Add netdev driver for GENEVE tunnels, from John W Linville.
15) Add ingress netfilter hooks and filtering, from Pablo Neira Ayuso.
16) Fix handling of epoll edge triggers in TCP, from Eric Dumazet.
17) Add an ECN retry fallback for the initial TCP handshake, from Daniel
Borkmann.
18) Add tail call support to BPF, from Alexei Starovoitov.
19) Add several pktgen helper scripts, from Jesper Dangaard Brouer.
20) Add zerocopy support to AF_UNIX, from Hannes Frederic Sowa.
21) Favor even port numbers for allocation to connect() requests, and
odd port numbers for bind(0), in an effort to help avoid
ip_local_port_range exhaustion. From Eric Dumazet.
22) Add Cavium ThunderX driver, from Sunil Goutham.
23) Allow bpf programs to access skb_iif and dev->ifindex SKB metadata,
from Alexei Starovoitov.
24) Add support for T6 chips in cxgb4vf driver, from Hariprasad Shenai.
25) Double TCP Small Queues default to 256K to accomodate situations
like the XEN driver and wireless aggregation. From Wei Liu.
26) Add more entropy inputs to flow dissector, from Tom Herbert.
27) Add CDG congestion control algorithm to TCP, from Kenneth Klette
Jonassen.
28) Convert ipset over to RCU locking, from Jozsef Kadlecsik.
29) Track and act upon link status of ipv4 route nexthops, from Andy
Gospodarek.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1670 commits)
bridge: vlan: flush the dynamically learned entries on port vlan delete
bridge: multicast: add a comment to br_port_state_selection about blocking state
net: inet_diag: export IPV6_V6ONLY sockopt
stmmac: troubleshoot unexpected bits in des0 & des1
net: ipv4 sysctl option to ignore routes when nexthop link is down
net: track link-status of ipv4 nexthops
net: switchdev: ignore unsupported bridge flags
net: Cavium: Fix MAC address setting in shutdown state
drivers: net: xgene: fix for ACPI support without ACPI
ip: report the original address of ICMP messages
net/mlx5e: Prefetch skb data on RX
net/mlx5e: Pop cq outside mlx5e_get_cqe
net/mlx5e: Remove mlx5e_cq.sqrq back-pointer
net/mlx5e: Remove extra spaces
net/mlx5e: Avoid TX CQE generation if more xmit packets expected
net/mlx5e: Avoid redundant dev_kfree_skb() upon NOP completion
net/mlx5e: Remove re-assignment of wq type in mlx5e_enable_rq()
net/mlx5e: Use skb_shinfo(skb)->gso_segs rather than counting them
net/mlx5e: Static mapping of netdev priv resources to/from netdev TX queues
net/mlx4_en: Use HW counters for rx/tx bytes/packets in PF device
...
There is a cut and paste error so instead of freeing "head_ref", we free
"ref" twice.
Fixes: 3368d001ba5d ('btrfs: qgroup: Record possible quota-related extent for qgroup.')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Chris Mason <clm@fb.com>
As a simple scheme, report every minute if IO is still going on.
Reviewed-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Peng Tao <tao.peng@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
There is no need to report concurrently.
Reviewed-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Peng Tao <tao.peng@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
It fills in the generic part of LAYOUTSTATS call. One thing to note
is that we don't really track if IO is continuous or not. So just fake
to use the completed bytes for it.
Still missing flexfiles specific part, which will be included in the next patch.
Reviewed-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Peng Tao <tao.peng@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
So that we can report cumulative time since the beginning
of statistics collection of the layout.
Reviewed-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Peng Tao <tao.peng@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
JFFS2
* fix a theoretical unbalanced locking issue; the lock handling was a bit
unclean, but AFAICT, it didn't actually lead to real deadlocks
NAND
* brcmnand driver: new driver supporting NAND controller found originally on
Broadcom STB SoCs (BCM7xxx), but now also found on BCM63xxx, iProc (e.g.,
Cygnus, BCM5301x), BCM3xxx, and more
* Begin factoring out BBT code so it can be shared between traditional
(parallel) NAND drivers and upcoming SPI NAND drivers (WIP)
* Add common DT-based init support, so nand_base can pick up some flash
properties automatically, using established common NAND DT properties
* mxc_nand: support 8-bit ECC
* pxa3xx_nand:
- fix build for ARM64
- use a jiffies-based timeout
SPI NOR
* Add a few new IDs
* Clear out some unnecessary entries
* Make sure SECT_4K flags are correct for all (?) entries
Core
* Fix mtd->usecount race conditions (BUG_ON())
* Switch to modern PM ops
Other
* CFI: save code space by de-inlining large functions
* Clean up some partition parser selection code across several drivers
* Various miscellaneous changes, mostly minor
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJViZdKAAoJEFySrpd9RFgtaqoQAINkYUdxa8mOlmXQSIhWz19K
5FJqJN+/lgrYmIGI1SzVy/TTB/V4hWH+9h1snU98ToaHFACqbKKeo6FLs6GRd+WJ
adR9H4PUjtAZOt8trKzzygJnqvRPDWnn6H+0urxFv7kpyPTEorifiiI6+o3AM995
vh+YsLJNFYHucIzi4TVkgwssLYi8I9TOb35pDnW2uT/ikDi+4Uu+Ocq6u7SuMPXV
Zch6DcmBhtTQ9ghBwF/dMMAhyyMLyY4q0+CEzmJGxNU+g2o1njOTUDBv+lLqxxMB
fUXMAPXIT3ndhWjMmzklG7AjOorbg44aG2UlqJWXx00VYJKNf+pljpgdHOXKiuWo
xYkqOYnEM8gZ4qYNInKbbv34Q2+EjCxg+aAWxh+yRCJrfnF3p5/QSMVL2P2JNrc3
Kg8W7pW42xOeGxTYpydBtvpq+41qRtdWMgNp47PHvKF0mhpeSCCvRf/YfU4cNcfv
WfQzBLy/d7RMVFyvACdvzerF/fh9YX3yxV0B0LstytCSZO13617F6HHpm1pYfP0v
d9GpLpdiFktoG/AXpPzlSl992ALf0SQaedamuxApfvLVymDkG9xoO0P5p6SbOiJ0
QnBvEMkswEf7ExPzr5SYufSE93wikAEsAfjsLOHo+FQQ57eaRgpCOZHHgfuwcTra
xUr6u2Rq9iDenhVYDqJU
=JN5N
-----END PGP SIGNATURE-----
Merge tag 'for-linus-20150623' of git://git.infradead.org/linux-mtd
Pull MTD updates from Brian Norris:
"JFFS2:
- fix a theoretical unbalanced locking issue; the lock handling was a
bit unclean, but AFAICT, it didn't actually lead to real deadlocks
NAND:
- brcmnand driver: new driver supporting NAND controller found
originally on Broadcom STB SoCs (BCM7xxx), but now also found on
BCM63xxx, iProc (e.g., Cygnus, BCM5301x), BCM3xxx, and more
- begin factoring out BBT code so it can be shared between
traditional (parallel) NAND drivers and upcoming SPI NAND drivers
(WIP)
- add common DT-based init support, so nand_base can pick up some
flash properties automatically, using established common NAND DT
properties
- mxc_nand: support 8-bit ECC
- pxa3xx_nand:
* fix build for ARM64
* use a jiffies-based timeout
SPI NOR:
- add a few new IDs
- clear out some unnecessary entries
- make sure SECT_4K flags are correct for all (?) entries
Core:
- fix mtd->usecount race conditions (BUG_ON())
- switch to modern PM ops
Other:
- CFI: save code space by de-inlining large functions
- clean up some partition parser selection code across several
drivers
- various miscellaneous changes, mostly minor"
* tag 'for-linus-20150623' of git://git.infradead.org/linux-mtd: (57 commits)
mtd: docg3: Fix kasprintf() usage
mtd: docg3: Don't leak docg3->bbt in error path
mtd: nandsim: Fix kasprintf() usage
mtd: cs553x_nand: Fix kasprintf() usage
mtd: r852: Fix device_create_file() usage
mtd: brcmnand: drop unnecessary initialization
mtd: propagate error codes from add_mtd_device()
mtd: diskonchip: remove two-phase partitioning / registration
mtd: dc21285: use raw spinlock functions for nw_gpio_lock
mtd: chips: fixup dependencies, to prevent build error
mtd: cfi_cmdset_0002: Initialize datum before calling map_word_load_partial
mtd: cfi: deinline large functions
mtd: lantiq-flash: use default partition parsers
mtd: plat_nand: use default partition probe
mtd: nand: correct indentation within conditional
mtd: remove incorrect file name
mtd: blktrans: use better error code for unimplemented ioctl()
mtd: maps: Spelling s/reseved/reserved/
mtd: blktrans: change blktrans_getgeo return value
mtd: mxc_nand: generate nand_ecclayout for 8 bit ECC
...
dir_pages was declared in a lot of filesystems.
Use newly dir_pages() from pagemap.h
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
That function was declared in a lot of filesystems to calculate
directory pages.
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
list_entry is just a wrapper for container_of, but it is arguably
wrong (and slightly confusing) to use it when the pointed-to struct
member is not a struct list_head. Use container_of directly instead.
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Currently XFS calls file_remove_privs() without holding i_mutex. This is
wrong because that function can end up messing with file permissions and
file capabilities stored in xattrs for which we need i_mutex held.
Fix the problem by grabbing iolock exclusively when we will need to
change anything in permissions / xattrs.
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Comment in include/linux/security.h says that ->inode_killpriv() should
be called when setuid bit is being removed and that similar security
labels (in fact this applies only to file capabilities) should be
removed at this time as well. However we don't call ->inode_killpriv()
when we remove suid bit on truncate.
We fix the problem by calling ->inode_need_killpriv() and subsequently
->inode_killpriv() on truncate the same way as we do it on file write.
After this patch there's only one user of should_remove_suid() - ocfs2 -
and indeed it's buggy because it doesn't call ->inode_killpriv() on
write. However fixing it is difficult because of special locking
constraints.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Provide function telling whether file_remove_privs() will do anything.
Currently we only have should_remove_suid() and that does something
slightly different.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
file_remove_suid() is a misnomer since it removes also file capabilities
stored in xattrs and sets S_NOSEC flag. Also should_remove_suid() tells
something else than whether file_remove_suid() call is necessary which
leads to bugs.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
file_remove_suid() could mistakenly set S_NOSEC inode bit when root was
modifying the file. As a result following writes to the file by ordinary
user would avoid clearing suid or sgid bits.
Fix the bug by checking actual mode bits before setting S_NOSEC.
CC: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
If posix_acl_create() returns an error code then "*acl" and "*default_acl"
can be uninitialized or point to freed memory. This is a dangerous thing
to do. For example, it causes a problem in ocfs2_reflink():
fs/ocfs2/refcounttree.c:4327 ocfs2_reflink()
error: potentially using uninitialized 'default_acl'.
I've re-written this so we set the pointers to NULL at the start. I've
added a temporary "clone" variable to hold the value of "*acl" until end.
Setting them to NULL means means we don't need the "no_acl" label. We may
as well remove the "apply_umask" stuff forward and remove that label as
well.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Newer versions of mount parse the lazytime feature and pass it to the
mount system call via the flags field in the mount system call,
removing the lazytime string from the mount options list. So we need
to check for the presence of MS_LAZYTIME and set it in sb->s_flags in
order for this flag to be set on a remount.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
Pull timer updates from Thomas Gleixner:
"A rather largish update for everything time and timer related:
- Cache footprint optimizations for both hrtimers and timer wheel
- Lower the NOHZ impact on systems which have NOHZ or timer migration
disabled at runtime.
- Optimize run time overhead of hrtimer interrupt by making the clock
offset updates smarter
- hrtimer cleanups and removal of restrictions to tackle some
problems in sched/perf
- Some more leap second tweaks
- Another round of changes addressing the 2038 problem
- First step to change the internals of clock event devices by
introducing the necessary infrastructure
- Allow constant folding for usecs/msecs_to_jiffies()
- The usual pile of clockevent/clocksource driver updates
The hrtimer changes contain updates to sched, perf and x86 as they
depend on them plus changes all over the tree to cleanup API changes
and redundant code, which got copied all over the place. The y2038
changes touch s390 to remove the last non 2038 safe code related to
boot/persistant clock"
* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (114 commits)
clocksource: Increase dependencies of timer-stm32 to limit build wreckage
timer: Minimize nohz off overhead
timer: Reduce timer migration overhead if disabled
timer: Stats: Simplify the flags handling
timer: Replace timer base by a cpu index
timer: Use hlist for the timer wheel hash buckets
timer: Remove FIFO "guarantee"
timers: Sanitize catchup_timer_jiffies() usage
hrtimer: Allow hrtimer::function() to free the timer
seqcount: Introduce raw_write_seqcount_barrier()
seqcount: Rename write_seqcount_barrier()
hrtimer: Fix hrtimer_is_queued() hole
hrtimer: Remove HRTIMER_STATE_MIGRATE
selftest: Timers: Avoid signal deadlock in leap-a-day
timekeeping: Copy the shadow-timekeeper over the real timekeeper last
clockevents: Check state instead of mode in suspend/resume path
selftests: timers: Add leap-second timer edge testing to leap-a-day.c
ntp: Do leapsecond adjustment in adjtimex read path
time: Prevent early expiry of hrtimers[CLOCK_REALTIME] at the leap second edge
ntp: Introduce and use SECS_PER_DAY macro instead of 86400
...
The xfs_attr3_root_inactive() call from xfs_attr_inactive() assumes that
attribute blocks exist to invalidate. It is possible to have an
attribute fork without extents, however. Consider the case where the
attribute fork is created towards the beginning of xfs_attr_set() but
some part of the subsequent attribute set fails.
If an inode in such a state hits xfs_attr_inactive(), it eventually
calls xfs_dabuf_map() and possibly xfs_bmapi_read(). The former emits a
filesystem corruption warning, returns an error that bubbles back up to
xfs_attr_inactive(), and leads to destruction of the in-core attribute
fork without an on-disk reset. If the inode happens to make it back
through xfs_inactive() in this state (e.g., via a concurrent bulkstat
that cycles the inode from the reclaim state and releases it), i_afp
might not exist when xfs_bmapi_read() is called and causes a NULL
dereference panic.
A '-p 2' fsstress run to ENOSPC on a relatively small fs (1GB)
reproduces these problems. The behavior is a regression caused by:
6dfe5a0 xfs: xfs_attr_inactive leaves inconsistent attr fork state behind
... which removed logic that avoided the attribute extent truncate when
no extents exist. Restore this logic to ensure the attribute fork is
destroyed and reset correctly if it exists without any allocated
extents.
cc: stable@vger.kernel.org # 3.12 to 4.0.x
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Pull locking updates from Ingo Molnar:
"The main changes are:
- 'qspinlock' support, enabled on x86: queued spinlocks - these are
now the spinlock variant used by x86 as they outperform ticket
spinlocks in every category. (Waiman Long)
- 'pvqspinlock' support on x86: paravirtualized variant of queued
spinlocks. (Waiman Long, Peter Zijlstra)
- 'qrwlock' support, enabled on x86: queued rwlocks. Similar to
queued spinlocks, they are now the variant used by x86:
CONFIG_ARCH_USE_QUEUED_SPINLOCKS=y
CONFIG_QUEUED_SPINLOCKS=y
CONFIG_ARCH_USE_QUEUED_RWLOCKS=y
CONFIG_QUEUED_RWLOCKS=y
- various lockdep fixlets
- various locking primitives cleanups, further WRITE_ONCE()
propagation"
* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
locking/lockdep: Remove hard coded array size dependency
locking/qrwlock: Don't contend with readers when setting _QW_WAITING
lockdep: Do not break user-visible string
locking/arch: Rename set_mb() to smp_store_mb()
locking/arch: Add WRITE_ONCE() to set_mb()
rtmutex: Warn if trylock is called from hard/softirq context
arch: Remove __ARCH_HAVE_CMPXCHG
locking/rtmutex: Drop usage of __HAVE_ARCH_CMPXCHG
locking/qrwlock: Rename QUEUE_RWLOCK to QUEUED_RWLOCKS
locking/pvqspinlock: Rename QUEUED_SPINLOCK to QUEUED_SPINLOCKS
locking/pvqspinlock: Replace xchg() by the more descriptive set_mb()
locking/pvqspinlock, x86: Enable PV qspinlock for Xen
locking/pvqspinlock, x86: Enable PV qspinlock for KVM
locking/pvqspinlock, x86: Implement the paravirt qspinlock call patching
locking/pvqspinlock: Implement simple paravirt support for the qspinlock
locking/qspinlock: Revert to test-and-set on hypervisors
locking/qspinlock: Use a simple write to grab the lock
locking/qspinlock: Optimize for smaller NR_CPUS
locking/qspinlock: Extract out code snippets for the next patch
locking/qspinlock: Add pending bit
...
Pull vfs updates from Al Viro:
"In this pile: pathname resolution rewrite.
- recursion in link_path_walk() is gone.
- nesting limits on symlinks are gone (the only limit remaining is
that the total amount of symlinks is no more than 40, no matter how
nested).
- "fast" (inline) symlinks are handled without leaving rcuwalk mode.
- stack footprint (independent of the nesting) is below kilobyte now,
about on par with what it used to be with one level of nested
symlinks and ~2.8 times lower than it used to be in the worst case.
- struct nameidata is entirely private to fs/namei.c now (not even
opaque pointers are being passed around).
- ->follow_link() and ->put_link() calling conventions had been
changed; all in-tree filesystems converted, out-of-tree should be
able to follow reasonably easily.
For out-of-tree conversions, see Documentation/filesystems/porting
for details (and in-tree filesystems for examples of conversion).
That has sat in -next since mid-May, seems to survive all testing
without regressions and merges clean with v4.1"
* 'for-linus-1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (131 commits)
turn user_{path_at,path,lpath,path_dir}() into static inlines
namei: move saved_nd pointer into struct nameidata
inline user_path_create()
inline user_path_parent()
namei: trim do_last() arguments
namei: stash dfd and name into nameidata
namei: fold path_cleanup() into terminate_walk()
namei: saner calling conventions for filename_parentat()
namei: saner calling conventions for filename_create()
namei: shift nameidata down into filename_parentat()
namei: make filename_lookup() reject ERR_PTR() passed as name
namei: shift nameidata inside filename_lookup()
namei: move putname() call into filename_lookup()
namei: pass the struct path to store the result down into path_lookupat()
namei: uninline set_root{,_rcu}()
namei: be careful with mountpoint crossings in follow_dotdot_rcu()
Documentation: remove outdated information from automount-support.txt
get rid of assorted nameidata-related debris
lustre: kill unused helper
lustre: kill unused macro (LOOKUP_CONTINUE)
...
Remove the hack where we fput the read-specific file in generic code.
Instead we can do it in nfsd4_encode_read as that gets called for all
error cases as well.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
This patch changes nfs4_preprocess_stateid_op so it always returns
a valid struct file if it has been asked for that. For that we
now allocate a temporary struct file for special stateids, and check
permissions if we got the file structure from the stateid. This
ensures that all callers will get their handling of special stateids
right, and avoids code duplication.
There is a little wart in here because the read code needs to know
if we allocated a file structure so that it can copy around the
read-ahead parameters. In the long run we should probably aim to
cache full file structures used with special stateids instead.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* bugfixes:
NFS: Ensure we set NFS_CONTEXT_RESEND_WRITES when requeuing writes
pNFS: Fix a memory leak when attempted pnfs fails
NFS: Ensure that we update the sequence id under the slot table lock
nfs: Initialize cb_sequenceres information before validate_seqid()
nfs: Only update callback sequnce id when CB_SEQUENCE success
NFSv4: nfs4_handle_delegation_recall_error should ignore EAGAIN
If jffs2 can deadlock on overlayfs readdir because it takes the same lock
on ->iterate() as in ->lookup().
Fix by moving whiteout checking outside iterate_dir(). Optimized by
collecting potential whiteouts (DT_CHR) in a temporary list and if
non-empty iterating throug these and checking for a 0/0 chardev.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Fixes: 49c21e1cacd7 ("ovl: check whiteout while reading directory")
Reported-by: Roman Yeryomin <leroi.lists@gmail.com>
Allow filesystems with .d_revalidate as lower layer(s), but not as upper
layer.
For local filesystems the rule was that modifications on the layers
directly while being part of the overlay results in undefined behavior.
This can easily be extended to distributed filesystems: we assume the tree
used as lower layer is static, which means ->d_revalidate() should always
return "1". If that is not the case, return -ESTALE, don't try to work
around the modification.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
NFS and other distributed filesystems may place automount points in the
tree. Previoulsy overlayfs refused to mount such filesystems types (based
on the existence of the .d_automount callback), even if the actual export
didn't have any automount points.
It cannot be determined in advance whether the filesystem has automount
points or not. The solution is to allow fs with .d_automount but refuse to
traverse any automount points encountered.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
At LSF we decided that if we truncate up from isize we shouldn't trim
fallocated blocks that were fallocated with KEEP_SIZE and are past the
new i_size. This patch fixes ext4 to do this.
[ Completely reworked patch so that i_disksize would actually get set
when truncating up. Also reworked the code for handling truncate so
that it's easier to handle. -- tytso ]
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
Make the error reporting behavior resulting from the unsupported use
of online defrag on files with data journaling enabled consistent with
that implemented for bigalloc file systems. Difference found with
ext4/308.
Signed-off-by: Eric Whitney <enwlinux@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>