android_kernel_asus_sm8350

Author	SHA1	Message	Date
Sachin Prabhu	348c1bfa84	Move check for prefix path to within cifs_get_root() Signed-off-by: Sachin Prabhu <sprabhu@redhat.com> Tested-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <smfrench@gmail.com>	2016-09-09 23:58:07 -05:00
Sachin Prabhu	c1d8b24d18	Compare prepaths when comparing superblocks The patch fs/cifs: make share unaccessible at root level mountable makes use of prepaths when any component of the underlying path is inaccessible. When mounting 2 separate shares having different prepaths but are other wise similar in other respects, we end up sharing superblocks when we shouldn't be doing so. Signed-off-by: Sachin Prabhu <sprabhu@redhat.com> Tested-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <smfrench@gmail.com>	2016-09-09 23:58:06 -05:00
Sachin Prabhu	4214ebf465	Fix memory leaks in cifs_do_mount() Fix memory leaks introduced by the patch fs/cifs: make share unaccessible at root level mountable Also move allocation of cifs_sb->prepath to cifs_setup_cifs_sb(). Signed-off-by: Sachin Prabhu <sprabhu@redhat.com> Tested-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <smfrench@gmail.com>	2016-09-09 23:58:06 -05:00
Eric Biggers	002ced4be6	fscrypto: only allow setting encryption policy on directories The FS_IOC_SET_ENCRYPTION_POLICY ioctl allowed setting an encryption policy on nondirectory files. This was unintentional, and in the case of nonempty regular files did not behave as expected because existing data was not actually encrypted by the ioctl. In the case of ext4, the user could also trigger filesystem errors in ->empty_dir(), e.g. due to mismatched "directory" checksums when the kernel incorrectly tried to interpret a regular file as a directory. This bug affected ext4 with kernels v4.8-rc1 or later and f2fs with kernels v4.6 and later. It appears that older kernels only permitted directories and that the check was accidentally lost during the refactoring to share the file encryption code between ext4 and f2fs. This patch restores the !S_ISDIR() check that was present in older kernels. Signed-off-by: Eric Biggers <ebiggers@google.com> Cc: stable@vger.kernel.org Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2016-09-09 23:38:12 -04:00
Eric Biggers	163ae1c6ad	fscrypto: add authorization check for setting encryption policy On an ext4 or f2fs filesystem with file encryption supported, a user could set an encryption policy on any empty directory() to which they had readonly access. This is obviously problematic, since such a directory might be owned by another user and the new encryption policy would prevent that other user from creating files in their own directory (for example). Fix this by requiring inode_owner_or_capable() permission to set an encryption policy. This means that either the caller must own the file, or the caller must have the capability CAP_FOWNER. () Or also on any regular file, for f2fs v4.6 and later and ext4 v4.8-rc1 and later; a separate bug fix is coming for that. Signed-off-by: Eric Biggers <ebiggers@google.com> Cc: stable@vger.kernel.org # 4.1+; check fs/{ext4,f2fs} Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2016-09-09 23:37:14 -04:00
Dan Williams	ca120cf688	mm: fix show_smap() for zone_device-pmd ranges Attempting to dump /proc/<pid>/smaps for a process with pmd dax mappings currently results in the following VM_BUG_ONs: kernel BUG at mm/huge_memory.c:1105! task: ffff88045f16b140 task.stack: ffff88045be14000 RIP: 0010:[<ffffffff81268f9b>] [<ffffffff81268f9b>] follow_trans_huge_pmd+0x2cb/0x340 [..] Call Trace: [<ffffffff81306030>] smaps_pte_range+0xa0/0x4b0 [<ffffffff814c2755>] ? vsnprintf+0x255/0x4c0 [<ffffffff8123c46e>] __walk_page_range+0x1fe/0x4d0 [<ffffffff8123c8a2>] walk_page_vma+0x62/0x80 [<ffffffff81307656>] show_smap+0xa6/0x2b0 kernel BUG at fs/proc/task_mmu.c:585! RIP: 0010:[<ffffffff81306469>] [<ffffffff81306469>] smaps_pte_range+0x499/0x4b0 Call Trace: [<ffffffff814c2795>] ? vsnprintf+0x255/0x4c0 [<ffffffff8123c46e>] __walk_page_range+0x1fe/0x4d0 [<ffffffff8123c8a2>] walk_page_vma+0x62/0x80 [<ffffffff81307696>] show_smap+0xa6/0x2b0 These locations are sanity checking page flags that must be set for an anonymous transparent huge page, but are not set for the zone_device pages associated with dax mappings. Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2016-09-09 17:34:45 -07:00
Linus Torvalds	6dc728ccd3	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse fix from Miklos Szeredi: "This fixes a deadlock when fuse, direct I/O and loop device are combined" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: direct-io: don't dirty ITER_BVEC pages	2016-09-09 13:00:41 -07:00
Linus Torvalds	5c44ad6a35	Merge branch 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs Pull overlayfs fix from Miklos Szeredi: "This fixes a regression caused by the last pull request" * 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: ovl: fix workdir creation	2016-09-09 12:56:28 -07:00
Linus Torvalds	f4a9c169c2	Merge branch 'for-linus-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fixes from Chris Mason: "I'm not proud of how long it took me to track down that one liner in btrfs_sync_log(), but the good news is the patches I was trying to blame for these problems were actually fine (sorry Filipe)" * 'for-linus-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: btrfs: introduce tickets_id to determine whether asynchronous metadata reclaim work makes progress btrfs: remove root_log_ctx from ctx list before btrfs_sync_log returns btrfs: do not decrease bytes_may_use when replaying extents	2016-09-09 12:52:31 -07:00
Matt Fleming	22c2b77f41	fs/efivarfs: Fix double kfree() in error path Julia reported that we may double free 'name' in efivarfs_callback(), and that this bug was introduced by commit 0d22f33bc37c ("efi: Don't use spinlocks for efi vars"). Move one of the kfree()s until after the point at which we know we are definitely on the success path. Reported-by: Julia Lawall <julia.lawall@lip6.fr> Acked-by: Julia Lawall <julia.lawall@lip6.fr> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Sylvain Chouleur <sylvain.chouleur@gmail.com> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>	2016-09-09 16:08:48 +01:00
Sylvain Chouleur	21b3ddd39f	efi: Don't use spinlocks for efi vars All efivars operations are protected by a spinlock which prevents interruptions and preemption. This is too restricted, we just need a lock preventing concurrency. The idea is to use a semaphore of count 1 and to have two ways of locking, depending on the context: - In interrupt context, we call down_trylock(), if it fails we return an error - In normal context, we call down_interruptible() We don't use a mutex here because the mutex_trylock() function must not be called from interrupt context, whereas the down_trylock() can. Signed-off-by: Sylvain Chouleur <sylvain.chouleur@intel.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Leif Lindholm <leif.lindholm@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Sylvain Chouleur <sylvain.chouleur@gmail.com> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>	2016-09-09 16:08:42 +01:00
Geliang Tang	f88baf68eb	ramoops: move spin_lock_init after kmalloc error checking If cxt->pstore.buf allocated failed, no need to initialize cxt->pstore.buf_lock. So this patch moves spin_lock_init() after the error checking. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org>	2016-09-08 15:01:13 -07:00
Andrew Bresticker	d771fdf941	pstore/ram: Use memcpy_fromio() to save old buffer The ramoops buffer may be mapped as either I/O memory or uncached memory. On ARM64, this results in a device-type (strongly-ordered) mapping. Since unnaligned accesses to device-type memory will generate an alignment fault (regardless of whether or not strict alignment checking is enabled), it is not safe to use memcpy(). memcpy_fromio() is guaranteed to only use aligned accesses, so use that instead. Signed-off-by: Andrew Bresticker <abrestic@chromium.org> Signed-off-by: Enric Balletbo Serra <enric.balletbo@collabora.com> Reviewed-by: Puneet Kumar <puneetster@chromium.org> Signed-off-by: Kees Cook <keescook@chromium.org> Cc: stable@vger.kernel.org	2016-09-08 15:01:12 -07:00
Furquan Shaikh	7e75678d23	pstore/ram: Use memcpy_toio instead of memcpy persistent_ram_update uses vmap / iomap based on whether the buffer is in memory region or reserved region. However, both map it as non-cacheable memory. For armv8 specifically, non-cacheable mapping requests use a memory type that has to be accessed aligned to the request size. memcpy() doesn't guarantee that. Signed-off-by: Furquan Shaikh <furquan@google.com> Signed-off-by: Enric Balletbo Serra <enric.balletbo@collabora.com> Reviewed-by: Aaron Durbin <adurbin@chromium.org> Reviewed-by: Olof Johansson <olofj@chromium.org> Tested-by: Furquan Shaikh <furquan@chromium.org> Signed-off-by: Kees Cook <keescook@chromium.org> Cc: stable@vger.kernel.org	2016-09-08 15:01:11 -07:00
Mark Salyzyn	5bf6d1b927	pstore/pmsg: drop bounce buffer Removing a bounce buffer copy operation in the pmsg driver path is always better. We also gain in overall performance by not requesting a vmalloc on every write as this can cause precious RT tasks, such as user facing media operation, to stall while memory is being reclaimed. Added a write_buf_user to the pstore functions, a backup platform write_buf_user that uses the small buffer that is part of the instance, and implemented a ramoops write_buf_user that only supports PSTORE_TYPE_PMSG. Signed-off-by: Mark Salyzyn <salyzyn@android.com> Signed-off-by: Kees Cook <keescook@chromium.org>	2016-09-08 15:01:10 -07:00
Namhyung Kim	79d955af71	pstore/ram: Set pstore flags dynamically The ramoops can be configured to enable each pstore type by setting their size. In that case, it'd be better not to register disabled types in the first place. Cc: Anton Vorontsov <anton@enomsg.org> Cc: Colin Cross <ccross@android.com> Cc: Kees Cook <keescook@chromium.org> Cc: Tony Luck <tony.luck@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org>	2016-09-08 15:01:09 -07:00
Namhyung Kim	c950fd6f20	pstore: Split pstore fragile flags This patch adds new PSTORE_FLAGS for each pstore type so that they can be enabled separately. This is a preparation for ongoing virtio-pstore work to support those types flexibly. The PSTORE_FLAGS_FRAGILE is changed to PSTORE_FLAGS_DMESG to preserve the original behavior. Cc: Anton Vorontsov <anton@enomsg.org> Cc: Colin Cross <ccross@android.com> Cc: Kees Cook <keescook@chromium.org> Cc: Tony Luck <tony.luck@intel.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Len Brown <lenb@kernel.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: linux-acpi@vger.kernel.org Cc: linux-efi@vger.kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> [kees: retained "FRAGILE" for now to make merges easier] Signed-off-by: Kees Cook <keescook@chromium.org>	2016-09-08 15:01:08 -07:00
Sebastian Andrzej Siewior	d5a9bf0b38	pstore/core: drop cmpxchg based updates I have here a FPGA behind PCIe which exports SRAM which I use for pstore. Now it seems that the FPGA no longer supports cmpxchg based updates and writes back 0xff…ff and returns the same. This leads to crash during crash rendering pstore useless. Since I doubt that there is much benefit from using cmpxchg() here, I am dropping this atomic access and use the spinlock based version. Cc: Anton Vorontsov <anton@enomsg.org> Cc: Colin Cross <ccross@android.com> Cc: Kees Cook <keescook@chromium.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Rabin Vincent <rabinv@axis.com> Tested-by: Rabin Vincent <rabinv@axis.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Guenter Roeck <linux@roeck-us.net> [kees: remove "_locked" suffix since it's the only option now] Signed-off-by: Kees Cook <keescook@chromium.org> Cc: stable@vger.kernel.org	2016-09-08 15:00:47 -07:00
Sebastian Andrzej Siewior	4407de74df	pstore/ramoops: fixup driver removal A basic rmmod ramoops segfaults. Let's see why. Since commit 34f0ec82e0a9 ("pstore: Correct the max_dump_cnt clearing of ramoops") sets ->max_dump_cnt to zero before looping over ->przs but we didn't use it before that either. And since commit ee1d267423a1 ("pstore: add pstore unregister") we free that memory on rmmod. But even then, we looped until a NULL pointer or ERR. I don't see where it is ensured that the last member is NULL. Let's try this instead: simply error recovery and free. Clean up in error case where resources were allocated. And then, in the free path, rely on ->max_dump_cnt in the free path. Cc: Anton Vorontsov <anton@enomsg.org> Cc: Colin Cross <ccross@android.com> Cc: Kees Cook <keescook@chromium.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Kees Cook <keescook@chromium.org> Cc: stable@vger.kernel.org # 4.4.x-	2016-09-08 14:58:00 -07:00
David Howells	248f219cb8	rxrpc: Rewrite the data and ack handling code Rewrite the data and ack handling code such that: (1) Parsing of received ACK and ABORT packets and the distribution and the filing of DATA packets happens entirely within the data_ready context called from the UDP socket. This allows us to process and discard ACK and ABORT packets much more quickly (they're no longer stashed on a queue for a background thread to process). (2) We avoid calling skb_clone(), pskb_pull() and pskb_trim(). We instead keep track of the offset and length of the content of each packet in the sk_buff metadata. This means we don't do any allocation in the receive path. (3) Jumbo DATA packet parsing is now done in data_ready context. Rather than cloning the packet once for each subpacket and pulling/trimming it, we file the packet multiple times with an annotation for each indicating which subpacket is there. From that we can directly calculate the offset and length. (4) A call's receive queue can be accessed without taking locks (memory barriers do have to be used, though). (5) Incoming calls are set up from preallocated resources and immediately made live. They can than have packets queued upon them and ACKs generated. If insufficient resources exist, DATA packet #1 is given a BUSY reply and other DATA packets are discarded). (6) sk_buffs no longer take a ref on their parent call. To make this work, the following changes are made: (1) Each call's receive buffer is now a circular buffer of sk_buff pointers (rxtx_buffer) rather than a number of sk_buff_heads spread between the call and the socket. This permits each sk_buff to be in the buffer multiple times. The receive buffer is reused for the transmit buffer. (2) A circular buffer of annotations (rxtx_annotations) is kept parallel to the data buffer. Transmission phase annotations indicate whether a buffered packet has been ACK'd or not and whether it needs retransmission. Receive phase annotations indicate whether a slot holds a whole packet or a jumbo subpacket and, if the latter, which subpacket. They also note whether the packet has been decrypted in place. (3) DATA packet window tracking is much simplified. Each phase has just two numbers representing the window (rx_hard_ack/rx_top and tx_hard_ack/tx_top). The hard_ack number is the sequence number before base of the window, representing the last packet the other side says it has consumed. hard_ack starts from 0 and the first packet is sequence number 1. The top number is the sequence number of the highest-numbered packet residing in the buffer. Packets between hard_ack+1 and top are soft-ACK'd to indicate they've been received, but not yet consumed. Four macros, before(), before_eq(), after() and after_eq() are added to compare sequence numbers within the window. This allows for the top of the window to wrap when the hard-ack sequence number gets close to the limit. Two flags, RXRPC_CALL_RX_LAST and RXRPC_CALL_TX_LAST, are added also to indicate when rx_top and tx_top point at the packets with the LAST_PACKET bit set, indicating the end of the phase. (4) Calls are queued on the socket 'receive queue' rather than packets. This means that we don't need have to invent dummy packets to queue to indicate abnormal/terminal states and we don't have to keep metadata packets (such as ABORTs) around (5) The offset and length of a (sub)packet's content are now passed to the verify_packet security op. This is currently expected to decrypt the packet in place and validate it. However, there's now nowhere to store the revised offset and length of the actual data within the decrypted blob (there may be a header and padding to skip) because an sk_buff may represent multiple packets, so a locate_data security op is added to retrieve these details from the sk_buff content when needed. (6) recvmsg() now has to handle jumbo subpackets, where each subpacket is individually secured and needs to be individually decrypted. The code to do this is broken out into rxrpc_recvmsg_data() and shared with the kernel API. It now iterates over the call's receive buffer rather than walking the socket receive queue. Additional changes: (1) The timers are condensed to a single timer that is set for the soonest of three timeouts (delayed ACK generation, DATA retransmission and call lifespan). (2) Transmission of ACK and ABORT packets is effected immediately from process-context socket ops/kernel API calls that cause them instead of them being punted off to a background work item. The data_ready handler still has to defer to the background, though. (3) A shutdown op is added to the AF_RXRPC socket so that the AFS filesystem can shut down the socket and flush its own work items before closing the socket to deal with any in-progress service calls. Future additional changes that will need to be considered: (1) Make sure that a call doesn't hog the front of the queue by receiving data from the network as fast as userspace is consuming it to the exclusion of other calls. (2) Transmit delayed ACKs from within recvmsg() when we've consumed sufficiently more packets to avoid the background work item needing to run. Signed-off-by: David Howells <dhowells@redhat.com>	2016-09-08 11:10:12 +01:00
David Howells	00e907127e	rxrpc: Preallocate peers, conns and calls for incoming service requests Make it possible for the data_ready handler called from the UDP transport socket to completely instantiate an rxrpc_call structure and make it immediately live by preallocating all the memory it might need. The idea is to cut out the background thread usage as much as possible. [Note that the preallocated structs are not actually used in this patch - that will be done in a future patch.] If insufficient resources are available in the preallocation buffers, it will be possible to discard the DATA packet in the data_ready handler or schedule a BUSY packet without the need to schedule an attempt at allocation in a background thread. To this end: (1) Preallocate rxrpc_peer, rxrpc_connection and rxrpc_call structs to a maximum number each of the listen backlog size. The backlog size is limited to a maxmimum of 32. Only this many of each can be in the preallocation buffer. (2) For userspace sockets, the preallocation is charged initially by listen() and will be recharged by accepting or rejecting pending new incoming calls. (3) For kernel services {,re,dis}charging of the preallocation buffers is handled manually. Two notifier callbacks have to be provided before kernel_listen() is invoked: (a) An indication that a new call has been instantiated. This can be used to trigger background recharging. (b) An indication that a call is being discarded. This is used when the socket is being released. A function, rxrpc_kernel_charge_accept() is called by the kernel service to preallocate a single call. It should be passed the user ID to be used for that call and a callback to associate the rxrpc call with the kernel service's side of the ID. (4) Discard the preallocation when the socket is closed. (5) Temporarily bump the refcount on the call allocated in rxrpc_incoming_call() so that rxrpc_release_call() can ditch the preallocation ref on service calls unconditionally. This will no longer be necessary once the preallocation is used. Note that this does not yet control the number of active service calls on a client - that will come in a later patch. A future development would be to provide a setsockopt() call that allows a userspace server to manually charge the preallocation buffer. This would allow user call IDs to be provided in advance and the awkward manual accept stage to be bypassed. Signed-off-by: David Howells <dhowells@redhat.com>	2016-09-08 11:10:12 +01:00
Jaegeuk Kim	68f313935f	f2fs: no need to make zeros beyond i_size We don't need to make zeros beyond i_size, since we already wrote that through NEW_ADDR case. Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2016-09-07 18:53:50 -07:00
Chao Yu	7732c26ac3	f2fs: fix to detect temporary name of multimedia file Some applications may create multimeida file with temporary name like '.jpg.tmp' or '.mp4.tmp', then rename to '.jpg' or '.mp4'. Now, f2fs can only detect multimedia filename with specified format: "filename + '.' + extension", so it will make f2fs missing to detect multimedia file with special temporary name, result in failing to set cold flag on file. This patch enhances detection flow for enabling lookup extension in the middle of temporary filename. Reported-by: Xue Liu <liuxueliu.liu@huawei.com> Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2016-09-07 18:53:49 -07:00
Chao Yu	6ab2a3085e	f2fs: fix minor typo Correct typo from 'destory' to 'destroy'. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2016-09-07 18:53:48 -07:00
Jaegeuk Kim	6bf6b267d2	f2fs: set dentry bits on random location in memory This fixes pointer panic when using inline_dentry, which was triggered when backporting to 3.10. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2016-09-07 18:53:47 -07:00
Chao Yu	c2a080aefa	f2fs: fix to set superblock dirty correctly tests/generic/251 of fstest suit complains us with below message: ------------[ cut here ]------------ invalid opcode: 0000 [#1] PREEMPT SMP CPU: 2 PID: 7698 Comm: fstrim Tainted: G O 4.7.0+ #21 task: e9f4e000 task.stack: e7262000 EIP: 0060:[<f89fcefe>] EFLAGS: 00010202 CPU: 2 EIP is at write_checkpoint+0xfde/0x1020 [f2fs] EAX: f33eb300 EBX: eecac310 ECX: 00000001 EDX: ffff0001 ESI: eecac000 EDI: eecac5f0 EBP: e7263dec ESP: e7263d18 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 CR0: 80050033 CR2: b76ab01c CR3: 2eb89de0 CR4: 000406f0 Stack: 00000001 a220fb7b e9f4e000 00000002 419ff2d3 b3a05151 00000002 e9f4e5d8 e9f4e000 419ff2d3 b3a05151 eecac310 c10b8154 b3a05151 419ff2d3 c10b78bd e9f4e000 e9f4e000 e9f4e5d8 00000001 e9f4e000 ec409000 eecac2cc eecac288 Call Trace: [<c10b8154>] ? __lock_acquire+0x3c4/0x760 [<c10b78bd>] ? mark_held_locks+0x5d/0x80 [<f8a10632>] f2fs_trim_fs+0x1c2/0x2e0 [f2fs] [<f89e9f56>] f2fs_ioctl+0x6b6/0x10b0 [f2fs] [<c13d51df>] ? __this_cpu_preempt_check+0xf/0x20 [<c10b4281>] ? trace_hardirqs_off_caller+0x91/0x120 [<f89e98a0>] ? __exchange_data_block+0xd30/0xd30 [f2fs] [<c120b2e1>] do_vfs_ioctl+0x81/0x7f0 [<c11d57c5>] ? kmem_cache_free+0x245/0x2e0 [<c1217840>] ? get_unused_fd_flags+0x40/0x40 [<c1206eec>] ? putname+0x4c/0x50 [<c11f631e>] ? do_sys_open+0x16e/0x1d0 [<c1001990>] ? do_fast_syscall_32+0x30/0x1c0 [<c13d51df>] ? __this_cpu_preempt_check+0xf/0x20 [<c120baa8>] SyS_ioctl+0x58/0x80 [<c1001a01>] do_fast_syscall_32+0xa1/0x1c0 [<c178cc54>] sysenter_past_esp+0x45/0x74 EIP: [<f89fcefe>] write_checkpoint+0xfde/0x1020 [f2fs] SS:ESP 0068:e7263d18 ---[ end trace 4de95d7e6b3aa7c6 ]--- The reason is: with below call stack, we will encounter BUG_ON during doing fstrim. Thread A Thread B - write_checkpoint - do_checkpoint - f2fs_write_inode - update_inode_page - update_inode - set_page_dirty - f2fs_set_node_page_dirty - inc_page_count - percpu_counter_inc - set_sbi_flag(SBI_IS_DIRTY) - clear_sbi_flag(SBI_IS_DIRTY) Thread C Thread D - f2fs_write_node_page - set_node_addr - __set_nat_cache_dirty - nm_i->dirty_nat_cnt++ - do_vfs_ioctl - f2fs_ioctl - f2fs_trim_fs - write_checkpoint - f2fs_bug_on(nm_i->dirty_nat_cnt) Fix it by setting superblock dirty correctly in do_checkpoint and f2fs_write_node_page. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2016-09-07 18:53:47 -07:00
Shuoran Liu	e7ba108a06	f2fs: add roll-forward recovery process for encrypted dentry Add roll-forward recovery process for encrypted dentry, so the first fsync issued to an encrypted file does not need writing checkpoint. This improves the performance of the following test at thousands of small files: open -> write -> fsync -> close Signed-off-by: Shuoran Liu <liushuoran@huawei.com> Acked-by: Chao Yu <yuchao0@huawei.com> [Jaegeuk Kim: modify kernel message to show encrypted names] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2016-09-07 17:27:40 -07:00
Jaegeuk Kim	bbf156f7af	f2fs: fix lost xattrs of directories This patch enhances the xattr consistency of dirs from suddern power-cuts. Possible scenario would be: 1. dir->setxattr used by per-file encryption 2. file->setxattr goes into inline_xattr 3. file->fsync In that case, we should do checkpoint for #1. Otherwise we'd lose dir's key information for the file given #2. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2016-09-07 17:27:39 -07:00
Chao Yu	275b66b09e	f2fs: support async discard Like most filesystems, f2fs will issue discard command synchronously, so when user trigger fstrim through ioctl, multiple discard commands will be issued serially with sync mode, which makes poor performance. In this patch we try to support async discard, so that all discard commands can be issued and be waited for endio in batch to improve performance. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2016-09-07 17:27:38 -07:00
Shuoran Liu	167451efb5	f2fs: set encryption name flag in add inline entry path This patch sets encryption name flag in the add inline entry path if filename is encrypted. Signed-off-by: Shuoran Liu <liushuoran@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2016-09-07 17:27:37 -07:00
Chao Yu	e06f86e61d	f2fs crypto: avoid unneeded memory allocation in ->readdir When decrypting dirents in ->readdir, fscrypt_fname_disk_to_usr won't change content of original encrypted dirent, we don't need to allocate additional buffer for storing mirror of it, so get rid of it. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2016-09-07 17:27:36 -07:00
Chao Yu	9421d57051	f2fs: fix to do security initialization of encrypted inode with original filename When creating new inode, security_inode_init_security will be called for initializing security info related to the inode, and filename is passed to security module, it helps security module such as SElinux to know which rule or label could be applied for the inode with specified name. Previously, if new inode is created as an encrypted one, f2fs will transfer encrypted filename to security module which may fail the check of security policy belong to the inode. So in order to this issue, alter to transfer original unencrypted filename instead. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2016-09-07 17:27:35 -07:00
Chao Yu	7ea984b060	f2fs: do in batch synchronously readahead during GC In order to enhance performance, we try to readahead node page during GC, but before loading node page we should get block address of node page which is stored in NAT table, so synchronously read of single NAT page block our readahead flow. f2fs_submit_page_bio: dev = (251,0), ino = 2, page_index = 0xa1e, oldaddr = 0xa1e, newaddr = 0xa1e, rw = READ_SYNC(MP), type = META f2fs_submit_page_bio: dev = (251,0), ino = 1, page_index = 0x35e9, oldaddr = 0x72d7a, newaddr = 0x72d7a, rw = READAHEAD ^H, type = NODE f2fs_submit_page_bio: dev = (251,0), ino = 2, page_index = 0xc1f, oldaddr = 0xc1f, newaddr = 0xc1f, rw = READ_SYNC(MP), type = META f2fs_submit_page_bio: dev = (251,0), ino = 1, page_index = 0x389d, oldaddr = 0x72d7d, newaddr = 0x72d7d, rw = READAHEAD ^H, type = NODE f2fs_submit_page_bio: dev = (251,0), ino = 1, page_index = 0x3a82, oldaddr = 0x72d7f, newaddr = 0x72d7f, rw = READAHEAD ^H, type = NODE f2fs_submit_page_bio: dev = (251,0), ino = 1, page_index = 0x3bfa, oldaddr = 0x72d86, newaddr = 0x72d86, rw = READAHEAD ^H, type = NODE This patch adds one phase that do readahead NAT pages in batch before readahead node page for more effeciently. f2fs_submit_page_bio: dev = (251,0), ino = 2, page_index = 0x1952, oldaddr = 0x1952, newaddr = 0x1952, rw = READ_SYNC(MP), type = META f2fs_submit_page_mbio: dev = (251,0), ino = 2, page_index = 0xc34, oldaddr = 0xc34, newaddr = 0xc34, rw = READ_SYNC(MP), type = META f2fs_submit_page_mbio: dev = (251,0), ino = 2, page_index = 0xa33, oldaddr = 0xa33, newaddr = 0xa33, rw = READ_SYNC(MP), type = META f2fs_submit_page_mbio: dev = (251,0), ino = 2, page_index = 0xc30, oldaddr = 0xc30, newaddr = 0xc30, rw = READ_SYNC(MP), type = META f2fs_submit_page_mbio: dev = (251,0), ino = 2, page_index = 0xc32, oldaddr = 0xc32, newaddr = 0xc32, rw = READ_SYNC(MP), type = META f2fs_submit_page_mbio: dev = (251,0), ino = 2, page_index = 0xc26, oldaddr = 0xc26, newaddr = 0xc26, rw = READ_SYNC(MP), type = META f2fs_submit_page_mbio: dev = (251,0), ino = 2, page_index = 0xa2b, oldaddr = 0xa2b, newaddr = 0xa2b, rw = READ_SYNC(MP), type = META f2fs_submit_page_mbio: dev = (251,0), ino = 2, page_index = 0xc23, oldaddr = 0xc23, newaddr = 0xc23, rw = READ_SYNC(MP), type = META f2fs_submit_page_mbio: dev = (251,0), ino = 2, page_index = 0xc24, oldaddr = 0xc24, newaddr = 0xc24, rw = READ_SYNC(MP), type = META f2fs_submit_page_mbio: dev = (251,0), ino = 2, page_index = 0xa10, oldaddr = 0xa10, newaddr = 0xa10, rw = READ_SYNC(MP), type = META f2fs_submit_page_mbio: dev = (251,0), ino = 2, page_index = 0xc2c, oldaddr = 0xc2c, newaddr = 0xc2c, rw = READ_SYNC(MP), type = META f2fs_submit_page_bio: dev = (251,0), ino = 1, page_index = 0x5db7, oldaddr = 0x6be00, newaddr = 0x6be00, rw = READAHEAD ^H, type = NODE f2fs_submit_page_bio: dev = (251,0), ino = 1, page_index = 0x5db9, oldaddr = 0x6be17, newaddr = 0x6be17, rw = READAHEAD ^H, type = NODE f2fs_submit_page_bio: dev = (251,0), ino = 1, page_index = 0x5dbc, oldaddr = 0x6be1a, newaddr = 0x6be1a, rw = READAHEAD ^H, type = NODE f2fs_submit_page_bio: dev = (251,0), ino = 1, page_index = 0x5dc3, oldaddr = 0x6be20, newaddr = 0x6be20, rw = READAHEAD ^H, type = NODE f2fs_submit_page_bio: dev = (251,0), ino = 1, page_index = 0x5dc7, oldaddr = 0x6be24, newaddr = 0x6be24, rw = READAHEAD ^H, type = NODE f2fs_submit_page_bio: dev = (251,0), ino = 1, page_index = 0x5dc9, oldaddr = 0x6be25, newaddr = 0x6be25, rw = READAHEAD ^H, type = NODE Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2016-09-07 17:27:34 -07:00
Chao Yu	74fa5f3d43	f2fs: schedule in between two continous batch discards In batch discard approach of fstrim will grab/release gc_mutex lock repeatly, it makes contention of the lock becoming more intensive. So after one batch discards were issued in checkpoint and the lock was released, it's better to do schedule() to increase opportunity of grabbing gc_mutex lock for other competitors. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2016-09-07 17:27:33 -07:00
Chris Mason	b7f3c7d345	Merge branch 'for-chris' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus-4.8	2016-09-07 12:55:36 -07:00
David Howells	5a42976d4f	rxrpc: Add tracepoint for working out where aborts happen Add a tracepoint for working out where local aborts happen. Each tracepoint call is labelled with a 3-letter code so that they can be distinguished - and the DATA sequence number is added too where available. rxrpc_kernel_abort_call() also takes a 3-letter code so that AFS can indicate the circumstances when it aborts a call. Signed-off-by: David Howells <dhowells@redhat.com>	2016-09-07 16:34:40 +01:00
Christophe JAILLET	240c5185c5	jfs: Simplify code Calling 'list_splice' followed by 'INIT_LIST_HEAD' is equivalent to 'list_splice_init'. This has been spotted with the following coccinelle script: ///// @@ expression y,z; @@ - list_splice(y,z); - INIT_LIST_HEAD(y); + list_splice_init(y,z); Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>	2016-09-06 12:17:24 -05:00
Jan Kara	f27792f5b7	udf: Remove useless check in udf_adinicb_write_begin() As Al properly points out, len is guaranteed to be smaller than PAGE_SIZE when we reach udf_adinicb_write_begin() as otherwise we would have converted the file to the normal format. Reported-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Jan Kara <jack@suse.cz>	2016-09-06 18:04:40 +02:00
Wang Xiaoguang	ce129655c9	btrfs: introduce tickets_id to determine whether asynchronous metadata reclaim work makes progress In btrfs_async_reclaim_metadata_space(), we use ticket's address to determine whether asynchronous metadata reclaim work is making progress. ticket = list_first_entry(&space_info->tickets, struct reserve_ticket, list); if (last_ticket == ticket) { flush_state++; } else { last_ticket = ticket; flush_state = FLUSH_DELAYED_ITEMS_NR; if (commit_cycles) commit_cycles--; } But indeed it's wrong, we should not rely on local variable's address to do this check, because addresses may be same. In my test environment, I dd one 168MB file in a 256MB fs, found that for this file, every time wait_reserve_ticket() called, local variable ticket's address is same, For above codes, assume a previous ticket's address is addrA, last_ticket is addrA. Btrfs_async_reclaim_metadata_space() finished this ticket and wake up it, then another ticket is added, but with the same address addrA, now last_ticket will be same to current ticket, then current ticket's flush work will start from current flush_state, not initial FLUSH_DELAYED_ITEMS_NR, which may result in some enospc issues(I have seen this in my test machine). Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>	2016-09-06 16:31:43 +02:00
Chris Mason	cbd60aa7cd	Btrfs: remove root_log_ctx from ctx list before btrfs_sync_log returns We use a btrfs_log_ctx structure to pass information into the tree log commit, and get error values out. It gets added to a per log-transaction list which we walk when things go bad. Commit d1433debe added an optimization to skip waiting for the log commit, but didn't take root_log_ctx out of the list. This patch makes sure we remove things before exiting. Signed-off-by: Chris Mason <clm@fb.com> Fixes: d1433debe7f4346cf9fc0dafc71c3137d2a97bc4 cc: stable@vger.kernel.org # 3.15+	2016-09-06 05:57:25 -07:00
Wang Xiaoguang	ed7a694839	btrfs: do not decrease bytes_may_use when replaying extents When replaying extents, there is no need to update bytes_may_use in btrfs_alloc_logged_file_extent(), otherwise it'll trigger a WARN_ON about bytes_may_use. Fixes: ("btrfs: update btrfs_space_info's bytes_may_use timely") Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>	2016-09-05 17:40:41 +02:00
Nicolas Iooss	0f5aa88a7b	ceph: do not modify fi->frag in need_reset_readdir() Commit f3c4ebe65ea1 ("ceph: using hash value to compose dentry offset") modified "if (fpos_frag(new_pos) != fi->frag)" to "if (fi->frag \|= fpos_frag(new_pos))" in need_reset_readdir(), thus replacing a comparison operator with an assignment one. This looks like a typo which is reported by clang when building the kernel with some warning flags: fs/ceph/dir.c:600:22: error: using the result of an assignment as a condition without parentheses [-Werror,-Wparentheses] } else if (fi->frag \|= fpos_frag(new_pos)) { ~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~ fs/ceph/dir.c:600:22: note: place parentheses around the assignment to silence this warning } else if (fi->frag \|= fpos_frag(new_pos)) { ^ ( ) fs/ceph/dir.c:600:22: note: use '!=' to turn this compound assignment into an inequality comparison } else if (fi->frag \|= fpos_frag(new_pos)) { ^~ != Fixes: f3c4ebe65ea1 ("ceph: using hash value to compose dentry offset") Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2016-09-05 14:30:35 +02:00
Miklos Szeredi	e1ff3dd1ae	ovl: fix workdir creation Workdir creation fails in latest kernel. Fix by allowing EOPNOTSUPP as a valid return value from vfs_removexattr(XATTR_NAME_POSIX_ACL_*). Upper filesystem may not support ACL and still be perfectly able to support overlayfs. Reported-by: Martin Ziegler <ziegler@uni-freiburg.de> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Fixes: c11b9fdd6a61 ("ovl: remove posix_acl_default from workdir") Cc: <stable@vger.kernel.org>	2016-09-05 13:55:20 +02:00
Greg Kroah-Hartman	2f5bb02ff2	Merge 4.8-rc5 into driver-core-next We want the sysfs and kernfs in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-09-05 08:09:04 +02:00
Bhaktipriya Shridhar	434e612003	fs/afs/flock: Remove deprecated create_singlethread_workqueue The workqueue "afs_lock_manager" queues work item &vnode->lock_work, per vnode. Since there can be multiple vnodes and since their work items can be executed concurrently, alloc_workqueue has been used to replace the deprecated create_singlethread_workqueue instance. The WQ_MEM_RECLAIM flag has been set to ensure forward progress under memory pressure because the workqueue is being used on a memory reclaim path. Since there are fixed number of work items, explicit concurrency limit is unnecessary here. Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com>	2016-09-04 21:41:39 +01:00
Bhaktipriya Shridhar	4c136dae62	fs/afs/callback: Remove deprecated create_singlethread_workqueue The workqueue "afs_callback_update_worker" queues multiple work items viz &vnode->cb_broken_work, &server->cb_break_work which require strict execution ordering. Hence, an ordered dedicated workqueue has been used. Since the workqueue is being used on a memory reclaim path, WQ_MEM_RECLAIM has been set to ensure forward progress under memory pressure. Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com>	2016-09-04 21:41:39 +01:00
Bhaktipriya Shridhar	69ad052aec	fs/afs/rxrpc: Remove deprecated create_singlethread_workqueue The workqueue "afs_async_calls" queues work item &call->async_work per afs_call. Since there could be multiple calls and since these calls can be run concurrently, alloc_workqueue has been used to replace the deprecated create_singlethread_workqueue instance. The WQ_MEM_RECLAIM flag has been set to ensure forward progress under memory pressure because the workqueue is being used on a memory reclaim path. Since there are fixed number of work items, explicit concurrency limit is unnecessary here. Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com>	2016-09-04 21:41:39 +01:00
Bhaktipriya Shridhar	9ce4d7d385	fs/afs/vlocation: Remove deprecated create_singlethread_workqueue The workqueue "afs_vlocation_update_worker" queues a single work item &afs_vlocation_update and hence it doesn't require execution ordering. Hence, alloc_workqueue has been used to replace the deprecated create_singlethread_workqueue instance. Since the workqueue is being used on a memory reclaim path, WQ_MEM_RECLAIM flag has been set to ensure forward progress under memory pressure. Since there are fixed number of work items, explicit concurrency limit is unnecessary here. Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com>	2016-09-04 21:41:39 +01:00
Trond Myklebust	334a8f3711	pNFS: Don't forget the layout stateid if there are outstanding LAYOUTGETs If there are outstanding LAYOUTGET rpc calls, then we want to ensure that we keep the layout stateid around so we that don't inadvertently pick up an old/misordered sequence id. The race is as follows: Client Server ====== ====== LAYOUTGET(seqid) LAYOUTGET(seqid) return LAYOUTGET(seqid+1) return LAYOUTGET(seqid+2) process LAYOUTGET(seqid+2) forget layout process LAYOUTGET(seqid+1) If it forgets the layout stateid before processing seqid+1, then the client will not check the layout->plh_barrier, and so will set the stateid with seqid+1. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2016-09-04 12:59:00 -04:00
Linus Torvalds	4b30b6d126	Merge branch 'for-linus-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fixes from Chris Mason: "I'm still prepping a set of fixes for btrfs fsync, just nailing down a hard to trigger memory corruption. For now, these are tested and ready." * 'for-linus-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: btrfs: fix one bug that process may endlessly wait for ticket in wait_reserve_ticket() Btrfs: fix endless loop in balancing block groups Btrfs: kill invalid ASSERT() in process_all_refs()	2016-09-03 12:40:45 -07:00

... 3 4 5 6 7 ...

46108 Commits