android_kernel_xiaomi_sm8450

Author	SHA1	Message	Date
Paolo Bonzini	2cf7ea9f40	KVM: VMX: hide flexpriority from guest when disabled at the module level As of commit `8d860bbeed` ("kvm: vmx: Basic APIC virtualization controls have three settings"), KVM will disable VIRTUALIZE_APIC_ACCESSES when a nested guest writes APIC_BASE MSR and kvm-intel.flexpriority=0, whereas previously KVM would allow a nested guest to enable VIRTUALIZE_APIC_ACCESSES so long as it's supported in hardware. That is, KVM now advertises VIRTUALIZE_APIC_ACCESSES to a guest but doesn't (always) allow setting it when kvm-intel.flexpriority=0, and may even initially allow the control and then clear it when the nested guest writes APIC_BASE MSR, which is decidedly odd even if it doesn't cause functional issues. Hide the control completely when the module parameter is cleared. reported-by: Sean Christopherson <sean.j.christopherson@intel.com> Fixes: `8d860bbeed` ("kvm: vmx: Basic APIC virtualization controls have three settings") Cc: Jim Mattson <jmattson@google.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-10-04 13:40:44 +02:00
Sean Christopherson	fd6b6d9b82	KVM: VMX: check for existence of secondary exec controls before accessing Return early from vmx_set_virtual_apic_mode() if the processor doesn't support VIRTUALIZE_APIC_ACCESSES or VIRTUALIZE_X2APIC_MODE, both of which reside in SECONDARY_VM_EXEC_CONTROL. This eliminates warnings due to VMWRITEs to SECONDARY_VM_EXEC_CONTROL (VMCS field 401e) failing on processors without secondary exec controls. Remove the similar check for TPR shadowing as it is incorporated in the flexpriority_enabled check and the APIC-related code in vmx_update_msr_bitmap() is further gated by VIRTUALIZE_X2APIC_MODE. Reported-by: Gerhard Wiesinger <redhat@wiesinger.com> Fixes: `8d860bbeed` ("kvm: vmx: Basic APIC virtualization controls have three settings") Cc: Jim Mattson <jmattson@google.com> Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-10-04 13:40:21 +02:00
Sean Christopherson	daa07cbc9a	KVM: x86: fix L1TF's MMIO GFN calculation One defense against L1TF in KVM is to always set the upper five bits of the legal physical address in the SPTEs for non-present and reserved SPTEs, e.g. MMIO SPTEs. In the MMIO case, the GFN of the MMIO SPTE may overlap with the upper five bits that are being usurped to defend against L1TF. To preserve the GFN, the bits of the GFN that overlap with the repurposed bits are shifted left into the reserved bits, i.e. the GFN in the SPTE will be split into high and low parts. When retrieving the GFN from the MMIO SPTE, e.g. to check for an MMIO access, get_mmio_spte_gfn() unshifts the affected bits and restores the original GFN for comparison. Unfortunately, get_mmio_spte_gfn() neglects to mask off the reserved bits in the SPTE that were used to store the upper chunk of the GFN. As a result, KVM fails to detect MMIO accesses whose GPA overlaps the repurprosed bits, which in turn causes guest panics and hangs. Fix the bug by generating a mask that covers the lower chunk of the GFN, i.e. the bits that aren't shifted by the L1TF mitigation. The alternative approach would be to explicitly zero the five reserved bits that are used to store the upper chunk of the GFN, but that requires additional run-time computation and makes an already-ugly bit of code even more inscrutable. I considered adding a WARN_ON_ONCE(low_phys_bits-1 <= PAGE_SHIFT) to warn if GENMASK_ULL() generated a nonsensical value, but that seemed silly since that would mean a system that supports VMX has less than 18 bits of physical address space... Reported-by: Sakari Ailus <sakari.ailus@iki.fi> Fixes: d9b47449c1a1 ("kvm: x86: Set highest physical address bits in non-present/reserved SPTEs") Cc: Junaid Shahid <junaids@google.com> Cc: Jim Mattson <jmattson@google.com> Cc: stable@vger.kernel.org Reviewed-by: Junaid Shahid <junaids@google.com> Tested-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-10-01 15:41:00 +02:00
Liran Alon	62cf9bd811	KVM: nVMX: Fix emulation of VM_ENTRY_LOAD_BNDCFGS L2 IA32_BNDCFGS should be updated with vmcs12->guest_bndcfgs only when VM_ENTRY_LOAD_BNDCFGS is specified in vmcs12->vm_entry_controls. Otherwise, L2 IA32_BNDCFGS should be set to vmcs01->guest_bndcfgs which is L1 IA32_BNDCFGS. Reviewed-by: Nikita Leshchenko <nikita.leshchenko@oracle.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Signed-off-by: Liran Alon <liran.alon@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-10-01 15:40:59 +02:00
Liran Alon	503234b3fd	KVM: x86: Do not use kvm_x86_ops->mpx_supported() directly Commit `a87036add0` ("KVM: x86: disable MPX if host did not enable MPX XSAVE features") introduced kvm_mpx_supported() to return true iff MPX is enabled in the host. However, that commit seems to have missed replacing some calls to kvm_x86_ops->mpx_supported() to kvm_mpx_supported(). Complete original commit by replacing remaining calls to kvm_mpx_supported(). Fixes: `a87036add0` ("KVM: x86: disable MPX if host did not enable MPX XSAVE features") Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Liran Alon <liran.alon@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-10-01 15:40:57 +02:00
Liran Alon	5f76f6f5ff	KVM: nVMX: Do not expose MPX VMX controls when guest MPX disabled Before this commit, KVM exposes MPX VMX controls to L1 guest only based on if KVM and host processor supports MPX virtualization. However, these controls should be exposed to guest only in case guest vCPU supports MPX. Without this change, a L1 guest running with kernel which don't have commit `691bd4340b` ("kvm: vmx: allow host to access guest MSR_IA32_BNDCFGS") asserts in QEMU on the following: qemu-kvm: error: failed to set MSR 0xd90 to 0x0 qemu-kvm: .../qemu-2.10.0/target/i386/kvm.c:1801 kvm_put_msrs: Assertion 'ret == cpu->kvm_msr_buf->nmsrs failed' This is because L1 KVM kvm_init_msr_list() will see that vmx_mpx_supported() (As it only checks MPX VMX controls support) and therefore KVM_GET_MSR_INDEX_LIST IOCTL will include MSR_IA32_BNDCFGS. However, later when L1 will attempt to set this MSR via KVM_SET_MSRS IOCTL, it will fail because !guest_cpuid_has_mpx(vcpu). Therefore, fix the issue by exposing MPX VMX controls to L1 guest only when vCPU supports MPX. Fixes: `36be0b9deb` ("KVM: x86: Add nested virtualization support for MPX") Reported-by: Eyal Moscovici <eyal.moscovici@oracle.com> Reviewed-by: Nikita Leshchenko <nikita.leshchenko@oracle.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Signed-off-by: Liran Alon <liran.alon@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-10-01 15:40:57 +02:00
Paolo Bonzini	4679b61f26	KVM: x86: never trap MSR_KERNEL_GS_BASE KVM has an old optimization whereby accesses to the kernel GS base MSR are trapped when the guest is in 32-bit and not when it is in 64-bit mode. The idea is that swapgs is not available in 32-bit mode, thus the guest has no reason to access the MSR unless in 64-bit mode and 32-bit applications need not pay the price of switching the kernel GS base between the host and the guest values. However, this optimization adds complexity to the code for little benefit (these days most guests are going to be 64-bit anyway) and in fact broke after commit `678e315e78` ("KVM: vmx: add dedicated utility to access guest's kernel_gs_base", 2018-08-06); the guest kernel GS base can be corrupted across SMIs and UEFI Secure Boot is therefore broken (a secure boot Linux guest, for example, fails to reach the login prompt about half the time). This patch just removes the optimization; the kernel GS base MSR is now never trapped by KVM, similarly to the FS and GS base MSRs. Fixes: `678e315e78` Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-09-24 18:34:13 +02:00
Greg Kroah-Hartman	a27fb6d983	This pull request is slightly bigger than usual at this stage, but I swear I would have sent it the same to Linus! The main cause for this is that I was on vacation until two weeks ago and it took a while to sort all the pending patches between 4.19 and 4.20, test them and so on. It's mostly small bugfixes and cleanups, mostly around x86 nested virtualization. One important change, not related to nested virtualization, is that the ability for the guest kernel to trap CPUID instructions (in Linux that's the ARCH_SET_CPUID arch_prctl) is now masked by default. This is because the feature is detected through an MSR; a very bad idea that Intel seems to like more and more. Some applications choke if the other fields of that MSR are not initialized as on real hardware, hence we have to disable the whole MSR by default, as was the case before Linux 4.12. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAABAgAGBQJbpPo1AAoJEL/70l94x66DdxgH/is0qe6ZBtzb6Qc0W+8mHHD7 nxIkWAs2V5NsouJ750YwRQ+0Ym407+wlNt30acdBUEoXhrnA5/TvyGq999XvCL96 upWEIxpIgbvTMX/e2nLhe4wQdhsboUK4r0/B9IFgVFYrdCt5uRXjB2G4ewxcqxL/ GxxqrAKhaRsbQG9Xv0Fw5Vohh/Ls6fQDJcyuY1EBnbMpVenq2QDLI6cOAPXncyFb uLN6ov4GNCWIPckwxejri5XhZesUOsafrmn48sApShh4T6TrisrdtSYdzl+DGza+ j5vhIEwdFO5kulZ3viuhqKJOnS2+F6wvfZ75IKT0tEKeU2bi+ifGDyGRefSF6Q0= =YXLw -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Paolo writes: "It's mostly small bugfixes and cleanups, mostly around x86 nested virtualization. One important change, not related to nested virtualization, is that the ability for the guest kernel to trap CPUID instructions (in Linux that's the ARCH_SET_CPUID arch_prctl) is now masked by default. This is because the feature is detected through an MSR; a very bad idea that Intel seems to like more and more. Some applications choke if the other fields of that MSR are not initialized as on real hardware, hence we have to disable the whole MSR by default, as was the case before Linux 4.12." * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (23 commits) KVM: nVMX: Fix bad cleanup on error of get/set nested state IOCTLs kvm: selftests: Add platform_info_test KVM: x86: Control guest reads of MSR_PLATFORM_INFO KVM: x86: Turbo bits in MSR_PLATFORM_INFO nVMX x86: Check VPID value on vmentry of L2 guests nVMX x86: check posted-interrupt descriptor addresss on vmentry of L2 KVM: nVMX: Wake blocked vCPU in guest-mode if pending interrupt in virtual APICv KVM: VMX: check nested state and CR4.VMXE against SMM kvm: x86: make kvm_{load\|put}_guest_fpu() static x86/hyper-v: rename ipi_arg_{ex,non_ex} structures KVM: VMX: use preemption timer to force immediate VMExit KVM: VMX: modify preemption timer bit only when arming timer KVM: VMX: immediately mark preemption timer expired only for zero value KVM: SVM: Switch to bitmap_zalloc() KVM/MMU: Fix comment in walk_shadow_page_lockless_end() kvm: selftests: use -pthread instead of -lpthread KVM: x86: don't reset root in kvm_mmu_setup() kvm: mmu: Don't read PDPTEs when paging is not enabled x86/kvm/lapic: always disable MMIO interface in x2APIC mode KVM: s390: Make huge pages unavailable in ucontrol VMs ...	2018-09-21 16:21:42 +02:00
Liran Alon	26b471c7e2	KVM: nVMX: Fix bad cleanup on error of get/set nested state IOCTLs The handlers of IOCTLs in kvm_arch_vcpu_ioctl() are expected to set their return value in "r" local var and break out of switch block when they encounter some error. This is because vcpu_load() is called before the switch block which have a proper cleanup of vcpu_put() afterwards. However, KVM_{GET,SET}_NESTED_STATE IOCTLs handlers just return immediately on error without performing above mentioned cleanup. Thus, change these handlers to behave as expected. Fixes: `8fcc4b5923` ("kvm: nVMX: Introduce KVM_CAP_NESTED_STATE") Reviewed-by: Mark Kanda <mark.kanda@oracle.com> Reviewed-by: Patrick Colp <patrick.colp@oracle.com> Signed-off-by: Liran Alon <liran.alon@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-09-20 18:54:08 +02:00
Drew Schmitt	6fbbde9a19	KVM: x86: Control guest reads of MSR_PLATFORM_INFO Add KVM_CAP_MSR_PLATFORM_INFO so that userspace can disable guest access to reads of MSR_PLATFORM_INFO. Disabling access to reads of this MSR gives userspace the control to "expose" this platform-dependent information to guests in a clear way. As it exists today, guests that read this MSR would get unpopulated information if userspace hadn't already set it (and prior to this patch series, only the CPUID faulting information could have been populated). This existing interface could be confusing if guests don't handle the potential for incorrect/incomplete information gracefully (e.g. zero reported for base frequency). Signed-off-by: Drew Schmitt <dasch@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-09-20 00:51:46 +02:00
Drew Schmitt	d84f1cff90	KVM: x86: Turbo bits in MSR_PLATFORM_INFO Allow userspace to set turbo bits in MSR_PLATFORM_INFO. Previously, only the CPUID faulting bit was settable. But now any bit in MSR_PLATFORM_INFO would be settable. This can be used, for example, to convey frequency information about the platform on which the guest is running. Signed-off-by: Drew Schmitt <dasch@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-09-20 00:51:46 +02:00
Krish Sadhukhan	ba8e23db59	nVMX x86: Check VPID value on vmentry of L2 guests According to section "Checks on VMX Controls" in Intel SDM vol 3C, the following check needs to be enforced on vmentry of L2 guests: If the 'enable VPID' VM-execution control is 1, the value of the of the VPID VM-execution control field must not be 0000H. Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> Reviewed-by: Mark Kanda <mark.kanda@oracle.com> Reviewed-by: Liran Alon <liran.alon@oracle.com> Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-09-20 00:51:45 +02:00
Krish Sadhukhan	6de84e581c	nVMX x86: check posted-interrupt descriptor addresss on vmentry of L2 According to section "Checks on VMX Controls" in Intel SDM vol 3C, the following check needs to be enforced on vmentry of L2 guests: - Bits 5:0 of the posted-interrupt descriptor address are all 0. - The posted-interrupt descriptor address does not set any bits beyond the processor's physical-address width. Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> Reviewed-by: Mark Kanda <mark.kanda@oracle.com> Reviewed-by: Liran Alon <liran.alon@oracle.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Reviewed-by: Karl Heubaum <karl.heubaum@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-09-20 00:51:44 +02:00
Liran Alon	e6c67d8cf1	KVM: nVMX: Wake blocked vCPU in guest-mode if pending interrupt in virtual APICv In case L1 do not intercept L2 HLT or enter L2 in HLT activity-state, it is possible for a vCPU to be blocked while it is in guest-mode. According to Intel SDM 26.6.5 Interrupt-Window Exiting and Virtual-Interrupt Delivery: "These events wake the logical processor if it just entered the HLT state because of a VM entry". Therefore, if L1 enters L2 in HLT activity-state and L2 has a pending deliverable interrupt in vmcs12->guest_intr_status.RVI, then the vCPU should be waken from the HLT state and injected with the interrupt. In addition, if while the vCPU is blocked (while it is in guest-mode), it receives a nested posted-interrupt, then the vCPU should also be waken and injected with the posted interrupt. To handle these cases, this patch enhances kvm_vcpu_has_events() to also check if there is a pending interrupt in L2 virtual APICv provided by L1. That is, it evaluates if there is a pending virtual interrupt for L2 by checking RVI[7:4] > VPPR[7:4] as specified in Intel SDM 29.2.1 Evaluation of Pending Interrupts. Note that this also handles the case of nested posted-interrupt by the fact RVI is updated in vmx_complete_nested_posted_interrupt() which is called from kvm_vcpu_check_block() -> kvm_arch_vcpu_runnable() -> kvm_vcpu_running() -> vmx_check_nested_events() -> vmx_complete_nested_posted_interrupt(). Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Signed-off-by: Liran Alon <liran.alon@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-09-20 00:51:44 +02:00
Paolo Bonzini	5bea5123cb	KVM: VMX: check nested state and CR4.VMXE against SMM VMX cannot be enabled under SMM, check it when CR4 is set and when nested virtualization state is restored. This should fix some WARNs reported by syzkaller, mostly around alloc_shadow_vmcs. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-09-20 00:51:43 +02:00
Sebastian Andrzej Siewior	822f312d47	kvm: x86: make kvm_{load\|put}_guest_fpu() static The functions kvm_load_guest_fpu() kvm_put_guest_fpu() are only used locally, make them static. This requires also that both functions are moved because they are used before their implementation. Those functions were exported (via EXPORT_SYMBOL) before commit `e5bb40251a` ("KVM: Drop kvm_{load,put}_guest_fpu() exports"). Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-09-20 00:51:43 +02:00
Vitaly Kuznetsov	a1efa9b700	x86/hyper-v: rename ipi_arg_{ex,non_ex} structures These structures are going to be used from KVM code so let's make their names reflect their Hyper-V origin. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Roman Kagan <rkagan@virtuozzo.com> Acked-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-09-20 00:51:42 +02:00
Sean Christopherson	d264ee0c2e	KVM: VMX: use preemption timer to force immediate VMExit A VMX preemption timer value of '0' is guaranteed to cause a VMExit prior to the CPU executing any instructions in the guest. Use the preemption timer (if it's supported) to trigger immediate VMExit in place of the current method of sending a self-IPI. This ensures that pending VMExit injection to L1 occurs prior to executing any instructions in the guest (regardless of nesting level). When deferring VMExit injection, KVM generates an immediate VMExit from the (possibly nested) guest by sending itself an IPI. Because hardware interrupts are blocked prior to VMEnter and are unblocked (in hardware) after VMEnter, this results in taking a VMExit(INTR) before any guest instruction is executed. But, as this approach relies on the IPI being received before VMEnter executes, it only works as intended when KVM is running as L0. Because there are no architectural guarantees regarding when IPIs are delivered, when running nested the INTR may "arrive" long after L2 is running e.g. L0 KVM doesn't force an immediate switch to L1 to deliver an INTR. For the most part, this unintended delay is not an issue since the events being injected to L1 also do not have architectural guarantees regarding their timing. The notable exception is the VMX preemption timer[1], which is architecturally guaranteed to cause a VMExit prior to executing any instructions in the guest if the timer value is '0' at VMEnter. Specifically, the delay in injecting the VMExit causes the preemption timer KVM unit test to fail when run in a nested guest. Note: this approach is viable even on CPUs with a broken preemption timer, as broken in this context only means the timer counts at the wrong rate. There are no known errata affecting timer value of '0'. [1] I/O SMIs also have guarantees on when they arrive, but I have no idea if/how those are emulated in KVM. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> [Use a hook for SVM instead of leaving the default in x86.c - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-09-20 00:51:42 +02:00
Sean Christopherson	f459a707ed	KVM: VMX: modify preemption timer bit only when arming timer Provide a singular location where the VMX preemption timer bit is set/cleared so that future usages of the preemption timer can ensure the VMCS bit is up-to-date without having to modify unrelated code paths. For example, the preemption timer can be used to force an immediate VMExit. Cache the status of the timer to avoid redundant VMREAD and VMWRITE, e.g. if the timer stays armed across multiple VMEnters/VMExits. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-09-20 00:51:41 +02:00
Sean Christopherson	4c008127e4	KVM: VMX: immediately mark preemption timer expired only for zero value A VMX preemption timer value of '0' at the time of VMEnter is architecturally guaranteed to cause a VMExit prior to the CPU executing any instructions in the guest. This architectural definition is in place to ensure that a previously expired timer is correctly recognized by the CPU as it is possible for the timer to reach zero and not trigger a VMexit due to a higher priority VMExit being signalled instead, e.g. a pending #DB that morphs into a VMExit. Whether by design or coincidence, commit `f4124500c2` ("KVM: nVMX: Fully emulate preemption timer") special cased timer values of '0' and '1' to ensure prompt delivery of the VMExit. Unlike '0', a timer value of '1' has no has no architectural guarantees regarding when it is delivered. Modify the timer emulation to trigger immediate VMExit if and only if the timer value is '0', and document precisely why '0' is special. Do this even if calibration of the virtual TSC failed, i.e. VMExit will occur immediately regardless of the frequency of the timer. Making only '0' a special case gives KVM leeway to be more aggressive in ensuring the VMExit is injected prior to executing instructions in the nested guest, and also eliminates any ambiguity as to why '1' is a special case, e.g. why wasn't the threshold for a "short timeout" set to 10, 100, 1000, etc... Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-09-20 00:26:46 +02:00
Andy Shevchenko	a101c9d63e	KVM: SVM: Switch to bitmap_zalloc() Switch to bitmap_zalloc() to show clearly what we are allocating. Besides that it returns pointer of bitmap type instead of opaque void *. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-09-20 00:26:45 +02:00
Tianyu Lan	9a9845867c	KVM/MMU: Fix comment in walk_shadow_page_lockless_end() kvm_commit_zap_page() has been renamed to kvm_mmu_commit_zap_page() This patch is to fix the commit. Signed-off-by: Lan Tianyu <Tianyu.Lan@microsoft.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-09-20 00:26:45 +02:00
Wei Yang	83b20b28c6	KVM: x86: don't reset root in kvm_mmu_setup() Here is the code path which shows kvm_mmu_setup() is invoked after kvm_mmu_create(). Since kvm_mmu_setup() is only invoked in this code path, this means the root_hpa and prev_roots are guaranteed to be invalid. And it is not necessary to reset it again. kvm_vm_ioctl_create_vcpu() kvm_arch_vcpu_create() vmx_create_vcpu() kvm_vcpu_init() kvm_arch_vcpu_init() kvm_mmu_create() kvm_arch_vcpu_setup() kvm_mmu_setup() kvm_init_mmu() This patch set reset_roots to false in kmv_mmu_setup(). Fixes: `50c28f21d0` Signed-off-by: Wei Yang <richard.weiyang@gmail.com> Reviewed-by: Liran Alon <liran.alon@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-09-20 00:26:44 +02:00
Junaid Shahid	d35b34a9a7	kvm: mmu: Don't read PDPTEs when paging is not enabled kvm should not attempt to read guest PDPTEs when CR0.PG = 0 and CR4.PAE = 1. Signed-off-by: Junaid Shahid <junaids@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-09-20 00:26:43 +02:00
Vitaly Kuznetsov	d176620277	x86/kvm/lapic: always disable MMIO interface in x2APIC mode When VMX is used with flexpriority disabled (because of no support or if disabled with module parameter) MMIO interface to lAPIC is still available in x2APIC mode while it shouldn't be (kvm-unit-tests): PASS: apic_disable: Local apic enabled in x2APIC mode PASS: apic_disable: CPUID.1H:EDX.APIC[bit 9] is set FAIL: apic_disable: *0xfee00030: 50014 The issue appears because we basically do nothing while switching to x2APIC mode when APIC access page is not used. apic_mmio_{read,write} only check if lAPIC is disabled before proceeding to actual write. When APIC access is virtualized we correctly manipulate with VMX controls in vmx_set_virtual_apic_mode() and we don't get vmexits from memory writes in x2APIC mode so there's no issue. Disabling MMIO interface seems to be easy. The question is: what do we do with these reads and writes? If we add apic_x2apic_mode() check to apic_mmio_in_range() and return -EOPNOTSUPP these reads and writes will go to userspace. When lAPIC is in kernel, Qemu uses this interface to inject MSIs only (see kvm_apic_mem_write() in hw/i386/kvm/apic.c). This somehow works with disabled lAPIC but when we're in xAPIC mode we will get a real injected MSI from every write to lAPIC. Not good. The simplest solution seems to be to just ignore writes to the region and return ~0 for all reads when we're in x2APIC mode. This is what this patch does. However, this approach is inconsistent with what currently happens when flexpriority is enabled: we allocate APIC access page and create KVM memory region so in x2APIC modes all reads and writes go to this pre-allocated page which is, btw, the same for all vCPUs. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-09-20 00:26:43 +02:00
Greg Kroah-Hartman	4ca719a338	Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Crypto stuff from Herbert: "This push fixes a potential boot hang in ccp and an incorrect CPU capability check in aegis/morus on x86." * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: x86/aegis,morus - Do not require OSXSAVE for SSE2 crypto: ccp - add timeout support in the SEV command	2018-09-19 08:29:42 +02:00
Paolo Bonzini	1795f81f61	Second set of PPC KVM fixes for 4.19 Two fixes for KVM on POWER machines. Both of these relate to memory corruption and host crashes seen when transparent huge pages are enabled. The first fixes a host crash that can occur when a DMA mapping is removed by the guest and the page mapped was part of a transparent huge page; the second fixes corruption that could occur when a hypervisor page fault for a radix guest is being serviced at the same time that the backing page is being collapsed or split. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQEcBAABCAAGBQJbmFIyAAoJEJ2a6ncsY3Gf+L8H/jRQ0ONUpv2xrgirXdPmfuVv xIVejn5chiygpo3ZY2YkRGjqMoX8usA5pDQONk9duoc48FedSjjmurfAkSA8NESI y6DSRGB6pir/reP/7tBVk0eeeMBjbYnHPA7KfI8ijK424VmRpCT5stiUm7gQvSEm LSRUSLwWKfCCjU78HVtiTuK865WZifrOCy6wiNEl79F1K6T1A+LeGaKrcDLjeK/Q GsNSbwBK37BOvcsm0W1xrlnCmYtR/nVrhjTFMc5noBuc4znQd3wxitgiInFsOH5V LUWL6IStFkbGKSxVZuJilkhVF58AAisrJnwvlZsjrExWYf1J42kbyvVoURt0O8I= =blkZ -----END PGP SIGNATURE----- Merge tag 'kvm-ppc-fixes-4.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into HEAD Second set of PPC KVM fixes for 4.19 Two fixes for KVM on POWER machines. Both of these relate to memory corruption and host crashes seen when transparent huge pages are enabled. The first fixes a host crash that can occur when a DMA mapping is removed by the guest and the page mapped was part of a transparent huge page; the second fixes corruption that could occur when a hypervisor page fault for a radix guest is being serviced at the same time that the backing page is being collapsed or split.	2018-09-18 15:13:00 +02:00
Paolo Bonzini	cb5fb87a2f	KVM: s390: Fixes for 4.19 - more fallout from the hugetlbfs enablement - bugfix for vma handling -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABAgAGBQJbm68DAAoJEBF7vIC1phx8i3cP/R3UwXGrTq3u5+78PfUu+Nvm zTSgHID25/pCsytvRcK7SSJQt+h53RjuiQlOcrtSJZvq3Btt23gl+pFltqvWGZDM PYgy+LaaaoIXNeOJqwkwcnNsq4usQ0N1jQ8EgNUxEc2EBkffA6mKcTS0lWDm47AE Yxsyz0RsvwcbYCOj10s1b4D9h97+ZkqAC2RDnafzQOb54j/aOH7blsStNDXfdkte xHJ7mELOwAWre2QhD7OeTX1aCjBfKUJzWK2sPxm7qt5nEtH18o9F6j2Y/d8dnWg8 kqthyyb+nH0aHgv3G+7sKBOp1Ra7ftHAVrrbh3m9hceaW5+VuEhWSN6zVZKVdWta 4bjiWa9I1gKS/Wz479ztdbpt+koRAYSg5CLHAD7YsBR3hHtnAZwuRO38S4Xg+Tfb Hp4Lhf8gnq9yQqrcyfKWQKtU9mb84oNtEw/wQPUvr2tZyeoB5NbnVy5UDa9V1DRS FnF1/MIZ/MszkwbNEnhk+WWFy41h+uduizfXwQ/YO/xciEEdM2TaDHuLIpBegiSO 4kFfHRSMxqbwNhQSsJI8dIFpSHqXrtn2p6JTsWdi94B0QBYaNm4XMNpgEZIs5CsE aNGhQa+IZ5jk6YsGMx4mqAmV2jN9bMEoKR/SaOvrSkPvAO9QxZP4V6W72YGoi6le RZK8Z+HGxg6qpeaAiG0t =TT85 -----END PGP SIGNATURE----- Merge tag 'kvm-s390-master-4.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD KVM: s390: Fixes for 4.19 - more fallout from the hugetlbfs enablement - bugfix for vma handling	2018-09-18 15:12:51 +02:00
Greg Kroah-Hartman	5211da9ca5	Merge gitolite.kernel.org:/pub/scm/linux/kernel/git/davem/net Dave writes: "Various fixes, all over the place: 1) OOB data generation fix in bluetooth, from Matias Karhumaa. 2) BPF BTF boundary calculation fix, from Martin KaFai Lau. 3) Don't bug on excessive frags, to be compatible in situations mixing older and newer kernels on each end. From Juergen Gross. 4) Scheduling in RCU fix in hv_netvsc, from Stephen Hemminger. 5) Zero keying information in TLS layer before freeing copies of them, from Sabrina Dubroca. 6) Fix NULL deref in act_sample, from Davide Caratti. 7) Orphan SKB before GRO in veth to prevent crashes with XDP, from Toshiaki Makita. 8) Fix use after free in ip6_xmit, from Eric Dumazet. 9) Fix VF mac address regression in bnxt_en, from Micahel Chan. 10) Fix MSG_PEEK behavior in TLS layer, from Daniel Borkmann. 11) Programming adjustments to r8169 which fix not being to enter deep sleep states on some machines, from Kai-Heng Feng and Hans de Goede. 12) Fix DST_NOCOUNT flag handling for ipv6 routes, from Peter Oskolkov." * gitolite.kernel.org:/pub/scm/linux/kernel/git/davem/net: (45 commits) net/ipv6: do not copy dst flags on rt init qmi_wwan: set DTR for modems in forced USB2 mode clk: x86: Stop marking clocks as CLK_IS_CRITICAL r8169: Get and enable optional ether_clk clock clk: x86: add "ether_clk" alias for Bay Trail / Cherry Trail r8169: enable ASPM on RTL8106E r8169: Align ASPM/CLKREQ setting function with vendor driver Revert "kcm: remove any offset before parsing messages" kcm: remove any offset before parsing messages net: ethernet: Fix a unused function warning. net: dsa: mv88e6xxx: Fix ATU Miss Violation tls: fix currently broken MSG_PEEK behavior hv_netvsc: pair VF based on serial number PCI: hv: support reporting serial number as slot information bnxt_en: Fix VF mac address regression. ipv6: fix possible use-after-free in ip6_xmit() net: hp100: fix always-true check for link up state ARM: dts: at91: add new compatibility string for macb on sama5d3 net: macb: disable scatter-gather for macb on sama5d3 net: mvpp2: let phylink manage the carrier state ...	2018-09-18 09:31:53 +02:00
Nicolas Ferre	321cc359d8	ARM: dts: at91: add new compatibility string for macb on sama5d3 We need this new compatibility string as we experienced different behavior for this 10/100Mbits/s macb interface on this particular SoC. Backward compatibility is preserved as we keep the alternative strings. Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-17 07:54:40 -07:00
Linus Torvalds	27c5a778df	Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingol Molnar: "Misc fixes: - EFI crash fix - Xen PV fixes - do not allow PTI on 2-level 32-bit kernels for now - documentation fix" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/APM: Fix build warning when PROC_FS is not enabled Revert "x86/mm/legacy: Populate the user page-table with user pgd's" x86/efi: Load fixmap GDT in efi_call_phys_epilog() before setting %cr3 x86/xen: Disable CPU0 hotplug for Xen PV x86/EISA: Don't probe EISA bus for Xen PV guests x86/doc: Fix Documentation/x86/earlyprintk.txt	2018-09-15 08:02:46 -10:00
Linus Torvalds	c0be92b5b1	Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Mostly tooling fixes, but also breakpoint and x86 PMU driver fixes" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits) perf tools: Fix maps__find_symbol_by_name() tools headers uapi: Update tools's copy of linux/if_link.h tools headers uapi: Update tools's copy of linux/vhost.h tools headers uapi: Update tools's copies of kvm headers tools headers uapi: Update tools's copy of drm/drm.h tools headers uapi: Update tools's copy of asm-generic/unistd.h tools headers uapi: Update tools's copy of linux/perf_event.h perf/core: Force USER_DS when recording user stack data perf/UAPI: Clearly mark __PERF_SAMPLE_CALLCHAIN_EARLY as internal use perf/x86/intel: Add support/quirk for the MISPREDICT bit on Knights Landing CPUs perf annotate: Fix parsing aarch64 branch instructions after objdump update perf probe powerpc: Ignore SyS symbols irrespective of endianness perf event-parse: Use fixed size string for comms perf util: Fix bad memory access in trace info. perf tools: Streamline bpf examples and headers installation perf evsel: Fix potential null pointer dereference in perf_evsel__new_idx() perf arm64: Fix include path for asm-generic/unistd.h perf/hw_breakpoint: Simplify breakpoint enable in perf_event_modify_breakpoint perf/hw_breakpoint: Enable breakpoint in modify_user_hw_breakpoint perf/hw_breakpoint: Remove superfluous bp->attr.disabled = 0 ...	2018-09-15 06:44:32 -10:00
Randy Dunlap	002b87d2aa	x86/APM: Fix build warning when PROC_FS is not enabled Fix build warning in apm_32.c when CONFIG_PROC_FS is not enabled: ../arch/x86/kernel/apm_32.c:1643:12: warning: 'proc_apm_show' defined but not used [-Wunused-function] static int proc_apm_show(struct seq_file m, void v) Fixes: `3f3942aca6` ("proc: introduce proc_create_single{,_data}") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Jiri Kosina <jikos@kernel.org> Link: https://lkml.kernel.org/r/be39ac12-44c2-4715-247f-4dcc3c525b8b@infradead.org	2018-09-15 10:16:25 +02:00
Linus Torvalds	eae4f8851f	Xtensa fixes and cleanups for v4.19: - don't allocate memory in platform_setup as the memory allocator is not initialized at that point yet; - remove unnecessary ifeq KBUILD_SRC from arch/xtensa/Makefile; - enable SG chaining in arch/xtensa/Kconfig. -----BEGIN PGP SIGNATURE----- iQJHBAABCAAxFiEEK2eFS5jlMn3N6xfYUfnMkfg/oEQFAlucAIETHGpjbXZia2Jj QGdtYWlsLmNvbQAKCRBR+cyR+D+gRNAAD/wME84VqljbhQpzUNmYTb7d5Pst8I3Z Mwbx4DqDaCAPu8trhMUtSeV6QBal9xxlyl0b7KWrkb5uX1L9ZdaanMC1W9be6f7B DiwKLrGOhnv0kyLZ4/Rrni6GN90TUo1/2UKyJfaNKJoaqojbLyOY8vGtdtxcQlvT bQPQ1wPRdfiRTDZkW6dygFF6tY2VT6ao3NqapLhvFUqsm7pzKW5Y1jVkY7PTo8t5 Qk1XyXdCdeoTUqvP12mAe4mst0yGelLPj8phTkXt7OZUXaubky+Rp/enSLhcJdpu HJKFC7lInh/C6IroINFYErndckSK19mOTm2SUq6R5saR8R/6OMniilaVJBvfm1BH HhO/6Gzy6K9XQ0VscNAClxBEU7AKbalvHyfI8WcQC+y5Zqpa9nSk9jG1V9U4mLs/ cnL0w+qJ3nh9mJBB/+14S7GwXITu0DlZEOxhsW6xblr/nfmFmvyFSug7i+WAu2Y3 Y4KD9tM7+tBnztggblUPQQm+orPAfpXZDZkGU+e7du0BWZikwcjWeIar2GtFGODz vtxKIIZyvuAOY4M6Ld+2U9GTcIdOhc+orzJFuNAqbzPnez+0chjFp8OYCj/Fdgj2 YhbiYcMxJRAFdrbcWhHMlmXOCkUrvKyDY+bpx9TzNXyZp0qb290uGshCxM9/q7CR UB/JZ/up28v5jw== =YJfc -----END PGP SIGNATURE----- Merge tag 'xtensa-20180914' of git://github.com/jcmvbkbc/linux-xtensa Pull Xtensa fixes and cleanups from Max Filippov: - don't allocate memory in platform_setup as the memory allocator is not initialized at that point yet; - remove unnecessary ifeq KBUILD_SRC from arch/xtensa/Makefile; - enable SG chaining in arch/xtensa/Kconfig. * tag 'xtensa-20180914' of git://github.com/jcmvbkbc/linux-xtensa: xtensa: enable SG chaining in Kconfig xtensa: remove unnecessary KBUILD_SRC ifeq conditional xtensa: ISS: don't allocate memory in platform_setup	2018-09-14 12:56:42 -10:00
Linus Torvalds	3e153256d9	arm64 fixes - Fix ioport_map() mapping the wrong physical address for some I/O BARs - Remove direct use of "asm goto", since some compilers don't like that - Ensure kimage_voffset is always present in vmcoreinfo PT_NOTE -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABCgAGBQJbm68LAAoJELescNyEwWM0Zt4H/2nhsMtmQCLNXvA9gd1SEBba Jrw4J+apYQBG7F7aY5mUZrNtCXECeNr1ukyxTEDMuSB0oTKWTFsQ52Sz4chsjl5u JdNzWRq8LtkWbxxmnzTcYvjHZeDfDx/GQqAHJp047NBI4+6GxXVuEbrPPFDvQuE4 xvMhGpIg9R/jfkhmRKCstalp6aQPDr+Glz/Z+HwyXtuu8A5Q8uCUrlTflHXNVsrn 5GtR69eTseFb/UJtLbi7mesKP/awMm8wBub4wSs/5zKLucgBkTdOCWOjPtQZ6u7F 7hwTBk4ZLihdw2fMNOHGf8iZgPvXV+zVfDPtZWfJGlFJpX8FefbRDVZkSGxXHVw= =dLyK -----END PGP SIGNATURE----- Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: "The trickle of arm64 fixes continues to come in. Nothing that's the end of the world, but we've got a fix for PCI IO port accesses, an accidental naked "asm goto" and a fix to the vmcoreinfo PT_NOTE merged this time around which we'd like to get sorted before it becomes ABI. - Fix ioport_map() mapping the wrong physical address for some I/O BARs - Remove direct use of "asm goto", since some compilers don't like that - Ensure kimage_voffset is always present in vmcoreinfo PT_NOTE" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: asm-generic: io: Fix ioport_map() for !CONFIG_GENERIC_IOMAP && CONFIG_INDIRECT_PIO arm64: kernel: arch_crash_save_vmcoreinfo() should depend on CONFIG_CRASH_CORE arm64: jump_label.h: use asm_volatile_goto macro instead of "asm goto"	2018-09-14 12:42:02 -10:00
Joerg Roedel	61a6bd83ab	Revert "x86/mm/legacy: Populate the user page-table with user pgd's" This reverts commit `1f40a46cf4`. It turned out that this patch is not sufficient to enable PTI on 32 bit systems with legacy 2-level page-tables. In this paging mode the huge-page PTEs are in the top-level page-table directory, where also the mirroring to the user-space page-table happens. So every huge PTE exits twice, in the kernel and in the user page-table. That means that accessed/dirty bits need to be fetched from two PTEs in this mode to be safe, but this is not trivial to implement because it needs changes to generic code just for the sake of enabling PTI with 32-bit legacy paging. As all systems that need PTI should support PAE anyway, remove support for PTI when 32-bit legacy paging is used. Fixes: `7757d607c6` ('x86/pti: Allow CONFIG_PAGE_TABLE_ISOLATION for x86_32') Reported-by: Meelis Roos <mroos@linux.ee> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: hpa@zytor.com Cc: linux-mm@kvack.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Andrea Arcangeli <aarcange@redhat.com> Link: https://lkml.kernel.org/r/1536922754-31379-1-git-send-email-joro@8bytes.org	2018-09-14 17:08:45 +02:00
Ondrej Mosnacek	24568b47d4	crypto: x86/aegis,morus - Do not require OSXSAVE for SSE2 It turns out OSXSAVE needs to be checked only for AVX, not for SSE. Without this patch the affected modules refuse to load on CPUs with SSE2 but without AVX support. Fixes: `877ccce7cb` ("crypto: x86/aegis,morus - Fix and simplify CPUID checks") Cc: <stable@vger.kernel.org> # 4.18 Reported-by: Zdenek Kaspar <zkaspar82@gmail.com> Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-09-14 14:08:27 +08:00
Linus Torvalds	72d4c6e589	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rkuo/linux-hexagon-kernel Pull hexagon fixes from Richard Kuo: "Some fixes for compile warnings" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rkuo/linux-hexagon-kernel: hexagon: modify ffs() and fls() to return int arch/hexagon: fix kernel/dma.c build warning	2018-09-13 16:33:26 -10:00
Linus Torvalds	1d176582c7	s390 fixes for 4.19-rc4 One fix for the zcrypt driver to correctly handle incomplete encryption/decryption operations. A cleanup for the aqmask/apmask parsing to avoid variable length arrays on the stack. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQEcBAABCAAGBQJbmkYUAAoJEDjwexyKj9rgrugH/21Uf1S0mkkRfDeYxD6lJva6 zNmEZ3V+GXc8L/0CisFWQOcIU1fO+jozp9HPGkQxeTvAuqIfVhBRVoMMxiTRaZb3 xdSBel8EvAGxlZq/6eq6fU59HPWGm+N53rC9J5MMQQqgpSmq8F2QeO5CoidflRh8 bdio9cliLsjPu+3P2JU3noolhb/f577J3dgP4gKARRpOfh8vUI7NLU3Mham+7886 ASjG8s/zr9spPnrErJusloOLDJt4M94J8KrIbB/WAT1wZv7GxClaGsCoyCuk4cDt TV8zIgMK9TChedwMOO0T8WMaxq+XJV+iI0dC1eZ7xNUlaIBw+UbifxCG335L3E0= =dHq1 -----END PGP SIGNATURE----- Merge tag 's390-4.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Martin Schwidefsky: - One fix for the zcrypt driver to correctly handle incomplete encryption/decryption operations. - A cleanup for the aqmask/apmask parsing to avoid variable length arrays on the stack. * tag 's390-4.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/zcrypt: remove VLA usage from the AP bus s390/crypto: Fix return code checking in cbc_paes_crypt()	2018-09-13 16:22:24 -10:00
Linus Torvalds	67b076095d	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Fix up several Kconfig dependencies in netfilter, from Martin Willi and Florian Westphal. 2) Memory leak in be2net driver, from Petr Oros. 3) Memory leak in E-Switch handling of mlx5 driver, from Raed Salem. 4) mlx5_attach_interface needs to check for errors, from Huy Nguyen. 5) tipc_release() needs to orphan the sock, from Cong Wang. 6) Need to program TxConfig register after TX/RX is enabled in r8169 driver, not beforehand, from Maciej S. Szmigiero. 7) Handle 64K PAGE_SIZE properly in ena driver, from Netanel Belgazal. 8) Fix crash regression in ip_do_fragment(), from Taehee Yoo. 9) syzbot can create conditions where kernel log is flooded with synflood warnings due to creation of many listening sockets, fix that. From Willem de Bruijn. 10) Fix RCU issues in rds socket layer, from Cong Wang. 11) Fix vlan matching in nfp driver, from Pieter Jansen van Vuuren. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (59 commits) nfp: flower: reject tunnel encap with ipv6 outer headers for offloading nfp: flower: fix vlan match by checking both vlan id and vlan pcp tipc: check return value of __tipc_dump_start() s390/qeth: don't dump past end of unknown HW header s390/qeth: use vzalloc for QUERY OAT buffer s390/qeth: switch on SG by default for IQD devices s390/qeth: indicate error when netdev allocation fails rds: fix two RCU related problems r8169: Clear RTL_FLAG_TASK_*_PENDING when clearing RTL_FLAG_TASK_ENABLED erspan: fix error handling for erspan tunnel erspan: return PACKET_REJECT when the appropriate tunnel is not found tcp: rate limit synflood warnings further MIPS: lantiq: dma: add dev pointer netfilter: xt_hashlimit: use s->file instead of s->private netfilter: nfnetlink_queue: Solve the NFQUEUE/conntrack clash for NF_REPEAT netfilter: cttimeout: ctnl_timeout_find_get() returns incorrect pointer to type netfilter: conntrack: timeout interface depend on CONFIG_NF_CONNTRACK_TIMEOUT netfilter: conntrack: reset tcp maxwin on re-register qmi_wwan: Support dynamic config on Quectel EP06 ethernet: renesas: convert to SPDX identifiers ...	2018-09-12 17:32:50 -10:00
Guenter Roeck	cf40361ede	x86/efi: Load fixmap GDT in efi_call_phys_epilog() before setting %cr3 Commit `eeb89e2bb1` ("x86/efi: Load fixmap GDT in efi_call_phys_epilog()") moved loading the fixmap in efi_call_phys_epilog() after load_cr3() since it was assumed to be more logical. Turns out this is incorrect: In efi_call_phys_prolog(), the gdt with its physical address is loaded first, and when the %cr3 is reloaded in _epilog from initial_page_table to swapper_pg_dir again the gdt is no longer mapped. This results in a triple fault if an interrupt occurs after load_cr3() and before load_fixmap_gdt(0). Calling load_fixmap_gdt(0) first restores the execution order prior to commit `eeb89e2bb1` and fixes the problem. Fixes: `eeb89e2bb1` ("x86/efi: Load fixmap GDT in efi_call_phys_epilog()") Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: linux-efi@vger.kernel.org Cc: Andy Lutomirski <luto@amacapital.net> Cc: Joerg Roedel <jroedel@suse.de> Link: https://lkml.kernel.org/r/1536689892-21538-1-git-send-email-linux@roeck-us.net	2018-09-12 21:53:34 +02:00
Juergen Gross	999696752d	x86/xen: Disable CPU0 hotplug for Xen PV Xen PV guests don't allow CPU0 hotplug, so disable it. Signed-off-by: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: boris.ostrovsky@oracle.com Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/20180912174122.24282-1-jgross@suse.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-09-12 21:15:02 +02:00
Linus Torvalds	96eddb810b	RISC-V: A single fix for 4.19-rc3 This tag contains what I hope to be the last RISC-V patch for 4.19. It fixes a bug in our initramfs support by removing some broken and obselete code. -----BEGIN PGP SIGNATURE----- iQJHBAABCAAxFiEEAM520YNJYN/OiG3470yhUCzLq0EFAluPHykTHHBhbG1lckBk YWJiZWx0LmNvbQAKCRDvTKFQLMurQX7iD/41XF5oXHJeRUfhiHVa/kiqFaCw1aEx YPp5escHEypGshWQJGd+ite5cEz0nrggsbmXJrQnfpU8fqgpkvguaIbOb9JAtOdj Y5hjQ5QgiQcsUrLnhy7yK62fpC27WwWPGfT73cLRgir2oDEI3F7CkaA0uX3y2kLF 9TEN2v+DL+89Y/Rq9mzRwwPOryZRNXZkxI6tqTVa7wZZzi7fSUMCG2msjeZRszQe 0IPyBtVR7OECzEaRwSETgC05KTFxCQ2JHMjHz1TatjvJmGU3ToP0uRZ1oYXDXcR3 AM2QfjBQDmBOjRRKBbwaiUzfX209eGrn/JK3j6BZZredX9MCP+qduuQV+7GsvRT2 ryCoWN56AAIMZJvmp57lG9jfDptxnS6zZCqw+mufsD1s3c/78zUv7Q5PPUdcfzuP qt7iVdUUP5QWDFgM0QumeZ9JuekoA0Kpsmg4Nq6M6YHimW63Y2+CJPqPfh1oY93t UoabFgz7FZ0WLo1jHtGVteihq78SKxTe4WYEDzjH++qrVPuYnbNH3Hfqwynj6Wsy fvNxmnjg1AVhD9MPSBJLDbQivxW4pEwuxV99MpwLhVdwGXDTAgt9t9mUP5xAaLna 60jszx1GM8HVMeQ0LNAGRWa8FH0bvn2kpLOBjvdMHl8Y/Oq/IuaINRJCKq59j/4X Qx963ajYY5QUHg== =XdJZ -----END PGP SIGNATURE----- Merge tag 'riscv-for-linus-4.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux Pull RISC-V fix from Palmer Dabbelt: "This contains what I hope to be the last RISC-V patch for 4.19. It fixes a bug in our initramfs support by removing some broken and obselete code" * tag 'riscv-for-linus-4.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux: riscv: Do not overwrite initrd_start and initrd_end	2018-09-12 06:51:27 -10:00
Janosch Frank	40ebdb8e59	KVM: s390: Make huge pages unavailable in ucontrol VMs We currently do not notify all gmaps when using gmap_pmdp_xchg(), due to locking constraints. This makes ucontrol VMs, which is the only VM type that creates multiple gmaps, incompatible with huge pages. Also we would need to hold the guest_table_lock of all gmaps that have this vmaddr maped to synchronize access to the pmd. ucontrol VMs are rather exotic and creating a new locking concept is no easy task. Hence we return EINVAL when trying to active KVM_CAP_S390_HPAGE_1M and report it as being not available when checking for it. Fixes: `a4499382` ("KVM: s390: Add huge page enablement control") Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Message-Id: <20180801112508.138159-1-frankja@linux.ibm.com> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>	2018-09-12 14:46:37 +02:00
Janosch Frank	1843abd032	s390/mm: Check for valid vma before zapping in gmap_discard Userspace could have munmapped the area before doing unmapping from the gmap. This would leave us with a valid vmaddr, but an invalid vma from which we would try to zap memory. Let's check before using the vma. Fixes: `1e133ab296` ("s390/mm: split arch/s390/mm/pgtable.c") Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Message-Id: <20180816082432.78828-1-frankja@linux.ibm.com> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>	2018-09-12 14:46:37 +02:00
Hauke Mehrtens	2d946e5bcd	MIPS: lantiq: dma: add dev pointer dma_zalloc_coherent() now crashes if no dev pointer is given. Add a dev pointer to the ltq_dma_channel structure and fill it in the driver using it. This fixes a bug introduced in kernel 4.19. Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-11 23:33:19 -07:00
Max Filippov	4a7f50f78c	xtensa: enable SG chaining in Kconfig Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>	2018-09-11 22:12:59 -07:00
Masahiro Yamada	8e966fab8e	xtensa: remove unnecessary KBUILD_SRC ifeq conditional You can always prefix variant/platform header search paths with $(srctree)/ because $(srctree) is '.' for in-tree building. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>	2018-09-11 22:09:48 -07:00
Nicholas Piggin	71d29f43b6	KVM: PPC: Book3S HV: Don't use compound_order to determine host mapping size THP paths can defer splitting compound pages until after the actual remap and TLB flushes to split a huge PMD/PUD. This causes radix partition scope page table mappings to get out of synch with the host qemu page table mappings. This results in random memory corruption in the guest when running with THP. The easiest way to reproduce is use KVM balloon to free up a lot of memory in the guest and then shrink the balloon to give the memory back, while some work is being done in the guest. Cc: David Gibson <david@gibson.dropbear.id.au> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> Cc: kvm-ppc@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2018-09-12 08:50:50 +10:00
Alexey Kardashevskiy	425333bf3a	KVM: PPC: Avoid marking DMA-mapped pages dirty in real mode At the moment the real mode handler of H_PUT_TCE calls iommu_tce_xchg_rm() which in turn reads the old TCE and if it was a valid entry, marks the physical page dirty if it was mapped for writing. Since it is in real mode, realmode_pfn_to_page() is used instead of pfn_to_page() to get the page struct. However SetPageDirty() itself reads the compound page head and returns a virtual address for the head page struct and setting dirty bit for that kills the system. This adds additional dirty bit tracking into the MM/IOMMU API for use in the real mode. Note that this does not change how VFIO and KVM (in virtual mode) set this bit. The KVM (real mode) changes include: - use the lowest bit of the cached host phys address to carry the dirty bit; - mark pages dirty when they are unpinned which happens when the preregistered memory is released which always happens in virtual mode; - add mm_iommu_ua_mark_dirty_rm() helper to set delayed dirty bit; - change iommu_tce_xchg_rm() to take the kvm struct for the mm to use in the new mm_iommu_ua_mark_dirty_rm() helper; - move iommu_tce_xchg_rm() to book3s_64_vio_hv.c (which is the only caller anyway) to reduce the real mode KVM and IOMMU knowledge across different subsystems. This removes realmode_pfn_to_page() as it is not used anymore. While we at it, remove some EXPORT_SYMBOL_GPL() as that code is for the real mode only and modules cannot call it anyway. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2018-09-12 08:49:54 +10:00

1 2 3 4 5 ...

150614 Commits