Commit Graph

267 Commits

Author SHA1 Message Date
Greg Kroah-Hartman
50874c58d8 Merge 6.1.47 into android14-6.1-lts
Changes in 6.1.47
	mmc: sdhci-f-sdh30: Replace with sdhci_pltfm
	cpuidle: psci: Extend information in log about OSI/PC mode
	cpuidle: psci: Move enabling OSI mode after power domains creation
	zsmalloc: consolidate zs_pool's migrate_lock and size_class's locks
	zsmalloc: fix races between modifications of fullness and isolated
	selftests: forwarding: tc_actions: cleanup temporary files when test is aborted
	selftests: forwarding: tc_actions: Use ncat instead of nc
	net/smc: replace mutex rmbs_lock and sndbufs_lock with rw_semaphore
	net/smc: Fix setsockopt and sysctl to specify same buffer size again
	net: phy: at803x: Use devm_regulator_get_enable_optional()
	net: phy: at803x: fix the wol setting functions
	drm/amdgpu: fix calltrace warning in amddrm_buddy_fini
	drm/amdgpu: Fix integer overflow in amdgpu_cs_pass1
	drm/amdgpu: fix memory leak in mes self test
	ASoC: Intel: sof_sdw: add quirk for MTL RVP
	ASoC: Intel: sof_sdw: add quirk for LNL RVP
	PCI: tegra194: Fix possible array out of bounds access
	ASoC: SOF: amd: Add pci revision id check
	drm/stm: ltdc: fix late dereference check
	drm: rcar-du: remove R-Car H3 ES1.* workarounds
	ASoC: amd: vangogh: Add check for acp config flags in vangogh platform
	ARM: dts: imx6dl: prtrvt, prtvt7, prti6q, prtwd2: fix USB related warnings
	ASoC: Intel: sof_sdw_rt_sdca_jack_common: test SOF_JACK_JDSRC in _exit
	ASoC: Intel: sof_sdw: Add support for Rex soundwire
	iopoll: Call cpu_relax() in busy loops
	ASoC: SOF: Intel: fix SoundWire/HDaudio mutual exclusion
	dma-remap: use kvmalloc_array/kvfree for larger dma memory remap
	accel/habanalabs: add pci health check during heartbeat
	HID: logitech-hidpp: Add USB and Bluetooth IDs for the Logitech G915 TKL Keyboard
	iommu/amd: Introduce Disable IRTE Caching Support
	drm/amdgpu: install stub fence into potential unused fence pointers
	drm/amd/display: Apply 60us prefetch for DCFCLK <= 300Mhz
	RDMA/mlx5: Return the firmware result upon destroying QP/RQ
	drm/amd/display: Skip DPP DTO update if root clock is gated
	drm/amd/display: Enable dcn314 DPP RCO
	ASoC: SOF: core: Free the firmware trace before calling snd_sof_shutdown()
	HID: intel-ish-hid: ipc: Add Arrow Lake PCI device ID
	ALSA: hda/realtek: Add quirks for ROG ALLY CS35l41 audio
	smb: client: fix warning in cifs_smb3_do_mount()
	cifs: fix session state check in reconnect to avoid use-after-free issue
	serial: stm32: Ignore return value of uart_remove_one_port() in .remove()
	led: qcom-lpg: Fix resource leaks in for_each_available_child_of_node() loops
	media: v4l2-mem2mem: add lock to protect parameter num_rdy
	media: camss: set VFE bpl_alignment to 16 for sdm845 and sm8250
	usb: gadget: u_serial: Avoid spinlock recursion in __gs_console_push
	usb: gadget: uvc: queue empty isoc requests if no video buffer is available
	media: platform: mediatek: vpu: fix NULL ptr dereference
	thunderbolt: Read retimer NVM authentication status prior tb_retimer_set_inbound_sbtx()
	usb: chipidea: imx: don't request QoS for imx8ulp
	usb: chipidea: imx: add missing USB PHY DPDM wakeup setting
	gfs2: Fix possible data races in gfs2_show_options()
	pcmcia: rsrc_nonstatic: Fix memory leak in nonstatic_release_resource_db()
	thunderbolt: Add Intel Barlow Ridge PCI ID
	thunderbolt: Limit Intel Barlow Ridge USB3 bandwidth
	firewire: net: fix use after free in fwnet_finish_incoming_packet()
	watchdog: sp5100_tco: support Hygon FCH/SCH (Server Controller Hub)
	Bluetooth: L2CAP: Fix use-after-free
	Bluetooth: btusb: Add MT7922 bluetooth ID for the Asus Ally
	ceph: try to dump the msgs when decoding fails
	drm/amdgpu: Fix potential fence use-after-free v2
	fs/ntfs3: Enhance sanity check while generating attr_list
	fs: ntfs3: Fix possible null-pointer dereferences in mi_read()
	fs/ntfs3: Mark ntfs dirty when on-disk struct is corrupted
	ALSA: hda/realtek: Add quirks for Unis H3C Desktop B760 & Q760
	ALSA: hda: fix a possible null-pointer dereference due to data race in snd_hdac_regmap_sync()
	ALSA: hda/realtek: Add quirk for ASUS ROG GX650P
	ALSA: hda/realtek: Add quirk for ASUS ROG GA402X
	ALSA: hda/realtek: Add quirk for ASUS ROG GZ301V
	powerpc/kasan: Disable KCOV in KASAN code
	Bluetooth: MGMT: Use correct address for memcpy()
	ring-buffer: Do not swap cpu_buffer during resize process
	igc: read before write to SRRCTL register
	drm/amd/display: save restore hdcp state when display is unplugged from mst hub
	drm/amd/display: phase3 mst hdcp for multiple displays
	drm/amd/display: fix access hdcp_workqueue assert
	KVM: arm64: vgic-v4: Make the doorbell request robust w.r.t preemption
	ARM: dts: nxp/imx6sll: fix wrong property name in usbphy node
	fbdev/hyperv-fb: Do not set struct fb_info.apertures
	video/aperture: Only remove sysfb on the default vga pci device
	btrfs: move out now unused BG from the reclaim list
	btrfs: convert btrfs_block_group::needs_free_space to runtime flag
	btrfs: convert btrfs_block_group::seq_zone to runtime flag
	btrfs: fix use-after-free of new block group that became unused
	virtio-mmio: don't break lifecycle of vm_dev
	vduse: Use proper spinlock for IRQ injection
	vdpa/mlx5: Fix mr->initialized semantics
	vdpa/mlx5: Delete control vq iotlb in destroy_mr only when necessary
	cifs: fix potential oops in cifs_oplock_break
	i2c: bcm-iproc: Fix bcm_iproc_i2c_isr deadlock issue
	i2c: hisi: Only handle the interrupt of the driver's transfer
	i2c: tegra: Fix i2c-tegra DMA config option processing
	fbdev: mmp: fix value check in mmphw_probe()
	powerpc/rtas_flash: allow user copy to flash block cache objects
	vdpa: Add features attr to vdpa_nl_policy for nlattr length check
	vdpa: Add queue index attr to vdpa_nl_policy for nlattr length check
	vdpa: Add max vqp attr to vdpa_nl_policy for nlattr length check
	vdpa: Enable strict validation for netlinks ops
	tty: n_gsm: fix the UAF caused by race condition in gsm_cleanup_mux
	tty: serial: fsl_lpuart: Clear the error flags by writing 1 for lpuart32 platforms
	btrfs: fix incorrect splitting in btrfs_drop_extent_map_range
	btrfs: fix BUG_ON condition in btrfs_cancel_balance
	i2c: designware: Correct length byte validation logic
	i2c: designware: Handle invalid SMBus block data response length value
	net: xfrm: Fix xfrm_address_filter OOB read
	net: af_key: fix sadb_x_filter validation
	net: xfrm: Amend XFRMA_SEC_CTX nla_policy structure
	xfrm: fix slab-use-after-free in decode_session6
	ip6_vti: fix slab-use-after-free in decode_session6
	ip_vti: fix potential slab-use-after-free in decode_session6
	xfrm: add NULL check in xfrm_update_ae_params
	xfrm: add forgotten nla_policy for XFRMA_MTIMER_THRESH
	virtio_net: notify MAC address change on device initialization
	virtio-net: set queues after driver_ok
	net: pcs: Add missing put_device call in miic_create
	net: phy: fix IRQ-based wake-on-lan over hibernate / power off
	selftests: mirror_gre_changes: Tighten up the TTL test match
	drm/panel: simple: Fix AUO G121EAN01 panel timings according to the docs
	net: macb: In ZynqMP resume always configure PS GTR for non-wakeup source
	octeon_ep: cancel tx_timeout_task later in remove sequence
	netfilter: nf_tables: fix false-positive lockdep splat
	netfilter: nf_tables: deactivate catchall elements in next generation
	ipvs: fix racy memcpy in proc_do_sync_threshold
	netfilter: nft_dynset: disallow object maps
	net: phy: broadcom: stub c45 read/write for 54810
	team: Fix incorrect deletion of ETH_P_8021AD protocol vid from slaves
	net: openvswitch: reject negative ifindex
	iavf: fix FDIR rule fields masks validation
	i40e: fix misleading debug logs
	net: dsa: mv88e6xxx: Wait for EEPROM done before HW reset
	sfc: don't unregister flow_indr if it was never registered
	sock: Fix misuse of sk_under_memory_pressure()
	net: do not allow gso_size to be set to GSO_BY_FRAGS
	qede: fix firmware halt over suspend and resume
	ice: Block switchdev mode when ADQ is active and vice versa
	bus: ti-sysc: Flush posted write on enable before reset
	arm64: dts: qcom: qrb5165-rb5: fix thermal zone conflict
	arm64: dts: rockchip: Disable HS400 for eMMC on ROCK Pi 4
	arm64: dts: rockchip: Disable HS400 for eMMC on ROCK 4C+
	ARM: dts: imx: align LED node names with dtschema
	ARM: dts: imx6: phytec: fix RTC interrupt level
	arm64: dts: imx8mm: Drop CSI1 PHY reference clock configuration
	ARM: dts: imx: Set default tuning step for imx6sx usdhc
	arm64: dts: imx93: Fix anatop node size
	ASoC: rt5665: add missed regulator_bulk_disable
	ASoC: meson: axg-tdm-formatter: fix channel slot allocation
	ALSA: hda/realtek: Add quirks for HP G11 Laptops
	soc: aspeed: uart-routing: Use __sysfs_match_string
	soc: aspeed: socinfo: Add kfree for kstrdup
	ALSA: hda/realtek - Remodified 3k pull low procedure
	riscv: uaccess: Return the number of bytes effectively not copied
	serial: 8250: Fix oops for port->pm on uart_change_pm()
	ALSA: usb-audio: Add support for Mythware XA001AU capture and playback interfaces.
	cifs: Release folio lock on fscache read hit.
	virtio-net: Zero max_tx_vq field for VIRTIO_NET_CTRL_MQ_HASH_CONFIG case
	arm64: dts: rockchip: Fix Wifi/Bluetooth on ROCK Pi 4 boards
	blk-crypto: dynamically allocate fallback profile
	mmc: wbsd: fix double mmc_free_host() in wbsd_init()
	mmc: block: Fix in_flight[issue_type] value error
	drm/qxl: fix UAF on handle creation
	drm/i915/sdvo: fix panel_type initialization
	drm/amd: flush any delayed gfxoff on suspend entry
	drm/amdgpu: skip fence GFX interrupts disable/enable for S0ix
	drm/amdgpu/pm: fix throttle_status for other than MP1 11.0.7
	ASoC: amd: vangogh: select CONFIG_SND_AMD_ACP_CONFIG
	drm/amd/display: disable RCO for DCN314
	zsmalloc: allow only one active pool compaction context
	sched/fair: unlink misfit task from cpu overutilized
	sched/fair: Remove capacity inversion detection
	drm/amd/display: Implement workaround for writing to OTG_PIXEL_RATE_DIV register
	hugetlb: do not clear hugetlb dtor until allocating vmemmap
	netfilter: set default timeout to 3 secs for sctp shutdown send and recv state
	arm64/ptrace: Ensure that SME is set up for target when writing SSVE state
	drm/amd/pm: skip the RLC stop when S0i3 suspend for SMU v13.0.4/11
	drm/amdgpu: keep irq count in amdgpu_irq_disable_all
	af_unix: Fix null-ptr-deref in unix_stream_sendpage().
	drm/nouveau/disp: fix use-after-free in error handling of nouveau_connector_create
	net: fix the RTO timer retransmitting skb every 1ms if linear option is enabled
	mmc: f-sdh30: fix order of function calls in sdhci_f_sdh30_remove
	Linux 6.1.47

Change-Id: I7c55c71f43f88a1d44d39c835e3f6e58d4c86279
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-09-13 19:35:46 +00:00
Marc Zyngier
3d2d051be1 KVM: arm64: vgic-v4: Make the doorbell request robust w.r.t preemption
[ Upstream commit b321c31c9b7b309dcde5e8854b741c8e6a9a05f0 ]

Xiang reports that VMs occasionally fail to boot on GICv4.1 systems when
running a preemptible kernel, as it is possible that a vCPU is blocked
without requesting a doorbell interrupt.

The issue is that any preemption that occurs between vgic_v4_put() and
schedule() on the block path will mark the vPE as nonresident and *not*
request a doorbell irq. This occurs because when the vcpu thread is
resumed on its way to block, vcpu_load() will make the vPE resident
again. Once the vcpu actually blocks, we don't request a doorbell
anymore, and the vcpu won't be woken up on interrupt delivery.

Fix it by tracking that we're entering WFI, and key the doorbell
request on that flag. This allows us not to make the vPE resident
when going through a preempt/schedule cycle, meaning we don't lose
any state.

Cc: stable@vger.kernel.org
Fixes: 8e01d9a396 ("KVM: arm64: vgic-v4: Move the GICv4 residency flow to be driven by vcpu_load/put")
Reported-by: Xiang Chen <chenxiang66@hisilicon.com>
Suggested-by: Zenghui Yu <yuzenghui@huawei.com>
Tested-by: Xiang Chen <chenxiang66@hisilicon.com>
Co-developed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Zenghui Yu <yuzenghui@huawei.com>
Link: https://lore.kernel.org/r/20230713070657.3873244-1-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-08-23 17:52:28 +02:00
Greg Kroah-Hartman
0871e23703 Revert "Revert "KVM: arm64: PMU: Align chained counter implementation with architecture pseudocode""
This reverts commit 9d29ba9c30.

It was perserving the ABI, but that is not needed anymore at this point
in time.

Change-Id: Iae5f3a6a9025e17aa7b1a0fd805c13953bc0c554
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-05-11 05:22:29 +00:00
Greg Kroah-Hartman
9d29ba9c30 Revert "KVM: arm64: PMU: Align chained counter implementation with architecture pseudocode"
This reverts commit eb3df96102.

It breaks the current Android kernel abi.  It will be brought back at
the next KABI break update.

Bug: 161946584
Change-Id: Icbc6a902a085e81fb1ba72666cae32527c2479cc
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-04-26 10:00:48 +00:00
Greg Kroah-Hartman
0fff48d6fe Merge 6.1.24 into android14-6.1
Changes in 6.1.24
	dm cache: Add some documentation to dm-cache-background-tracker.h
	dm integrity: Remove bi_sector that's only used by commented debug code
	dm: change "unsigned" to "unsigned int"
	dm: fix improper splitting for abnormal bios
	KVM: arm64: PMU: Align chained counter implementation with architecture pseudocode
	KVM: arm64: PMU: Distinguish between 64bit counter and 64bit overflow
	KVM: arm64: PMU: Sanitise PMCR_EL0.LP on first vcpu run
	KVM: arm64: PMU: Don't save PMCR_EL0.{C,P} for the vCPU
	gpio: GPIO_REGMAP: select REGMAP instead of depending on it
	Drivers: vmbus: Check for channel allocation before looking up relids
	ASoC: SOF: ipc4: Ensure DSP is in D0I0 during sof_ipc4_set_get_data()
	pwm: Make .get_state() callback return an error code
	pwm: hibvt: Explicitly set .polarity in .get_state()
	pwm: cros-ec: Explicitly set .polarity in .get_state()
	pwm: iqs620a: Explicitly set .polarity in .get_state()
	pwm: sprd: Explicitly set .polarity in .get_state()
	pwm: meson: Explicitly set .polarity in .get_state()
	ASoC: codecs: lpass: fix the order or clks turn off during suspend
	KVM: s390: pv: fix external interruption loop not always detected
	wifi: mac80211: fix the size calculation of ieee80211_ie_len_eht_cap()
	wifi: mac80211: fix invalid drv_sta_pre_rcu_remove calls for non-uploaded sta
	net: qrtr: Fix a refcount bug in qrtr_recvmsg()
	net: phylink: add phylink_expects_phy() method
	net: stmmac: check if MAC needs to attach to a PHY
	net: stmmac: remove redundant fixup to support fixed-link mode
	l2tp: generate correct module alias strings
	wifi: brcmfmac: Fix SDIO suspend/resume regression
	NFSD: Avoid calling OPDESC() with ops->opnum == OP_ILLEGAL
	nfsd: call op_release, even when op_func returns an error
	icmp: guard against too small mtu
	ALSA: hda/hdmi: Preserve the previous PCM device upon re-enablement
	net: don't let netpoll invoke NAPI if in xmit context
	net: dsa: mv88e6xxx: Reset mv88e6393x force WD event bit
	sctp: check send stream number after wait_for_sndbuf
	net: qrtr: Do not do DEL_SERVER broadcast after DEL_CLIENT
	ipv6: Fix an uninit variable access bug in __ip6_make_skb()
	platform/x86: think-lmi: Fix memory leak when showing current settings
	platform/x86: think-lmi: Fix memory leaks when parsing ThinkStation WMI strings
	platform/x86: think-lmi: Clean up display of current_value on Thinkstation
	gpio: davinci: Do not clear the bank intr enable bit in save_context
	gpio: davinci: Add irq chip flag to skip set wake
	net: ethernet: ti: am65-cpsw: Fix mdio cleanup in probe
	net: stmmac: fix up RX flow hash indirection table when setting channels
	sunrpc: only free unix grouplist after RCU settles
	NFSD: callback request does not use correct credential for AUTH_SYS
	ice: fix wrong fallback logic for FDIR
	ice: Reset FDIR counter in FDIR init stage
	raw: use net_hash_mix() in hash function
	raw: Fix NULL deref in raw_get_next().
	ping: Fix potentail NULL deref for /proc/net/icmp.
	ethtool: reset #lanes when lanes is omitted
	netlink: annotate lockless accesses to nlk->max_recvmsg_len
	gve: Secure enough bytes in the first TX desc for all TCP pkts
	arm64: compat: Work around uninitialized variable warning
	net: stmmac: check fwnode for phy device before scanning for phy
	cxl/pci: Fix CDAT retrieval on big endian
	cxl/pci: Handle truncated CDAT header
	cxl/pci: Handle truncated CDAT entries
	cxl/pci: Handle excessive CDAT length
	PCI/DOE: Silence WARN splat with CONFIG_DEBUG_OBJECTS=y
	PCI/DOE: Fix memory leak with CONFIG_DEBUG_OBJECTS=y
	usb: xhci: tegra: fix sleep in atomic call
	xhci: Free the command allocated for setting LPM if we return early
	xhci: also avoid the XHCI_ZERO_64B_REGS quirk with a passthrough iommu
	usb: cdnsp: Fixes error: uninitialized symbol 'len'
	usb: dwc3: pci: add support for the Intel Meteor Lake-S
	USB: serial: cp210x: add Silicon Labs IFS-USB-DATACABLE IDs
	usb: typec: altmodes/displayport: Fix configure initial pin assignment
	USB: serial: option: add Telit FE990 compositions
	USB: serial: option: add Quectel RM500U-CN modem
	drivers: iio: adc: ltc2497: fix LSB shift
	iio: adis16480: select CONFIG_CRC32
	iio: adc: qcom-spmi-adc5: Fix the channel name
	iio: adc: ti-ads7950: Set `can_sleep` flag for GPIO chip
	iio: dac: cio-dac: Fix max DAC write value check for 12-bit
	iio: buffer: correctly return bytes written in output buffers
	iio: buffer: make sure O_NONBLOCK is respected
	iio: light: cm32181: Unregister second I2C client if present
	tty: serial: sh-sci: Fix transmit end interrupt handler
	tty: serial: sh-sci: Fix Rx on RZ/G2L SCI
	tty: serial: fsl_lpuart: avoid checking for transfer complete when UARTCTRL_SBK is asserted in lpuart32_tx_empty
	nilfs2: fix potential UAF of struct nilfs_sc_info in nilfs_segctor_thread()
	nilfs2: fix sysfs interface lifetime
	dt-bindings: serial: renesas,scif: Fix 4th IRQ for 4-IRQ SCIFs
	serial: 8250: Prevent starting up DMA Rx on THRI interrupt
	ksmbd: do not call kvmalloc() with __GFP_NORETRY | __GFP_NO_WARN
	ksmbd: fix slab-out-of-bounds in init_smb2_rsp_hdr
	ALSA: hda/realtek: Add quirk for Clevo X370SNW
	ALSA: hda/realtek: fix mute/micmute LEDs for a HP ProBook
	x86/acpi/boot: Correct acpi_is_processor_usable() check
	x86/ACPI/boot: Use FADT version to check support for online capable
	KVM: x86: Clear "has_error_code", not "error_code", for RM exception injection
	KVM: nVMX: Do not report error code when synthesizing VM-Exit from Real Mode
	mm: kfence: fix PG_slab and memcg_data clearing
	mm: kfence: fix handling discontiguous page
	coresight: etm4x: Do not access TRCIDR1 for identification
	coresight-etm4: Fix for() loop drvdata->nr_addr_cmp range bug
	counter: 104-quad-8: Fix race condition between FLAG and CNTR reads
	counter: 104-quad-8: Fix Synapse action reported for Index signals
	blk-mq: directly poll requests
	iio: adc: ad7791: fix IRQ flags
	io_uring: fix return value when removing provided buffers
	io_uring: fix memory leak when removing provided buffers
	scsi: qla2xxx: Fix memory leak in qla2x00_probe_one()
	scsi: iscsi_tcp: Check that sock is valid before iscsi_set_param()
	nvme: fix discard support without oncs
	cifs: sanitize paths in cifs_update_super_prepath.
	block: ublk: make sure that block size is set correctly
	block: don't set GD_NEED_PART_SCAN if scan partition failed
	perf/core: Fix the same task check in perf_event_set_output
	ftrace: Mark get_lock_parent_ip() __always_inline
	ftrace: Fix issue that 'direct->addr' not restored in modify_ftrace_direct()
	fs: drop peer group ids under namespace lock
	can: j1939: j1939_tp_tx_dat_new(): fix out-of-bounds memory access
	can: isotp: fix race between isotp_sendsmg() and isotp_release()
	can: isotp: isotp_ops: fix poll() to not report false EPOLLOUT events
	can: isotp: isotp_recvmsg(): use sock_recv_cmsgs() to get SOCK_RXQ_OVFL infos
	ACPI: video: Add auto_detect arg to __acpi_video_get_backlight_type()
	ACPI: video: Make acpi_backlight=video work independent from GPU driver
	ACPI: video: Add acpi_backlight=video quirk for Apple iMac14,1 and iMac14,2
	ACPI: video: Add acpi_backlight=video quirk for Lenovo ThinkPad W530
	net: stmmac: Add queue reset into stmmac_xdp_open() function
	tracing/synthetic: Fix races on freeing last_cmd
	tracing/timerlat: Notify new max thread latency
	tracing/osnoise: Fix notify new tracing_max_latency
	tracing: Free error logs of tracing instances
	ASoC: hdac_hdmi: use set_stream() instead of set_tdm_slots()
	tracing/synthetic: Make lastcmd_mutex static
	zsmalloc: document freeable stats
	mm: vmalloc: avoid warn_alloc noise caused by fatal signal
	wifi: mt76: ignore key disable commands
	ublk: read any SQE values upfront
	drm/panfrost: Fix the panfrost_mmu_map_fault_addr() error path
	drm/nouveau/disp: Support more modes by checking with lower bpc
	drm/i915: Fix context runtime accounting
	drm/i915: fix race condition UAF in i915_perf_add_config_ioctl
	ring-buffer: Fix race while reader and writer are on the same page
	mm/swap: fix swap_info_struct race between swapoff and get_swap_pages()
	mm/hugetlb: fix uffd wr-protection for CoW optimization path
	maple_tree: fix get wrong data_end in mtree_lookup_walk()
	maple_tree: fix a potential concurrency bug in RCU mode
	blk-throttle: Fix that bps of child could exceed bps limited in parent
	drm/amd/display: Clear MST topology if it fails to resume
	drm/amdgpu: for S0ix, skip SDMA 5.x+ suspend/resume
	drm/amdgpu: skip psp suspend for IMU enabled ASICs mode2 reset
	drm/display/dp_mst: Handle old/new payload states in drm_dp_remove_payload()
	drm/i915/dp_mst: Fix payload removal during output disabling
	drm/bridge: lt9611: Fix PLL being unable to lock
	drm/i915: Use _MMIO_PIPE() for SKL_BOTTOM_COLOR
	drm/i915: Split icl_color_commit_noarm() from skl_color_commit_noarm()
	mm: take a page reference when removing device exclusive entries
	maple_tree: remove GFP_ZERO from kmem_cache_alloc() and kmem_cache_alloc_bulk()
	maple_tree: fix potential rcu issue
	maple_tree: reduce user error potential
	maple_tree: fix handle of invalidated state in mas_wr_store_setup()
	maple_tree: fix mas_prev() and mas_find() state handling
	maple_tree: be more cautious about dead nodes
	maple_tree: refine ma_state init from mas_start()
	maple_tree: detect dead nodes in mas_start()
	maple_tree: fix freeing of nodes in rcu mode
	maple_tree: remove extra smp_wmb() from mas_dead_leaves()
	maple_tree: add smp_rmb() to dead node detection
	maple_tree: add RCU lock checking to rcu callback functions
	mm: enable maple tree RCU mode by default.
	bpftool: Print newline before '}' for struct with padding only fields
	Linux 6.1.24

Change-Id: I475408e1166927565c7788e7095bdf2cb236c4b2
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-04-22 08:52:25 +00:00
Marc Zyngier
eb3df96102 KVM: arm64: PMU: Align chained counter implementation with architecture pseudocode
[ Upstream commit bead02204e9806807bb290137b1ccabfcb4b16fd ]

Ricardo recently pointed out that the PMU chained counter emulation
in KVM wasn't quite behaving like the one on actual hardware, in
the sense that a chained counter would expose an overflow on
both halves of a chained counter, while KVM would only expose the
overflow on the top half.

The difference is subtle, but significant. What does the architecture
say (DDI0087 H.a):

- Up to PMUv3p4, all counters but the cycle counter are 32bit

- A 32bit counter that overflows generates a CHAIN event on the
  adjacent counter after exposing its own overflow status

- The CHAIN event is accounted if the counter is correctly
  configured (CHAIN event selected and counter enabled)

This all means that our current implementation (which uses 64bit
perf events) prevents us from emulating this overflow on the lower half.

How to fix this? By implementing the above, to the letter.

This largely results in code deletion, removing the notions of
"counter pair", "chained counters", and "canonical counter".
The code is further restructured to make the CHAIN handling similar
to SWINC, as the two are now extremely similar in behaviour.

Reported-by: Ricardo Koller <ricarkol@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Reiji Watanabe <reijiw@google.com>
Link: https://lore.kernel.org/r/20221113163832.3154370-3-maz@kernel.org
Stable-dep-of: f6da81f650fa ("KVM: arm64: PMU: Don't save PMCR_EL0.{C,P} for the vCPU")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-04-13 16:55:17 +02:00
Fuad Tabba
9fb1fb85fa ANDROID: KVM: arm64: Move some kvm_psci functions to a shared header
Move some PSCI functions and macros to a shared header to be used
by hyp in protected mode.

No functional change intended.

Signed-off-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Change-Id: Ibe84564f423cd0281f3dc33d9801b474fe8f2db9
Signed-off-by: Quentin Perret <qperret@google.com>
2022-12-15 16:12:56 +00:00
Marc Zyngier
4f1e3e2c1e ANDROID: KVM: arm64: Simplify vgic-v3 hypercalls
Consolidate the GICv3 VMCR accessor hypercalls into the APR save/restore
hypercalls so that all of the EL2 GICv3 state is covered by a single pair
of hypercalls.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Change-Id: Ifb109d1592a82d0858d5040482d5cf686f9e74e2
Signed-off-by: Quentin Perret <qperret@google.com>
2022-12-15 16:12:52 +00:00
Marc Zyngier
4b85080f4e KVM: arm64: vgic: Consolidate userspace access for base address setting
Align kvm_vgic_addr() with the rest of the code by moving the
userspace accesses into it. kvm_vgic_addr() is also made static.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-07-17 11:55:33 +01:00
Marc Zyngier
9f968c9266 KVM: arm64: vgic-v2: Add helper for legacy dist/cpuif base address setting
We carry a legacy interface to set the base addresses for GICv2.
As this is currently plumbed into the same handling code as
the modern interface, it limits the evolution we can make there.

Add a helper dedicated to this handling, with a view of maybe
removing this in the future.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-07-17 11:55:33 +01:00
Marc Zyngier
8794b4f510 Merge branch kvm-arm64/per-vcpu-host-pmu-data into kvmarm-master/next
* kvm-arm64/per-vcpu-host-pmu-data:
  : .
  : Pass the host PMU state in the vcpu to avoid the use of additional
  : shared memory between EL1 and EL2 (this obviously only applies
  : to nVHE and Protected setups).
  :
  : Patches courtesy of Fuad Tabba.
  : .
  KVM: arm64: pmu: Restore compilation when HW_PERF_EVENTS isn't selected
  KVM: arm64: Reenable pmu in Protected Mode
  KVM: arm64: Pass pmu events to hyp via vcpu
  KVM: arm64: Repack struct kvm_pmu to reduce size
  KVM: arm64: Wrapper for getting pmu_events

Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-05-16 17:48:36 +01:00
Marc Zyngier
ec2cff6cbd Merge branch kvm-arm64/vgic-invlpir into kvmarm-master/next
* kvm-arm64/vgic-invlpir:
  : .
  : Implement MMIO-based LPI invalidation for vGICv3.
  : .
  KVM: arm64: vgic-v3: Advertise GICR_CTLR.{IR, CES} as a new GICD_IIDR revision
  KVM: arm64: vgic-v3: Implement MMIO-based LPI invalidation
  KVM: arm64: vgic-v3: Expose GICR_CTLR.RWP when disabling LPIs
  irqchip/gic-v3: Exposes bit values for GICR_CTLR.{IR, CES}

Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-05-16 17:48:35 +01:00
Marc Zyngier
0586e28aaa Merge branch kvm-arm64/hcall-selection into kvmarm-master/next
* kvm-arm64/hcall-selection:
  : .
  : Introduce a new set of virtual sysregs for userspace to
  : select the hypercalls it wants to see exposed to the guest.
  :
  : Patches courtesy of Raghavendra and Oliver.
  : .
  KVM: arm64: Fix hypercall bitmap writeback when vcpus have already run
  KVM: arm64: Hide KVM_REG_ARM_*_BMAP_BIT_COUNT from userspace
  Documentation: Fix index.rst after psci.rst renaming
  selftests: KVM: aarch64: Add the bitmap firmware registers to get-reg-list
  selftests: KVM: aarch64: Introduce hypercall ABI test
  selftests: KVM: Create helper for making SMCCC calls
  selftests: KVM: Rename psci_cpu_on_test to psci_test
  tools: Import ARM SMCCC definitions
  Docs: KVM: Add doc for the bitmap firmware registers
  Docs: KVM: Rename psci.rst to hypercalls.rst
  KVM: arm64: Add vendor hypervisor firmware register
  KVM: arm64: Add standard hypervisor firmware register
  KVM: arm64: Setup a framework for hypercall bitmap firmware registers
  KVM: arm64: Factor out firmware register handling from psci.c

Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-05-16 17:47:03 +01:00
Marc Zyngier
20492a62b9 KVM: arm64: pmu: Restore compilation when HW_PERF_EVENTS isn't selected
Moving kvm_pmu_events into the vcpu (and refering to it) broke the
somewhat unusual case where the kernel has no support for a PMU
at all.

In order to solve this, move things around a bit so that we can
easily avoid refering to the pmu structure outside of PMU-aware
code. As a bonus, pmu.c isn't compiled in when HW_PERF_EVENTS
isn't selected.

Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/202205161814.KQHpOzsJ-lkp@intel.com
2022-05-16 13:42:41 +01:00
Fuad Tabba
84d751a019 KVM: arm64: Pass pmu events to hyp via vcpu
Instead of the host accessing hyp data directly, pass the pmu
events of the current cpu to hyp via the vcpu.

This adds 64 bits (in two fields) to the vcpu that need to be
synced before every vcpu run in nvhe and protected modes.
However, it isolates the hypervisor from the host, which allows
us to use pmu in protected mode in a subsequent patch.

No visible side effects in behavior intended.

Signed-off-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Oliver Upton <oupton@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220510095710.148178-4-tabba@google.com
2022-05-15 11:26:41 +01:00
Fuad Tabba
e987a4c60f KVM: arm64: Repack struct kvm_pmu to reduce size
struct kvm_pmu has 2 holes using 10 bytes. This is instantiated
in all vcpus, so it adds up. Repack the structures to remove the
holes.

No functional change intended.

Reviewed-by: Oliver Upton <oupton@google.com>
Signed-off-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220510095710.148178-3-tabba@google.com
2022-05-15 11:24:17 +01:00
Marc Zyngier
49a1a2c70a KVM: arm64: vgic-v3: Advertise GICR_CTLR.{IR, CES} as a new GICD_IIDR revision
Since adversising GICR_CTLR.{IC,CES} is directly observable from
a guest, we need to make it selectable from userspace.

For that, bump the default GICD_IIDR revision and let userspace
downgrade it to the previous default. For GICv2, the two distributor
revisions are strictly equivalent.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220405182327.205520-5-maz@kernel.org
2022-05-04 14:09:53 +01:00
Marc Zyngier
4645d11f4a KVM: arm64: vgic-v3: Implement MMIO-based LPI invalidation
Since GICv4.1, it has become legal for an implementation to advertise
GICR_{INVLPIR,INVALLR,SYNCR} while having an ITS, allowing for a more
efficient invalidation scheme (no guest command queue contention when
multiple CPUs are generating invalidations).

Provide the invalidation registers as a primitive to their ITS
counterpart. Note that we don't advertise them to the guest yet
(the architecture allows an implementation to do this).

Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Oliver Upton <oupton@google.com>
Link: https://lore.kernel.org/r/20220405182327.205520-4-maz@kernel.org
2022-05-04 14:09:53 +01:00
Marc Zyngier
94828468a6 KVM: arm64: vgic-v3: Expose GICR_CTLR.RWP when disabling LPIs
When disabling LPIs, a guest needs to poll GICR_CTLR.RWP in order
to be sure that the write has taken effect. We so far reported it
as 0, as we didn't advertise that LPIs could be turned off the
first place.

Start tracking this state during which LPIs are being disabled,
and expose the 'in progress' state via the RWP bit.

We also take this opportunity to disallow enabling LPIs and programming
GICR_{PEND,PROP}BASER while LPI disabling is in progress, as allowed by
the architecture (UNPRED behaviour).

We don't advertise the feature to the guest yet (which is allowed by
the architecture).

Reviewed-by: Oliver Upton <oupton@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220405182327.205520-3-maz@kernel.org
2022-05-04 14:09:53 +01:00
Raghavendra Rao Ananta
05714cab7d KVM: arm64: Setup a framework for hypercall bitmap firmware registers
KVM regularly introduces new hypercall services to the guests without
any consent from the userspace. This means, the guests can observe
hypercall services in and out as they migrate across various host
kernel versions. This could be a major problem if the guest
discovered a hypercall, started using it, and after getting migrated
to an older kernel realizes that it's no longer available. Depending
on how the guest handles the change, there's a potential chance that
the guest would just panic.

As a result, there's a need for the userspace to elect the services
that it wishes the guest to discover. It can elect these services
based on the kernels spread across its (migration) fleet. To remedy
this, extend the existing firmware pseudo-registers, such as
KVM_REG_ARM_PSCI_VERSION, but by creating a new COPROC register space
for all the hypercall services available.

These firmware registers are categorized based on the service call
owners, but unlike the existing firmware pseudo-registers, they hold
the features supported in the form of a bitmap.

During the VM initialization, the registers are set to upper-limit of
the features supported by the corresponding registers. It's expected
that the VMMs discover the features provided by each register via
GET_ONE_REG, and write back the desired values using SET_ONE_REG.
KVM allows this modification only until the VM has started.

Some of the standard features are not mapped to any bits of the
registers. But since they can recreate the original problem of
making it available without userspace's consent, they need to
be explicitly added to the case-list in
kvm_hvc_call_default_allowed(). Any function-id that's not enabled
via the bitmap, or not listed in kvm_hvc_call_default_allowed, will
be returned as SMCCC_RET_NOT_SUPPORTED to the guest.

Older userspace code can simply ignore the feature and the
hypercall services will be exposed unconditionally to the guests,
thus ensuring backward compatibility.

In this patch, the framework adds the register only for ARM's standard
secure services (owner value 4). Currently, this includes support only
for ARM True Random Number Generator (TRNG) service, with bit-0 of the
register representing mandatory features of v1.0. Other services are
momentarily added in the upcoming patches.

Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
[maz: reduced the scope of some helpers, tidy-up bitmap max values,
 dropped error-only fast path]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220502233853.1233742-3-rananta@google.com
2022-05-03 21:30:19 +01:00
Raghavendra Rao Ananta
85fbe08e4d KVM: arm64: Factor out firmware register handling from psci.c
Common hypercall firmware register handing is currently employed
by psci.c. Since the upcoming patches add more of these registers,
it's better to move the generic handling to hypercall.c for a
cleaner presentation.

While we are at it, collect all the firmware registers under
fw_reg_ids[] to help implement kvm_arm_get_fw_num_regs() and
kvm_arm_copy_fw_reg_indices() in a generic way. Also, define
KVM_REG_FEATURE_LEVEL_MASK using a GENMASK instead.

No functional change intended.

Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Reviewed-by: Oliver Upton <oupton@google.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
[maz: fixed KVM_REG_FEATURE_LEVEL_MASK]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220502233853.1233742-2-rananta@google.com
2022-05-03 14:48:54 +01:00
Marc Zyngier
b57de4ffd7 KVM: arm64: Simplify kvm_cpu_has_pending_timer()
kvm_cpu_has_pending_timer() ends up checking all the possible
timers for a wake-up cause. However, we already check for
pending interrupts whenever we try to wake-up a vcpu, including
the timer interrupts.

Obviously, doing the same work twice is once too many. Reduce
this helper to almost nothing, but keep it around, as we are
going to make use of it soon.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220419182755.601427-4-maz@kernel.org
2022-04-20 13:24:44 +01:00
Marc Zyngier
1a48ce9264 Merge branch kvm-arm64/psci-1.1 into kvmarm-master/next
* kvm-arm64/psci-1.1:
  : .
  : Limited PSCI-1.1 support from Will Deacon:
  :
  : This small series exposes the PSCI SYSTEM_RESET2 call to guests, which
  : allows the propagation of a "reset_type" and a "cookie" back to the VMM.
  : Although Linux guests only ever pass 0 for the type ("SYSTEM_WARM_RESET"),
  : the vendor-defined range can be used by a bootloader to provide additional
  : information about the reset, such as an error code.
  : .
  KVM: arm64: Remove unneeded semicolons
  KVM: arm64: Indicate SYSTEM_RESET2 in kvm_run::system_event flags field
  KVM: arm64: Expose PSCI SYSTEM_RESET2 call to the guest
  KVM: arm64: Bump guest PSCI version to 1.1

Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-02-25 13:49:48 +00:00
Will Deacon
512865d83f KVM: arm64: Bump guest PSCI version to 1.1
Expose PSCI version v1.1 to the guest by default. The only difference
for now is that an updated version number is reported by PSCI_VERSION.

Cc: Marc Zyngier <maz@kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220221153524.15397-2-will@kernel.org
2022-02-21 16:02:55 +00:00
Marc Zyngier
00e6dae00e Merge branch kvm-arm64/pmu-bl into kvmarm-master/next
* kvm-arm64/pmu-bl:
  : .
  : Improve PMU support on heterogeneous systems, courtesy of Alexandru Elisei
  : .
  KVM: arm64: Refuse to run VCPU if the PMU doesn't match the physical CPU
  KVM: arm64: Add KVM_ARM_VCPU_PMU_V3_SET_PMU attribute
  KVM: arm64: Keep a list of probed PMUs
  KVM: arm64: Keep a per-VM pointer to the default PMU
  perf: Fix wrong name in comment for struct perf_cpu_context
  KVM: arm64: Do not change the PMU event filter after a VCPU has run

Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-02-08 17:54:41 +00:00
Alexandru Elisei
db858060b1 KVM: arm64: Keep a list of probed PMUs
The ARM PMU driver calls kvm_host_pmu_init() after probing to tell KVM that
a hardware PMU is available for guest emulation. Heterogeneous systems can
have more than one PMU present, and the callback gets called multiple
times, once for each of them. Keep track of all the PMUs available to KVM,
as they're going to be needed later.

Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220127161759.53553-5-alexandru.elisei@arm.com
2022-02-08 17:51:21 +00:00
Oliver Upton
dfefa04a90 KVM: arm64: Drop unused param from kvm_psci_version()
kvm_psci_version() consumes a pointer to struct kvm in addition to a
vcpu pointer. Drop the kvm pointer as it is unused. While the comment
suggests the explicit kvm pointer was useful for calling from hyp, there
exist no such callsite in hyp.

Signed-off-by: Oliver Upton <oupton@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220208012705.640444-1-oupton@google.com
2022-02-08 15:25:34 +00:00
Linus Torvalds
79e06c4c49 RISCV:
- Use common KVM implementation of MMU memory caches
 
 - SBI v0.2 support for Guest
 
 - Initial KVM selftests support
 
 - Fix to avoid spurious virtual interrupts after clearing hideleg CSR
 
 - Update email address for Anup and Atish
 
 ARM:
 - Simplification of the 'vcpu first run' by integrating it into
   KVM's 'pid change' flow
 
 - Refactoring of the FP and SVE state tracking, also leading to
   a simpler state and less shared data between EL1 and EL2 in
   the nVHE case
 
 - Tidy up the header file usage for the nvhe hyp object
 
 - New HYP unsharing mechanism, finally allowing pages to be
   unmapped from the Stage-1 EL2 page-tables
 
 - Various pKVM cleanups around refcounting and sharing
 
 - A couple of vgic fixes for bugs that would trigger once
   the vcpu xarray rework is merged, but not sooner
 
 - Add minimal support for ARMv8.7's PMU extension
 
 - Rework kvm_pgtable initialisation ahead of the NV work
 
 - New selftest for IRQ injection
 
 - Teach selftests about the lack of default IPA space and
   page sizes
 
 - Expand sysreg selftest to deal with Pointer Authentication
 
 - The usual bunch of cleanups and doc update
 
 s390:
 - fix sigp sense/start/stop/inconsistency
 
 - cleanups
 
 x86:
 - Clean up some function prototypes more
 
 - improved gfn_to_pfn_cache with proper invalidation, used by Xen emulation
 
 - add KVM_IRQ_ROUTING_XEN_EVTCHN and event channel delivery
 
 - completely remove potential TOC/TOU races in nested SVM consistency checks
 
 - update some PMCs on emulated instructions
 
 - Intel AMX support (joint work between Thomas and Intel)
 
 - large MMU cleanups
 
 - module parameter to disable PMU virtualization
 
 - cleanup register cache
 
 - first part of halt handling cleanups
 
 - Hyper-V enlightened MSR bitmap support for nested hypervisors
 
 Generic:
 - clean up Makefiles
 
 - introduce CONFIG_HAVE_KVM_DIRTY_RING
 
 - optimize memslot lookup using a tree
 
 - optimize vCPU array usage by converting to xarray
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmHhxvsUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroPZkAf+Nz92UL/5nNGcdHtE4m7AToMmitE9
 bYkesf9BMQvAe5wjkABLuoHGi6ay4jabo4fiGzbdkiK7lO5YgfsWiMB3/MT5fl4E
 jRPzaVQabp3YZLM8UYCBmfUVuRj524S967SfSRe0AvYjDEH8y7klPf4+7sCsFT0/
 Px9Vf2KGuOlf0eM78yKg4rGaF0jS22eLgXm6FfNMY8/e29ZAo/jyUmqBY+Z2xxZG
 aWhceDtSheW1jwLHLj3nOlQJvHTn8LVGXBE/R8Gda3ZjrBV2rKaDi4Fh+HD+dz86
 2zVXwzQ7uck2CMW73GMoXMTWoKSHMyvlBOs1BdvBm4UsnGcXR+q8IFCeuQ==
 =s73m
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "RISCV:

   - Use common KVM implementation of MMU memory caches

   - SBI v0.2 support for Guest

   - Initial KVM selftests support

   - Fix to avoid spurious virtual interrupts after clearing hideleg CSR

   - Update email address for Anup and Atish

  ARM:

   - Simplification of the 'vcpu first run' by integrating it into KVM's
     'pid change' flow

   - Refactoring of the FP and SVE state tracking, also leading to a
     simpler state and less shared data between EL1 and EL2 in the nVHE
     case

   - Tidy up the header file usage for the nvhe hyp object

   - New HYP unsharing mechanism, finally allowing pages to be unmapped
     from the Stage-1 EL2 page-tables

   - Various pKVM cleanups around refcounting and sharing

   - A couple of vgic fixes for bugs that would trigger once the vcpu
     xarray rework is merged, but not sooner

   - Add minimal support for ARMv8.7's PMU extension

   - Rework kvm_pgtable initialisation ahead of the NV work

   - New selftest for IRQ injection

   - Teach selftests about the lack of default IPA space and page sizes

   - Expand sysreg selftest to deal with Pointer Authentication

   - The usual bunch of cleanups and doc update

  s390:

   - fix sigp sense/start/stop/inconsistency

   - cleanups

  x86:

   - Clean up some function prototypes more

   - improved gfn_to_pfn_cache with proper invalidation, used by Xen
     emulation

   - add KVM_IRQ_ROUTING_XEN_EVTCHN and event channel delivery

   - completely remove potential TOC/TOU races in nested SVM consistency
     checks

   - update some PMCs on emulated instructions

   - Intel AMX support (joint work between Thomas and Intel)

   - large MMU cleanups

   - module parameter to disable PMU virtualization

   - cleanup register cache

   - first part of halt handling cleanups

   - Hyper-V enlightened MSR bitmap support for nested hypervisors

  Generic:

   - clean up Makefiles

   - introduce CONFIG_HAVE_KVM_DIRTY_RING

   - optimize memslot lookup using a tree

   - optimize vCPU array usage by converting to xarray"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (268 commits)
  x86/fpu: Fix inline prefix warnings
  selftest: kvm: Add amx selftest
  selftest: kvm: Move struct kvm_x86_state to header
  selftest: kvm: Reorder vcpu_load_state steps for AMX
  kvm: x86: Disable interception for IA32_XFD on demand
  x86/fpu: Provide fpu_sync_guest_vmexit_xfd_state()
  kvm: selftests: Add support for KVM_CAP_XSAVE2
  kvm: x86: Add support for getting/setting expanded xstate buffer
  x86/fpu: Add uabi_size to guest_fpu
  kvm: x86: Add CPUID support for Intel AMX
  kvm: x86: Add XCR0 support for Intel AMX
  kvm: x86: Disable RDMSR interception of IA32_XFD_ERR
  kvm: x86: Emulate IA32_XFD_ERR for guest
  kvm: x86: Intercept #NM for saving IA32_XFD_ERR
  x86/fpu: Prepare xfd_err in struct fpu_guest
  kvm: x86: Add emulation for IA32_XFD
  x86/fpu: Provide fpu_update_guest_xfd() for IA32_XFD emulation
  kvm: x86: Enable dynamic xfeatures at KVM_SET_CPUID2
  x86/fpu: Provide fpu_enable_guest_xfd_features() for KVM
  x86/fpu: Add guest support to xfd_enable_feature()
  ...
2022-01-16 16:15:14 +02:00
Andy Shevchenko
6c9eeb5f4a KVM: arm64: vgic: Replace kernel.h with the necessary inclusions
arm_vgic.h does not require all the stuff that kernel.h provides.
Replace kernel.h inclusion with the list of what is really being used.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220104151940.55399-1-andriy.shevchenko@linux.intel.com
2022-01-04 17:11:47 +00:00
Sean Christopherson
be399d824b KVM: arm64: Hide kvm_arm_pmu_available behind CONFIG_HW_PERF_EVENTS=y
Move the definition of kvm_arm_pmu_available to pmu-emul.c and, out of
"necessity", hide it behind CONFIG_HW_PERF_EVENTS.  Provide a stub for
the key's wrapper, kvm_arm_support_pmu_v3().  Moving the key's definition
out of perf.c will allow a future commit to delete perf.c entirely.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20211111020738.2512932-16-seanjc@google.com
2021-11-17 14:49:11 +01:00
Marc Zyngier
e840f42a49 KVM: arm64: Fix PMU probe ordering
Russell reported that since 5.13, KVM's probing of the PMU has
started to fail on his HW. As it turns out, there is an implicit
ordering dependency between the architectural PMU probing code and
and KVM's own probing. If, due to probe ordering reasons, KVM probes
before the PMU driver, it will fail to detect the PMU and prevent it
from being advertised to guests as well as the VMM.

Obviously, this is one probing too many, and we should be able to
deal with any ordering.

Add a callback from the PMU code into KVM to advertise the registration
of a host CPU PMU, allowing for any probing order.

Fixes: 5421db1be3 ("KVM: arm64: Divorce the perf code from oprofile helpers")
Reported-by: "Russell King (Oracle)" <linux@armlinux.org.uk>
Tested-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/YUYRKVflRtUytzy5@shell.armlinux.org.uk
Cc: stable@vger.kernel.org
2021-09-20 12:43:34 +01:00
Marc Zyngier
354920e794 KVM: arm64: vgic: Implement SW-driven deactivation
In order to deal with these systems that do not offer HW-based
deactivation of interrupts, let implement a SW-based approach:

- When the irq is queued into a LR, treat it as a pure virtual
  interrupt and set the EOI flag in the LR.

- When the interrupt state is read back from the LR, force a
  deactivation when the state is invalid (neither active nor
  pending)

Interrupts requiring such treatment get the VGIC_SW_RESAMPLE flag.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-06-01 10:46:00 +01:00
Marc Zyngier
db75f1a33f KVM: arm64: vgic: move irq->get_input_level into an ops structure
We already have the option to attach a callback to an interrupt
to retrieve its pending state. As we are planning to expand this
facility, move this callback into its own data structure.

This will limit the size of individual interrupts as the ops
structures can be shared across multiple interrupts.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-06-01 10:45:59 +01:00
Marc Zyngier
f6c3e24fb7 KVM: arm64: vgic: Let an interrupt controller advertise lack of HW deactivation
The vGIC, as architected by ARM, allows a virtual interrupt to
trigger the deactivation of a physical interrupt. This allows
the following interrupt to be delivered without requiring an exit.

However, some implementations have choosen not to implement this,
meaning that we will need some unsavoury workarounds to deal with this.

On detecting such a case, taint the kernel and spit a nastygram.
We'll deal with this in later patches.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-06-01 10:45:59 +01:00
Marc Zyngier
9a8aae605b Merge branch 'kvm-arm64/kill_oprofile_dependency' into kvmarm-master/next
Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-04-22 13:41:49 +01:00
Marc Zyngier
5421db1be3 KVM: arm64: Divorce the perf code from oprofile helpers
KVM/arm64 is the sole user of perf_num_counters(), and really
could do without it. Stop using the obsolete API by relying on
the existing probing code.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210414134409.1266357-2-maz@kernel.org
2021-04-22 13:32:39 +01:00
Eric Auger
28e9d4bce3 KVM: arm64: vgic-v3: Expose GICR_TYPER.Last for userspace
Commit 23bde34771 ("KVM: arm64: vgic-v3: Drop the
reporting of GICR_TYPER.Last for userspace") temporarily fixed
a bug identified when attempting to access the GICR_TYPER
register before the redistributor region setting, but dropped
the support of the LAST bit.

Emulating the GICR_TYPER.Last bit still makes sense for
architecture compliance though. This patch restores its support
(if the redistributor region was set) while keeping the code safe.

We introduce a new helper, vgic_mmio_vcpu_rdist_is_last() which
computes whether a redistributor is the highest one of a series
of redistributor contributor pages.

With this new implementation we do not need to have a uaccess
read accessor anymore.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210405163941.510258-9-eric.auger@redhat.com
2021-04-06 14:51:38 +01:00
Marc Zyngier
6b5b368fcc KVM: arm64: Turn kvm_arm_support_pmu_v3() into a static key
We currently find out about the presence of a HW PMU (or the handling
of that PMU by perf, which amounts to the same thing) in a fairly
roundabout way, by checking the number of counters available to perf.
That's good enough for now, but we will soon need to find about about
that on paths where perf is out of reach (in the world switch).

Instead, let's turn kvm_arm_support_pmu_v3() into a static key.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Link: https://lore.kernel.org/r/20210209114844.3278746-2-maz@kernel.org
Message-Id: <20210305185254.3730990-5-maz@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-03-06 04:18:40 -05:00
Marc Zyngier
8cbebc4118 KVM: arm64: Replace KVM_ARM_PMU with HW_PERF_EVENTS
KVM_ARM_PMU only existed for the benefit of 32bit ARM hosts,
and makes no sense now that we are 64bit only. Get rid of it.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-01-04 16:50:16 +00:00
Marc Zyngier
17f84520cb Merge remote-tracking branch 'origin/kvm-arm64/misc-5.11' into kvmarm-master/queue
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-12-04 10:12:55 +00:00
Shenming Lu
57e3cebd02 KVM: arm64: Delay the polling of the GICR_VPENDBASER.Dirty bit
In order to reduce the impact of the VPT parsing happening on the GIC,
we can split the vcpu reseidency in two phases:

- programming GICR_VPENDBASER: this still happens in vcpu_load()
- checking for the VPT parsing to be complete: this can happen
  on vcpu entry (in kvm_vgic_flush_hwstate())

This allows the GIC and the CPU to work in parallel, rewmoving some
of the entry overhead.

Suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Shenming Lu <lushenming@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201128141857.983-3-lushenming@huawei.com
2020-11-30 11:18:29 +00:00
Marc Zyngier
7521c3a9e6 KVM: arm64: Get rid of the PMU ready state
The PMU ready state has no user left. Goodbye.

Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-11-27 11:41:24 +00:00
Linus Torvalds
f9a705ad1c ARM:
- New page table code for both hypervisor and guest stage-2
 - Introduction of a new EL2-private host context
 - Allow EL2 to have its own private per-CPU variables
 - Support of PMU event filtering
 - Complete rework of the Spectre mitigation
 
 PPC:
 - Fix for running nested guests with in-kernel IRQ chip
 - Fix race condition causing occasional host hard lockup
 - Minor cleanups and bugfixes
 
 x86:
 - allow trapping unknown MSRs to userspace
 - allow userspace to force #GP on specific MSRs
 - INVPCID support on AMD
 - nested AMD cleanup, on demand allocation of nested SVM state
 - hide PV MSRs and hypercalls for features not enabled in CPUID
 - new test for MSR_IA32_TSC writes from host and guest
 - cleanups: MMU, CPUID, shared MSRs
 - LAPIC latency optimizations ad bugfixes
 
 For x86, also included in this pull request is a new alternative and
 (in the future) more scalable implementation of extended page tables
 that does not need a reverse map from guest physical addresses to
 host physical addresses.  For now it is disabled by default because
 it is still lacking a few of the existing MMU's bells and whistles.
 However it is a very solid piece of work and it is already available
 for people to hammer on it.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAl+S8dsUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroM40Af+M46NJmuS5rcwFfybvK/c42KT6svX
 Co1NrZDwzSQ2mMy3WQzH9qeLvb+nbY4sT3n5BPNPNsT+aIDPOTDt//qJ2/Ip9UUs
 tRNea0MAR96JWLE7MSeeRxnTaQIrw/AAZC0RXFzZvxcgytXwdqBExugw4im+b+dn
 Dcz8QxX1EkwT+4lTm5HC0hKZAuo4apnK1QkqCq4SdD2QVJ1YE6+z7pgj4wX7xitr
 STKD6q/Yt/0ndwqS0GSGbyg0jy6mE620SN6isFRkJYwqfwLJci6KnqvEK67EcNMu
 qeE017K+d93yIVC46/6TfVHzLR/D1FpQ8LZ16Yl6S13OuGIfAWBkQZtPRg==
 =AD6a
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:
 "For x86, there is a new alternative and (in the future) more scalable
  implementation of extended page tables that does not need a reverse
  map from guest physical addresses to host physical addresses.

  For now it is disabled by default because it is still lacking a few of
  the existing MMU's bells and whistles. However it is a very solid
  piece of work and it is already available for people to hammer on it.

  Other updates:

  ARM:
   - New page table code for both hypervisor and guest stage-2
   - Introduction of a new EL2-private host context
   - Allow EL2 to have its own private per-CPU variables
   - Support of PMU event filtering
   - Complete rework of the Spectre mitigation

  PPC:
   - Fix for running nested guests with in-kernel IRQ chip
   - Fix race condition causing occasional host hard lockup
   - Minor cleanups and bugfixes

  x86:
   - allow trapping unknown MSRs to userspace
   - allow userspace to force #GP on specific MSRs
   - INVPCID support on AMD
   - nested AMD cleanup, on demand allocation of nested SVM state
   - hide PV MSRs and hypercalls for features not enabled in CPUID
   - new test for MSR_IA32_TSC writes from host and guest
   - cleanups: MMU, CPUID, shared MSRs
   - LAPIC latency optimizations ad bugfixes"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (232 commits)
  kvm: x86/mmu: NX largepage recovery for TDP MMU
  kvm: x86/mmu: Don't clear write flooding count for direct roots
  kvm: x86/mmu: Support MMIO in the TDP MMU
  kvm: x86/mmu: Support write protection for nesting in tdp MMU
  kvm: x86/mmu: Support disabling dirty logging for the tdp MMU
  kvm: x86/mmu: Support dirty logging for the TDP MMU
  kvm: x86/mmu: Support changed pte notifier in tdp MMU
  kvm: x86/mmu: Add access tracking for tdp_mmu
  kvm: x86/mmu: Support invalidate range MMU notifier for TDP MMU
  kvm: x86/mmu: Allocate struct kvm_mmu_pages for all pages in TDP MMU
  kvm: x86/mmu: Add TDP MMU PF handler
  kvm: x86/mmu: Remove disallowed_hugepage_adjust shadow_walk_iterator arg
  kvm: x86/mmu: Support zapping SPTEs in the TDP MMU
  KVM: Cache as_id in kvm_memory_slot
  kvm: x86/mmu: Add functions to handle changed TDP SPTEs
  kvm: x86/mmu: Allocate and free TDP MMU roots
  kvm: x86/mmu: Init / Uninit the TDP MMU
  kvm: x86/mmu: Introduce tdp_iter
  KVM: mmu: extract spte.h and spte.c
  KVM: mmu: Separate updating a PTE from kvm_set_pte_rmapp
  ...
2020-10-23 11:17:56 -07:00
Marc Zyngier
88865beca9 KVM: arm64: Mask out filtered events in PCMEID{0,1}_EL1
As we can now hide events from the guest, let's also adjust its view of
PCMEID{0,1}_EL1 so that it can figure out why some common events are not
counting as they should.

The astute user can still look into the TRM for their CPU and find out
they've been cheated, though. Nobody's perfect.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-09-29 14:19:39 +01:00
Julien Thierry
95e92e45a4 KVM: arm64: pmu: Make overflow handler NMI safe
kvm_vcpu_kick() is not NMI safe. When the overflow handler is called from
NMI context, defer waking the vcpu to an irq_work queue.

A vcpu can be freed while it's not running by kvm_destroy_vm(). Prevent
running the irq_work for a non-existent vcpu by calling irq_work_sync() on
the PMU destroy path.

[Alexandru E.: Added irq_work_sync()]

Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Tested-by: Sumit Garg <sumit.garg@linaro.org> (Developerbox)
Cc: Julien Thierry <julien.thierry.kdev@gmail.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Suzuki K Pouloze <suzuki.poulose@arm.com>
Cc: kvm@vger.kernel.org
Cc: kvmarm@lists.cs.columbia.edu
Link: https://lore.kernel.org/r/20200924110706.254996-6-alexandru.elisei@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28 19:00:17 +01:00
Marc Zyngier
41ce82f63c KVM: arm64: timers: Move timer registers to the sys_regs file
Move the timer gsisters to the sysreg file. This will further help when
they are directly changed by a nesting hypervisor in the VNCR page.

This requires moving the initialisation of the timer struct so that some
of the helpers (such as arch_timer_ctx_index) can work correctly at an
early stage.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-07-07 09:28:38 +01:00
Marc Zyngier
3c5ff0c60f KVM: arm64: timers: Rename kvm_timer_sync_hwstate to kvm_timer_sync_user
kvm_timer_sync_hwstate() has nothing to do with the timer HW state,
but more to do with the state of a userspace interrupt controller.
Change the suffix from _hwstate to_user, in keeping with the rest
of the code.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-07-07 09:28:38 +01:00
Christoffer Dall
fc5d1f1a42 KVM: arm64: vgic-v3: Take cpu_if pointer directly instead of vcpu
If we move the used_lrs field to the version-specific cpu interface
structure, the following functions only operate on the struct
vgic_v3_cpu_if and not the full vcpu:

  __vgic_v3_save_state
  __vgic_v3_restore_state
  __vgic_v3_activate_traps
  __vgic_v3_deactivate_traps
  __vgic_v3_save_aprs
  __vgic_v3_restore_aprs

This is going to be very useful for nested virt, so move the used_lrs
field and change the prototypes and implementations of these functions to
take the cpu_if parameter directly.

No functional change.

Reviewed-by: James Morse <james.morse@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-05-28 11:57:10 +01:00
Marc Zyngier
bacf2c6054 KVM: arm64: GICv4.1: Allow SGIs to switch between HW and SW interrupts
In order to let a guest buy in the new, active-less SGIs, we
need to be able to switch between the two modes.

Handle this by stopping all guest activity, transfer the state
from one mode to the other, and resume the guest. Nothing calls
this code so far, but a later patch will plug it into the MMIO
emulation.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Link: https://lore.kernel.org/r/20200304203330.4967-20-maz@kernel.org
2020-03-24 12:15:51 +00:00
Marc Zyngier
ae699ad348 irqchip/gic-v4.1: Move doorbell management to the GICv4 abstraction layer
In order to hide some of the differences between v4.0 and v4.1, move
the doorbell management out of the KVM code, and into the GICv4-specific
layer. This allows the calling code to ask for the doorbell when blocking,
and otherwise to leave the doorbell permanently disabled.

This matches the v4.1 code perfectly, and only results in a minor
refactoring of the v4.0 code.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Link: https://lore.kernel.org/r/20200304203330.4967-14-maz@kernel.org
2020-03-24 12:15:51 +00:00