Commit Graph

11719 Commits

Author SHA1 Message Date
Greg Kroah-Hartman
2a77668d45 This is the 6.1.33 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmSC5VIACgkQONu9yGCS
 aT5RPhAAiVFNzTuQT4DtPzXUzl9hpNtdtZPVa/z28+SbOZyf2YgyDGXLHvnGbJ/2
 8DWDV9uSsxdX2InNqzD/IbRSiHjXprpDssthq3Qr5aPH7FO76uICWndrCk0dhZsK
 kI/+J7BqS1vgtaxsZeo/IHmMQJ5oEzx/JzvcyK5po0rykNDCxWNnh8cK4YtFOVtk
 eRD8cPWXvJGn88pdPPlQuS75MKBGcAUZLodN//tP+x2bcWzocaTZUCEHL36eLcVc
 0CxPykCpFOcLFLIJWQ+pY2/HR2ynTBxYoaXsTpscR+FKbS+Lz9B6PUoXCvqaV2/e
 lriLjg22lbqxBbBhEk5NLBVozajtU/gNq6pptp/EnZahwjjyavuToZviWf8NWfs0
 2u+zQlolinCKnm+8o18dRn24kI7LbUSD2w+V8FydSQNHMikvu/xHgDdLgzmj2XAf
 ZIAkHdGjRzKL2euDPrp28D5vPfCqDjqT2wUE2vUsc+Ax4k6ewFCPs3cweWD8hoFS
 fAjTC3Q/oNp6eEbWuWJPxl+DW/tD3ezRGeqrRCXQwubcgwB5iaS5ItdCCfG/lfiJ
 PNHf4kpg4FlyBf8aPD+R3QA6KOuS1owNNk3cx72zHs8zPusosHWj9hDrXeYVn06G
 gj1SIoC+jC/L5nbYH9WFLnKm9+EQ28lcp9j7f1PdlDhkcJmzBRY=
 =Qjnb
 -----END PGP SIGNATURE-----

Merge 6.1.33 into android14-6.1-lts

Changes in 6.1.33
	RDMA/bnxt_re: Fix the page_size used during the MR creation
	phy: amlogic: phy-meson-g12a-mipi-dphy-analog: fix CNTL2_DIF_TX_CTL0 value
	RDMA/efa: Fix unsupported page sizes in device
	RDMA/hns: Fix timeout attr in query qp for HIP08
	RDMA/hns: Fix base address table allocation
	RDMA/hns: Modify the value of long message loopback slice
	dmaengine: at_xdmac: fix potential Oops in at_xdmac_prep_interleaved()
	RDMA/bnxt_re: Fix a possible memory leak
	RDMA/bnxt_re: Fix return value of bnxt_re_process_raw_qp_pkt_rx
	iommu/rockchip: Fix unwind goto issue
	iommu/amd: Don't block updates to GATag if guest mode is on
	iommu/amd: Handle GALog overflows
	iommu/amd: Fix up merge conflict resolution
	nfsd: make a copy of struct iattr before calling notify_change
	dmaengine: pl330: rename _start to prevent build error
	riscv: Fix unused variable warning when BUILTIN_DTB is set
	net/mlx5: Drain health before unregistering devlink
	net/mlx5: SF, Drain health before removing device
	net/mlx5: fw_tracer, Fix event handling
	net/mlx5e: Don't attach netdev profile while handling internal error
	net: mellanox: mlxbf_gige: Fix skb_panic splat under memory pressure
	netrom: fix info-leak in nr_write_internal()
	af_packet: Fix data-races of pkt_sk(sk)->num.
	tls: improve lockless access safety of tls_err_abort()
	amd-xgbe: fix the false linkup in xgbe_phy_status
	perf ftrace latency: Remove unnecessary "--" from --use-nsec option
	mtd: rawnand: ingenic: fix empty stub helper definitions
	RDMA/irdma: Prevent QP use after free
	RDMA/irdma: Fix Local Invalidate fencing
	af_packet: do not use READ_ONCE() in packet_bind()
	tcp: deny tcp_disconnect() when threads are waiting
	tcp: Return user_mss for TCP_MAXSEG in CLOSE/LISTEN state if user_mss set
	net/smc: Scan from current RMB list when no position specified
	net/smc: Don't use RMBs not mapped to new link in SMCRv2 ADD LINK
	net/sched: sch_ingress: Only create under TC_H_INGRESS
	net/sched: sch_clsact: Only create under TC_H_CLSACT
	net/sched: Reserve TC_H_INGRESS (TC_H_CLSACT) for ingress (clsact) Qdiscs
	net/sched: Prohibit regrafting ingress or clsact Qdiscs
	net: sched: fix NULL pointer dereference in mq_attach
	net/netlink: fix NETLINK_LIST_MEMBERSHIPS length report
	udp6: Fix race condition in udp6_sendmsg & connect
	nfsd: fix double fget() bug in __write_ports_addfd()
	nvme: fix the name of Zone Append for verbose logging
	net/mlx5e: Fix error handling in mlx5e_refresh_tirs
	net/mlx5: Read embedded cpu after init bit cleared
	iommu/mediatek: Flush IOTLB completely only if domain has been attached
	net/sched: flower: fix possible OOB write in fl_set_geneve_opt()
	tcp: fix mishandling when the sack compression is deferred.
	net: dsa: mv88e6xxx: Increase wait after reset deactivation
	mtd: rawnand: marvell: ensure timing values are written
	mtd: rawnand: marvell: don't set the NAND frequency select
	rtnetlink: call validate_linkmsg in rtnl_create_link
	mptcp: avoid unneeded __mptcp_nmpc_socket() usage
	mptcp: add annotations around msk->subflow accesses
	mptcp: avoid unneeded address copy
	mptcp: simplify subflow_syn_recv_sock()
	mptcp: consolidate passive msk socket initialization
	mptcp: fix data race around msk->first access
	mptcp: add annotations around sk->sk_shutdown accesses
	drm/amdgpu: release gpu full access after "amdgpu_device_ip_late_init"
	watchdog: menz069_wdt: fix watchdog initialisation
	ALSA: hda: Glenfly: add HD Audio PCI IDs and HDMI Codec Vendor IDs.
	ASoC: Intel: soc-acpi-cht: Add quirk for Nextbook Ares 8A tablet
	drm/amdgpu: Use the default reset when loading or reloading the driver
	mailbox: mailbox-test: Fix potential double-free in mbox_test_message_write()
	drm/ast: Fix ARM compatibility
	btrfs: abort transaction when sibling keys check fails for leaves
	ARM: 9295/1: unwind:fix unwind abort for uleb128 case
	hwmon: (k10temp) Add PCI ID for family 19, model 78h
	media: rcar-vin: Select correct interrupt mode for V4L2_FIELD_ALTERNATE
	platform/x86: intel_scu_pcidrv: Add back PCI ID for Medfield
	platform/mellanox: fix potential race in mlxbf-tmfifo driver
	gfs2: Don't deref jdesc in evict
	drm/amdgpu: set gfx9 onwards APU atomics support to be true
	fbdev: imsttfb: Fix use after free bug in imsttfb_probe
	fbdev: modedb: Add 1920x1080 at 60 Hz video mode
	fbdev: stifb: Fix info entry in sti_struct on error path
	nbd: Fix debugfs_create_dir error checking
	block/rnbd: replace REQ_OP_FLUSH with REQ_OP_WRITE
	nvme-pci: add NVME_QUIRK_BOGUS_NID for HS-SSD-FUTURE 2048G
	nvme-pci: add quirk for missing secondary temperature thresholds
	ASoC: amd: yc: Add DMI entry to support System76 Pangolin 12
	ASoC: dwc: limit the number of overrun messages
	um: harddog: fix modular build
	xfrm: Check if_id in inbound policy/secpath match
	ASoC: dt-bindings: Adjust #sound-dai-cells on TI's single-DAI codecs
	ALSA: hda/realtek: Add quirks for ASUS GU604V and GU603V
	ASoC: ssm2602: Add workaround for playback distortions
	media: dvb_demux: fix a bug for the continuity counter
	media: dvb-usb: az6027: fix three null-ptr-deref in az6027_i2c_xfer()
	media: dvb-usb-v2: ec168: fix null-ptr-deref in ec168_i2c_xfer()
	media: dvb-usb-v2: ce6230: fix null-ptr-deref in ce6230_i2c_master_xfer()
	media: dvb-usb-v2: rtl28xxu: fix null-ptr-deref in rtl28xxu_i2c_xfer
	media: dvb-usb: digitv: fix null-ptr-deref in digitv_i2c_xfer()
	media: dvb-usb: dw2102: fix uninit-value in su3000_read_mac_address
	media: netup_unidvb: fix irq init by register it at the end of probe
	media: dvb_ca_en50221: fix a size write bug
	media: ttusb-dec: fix memory leak in ttusb_dec_exit_dvb()
	media: mn88443x: fix !CONFIG_OF error by drop of_match_ptr from ID table
	media: dvb-core: Fix use-after-free due on race condition at dvb_net
	media: dvb-core: Fix use-after-free due to race at dvb_register_device()
	media: dvb-core: Fix kernel WARNING for blocking operation in wait_event*()
	media: dvb-core: Fix use-after-free due to race condition at dvb_ca_en50221
	ASoC: SOF: debug: conditionally bump runtime_pm counter on exceptions
	ASoC: SOF: pcm: fix pm_runtime imbalance in error handling
	ASoC: SOF: sof-client-probes: fix pm_runtime imbalance in error handling
	ASoC: SOF: pm: save io region state in case of errors in resume
	s390/pkey: zeroize key blobs
	s390/topology: honour nr_cpu_ids when adding CPUs
	ACPI: resource: Add IRQ override quirk for LG UltraPC 17U70P
	wifi: rtl8xxxu: fix authentication timeout due to incorrect RCR value
	ARM: dts: stm32: add pin map for CAN controller on stm32f7
	arm64/mm: mark private VM_FAULT_X defines as vm_fault_t
	arm64: vdso: Pass (void *) to virt_to_page()
	wifi: mac80211: simplify chanctx allocation
	wifi: mac80211: consider reserved chanctx for mindef
	wifi: mac80211: recalc chanctx mindef before assigning
	wifi: iwlwifi: mvm: Add locking to the rate read flow
	scsi: core: Decrease scsi_device's iorequest_cnt if dispatch failed
	wifi: b43: fix incorrect __packed annotation
	net: wwan: t7xx: Ensure init is completed before system sleep
	netfilter: conntrack: define variables exp_nat_nla_policy and any_addr with CONFIG_NF_NAT
	nvme-multipath: don't call blk_mark_disk_dead in nvme_mpath_remove_disk
	nvme: do not let the user delete a ctrl before a complete initialization
	ALSA: oss: avoid missing-prototype warnings
	drm/msm: Be more shouty if per-process pgtables aren't working
	atm: hide unused procfs functions
	ceph: silence smatch warning in reconnect_caps_cb()
	drm/amdgpu: skip disabling fence driver src_irqs when device is unplugged
	ublk: fix AB-BA lockdep warning
	nvme-pci: Add quirk for Teamgroup MP33 SSD
	block: Deny writable memory mapping if block is read-only
	KVM: arm64: vgic: Fix a circular locking issue
	KVM: arm64: vgic: Wrap vgic_its_create() with config_lock
	KVM: arm64: vgic: Fix locking comment
	media: mediatek: vcodec: Only apply 4K frame sizes on decoder formats
	mailbox: mailbox-test: fix a locking issue in mbox_test_message_write()
	drivers: base: cacheinfo: Fix shared_cpu_map changes in event of CPU hotplug
	media: uvcvideo: Don't expose unsupported formats to userspace
	iio: accel: st_accel: Fix invalid mount_matrix on devices without ACPI _ONT method
	iio: adc: mxs-lradc: fix the order of two cleanup operations
	HID: google: add jewel USB id
	HID: wacom: avoid integer overflow in wacom_intuos_inout()
	iio: imu: inv_icm42600: fix timestamp reset
	dt-bindings: iio: adc: renesas,rcar-gyroadc: Fix adi,ad7476 compatible value
	iio: light: vcnl4035: fixed chip ID check
	iio: adc: stm32-adc: skip adc-channels setup if none is present
	iio: adc: ad_sigma_delta: Fix IRQ issue by setting IRQ_DISABLE_UNLAZY flag
	iio: dac: mcp4725: Fix i2c_master_send() return value handling
	iio: addac: ad74413: fix resistance input processing
	iio: adc: ad7192: Change "shorted" channels to differential
	iio: adc: stm32-adc: skip adc-diff-channels setup if none is present
	iio: dac: build ad5758 driver when AD5758 is selected
	net: usb: qmi_wwan: Set DTR quirk for BroadMobi BM818
	dt-bindings: usb: snps,dwc3: Fix "snps,hsphy_interface" type
	usb: cdns3: fix NCM gadget RX speed 20x slow than expection at iMX8QM
	usb: gadget: f_fs: Add unbind event before functionfs_unbind
	md/raid5: fix miscalculation of 'end_sector' in raid5_read_one_chunk()
	misc: fastrpc: return -EPIPE to invocations on device removal
	misc: fastrpc: reject new invocations during device removal
	scsi: stex: Fix gcc 13 warnings
	ata: libata-scsi: Use correct device no in ata_find_dev()
	drm/amdgpu: enable tmz by default for GC 11.0.1
	drm/amd/pm: reverse mclk and fclk clocks levels for SMU v13.0.4
	drm/amd/pm: reverse mclk and fclk clocks levels for vangogh
	drm/amd/pm: resolve reboot exception for si oland
	drm/amd/pm: reverse mclk clocks levels for SMU v13.0.5
	drm/amd/pm: reverse mclk and fclk clocks levels for yellow carp
	drm/amd/pm: reverse mclk and fclk clocks levels for renoir
	x86/mtrr: Revert 90b926e68f50 ("x86/pat: Fix pat_x_mtrr_type() for MTRR disabled case")
	mmc: vub300: fix invalid response handling
	mmc: pwrseq: sd8787: Fix WILC CHIP_EN and RESETN toggling order
	tty: serial: fsl_lpuart: use UARTCTRL_TXINV to send break instead of UARTCTRL_SBK
	btrfs: fix csum_tree_block page iteration to avoid tripping on -Werror=array-bounds
	phy: qcom-qmp-combo: fix init-count imbalance
	phy: qcom-qmp-pcie-msm8996: fix init-count imbalance
	block: fix revalidate performance regression
	powerpc/iommu: Limit number of TCEs to 512 for H_STUFF_TCE hcall
	iommu/amd: Fix domain flush size when syncing iotlb
	tpm, tpm_tis: correct tpm_tis_flags enumeration values
	riscv: perf: Fix callchain parse error with kernel tracepoint events
	io_uring: undeprecate epoll_ctl support
	selinux: don't use make's grouped targets feature yet
	mtdchar: mark bits of ioctl handler noinline
	tracing/timerlat: Always wakeup the timerlat thread
	tracing/histograms: Allow variables to have some modifiers
	tracing/probe: trace_probe_primary_from_call(): checked list_first_entry
	selftests: mptcp: connect: skip if MPTCP is not supported
	selftests: mptcp: pm nl: skip if MPTCP is not supported
	selftests: mptcp: join: skip if MPTCP is not supported
	selftests: mptcp: sockopt: skip if MPTCP is not supported
	selftests: mptcp: userspace pm: skip if MPTCP is not supported
	mptcp: fix connect timeout handling
	mptcp: fix active subflow finalization
	ext4: add EA_INODE checking to ext4_iget()
	ext4: set lockdep subclass for the ea_inode in ext4_xattr_inode_cache_find()
	ext4: disallow ea_inodes with extended attributes
	ext4: add lockdep annotations for i_data_sem for ea_inode's
	fbcon: Fix null-ptr-deref in soft_cursor
	serial: 8250_tegra: Fix an error handling path in tegra_uart_probe()
	serial: cpm_uart: Fix a COMPILE_TEST dependency
	powerpc/xmon: Use KSYM_NAME_LEN in array size
	test_firmware: fix a memory leak with reqs buffer
	test_firmware: fix the memory leak of the allocated firmware buffer
	KVM: arm64: Populate fault info for watchpoint
	KVM: x86: Account fastpath-only VM-Exits in vCPU stats
	ksmbd: fix credit count leakage
	ksmbd: fix UAF issue from opinfo->conn
	ksmbd: fix incorrect AllocationSize set in smb2_get_info
	ksmbd: fix slab-out-of-bounds read in smb2_handle_negotiate
	ksmbd: fix multiple out-of-bounds read during context decoding
	KEYS: asymmetric: Copy sig and digest in public_key_verify_signature()
	fs/ntfs3: Validate MFT flags before replaying logs
	regmap: Account for register length when chunking
	tpm, tpm_tis: Request threaded interrupt handler
	iommu/amd/pgtbl_v2: Fix domain max address
	drm/amd/display: Have Payload Properly Created After Resume
	xfs: verify buffer contents when we skip log replay
	tls: rx: strp: don't use GFP_KERNEL in softirq context
	arm64: efi: Use SMBIOS processor version to key off Ampere quirk
	selftests: mptcp: diag: skip if MPTCP is not supported
	selftests: mptcp: simult flows: skip if MPTCP is not supported
	selftests: mptcp: join: avoid using 'cmp --bytes'
	ext4: enable the lazy init thread when remounting read/write
	Linux 6.1.33

Note, the following commits were reverted from this merge, due to
conflicts with other KVM patches.  If they are needed later, they can be
brought back in a way that enables them to actually build properly:
	bafe94ac99 ("KVM: arm64: vgic: Fix locking comment")
	150a5f74a5 ("KVM: arm64: vgic: Wrap vgic_its_create() with config_lock")
	4129d71e5b ("KVM: arm64: vgic: Fix a circular locking issue")

Change-Id: I3c4183fbe22b22914ee8985bd6add545abded9d0
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-06-14 19:17:01 +00:00
Greg Kroah-Hartman
3a53767f1f Revert "bpf, sockmap: Pass skb ownership through read_skb"
This reverts commit 4ae2af3e59.

It breaks the Android KABI and will be brought back at a later time when
it is safe to do so.

Bug: 161946584
Change-Id: I86bb2f6a68328744a796648dd8ec907580dd2c0b
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-06-14 16:18:29 +00:00
Greg Kroah-Hartman
51ffabff7c Revert "bpf, sockmap: Handle fin correctly"
This reverts commit 3a2129ebae.

It breaks the Android KABI and will be brought back at a later time when
it is safe to do so.

Bug: 161946584
Change-Id: Ib4d8c98270484b4d2c63e838bbe0d24c00642f87
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-06-14 16:18:22 +00:00
Greg Kroah-Hartman
3ce63059c1 Revert "bpf, sockmap: TCP data stall on recv before accept"
This reverts commit ab90b68f65.

It breaks the Android KABI and will be brought back at a later time when
it is safe to do so.

Bug: 161946584
Change-Id: If61ba37db2a3ea1642c9e5c9295e2d7d4a94207b
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-06-14 16:18:21 +00:00
Greg Kroah-Hartman
0851b00164 Revert "bpf, sockmap: Incorrectly handling copied_seq"
This reverts commit fe735073a5.

It breaks the Android KABI and will be brought back at a later time when
it is safe to do so.

Bug: 161946584
Change-Id: Iff834f7a4fde71e5c43d477cb9e2adbebd92d389
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-06-14 16:18:07 +00:00
Greg Kroah-Hartman
26b6ad0f34 Merge 6.1.32 into android14-6.1-lts
Changes in 6.1.32
	inet: Add IP_LOCAL_PORT_RANGE socket option
	ipv{4,6}/raw: fix output xfrm lookup wrt protocol
	firmware: arm_ffa: Fix usage of partition info get count flag
	selftests/bpf: Fix pkg-config call building sign-file
	platform/x86/amd/pmf: Fix CnQF and auto-mode after resume
	tls: rx: device: fix checking decryption status
	tls: rx: strp: set the skb->len of detached / CoW'ed skbs
	tls: rx: strp: fix determining record length in copy mode
	tls: rx: strp: force mixed decrypted records into copy mode
	tls: rx: strp: factor out copying skb data
	tls: rx: strp: preserve decryption status of skbs when needed
	net/mlx5: E-switch, Devcom, sync devcom events and devcom comp register
	gpio-f7188x: fix chip name and pin count on Nuvoton chip
	bpf, sockmap: Pass skb ownership through read_skb
	bpf, sockmap: Convert schedule_work into delayed_work
	bpf, sockmap: Reschedule is now done through backlog
	bpf, sockmap: Improved check for empty queue
	bpf, sockmap: Handle fin correctly
	bpf, sockmap: TCP data stall on recv before accept
	bpf, sockmap: Wake up polling after data copy
	bpf, sockmap: Incorrectly handling copied_seq
	blk-mq: fix race condition in active queue accounting
	vfio/type1: check pfn valid before converting to struct page
	net: page_pool: use in_softirq() instead
	page_pool: fix inconsistency for page_pool_ring_[un]lock()
	net: phy: mscc: enable VSC8501/2 RGMII RX clock
	wifi: rtw89: correct 5 MHz mask setting
	wifi: iwlwifi: mvm: support wowlan info notification version 2
	wifi: iwlwifi: mvm: fix potential memory leak
	RDMA/rxe: Fix the error "trying to register non-static key in rxe_cleanup_task"
	octeontx2-af: Add validation for lmac type
	drm/amd: Don't allow s0ix on APUs older than Raven
	bluetooth: Add cmd validity checks at the start of hci_sock_ioctl()
	Revert "thermal/drivers/mellanox: Use generic thermal_zone_get_trip() function"
	block: fix bio-cache for passthru IO
	cpufreq: amd-pstate: Update policy->cur in amd_pstate_adjust_perf()
	cpufreq: amd-pstate: Add ->fast_switch() callback
	netfilter: ctnetlink: Support offloaded conntrack entry deletion
	tools headers UAPI: Sync the linux/in.h with the kernel sources
	Linux 6.1.32

Change-Id: I70ca0d07b33b26c2ed7613e6532eb9ae845112ee
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-06-13 20:47:37 +00:00
Greg Kroah-Hartman
03c3264a15 Merge 6.1.31 into android14-6.1-lts
Changes in 6.1.31
	usb: dwc3: fix gadget mode suspend interrupt handler issue
	tpm, tpm_tis: Avoid cache incoherency in test for interrupts
	tpm, tpm_tis: Only handle supported interrupts
	tpm_tis: Use tpm_chip_{start,stop} decoration inside tpm_tis_resume
	tpm, tpm_tis: startup chip before testing for interrupts
	tpm: Re-enable TPM chip boostrapping non-tpm_tis TPM drivers
	tpm: Prevent hwrng from activating during resume
	watchdog: sp5100_tco: Immediately trigger upon starting.
	drm/amd/amdgpu: update mes11 api def
	drm/amdgpu/mes11: enable reg active poll
	skbuff: Proactively round up to kmalloc bucket size
	platform/x86: hp-wmi: Fix cast to smaller integer type warning
	net: dsa: mv88e6xxx: Add RGMII delay to 88E6320
	drm/amd/display: hpd rx irq not working with eDP interface
	ocfs2: Switch to security_inode_init_security()
	arm64: Also reset KASAN tag if page is not PG_mte_tagged
	x86/mm: Avoid incomplete Global INVLPG flushes
	platform/x86/intel/ifs: Annotate work queue on stack so object debug does not complain
	ALSA: hda/ca0132: add quirk for EVGA X299 DARK
	ALSA: hda: Fix unhandled register update during auto-suspend period
	ALSA: hda/realtek: Enable headset onLenovo M70/M90
	SUNRPC: Don't change task->tk_status after the call to rpc_exit_task
	mmc: sdhci-esdhc-imx: make "no-mmc-hs400" works
	mmc: block: ensure error propagation for non-blk
	power: supply: axp288_fuel_gauge: Fix external_power_changed race
	power: supply: bq25890: Fix external_power_changed race
	ASoC: rt5682: Disable jack detection interrupt during suspend
	net: cdc_ncm: Deal with too low values of dwNtbOutMaxSize
	m68k: Move signal frame following exception on 68020/030
	xtensa: fix signal delivery to FDPIC process
	xtensa: add __bswap{si,di}2 helpers
	parisc: Use num_present_cpus() in alternative patching code
	parisc: Handle kgdb breakpoints only in kernel context
	parisc: Fix flush_dcache_page() for usage from irq context
	parisc: Allow to reboot machine after system halt
	parisc: Enable LOCKDEP support
	parisc: Handle kprobes breakpoints only in kernel context
	gpio: mockup: Fix mode of debugfs files
	btrfs: use nofs when cleaning up aborted transactions
	dt-binding: cdns,usb3: Fix cdns,on-chip-buff-size type
	drm/mgag200: Fix gamma lut not initialized.
	drm/radeon: reintroduce radeon_dp_work_func content
	drm/amd/pm: add missing NotifyPowerSource message mapping for SMU13.0.7
	drm/amd/pm: Fix output of pp_od_clk_voltage
	Revert "binder_alloc: add missing mmap_lock calls when using the VMA"
	Revert "android: binder: stop saving a pointer to the VMA"
	binder: add lockless binder_alloc_(set|get)_vma()
	binder: fix UAF caused by faulty buffer cleanup
	binder: fix UAF of alloc->vma in race with munmap()
	selftests/memfd: Fix unknown type name build failure
	drm/amd/amdgpu: limit one queue per gang
	perf/x86/uncore: Correct the number of CHAs on SPR
	x86/topology: Fix erroneous smp_num_siblings on Intel Hybrid platforms
	irqchip/mips-gic: Don't touch vl_map if a local interrupt is not routable
	irqchip/mips-gic: Use raw spinlock for gic_lock
	debugobjects: Don't wake up kswapd from fill_pool()
	fbdev: udlfb: Fix endpoint check
	net: fix stack overflow when LRO is disabled for virtual interfaces
	udplite: Fix NULL pointer dereference in __sk_mem_raise_allocated().
	USB: core: Add routines for endpoint checks in old drivers
	USB: sisusbvga: Add endpoint checks
	media: radio-shark: Add endpoint checks
	ASoC: lpass: Fix for KASAN use_after_free out of bounds
	net: fix skb leak in __skb_tstamp_tx()
	drm: fix drmm_mutex_init()
	selftests: fib_tests: mute cleanup error message
	octeontx2-pf: Fix TSOv6 offload
	bpf: Fix mask generation for 32-bit narrow loads of 64-bit fields
	bpf: fix a memory leak in the LRU and LRU_PERCPU hash maps
	lan966x: Fix unloading/loading of the driver
	ipv6: Fix out-of-bounds access in ipv6_find_tlv()
	cifs: mapchars mount option ignored
	power: supply: leds: Fix blink to LED on transition
	power: supply: mt6360: add a check of devm_work_autocancel in mt6360_charger_probe
	power: supply: bq27xxx: Fix bq27xxx_battery_update() race condition
	power: supply: bq27xxx: Fix I2C IRQ race on remove
	power: supply: bq27xxx: Fix poll_interval handling and races on remove
	power: supply: bq27xxx: Add cache parameter to bq27xxx_battery_current_and_status()
	power: supply: bq27xxx: Move bq27xxx_battery_update() down
	power: supply: bq27xxx: Ensure power_supply_changed() is called on current sign changes
	power: supply: bq27xxx: After charger plug in/out wait 0.5s for things to stabilize
	power: supply: bq25890: Call power_supply_changed() after updating input current or voltage
	power: supply: bq24190: Call power_supply_changed() after updating input current
	power: supply: sbs-charger: Fix INHIBITED bit for Status reg
	optee: fix uninited async notif value
	firmware: arm_ffa: Check if ffa_driver remove is present before executing
	firmware: arm_ffa: Fix FFA device names for logical partitions
	fs: fix undefined behavior in bit shift for SB_NOUSER
	regulator: pca9450: Fix BUCK2 enable_mask
	platform/x86: ISST: Remove 8 socket limit
	coresight: Fix signedness bug in tmc_etr_buf_insert_barrier_packet()
	ARM: dts: imx6qdl-mba6: Add missing pvcie-supply regulator
	x86/pci/xen: populate MSI sysfs entries
	xen/pvcalls-back: fix double frees with pvcalls_new_active_socket()
	x86/show_trace_log_lvl: Ensure stack pointer is aligned, again
	ASoC: Intel: Skylake: Fix declaration of enum skl_ch_cfg
	ASoC: Intel: avs: Fix declaration of enum avs_channel_config
	ASoC: Intel: avs: Access path components under lock
	cxl: Wait Memory_Info_Valid before access memory related info
	sctp: fix an issue that plpmtu can never go to complete state
	forcedeth: Fix an error handling path in nv_probe()
	platform/mellanox: mlxbf-pmc: fix sscanf() error checking
	net/mlx5e: Fix SQ wake logic in ptp napi_poll context
	net/mlx5e: Fix deadlock in tc route query code
	net/mlx5e: Use correct encap attribute during invalidation
	net/mlx5e: do as little as possible in napi poll when budget is 0
	net/mlx5: DR, Fix crc32 calculation to work on big-endian (BE) CPUs
	net/mlx5: Handle pairing of E-switch via uplink un/load APIs
	net/mlx5: DR, Check force-loopback RC QP capability independently from RoCE
	net/mlx5: Fix error message when failing to allocate device memory
	net/mlx5: Collect command failures data only for known commands
	net/mlx5: Devcom, fix error flow in mlx5_devcom_register_device
	net/mlx5: Devcom, serialize devcom registration
	arm64: dts: imx8mn-var-som: fix PHY detection bug by adding deassert delay
	firmware: arm_ffa: Set reserved/MBZ fields to zero in the memory descriptors
	regulator: mt6359: add read check for PMIC MT6359
	net/smc: Reset connection when trying to use SMCRv2 fails.
	3c589_cs: Fix an error handling path in tc589_probe()
	net: phy: mscc: add VSC8502 to MODULE_DEVICE_TABLE
	Linux 6.1.31

Change-Id: I1043b7dd190672829baaf093f690e70a07c7a6dd
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-06-13 20:43:51 +00:00
Greg Kroah-Hartman
26c1cc6858 This is the 6.1.30 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmRuPHsACgkQONu9yGCS
 aT6USxAAx2uklTRE3mmIS9qytOjb8Z3gsA8LVaaQ3f25CWNiuverNj0mFyNtI9KX
 84ZBS/G8aHA6z0dtdyMupHznHehQp7pVo0LOeVMz2bR+CjkpRQei2NimG8bGRcFK
 W6c40w99lD9dYpaal3yajs+k+LF3BktmBNc0SynCjjyEy4YA5RbWOhtGX6P4VRqs
 sPXcmmAHsqDPLfqsgsHiBNsiw+dCP7jY1a17rTxz1g49/4zS6BEGtxxpU4UZNbph
 rKrX0sgF8UM15IfdFc0CiOXhAcL7QQfUbucJ/94180gclF4j6QqAMueAr6mLWkFd
 Pj7vLn/KD2wA2dzTBekHZ9SYp31xcXomkzfdLoMMnazfy3RL4sO7WhJks0k0T2En
 3LIlsRZx/C2ztf3SLq2z2Bw/ExaefrydLI9cWJBi7CQ5yUVO15edcv40W4pxoMOL
 xFDZhCksC+JNc74HPYKTmg+SJQsxtYeLrwb6zW43aJByY+rls70crfhdS5fORvmH
 G8qDS2PCNAqpulxyxQtYxiIcRiM4SqPskves+3nu7gBFGfsv2AJU1gNCorIpZuW8
 DS2jrMwPv7gH+eUvqrnrtdA+Vk4TYWslg0mPlVNavX98i9/dC9Vjss3yXCYh7Q6u
 0+BpSBLtKM4pahaMgKpYv/V/r+GKvIt7Npki8o/bs1nuykF04aw=
 =hAQM
 -----END PGP SIGNATURE-----

Merge 6.1.30 into android14-6.1-lts

Changes in 6.1.30
	drm/fbdev-generic: prohibit potential out-of-bounds access
	drm/mipi-dsi: Set the fwnode for mipi_dsi_device
	ARM: 9296/1: HP Jornada 7XX: fix kernel-doc warnings
	net: skb_partial_csum_set() fix against transport header magic value
	net: mdio: mvusb: Fix an error handling path in mvusb_mdio_probe()
	scsi: ufs: core: Fix I/O hang that occurs when BKOPS fails in W-LUN suspend
	tick/broadcast: Make broadcast device replacement work correctly
	linux/dim: Do nothing if no time delta between samples
	net: stmmac: Initialize MAC_ONEUS_TIC_COUNTER register
	net: Fix load-tearing on sk->sk_stamp in sock_recv_cmsgs().
	net: phy: bcm7xx: Correct read from expansion register
	netfilter: nf_tables: always release netdev hooks from notifier
	netfilter: conntrack: fix possible bug_on with enable_hooks=1
	bonding: fix send_peer_notif overflow
	netlink: annotate accesses to nlk->cb_running
	net: annotate sk->sk_err write from do_recvmmsg()
	net: deal with most data-races in sk_wait_event()
	net: add vlan_get_protocol_and_depth() helper
	tcp: add annotations around sk->sk_shutdown accesses
	gve: Remove the code of clearing PBA bit
	ipvlan:Fix out-of-bounds caused by unclear skb->cb
	net: mscc: ocelot: fix stat counter register values
	net: datagram: fix data-races in datagram_poll()
	af_unix: Fix a data race of sk->sk_receive_queue->qlen.
	af_unix: Fix data races around sk->sk_shutdown.
	drm/i915/guc: Don't capture Gen8 regs on Xe devices
	drm/i915: Fix NULL ptr deref by checking new_crtc_state
	drm/i915/dp: prevent potential div-by-zero
	drm/i915: Expand force_probe to block probe of devices as well.
	drm/i915: taint kernel when force probing unsupported devices
	fbdev: arcfb: Fix error handling in arcfb_probe()
	ext4: reflect error codes from ext4_multi_mount_protect() to its callers
	ext4: don't clear SB_RDONLY when remounting r/w until quota is re-enabled
	ext4: allow to find by goal if EXT4_MB_HINT_GOAL_ONLY is set
	ext4: allow ext4_get_group_info() to fail
	refscale: Move shutdown from wait_event() to wait_event_idle()
	selftests: cgroup: Add 'malloc' failures checks in test_memcontrol
	rcu: Protect rcu_print_task_exp_stall() ->exp_tasks access
	open: return EINVAL for O_DIRECTORY | O_CREAT
	fs: hfsplus: remove WARN_ON() from hfsplus_cat_{read,write}_inode()
	drm/displayid: add displayid_get_header() and check bounds better
	drm/amd/display: populate subvp cmd info only for the top pipe
	drm/amd/display: Correct DML calculation to align HW formula
	platform/x86: x86-android-tablets: Add Acer Iconia One 7 B1-750 data
	drm/amd/display: Enable HostVM based on rIOMMU active
	drm/amd/display: Use DC_LOG_DC in the trasform pixel function
	regmap: cache: Return error in cache sync operations for REGCACHE_NONE
	remoteproc: imx_dsp_rproc: Add custom memory copy implementation for i.MX DSP Cores
	arm64: dts: qcom: msm8996: Add missing DWC3 quirks
	media: cx23885: Fix a null-ptr-deref bug in buffer_prepare() and buffer_finish()
	media: pci: tw68: Fix null-ptr-deref bug in buf prepare and finish
	media: pvrusb2: VIDEO_PVRUSB2 depends on DVB_CORE to use dvb_* symbols
	ACPI: processor: Check for null return of devm_kzalloc() in fch_misc_setup()
	drm/rockchip: dw_hdmi: cleanup drm encoder during unbind
	memstick: r592: Fix UAF bug in r592_remove due to race condition
	arm64: dts: imx8mq-librem5: Remove dis_u3_susphy_quirk from usb_dwc3_0
	firmware: arm_sdei: Fix sleep from invalid context BUG
	ACPI: EC: Fix oops when removing custom query handlers
	drm/amd/display: fixed dcn30+ underflow issue
	remoteproc: stm32_rproc: Add mutex protection for workqueue
	drm/tegra: Avoid potential 32-bit integer overflow
	drm/msm/dp: Clean up handling of DP AUX interrupts
	ACPICA: Avoid undefined behavior: applying zero offset to null pointer
	ACPICA: ACPICA: check null return of ACPI_ALLOCATE_ZEROED in acpi_db_display_objects
	arm64: dts: qcom: sdm845-polaris: Drop inexistent properties
	irqchip/gicv3: Workaround for NVIDIA erratum T241-FABRIC-4
	ACPI: video: Remove desktops without backlight DMI quirks
	drm/amd/display: Correct DML calculation to follow HW SPEC
	drm/amd: Fix an out of bounds error in BIOS parser
	drm/amdgpu: Fix sdma v4 sw fini error
	media: Prefer designated initializers over memset for subdev pad ops
	media: mediatek: vcodec: Fix potential array out-of-bounds in decoder queue_setup
	wifi: ath: Silence memcpy run-time false positive warning
	bpf: Annotate data races in bpf_local_storage
	wifi: brcmfmac: pcie: Provide a buffer of random bytes to the device
	wifi: brcmfmac: cfg80211: Pass the PMK in binary instead of hex
	ext2: Check block size validity during mount
	scsi: lpfc: Prevent lpfc_debugfs_lockstat_write() buffer overflow
	scsi: lpfc: Correct used_rpi count when devloss tmo fires with no recovery
	bnxt: avoid overflow in bnxt_get_nvram_directory()
	net: pasemi: Fix return type of pasemi_mac_start_tx()
	net: Catch invalid index in XPS mapping
	netdev: Enforce index cap in netdev_get_tx_queue
	scsi: target: iscsit: Free cmds before session free
	lib: cpu_rmap: Avoid use after free on rmap->obj array entries
	scsi: message: mptlan: Fix use after free bug in mptlan_remove() due to race condition
	gfs2: Fix inode height consistency check
	scsi: ufs: ufs-pci: Add support for Intel Lunar Lake
	ext4: set goal start correctly in ext4_mb_normalize_request
	ext4: Fix best extent lstart adjustment logic in ext4_mb_new_inode_pa()
	crypto: jitter - permanent and intermittent health errors
	f2fs: Fix system crash due to lack of free space in LFS
	f2fs: fix to drop all dirty pages during umount() if cp_error is set
	f2fs: fix to check readonly condition correctly
	samples/bpf: Fix fout leak in hbm's run_bpf_prog
	bpf: Add preempt_count_{sub,add} into btf id deny list
	md: fix soft lockup in status_resync
	wifi: iwlwifi: pcie: fix possible NULL pointer dereference
	wifi: iwlwifi: add a new PCI device ID for BZ device
	wifi: iwlwifi: pcie: Fix integer overflow in iwl_write_to_user_buf
	wifi: iwlwifi: mvm: fix ptk_pn memory leak
	block, bfq: Fix division by zero error on zero wsum
	wifi: ath11k: Ignore frags from uninitialized peer in dp.
	wifi: iwlwifi: fix iwl_mvm_max_amsdu_size() for MLO
	null_blk: Always check queue mode setting from configfs
	wifi: iwlwifi: dvm: Fix memcpy: detected field-spanning write backtrace
	wifi: ath11k: Fix SKB corruption in REO destination ring
	nbd: fix incomplete validation of ioctl arg
	ipvs: Update width of source for ip_vs_sync_conn_options
	Bluetooth: btusb: Add new PID/VID 04ca:3801 for MT7663
	Bluetooth: Add new quirk for broken local ext features page 2
	Bluetooth: btrtl: add support for the RTL8723CS
	Bluetooth: Improve support for Actions Semi ATS2851 based devices
	Bluetooth: btrtl: check for NULL in btrtl_set_quirks()
	Bluetooth: btintel: Add LE States quirk support
	Bluetooth: hci_bcm: Fall back to getting bdaddr from EFI if not set
	Bluetooth: Add new quirk for broken set random RPA timeout for ATS2851
	Bluetooth: L2CAP: fix "bad unlock balance" in l2cap_disconnect_rsp
	Bluetooth: btrtl: Add the support for RTL8851B
	staging: rtl8192e: Replace macro RTL_PCI_DEVICE with PCI_DEVICE
	HID: apple: Set the tilde quirk flag on the Geyser 4 and later
	staging: axis-fifo: initialize timeouts in init only
	ASoC: amd: yc: Add DMI entries to support HP OMEN 16-n0xxx (8A42)
	HID: logitech-hidpp: Don't use the USB serial for USB devices
	HID: logitech-hidpp: Reconcile USB and Unifying serials
	spi: spi-imx: fix MX51_ECSPI_* macros when cs > 3
	usb: typec: ucsi: acpi: add quirk for ASUS Zenbook UM325
	ALSA: hda: LNL: add HD Audio PCI ID
	ASoC: amd: Add Dell G15 5525 to quirks list
	ASoC: amd: yc: Add ThinkBook 14 G5+ ARP to quirks list for acp6x
	HID: apple: Set the tilde quirk flag on the Geyser 3
	HID: Ignore battery for ELAN touchscreen on ROG Flow X13 GV301RA
	HID: wacom: generic: Set battery quirk only when we see battery data
	usb: typec: tcpm: fix multiple times discover svids error
	serial: 8250: Reinit port->pm on port specific driver unbind
	mcb-pci: Reallocate memory region to avoid memory overlapping
	sched: Fix KCSAN noinstr violation
	lkdtm/stackleak: Fix noinstr violation
	recordmcount: Fix memory leaks in the uwrite function
	soundwire: dmi-quirks: add remapping for Intel 'Rooks County' NUC M15
	phy: st: miphy28lp: use _poll_timeout functions for waits
	soundwire: qcom: gracefully handle too many ports in DT
	soundwire: bus: Fix unbalanced pm_runtime_put() causing usage count underflow
	mfd: intel_soc_pmic_chtwc: Add Lenovo Yoga Book X90F to intel_cht_wc_models
	mfd: dln2: Fix memory leak in dln2_probe()
	mfd: intel-lpss: Add Intel Meteor Lake PCH-S LPSS PCI IDs
	parisc: Replace regular spinlock with spin_trylock on panic path
	platform/x86: Move existing HP drivers to a new hp subdir
	platform/x86: hp-wmi: add micmute to hp_wmi_keymap struct
	drm/amdgpu: drop gfx_v11_0_cp_ecc_error_irq_funcs
	xfrm: don't check the default policy if the policy allows the packet
	Revert "Fix XFRM-I support for nested ESP tunnels"
	drm/msm/dp: unregister audio driver during unbind
	drm/msm/dpu: Assign missing writeback log_mask
	drm/msm/dpu: Move non-MDP_TOP INTF_INTR offsets out of hwio header
	drm/msm/dpu: Remove duplicate register defines from INTF
	dt-bindings: display/msm: dsi-controller-main: Document qcom, master-dsi and qcom, sync-dual-dsi
	platform: Provide a remove callback that returns no value
	ASoC: fsl_micfil: Fix error handler with pm_runtime_enable
	cpupower: Make TSC read per CPU for Mperf monitor
	xfrm: Reject optional tunnel/BEET mode templates in outbound policies
	af_key: Reject optional tunnel/BEET mode templates in outbound policies
	drm/msm: Fix submit error-path leaks
	selftests: seg6: disable DAD on IPv6 router cfg for srv6_end_dt4_l3vpn_test
	selftets: seg6: disable rp_filter by default in srv6_end_dt4_l3vpn_test
	net: fec: Better handle pm_runtime_get() failing in .remove()
	net: phy: dp83867: add w/a for packet errors seen with short cables
	ALSA: firewire-digi00x: prevent potential use after free
	wifi: mt76: connac: fix stats->tx_bytes calculation
	ALSA: hda/realtek: Apply HP B&O top speaker profile to Pavilion 15
	sfc: disable RXFCS and RXALL features by default
	vsock: avoid to close connected socket after the timeout
	tcp: fix possible sk_priority leak in tcp_v4_send_reset()
	serial: arc_uart: fix of_iomap leak in `arc_serial_probe`
	serial: 8250_bcm7271: balance clk_enable calls
	serial: 8250_bcm7271: fix leak in `brcmuart_probe`
	erspan: get the proto with the md version for collect_md
	net: dsa: rzn1-a5psw: enable management frames for CPU port
	net: dsa: rzn1-a5psw: fix STP states handling
	net: dsa: rzn1-a5psw: disable learning for standalone ports
	net: hns3: fix output information incomplete for dumping tx queue info with debugfs
	net: hns3: fix sending pfc frames after reset issue
	net: hns3: fix reset delay time to avoid configuration timeout
	net: hns3: fix reset timeout when enable full VF
	media: netup_unidvb: fix use-after-free at del_timer()
	SUNRPC: double free xprt_ctxt while still in use
	SUNRPC: always free ctxt when freeing deferred request
	SUNRPC: Fix trace_svc_register() call site
	ASoC: mediatek: mt8186: Fix use-after-free in driver remove path
	ASoC: SOF: topology: Fix logic for copying tuples
	drm/exynos: fix g2d_open/close helper function definitions
	net: nsh: Use correct mac_offset to unwind gso skb in nsh_gso_segment()
	virtio-net: Maintain reverse cleanup order
	virtio_net: Fix error unwinding of XDP initialization
	tipc: add tipc_bearer_min_mtu to calculate min mtu
	tipc: do not update mtu if msg_max is too small in mtu negotiation
	tipc: check the bearer min mtu properly when setting it by netlink
	s390/cio: include subchannels without devices also for evaluation
	can: dev: fix missing CAN XL support in can_put_echo_skb()
	net: bcmgenet: Remove phy_stop() from bcmgenet_netif_stop()
	net: bcmgenet: Restore phy_stop() depending upon suspend/close
	ice: introduce clear_reset_state operation
	ice: Fix ice VF reset during iavf initialization
	wifi: cfg80211: Drop entries with invalid BSSIDs in RNR
	wifi: mac80211: fortify the spinlock against deadlock by interrupt
	wifi: mac80211: fix min center freq offset tracing
	wifi: mac80211: Abort running color change when stopping the AP
	wifi: iwlwifi: mvm: fix cancel_delayed_work_sync() deadlock
	wifi: iwlwifi: fw: fix DBGI dump
	wifi: iwlwifi: fix OEM's name in the ppag approved list
	wifi: iwlwifi: mvm: fix OEM's name in the tas approved list
	wifi: iwlwifi: mvm: don't trust firmware n_channels
	scsi: storvsc: Don't pass unused PFNs to Hyper-V host
	net: tun: rebuild error handling in tun_get_user
	tun: Fix memory leak for detached NAPI queue.
	cassini: Fix a memory leak in the error handling path of cas_init_one()
	net: dsa: mv88e6xxx: Fix mv88e6393x EPC write command offset
	igb: fix bit_shift to be in [1..8] range
	vlan: fix a potential uninit-value in vlan_dev_hard_start_xmit()
	net: wwan: iosm: fix NULL pointer dereference when removing device
	net: pcs: xpcs: fix C73 AN not getting enabled
	net: selftests: Fix optstring
	netfilter: nf_tables: fix nft_trans type confusion
	netfilter: nft_set_rbtree: fix null deref on element insertion
	bridge: always declare tunnel functions
	ALSA: usb-audio: Add a sample rate workaround for Line6 Pod Go
	USB: usbtmc: Fix direction for 0-length ioctl control messages
	usb-storage: fix deadlock when a scsi command timeouts more than once
	USB: UHCI: adjust zhaoxin UHCI controllers OverCurrent bit value
	usb: dwc3: gadget: Improve dwc3_gadget_suspend() and dwc3_gadget_resume()
	usb: dwc3: debugfs: Resume dwc3 before accessing registers
	usb: gadget: u_ether: Fix host MAC address case
	usb: typec: altmodes/displayport: fix pin_assignment_show
	Revert "usb: gadget: udc: core: Prevent redundant calls to pullup"
	Revert "usb: gadget: udc: core: Invoke usb_gadget_connect only when started"
	xhci-pci: Only run d3cold avoidance quirk for s2idle
	xhci: Fix incorrect tracking of free space on transfer rings
	ALSA: hda: Fix Oops by 9.1 surround channel names
	ALSA: hda: Add NVIDIA codec IDs a3 through a7 to patch table
	ALSA: hda/realtek: Add quirk for Clevo L140AU
	ALSA: hda/realtek: Add a quirk for HP EliteDesk 805
	ALSA: hda/realtek: Add quirk for 2nd ASUS GU603
	ALSA: hda/realtek: Add quirk for HP EliteBook G10 laptops
	ALSA: hda/realtek: Fix mute and micmute LEDs for yet another HP laptop
	can: j1939: recvmsg(): allow MSG_CMSG_COMPAT flag
	can: isotp: recvmsg(): allow MSG_CMSG_COMPAT flag
	can: kvaser_pciefd: Set CAN_STATE_STOPPED in kvaser_pciefd_stop()
	can: kvaser_pciefd: Call request_irq() before enabling interrupts
	can: kvaser_pciefd: Empty SRB buffer in probe
	can: kvaser_pciefd: Clear listen-only bit if not explicitly requested
	can: kvaser_pciefd: Do not send EFLUSH command on TFD interrupt
	can: kvaser_pciefd: Disable interrupts in probe error path
	wifi: rtw88: use work to update rate to avoid RCU warning
	SMB3: Close all deferred handles of inode in case of handle lease break
	SMB3: drop reference to cfile before sending oplock break
	ksmbd: smb2: Allow messages padded to 8byte boundary
	ksmbd: allocate one more byte for implied bcc[0]
	ksmbd: fix wrong UserName check in session_user
	ksmbd: fix global-out-of-bounds in smb2_find_context_vals
	KVM: Fix vcpu_array[0] races
	statfs: enforce statfs[64] structure initialization
	maple_tree: make maple state reusable after mas_empty_area()
	mm: fix zswap writeback race condition
	serial: Add support for Advantech PCI-1611U card
	serial: 8250_exar: Add support for USR298x PCI Modems
	serial: qcom-geni: fix enabling deactivated interrupt
	thunderbolt: Clear registers properly when auto clear isn't in use
	vc_screen: reload load of struct vc_data pointer in vcs_write() to avoid UAF
	ceph: force updating the msg pointer in non-split case
	drm/amd/pm: fix possible power mode mismatch between driver and PMFW
	drm/amdgpu/gmc11: implement get_vbios_fb_size()
	drm/amdgpu/gfx10: Disable gfxoff before disabling powergating.
	drm/amdgpu/gfx11: Adjust gfxoff before powergating on gfx11 as well
	drm/amdgpu: refine get gpu clock counter method
	drm/amdgpu/gfx11: update gpu_clock_counter logic
	dt-bindings: ata: ahci-ceva: Cover all 4 iommus entries
	powerpc/iommu: DMA address offset is incorrectly calculated with 2MB TCEs
	powerpc/iommu: Incorrect DDW Table is referenced for SR-IOV device
	tpm/tpm_tis: Disable interrupts for more Lenovo devices
	powerpc/64s/radix: Fix soft dirty tracking
	nilfs2: fix use-after-free bug of nilfs_root in nilfs_evict_inode()
	s390/dasd: fix command reject error on ESE devices
	s390/crypto: use vector instructions only if available for ChaCha20
	s390/qdio: fix do_sqbs() inline assembly constraint
	arm64: mte: Do not set PG_mte_tagged if tags were not initialized
	rethook: use preempt_{disable, enable}_notrace in rethook_trampoline_handler
	rethook, fprobe: do not trace rethook related functions
	remoteproc: imx_dsp_rproc: Fix kernel test robot sparse warning
	crypto: testmgr - fix RNG performance in fuzz tests
	drm/amdgpu: declare firmware for new MES 11.0.4
	drm/amd/amdgpu: introduce gc_*_mes_2.bin v2
	drm/amdgpu: reserve the old gc_11_0_*_mes.bin
	Linux 6.1.30

Change-Id: I411885affcf017410aab34bf3fba2dde96df6593
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-06-12 20:48:03 +00:00
Greg Kroah-Hartman
ef75a88787 Merge 6.1.28 into android14-6.1-lts
Changes in 6.1.28
	ASOC: Intel: sof_sdw: add quirk for Intel 'Rooks County' NUC M15
	ASoC: Intel: soc-acpi: add table for Intel 'Rooks County' NUC M15
	ASoC: soc-pcm: fix hw->formats cleared by soc_pcm_hw_init() for dpcm
	x86/hyperv: Block root partition functionality in a Confidential VM
	ASoC: amd: yc: Add DMI entries to support Victus by HP Laptop 16-e1xxx (8A22)
	iio: adc: palmas_gpadc: fix NULL dereference on rmmod
	ASoC: Intel: bytcr_rt5640: Add quirk for the Acer Iconia One 7 B1-750
	ASoC: da7213.c: add missing pm_runtime_disable()
	net: wwan: t7xx: do not compile with -Werror
	selftests mount: Fix mount_setattr_test builds failed
	scsi: mpi3mr: Handle soft reset in progress fault code (0xF002)
	net: sfp: add quirk enabling 2500Base-x for HG MXPD-483II
	platform/x86: thinkpad_acpi: Add missing T14s Gen1 type to s2idle quirk list
	wifi: ath11k: reduce the MHI timeout to 20s
	tracing: Error if a trace event has an array for a __field()
	asm-generic/io.h: suppress endianness warnings for readq() and writeq()
	x86/cpu: Add model number for Intel Arrow Lake processor
	wireguard: timers: cast enum limits members to int in prints
	wifi: mt76: mt7921e: Set memory space enable in PCI_COMMAND if unset
	ASoC: amd: fix ACP version typo mistake
	ASoC: amd: ps: update the acp clock source.
	arm64: Always load shadow stack pointer directly from the task struct
	arm64: Stash shadow stack pointer in the task struct on interrupt
	powerpc/boot: Fix boot wrapper code generation with CONFIG_POWER10_CPU
	PCI: kirin: Select REGMAP_MMIO
	PCI: pciehp: Fix AB-BA deadlock between reset_lock and device_lock
	PCI: qcom: Fix the incorrect register usage in v2.7.0 config
	phy: qcom-qmp-pcie: sc8180x PCIe PHY has 2 lanes
	IMA: allow/fix UML builds
	usb: gadget: udc: core: Invoke usb_gadget_connect only when started
	usb: gadget: udc: core: Prevent redundant calls to pullup
	usb: dwc3: gadget: Stall and restart EP0 if host is unresponsive
	USB: dwc3: fix runtime pm imbalance on probe errors
	USB: dwc3: fix runtime pm imbalance on unbind
	hwmon: (k10temp) Check range scale when CUR_TEMP register is read-write
	hwmon: (adt7475) Use device_property APIs when configuring polarity
	tpm: Add !tpm_amd_is_rng_defective() to the hwrng_unregister() call site
	posix-cpu-timers: Implement the missing timer_wait_running callback
	media: ov8856: Do not check for for module version
	blk-stat: fix QUEUE_FLAG_STATS clear
	blk-crypto: don't use struct request_queue for public interfaces
	blk-crypto: add a blk_crypto_config_supported_natively helper
	blk-crypto: move internal only declarations to blk-crypto-internal.h
	blk-crypto: Add a missing include directive
	blk-mq: release crypto keyslot before reporting I/O complete
	blk-crypto: make blk_crypto_evict_key() return void
	blk-crypto: make blk_crypto_evict_key() more robust
	staging: iio: resolver: ads1210: fix config mode
	tty: Prevent writing chars during tcsetattr TCSADRAIN/FLUSH
	xhci: fix debugfs register accesses while suspended
	serial: fix TIOCSRS485 locking
	serial: 8250: Fix serial8250_tx_empty() race with DMA Tx
	serial: max310x: fix IO data corruption in batched operations
	tick/nohz: Fix cpu_is_hotpluggable() by checking with nohz subsystem
	fs: fix sysctls.c built
	MIPS: fw: Allow firmware to pass a empty env
	ipmi:ssif: Add send_retries increment
	ipmi: fix SSIF not responding under certain cond.
	iio: addac: stx104: Fix race condition when converting analog-to-digital
	iio: addac: stx104: Fix race condition for stx104_write_raw()
	kheaders: Use array declaration instead of char
	wifi: mt76: add missing locking to protect against concurrent rx/status calls
	pwm: meson: Fix axg ao mux parents
	pwm: meson: Fix g12a ao clk81 name
	soundwire: qcom: correct setting ignore bit on v1.5.1
	pinctrl: qcom: lpass-lpi: set output value before enabling output
	ring-buffer: Ensure proper resetting of atomic variables in ring_buffer_reset_online_cpus
	ring-buffer: Sync IRQ works before buffer destruction
	crypto: api - Demote BUG_ON() in crypto_unregister_alg() to a WARN_ON()
	crypto: safexcel - Cleanup ring IRQ workqueues on load failure
	crypto: arm64/aes-neonbs - fix crash with CFI enabled
	crypto: ccp - Don't initialize CCP for PSP 0x1649
	rcu: Avoid stack overflow due to __rcu_irq_enter_check_tick() being kprobe-ed
	reiserfs: Add security prefix to xattr name in reiserfs_security_write()
	KVM: nVMX: Emulate NOPs in L2, and PAUSE if it's not intercepted
	KVM: arm64: Avoid vcpu->mutex v. kvm->lock inversion in CPU_ON
	KVM: arm64: Avoid lock inversion when setting the VM register width
	KVM: arm64: Use config_lock to protect data ordered against KVM_RUN
	KVM: arm64: Use config_lock to protect vgic state
	KVM: arm64: vgic: Don't acquire its_lock before config_lock
	relayfs: fix out-of-bounds access in relay_file_read
	drm/amd/display: Remove stutter only configurations
	drm/amd/display: limit timing for single dimm memory
	drm/amd/display: fix PSR-SU/DSC interoperability support
	drm/amd/display: fix a divided-by-zero error
	KVM: RISC-V: Retry fault if vma_lookup() results become invalid
	ksmbd: fix racy issue under cocurrent smb2 tree disconnect
	ksmbd: call rcu_barrier() in ksmbd_server_exit()
	ksmbd: fix NULL pointer dereference in smb2_get_info_filesystem()
	ksmbd: fix memleak in session setup
	ksmbd: not allow guest user on multichannel
	ksmbd: fix deadlock in ksmbd_find_crypto_ctx()
	ACPI: video: Remove acpi_backlight=video quirk for Lenovo ThinkPad W530
	i2c: omap: Fix standard mode false ACK readings
	riscv: mm: remove redundant parameter of create_fdt_early_page_table
	tracing: Fix permissions for the buffer_percent file
	swsmu/amdgpu_smu: Fix the wrong if-condition
	drm/amd/pm: re-enable the gfx imu when smu resume
	iommu/amd: Fix "Guest Virtual APIC Table Root Pointer" configuration in IRTE
	RISC-V: Align SBI probe implementation with spec
	Revert "ubifs: dirty_cow_znode: Fix memleak in error handling path"
	ubifs: Fix memleak when insert_old_idx() failed
	ubi: Fix return value overwrite issue in try_write_vid_and_data()
	ubifs: Free memory for tmpfile name
	ubifs: Fix memory leak in do_rename
	ceph: fix potential use-after-free bug when trimming caps
	xfs: don't consider future format versions valid
	cxl/hdm: Fail upon detecting 0-sized decoders
	bus: mhi: host: Remove duplicate ee check for syserr
	bus: mhi: host: Use mhi_tryset_pm_state() for setting fw error state
	bus: mhi: host: Range check CHDBOFF and ERDBOFF
	ASoC: dt-bindings: qcom,lpass-rx-macro: correct minItems for clocks
	kunit: improve KTAP compliance of KUnit test output
	kunit: fix bug in the order of lines in debugfs logs
	rcu: Fix missing TICK_DEP_MASK_RCU_EXP dependency check
	selftests/resctrl: Return NULL if malloc_and_init_memory() did not alloc mem
	selftests/resctrl: Move ->setup() call outside of test specific branches
	selftests/resctrl: Allow ->setup() to return errors
	selftests/resctrl: Check for return value after write_schemata()
	selinux: fix Makefile dependencies of flask.h
	selinux: ensure av_permissions.h is built when needed
	tpm, tpm_tis: Do not skip reset of original interrupt vector
	tpm, tpm_tis: Claim locality before writing TPM_INT_ENABLE register
	tpm, tpm_tis: Disable interrupts if tpm_tis_probe_irq() failed
	tpm, tpm_tis: Claim locality before writing interrupt registers
	tpm, tpm: Implement usage counter for locality
	tpm, tpm_tis: Claim locality when interrupts are reenabled on resume
	erofs: stop parsing non-compact HEAD index if clusterofs is invalid
	erofs: initialize packed inode after root inode is assigned
	erofs: fix potential overflow calculating xattr_isize
	drm/rockchip: Drop unbalanced obj unref
	drm/i915/dg2: Drop one PCI ID
	drm/vgem: add missing mutex_destroy
	drm/probe-helper: Cancel previous job before starting new one
	drm/amdgpu: register a vga_switcheroo client for MacBooks with apple-gmux
	tools/x86/kcpuid: Fix avx512bw and avx512lvl fields in Fn00000007
	soc: ti: pm33xx: Fix refcount leak in am33xx_pm_probe
	arm64: dts: renesas: r8a77990: Remove bogus voltages from OPP table
	arm64: dts: renesas: r8a774c0: Remove bogus voltages from OPP table
	arm64: dts: renesas: r9a07g044: Update IRQ numbers for SSI channels
	arm64: dts: renesas: r9a07g054: Update IRQ numbers for SSI channels
	arm64: dts: renesas: r9a07g043: Introduce SOC_PERIPHERAL_IRQ() macro to specify interrupt property
	arm64: dts: renesas: r9a07g043: Update IRQ numbers for SSI channels
	drm/mediatek: dp: Only trigger DRM HPD events if bridge is attached
	drm/msm/disp/dpu: check for crtc enable rather than crtc active to release shared resources
	EDAC/skx: Fix overflows on the DRAM row address mapping arrays
	ARM: dts: qcom-apq8064: Fix opp table child name
	regulator: core: Shorten off-on-delay-us for always-on/boot-on by time since booted
	arm64: dts: ti: k3-am62-main: Fix GPIO numbers in DT
	arm64: dts: ti: k3-am62a7-sk: Fix DDR size to full 4GB
	arm64: dts: ti: k3-j721e-main: Remove ti,strobe-sel property
	arm64: dts: broadcom: bcmbca: bcm4908: fix NAND interrupt name
	arm64: dts: broadcom: bcmbca: bcm4908: fix LED nodenames
	arm64: dts: broadcom: bcmbca: bcm4908: fix procmon nodename
	arm64: dts: qcom: msm8998: Fix stm-stimulus-base reg name
	arm64: dts: qcom: sc7280: fix EUD port properties
	arm64: dts: qcom: sdm845: correct dynamic power coefficients
	arm64: dts: qcom: sdm845: Fix the PCI I/O port range
	arm64: dts: qcom: msm8998: Fix the PCI I/O port range
	arm64: dts: qcom: sc7280: Fix the PCI I/O port range
	arm64: dts: qcom: ipq8074: Fix the PCI I/O port range
	arm64: dts: qcom: ipq6018: Fix the PCI I/O port range
	arm64: dts: qcom: msm8996: Fix the PCI I/O port range
	arm64: dts: qcom: sm8250: Fix the PCI I/O port range
	arm64: dts: qcom: sm8150: Fix the PCI I/O port range
	arm64: dts: qcom: sm8450: Fix the PCI I/O port range
	ARM: dts: qcom: ipq4019: Fix the PCI I/O port range
	ARM: dts: qcom: ipq8064: Fix the PCI I/O port range
	ARM: dts: qcom: sdx55: Fix the unit address of PCIe EP node
	x86/MCE/AMD: Use an u64 for bank_map
	media: bdisp: Add missing check for create_workqueue
	media: platform: mtk-mdp3: Add missing check and free for ida_alloc
	media: amphion: decoder implement display delay enable
	media: av7110: prevent underflow in write_ts_to_decoder()
	firmware: qcom_scm: Clear download bit during reboot
	drm/bridge: adv7533: Fix adv7533_mode_valid for adv7533 and adv7535
	media: max9286: Free control handler
	arm64: dts: ti: k3-am625: Correct L2 cache size to 512KB
	arm64: dts: ti: k3-am62a7: Correct L2 cache size to 512KB
	drm/msm/adreno: drop bogus pm_runtime_set_active()
	drm: msm: adreno: Disable preemption on Adreno 510
	virt/coco/sev-guest: Double-buffer messages
	arm64: dts: qcom: sm8350-microsoft-surface: fix USB dual-role mode property
	drm/amd/display/dc/dce60/Makefile: Fix previous attempt to silence known override-init warnings
	ACPI: processor: Fix evaluating _PDC method when running as Xen dom0
	mmc: sdhci-of-esdhc: fix quirk to ignore command inhibit for data
	arm64: dts: qcom: sm8450: fix pcie1 gpios properties name
	drm: rcar-du: Fix a NULL vs IS_ERR() bug
	ARM: dts: gta04: fix excess dma channel usage
	firmware: arm_scmi: Fix xfers allocation on Rx channel
	perf/arm-cmn: Move overlapping wp_combine field
	ARM: dts: stm32: fix spi1 pin assignment on stm32mp15
	arm64: dts: apple: t8103: Disable unused PCIe ports
	cpufreq: mediatek: fix passing zero to 'PTR_ERR'
	cpufreq: mediatek: fix KP caused by handler usage after regulator_put/clk_put
	cpufreq: mediatek: raise proc/sram max voltage for MT8516
	cpufreq: mediatek: Raise proc and sram max voltage for MT7622/7623
	cpufreq: qcom-cpufreq-hw: Revert adding cpufreq qos
	arm64: dts: mediatek: mt8192-asurada: Fix voltage constraint for Vgpu
	ACPI: VIOT: Initialize the correct IOMMU fwspec
	drm/lima/lima_drv: Add missing unwind goto in lima_pdev_probe()
	drm/mediatek: dp: Change the aux retries times when receiving AUX_DEFER
	mailbox: mpfs: switch to txdone_poll
	soc: bcm: brcmstb: biuctrl: fix of_iomap leak
	soc: renesas: renesas-soc: Release 'chipid' from ioremap()
	gpu: host1x: Fix potential double free if IOMMU is disabled
	gpu: host1x: Fix memory leak of device names
	arm64: dts: qcom: sc7280-herobrine-villager: correct trackpad supply
	arm64: dts: qcom: sc7180-trogdor-lazor: correct trackpad supply
	arm64: dts: qcom: sc7180-trogdor-pazquel: correct trackpad supply
	arm64: dts: qcom: msm8994-kitakami: drop unit address from PMI8994 regulator
	arm64: dts: qcom: msm8994-msft-lumia-octagon: drop unit address from PMI8994 regulator
	arm64: dts: qcom: apq8096-db820c: drop unit address from PMI8994 regulator
	drm/ttm: optimize pool allocations a bit v2
	drm/ttm/pool: Fix ttm_pool_alloc error path
	regulator: core: Consistently set mutex_owner when using ww_mutex_lock_slow()
	regulator: core: Avoid lockdep reports when resolving supplies
	x86/apic: Fix atomic update of offset in reserve_eilvt_offset()
	arm64: dts: qcom: msm8994-angler: Fix cont_splash_mem mapping
	arm64: dts: qcom: msm8994-angler: removed clash with smem_region
	arm64: dts: sc7180: Rename qspi data12 as data23
	arm64: dts: sc7280: Rename qspi data12 as data23
	media: mediatek: vcodec: Use 4K frame size when supported by stateful decoder
	media: mediatek: vcodec: Make MM21 the default capture format
	media: mediatek: vcodec: Force capture queue format to MM21
	media: mediatek: vcodec: add params to record lat and core lat_buf count
	media: mediatek: vcodec: using each instance lat_buf count replace core ready list
	media: mediatek: vcodec: move lat_buf to the top of core list
	media: mediatek: vcodec: add core decode done event
	media: mediatek: vcodec: remove unused lat_buf
	media: mediatek: vcodec: making sure queue_work successfully
	media: mediatek: vcodec: change lat thread decode error condition
	media: cedrus: fix use after free bug in cedrus_remove due to race condition
	media: rkvdec: fix use after free bug in rkvdec_remove
	platform/x86/amd/pmf: Move out of BIOS SMN pair for driver probe
	platform/x86/amd: pmc: Don't try to read SMU version on Picasso
	platform/x86/amd: pmc: Hide SMU version and program attributes for Picasso
	platform/x86/amd: pmc: Don't dump data after resume from s0i3 on picasso
	platform/x86/amd: pmc: Move idlemask check into `amd_pmc_idlemask_read`
	platform/x86/amd: pmc: Utilize SMN index 0 for driver probe
	platform/x86/amd: pmc: Move out of BIOS SMN pair for STB init
	media: dm1105: Fix use after free bug in dm1105_remove due to race condition
	media: saa7134: fix use after free bug in saa7134_finidev due to race condition
	media: platform: mtk-mdp3: fix potential frame size overflow in mdp_try_fmt_mplane()
	media: rcar_fdp1: Fix refcount leak in probe and remove function
	media: v4l: async: Return async sub-devices to subnotifier list
	media: hi846: Fix memleak in hi846_init_controls()
	drm/amd/display: Fix potential null dereference
	media: rc: gpio-ir-recv: Fix support for wake-up
	media: venus: dec: Fix handling of the start cmd
	media: venus: dec: Fix capture formats enumeration order
	regulator: stm32-pwr: fix of_iomap leak
	x86/ioapic: Don't return 0 from arch_dynirq_lower_bound()
	arm64: kgdb: Set PSTATE.SS to 1 to re-enable single-step
	perf/arm-cmn: Fix port detection for CMN-700
	media: mediatek: vcodec: fix decoder disable pm crash
	media: mediatek: vcodec: add remove function for decoder platform driver
	debugobject: Prevent init race with static objects
	drm/i915: Make intel_get_crtc_new_encoder() less oopsy
	tick/common: Align tick period with the HZ tick.
	ACPI: bus: Ensure that notify handlers are not running after removal
	cpufreq: use correct unit when verify cur freq
	rpmsg: glink: Propagate TX failures in intentless mode as well
	hwmon: (pmbus/fsp-3y) Fix functionality bitmask in FSP-3Y YM-2151E
	platform/chrome: cros_typec_switch: Add missing fwnode_handle_put()
	wifi: ath6kl: minor fix for allocation size
	wifi: ath9k: hif_usb: fix memory leak of remain_skbs
	wifi: ath11k: Use platform_get_irq() to get the interrupt
	wifi: ath5k: Use platform_get_irq() to get the interrupt
	wifi: ath5k: fix an off by one check in ath5k_eeprom_read_freq_list()
	wifi: ath11k: fix SAC bug on peer addition with sta band migration
	wifi: brcmfmac: support CQM RSSI notification with older firmware
	wifi: ath6kl: reduce WARN to dev_dbg() in callback
	tools: bpftool: Remove invalid \' json escape
	wifi: rtw88: mac: Return the original error from rtw_pwr_seq_parser()
	wifi: rtw88: mac: Return the original error from rtw_mac_power_switch()
	bpf: take into account liveness when propagating precision
	bpf: fix precision propagation verbose logging
	crypto: qat - fix concurrency issue when device state changes
	scm: fix MSG_CTRUNC setting condition for SO_PASSSEC
	wifi: ath11k: fix deinitialization of firmware resources
	selftests/bpf: Fix a fd leak in an error path in network_helpers.c
	bpf: Remove misleading spec_v1 check on var-offset stack read
	net: pcs: xpcs: remove double-read of link state when using AN
	vlan: partially enable SIOCSHWTSTAMP in container
	net/packet: annotate accesses to po->xmit
	net/packet: convert po->origdev to an atomic flag
	net/packet: convert po->auxdata to an atomic flag
	libbpf: Fix ld_imm64 copy logic for ksym in light skeleton.
	net: dsa: qca8k: remove assignment of an_enabled in pcs_get_state()
	netfilter: keep conntrack reference until IPsecv6 policy checks are done
	bpf: Fix __reg_bound_offset 64->32 var_off subreg propagation
	scsi: target: core: Change the way target_xcopy_do_work() sets restiction on max I/O
	scsi: target: Move sess cmd counter to new struct
	scsi: target: Move cmd counter allocation
	scsi: target: Pass in cmd counter to use during cmd setup
	scsi: target: iscsit: isert: Alloc per conn cmd counter
	scsi: target: iscsit: Stop/wait on cmds during conn close
	scsi: target: Fix multiple LUN_RESET handling
	scsi: target: iscsit: Fix TAS handling during conn cleanup
	scsi: megaraid: Fix mega_cmd_done() CMDID_INT_CMDS
	net: sunhme: Fix uninitialized return code
	f2fs: handle dqget error in f2fs_transfer_project_quota()
	f2fs: fix uninitialized skipped_gc_rwsem
	f2fs: apply zone capacity to all zone type
	f2fs: compress: fix to call f2fs_wait_on_page_writeback() in f2fs_write_raw_pages()
	f2fs: fix scheduling while atomic in decompression path
	crypto: caam - Clear some memory in instantiate_rng
	crypto: sa2ul - Select CRYPTO_DES
	wifi: rtlwifi: fix incorrect error codes in rtl_debugfs_set_write_rfreg()
	wifi: rtlwifi: fix incorrect error codes in rtl_debugfs_set_write_reg()
	scsi: libsas: Add sas_ata_device_link_abort()
	scsi: hisi_sas: Handle NCQ error when IPTT is valid
	wifi: rt2x00: Fix memory leak when handling surveys
	f2fs: fix iostat lock protection
	net: qrtr: correct types of trace event parameters
	selftests: xsk: Use correct UMEM size in testapp_invalid_desc
	selftests: xsk: Disable IPv6 on VETH1
	selftests: xsk: Deflakify STATS_RX_DROPPED test
	selftests/bpf: Wait for receive in cg_storage_multi test
	bpftool: Fix bug for long instructions in program CFG dumps
	crypto: drbg - Only fail when jent is unavailable in FIPS mode
	xsk: Fix unaligned descriptor validation
	f2fs: fix to avoid use-after-free for cached IPU bio
	wifi: iwlwifi: fix duplicate entry in iwl_dev_info_table
	bpf/btf: Fix is_int_ptr()
	scsi: lpfc: Fix ioremap issues in lpfc_sli4_pci_mem_setup()
	net: ethernet: stmmac: dwmac-rk: rework optional clock handling
	net: ethernet: stmmac: dwmac-rk: fix optional phy regulator handling
	wifi: ath11k: fix writing to unintended memory region
	bpf, sockmap: fix deadlocks in the sockhash and sockmap
	nvmet: fix error handling in nvmet_execute_identify_cns_cs_ns()
	nvmet: fix Identify Namespace handling
	nvmet: fix Identify Controller handling
	nvmet: fix Identify Active Namespace ID list handling
	nvmet: fix I/O Command Set specific Identify Controller
	nvme: fix async event trace event
	nvme-fcloop: fix "inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage"
	selftests/bpf: Use read_perf_max_sample_freq() in perf_event_stackmap
	selftests/bpf: Fix leaked bpf_link in get_stackid_cannot_attach
	blk-mq: don't plug for head insertions in blk_execute_rq_nowait
	wifi: iwlwifi: debug: fix crash in __iwl_err()
	wifi: iwlwifi: trans: don't trigger d3 interrupt twice
	wifi: iwlwifi: mvm: don't set CHECKSUM_COMPLETE for unsupported protocols
	bpf, sockmap: Revert buggy deadlock fix in the sockhash and sockmap
	f2fs: fix to check return value of f2fs_do_truncate_blocks()
	f2fs: fix to check return value of inc_valid_block_count()
	md/raid10: fix task hung in raid10d
	md/raid10: fix leak of 'r10bio->remaining' for recovery
	md/raid10: fix memleak for 'conf->bio_split'
	md/raid10: fix memleak of md thread
	md/raid10: don't call bio_start_io_acct twice for bio which experienced read error
	wifi: iwlwifi: mvm: don't drop unencrypted MCAST frames
	wifi: iwlwifi: yoyo: skip dump correctly on hw error
	wifi: iwlwifi: yoyo: Fix possible division by zero
	wifi: iwlwifi: mvm: initialize seq variable
	wifi: iwlwifi: fw: move memset before early return
	jdb2: Don't refuse invalidation of already invalidated buffers
	io_uring/rsrc: use nospec'ed indexes
	wifi: iwlwifi: make the loop for card preparation effective
	wifi: mt76: mt7915: expose device tree match table
	wifi: mt76: handle failure of vzalloc in mt7615_coredump_work
	wifi: mt76: add flexible polling wait-interval support
	wifi: mt76: mt7921e: fix probe timeout after reboot
	wifi: mt76: fix 6GHz high channel not be scanned
	mt76: mt7921: fix kernel panic by accessing unallocated eeprom.data
	wifi: mt76: mt7921: fix missing unwind goto in `mt7921u_probe`
	wifi: mt76: mt7921e: improve reliability of dma reset
	wifi: mt76: mt7921e: stop chip reset worker in unregister hook
	wifi: mt76: connac: fix txd multicast rate setting
	wifi: iwlwifi: mvm: check firmware response size
	netfilter: conntrack: restore IPS_CONFIRMED out of nf_conntrack_hash_check_insert()
	netfilter: conntrack: fix wrong ct->timeout value
	wifi: iwlwifi: fw: fix memory leak in debugfs
	ixgbe: Allow flow hash to be set via ethtool
	ixgbe: Enable setting RSS table to default values
	net/mlx5e: Don't clone flow post action attributes second time
	net/mlx5: E-switch, Create per vport table based on devlink encap mode
	net/mlx5: E-switch, Don't destroy indirect table in split rule
	net/mlx5e: Fix error flow in representor failing to add vport rx rule
	net/mlx5: Remove "recovery" arg from mlx5_load_one() function
	net/mlx5: Suspend auxiliary devices only in case of PCI device suspend
	Revert "net/mlx5: Remove "recovery" arg from mlx5_load_one() function"
	net/mlx5: Use recovery timeout on sync reset flow
	net/mlx5e: Nullify table pointer when failing to create
	net: stmmac:fix system hang when setting up tag_8021q VLAN for DSA ports
	bpf: Fix race between btf_put and btf_idr walk.
	bpf: Don't EFAULT for getsockopt with optval=NULL
	netfilter: nf_tables: don't write table validation state without mutex
	net: dpaa: Fix uninitialized variable in dpaa_stop()
	net/sched: sch_fq: fix integer overflow of "credit"
	ipv4: Fix potential uninit variable access bug in __ip_make_skb()
	Revert "Bluetooth: btsdio: fix use after free bug in btsdio_remove due to unfinished work"
	netlink: Use copy_to_user() for optval in netlink_getsockopt().
	net: amd: Fix link leak when verifying config failed
	tcp/udp: Fix memleaks of sk and zerocopy skbs with TX timestamp.
	ipmi: ASPEED_BT_IPMI_BMC: select REGMAP_MMIO instead of depending on it
	ASoC: cs35l41: Only disable internal boost
	drivers: staging: rtl8723bs: Fix locking in _rtw_join_timeout_handler()
	drivers: staging: rtl8723bs: Fix locking in rtw_scan_timeout_handler()
	pstore: Revert pmsg_lock back to a normal mutex
	usb: host: xhci-rcar: remove leftover quirk handling
	usb: dwc3: gadget: Change condition for processing suspend event
	serial: stm32: Re-assert RTS/DE GPIO in RS485 mode only if more data are transmitted
	fpga: bridge: fix kernel-doc parameter description
	iio: light: max44009: add missing OF device matching
	serial: 8250_bcm7271: Fix arbitration handling
	spi: atmel-quadspi: Don't leak clk enable count in pm resume
	spi: atmel-quadspi: Free resources even if runtime resume failed in .remove()
	spi: imx: Don't skip cleanup in remove's error path
	usb: gadget: udc: renesas_usb3: Fix use after free bug in renesas_usb3_remove due to race condition
	ASoC: soc-compress: Inherit atomicity from DAI link for Compress FE
	PCI: imx6: Install the fault handler only on compatible match
	ASoC: es8316: Handle optional IRQ assignment
	linux/vt_buffer.h: allow either builtin or modular for macros
	spi: qup: Don't skip cleanup in remove's error path
	interconnect: qcom: rpm: drop bogus pm domain attach
	spi: fsl-spi: Fix CPM/QE mode Litte Endian
	vmci_host: fix a race condition in vmci_host_poll() causing GPF
	of: Fix modalias string generation
	PCI/EDR: Clear Device Status after EDR error recovery
	ia64: mm/contig: fix section mismatch warning/error
	ia64: salinfo: placate defined-but-not-used warning
	scripts/gdb: bail early if there are no clocks
	scripts/gdb: bail early if there are no generic PD
	HID: amd_sfh: Correct the structure fields
	HID: amd_sfh: Correct the sensor enable and disable command
	HID: amd_sfh: Fix illuminance value
	HID: amd_sfh: Add support for shutdown operation
	HID: amd_sfh: Correct the stop all command
	HID: amd_sfh: Increase sensor command timeout for SFH1.1
	HID: amd_sfh: Handle "no sensors" enabled for SFH1.1
	cacheinfo: Check sib_leaf in cache_leaves_are_shared()
	coresight: etm_pmu: Set the module field
	drm/panel: novatek-nt35950: Improve error handling
	ASoC: fsl_mqs: move of_node_put() to the correct location
	PCI/PM: Extend D3hot delay for NVIDIA HDA controllers
	drm/panel: novatek-nt35950: Only unregister DSI1 if it exists
	spi: cadence-quadspi: fix suspend-resume implementations
	i2c: cadence: cdns_i2c_master_xfer(): Fix runtime PM leak on error path
	i2c: xiic: xiic_xfer(): Fix runtime PM leak on error path
	scripts/gdb: raise error with reduced debugging information
	uapi/linux/const.h: prefer ISO-friendly __typeof__
	sh: sq: Fix incorrect element size for allocating bitmap buffer
	usb: gadget: tegra-xudc: Fix crash in vbus_draw
	usb: chipidea: fix missing goto in `ci_hdrc_probe`
	usb: mtu3: fix kernel panic at qmu transfer done irq handler
	firmware: stratix10-svc: Fix an NULL vs IS_ERR() bug in probe
	tty: serial: fsl_lpuart: adjust buffer length to the intended size
	serial: 8250: Add missing wakeup event reporting
	spi: cadence-quadspi: use macro DEFINE_SIMPLE_DEV_PM_OPS
	staging: rtl8192e: Fix W_DISABLE# does not work after stop/start
	spmi: Add a check for remove callback when removing a SPMI driver
	virtio_ring: don't update event idx on get_buf
	fbdev: mmp: Fix deferred clk handling in mmphw_probe()
	selftests/powerpc/pmu: Fix sample field check in the mmcra_thresh_marked_sample_test
	macintosh/windfarm_smu_sat: Add missing of_node_put()
	powerpc/perf: Properly detect mpc7450 family
	powerpc/mpc512x: fix resource printk format warning
	powerpc/wii: fix resource printk format warnings
	powerpc/sysdev/tsi108: fix resource printk format warnings
	macintosh: via-pmu-led: requires ATA to be set
	powerpc/rtas: use memmove for potentially overlapping buffer copy
	sched/fair: Fix inaccurate tally of ttwu_move_affine
	perf/core: Fix hardlockup failure caused by perf throttle
	Revert "objtool: Support addition to set CFA base"
	riscv: Fix ptdump when KASAN is enabled
	sched/rt: Fix bad task migration for rt tasks
	tracing/user_events: Ensure write index cannot be negative
	clk: at91: clk-sam9x60-pll: fix return value check
	IB/hifi1: add a null check of kzalloc_node in hfi1_ipoib_txreq_init
	RDMA/siw: Fix potential page_array out of range access
	clk: mediatek: mt2712: Add error handling to clk_mt2712_apmixed_probe()
	clk: mediatek: Consistently use GATE_MTK() macro
	clk: mediatek: mt7622: Properly use CLK_IS_CRITICAL flag
	clk: mediatek: mt8135: Properly use CLK_IS_CRITICAL flag
	RDMA/rdmavt: Delete unnecessary NULL check
	clk: qcom: gcc-qcm2290: Fix up gcc_sdcc2_apps_clk_src
	workqueue: Fix hung time report of worker pools
	rtc: omap: include header for omap_rtc_power_off_program prototype
	RDMA/mlx4: Prevent shift wrapping in set_user_sq_size()
	rtc: meson-vrtc: Use ktime_get_real_ts64() to get the current time
	rtc: k3: handle errors while enabling wake irq
	RDMA/erdma: Use fixed hardware page size
	fs/ntfs3: Fix memory leak if ntfs_read_mft failed
	fs/ntfs3: Add check for kmemdup
	fs/ntfs3: Fix OOB read in indx_insert_into_buffer
	fs/ntfs3: Fix slab-out-of-bounds read in hdr_delete_de()
	iommu/mediatek: Set dma_mask for PGTABLE_PA_35_EN
	power: supply: generic-adc-battery: fix unit scaling
	clk: add missing of_node_put() in "assigned-clocks" property parsing
	RDMA/siw: Remove namespace check from siw_netdev_event()
	clk: qcom: gcc-sm6115: Mark RCGs shared where applicable
	power: supply: rk817: Fix low SOC bugs
	RDMA/cm: Trace icm_send_rej event before the cm state is reset
	RDMA/srpt: Add a check for valid 'mad_agent' pointer
	IB/hfi1: Fix SDMA mmu_rb_node not being evicted in LRU order
	IB/hfi1: Fix bugs with non-PAGE_SIZE-end multi-iovec user SDMA requests
	clk: imx: fracn-gppll: fix the rate table
	clk: imx: fracn-gppll: disable hardware select control
	clk: imx: imx8ulp: Fix XBAR_DIVBUS and AD_SLOW clock parents
	NFSv4.1: Always send a RECLAIM_COMPLETE after establishing lease
	iommu/amd: Set page size bitmap during V2 domain allocation
	clk: qcom: lpasscc-sc7280: Skip qdsp6ss clock registration
	clk: qcom: lpassaudiocc-sc7280: Add required gdsc power domain clks in lpass_cc_sc7280_desc
	clk: qcom: gcc-sm8350: fix PCIe PIPE clocks handling
	clk: qcom: dispcc-qcm2290: get rid of test clock
	clk: qcom: dispcc-qcm2290: Remove inexistent DSI1PHY clk
	Input: raspberrypi-ts - fix refcount leak in rpi_ts_probe
	swiotlb: relocate PageHighMem test away from rmem_swiotlb_setup
	swiotlb: fix debugfs reporting of reserved memory pools
	RDMA/mlx5: Check pcie_relaxed_ordering_enabled() in UMR
	RDMA/mlx5: Fix flow counter query via DEVX
	SUNRPC: remove the maximum number of retries in call_bind_status
	RDMA/mlx5: Use correct device num_ports when modify DC
	clocksource/drivers/davinci: Fix memory leak in davinci_timer_register when init fails
	openrisc: Properly store r31 to pt_regs on unhandled exceptions
	timekeeping: Fix references to nonexistent ktime_get_fast_ns()
	SMB3: Add missing locks to protect deferred close file list
	SMB3: Close deferred file handles in case of handle lease break
	ext4: fix i_disksize exceeding i_size problem in paritally written case
	ext4: fix use-after-free read in ext4_find_extent for bigalloc + inline
	pinctrl: renesas: r8a779a0: Remove incorrect AVB[01] pinmux configuration
	pinctrl: renesas: r8a779f0: Fix tsn1_avtp_pps pin group
	pinctrl: renesas: r8a779g0: Fix Group 4/5 pin functions
	pinctrl: renesas: r8a779g0: Fix Group 6/7 pin functions
	pinctrl: renesas: r8a779g0: Fix ERROROUTC function names
	leds: TI_LMU_COMMON: select REGMAP instead of depending on it
	pinctrl: ralink: reintroduce ralink,rt2880-pinmux compatible string
	dmaengine: mv_xor_v2: Fix an error code.
	leds: tca6507: Fix error handling of using fwnode_property_read_string
	pwm: mtk-disp: Disable shadow registers before setting backlight values
	pwm: mtk-disp: Configure double buffering before reading in .get_state()
	soundwire: cadence: rename sdw_cdns_dai_dma_data as sdw_cdns_dai_runtime
	soundwire: intel: don't save hw_params for use in prepare
	phy: tegra: xusb: Add missing tegra_xusb_port_unregister for usb2_port and ulpi_port
	phy: ti: j721e-wiz: Fix unreachable code in wiz_mode_select()
	dma: gpi: remove spurious unlock in gpi_ch_init
	dmaengine: dw-edma: Fix to change for continuous transfer
	dmaengine: dw-edma: Fix to enable to issue dma request on DMA processing
	dmaengine: at_xdmac: do not enable all cyclic channels
	pinctrl-bcm2835.c: fix race condition when setting gpio dir
	thermal/drivers/mediatek: Use devm_of_iomap to avoid resource leak in mtk_thermal_probe
	mfd: tqmx86: Do not access I2C_DETECT register through io_base
	mfd: tqmx86: Specify IO port register range more precisely
	mfd: tqmx86: Correct board names for TQMxE39x
	mfd: ocelot-spi: Fix unsupported bulk read
	mfd: arizona-spi: Add missing MODULE_DEVICE_TABLE
	hte: tegra: fix 'struct of_device_id' build error
	hte: tegra-194: Fix off by one in tegra_hte_map_to_line_id()
	ACPI: PM: Do not turn of unused power resources on the Toshiba Click Mini
	PM: hibernate: Turn snapshot_test into global variable
	PM: hibernate: Do not get block device exclusively in test_resume mode
	afs: Fix updating of i_size with dv jump from server
	afs: Fix getattr to report server i_size on dirs, not local size
	afs: Avoid endless loop if file is larger than expected
	parisc: Fix argument pointer in real64_call_asm()
	parisc: Ensure page alignment in flush functions
	ALSA: usb-audio: Add quirk for Pioneer DDJ-800
	ALSA: hda/realtek: Add quirk for ThinkPad P1 Gen 6
	ALSA: hda/realtek: Add quirk for ASUS UM3402YAR using CS35L41
	ALSA: hda/realtek: support HP Pavilion Aero 13-be0xxx Mute LED
	ALSA: hda/realtek: Fix mute and micmute LEDs for an HP laptop
	nilfs2: do not write dirty data after degenerating to read-only
	nilfs2: fix infinite loop in nilfs_mdt_get_block()
	mm: do not reclaim private data from pinned page
	drbd: correctly submit flush bio on barrier
	md/raid10: fix null-ptr-deref in raid10_sync_request
	md/raid5: Improve performance for sequential IO
	kasan: hw_tags: avoid invalid virt_to_page()
	mtd: core: provide unique name for nvmem device, take two
	mtd: core: fix nvmem error reporting
	mtd: core: fix error path for nvmem provider
	mtd: spi-nor: core: Update flash's current address mode when changing address mode
	mailbox: zynqmp: Fix IPI isr handling
	kcsan: Avoid READ_ONCE() in read_instrumented_memory()
	mailbox: zynqmp: Fix typo in IPI documentation
	wifi: rtl8xxxu: RTL8192EU always needs full init
	wifi: rtw89: fix potential race condition between napi_init and napi_enable
	clk: microchip: fix potential UAF in auxdev release callback
	clk: rockchip: rk3399: allow clk_cifout to force clk_cifout_src to reparent
	scripts/gdb: fix lx-timerlist for Python3
	btrfs: scrub: reject unsupported scrub flags
	s390/dasd: fix hanging blockdevice after request requeue
	ia64: fix an addr to taddr in huge_pte_offset()
	mm/mempolicy: correctly update prev when policy is equal on mbind
	vhost_vdpa: fix unmap process in no-batch mode
	dm verity: fix error handling for check_at_most_once on FEC
	dm clone: call kmem_cache_destroy() in dm_clone_init() error path
	dm integrity: call kmem_cache_destroy() in dm_integrity_init() error path
	dm flakey: fix a crash with invalid table line
	dm ioctl: fix nested locking in table_clear() to remove deadlock concern
	dm: don't lock fs when the map is NULL in process of resume
	blk-iocost: avoid 64-bit division in ioc_timer_fn
	cifs: fix potential use-after-free bugs in TCP_Server_Info::hostname
	cifs: protect session status check in smb2_reconnect()
	thunderbolt: Use correct type in tb_port_is_clx_enabled() prototype
	bonding (gcc13): synchronize bond_{a,t}lb_xmit() types
	wifi: ath11k: synchronize ath11k_mac_he_gi_to_nl80211_he_gi()'s return type
	perf auxtrace: Fix address filter entire kernel size
	perf intel-pt: Fix CYC timestamps after standalone CBR
	block/blk-iocost (gcc13): keep large values in a new enum
	sfc (gcc13): synchronize ef100_enqueue_skb()'s return type
	i40e: Remove unused i40e status codes
	i40e: Remove string printing for i40e_status
	i40e: use int for i40e_status
	drm/amd/display (gcc13): fix enum mismatch
	debugobject: Ensure pool refill (again)
	scsi: libsas: Grab the ATA port lock in sas_ata_device_link_abort()
	netfilter: nf_tables: deactivate anonymous set from preparation phase
	Linux 6.1.28

Change-Id: I61b5133e2d051cc2aa39b8c7c1be3fc25da40210
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-06-09 20:20:52 +00:00
fuyuanli
c3fc733798 tcp: fix mishandling when the sack compression is deferred.
[ Upstream commit 30c6f0bf9579debce27e45fac34fdc97e46acacc ]

In this patch, we mainly try to handle sending a compressed ack
correctly if it's deferred.

Here are more details in the old logic:
When sack compression is triggered in the tcp_compressed_ack_kick(),
if the sock is owned by user, it will set TCP_DELACK_TIMER_DEFERRED
and then defer to the release cb phrase. Later once user releases
the sock, tcp_delack_timer_handler() should send a ack as expected,
which, however, cannot happen due to lack of ICSK_ACK_TIMER flag.
Therefore, the receiver would not sent an ack until the sender's
retransmission timeout. It definitely increases unnecessary latency.

Fixes: 5d9f4262b7 ("tcp: add SACK compression")
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: fuyuanli <fuyuanli@didiglobal.com>
Signed-off-by: Jason Xing <kerneljasonxing@gmail.com>
Link: https://lore.kernel.org/netdev/20230529113804.GA20300@didi-ThinkCentre-M920t-N000/
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20230531080150.GA20424@didi-ThinkCentre-M920t-N000
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-09 10:34:05 +02:00
Cambda Zhu
752836e1a2 tcp: Return user_mss for TCP_MAXSEG in CLOSE/LISTEN state if user_mss set
[ Upstream commit 34dfde4ad87b84d21278a7e19d92b5b2c68e6c4d ]

This patch replaces the tp->mss_cache check in getting TCP_MAXSEG
with tp->rx_opt.user_mss check for CLOSE/LISTEN sock. Since
tp->mss_cache is initialized with TCP_MSS_DEFAULT, checking if
it's zero is probably a bug.

With this change, getting TCP_MAXSEG before connecting will return
default MSS normally, and return user_mss if user_mss is set.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reported-by: Jack Yang <mingliang@linux.alibaba.com>
Suggested-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/netdev/CANn89i+3kL9pYtkxkwxwNMzvC_w3LNUum_2=3u+UyLBmGmifHA@mail.gmail.com/#t
Signed-off-by: Cambda Zhu <cambda@linux.alibaba.com>
Link: https://lore.kernel.org/netdev/14D45862-36EA-4076-974C-EA67513C92F6@linux.alibaba.com/
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20230527040317.68247-1-cambda@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-09 10:34:02 +02:00
Eric Dumazet
c2251ce048 tcp: deny tcp_disconnect() when threads are waiting
[ Upstream commit 4faeee0cf8a5d88d63cdbc3bab124fb0e6aed08c ]

Historically connect(AF_UNSPEC) has been abused by syzkaller
and other fuzzers to trigger various bugs.

A recent one triggers a divide-by-zero [1], and Paolo Abeni
was able to diagnose the issue.

tcp_recvmsg_locked() has tests about sk_state being not TCP_LISTEN
and TCP REPAIR mode being not used.

Then later if socket lock is released in sk_wait_data(),
another thread can call connect(AF_UNSPEC), then make this
socket a TCP listener.

When recvmsg() is resumed, it can eventually call tcp_cleanup_rbuf()
and attempt a divide by 0 in tcp_rcv_space_adjust() [1]

This patch adds a new socket field, counting number of threads
blocked in sk_wait_event() and inet_wait_for_connect().

If this counter is not zero, tcp_disconnect() returns an error.

This patch adds code in blocking socket system calls, thus should
not hurt performance of non blocking ones.

Note that we probably could revert commit 499350a5a6 ("tcp:
initialize rcv_mss to TCP_MIN_MSS instead of 0") to restore
original tcpi_rcv_mss meaning (was 0 if no payload was ever
received on a socket)

[1]
divide error: 0000 [#1] PREEMPT SMP KASAN
CPU: 0 PID: 13832 Comm: syz-executor.5 Not tainted 6.3.0-rc4-syzkaller-00224-g00c7b5f4ddc5 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/02/2023
RIP: 0010:tcp_rcv_space_adjust+0x36e/0x9d0 net/ipv4/tcp_input.c:740
Code: 00 00 00 00 fc ff df 4c 89 64 24 48 8b 44 24 04 44 89 f9 41 81 c7 80 03 00 00 c1 e1 04 44 29 f0 48 63 c9 48 01 e9 48 0f af c1 <49> f7 f6 48 8d 04 41 48 89 44 24 40 48 8b 44 24 30 48 c1 e8 03 48
RSP: 0018:ffffc900033af660 EFLAGS: 00010206
RAX: 4a66b76cbade2c48 RBX: ffff888076640cc0 RCX: 00000000c334e4ac
RDX: 0000000000000000 RSI: dffffc0000000000 RDI: 0000000000000001
RBP: 00000000c324e86c R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8880766417f8
R13: ffff888028fbb980 R14: 0000000000000000 R15: 0000000000010344
FS: 00007f5bffbfe700(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000001b32f25000 CR3: 000000007ced0000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
tcp_recvmsg_locked+0x100e/0x22e0 net/ipv4/tcp.c:2616
tcp_recvmsg+0x117/0x620 net/ipv4/tcp.c:2681
inet6_recvmsg+0x114/0x640 net/ipv6/af_inet6.c:670
sock_recvmsg_nosec net/socket.c:1017 [inline]
sock_recvmsg+0xe2/0x160 net/socket.c:1038
____sys_recvmsg+0x210/0x5a0 net/socket.c:2720
___sys_recvmsg+0xf2/0x180 net/socket.c:2762
do_recvmmsg+0x25e/0x6e0 net/socket.c:2856
__sys_recvmmsg net/socket.c:2935 [inline]
__do_sys_recvmmsg net/socket.c:2958 [inline]
__se_sys_recvmmsg net/socket.c:2951 [inline]
__x64_sys_recvmmsg+0x20f/0x260 net/socket.c:2951
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f5c0108c0f9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 f1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f5bffbfe168 EFLAGS: 00000246 ORIG_RAX: 000000000000012b
RAX: ffffffffffffffda RBX: 00007f5c011ac050 RCX: 00007f5c0108c0f9
RDX: 0000000000000001 RSI: 0000000020000bc0 RDI: 0000000000000003
RBP: 00007f5c010e7b39 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000122 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f5c012cfb1f R14: 00007f5bffbfe300 R15: 0000000000022000
</TASK>

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reported-by: syzbot <syzkaller@googlegroups.com>
Reported-by: Paolo Abeni <pabeni@redhat.com>
Diagnosed-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Paolo Abeni <pabeni@redhat.com>
Link: https://lore.kernel.org/r/20230526163458.2880232-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-09 10:34:02 +02:00
Suren Baghdasaryan
5dd0547a3e UPSTREAM: mm: replace vma->vm_flags direct modifications with modifier calls
Replace direct modifications to vma->vm_flags with calls to modifier
functions to be able to track flag changes and to keep vma locking
correctness.

[akpm@linux-foundation.org: fix drivers/misc/open-dice.c, per Hyeonggon Yoo]
Link: https://lkml.kernel.org/r/20230126193752.297968-5-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Mike Rapoport (IBM) <rppt@kernel.org>
Acked-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arjun Roy <arjunroy@google.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: David Rientjes <rientjes@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Joel Fernandes <joelaf@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Laurent Dufour <ldufour@linux.ibm.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Minchan Kim <minchan@google.com>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Peter Oskolkov <posk@google.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Punit Agrawal <punit.agrawal@bytedance.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

(cherry picked from commit 1c71222e5f2393b5ea1a41795c67589eea7e3490)

Bug: 161210518
Change-Id: Ifc352b487db109adab17dd33a83f5c7e68c0bbc6
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2023-06-07 14:24:57 +00:00
Nicolas Dichtel
1d39b94f8c UPSTREAM: ipv{4,6}/raw: fix output xfrm lookup wrt protocol
With a raw socket bound to IPPROTO_RAW (ie with hdrincl enabled), the
protocol field of the flow structure, build by raw_sendmsg() /
rawv6_sendmsg()),  is set to IPPROTO_RAW. This breaks the ipsec policy
lookup when some policies are defined with a protocol in the selector.

For ipv6, the sin6_port field from 'struct sockaddr_in6' could be used to
specify the protocol. Just accept all values for IPPROTO_RAW socket.

For ipv4, the sin_port field of 'struct sockaddr_in' could not be used
without breaking backward compatibility (the value of this field was never
checked). Let's add a new kind of control message, so that the userland
could specify which protocol is used.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
CC: stable@vger.kernel.org
Change-Id: I168bca9e37561b255268414e2fdfac18cd08381c
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Link: https://lore.kernel.org/r/20230522120820.1319391-1-nicolas.dichtel@6wind.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
(cherry picked from commit 3632679d9e4f879f49949bb5b050e0de553e4739)
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-06-07 14:24:50 +00:00
Jakub Sitnicki
9713594a2b UPSTREAM: inet: Add IP_LOCAL_PORT_RANGE socket option
Users who want to share a single public IP address for outgoing connections
between several hosts traditionally reach for SNAT. However, SNAT requires
state keeping on the node(s) performing the NAT.

A stateless alternative exists, where a single IP address used for egress
can be shared between several hosts by partitioning the available ephemeral
port range. In such a setup:

1. Each host gets assigned a disjoint range of ephemeral ports.
2. Applications open connections from the host-assigned port range.
3. Return traffic gets routed to the host based on both, the destination IP
   and the destination port.

An application which wants to open an outgoing connection (connect) from a
given port range today can choose between two solutions:

1. Manually pick the source port by bind()'ing to it before connect()'ing
   the socket.

   This approach has a couple of downsides:

   a) Search for a free port has to be implemented in the user-space. If
      the chosen 4-tuple happens to be busy, the application needs to retry
      from a different local port number.

      Detecting if 4-tuple is busy can be either easy (TCP) or hard
      (UDP). In TCP case, the application simply has to check if connect()
      returned an error (EADDRNOTAVAIL). That is assuming that the local
      port sharing was enabled (REUSEADDR) by all the sockets.

        # Assume desired local port range is 60_000-60_511
        s = socket(AF_INET, SOCK_STREAM)
        s.setsockopt(SOL_SOCKET, SO_REUSEADDR, 1)
        s.bind(("192.0.2.1", 60_000))
        s.connect(("1.1.1.1", 53))
        # Fails only if 192.0.2.1:60000 -> 1.1.1.1:53 is busy
        # Application must retry with another local port

      In case of UDP, the network stack allows binding more than one socket
      to the same 4-tuple, when local port sharing is enabled
      (REUSEADDR). Hence detecting the conflict is much harder and involves
      querying sock_diag and toggling the REUSEADDR flag [1].

   b) For TCP, bind()-ing to a port within the ephemeral port range means
      that no connecting sockets, that is those which leave it to the
      network stack to find a free local port at connect() time, can use
      the this port.

      IOW, the bind hash bucket tb->fastreuse will be 0 or 1, and the port
      will be skipped during the free port search at connect() time.

2. Isolate the app in a dedicated netns and use the use the per-netns
   ip_local_port_range sysctl to adjust the ephemeral port range bounds.

   The per-netns setting affects all sockets, so this approach can be used
   only if:

   - there is just one egress IP address, or
   - the desired egress port range is the same for all egress IP addresses
     used by the application.

   For TCP, this approach avoids the downsides of (1). Free port search and
   4-tuple conflict detection is done by the network stack:

     system("sysctl -w net.ipv4.ip_local_port_range='60000 60511'")

     s = socket(AF_INET, SOCK_STREAM)
     s.setsockopt(SOL_IP, IP_BIND_ADDRESS_NO_PORT, 1)
     s.bind(("192.0.2.1", 0))
     s.connect(("1.1.1.1", 53))
     # Fails if all 4-tuples 192.0.2.1:60000-60511 -> 1.1.1.1:53 are busy

  For UDP this approach has limited applicability. Setting the
  IP_BIND_ADDRESS_NO_PORT socket option does not result in local source
  port being shared with other connected UDP sockets.

  Hence relying on the network stack to find a free source port, limits the
  number of outgoing UDP flows from a single IP address down to the number
  of available ephemeral ports.

To put it another way, partitioning the ephemeral port range between hosts
using the existing Linux networking API is cumbersome.

To address this use case, add a new socket option at the SOL_IP level,
named IP_LOCAL_PORT_RANGE. The new option can be used to clamp down the
ephemeral port range for each socket individually.

The option can be used only to narrow down the per-netns local port
range. If the per-socket range lies outside of the per-netns range, the
latter takes precedence.

UAPI-wise, the low and high range bounds are passed to the kernel as a pair
of u16 values in host byte order packed into a u32. This avoids pointer
passing.

  PORT_LO = 40_000
  PORT_HI = 40_511

  s = socket(AF_INET, SOCK_STREAM)
  v = struct.pack("I", PORT_HI << 16 | PORT_LO)
  s.setsockopt(SOL_IP, IP_LOCAL_PORT_RANGE, v)
  s.bind(("127.0.0.1", 0))
  s.getsockname()
  # Local address between ("127.0.0.1", 40_000) and ("127.0.0.1", 40_511),
  # if there is a free port. EADDRINUSE otherwise.

[1] https://github.com/cloudflare/cloudflare-blog/blob/232b432c1d57/2022-02-connectx/connectx.py#L116

Reviewed-by: Marek Majkowski <marek@cloudflare.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Change-Id: I06e1860472cd2f90bf030076be0c87b9b775a3df
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
(cherry picked from commit 91d0b78c5177f3e42a4d8738af8ac19c3a90d002)
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-06-07 14:24:50 +00:00
John Fastabend
fe735073a5 bpf, sockmap: Incorrectly handling copied_seq
[ Upstream commit e5c6de5fa025882babf89cecbed80acf49b987fa ]

The read_skb() logic is incrementing the tcp->copied_seq which is used for
among other things calculating how many outstanding bytes can be read by
the application. This results in application errors, if the application
does an ioctl(FIONREAD) we return zero because this is calculated from
the copied_seq value.

To fix this we move tcp->copied_seq accounting into the recv handler so
that we update these when the recvmsg() hook is called and data is in
fact copied into user buffers. This gives an accurate FIONREAD value
as expected and improves ACK handling. Before we were calling the
tcp_rcv_space_adjust() which would update 'number of bytes copied to
user in last RTT' which is wrong for programs returning SK_PASS. The
bytes are only copied to the user when recvmsg is handled.

Doing the fix for recvmsg is straightforward, but fixing redirect and
SK_DROP pkts is a bit tricker. Build a tcp_psock_eat() helper and then
call this from skmsg handlers. This fixes another issue where a broken
socket with a BPF program doing a resubmit could hang the receiver. This
happened because although read_skb() consumed the skb through sock_drop()
it did not update the copied_seq. Now if a single reccv socket is
redirecting to many sockets (for example for lb) the receiver sk will be
hung even though we might expect it to continue. The hang comes from
not updating the copied_seq numbers and memory pressure resulting from
that.

We have a slight layer problem of calling tcp_eat_skb even if its not
a TCP socket. To fix we could refactor and create per type receiver
handlers. I decided this is more work than we want in the fix and we
already have some small tweaks depending on caller that use the
helper skb_bpf_strparser(). So we extend that a bit and always set
the strparser bit when it is in use and then we can gate the
seq_copied updates on this.

Fixes: 04919bed94 ("tcp: Introduce tcp_read_skb()")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/20230523025618.113937-9-john.fastabend@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-05 09:26:19 +02:00
John Fastabend
ab90b68f65 bpf, sockmap: TCP data stall on recv before accept
[ Upstream commit ea444185a6bf7da4dd0df1598ee953e4f7174858 ]

A common mechanism to put a TCP socket into the sockmap is to hook the
BPF_SOCK_OPS_{ACTIVE_PASSIVE}_ESTABLISHED_CB event with a BPF program
that can map the socket info to the correct BPF verdict parser. When
the user adds the socket to the map the psock is created and the new
ops are assigned to ensure the verdict program will 'see' the sk_buffs
as they arrive.

Part of this process hooks the sk_data_ready op with a BPF specific
handler to wake up the BPF verdict program when data is ready to read.
The logic is simple enough (posted here for easy reading)

 static void sk_psock_verdict_data_ready(struct sock *sk)
 {
	struct socket *sock = sk->sk_socket;

	if (unlikely(!sock || !sock->ops || !sock->ops->read_skb))
		return;
	sock->ops->read_skb(sk, sk_psock_verdict_recv);
 }

The oversight here is sk->sk_socket is not assigned until the application
accepts() the new socket. However, its entirely ok for the peer application
to do a connect() followed immediately by sends. The socket on the receiver
is sitting on the backlog queue of the listening socket until its accepted
and the data is queued up. If the peer never accepts the socket or is slow
it will eventually hit data limits and rate limit the session. But,
important for BPF sockmap hooks when this data is received TCP stack does
the sk_data_ready() call but the read_skb() for this data is never called
because sk_socket is missing. The data sits on the sk_receive_queue.

Then once the socket is accepted if we never receive more data from the
peer there will be no further sk_data_ready calls and all the data
is still on the sk_receive_queue(). Then user calls recvmsg after accept()
and for TCP sockets in sockmap we use the tcp_bpf_recvmsg_parser() handler.
The handler checks for data in the sk_msg ingress queue expecting that
the BPF program has already run from the sk_data_ready hook and enqueued
the data as needed. So we are stuck.

To fix do an unlikely check in recvmsg handler for data on the
sk_receive_queue and if it exists wake up data_ready. We have the sock
locked in both read_skb and recvmsg so should avoid having multiple
runners.

Fixes: 04919bed94 ("tcp: Introduce tcp_read_skb()")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/20230523025618.113937-7-john.fastabend@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-05 09:26:19 +02:00
John Fastabend
3a2129ebae bpf, sockmap: Handle fin correctly
[ Upstream commit 901546fd8f9ca4b5c481ce00928ab425ce9aacc0 ]

The sockmap code is returning EAGAIN after a FIN packet is received and no
more data is on the receive queue. Correct behavior is to return 0 to the
user and the user can then close the socket. The EAGAIN causes many apps
to retry which masks the problem. Eventually the socket is evicted from
the sockmap because its released from sockmap sock free handling. The
issue creates a delay and can cause some errors on application side.

To fix this check on sk_msg_recvmsg side if length is zero and FIN flag
is set then set return to zero. A selftest will be added to check this
condition.

Fixes: 04919bed94 ("tcp: Introduce tcp_read_skb()")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: William Findlay <will@isovalent.com>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/20230523025618.113937-6-john.fastabend@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-05 09:26:19 +02:00
John Fastabend
4ae2af3e59 bpf, sockmap: Pass skb ownership through read_skb
[ Upstream commit 78fa0d61d97a728d306b0c23d353c0e340756437 ]

The read_skb hook calls consume_skb() now, but this means that if the
recv_actor program wants to use the skb it needs to inc the ref cnt
so that the consume_skb() doesn't kfree the sk_buff.

This is problematic because in some error cases under memory pressure
we may need to linearize the sk_buff from sk_psock_skb_ingress_enqueue().
Then we get this,

 skb_linearize()
   __pskb_pull_tail()
     pskb_expand_head()
       BUG_ON(skb_shared(skb))

Because we incremented users refcnt from sk_psock_verdict_recv() we
hit the bug on with refcnt > 1 and trip it.

To fix lets simply pass ownership of the sk_buff through the skb_read
call. Then we can drop the consume from read_skb handlers and assume
the verdict recv does any required kfree.

Bug found while testing in our CI which runs in VMs that hit memory
constraints rather regularly. William tested TCP read_skb handlers.

[  106.536188] ------------[ cut here ]------------
[  106.536197] kernel BUG at net/core/skbuff.c:1693!
[  106.536479] invalid opcode: 0000 [#1] PREEMPT SMP PTI
[  106.536726] CPU: 3 PID: 1495 Comm: curl Not tainted 5.19.0-rc5 #1
[  106.537023] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ArchLinux 1.16.0-1 04/01/2014
[  106.537467] RIP: 0010:pskb_expand_head+0x269/0x330
[  106.538585] RSP: 0018:ffffc90000138b68 EFLAGS: 00010202
[  106.538839] RAX: 000000000000003f RBX: ffff8881048940e8 RCX: 0000000000000a20
[  106.539186] RDX: 0000000000000002 RSI: 0000000000000000 RDI: ffff8881048940e8
[  106.539529] RBP: ffffc90000138be8 R08: 00000000e161fd1a R09: 0000000000000000
[  106.539877] R10: 0000000000000018 R11: 0000000000000000 R12: ffff8881048940e8
[  106.540222] R13: 0000000000000003 R14: 0000000000000000 R15: ffff8881048940e8
[  106.540568] FS:  00007f277dde9f00(0000) GS:ffff88813bd80000(0000) knlGS:0000000000000000
[  106.540954] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  106.541227] CR2: 00007f277eeede64 CR3: 000000000ad3e000 CR4: 00000000000006e0
[  106.541569] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  106.541915] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  106.542255] Call Trace:
[  106.542383]  <IRQ>
[  106.542487]  __pskb_pull_tail+0x4b/0x3e0
[  106.542681]  skb_ensure_writable+0x85/0xa0
[  106.542882]  sk_skb_pull_data+0x18/0x20
[  106.543084]  bpf_prog_b517a65a242018b0_bpf_skskb_http_verdict+0x3a9/0x4aa9
[  106.543536]  ? migrate_disable+0x66/0x80
[  106.543871]  sk_psock_verdict_recv+0xe2/0x310
[  106.544258]  ? sk_psock_write_space+0x1f0/0x1f0
[  106.544561]  tcp_read_skb+0x7b/0x120
[  106.544740]  tcp_data_queue+0x904/0xee0
[  106.544931]  tcp_rcv_established+0x212/0x7c0
[  106.545142]  tcp_v4_do_rcv+0x174/0x2a0
[  106.545326]  tcp_v4_rcv+0xe70/0xf60
[  106.545500]  ip_protocol_deliver_rcu+0x48/0x290
[  106.545744]  ip_local_deliver_finish+0xa7/0x150

Fixes: 04919bed94 ("tcp: Introduce tcp_read_skb()")
Reported-by: William Findlay <will@isovalent.com>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: William Findlay <will@isovalent.com>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/20230523025618.113937-2-john.fastabend@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-05 09:26:18 +02:00
Nicolas Dichtel
3f5413c954 ipv{4,6}/raw: fix output xfrm lookup wrt protocol
[ Upstream commit 3632679d9e4f879f49949bb5b050e0de553e4739 ]

With a raw socket bound to IPPROTO_RAW (ie with hdrincl enabled), the
protocol field of the flow structure, build by raw_sendmsg() /
rawv6_sendmsg()),  is set to IPPROTO_RAW. This breaks the ipsec policy
lookup when some policies are defined with a protocol in the selector.

For ipv6, the sin6_port field from 'struct sockaddr_in6' could be used to
specify the protocol. Just accept all values for IPPROTO_RAW socket.

For ipv4, the sin_port field of 'struct sockaddr_in' could not be used
without breaking backward compatibility (the value of this field was never
checked). Let's add a new kind of control message, so that the userland
could specify which protocol is used.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
CC: stable@vger.kernel.org
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Link: https://lore.kernel.org/r/20230522120820.1319391-1-nicolas.dichtel@6wind.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-05 09:26:16 +02:00
Jakub Sitnicki
6728486447 inet: Add IP_LOCAL_PORT_RANGE socket option
[ Upstream commit 91d0b78c5177f3e42a4d8738af8ac19c3a90d002 ]

Users who want to share a single public IP address for outgoing connections
between several hosts traditionally reach for SNAT. However, SNAT requires
state keeping on the node(s) performing the NAT.

A stateless alternative exists, where a single IP address used for egress
can be shared between several hosts by partitioning the available ephemeral
port range. In such a setup:

1. Each host gets assigned a disjoint range of ephemeral ports.
2. Applications open connections from the host-assigned port range.
3. Return traffic gets routed to the host based on both, the destination IP
   and the destination port.

An application which wants to open an outgoing connection (connect) from a
given port range today can choose between two solutions:

1. Manually pick the source port by bind()'ing to it before connect()'ing
   the socket.

   This approach has a couple of downsides:

   a) Search for a free port has to be implemented in the user-space. If
      the chosen 4-tuple happens to be busy, the application needs to retry
      from a different local port number.

      Detecting if 4-tuple is busy can be either easy (TCP) or hard
      (UDP). In TCP case, the application simply has to check if connect()
      returned an error (EADDRNOTAVAIL). That is assuming that the local
      port sharing was enabled (REUSEADDR) by all the sockets.

        # Assume desired local port range is 60_000-60_511
        s = socket(AF_INET, SOCK_STREAM)
        s.setsockopt(SOL_SOCKET, SO_REUSEADDR, 1)
        s.bind(("192.0.2.1", 60_000))
        s.connect(("1.1.1.1", 53))
        # Fails only if 192.0.2.1:60000 -> 1.1.1.1:53 is busy
        # Application must retry with another local port

      In case of UDP, the network stack allows binding more than one socket
      to the same 4-tuple, when local port sharing is enabled
      (REUSEADDR). Hence detecting the conflict is much harder and involves
      querying sock_diag and toggling the REUSEADDR flag [1].

   b) For TCP, bind()-ing to a port within the ephemeral port range means
      that no connecting sockets, that is those which leave it to the
      network stack to find a free local port at connect() time, can use
      the this port.

      IOW, the bind hash bucket tb->fastreuse will be 0 or 1, and the port
      will be skipped during the free port search at connect() time.

2. Isolate the app in a dedicated netns and use the use the per-netns
   ip_local_port_range sysctl to adjust the ephemeral port range bounds.

   The per-netns setting affects all sockets, so this approach can be used
   only if:

   - there is just one egress IP address, or
   - the desired egress port range is the same for all egress IP addresses
     used by the application.

   For TCP, this approach avoids the downsides of (1). Free port search and
   4-tuple conflict detection is done by the network stack:

     system("sysctl -w net.ipv4.ip_local_port_range='60000 60511'")

     s = socket(AF_INET, SOCK_STREAM)
     s.setsockopt(SOL_IP, IP_BIND_ADDRESS_NO_PORT, 1)
     s.bind(("192.0.2.1", 0))
     s.connect(("1.1.1.1", 53))
     # Fails if all 4-tuples 192.0.2.1:60000-60511 -> 1.1.1.1:53 are busy

  For UDP this approach has limited applicability. Setting the
  IP_BIND_ADDRESS_NO_PORT socket option does not result in local source
  port being shared with other connected UDP sockets.

  Hence relying on the network stack to find a free source port, limits the
  number of outgoing UDP flows from a single IP address down to the number
  of available ephemeral ports.

To put it another way, partitioning the ephemeral port range between hosts
using the existing Linux networking API is cumbersome.

To address this use case, add a new socket option at the SOL_IP level,
named IP_LOCAL_PORT_RANGE. The new option can be used to clamp down the
ephemeral port range for each socket individually.

The option can be used only to narrow down the per-netns local port
range. If the per-socket range lies outside of the per-netns range, the
latter takes precedence.

UAPI-wise, the low and high range bounds are passed to the kernel as a pair
of u16 values in host byte order packed into a u32. This avoids pointer
passing.

  PORT_LO = 40_000
  PORT_HI = 40_511

  s = socket(AF_INET, SOCK_STREAM)
  v = struct.pack("I", PORT_HI << 16 | PORT_LO)
  s.setsockopt(SOL_IP, IP_LOCAL_PORT_RANGE, v)
  s.bind(("127.0.0.1", 0))
  s.getsockname()
  # Local address between ("127.0.0.1", 40_000) and ("127.0.0.1", 40_511),
  # if there is a free port. EADDRINUSE otherwise.

[1] https://github.com/cloudflare/cloudflare-blog/blob/232b432c1d57/2022-02-connectx/connectx.py#L116

Reviewed-by: Marek Majkowski <marek@cloudflare.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of: 3632679d9e4f ("ipv{4,6}/raw: fix output xfrm lookup wrt protocol")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-05 09:26:16 +02:00
Kuniyuki Iwashima
2a112f0462 udplite: Fix NULL pointer dereference in __sk_mem_raise_allocated().
commit ad42a35bdfc6d3c0fc4cb4027d7b2757ce665665 upstream.

syzbot reported [0] a null-ptr-deref in sk_get_rmem0() while using
IPPROTO_UDPLITE (0x88):

  14:25:52 executing program 1:
  r0 = socket$inet6(0xa, 0x80002, 0x88)

We had a similar report [1] for probably sk_memory_allocated_add()
in __sk_mem_raise_allocated(), and commit c915fe13cb ("udplite: fix
NULL pointer dereference") fixed it by setting .memory_allocated for
udplite_prot and udplitev6_prot.

To fix the variant, we need to set either .sysctl_wmem_offset or
.sysctl_rmem.

Now UDP and UDPLITE share the same value for .memory_allocated, so we
use the same .sysctl_wmem_offset for UDP and UDPLITE.

[0]:
general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
CPU: 0 PID: 6829 Comm: syz-executor.1 Not tainted 6.4.0-rc2-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/28/2023
RIP: 0010:sk_get_rmem0 include/net/sock.h:2907 [inline]
RIP: 0010:__sk_mem_raise_allocated+0x806/0x17a0 net/core/sock.c:3006
Code: c1 ea 03 80 3c 02 00 0f 85 23 0f 00 00 48 8b 44 24 08 48 8b 98 38 01 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 da 48 c1 ea 03 <0f> b6 14 02 48 89 d8 83 e0 07 83 c0 03 38 d0 0f 8d 6f 0a 00 00 8b
RSP: 0018:ffffc90005d7f450 EFLAGS: 00010246
RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffc90004d92000
RDX: 0000000000000000 RSI: ffffffff88066482 RDI: ffffffff8e2ccbb8
RBP: ffff8880173f7000 R08: 0000000000000005 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000030000
R13: 0000000000000001 R14: 0000000000000340 R15: 0000000000000001
FS:  0000000000000000(0000) GS:ffff8880b9800000(0063) knlGS:00000000f7f1cb40
CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
CR2: 000000002e82f000 CR3: 0000000034ff0000 CR4: 00000000003506f0
Call Trace:
 <TASK>
 __sk_mem_schedule+0x6c/0xe0 net/core/sock.c:3077
 udp_rmem_schedule net/ipv4/udp.c:1539 [inline]
 __udp_enqueue_schedule_skb+0x776/0xb30 net/ipv4/udp.c:1581
 __udpv6_queue_rcv_skb net/ipv6/udp.c:666 [inline]
 udpv6_queue_rcv_one_skb+0xc39/0x16c0 net/ipv6/udp.c:775
 udpv6_queue_rcv_skb+0x194/0xa10 net/ipv6/udp.c:793
 __udp6_lib_mcast_deliver net/ipv6/udp.c:906 [inline]
 __udp6_lib_rcv+0x1bda/0x2bd0 net/ipv6/udp.c:1013
 ip6_protocol_deliver_rcu+0x2e7/0x1250 net/ipv6/ip6_input.c:437
 ip6_input_finish+0x150/0x2f0 net/ipv6/ip6_input.c:482
 NF_HOOK include/linux/netfilter.h:303 [inline]
 NF_HOOK include/linux/netfilter.h:297 [inline]
 ip6_input+0xa0/0xd0 net/ipv6/ip6_input.c:491
 ip6_mc_input+0x40b/0xf50 net/ipv6/ip6_input.c:585
 dst_input include/net/dst.h:468 [inline]
 ip6_rcv_finish net/ipv6/ip6_input.c:79 [inline]
 NF_HOOK include/linux/netfilter.h:303 [inline]
 NF_HOOK include/linux/netfilter.h:297 [inline]
 ipv6_rcv+0x250/0x380 net/ipv6/ip6_input.c:309
 __netif_receive_skb_one_core+0x114/0x180 net/core/dev.c:5491
 __netif_receive_skb+0x1f/0x1c0 net/core/dev.c:5605
 netif_receive_skb_internal net/core/dev.c:5691 [inline]
 netif_receive_skb+0x133/0x7a0 net/core/dev.c:5750
 tun_rx_batched+0x4b3/0x7a0 drivers/net/tun.c:1553
 tun_get_user+0x2452/0x39c0 drivers/net/tun.c:1989
 tun_chr_write_iter+0xdf/0x200 drivers/net/tun.c:2035
 call_write_iter include/linux/fs.h:1868 [inline]
 new_sync_write fs/read_write.c:491 [inline]
 vfs_write+0x945/0xd50 fs/read_write.c:584
 ksys_write+0x12b/0x250 fs/read_write.c:637
 do_syscall_32_irqs_on arch/x86/entry/common.c:112 [inline]
 __do_fast_syscall_32+0x65/0xf0 arch/x86/entry/common.c:178
 do_fast_syscall_32+0x33/0x70 arch/x86/entry/common.c:203
 entry_SYSENTER_compat_after_hwframe+0x70/0x82
RIP: 0023:0xf7f21579
Code: b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 00 00 00 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d b4 26 00 00 00 00 8d b4 26 00 00 00 00
RSP: 002b:00000000f7f1c590 EFLAGS: 00000282 ORIG_RAX: 0000000000000004
RAX: ffffffffffffffda RBX: 00000000000000c8 RCX: 0000000020000040
RDX: 0000000000000083 RSI: 00000000f734e000 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000296 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
 </TASK>
Modules linked in:

Link: https://lore.kernel.org/netdev/CANaxB-yCk8hhP68L4Q2nFOJht8sqgXGGQO2AftpHs0u1xyGG5A@mail.gmail.com/ [1]
Fixes: 850cbaddb5 ("udp: use it's own memory accounting schema")
Reported-by: syzbot+444ca0907e96f7c5e48b@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=444ca0907e96f7c5e48b
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20230523163305.66466-1-kuniyu@amazon.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-05-30 14:03:20 +01:00
Eric Dumazet
820a60a416 tcp: fix possible sk_priority leak in tcp_v4_send_reset()
[ Upstream commit 1e306ec49a1f206fd2cc89a42fac6e6f592a8cc1 ]

When tcp_v4_send_reset() is called with @sk == NULL,
we do not change ctl_sk->sk_priority, which could have been
set from a prior invocation.

Change tcp_v4_send_reset() to set sk_priority and sk_mark
fields before calling ip_send_unicast_reply().

This means tcp_v4_send_reset() and tcp_v4_send_ack()
no longer have to clear ctl_sk->sk_mark after
their call to ip_send_unicast_reply().

Fixes: f6c0f5d209 ("tcp: honor SO_PRIORITY in TIME_WAIT state")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Antoine Tenart <atenart@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-24 17:32:44 +01:00
Eric Dumazet
b4c0af8974 tcp: add annotations around sk->sk_shutdown accesses
[ Upstream commit e14cadfd80d76f01bfaa1a8d745b1db19b57d6be ]

Now sk->sk_shutdown is no longer a bitfield, we can add
standard READ_ONCE()/WRITE_ONCE() annotations to silence
KCSAN reports like the following:

BUG: KCSAN: data-race in tcp_disconnect / tcp_poll

write to 0xffff88814588582c of 1 bytes by task 3404 on cpu 1:
tcp_disconnect+0x4d6/0xdb0 net/ipv4/tcp.c:3121
__inet_stream_connect+0x5dd/0x6e0 net/ipv4/af_inet.c:715
inet_stream_connect+0x48/0x70 net/ipv4/af_inet.c:727
__sys_connect_file net/socket.c:2001 [inline]
__sys_connect+0x19b/0x1b0 net/socket.c:2018
__do_sys_connect net/socket.c:2028 [inline]
__se_sys_connect net/socket.c:2025 [inline]
__x64_sys_connect+0x41/0x50 net/socket.c:2025
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

read to 0xffff88814588582c of 1 bytes by task 3374 on cpu 0:
tcp_poll+0x2e6/0x7d0 net/ipv4/tcp.c:562
sock_poll+0x253/0x270 net/socket.c:1383
vfs_poll include/linux/poll.h:88 [inline]
io_poll_check_events io_uring/poll.c:281 [inline]
io_poll_task_func+0x15a/0x820 io_uring/poll.c:333
handle_tw_list io_uring/io_uring.c:1184 [inline]
tctx_task_work+0x1fe/0x4d0 io_uring/io_uring.c:1246
task_work_run+0x123/0x160 kernel/task_work.c:179
get_signal+0xe64/0xff0 kernel/signal.c:2635
arch_do_signal_or_restart+0x89/0x2a0 arch/x86/kernel/signal.c:306
exit_to_user_mode_loop+0x6f/0xe0 kernel/entry/common.c:168
exit_to_user_mode_prepare+0x6c/0xb0 kernel/entry/common.c:204
__syscall_exit_to_user_mode_work kernel/entry/common.c:286 [inline]
syscall_exit_to_user_mode+0x26/0x140 kernel/entry/common.c:297
do_syscall_64+0x4d/0xc0 arch/x86/entry/common.c:86
entry_SYSCALL_64_after_hwframe+0x63/0xcd

value changed: 0x03 -> 0x00

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-24 17:32:32 +01:00
Eric Dumazet
65531f5675 net: deal with most data-races in sk_wait_event()
[ Upstream commit d0ac89f6f9879fae316c155de77b5173b3e2c9c9 ]

__condition is evaluated twice in sk_wait_event() macro.

First invocation is lockless, and reads can race with writes,
as spotted by syzbot.

BUG: KCSAN: data-race in sk_stream_wait_connect / tcp_disconnect

write to 0xffff88812d83d6a0 of 4 bytes by task 9065 on cpu 1:
tcp_disconnect+0x2cd/0xdb0
inet_shutdown+0x19e/0x1f0 net/ipv4/af_inet.c:911
__sys_shutdown_sock net/socket.c:2343 [inline]
__sys_shutdown net/socket.c:2355 [inline]
__do_sys_shutdown net/socket.c:2363 [inline]
__se_sys_shutdown+0xf8/0x140 net/socket.c:2361
__x64_sys_shutdown+0x31/0x40 net/socket.c:2361
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

read to 0xffff88812d83d6a0 of 4 bytes by task 9040 on cpu 0:
sk_stream_wait_connect+0x1de/0x3a0 net/core/stream.c:75
tcp_sendmsg_locked+0x2e4/0x2120 net/ipv4/tcp.c:1266
tcp_sendmsg+0x30/0x50 net/ipv4/tcp.c:1484
inet6_sendmsg+0x63/0x80 net/ipv6/af_inet6.c:651
sock_sendmsg_nosec net/socket.c:724 [inline]
sock_sendmsg net/socket.c:747 [inline]
__sys_sendto+0x246/0x300 net/socket.c:2142
__do_sys_sendto net/socket.c:2154 [inline]
__se_sys_sendto net/socket.c:2150 [inline]
__x64_sys_sendto+0x78/0x90 net/socket.c:2150
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

value changed: 0x00000000 -> 0x00000068

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-24 17:32:32 +01:00
Ziyang Xuan
fc60067260 ipv4: Fix potential uninit variable access bug in __ip_make_skb()
[ Upstream commit 99e5acae193e369b71217efe6f1dad42f3f18815 ]

Like commit ea30388baebc ("ipv6: Fix an uninit variable access bug in
__ip6_make_skb()"). icmphdr does not in skb linear region under the
scenario of SOCK_RAW socket. Access icmp_hdr(skb)->type directly will
trigger the uninit variable access bug.

Use a local variable icmp_type to carry the correct value in different
scenarios.

Fixes: 96793b4825 ("[IPV4]: Add ICMPMsgStats MIB (RFC 4293)")
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-11 23:03:26 +09:00
Greg Kroah-Hartman
18c6e1f4af Revert "Revert "raw: Fix NULL deref in raw_get_next().""
This reverts commit cc7a00d2d6.

It was perserving the ABI, but that is not needed anymore at this point
in time.

Change-Id: I30a89a414afcc3db54c040afff3ab067b33170be
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-05-11 05:22:29 +00:00
Greg Kroah-Hartman
01e7770c33 Revert "Revert "raw: use net_hash_mix() in hash function""
This reverts commit 2039635543.

It was perserving the ABI, but that is not needed anymore at this point
in time.

Change-Id: Iaa9f70751453325f43d15bc6e4f6cf5bc68d6ec2
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-05-11 05:22:29 +00:00
Greg Kroah-Hartman
55e4f0c551 This is the 6.1.25 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmRBFW0ACgkQONu9yGCS
 aT7Jew//Ytw9+JQ71LT1TuJnQ1GayJOL1BW5hgxoYgnBFasWDwsGA9rzHs6KHqHb
 0Vjk7MX7VZB+6zWakOxY5CFVM33J4fS7wY8WE2bj8X3QQhD/J0HQDMdELvSBi3qF
 7xI6sghEQEwOuwAj2+CBqm/q7rA5FTnO1QgJuk/AKJ6PHGRiQeZ7q1zGpFvSaj7S
 cyKvY99RsjnUN+PYk4LE2+u/6DVCqiWYVDQrdjalb9zsrXg4+nmPH6ZJzZX8+bbM
 eM0xAR675V8TXqDi+8bj7tWmiS52XyjYF3Q/bu9BmU67DqslH9FFyVQxhgTHUZpN
 qWXkojEU2djIc3qt7T/bpZS/vD8Kg3Px1CgyIRN8Y5SlZfhZyqVdTZ4AQCtJuLQJ
 wDIdQCLlGzzDNFvbD+LdfJSjZt7Ig1sM/HwtPZhUA9yF0FN1XV3dcESzCOeI0/S7
 ohRh8cs1sidnxrbvVwiVNENSqbJD7G9/9vVjIfyfcnt57q+fs6xCBhpOyNoVOp74
 I5i6ALMcVZoAB50vDjnoGZsSRe9W2AmOV6UMIkVCvRCWYFqBpgVftMTAACNyljni
 UlXmO7aDQj+nbHD/auclFtU02oHQbk62FSrwoWMFS090zWztQqUhgRY7Qnl13yCM
 poEvrKlskXhvunsNtdVmI5O3N2GANWKgGwkyFIiXvgxKkw1qpUo=
 =zeN9
 -----END PGP SIGNATURE-----

Merge 6.1.25 into android14-6.1

Changes in 6.1.25
	Revert "pinctrl: amd: Disable and mask interrupts on resume"
	drm/amd/display: Pass the right info to drm_dp_remove_payload
	ALSA: emu10k1: fix capture interrupt handler unlinking
	ALSA: hda/sigmatel: add pin overrides for Intel DP45SG motherboard
	ALSA: i2c/cs8427: fix iec958 mixer control deactivation
	ALSA: hda: patch_realtek: add quirk for Asus N7601ZM
	ALSA: hda/realtek: Add quirks for Lenovo Z13/Z16 Gen2
	ALSA: firewire-tascam: add missing unwind goto in snd_tscm_stream_start_duplex()
	ALSA: emu10k1: don't create old pass-through playback device on Audigy
	ALSA: hda/sigmatel: fix S/PDIF out on Intel D*45* motherboards
	ALSA: hda/hdmi: disable KAE for Intel DG2
	Bluetooth: L2CAP: Fix use-after-free in l2cap_disconnect_{req,rsp}
	Bluetooth: Fix race condition in hidp_session_thread
	bluetooth: btbcm: Fix logic error in forming the board name.
	Bluetooth: Free potentially unfreed SCO connection
	Bluetooth: hci_conn: Fix possible UAF
	btrfs: restore the thread_pool= behavior in remount for the end I/O workqueues
	btrfs: fix fast csum implementation detection
	fbmem: Reject FB_ACTIVATE_KD_TEXT from userspace
	mtdblock: tolerate corrected bit-flips
	mtd: rawnand: meson: fix bitmask for length in command word
	mtd: rawnand: stm32_fmc2: remove unsupported EDO mode
	mtd: rawnand: stm32_fmc2: use timings.mode instead of checking tRC_min
	KVM: arm64: PMU: Restore the guest's EL0 event counting after migration
	fbcon: Fix error paths in set_con2fb_map
	fbcon: set_con2fb_map needs to set con2fb_map!
	drm/i915/dsi: fix DSS CTL register offsets for TGL+
	clk: sprd: set max_register according to mapping range
	RDMA/irdma: Do not generate SW completions for NOPs
	RDMA/irdma: Fix memory leak of PBLE objects
	RDMA/irdma: Increase iWARP CM default rexmit count
	RDMA/irdma: Add ipv4 check to irdma_find_listener()
	IB/mlx5: Add support for 400G_8X lane speed
	RDMA/erdma: Update default EQ depth to 4096 and max_send_wr to 8192
	RDMA/erdma: Inline mtt entries into WQE if supported
	RDMA/erdma: Defer probing if netdevice can not be found
	clk: rs9: Fix suspend/resume
	RDMA/cma: Allow UD qp_type to join multicast only
	bpf: tcp: Use sock_gen_put instead of sock_put in bpf_iter_tcp
	LoongArch, bpf: Fix jit to skip speculation barrier opcode
	dmaengine: apple-admac: Handle 'global' interrupt flags
	dmaengine: apple-admac: Set src_addr_widths capability
	dmaengine: apple-admac: Fix 'current_tx' not getting freed
	9p/xen : Fix use after free bug in xen_9pfs_front_remove due to race condition
	bpf, arm64: Fixed a BTI error on returning to patched function
	KVM: arm64: Initialise hypervisor copies of host symbols unconditionally
	KVM: arm64: Advertise ID_AA64PFR0_EL1.CSV2/3 to protected VMs
	niu: Fix missing unwind goto in niu_alloc_channels()
	tcp: restrict net.ipv4.tcp_app_win
	bonding: fix ns validation on backup slaves
	iavf: refactor VLAN filter states
	iavf: remove active_cvlans and active_svlans bitmaps
	net: openvswitch: fix race on port output
	Bluetooth: hci_conn: Fix not cleaning up on LE Connection failure
	Bluetooth: Fix printing errors if LE Connection times out
	Bluetooth: SCO: Fix possible circular locking dependency sco_sock_getsockopt
	Bluetooth: Set ISO Data Path on broadcast sink
	drm/armada: Fix a potential double free in an error handling path
	qlcnic: check pci_reset_function result
	net: wwan: iosm: Fix error handling path in ipc_pcie_probe()
	cgroup,freezer: hold cpu_hotplug_lock before freezer_mutex
	net: qrtr: Fix an uninit variable access bug in qrtr_tx_resume()
	sctp: fix a potential overflow in sctp_ifwdtsn_skip
	RDMA/core: Fix GID entry ref leak when create_ah fails
	selftests: openvswitch: adjust datapath NL message declaration
	udp6: fix potential access to stale information
	net: macb: fix a memory corruption in extended buffer descriptor mode
	skbuff: Fix a race between coalescing and releasing SKBs
	libbpf: Fix single-line struct definition output in btf_dump
	ARM: 9290/1: uaccess: Fix KASAN false-positives
	ARM: dts: qcom: apq8026-lg-lenok: add missing reserved memory
	power: supply: rk817: Fix unsigned comparison with less than zero
	power: supply: cros_usbpd: reclassify "default case!" as debug
	power: supply: axp288_fuel_gauge: Added check for negative values
	selftests/bpf: Fix progs/find_vma_fail1.c build error.
	wifi: mwifiex: mark OF related data as maybe unused
	i2c: imx-lpi2c: clean rx/tx buffers upon new message
	i2c: hisi: Avoid redundant interrupts
	efi: sysfb_efi: Add quirk for Lenovo Yoga Book X91F/L
	block: ublk_drv: mark device as LIVE before adding disk
	ACPI: video: Add backlight=native DMI quirk for Acer Aspire 3830TG
	drm: panel-orientation-quirks: Add quirk for Lenovo Yoga Book X90F
	hwmon: (peci/cputemp) Fix miscalculated DTS for SKX
	hwmon: (xgene) Fix ioremap and memremap leak
	verify_pefile: relax wrapper length check
	asymmetric_keys: log on fatal failures in PE/pkcs7
	nvme: send Identify with CNS 06h only to I/O controllers
	wifi: iwlwifi: mvm: fix mvmtxq->stopped handling
	wifi: iwlwifi: mvm: protect TXQ list manipulation
	drm/amdgpu: add mes resume when do gfx post soft reset
	drm/amdgpu: Force signal hw_fences that are embedded in non-sched jobs
	drm/amdgpu/gfx: set cg flags to enter/exit safe mode
	ACPI: resource: Add Medion S17413 to IRQ override quirk
	x86/hyperv: Move VMCB enlightenment definitions to hyperv-tlfs.h
	KVM: selftests: Move "struct hv_enlightenments" to x86_64/svm.h
	KVM: SVM: Add a proper field for Hyper-V VMCB enlightenments
	x86/hyperv: KVM: Rename "hv_enlightenments" to "hv_vmcb_enlightenments"
	KVM: SVM: Flush Hyper-V TLB when required
	tracing: Add trace_array_puts() to write into instance
	tracing: Have tracing_snapshot_instance_cond() write errors to the appropriate instance
	maple_tree: fix write memory barrier of nodes once dead for RCU mode
	ksmbd: avoid out of bounds access in decode_preauth_ctxt()
	riscv: add icache flush for nommu sigreturn trampoline
	HID: intel-ish-hid: Fix kernel panic during warm reset
	net: sfp: initialize sfp->i2c_block_size at sfp allocation
	net: phy: nxp-c45-tja11xx: add remove callback
	net: phy: nxp-c45-tja11xx: fix unsigned long multiplication overflow
	scsi: ses: Handle enclosure with just a primary component gracefully
	x86/PCI: Add quirk for AMD XHCI controller that loses MSI-X state in D3hot
	cgroup: fix display of forceidle time at root
	cgroup/cpuset: Fix partition root's cpuset.cpus update bug
	cgroup/cpuset: Wake up cpuset_attach_wq tasks in cpuset_cancel_attach()
	drm/amd/pm: correct SMU13.0.7 pstate profiling clock settings
	drm/amd/pm: correct SMU13.0.7 max shader clock reporting
	mptcp: use mptcp_schedule_work instead of open-coding it
	mptcp: stricter state check in mptcp_worker
	ubi: Fix failure attaching when vid_hdr offset equals to (sub)page size
	ubi: Fix deadlock caused by recursively holding work_sem
	i2c: mchp-pci1xxxx: Update Timing registers
	powerpc/papr_scm: Update the NUMA distance table for the target node
	sched/fair: Fix imbalance overflow
	x86/rtc: Remove __init for runtime functions
	i2c: ocores: generate stop condition after timeout in polling mode
	cifs: fix negotiate context parsing
	nvme-pci: mark Lexar NM760 as IGNORE_DEV_SUBNQN
	nvme-pci: add NVME_QUIRK_BOGUS_NID for T-FORCE Z330 SSD
	cgroup/cpuset: Skip spread flags update on v2
	cgroup/cpuset: Make cpuset_fork() handle CLONE_INTO_CGROUP properly
	cgroup/cpuset: Add cpuset_can_fork() and cpuset_cancel_fork() methods
	Linux 6.1.25

Change-Id: Ib4d2c49ea9bacb8d8dbdb7b3a4eecce937016427
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-04-26 13:13:19 +00:00
Greg Kroah-Hartman
2039635543 Revert "raw: use net_hash_mix() in hash function"
This reverts commit 53a0031217.

It breaks the current Android kernel abi.  It will be brought back at
the next KABI break update.

Bug: 161946584
Change-Id: I7fd2655234ff38dfe11a528e71c7772458a36328
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-04-26 10:00:48 +00:00
Greg Kroah-Hartman
cc7a00d2d6 Revert "raw: Fix NULL deref in raw_get_next()."
This reverts commit b34056bedf.

It breaks the current Android kernel abi.  It will be brought back at
the next KABI break update.

Bug: 161946584
Change-Id: I3664de0db15ba207c8b35530840095083d376dd1
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-04-26 10:00:48 +00:00
YueHaibing
9d7765638f tcp: restrict net.ipv4.tcp_app_win
[ Upstream commit dc5110c2d959c1707e12df5f792f41d90614adaa ]

UBSAN: shift-out-of-bounds in net/ipv4/tcp_input.c:555:23
shift exponent 255 is too large for 32-bit type 'int'
CPU: 1 PID: 7907 Comm: ssh Not tainted 6.3.0-rc4-00161-g62bad54b26db-dirty #206
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0x136/0x150
 __ubsan_handle_shift_out_of_bounds+0x21f/0x5a0
 tcp_init_transfer.cold+0x3a/0xb9
 tcp_finish_connect+0x1d0/0x620
 tcp_rcv_state_process+0xd78/0x4d60
 tcp_v4_do_rcv+0x33d/0x9d0
 __release_sock+0x133/0x3b0
 release_sock+0x58/0x1b0

'maxwin' is int, shifting int for 32 or more bits is undefined behaviour.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-04-20 12:35:08 +02:00
Martin KaFai Lau
db9c9086d3 bpf: tcp: Use sock_gen_put instead of sock_put in bpf_iter_tcp
[ Upstream commit 580031ff9952b7dbf48dedba6b56a100ae002bef ]

While reviewing the udp-iter batching patches, noticed the bpf_iter_tcp
calling sock_put() is incorrect. It should call sock_gen_put instead
because bpf_iter_tcp is iterating the ehash table which has the req sk
and tw sk. This patch replaces all sock_put with sock_gen_put in the
bpf_iter_tcp codepath.

Fixes: 04c7820b77 ("bpf: tcp: Bpf iter batching and lock_sock")
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20230328004232.2134233-1-martin.lau@linux.dev
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-04-20 12:35:08 +02:00
Kuniyuki Iwashima
5a08a32e62 ping: Fix potentail NULL deref for /proc/net/icmp.
[ Upstream commit ab5fb73ffa01072b4d8031cc05801fa1cb653bee ]

After commit dbca1596bb ("ping: convert to RCU lookups, get rid
of rwlock"), we use RCU for ping sockets, but we should use spinlock
for /proc/net/icmp to avoid a potential NULL deref mentioned in
the previous patch.

Let's go back to using spinlock there.

Note we can convert ping sockets to use hlist instead of hlist_nulls
because we do not use SLAB_TYPESAFE_BY_RCU for ping sockets.

Fixes: dbca1596bb ("ping: convert to RCU lookups, get rid of rwlock")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-04-13 16:55:24 +02:00
Kuniyuki Iwashima
b34056bedf raw: Fix NULL deref in raw_get_next().
[ Upstream commit 0a78cf7264d29abeca098eae0b188a10aabc8a32 ]

Dae R. Jeong reported a NULL deref in raw_get_next() [0].

It seems that the repro was running these sequences in parallel so
that one thread was iterating on a socket that was being freed in
another netns.

  unshare(0x40060200)
  r0 = syz_open_procfs(0x0, &(0x7f0000002080)='net/raw\x00')
  socket$inet_icmp_raw(0x2, 0x3, 0x1)
  pread64(r0, &(0x7f0000000000)=""/10, 0xa, 0x10000000007f)

After commit 0daf07e527 ("raw: convert raw sockets to RCU"), we
use RCU and hlist_nulls_for_each_entry() to iterate over SOCK_RAW
sockets.  However, we should use spinlock for slow paths to avoid
the NULL deref.

Also, SOCK_RAW does not use SLAB_TYPESAFE_BY_RCU, and the slab object
is not reused during iteration in the grace period.  In fact, the
lockless readers do not check the nulls marker with get_nulls_value().
So, SOCK_RAW should use hlist instead of hlist_nulls.

Instead of adding an unnecessary barrier by sk_nulls_for_each_rcu(),
let's convert hlist_nulls to hlist and use sk_for_each_rcu() for
fast paths and sk_for_each() and spinlock for /proc/net/raw.

[0]:
general protection fault, probably for non-canonical address 0xdffffc0000000005: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000028-0x000000000000002f]
CPU: 2 PID: 20952 Comm: syz-executor.0 Not tainted 6.2.0-g048ec869bafd-dirty #7
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
RIP: 0010:read_pnet include/net/net_namespace.h:383 [inline]
RIP: 0010:sock_net include/net/sock.h:649 [inline]
RIP: 0010:raw_get_next net/ipv4/raw.c:974 [inline]
RIP: 0010:raw_get_idx net/ipv4/raw.c:986 [inline]
RIP: 0010:raw_seq_start+0x431/0x800 net/ipv4/raw.c:995
Code: ef e8 33 3d 94 f7 49 8b 6d 00 4c 89 ef e8 b7 65 5f f7 49 89 ed 49 83 c5 98 0f 84 9a 00 00 00 48 83 c5 c8 48 89 e8 48 c1 e8 03 <42> 80 3c 30 00 74 08 48 89 ef e8 00 3d 94 f7 4c 8b 7d 00 48 89 ef
RSP: 0018:ffffc9001154f9b0 EFLAGS: 00010206
RAX: 0000000000000005 RBX: 1ffff1100302c8fd RCX: 0000000000000000
RDX: 0000000000000028 RSI: ffffc9001154f988 RDI: ffffc9000f77a338
RBP: 0000000000000029 R08: ffffffff8a50ffb4 R09: fffffbfff24b6bd9
R10: fffffbfff24b6bd9 R11: 0000000000000000 R12: ffff88801db73b78
R13: fffffffffffffff9 R14: dffffc0000000000 R15: 0000000000000030
FS:  00007f843ae8e700(0000) GS:ffff888063700000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055bb9614b35f CR3: 000000003c672000 CR4: 00000000003506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 seq_read_iter+0x4c6/0x10f0 fs/seq_file.c:225
 seq_read+0x224/0x320 fs/seq_file.c:162
 pde_read fs/proc/inode.c:316 [inline]
 proc_reg_read+0x23f/0x330 fs/proc/inode.c:328
 vfs_read+0x31e/0xd30 fs/read_write.c:468
 ksys_pread64 fs/read_write.c:665 [inline]
 __do_sys_pread64 fs/read_write.c:675 [inline]
 __se_sys_pread64 fs/read_write.c:672 [inline]
 __x64_sys_pread64+0x1e9/0x280 fs/read_write.c:672
 do_syscall_x64 arch/x86/entry/common.c:51 [inline]
 do_syscall_64+0x4e/0xa0 arch/x86/entry/common.c:82
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x478d29
Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f843ae8dbe8 EFLAGS: 00000246 ORIG_RAX: 0000000000000011
RAX: ffffffffffffffda RBX: 0000000000791408 RCX: 0000000000478d29
RDX: 000000000000000a RSI: 0000000020000000 RDI: 0000000000000003
RBP: 00000000f477909a R08: 0000000000000000 R09: 0000000000000000
R10: 000010000000007f R11: 0000000000000246 R12: 0000000000791740
R13: 0000000000791414 R14: 0000000000791408 R15: 00007ffc2eb48a50
 </TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:read_pnet include/net/net_namespace.h:383 [inline]
RIP: 0010:sock_net include/net/sock.h:649 [inline]
RIP: 0010:raw_get_next net/ipv4/raw.c:974 [inline]
RIP: 0010:raw_get_idx net/ipv4/raw.c:986 [inline]
RIP: 0010:raw_seq_start+0x431/0x800 net/ipv4/raw.c:995
Code: ef e8 33 3d 94 f7 49 8b 6d 00 4c 89 ef e8 b7 65 5f f7 49 89 ed 49 83 c5 98 0f 84 9a 00 00 00 48 83 c5 c8 48 89 e8 48 c1 e8 03 <42> 80 3c 30 00 74 08 48 89 ef e8 00 3d 94 f7 4c 8b 7d 00 48 89 ef
RSP: 0018:ffffc9001154f9b0 EFLAGS: 00010206
RAX: 0000000000000005 RBX: 1ffff1100302c8fd RCX: 0000000000000000
RDX: 0000000000000028 RSI: ffffc9001154f988 RDI: ffffc9000f77a338
RBP: 0000000000000029 R08: ffffffff8a50ffb4 R09: fffffbfff24b6bd9
R10: fffffbfff24b6bd9 R11: 0000000000000000 R12: ffff88801db73b78
R13: fffffffffffffff9 R14: dffffc0000000000 R15: 0000000000000030
FS:  00007f843ae8e700(0000) GS:ffff888063700000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f92ff166000 CR3: 000000003c672000 CR4: 00000000003506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

Fixes: 0daf07e527 ("raw: convert raw sockets to RCU")
Reported-by: syzbot <syzkaller@googlegroups.com>
Reported-by: Dae R. Jeong <threeearcat@gmail.com>
Link: https://lore.kernel.org/netdev/ZCA2mGV_cmq7lIfV@dragonet/
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-04-13 16:55:23 +02:00
Eric Dumazet
53a0031217 raw: use net_hash_mix() in hash function
[ Upstream commit 6579f5bacc2c4cbc5ef6abb45352416939d1f844 ]

Some applications seem to rely on RAW sockets.

If they use private netns, we can avoid piling all RAW
sockets bound to a given protocol into a single bucket.

Also place (struct raw_hashinfo).lock into its own
cache line to limit false sharing.

Alternative would be to have per-netns hashtables,
but this seems too expensive for most netns
where RAW sockets are not used.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of: 0a78cf7264d2 ("raw: Fix NULL deref in raw_get_next().")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-04-13 16:55:23 +02:00
Eric Dumazet
0185e87c69 icmp: guard against too small mtu
[ Upstream commit 7d63b67125382ff0ffdfca434acbc94a38bd092b ]

syzbot was able to trigger a panic [1] in icmp_glue_bits(), or
more exactly in skb_copy_and_csum_bits()

There is no repro yet, but I think the issue is that syzbot
manages to lower device mtu to a small value, fooling __icmp_send()

__icmp_send() must make sure there is enough room for the
packet to include at least the headers.

We might in the future refactor skb_copy_and_csum_bits() and its
callers to no longer crash when something bad happens.

[1]
kernel BUG at net/core/skbuff.c:3343 !
invalid opcode: 0000 [#1] PREEMPT SMP KASAN
CPU: 0 PID: 15766 Comm: syz-executor.0 Not tainted 6.3.0-rc4-syzkaller-00039-gffe78bbd5121 #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014
RIP: 0010:skb_copy_and_csum_bits+0x798/0x860 net/core/skbuff.c:3343
Code: f0 c1 c8 08 41 89 c6 e9 73 ff ff ff e8 61 48 d4 f9 e9 41 fd ff ff 48 8b 7c 24 48 e8 52 48 d4 f9 e9 c3 fc ff ff e8 c8 27 84 f9 <0f> 0b 48 89 44 24 28 e8 3c 48 d4 f9 48 8b 44 24 28 e9 9d fb ff ff
RSP: 0018:ffffc90000007620 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 00000000000001e8 RCX: 0000000000000100
RDX: ffff8880276f6280 RSI: ffffffff87fdd138 RDI: 0000000000000005
RBP: 0000000000000000 R08: 0000000000000005 R09: 0000000000000000
R10: 00000000000001e8 R11: 0000000000000001 R12: 000000000000003c
R13: 0000000000000000 R14: ffff888028244868 R15: 0000000000000b0e
FS: 00007fbc81f1c700(0000) GS:ffff88802ca00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000001b2df43000 CR3: 00000000744db000 CR4: 0000000000150ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<IRQ>
icmp_glue_bits+0x7b/0x210 net/ipv4/icmp.c:353
__ip_append_data+0x1d1b/0x39f0 net/ipv4/ip_output.c:1161
ip_append_data net/ipv4/ip_output.c:1343 [inline]
ip_append_data+0x115/0x1a0 net/ipv4/ip_output.c:1322
icmp_push_reply+0xa8/0x440 net/ipv4/icmp.c:370
__icmp_send+0xb80/0x1430 net/ipv4/icmp.c:765
ipv4_send_dest_unreach net/ipv4/route.c:1239 [inline]
ipv4_link_failure+0x5a9/0x9e0 net/ipv4/route.c:1246
dst_link_failure include/net/dst.h:423 [inline]
arp_error_report+0xcb/0x1c0 net/ipv4/arp.c:296
neigh_invalidate+0x20d/0x560 net/core/neighbour.c:1079
neigh_timer_handler+0xc77/0xff0 net/core/neighbour.c:1166
call_timer_fn+0x1a0/0x580 kernel/time/timer.c:1700
expire_timers+0x29b/0x4b0 kernel/time/timer.c:1751
__run_timers kernel/time/timer.c:2022 [inline]

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reported-by: syzbot+d373d60fddbdc915e666@syzkaller.appspotmail.com
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20230330174502.1915328-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-04-13 16:55:21 +02:00
Eric Dumazet
9c7d680368 erspan: do not use skb_mac_header() in ndo_start_xmit()
[ Upstream commit 8e50ed774554f93d55426039b27b1e38d7fa64d8 ]

Drivers should not assume skb_mac_header(skb) == skb->data in their
ndo_start_xmit().

Use skb_network_offset() and skb_transport_offset() which
better describe what is needed in erspan_fb_xmit() and
ip6erspan_tunnel_xmit()

syzbot reported:
WARNING: CPU: 0 PID: 5083 at include/linux/skbuff.h:2873 skb_mac_header include/linux/skbuff.h:2873 [inline]
WARNING: CPU: 0 PID: 5083 at include/linux/skbuff.h:2873 ip6erspan_tunnel_xmit+0x1d9c/0x2d90 net/ipv6/ip6_gre.c:962
Modules linked in:
CPU: 0 PID: 5083 Comm: syz-executor406 Not tainted 6.3.0-rc2-syzkaller-00866-gd4671cb96fa3 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/02/2023
RIP: 0010:skb_mac_header include/linux/skbuff.h:2873 [inline]
RIP: 0010:ip6erspan_tunnel_xmit+0x1d9c/0x2d90 net/ipv6/ip6_gre.c:962
Code: 04 02 41 01 de 84 c0 74 08 3c 03 0f 8e 1c 0a 00 00 45 89 b4 24 c8 00 00 00 c6 85 77 fe ff ff 01 e9 33 e7 ff ff e8 b4 27 a1 f8 <0f> 0b e9 b6 e7 ff ff e8 a8 27 a1 f8 49 8d bf f0 0c 00 00 48 b8 00
RSP: 0018:ffffc90003b2f830 EFLAGS: 00010293
RAX: 0000000000000000 RBX: 000000000000ffff RCX: 0000000000000000
RDX: ffff888021273a80 RSI: ffffffff88e1bd4c RDI: 0000000000000003
RBP: ffffc90003b2f9d8 R08: 0000000000000003 R09: 000000000000ffff
R10: 000000000000ffff R11: 0000000000000000 R12: ffff88802b28da00
R13: 00000000000000d0 R14: ffff88807e25b6d0 R15: ffff888023408000
FS: 0000555556a61300(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055e5b11eb6e8 CR3: 0000000027c1b000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
__netdev_start_xmit include/linux/netdevice.h:4900 [inline]
netdev_start_xmit include/linux/netdevice.h:4914 [inline]
__dev_direct_xmit+0x504/0x730 net/core/dev.c:4300
dev_direct_xmit include/linux/netdevice.h:3088 [inline]
packet_xmit+0x20a/0x390 net/packet/af_packet.c:285
packet_snd net/packet/af_packet.c:3075 [inline]
packet_sendmsg+0x31a0/0x5150 net/packet/af_packet.c:3107
sock_sendmsg_nosec net/socket.c:724 [inline]
sock_sendmsg+0xde/0x190 net/socket.c:747
__sys_sendto+0x23a/0x340 net/socket.c:2142
__do_sys_sendto net/socket.c:2154 [inline]
__se_sys_sendto net/socket.c:2150 [inline]
__x64_sys_sendto+0xe1/0x1b0 net/socket.c:2150
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f123aaa1039
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 b1 14 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffc15d12058 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f123aaa1039
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000003
RBP: 0000000000000000 R08: 0000000020000040 R09: 0000000000000014
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f123aa648c0
R13: 431bde82d7b634db R14: 0000000000000000 R15: 0000000000000000

Fixes: 1baf5ebf89 ("erspan: auto detect truncated packets.")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230320163427.8096-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-30 12:49:09 +02:00
Ido Schimmel
1c5642cfa6 ipv4: Fix incorrect table ID in IOCTL path
[ Upstream commit 8a2618e14f81604a9b6ad305d57e0c8da939cd65 ]

Commit f96a3d7455 ("ipv4: Fix incorrect route flushing when source
address is deleted") started to take the table ID field in the FIB info
structure into account when determining if two structures are identical
or not. This field is initialized using the 'fc_table' field in the
route configuration structure, which is not set when adding a route via
IOCTL.

The above can result in user space being able to install two identical
routes that only differ in the table ID field of their associated FIB
info.

Fix by initializing the table ID field in the route configuration
structure in the IOCTL path.

Before the fix:

 # ip route add default via 192.0.2.2
 # route add default gw 192.0.2.2
 # ip -4 r show default
 # default via 192.0.2.2 dev dummy10
 # default via 192.0.2.2 dev dummy10

After the fix:

 # ip route add default via 192.0.2.2
 # route add default gw 192.0.2.2
 SIOCADDRT: File exists
 # ip -4 r show default
 default via 192.0.2.2 dev dummy10

Audited the code paths to ensure there are no other paths that do not
properly initialize the route configuration structure when installing a
route.

Fixes: 5a56a0b3a4 ("net: Don't delete routes in different VRFs")
Fixes: f96a3d7455 ("ipv4: Fix incorrect route flushing when source address is deleted")
Reported-by: gaoxingwang <gaoxingwang1@huawei.com>
Link: https://lore.kernel.org/netdev/20230314144159.2354729-1-gaoxingwang1@huawei.com/
Tested-by: gaoxingwang <gaoxingwang1@huawei.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20230315124009.4015212-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-22 13:33:50 +01:00
Kuniyuki Iwashima
b339c0af83 tcp: Fix bind() conflict check for dual-stack wildcard address.
[ Upstream commit d9ba9934285514f1f95d96326a82398a22dc77f2 ]

Paul Holzinger reported [0] that commit 5456262d2b ("net: Fix
incorrect address comparison when searching for a bind2 bucket")
introduced a bind() regression.  Paul also gave a nice repro that
calls two types of bind() on the same port, both of which now
succeed, but the second call should fail:

  bind(fd1, ::, port) + bind(fd2, 127.0.0.1, port)

The cited commit added address family tests in three functions to
fix the uninit-value KMSAN report. [1]  However, the test added to
inet_bind2_bucket_match_addr_any() removed a necessary conflict
check; the dual-stack wildcard address no longer conflicts with
an IPv4 non-wildcard address.

If tb->family is AF_INET6 and sk->sk_family is AF_INET in
inet_bind2_bucket_match_addr_any(), we still need to check
if tb has the dual-stack wildcard address.

Note that the IPv4 wildcard address does not conflict with
IPv6 non-wildcard addresses.

[0]: https://lore.kernel.org/netdev/e21bf153-80b0-9ec0-15ba-e04a4ad42c34@redhat.com/
[1]: https://lore.kernel.org/netdev/CAG_fn=Ud3zSW7AZWXc+asfMhZVL5ETnvuY44Pmyv4NPv-ijN-A@mail.gmail.com/

Fixes: 5456262d2b ("net: Fix incorrect address comparison when searching for a bind2 bucket")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reported-by: Paul Holzinger <pholzing@redhat.com>
Link: https://lore.kernel.org/netdev/CAG_fn=Ud3zSW7AZWXc+asfMhZVL5ETnvuY44Pmyv4NPv-ijN-A@mail.gmail.com/
Reviewed-by: Eric Dumazet <edumazet@google.com>
Tested-by: Paul Holzinger <pholzing@redhat.com>
Reviewed-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-22 13:33:46 +01:00
Eric Dumazet
a69b72b57b net: tunnels: annotate lockless accesses to dev->needed_headroom
[ Upstream commit 4b397c06cb987935b1b097336532aa6b4210e091 ]

IP tunnels can apparently update dev->needed_headroom
in their xmit path.

This patch takes care of three tunnels xmit, and also the
core LL_RESERVED_SPACE() and LL_RESERVED_SPACE_EXTRA()
helpers.

More changes might be needed for completeness.

BUG: KCSAN: data-race in ip_tunnel_xmit / ip_tunnel_xmit

read to 0xffff88815b9da0ec of 2 bytes by task 888 on cpu 1:
ip_tunnel_xmit+0x1270/0x1730 net/ipv4/ip_tunnel.c:803
__gre_xmit net/ipv4/ip_gre.c:469 [inline]
ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:661
__netdev_start_xmit include/linux/netdevice.h:4881 [inline]
netdev_start_xmit include/linux/netdevice.h:4895 [inline]
xmit_one net/core/dev.c:3580 [inline]
dev_hard_start_xmit+0x127/0x400 net/core/dev.c:3596
__dev_queue_xmit+0x1007/0x1eb0 net/core/dev.c:4246
dev_queue_xmit include/linux/netdevice.h:3051 [inline]
neigh_direct_output+0x17/0x20 net/core/neighbour.c:1623
neigh_output include/net/neighbour.h:546 [inline]
ip_finish_output2+0x740/0x840 net/ipv4/ip_output.c:228
ip_finish_output+0xf4/0x240 net/ipv4/ip_output.c:316
NF_HOOK_COND include/linux/netfilter.h:291 [inline]
ip_output+0xe5/0x1b0 net/ipv4/ip_output.c:430
dst_output include/net/dst.h:444 [inline]
ip_local_out+0x64/0x80 net/ipv4/ip_output.c:126
iptunnel_xmit+0x34a/0x4b0 net/ipv4/ip_tunnel_core.c:82
ip_tunnel_xmit+0x1451/0x1730 net/ipv4/ip_tunnel.c:813
__gre_xmit net/ipv4/ip_gre.c:469 [inline]
ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:661
__netdev_start_xmit include/linux/netdevice.h:4881 [inline]
netdev_start_xmit include/linux/netdevice.h:4895 [inline]
xmit_one net/core/dev.c:3580 [inline]
dev_hard_start_xmit+0x127/0x400 net/core/dev.c:3596
__dev_queue_xmit+0x1007/0x1eb0 net/core/dev.c:4246
dev_queue_xmit include/linux/netdevice.h:3051 [inline]
neigh_direct_output+0x17/0x20 net/core/neighbour.c:1623
neigh_output include/net/neighbour.h:546 [inline]
ip_finish_output2+0x740/0x840 net/ipv4/ip_output.c:228
ip_finish_output+0xf4/0x240 net/ipv4/ip_output.c:316
NF_HOOK_COND include/linux/netfilter.h:291 [inline]
ip_output+0xe5/0x1b0 net/ipv4/ip_output.c:430
dst_output include/net/dst.h:444 [inline]
ip_local_out+0x64/0x80 net/ipv4/ip_output.c:126
iptunnel_xmit+0x34a/0x4b0 net/ipv4/ip_tunnel_core.c:82
ip_tunnel_xmit+0x1451/0x1730 net/ipv4/ip_tunnel.c:813
__gre_xmit net/ipv4/ip_gre.c:469 [inline]
ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:661
__netdev_start_xmit include/linux/netdevice.h:4881 [inline]
netdev_start_xmit include/linux/netdevice.h:4895 [inline]
xmit_one net/core/dev.c:3580 [inline]
dev_hard_start_xmit+0x127/0x400 net/core/dev.c:3596
__dev_queue_xmit+0x1007/0x1eb0 net/core/dev.c:4246
dev_queue_xmit include/linux/netdevice.h:3051 [inline]
neigh_direct_output+0x17/0x20 net/core/neighbour.c:1623
neigh_output include/net/neighbour.h:546 [inline]
ip_finish_output2+0x740/0x840 net/ipv4/ip_output.c:228
ip_finish_output+0xf4/0x240 net/ipv4/ip_output.c:316
NF_HOOK_COND include/linux/netfilter.h:291 [inline]
ip_output+0xe5/0x1b0 net/ipv4/ip_output.c:430
dst_output include/net/dst.h:444 [inline]
ip_local_out+0x64/0x80 net/ipv4/ip_output.c:126
iptunnel_xmit+0x34a/0x4b0 net/ipv4/ip_tunnel_core.c:82
ip_tunnel_xmit+0x1451/0x1730 net/ipv4/ip_tunnel.c:813
__gre_xmit net/ipv4/ip_gre.c:469 [inline]
ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:661
__netdev_start_xmit include/linux/netdevice.h:4881 [inline]
netdev_start_xmit include/linux/netdevice.h:4895 [inline]
xmit_one net/core/dev.c:3580 [inline]
dev_hard_start_xmit+0x127/0x400 net/core/dev.c:3596
__dev_queue_xmit+0x1007/0x1eb0 net/core/dev.c:4246
dev_queue_xmit include/linux/netdevice.h:3051 [inline]
neigh_direct_output+0x17/0x20 net/core/neighbour.c:1623
neigh_output include/net/neighbour.h:546 [inline]
ip_finish_output2+0x740/0x840 net/ipv4/ip_output.c:228
ip_finish_output+0xf4/0x240 net/ipv4/ip_output.c:316
NF_HOOK_COND include/linux/netfilter.h:291 [inline]
ip_output+0xe5/0x1b0 net/ipv4/ip_output.c:430
dst_output include/net/dst.h:444 [inline]
ip_local_out+0x64/0x80 net/ipv4/ip_output.c:126
iptunnel_xmit+0x34a/0x4b0 net/ipv4/ip_tunnel_core.c:82
ip_tunnel_xmit+0x1451/0x1730 net/ipv4/ip_tunnel.c:813
__gre_xmit net/ipv4/ip_gre.c:469 [inline]
ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:661
__netdev_start_xmit include/linux/netdevice.h:4881 [inline]
netdev_start_xmit include/linux/netdevice.h:4895 [inline]
xmit_one net/core/dev.c:3580 [inline]
dev_hard_start_xmit+0x127/0x400 net/core/dev.c:3596
__dev_queue_xmit+0x1007/0x1eb0 net/core/dev.c:4246
dev_queue_xmit include/linux/netdevice.h:3051 [inline]
neigh_direct_output+0x17/0x20 net/core/neighbour.c:1623
neigh_output include/net/neighbour.h:546 [inline]
ip_finish_output2+0x740/0x840 net/ipv4/ip_output.c:228
ip_finish_output+0xf4/0x240 net/ipv4/ip_output.c:316
NF_HOOK_COND include/linux/netfilter.h:291 [inline]
ip_output+0xe5/0x1b0 net/ipv4/ip_output.c:430
dst_output include/net/dst.h:444 [inline]
ip_local_out+0x64/0x80 net/ipv4/ip_output.c:126
iptunnel_xmit+0x34a/0x4b0 net/ipv4/ip_tunnel_core.c:82
ip_tunnel_xmit+0x1451/0x1730 net/ipv4/ip_tunnel.c:813
__gre_xmit net/ipv4/ip_gre.c:469 [inline]
ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:661
__netdev_start_xmit include/linux/netdevice.h:4881 [inline]
netdev_start_xmit include/linux/netdevice.h:4895 [inline]
xmit_one net/core/dev.c:3580 [inline]
dev_hard_start_xmit+0x127/0x400 net/core/dev.c:3596
__dev_queue_xmit+0x1007/0x1eb0 net/core/dev.c:4246
dev_queue_xmit include/linux/netdevice.h:3051 [inline]
neigh_direct_output+0x17/0x20 net/core/neighbour.c:1623
neigh_output include/net/neighbour.h:546 [inline]
ip_finish_output2+0x740/0x840 net/ipv4/ip_output.c:228
ip_finish_output+0xf4/0x240 net/ipv4/ip_output.c:316
NF_HOOK_COND include/linux/netfilter.h:291 [inline]
ip_output+0xe5/0x1b0 net/ipv4/ip_output.c:430
dst_output include/net/dst.h:444 [inline]
ip_local_out+0x64/0x80 net/ipv4/ip_output.c:126
iptunnel_xmit+0x34a/0x4b0 net/ipv4/ip_tunnel_core.c:82
ip_tunnel_xmit+0x1451/0x1730 net/ipv4/ip_tunnel.c:813
__gre_xmit net/ipv4/ip_gre.c:469 [inline]
ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:661
__netdev_start_xmit include/linux/netdevice.h:4881 [inline]
netdev_start_xmit include/linux/netdevice.h:4895 [inline]
xmit_one net/core/dev.c:3580 [inline]
dev_hard_start_xmit+0x127/0x400 net/core/dev.c:3596
__dev_queue_xmit+0x1007/0x1eb0 net/core/dev.c:4246

write to 0xffff88815b9da0ec of 2 bytes by task 2379 on cpu 0:
ip_tunnel_xmit+0x1294/0x1730 net/ipv4/ip_tunnel.c:804
__gre_xmit net/ipv4/ip_gre.c:469 [inline]
ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:661
__netdev_start_xmit include/linux/netdevice.h:4881 [inline]
netdev_start_xmit include/linux/netdevice.h:4895 [inline]
xmit_one net/core/dev.c:3580 [inline]
dev_hard_start_xmit+0x127/0x400 net/core/dev.c:3596
__dev_queue_xmit+0x1007/0x1eb0 net/core/dev.c:4246
dev_queue_xmit include/linux/netdevice.h:3051 [inline]
neigh_direct_output+0x17/0x20 net/core/neighbour.c:1623
neigh_output include/net/neighbour.h:546 [inline]
ip6_finish_output2+0x9bc/0xc50 net/ipv6/ip6_output.c:134
__ip6_finish_output net/ipv6/ip6_output.c:195 [inline]
ip6_finish_output+0x39a/0x4e0 net/ipv6/ip6_output.c:206
NF_HOOK_COND include/linux/netfilter.h:291 [inline]
ip6_output+0xeb/0x220 net/ipv6/ip6_output.c:227
dst_output include/net/dst.h:444 [inline]
NF_HOOK include/linux/netfilter.h:302 [inline]
mld_sendpack+0x438/0x6a0 net/ipv6/mcast.c:1820
mld_send_cr net/ipv6/mcast.c:2121 [inline]
mld_ifc_work+0x519/0x7b0 net/ipv6/mcast.c:2653
process_one_work+0x3e6/0x750 kernel/workqueue.c:2390
worker_thread+0x5f2/0xa10 kernel/workqueue.c:2537
kthread+0x1ac/0x1e0 kernel/kthread.c:376
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308

value changed: 0x0dd4 -> 0x0e14

Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 2379 Comm: kworker/0:0 Not tainted 6.3.0-rc1-syzkaller-00002-g8ca09d5fa354-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/02/2023
Workqueue: mld mld_ifc_work

Fixes: 8eb30be035 ("ipv6: Create ip6_tnl_xmit")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20230310191109.2384387-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-22 13:33:46 +01:00
Breno Leitao
9180aa4622 tcp: tcp_make_synack() can be called from process context
[ Upstream commit bced3f7db95ff2e6ca29dc4d1c9751ab5e736a09 ]

tcp_rtx_synack() now could be called in process context as explained in
0a375c8224 ("tcp: tcp_rtx_synack() can be called from process
context").

tcp_rtx_synack() might call tcp_make_synack(), which will touch per-CPU
variables with preemption enabled. This causes the following BUG:

    BUG: using __this_cpu_add() in preemptible [00000000] code: ThriftIO1/5464
    caller is tcp_make_synack+0x841/0xac0
    Call Trace:
     <TASK>
     dump_stack_lvl+0x10d/0x1a0
     check_preemption_disabled+0x104/0x110
     tcp_make_synack+0x841/0xac0
     tcp_v6_send_synack+0x5c/0x450
     tcp_rtx_synack+0xeb/0x1f0
     inet_rtx_syn_ack+0x34/0x60
     tcp_check_req+0x3af/0x9e0
     tcp_rcv_state_process+0x59b/0x2030
     tcp_v6_do_rcv+0x5f5/0x700
     release_sock+0x3a/0xf0
     tcp_sendmsg+0x33/0x40
     ____sys_sendmsg+0x2f2/0x490
     __sys_sendmsg+0x184/0x230
     do_syscall_64+0x3d/0x90

Avoid calling __TCP_INC_STATS() with will touch per-cpu variables. Use
TCP_INC_STATS() which is safe to be called from context switch.

Fixes: 8336886f78 ("tcp: TCP Fast Open Server - support TFO listeners")
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20230308190745.780221-1-leitao@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-22 13:33:43 +01:00
Florian Westphal
079d37e162 netfilter: tproxy: fix deadlock due to missing BH disable
[ Upstream commit 4a02426787bf024dafdb79b362285ee325de3f5e ]

The xtables packet traverser performs an unconditional local_bh_disable(),
but the nf_tables evaluation loop does not.

Functions that are called from either xtables or nftables must assume
that they can be called in process context.

inet_twsk_deschedule_put() assumes that no softirq interrupt can occur.
If tproxy is used from nf_tables its possible that we'll deadlock
trying to aquire a lock already held in process context.

Add a small helper that takes care of this and use it.

Link: https://lore.kernel.org/netfilter-devel/401bd6ed-314a-a196-1cdc-e13c720cc8f2@balasys.hu/
Fixes: 4ed8eb6570 ("netfilter: nf_tables: Add native tproxy support")
Reported-and-tested-by: Major Dávid <major.david@balasys.hu>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17 08:50:25 +01:00
Liu Jian
f45cf3ae30 bpf, sockmap: Fix an infinite loop error when len is 0 in tcp_bpf_recvmsg_parser()
[ Upstream commit d900f3d20cc3169ce42ec72acc850e662a4d4db2 ]

When the buffer length of the recvmsg system call is 0, we got the
flollowing soft lockup problem:

watchdog: BUG: soft lockup - CPU#3 stuck for 27s! [a.out:6149]
CPU: 3 PID: 6149 Comm: a.out Kdump: loaded Not tainted 6.2.0+ #30
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
RIP: 0010:remove_wait_queue+0xb/0xc0
Code: 5e 41 5f c3 cc cc cc cc 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 41 57 <41> 56 41 55 41 54 55 48 89 fd 53 48 89 f3 4c 8d 6b 18 4c 8d 73 20
RSP: 0018:ffff88811b5978b8 EFLAGS: 00000246
RAX: 0000000000000000 RBX: ffff88811a7d3780 RCX: ffffffffb7a4d768
RDX: dffffc0000000000 RSI: ffff88811b597908 RDI: ffff888115408040
RBP: 1ffff110236b2f1b R08: 0000000000000000 R09: ffff88811a7d37e7
R10: ffffed10234fa6fc R11: 0000000000000001 R12: ffff88811179b800
R13: 0000000000000001 R14: ffff88811a7d38a8 R15: ffff88811a7d37e0
FS:  00007f6fb5398740(0000) GS:ffff888237180000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020000000 CR3: 000000010b6ba002 CR4: 0000000000370ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 tcp_msg_wait_data+0x279/0x2f0
 tcp_bpf_recvmsg_parser+0x3c6/0x490
 inet_recvmsg+0x280/0x290
 sock_recvmsg+0xfc/0x120
 ____sys_recvmsg+0x160/0x3d0
 ___sys_recvmsg+0xf0/0x180
 __sys_recvmsg+0xea/0x1a0
 do_syscall_64+0x3f/0x90
 entry_SYSCALL_64_after_hwframe+0x72/0xdc

The logic in tcp_bpf_recvmsg_parser is as follows:

msg_bytes_ready:
	copied = sk_msg_recvmsg(sk, psock, msg, len, flags);
	if (!copied) {
		wait data;
		goto msg_bytes_ready;
	}

In this case, "copied" always is 0, the infinite loop occurs.

According to the Linux system call man page, 0 should be returned in this
case. Therefore, in tcp_bpf_recvmsg_parser(), if the length is 0, directly
return. Also modify several other functions with the same problem.

Fixes: 1f5be6b3b0 ("udp: Implement udp_bpf_recvmsg() for sockmap")
Fixes: 9825d866ce ("af_unix: Implement unix_dgram_bpf_recvmsg()")
Fixes: c5d2177a72 ("bpf, sockmap: Fix race in ingress receive verdict with redirect to self")
Fixes: 604326b41a ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: Liu Jian <liujian56@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Cc: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/20230303080946.1146638-1-liujian56@huawei.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17 08:50:24 +01:00
Eric Dumazet
95c131b41f tcp: tcp_check_req() can be called from process context
[ Upstream commit 580f98cc33a260bb8c6a39ae2921b29586b84fdf ]

This is a follow up of commit 0a375c8224 ("tcp: tcp_rtx_synack()
can be called from process context").

Frederick Lawler reported another "__this_cpu_add() in preemptible"
warning caused by the same reason.

In my former patch I took care of tcp_rtx_synack()
but forgot that tcp_check_req() also contained some SNMP updates.

Note that some parts of tcp_check_req() always run in BH context,
I added a comment to clarify this.

Fixes: 8336886f78 ("tcp: TCP Fast Open Server - support TFO listeners")
Link: https://lore.kernel.org/netdev/8cd33923-a21d-397c-e46b-2a068c287b03@cloudflare.com/T/
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Frederick Lawler <fred@cloudflare.com>
Tested-by: Frederick Lawler <fred@cloudflare.com>
Link: https://lore.kernel.org/r/20230227083336.4153089-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-11 13:55:29 +01:00
Pavel Tikhomirov
512b6c4b83 netfilter: x_tables: fix percpu counter block leak on error path when creating new netns
[ Upstream commit 0af8c09c896810879387decfba8c942994bb61f5 ]

Here is the stack where we allocate percpu counter block:

  +-< __alloc_percpu
    +-< xt_percpu_counter_alloc
      +-< find_check_entry # {arp,ip,ip6}_tables.c
        +-< translate_table

And it can be leaked on this code path:

  +-> ip6t_register_table
    +-> translate_table # allocates percpu counter block
    +-> xt_register_table # fails

there is no freeing of the counter block on xt_register_table fail.
Note: xt_percpu_counter_free should be called to free it like we do in
do_replace through cleanup_entry helper (or in __ip6t_unregister_table).

Probability of hitting this error path is low AFAICS (xt_register_table
can only return ENOMEM here, as it is not replacing anything, as we are
creating new netns, and it is hard to imagine that all previous
allocations succeeded and after that one in xt_register_table failed).
But it's worth fixing even the rare leak.

Fixes: 71ae0dff02 ("netfilter: xtables: use percpu rule counters")
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-11 13:55:25 +01:00
Florian Westphal
3dd6ac9733 netfilter: ebtables: fix table blob use-after-free
[ Upstream commit e58a171d35e32e6e8c37cfe0e8a94406732a331f ]

We are not allowed to return an error at this point.
Looking at the code it looks like ret is always 0 at this
point, but its not.

t = find_table_lock(net, repl->name, &ret, &ebt_mutex);

... this can return a valid table, with ret != 0.

This bug causes update of table->private with the new
blob, but then frees the blob right away in the caller.

Syzbot report:

BUG: KASAN: vmalloc-out-of-bounds in __ebt_unregister_table+0xc00/0xcd0 net/bridge/netfilter/ebtables.c:1168
Read of size 4 at addr ffffc90005425000 by task kworker/u4:4/74
Workqueue: netns cleanup_net
Call Trace:
 kasan_report+0xbf/0x1f0 mm/kasan/report.c:517
 __ebt_unregister_table+0xc00/0xcd0 net/bridge/netfilter/ebtables.c:1168
 ebt_unregister_table+0x35/0x40 net/bridge/netfilter/ebtables.c:1372
 ops_exit_list+0xb0/0x170 net/core/net_namespace.c:169
 cleanup_net+0x4ee/0xb10 net/core/net_namespace.c:613
...

ip(6)tables appears to be ok (ret should be 0 at this point) but make
this more obvious.

Fixes: c58dd2dd44 ("netfilter: Can't fail and free after table replacement")
Reported-by: syzbot+f61594de72d6705aea03@syzkaller.appspotmail.com
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-11 13:55:24 +01:00
Pietro Borrello
6965c92ef4 inet: fix fast path in __inet_hash_connect()
[ Upstream commit 21cbd90a6fab7123905386985e3e4a80236b8714 ]

__inet_hash_connect() has a fast path taken if sk_head(&tb->owners) is
equal to the sk parameter.
sk_head() returns the hlist_entry() with respect to the sk_node field.
However entries in the tb->owners list are inserted with respect to the
sk_bind_node field with sk_add_bind_node().
Thus the check would never pass and the fast path never execute.

This fast path has never been executed or tested as this bug seems
to be present since commit 1da177e4c3 ("Linux-2.6.12-rc2"), thus
remove it to reduce code complexity.

Signed-off-by: Pietro Borrello <borrello@diag.uniroma1.it>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20230112-inet_hash_connect_bind_head-v3-1-b591fd212b93@diag.uniroma1.it
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-10 09:33:49 +01:00
Kevin Yang
0ae9d81109 txhash: fix sk->sk_txrehash default
[ Upstream commit c11204c78d6966c5bda6dd05c3ac5cbb193f93e3 ]

This code fix a bug that sk->sk_txrehash gets its default enable
value from sysctl_txrehash only when the socket is a TCP listener.

We should have sysctl_txrehash to set the default sk->sk_txrehash,
no matter TCP, nor listerner/connector.

Tested by following packetdrill:
  0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
  +0 socket(..., SOCK_DGRAM, IPPROTO_UDP) = 4
  // SO_TXREHASH == 74, default to sysctl_txrehash == 1
  +0 getsockopt(3, SOL_SOCKET, 74, [1], [4]) = 0
  +0 getsockopt(4, SOL_SOCKET, 74, [1], [4]) = 0

Fixes: 26859240e4 ("txhash: Add socket option to control TX hash rethink behavior")
Signed-off-by: Kevin Yang <yyd@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-14 19:11:48 +01:00
Al Viro
5a19095103 use less confusing names for iov_iter direction initializers
[ Upstream commit de4eda9de2d957ef2d6a8365a01e26a435e958cb ]

READ/WRITE proved to be actively confusing - the meanings are
"data destination, as used with read(2)" and "data source, as
used with write(2)", but people keep interpreting those as
"we read data from it" and "we write data to it", i.e. exactly
the wrong way.

Call them ITER_DEST and ITER_SOURCE - at least that is harder
to misinterpret...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Stable-dep-of: 6dd88fd59da8 ("vhost-scsi: unbreak any layout for response")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-09 11:28:04 +01:00