From 0d7c3fb4212429b1746bced663d2b556714bf373 Mon Sep 17 00:00:00 2001 From: Alan Stern Date: Sat, 29 Jul 2023 10:59:38 -0400 Subject: [PATCH 1/9] UPSTREAM: USB: Gadget: core: Help prevent panic during UVC unconfigure Avichal Rakesh reported a kernel panic that occurred when the UVC gadget driver was removed from a gadget's configuration. The panic involves a somewhat complicated interaction between the kernel driver and a userspace component (as described in the Link tag below), but the analysis did make one thing clear: The Gadget core should accomodate gadget drivers calling usb_gadget_deactivate() as part of their unbind procedure. Currently this doesn't work. gadget_unbind_driver() calls driver->unbind() while holding the udc->connect_lock mutex, and usb_gadget_deactivate() attempts to acquire that mutex, which will result in a deadlock. The simple fix is for gadget_unbind_driver() to release the mutex when invoking the ->unbind() callback. There is no particular reason for it to be holding the mutex at that time, and the mutex isn't held while the ->bind() callback is invoked. So we'll drop the mutex before performing the unbind callback and reacquire it afterward. We'll also add a couple of comments to usb_gadget_activate() and usb_gadget_deactivate(). Because they run in process context they must not be called from a gadget driver's ->disconnect() callback, which (according to the kerneldoc for struct usb_gadget_driver in include/linux/usb/gadget.h) may run in interrupt context. This may help prevent similar bugs from arising in the future. Reported-and-tested-by: Avichal Rakesh Signed-off-by: Alan Stern Fixes: 286d9975a838 ("usb: gadget: udc: core: Prevent soft_connect_store() race") Link: https://lore.kernel.org/linux-usb/4d7aa3f4-22d9-9f5a-3d70-1bd7148ff4ba@google.com/ Cc: Badhri Jagan Sridharan Cc: Link: https://lore.kernel.org/r/48b2f1f1-0639-46bf-bbfc-98cb05a24914@rowland.harvard.edu Signed-off-by: Greg Kroah-Hartman Bug: 291976100 (cherry picked from commit 65dadb2beeb7360232b09ebc4585b54475dfee06) Signed-off-by: Avichal Rakesh (cherry picked from https://android-review.googlesource.com/q/commit:edfcc0980a22fdd21490ba38f4ae24346b562e39) Merged-In: Icff01d8e88f041af4bda8726242de9cd518a247a Change-Id: Icff01d8e88f041af4bda8726242de9cd518a247a --- drivers/usb/gadget/udc/core.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/drivers/usb/gadget/udc/core.c b/drivers/usb/gadget/udc/core.c index 9568b13c05f4..93695ce5fef0 100644 --- a/drivers/usb/gadget/udc/core.c +++ b/drivers/usb/gadget/udc/core.c @@ -795,6 +795,9 @@ EXPORT_SYMBOL_GPL(usb_gadget_disconnect); * usb_gadget_activate() is called. For example, user mode components may * need to be activated before the system can talk to hosts. * + * This routine may sleep; it must not be called in interrupt context + * (such as from within a gadget driver's disconnect() callback). + * * Returns zero on success, else negative errno. */ int usb_gadget_deactivate(struct usb_gadget *gadget) @@ -833,6 +836,8 @@ EXPORT_SYMBOL_GPL(usb_gadget_deactivate); * This routine activates gadget which was previously deactivated with * usb_gadget_deactivate() call. It calls usb_gadget_connect() if needed. * + * This routine may sleep; it must not be called in interrupt context. + * * Returns zero on success, else negative errno. */ int usb_gadget_activate(struct usb_gadget *gadget) @@ -1625,7 +1630,11 @@ static void gadget_unbind_driver(struct device *dev) usb_gadget_disable_async_callbacks(udc); if (gadget->irq) synchronize_irq(gadget->irq); + mutex_unlock(&udc->connect_lock); + udc->driver->unbind(gadget); + + mutex_lock(&udc->connect_lock); usb_gadget_udc_stop_locked(udc); mutex_unlock(&udc->connect_lock); From 64466a748ac89e573955f7e881a242d1ebe1e053 Mon Sep 17 00:00:00 2001 From: Kalesh Singh Date: Tue, 1 Aug 2023 19:56:02 -0700 Subject: [PATCH 2/9] FROMGIT: Multi-gen LRU: Fix per-zone reclaim MGLRU has a LRU list for each zone for each type (anon/file) in each generation: long nr_pages[MAX_NR_GENS][ANON_AND_FILE][MAX_NR_ZONES]; The min_seq (oldest generation) can progress independently for each type but the max_seq (youngest generation) is shared for both anon and file. This is to maintain a common frame of reference. In order for eviction to advance the min_seq of a type, all the per-zone lists in the oldest generation of that type must be empty. The eviction logic only considers pages from eligible zones for eviction or promotion. scan_folios() { ... for (zone = sc->reclaim_idx; zone >= 0; zone--) { ... sort_folio(); // Promote ... isolate_folio(); // Evict } ... } Consider the system has the movable zone configured and default 4 generations. The current state of the system is as shown below (only illustrating one type for simplicity): Type: ANON Zone DMA32 Normal Movable Device Gen 0 0 0 4GB 0 Gen 1 0 1GB 1MB 0 Gen 2 1MB 4GB 1MB 0 Gen 3 1MB 1MB 1MB 0 Now consider there is a GFP_KERNEL allocation request (eligible zone index <= Normal), evict_folios() will return without doing any work since there are no pages to scan in the eligible zones of the oldest generation. Reclaim won't make progress until triggered from a ZONE_MOVABLE allocation request; which may not happen soon if there is a lot of free memory in the movable zone. This can lead to OOM kills, although there is 1GB pages in the Normal zone of Gen 1 that we have not yet tried to reclaim. This issue is not seen in the conventional active/inactive LRU since there are no per-zone lists. If there are no (not enough) folios to scan in the eligible zones, move folios from ineligible zone (zone_index > reclaim_index) to the next generation. This allows for the progression of min_seq and reclaiming from the next generation (Gen 1). Qualcomm, Mediatek and raspberrypi [1] discovered this issue independently. [1] https://github.com/raspberrypi/linux/issues/5395 Link: https://lkml.kernel.org/r/20230802025606.346758-1-kaleshsingh@google.com Fixes: ac35a4902374 ("mm: multi-gen LRU: minimal implementation") Change-Id: I5bbf44bd7ffe42f4347df4be59a75c1603c9b947 Signed-off-by: Kalesh Singh Reported-by: Charan Teja Kalla Reported-by: Lecopzer Chen Tested-by: AngeloGioacchino Del Regno [mediatek] Tested-by: Charan Teja Kalla Cc: Yu Zhao Cc: Barry Song Cc: Brian Geffon Cc: Jan Alexander Steffens (heftig) Cc: Matthias Brugger Cc: Oleksandr Natalenko Cc: Qi Zheng Cc: Steven Barrett Cc: Suleiman Souhlal Cc: Suren Baghdasaryan Cc: Aneesh Kumar K V Cc: Signed-off-by: Andrew Morton (cherry picked from commit 1462260adc41c5974362cb54ff577c2a15b8c7b2 https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-unstable) Bug: 295078665 Bug: 288383787 Bug: 291719697 Bug: 296020093 Bug: 296152871 Signed-off-by: Kalesh Singh (cherry picked from commit a7adb988970e13c42f3c7ca4fe157c35e8e885fe) (cherry picked from commit 00ff53e1e06236335c32a14a7bcb87ea5ce2cc0d) --- mm/vmscan.c | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index cd4323f336a6..cfde79671c01 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -4789,7 +4789,8 @@ static int lru_gen_memcg_seg(struct lruvec *lruvec) * the eviction ******************************************************************************/ -static bool sort_folio(struct lruvec *lruvec, struct folio *folio, int tier_idx) +static bool sort_folio(struct lruvec *lruvec, struct folio *folio, struct scan_control *sc, + int tier_idx) { bool success; int gen = folio_lru_gen(folio); @@ -4839,6 +4840,13 @@ static bool sort_folio(struct lruvec *lruvec, struct folio *folio, int tier_idx) return true; } + /* ineligible */ + if (zone > sc->reclaim_idx) { + gen = folio_inc_gen(lruvec, folio, false); + list_move_tail(&folio->lru, &lrugen->folios[gen][type][zone]); + return true; + } + /* waiting for writeback */ if (folio_test_locked(folio) || folio_test_writeback(folio) || (type == LRU_GEN_FILE && folio_test_dirty(folio))) { @@ -4887,7 +4895,8 @@ static bool isolate_folio(struct lruvec *lruvec, struct folio *folio, struct sca static int scan_folios(struct lruvec *lruvec, struct scan_control *sc, int type, int tier, struct list_head *list) { - int gen, zone; + int i; + int gen; enum vm_event_item item; int sorted = 0; int scanned = 0; @@ -4903,9 +4912,10 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc, gen = lru_gen_from_seq(lrugen->min_seq[type]); - for (zone = sc->reclaim_idx; zone >= 0; zone--) { + for (i = MAX_NR_ZONES; i > 0; i--) { LIST_HEAD(moved); int skipped = 0; + int zone = (sc->reclaim_idx + i) % MAX_NR_ZONES; struct list_head *head = &lrugen->folios[gen][type][zone]; while (!list_empty(head)) { @@ -4919,7 +4929,7 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc, scanned += delta; - if (sort_folio(lruvec, folio, tier)) + if (sort_folio(lruvec, folio, sc, tier)) sorted += delta; else if (isolate_folio(lruvec, folio, sc)) { list_add(&folio->lru, list); From e5908ce872306a87a27f7d1635557477aef03097 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Christian=20K=C3=B6nig?= Date: Tue, 13 Jun 2023 10:09:20 +0200 Subject: [PATCH 3/9] UPSTREAM: dma-buf: keep the signaling time of merged fences v3 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Some Android CTS is testing if the signaling time keeps consistent during merges. v2: use the current time if the fence is still in the signaling path and the timestamp not yet available. v3: improve comment, fix one more case to use the correct timestamp Bug: 286438670 Bug: 296793123 Signed-off-by: Christian König Reviewed-by: Luben Tuikov Link: https://patchwork.freedesktop.org/patch/msgid/20230630120041.109216-1-christian.koenig@amd.com (cherry picked from commit f781f661e8c99b0cb34129f2e374234d61864e77) Change-Id: I5cd3178213fc28ac67146f58fddf83f7d482fd76 Signed-off-by: Jindong Yue --- drivers/dma-buf/dma-fence-unwrap.c | 26 ++++++++++++++++++++++---- drivers/dma-buf/dma-fence.c | 5 +++-- drivers/gpu/drm/drm_syncobj.c | 2 +- include/linux/dma-fence.h | 2 +- 4 files changed, 27 insertions(+), 8 deletions(-) diff --git a/drivers/dma-buf/dma-fence-unwrap.c b/drivers/dma-buf/dma-fence-unwrap.c index 7002bca792ff..c625bb2b5d56 100644 --- a/drivers/dma-buf/dma-fence-unwrap.c +++ b/drivers/dma-buf/dma-fence-unwrap.c @@ -66,18 +66,36 @@ struct dma_fence *__dma_fence_unwrap_merge(unsigned int num_fences, { struct dma_fence_array *result; struct dma_fence *tmp, **array; + ktime_t timestamp; unsigned int i; size_t count; count = 0; + timestamp = ns_to_ktime(0); for (i = 0; i < num_fences; ++i) { - dma_fence_unwrap_for_each(tmp, &iter[i], fences[i]) - if (!dma_fence_is_signaled(tmp)) + dma_fence_unwrap_for_each(tmp, &iter[i], fences[i]) { + if (!dma_fence_is_signaled(tmp)) { ++count; + } else if (test_bit(DMA_FENCE_FLAG_TIMESTAMP_BIT, + &tmp->flags)) { + if (ktime_after(tmp->timestamp, timestamp)) + timestamp = tmp->timestamp; + } else { + /* + * Use the current time if the fence is + * currently signaling. + */ + timestamp = ktime_get(); + } + } } + /* + * If we couldn't find a pending fence just return a private signaled + * fence with the timestamp of the last signaled one. + */ if (count == 0) - return dma_fence_get_stub(); + return dma_fence_allocate_private_stub(timestamp); array = kmalloc_array(count, sizeof(*array), GFP_KERNEL); if (!array) @@ -138,7 +156,7 @@ struct dma_fence *__dma_fence_unwrap_merge(unsigned int num_fences, } while (tmp); if (count == 0) { - tmp = dma_fence_get_stub(); + tmp = dma_fence_allocate_private_stub(ktime_get()); goto return_tmp; } diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c index 0de0482cd36e..3855dc747fe5 100644 --- a/drivers/dma-buf/dma-fence.c +++ b/drivers/dma-buf/dma-fence.c @@ -150,10 +150,11 @@ EXPORT_SYMBOL(dma_fence_get_stub); /** * dma_fence_allocate_private_stub - return a private, signaled fence + * @timestamp: timestamp when the fence was signaled * * Return a newly allocated and signaled stub fence. */ -struct dma_fence *dma_fence_allocate_private_stub(void) +struct dma_fence *dma_fence_allocate_private_stub(ktime_t timestamp) { struct dma_fence *fence; @@ -169,7 +170,7 @@ struct dma_fence *dma_fence_allocate_private_stub(void) set_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &fence->flags); - dma_fence_signal(fence); + dma_fence_signal_timestamp(fence, timestamp); return fence; } diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c index 0c2be8360525..04589a35eb09 100644 --- a/drivers/gpu/drm/drm_syncobj.c +++ b/drivers/gpu/drm/drm_syncobj.c @@ -353,7 +353,7 @@ EXPORT_SYMBOL(drm_syncobj_replace_fence); */ static int drm_syncobj_assign_null_handle(struct drm_syncobj *syncobj) { - struct dma_fence *fence = dma_fence_allocate_private_stub(); + struct dma_fence *fence = dma_fence_allocate_private_stub(ktime_get()); if (IS_ERR(fence)) return PTR_ERR(fence); diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h index 775cdc0b4f24..be572c3a4dcd 100644 --- a/include/linux/dma-fence.h +++ b/include/linux/dma-fence.h @@ -584,7 +584,7 @@ static inline signed long dma_fence_wait(struct dma_fence *fence, bool intr) } struct dma_fence *dma_fence_get_stub(void); -struct dma_fence *dma_fence_allocate_private_stub(void); +struct dma_fence *dma_fence_allocate_private_stub(ktime_t timestamp); u64 dma_fence_context_alloc(unsigned num); extern const struct dma_fence_ops dma_fence_array_ops; From 42f00a4f6661f7f44debf23c39a7c3c6a48bdb14 Mon Sep 17 00:00:00 2001 From: Dan Carpenter Date: Thu, 6 Jul 2023 15:37:51 +0300 Subject: [PATCH 4/9] UPSTREAM: dma-buf: fix an error pointer vs NULL bug MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Smatch detected potential error pointer dereference. drivers/gpu/drm/drm_syncobj.c:888 drm_syncobj_transfer_to_timeline() error: 'fence' dereferencing possible ERR_PTR() The error pointer comes from dma_fence_allocate_private_stub(). One caller expected error pointers and one expected NULL pointers. Change it to return NULL and update the caller which expected error pointers, drm_syncobj_assign_null_handle(), to check for NULL instead. Bug: 286438670 Bug: 296793123 Fixes: f781f661e8c9 ("dma-buf: keep the signaling time of merged fences v3") Signed-off-by: Dan Carpenter Reviewed-by: Christian König Reviewed-by: Sumit Semwal Signed-off-by: Sumit Semwal Link: https://patchwork.freedesktop.org/patch/msgid/b09f1996-3838-4fa2-9193-832b68262e43@moroto.mountain (cherry picked from commit 00ae1491f970acc454be0df63f50942d94825860) Change-Id: I9fe1e61543e84a0f22d8ec26e01d94b809620744 Signed-off-by: Jindong Yue --- drivers/dma-buf/dma-fence.c | 2 +- drivers/gpu/drm/drm_syncobj.c | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c index 3855dc747fe5..eef4786aaf86 100644 --- a/drivers/dma-buf/dma-fence.c +++ b/drivers/dma-buf/dma-fence.c @@ -160,7 +160,7 @@ struct dma_fence *dma_fence_allocate_private_stub(ktime_t timestamp) fence = kzalloc(sizeof(*fence), GFP_KERNEL); if (fence == NULL) - return ERR_PTR(-ENOMEM); + return NULL; dma_fence_init(fence, &dma_fence_stub_ops, diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c index 04589a35eb09..e592c5da70ce 100644 --- a/drivers/gpu/drm/drm_syncobj.c +++ b/drivers/gpu/drm/drm_syncobj.c @@ -355,8 +355,8 @@ static int drm_syncobj_assign_null_handle(struct drm_syncobj *syncobj) { struct dma_fence *fence = dma_fence_allocate_private_stub(ktime_get()); - if (IS_ERR(fence)) - return PTR_ERR(fence); + if (!fence) + return -ENOMEM; drm_syncobj_replace_fence(syncobj, fence); dma_fence_put(fence); From 7e9bdc228e7f13b53bf954a9a7a4e1dae867c6fa Mon Sep 17 00:00:00 2001 From: Xu Yang Date: Wed, 9 Aug 2023 10:44:31 +0800 Subject: [PATCH 5/9] FROMGIT: BACKPORT: usb: ehci: add workaround for chipidea PORTSC.PEC bug Some NXP processor using chipidea IP has a bug when frame babble is detected. As per 4.15.1.1.1 Serial Bus Babble: A babble condition also exists if IN transaction is in progress at High-speed SOF2 point. This is called frame babble. The host controller must disable the port to which the frame babble is detected. The USB controller has disabled the port (PE cleared) and has asserted USBERRINT when frame babble is detected, but PEC is not asserted. Therefore, the SW isn't aware that port has been disabled. Then the SW keeps sending packets to this port, but all of the transfers will fail. This workaround will firstly assert PCD by SW when USBERRINT is detected and then judge whether port change has really occurred or not by polling roothub status. Because the PEC doesn't get asserted in our case, this patch will also assert it by SW when specific conditions are satisfied. Bug: 295046582 Bug: 296794033 Signed-off-by: Xu Yang Acked-by: Peter Chen Link: https://lore.kernel.org/r/20230809024432.535160-1-xu.yang_2@nxp.com Signed-off-by: Greg Kroah-Hartman (cherry picked from commit dda4b60ed70bd670eefda081f70c0cb20bbeb1fa https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git usb-next) [JD: replaced has_ci_pec_bug with existing has_fsl_port_bug to avoid abi breakage] Change-Id: I7d36cf656efda2dd46c0ddcca252b3de6ea434ee Signed-off-by: Jindong Yue --- drivers/usb/host/ehci-hcd.c | 8 ++++++-- drivers/usb/host/ehci-hub.c | 10 +++++++++- drivers/usb/host/ehci.h | 9 +++++++++ 3 files changed, 24 insertions(+), 3 deletions(-) diff --git a/drivers/usb/host/ehci-hcd.c b/drivers/usb/host/ehci-hcd.c index a1930db0da1c..68674b19f15d 100644 --- a/drivers/usb/host/ehci-hcd.c +++ b/drivers/usb/host/ehci-hcd.c @@ -755,10 +755,14 @@ static irqreturn_t ehci_irq (struct usb_hcd *hcd) /* normal [4.15.1.2] or error [4.15.1.1] completion */ if (likely ((status & (STS_INT|STS_ERR)) != 0)) { - if (likely ((status & STS_ERR) == 0)) + if (likely ((status & STS_ERR) == 0)) { INCR(ehci->stats.normal); - else + } else { + /* Force to check port status */ + if (ehci->has_fsl_port_bug) + status |= STS_PCD; INCR(ehci->stats.error); + } bh = 1; } diff --git a/drivers/usb/host/ehci-hub.c b/drivers/usb/host/ehci-hub.c index efe30e3be22f..1aee392e8492 100644 --- a/drivers/usb/host/ehci-hub.c +++ b/drivers/usb/host/ehci-hub.c @@ -674,7 +674,8 @@ ehci_hub_status_data (struct usb_hcd *hcd, char *buf) if ((temp & mask) != 0 || test_bit(i, &ehci->port_c_suspend) || (ehci->reset_done[i] && time_after_eq( - jiffies, ehci->reset_done[i]))) { + jiffies, ehci->reset_done[i])) + || ehci_has_ci_pec_bug(ehci, temp)) { if (i < 7) buf [0] |= 1 << (i + 1); else @@ -875,6 +876,13 @@ int ehci_hub_control( if (temp & PORT_PEC) status |= USB_PORT_STAT_C_ENABLE << 16; + if (ehci_has_ci_pec_bug(ehci, temp)) { + status |= USB_PORT_STAT_C_ENABLE << 16; + ehci_info(ehci, + "PE is cleared by HW port:%d PORTSC:%08x\n", + wIndex + 1, temp); + } + if ((temp & PORT_OCC) && (!ignore_oc && !ehci->spurious_oc)){ status |= USB_PORT_STAT_C_OVERCURRENT << 16; diff --git a/drivers/usb/host/ehci.h b/drivers/usb/host/ehci.h index ad3f13a3eaf1..4ee0d34323cf 100644 --- a/drivers/usb/host/ehci.h +++ b/drivers/usb/host/ehci.h @@ -707,6 +707,15 @@ ehci_port_speed(struct ehci_hcd *ehci, unsigned int portsc) */ #define ehci_has_fsl_susp_errata(e) ((e)->has_fsl_susp_errata) +/* + * Some Freescale/NXP processors using ChipIdea IP have a bug in which + * disabling the port (PE is cleared) does not cause PEC to be asserted + * when frame babble is detected. + */ +#define ehci_has_ci_pec_bug(e, portsc) \ + ((e)->has_fsl_port_bug && ((e)->command & CMD_PSE) \ + && !(portsc & PORT_PEC) && !(portsc & PORT_PE)) + /* * While most USB host controllers implement their registers in * little-endian format, a minority (celleb companion chip) implement From aa4fe52ab59e117d732f1d5d6bb6de312f0fd621 Mon Sep 17 00:00:00 2001 From: Xu Yang Date: Wed, 9 Aug 2023 14:53:27 +0800 Subject: [PATCH 6/9] FROMGIT: usb: host: ehci-sched: try to turn on io watchdog as long as periodic_count > 0 If initially isoc_count = 0, periodic_count > 0 and the io watchdog is not started (e.g. just timed out), then the io watchdog may not run after submitting isoc urbs and enable_periodic(). The isoc urbs may not complete forever if the controller had already stopped periodic schedule. This will try to call turn_on_io_watchdog() for each enable_periodic() to ensure the io watchdog functions properly. Bug: 295046582 Bug: 296794033 Signed-off-by: Xu Yang Reviewed-by: Alan Stern Link: https://lore.kernel.org/r/20230809065327.952368-1-xu.yang_2@nxp.com Signed-off-by: Greg Kroah-Hartman (cherry picked from commit c272dabf2d43c3523af1a40be3127e7a1f84540a https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git usb-next) Change-Id: I0f10ec8bcf0e14269b2a9693617dd83327c26a20 Signed-off-by: Jindong Yue --- drivers/usb/host/ehci-sched.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/usb/host/ehci-sched.c b/drivers/usb/host/ehci-sched.c index bd542b6fc46b..7e834587e7de 100644 --- a/drivers/usb/host/ehci-sched.c +++ b/drivers/usb/host/ehci-sched.c @@ -490,13 +490,14 @@ static int tt_no_collision( static void enable_periodic(struct ehci_hcd *ehci) { if (ehci->periodic_count++) - return; + goto out; /* Stop waiting to turn off the periodic schedule */ ehci->enabled_hrtimer_events &= ~BIT(EHCI_HRTIMER_DISABLE_PERIODIC); /* Don't start the schedule until PSS is 0 */ ehci_poll_PSS(ehci); +out: turn_on_io_watchdog(ehci); } From b7b64b06a9b564fa4139e81188e04ec62473f9f2 Mon Sep 17 00:00:00 2001 From: Nhat Pham Date: Mon, 28 Nov 2022 11:16:12 -0800 Subject: [PATCH 7/9] UPSTREAM: zsmalloc: consolidate zs_pool's migrate_lock and size_class's locks Currently, zsmalloc has a hierarchy of locks, which includes a pool-level migrate_lock, and a lock for each size class. We have to obtain both locks in the hotpath in most cases anyway, except for zs_malloc. This exception will no longer exist when we introduce a LRU into the zs_pool for the new writeback functionality - we will need to obtain a pool-level lock to synchronize LRU handling even in zs_malloc. In preparation for zsmalloc writeback, consolidate these locks into a single pool-level lock, which drastically reduces the complexity of synchronization in zsmalloc. We have also benchmarked the lock consolidation to see the performance effect of this change on zram. First, we ran a synthetic FS workload on a server machine with 36 cores (same machine for all runs), using fs_mark -d ../zram1mnt -s 100000 -n 2500 -t 32 -k before and after for btrfs and ext4 on zram (FS usage is 80%). Here is the result (unit is file/second): With lock consolidation (btrfs): Average: 13520.2, Median: 13531.0, Stddev: 137.5961482019028 Without lock consolidation (btrfs): Average: 13487.2, Median: 13575.0, Stddev: 309.08283679298665 With lock consolidation (ext4): Average: 16824.4, Median: 16839.0, Stddev: 89.97388510006668 Without lock consolidation (ext4) Average: 16958.0, Median: 16986.0, Stddev: 194.7370021336469 As you can see, we observe a 0.3% regression for btrfs, and a 0.9% regression for ext4. This is a small, barely measurable difference in my opinion. For a more realistic scenario, we also tries building the kernel on zram. Here is the time it takes (in seconds): With lock consolidation (btrfs): real Average: 319.6, Median: 320.0, Stddev: 0.8944271909999159 user Average: 6894.2, Median: 6895.0, Stddev: 25.528415540334656 sys Average: 521.4, Median: 522.0, Stddev: 1.51657508881031 Without lock consolidation (btrfs): real Average: 319.8, Median: 320.0, Stddev: 0.8366600265340756 user Average: 6896.6, Median: 6899.0, Stddev: 16.04057355583023 sys Average: 520.6, Median: 521.0, Stddev: 1.140175425099138 With lock consolidation (ext4): real Average: 320.0, Median: 319.0, Stddev: 1.4142135623730951 user Average: 6896.8, Median: 6878.0, Stddev: 28.621670111997307 sys Average: 521.2, Median: 521.0, Stddev: 1.7888543819998317 Without lock consolidation (ext4) real Average: 319.6, Median: 319.0, Stddev: 0.8944271909999159 user Average: 6886.2, Median: 6887.0, Stddev: 16.93221781102523 sys Average: 520.4, Median: 520.0, Stddev: 1.140175425099138 The difference is entirely within the noise of a typical run on zram. This hardly justifies the complexity of maintaining both the pool lock and the class lock. In fact, for writeback, we would need to introduce yet another lock to prevent data races on the pool's LRU, further complicating the lock handling logic. IMHO, it is just better to collapse all of these into a single pool-level lock. Link: https://lkml.kernel.org/r/20221128191616.1261026-4-nphamcs@gmail.com Change-Id: Ib0eb09d7a69190fc4ffea8f819423c7f66d83379 Signed-off-by: Nhat Pham Suggested-by: Johannes Weiner Acked-by: Minchan Kim Acked-by: Johannes Weiner Reviewed-by: Sergey Senozhatsky Cc: Dan Streetman Cc: Nitin Gupta Cc: Seth Jennings Cc: Vitaly Wool Signed-off-by: Andrew Morton (cherry picked from commit c0547d0b6a4b637db05406b90ba82e1b2e71de56) Bug: 297093100 Bug: 297936826 Signed-off-by: Kalesh Singh --- mm/zsmalloc.c | 87 ++++++++++++++++++++++----------------------------- 1 file changed, 37 insertions(+), 50 deletions(-) diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c index 649376f17bd9..33d8357fdbf2 100644 --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@ -33,8 +33,7 @@ /* * lock ordering: * page_lock - * pool->migrate_lock - * class->lock + * pool->lock * zspage->lock */ @@ -192,7 +191,6 @@ static const int fullness_threshold_frac = 4; static size_t huge_class_size; struct size_class { - spinlock_t lock; struct list_head fullness_list[NR_ZS_FULLNESS]; /* * Size of objects stored in this class. Must be multiple @@ -247,8 +245,7 @@ struct zs_pool { #ifdef CONFIG_COMPACTION struct work_struct free_work; #endif - /* protect page/zspage migration */ - rwlock_t migrate_lock; + spinlock_t lock; }; struct zspage { @@ -355,7 +352,7 @@ static void cache_free_zspage(struct zs_pool *pool, struct zspage *zspage) kmem_cache_free(pool->zspage_cachep, zspage); } -/* class->lock(which owns the handle) synchronizes races */ +/* pool->lock(which owns the handle) synchronizes races */ static void record_obj(unsigned long handle, unsigned long obj) { *(unsigned long *)handle = obj; @@ -452,7 +449,7 @@ static __maybe_unused int is_first_page(struct page *page) return PagePrivate(page); } -/* Protected by class->lock */ +/* Protected by pool->lock */ static inline int get_zspage_inuse(struct zspage *zspage) { return zspage->inuse; @@ -597,13 +594,13 @@ static int zs_stats_size_show(struct seq_file *s, void *v) if (class->index != i) continue; - spin_lock(&class->lock); + spin_lock(&pool->lock); class_almost_full = zs_stat_get(class, CLASS_ALMOST_FULL); class_almost_empty = zs_stat_get(class, CLASS_ALMOST_EMPTY); obj_allocated = zs_stat_get(class, OBJ_ALLOCATED); obj_used = zs_stat_get(class, OBJ_USED); freeable = zs_can_compact(class); - spin_unlock(&class->lock); + spin_unlock(&pool->lock); objs_per_zspage = class->objs_per_zspage; pages_used = obj_allocated / objs_per_zspage * @@ -916,7 +913,7 @@ static void __free_zspage(struct zs_pool *pool, struct size_class *class, get_zspage_mapping(zspage, &class_idx, &fg); - assert_spin_locked(&class->lock); + assert_spin_locked(&pool->lock); VM_BUG_ON(get_zspage_inuse(zspage)); VM_BUG_ON(fg != ZS_EMPTY); @@ -1247,19 +1244,19 @@ void *zs_map_object(struct zs_pool *pool, unsigned long handle, BUG_ON(in_interrupt()); /* It guarantees it can get zspage from handle safely */ - read_lock(&pool->migrate_lock); + spin_lock(&pool->lock); obj = handle_to_obj(handle); obj_to_location(obj, &page, &obj_idx); zspage = get_zspage(page); /* - * migration cannot move any zpages in this zspage. Here, class->lock + * migration cannot move any zpages in this zspage. Here, pool->lock * is too heavy since callers would take some time until they calls * zs_unmap_object API so delegate the locking from class to zspage * which is smaller granularity. */ migrate_read_lock(zspage); - read_unlock(&pool->migrate_lock); + spin_unlock(&pool->lock); class = zspage_class(pool, zspage); off = (class->size * obj_idx) & ~PAGE_MASK; @@ -1412,8 +1409,8 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size, gfp_t gfp) size += ZS_HANDLE_SIZE; class = pool->size_class[get_size_class_index(size)]; - /* class->lock effectively protects the zpage migration */ - spin_lock(&class->lock); + /* pool->lock effectively protects the zpage migration */ + spin_lock(&pool->lock); zspage = find_get_zspage(class); if (likely(zspage)) { obj = obj_malloc(pool, zspage, handle); @@ -1421,12 +1418,12 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size, gfp_t gfp) fix_fullness_group(class, zspage); record_obj(handle, obj); class_stat_inc(class, OBJ_USED, 1); - spin_unlock(&class->lock); + spin_unlock(&pool->lock); return handle; } - spin_unlock(&class->lock); + spin_unlock(&pool->lock); zspage = alloc_zspage(pool, class, gfp); if (!zspage) { @@ -1434,7 +1431,7 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size, gfp_t gfp) return (unsigned long)ERR_PTR(-ENOMEM); } - spin_lock(&class->lock); + spin_lock(&pool->lock); obj = obj_malloc(pool, zspage, handle); newfg = get_fullness_group(class, zspage); insert_zspage(class, zspage, newfg); @@ -1447,7 +1444,7 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size, gfp_t gfp) /* We completely set up zspage so mark them as movable */ SetZsPageMovable(pool, zspage); - spin_unlock(&class->lock); + spin_unlock(&pool->lock); return handle; } @@ -1491,16 +1488,14 @@ void zs_free(struct zs_pool *pool, unsigned long handle) return; /* - * The pool->migrate_lock protects the race with zpage's migration + * The pool->lock protects the race with zpage's migration * so it's safe to get the page from handle. */ - read_lock(&pool->migrate_lock); + spin_lock(&pool->lock); obj = handle_to_obj(handle); obj_to_page(obj, &f_page); zspage = get_zspage(f_page); class = zspage_class(pool, zspage); - spin_lock(&class->lock); - read_unlock(&pool->migrate_lock); obj_free(class->size, obj); class_stat_dec(class, OBJ_USED, 1); @@ -1510,7 +1505,7 @@ void zs_free(struct zs_pool *pool, unsigned long handle) free_zspage(pool, class, zspage); out: - spin_unlock(&class->lock); + spin_unlock(&pool->lock); cache_free_handle(pool, handle); } EXPORT_SYMBOL_GPL(zs_free); @@ -1867,16 +1862,12 @@ static int zs_page_migrate(struct page *newpage, struct page *page, pool = zspage->pool; /* - * The pool migrate_lock protects the race between zpage migration + * The pool's lock protects the race between zpage migration * and zs_free. */ - write_lock(&pool->migrate_lock); + spin_lock(&pool->lock); class = zspage_class(pool, zspage); - /* - * the class lock protects zpage alloc/free in the zspage. - */ - spin_lock(&class->lock); /* the migrate_write_lock protects zpage access via zs_map_object */ migrate_write_lock(zspage); @@ -1906,10 +1897,9 @@ static int zs_page_migrate(struct page *newpage, struct page *page, replace_sub_page(class, zspage, newpage, page); /* * Since we complete the data copy and set up new zspage structure, - * it's okay to release migration_lock. + * it's okay to release the pool's lock. */ - write_unlock(&pool->migrate_lock); - spin_unlock(&class->lock); + spin_unlock(&pool->lock); dec_zspage_isolation(zspage); migrate_write_unlock(zspage); @@ -1964,9 +1954,9 @@ static void async_free_zspage(struct work_struct *work) if (class->index != i) continue; - spin_lock(&class->lock); + spin_lock(&pool->lock); list_splice_init(&class->fullness_list[ZS_EMPTY], &free_pages); - spin_unlock(&class->lock); + spin_unlock(&pool->lock); } list_for_each_entry_safe(zspage, tmp, &free_pages, list) { @@ -1976,9 +1966,9 @@ static void async_free_zspage(struct work_struct *work) get_zspage_mapping(zspage, &class_idx, &fullness); VM_BUG_ON(fullness != ZS_EMPTY); class = pool->size_class[class_idx]; - spin_lock(&class->lock); + spin_lock(&pool->lock); __free_zspage(pool, class, zspage); - spin_unlock(&class->lock); + spin_unlock(&pool->lock); } }; @@ -2039,10 +2029,11 @@ static unsigned long __zs_compact(struct zs_pool *pool, struct zspage *dst_zspage = NULL; unsigned long pages_freed = 0; - /* protect the race between zpage migration and zs_free */ - write_lock(&pool->migrate_lock); - /* protect zpage allocation/free */ - spin_lock(&class->lock); + /* + * protect the race between zpage migration and zs_free + * as well as zpage allocation/free + */ + spin_lock(&pool->lock); while ((src_zspage = isolate_zspage(class, true))) { /* protect someone accessing the zspage(i.e., zs_map_object) */ migrate_write_lock(src_zspage); @@ -2067,7 +2058,7 @@ static unsigned long __zs_compact(struct zs_pool *pool, putback_zspage(class, dst_zspage); migrate_write_unlock(dst_zspage); dst_zspage = NULL; - if (rwlock_is_contended(&pool->migrate_lock)) + if (spin_is_contended(&pool->lock)) break; } @@ -2084,11 +2075,9 @@ static unsigned long __zs_compact(struct zs_pool *pool, pages_freed += class->pages_per_zspage; } else migrate_write_unlock(src_zspage); - spin_unlock(&class->lock); - write_unlock(&pool->migrate_lock); + spin_unlock(&pool->lock); cond_resched(); - write_lock(&pool->migrate_lock); - spin_lock(&class->lock); + spin_lock(&pool->lock); } if (src_zspage) { @@ -2096,8 +2085,7 @@ static unsigned long __zs_compact(struct zs_pool *pool, migrate_write_unlock(src_zspage); } - spin_unlock(&class->lock); - write_unlock(&pool->migrate_lock); + spin_unlock(&pool->lock); return pages_freed; } @@ -2200,7 +2188,7 @@ struct zs_pool *zs_create_pool(const char *name) return NULL; init_deferred_free(pool); - rwlock_init(&pool->migrate_lock); + spin_lock_init(&pool->lock); pool->name = kstrdup(name, GFP_KERNEL); if (!pool->name) @@ -2271,7 +2259,6 @@ struct zs_pool *zs_create_pool(const char *name) class->index = i; class->pages_per_zspage = pages_per_zspage; class->objs_per_zspage = objs_per_zspage; - spin_lock_init(&class->lock); pool->size_class[i] = class; for (fullness = ZS_EMPTY; fullness < NR_ZS_FULLNESS; fullness++) From 725fdf9ba3466bf6b4164155529290d78ba0d58b Mon Sep 17 00:00:00 2001 From: Andrew Yang Date: Fri, 21 Jul 2023 14:37:01 +0800 Subject: [PATCH 8/9] BACKPORT: zsmalloc: fix races between modifications of fullness and isolated We encountered many kernel exceptions of VM_BUG_ON(zspage->isolated == 0) in dec_zspage_isolation() and BUG_ON(!pages[1]) in zs_unmap_object() lately. This issue only occurs when migration and reclamation occur at the same time. With our memory stress test, we can reproduce this issue several times a day. We have no idea why no one else encountered this issue. BTW, we switched to the new kernel version with this defect a few months ago. Since fullness and isolated share the same unsigned int, modifications of them should be protected by the same lock. [andrew.yang@mediatek.com: move comment] Link: https://lkml.kernel.org/r/20230727062910.6337-1-andrew.yang@mediatek.com Link: https://lkml.kernel.org/r/20230721063705.11455-1-andrew.yang@mediatek.com Fixes: c4549b871102 ("zsmalloc: remove zspage isolation for migration") Change-Id: I4aeda0715d65f828bb88ad6fbf36b9927c7a5c4b Signed-off-by: Andrew Yang Reviewed-by: Sergey Senozhatsky Cc: AngeloGioacchino Del Regno Cc: Matthias Brugger Cc: Minchan Kim Cc: Sebastian Andrzej Siewior Cc: Signed-off-by: Andrew Morton (cherry picked from commit 4b5d1e47b69426c0f7491d97d73ad0152d02d437) Bug: 297093100 Bug: 297936826 [ Kalesh Singh - Fix trivial conflicts in zs_page_putback()] Signed-off-by: Kalesh Singh --- mm/zsmalloc.c | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c index 33d8357fdbf2..9fb906d56e0b 100644 --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@ -1816,6 +1816,7 @@ static void replace_sub_page(struct size_class *class, struct zspage *zspage, static bool zs_page_isolate(struct page *page, isolate_mode_t mode) { + struct zs_pool *pool; struct zspage *zspage; /* @@ -1826,9 +1827,10 @@ static bool zs_page_isolate(struct page *page, isolate_mode_t mode) VM_BUG_ON_PAGE(PageIsolated(page), page); zspage = get_zspage(page); - migrate_write_lock(zspage); + pool = zspage->pool; + spin_lock(&pool->lock); inc_zspage_isolation(zspage); - migrate_write_unlock(zspage); + spin_unlock(&pool->lock); return true; } @@ -1895,12 +1897,12 @@ static int zs_page_migrate(struct page *newpage, struct page *page, kunmap_atomic(s_addr); replace_sub_page(class, zspage, newpage, page); + dec_zspage_isolation(zspage); /* * Since we complete the data copy and set up new zspage structure, * it's okay to release the pool's lock. */ spin_unlock(&pool->lock); - dec_zspage_isolation(zspage); migrate_write_unlock(zspage); get_page(newpage); @@ -1918,14 +1920,16 @@ static int zs_page_migrate(struct page *newpage, struct page *page, static void zs_page_putback(struct page *page) { struct zspage *zspage; + struct zs_pool *pool; VM_BUG_ON_PAGE(!PageMovable(page), page); VM_BUG_ON_PAGE(!PageIsolated(page), page); zspage = get_zspage(page); - migrate_write_lock(zspage); + pool = zspage->pool; + spin_lock(&pool->lock); dec_zspage_isolation(zspage); - migrate_write_unlock(zspage); + spin_unlock(&pool->lock); } static const struct movable_operations zsmalloc_mops = { From 71b43c3e005a15b83c564a835361e2b7aa56e086 Mon Sep 17 00:00:00 2001 From: Kalesh Singh Date: Fri, 25 Aug 2023 09:58:36 -0700 Subject: [PATCH 9/9] ANDROID: GKI: Update ABI for zsmalloc fixes zs_pool->lock was added upstream as a replacement for the size_class locks. The tooling over-cautiously reports this as a ABI breakage but both of these structs (zs_pool and size_class) are internal to zsmalloc.c. Update the ABI to allow these changes. Bug: 297093100 Bug: 297936826 Change-Id: Ib9fc5a036f75d89fb6bee4c146034f6c81759e04 Signed-off-by: Kalesh Singh --- android/abi_gki_aarch64.stg | 68 ++++++++++++++++++------------------- 1 file changed, 33 insertions(+), 35 deletions(-) diff --git a/android/abi_gki_aarch64.stg b/android/abi_gki_aarch64.stg index 8086b4e2dba3..8fcb46f3dacc 100644 --- a/android/abi_gki_aarch64.stg +++ b/android/abi_gki_aarch64.stg @@ -90370,10 +90370,9 @@ member { offset: 712 } member { - id: 0xf667d80f + id: 0xf667dcee name: "fullness_list" type_id: 0xb8bf135c - offset: 64 } member { id: 0xfeb50ea0 @@ -101850,12 +101849,6 @@ member { type_id: 0x4585663f offset: 320 } -member { - id: 0xad7c8a98 - name: "index" - type_id: 0x4585663f - offset: 672 -} member { id: 0xad7c8ba4 name: "index" @@ -101868,6 +101861,12 @@ member { type_id: 0x4585663f offset: 480 } +member { + id: 0xad7c8d2b + name: "index" + type_id: 0x4585663f + offset: 608 +} member { id: 0xad7c8d72 name: "index" @@ -113342,6 +113341,12 @@ member { type_id: 0xf313e71a offset: 768 } +member { + id: 0x2d1fe43b + name: "lock" + type_id: 0xf313e71a + offset: 17536 +} member { id: 0x2d1fe44c name: "lock" @@ -121005,12 +121010,6 @@ member { type_id: 0x2c8b0a9f offset: 768 } -member { - id: 0xdb33fcdf - name: "migrate_lock" - type_id: 0xf4933b90 - offset: 17536 -} member { id: 0x8edaa968 name: "migrate_page" @@ -133786,10 +133785,10 @@ member { type_id: 0xad7c0a89 } member { - id: 0x7a226550 + id: 0x7a226b7d name: "objs_per_zspage" type_id: 0x6720d32f - offset: 608 + offset: 544 } member { id: 0x33953b25 @@ -138582,10 +138581,10 @@ member { bitsize: 1 } member { - id: 0x338646f2 + id: 0x338649f9 name: "pages_per_zspage" type_id: 0x6720d32f - offset: 640 + offset: 576 } member { id: 0xf9521fd2 @@ -170127,18 +170126,18 @@ member { type_id: 0x6720d32f offset: 96 } -member { - id: 0xd91935d3 - name: "size" - type_id: 0x6720d32f - offset: 576 -} member { id: 0xd9193607 name: "size" type_id: 0x6720d32f offset: 896 } +member { + id: 0xd91937b9 + name: "size" + type_id: 0x6720d32f + offset: 512 +} member { id: 0xd9193b66 name: "size" @@ -175840,10 +175839,10 @@ member { offset: 896 } member { - id: 0xb91e0d04 + id: 0xb91e0940 name: "stats" type_id: 0x6b61371d - offset: 704 + offset: 640 } member { id: 0xb920e0d3 @@ -243356,14 +243355,13 @@ struct_union { kind: STRUCT name: "size_class" definition { - bytesize: 136 - member_id: 0x2d1fec85 - member_id: 0xf667d80f - member_id: 0xd91935d3 - member_id: 0x7a226550 - member_id: 0x338646f2 - member_id: 0xad7c8a98 - member_id: 0xb91e0d04 + bytesize: 128 + member_id: 0xf667dcee + member_id: 0xd91937b9 + member_id: 0x7a226b7d + member_id: 0x338649f9 + member_id: 0xad7c8d2b + member_id: 0xb91e0940 } } struct_union { @@ -259130,7 +259128,7 @@ struct_union { member_id: 0xb9089225 member_id: 0x868caa9e member_id: 0x8a67a9e5 - member_id: 0xdb33fcdf + member_id: 0x2d1fe43b } } struct_union {